treat all data as conversation

What if data published by governments had tracked changes and comments turned on like a Word document–for every user? That is basically the question asked the other day by Gov 2.0 evangelist Ben Balter.

It’s a valuable idea, driven by several urgent needs in the open data world: more usable government data, more engaged data users, and a more fluid, accountable dialogue between data publishers and data users. And Balter stands at the forefront of the issue, having pioneered the federal government’s earliest uses of GitHub, the socially-oriented software platform, and now working with GitHub to help lead its new focus on state, local and federal government usage.

The vision in his post is lucid and compelling, but the proposal is off the mark. We need better conversations more than we need better annotations. More precisely, we’ll get more users to give more feedback that is more useful to more government data publishers if we don’t require issue experts (or civil servants) to become more like programmers.

Balter compares open data sets to open source software, which relies on annotations and iterations of code to create ever-improving tools based on loosely-joined collaborative communities of coders.

Being able to track changes at that level of granularity … empowers contributors to propose and discuss changes with great efficiently [sic], accurately, and precisely. It makes software a team sport. All of a sudden line-by-line code reviews, issues, and pull requests arise to address challenges both large and small.

If governments made their data more accessible and annotatable, Balter says, “rather than posting the data as a zip file or to a proprietary data portal,” the open data community could evolve on a faster track–using lessons from the open source community–and build up habits and standards of markup that strengthened the quality of supply and the capacity of those with demands. “Consumers of the data can submit proposed changed to do everything from normalizing columns to correcting errors to making the data itself more useable.”

He’s right that current challenges in open data usability are slowing feedback loops between, for example, an education ministry posting schools data and education advocates trying to make the most of the data in local advocacy.

But in a world where people had the time and savvy to annotate the way that he pictures, we wouldn’t need the system he’s picturing: The user/publisher dynamics would already be rich enough to yield more usable applications and more accountable data publishers.

As it stands, most governments are working to fulfill the basic promises of the open data movement: delivering a steady supply of higher quality data that is genuinely usable. And open data users are striving to improve their own basic data capabilities while maintaining the pressure on government to fulfill its promises.

Since Kenya’s government launched its data portal, for example, the number of published data sets has more than doubled, but open data experts have expressed a range of concerns about the effectiveness of the portal and the degree of uptake among citizen groups. As Kenya’s own ICT champion Bitange Ndemo said last year, “Right now, we have dealt with just the supply side of data. The challenge now is to build the demand side of data.”

Our #TABridge colleague Mikel Maron has spoken on several occasions about the value and challenges of “distributed version control” for data tools and the need for stronger user communities if the open data environment is to flourish. To build those communities, governments and experts need to meet the members where they are. That’s what groups like the Open Knowledge Foundation, the Open Government Partnership, the Sunlight Foundation and we at TABridge are trying to do.

Before creating a new layer of tags, comments and tracked changes on government data sets, however, let’s create a new community of curious, competent, confident data users among NGOs and citizens.

Let’s expand our vision of open data community outreach from hackathons and datapaloozas to include feedback-athons and train-apaloozas. We have a language available to us to describe our challenges and imagine better uses of data: conversation.

I’d rather teach government publishers to listen more often than assume they’ll find the time, money or permission to build the infrastructure and job descriptions Balter is imagining. And I’d rather set the threshold of skill lower for users with feedback than assume that all the NGOs who need government data can become good users of GitHub.

It’s worth noting that the open source community Balter looks to for inspiration has itself confronted the challenges of usability and adapted. Just because code is open for collaboration does not mean that breakthrough applications or dynamic feedback loops will spring forth. The developer community’s embrace of user communities and user-centric design has deepened and widened the impact of open source code, and continues to push publishers to do better.

So, yes, we should push the open government data community to evolve faster than the open source community did from “open up and win” to “open up then listen” but the best listening and revision platform should be something more human-centric than comment tags and version histories online.

Originally posted on the TAI blog

Jed Miller

treat all data as conversation

Leave a Reply