My introduction to open source was a little unusual - it started around the discussion of open data instead of open code. I got involved in “The Public Knowledge Workshop,” a non-profit organization of Israeli social activists devoting their leisure time to making public data available and meaningful.

The founding story of this volunteer community impressed me - and even more so, their social impact. It began with a university student contemplating who he should vote for in the coming election. He sent a request for data and statistics to The Knesset (the Israeli parliament) spokesperson and received a standard reply with a reference to the official website. Back then, the website mainly consisted of static protocols. It meant that he had to devote months to skimming and reading hundreds of pages to answer each of the questions he was curious about. Here is where the story becomes interesting - to speed up the process and have the information he needed before election time, he wrote scripts to mine and aggregate the information.

Like any good open source story, he found that many others had the same “itch”. They created the Open Knesset website, which became the leading instrument for gathering updated and archival information about the Knesset. The site never intended to replace, nor compete, with the official website. The purpose was to demonstrate how public data can be presented in a way that can be meaningful to citizens.

There are more great stories like this all around the globe. You can find something that may interest you by choosing a field of interest at the opengovguide report generator. These stories teach of the essence of public open data: the effort to redesign relationships between citizens and governments, especially in an era where the cost of data required to make decisions costs as much as (or less than) a rubber duck.

I believe that open source developers should get involved and take part in efforts like these for two main reasons. First, I believe that dealing with open data challenges make you a better developer. They tend to improve critical and creative thinking and force us to focus on the end users and user stories. Secondly, it is a great way to utilize your professional skills for a better society. Yes, descent data is crucial, even if you believe we just entered “the post truth era”.

From data to public open data

When thinking of public data, just like with any code, you should ask yourself “how do I want it to be presented”:

Data presented as a final product? Credit: Porotherm style heat-insulating clay block brick. By Kozuch (Own work) CC BY-SA 3.0, via Wikimedia Commons.
Shaping open data to make it useful. Credit: Drop spindles of clay making. Peter van der Sluijs By Kozuch (Own work) CC BY-SA 3.0, via Wikimedia Commons.

The main question is whether we expect the data producer/collector to present it as a solid manufactured product, or should the client (or user) have the right (and ability) to reshape and reuse it. Another way to put it is - who actually owns the data, and do they have the right to limit the usage of the data? The principles of governmental open data that were defined in a workshop consisting of 30 open government advocates back in 2007, have since been adopted and recognized by state organizations. They are:

  1. Complete (Bring it all, not just a part or parts of it).
  2. Primary (Publicize it as it was collected, granular as possible).
  3. Timely (Provide it ASAP, since time influences the usability of data and value).
  4. Accessible (Consider the means of distribution to enable as many as possible to use the data).
  5. Machine Processable (Make it usable, not only readable).
  6. Non-discriminatory (Make it available to all, without any demand to register).
  7. Non-proprietary (Neither the content nor the format should be limit users).
  8. License-free (Except for some reasonable privacy, security, and privilege restrictions. These words may make you yawn, but actually it is one of the most tumultuous debates.)

These are the original principles of the open data definition, but of course, like any other open source project, they are dynamic, forked, and have evolved.

Got it, so where do I start?

  • Check out what is happening in your country, region, municipality, or in any other place of interest. The World Wide Web Foundation’s Barometer or the Open Knowledge’s Global Open Data Index) are beneficial resources.
  • Search relevant open source code repositories, and look for open issues that fit your development skill.
  • Resist the temptation to start by building your own project - most likely someone already thought about a solution for the problem you face. Check it out and consider joining those efforts. Open Knowledge can help you connect with active groups in your region.
  • Embrace the opportunity to learn and gain experience. Don’t expect quick results.
noamcastel's profile

Noam Castel