Is Raw Data Bad For You? Open Data Obligations to Government.

By: Leah Cooke, Stephanie Piper, Alana Kingdon, and Peter Johnson

*This blog post was written collaboratively during the springtime Geothink meetup between Ryerson University and University of Waterloo students + faculty. The goals of this meetup were to discuss current and future issues related to Geothink research themes.

What strings are attached to governments that provide open data to citizens? Alongside the current interest in government open data, questions remain about how government should share data. Specifically, what obligations do government have beyond simple data provision. These obligations could include educating citizens, contextualizing data, and also being receptive to citizen feedback on the data provided. For example, if a government publishes drinking water quality data, do they have a (moral, ethical, operational) obligation to support this data with relevant contextualizing information? We propose five main responses that government could provide when answering this question.

1. Nothing

Providing the data as it exists without any contextual information to aid in understanding the data.

2. Metadata

Defining the details of data by including acronyms and field names etc., to make the document readable for technically adept users.

3. Processed data

Data that includes maps, legends, annotations, or graphs/charts to aid in the understanding of the data by viewers, while still including original data to allow for additional analyses.  Also included is descriptive information or explanatory text that may be helpful to user’s understanding of the data.

4. Engagement and Responsiveness:

A responsive format for the distribution of open data would see a commitment to the sustainability of the data itself, by ensuring updates and maintenance to open data portals.  An obligation for citizen engagement would also be present at this level, with governments creating workshops or tools to help citizens become knowledgeable about the data as well as ensuring two-way communication between those with questions or suggestions surrounding the data.

5.  Interoperable Standards for Data Sets

Data sets are released in a standardized format, with the intention of increasing the accessibility of data for novice users as well as for ease of integrating information from different municipalities for regional analyses.

While these five standards are different potential ways government can operationally structure and release their data, the question still remains: which format is ethically or morally the option that should be adopted. Further, government bodies have complex requirements to abide by legislation, including the Accessibility for Ontarians with Disabilities Act (AODA), that also need to be considered when releasing any information. Do these requirements alter these obligations?  Beyond the regulations themselves, further accessibility issues are also raised.  Should the data be accessible by various levels of users, from novice to expert?  What does this mean for the ethical framework surrounding the release of the data?  As data is often released in formats only recognized by technical users such as .csv files, is there an additional obligation to release data that is open to nontechnical users as well? Inherent in the name, open data is the assumption that this data is being released in order to create an increase in transparency. It would be natural to assume that this data should therefore be accessible to users regardless of their technical skill levels.

In conclusion, for municipal governments, providing raw data is really just the first step. Governments that are serious about using open data as a prelude or support to open government need to also provide tools and support to enable data being turned into information. Metadata is not enough, and open data does not replace targeted information and publications created internally and shared with citizens.