Sunlightfoundation.com

Reasons to Not Release Data, Part 4: Cost

2013-10-03

Earlier this month, we shared a crowdsourced collection of the top concerns data advocates have heard when they’ve raised an open data project with government officials at the federal, state, and local level, and we asked for you to share how you’ve responded. Dozens of you contributed to the project, sharing your thoughts on social media, our public Google doc, and even on the Open Data Stack Exchange, where 8 threads were opened to dive deeper into specific subjects.

Drawing from your input, our own experience, and existing materials from our peers at the National Neighborhood Indicators Partnership and some data warriors from the UK, we’ve compiled a number of answers -- discussion points, if you will -- to help unpack and respond to some of the most commonly cited open data concerns. This mash-up of expertise is a work in progress, but we bet you’ll find it a useful conversation starter (or continuer) for your own data advocacy efforts.

Click here to see other posts in this series.

Over the next few weeks, we’ll be sharing challenges and responses from our #WhyOpenData list that correspond to different themes. Today’s theme is Cost.

It's expensive

14. We don't have the budget

A. It requires expensive software

Unpack assumptions about software, hardware, applications, and hackathons. Identify, specifically, where cost fears lie and do research into existing platforms and open source solutions, and price out (and compare!) the options.

If the costs fall not on the tech, but on personnel, you should draw from our #WhyOpenData chapter on staffing concerns (coming soon!).

B. It would be costly to provide copies of data in open formats

"Why would it have to be costly? Have you explored the processes that would be required for this?"

Show them examples of low-cost ways to provide data in open formats.

Try to see if this is actually true. Try to get the details of the software they are using and then figure out what the capabilities of the software are. Talking directly to the vendor can be the best next step. This excuse is often given by people who may not be well informed about what is technically possible, so attempt to talk to whoever is directly responsible. Assuming it is true, ask them to release the data in whatever format they can. If the data is interesting enough and released accessible, (ideally, in a bulk format, too,) then the community will figure out how to make an obscure data format open.

C. It would require a lot of staff time

“Does your staff/department/agency currently respond to FOI or public records requests? These may already be using lots of staff time, requiring staff to spend a great deal of time answering inquiries one-by-one. By making the most commonly requested documents, maps, and other information open by default and posted online, staffers can refer data seekers to your website and spend less time answering these requests.”

Offer additional analysis useful to the agency's work. Agency staff may lack the time or expertise for analysis of their own data for internal purposes. In return for data sharing, you can offer to fulfill simple agency requests or return enhanced files to the agency.

This is also an opportunity to explore the creation of a data inventory, a complete listing of government data holdings by agency (ideally including all public data and noting all non-public data). For a crunched agency, an inventory can be a great tool to help prioritize and pace data release, enabling the identification of data that can be more easily released and data that may require more effort and consultation to do so. Inventories also open a door for the public or at least public stakeholders to support data prioritization and release, potentially removing some administrative burdens and certainly adding political will.

D. It would require new processes and staff training

“Very likely, but these are processes that will have to happen eventually anyway. With the rapid rate of change in technology and archival storage, there is a need to, if not keep up, at least ensure that the systems our government has in place to manage information can perform sustainably. Many of the government processes in use today will likely not survive the coming shifts and changing expectations of the community. Exploring ways to implement new systems now will ultimately lessen work down the road and likely costs, too.”

“There may never be a ‘perfect moment’ when opening data is easy and instant, but we can begin to explore small steps that benefit your staff and the public and ultimately open more data now and build toward a greater scale down the road.”

Offer to speak with staff about the benefits of opening data and to connect your agency with other governments that have had comparable experiences so that they can share lessons learned.

E. Keeping everything updated would be costly

"It may be more expensive to continue with business as usual. Plot out the current process: where are there inefficiencies?"

This can depend upon current information management practices. If there are good, computerized systems in place for managing data internally, it can be easy to automate public data releases. If they don't have those internal systems or they are not updated frequently, explore simple options for regular upload (for example, publishing daily bulk data) and seek to identify the internal champion or administrator who can drive broader change in internal data processing.

As the agency gets used to data publishing and sharing, there will be automated options to explore to help with real-time publishing, an important step for data accuracy and completeness.

F. What if something breaks and the open version becomes out of date?

"Integrating the open data into internal data flows means that people will notice if the data becomes stale, reducing the chance that it remains so."

“It might also be appropriate to include disclaimers for the data, or at least to publicly identify the sources of the data and how they’re processed so that people accessing the data are aware of potential stoppages. This will both aid civilian flagging of stale data and ensure that reuse of the data doesn’t include false assumptions. It might also garner civilian or private sector support for enhancing whatever software or automated processes are ‘breaking down’.”

Stay tuned tomorrow for our next #WhyOpenData post on Staffing Concerns.