2016-10-16

(This article was originally published at Blog about Stats, and syndicated at StatsBlogs.)

State of Open Data in Europe

The European Commission (Directorate General for Communications Networks, Content and Technology) just published the second Open Data Maturity Report.

‘Open Data refers to the information collected, produced or paid for by public bodies which can be freely used, modified, and shared by anyone for any purpose.’ (all pages from the report, p.6)

‘The two key indicators used to measure Open Data Maturity are Open Data Readiness and Portal Maturity.

— The first key indicator, Open Data Readiness, assesses to what extent countries have an Open Data policy in place, licensing norms and the extent of national coordination regarding guidelines and setting common approaches.

— The second key indicator, Portal Maturity, explores the usability of the portal regarding the availability of functionalities, the overall re-usability of data such as machine readability and accessibility of data sets, for example, as well as the spread of data across domains.

The two key indicators as well as the sub-indicators are shown in the table below.

Open Data Maturity in Europe 2016: The results

Overview by countries:



From European Data Portal, all the details here.

(p.59)

And one more result blog about stats enjoys:

About Re-Usability

One of the criteria for open data maturity is the re-usability of data and especially machine readability of data. Six questions focus on this item:

‘When looking at the data on the European Data Portal, over 49 different file formats are used. The most used data formats are CSV, HTML and WMS. The fourth most used data format is PDF. PDF is one of the few data formats that is not machine-readable. The following most frequent distributions are ZIP, JSON, XLS and XLSX, followed by WFS and XML. Numbers range from nearly 49,000 CSV formats to just over 23,000 JSON formats to the least used 263 shape formats. Most data formats are or are related to a spreadsheet, which enables to analyse the data more swiftly.’ (p. 49).

That is not enough and the report recommends:

‘On the more technical side, some improvements are still necessary. To further develop automated processes each national portal should have an API in combination with a complete metadata profile. This allows a portal to share the data with data users more easily. This can for instance enable harvesting data directly from public administrations in an automated fashion, saving efforts in manual uploading of data and limiting errors when editing data and metadata manually.  … Typos or different spellings can limit the discovery of data. Here activities conducted at EU level on controlled vocabularies can be of interest to learn from in order to increase semantic interoperability. ‘  (p.63)

There is more

PDF is poor, XLS is better, CSV even better, also JSON and APIs; and metadata are of crucial importance. The European Data Portal gives a good example: it organizes the datasets in the triple format (RDF) and offers an SPARQL search.

But there is more.

Not only the datasets could foster semantic interoperability but also the data in these sets. Linked data and adequate formats can assure this interoperability and with this extended machine readability and use of data. So why not add this criterion to the questionnaire (Question 7.6 +) and lead the national portals in this direction?

Linked data? Tim Berners-Lee explains

And related: https://blogstats.wordpress.com/2016/02/03/open-data-portals-news/

Filed under: 032 Metadata, 037 Open data initiatives Tagged: EU Commission, linked data, open data, Report, SPARQL

Please comment on the article here: Blog about Stats

The post There’s more appeared first on All About Statistics.

Show more