2013-04-26



How is Wikidata different to other tools claiming to be a "free knowledge base"? That may well remain to be seen

Wikidata, the latest initiative of Wikimedia, operator of Wikipedia, Wiktionary and the Wikiversity, (not to be confused with Wikileaks or WikiMapia which are not affiliated) was officially launched this week. Is this a wonderful break through in open data, or one Wiki too far?

Wikidata aims to create:

a free, collaborative, multilingual, secondary database, collecting structured data to provide support for Wikipedia, Wikimedia Commons, the other Wikimedia projects, and well beyond that.

Beyond the bright adjectives, the key bit is "structured data". This means data in the more traditional sense, not media files or articles, but information that can be analysed to look for connections. More importantly, structured data means that information can be automatically updated across a range of different languages - that makes editing articles on Wikipedia less time consuming, and it makes Wikipedia more current.

"Structured data" doesn't necessarily mean numbers though as the site's example demonstrates: "before Wikidata, Wikipedians needed to manually update hundreds of Wikipedia language versions every time a famous person died or a country's leader changed".

But there already seem to be some issues in Wikidata's promotion of information consistency across different pages. For a start, the press release claims Wikidata began in March 2012, while the introduction page claims it began in April 2012 - a discrepancy which will probably be changed after publishing this article.

Similarly, the main page claims that phase 2 of the Wikidata project (creating infoboxes) was launched this week. But the timeline suggests that this happened in February and that Wikidata is now in it's third phase (which involves automatically updating and translating articles).

To be fair, Wikimedia doesn't pretend that this isn't a work in progress. For example, though it claims that the Wikidata database will be "read and edited by humans and machines alike" it acknowledges elsewhere the importance of the former, stating "only when we have a working community will we have an interesting data set". That community component of information management is particularly critical for "large quantities of data" where "you need to be even more patient".

The guiding principle however of all this, to make information more accessible across languages, though not new, is an important one. What's more, the concept of automatically creating lists and charts might also be a powerful step forward in making data easier to understand, and question, in a shorter space of time.

Here's a link to Wikidata's page on The Guardian. So far, it doesn't look like much more than a useful index to Wikipedia articles about the Guardian in various languages. But who knows what it might become?

Do you know more about the Wikidata project? Do you have concerns about it? If so, please tell us by posting your comment/question below.

NEW! Buy our book

• Facts are Sacred: the power of data (on Kindle)

More open data

Data journalism and data visualisations from the Guardian

World government data

• Search the world's government data with our gateway

Development and aid data

• Search the world's global development data with our gateway

Can you do something with this data?

• Flickr Please post your visualisations and mash-ups on our Flickr group
• Contact us at data@guardian.co.uk

• Get the A-Z of data
• More at the Datastore directory
• Follow us on Twitter
• Like us on Facebook

Data and computer security

Big data

Wikipedia

Mona Chalabi

guardian.co.uk © 2013 Guardian News and Media Limited or its affiliated companies. All rights reserved. | Use of this content is subject to our Terms & Conditions | More Feeds

Show more