2014-08-28

Wikipedia as the Front Matter to all Research

A session at the recent Wikimania conference provided an opportunity for discussion on the topics: “The fount of all knowledge – wikipedia as the front matter to all research“. The abstract describes how:

This discussion focuses on how Wikipedia could become the entry or discovery point to all significant research for the general public, and for scholars who are working just outside of the topic of interest. For most people, even researchers from closely related areas, summaries and explanations of a piece of research can be a crucial means both to discover and to begin to get into a new piece of research.

Currently overviews of research topics are supported through two mechanisms: reviews and “front matter” content. A review is a systematic summary of a field, written by an expert. These go out of date quickly, particularly in rapidly moving areas of research. Front matter is “News and Views” pieces, often found at the “front” of scientific journals that explain newly published research and put it in context. This often includes a discussion of explaining how the research is an important advance and its broader societal implications.

Both of these functions could easily be provided in a more up to date and scalable manner by tapping into a global community of experts. Wikipedia articles are often the top web search result for initial queries in many research areas and these articles are a major source of traffic for scientific journals. As the first port of call for many users of research and a significant discovery route the potential for Wikipedia as a form of dynamic, expertly curated “front matter” for the whole research literature is substantial. This facilitated discussion session will focus on how this role could be enhanced, what is currently missing and what risks exist in taking this route.

Reading this I wondered about the extent to which Wikipedia articles currently link to papers hosted in institutional repositories.

In order to explore this question I made use of Wikipedia’s External links search tool to monitor the number of links to Wikipedia pages from institutional repositories provided by the Russell Group universities.

The survey was carried out on 28 August 2014 using the service. Note that the current finding can be obtained by following the link in the final column.

Table 1: Numbers of Links to Wikipedia from Repositories Hosted at Russell Group Universities

Ref.

No.

Institutional Repository Details

Nos. of links

from Wikipedia

View Results

1

Institution: University of Birmingham

Repository used: eprint Repository (http://eprints.bham.ac.uk/)

2

[Link]

2

Institution: University of Bristol

Repository used: ROSE (http://rose.bris.ac.uk/)

6

[Link]

3

Institution: University of Cambridge

Repository used: Dspace @ Cambridge (http://www.dspace.cam.ac.uk/)

82

[Link]

4

Institution: Cardiff University

Repository used: ORCA (http://orca.cardiff.ac.uk/)

1

[Link]

5

Institution: University of Durham

Repository used: DRO (http://dro.dur.ac.uk/)

109

[Link]

6

Institution: University of Edinburgh

Repository used: ERA (http://www.era.lib.ed.ac.uk/)

55

[Link]

7

Institution: University of Exeter

Repository used: ERIC (https://eric.exeter.ac.uk/repository/)

17

[Link]

8

Institution: University of Glasgow

Repository used: Enlighten (http://eprints.gla.ac.uk/)

120

[Link]

9

Institution: Imperial College

Repository used: Spiral (http://spiral.imperial.ac.uk/)

5

[Link]

10

Institution: King’s College London

Repository used: King’s Research Portal (https://kclpure.kcl.ac.uk/portal/)

45

[Link]

11

Institution: University of Leeds

Repository used: White Rose Research Online (http://eprints.whiterose.ac.uk/)

65

[Link]

12

Institution: University of Liverpool

Repository used: University of Liverpool Research Archive (http://research-archive.liv.ac.uk/)

1

[Link]

13

Institution: LSE

Repository used: LSE Research Online (http://eprints.lse.ac.uk/)

186

[Link]

14

Institution: University of Manchester

Repository used: eScholar (https://www.escholar.manchester.ac.uk/)

74

[Link]

15

Institution: Newcastle University

Repository used: Newcastle Eprints (http://eprint.ncl.ac.uk/)

4

[Link]

16

Institution: University of Nottingham

Repository used: Nottingham Eprints (http://eprints.nottingham.ac.uk/)

10

[Link]

17

Institution: University of Oxford

Repository used: ORA (http://ora.ouls.ox.ac.uk/)

19

[Link]

18

Institution: Queen Mary, University of London

Repository used: QMRO (https://qmro.qmul.ac.uk/)

15

[Link]

19

Institution: Queen’s University Belfast

Repository used: QUB Research Portal (http://pure.qub.ac.uk/portal/)

3

[Link]

20

Institution: University of Sheffield

Repository used: The University of Sheffield also uses the White Rose repository which is also used by Leeds and York. See the Leeds entry for the statistics.

(65)

[Link]

21

Institution: University of Southampton

Repository used: eprints.soton (http://eprints.soton.ac.uk/)

134

[Link]

22

Institution: University College London

Repository used: UCL Discovery (http://discovery.ucl.ac.uk/)

98

[Link]

23

Institution: University of Warwick

Repository used: WRAP (http://wrap.warwick.ac.uk/)

57

[Link]

24

Institution: University of York

Repository used: The University of York uses the White Rose repository which is also used by Leeds and Sheffield. See the Leeds entry for the statistics.

(65)

[Link]

Total

1,108

NOTE:

The URL of the repositories is taken from the OpenDOAR service.

Since the universities of Leeds, Sheffield and York share a repository the figures are provided in the entry for Leeds.

A number of institutions appear to host more than one research repository. In such cases the repository which appears to be the main research repository for the institution is used.

Discussion

The Survey Methodology

It should be noted that this initial survey does note pretend to provide an answer to the question “How many research papers hosted by institutional repositories provided by Russell group universities are cited in Wikipedia articles?” Rather the survey reflects the use of this blog as an ‘open notebook’ in which the initial steps in gathering evidence are documented openly in order to solicit feedback on the methodology. This post also documents flaws and limitations in the methodology in order that others who may wish to use similar approaches are aware of the limitations. Possible ways in which such limitations can be addressed are given and feedback is welcomed.

In particular it should be noted that the search engine used in the survey covers all public pages on the Wikipedia web site and not just Wikipedia articles. It includes Talk pages and user profile pages.

In addition the repository web sites include a variety of resources and not just research papers; for example it was observed that some user profile pages for researchers provide links to their profile on their institutional repository.

It was also noticed that some of the files linked to from Wikipedia were listed in the search results as PDFs. Since it seems likely that PDFs referenced on Wikipedia which are hosted on institutional repositories will be research papers a more accurate reflection on the number of research papers which are cited in institutional repositories may be obtained by filtering the findings to include only PDF results.

In addition if the findings from the search tool were restricted to Wikimedia articles only (and omitted Talk pages, user profile pages, etc.) we should get a better understanding of the extent to which Wikipedia is being used as the “front matter” to research hosted in Russell group university institutional repositories.

If any Wikipedia developers would be interested in talking up this challenge, this could help to provide a more meaningful benchmark which could be useful in monitoring trends.

Policy Implications of Encouraging Wikipedia to Act as the Front Matter to Research

There are risks when gathering such data that observers with vested interests will seek to make too much of the findings if they suggest a league table, particularly if there seem to be runaway leaders.

However as can be seen from the accompanying pie chart in this case no single institutional repository has more than 17% of the total number of links (and remember that these figures are flawed due to the reasons summarised above).

However there will be interesting policy implications if universities agree with the suggestion that Wikipedia can act as “the front matter to all research”, especially if links from Wikipedia to the institution’s repository results in increased traffic to the repository. Another way of characterising the proposal would be to suggest that Wikipedia can act as “the marketing tool to an institution’s research outputs”.

This could easily lead to institutions failing to abide by Wikipedia’s core principles regarding providing content updates from a neutral point of view and a failure to abide by the Wikimedia Foundation’s terms of use.

Earlier today I came across an article entitled “So who’s editing the SNHU Wikipedia page?” which described how analysis of editing patterns and deviations from the norm may be indicative of inappropriate Wikipedia editing strategies, such as pay-for updates to institutional Wikipedia articles.

The articles also pointed out how the PR sector has responded to criticisms that PR companies have been failing to abide by the Wikimedia Foundation’s terms of use: Top PR Firms Promise They Won’t Edit Clients’ Wikipedia Entries on the Sly. The article describes the Statement on Wikipedia from participating communications firms which is hosted on Wikipedia. The following statement was issued in 10 June 2014:

On behalf of our firms, we recognize Wikipedia’s unique and important role as a public knowledge resource. We also acknowledge that the prior actions of some in our industry have led to a challenging relationship with the community of Wikipedia editors.

Our firms believe that it is in the best interest of our industry, and Wikipedia users at large, that Wikipedia fulfill its mission of developing an accurate and objective online encyclopedia. Therefore, it is wise for communications professionals to follow Wikipedia policies as part of ethical engagement practices.

We therefore publicly state and commit, on behalf of our respective firms, to the best of our ability, to abide by the following principles:

To seek to better understand the fundamental principles guiding Wikipedia and other Wikimedia projects.

To act in accordance with Wikipedia’s policies and guidelines, particularly those related to “conflict of interest.”

To abide by the Wikimedia Foundation’s Terms of Use.

To the extent we become aware of potential violations of Wikipedia policies by our respective firms, to investigate the matter and seek corrective action, as appropriate and consistent with our policies.

Beyond our own firms, to take steps to publicize our views and counsel our clients and peers to conduct themselves accordingly.

We also seek opportunities for a productive and transparent dialogue with Wikipedia editors, inasmuch as we can provide accurate, up-to-date, and verifiable information that helps Wikipedia better achieve its goals.

A significant improvement in relations between our two communities may not occur quickly or easily, but it is our intention to do what we can to create a long-term positive change and contribute toward Wikipedia’s continued success.

If we wish to see Wikipedia acting as the front matter to research provided by the university sector should we be seeking to develop a similar statement on how we will do this whilst ensuring that we act in accordance with Wikipedia’s policies and guidelines? Of course the challenge would then be to identify what the appropriate best practices should be.

View Twitter conversations and metrics using: [Topsy] – [bit.ly]

Filed under: Evidence, Repositories, Wikipedia

Show more