Libraryjournal.com

Making Open Access Content Discoverable

2016-12-15

Finding the right journal content has always been hard work for scholars and librarians. The move from print to digital was supposed to make that easier—in theory. Data-at-your-fingertips should have meant less stack climbing, less poring over endless endnotes, and better overall results.

The reality is far different. Search technology, already in a fragmented state, has been further disrupted by the rise of Open Access (OA). Library information professionals agree. In a recent Library Journal survey, about 72% of all respondents cited OA article discoverability as a major concern.

Gold & Green Factors

The problem is easy to describe but hard to resolve. Typically, Gold OA publishers have robust search capabilities. But they often use different metadata—sometimes only slightly different—than other OA or subscription-based journals. Indexing services do a respectable job, but they can only collect metadata that publishers choose to provide.

Green OA is also problematic. Unequally funded and resourced institutional repositories can have complex, irregularly developed approaches to metadata, which are often incompatible with one another. Even when full text articles are openly available, discovering the right one can become a combination of long hours and luck.

This feature article is part of our Open Access in Action series, sponsored by Dove Press, which tracks the evolution of important open access (OA) issues through a library lens by presenting regular original articles, video interviews, news, and perspectives. To learn more about how librarians like you are driving practice across the lifestyle of open access, be sure to visit our Open Access in Action hub page.

Finding the Right Tools

There are of course good tools for academic article discovery, as we briefly discussed in April. Not all of them are equal, however. We spoke with University of Texas: MD Anderson Cancer Research Medical Library’s Collections Librarian, Allen Lopez, for more insights on the current discoverability tools.

“As librarians, we have to accept that Google Scholar is where many of our folks are going,” Lopez said. “The drawbacks include the sheer mass of hits that a single query generates. More isn’t necessarily a good thing. It’s not humanly possible to evaluate each one. Also, no one outside Google knows the ins and outs of their search algorithm, so we simply don’t know if these links are to reliable sources.”

Lopez emphasized the discipline needed to evaluate a Google Scholar finding, including the journal’s peer review process, its reputation, and a willingness to openly discuss methodology. Quantitative metrics such as numbers of views or downloads, should be weighed less heavily, he feels. “We do talk about numbers, but it’s an attempt to apply an objective measurement to a subjective value,” he said.

Of course, there are other, discipline-specific tools available “If you’re a serious researcher, you need to go into the individual databases in your field,” Lopez said. For OA, these include PubMed Central for biomedical and life sciences research, JSTOR for a wide variety of arts and sciences disciplines, and the OAJSE for mechanical engineering.

Between general tools like Google Scholar and highly specialized databases, Lopez noted several, library-specific applications like Primo, Summon, and EDS. Although these are understandably subscription-based, they are increasingly including Open Access journal content. As OA metadata become more standardized (see below), these discovery tools will become more efficient.

The keys to greater OA discoverability include some presumably basic steps. “Journal article authors need to hold to the highest standards of the peer review process—not just publishing for the sake of being published,” Lopez said. “Publishers also need to systematically connect with all the major indexes in their field.”

Significantly, Lopez also expressed the desire for a more standard and consistently applied set of metadata—covering not only DOIs and ORCiDs but many other things that librarians struggle with daily. “There are some good, common identifiers out there,” he said, “but they are not required or even consistently applied by every journal. Even something as seemingly simple as volume and issue numbers are not consistent—even within the same journal!”

Metadata 2020

Lopez’s wish is shared by many—and may potentially have a solution. We spoke with Crossref’s Director of Member and Community Outreach, Ginny Hendricks, about the persistent, chicken-and-egg dilemma surrounding journal metadata. “To start with, publishers have registered 232 different licenses with us, many of which are easily identifiable as OA because they use e.g. CC-BY versions,” she said. “But we can’t tell for the others without reading the licenses in full. It’s also not just about either Green or Gold; it’s more like a spectrum of openness.” Hendricks noted that all publishers struggle with metadata consistency, and that just because the license is OA, doesn’t make it any less susceptible to the discoverability problem.

To address this, Crossref and at least 17 other organizations—including schools, bodies like ORCiD, and even the Wikimedia Foundation, are launching a new initiative—Metadata 2020—next year. Hendricks asked, “What if we could make it easy to include as much information as possible? All the basic stuff but also license info, funding/grant data, ORCiDs, organization IDs, clinical trial data, and—along the way—corrections and retractions? What if it was a simple case of entering once, and watching that work—with clean and “complete” metadata—grow and get added to, permeating through other systems, contributing to research throughout the world?”

The campaign’s goal is to build on existing efforts, increasing awareness among publishers, and “rally the community” in promoting more robust, consistent metadata. Besides improving on things like reporting, Metadata 2020 will also pursue new types of incentives—particularly gamification—for publishers to provide better metadata.

“All parties in the research enterprise should aim to improve the discoverability of content,” Hendricks said. “Whether they’re funders, authors, preprint servers, publishers, libraries, or repositories, it is in everyone’s interest to add value through search, discovery, annotation, and analyses.” Awareness and application of a common metadata standard is a clear path to that goal.