2015-07-01

Anticipating Demand: The User Experience as Driver

By Donald T. Hawkins, Freelance Editor and Conference Blogger

Note: A summary of this article appeared in the June 2015 issue of Against The Grain, v27 #3, on page XX.

The 57th Annual Meeting of the National Federation of Advanced Information Services (NFAIS) was held in Arlington, VA on February 22-24, 2015. It attracted an audience of approximately 175 attendees and featured a mix of plenary and panel sessions and, of course, the Miles Conrad Memorial Lecture, presented this year by Tim Collins, President and CEO of EBSCO Industries, Inc.

Keynote Address: The User of the Future

Kalev Leetaru

Kalev Leetaru, Founder of the GDELT Project, presented an information-packed keynote address on reimagining how we think about information that we can access from anywhere. In today’s world:

There are now as many cell phones in existence as there people on earth,

Facebook alone has 240 billion photographs (35% of all online photos,

Every year, 6.1 trillion text messages and 107 trillion e-mails are sent,

Every day, 2.5 billion new items and 300 million new photos are posted to Facebook,

As many words are posted to Twitter every day as were published in the entire New York Times in the last half century, and

Over 100 billion social media actions are taken every day.

Despite the huge growth in the amount of data now available, the basic search experience has not changed much in the last 50 years: users must still figure out how authors would describe their subject and enter the relevant words in a search box. (On a personal note, I can confirm this; when I began my career as an online searcher over 40 years ago, I found that one of the best questions to ask a requester in a reference interview was to ask, “If you were writing an article on this, what would its title be?”)

We must understand users better. Successful search companies know this. For example, Google usually understands what we are looking for when we enter a query. Users do not need or want to think about the structure of the underlying database or the search profile that Google creates for each one of its users; all they want is the answer to their query.

Many companies recognize that they cannot be the source of all information, so they employ crowdsourcing techniques to improve their platforms. Suppose they created their own clouds of information for their customers and allowed them to mine the data. This could create significant revenue streams and many unexpected uses of data will come to light. For example, similar images from two different databases could be compared side by side for the first time ever.

The GDELT Project compares events, quotes, people identified, images, etc. emotions and themes in global news articles and creates metadata from them. From the GDELT database, one can create a knowledge graph of what is happening and how people are reacting to events. For example, the U.S. Army analyzed 21 billion words from the academic literature, reports, and dissertations to create knowledge graphs of information sources in Africa and the Middle East. Studies such as this will help us understand what information will look like in the future, how users will interact with it and use it, and how it can be delivered effectively and in a frictionless manner. We already have an incredible knowledge about how users interact with the data they come into contact with; we must create ways for them to mine the information, selectively share it, and make it useful. Successful companies will create interfaces that adapt to the user’s needs.

User Experience Demands on Libraries

David Shumaker, Professor of Library and Information Science at the Catholic University of America, appropriately titled his presentation “Caught in the Middle: Librarians, Scholars, and Information Professionals”. He began with a well-known quotation from Stewart Brand’s book, The Media Lab: Inventing the Future at M.I.T. (Penguin, 1988): “information wants to be free”. But it also wants to be expensive, so we are in a very dynamic environment and are being pulled in two directions at once, which can be bewildering.

We are each in the center of our own information universe, and we all have our own personal information management systems (PIMs), like this:

PIMs require feeding and maintenance: incoming content must be evaluated, metadata records must be created for it, and usually it must be cleaned up before being added to the PIM. Time spent on these activities could be better used for scholarship, teaching, or research, so PIM maintenance tends to be neglected. Information is a means to an end; once a problem is solved, we focus on other activities. Why do users allow this situation to continue? We need better tools to help us keep up. Librarians, experts at evaluating competing tools and giving advice, have an opportunity to become change agents in their organizations.

Peter Fox, Professor at Rensselaer Polytechnic Institute, agreed with Shumaker and said that the goal of an information system must be to get something back to the user as quickly as possible. He advocated use cases to expose system requirements and said that we must get away from building “systems” and focus on frameworks instead. (Systems have well-defined entry and exit points, but frameworks can have many.) Social aspects and small teams are important when building a platform.

New Workflow Tools

Five new startup companies presented their products in a series of lightning presentations:

Kudos helps researchers maximize the visibility and impact of their published articles. So many articles are published (about 50 million/year) that many of them are never viewed or read, which is painful for researchers, publishers, universities, and funding organizations. Academia is moving from a “publish or perish” environment to “be discovered or die”; Kudos puts researchers in control of their output by providing tools to explain, enrich, share, and measure the impact of their publications. It has been well received; since May 2014, Kudos has registered over 30,000 users.

Sparrho is a personalized recommendation engine to help researchers make connections to others having similar interests and discover content outside normal sources. It uses its database of over 2 million articles, events, newsfeeds, and patents from over 18,000 sources to create a personalized newsfeed for its users.

Sciencescape organizes and maps published papers in the life sciences from its database of over 24 million biomedical articles. Journal routing services have disappeared because most journals are now digital, and search works only if you know what you are looking for, so today’s distribution system has broken down. Researchers at Sciencescape’s over 1,000 customers can view the latest research results as a knowledge graph, which helps them discover and share new key papers in their field.

ZappyLab has built a repository of science protocols used in research and is trying to overcome the problem of rediscovering knowledge previously discovered by others. Currently, if a scientist discovers a change that should be made in a published protocol, there is no way to inform others about it, so the next person to use that protocol must rediscover it. Zappy’s crowdsourced protocol repository provides a platform for documenting such changes and notifying other Zappy users about them.

Hypothes.is provides a method of annotating web pages (like marginal notes made in books). Although commenting (bookmarking, tagging, note taking, or discussion) is possible on many websites, the comments are often buried at the bottom of the page, sharing them is difficult, and they use many incompatible systems. The Hypothes.is platform permits private discussions, collaborative discussions, and public annotations.

Information Wants Someone Else to Pay for It

Micah Altman, Director of MIT Libraries’ Program on Information Science, began the second day of the meeting by describing several current trends in authorship:

Data is being created at an ever increasing rate, and publication is no longer the end stage of information dissemination. We now tend to publish information as soon as it is created, then filter and edit it afterwards. Some websites even have a button labeled “Publish” to be used as part of the data creation process.

Collaborations have become multidisciplinary. The average number of collaborators on an article is now over 6, and some articles have thousands of authors.

Many new challenges to curating and evaluating the output of the scientific process have arisen. The information flow is changing and becoming increasingly complex.

We are tracking more types of information and more processes are generating data than ever before. Digital technologies now allow us to “unpack” information and re-bundle scholarly information.

How do we communicate trustworthiness and how do we enable corrections and annotations? Provenance, the source of the information, is important.

Production methods have changed, which raises questions like these:

No single organization can preserve and maintain all the information upon which it relies. Many individuals maintain their own repositories, which they may be reluctant to share.

Mark Jacobson, a consultant at Delta Think Inc., said that Google and other search giants have had a significant influence on users’ expectations. Complex user interfaces have disappeared, and we now have the ability to string small apps together to make personalized and focused tools. We still must understand customers’ experiences and keep technology current and flexible. It is important for developers to become very agile and understand the compromises necessary to build successful products.

Flash Builds: Rapid Prototyping

JSTOR has recently created a new team, JSTOR Labs, which is using “Flash Builds” to create prototypes of new products. Alex Humphreys. Associate Vice President, JSTOR Labs, described two recent projects:

JSTOR Snap allows a user to take a photo of a page with a smartphone and receive a list of relevant articles on the same topic from the JSTOR database. It was built in only a week with the participation of students and faculty at the University of Michigan.

In a partnership with the Folger Shakespeare Library, a JSTOR Labs team used Flash Builds to create an app linking the lines in plays with research articles that quoted them. The primary text thus became a portal to the scholarship and promoted the value of Folger’s Digital Texts.

Humphreys said that several ingredients are necessary for projects like this to succeed. The most important, of course, is the team, which must be small but have both technical design and business skills. It is also necessary to have a place to innovate with both technical and cultural support so there is a safe place to fail, a comfort level with uncertainty, and dedicated time to work without distractions of meetings, e-mail, etc. The prototypes must be shown to users early and often, and in the presence of the whole team. The team must recognize what it can learn from users, what it cannot, and collaborate openly. Partners can see the world through the users’ eyes and environment in which the product must operate. For those interested in these types of products, Humphreys recommended the following books: Eric Ries, Lean Startup (Crown Business, 2011); Laura Klein, UX for Lean Startups (O’Reilly Media, 2013); and Alistair Croll and Benjamin Yoskovitz, Lean Analytics (O’Reilly Media, 2013).

An Integrated Knowledge Platform

Tim Clark, a member of the Scientific Advisory Board of OpenPHACTS, noted that informatics are essential in the drug development process; all companies search the scientific literature, download relevant articles, and store them in proprietary databases. But there is little sharing of ideas across sources and providers. Over the last decade, open data has become more widely available, but there has been little integration of it.

OpenPHACTS, funded by the European Community, is a semantic data integration platform which seeks to address common problems in pharmaceutical research such as patent expiry, competition from generic drugs, cost containment, and the need for increased R&D productivity. The system is built around science questions supporting business activities in developing a drug. It contains a viewer that allows unpacking of terms, chemical structures, etc. from documents and annotating the PDF. When the HTML version of the PDF is opened, the annotations appear in the correct spot in the document. The OpenPHACTS Foundation is a nonprofit membership organization that supports the OpenPHACTS project.

New Content For Researchers

Victor Camlek, President, Camlek Concepts, reviewed the use of social networks by healthcare professionals. Social networks play a role in doctors’ prescribing choices. According to a 2012 study published in the Journal of Medical Internet Research, one in four physicians uses specialized social networks as a source of medical information. In the U.S., Sermo, Doximity, and QuantiaMD are the most heavily used; outside the U.S., physicians use Neuros.org, DXY.cn, and Networks in Health. The most significant concerns involve patient privacy regulations and rules of ethics and conduct.

Karie Kirkpatrick, Digital Publications Manager, American Physiological Society (APS), said that although content is everywhere, and we buy, share, and find lots of it on social media and publisher websites. Researchers typically have these specific desires in choosing a search system to use:

Kirkpatrick gave these examples of products that fulfil those needs:

MIT’s Press Batches are bundles of 9 to 12 articles on topics identified by altmetric studies.

APS offers APSelect, groups of 8 to 10 articles selected monthly by APS editors.

eLife Lens is an HTML-based viewer that improves the readability of journals by presenting parts of articles (citations, references, tables, etc.) on a side panel so that the reader can view them along with the text and without the need to scroll through the article.

She suggested several possible enhancements that would improve content presentation: a Newsmap-type view of Google news or other popular content, content analysis using Google’s Ngram viewer, and Harvard’s StackLife browser that organizes books held by multiple libraries as if they were on a single shelf.

Karim Boughida, Associate University Librarian for Digital Initiatives and Content Management, George Washington University, noted that libraries exist for their users and are intermediaries between users and publishers, but they are not doing a good job of using machines to help their users. Machines are the gateway to reusing content, and the web is changing from human-readable to machine-readable. He also noted that one developer has said that systems have 3 seconds to convince a user not to use the Back button.

In today’s academic libraries:

We still do discovery the same way as we have for the last 20 years, and we are focusing less on making the library a starting point.

Open access is inevitable; digital collections have led to repository chaos.

We are entering the post-monograph era, but junior faculty must still publish monographs to get tenure.

The U.S. is not doing well in designing systems for accessibility. Harvard and MIT have been sued over the lack of closed captions in online lectures.

Linked and open data are in an early experimental phase. OCLC has created the Evolving Scholarly Record (ESR) Framework which provides a view of material encompassed by the scholarly record and the roles associated with its creation, management, and use.

The Changing Landscape of Scholarly Communication

Keith Webster

In his very challenging Members-only Luncheon Address: “Quo Vadis? The Changing Landscape of Scholarly Communication”, Keith Webster, Dean of Libraries at Carnegie-Mellon University, gave a sobering view of the changing world of libraries and how they must adapt to survive. He began with more phenomena that have occurred in today’s world of academic libraries:

Students crowd libraries without using them (in the traditional way).

The world of open science is beginning to transform scholarly communication.

The success of e-journals has driven the researcher from the library.

Web-based knowledge and research tools are growing, often outside the institution.

Open access has shaped policy agendas.

Library budgets are under pressure.

Shareholders and venture capitalists expect returns on their investments, leading to pressure between customers and investors.

Events of the world are bypassing university libraries. A study by the Research Information Network (“Researchers and Discovery Services: Behaviour, Perceptions, and Needs”, November 2006) said that contact between researchers and information professionals is rare. Researchers and generally confident in their self-taught abilities, but librarians see them as relatively unsophisticated, and although librarians see a problem in not reaching researchers with formal training, the researchers don’t think they need it. Our library ecosystem is clearly under threat.

Webster traced the development of libraries through five generations, from collections to the provision of collaborative knowledge, media, and fabrication facilities. He noted that the generations have been cumulative; we did not stop doing things from a previous generation when a new one arrived. There has been a huge growth of scientific output in the last 30 years, particularly in China. Twitter is spreading awareness of articles; the number of multi-author papers is growing rapidly; and the journal article is no longer necessarily the record of true research. In-house journals are being sold; open access is financially challenging; and conferences may be under threat as new generations of researchers are losing interest in joining professional associations.

Libraries are not going out of business, but they have missed the mark in connecting with researchers to ensure their long-term viability and have not shared in government investments in science. They are therefore having difficulties demonstrating their impact on their institutions. Instead of focusing on the size of their collections, libraries must think about their impact on teaching and research.

If researchers had to acquire their own resources, where would they go for information? Many would approach colleagues, purchase directly from publishers, or use open access repositories, which might cause librarians to become embedded in their users’ environments, become specialists in evaluation of technology, and thus continue to have a meaningful impact on the research process. Libraries would be repurposed as learning spaces, with the focus on collection development shifting to local curation.

New Ways of Interacting With Content

Kate Lawrence, Vice President, User Experience Research at EBSCO Information Services, studied today’s student researchers and said that information providers must accept the reality that students are no longer adapting to our tools, rules, and processes. The emotional experience in the research process matters; this word cloud shows the feelings of some student researchers.

Google is everything to students. They also love Wikipedia because it first presents an overview of a topic in layman’s language, then a table of contents for the page, followed by references for further research. EBSCO has responded to these findings by creating “Research Starters” for education students that contain articles, reading lists, topic overviews, and discussions chosen from an analysis of courses at leading universities.

Library websites challenge to today’s students because they use language and terminology not widely understood—as one student said, “I don’t speak ‘library-ese’.” To succeed in the student research market, information providers must accept the influence of Google and Wikipedia. It is important to recognize that search results are a destination, design for binary decision making, and create experiences to make students want to return to your site.

Pierre Montagno, Business Development Director, Squid Solutions, said that both content and delivery are now important. We must develop an experience that meets expectations that have been set by Google. The user has become king, and we must find what they want. The best way to study users is to watch what they do online and know how they are getting to your site. Path analysis will reveal the most frequent user paths through the site and determine how long it takes them to get results.

Alex Wade, Director of Scholarly Communications at Microsoft Research, noted that there has been a decrease in domain-specific databases and a rise in web-scale searches, with the result that searchers are finding what they searched for but not necessarily what they needed. Microsoft Cortana, an intelligent personal assistant, attempts to move beyond the search box to a more long-term engagement with the user. To get rich semantics, one must know the user, where they are located, what their calendar is, their preferences, and link that data to knowledge of the world. The result is a more personal user experience. For example, a user could get answers to a search based on those from a previous search, which is an example of “proactive discovery” made possible by knowing the immediate environment of the user.

Business Models and New Policy Impacts

The final day of the meeting began with two panel discussions: business models for startups, and policies for today’s changing environment. The meeting wrapped up with the awards luncheon and a closing keynote address on the future of the internet, cloud, and big data. Here are some of the points made in the panel discussions:

Business Models: Partner, Build, or Acquire

Help users to discover what is happening, especially on the fringes of a field.

Make connections through content to people; help increase research performance and impact. Researchers want their work to be read; empower them to have an active role in the post-publication impact of their work.

It is expensive to market directly to researchers; link to partners and work with publishers who have an interest in getting their content used.

Most startups do not realize how difficult it is to gain users.

Bringing an idea to life as a new product or service is exciting, something like parenting. Always think about what is needed to make the company very successful.

Normalizing content is a critical and difficult step.

The best way to make people aware of your existence is by personal networking. Get press coverage and have a business development person on the team.

Impacts of Policy

Publishers are continuing to look at problems in a traditional way. It is important to try and understand what users do.

Sometimes traditional policies get in the way of what we want to do. We must work on change management.

Societies must be sure they are meeting the needs of their authors. They are an outstanding place where research is published, which will increase emphasis on marketing to authors.

We have not yet figured out the model to move from subscriptions to open access. See Cory Doctorow’s book Information Doesn’t Want to Be Free: Laws for the Internet Age (McSweeney’s, 2014).

Metadata is really an advertisement for your content. It must be continuously available publicly, both in the language of publication as well as English.

We cannot know what users will want five years from now, but we have a responsibility to think about that even as we provide for them now. There are many ways to present information; we must give it to users in the way they want it. Consider repackaging books (“chunking” them) for those who want only parts of a book.

Policies must be reviewed regularly and validated. Some of them are helpful in opening opportunities for change. See The Intention Economy: When Customer’s Take Charge (Harvard Business Review Press, 2012) for a description of how policies are changing. Policies should grow out of best practices, not the other way around.

We want services that give us personalized information, but we are also concerned about privacy. Privacy and customized services will always be at odds with each other.

Where Do We Go From Here?

The closing address by Michael Nelson, consultant with CloudFlare, Inc. and Adjunct Professor, Georgetown University, was very appropriate because it gave a look at the future. He said that in the next 10 years, we will see as much change as we saw in the last 20. Big data is a hot topic now because:

The supply is huge and growing. We are entering the zettabyte era (1 zettabyte = 1 billion terabytes) with the availability of internet traffic data, photos on Facebook and Flickr, Twitter updates, and data from widespread sensors. With billions of ways to combine it, all that data generates more data.

Tools for data manipulation have become widely available. The Internet of Things, open source data analytics tools, and massive data sources from the government have all led to what Nelson calls “Cloud+”.

Demand for data is growing and has led to several dreams by users:

Situational awareness: knowing what is happening and where,

X-ray vision: smartphone apps, image detectors, etc.: data you need where you need it,

Real-time prediction: actionable traffic and weather forecasts, predictive analytics,

A super “sidekick”: technology that knows what you want,

A team of sidekicks: online collaboration anywhere anytime, and

A team of everyone: social media, real-time polling, videoconferencing.

These trends are reshaping our economies. The Cloud+ gives people almost free cycles of processing and storage, which makes it easier for them to get the information and tools they need, improves collaboration, and enables them to contribute their time and expertise to new projects. The results are more innovation in more places, stronger communities, and a safer, more sustainable, and more prosperous society.

CloudFare is a new system that routes traffic from over 2 million websites to users and protects them from malware, denial of service attacks, etc. They can learn what is happening on the web and identify cyberattacks. CloudFare is less than 5 years old and has 130 employees. It has 32 data centers and handles about 5% of today’s web queries. It is one example of how the Cloud+ is improving the quality of life for billions of people. But the benefits are not guaranteed. Governments and business must make the right choices when they are setting internet policy. The internet could have grown up like the cable TV networks, where the owners control the content. It succeeded because enlightened choices were made in its early days. Most countries have done as little as possible to regulate the internet, which has been good for innovators.

Today’s vision of the digital economy revolves around innovative ecosystems: the ability to share information while protecting private information, more open companies, erring on the side of transparency, and learning to share. We need smart policies, smart technologies, a smart culture, and smart intellectual property rights. The key to all of this is trust.

Nelson closed with a list of suggested readings:

Next year, the NFAIS annual meeting will return to Philadelphia in February.

Donald T. Hawkins is an information industry freelance writer based in Pennsylvania. In addition to blogging and writing about conferences for Against the Grain, he blogs the Computers in Libraries and Internet Librarian conferences for Information Today, Inc. (ITI) and maintains the Conference Calendar on the ITI website. He is the Editor of Personal Archiving, (Information Today, 2013) and is currently editing a book on public information. He holds a Ph.D. degree from the University of California, Berkeley and has worked in the online information industry for over 40 years.

Charleston-hub.com

Don’s Conference Notes: The 2015 NFAIS Annual Conference