Thenoisychannel.com

HCIR 2012: A Personal Report

2012-10-08

Human-computer information retrieval (HCIR) is the study of information retrieval techniques that integrate human intelligence and algorithmic search to help people explore, understand, and use information. Since 2007, the HCIR Symposium (previously known as the HCIR Workshop) has provided a venue for the theoretical and practical study of HCIR. We even inspired an EuroHCIR workshop across the pond that started in 2011 and is going strong.

Overview

The Sixth Symposium on Human-Computer Interaction and Information Retrieval (HCIR 2012) took place on October 4th and 5th at IBM Research in Cambridge, MA. The 75 attendees represented a cross-section of HCIR research and practice. Over a third of the attendees were from industry — including startups and large technology firms. We had a similar diversity of sponsors, benefiting from the generosity of FXPAL, IBM Research, LinkedIn, Mendeley, Microsoft Research, MIT CSAIL, and Oracle. And we had participants from 6 countries: Canada, Germany, Israel, New Zealand, Switzerland, and the United States.

Keynote

We started the Symposium with a keynote from UC Berkeley professor Marti Hearst, a pioneer in the area of search user interfaces, as well as a prominent researcher of information visualization, natural language processing, and social media analysis. Marti set the tone for the symposium with a visionary keynote that she entitled her “Halloween Cauldron of Ideas for Research”.

She started by talking about the unaddressed seams of sensemaking, reminding us that information seeking is only one part of an overall sensemaking process. She used the challenge of saving and personally organizing search results as an example of a neglected but crucial part of a search interface.

She then challenged us to think about how audio could be used in search interfaces. She cited a study showing that programmers comment their code better when the commenting interface uses speech rather than the keyboard. She then challenged us to consider how auditory notification or feedback could enhance the search experience.

Finally, she presented the idea of “radical collaboration”, offering as an example the use of Mechanical Turk to crowdsource vacation planning. The plans were tested by real tourists, who were delighted with the results.

Marti’s keynote was not only insightful and entertaining (one of her slides featured brain cupcakes!), but notable in how much she engaged all of us in discussion throughout her presentation. This approach was especially appropriate for an HCIR Symposium, given our emphasis on human interaction. For more detail about the keynote, I recommend Gene Golovchinsky’s summary.

Short Paper Presentations

After a coffee break, we had a session devoted to 5 short papers. Each presenter had 10 minutes: 5 minutes to present and 5 minutes for discussion.

We started off with UXLabs director Tony Russell-Rose presenting “Designing for Consumer Search Behaviour“, joint work with University College London researcher Stephann Makri. Tony could not attend in person, so he submitted a video. He presented a framework for describing consumer search behavior along with concrete examples — many of them familiar from the time that Tony and I both worked at Endeca. Most of all, he emphasized the need to close the gap between information science research and industry practice.

Then MIT professor (and Haystack principal investigator) David Karger talked about “Standards Opportunities around Data-Bearing Web Pages“. He argued that there is a small set of standard user interface patterns for authoring structured data: text search, sorting by properties, presenting items in a template, and faceted browsing. He then advocated that these primitives (which have already been implemented in the popular Exhibit framework) be incorporated into a W3C standard so that content authors can use them with the expectation that all modern browsers support them.

Next, Harvard student Elena Agapie presented joint work that she did at FXPAL with Gene Golovchinsky and Pernilla Qvarfordt, entitled “Encouraging Behavior: A Foray into Persuasive Computing“. Information retrieval researchers and practitioners have often argued that longer queries lead to better retrieval performance. But how do we get users to enter longer queries. Elena and colleagues found that the best way was not to explicitly tell them that longer queries are better, but rather to present a halo around the search box that changes color as the query gets longer. A very interesting approach to apply persuasive technology to search!

Then Rutgers student Roberto González-Ibáñez presented joint work with Chirag Shah and Ryen White on “Pseudo-Collaboration as a Method to Perform Selective Algorithmic Mediation in Collaborative IR Systems“. He presented a novel approach that identified when a user should be aided by a collaborator, and to what extent such help could enhance the user’s search success. An interesting way to achieve the benefits of both user-mediated and system-mediated collaboration.

Finally, University of Washington student Jeff Huang presented joint work with Abdigani Diriye on “Web User Interaction Mining from Touch-Enabled Mobile Devices“. He focused on the practical concerns of instrumenting interaction with search engines in mobile environments. Specifically, he suggested tracking the viewport coordinates — that is, the visible portion of the page at any given time.

The short presentation format was extremely effective, encouraging presenters to communicate their ideas efficiently and leaving ample time for discussion.

Posters and Demos

As in previous years, we followed lunch with a vibrant session for posters and demos. Some of the more popular poster themes included question answering, task difficulty, and collaborative information seeking. Here is the full list of poster / demo presentations:

Developing a Typology of Online Q&A Models and Recommending the Right Model for Each Question Type
Erik Choi, Vanessa Kitzie, Chirag Shah

Investigating Positive and Negative Affects in Collaborative Information Seeking: A Pilot Study Report
Roberto González-Ibáñez, Chirag Shah

To Ask or Not to Ask, That is The Question: Investigating Methods and Motivations for Online Q&A
Vanessa Kitzie, Erik Choi, Chirag Shah

Information Seeking Tasks: Why Do Searchers Feel Difficult?
Jingjing Liu, Chang Suk Kim

Finding Literary Themes with Relevance Feedback
Aditi Muralidharan, Marti Hearst

InFrame-Browsing: Enhancing Standard Web Search
Marcus Nitsche, Andreas Nürnberger

Trailblazer: Towards the Design of an Exploratory Search User Interface
Marcus Nitsche, Andreas Nürnberger

min: A Multi-Modal Web Interface for Math Search
Christopher Sasarak, Kevin Hart, Siyu Zhu, Richard Pospesel, David Stalnaker, Lei Hu, Robert Livolsi, Richard Zanibbi

Search Tactics in Collaborative Exploratory Web Search
Zhen Yue, Shuguang Han, Daqing He

Developing a Dual-Process Information-Seeking Model for Exploratory Search
Michael Zarro

Interactive Data Mining at the Speed of Thought
Vladimir Zelevinsky

Do Users with Different Domain Knowledge Select Different Sets of Documents?
Xiangmin Zhang, Jingjing Liu, Xiaojun Yuan, Michael Cole, Nicholas Belkin, Chang Liu

Predicting Task Difficulty from a User’s Moment to Moment Cognitive Effort During Information Seeking
Michael Cole, Jacek Gwizdka, Chang Liu, Nicholas Belkin

Effects of Domain Knowledge on User Task Performance in a Knowledge Domain Visualization System
Xiaojun Yuan, Chaomei Chen, Xiangmin Zhang, Joshua Avery, Tao Xu

Investigating the Effect of Visualization on User Performance of Information Systems
Xiaojun Yuan

Full Paper Presentations

The full paper presentations were split into two sessions, the first held on the 4th and the second held on the 5th. Each presentation slot was 30 minutes. The full papers will be made available soon through the ACM Digital Library.

University of Magdeburg student Marcus Nitsche presented “Knowledge Journey: A Web Search Interface for Young Users”, joint work with Tatiana Gossen and Andreas Nürnberger. The authors performed a study in which they found that children liked having personalized avatars that offer guidance, a wheel-shaped browsing menu, and a coverflow-style results presentation. It will be interesting to see how their study holds up in larger-scale user studies, and whether adults like some of these interface elements too.

Oregon State University professor Carlos Jensen presented “Leyline: Provenance-Based Search Using a Graphical Sketchpad”, joint work with Seyedsoroush Ghorashi. I was intrigued to see a search approach focused entirely on provenance — that is, the history of a document’s ownership and transformations. I’m particularly curious about this area, since I’m a committee member for Aleatha Parker-Wood, who is pursuing a dissertation on “Making Sense of File Systems Through Provenance and Rich Metadata“.

University of Waterloo professor Mark Smucker presented joint work with Charlie Clarke on “Modeling User Variance in Time-Biased Gain”. Their simulation-based approach produced distributions of gain that agree with distributions produced by real users. By emphasizing the effect size of differences, their approach could help uncover how much the performance differences among systems matter to real users.

Finally, University of North Carolina at Chapel Hill professor Barbara Wildemuth and University of British Columbia professor Luanne Freund delivered a highly interactive presentation on “Assigning Search Tasks Designed to Elicit Exploratory Search Behaviors”. They performed an extensive survey of information exploration literature to identify concepts that authors have used to characterize exploratory search tasks. They tested examples on the audience to see how well we agreed with their characterization and with one another.

HCIR Challenge

With Friday morning came the most anticipated event of the Symposium: the HCIR Challenge. The Challenge is now in its third year: the 2010 Challenge focused on historical exploration of news using the New York Times Annotated Corpus; the 2011 Challenge focused on the problem of information availability using the CiteSeer digital library of scientific literature.

This year, we turned to the problem of people and expertise finding, a topic of obvious personal interest. We are grateful to Mendeley for providing this year’s corpus: a database of over a million researcher profiles with associated metadata including published papers, academic status, disciplines, awards, and more taken from Mendeley’s network of 1.6M+ researchers and 180M+ academic documents.

We asked participants to build systems that could perform three kinds of tasks:

Hiring. Given a job description, produce a set of suitable candidates for the position.

Assembling a Conference Program. Given a conference’s past history, produce a set of suitable candidates for keynotes, program committee members, etc. for the conference.

Finding People to deliver Patent Research or Expert Testimony. Given a patent, produce a set of suitable candidates who could deliver relevant research or expert testimony for use in a trial. These people can be further segmented, e.g., students and other practitioners might be good at the research, while more senior experts might be more credible in high-stakes litigation.

Each of the 5 teams was given 30 minutes to present.

École Polytechnique Fédérale de Lausanne student Na Li presented “Magnifico: A Platform For Expert Mining Using Metadata“, joint work with Lei Zhou and Denis Gillet. Magnifico used a modified TF-IDF approach — where the IDF is an inverse discipline frequency — to match search queries to topic experts. It also assigned a multi-disciplinary reputation metric based on the expertise distribution of an author’s readers.

Ben-Gurion University student Dima Kagan presented “Social Network Based Search for Experts“, joint work with Yehonatan Bitton, Michael Fire, Bracha Shapira, Lior Rokach, and Judit Bar-Ilan. Their system made excellent use of additional publicly available data, cross-referencing the Mendeley user profiles with data from Academia.edu and using Microsoft Academic Search to categorize publication and journals. You can try out their application here.

University of Pittsburgh student Shuguang Han presented “IRIS-IPS: An Interactive People Search System for HCIR Challenge“– joint work with Daqing He, Zhen Yue, Jiepu Jiang, and Wei Jeng. The system used three different types of evidence to suggest candidates: expertise relevance, authority based on a PageRank algorithm applied to the co-authorship network, and social similarity using the Jaccard similarity between co-authors.

Luanne Freund and Kristof Kessler, both from the University of British Columbia, presented “Exposing and exploring academic expertise with Virtu“, joint work with Michael Huggett and Edie Rasmussen. Virtu takes a task-based approach to expertise, exposing and giving the user control over dimensions of expertise that are more or less desirable depending on the type of expert-finding task. The search interface supports information interaction and exploration through a number of browsing and filtering tools, including facets and sliders. You can try out their application here.

UCLA student Fei Liu presented the “‘iF’ People Search System“, an impressive solo effort. Also unique among the entries, iF is a mobile application, designed for the iPad and supporting swipe and multi-touch gestures. A very slick application, iF offered a novel approach to exploring the corpus of documents and people using the analysis of their reputations and social network relationships.

THE WINNER: Virtu! The competition was fierce, but Virtu stood out for the compelling approach it took to offering users control over the expert-finding process. Congratulations to Luanne, Kristof, and their colleagues for their outstanding work and well-deserved honor.

Reception

After we wrapped up the first day of the Symposium, we walked over to the nearby Technique, a restaurant in the Athenaeum Press building (home to two of Endeca’s offices in our early years) where students of Le Cordon Bleu practice their culinary skills. I’m no master chef, but I certain hope these students earned excellent grades for their performance. We enjoyed a delightful sampling of wines, appetizers, main courses, and desserts.

Conclusion

HCIR has been getting better every year, and this year was no exception. Many attendees in previous years had felt that the one-day format made the event feel rushed, and expanding to a second day took off much of the time pressure. We had ample opportunity for discussion, during the presentations as well as at the coffee breaks and reception. Finally, the Challenge was our best yet, eliciting extraordinary results from the five participating teams.

I’m proud of how far we’ve taken HCIR in these six years, and especially grateful to co-organizers Robert Capra, Gene Golovchinsky, Bill Kules, Catherine Smith, and Ryen White.

Time to start thinking about HCIR 2013!