The following is a guest post by Cullen Hendrix of the University of Denver.
If you’ve read or seen Moneyball, the following anecdote will be familiar to you: Baseball is a complex sport requiring a diverse, often hard-to-quantify[1] skillset. Before the 2000s, baseball talent scouts relied heavily on a variety of heuristics marked by varying degrees of sanity: whether the player had a toned physique, whether the player had an attractive girlfriend, and whether or not the player seemed arrogant (this was seen as a good thing). Billy Beane and the Oakland Athletics changed things with a radical concept: instead of relying completely on hoary seers and their tea-leaf reading, you might look at the data on their actual productivity and form assessments that way. This thinking was revolutionary little more than a decade ago; now it’s the way every baseball team does business.
Roughly around the same time, physicist Jorge Hirsch was starting a revolution of his own. Hirsch was ruminating on a simple question: what constitutes a productive scholar? Since the implicit answer to this question informs all our hiring, promotion and firing decisions, the answer is pretty important. In 2005, Hirsch published “An Index to Quantify an Individual’s Scientific Research Output”, which introduced the world to the h-index. Like most revolutionary ideas, its brilliance lay in its simplicity. Here’s the abstract:
I propose the index h, defined as the number of papers with citation number ≥h, as a useful index to characterize the scientific output of a researcher.
Thus, a metrics revolution was born. Hirsch had distilled information on citations and numbers of papers published into a simple metric that could be used to compare researchers and forecast their scholarly trajectory. That metric is at the heart of Google Scholar’s attempts to rank journals and forms the core of its scholar profiles. With Google’s constant indexing and unrivaled breadth, it is fast become the industry standard for citation metrics. Google’s scholar profiles track three basic statistics: total citations, the h-index, and the i10 index, which is simply the number of articles/books/etc. a researcher has published that have at least 10 citations.
So, what do these metrics say about productivity in the field of international relations, and why should we care?
Citation metrics are worth investigating for at least two reasons. First, metrics tell us something about how the work is being used. Stellar book reviews, gushing tenure letters, and article awards may tell us how the work is perceived, but those perceptions can be highly idiosyncratic. And, if we’re being honest, they can be driven by a host of factors (how well you are liked personally, the kind of shadow your advisor casts, whether the letter writer had just survived a shark attack, membership in the Skull and Bones Society, etc.) that have little if anything to do with the quality and/or the usefulness of the work in question. Yes, metrics do not completely get around these biases – see Maliniak et al. on the gender citation gap in IR – but that bias is much easier to account for than the former. Show me a book that was reviewed harshly but eventually cited a 1,000 times and I’ll show you a game changer. Show me a book that was reviewed glowingly and that has had virtually no quantifiable impact and I’ll show you a dud.
Second, this information may be useful to people submitting files for tenure and promotion to full. Before I started putting together my file, I realized I was completely unaware of what constituted a good citation record for someone who had been out for seven years. I’d heard various places that an h-index equal to or greater than years since PhD was a good rule of thumb, but that standard seems to have been designed with physicists in mind, and physicists publish faster than most people type. If you hear grumblings that you should have an h-index of such-and-such come tenure time, it would be good to know whether that bar is low or high, given prevailing citation patterns in the discipline.
With the help of RAs, I compiled data on the 1,000 most highly cited IR scholars according to Google Scholar.[2] Then, the RAs collected supplemental information on the year during which their PhD had been granted (PhD Year). The sample is one of convenience, based on those individuals who listed “International Relations” as one of the tags in their profile and for which the year their PhD was granted could be ascertained. For this reason, many highly (and lowly) cited individuals did not appear on the list.[3] However, the list includes all sorts: realists, liberals, constructivists, feminists, formal theorists, etc., and at all manner of institutions, though the bias is toward research universities. The list appears to be dominated by people at universities in the USA, UK, Canada and Australia.
Descriptive statistics for the group are as follows:
Variable
Obs
Mean
Std. Dev.
Min
Max
GS citations
713
915.2
2804.9
0
40978
ln GS citations
713
4.8
2.2
0
10.6
h-index
713
8.5
8.9
0
73
i10 index
713
10.6
18.3
0
188
ln i10 index
713
1.6
1.3
0
5.2
Most Cited
713
184.9
567.7
0
9429
Ln Most Cited
713
3.4
2.1
0
9.2
Most Cited Solo
713
121.3
361.9
0
4620
Ln Most Cited Solo
713
3.2
1.9
0
8.4
PhD Year
713
2003.5
9.4
1961
2015
I plan to crunch the numbers a variety of different ways. For the moment, a cursory look at the data yields some potentially interesting insights:
Most scholars are not cited all that frequently. It’s time to take a deep breath when worrying about your citation count. Yes, the Joe Nyes and Kathryn Sikkinks of the world can give us all a little count envy, but the median total citation count for all 713 scholars in the sample was 119. That includes at least one person who got their PhD while John F. Kennedy was still president. If we just look at people who got their PhD since 2000, the median is 57. That the mean is so much higher than the median tells us what many of us suspect is true: it’s a pretty unequal world. The top 10% of cite-getters in the sample account for ~75% of all the citations.
The “h-index ≥ years since PhD” rule of thumb for scholarly productivity is probably a bit high, at least for IR scholars. The mean is closer to 0.76. A tenure case with an h-index of 6 six years out from their PhD would be in the 75th percentile of this group. This information is the kind of thing that should be conveyed to university-wide promotion and tenure committees, as notions of what constitutes productivity vary widely across fields. The 500th ranked IR scholar has 71 GS citations and an h-index of 5; the 500th ranked physicist has a few more than that.
Co-authoring is pretty common. For 59% of scholars in the sample, their most highly cited article/book was solo-authored; for the remaining 41%, their most highly cited article/book was co-authored. Interestingly, it breaks down that way even if we just look at people who got their PhD since 2000. Co-authoring, at least of IR scholars’ most influential works, does not appear to be such a recent fad.
Seriously? Nearly 30% of IR scholars don’t have a readily available CV that indicates the year they received their PhD? I feel no further comment is necessary.
Diving a little deeper, I used locally weighted scatterplot smoothing to estimate predicted GS citations, h-index and i10 scores as a function of PhD year. The results are as follows, and can be interpreted as the predicted mean score for each for a given PhD year; I only go back to 1990, as data are rather sparse before then:
PhD Year
Predicted GS Cites
Predicted h-index
Predicted i10
1990
2685.4
17.2
26.9
1991
2510.9
16.6
25.5
1992
2339.8
15.9
24.2
1993
2174.3
15.3
22.9
1994
2012.4
14.7
21.5
1995
1852.2
14.0
20.2
1996
1698.1
13.3
18.9
1997
1549.4
12.7
17.6
1998
1399.8
12.0
16.3
1999
1260.7
11.3
15.0
2000
1132.8
10.7
13.8
2001
1006.5
10.1
12.7
2002
880.4
9.4
11.4
2003
765.2
8.7
10.2
2004
640.6
8.1
9.0
2005
506.3
7.4
7.8
2006
393.7
6.7
6.5
2007
305.5
6.0
5.5
2008
223.3
5.3
4.5
2009
170.8
4.8
3.6
2010
135.4
4.3
3.0
2011
108.9
3.8
2.4
2012
87.0
3.3
1.9
2013
64.9
2.9
1.4
2014
47.2
2.6
1.1
2015
46.2
2.5
1.0
These data are pretty clearly biased upwards by the presence of publishing all-stars (the aforementioned 10%, plus very highly cited junior and mid-career people) with citation counts that skew the distribution. Here’s the same table, but substituting the observed median values by PhD year:
PhD Year
Median GS Cites
Median h-index
Median i10
N
1990
1786.0
18.0
24.0
9
1991
2160.0
19.0
27.0
9
1992
1491.5
19.0
29.5
10
1993
1654.0
18.0
22.5
16
1994
1643.0
15.0
19.0
9
1995
1983.0
16.0
17.0
7
1996
583.5
9.5
9.5
20
1997
396.0
10.0
10.0
15
1998
376.0
10.0
11.0
11
1999
755.0
12.5
14.5
24
2000
701.0
11.0
12.0
19
2001
301.0
9.5
9.5
18
2002
153.5
6.0
4.0
28
2003
220.0
8.0
7.0
28
2004
213.0
7.0
5.0
25
2005
144.0
6.0
4.0
15
2006
105.0
5.0
3.5
38
2007
98.0
5.0
4.0
46
2008
78.0
5.0
2.0
54
2009
76.0
5.0
3.0
29
2010
34.5
3.0
1.0
42
2011
22.0
3.0
1.0
46
2012
21.0
2.0
0.0
41
2013
19.0
2.0
1.0
42
2014
17.0
2.0
0.0
33
2015
8.0
1.0
0.0
17
So if you’re a couple years out and your citation count is barely cracking double digits, don’t worry: you’re in pretty good company.
Some caveats are in order. First, everyone in the sample both has a Google Scholar profile and self-identifies as an IR scholar; there is self-selection bias all over these results. The nature of the bias is probably upward, inflating the sample citation metrics relative to those of the population of interest. This rests on the assumption that people who believe they are doing well, by these metrics, will be more likely to make this information public. I believe this issue is particularly acute for the most recent PhDs in the sample. Second, there is really no way of speculating about the bias stemming from excluding those who are IR scholars but who do not self-identify as IR scholars. Third, I believe metrics should be a compliment to, not a substitute for, our subjective evaluations of the work. They’re another useful piece of information in forming the assessments of bodies of scholarly work that make or break tenure, promotion, and hiring processes.
Metrics will never fully supplant subjective evaluations of theoretical, empirical and normative merit. But they provide a necessary complement to them. So, what did I miss? And what would you like to see done with these data?
[1] As was thought at the time, anyway.
[2] Coding took place between 7/16/15 and 8/3/15.
[3] Both my PhD mentors, Steph Haggard and Kristian Gleditsch, were left off the list. You’re killing me!Share