The Sloan Analytics conference took place last week. In the past, I've attended and enjoyed myself. When it comes to recognizing the value of Sloan, it's important to ask what it is. I view Sloan as a "sports geek comic-con" like experience. Lots of people with a similar interest going to a place to see big names talk. It's also a great networking event if you're looking to get a job with a team or a major media outlet. What it is not is an academic conference, despite its tie to the MIT School (of business.) I examined the two NBA papers that were selected as Sloan Research papers. Sadly, I found both underwhelming. I'll give some problems I had with both papers, and end with a discussion of what Sloan should be.
The two papers I'll be reviewing are:
Accounting for Complementary Skill Sets When Evaluating NBA Players’ Values to a Speciϐic Team
This paper examines play by play data to attempt to find how much players help or hurt their teammates.
Recognizing and Analyzing Ball Screen Defense in the NBA
This paper uses sportVU data to analyze the effectiveness of different defenses to screens.
References Analysis
One method to do research is to examine related work. And in any given paper, you're expected to cite your sources. And sadly both of the NBA papers that were accepted as research papers this year were lacking.
"Recognizing and Analyzing Ball Screen Defense in the NBA" only lists four references, listed below, with some notes by me.
[1] http://wwwstats.com/sportvu/sportvu.asp. STATS sportVU, 2015
This is a hyperlink to the sportVU data; I'd assume. Except if you follow the link, it's actually broken.
[2] Armand McQueen, Jenna Wiens, and John Guttag. Automatically recognizing on-ball screens. In 2014 MIT Sloan Sports Analytics Conference, 2014.
You'll recognize this as a Sloan paper from the same group two years ago. Not strictly terrible, but still. We'll get back to this soon.
[3] Kevin P Murphy. Machine learning: a probabilistic perspective. MIT press, 2012.
This is a textbook on machine learning.
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. Scikit-learn: Machine learning in python. The Journal of Machine Learning Research, 12:2825–2830, 2011.
This is a four-page high-level description of a python library for machine learning. Worth noting, its reference section is much bigger.
Of the four sources, three are references and one is their own work, which I'll get to in a bit. Now, this may not be terrible, as it's possible the "real research" is in the data they examined. Sadly, this doesn't end up being the case.
"Accounting for Complementary Skill Sets When Evaluating NBA Players' Values to a Specific Team"
[1] Peter Arcidiacono, Josh Kinsler, and Joseph Price. Productivity spillovers in team production: Evidence from professional basketball, 2014.
This is a mostly solid paper that is forthcoming in the Journal of Labor Economics. I'll note that this isn't listed as the basis of the Sloan research, rather it's noted as "other people are also doing similar research." This paper is more researched and uses more data. So it's an interesting link, as it essentially says: "there is better work being done."
[2] ESPN. Espn nba player salaries, 2015. URL http://www.espn.go.com/nba/salaries.
A link to player salaries at ESPN.
[3] Sports Illustrated. Sports illustrated nba play-by-play, 2015. URL http://www.si.com.
Oddly the URL provided only goes to the Sports Illustrated home page, not the play by play data itself.
[4] Allan Maymin, Philip Maymin, and Eugene Shen. Nba chemistry: Positive and negative synergies in basketball. In 2012 MIT Sloan Sports Analytics Conference, 2012.
[5] Min-hwan Oh, Suraj Keshri, and Garud Iyengar. Graphical model for basketball match simulation. In 2015 MIT Sloan Sports Analytics Conference, 2015.
I'll just note that the last two papers are previous Sloan papers.
Neither paper provides a very comprehensive reference section. In fact, the trend seems to be to only look at past Sloan work and reference basic data sources. Now, again, this isn't strictly a problem, except it means the research on the actual data itself has to be good. That's our next point.
Data Analysis - Part 1
"Accounting for Complementary Skill Sets When Evaluating NBA Players' Values to a Specific Team" decided to examine play by play data for the 2014-2015 season. As per the paper:
The data I use is play-by-play data from SI.com[3] for the 2014-2015 NBA season. With this data I record the 10 players on the court for each possession and the detailed result of the possession. To avoid trying to calculate ratings for players with few possessions, I only look at the 250 players with the most possessions during the 2014-2015 season. All the rest of the players are considered ”replacement” players. I also use data on player salaries for the 2015-2016 NBA season from ESPN.com[2].
A few notes, this is only one season of data, which I find a bit odd, particularly since we have play-by-play data back to 2001. Basketball-Reference offers a tool for examining actions back to the 2000-2001 season for instance. Second, only the top 250 players in terms of possessions are used (I'm not sure if this is total, per-minute, etc.) but this also seems off. In 2015 players in the mid-200 by minutes played included many "important players." The easiest one that springs to mind is Hassan Whiteside. In short, this data set seems severely limited, especially as the one paper it references above used four seasons of data.
The rest of the paper goes on to explain a long set of steps to try and examine how each action a player makes impacts the rate of their teammates. This is attempted for both offense and defense. While interesting, one major issue I had is we are never given any indication of how well this model works. However, many pages are devoted to very specific lineup analysis, including a page giving lineup analysis for LaMarcus Aldridge's free agency.
Finally, the author does a regression analysis on the current NBA salaries using their metric for the players and the players' position. What's iffy is that only one of the variables is found to be statistically significant (a player's action on offense) and with that they conclude teams don't care about teamwork in evaluating players. There are some more issues too. Using only the 2014-2015 data and salaries doesn't make sense. Many of these contracts were signed in previous years or were rookies. What's more, the author spends several sentences talking about variables that weren't found to be significant, which makes no sense. I could go on with the issue with this research but I'll just note there has been tons and tons of research on player productivity and salary in the NBA (Dave Berri's research page has a ton of it that's easy to find ...) and this paper both ignores that work and proceeds to do it worse.
To finish up on this paper - a very limited data set is used to make a very complicated model. We are not told how well this model works at explaining what it's trying to do. This model is then used on very specific examples. Finally, a shoddy regression against salary is used. And as mentioned, this paper opens by acknowledging another work using more data is already being done.
Data Analysis - Part 2
"Recognizing and Analyzing Ball Screen Defense in the NBA" has a much more interesting data set. They have sportVU data (cameras that track player location at 25 frames per second) from the 2011-2012 season through most of the 2014-2015 season. They note that the 2013-2014 data set is the most complete. I'm not sure if they did their work midway through the 2014-2015 season, or if the data feed they were given was cut off. Regardless, with almost three thousand games of data, and hundreds of thousands of game interactions, I was excited. Using this data, the authors built a model to automatically detect ball screens and the various defenses to them. Awesome? Well, kind of. Here's their data for training their model:
To build our training set of data, we watched film from six games from the 2012-2013 regular season. In total we hand labeled a set of 340 attempts to defend a ball screen. Each attempt was labeled based on its most deϐining characteristic. For example, we would label an instance where the on-ball defender goes over and then a trap occurs as a trap, since the occurrence of a trap deϐines the screen defense. In total our training set consisted of 199 instances of over, 56 instances of under, 57 instances of switch, and 28 instances of trap.
Thousands of games and they train it on six games worth of film? Now, that'd be alright if their classifier was great, except by their results, it's not. Using the final parameters they list, they can identify:
Over - 78% of the time plus or minus 5%
Under - 65% of the time plus or minus 12%
Trap - 46% of the time plus or minus 36%!
Switch - 69% of the time plus or minus 11%
Of course, this is based on their training set. Additionally, they get these numbers by adding a threshold that throws out many screens because the classifier "can't tell." And that brings up the original work this paper is based on, which is by the same authors "Automatically Recognizing On-Ball Screens."
Obviously, the crux of all of this work is being able to use machine learning to examine lots of sportVU data and identify when a screen occurs. In this paper a graph identifying the tradeoff between correctly identifying a screen and misidentifying a screen is given (these are called ROC curves.) What this paper showed was that it was really hard to identify screens using the method they provided. To correctly identify 90% of screens, the classifier would tell you 30% of "not screens" were actually screens. This means you're left with two options, knowingly leave out lots of screens, or know you have lots of things called screens that aren't actually screens in your data set.
This is important given that the defense classifier (trained on 340 data points) is set loose on 270,823 labeled ball screens (well actually only 51,451, as over 200,000 of the screens are labeled: "unclear" and thus thrown out). To reiterate - a classifier that is not super great at finding ball screens finds a bunch of ball screens. Then, a classifier that is not super great at identifying ball-screen defense is used on those. Then, the authors go into very minute detail about specific players and defenses. Whew!
Finishing Up
I'll be candid, I decided to do a review of these papers and given their size (they're each under 10 pages, and have lots of charts) I assumed I'd get a quick article out of it. The number of issues I ended up finding in both papers grew very quickly. The primary point I'll give is that these two papers used limited data to make iffy models. In both cases, lots of data was excluded or knowingly mislabeled. Then the authors went into intricate detail at the player level using these models. And as noted, it did not appear that either paper used a lot of research ahead of time. These are not good papers. Sorry. Now, I'll give an out, both of these have some fascinating ideas, and exciting data sets. But neither has, in my opinion, done the work needed to actually be considered worthy of a research paper. As I told Dave Berri while I was reviewing these: "I say this as someone who has turned in a bad semester-long graduate project or two. These qualify as bad semester-long graduate projects."
I'll again stress we should be asking what the Sloan Analytics Conference is trying to be. It's a place for many data nerds to meet and network. It's essentially a convention. It reminds me a lot of a comic book convention! There are already special big name guests giving panels for instance. When I've attended some of the most popular people were those showing their work on laptops in the resting areas outside of the presentation rooms. Going forward, I'd recommend this be what Sloan aims for. A place to meet celebrities, network, and show off your work. There are already research paper sections and areas people could display their in-progress work. It is not, however, an academic conference. Year after year I see poor research papers that are inexplicably accepted. They seem to be getting worse! It's time for Sloan to admit it doesn't have the staff to correctly vet these papers and stop taking iffy work. At least, that's my two cents.
-Dre