MLB DFS Strategy: Small Sample Sizes and Adjusting Projection Baselines
Earlier this week, there was a really great question in our forums about evaluating small sample sizes on individual skill sets and it gave me an opportunity to revisit one of my favorite articles I’ve ever read on Fangraphs. The article, which was published in 2009, remains relevant to the work we do when adjusting player baselines and our projections for them throughout the season. It’s a very quick read (within it is a link to the bigger research article most of the data is pulled from and at the bottom you’ll find some similar links that also lead to more updated research – in general the results are similar so I like to link back to the original), but one that helps you understand some of the background on sample size, peripheral indicators we value, and when to start evaluating them as meaningful change.
One of the things I love about this research is how intuitive the results were. The peripherals that reached a level of predictability the fastest were the indicators that were most controlled by the batter. The batter can control how often they swing. They can control how often they make contact and thus how often they strikeout. They can control how hard they hit the ball, but what they can’t control is the result of the play once the ball is put in play. Most hard line drives fall for base hits, but not all of them. As you’d expect things like Home Run/Fly Ball Rate take longer to normalize than things like swing rate or contact rate.
You’ll also notice a few indicators start becoming more reliable around 100 plate appearances. In general, most everyday hitters approach 100 plate appearances per month and we’re almost a month into the season! This is when we start evaluating in-season player performance and adjusting projected baselines for players in our model, so it makes sense to approach this topic while fresh in our minds. While you don’t want to overwrite multiple seasons of performance with just the first 100 plate appearances in a season, it’s appropriate to adjust players’ baselines as the season goes on. This gives us a better chance at capturing player changes in performance and skill level without erasing all of the context that our baseline projection was founded on. Ultimately, a combination of past history with current performance will help create more accurate player baselines. We don’t want to do this too early in the season before indicators reach potentially predictive thresholds, because we don’t want to overweight small samples.
Over the next few weeks, one of the first areas we’ll start examining is swing percentage and contact rates. Pizza Cutter’s research found that swing percentage becomes reliable at about 50 plate appearances and contact rates start becoming reliable at 100 plate appearances. In this article, we’re going to highlight some of the individuals around Major League Baseball that have seen the biggest change in their swing rates and contact rates this season compared to the prior few years of data. Specifically, we’re going to examine the largest changes from 2014-2016 to this season.
Let’s start with swing rate. Below are the players who are swinging five percentage points more frequently than the past few seasons:
Player
PA
14-16 Swing %
16 Swing
Swing Delta
Evan Longoria
74
48.70%
58.50%
9.80%
Brandon Phillips
65
55.00%
64.60%
9.60%
Andrelton Simmons
70
48.50%
57.30%
8.80%
Dee Gordon
72
48.00%
55.90%
7.90%
Yonder Alonso
52
47.20%
54.40%
7.20%
Yan Gomes
52
52.10%
59.20%
7.10%
Khris Davis
62
50.50%
57.50%
7.00%
Nick Castellanos
57
51.50%
58.10%
6.60%
Nick Ahmed
67
48.80%
55.10%
6.30%
Kole Calhoun
71
48.80%
54.90%
6.10%
Edwin Encarnacion
80
44.10%
50.20%
6.10%
Anthony Rendon
74
40.30%
45.70%
5.40%
Josh Harrison
77
54.30%
59.70%
5.40%
Nori Aoki
73
45.70%
50.90%
5.20%
Joc Pederson
59
43.00%
48.10%
5.10%
Swinging more frequently isn’t always a bad thing. Sometimes, hitters can be too patient. However, an elevated swing rate can sometimes accompany declining bat speed. As players age, their bat speeds slow and they need to find ways to adjust. One of the most common ways is to start their swings earlier and “cheat” a little bit to try to overcome the lack of bat speed. As a result, I like to look for aging veterans that may be trying to speed up that bat early in the season. Brandon Phillips (CIN) fits the profile (age 34) and his results early in the season have been great. He’s shown more power than the last few seasons and he’s making the most hard contact he’s made since 2010. He’s also seen more pitches inside the strike zone by a wide margin than recent years (50.5 percent his season, sub-45 percent since 2011) but his swinging strike rates are up. Once the league adjusts to Phillips’ more aggressive approach, we’d expect the power to fall off substantially.
On the flip side, here is a list of players swinging less frequently than in years’ past:
Player
PA
14-16 Swing %
16 Swing
Swing Delta
Danny Espinosa
59
52.50%
41.80%
-10.70%
Jose Altuve
82
51.60%
41.30%
-10.30%
Chase Headley
54
42.70%
34.20%
-8.50%
J.D. Martinez
69
54.40%
46.60%
-7.80%
Scooter Gennett
68
54.60%
46.80%
-7.80%
Chris Davis
72
47.50%
39.70%
-7.80%
Chris Carter
68
47.80%
40.40%
-7.40%
Brett Gardner
59
37.00%
29.80%
-7.20%
Nolan Arenado
73
53.80%
46.70%
-7.10%
Colby Rasmus
69
46.30%
39.30%
-7.00%
Randal Grichuk
61
51.40%
44.60%
-6.80%
J.J. Hardy
60
39.90%
33.10%
-6.80%
Christian Yelich
70
40.60%
34.30%
-6.30%
Jon Jay
78
48.60%
42.40%
-6.20%
Hunter Pence
81
46.60%
40.50%
-6.10%
John Jaso
69
44.70%
38.80%
-5.90%
Ian Desmond
70
49.90%
44.00%
-5.90%
Dexter Fowler
81
42.00%
36.20%
-5.80%
Brian McCann
55
42.90%
37.10%
-5.80%
Joe Mauer
80
39.90%
34.20%
-5.70%
Mike Moustakas
71
48.00%
42.30%
-5.70%
Jose Abreu
77
52.40%
46.80%
-5.60%
Francisco Lindor
65
49.80%
44.20%
-5.60%
Mark Teixeira
68
42.80%
37.40%
-5.40%
Kolten Wong
56
49.20%
43.80%
-5.40%
Jason Kipnis
65
42.60%
37.20%
-5.40%
Daniel Murphy
64
48.60%
43.40%
-5.20%
Ryan Braun
70
50.70%
45.60%
-5.10%
Chase Utley
69
41.70%
36.60%
-5.10%
Jacoby Ellsbury
68
46.80%
41.80%
-5.00%
David Freese
75
45.40%
40.40%
-5.00%
Swinging less frequently is one way to increase your BB Rate, and subsequently your K Rate as well, and it’s something we like to look at as potential growth in young hitters. Jose Altuve (HOU) is sporting a career high 11 percent BB Rate (previous high was 6.3 percent), which is directly correlated with swinging less frequently. There is a line where you can be “too patient” but in general more selectivity at the plate often correlates to improved power as well as improved BB Rates. We’re seeing this early with Altuve (and a number of other players on this list). He’ll give some of this back over time, but an improved approach likely means even better things for the soon-to-be 26 year old. This list is filled with players off to strong early season starts.
The contact rate samples aren’t yet reaching reliable levels but they will within the next month. Similar to the swing rates above, we’ll post those who have seen a percentage point increase or decrease greater than five percent and then provide some comments below:
Player
2016 PA
14-16 Contact
16 Contact
Contact Delta
David Wright
69
79.80%
60.20%
-19.60%
Khris Davis
62
69.40%
53.10%
-16.30%
Russell Martin
59
78.80%
65.10%
-13.70%
Alex Gordon
67
77.60%
64.40%
-13.20%
Ryan Zimmerman
56
79.90%
68.50%
-11.40%
Lorenzo Cain
71
79.80%
68.80%
-11.00%
Neil Walker
67
82.00%
71.70%
-10.30%
Jose Altuve
82
89.90%
80.50%
-9.40%
Marcus Semien
64
78.30%
69.10%
-9.20%
Daniel Murphy
64
89.50%
80.80%
-8.70%
Edwin Encarnacion
80
79.50%
71.20%
-8.30%
Andrew McCutchen
82
76.70%
68.40%
-8.30%
Yoenis Cespedes
66
78.60%
70.60%
-8.00%
Mitch Moreland
62
75.50%
67.70%
-7.80%
Corey Dickerson
57
76.30%
68.50%
-7.80%
Logan Forsythe
68
83.20%
75.70%
-7.50%
Hanley Ramirez
73
81.30%
73.90%
-7.40%
Omar Infante
58
83.00%
75.80%
-7.20%
Yonder Alonso
52
85.20%
78.00%
-7.20%
Troy Tulowitzki
75
80.00%
72.90%
-7.10%
Jason Kipnis
65
83.00%
76.00%
-7.00%
Rajai Davis
56
82.40%
75.50%
-6.90%
Prince Fielder
78
80.80%
74.00%
-6.80%
Ian Desmond
70
72.40%
65.80%
-6.60%
Jon Jay
78
84.00%
77.50%
-6.50%
Francisco Lindor
65
82.00%
75.50%
-6.50%
Desmond Jennings
63
80.70%
74.30%
-6.40%
Nick Markakis
73
90.50%
84.20%
-6.30%
Mookie Betts
80
86.50%
80.30%
-6.20%
Leonys Martin
62
77.20%
71.20%
-6.00%
Erick Aybar
69
87.80%
82.40%
-5.40%
Nori Aoki
73
90.70%
85.60%
-5.10%
Kyle Seager
71
82.90%
77.80%
-5.10%
Jayson Werth
56
80.90%
75.90%
-5.00%
Yasiel Puig
70
74.10%
69.10%
-5.00%
Player
2016 PA
14-16 Contact
16 Contact
Contact Delta
John Jaso
69
81.60%
91.90%
10.30%
Wilson Ramos
54
78.90%
87.50%
8.60%
Kris Bryant
82
67.30%
75.80%
8.50%
Bryce Harper
73
74.80%
83.10%
8.30%
Jay Bruce
71
75.00%
82.70%
7.70%
Brett Gardner
59
84.10%
91.70%
7.60%
George Springer
79
66.50%
73.90%
7.40%
Nolan Arenado
73
82.50%
89.70%
7.20%
Melvin Upton Jr.
68
69.10%
76.20%
7.10%
Brandon Belt
78
74.60%
81.00%
6.40%
Joe Mauer
80
85.00%
90.80%
5.80%
Andrelton Simmons
70
88.60%
94.40%
5.80%
Ryan Braun
70
79.20%
84.90%
5.70%
Zack Cozart
53
86.90%
92.30%
5.40%
Jose Iglesias
55
91.20%
96.50%
5.30%
Danny Espinosa
59
70.30%
75.30%
5.00%
David Peralta
80
78.60%
83.60%
5.00%
Let’s start with the worrisome contact rate issues. Again, we’re still shy of the plate appearance number where things show some predictive reliability, but wide gaps from expected performance are worrisome. David Wright’s (NYM) contact woes early in the season are very alarming. Wright has hit for power early in the season, which has helped hide the fact he’s striking out in nearly 35 percent of his plate appearances. Wright’s 33 years old and injuries erased a 2015 season. We’re two years removed from Wright playing a full season and three years removed from a truly productive season. While Wright’s early season performance might hint at a return to productivity, the early season indicators suggest otherwise. Some other high profile names we’re concerned about on the contact side early in the season include Andrew McCutchen (PIT), Edwin Encarnacion (TOR), Troy Tulowitzki (TOR), and Prince Fielder (TEX). All of these players are seeing decreased contact rates accompanied by decreased hard hit rates. Another thing to note here is the potential sampling bias as we’re often seeing pairs of teammates on these lists, which may simply suggest a team has faced more challenging pitchers to generate contact off of. Remember, we’re still shy of the reliability mark, so we’re not ready to draw significant conclusions, but these are situations we’re monitoring.
On the positive side, we see some young players making more contact early in the season as they aid their ascent towards super-duper stardom. Bryce Harper (WAS), Kris Bryant (CHC), Nolan Arenado (COL), and George Springer (HOU) are all making even more contact than they have in recent years. Strikeouts were a problem for Bryant and Springer before but weren’t big concerns for Harper or Arenado who are both getting rather ridiculous. One other notable on this list is Melvin Upton Jr. (SD) who has had a bit of a rebirth early this season in San Diego and has been a recent nuisance to our strikeout hopes when picking on the Padres.
As we work our way through some baseline projection updates over the next week, we wanted to give you a peek into our process behind the scenes and how we incorporate in-season indicators into some of our analysis. As we approach more sample reliability, players will receive slight adjustments to their baselines that will ultimately impact their own ranking and the strength of the matchup for opposing starters. Often, our content may feel a little repetitive as we’re recommending similar players in similar matchups over and over but it’s the result of sticking to a process. As we adjust baselines during the season, you may see some slight differences in the names that are popping up in different matchups and that is directly the result of trying to adjust with the new information coming in. With “data” becoming less of a competitive advantage in the market place, we wanted to emphasize some of the ways we actually put the data to use and help you avoid some of the pitfalls of putting too much weight in data that hasn’t stabilized.
The post MLB DFS Strategy: Small Sample Sizes and Adjusting Projection Baselines appeared first on DailyRoto.