2016-04-25

MLB DFS Strategy: Small Sample Sizes and Adjusting Projection Baselines

Earlier this week, there was a really great question in our forums about evaluating small sample sizes on individual skill sets and it gave me an opportunity to revisit one of my favorite articles I’ve ever read on Fangraphs. The article, which was published in 2009, remains relevant to the work we do when adjusting player baselines and our projections for them throughout the season. It’s a very quick read (within it is a link to the bigger research article most of the data is pulled from and at the bottom you’ll find some similar links that also lead to more updated research – in general the results are similar so I like to link back to the original), but one that helps you understand some of the background on sample size, peripheral indicators we value, and when to start evaluating them as meaningful change.

One of the things I love about this research is how intuitive the results were. The peripherals that reached a level of predictability the fastest were the indicators that were most controlled by the batter. The batter can control how often they swing. They can control how often they make contact and thus how often they strikeout. They can control how hard they hit the ball, but what they can’t control is the result of the play once the ball is put in play. Most hard line drives fall for base hits, but not all of them. As you’d expect things like Home Run/Fly Ball Rate take longer to normalize than things like swing rate or contact rate.

You’ll also notice a few indicators start becoming more reliable around 100 plate appearances. In general, most everyday hitters approach 100 plate appearances per month and we’re almost a month into the season! This is when we start evaluating in-season player performance and adjusting projected baselines for players in our model, so it makes sense to approach this topic while fresh in our minds. While you don’t want to overwrite multiple seasons of performance with just the first 100 plate appearances in a season, it’s appropriate to adjust players’ baselines as the season goes on. This gives us a better chance at capturing player changes in performance and skill level without erasing all of the context that our baseline projection was founded on. Ultimately, a combination of past history with current performance will help create more accurate player baselines. We don’t want to do this too early in the season before indicators reach potentially predictive thresholds, because we don’t want to overweight small samples.

Over the next few weeks, one of the first areas we’ll start examining is swing percentage and contact rates. Pizza Cutter’s research found that swing percentage becomes reliable at about 50 plate appearances and contact rates start becoming reliable at 100 plate appearances. In this article, we’re going to highlight some of the individuals around Major League Baseball that have seen the biggest change in their swing rates and contact rates this season compared to the prior few years of data. Specifically, we’re going to examine the largest changes from 2014-2016 to this season.

Let’s start with swing rate. Below are the players who are swinging five percentage points more frequently than the past few seasons:

Player

PA

14-16 Swing %

16 Swing

Swing Delta

Evan Longoria

74

48.70%

58.50%

9.80%

Brandon Phillips

65

55.00%

64.60%

9.60%

Andrelton Simmons

70

48.50%

57.30%

8.80%

Dee Gordon

72

48.00%

55.90%

7.90%

Yonder Alonso

52

47.20%

54.40%

7.20%

Yan Gomes

52

52.10%

59.20%

7.10%

Khris Davis

62

50.50%

57.50%

7.00%

Nick Castellanos

57

51.50%

58.10%

6.60%

Nick Ahmed

67

48.80%

55.10%

6.30%

Kole Calhoun

71

48.80%

54.90%

6.10%

Edwin Encarnacion

80

44.10%

50.20%

6.10%

Anthony Rendon

74

40.30%

45.70%

5.40%

Josh Harrison

77

54.30%

59.70%

5.40%

Nori Aoki

73

45.70%

50.90%

5.20%

Joc Pederson

59

43.00%

48.10%

5.10%

Swinging more frequently isn’t always a bad thing. Sometimes, hitters can be too patient. However, an elevated swing rate can sometimes accompany declining bat speed. As players age, their bat speeds slow and they need to find ways to adjust. One of the most common ways is to start their swings earlier and “cheat” a little bit to try to overcome the lack of bat speed. As a result, I like to look for aging veterans that may be trying to speed up that bat early in the season. Brandon Phillips (CIN) fits the profile (age 34) and his results early in the season have been great. He’s shown more power than the last few seasons and he’s making the most hard contact he’s made since 2010. He’s also seen more pitches inside the strike zone by a wide margin than recent years (50.5 percent his season, sub-45 percent since 2011) but his swinging strike rates are up. Once the league adjusts to Phillips’ more aggressive approach, we’d expect the power to fall off substantially.

On the flip side, here is a list of players swinging less frequently than in years’ past:

Player

PA

14-16 Swing %

16 Swing

Swing Delta

Danny Espinosa

59

52.50%

41.80%

-10.70%

Jose Altuve

82

51.60%

41.30%

-10.30%

Chase Headley

54

42.70%

34.20%

-8.50%

J.D. Martinez

69

54.40%

46.60%

-7.80%

Scooter Gennett

68

54.60%

46.80%

-7.80%

Chris Davis

72

47.50%

39.70%

-7.80%

Chris Carter

68

47.80%

40.40%

-7.40%

Brett Gardner

59

37.00%

29.80%

-7.20%

Nolan Arenado

73

53.80%

46.70%

-7.10%

Colby Rasmus

69

46.30%

39.30%

-7.00%

Randal Grichuk

61

51.40%

44.60%

-6.80%

J.J. Hardy

60

39.90%

33.10%

-6.80%

Christian Yelich

70

40.60%

34.30%

-6.30%

Jon Jay

78

48.60%

42.40%

-6.20%

Hunter Pence

81

46.60%

40.50%

-6.10%

John Jaso

69

44.70%

38.80%

-5.90%

Ian Desmond

70

49.90%

44.00%

-5.90%

Dexter Fowler

81

42.00%

36.20%

-5.80%

Brian McCann

55

42.90%

37.10%

-5.80%

Joe Mauer

80

39.90%

34.20%

-5.70%

Mike Moustakas

71

48.00%

42.30%

-5.70%

Jose Abreu

77

52.40%

46.80%

-5.60%

Francisco Lindor

65

49.80%

44.20%

-5.60%

Mark Teixeira

68

42.80%

37.40%

-5.40%

Kolten Wong

56

49.20%

43.80%

-5.40%

Jason Kipnis

65

42.60%

37.20%

-5.40%

Daniel Murphy

64

48.60%

43.40%

-5.20%

Ryan Braun

70

50.70%

45.60%

-5.10%

Chase Utley

69

41.70%

36.60%

-5.10%

Jacoby Ellsbury

68

46.80%

41.80%

-5.00%

David Freese

75

45.40%

40.40%

-5.00%

Swinging less frequently is one way to increase your BB Rate, and subsequently your K Rate as well, and it’s something we like to look at as potential growth in young hitters. Jose Altuve (HOU) is sporting a career high 11 percent BB Rate (previous high was 6.3 percent), which is directly correlated with swinging less frequently. There is a line where you can be “too patient” but in general more selectivity at the plate often correlates to improved power as well as improved BB Rates. We’re seeing this early with Altuve (and a number of other players on this list). He’ll give some of this back over time, but an improved approach likely means even better things for the soon-to-be 26 year old. This list is filled with players off to strong early season starts.

The contact rate samples aren’t yet reaching reliable levels but they will within the next month. Similar to the swing rates above, we’ll post those who have seen a percentage point increase or decrease greater than five percent and then provide some comments below:

Player

2016 PA

14-16 Contact

16 Contact

Contact Delta

David Wright

69

79.80%

60.20%

-19.60%

Khris Davis

62

69.40%

53.10%

-16.30%

Russell Martin

59

78.80%

65.10%

-13.70%

Alex Gordon

67

77.60%

64.40%

-13.20%

Ryan Zimmerman

56

79.90%

68.50%

-11.40%

Lorenzo Cain

71

79.80%

68.80%

-11.00%

Neil Walker

67

82.00%

71.70%

-10.30%

Jose Altuve

82

89.90%

80.50%

-9.40%

Marcus Semien

64

78.30%

69.10%

-9.20%

Daniel Murphy

64

89.50%

80.80%

-8.70%

Edwin Encarnacion

80

79.50%

71.20%

-8.30%

Andrew McCutchen

82

76.70%

68.40%

-8.30%

Yoenis Cespedes

66

78.60%

70.60%

-8.00%

Mitch Moreland

62

75.50%

67.70%

-7.80%

Corey Dickerson

57

76.30%

68.50%

-7.80%

Logan Forsythe

68

83.20%

75.70%

-7.50%

Hanley Ramirez

73

81.30%

73.90%

-7.40%

Omar Infante

58

83.00%

75.80%

-7.20%

Yonder Alonso

52

85.20%

78.00%

-7.20%

Troy Tulowitzki

75

80.00%

72.90%

-7.10%

Jason Kipnis

65

83.00%

76.00%

-7.00%

Rajai Davis

56

82.40%

75.50%

-6.90%

Prince Fielder

78

80.80%

74.00%

-6.80%

Ian Desmond

70

72.40%

65.80%

-6.60%

Jon Jay

78

84.00%

77.50%

-6.50%

Francisco Lindor

65

82.00%

75.50%

-6.50%

Desmond Jennings

63

80.70%

74.30%

-6.40%

Nick Markakis

73

90.50%

84.20%

-6.30%

Mookie Betts

80

86.50%

80.30%

-6.20%

Leonys Martin

62

77.20%

71.20%

-6.00%

Erick Aybar

69

87.80%

82.40%

-5.40%

Nori Aoki

73

90.70%

85.60%

-5.10%

Kyle Seager

71

82.90%

77.80%

-5.10%

Jayson Werth

56

80.90%

75.90%

-5.00%

Yasiel Puig

70

74.10%

69.10%

-5.00%

Player

2016 PA

14-16 Contact

16 Contact

Contact Delta

John Jaso

69

81.60%

91.90%

10.30%

Wilson Ramos

54

78.90%

87.50%

8.60%

Kris Bryant

82

67.30%

75.80%

8.50%

Bryce Harper

73

74.80%

83.10%

8.30%

Jay Bruce

71

75.00%

82.70%

7.70%

Brett Gardner

59

84.10%

91.70%

7.60%

George Springer

79

66.50%

73.90%

7.40%

Nolan Arenado

73

82.50%

89.70%

7.20%

Melvin Upton Jr.

68

69.10%

76.20%

7.10%

Brandon Belt

78

74.60%

81.00%

6.40%

Joe Mauer

80

85.00%

90.80%

5.80%

Andrelton Simmons

70

88.60%

94.40%

5.80%

Ryan Braun

70

79.20%

84.90%

5.70%

Zack Cozart

53

86.90%

92.30%

5.40%

Jose Iglesias

55

91.20%

96.50%

5.30%

Danny Espinosa

59

70.30%

75.30%

5.00%

David Peralta

80

78.60%

83.60%

5.00%

Let’s start with the worrisome contact rate issues. Again, we’re still shy of the plate appearance number where things show some predictive reliability, but wide gaps from expected performance are worrisome. David Wright’s (NYM) contact woes early in the season are very alarming. Wright has hit for power early in the season, which has helped hide the fact he’s striking out in nearly 35 percent of his plate appearances. Wright’s 33 years old and injuries erased a 2015 season. We’re two years removed from Wright playing a full season and three years removed from a truly productive season. While Wright’s early season performance might hint at a return to productivity, the early season indicators suggest otherwise. Some other high profile names we’re concerned about on the contact side early in the season include Andrew McCutchen (PIT), Edwin Encarnacion (TOR), Troy Tulowitzki (TOR), and Prince Fielder (TEX). All of these players are seeing decreased contact rates accompanied by decreased hard hit rates. Another thing to note here is the potential sampling bias as we’re often seeing pairs of teammates on these lists, which may simply suggest a team has faced more challenging pitchers to generate contact off of. Remember, we’re still shy of the reliability mark, so we’re not ready to draw significant conclusions, but these are situations we’re monitoring.

On the positive side, we see some young players making more contact early in the season as they aid their ascent towards super-duper stardom. Bryce Harper (WAS), Kris Bryant (CHC), Nolan Arenado (COL), and George Springer (HOU) are all making even more contact than they have in recent years. Strikeouts were a problem for Bryant and Springer before but weren’t big concerns for Harper or Arenado who are both getting rather ridiculous. One other notable on this list is Melvin Upton Jr. (SD) who has had a bit of a rebirth early this season in San Diego and has been a recent nuisance to our strikeout hopes when picking on the Padres.

As we work our way through some baseline projection updates over the next week, we wanted to give you a peek into our process behind the scenes and how we incorporate in-season indicators into some of our analysis. As we approach more sample reliability, players will receive slight adjustments to their baselines that will ultimately impact their own ranking and the strength of the matchup for opposing starters. Often, our content may feel a little repetitive as we’re recommending similar players in similar matchups over and over but it’s the result of sticking to a process. As we adjust baselines during the season, you may see some slight differences in the names that are popping up in different matchups and that is directly the result of trying to adjust with the new information coming in. With “data” becoming less of a competitive advantage in the market place, we wanted to emphasize some of the ways we actually put the data to use and help you avoid some of the pitfalls of putting too much weight in data that hasn’t stabilized.

The post MLB DFS Strategy: Small Sample Sizes and Adjusting Projection Baselines appeared first on DailyRoto.

Show more