Skip to main content

Predicting Hit Distances & Investigating Changes to the Ball in 2021

How did the baseball change in 2021? 

Introduction

Before the 2021 season MLB announced that they were going to deaden the baseball in an effort to reduce the historic home run rates in recent seasons. In this post I aim to investigate how this deadened baseball acted differently to the ball from 2020 and earlier, primarily by modelling flyball distances.

I'll be using Generalized Additive Models (GAMs) in this project. These are easy to use and can model smooth non-linear effects along with interactions between variables.

Before we look at how the ball changed in 2021 we need a baseline for expected ball behaviour in earlier years. The next few sections go through the process of creating the models which account for important variables when predicting flyball distance. This is similar to existing analysis done by Alan Nathan which was done much more rigorously and with more data than I have access to. 

Launch Angle and Exit Velocity

The most important factor to consider when working out how far a flyball will travel is its initial trajectory, that is: exit velocity and launch angle. The graph below shows the output of the GAM which aims to predict flyball distance from these two variables. The contours show where on the graph most flyballs are found. Predicted flyball distances peak at launch angles around 30 degrees.

Spray Angle

Surprisingly spray angle is also an important consideration for flyball distance. This may seem counter-intuitive, after all, when the ball has left the bat the distance it travels can't be affected by which batter's box the hitter is standing in. However we can clearly see that pulled flyballs travel further.
The root cause for this effect is not spray angle, but spin. The faster a ball spins the larger the drag force on the ball and therefore the less far it travels. Pulled flyballs have less spin on them which contributes to an increase in distance. I don't have access to ball spin data, so spray angle acts as a proxy for spin on the batted ball.

Weather

Now that we've accounted for the physical variables corresponding to the motion of the ball we can move on to environmental variables which affect flyball distances. Weather conditions are an important consideration as the wind and temperature can turn home runs into flyouts and vice-versa. This has long been known in betting markets where over-under lines are affected by the weather conditions.

I couldn't find an easily available dataset with weather conditions for MLB games going back to 2015 (the beginning of the Statcast era), therefore I had to gather this data myself. Box scores on the MLB website contain a short description of the weather conditions, including temperature, wind speed and direction as seen in the image below. I was able to scrape this data for all games from 2015-2021 in order to see how the weather affects flyball distances.
The graph below shows how the weather can increase or decrease flyball distances relative to the expectation of a model which only includes the ball's trajectory. Note that my wind data only includes the speed and direction provided on the box score, this means that the effect of wind is lower in the model than in reality because a lot of the values are probably inaccurate given how the wind can gust and change direction over short periods of time.
This is useful but weather effects can be stadium specific. Training linear models on flyballs in each stadium can show more clear effects of weather.
We can see a clear relationship with parks that are more exposed to the wind direction also being more exposed to the effect of temperature differences. Wrigley Field is by far the most variable park for flyball distances with changes in weather. There are also a group of parks where the temperature has an effect on flyball distance but the effect of the wind (as presented in the box score) is close to zero

With all these variables considered, let's see how the 2021 ball was different.

What changed in 2021?

In 2021 the ball was deadened. This means we expect flyball distances to decrease compared to earlier years. But which types of flyballs were most affected? 

I retrained the flyball distance models using 2021 data, then by comparing models we can see where the largest differences emerge.

Using the model which only included launch angle and exit velocity, we can see that it was mostly flyballs with especially high or low launch angles which were affected. The areas at the top and bottom of the contoured region are blue. The very low launch angle balls had a higher predicted hit distance but I believe this is because there aren't many balls which have these properties so differences here are more likely to represent errors in the models at the extremes of the dataset.

Balls with especially high or low launch angles are more likely to have high spin because they are not hit as squarely. Perhaps it is a higher drag coefficient in the 2021 ball which causes spin to have a larger effect on deadening the ball.

Looking at the changes in the model which includes spray angle shows the effect of spin more clearly. Firstly, for extremely high or low launch angles here the 2021 predicted distance is lower than for pre-2021. For opposite field, high launch angle, flyballs the distance is lowered while for pulled flyballs there is less of a distance decrease. This agrees with the previous conclusion that 2021 balls exhibit more drag which means that balls hit with more spin are deadened to a greater degree.
An example of a player who could be most affected by this would be DJ LeMahieu. He gained much of his value in 2019 and 2020 by hitting opposite field flyballs which snuck over the short porch in Yankee Stadium. Deadened balls would reduce the distance of these flyballs the most, turning some home runs into flyouts and representing a large decrease in value.
Finally we can look at how changes to the ball in 2021 affected its properties in different weather conditions. The graph below shows how the effects of wind and temperature on the ball changed from before 2021 to the 2021 season. This is split by stadium. The higher up a team's logo is the greater the increase in the effect of wind on ball distance, the further to the right the greater the increase in the effect of temperature.
In general the ball was more affected by wind than before, with most teams placed slightly above the horizontal line representing no change. The largest increases were at Wrigley Field and Kauffman Stadium. Meanwhile temperature effect changes were more balanced, with teams evenly split about the vertical line representing no change. Arizona showed the largest increase in the effect of temperature, but perhaps this is because they added a humidor in 2018: the 2021 model contains all humidor data but the pre-2021 model contains a mix of humidor and no-humidor.

The increase in the effect of wind corroborates the earlier conclusion that the 2021 balls had extra drag and therefore would be pushed around more by the wind.

Conclusions

In this post we have seen how different factors can affect flyball distances, from exit velocity and launch angle, to the surprising effect of spray angle, and the obvious but difficult to measure aspect of weather conditions.

I've compared how the effect of these factors changed in 2021 with the addition of a deadened ball. Flyballs with extreme launch angles and those that are hit to the opposite field are deadened the most. This is likely because the aerodynamic properties of the new ball result in increased drag on balls with high spin.

The 2021 balls are also more affected by the wind direction and strength, as balls with greater drag will feel a stronger force from air resistance.

I did not find evidence for two separate populations of balls. The errors on my distance measurements are much too large and smear over any possible measurement of balls with different properties. I did observe a larger standard deviation in model errors in 2021, but I believe this is due to the effect of spin (which I cannot measure) being larger.

These changes to flyball behaviour are likely to stick if MLB continues with the 2021 ball, consequences for player evaluation are that flyballs have become less valuable than during the 2015-2020 period. However, not all flyballs are deadened equally. Pulled flyballs have always been more valuable than opposite field flyballs, but now they are even more relatively valuable because they do not get deadened to the same degree. Players who pull a lot of their flyballs (Arenado, Bregman) will not see decreases in production to the same degree as players who spray their flyballs (LeMahieu).

In addition, wind conditions are now more impactful on flyball distances than they used to be, especially in Wrigley Field.

There could be other effects on the game that changing the baseball has had, and as we progress into the 2022 season and gather more data then there will surely be more analysis to be done to investigate these consequences.

Comments

Popular posts from this blog

Custom Pitch Stuff Grades

  I've made an app allowing anyone to see what my models think of any hypothetical pitch.

Don't Let Opposing Hitters See the Same Reliever Too Many Times, Especially in the Postseason.

Here I show how relief pitchers get significantly worse results when hitters see them on multiple occasions in a short time period.

A PitchingBot Overhaul

This is a post to describe various updates to my pitch quality models.