Introduction
When a pitcher starts dealing or falling apart over the course of a couple of months, I often see release point changes suggested as a possible cause for the sudden change in fortunes.
I’m interested to see if this holds up in aggregate. Do release point changes have any effect on pitcher performance? Do many pitchers undergo release point changes every month with no change in ability level?
If any correlation is found, then this is informative, and suggests that release point changes could be incorporated into a general theory of pitching. However, if not, it implies that analysis of release points may be similar to reading tea leaves. Searching for patterns to explain randomness is human nature, and this could be an example of it. This isn’t to say that release point changes cannot be informative and causes for changes in ability level, but that there should be some additional justification for why a particular release point change is special and defies the general relationship.
I’ll be including R code segments in this article. I’ve been using R Markdown for various job applications and technical assessments recently, and it seems pretty neat. This will allow others to repeat my analysis and improve upon it if necessary. If you’re not interested in these then feel free to skip past them. See an example below:
# I am a code segment
# Ignore me if you're not a nerd
Gathering the Data
There are probably many possible definitions which one could use for a “release point change”, I’ll be looking at monthly differences in release height to a pitcher’s career average. Monthly changes in a pitcher’s wOBA and run value compared to their career average will be used to measure pitcher performance. Correlating these factors will tell us if there’s anything useful here.
# My personal statcast database is in the db object here
# Take regular season games in the statcast era (2015-)
# Select relevant columns and collect from the database
# Purists may get mad that I haven't done all the data
# manipulation in SQL. But I find this method much quicker
# because I'm so much more familiar with the R language.
data = tbl(db,"statcast") %>% filter(game_type=="R",game_year>=2015) %>%
select(release_pos_z,woba_value,woba_denom,
delta_run_exp,p_throws,pitcher,game_year,game_date) %>% collect()
This loads over 5 million rows of data! Luckily my laptop can handle it. The next step is to group by month and find a pitcher’s career average at each point.
# This is a big piped set of manipulations which get
# the data to where I want it to be
data_grouped = data %>%
# remove NAs
filter(!is.na(release_pos_z),
!is.na(delta_run_exp)
) %>%
# turn NAs in woba value & woba_denom to 0
# Add month variable
mutate(
woba_value = if_else(is.na(woba_value),0,woba_value),
woba_denom = if_else(is.na(woba_denom),0,woba_denom),
game_month = as.numeric(substr(game_date,7,7)),
) %>%
group_by(pitcher) %>%
arrange(game_year,game_date) %>%
# Find career-to-date stats
mutate(
career_pitches = seq_along(pitcher),
career_height = cummean(release_pos_z),
career_rv = cummean(delta_run_exp),
career_woba = cumsum(woba_value)/cumsum(woba_denom)
) %>%
ungroup() %>%
# Don't care about march/october, probably very few pitches
# in these months anyway, get rid of them
filter(
game_month >=4,
game_month <=9
) %>%
# group by each pitcher-month
group_by(pitcher,game_year,game_month) %>%
arrange(game_date) %>%
# Create the summary statistics
# Also include career stats from beginning of month
summarise(
count = n(),
height = mean(release_pos_z,na.rm=TRUE),
woba = weighted.mean(woba_value,woba_denom,na.rm=TRUE),
RV = mean(delta_run_exp,na.rm=TRUE),
p_throws = first(p_throws),
career_pitches = first(career_pitches),
career_height = first(career_height),
career_rv = first(career_rv),
career_woba = first(career_woba),
.groups="drop"
) %>%
# Add on the differences to career averages
mutate(
height_diff = height - career_height,
woba_diff = woba - career_woba,
rv_diff = RV - career_rv
)
Are Release Height Changes Bad?
I have the data all set up, let’s take a look at it. I want to see if changes in release height correlate to changes in performance.
The easiest way to see this is to make a graph!
To start with, let’s see the distribution of monthly release height differences to career averages. With a minimum sample size applied (100 pitches in month, 1000 career pitches):
data_grouped %>%
filter(count >=100,
career_pitches >=1000) %>%
ggplot(aes(x = 12*height_diff))+
geom_histogram(bins = 100)+
theme_minimal()+
ggtitle("Monthly Release Height Difference to Career Average /inches")+
xlab("Height Difference /inches")
This graph looks reasonable, after all, most pitchers don’t change their release points by much month-to-month.
If I try to plot a graph of run value/release height change for every single pitcher-month, it will look like a massive blob. So I’ll round by inch of release height change and see what the effect is in each rounded bin.
windowsFonts("bahn" = windowsFont("Bahnschrift"))
data_grouped %>%
filter(count >=100,
career_pitches >=1000) %>%
mutate(height_change = round(height_diff*12)) %>%
filter(abs(height_change)<=5) %>%
ggplot(aes(
x = factor(height_change),
y = 100*rv_diff
)) +
geom_violin()+
geom_boxplot(width=0.1)+
theme_minimal()+
theme(text = element_text(family="bahn",size=15))+
xlab("Height Change /inches")+
ylab("Run Value /100 pitches Change")+
ggtitle("Release Height Change - Effect on Pitch Run Value")
All of those distributions look pretty similar to me, let’s try using wOBA too.
data_grouped %>%
filter(count >=100,
career_pitches >=1000) %>%
mutate(height_change = round(height_diff*12)) %>%
filter(abs(height_change)<=5) %>%
ggplot(aes(
x = factor(height_change),
y = woba_diff
)) +
geom_violin()+
geom_boxplot(width=0.1)+
theme_minimal()+
theme(text = element_text(family="bahn",size=15))+
xlab("Release Height Change /inches")+
ylab("wOBA Change")+
ggtitle("Release Height Change - Effect on wOBA")
Even just looking at the average change in each bin, there isn’t a clear relationship, as seen below. If anything, changing release height in both directions produces slightly better results on average.
data_grouped %>%
filter(count >=100,
career_pitches >=1000) %>%
group_by(height_change = round(height_diff*12)) %>%
summarise(rv_diff = mean(rv_diff),
Count = n()) %>%
filter(abs(height_change)<=5) %>%
ggplot(aes(
x = factor(height_change),
y = 100*rv_diff,
size=Count
)) +
geom_point()+
theme_minimal()+
theme(text = element_text(family="bahn",size=15))+
xlab("Height Change /inches")+
ylab("Run Value /100 pitches Change")+
ggtitle("Release Height Change - Effect on Pitch Run Value")
data_grouped %>%
filter(count >=100,
career_pitches >=1000) %>%
group_by(height_change = round(height_diff*12)) %>%
summarise(woba_diff = mean(woba_diff),
Count = n()) %>%
filter(abs(height_change)<=5) %>%
ggplot(aes(
x = factor(height_change),
y = woba_diff,
size=Count
)) +
geom_point()+
theme_minimal()+
theme(text = element_text(family="bahn",size=15))+
xlab("Height Change /inches")+
ylab("wOBA Change")+
ggtitle("Release Height Change - Effect on wOBA")
Applications to Josh Hader
The reason for this article is because I’ve seen a lot of this analysis applied to Josh Hader. He had an awful July before being traded by the Brewers, and his release point has been higher than his career average all year. That release point change didn’t stop Hader from putting together a nearly record breaking scoreless appearance streak to start the season.
Let’s look at some other pitchers with significantly higher release height in July 2022 compared to their career average. Hader clearly stands out with the worst wOBA change. All the other players are spread around the plot, averaging zero change.
Kenley Jansen has a much higher release point from his career average, over half a foot, without significant adverse effects.
library(ggrepel)
data_grouped %>% group_by(pitcher) %>% arrange(game_year,game_month) %>%
mutate(count_to_date = cumsum(count)) %>%
filter(12*height_diff >= 2.5,count_to_date >=1000) %>%
ungroup() %>%
filter(game_year==2022,game_month == 7) %>%
left_join(names,by="pitcher") %>%
ggplot(
aes(x = 12 * height_diff, y = woba_diff)
)+
geom_hline(yintercept = 0)+
geom_vline(xintercept = 0)+
geom_point(size=5)+
geom_label_repel(aes(label = PLAYERNAME),family="bahn",
alpha=0.5,min.segment.length = 0,max.overlaps = 100)+
xlab("Release Height Change to Career Average /inches")+
ylab("wOBA Change to Career Average")+
ggtitle("July 2022 Release Height Outliers")+
theme_minimal()+
theme(text = element_text(family = "bahn",size=15))
Maybe Josh Hader’s performance decrease isn’t random, looking at similar decreases in performance with similar release height changes may inform on if there is a type of player who is vulnerable to this change
Let’s take all players with release height change of above 2.5” and wOBA change of above .150.
filtered_group = data_grouped %>%
filter(count >=100,
career_pitches >=1000) %>%
filter(12*height_diff >= 2.5,
woba_diff >=0.150) %>%
# Add on name data, I load this from an external file
left_join(names,by="pitcher")
filtered_group %>%
arrange(PLAYERNAME) %>%
select(PLAYERNAME,
game_year,
game_month,
height_diff,
woba_diff) %>% mutate(height_diff = 12*round(height_diff,1),
woba_diff = round(woba_diff,3)) %>%
rename(Player = PLAYERNAME,
Season = game_year,
Month = game_month,
`Release Height Change` = height_diff,
`wOBA Change` = woba_diff) %>%
kable()
Player | Season | Month | Release Height Change " | wOBA Change |
---|---|---|---|---|
Adam Conley | 2016 | 8 | 2.4 | 0.153 |
Alex Colome | 2017 | 6 | 2.4 | 0.150 |
Andrew Cashner | 2018 | 9 | 4.8 | 0.210 |
Arquimedes Caminero | 2016 | 5 | 2.4 | 0.194 |
Bud Norris | 2018 | 9 | 3.6 | 0.273 |
Cam Bedrosian | 2021 | 4 | 2.4 | 0.189 |
Casey Fien | 2017 | 6 | 2.4 | 0.185 |
Jhoulys Chacin | 2021 | 5 | 2.4 | 0.156 |
Jordan Lyles | 2019 | 7 | 3.6 | 0.189 |
Josh Hader | 2022 | 7 | 3.6 | 0.297 |
Josh Osich | 2017 | 9 | 2.4 | 0.157 |
Josh Tomlin | 2016 | 8 | 2.4 | 0.154 |
Josh Tomlin | 2021 | 8 | 2.4 | 0.229 |
Luke Weaver | 2018 | 9 | 2.4 | 0.193 |
Matt Bush | 2017 | 9 | 2.4 | 0.270 |
Mike Clevinger | 2019 | 6 | 2.4 | 0.166 |
Nate Jones | 2020 | 8 | 3.6 | 0.151 |
Pedro Baez | 2017 | 9 | 3.6 | 0.212 |
Rick Porcello | 2020 | 7 | 2.4 | 0.170 |
Ross Stripling | 2021 | 9 | 2.4 | 0.213 |
Ryan Dull | 2019 | 4 | 2.4 | 0.192 |
Sam Freeman | 2018 | 7 | 3.6 | 0.165 |
Tommy Milone | 2016 | 6 | 3.6 | 0.150 |
Tommy Milone | 2016 | 8 | 3.6 | 0.155 |
Travis Wood | 2017 | 9 | 2.4 | 0.218 |
Will Harris | 2019 | 7 | 2.4 | 0.184 |
Williams Perez | 2016 | 9 | 2.4 | 0.210 |
There are only 27 pitcher-months that fit these criteria. Is there anything special about them? Firstly, do they have odd release heights?
player_list = unique(filtered_group$pitcher)
data_players = data_grouped %>%
filter(count >=100,
career_pitches >=1000,
pitcher %in% c(player_list))
data_grouped %>%
filter(count >=100,
career_pitches >=1000) %>%
ggplot(aes(x = 12*height))+
geom_density(fill="grey")+
geom_density(data = data_players,color="red",fill="red",alpha=0.5)+
theme_minimal()+
ggtitle("Grey - Total population,
\nRed - Pitchers who saw worse results with increase release height")+
xlab("Height /inches")+
theme(text = element_text(family = "bahn",size=10))
From the graph above we can see that they have quite a representative spread in release height, if anything erring slightly on the higher end than the general distribution
In addition, we can look at whether there is a clear correlation between wOBA and release height for these players or if the selected months are clear outliers to the distribution
data_players %>%
ggplot(aes(x = 12*height_diff,y = woba_diff))+
annotate("segment",x=2.5,y=0.15,xend=6,yend=0.15)+
annotate("segment",x=2.5,y=0.15,xend=2.5,yend=0.4)+
geom_point(size=2)+
geom_point(data =
filter(
data_players,
12*height_diff >= 2.5,
woba_diff >=0.150),size=2,color="red")+
theme_minimal()+
ggtitle("Red: Results in the 27 filtered Pitcher-months
\nBlack: All Pitcher-months for players picked up by the filter")+
xlab("Height Change /inches")+
ylab("wOBA Change")+
theme(text = element_text(family = "bahn",size=10))
This graph shows that there is nothing special about these pitchers, they have simply had a bad month that coincided with a release height change. In the rest of their careers, they have had no correlation between release height change and results. These pitchers have had months with much higher release points where they saw little change compared to their career wOBA.
data_players %>%
filter(pitcher==623352) %>%
ggplot(aes(x = 12*height_diff,y = woba_diff))+
geom_path(size=2)+
geom_point(size=4,aes(color = factor(game_year)))+
theme_minimal()+
xlab("Release Height Change /inches")+
ylab("wOBA Change")+
labs(color = "Season")+
theme(text = element_text(family = "bahn",size=15))+
ggtitle("Josh Hader")
If we only look at Josh Hader, we can see that there has been a clear release height increase in 2022, but that this only led to a wOBA increase in July 2022.
The first two months of 2022 had a similarly high release point but Hader was posting wOBAs almost 100 points below his career average. If the release point was important to explaining this decrease in effectiveness, then I would have expected the first few months of 2022 to go much worse.
Conclusions
Using this data, I have found very little to suggest that there is a global relationship between release height changes and pitcher effectiveness. In addition, those pitchers who have seen particularly bad months coincide with release height increases don’t show any correlation between release height and results outside of that month.
Pitcher release point changes could still be a mechanism for explaining performance differences. However, I believe this may only be possible to analyse with performance statistics that are very stable, such as stuff quality.