Investigating Pitcher Release Point Changes

Introduction

When a pitcher starts dealing or falling apart over the course of a couple of months, I often see release point changes suggested as a possible cause for the sudden change in fortunes.

I’m interested to see if this holds up in aggregate. Do release point changes have any effect on pitcher performance? Do many pitchers undergo release point changes every month with no change in ability level?

If any correlation is found, then this is informative, and suggests that release point changes could be incorporated into a general theory of pitching. However, if not, it implies that analysis of release points may be similar to reading tea leaves. Searching for patterns to explain randomness is human nature, and this could be an example of it. This isn’t to say that release point changes cannot be informative and causes for changes in ability level, but that there should be some additional justification for why a particular release point change is special and defies the general relationship.

I’ll be including R code segments in this article. I’ve been using R Markdown for various job applications and technical assessments recently, and it seems pretty neat. This will allow others to repeat my analysis and improve upon it if necessary. If you’re not interested in these then feel free to skip past them. See an example below:

# I am a code segment
# Ignore me if you're not a nerd

Gathering the Data

There are probably many possible definitions which one could use for a “release point change”, I’ll be looking at monthly differences in release height to a pitcher’s career average. Monthly changes in a pitcher’s wOBA and run value compared to their career average will be used to measure pitcher performance. Correlating these factors will tell us if there’s anything useful here.

# My personal statcast database is in the db object here

# Take regular season games in the statcast era (2015-)
# Select relevant columns and collect from the database

# Purists may get mad that I haven't done all the data
# manipulation in SQL. But I find this method much quicker 
# because I'm so much more familiar with the R language.

data = tbl(db,"statcast") %>% filter(game_type=="R",game_year>=2015) %>% 
  select(release_pos_z,woba_value,woba_denom,
         delta_run_exp,p_throws,pitcher,game_year,game_date) %>% collect()

This loads over 5 million rows of data! Luckily my laptop can handle it. The next step is to group by month and find a pitcher’s career average at each point.

# This is a big piped set of manipulations which get
# the data to where I want it to be

data_grouped = data %>% 
  # remove NAs
  filter(!is.na(release_pos_z),
         !is.na(delta_run_exp)
         ) %>% 
  # turn NAs in woba value & woba_denom to 0
  # Add month variable
  mutate(
    woba_value = if_else(is.na(woba_value),0,woba_value),
    woba_denom = if_else(is.na(woba_denom),0,woba_denom),
    game_month = as.numeric(substr(game_date,7,7)),
    ) %>% 
  group_by(pitcher) %>% 
  arrange(game_year,game_date) %>% 
  # Find career-to-date stats
  mutate(
    career_pitches = seq_along(pitcher),
    career_height = cummean(release_pos_z),
    career_rv = cummean(delta_run_exp),
    career_woba = cumsum(woba_value)/cumsum(woba_denom)
  ) %>% 
    ungroup() %>% 
  # Don't care about march/october, probably very few pitches
  # in these months anyway, get rid of them
  filter(
    game_month >=4,
    game_month <=9
    ) %>% 
  # group by each pitcher-month
  group_by(pitcher,game_year,game_month) %>% 
  arrange(game_date) %>% 
  # Create the summary statistics
  # Also include career stats from beginning of month
  summarise(
    count = n(),
    height = mean(release_pos_z,na.rm=TRUE),
    woba = weighted.mean(woba_value,woba_denom,na.rm=TRUE),
    RV = mean(delta_run_exp,na.rm=TRUE),
    p_throws = first(p_throws),
    career_pitches = first(career_pitches),
    career_height = first(career_height),
    career_rv = first(career_rv),
    career_woba = first(career_woba),
    .groups="drop"
  ) %>% 
  # Add on the differences to career averages
  mutate(
    height_diff  = height - career_height,
    woba_diff = woba - career_woba,
    rv_diff = RV - career_rv
    )

Are Release Height Changes Bad?

I have the data all set up, let’s take a look at it. I want to see if changes in release height correlate to changes in performance.

The easiest way to see this is to make a graph!

To start with, let’s see the distribution of monthly release height differences to career averages. With a minimum sample size applied (100 pitches in month, 1000 career pitches):

data_grouped %>% 
  filter(count >=100,
         career_pitches >=1000) %>% 
  ggplot(aes(x = 12*height_diff))+
  geom_histogram(bins = 100)+
  theme_minimal()+
  ggtitle("Monthly Release Height Difference to Career Average /inches")+
  xlab("Height Difference /inches")

This graph looks reasonable, after all, most pitchers don’t change their release points by much month-to-month.

If I try to plot a graph of run value/release height change for every single pitcher-month, it will look like a massive blob. So I’ll round by inch of release height change and see what the effect is in each rounded bin.

windowsFonts("bahn" = windowsFont("Bahnschrift"))
data_grouped %>% 
  filter(count >=100,
         career_pitches >=1000) %>% 
  mutate(height_change = round(height_diff*12)) %>% 
  filter(abs(height_change)<=5) %>% 
  ggplot(aes(
    x = factor(height_change),
    y = 100*rv_diff
  )) +
  geom_violin()+
  geom_boxplot(width=0.1)+
  theme_minimal()+
  theme(text = element_text(family="bahn",size=15))+
  xlab("Height Change /inches")+
  ylab("Run Value /100 pitches Change")+
  ggtitle("Release Height Change - Effect on Pitch Run Value")

All of those distributions look pretty similar to me, let’s try using wOBA too.

data_grouped %>% 
  filter(count >=100,
         career_pitches >=1000) %>% 
  mutate(height_change = round(height_diff*12)) %>% 
  filter(abs(height_change)<=5) %>% 
  ggplot(aes(
    x = factor(height_change),
    y = woba_diff
  )) +
  geom_violin()+
  geom_boxplot(width=0.1)+
  theme_minimal()+
  theme(text = element_text(family="bahn",size=15))+
  xlab("Release Height Change /inches")+
  ylab("wOBA Change")+
  ggtitle("Release Height Change - Effect on wOBA")

Even just looking at the average change in each bin, there isn’t a clear relationship, as seen below. If anything, changing release height in both directions produces slightly better results on average.

data_grouped %>% 
  filter(count >=100,
         career_pitches >=1000) %>% 
  group_by(height_change = round(height_diff*12)) %>% 
  summarise(rv_diff = mean(rv_diff),
            Count = n()) %>% 
  filter(abs(height_change)<=5) %>% 
  ggplot(aes(
    x = factor(height_change),
    y = 100*rv_diff,
    size=Count
  )) +
  geom_point()+
  theme_minimal()+
  theme(text = element_text(family="bahn",size=15))+
  xlab("Height Change /inches")+
  ylab("Run Value /100 pitches Change")+
  ggtitle("Release Height Change - Effect on Pitch Run Value")

data_grouped %>% 
  filter(count >=100,
         career_pitches >=1000) %>% 
  group_by(height_change = round(height_diff*12)) %>% 
  summarise(woba_diff = mean(woba_diff),
            Count = n()) %>% 
  filter(abs(height_change)<=5) %>% 
  ggplot(aes(
    x = factor(height_change),
    y = woba_diff,
    size=Count
  )) +
  geom_point()+
  theme_minimal()+
  theme(text = element_text(family="bahn",size=15))+
  xlab("Height Change /inches")+
  ylab("wOBA Change")+
  ggtitle("Release Height Change - Effect on wOBA")

Applications to Josh Hader

The reason for this article is because I’ve seen a lot of this analysis applied to Josh Hader. He had an awful July before being traded by the Brewers, and his release point has been higher than his career average all year. That release point change didn’t stop Hader from putting together a nearly record breaking scoreless appearance streak to start the season.

Let’s look at some other pitchers with significantly higher release height in July 2022 compared to their career average. Hader clearly stands out with the worst wOBA change. All the other players are spread around the plot, averaging zero change.

Kenley Jansen has a much higher release point from his career average, over half a foot, without significant adverse effects.

library(ggrepel)
data_grouped %>% group_by(pitcher) %>% arrange(game_year,game_month) %>% 
  mutate(count_to_date = cumsum(count)) %>% 
  filter(12*height_diff >= 2.5,count_to_date >=1000) %>% 
  ungroup() %>% 
  filter(game_year==2022,game_month == 7) %>% 
  left_join(names,by="pitcher") %>% 
  ggplot(
    aes(x = 12 * height_diff, y = woba_diff)
  )+
  geom_hline(yintercept = 0)+
  geom_vline(xintercept = 0)+
  geom_point(size=5)+
  geom_label_repel(aes(label = PLAYERNAME),family="bahn",
                   alpha=0.5,min.segment.length = 0,max.overlaps = 100)+
  xlab("Release Height Change to Career Average /inches")+
  ylab("wOBA Change to Career Average")+
  ggtitle("July 2022 Release Height Outliers")+
  theme_minimal()+
  theme(text = element_text(family = "bahn",size=15))

Maybe Josh Hader’s performance decrease isn’t random, looking at similar decreases in performance with similar release height changes may inform on if there is a type of player who is vulnerable to this change

Let’s take all players with release height change of above 2.5” and wOBA change of above .150.

filtered_group = data_grouped %>% 
  filter(count >=100,
         career_pitches >=1000) %>% 
  filter(12*height_diff >= 2.5,
         woba_diff >=0.150) %>% 
  # Add on name data, I load this from an external file
  left_join(names,by="pitcher")

filtered_group %>% 
  arrange(PLAYERNAME) %>% 
  select(PLAYERNAME,
         game_year,
         game_month,
         height_diff,
         woba_diff) %>% mutate(height_diff = 12*round(height_diff,1),
                               woba_diff = round(woba_diff,3)) %>% 
  rename(Player = PLAYERNAME,
         Season = game_year,
         Month = game_month,
         `Release Height Change` = height_diff,
         `wOBA Change` = woba_diff) %>% 
  kable()

Player	Season	Month	Release Height Change "	wOBA Change
Adam Conley	2016	8	2.4	0.153
Alex Colome	2017	6	2.4	0.150
Andrew Cashner	2018	9	4.8	0.210
Arquimedes Caminero	2016	5	2.4	0.194
Bud Norris	2018	9	3.6	0.273
Cam Bedrosian	2021	4	2.4	0.189
Casey Fien	2017	6	2.4	0.185
Jhoulys Chacin	2021	5	2.4	0.156
Jordan Lyles	2019	7	3.6	0.189
Josh Hader	2022	7	3.6	0.297
Josh Osich	2017	9	2.4	0.157
Josh Tomlin	2016	8	2.4	0.154
Josh Tomlin	2021	8	2.4	0.229
Luke Weaver	2018	9	2.4	0.193
Matt Bush	2017	9	2.4	0.270
Mike Clevinger	2019	6	2.4	0.166
Nate Jones	2020	8	3.6	0.151
Pedro Baez	2017	9	3.6	0.212
Rick Porcello	2020	7	2.4	0.170
Ross Stripling	2021	9	2.4	0.213
Ryan Dull	2019	4	2.4	0.192
Sam Freeman	2018	7	3.6	0.165
Tommy Milone	2016	6	3.6	0.150
Tommy Milone	2016	8	3.6	0.155
Travis Wood	2017	9	2.4	0.218
Will Harris	2019	7	2.4	0.184
Williams Perez	2016	9	2.4	0.210

There are only 27 pitcher-months that fit these criteria. Is there anything special about them? Firstly, do they have odd release heights?

player_list = unique(filtered_group$pitcher)

data_players = data_grouped %>% 
  filter(count >=100,
         career_pitches >=1000,
         pitcher %in% c(player_list))

data_grouped %>% 
  filter(count >=100,
         career_pitches >=1000) %>% 
  ggplot(aes(x = 12*height))+
  geom_density(fill="grey")+
  geom_density(data = data_players,color="red",fill="red",alpha=0.5)+
  theme_minimal()+
  ggtitle("Grey - Total population,
          \nRed - Pitchers who saw worse results with increase release height")+
  xlab("Height /inches")+
  theme(text = element_text(family = "bahn",size=10))

From the graph above we can see that they have quite a representative spread in release height, if anything erring slightly on the higher end than the general distribution

In addition, we can look at whether there is a clear correlation between wOBA and release height for these players or if the selected months are clear outliers to the distribution

data_players %>%
  ggplot(aes(x = 12*height_diff,y = woba_diff))+
  annotate("segment",x=2.5,y=0.15,xend=6,yend=0.15)+
  annotate("segment",x=2.5,y=0.15,xend=2.5,yend=0.4)+
  geom_point(size=2)+
  geom_point(data = 
    filter(
      data_players,
      12*height_diff >= 2.5,
      woba_diff >=0.150),size=2,color="red")+
  theme_minimal()+
  ggtitle("Red:  Results in the 27 filtered Pitcher-months
          \nBlack:  All Pitcher-months for players picked up by the filter")+
  xlab("Height Change /inches")+
  ylab("wOBA Change")+
  theme(text = element_text(family = "bahn",size=10))

This graph shows that there is nothing special about these pitchers, they have simply had a bad month that coincided with a release height change. In the rest of their careers, they have had no correlation between release height change and results. These pitchers have had months with much higher release points where they saw little change compared to their career wOBA.

data_players %>%
  filter(pitcher==623352) %>% 
  ggplot(aes(x = 12*height_diff,y = woba_diff))+
  geom_path(size=2)+
  geom_point(size=4,aes(color = factor(game_year)))+
  theme_minimal()+
  xlab("Release Height Change /inches")+
  ylab("wOBA Change")+
  labs(color = "Season")+
  theme(text = element_text(family = "bahn",size=15))+
  ggtitle("Josh Hader")

If we only look at Josh Hader, we can see that there has been a clear release height increase in 2022, but that this only led to a wOBA increase in July 2022.

The first two months of 2022 had a similarly high release point but Hader was posting wOBAs almost 100 points below his career average. If the release point was important to explaining this decrease in effectiveness, then I would have expected the first few months of 2022 to go much worse.

Conclusions

Using this data, I have found very little to suggest that there is a global relationship between release height changes and pitcher effectiveness. In addition, those pitchers who have seen particularly bad months coincide with release height increases don’t show any correlation between release height and results outside of that month.

Pitcher release point changes could still be a mechanism for explaining performance differences. However, I believe this may only be possible to analyse with performance statistics that are very stable, such as stuff quality.

Ahead in the count

Search This Blog

Investigating Pitcher Release Point Changes

Introduction

Gathering the Data

Are Release Height Changes Bad?

Applications to Josh Hader

Conclusions

Comments

Post a Comment

Popular posts from this blog

Custom Pitch Stuff Grades

PitchingBot: Now With Seam-Shifted Wake