Skip to main content

An attempt to measure catcher game-calling or: Why Sandy Leon is great

I was browsing Twitter a while ago and came across a couple of tweets which discussed the unquantifiability of catcher game-calling and I took that as a challenge. 

This is my attempt to quantify catcher game-calling, using a mixture of pitch run values and also my own Statcast based pitch level predictions. 

I should note that measuring game-calling ability has been attempted before by Harry Pavlidis, founder of Pitch Info, however I haven't found much information on this statistic since it was published in 2015.

What is Game-Calling?

Game-Calling is rather self-explanatory, the catcher tells the pitcher which pitches to throw and where to throw them, and then the pitcher attempts to execute those instructions. However catchers aren't random number generators, each catcher calls pitches in specific ways and some will be better at calling good pitches than others.

There are many facets to game-calling which a catcher could be good or bad at. Firstly some catchers may be better at calling pitches which are suitable for the ball/strike count. For example a catcher who insists on middle-middle fastballs in 0-2 counts and breaking balls in the dirt on 3-0 counts would not be a good game-caller (this is exaggerated but small differences in calling suitable pitches could add up to many runs over a long enough sample).

In addition, some catchers will take more advantage of advance scouting information, calling pitches which exploit the specific weaknesses of the batters that a pitcher faces.

Finally, different catchers may pick up on when their pitcher is struggling with locating a certain pitch or if their stuff is unhittable on another pitch. They can then adjust their game-calling to favour the pitches which are working particularly well on a given day.

These many aspects of game-calling make it a challenge to measure. It's difficult to derive the intent of a catcher since pitchers can miss their locations by some distance. In addition pitch outcomes can be sufficiently noisy that a catcher could call a good pitch and still get bad results from it. 

Game-Calling is not framing or blocking or any other aspect of catcher defense. This article purely deals with what the catcher can do before they receive the ball, an entirely mental aspect of a physical game.

Method 1: Pitch Run Values

The value of every pitch can be quantified in its run value. A good explanation of pitch values van be found on Fangraphs here. The idea is to reward a pitch for each change in the ball/strike count, along with the traditional run values of events.  I use a slightly modified version of run values which don't use the results of balls in play but instead use the batted ball type (groundball, linedrive, flyball, popup). Pitchers have the most control over launch angle so this modification removes a lot of the randomness from balls in play and makes the run values a bit more reliable in smaller sample sizes.

For each catcher-season I followed this procedure:

For each pitcher that the catcher works with, I took the pitcher's average run value per pitch in that season. Then I found the difference between their run value with the catcher compared to their average in total. Then I added up the extra run value which the catcher produced above average for each pitcher to give a total number of game-calling runs for the season.

I believe this represents game-calling because if pitchers get significantly better results with one catcher than with another over a large enough sample then there can be very few other factors which could cause this difference. There could be effects from opponent quality but these should even out over a season's worth of games and different pitchers. The only risk is that pitch run values are too noisy for the game-calling signal to be visible. I could have used a "With Or Without You" (WOWY) method which compares run values for pitchers with and without the chosen catcher but this can lead to small sample effects which could distort my measures of run value.

In my initial attempt using this method, I noticed that the leaders looked suspiciously similar to the pitch framing leaders. Therefore to remove any impact of framing I used my model of predicted called strikes and ignored all takes in my sample with a called strike probability of  between 1% - 99%. This was probably overkill but it removed all borderline pitches which could have potentially been framed by the catcher.

Using this method the top game-callers from 2015-2021 are:

Catcher Game-Calling Runs
Sandy Leon 66.9
Roberto Perez 64.2
Austin Hedges 59.8
Mike Zunino 50.7
Luke Maile 49.4

Using pitch run values can be a noisy business, so we should look at the yearly correlation in this method to make sure that the values which come out are somewhat repeatable in future seasons. This would indicate the these game-calling runs are due to catcher skill and not some lucky breaks or a few starts against sub-par opponents.

The year-on-year correlation of these game calling runs was 0.25 for catchers who caught at least 5000 pitches each year, which is quite small. This is somewhere between the yearly correlation of pitcher wins and LOB%. But it is not close to zero, which implies that there is some skill hidden in these run values. Sandy Leon has a reputation for being an excellent game caller, so it is good that his name appears atop the leaderboard.

I wanted to try to remove some of the noise from these predictions, sometimes batters hit line drives off good pitches and sometimes they strike out on meatballs. How could I remove this from my measure of catcher game-calling?

Method 2: Predicted Pitch Run Values

If you've read any of my blog articles or follow me on twitter then you'll know that I have a set of models which I use to make predictions of pitch quality based on pitch characteristics and location, further information on these models can be found here and here. I like to think of these as a pitching analogue to xwOBA for batters, using Statcast data without looking at pitch outcomes to predict performance.

The good part about using predicted pitch values is that they remove a huge amount of variance from pitch outcomes, which should hopefully allow for a less noisy measure of game-calling. In addition we no longer need to worry about the impact of pitch framing, as the models don't know whether a pitch will get framed or not and therefore assume a league average framing rate.

There are some downsides to using predicted pitch quality. Firstly, this measure of game-calling will no longer adjust based on the batter that the pitcher faces, this model doesn't know which hitters are weak in different areas and therefore this important aspect of game-calling will be ignored. Secondly, pitchers still miss their spots and shake off their catchers, meaning that the pitch which gets thrown is not always what the catcher intended.

With these caveats in mind, here are the leaders in game-calling by using predicted pitch quality from 2015-2021:

Catcher Game-Calling Runs
Sandy Leon 48.5
Matt Wieters 24.2
Austin Barnes 21.0
Curt Casali 19.9
Chance Sisco 19.2

The yearly correlation here becomes slightly better with a value of 0.3. Sandy Leon leads the field significantly, with double the game-calling runs of any other catcher. Somehow, Sandy Leon manages to get pitchers to throw better pitches when he is catching, a remarkable feat. Let's have a look at what makes Sandy so special.

A deep dive into Sandy Leon

For much of Sandy Leon's Red Sox tenure, he was joined in the catching position by Christian Vazquez, a partnership which has resulted in Vazquez lying at the bottom of the game-calling runs leaderboard. This is because every run which Leon gains is a run which another catcher on the same team has to lose.

The total run difference between the two adds up either 110 or 95 game-calling runs, depending on the method used, or around 9-10 wins from the difference in pitch run values alone. 

Here I'll look at which pitchers contribute to this significant game-calling run difference, and what about the two catchers makes them so good/bad at game calling.

Firstly here are the game-calling run values (using predicted pitch quality) by pitcher for each catcher during their time spent together in Boston.


Leon beats Vazquez in game-calling runs for virtually every pitcher whom they both caught!

There may be quite a simple reason for this large skill difference, which is surprising since the calculation of game-calling runs is so complicated. It boils down to the fraction of breaking balls which the pitcher calls for.

Breaking balls such as sliders and curveballs are generally better than fastballs (see the difference in pitch run values between fastballs and sliders here) and recently teams have realised this and started to throw them more often. Leon called for far more breaking balls than Vazquez in every season they were together.


Throwing more breaking balls is especially important in two-strike counts, where a breaking ball below the zone has a high probability of inducing a strikeout. Again, Leon is far above Vazquez in calling for two-strike breaking balls.

Perhaps after all the complicated modelling, a good and simple measure of catcher game-calling is just how many breaking balls they ask for. 

However, looking across my whole dataset, there is only a very small correlation between game-calling runs and breaking ball usage so this can't explain the difference in game-calling runs. 

I also considered whether different catchers called for a pitcher's "best pitch" (as judged by my pitch rating model) more often and if this could contribute towards the game-calling differences seen, but there was only a very weak relationship there too. Clearly the skills which the best game-callers use extend far beyond pitch selection.

Conclusions

Catcher game-calling is an important, but difficult to measure part of the game of baseball. The catcher influences the type and location of every pitch thrown and so small differences in game-calling ability can be magnified to be worth many runs over time. In this study I've investigated a couple of different ways to measure catcher game-calling, by looking at actual and predicted pitch run values. Catcher game-calling is slightly less important than pitch framing when measured using these methods, with the difference between the best and worst game-callers being around half the difference between the best and worst framers. 

It is important to remember that the number of game-calling runs saved by a catcher depends on the skill of other catchers on the same team. This can make it difficult to work out if someone high on the leaderboard is actually a good game-caller, or if they were just paired with a bad game-caller.

Using pitch run values can add a lot of noise into my measure of game-calling which reduces the year on year correlation, I have tried to mitigate this by also using predicted run values. Sandy Leon stands out as clearly the best game-caller in both methods from the years 2015-2021.

Red Sox starters Chris Sale, Eduardo Rodriguez, Rick Porcello, and David Price all have better ERAs with Leon catching than with Christian Vazquez, despite Vazquez being the better pitch framer. This shows the impact that good game-calling can have on a pitching staff.

I've uploaded the overall leaderboard for my catcher framing runs as a csv file which you can download here, and the leaderboard for the 2021 season so far can be found here.

Comments

Popular posts from this blog

Custom Pitch Stuff Grades

  I've made an app allowing anyone to see what my models think of any hypothetical pitch.

Don't Let Opposing Hitters See the Same Reliever Too Many Times, Especially in the Postseason.

Here I show how relief pitchers get significantly worse results when hitters see them on multiple occasions in a short time period.

PitchingBot - An Overview

PitchingBot is a model I have made to evaluate pitch quality from the characteristics of the pitch alone. This post goes through the details of making and testing PitchingBot before giving some topic ideas for future posts which will use the model.