clock menu more-arrow no yes mobile

Filed under:

The Last Word on Joe Girardi’s Game 2 Replay Challenge Blunder

The Yankees manager’s defense of his non-challenge against the Indians on Friday may have a sliver of statistical support, but the data show that hoarding challenges isn’t a strategy for success

Yankees manager Joe Girardi Getty Images/Ringer illustration

On Friday evening in a Cleveland dugout, Yankees manager Joe Girardi made what most observers regarded as the first significant managerial blunder of baseball’s 2017 playoffs. As is so often the case during the postseason, a time when we obsess over starting pitchers not pulled and optimal relievers warmed but too slowly summoned, Girardi’s sin was a move not made.

2017 MLB Playoffs

With two outs and two on in the bottom of the sixth inning and the Yankees up 8–3 on the Indians, Yankees reliever Chad Green, facing his third batter in relief of starter CC Sabathia, nicked the hand of Indians pinch hitter Lonnie Chisenhall with a 96 mph fastball. So said plate umpire Dan Iassogna, who declared the hit by pitch, sending Chisenhall (who had been down 0-2) to first and loading the bases for Cleveland. But slow-motion replays showed that the ball had almost certainly nicked the knob of Chisenhall’s bat, not his hand, before deflecting into catcher Gary Sánchez’s glove.

GIF of Green’s pitch hitting Chisenhall’s bat

Hit-by-pitch calls are reviewable under MLB’s replay rules, but Girardi never issued a challenge. That non-review proved pivotal: Had Iassogna’s call been overturned, the Yankees would have been out of the inning, with a win expectancy upward of 97 percent. As it was, with the bases loaded, their win expectancy was only 93 percent—or, in this instance, slightly lower, because the next batter was not the generic major leaguer that the win-expectancy model assumes, but star shortstop Francisco Lindor. Naturally, Lindor homered, plating four runs, which brought the Indians within one and lowered the Yankees’ win expectancy to 70 percent. In time, that figure would fall to zero percent, after a Jay Bruce homer in the eighth and a Yan Gomes single in the 13th gave Cleveland a 9-8 win and a 2-0 series lead. (The Indians’ tying and winning runs were scored off of relievers whom Girardi had arguably asked to do too much, but let’s stick with one mistake at a time.)

Girardi’s decision not to challenge proved … unpopular.

In the pre-review era, Yankees fans might have blamed Iassogna and Green for their team being down 0-2 in the series instead of even at one game apiece, with a win against Indians ace Corey Kluber in the books and two of the three remaining games set in New York. Instead, Girardi was the whipping boy—and understandably so, once we accept that the replay-review system puts the onus on the team to police the umpire’s performance. Although Iassogna technically blew the call, he’d been asked to make an impossible, split-second pivot from preparing to signal a strike or a ball to judging whether a fastball had hit a hand or a spot just below, based mostly on sound. Compared with that task, Girardi’s duty to challenge within 30 seconds seemed simple. And while Green was the one who hung a slider in a high-leverage spot, he wouldn’t have had to throw it if the call had been corrected.

After an apparently unforced managerial error of this magnitude, only an airtight defense or a forthright admission of guilt can quell the anger of a fan base out for blood. Girardi’s defense Friday leaked a lot of air and entirely lacked self-flagellation. During his postgame grilling, Girardi said he didn’t challenge because whatever video angles Yankees replay coordinator Brett Weber had seen and described within the 30-second window weren’t definitive. In his first public comments, Girardi didn’t deem his decision a mistake, even though he knew by then that only Chisenhall’s bat had been hit by the pitch. “There was nothing that told us that he was not hit on the pitch,” he said. “By the time we got the super slo-mo, we are beyond a minute. It was way too late.”

Granted, what eventually seemed so clear to viewers at home—that the ball had hit the bat—was much harder to tell at the time. Even putting the pressure of the moment aside, there was no way to know that Green would get the out if Girardi challenged. The problem with Girardi’s justification, at least on the surface, is that there was seemingly little harm in challenging anyway, in the hope that the slo-mo view would reveal evidence of a foul ball. In the playoffs, managers get two challenges per game, up from the usual one. What’s more, managers who’ve run out of challenges can still appeal to the crew chief to review non-home-run calls starting in the eighth inning. With one out left in the sixth and one challenge left—Girardi had successfully challenged once already earlier in the game—it was unlikely that an upheld call in this spot would leave the Yankees lacking later.

Girardi argued, though, that there was a hidden cost to challenging, which was visible to him because of his baseball experience. “Being a catcher, my thought is I never want to break a pitcher’s rhythm,” Girardi said. “That’s how I think about it.” Later in the same presser, he doubled down, adding, “I’m going to reiterate, I think about keeping a pitcher in rhythm.” And even the next day, he returned to that alleged cost, asking, “If it isn’t overturned and we’re wrong and then Chad struggles after that, do you feel like I screwed him up?”

There’s plenty to critique about the “being a catcher” comment. For one thing, Girardi hasn’t been a catcher since 2003; after 11 years at the helm of the Marlins and Yankees, he should be thinking primarily as a manager. Of course, that’s not to say that his catching experience can’t continue to inform his thinking. But if he was thinking like a catcher in this instance, why didn’t he defer to the Yankees’ active catcher, Sánchez, who told Girardi immediately that he’d heard the ball hit the bat? And why wasn’t he focused closely enough on the hitter to notice, as others in the dugout did, that Chisenhall didn’t react to the supposed impact and seemed surprised when Iassogna told him to take his base?

Acknowledging that the “being a catcher” defense is problematic to begin with, let’s examine Girardi’s specific claim about replay disrupting pitchers’ rhythm to see whether there was reason to think that Green would be adversely affected by waiting for a replay review. If that disruption is real, then Girardi’s decision would seem more reasonable, especially since the likelihood of the HBP being overturned looked low in light of what he’d been told by a previously reliable replay person.

To test Girardi’s concern, I asked Rob McQuown of Baseball Prospectus for records of all innings since the start of the replay-review era in 2014 in which the same pitcher faced batters both before and after a replay review (thus excluding all innings in which a review came on the first or last play, or in which a pitching change occurred right before or after a review). McQuown also included the full-season true averageBaseball Prospectus’s all-in one offensive rate stat, which is scaled to batting average such that .220 is terrible, .260 is average, .300 is great, and so on—for each pitcher and hitter, as well as the true average of their resulting matchup.

With that data in hand, I used the log5 method pioneered by Bill James to estimate the results of each batter-pitcher matchup, based on the full-season true average and true average allowed of the players involved. For instance, if a hitter with a .280 true average faced a pitcher with a .240 true average allowed, one would expect the true average of their head-to-head plate appearances to fall somewhere between those two numbers.

The table below shows the collective estimated and actual true average figures for the matchups that preceded the replay reviews, the matchups that ended in replay reviews, and the matchups that followed replay reviews, with that last category subdivided into matchups that immediately followed reviews (as in, the first batter after a review) and subsequent matchups (as in, any batters in the same inning after the first one to follow the review). We’re most interested in the fourth column, which shows how pitchers did after replay reviews, relative to expectation.

True Averages Pre- and Post-Replays

True Average Pre-Replay Replay Post-Replay Immediate Post-Replay Subsequent Post-Replay
True Average Pre-Replay Replay Post-Replay Immediate Post-Replay Subsequent Post-Replay
Estimated .266 .265 .265 .264 .266
Actual .482 .423 .269 .272 .266

Girardi was right (maybe, and barely)! After a replay review, hitters outperform expectations by four points of true average. All of that advantage is concentrated in the first matchup after the replay review: The first hitter who follows the review outperforms expectations by eight points, while all subsequent hitters collectively match expectations precisely. The numbers are consistent with the claim that pitchers lose a little rhythm after a replay review, if for only one batter. And the batter following the review Girardi was considering in ALDS Game 2 would be a very good one, with the bases loaded to boot.

Time for a couple of caveats. First, eight points of true average aren’t worth that much; this season, the difference between .264 and .272 was the difference between Evan Longoria and (to use a Yankees example) Chase Headley. According to McQuown, a gain of one point of true average is worth an additional half a run over 500 plate appearances. Over one plate appearance, the extra value is negligible.

Second, the sample here is skewed in that we’re examining pitchers who’ve just put themselves in position for a replay view. In many cases, that means some event or events that spawned a base runner or base runners has preceded the review, setting up a contested force play or tag play; the review itself may have been touched off by a home run, a double down the foul line or, yes, a hit by pitch. The point is that pitchers who are on the mound during a replay review have often struggled in some way that invited the review, which suggests that their apparent problems post-review could just be a symptom of a preexisting funk. Maybe it’s not that pitchers lose their rhythm while they wait two minutes for the review verdict; it’s that pitchers who have to wait are out of rhythm already.

Regardless, it doesn’t seem as if a post-review hangover effect could be strong enough for the typical pitcher to justify Girardi’s decision not to challenge. And with time to reflect—and, perhaps, be swayed by external and internal condemnation—Girardi reached the same conclusion. In an unusually frank Q&A on Saturday, he adopted a much more regretful tone. “Now, knowing that I had two challenges, in hindsight, yeah, I wish I would have challenged it,” he said. “And, yeah, I should have challenged it, now that I think about it.” Later, he added, “I screwed up. And it’s hard. It’s a hard day for me.”

In retrospect, it makes sense that Girardi would eventually be burned by being a little too tight-fisted with challenges. In the replay-review era, the Yankees have had by far the highest success rate when issuing challenges, according to data pulled from the Baseball Savant instant-replay database. Three out of every four of their challenges have led to an overturned call.

Graph showing Yankees with the highest replay challenge success rate from 2014-2017

However, the Yankees have also issued the third-fewest challenges overall.

Graph showing Rangers and Rays leading total replay challenges from 2014 to 2017

Even though Weber obviously has a good feel for when a call is likely to be overturned, it’s possible that he’s recommending challenges too selectively. The teams with the most overturned plays in the replay-review era aren’t the ones with the highest success rates; they’re the ones that challenge the most often. (The teams in first or tied for second in total plays overturned—the Cubs, Rays, and Pirates, respectively—are tied for first, third, and fifth in total challenges issued.) By reserving their challenges for clear-cut cases and eschewing some that could go either way, Weber (and by extension, Girardi) might be acting too conservatively and passing up opportunities for additional outs, a case that Sam Miller and I made on a podcast last year. “Very seldom have I ever wasted a challenge when it wasn’t conclusive,” Girardi said Saturday. “That’s just what I’ve done, you know. Maybe that’s the wrong way. But that’s the way I’ve been.”

The question now is whether the Yankees—and Girardi, whose contract expires after this season—will pay a lasting price for hoarding their challenges in Game 2. It’s fairly rare for a well-established manager of a winning team to be dismissed over one ignominious mistake; former Red Sox manager Grady Little, who was fired after infamously sticking too long with Pedro Martínez in Game 7 of the 2003 ALCS, is more the exception than the rule, and even he got to manage again. Other managers who’ve made glaring recent postseason mistakes, including Mike Matheny and Buck Showalter, have held onto their gigs. Most of the factors that predict a manager’s likelihood of returning—team wins and improvement in wins, playoff appearances, etc.—point toward Girardi coming back. And maybe one costly replay decision shouldn’t be a fireable offense, given Girardi’s long-term bullpen-management skills, ability to avoid drama in the country’s biggest media market, and history of both winning a title with a talented team and keeping less-talented teams out of losing territory.

Should the Yankees—who won 1-0 on Sunday to stay alive in the series and will fight for their survival again on Monday night—become the 10th team in 76 tries to come back from an 0-2 division series deficit, Girardi’s flub will probably be forgotten. If the Indians eliminate them, the fallout from the foul tip that wasn’t will linger much longer. As Girardi said Saturday, “Let's just see what happens tomorrow and as we move forward. That will probably determine the severity of it.”

An earlier version of this piece gave an incorrect score for Game 3 of the Indians-Yankees ALDS; it was 1-0, not 2-0.