Astros: Projecting regular season win total for 2021

(Photo by Michael Reaves/Getty Images)
(Photo by Michael Reaves/Getty Images)

One of the more ubiquitous of the modern metrics is applying the Pythagorean Win Percentage to an MLB team run differential.  It’s tracked by MLB, ESPN and Baseball-Reference, among others.

Earlier this season, as the Houston Astros were bludgeoning the Oakland A’s, I noticed Houston’s expected wins by this formula seemed out of whack, so I flipped over to FanGraphs and theirs seemed a bit conservative. My initial thought was the number in my head would be somewhere between the two. “It’s early in the season, surely corrections are coming.”

Those corrections never seemed to materialize, so I decided to try and develop a more realistic, if less mathematically sound, expectation of my own for the Astros win total.

More from Climbing Tal's Hill

The problem with the Pythagorean, at least as I see it, is that it ignores the outcomes of the games in the calculation of expected wins for a season. Let’s assume a three-game series goes like this: a 12-1 Astros win in game one and losses in the next two games by 2-1 and 1-0 margins. The Astros have outscored their opponent 13-3, yet are 1-2.

Pythagorean tells us these losses will be made up somewhere along the line and shows an expected record of 3-0 (2.847 to be exact). I understand that the Pythagorean formula is using run differential as a way of looking to the team’s future performance, rather than past performance.

Here’s the problem: those losses are real, and you aren’t getting those games back, no matter how bad you beat the next team, you still have those two losses on your record. Intrigued, I began building a blended, weighted formula that would better represent not only the run differential, but also include the on field results.

The next step was to back test my numbers on the previous five seasons (excluding 2020) for the Astros to see if I was even in the ballpark. I’m calling it blended, because I’m blending two things (and weighting each) – the Pythagorean Expected Wins and the actual current won/loss percentage to estimate the win total.

Over 2015-2019, the blended formula was only off by nine games total while the Pythagorean Formula was off by a total of 19. I can’t stress enough that I’m not saying my calculation is mathematically sound or that the Pythagorean is not a better long term answer. However, when I back tested my formula it came much closer to the actual totals than the Pythagorean formula did.

Will it this year?  We’re about to find out.

Here’s the good news: If you believe in the blended model I described above, it currently has the Astros ending the regular season with 101 wins, 12 games ahead of the A’s. Assuming the Astros get healthy and Alex Bregman and Carlos Correa return to the lineup on a regular basis, it’s not a stretch to envision a 12 game margin in the division.

Here’s what the AL West looks like with the blended calculation:

Astros…………………………………………..101-61

A’s…………………………………………………89-73 (12 GB)

Angels………………………………………….79*-83 (22 GB)

Mariners………………………………………79*-83 (22 GB)

Rangers……………………………………….67-95 (34 GB)

* Angels are on pace for 79.4 wins while Mariners are at 78.6.

As the All-Star break comes to an end FanGraphs is projecting 95 wins for the Astros and 88 for the A’s, a much closer, but still comfortable, margin for the Astros.

The Pythagorean model used by MLB.com has the Astros currently on pace to win 104 games and 86 for the A’s. The blended model I use is in the middle of these two, and I’m comfortable with that. Moving forward I plan to update this periodically assuming interest in the topic.

Schedule