I love this. I've been playing around with BAPP on an individual level to see how it might shift our perception of players. One thing I'm noticing is, at least relative to OPS rankings, the players hurt most by a shift to BAPP is high BA with moderate-to-low BB% and ISO players.
You sort of addressed this at the top, but I'm wondering if you'd agree with this takeaway: BA matters, but in the effort to replace BA, some stats used as a replacement (mainly OPS) were accidentally overvaluing BA.
I'm a Cleveland fan so I've been playing around with their stats. An example that jumped out was 2014 Carlos Santana vs 1995 Carlos Baerga:
Based on OBP and OPS you'd say they're similar:
Baerga: .355 OBP, .807 OPS
Santana: .365 OBP, .792 OPS
But BAPP shows Santana as a much more valuable player:
Baerga: .314 BA / .058 BB% / .138 ISO - .510 BAPP
Santana: .231 BA / .171 BB% / .196 ISO - .598 BAPP
Yes, perhaps ironically, since OBP and SLG both have a large proportion of their value comprised of BA, OPS can make a high BA player look better than they would in BAPP. BAPP clearly shows strengths and weaknesses of the 3 pillars of hitting.
I'm really hoping that BAPP can take off, and I have a post scheduled for Saturday that digs into it some more.
Great read. I played baseball in college from 2009-2012. Before the start of my junior year, the top half of our lineup created a competition for who would walk the fewest number of times over the course of the season. We also had side wagers with the first hitter to get to 5 walks being the loser. This was born from a late night conversation over a couple frosty beers where guys with higher BAs were ragging on guys with higher OBPs.
Something odd happened- our walks went down, average went up…and we had our best conference finish in school history.
On an unrelated note- we were A LOT more aggressive on the base paths too…attempting to stretch every 50/50 play into a double/triple etc.
Obviously not MLB data nor super objective. But my view on baseball theory changed during that season.
In your example, how is it possible that trading 55 pts of BA outweighs losing 55 BB% and 55 ISO when the coefficients are +0.224 vs -0.156 and and -0.2?
So Player A .260/.365/.510 is better than Player B .315/.365/.510? If I'm understanding correctly, Player A, the patient power hitter, is 55*(-0.224+0.156+0.2)=7.26 runs more valuable per/600 PA than Player B, the aggressive contact hitter.
I see that Ryan already came to the same conclusion above in comparing Baerga to Santana.
Based on this analysis, yes. It's why I strongly believe that BAPP is superior to the traditional triple slash, and am hopeful that we can move towards that. If you want player level precision, you would want to use linear weights at the individual level.
I knew this Ben Clemens answer was wrong when he wrote the article over at fangraphs. But everything you say as a commenter at fangraphs is deemed as wrong if you don't agree with the authors. While mentioned in your article, you did not mention the technical term - multicolinearity, meaning overlapping or redundant predictability. I also knew the article was wrong because i did my own analysis. I will let you take any year from 1980-1992, 96 and any year from 2010-2022 (excluding the strike years, 1981, 1994, 1995, 2020 covid, and you will have to adjust for only 26 teams in the 80's) and compare them with a 30 to 40 year gap in between. Batting average and on base percentage are better (BA + walks don't overlap) is a better predictor of runs than OPS and SLG. Why? 1. because the modern age is devaluing the single by default because there are way fewer of them hit compared to the 80's because the mantra is don't swing unless you get a double or homer. 2. lineup turnover is higher in the 80's, not now in the modern age. 3. more runs were scored in the 80's. Go do the calculations yourself, or if you want you can have my spreadsheets
This is interesting research, Eli, but I think there's a question going missed here. What do we want to predict? Are we looking for metrics that will predict how many runs a team DID score? Or how many runs a team WILL score? One is backwards looking, one is forwards looking. The forward-looking question leads us to that mythic search of "true talent level."
If I am given the statistics for a game (AVG, OBP, SLG, wRC+, etc.) and told to guess the final score, honestly, simpler is better (really, the Runs metric is pretty dang simple, and it will give me the best answer). But If you give me a team's stats for June (AVG, OBP, etc.) and ask me to predict their record in July -- now THAT'S a tougher, and more important, question.
The problem with Batting Average isn't that it doesn't report facts -- it certainly does! (Albeit in a weird way, thanks to the use of ABs as the denominator.) The problem is that AVG takes some 1200 PA to stabilize, meaning we end up with random oddities like in 2000 when an outfielder Jeffrey Hammonds (a career .272 hitter) randomly hit .335 for a whole season. He never did that before, and never again. Did he help Colorado win some games that year? You bet your butt he did. But Milwaukee was probably pretty disappointed the following two seasons when, after acquiring him, he averaged .250.
This is also why wRC+ has an even worse correlation with Runs scored -- wRC+ is trying to strip park effects from the data. Hammonds hit .335 in the 2000 Colorado run environment. That place was next to the moon. So a metric like wRC+ gives less credit -- not because Hammonds didn't get hits and create runs. But because he probably couldn't do it again. (That said, his wRC+ was pretty good in 2000.)
I think the next step to this test is seeing if a team's AVG on even days can predict their run scoring on odd days. Or can a team's June AVG predict July runs. Because if it performs poorly there (and it historically has), then AVG is right back where it was at the beginning -- an interesting, but comparatively very limited metric.
Thanks for the detailed comment. My view is that if you're looking to make predictions, your best bet is to use one of the many, very excellent, free projection systems such as ATC, The Bat or ZiPS.
Further, if you're looking for accuracy then you definitely want wOBA and/or wRC+, and not OPS or BAPP.
Really enjoyed this piece. Would you mind sharing your calculation for linear weights when comparing a .250 hitter to a .333 one? I’m trying to better understand it. Thanks!
This is interesting work; thanks for presenting it. How does your model account for HBP? There is a great deal of variability in HBP% and in some instances it is a significant component of a player's OBP. Unless I'm missing something, your calculation of BAPP disregards HBP.
For this type of study, it is important to correlate to runs per out rather than runs per PA since games are measured in outs. Part of the value of getting on base is that not making outs gives your team more PAs to work with in its allotted number of outs, so if two teams have the same R/PA but one has a better OBP, that team will likely score more runs per game. Correlating to R/PA will miss this aspect of on-base value and can skew the results, such as finding a lower correlation for wRC+.
wRC+ will also probably have a lower correlation to run scoring than just wOBA (and possibly other rate stats) because wRC+ is park adjusted. Teams that play in good hitters parks will have higher raw batting stats and score more runs without necessarily having a higher wRC+.
I love this. I've been playing around with BAPP on an individual level to see how it might shift our perception of players. One thing I'm noticing is, at least relative to OPS rankings, the players hurt most by a shift to BAPP is high BA with moderate-to-low BB% and ISO players.
You sort of addressed this at the top, but I'm wondering if you'd agree with this takeaway: BA matters, but in the effort to replace BA, some stats used as a replacement (mainly OPS) were accidentally overvaluing BA.
I'm a Cleveland fan so I've been playing around with their stats. An example that jumped out was 2014 Carlos Santana vs 1995 Carlos Baerga:
Based on OBP and OPS you'd say they're similar:
Baerga: .355 OBP, .807 OPS
Santana: .365 OBP, .792 OPS
But BAPP shows Santana as a much more valuable player:
Baerga: .314 BA / .058 BB% / .138 ISO - .510 BAPP
Santana: .231 BA / .171 BB% / .196 ISO - .598 BAPP
Thoughts on this comparison and takeaway?
Yes, perhaps ironically, since OBP and SLG both have a large proportion of their value comprised of BA, OPS can make a high BA player look better than they would in BAPP. BAPP clearly shows strengths and weaknesses of the 3 pillars of hitting.
I'm really hoping that BAPP can take off, and I have a post scheduled for Saturday that digs into it some more.
Thank you for reading and commenting!
Great article, Eli! Thanks for sharing! Essentially, each of the traditional triple slash line metrics needs the others to complete the story.
SLG doesn't tell us how often the player avoids outs.
OBP doesn't tell us what proportion of the non-outs come from the more valuable hits (and doesn't reward extra base hits).
BA doesn't tell us the quality of the hits or the other ways to get on base.
I really like the idea of BAPP.
This is excellent work. Thank you!!!
Great read. I played baseball in college from 2009-2012. Before the start of my junior year, the top half of our lineup created a competition for who would walk the fewest number of times over the course of the season. We also had side wagers with the first hitter to get to 5 walks being the loser. This was born from a late night conversation over a couple frosty beers where guys with higher BAs were ragging on guys with higher OBPs.
Something odd happened- our walks went down, average went up…and we had our best conference finish in school history.
On an unrelated note- we were A LOT more aggressive on the base paths too…attempting to stretch every 50/50 play into a double/triple etc.
Obviously not MLB data nor super objective. But my view on baseball theory changed during that season.
Appreciate the write up and information!
In your example, how is it possible that trading 55 pts of BA outweighs losing 55 BB% and 55 ISO when the coefficients are +0.224 vs -0.156 and and -0.2?
Other way around. You lose 55 points of BA, and gain the other two.
So Player A .260/.365/.510 is better than Player B .315/.365/.510? If I'm understanding correctly, Player A, the patient power hitter, is 55*(-0.224+0.156+0.2)=7.26 runs more valuable per/600 PA than Player B, the aggressive contact hitter.
I see that Ryan already came to the same conclusion above in comparing Baerga to Santana.
Based on this analysis, yes. It's why I strongly believe that BAPP is superior to the traditional triple slash, and am hopeful that we can move towards that. If you want player level precision, you would want to use linear weights at the individual level.
I knew this Ben Clemens answer was wrong when he wrote the article over at fangraphs. But everything you say as a commenter at fangraphs is deemed as wrong if you don't agree with the authors. While mentioned in your article, you did not mention the technical term - multicolinearity, meaning overlapping or redundant predictability. I also knew the article was wrong because i did my own analysis. I will let you take any year from 1980-1992, 96 and any year from 2010-2022 (excluding the strike years, 1981, 1994, 1995, 2020 covid, and you will have to adjust for only 26 teams in the 80's) and compare them with a 30 to 40 year gap in between. Batting average and on base percentage are better (BA + walks don't overlap) is a better predictor of runs than OPS and SLG. Why? 1. because the modern age is devaluing the single by default because there are way fewer of them hit compared to the 80's because the mantra is don't swing unless you get a double or homer. 2. lineup turnover is higher in the 80's, not now in the modern age. 3. more runs were scored in the 80's. Go do the calculations yourself, or if you want you can have my spreadsheets
This is interesting research, Eli, but I think there's a question going missed here. What do we want to predict? Are we looking for metrics that will predict how many runs a team DID score? Or how many runs a team WILL score? One is backwards looking, one is forwards looking. The forward-looking question leads us to that mythic search of "true talent level."
If I am given the statistics for a game (AVG, OBP, SLG, wRC+, etc.) and told to guess the final score, honestly, simpler is better (really, the Runs metric is pretty dang simple, and it will give me the best answer). But If you give me a team's stats for June (AVG, OBP, etc.) and ask me to predict their record in July -- now THAT'S a tougher, and more important, question.
The problem with Batting Average isn't that it doesn't report facts -- it certainly does! (Albeit in a weird way, thanks to the use of ABs as the denominator.) The problem is that AVG takes some 1200 PA to stabilize, meaning we end up with random oddities like in 2000 when an outfielder Jeffrey Hammonds (a career .272 hitter) randomly hit .335 for a whole season. He never did that before, and never again. Did he help Colorado win some games that year? You bet your butt he did. But Milwaukee was probably pretty disappointed the following two seasons when, after acquiring him, he averaged .250.
This is also why wRC+ has an even worse correlation with Runs scored -- wRC+ is trying to strip park effects from the data. Hammonds hit .335 in the 2000 Colorado run environment. That place was next to the moon. So a metric like wRC+ gives less credit -- not because Hammonds didn't get hits and create runs. But because he probably couldn't do it again. (That said, his wRC+ was pretty good in 2000.)
I think the next step to this test is seeing if a team's AVG on even days can predict their run scoring on odd days. Or can a team's June AVG predict July runs. Because if it performs poorly there (and it historically has), then AVG is right back where it was at the beginning -- an interesting, but comparatively very limited metric.
Thanks for the detailed comment. My view is that if you're looking to make predictions, your best bet is to use one of the many, very excellent, free projection systems such as ATC, The Bat or ZiPS.
Further, if you're looking for accuracy then you definitely want wOBA and/or wRC+, and not OPS or BAPP.
Hi Ben,
Really enjoyed this piece. Would you mind sharing your calculation for linear weights when comparing a .250 hitter to a .333 one? I’m trying to better understand it. Thanks!
https://www.fangraphs.com/guts.aspx
210 hits + 70 walks = 233.9
140 hits + 140 walks = 220.2
If you trade 10 walks for 10 singles, you gain about 2 runs. So if you do it 70 times (as in my contrived example), you gain about 13-14.
Thanks!
This is interesting work; thanks for presenting it. How does your model account for HBP? There is a great deal of variability in HBP% and in some instances it is a significant component of a player's OBP. Unless I'm missing something, your calculation of BAPP disregards HBP.
I left it out to keep things simple, and yes it should be put into the patience bucket.
3 questions:
1) the title of the BAPP vs r/PA says the r^2 is .72 and then in the description it says it’s .86. Which is it?
2) did you graph out wRC+ to r/PA, what was that r^2?
3) is r/PA the standard way to measure offense? I would’ve thought r/outs made, but idk?
1) Should be 0.86, good catch :).
2) wRC+ had a lower correlation for some reason
3) No, but it works with the variables I was using. You can measure offense however you want.
Thanks for reading and commenting!
For this type of study, it is important to correlate to runs per out rather than runs per PA since games are measured in outs. Part of the value of getting on base is that not making outs gives your team more PAs to work with in its allotted number of outs, so if two teams have the same R/PA but one has a better OBP, that team will likely score more runs per game. Correlating to R/PA will miss this aspect of on-base value and can skew the results, such as finding a lower correlation for wRC+.
wRC+ will also probably have a lower correlation to run scoring than just wOBA (and possibly other rate stats) because wRC+ is park adjusted. Teams that play in good hitters parks will have higher raw batting stats and score more runs without necessarily having a higher wRC+.