So if we divide by .35, ignore the HBP, and forget about the denominator mismatch, this could sort of be mapped to
2.6 * AVG + 2 * BB% + ISO
Maybe we just round up the 2.6 to 3 and call it...
BAPP321 = 3*AVG + 2*BB% + ISO
This one has a RSQ of 0.968 with wOBA (same dataset as you used here). So not quite as good as BAPP2. OTOH, the algebraic relationship between BAPP321 and wOBA is a little more direct than that between BAPP or BAPP2 and wOBA.
I mean all you are doing is subdividing the three parts that make up the success of the batter on offense and combining them together, why not add in ROE as part of OBP, it will be the highest r squared of all your stats and is the final missing piece. I have been saying this for 9 years now, no one listens. The accounting of stats in baseball is flawed, ROE should be part of OBP anyway, if the batter does not get out, nor gets any of his teammates out, did he not reach base safely, thus he is on base is he not? did he not do at least 2 positives during his ROE plate appearance (and catcher's interference is a type of ROE as catcher gets an error on the play)? 1. by at least making contact with the ball forcing the defense to make a play, 2. using his legs to hustle to be safe putting even more pressure on the defense. I have 3 stats on offense that I made, bases moved, bases caught, and HIPOWER, hits, ROE, contact outs, isolated power, walks. you have to make sure its done where there is no multicolinearity or kurtosis, but yeah.
FWIW, Tango's "standard wOBA" formula:
wOBA = 0.9 * 1B + 1.25 * 2B + 1.6 * 3B + 2.0 * HR + 0.7 * (BB+HBP)
... could be broken down as
wOBA =0.9 * H + (0.35 * 2B + 0.7 * 3B + 1.1 * HR) + 0.7 * (BB+HBP)
which is kind of close to
0.9*H + 0.35 * (TB-H) + .7 * (BB+HBP)
So if we divide by .35, ignore the HBP, and forget about the denominator mismatch, this could sort of be mapped to
2.6 * AVG + 2 * BB% + ISO
Maybe we just round up the 2.6 to 3 and call it...
BAPP321 = 3*AVG + 2*BB% + ISO
This one has a RSQ of 0.968 with wOBA (same dataset as you used here). So not quite as good as BAPP2. OTOH, the algebraic relationship between BAPP321 and wOBA is a little more direct than that between BAPP or BAPP2 and wOBA.
You can also see my contributions from two years ago, and you can see obvious parallels with Eli's more rigorous approach:
http://tangotiger.com/index.php/site/article/lies-damned-lies-and-batting-average
In other words: start with what you think is CENTRAL for a batter, then create the other metrics needed to support that.
Eli thinks it's getting hits. I think it's getting on base.
I mean all you are doing is subdividing the three parts that make up the success of the batter on offense and combining them together, why not add in ROE as part of OBP, it will be the highest r squared of all your stats and is the final missing piece. I have been saying this for 9 years now, no one listens. The accounting of stats in baseball is flawed, ROE should be part of OBP anyway, if the batter does not get out, nor gets any of his teammates out, did he not reach base safely, thus he is on base is he not? did he not do at least 2 positives during his ROE plate appearance (and catcher's interference is a type of ROE as catcher gets an error on the play)? 1. by at least making contact with the ball forcing the defense to make a play, 2. using his legs to hustle to be safe putting even more pressure on the defense. I have 3 stats on offense that I made, bases moved, bases caught, and HIPOWER, hits, ROE, contact outs, isolated power, walks. you have to make sure its done where there is no multicolinearity or kurtosis, but yeah.
What's ROE?
Reached on error, I believe
Ah, that makes sense. Yes, ROE is a real skill and should probably be included in OBP. But that's a separate discussion in my opinion.
yes, it is indeed reached on an error, and technically the way OBP is currently calculated it omits ROE.