ALTERNATE TITLE DEPT.
When You ASSUME, You Make (heh) of U and ME
I had a plan all worked out for how to rate baserunning in the play by play era. The topic seems especially germaine these days what with the discussion over whether Hargrove's offensive strategy is hurting the Mariners' chances of scoring. Are the Mariners too aggressive on the bases? Not aggressive enough? Are they doing better than we think? The PBP database is filled with all kinds of nifty information that can help us understand the relative skill of individual players and teams when it comes to taking that extra base consistently without wasting runners and outs. I have a roughly complete outline of how I would like to approach the rating of baserunning, but I've run into a bit of a snag.
The central idea behind a strong analysis of baserunning ni the PBP era is the direct comparison of the results of each play to the results we would expect if the baserunners were average. I won't go into the complexities of multiple-baserunner plays just yet (that's a whole 'nother article unto itself) or the nitty gritty of my planned analysis method. The important point for today is that in order to compare any one play to average, you need to know the common result of that play. To show what I mean, I'll use a very simple play:
Runner at first with no one out - Batter hits a single.
A careful study of the skill of the runner at first starts with finding what the probability is that an average runner will reach second, third, or home or be thrown out on a set of plays with matching initial conditions (the batting event and starting base/out state are the same).
My initial plan was to use league average probabilities as my refernece point, but even at the league level, you start running into small sample sizes for rare combinations of base/out states and batting events (bases loaded, two outs, line drive extra base hit, for example) which makes it difficult to trust any findings. The problem is greatly magnified when you start dealing with multiple runners and using conditional probabilities to account for the fact that lead runners have a limiting impact on the possible movements of trail runners.
View article …