Your Mariners Hard Rock Cafe … D-O-V, Seattle’s Best Baseball Blog :- )

SABRMath


August 16, 2007: 11:00 pm: posted by : SABRMatt ....> 329 Comments <.... filed under: SABRMath

I decided to make my comments on the Game Averaged PythagenPat win estimator a regular, updated feature of DOV, complete with PythagenMatt for every Major League team.  I'll update this each night with the latest games played for each team included and leave comments open to discuss the strengths of teams as they rise and fall on the chart.  Here are the 30 teams in order of adjusted PythagenMatt W%

For those not familiar with PythagenMatt, I'll give the short explanation here and a link to the full explanation appears in comment four below.

We know that the biggest problem with ordinary seasonal Pythag is that is puts too much emphasis on a team's production in blowout games.  Generally, once the game starts to get out of hand, the losing side starts to send out the reserves, especially the reserve pitchers.  They in essence become a weaker team for the rest of that game.  This often leads to additional run scoring that has essentially no meaning (or very close to none).  I believe that the best way to cancel out the impact of blowout games is to put each game on a pythagorean scale.  If you win by 8 runs or by 12, you can still only win one game at a time, and the pythagorean difference between a 10-1 game and a 15-1 game is negligible.

PythagenMatt is PythagenPat, but applied to one game at a time and then summed and averaged (per game).

For example, if you win 14-3, the PythagenPat equation gives us an exponent of 17^0.285 or 2.24 and a winning percentage of 0.969.  Do this for every game and you get something that correlates much more strongly to actual winning percentage than seasonal pythag (I demonstrated a 4% improvement in R^2 in the article I linked in comment #4).  Doing just that gives you numbers that bias toward .500 from both sides (a center-pull) because you're by definition taking away some of the extremes, but the center-pull is easily remedied by applying an adjustment based on the linear correlation I ran to prove that PythagenMatt was indeed a step in the right direction.

View article …

March 27, 2007: 4:16 pm: posted by : SABRMatt ....> 16 Comments <.... filed under: SABRMath

ALTERNATE TITLE DEPT.

When You ASSUME, You Make (heh) of U and ME

I had a plan all worked out for how to rate baserunning in the play by play era.  The topic seems especially germaine these days what with the discussion over whether Hargrove's offensive strategy is hurting the Mariners' chances of scoring.  Are the Mariners too aggressive on the bases?  Not aggressive enough?  Are they doing better than we think?  The PBP database is filled with all kinds of nifty information that can help us understand the relative skill of individual players and teams when it comes to taking that extra base consistently without wasting runners and outs.  I have a roughly complete outline of how I would like to approach the rating of baserunning, but I've run into a bit of a snag.

The central idea behind a strong analysis of baserunning ni the PBP era is the direct comparison of the results of each play to the results we would expect if the baserunners were average.  I won't go into the complexities of multiple-baserunner plays just yet (that's a whole 'nother article unto itself) or the nitty gritty of my planned analysis method.  The important point for today is that in order to compare any one play to average, you need to know the common result of that play.  To show what I mean, I'll use a very simple play:

Runner at first with no one out - Batter hits a single.

A careful study of the skill of the runner at first starts with finding what the probability is that an average runner will reach second, third, or home or be thrown out on a set of plays with matching initial conditions (the batting event and starting base/out state are the same).

My initial plan was to use league average probabilities as my refernece point, but even at the league level, you start running into small sample sizes for rare combinations of base/out states and batting events (bases loaded, two outs, line drive extra base hit, for example) which makes it difficult to trust any findings.  The problem is greatly magnified when you start dealing with multiple runners and using conditional probabilities to account for the fact that lead runners have a limiting impact on the possible movements of trail runners.

View article …

March 24, 2007: 11:16 pm: posted by : SABRMatt ....> 3 Comments <.... filed under: SABRMath

OVERVIEW

One of the first missions of almost every sabermetrician is to determine a preferred strategy for rating the performance of baseball teams and players while keeping in mind the many complicating factors that distort statistics like wins and losses and run differentials.

There is a host of available data today that makes analysis of teams possible, but some understanding of the dynamic way in which those statistics combine to produce wins and losses is required, and this is not a simple matter. Empirical analysis has for years centered on the idea that averages tell enough of the story to be used as the backbone of any system designed to adjust raw statistics to account for the context in which they occurred. This document will explore the problems with empirical sabermetrics and introduce a new tool designed to bridge the gap between the intrinsic skill of the players, and the real world statistics that define them.

REVIEW OF EMPIRICAL METHODS (empirical or traditional analysis includes my own work in the field…the original Pythagorean Comparative Analysis (PCA) for example)

Up until this moment, all documented analyses of player and team value have proceeded in a straight forward, logical fashion, going from point A to point B to point C in order.

A) Rate the offensive context of the league.

View article …

March 19, 2007: 10:41 pm: posted by : SABRMatt ....> 10 Comments <.... filed under: SABRMath

I was in the middle of (sort of) freaking out because I couldn't really think of a good provocative opening topic for DOV premium blend.  I was going to do something about the real Mariners this spring (the ones who will get major league playing time, rather than the ones who are bombing and ruining the ends of most of our games), but it just didn't strike me as "news" in that I think everyone here is quite aware of the good spring Seattle is actually having once you take out the losers who really had no place on the club.

Anyway, I decided instead to pick around my many files, writings and data sets looking for a sabermetric concept I hadn't divulged here yet that I'd done some research on, and I ran into a note I wrote that I was considering posting to DOV but never quite found an opportune moment to introduce on the subject of Pythagorean W%.

Name the #1 problem with Pythagorean W% as guide to team performance.  Quick.  Without thinking to hard.  You should have said "blowout games." 

The first objection you hear from most sabermetricians is that a seasonal RS or RA tally can be dramatically warped by a few big blowouts that obscure the real pattern of the team's performance.  The distribution of runs from game to game is a chief source of error in pythagorean win estimation and it got me thinking…


powered by WordPress. .................Web Design by Jimmy Mac.
Copyright©  2007 detectovision.com - All Rights Reserved
eXTReMe Tracker