Projections vs. Reality

I love stats. I’ve been a baseball stathead since I was a kid. I kept track of my own batting average in Little League as well as walks, extra base hits (those were easy – there weren’t many of them), and RBIs. I used to check the box scores in The Record every morning to see what the Mets’ updated stats looked like after the previous night’s game (Unless the game was on the west coast – ahhh, the dark ages), and I’ve continued that habit into the present day.

Over the past 10 years or so, I’ve been familiarizing myself with sabermetrics. I like them. I find them useful. Especially the ones I can understand. Advanced metrics have given me a whole new perspective on the careers of baseball players today, and a renewed appreciation of players from the past.

The one relatively new development I haven’t gotten on board with is the concept of statistical projections.

Projection systems – like PECOTA (which was released today for the 2014 season), ZiPS, or Steamer – are usually based on the average performance of a player over the last 3-4 years, taking into account decline or improvement caused by age and other factors. They’re meant to be an objective measurement of what a ballplayer is supposed to be based on his track record.

First of all, how accurate are they? Let’s look at projections for a few Mets players for the 2013 season, and how they stacked up against reality. These are based on the ZiPS algorithm:


ZiPS assumed Wright would play a full season, despite the fact that he missed a good chunk of 2011 with broken back induced by a collision with Ike Davis. For this, I give ZiPS credit. One of my major pet peeves with projection systems is that they’ll take an injury from 3 years ago, and no matter how freakish or isolated it was, they’ll assume that the player will miss a significant chunk of time in his subsequent years.

As it turned out, Wright missed a month and a half with a hamstring pull. Had David put up 610 plate appearances in 2013, his doubles and RBI totals would have been close to the ZiPS projection. He would have hit a few more home runs, however, and ZiPS completely missed his batting average, on-base percentage, and slugging percentage. This is probably due to the fact that he hit .254/.345/.427 in 2011 (while playing with the back injury), which was a statistical outlier when you look at his entire career.

Two of the biggest misses ZiPS came up with last year were Ruben Tejada and Ike Davis. I’m going to throw those out because of how badly they deviated from their previous trends. Not even the most carefully crafted software program could have seen that coming.

One player that ZiPS seems to have the most success with is Daniel Murphy. Here are his 2013 results:


ZiPS predicted fewer plate appearances because of his 2011 injury. I’m not sure why it factored Murphy’s injury into his projected plate appearances, but didn’t factor Wright’s into his.

If ZiPS had the number of PAs right, the rest of his stats would have been very close. ZiPS also came close to accurately projecting Murph’s 2012 numbers.

Anyone with a strong knowledge of baseball stats can look at a player’s career trends and make an educated guess about the kind of season he’s about to have. You don’t necessarily need a supercomputer to do so. But if I’m compiling projections for every player in the league (plus minor leaguers), then you’d better believe I’m going to have a CPU do it for me.

So, while I still believe blind reliance on projections would be ill-advised, I can see how they can be useful. They can help a front office decide what they have, which in turn helps them decide what they need. I would still treat them as just one of many tools in the box, however, since their accuracy can fluctuate.

Paul is a freelance writer, blogger, and broadcast technology professional residing in Denver. A New Jersey native, he is a long-suffering Mets fan, a recently-happy Giants fan, and bewildered Islanders fan. He's also a fair-weather Avalanche and Rockies supporter. In his spare time, he enjoys the three Gs: Golf, Guitars, and Games.
  1. NormE February 5, 2014 at 6:36 pm

    As you point out, projections are “just one of many tools….”
    Personally I think they are a crutch for front offices to use to divert blame for their mistakes. The human factor makes projections a hit or miss proposition. Sometimes they get it correct and other times you come up with them being way off.
    What was the projection on Marlon Byrd before the 2013 season (I have no idea, but I’ll bet it was way off)? I know I’m cherry picking, but I’m not a fan of such items. Projections/predictions are opinions, and as such are commodities to be offered to the fan questing for insights or validation.

  2. argonbunnies February 5, 2014 at 8:34 pm

    Projections are good for forecasting average luck. If a projection tells you the Mets will be a 74-win team, then anyone who says, “But what about this, this, and this, ha ha, they’re actually an 85-win team!” is expecting good luck (and, usually, not being honest about that).

    Few teams’ baseball seasons unfold exactly as expected. Most teams do have either upswings or downswings relative to what was expected before the season. The key, I think, is about goals and odds.

    Right now the Mets’ problem is that their odds of achieving their 2014 goals are very low. That doesn’t meant they definitely won’t happen, but it does mean the front office needs to do a lot more to give them a 50/50 shot.

    I, for one, am tired of crossing my fingers against long odds, and I’m now disgusting by spin that tries to make the odds sound shorter in defiance of all objective evidence.

    It is certainly true that the 2014 Mets could win 86 games and earn a wild card berth, but it is equally true that they could lose 100.

    I might be more inclined to take 2014 Mets projections with a grain of salt if they weren’t telling me exactly what I already expected.

  3. JoeBourgeois February 5, 2014 at 9:40 pm

    Good piece with only one caveat:

    Wright got the stress fracture in his back diving to tag out Carlos Lee at third, not due to the collison with Davis.