Stat pundit rankings: MLB 2014 over/under win totals

The 2014 Major League Baseball regular season is over, making it the perfect time to look back at how stat pundits performed at predicting each team’s total wins.

Last year, Trading Bases bested competitors while also outperforming totals set in Las Vegas.

Before looking at this year’s results, let’s take a look at our competitors, along with each ones’ abbreviations.

O/U: The over/under set for each team in sportsbooks. I used ones set by sportsbetting.ag in early March (Note: if you are scoring at home, all bets I extracted were between +115 and -135. For simplicity, I treat each the same way)

BP: Baseball prospectus (PECOTA). Picks were taken from the site before the season, although there does not appear to be an active link for these projections.

PM: Prediction Machine

FG: Fan Graphs. There does not appear to be an active link for these projections, although the AL West’s are here

TB: Trading bases, the 2013 winner, and a website of Joe Peta

Cairo: Cairo

DP: Clay Davenport

LS: LiteSabers

Ensemble: An average of the first four site’s above.

Here are my metrics

MSE: Averaged squared error between the prediction and the win totals (lower is better)

MAE: Averaged absolute error between the prediction and the win totals (lower is better)

Percent: Fraction of successful over or under bets (higher is better)

O/U BP FG PM TB Cairo DP LS Ensemble
MSE 69.2 73.1 68.1 81.8 73.5 73.9 64.1 164.6 70.5
MAE 6.38 6.67 6.1 6.94 7.37 6.5 6.2 10.2 6.41
Percent NA 47 43 37 43 50 43 27 50

While Fan Graphs and Davenport appeared to outperform its competitors and boasted slightly lower average errors (MSE, MAE) than sportsbooks, betting each team according to both Fan Graph’s and Davenport predicted totals would have finished just 13-17. In general, the number predicted by sportsbooks finished closer to the eventual win totals.

In summary, it’s pretty amazing that not a single one of the nine prediction methods I used was able to finish with an overall record above 0.500.

Lastly, I plotted the predicted and observed win totals, using both sportsbook predictions (abbreviated as “Las Vegas Prediction”) and the ensemble method (the average of the first four sabermetrics/statistics sites above, listed as “Steadheads Prediction.”)

MLB2014

Win totals were relatively accurate for most teams.  The Rockies, Diamondbacks, Rangers, and Red Sox all falling short of expectation, while the Marlins, Orioles, and Angels all outperformed expectations.

Interestingly, betting the “over” on the 15 teams with the lowest predictions, and the “under” on the 15 teams with the highest prediction would have finished 19-11.

Thanks for reading, and if you have any other prediction website’s you’d like me to include, feel free to send them along.

Advertisements

18 Comments

    1. Yes! I used both of those sites last year, but couldn’t find any predictions archived online. Would have definitely included otherwise.

      For the ones I linked, I had grabbed em in mid-March and kept in a spreadsheet.

    1. That TeamRankings link from the web archive is just showing the 2013 standings. It’s not a projection for 2014.

      I can pull preseason projections from the DB if you want them. But there’s a reason we didn’t post them on the blog — our own projections have not historically been very good for baseball, so this year we basically used an ensemble approach, rather than trying to come up with our own from scratch.

    2. That web archive link for TeamRankings is just showing the final 2013 standings. It’s not a 2014 projection.

      I can pull the preseason projections from the DB if you want. However, there’s a reason we didn’t post them on the blog this year — our MLB preseason projections the last few years have not really been very good (much worse than the other sports), so this year we essentially used an ensemble approach (though I believe our ensemble was much smaller than yours).

      1. Hi David,

        Sorry about that. I actually didn’t include TR because I remember that you didn’t vouch for your MLB projections. A reader sent a request and a link to picks; I should’ve followed up before posting. I took TR’s down.

        -Mike

  1. Thanks, appreciate it. For what it’s worth, I looked up the info for ours. We used the following ensemble with equal weight to all three:

    Baseball Prospectus
    5Dimes raw win total lines
    5Dimes win total lines adjusted for juice (so, for example, 87.5 with Over -120 / Under +110 might become 88.5)

    My file is dated 3/20, so I probably grabbed the info as of that day, but I’m not sure.

    And actually, we didn’t just use the average win totals directly. We adjusted them a bit to get the correct number of wins (since 5Dimes didn’t add to the correct number), then I created power ratings that resulted in win totals close to the adjusted average when simulating the full season.

    The MSE using those ratings was 65.2 and MAE was 6.2.

  2. while fun to track, this stuff is really only for entertainment purposes only and you can’t read anything into it. by treating +115 through -135 equally, which is a difference of 10.9%, in the long run, the lack of associated odds would be a stronger influence than the actual difference in skill between the prognosticators. being forced to pick all 30 teams, i’m sure the difference between my dog and the greatest baseball prognosticator in the world is less than 10.9%.

    assuming everything is even money, a dart thrower has about a 42.8% chance of picking 16-14 or better. i would be highly impressed by someone who could pick at 52% accuracy over the long run picking every team every year–someone who makes 10 bets a year that win at 56% (a spectacular win rate), and thinks the other 20 teams are lined correctly, will pick 52% overall, and that person has about a 51.6% chance of picking 16-14 or better. someone who can pick with 55% accuracy, which would be pretty incredible considering they are forced to make predictions on all 30 teams, has about a 64.5% chance of picking 16-14 or better. so you would need many, many seasons worth of data to draw firm conclusions.

    for entertainment purposes, it doesnt matter, but if you end up acquiring enough data to draw significant conclusions, i would make three suggestions

    -include the associated odds with each line
    -use lines and odds from a bigger book with bigger betting limits like pinnacle or bookmaker
    -simply call the betting odds “betting odds” rather than Vegas, since sportsbetting.ag has nothing to do with Vegas. you have certainly put a lot of thought into this, but referring to an offshore book as “Vegas” makes it sound like the crappy sites that don’t know the difference between Vegas and offshore books. a friend of mine referred me to your article and asked what i thought, but if i see an article calling offshore odds “Vegas odds” i usually skip it assuming the author is poorly informed.

    1. Hi Will,

      Thanks for reading, and glad you stuck it out.

      I had no idea this was going to be linked on Deadspin, otherwise I would’ve taken a bit of additional care in the writing.

      For example, I try to write “sports books” whenever possible, but sometimes I let a ‘Vegas’ or two slip. That said, my limited experience is that sportsbetting.ag’s lines are fairly close to the lines I’ve observed when actually at Vegas sportsbooks.

      I don’t think the associated odds made a big difference – I went through and used the adjusted lines that I had saved, and it didn’t make much difference in terms of the overall margin one hypothetically could have made or lost. See replies in my Twitter timeline (@StatsbyLopez) if you want that info.

      Lastly, I agree that this is mostly for entertainment. Virtually impossible to make long run judgements from one season.

    1. Yes – I try and keep track of all posted season totals at the beginning of each season (MLB, NFL, NBA). The date I had saved my spreadsheet was around March 20, so I’m pretty sure that’s when I extracted them.

      1. These were pulled from Bookmaker the night before opening day:

        Time
        # Teams Spread Total
        3:00 PM
        14301 D’BACKS RSW o79+103
        14302 D’BACKS RSW u79-123
        3:00 PM
        14303 BRAVES RSW o86½+108
        14304 BRAVES RSW u86½-128
        3:00 PM
        14305 ORIOLES RSW o80½-117
        14306 ORIOLES RSW u80½-103
        3:00 PM
        14307 RED SOX RSW o87-112
        14308 RED SOX RSW u87-108
        3:00 PM
        14309 CUBS RSW o69½-134
        14310 CUBS RSW u69½+114
        3:00 PM
        14311 WHITE SOX RSW o74½-124
        14312 WHITE SOX RSW u74½+104
        3:00 PM
        14313 REDS RSW o85+101
        14314 REDS RSW u85-121
        3:00 PM
        14315 INDIANS RSW o81-142
        14316 INDIANS RSW u81+120
        3:00 PM
        14317 ROCKIES RSW o76½-137
        14318 ROCKIES RSW u76½+115
        3:00 PM
        14319 TIGERS RSW o90-101
        14320 TIGERS RSW u90-119
        3:00 PM
        14321 MARLINS RSW o68½-123
        14322 MARLINS RSW u68½+104
        3:00 PM
        14323 ASTROS RSW o63½-105
        14324 ASTROS RSW u63½-115
        3:00 PM
        14325 ROYALS RSW o82½+106
        14326 ROYALS RSW u82½-128
        3:00 PM
        14327 ANGELS RSW o87-108
        14328 ANGELS RSW u87-112
        3:00 PM
        14329 DODGERS RSW o93½-119
        14330 DODGERS RSW u93½-101
        3:00 PM
        14331 BREWERS RSW o80EV
        14332 BREWERS RSW u80-120
        3:00 PM
        14333 TWINS RSW o69-119
        14334 TWINS RSW u69-101
        3:00 PM
        14335 METS RSW o73½-124
        14336 METS RSW u73½+104
        3:00 PM
        14337 YANKEES RSW o86½+126
        14338 YANKEES RSW u86½-147
        3:00 PM
        14339 ATHLETICS RSW o87½+113
        14340 ATHLETICS RSW u87½-133
        3:00 PM
        14341 PHILLIES RSW o77+127
        14342 PHILLIES RSW u77-147
        3:00 PM
        14343 PIRATES RSW o83½-105
        14344 PIRATES RSW u83½-115
        3:00 PM
        14345 PADRES RSW o77½-129
        14346 PADRES RSW u77½+109
        3:00 PM
        14347 GIANTS RSW o86-102
        14348 GIANTS RSW u86-118
        3:00 PM
        14349 MARINERS RSW o80½-129
        14350 MARINERS RSW u80½+109
        3:00 PM
        14351 CARDINALS RSW o91½-101
        14352 CARDINALS RSW u91½-119
        3:00 PM
        14353 RAYS RSW o88½-106
        14354 RAYS RSW u88½-114
        3:00 PM
        14355 RANGERS RSW o87+107
        14356 RANGERS RSW u87-126
        3:00 PM
        14357 BLUE JAYS RSW o80-122
        14358 BLUE JAYS RSW u80+102
        3:00 PM
        14359 NATIONALS RSW o90-134
        14360 NATIONALS RSW u90+112

  3. I think it would be more illuminating to order or weight the predictions by their difference from the O/U. A prediction where the oddsmakers say 80.5 and the forecasters say 81 is basically noise. A prediction where the oddsmakers say 80.5 but the forecasters say 90 is a strong difference of opinion.

    Do you have available the data behind the chart?

    1. Yes, I’ve done that in the past – just looked at the top picks for each site (e.g., when they differed from the O/U by more than 3 wins). Left it out this year, although I know that the aggregate’s top 5 finished 2-3, and Joe Peta’s Top 10 finished 3-7. Good point, and I’ll remember to do that next time.

      Also, if you want the data, feel free to send me an email and I’ll send you a .csv

  4. Thanks for tracking us. We had a rough year in 2014, but we’re truly honored to be included in such company. We’ll be tweaking the system and looking forward to repeating the successes of our 2012 and 2013 projections.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s