Thursday, June 3, 2010

Payroll and Performance in MLB

At the beginning of the MLB season, the USA Today had an article on MLB parity. One of the hypothesis of the article is that MLB teams can buy team performance. In other words, teams that have higher payrolls have higher winning percentages. This is a story that has been often told about sports teams, but is rarely true - European soccer is the exception. So again, let's put this to the test, and see what the correlation between team total payroll (or relative payroll) and team performance is during this season. Thus what I have done is grab the MLB team total payroll data from the USA Today's website and the MLB standings data as of June 3, 2010 from MLB's website and calculated the correlation coefficient and also run a regression and looked at how the variation in total MLB payroll explains the variation in MLB winning percentage. I have included a table with the data below, in case you are interested of replicating the results on your own.

First, notice that the correlation coefficient between total payroll and winning percentage is 22.4%, which means that the two variables move together just over 22%; or there is about 78% not moving together. That does not seem to me a great deal of support for the hypothesis that as team's spend more on payroll, it results in higher team winning percent (or better quality teams).

Second, we I run a regression (including a constant term), I find that payroll for the 2010 season is statistically insignificant. In other words in statistical terms payroll has zero effect on winning percentage at this point in the season. From that I would conclude that payroll for the 2010 MLB season has zero impact on winning percent in the 2010 season. Maybe given more time, this relationship will increase and become statistically significant. Over time (in other words adding more seasons) we do find a statistically significant relationship, but it is rather weak, indicating that using payroll to find out who the better teams in baseball will be is not all that reliable.


Team Winpct Total payroll
Arizona 0.370 $60,718,166
Atlanta 0.585 $84,423,666
Baltimore 0.283 $81,612,500
Boston 0.574 $162,447,333
Chicago Cubs 0.453 $146,609,000
Chicago WhiteSox 0.423 $105,530,000
Cincinnati 0.574 $71,761,542
Cleveland 0.373 $61,203,966
Colorado 0.528 $84,227,000
Detroit 0.519 $122,864,928
Florida 0.500 $57,034,719
Houston 0.358 $92,355,500
Kansas City 0.407 $71,405,210
Los Angeles Angels 0.491 $104,963,866
Los Angeles Dodgers 0.585 $95,358,016
Milwaukee 0.415 $81,108,278
Minnesota 0.585 $97,559,166
New York Mets 0.500 $134,422,942
New York Yankees 0.623 $206,333,389
Oakland 0.519 $51,654,900
Philadelphia 0.538 $141,928,379
Pittsburgh 0.415 $34,943,000
San Diego 0.604 $37,799,300
San Francisco 0.538 $98,641,333
Seattle 0.404 $86,510,000
St. Louis 0.574 $93,540,751
Tampa Bay 0.667 $71,923,471
Texas 0.538 $55,250,544
Toronto 0.564 $62,234,000
Washington 0.481 $61,400,000

2 comments:

Unknown said...

Dear Stace,

I teach high school math, and my class stumbled upon your website. As a result, we are flabbergasted by your findings. Obviously, a person of your stature - economics teacher at the University of Iowa -- would have the ability to find a proper sample size in analyzing data! How dare you have the audactiy to publish a website--and be the 3rd result under a google search--and not have reliable results. We determined by analyzing team's payrolls and winning percentages over a ten year span--not two months--that the correlation between winning percentage and average payroll to be a correlation coefficient of 0.72. Therefore, money does effect winning.

@StaceyLBrook said...

Sebastian,

First, you are correct the sample size is small. The point of the USA Today article (linked at the beginning of the blog) is that for this time period is that payroll and performance are related. So, I looked at their statements and tested it and found that it is not verified. In fact if you read the end of the blog, you will notice that I state, "Maybe given more time, this relationship will increase and become statistically significant. Over time (in other words adding more seasons) we do find a statistically significant relationship, but it is rather weak, indicating that using payroll to find out who the better teams in baseball will be is not all that reliable".
Second, I have no control where my blog shows up on Google. If this offends you - contact Google.
Third, you have made two errors. The first is that you state you are using team payrolls (which have an upward trend - and thus are non-stationary) and calculating the correlation coefficient with winning percentage, which is stationary. As we state in our book, The Wages of Wins, you must use relative payroll (team i's total payroll divided by the average payroll during season j). As a math teacher hopefully you can follow the i is for the individual team and j is for each season.
Your second error is the last sentence. You use a descriptive statistic (correlation coefficient) to make a statistic inference. Just because the correlation coefficient is a number does not allow us to draw a conclusion, which is a freshman lesson in statistics. As a high school math teacher you should know better. Run a regression on winning percentage and relative payroll - you choose the league and the time period. At most the R-square will be 30%, which means that the variation in relative payroll is missing at least 70% of the variation in winning percent.