First, notice that the correlation coefficient between total payroll and winning percentage is 22.4%, which means that the two variables move together just over 22%; or there is about 78% not moving together. That does not seem to me a great deal of support for the hypothesis that as team's spend more on payroll, it results in higher team winning percent (or better quality teams).
Second, we I run a regression (including a constant term), I find that payroll for the 2010 season is statistically insignificant. In other words in statistical terms payroll has zero effect on winning percentage at this point in the season. From that I would conclude that payroll for the 2010 MLB season has zero impact on winning percent in the 2010 season. Maybe given more time, this relationship will increase and become statistically significant. Over time (in other words adding more seasons) we do find a statistically significant relationship, but it is rather weak, indicating that using payroll to find out who the better teams in baseball will be is not all that reliable.
Team | Winpct | Total payroll |
Arizona | 0.370 | $60,718,166 |
Atlanta | 0.585 | $84,423,666 |
Baltimore | 0.283 | $81,612,500 |
Boston | 0.574 | $162,447,333 |
Chicago Cubs | 0.453 | $146,609,000 |
Chicago WhiteSox | 0.423 | $105,530,000 |
Cincinnati | 0.574 | $71,761,542 |
Cleveland | 0.373 | $61,203,966 |
Colorado | 0.528 | $84,227,000 |
Detroit | 0.519 | $122,864,928 |
Florida | 0.500 | $57,034,719 |
Houston | 0.358 | $92,355,500 |
Kansas City | 0.407 | $71,405,210 |
Los Angeles Angels | 0.491 | $104,963,866 |
Los Angeles Dodgers | 0.585 | $95,358,016 |
Milwaukee | 0.415 | $81,108,278 |
Minnesota | 0.585 | $97,559,166 |
New York Mets | 0.500 | $134,422,942 |
New York Yankees | 0.623 | $206,333,389 |
Oakland | 0.519 | $51,654,900 |
Philadelphia | 0.538 | $141,928,379 |
Pittsburgh | 0.415 | $34,943,000 |
San Diego | 0.604 | $37,799,300 |
San Francisco | 0.538 | $98,641,333 |
Seattle | 0.404 | $86,510,000 |
St. Louis | 0.574 | $93,540,751 |
Tampa Bay | 0.667 | $71,923,471 |
Texas | 0.538 | $55,250,544 |
Toronto | 0.564 | $62,234,000 |
Washington | 0.481 | $61,400,000 |
2 comments:
Dear Stace,
I teach high school math, and my class stumbled upon your website. As a result, we are flabbergasted by your findings. Obviously, a person of your stature - economics teacher at the University of Iowa -- would have the ability to find a proper sample size in analyzing data! How dare you have the audactiy to publish a website--and be the 3rd result under a google search--and not have reliable results. We determined by analyzing team's payrolls and winning percentages over a ten year span--not two months--that the correlation between winning percentage and average payroll to be a correlation coefficient of 0.72. Therefore, money does effect winning.
Sebastian,
First, you are correct the sample size is small. The point of the USA Today article (linked at the beginning of the blog) is that for this time period is that payroll and performance are related. So, I looked at their statements and tested it and found that it is not verified. In fact if you read the end of the blog, you will notice that I state, "Maybe given more time, this relationship will increase and become statistically significant. Over time (in other words adding more seasons) we do find a statistically significant relationship, but it is rather weak, indicating that using payroll to find out who the better teams in baseball will be is not all that reliable".
Second, I have no control where my blog shows up on Google. If this offends you - contact Google.
Third, you have made two errors. The first is that you state you are using team payrolls (which have an upward trend - and thus are non-stationary) and calculating the correlation coefficient with winning percentage, which is stationary. As we state in our book, The Wages of Wins, you must use relative payroll (team i's total payroll divided by the average payroll during season j). As a math teacher hopefully you can follow the i is for the individual team and j is for each season.
Your second error is the last sentence. You use a descriptive statistic (correlation coefficient) to make a statistic inference. Just because the correlation coefficient is a number does not allow us to draw a conclusion, which is a freshman lesson in statistics. As a high school math teacher you should know better. Run a regression on winning percentage and relative payroll - you choose the league and the time period. At most the R-square will be 30%, which means that the variation in relative payroll is missing at least 70% of the variation in winning percent.
Post a Comment