<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Heureusement, ici, c&#039;est le Blog!</title>
	<atom:link href="http://tomflesher.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://tomflesher.com</link>
	<description>Happily, here, it&#039;s the Blog! Baseball and economics discussion.</description>
	<lastBuildDate>Wed, 28 Jul 2010 15:20:43 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='tomflesher.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://www.gravatar.com/blavatar/2cb82d01b009edbc0585a3920947b352?s=96&#038;d=http://s2.wp.com/i/buttonw-com.png</url>
		<title>Heureusement, ici, c&#039;est le Blog!</title>
		<link>http://tomflesher.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://tomflesher.com/osd.xml" title="Heureusement, ici, c&#039;est le Blog!" />
	<atom:link rel='hub' href='http://tomflesher.com/?pushpress=hub'/>
		<item>
		<title>The 600 Home Run Almanac</title>
		<link>http://tomflesher.com/2010/07/28/the-600-home-run-almanac/</link>
		<comments>http://tomflesher.com/2010/07/28/the-600-home-run-almanac/#comments</comments>
		<pubDate>Wed, 28 Jul 2010 15:20:04 +0000</pubDate>
		<dc:creator>tomflesher</dc:creator>
				<category><![CDATA[Baseball]]></category>
		<category><![CDATA[Economics]]></category>
		<category><![CDATA[Alex Rodriguez]]></category>
		<category><![CDATA[Barry Bonds]]></category>
		<category><![CDATA[baseball-reference.com]]></category>
		<category><![CDATA[Jim Thome]]></category>
		<category><![CDATA[Manny Ramirez]]></category>
		<category><![CDATA[probability]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[Willie Mays]]></category>
		<category><![CDATA[600 home runs]]></category>
		<category><![CDATA[Hank Aaron]]></category>
		<category><![CDATA[Sammy Sosa]]></category>
		<category><![CDATA[Ken Griffey Jr.]]></category>
		<category><![CDATA[Babe Ruth]]></category>
		<category><![CDATA[A-Rod]]></category>

		<guid isPermaLink="false">http://tomflesher.com/?p=398</guid>
		<description><![CDATA[People are interested in players who hit 600 home runs, at least judging by the Google searches that point people here. With that in mind, let&#8217;s take a look at some quick facts about the 600th home run and the people who have hit it. Age: There are six players to have hit #600. Sammy [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=398&subd=heureusementici&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>People are interested in players who hit 600 home runs, at least judging by the Google searches that point people here. With that in mind, let&#8217;s take a look at some quick facts about the 600th home run and the people who have hit it.</p>
<p><strong>Age: </strong>There are <a href="http://bbref.com/pi/shareit/y3VbM">six players</a> to have hit #600. <strong><a href="http://www.baseball-reference.com/players/s/sosasa01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Sammy  Sosa</a></strong> was the oldest at 39 years old in 2007. <strong><a href="http://www.baseball-reference.com/players/g/griffke02.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Ken  Griffey</a></strong>, Jr. was 38 in 2007, as were <a href="http://www.baseball-reference.com/players/m/mayswi01.shtml"><strong>Willie Mays</strong></a> in 1969 and <strong><a href="http://www.baseball-reference.com/players/b/bondsba01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Barry  Bonds</a></strong> in 2002. <strong><a href="http://www.baseball-reference.com/players/a/aaronha01.shtml">Hank Aaron</a></strong> was 37. <strong><a href="http://www.baseball-reference.com/players/r/ruthba01.shtml">Babe Ruth</a></strong> was the youngest at 36 in 1931. <strong><a href="http://www.baseball-reference.com/players/r/rodrial01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Alex  Rodriguez</a></strong>, who is 35 as of July 27, will almost certainly be the youngest player to reach 600 home runs. If both <strong><a href="http://www.baseball-reference.com/players/r/ramirma02.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Manny  Ramirez</a></strong> and <strong><a href="http://www.baseball-reference.com/players/t/thomeji01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Jim  Thome</a></strong> hang on to hit #600 over the next two to three seasons, Thome (who was born in August of 1970) will probably be 42 in 2012; Ramirez (born in May of 1972) will be 41 in 2013. (In an earlier post that&#8217;s when I estimated each player would hit #600.) If Thome holds on, then, he&#8217;ll be the oldest player to hit his 600th home run.</p>
<p><strong>Productivity:</strong> Since 2000 (which encompasses Rodriguez, Ramirez, and Thome in their primes), the average league rate of home runs per plate appearances has been about .028. That is, a home run was hit in about 2.8% of plate appearances. Over the same time period, Rodriguez&#8217; rate was .064 &#8211; more than double the league average. Ramirez hit .059 &#8211; again, over double the league rate. Thome, for his part, hit at a rate of .065 home runs per plate appearance. From 2000 to 2009, Thome was more productive than Rodriguez.</p>
<p><strong>Standing Out:</strong> Obviously it&#8217;s unusual for them to be that far above the curve. There were 1,877,363 plate appearances (trials) from 2000 to 2009. The margin of error for a proportion like the rate of home runs per plate appearance is</p>
<p><img src='http://l.wordpress.com/latex.php?latex=%5Csqrt%7B%5Cfrac%7Bp%281-p%29%7D%7Bn-1%7D%7D+%3D+%5Csqrt%7B%5Cfrac%7B.028%28.972%29%7D%7B1%2C877%2C362%7D%7D+%3D+%5Csqrt%7B%5Cfrac%7B.027%7D%7B1%2C877%2C362%7D%7D+%5Capprox+%5Csqrt%7B%5Cfrac%7B14%7D%7B1%2C000%2C000%2C000%7D%7D+%3D+.00012&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sqrt{\frac{p(1-p)}{n-1}} = \sqrt{\frac{.028(.972)}{1,877,362}} = \sqrt{\frac{.027}{1,877,362}} \approx \sqrt{\frac{14}{1,000,000,000}} = .00012' title='\sqrt{\frac{p(1-p)}{n-1}} = \sqrt{\frac{.028(.972)}{1,877,362}} = \sqrt{\frac{.027}{1,877,362}} \approx \sqrt{\frac{14}{1,000,000,000}} = .00012' class='latex' /></p>
<p>Ordinarily, we expect a random individual chosen from the population to land within the space of <img src='http://l.wordpress.com/latex.php?latex=p+%5Cpm+1.96+%5Ctimes+MoE&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p \pm 1.96 \times MoE' title='p \pm 1.96 \times MoE' class='latex' /> 95% of the time. That means our interval is</p>
<p><img src='http://l.wordpress.com/latex.php?latex=.027+%5Cpm+.00024&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='.027 \pm .00024' title='.027 \pm .00024' class='latex' /></p>
<p>That means that all three of the players are well without that confidence interval. (However, it&#8217;s likely that home run hitting is highly correlated with other factors that make this test less useful than it is in other situations.)</p>
<p><strong>Alex&#8217;s Drought:</strong> Finally, just how likely is it that Alex Rodriguez will go this long without a home run? He hit his last home run in his fourth plate appearance on <a href="http://www.baseball-reference.com/boxes/NYA/NYA201007220.shtml">July 22</a>. He had a fifth plate appearance in which he doubled. Since then, he&#8217;s played in five games totalling 22 plate appearances, so he&#8217;s gone 23 plate appearances without a home run. Assuming his rate of .064 home runs per plate appearance, how likely is that? We&#8217;d expect (.064*23) = about 1.5 home runs in that time, but how unlikely is this drought?</p>
<p>The binomial distribution is used to model strings of successes and failures in tests where we can say clearly whether each trial ended in a &#8220;yes&#8221; or &#8220;no.&#8221; We don&#8217;t need to break out that tool here, though &#8211; if the probability of a home run is .064, the probability of anything else is .936. The likelihood of a string of 23 non-home runs is</p>
<p><img src='http://l.wordpress.com/latex.php?latex=.936%5E%7B23%7D+%3D+.218&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='.936^{23} = .218' title='.936^{23} = .218' class='latex' /></p>
<p>It&#8217;s only about 22% likely that this drought happened only by chance. The better guess is that, as Rodriguez has said, he&#8217;s distracted by the switching to marked baseballs and media pressure to finally hit #600.</p>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:156px;width:1px;height:1px;overflow:hidden;"><img src="/Users/Owner/AppData/Local/Temp/moz-screenshot-5.png" alt="" /></div>
<br />Filed under: <a href='http://tomflesher.com/category/baseball/'>Baseball</a>, <a href='http://tomflesher.com/category/economics-2/'>Economics</a> Tagged: <a href='http://tomflesher.com/tag/600-home-runs/'>600 home runs</a>, <a href='http://tomflesher.com/tag/a-rod/'>A-Rod</a>, <a href='http://tomflesher.com/tag/alex-rodriguez/'>Alex Rodriguez</a>, <a href='http://tomflesher.com/tag/babe-ruth/'>Babe Ruth</a>, <a href='http://tomflesher.com/tag/barry-bonds/'>Barry Bonds</a>, <a href='http://tomflesher.com/tag/baseball/'>Baseball</a>, <a href='http://tomflesher.com/tag/baseball-reference-com/'>baseball-reference.com</a>, <a href='http://tomflesher.com/tag/hank-aaron/'>Hank Aaron</a>, <a href='http://tomflesher.com/tag/jim-thome/'>Jim Thome</a>, <a href='http://tomflesher.com/tag/ken-griffey-jr/'>Ken Griffey Jr.</a>, <a href='http://tomflesher.com/tag/manny-ramirez/'>Manny Ramirez</a>, <a href='http://tomflesher.com/tag/probability/'>probability</a>, <a href='http://tomflesher.com/tag/sammy-sosa/'>Sammy Sosa</a>, <a href='http://tomflesher.com/tag/statistics/'>statistics</a>, <a href='http://tomflesher.com/tag/willie-mays/'>Willie Mays</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/heureusementici.wordpress.com/398/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/heureusementici.wordpress.com/398/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/heureusementici.wordpress.com/398/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/heureusementici.wordpress.com/398/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/heureusementici.wordpress.com/398/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/heureusementici.wordpress.com/398/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/heureusementici.wordpress.com/398/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/heureusementici.wordpress.com/398/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/heureusementici.wordpress.com/398/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/heureusementici.wordpress.com/398/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=398&subd=heureusementici&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://tomflesher.com/2010/07/28/the-600-home-run-almanac/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Tom</media:title>
		</media:content>

		<media:content url="/Users/Owner/AppData/Local/Temp/moz-screenshot-5.png" medium="image" />
	</item>
		<item>
		<title>Matt Garza, Fifth No-Hitter of 2010</title>
		<link>http://tomflesher.com/2010/07/26/matt-garza-fifth-no-hitter-of-2010/</link>
		<comments>http://tomflesher.com/2010/07/26/matt-garza-fifth-no-hitter-of-2010/#comments</comments>
		<pubDate>Tue, 27 Jul 2010 02:26:13 +0000</pubDate>
		<dc:creator>tomflesher</dc:creator>
				<category><![CDATA[Baseball]]></category>
		<category><![CDATA[Dallas Braden]]></category>
		<category><![CDATA[Roy Halladay]]></category>
		<category><![CDATA[Ubaldo Jimenez]]></category>
		<category><![CDATA[Edwin Jackson]]></category>
		<category><![CDATA[no-hitters]]></category>
		<category><![CDATA[Year of the Pitcher]]></category>
		<category><![CDATA[Matt Garza]]></category>

		<guid isPermaLink="false">http://tomflesher.com/?p=394</guid>
		<description><![CDATA[Tonight, Matt Garza pitched the fifth no-hitter of 2010. He joins Edwin Jackson, Roy Halladay, Dallas Braden, and Ubaldo Jimenez in the Year of the Pitcher club. As I pointed out when Jackson hit his no-hitter, no-hit games are probably Poisson distributed. Let&#8217;s update the chart. The Poisson distribution has probability density function Maintaining our [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=394&subd=heureusementici&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>Tonight, <strong><a href="http://www.baseball-reference.com/players/g/garzama01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Matt  Garza</a></strong> pitched the fifth no-hitter of 2010. He joins <strong><a href="http://www.baseball-reference.com/players/j/jacksed01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Edwin  Jackson</a></strong>, <strong><a href="http://www.baseball-reference.com/players/h/hallaro01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Roy  Halladay</a></strong>, <strong><a href="http://www.baseball-reference.com/players/b/bradeda01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Dallas  Braden</a></strong>, and <strong><a href="http://www.baseball-reference.com/players/j/jimenub01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Ubaldo  Jimenez</a></strong> in the Year of the Pitcher club.</p>
<p>As I pointed out when Jackson hit his no-hitter, no-hit games are probably Poisson distributed. Let&#8217;s update the chart.</p>
<p>The Poisson distribution has probability density function</p>
<p><img src='http://l.wordpress.com/latex.php?latex=f%28n%3B+%5Clambda%29%3D%5Cfrac%7B%5Clambda%5En+e%5E%7B-%5Clambda%7D%7D%7Bn%21%7D+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='f(n; \lambda)=\frac{\lambda^n e^{-\lambda}}{n!} ' title='f(n; \lambda)=\frac{\lambda^n e^{-\lambda}}{n!} ' class='latex' /></p>
<p>Maintaining our prior rate of 2.45 no-hitters per season, that means <img src='http://l.wordpress.com/latex.php?latex=%5Clambda+%3D+2.45&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\lambda = 2.45' title='\lambda = 2.45' class='latex' />. Our function is then</p>
<p><img src='http://l.wordpress.com/latex.php?latex=f%28n%3B+%5Clambda+%3D+2.5%29%3D%5Cfrac%7B2.45%5En++%28.0864%29%7D%7Bn%21%7D+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='f(n; \lambda = 2.5)=\frac{2.45^n  (.0864)}{n!} ' title='f(n; \lambda = 2.5)=\frac{2.45^n  (.0864)}{n!} ' class='latex' /></p>
<p>The probabilities remain the same:</p>
<table border="0" cellspacing="0" cellpadding="0" width="227">
<col width="64"></col>
<col width="64"></col>
<col width="99"></col>
<tbody>
<tr>
<td width="64" height="20">n</td>
<td width="64">p</td>
<td width="99">cumulative</td>
</tr>
<tr>
<td height="20">0</td>
<td>0.0863</td>
<td>0.0863</td>
</tr>
<tr>
<td height="20">1</td>
<td>0.2114</td>
<td>0.2977</td>
</tr>
<tr>
<td height="20">2</td>
<td>0.2590</td>
<td>0.5567</td>
</tr>
<tr>
<td height="20">3</td>
<td>0.2115</td>
<td>0.7683</td>
</tr>
<tr>
<td height="20">4</td>
<td>0.1296</td>
<td>0.8978</td>
</tr>
<tr>
<td height="20">5</td>
<td>0.0635</td>
<td>0.9613</td>
</tr>
<tr>
<td height="20">6</td>
<td>0.0259</td>
<td>0.9872</td>
</tr>
<tr>
<td height="20">7</td>
<td>0.0091</td>
<td>0.9963</td>
</tr>
<tr>
<td height="20">8</td>
<td>0.0028</td>
<td>0.9991</td>
</tr>
<tr>
<td height="20">9</td>
<td>0.0008</td>
<td>0.9998</td>
</tr>
<tr>
<td height="20">10</td>
<td>0.0002</td>
<td>1.0000</td>
</tr>
</tbody>
</table>
<p>And though the expectation (E(49)) and cumulative expectation (C(49)) remain the same, the observed values shift slightly:</p>
<table border="0" cellspacing="0" cellpadding="0" width="295">
<col width="90"></col>
<col width="64"></col>
<col width="77"></col>
<col width="64"></col>
<tbody>
<tr>
<td width="90" height="20">E(49)</td>
<td width="64">Observed</td>
<td width="77">C(49)</td>
<td width="64">Total</td>
</tr>
<tr>
<td height="20">4.23</td>
<td>5</td>
<td>4.23</td>
<td>5</td>
</tr>
<tr>
<td height="20">10.36</td>
<td>11</td>
<td>14.59</td>
<td>16</td>
</tr>
<tr>
<td height="20">12.69</td>
<td>8</td>
<td>27.28</td>
<td>24</td>
</tr>
<tr>
<td height="20">10.36</td>
<td>17</td>
<td>37.65</td>
<td>41</td>
</tr>
<tr>
<td height="20">6.35</td>
<td>1</td>
<td>43.99</td>
<td>42</td>
</tr>
<tr>
<td height="20">3.11</td>
<td>5</td>
<td>47.10</td>
<td>47</td>
</tr>
<tr>
<td height="20">1.27</td>
<td>1</td>
<td>48.37</td>
<td>48</td>
</tr>
<tr>
<td height="20">0.44</td>
<td>0</td>
<td>48.82</td>
<td>48</td>
</tr>
<tr>
<td height="20">0.14</td>
<td>1</td>
<td>48.95</td>
<td>49</td>
</tr>
<tr>
<td height="20">0.04</td>
<td>0</td>
<td>48.99</td>
<td>49</td>
</tr>
<tr>
<td height="20">0.01</td>
<td>0</td>
<td>49.00</td>
<td>49</td>
</tr>
</tbody>
</table>
<p>The tailing observations (say, for 4+ no-hitters) don&#8217;t quite match the expected frequencies, but the cumulative values match quite nicely. There might be some unobserved variables that explain the weirdness in the upper tail. Still, cumulatively, we have 47 seasons with 5 or fewer no-hitters, which is almost exactly what&#8217;s expected. This is unusual, but not outside the realm of statistical expectation.</p>
<br />Filed under: <a href='http://tomflesher.com/category/baseball/'>Baseball</a> Tagged: <a href='http://tomflesher.com/tag/dallas-braden/'>Dallas Braden</a>, <a href='http://tomflesher.com/tag/edwin-jackson/'>Edwin Jackson</a>, <a href='http://tomflesher.com/tag/matt-garza/'>Matt Garza</a>, <a href='http://tomflesher.com/tag/no-hitters/'>no-hitters</a>, <a href='http://tomflesher.com/tag/roy-halladay/'>Roy Halladay</a>, <a href='http://tomflesher.com/tag/ubaldo-jimenez/'>Ubaldo Jimenez</a>, <a href='http://tomflesher.com/tag/year-of-the-pitcher/'>Year of the Pitcher</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/heureusementici.wordpress.com/394/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/heureusementici.wordpress.com/394/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/heureusementici.wordpress.com/394/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/heureusementici.wordpress.com/394/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/heureusementici.wordpress.com/394/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/heureusementici.wordpress.com/394/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/heureusementici.wordpress.com/394/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/heureusementici.wordpress.com/394/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/heureusementici.wordpress.com/394/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/heureusementici.wordpress.com/394/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=394&subd=heureusementici&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://tomflesher.com/2010/07/26/matt-garza-fifth-no-hitter-of-2010/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Tom</media:title>
		</media:content>
	</item>
		<item>
		<title>600 Home Runs: Who&#8217;s Second?</title>
		<link>http://tomflesher.com/2010/07/25/600-home-runs-whos-second/</link>
		<comments>http://tomflesher.com/2010/07/25/600-home-runs-whos-second/#comments</comments>
		<pubDate>Mon, 26 Jul 2010 00:20:47 +0000</pubDate>
		<dc:creator>tomflesher</dc:creator>
				<category><![CDATA[Baseball]]></category>
		<category><![CDATA[Economics]]></category>
		<category><![CDATA[Alex Rodriguez]]></category>
		<category><![CDATA[Dodgers]]></category>
		<category><![CDATA[Jim Thome]]></category>
		<category><![CDATA[Manny Ramirez]]></category>
		<category><![CDATA[home runs]]></category>
		<category><![CDATA[binomial distribution]]></category>
		<category><![CDATA[600 home runs]]></category>
		<category><![CDATA[quick and dirty stats]]></category>
		<category><![CDATA[Twins]]></category>

		<guid isPermaLink="false">http://tomflesher.com/?p=390</guid>
		<description><![CDATA[Alex Rodriguez is, as I&#8217;m writing this, sitting at 599 home runs. Almost certainly, he&#8217;ll be the next player to hit the 600 home-run milestone, since the next two active players are Jim Thome at 575 and Manny Ramirez at 554. Today&#8217;s Toyota Text Poll (which runs during Yankee games on YES) asked which of [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=390&subd=heureusementici&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p><strong><a href="http://www.baseball-reference.com/players/r/rodrial01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Alex  Rodriguez</a></strong> is, as I&#8217;m writing this, sitting at 599 home runs. Almost certainly, he&#8217;ll be the next player to hit the 600 home-run milestone, since the next two active players are <strong><a href="http://www.baseball-reference.com/players/t/thomeji01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Jim  Thome</a></strong> at 575 and <strong><a href="http://www.baseball-reference.com/players/r/ramirma02.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Manny  Ramirez</a></strong> at 554. Today&#8217;s Toyota Text Poll (which runs during Yankee games on YES) asked which of those two players would reach #600 sooner.</p>
<p>There are a few levels of abstraction to answering this question. First of all, without looking at the players&#8217; stats, Thome gets the nod at the first order because he&#8217;s significantly closer than Driving in 25 home runs is easier than driving in 46, so Thome will probably get there first.</p>
<p>At the second order, we should take a look at the players&#8217; respective rates. Over the past two seasons, Thome has averaged a rate of .053 home runs per plate appearance, while Ramirez has averaged .041 home runs per plate appearance. With fewer home runs to hit and a higher likelihood of hitting one each time he makes it to the plate, Thome stays more likely to hit #600 before Ramirez does&#8230; but how much more likely?</p>
<p>Using the binomial distribution, I tested the likelihood that each player would hit his required number of home runs in different numbers of plate appearances to see where that likelihood reached a maximum. For Thome, the probability increases until 471 plate appearances, then starts decreasing, so roughly, I expect Thome to hit his 25th home run within 471 plate appearances. For Manny, that maximum doesn&#8217;t occur until 1121 plate appearances. Again, the nod has to go to Thome. He&#8217;ll probably reach the milestone in less than half as many plate appearances.</p>
<p>But wait. How many plate appearances is that, anyway? Until recently, Manny played 80-90% of the games in a season. Last year, he played 64%. So far the Dodgers have played 99 games and Manny appeared in 61 of them, but of course he&#8217;s disabled this year. Let&#8217;s make the generous assumption that Manny will play in 75% of the games in each season starting with this one. Then, let&#8217;s look at his average plate appearances per game. For most of his career, he averaged between 4.1 and 4.3 plate appearances per game, but this year he&#8217;s down to 3.6. Let&#8217;s make the (again, generous) assumption that he&#8217;ll get 4 plate appearances in each game from now on. At that rate, to get 1121 plate appearances, he needs to play in 280.25 games, which averages to 1.723 seasons of 162 games or about 2.62 seasons of 75% playing time.</p>
<p>Thome, on the other hand, has consistently played in 80% or more of his team&#8217;s games but suffered last year and this year because he hasn&#8217;t been serving as an everyday player. He pinch-hit in the National League last year and has, in Minnesota, played in about 69% of the games averaging only 3 plate appearances in each. Let&#8217;s give Jim the benefit of the doubt and assume that from here on out he&#8217;ll hit in 70% of the games and get 3.5 appearances (fewer games and fewer appearances than Ramirez). He&#8217;d need about 120.3 games, which equates to about 3/4 of a 162-game season or about 1.06 seasons with 70% playing time. Even if we downgrade Thome to 2.5 PA per game and 66% playing time, that still gives us an expectation that he&#8217;ll hit #600 within the next 1.6 real-time seasons.</p>
<p>Since Thome and Ramirez are the same age, there&#8217;s probably no good reason to expect one to retire before the other, and they&#8217;ll probably both be hitting as designated hitters in the AL next year. As a result, it&#8217;s very fair to expect Thome to A) reach 600 home runs and B) do it before <strong><a href="http://www.baseball-reference.com/players/r/ramirma02.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Manny  Ramirez</a></strong>.</p>
<br />Filed under: <a href='http://tomflesher.com/category/baseball/'>Baseball</a>, <a href='http://tomflesher.com/category/economics-2/'>Economics</a> Tagged: <a href='http://tomflesher.com/tag/600-home-runs/'>600 home runs</a>, <a href='http://tomflesher.com/tag/alex-rodriguez/'>Alex Rodriguez</a>, <a href='http://tomflesher.com/tag/binomial-distribution/'>binomial distribution</a>, <a href='http://tomflesher.com/tag/dodgers/'>Dodgers</a>, <a href='http://tomflesher.com/tag/home-runs/'>home runs</a>, <a href='http://tomflesher.com/tag/jim-thome/'>Jim Thome</a>, <a href='http://tomflesher.com/tag/manny-ramirez/'>Manny Ramirez</a>, <a href='http://tomflesher.com/tag/quick-and-dirty-stats/'>quick and dirty stats</a>, <a href='http://tomflesher.com/tag/twins/'>Twins</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/heureusementici.wordpress.com/390/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/heureusementici.wordpress.com/390/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/heureusementici.wordpress.com/390/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/heureusementici.wordpress.com/390/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/heureusementici.wordpress.com/390/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/heureusementici.wordpress.com/390/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/heureusementici.wordpress.com/390/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/heureusementici.wordpress.com/390/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/heureusementici.wordpress.com/390/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/heureusementici.wordpress.com/390/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=390&subd=heureusementici&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://tomflesher.com/2010/07/25/600-home-runs-whos-second/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Tom</media:title>
		</media:content>
	</item>
		<item>
		<title>Micah Owings and Cobb-Douglas Production</title>
		<link>http://tomflesher.com/2010/07/22/micah-owings-and-cobb-douglas-production/</link>
		<comments>http://tomflesher.com/2010/07/22/micah-owings-and-cobb-douglas-production/#comments</comments>
		<pubDate>Thu, 22 Jul 2010 13:34:05 +0000</pubDate>
		<dc:creator>tomflesher</dc:creator>
				<category><![CDATA[Baseball]]></category>
		<category><![CDATA[Economics]]></category>
		<category><![CDATA[David Ortiz]]></category>
		<category><![CDATA[Micah Owings]]></category>
		<category><![CDATA[run production]]></category>
		<category><![CDATA[Reds]]></category>
		<category><![CDATA[Brooks Kieschnick]]></category>
		<category><![CDATA[Cobb-Douglas function]]></category>

		<guid isPermaLink="false">http://tomflesher.com/?p=386</guid>
		<description><![CDATA[Micah Owings, who is one of the best two-way players in baseball since Brooks Kieschnick, was sent down to the minors by the Cincinnati Reds yesterday. As big a fan as I am of Micah (really, look at the blog), I think this was probably the right decision. Owings was being used as a long [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=386&subd=heureusementici&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p><strong><a href="http://www.baseball-reference.com/players/o/owingmi01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Micah  Owings</a></strong>, who is one of the best two-way players in baseball since Brooks Kieschnick, was <a href="http://news.cincinnati.com/article/20100721/SPT04/307210081/1062/SPT/Owings-out-Fisher-back">sent down to the minors </a>by the Cincinnati Reds yesterday. As big a fan as I am of Micah (really, look at the blog), I think this was probably the right decision.</p>
<p>Owings was being used as a long reliever. For a big-hitting pitcher like Micah, that&#8217;s death to begin with. Relievers need to be available to pitch, so the Reds couldn&#8217;t get their money&#8217;s worth from Owings as a pinch hitter, since he wouldn&#8217;t be available to re-enter the game as a pitcher unless they used him immediately. They also weren&#8217;t getting their money&#8217;s worth as a pitcher, since, as Cincinnati.com notes, the Reds&#8217; starting pitching was doing very well and so long relief wasn&#8217;t being used very often.</p>
<p>Letting Owings start in AAA will give him the best possible outcome &#8211; he&#8217;ll have regular opportunities to pitch, so he won&#8217;t rust, and he&#8217;ll get to bat at least some of the time. Owings needs to be cultivated as a batter because that&#8217;s where his comparative advantage is. I doubt he&#8217;ll ever be at the top of the rotation, but he could be a competent fifth starter. If he pitches often enough to get there, he&#8217;ll add significant value to the team in terms of his OBP above the expected pitcher. He&#8217;ll get on base more, so he&#8217;ll both advance runners and avoid making an out.</p>
<p>A baseball player is a factory for producing run differential. He does so using two inputs: defensive ability (pitching and fielding) and offensive ability (batting). In the National League, if a player can&#8217;t hit at all, he&#8217;s likely to produce very little in the way of run differential, but at the same time, if he&#8217;s a liability on defense, he&#8217;s not likely to be very useful either. Defense produces marginal runs by preventing opposing runs from scoring, and offense produces marginal runs by scoring runs. Having either one set to zero (in the case of a pitcher who can&#8217;t hit at all) or a negative value (an actively bad pitcher) would negatively affect the player&#8217;s run production. This is similar to a factory situation where labor and equipment are used to produce goods, and that situation is usually modeled using a <a href="http://en.wikipedia.org/wiki/Cobb%E2%80%93Douglas">Cobb-Douglas production function</a>:</p>
<p><img src='http://l.wordpress.com/latex.php?latex=Y+%3D+K%5E%7B%5Calpha%7D+%5Ctimes+L%5E%7B1+-+%5Calpha%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='Y = K^{\alpha} \times L^{1 - \alpha}' title='Y = K^{\alpha} \times L^{1 - \alpha}' class='latex' /></p>
<p>with Y = production, z = a productivity constant, K = equipment and technology, L = labor input, and <img src='http://l.wordpress.com/latex.php?latex=%5Calpha&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\alpha' title='\alpha' class='latex' /> is a constant between 0 and 1 that represents relatively how important the input is. K might be, for example, operating expenses for a machine to produce widgets, and L might be the wages paid to the operators of the machine. This function has the nice property that if we think both inputs are equally important (that is, <img src='http://l.wordpress.com/latex.php?latex=%5Calpha&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\alpha' title='\alpha' class='latex' /> = .5) then production is maximized when the inputs are equal.</p>
<p>In general, production of run differential could be modeled using the same method. For example:</p>
<p><img src='http://l.wordpress.com/latex.php?latex=RD+%3D+P%5E%7B%5Calpha%7D+%5Ctimes+F%5E%7B%5Cbeta%7D+%5Ctimes+B%5E%7B1+-+%5Calpha+-+%5Cbeta%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='RD = P^{\alpha} \times F^{\beta} \times B^{1 - \alpha - \beta}' title='RD = P^{\alpha} \times F^{\beta} \times B^{1 - \alpha - \beta}' class='latex' /></p>
<p>where P = pitching contribution, F = fielding contribution, B = batting contribution, and <img src='http://l.wordpress.com/latex.php?latex=%5Calpha&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\alpha' title='\alpha' class='latex' /> and <img src='http://l.wordpress.com/latex.php?latex=%5Cbeta&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\beta' title='\beta' class='latex' /> are both between 0 and 1 and would vary based on position. For example, <strong><a href="http://www.baseball-reference.com/players/o/ortizda01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">David Ortiz</a></strong> is a designated hitter. His pitching ability is totally irrelevant, and so is his fielding ability outside of interleague games. The DH&#8217;s <img src='http://l.wordpress.com/latex.php?latex=%5Calpha&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\alpha' title='\alpha' class='latex' /> would be 0 and his <img src='http://l.wordpress.com/latex.php?latex=%5Cbeta&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\beta' title='\beta' class='latex' /> would be very close to 0. On the other hand, an American League pitcher would have an <img src='http://l.wordpress.com/latex.php?latex=%5Calpha&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\alpha' title='\alpha' class='latex' /> very close to 1 since pitcher fielding is not as important as pitching and his hitting is entirely inconsequential in the AL. Catchers would have <img src='http://l.wordpress.com/latex.php?latex=%5Calpha&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\alpha' title='\alpha' class='latex' /> at 0 but <img src='http://l.wordpress.com/latex.php?latex=%5Cbeta&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\beta' title='\beta' class='latex' /> much higher than other positions.</p>
<p>The upshot of this method of modeling production is that it shows Owings can make up for being a less than stellar pitcher by helping his team score runs and be a considerably better investment than a pitcher with a slightly lower ERA but no run production.</p>
<br />Filed under: <a href='http://tomflesher.com/category/baseball/'>Baseball</a>, <a href='http://tomflesher.com/category/economics-2/'>Economics</a> Tagged: <a href='http://tomflesher.com/tag/brooks-kieschnick/'>Brooks Kieschnick</a>, <a href='http://tomflesher.com/tag/cobb-douglas-function/'>Cobb-Douglas function</a>, <a href='http://tomflesher.com/tag/david-ortiz/'>David Ortiz</a>, <a href='http://tomflesher.com/tag/micah-owings/'>Micah Owings</a>, <a href='http://tomflesher.com/tag/reds/'>Reds</a>, <a href='http://tomflesher.com/tag/run-production/'>run production</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/heureusementici.wordpress.com/386/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/heureusementici.wordpress.com/386/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/heureusementici.wordpress.com/386/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/heureusementici.wordpress.com/386/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/heureusementici.wordpress.com/386/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/heureusementici.wordpress.com/386/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/heureusementici.wordpress.com/386/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/heureusementici.wordpress.com/386/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/heureusementici.wordpress.com/386/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/heureusementici.wordpress.com/386/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=386&subd=heureusementici&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://tomflesher.com/2010/07/22/micah-owings-and-cobb-douglas-production/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Tom</media:title>
		</media:content>
	</item>
		<item>
		<title>Adventures in the Mets Bullpen: One-Run No-Decisions and Vulture Wins</title>
		<link>http://tomflesher.com/2010/07/19/adventures-in-the-mets-bullpen/</link>
		<comments>http://tomflesher.com/2010/07/19/adventures-in-the-mets-bullpen/#comments</comments>
		<pubDate>Mon, 19 Jul 2010 15:37:24 +0000</pubDate>
		<dc:creator>tomflesher</dc:creator>
				<category><![CDATA[Baseball]]></category>
		<category><![CDATA[Mets]]></category>
		<category><![CDATA[Roy Halladay]]></category>
		<category><![CDATA[Johan Santana]]></category>
		<category><![CDATA[Yovani Gallardo]]></category>
		<category><![CDATA[Francisco Rodriguez]]></category>
		<category><![CDATA[Tyler Clippard]]></category>
		<category><![CDATA[Phil Cuzzi]]></category>
		<category><![CDATA[Ted Lilly]]></category>
		<category><![CDATA[Randy Wels]]></category>
		<category><![CDATA[Jason Stark]]></category>
		<category><![CDATA[Phil Cuzzi's hissyfit]]></category>
		<category><![CDATA[vulture wins]]></category>
		<category><![CDATA[one-run no-decisions]]></category>

		<guid isPermaLink="false">http://tomflesher.com/?p=384</guid>
		<description><![CDATA[A close cousin of the Tough Loss discussed earlier is what Jayson Stark of ESPN calls the Criminally Unsupported Start. Stark defines a CUS as a start in which the pitcher pitches 6 or more innings but the offense scores one run or less in support. Johan Santana didn&#8217;t fit that definition last night, but [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=384&subd=heureusementici&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>A close cousin of the <a href="http://tomflesher.com/2010/07/08/tough-losses/">Tough Loss</a> discussed earlier is what Jayson Stark of ESPN calls the <a href="http://sports.espn.go.com/mlb/columns/story?columnist=stark_jayson&amp;id=2910803">Criminally Unsupported Start</a>. Stark defines a CUS as a start in which the pitcher pitches 6 or more innings but the offense scores one run or less in support. <strong><a href="http://www.baseball-reference.com/players/s/santajo02.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Johan  Santana</a></strong> didn&#8217;t fit that definition <a href="http://www.baseball-reference.com/boxes/SFN/SFN201007180.shtml">last night</a>, but he was close: he left the game with a 2-1 lead after 8 innings pitched and ended up with a no-decision. (A friend of mine liked to call that &#8220;the ol&#8217; <strong><a href="http://www.baseball-reference.com/players/h/hallaro01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Roy  Halladay</a></strong>&#8221; back when Doc was pitching in Toronto.) Just as he was the centerpiece of Jayson Stark&#8217;s CUS standings back in 2007, Santana currently <a href="http://bbref.com/pi/shareit/V97Ya">leads the league</a> in starts with 6.0 or more innings pitched, at most one run allowed, and no decision. He has six such games, and no other pitcher has more than four. (<strong><a href="http://www.baseball-reference.com/players/g/gallayo01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Yovani  Gallardo</a></strong>, however, has a respectable 3.)</p>
<p>In <a href="http://bbref.com/pi/shareit/QXvvn">all of 2009</a>, no one hit the six-game mark in one-run no-decisions. Surprisingly, this year the Mets <a href="http://bbref.com/pi/shareit/KW9Ct">aren&#8217;t leading the league in these one-run no-decisions</a> &#8211; the Cubs are, led by <strong><a href="http://www.baseball-reference.com/players/w/wellsra01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Randy  Wells</a></strong> and his impressive 4, along with <strong><a href="http://www.baseball-reference.com/players/l/lillyte01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Ted  Lilly</a></strong> with 3.</p>
<p><strong><a href="http://www.baseball-reference.com/player_search.cgi?search=Francisco+Rodriguez&amp;utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Francisco  Rodriguez</a></strong> also picked up his third Vulture Win of the year last night. A vulture win is the combination of a blown save and a win in the same game. Usually, that happens when a hometown closer blows the save in the top of the 9th and his teammates score in the bottom for the win. Frankie blew the save in the bottom of the 9th last night, but they left him in to pitch the bottom of the 10th and he held on (despite Phil Cuzzi&#8217;s hissyfit and some questionable umpiring going in both directions). <strong><a href="http://www.baseball-reference.com/players/c/clippty01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Tyler  Clippard</a></strong> <a href="http://bbref.com/pi/shareit/665FK">leads the league in vulture wins</a> this year with four.</p>
<br />Filed under: <a href='http://tomflesher.com/category/baseball/'>Baseball</a> Tagged: <a href='http://tomflesher.com/tag/francisco-rodriguez/'>Francisco Rodriguez</a>, <a href='http://tomflesher.com/tag/jason-stark/'>Jason Stark</a>, <a href='http://tomflesher.com/tag/johan-santana/'>Johan Santana</a>, <a href='http://tomflesher.com/tag/mets/'>Mets</a>, <a href='http://tomflesher.com/tag/one-run-no-decisions/'>one-run no-decisions</a>, <a href='http://tomflesher.com/tag/phil-cuzzi/'>Phil Cuzzi</a>, <a href='http://tomflesher.com/tag/phil-cuzzis-hissyfit/'>Phil Cuzzi's hissyfit</a>, <a href='http://tomflesher.com/tag/randy-wels/'>Randy Wels</a>, <a href='http://tomflesher.com/tag/roy-halladay/'>Roy Halladay</a>, <a href='http://tomflesher.com/tag/ted-lilly/'>Ted Lilly</a>, <a href='http://tomflesher.com/tag/tyler-clippard/'>Tyler Clippard</a>, <a href='http://tomflesher.com/tag/vulture-wins/'>vulture wins</a>, <a href='http://tomflesher.com/tag/yovani-gallardo/'>Yovani Gallardo</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/heureusementici.wordpress.com/384/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/heureusementici.wordpress.com/384/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/heureusementici.wordpress.com/384/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/heureusementici.wordpress.com/384/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/heureusementici.wordpress.com/384/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/heureusementici.wordpress.com/384/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/heureusementici.wordpress.com/384/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/heureusementici.wordpress.com/384/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/heureusementici.wordpress.com/384/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/heureusementici.wordpress.com/384/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=384&subd=heureusementici&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://tomflesher.com/2010/07/19/adventures-in-the-mets-bullpen/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Tom</media:title>
		</media:content>
	</item>
		<item>
		<title>The Kate Smith Effect</title>
		<link>http://tomflesher.com/2010/07/18/the-kate-smith-effect/</link>
		<comments>http://tomflesher.com/2010/07/18/the-kate-smith-effect/#comments</comments>
		<pubDate>Sun, 18 Jul 2010 04:36:10 +0000</pubDate>
		<dc:creator>tomflesher</dc:creator>
				<category><![CDATA[Baseball]]></category>
		<category><![CDATA[binomial distribution]]></category>
		<category><![CDATA[Kate Smith]]></category>
		<category><![CDATA[Flyers]]></category>
		<category><![CDATA[hockey-reference.com]]></category>
		<category><![CDATA[Kate Smith Effect]]></category>

		<guid isPermaLink="false">http://tomflesher.com/?p=382</guid>
		<description><![CDATA[From the Mountains&#8230; To the Prairies&#8230; To the Oceans&#8230; White with foam&#8230;. It&#8217;s &#8220;well-known&#8221; that when Kate Smith sings &#8220;God Bless America&#8221; &#8211; whether live starting in 1969 or on videotape now &#8211; the Philadelphia Flyers play better, or at least they&#8217;re more likely to win. As Wikipedia indicates, she&#8217;s considered a good luck charm [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=382&subd=heureusementici&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p><em>From the Mountains&#8230;<br />
To the Prairies&#8230;<br />
To the Oceans&#8230;<br />
White with foam&#8230;.</em></p>
<p>It&#8217;s &#8220;well-known&#8221; that when <a href="http://en.wikipedia.org/wiki/Kate_Smith#Significance_in_professional_sports">Kate Smith</a> sings &#8220;God Bless America&#8221; &#8211; whether live starting in 1969 or on videotape now &#8211; <a href="http://www.hockey-reference.com/teams/PHI/">the Philadelphia Flyers</a> play better, or at least they&#8217;re more likely to win. As Wikipedia indicates, she&#8217;s considered a good luck charm for the Flyers. How much does she help?</p>
<p>Since 1969, the Flyers have played in 3268 games and won 1631 of them for an observed win percentage of .4991. That&#8217;s very close to the long-term win percentage of .50 that we&#8217;d expect for any team. Of those games, Kate Smith sang or was played at 114 of them with a  total record of 87-23-4, and the record when Kate Smith did not sing was 1544 wins in 3154 games for a &#8220;non-Kate&#8221; win proportion of .4895. I&#8217;ll make the null hypothesis that the Flyers play exactly the same way in games where &#8220;God Bless America&#8221; is sung &#8211; &#8220;Kate games&#8221; &#8211; as they do when it isn&#8217;t. That means that</p>
<p><img src='http://l.wordpress.com/latex.php?latex=H_%7B0%7D%3A+p%28Win+%5Cmid+Kate%29+%3D+p%28Win+%5Cmid+Non-Kate%29+%3D+.4895+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='H_{0}: p(Win \mid Kate) = p(Win \mid Non-Kate) = .4895 ' title='H_{0}: p(Win \mid Kate) = p(Win \mid Non-Kate) = .4895 ' class='latex' /></p>
<p>The simplest way to attack this is to note that the Flyers&#8217; win percentage in Kate games is .7632. Qualitatively, that&#8217;s quite a jump &#8211; surely, it must be significant. Of course, we can&#8217;t leave it at that.</p>
<p>First, note that with an observed proportion of .4895, the binomial probability of winning 87 games in 114 trials is approximately .00000000145 &#8211; that&#8217;s about 145 in one hundred billion. That&#8217;s highly unlikely. However, other methods can help us quantify the Kate Smith Effect.</p>
<p>The standard error for proportions is</p>
<p><img src='http://l.wordpress.com/latex.php?latex=%5Csqrt%7B%5Cfrac%7Bp%281-p%29%7D%7Bn%7D%7D+%3D+%5Csqrt%7B%5Cfrac%7B.7632%28.2368%29%7D%7B114%7D%7D+%3D+%5Csqrt%7B.0012%7D+%3D+.0346&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sqrt{\frac{p(1-p)}{n}} = \sqrt{\frac{.7632(.2368)}{114}} = \sqrt{.0012} = .0346' title='\sqrt{\frac{p(1-p)}{n}} = \sqrt{\frac{.7632(.2368)}{114}} = \sqrt{.0012} = .0346' class='latex' /></p>
<p>With 113 degrees of freedom and a 95% confidence interval, I used <a href="http://www.stat.tamu.edu/~west/applets/tdemo.html">Texas A&amp;M&#8217;s t Calculator</a> to find that the appropriate critical value is 1.98. That means that we can be 95% confident that the win percentage in Kate games after controlling for other factors is somewhere in the range</p>
<p><img src='http://l.wordpress.com/latex.php?latex=.7632+%5Cpm+1.98+%5Ctimes+.0346+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='.7632 \pm 1.98 \times .0346 ' title='.7632 \pm 1.98 \times .0346 ' class='latex' /> or approximately <img src='http://l.wordpress.com/latex.php?latex=.6947+%5Cle+p%28Win+%5Cmid+Kate%29+%5Cle+.8317+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='.6947 \le p(Win \mid Kate) \le .8317 ' title='.6947 \le p(Win \mid Kate) \le .8317 ' class='latex' /></p>
<p>Since the true proportion in non-Kate games is .4895, that means the Kate Smith Effect is somewhere in the range</p>
<p><img src='http://l.wordpress.com/latex.php?latex=.2051+%5Cle+%5Chat%7B%5Cdelta%7D+%5Cle+.3421+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='.2051 \le \hat{\delta} \le .3421 ' title='.2051 \le \hat{\delta} \le .3421 ' class='latex' /></p>
<p>Though I can&#8217;t explain <em>why</em>, it&#8217;s apparent that there&#8217;s a Kate Smith Effect of at least 20% in terms of winning percentage. This isn&#8217;t to say that playing Kate Smith&#8217;s &#8220;God Bless America&#8221; causes good luck. Since the Kate video is considered a good luck charm, it&#8217;s probably more likely that the players play harder in games that are deemed important enough to play it.</p>
<br />Filed under: <a href='http://tomflesher.com/category/baseball/'>Baseball</a> Tagged: <a href='http://tomflesher.com/tag/binomial-distribution/'>binomial distribution</a>, <a href='http://tomflesher.com/tag/flyers/'>Flyers</a>, <a href='http://tomflesher.com/tag/hockey-reference-com/'>hockey-reference.com</a>, <a href='http://tomflesher.com/tag/kate-smith/'>Kate Smith</a>, <a href='http://tomflesher.com/tag/kate-smith-effect/'>Kate Smith Effect</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/heureusementici.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/heureusementici.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/heureusementici.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/heureusementici.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/heureusementici.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/heureusementici.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/heureusementici.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/heureusementici.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/heureusementici.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/heureusementici.wordpress.com/382/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=382&subd=heureusementici&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://tomflesher.com/2010/07/18/the-kate-smith-effect/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Tom</media:title>
		</media:content>
	</item>
		<item>
		<title>Cheap Wins</title>
		<link>http://tomflesher.com/2010/07/16/cheap-wins/</link>
		<comments>http://tomflesher.com/2010/07/16/cheap-wins/#comments</comments>
		<pubDate>Fri, 16 Jul 2010 05:51:44 +0000</pubDate>
		<dc:creator>tomflesher</dc:creator>
				<category><![CDATA[Baseball]]></category>
		<category><![CDATA[baseball-reference.com]]></category>
		<category><![CDATA[Bill James]]></category>
		<category><![CDATA[Roy Halladay]]></category>
		<category><![CDATA[Tim Lincecum]]></category>
		<category><![CDATA[Yovani Gallardo]]></category>
		<category><![CDATA[R.A. Dickey]]></category>
		<category><![CDATA[Cheap Wins]]></category>
		<category><![CDATA[Tough Losses]]></category>
		<category><![CDATA[John Danks]]></category>
		<category><![CDATA[Ricky Romero]]></category>
		<category><![CDATA[Tim Wakefield]]></category>
		<category><![CDATA[Joe Saunders]]></category>
		<category><![CDATA[John Lackey]]></category>
		<category><![CDATA[Brian Bannister]]></category>

		<guid isPermaLink="false">http://tomflesher.com/?p=378</guid>
		<description><![CDATA[The opposite of the Tough Loss discussed below (which R.A. Dickey unfortunately experienced tonight in a duel with Tim Lincecum) is a Cheap Win. Logically, since a Tough Loss is a loss in a quality start, a Cheap Win (invented by Bill James) is a win in a non-quality start &#8211; that is, a start [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=378&subd=heureusementici&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>The opposite of the Tough Loss discussed below (which <strong><a href="http://www.baseball-reference.com/players/d/dicker.01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">R.A.  Dickey</a></strong> unfortunately experienced tonight in a duel with <strong><a href="http://www.baseball-reference.com/players/l/linceti01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Tim  Lincecum</a></strong>) is a Cheap Win. Logically, since a Tough Loss is a loss in a quality start, a Cheap Win (invented by Bill James) is a win in a non-quality start &#8211; that is, a start with a game score of below 50 (or, officially, a start with fewer than 6.0 innings pitched or more than 3 runs allowed).</p>
<p>The Chicago White Sox&#8217; starter, <strong><a href="http://www.baseball-reference.com/players/d/danksjo01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">John  Danks</a></strong>, picked up a Cheap Win in Thursday&#8217;s game against the Twins. Although he pitched six innings, he gave up six runs (all earned) in the second inning, leading to an abysmal game score of 33. Danks had two of <a href="http://bbref.com/pi/shareit/SzKxb">last year&#8217;s 304 Cheap Wins</a>. <strong><a href="http://www.baseball-reference.com/players/r/romerri01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Ricky  Romero</a></strong> led <a href="http://bbref.com/pi/shareit/SzKxb">the pack</a> with six, and <strong><a href="http://www.baseball-reference.com/players/s/saundjo01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Joe  Saunders</a></strong> and <strong><a href="http://www.baseball-reference.com/players/w/wakefti01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Tim  Wakefield</a></strong> were both among the six pitchers with five Cheap Wins. Even <strong><a href="http://www.baseball-reference.com/players/h/hallaro01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Roy  Halladay</a></strong> had two.</p>
<p>Through the beginning of the All-Star Break, there have been <a href="http://bbref.com/pi/shareit/BCQQA">136 Cheap Wins</a> in 2010. That includes one by my current favorite player, <strong><a href="http://www.baseball-reference.com/players/g/gallayo01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Yovani  Gallardo</a></strong>. <strong><a href="http://www.baseball-reference.com/players/l/lackejo01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">John  Lackey</a></strong> is <a href="http://bbref.com/pi/shareit/QHXVB">already up to 5</a>, and <strong><a href="http://www.baseball-reference.com/players/b/bannibr01.shtml?utm_source=direct&amp;utm_medium=linker&amp;utm_campaign=Linker">Brian  Bannister</a></strong> is knocking on the door with 4.</p>
<p>It&#8217;s hard to read too much into the tea leaves of Cheap Wins, since they&#8217;re not all created equal. In general, they represent a pitcher sliding a little bit off his game, but his team upping their run production to rescue him. To that end, Cheap Wins might be a better measure of a team&#8217;s ability than Tough Losses, since, while Tough Losses show a pitcher maintaining himself under fire, Cheap Wins represent an ability to hit in the clutch (assuming that run production in Cheap Wins is significantly different from run production in other games). That&#8217;s hard to validate without doing a bit more work, but it&#8217;s a project to consider.</p>
<br />Filed under: <a href='http://tomflesher.com/category/baseball/'>Baseball</a> Tagged: <a href='http://tomflesher.com/tag/baseball-reference-com/'>baseball-reference.com</a>, <a href='http://tomflesher.com/tag/bill-james/'>Bill James</a>, <a href='http://tomflesher.com/tag/brian-bannister/'>Brian Bannister</a>, <a href='http://tomflesher.com/tag/cheap-wins/'>Cheap Wins</a>, <a href='http://tomflesher.com/tag/joe-saunders/'>Joe Saunders</a>, <a href='http://tomflesher.com/tag/john-danks/'>John Danks</a>, <a href='http://tomflesher.com/tag/john-lackey/'>John Lackey</a>, <a href='http://tomflesher.com/tag/r-a-dickey/'>R.A. Dickey</a>, <a href='http://tomflesher.com/tag/ricky-romero/'>Ricky Romero</a>, <a href='http://tomflesher.com/tag/roy-halladay/'>Roy Halladay</a>, <a href='http://tomflesher.com/tag/tim-lincecum/'>Tim Lincecum</a>, <a href='http://tomflesher.com/tag/tim-wakefield/'>Tim Wakefield</a>, <a href='http://tomflesher.com/tag/tough-losses/'>Tough Losses</a>, <a href='http://tomflesher.com/tag/yovani-gallardo/'>Yovani Gallardo</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/heureusementici.wordpress.com/378/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/heureusementici.wordpress.com/378/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/heureusementici.wordpress.com/378/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/heureusementici.wordpress.com/378/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/heureusementici.wordpress.com/378/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/heureusementici.wordpress.com/378/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/heureusementici.wordpress.com/378/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/heureusementici.wordpress.com/378/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/heureusementici.wordpress.com/378/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/heureusementici.wordpress.com/378/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=378&subd=heureusementici&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://tomflesher.com/2010/07/16/cheap-wins/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Tom</media:title>
		</media:content>
	</item>
		<item>
		<title>Paul the Octopus: Credible?</title>
		<link>http://tomflesher.com/2010/07/11/paul-the-octopus-credible/</link>
		<comments>http://tomflesher.com/2010/07/11/paul-the-octopus-credible/#comments</comments>
		<pubDate>Sun, 11 Jul 2010 23:14:26 +0000</pubDate>
		<dc:creator>tomflesher</dc:creator>
				<category><![CDATA[Economics]]></category>
		<category><![CDATA[binomial distribution]]></category>
		<category><![CDATA[Paul the Octopus]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[World Cup]]></category>

		<guid isPermaLink="false">http://tomflesher.com/?p=358</guid>
		<description><![CDATA[Paul the Octopus (hatched 2008) is an octopus who correctly predicted 12 of 14 World Cup matches, including Spain&#8217;s victory over the Dutch. Is his string of victories statistically significant? First, I&#8217;m going to posit the null hypothesis that Paul is choosing randomly. As such, Paul&#8217;s proportion of correct choices should be .5 (). His [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=358&subd=heureusementici&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://en.wikipedia.org/wiki/Paul_the_Octopus">Paul the Octopus</a> (hatched 2008) is an octopus who correctly predicted 12 of 14 World Cup matches, including</p>
<div class="mceTemp">
<dl class="wp-caption alignright">
<dt class="wp-caption-dt"><a href="http://heureusementici.files.wordpress.com/2010/07/oktopus-orakel_paul_mit_schuh.jpg"><img class="size-thumbnail wp-image-363" title="Oktopus-Orakel_Paul_mit_Schuh" src="http://heureusementici.files.wordpress.com/2010/07/oktopus-orakel_paul_mit_schuh.jpg?w=112&#038;h=150" alt="Paul the Octopus with a Shoe" width="112" height="150" /></a></dt>
</dl>
</div>
<p>Spain&#8217;s victory over the Dutch. Is his string of victories statistically significant?</p>
<p>First, I&#8217;m going to posit the null hypothesis that Paul is choosing randomly. As such, Paul&#8217;s proportion of correct choices should be .5 (<img src='http://l.wordpress.com/latex.php?latex=H_o+%3A+%5Cbar%7Bp%7D+%3D+.5&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='H_o : \bar{p} = .5' title='H_o : \bar{p} = .5' class='latex' />). His observed proportion of correct choices is 12/14 or .857.</p>
<p>The standard error for proportions is</p>
<p><img src='http://l.wordpress.com/latex.php?latex=%5Csqrt%7B%5Cfrac%7Bp%281-p%29%7D%7Bn-1%7D%7D+%3D+%5Csqrt%7B%5Cfrac%7B.857%28.143%29%7D%7B13%7D%7D+%3D+%5Csqrt%7B%5Cfrac%7B.123%7D%7B13%7D%7D+%3D+%5Csqrt%7B.009%7D+%3D+.097+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sqrt{\frac{p(1-p)}{n-1}} = \sqrt{\frac{.857(.143)}{13}} = \sqrt{\frac{.123}{13}} = \sqrt{.009} = .097 ' title='\sqrt{\frac{p(1-p)}{n-1}} = \sqrt{\frac{.857(.143)}{13}} = \sqrt{\frac{.123}{13}} = \sqrt{.009} = .097 ' class='latex' /></p>
<p>The t-value of an observation is</p>
<p><img src='http://l.wordpress.com/latex.php?latex=%5Cfrac%7Bp%7D%7Bse%7D+%5Csim%5C+t_%7Bdf%7D+%3D+%5Cfrac%7B.857%7D%7B.097%7D+%5Csim%5C+t_%7B13%7D+%3D+8.84+%5Csim%5C+t_%7B13%7D+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{p}{se} \sim\ t_{df} = \frac{.857}{.097} \sim\ t_{13} = 8.84 \sim\ t_{13} ' title='\frac{p}{se} \sim\ t_{df} = \frac{.857}{.097} \sim\ t_{13} = 8.84 \sim\ t_{13} ' class='latex' /></p>
<p>According to <a href="http://www.stat.tamu.edu/~west/applets/tdemo.html">Texas A&amp;M&#8217;s t Distribution Calculator</a>, the probability (or p-value) of this result by chance alone is less than .01.</p>
<p>Using the <a href="http://en.wikipedia.org/wiki/Binomial_distribution">binomial distribution</a> with <img src='http://l.wordpress.com/latex.php?latex=%5Clambda+%3D+.5&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\lambda = .5' title='\lambda = .5' class='latex' />, the probability of 12 or more successes in 14 trials is a vanishingly small .0065.</p>
<p>So, is Paul an oracle? Almost certainly not. However, not being a zoologist, I can&#8217;t explain what biases might be in play. I&#8217;d imagine it&#8217;s something like an attraction to contrast as well as a spurious correlation between octopus-attractive flags and success at soccer.</p>
<br />Filed under: <a href='http://tomflesher.com/category/economics-2/'>Economics</a> Tagged: <a href='http://tomflesher.com/tag/binomial-distribution/'>binomial distribution</a>, <a href='http://tomflesher.com/tag/paul-the-octopus/'>Paul the Octopus</a>, <a href='http://tomflesher.com/tag/statistics/'>statistics</a>, <a href='http://tomflesher.com/tag/world-cup/'>World Cup</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/heureusementici.wordpress.com/358/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/heureusementici.wordpress.com/358/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/heureusementici.wordpress.com/358/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/heureusementici.wordpress.com/358/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/heureusementici.wordpress.com/358/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/heureusementici.wordpress.com/358/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/heureusementici.wordpress.com/358/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/heureusementici.wordpress.com/358/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/heureusementici.wordpress.com/358/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/heureusementici.wordpress.com/358/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=358&subd=heureusementici&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://tomflesher.com/2010/07/11/paul-the-octopus-credible/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Tom</media:title>
		</media:content>

		<media:content url="http://heureusementici.files.wordpress.com/2010/07/oktopus-orakel_paul_mit_schuh.jpg?w=112" medium="image">
			<media:title type="html">Oktopus-Orakel_Paul_mit_Schuh</media:title>
		</media:content>
	</item>
		<item>
		<title>More on Home Runs Per Game</title>
		<link>http://tomflesher.com/2010/07/09/more-on-home-runs-per-game/</link>
		<comments>http://tomflesher.com/2010/07/09/more-on-home-runs-per-game/#comments</comments>
		<pubDate>Fri, 09 Jul 2010 14:35:26 +0000</pubDate>
		<dc:creator>tomflesher</dc:creator>
				<category><![CDATA[Baseball]]></category>
		<category><![CDATA[Economics]]></category>
		<category><![CDATA[baseball-reference.com]]></category>
		<category><![CDATA[Japan]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Rays]]></category>
		<category><![CDATA[regression]]></category>
		<category><![CDATA[replication]]></category>
		<category><![CDATA[home runs]]></category>
		<category><![CDATA[Japanese baseball]]></category>
		<category><![CDATA[Chow test]]></category>

		<guid isPermaLink="false">http://tomflesher.com/?p=335</guid>
		<description><![CDATA[In the previous post, I looked at the trend in home runs per game in the Major Leagues and suggested that the recent deviation from the increasing trend might have been due to the development of strong farm systems like the Tampa Bay Rays&#8217;. That means that if the same data analysis process is used [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=335&subd=heureusementici&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>In the previous post, I looked at the trend in home runs per game in the Major Leagues and suggested that the recent deviation from the increasing trend might have been due to the development of strong farm systems like the Tampa Bay Rays&#8217;. That means that if the same data analysis process is used on data in an otherwise identical league, we should see similar trends but no dropoff around 1995. As usual, for replication purposes I&#8217;m going to use Japan&#8217;s Pro Baseball leagues, the Pacific and Central Leagues. They&#8217;re ideal because, just like the American Major Leagues, one league uses the designated hitter and one does not. There are some differences &#8211; the talent pool is a bit smaller because of the lower population base that the leagues draw from, and there are only 6 teams in each league as opposed to MLB&#8217;s 14 and 16.</p>
<p>As a reminder, the MLB regression gave us a regression equation of</p>
<p><img src='http://l.wordpress.com/latex.php?latex=%5Chat%7BHR%7D+%3D+.957+-+.0188+%5Ctimes+t+%2B+.0004+%5Ctimes+t%5E2+%2B+.0911+%5Ctimes+DH+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\hat{HR} = .957 - .0188 \times t + .0004 \times t^2 + .0911 \times DH ' title='\hat{HR} = .957 - .0188 \times t + .0004 \times t^2 + .0911 \times DH ' class='latex' /></p>
<p>where <img src='http://l.wordpress.com/latex.php?latex=%5Chat%7BHR%7D+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\hat{HR} ' title='\hat{HR} ' class='latex' /> is the predicted number of home runs per game,<em> t</em> is a time variable starting at <em>t</em>=1 in 1954, and <em>DH</em> is a binary variable that takes value 1 if the league uses the designated hitter in the season in question.</p>
<p>Just examining the data on home runs per game from the Japanese leagues, the trend looks significantly differe<a href="http://heureusementici.files.wordpress.com/2010/07/japanhrpergame.jpg"><img class="alignright size-thumbnail  wp-image-336" title="japanhrpergame" src="http://heureusementici.files.wordpress.com/2010/07/japanhrpergame.jpg?w=150&#038;h=82" alt="" width="150" height="82" /></a>nt.  Instead of the rough U-shape that the MLB data showed, the Japanese data looks almost M-shaped with a maximum around 1984. (Why, I&#8217;m not sure &#8211; I&#8217;m not knowledgeable enough about Japanese baseball to know what might have caused that spike.) It reaches a minimum again and then keeps rising.</p>
<p>After running the same regression with <em>t</em>=1 in 1950, I got these results:</p>
<table border="0" cellspacing="0" cellpadding="0" width="384">
<col span="6" width="64"></col>
<tbody>
<tr>
<td width="64" height="20"></td>
<td width="64">Estimate</td>
<td width="64">Std. Error</td>
<td width="64">t-value</td>
<td width="64">p-value</td>
<td width="64">Signif</td>
</tr>
<tr>
<td height="20">B0</td>
<td align="right">0.2462</td>
<td align="right">0.0992</td>
<td align="right">2.481</td>
<td align="right">0.0148</td>
<td align="right">0.9852</td>
</tr>
<tr>
<td height="20">t</td>
<td align="right">0.0478</td>
<td align="right">0.0062</td>
<td align="right">7.64</td>
<td align="right">1.63E-11</td>
<td align="right">1</td>
</tr>
<tr>
<td height="20">tsq</td>
<td align="right">-0.0006</td>
<td align="right">0.00009</td>
<td align="right">-7.463</td>
<td align="right">3.82E-11</td>
<td align="right">1</td>
</tr>
<tr>
<td height="20">DH</td>
<td align="right">0.0052</td>
<td align="right">0.0359</td>
<td align="right">0.144</td>
<td align="right">0.8855</td>
<td align="right">0.1145</td>
</tr>
</tbody>
</table>
<p>This equation shows two things, one that surprises me and one that doesn&#8217;t. The unsurprising factor is the switching of signs for the <em>t</em> variables &#8211; we expected that based on the shape of the data. The surprising factor is that the designated hitter rule is insignificant. We can only be about 11% sure it&#8217;s significant. In addition, this model explains less of the variation than the MLB version &#8211; while that explained about 56% of the variation, the Japanese model has an <img src='http://l.wordpress.com/latex.php?latex=R%5E2+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='R^2 ' title='R^2 ' class='latex' /> value of .4045, meaning it explains about 40% of the variation in home runs per game.</p>
<p>There&#8217;s a slightly interesting pattern to the residual home runs per game (<img src='http://l.wordpress.com/latex.php?latex=Residual+%3D+%5Chat%7BHR%7D+-+HR&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='Residual = \hat{HR} - HR' title='Residual = \hat{HR} - HR' class='latex' />. Although <a href="http://heureusementici.files.wordpress.com/2010/07/japanresidualhrpergame1.jpg"><img class="alignright size-thumbnail wp-image-338" title="japanresidualhrpergame" src="http://heureusementici.files.wordpress.com/2010/07/japanresidualhrpergame1.jpg?w=150&#038;h=82" alt="" width="150" height="82" /></a>it isn&#8217;t as pronounced, this data also shows a spike &#8211; but the spike is at <em>t</em>=55, so instead of showing up in 1995, the Japan leagues spiked around the early 2000s. Clearly the same effect is not in play, but why might the Japanese leagues see the same effect later than the MLB teams? It can&#8217;t be an expansion effect, since the Japanese leagues have stayed constant at 6 teams since their inception.</p>
<p>Incidentally, the Japanese league data is heteroskedastic (Breusch-Pagan test p-value .0796), so it might be better modeled using a generalized least squares formula, but doing so would have skewed the results of the replication.</p>
<p>In order to show that the parameters really are different, the appropriate test is <a href="http://en.wikipedia.org/wiki/Chow_test">Chow&#8217;s test for structural change</a>. To clean it up, I&#8217;m using only the data from 1960 on. (It&#8217;s quick and dirty, but it&#8217;ll do the job.) Chow&#8217;s test takes</p>
<p><img src='http://l.wordpress.com/latex.php?latex=%5Cfrac%7B%28S_C+-%28S_1%2BS_2%29%29%2F%28k%29%7D%7B%28S_1%2BS_2%29%2F%28N_1%2BN_2-2k%29%7D+%5Csim%5C+F_%7Bk%2CN_1%2BN_2-2k%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{(S_C -(S_1+S_2))/(k)}{(S_1+S_2)/(N_1+N_2-2k)} \sim\ F_{k,N_1+N_2-2k}' title='\frac{(S_C -(S_1+S_2))/(k)}{(S_1+S_2)/(N_1+N_2-2k)} \sim\ F_{k,N_1+N_2-2k}' class='latex' /></p>
<p>where <img src='http://l.wordpress.com/latex.php?latex=S_C+%3D+6.3666&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='S_C = 6.3666' title='S_C = 6.3666' class='latex' /> is the combined sum of squared residuals, <img src='http://l.wordpress.com/latex.php?latex=S_1+%3D+1.2074&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='S_1 = 1.2074' title='S_1 = 1.2074' class='latex' /> and <img src='http://l.wordpress.com/latex.php?latex=S_2+%3D+2.2983&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='S_2 = 2.2983' title='S_2 = 2.2983' class='latex' /> are the individual (i.e. MLB and Japan) sum of squared residuals, <img src='http://l.wordpress.com/latex.php?latex=k%3D4&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='k=4' title='k=4' class='latex' /> is the number of parameters, and <img src='http://l.wordpress.com/latex.php?latex=N_1+%3D+100&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='N_1 = 100' title='N_1 = 100' class='latex' /> and <img src='http://l.wordpress.com/latex.php?latex=N_2+%3D+100&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='N_2 = 100' title='N_2 = 100' class='latex' /> are the number of observations in each group.</p>
<p><img src='http://l.wordpress.com/latex.php?latex=%5Cfrac%7B%286.3666+-%281.2074+%2B+2.2983%29%29%2F%284%29%7D%7B%28100%2B100%29%2F%28100%2B100-2%5Ctimes+4%29%7D+%5Csim%5C++F_%7B4%2C100%2B100-2+%5Ctimes+4%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{(6.3666 -(1.2074 + 2.2983))/(4)}{(100+100)/(100+100-2\times 4)} \sim\  F_{4,100+100-2 \times 4}' title='\frac{(6.3666 -(1.2074 + 2.2983))/(4)}{(100+100)/(100+100-2\times 4)} \sim\  F_{4,100+100-2 \times 4}' class='latex' /></p>
<p><img src='http://l.wordpress.com/latex.php?latex=%5Cfrac%7B%286.3666+-%283.5057%29%29%2F%284%29%7D%7B%28200%29%2F%28192%29%7D+%5Csim%5C++F_%7B4%2C192%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{(6.3666 -(3.5057))/(4)}{(200)/(192)} \sim\  F_{4,192}' title='\frac{(6.3666 -(3.5057))/(4)}{(200)/(192)} \sim\  F_{4,192}' class='latex' /></p>
<p><img src='http://l.wordpress.com/latex.php?latex=%5Cfrac%7B2.8609%2F4%7D%7B1.0417%29%7D+%5Csim%5C++F_%7B4%2C192%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{2.8609/4}{1.0417)} \sim\  F_{4,192}' title='\frac{2.8609/4}{1.0417)} \sim\  F_{4,192}' class='latex' /></p>
<p><img src='http://l.wordpress.com/latex.php?latex=%5Cfrac%7B.7152%7D%7B1.0417%29%7D+%5Csim%5C++F_%7B4%2C192%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{.7152}{1.0417)} \sim\  F_{4,192}' title='\frac{.7152}{1.0417)} \sim\  F_{4,192}' class='latex' /></p>
<p><img src='http://l.wordpress.com/latex.php?latex=.6866+%5Csim%5C++F_%7B4%2C192%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='.6866 \sim\  F_{4,192}' title='.6866 \sim\  F_{4,192}' class='latex' /></p>
<p>The critical value for 90% significance at 4 and 192 degrees of freedom would be 1.974 according to <a href="http://www.stat.tamu.edu/~west/applets/fdemo.html">Texas A&amp;M&#8217;s F calculator</a>. That means we don&#8217;t have enough evidence that the parameters are different to treat them differently. This is probably an artifact of the small amount of data we have.</p>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:744px;width:1px;height:1px;overflow:hidden;">
<div class="snap_preview">
<p>In the previous post, I looked at the trend  in home runs per game in the Major Leagues and suggested that the  recent deviation from the increasing trend might have been due to the  development of strong farm systems like the Tampa Bay Rays’. That means  that if the same data analysis process is used on data in an otherwise  identical league, we should see similar trends but no dropoff around  1995. As usual, for replication purposes I’m going to use Japan’s Pro  Baseball leagues, the Pacific and Central Leagues. They’re ideal  because, just like the American Major Leagues, one league uses the  designated hitter and one does not. There are some differences – the  talent pool is a bit smaller because of the lower population base that  the leagues draw from, and there are only 6 teams in each league as  opposed to MLB’s 14 and 16.</p>
<p>As a reminder, the MLB regression gave us a regression equation of</p>
<p><img class="latex" title="\hat{HR} = .957 - .0188 \times t + .0004 \times t^2 + .0911  \times DH " src="http://l.wordpress.com/latex.php?latex=%5Chat%7BHR%7D+%3D+.957+-+.0188+%5Ctimes+t+%2B+.0004+%5Ctimes+t%5E2+%2B+.0911+%5Ctimes+DH+&amp;bg=ffffff&amp;fg=000000&amp;s=0" alt="\hat{HR} = .957 - .0188 \times t + .0004 \times t^2 + .0911 \times  DH " /></p>
<p>where <img class="latex" title="\hat{HR} " src="http://l.wordpress.com/latex.php?latex=%5Chat%7BHR%7D+&amp;bg=ffffff&amp;fg=000000&amp;s=0" alt="\hat{HR} " /> is the predicted  number of home runs per game,<em> t</em> is a time variable starting at <em>t</em>=1  in 1954, and <em>DH</em> is a binary variable that takes value 1 if the  league uses the designated hitter in the season in question.</p>
<p>Just examining the data on home runs per game from the Japanese  leagues, the trend looks significantly differe<a href="http://heureusementici.files.wordpress.com/2010/07/japanhrpergame.jpg"><img class="alignright size-thumbnail  wp-image-336" title="japanhrpergame" src="http://heureusementici.files.wordpress.com/2010/07/japanhrpergame.jpg?w=150&amp;h=82&#038;h=82" alt="" width="150" height="82" /></a>nt.  Instead of the rough U-shape  that the MLB data showed, the Japanese data looks almost M-shaped with a  maximum around 1984. (Why, I’m not sure – I’m not knowledgeable enough  about Japanese baseball to know what might have caused that spike.) It  reaches a minimum again and then keeps rising.</p>
<p>After running the same regression with <em>t</em>=1 in 1950, I got  these results:</p>
<table border="0" cellspacing="0" cellpadding="0" width="384">
<col span="6" width="64"></col>
<tbody>
<tr>
<td width="64" height="20"></td>
<td width="64">Estimate</td>
<td width="64">Std. Error</td>
<td width="64">t-value</td>
<td width="64">p-value</td>
<td width="64">Signif</td>
</tr>
<tr>
<td height="20">B0</td>
<td align="right">0.2462</td>
<td align="right">0.0992</td>
<td align="right">2.481</td>
<td align="right">0.0148</td>
<td align="right">0.9852</td>
</tr>
<tr>
<td height="20">t</td>
<td align="right">0.0478</td>
<td align="right">0.0062</td>
<td align="right">7.64</td>
<td align="right">1.63E-11</td>
<td align="right">1</td>
</tr>
<tr>
<td height="20">tsq</td>
<td align="right">-0.0006</td>
<td align="right">0.00009</td>
<td align="right">-7.463</td>
<td align="right">3.82E-11</td>
<td align="right">1</td>
</tr>
<tr>
<td height="20">DH</td>
<td align="right">0.0052</td>
<td align="right">0.0359</td>
<td align="right">0.144</td>
<td align="right">0.8855</td>
<td align="right">0.1145</td>
</tr>
</tbody>
</table>
<p>This equation shows two things, one that surprises me and one that  doesn’t. The unsurprising factor is the switching of signs for the <em>t</em> variables – we expected that based on the shape of the data. The  surprising factor is that the designated hitter rule is insignificant.  We can only be about 11% sure it’s significant. In addition, this model  explains less of the variation than the MLB version – while that  explained about 56% of the variation, the Japanese model has an <img class="latex" title="R^2 " src="http://l.wordpress.com/latex.php?latex=R%5E2+&amp;bg=ffffff&amp;fg=000000&amp;s=0" alt="R^2 " /> value of .4045, meaning it  explains about 40% of the variation in home runs per game.</p>
<p>There’s a slightly interesting pattern to the residual home runs per  game (<img class="latex" title="Residual = \hat{HR} - HR" src="http://l.wordpress.com/latex.php?latex=Residual+%3D+%5Chat%7BHR%7D+-+HR&amp;bg=ffffff&amp;fg=000000&amp;s=0" alt="Residual = \hat{HR} - HR" />. Although <a href="http://heureusementici.files.wordpress.com/2010/07/japanresidualhrpergame1.jpg"><img class="alignright size-thumbnail wp-image-338" title="japanresidualhrpergame" src="http://heureusementici.files.wordpress.com/2010/07/japanresidualhrpergame1.jpg?w=150&amp;h=82&#038;h=82" alt="" width="150" height="82" /></a>it isn’t as pronounced, this data  also shows a spike – but the spike is at <em>t</em>=55, so instead of  showing up in 1995, the Japan leagues spiked around the early 2000s.  Clearly the same effect is not in play, but why might the Japanese  leagues see the same effect later than the MLB teams? It can’t be an  expansion effect, since the Japanese leagues have stayed constant at 6  teams since their inception.</p>
<p>Incidentally, the Japanese league data is heteroskedastic  (Breusch-Pagan test p-value .0796), so it might be better modeled using a  generalized least squares formula, but doing so would have skewed the  results of the replication.</p>
<p>In order to show that the parameters really are different, the  appropriate test is <a href="http://en.wikipedia.org/wiki/Chow_test">Chow’s  test for structural change</a>. To clean it up, I’m using only the data  from 1960 on. (It’s quick and dirty, but it’ll do the job.) Chow’s test  takes</p>
<p><img class="latex" title="\frac{(S_C -(S_1+S_2))/(k)}{(S_1+S_2)/(N_1+N_2-2k)} ~ F" src="http://l.wordpress.com/latex.php?latex=%5Cfrac%7B%28S_C+-%28S_1%2BS_2%29%29%2F%28k%29%7D%7B%28S_1%2BS_2%29%2F%28N_1%2BN_2-2k%29%7D+%7E+F&amp;bg=ffffff&amp;fg=000000&amp;s=0" alt="\frac{(S_C -(S_1+S_2))/(k)}{(S_1+S_2)/(N_1+N_2-2k)} ~ F" /></p>
</div>
</div>
<br />Filed under: <a href='http://tomflesher.com/category/baseball/'>Baseball</a>, <a href='http://tomflesher.com/category/economics-2/'>Economics</a> Tagged: <a href='http://tomflesher.com/tag/baseball/'>Baseball</a>, <a href='http://tomflesher.com/tag/baseball-reference-com/'>baseball-reference.com</a>, <a href='http://tomflesher.com/tag/chow-test/'>Chow test</a>, <a href='http://tomflesher.com/tag/home-runs/'>home runs</a>, <a href='http://tomflesher.com/tag/japan/'>Japan</a>, <a href='http://tomflesher.com/tag/japanese-baseball/'>Japanese baseball</a>, <a href='http://tomflesher.com/tag/r/'>R</a>, <a href='http://tomflesher.com/tag/rays/'>Rays</a>, <a href='http://tomflesher.com/tag/regression/'>regression</a>, <a href='http://tomflesher.com/tag/replication/'>replication</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/heureusementici.wordpress.com/335/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/heureusementici.wordpress.com/335/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/heureusementici.wordpress.com/335/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/heureusementici.wordpress.com/335/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/heureusementici.wordpress.com/335/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/heureusementici.wordpress.com/335/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/heureusementici.wordpress.com/335/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/heureusementici.wordpress.com/335/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/heureusementici.wordpress.com/335/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/heureusementici.wordpress.com/335/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=335&subd=heureusementici&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://tomflesher.com/2010/07/09/more-on-home-runs-per-game/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Tom</media:title>
		</media:content>

		<media:content url="http://heureusementici.files.wordpress.com/2010/07/japanhrpergame.jpg?w=150" medium="image">
			<media:title type="html">japanhrpergame</media:title>
		</media:content>

		<media:content url="http://heureusementici.files.wordpress.com/2010/07/japanresidualhrpergame1.jpg?w=150" medium="image">
			<media:title type="html">japanresidualhrpergame</media:title>
		</media:content>

		<media:content url="http://l.wordpress.com/latex.php?latex=%5Chat%7BHR%7D+%3D+.957+-+.0188+%5Ctimes+t+%2B+.0004+%5Ctimes+t%5E2+%2B+.0911+%5Ctimes+DH+&#38;bg=ffffff&#38;fg=000000&#38;s=0" medium="image">
			<media:title type="html">\hat{HR} = .957 - .0188 \times t + .0004 \times t^2 + .0911  \times DH </media:title>
		</media:content>

		<media:content url="http://l.wordpress.com/latex.php?latex=%5Chat%7BHR%7D+&#38;bg=ffffff&#38;fg=000000&#38;s=0" medium="image">
			<media:title type="html">\hat{HR} </media:title>
		</media:content>

		<media:content url="http://heureusementici.files.wordpress.com/2010/07/japanhrpergame.jpg?w=150&#38;h=82" medium="image">
			<media:title type="html">japanhrpergame</media:title>
		</media:content>

		<media:content url="http://l.wordpress.com/latex.php?latex=R%5E2+&#38;bg=ffffff&#38;fg=000000&#38;s=0" medium="image">
			<media:title type="html">R^2 </media:title>
		</media:content>

		<media:content url="http://l.wordpress.com/latex.php?latex=Residual+%3D+%5Chat%7BHR%7D+-+HR&#38;bg=ffffff&#38;fg=000000&#38;s=0" medium="image">
			<media:title type="html">Residual = \hat{HR} - HR</media:title>
		</media:content>

		<media:content url="http://heureusementici.files.wordpress.com/2010/07/japanresidualhrpergame1.jpg?w=150&#38;h=82" medium="image">
			<media:title type="html">japanresidualhrpergame</media:title>
		</media:content>

		<media:content url="http://l.wordpress.com/latex.php?latex=%5Cfrac%7B%28S_C+-%28S_1%2BS_2%29%29%2F%28k%29%7D%7B%28S_1%2BS_2%29%2F%28N_1%2BN_2-2k%29%7D+%7E+F&#38;bg=ffffff&#38;fg=000000&#38;s=0" medium="image">
			<media:title type="html">\frac{(S_C -(S_1+S_2))/(k)}{(S_1+S_2)/(N_1+N_2-2k)} ~ F</media:title>
		</media:content>
	</item>
		<item>
		<title>Back when it was hard to hit 55&#8230;</title>
		<link>http://tomflesher.com/2010/07/08/back-when-it-was-hard-to-hit-55/</link>
		<comments>http://tomflesher.com/2010/07/08/back-when-it-was-hard-to-hit-55/#comments</comments>
		<pubDate>Thu, 08 Jul 2010 15:06:05 +0000</pubDate>
		<dc:creator>tomflesher</dc:creator>
				<category><![CDATA[Baseball]]></category>
		<category><![CDATA[Economics]]></category>
		<category><![CDATA[baseball-reference.com]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[regression]]></category>
		<category><![CDATA[sabermetrics]]></category>
		<category><![CDATA[Stuff Keith Hernandez Says]]></category>
		<category><![CDATA[home runs]]></category>
		<category><![CDATA[Year of the Pitcher]]></category>
		<category><![CDATA[Willie Mays]]></category>
		<category><![CDATA[talent pool dilution]]></category>

		<guid isPermaLink="false">http://tomflesher.com/?p=319</guid>
		<description><![CDATA[Last night was one of those classic Keith Hernandez moments where he started talking and then stopped abruptly, which I always like to assume is because the guys in the truck are telling him to shut the hell up. He was talking about Willie Mays for some reason, and said that Mays hit 55 home [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=319&subd=heureusementici&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>Last night was one of those classic Keith Hernandez moments where he started talking and then stopped abruptly, which I always like to assume is because the guys in the truck are telling him to shut the hell up. He was talking about <a href="http://www.baseball-reference.com/players/m/mayswi01.shtml">Willie Mays </a>for some reason, and said that Mays hit 55 home runs &#8220;back when it was hard to hit 55.&#8221; Keith coyly said that, while it was easy for a while, it was &#8220;getting hard again,&#8221; at which point he abruptly stopped talking.</p>
<p>Keith&#8217;s unusual candor about drug use and Mays&#8217; career best of 52 home runs aside, this pinged my &#8220;Stuff Keith Hernandez Says&#8221; meter. After accounting for any time trend and other factors that might explain home run hitting, is there an upward trend? If so, is there a pattern to the remaining home runs?</p>
<p>The first step is to examine the data to see if there appears to be any trend. Just looking at it, there appears to be a messy U shape with a minimum around t=20, which indicates a quadratic trend. That means I want to include a term for time and a term for time squared.<a href="http://heureusementici.files.wordpress.com/2010/07/homerunspergame.jpg"><img class="alignright size-thumbnail  wp-image-325" title="homerunspergame" src="http://heureusementici.files.wordpress.com/2010/07/homerunspergame.jpg?w=150&#038;h=102" alt="" width="150" height="102" /></a></p>
<p>Using the per-game averages for home runs from 1955 to 2009, I detrended the data using t=1 in 1955. I also had to correct for the effect of the designated hitter. That gives us an equation of the form</p>
<p><img src='http://l.wordpress.com/latex.php?latex=%5Chat%7BHR%7D+%3D+%5Chat%7B%5Cbeta_%7B0%7D%7D+%2B+%5Chat%7B%5Cbeta_%7B1%7D%7Dt+%2B+%5Chat%7B%5Cbeta_%7B2%7D%7D+t%5E%7B2%7D+%2B+%5Chat%7B%5Cbeta_%7B3%7D%7D+DH+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\hat{HR} = \hat{\beta_{0}} + \hat{\beta_{1}}t + \hat{\beta_{2}} t^{2} + \hat{\beta_{3}} DH ' title='\hat{HR} = \hat{\beta_{0}} + \hat{\beta_{1}}t + \hat{\beta_{2}} t^{2} + \hat{\beta_{3}} DH ' class='latex' /></p>
<p>The results:</p>
<table border="0" cellspacing="0" cellpadding="0" width="384">
<col span="6" width="64"></col>
<tbody>
<tr>
<td width="64" height="20"></td>
<td width="64">Estimate</td>
<td width="64">Std. Error</td>
<td width="64">t-value</td>
<td width="64">p-value</td>
<td width="64">Signif</td>
</tr>
<tr>
<td height="20">B0</td>
<td align="right">0.957</td>
<td align="right">0.0328</td>
<td align="right">29.189</td>
<td align="right">0.0001</td>
<td align="right">0.9999</td>
</tr>
<tr>
<td height="20">t</td>
<td align="right">-0.0188</td>
<td align="right">0.0028</td>
<td align="right">-6.738</td>
<td align="right">0.0001</td>
<td align="right">0.9999</td>
</tr>
<tr>
<td height="20">tsq</td>
<td align="right">0.0004</td>
<td align="right">0.00005</td>
<td align="right">8.599</td>
<td align="right">0.0001</td>
<td align="right">0.9999</td>
</tr>
<tr>
<td height="20">DH</td>
<td align="right">0.0911</td>
<td align="right">0.0246</td>
<td align="right">3.706</td>
<td align="right">0.0003</td>
<td align="right">0.9997</td>
</tr>
</tbody>
</table>
<p>We can see that there&#8217;s an upward quadratic trend in predicted home runs that together with the DH rule account for about 56% of the variation in the number of home runs per game in a season (<img src='http://l.wordpress.com/latex.php?latex=R%5E2+%3D+.5618&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='R^2 = .5618' title='R^2 = .5618' class='latex' />). The Breusch-Pagan test has a p-value of .1610, indicating a possibility of mild homoskedasticity but nothing we should get concerned about.</p>
<p>Then, I needed to look at the difference between the predicted number of home runs per game and the actual number of home runs per game, which is accessible by subtracting</p>
<p><img src='http://l.wordpress.com/latex.php?latex=Residual+%3D+HR+-+%5Chat%7BHR%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='Residual = HR - \hat{HR}' title='Residual = HR - \hat{HR}' class='latex' /></p>
<p>This represents the &#8220;abnormal&#8221; number of home runs per year. The question then becomes, &#8220;Is there a patt<a href="http://heureusementici.files.wordpress.com/2010/07/homerunresiduals.jpg"><img class="alignright size-thumbnail  wp-image-331" title="homerunresiduals" src="http://heureusementici.files.wordpress.com/2010/07/homerunresiduals.jpg?w=150&#038;h=102" alt="" width="150" height="102" /></a>ern to the number of abnormal home runs?&#8221;  There are two ways to answer this. The first way is to look at the abnormal home runs. Up until about t=40 (the mid-1990s), the abnormal home runs are pretty much scattershot above and below 0. However, at t=40, the residual jumps up for both leagues and then begins a downward trend. It&#8217;s not clear what the cause of this is, but the knee-jerk reaction is that there might be a drug use effect. On the other hand, there are a couple of other explanations.</p>
<p>The most obvious is a boring old expansion effect. In 1993, the National League added two teams (the Marlins and the Rockies), and in 1998 each league added a team (the AL&#8217;s Rays and the NL&#8217;s Diamondbacks). Talent pool dilution has shown up in our discussion of hit batsmen, and I believe that it can be a real effect. It would be mitigated over time, however, by the establishment and development of farm systems, in particular strong systems like the one that&#8217;s producing good, cheap talent for the Rays.</p>
<br />Filed under: <a href='http://tomflesher.com/category/baseball/'>Baseball</a>, <a href='http://tomflesher.com/category/economics-2/'>Economics</a> Tagged: <a href='http://tomflesher.com/tag/baseball/'>Baseball</a>, <a href='http://tomflesher.com/tag/baseball-reference-com/'>baseball-reference.com</a>, <a href='http://tomflesher.com/tag/home-runs/'>home runs</a>, <a href='http://tomflesher.com/tag/r/'>R</a>, <a href='http://tomflesher.com/tag/regression/'>regression</a>, <a href='http://tomflesher.com/tag/sabermetrics/'>sabermetrics</a>, <a href='http://tomflesher.com/tag/stuff-keith-hernandez-says/'>Stuff Keith Hernandez Says</a>, <a href='http://tomflesher.com/tag/talent-pool-dilution/'>talent pool dilution</a>, <a href='http://tomflesher.com/tag/willie-mays/'>Willie Mays</a>, <a href='http://tomflesher.com/tag/year-of-the-pitcher/'>Year of the Pitcher</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/heureusementici.wordpress.com/319/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/heureusementici.wordpress.com/319/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/heureusementici.wordpress.com/319/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/heureusementici.wordpress.com/319/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/heureusementici.wordpress.com/319/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/heureusementici.wordpress.com/319/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/heureusementici.wordpress.com/319/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/heureusementici.wordpress.com/319/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/heureusementici.wordpress.com/319/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/heureusementici.wordpress.com/319/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=tomflesher.com&blog=14243162&post=319&subd=heureusementici&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://tomflesher.com/2010/07/08/back-when-it-was-hard-to-hit-55/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Tom</media:title>
		</media:content>

		<media:content url="http://heureusementici.files.wordpress.com/2010/07/homerunspergame.jpg?w=150" medium="image">
			<media:title type="html">homerunspergame</media:title>
		</media:content>

		<media:content url="http://heureusementici.files.wordpress.com/2010/07/homerunresiduals.jpg?w=150" medium="image">
			<media:title type="html">homerunresiduals</media:title>
		</media:content>
	</item>
	</channel>
</rss>