Is Data Ruining Sports?

Who would you rather have: Tris Speaker or Ty Cobb?

Jason Whitlock says that this question cannot be discussed; it can only be answered, thanks to the popularity of the book-turned-movie Moneyball, and sabermetrics, the advanced statistics that baseball fans and writers can now apply to the game as a lens through which to understand and contextualize the game. (Cobb had a better career OPS+, 168 to 157, so I guess he was better.)

someecards.com - Let's see a movie about a baseball genius who leads his team to winning one playoff series in 14 years

Whitlock argues that data is sapping the fun out of the sports. Little Timmy can’t enjoy the game of basketball anymore because nothing is left open to interpretation; there is a “right answer” to every question. Kobe versus MJ. Wilt versus Russell. Jason Whitlock believes it’s not even worth discussing anymore; some pencil-necked geek will inevitably come up with an empirical correct answer.

The problem with Whitlock’s argument is that it absolutely cannot be proven without resorting to data. On what basis does he believe that sports are being ruined for fans? What led him to this conclusion, other than his own personal distaste for advanced statistical measures? Here is some data to suggest that Jason is wrong.

If fans can’t enjoy sports anymore, because of data, how come ESPN keeps seeing excellent ratings for football, baseball, and basketball? When the Yankees played the Red Sox on August 7, it was the most viewed baseball broadcast on ESPN since 2007. The Patriots-Dolphins on Monday Night Football last week “delivered a 10.3 overnight rating, the second highest opening-game rating since ESPN started airing MNF in 2006.” Why are people watching instead of just watching the players’ statistics change in real time, since data has ruined sports?

If fans can’t enjoy sports anymore, because of data, how come attendance in the NBA has not slipped? It has stayed essentially level—around 21 million, near the cumulative total max capacity of all NBA arenas for 41 home games per team—since at least 2004, which is as far back as my NBA attendance data goes. Mr. Thompson and I did some rudimentary analysis of trends in NBA data. Fans aren’t staying home, and they’re not why the NBA is locked out. They’re having fun and enjoying the beauty and the drama of sports. (Data cannot tell you with certainty whether Kobe is going to hit that fadeaway at the buzzer to beat the Spurs; it can only tell you what the odds are.)

If fans can’t enjoy sports anymore, because of data, why is Major League Baseball reporting revenue increases year over year? As MLB reported after last season, the past seven seasons (2004-2010 inclusive) “are the seven best attended seasons in MLB history.” This coincides with the Moneyball era nicely, as the book was published in 2003. MLB revenue in 2010 approached $7 billion for the first time, putting it at around a 6% increase over 2009.

See, in order to prove that Moneyball and Sabermetrics have ruined sports, you’d need to show the world that they are having some sort of quantifiable negative effect. Jason absolutely cannot do that. His argument boils down to fear of needing to defend a position with more intelligence than “well, I just like Kobe Bryant better than Michael Jordan.” The reason people hate data, in sports just as in business, is that it raises the level of conversation and forces them to think more critically about the world.

Jason says we like data because we lack the ability to understand sports viscerally or strategically. I’m not sure what he means. (Oh no, I’m so buried in my data that I can’t tell what defense the Patriots are playing! Is it the 4-3 or the 3-4? Is that called a “blitz?” I can’t tell because, you know, I’m too nerdy to understand football.) This argument is ridiculous. It’s the same thing we hear in analytics for certain dyed-in-the-wool creatives who feel that data is an insufficient way to understand their “art.”

He says, “I saw Player X, and I know he was good, so therefore he’s good.” I’m afraid that’s unrealistic, Jason. See, you have biases. There are things you prefer in players, but that others might not. Errors or flaws you might not see, but that others do. You might see Brett Favre’s greatest game but miss his 20 game-ending interceptions because you were out getting coffee. This is even more likely to be true if your teams or your favorite players (or, if you’re in marketing or UXD, your favorite content/layout/design) are involved. You need data in order to look at the world on an even plane. You think that’s where the fun ends. I’d say that’s where the fun begins.

Ty CobbLet’s go back to Speaker and Cobb. Their advanced statistics are remarkably similar. You could legitimately make a case for either one. Sure, Jason; I suppose a nerd could come to you and say that Cobb had a higher OPS+, and that therefore there is no argument to be made for Speaker. Baseball fans don’t think that way. Speaker won four World Series with the Boston Red Sox; Cobb never won a World Series. Cobb was a terrible leader—perhaps the worst in sports history. His teammates utterly despised him. Yet, baseball fans are far more likely to know Ty Cobb. He was one of the first five inductees in the baseball Hall of Fame, and one could legitimately argue that he was the greatest natural hitter of all time. He is one of two players to accumulate more than 4,000 hits over his career. Despite all of this, there is a strong argument to be made that Speaker, even with his lower OPS+, would be a better player to build around. There is plenty about sports that cannot be quantified, Jason. Data just makes us think a little bit more about the nuances of the games we love.

Here’s another, more current example: Justin Verlander of the Detroit Tigers. A whole bunch of people believe that Verlander, by far the best statistical starting pitcher in baseball in 2011, should be the American League Most Valuable Player. Verlander has a Wins Above Replacement (WAR) of 8.5, which means that if you were to imagine that Verlander were replaced by an average starting pitcher, the Tigers would have won 8.5 fewer games. That is a massive number of wins to attribute to a single player. It’s the best in the league. If you define “value” in baseball as “wins,” you can definitely see how Verlander might be the MVP. But there are plenty of intelligent, knowledgeable Sabermetricians (myself included) who would accept the argument that the MVP should be someone who is on the field every day, playing in nearly every game (whereas starting pitchers only see action every fifth game). It’s a topic of conversation and debate on sports radio regularly. Fans love discussing it, even fans like me who know how statistically dominant Verlander has been. Where would the Yankees be without Curtis Granderson this season? Or the Red Sox without Jacoby Ellsbury at the top of their lineup? You can make legitimate cases for any of these players, each of whom (surprise!) excel in various statistical categories. Could it be that there is more to the MVP race that pure statistics? But Jason, I thought you said that there were no discussions allowed anymore!

I think Jason Whitlock is scared. He is scared that Hall-of-Fame voting in professional sports will someday be reduced to plugging numbers into a computer and seeing who the best players were. (This would guarantee someone like Todd Helton a spot in Cooperstown.) I don’t think anyone, even the great Bill James, would advocate such a hard-line stance. Eric Peterson made this point earlier this month, and I think it was prescient of him to make the distinction, since we’re going to be hearing these anti-data arguments more often as data usage grows, in both sports and business: we like to be data-informed, not data-driven. It’s important for me to know that Ryan Howard’s numbers don’t justify his massive contract, but that doesn’t mean I wouldn’t want him clubbing home runs for my team. (Perhaps that’s the difference between me and the seemingly data-driven Billy Beane, who still hasn’t won the last game of the season.) When I have data to help me understand what I’m seeing, I can put things into context. I can “relationalize” teams, players, and individual plays in new and exciting ways. Yes, Jason, data helps me and many others enjoy sports more than if it were completely up to our eyes.

I think Jason is also is scared that he can’t articulate why he loves John Elway other than “I like him.” I’m not sure why you’re scared, Jason. It seems easy to defend the fact that, even though Peyton Manning has eight more points of career completion percentage, and Tom Brady has a better postseason record, and Dan Marino has more yards, your boy Elway was a winner. He was a better leader than most of those quarterbacks, numbers be damned. Leadership matters, Jason, and it’s not quantifiable, so you’ve got your argument. You’ve got your discussion. And that doesn’t even touch on the physical aspects of Elway’s game that made him special (such as his arm). I could counter by talking about Tom Brady’s decision making, which also isn’t a statistic. (It isn’t just completing the pass that matters; it’s completing the best pass to the best possible target. This will never show on the stat sheet.) At that level—the Montana/Young/Marino/Manning/Brady/Elway level—you’re splitting hairs anyway. We nerds can say, “objectively, so-and-so is the best of all time.” You’re welcome to make a point that isn’t accounted for in the numbers. I don’t see how that should impact your enjoyment of the game or of discussions about the game, other than to make you think.

Maybe you don’t want to be forced to think. If that’s the case. . . tell me, who is ruining the vibrant discussion of sports, you or me?

7 comments

  1. Mark Bowers says:

    I wonder, do you think Whitlock would like his stock broker to act primarily just on what he saw, perhaps that commerical for that new start up looked good, without relying on the data represented in that company’s balance sheet or cash flow?

  2. Jared Conley says:

    Thanks Ben for the post. I love this topic as it ties together my two favorite passions – Sports and Numbers, and like you state, it makes for great debates – should Kobe shoot more or less, should Steve Nash be MVP, etc.

    Anyone who read Moneyball also knows that Billy Beane didn’t make every decision based solely off of statistics. He had been a top pick, been through the minor leagues, and knew that it was a grind and most players wash out. He used the statistics as a part of the equation. Some of his decisions likely were too tied to the statistics, but he knew statistics weren’t everything, but they provided much more insight than the naked eye could process.

    Paul Depodesta, Billy Beane’s renamed nerdy right hand man in the movie, went on to be the General Manager of the Dodgers and failed miserably, because he was too tied to statistics. Like everything in life, there needs to be a balance.

    • Ben says:

      Great point about DePodesta, Jared. And you’ve also pointed out some of the remaining questions that we can debate once we’ve established a foundation of data and information. I actually got the sense that Beane relied a little too heavily on data, personally. It’s the only way I can reconcile the fact that he hasn’t succeeded in his ultimate goal with the fact that he was ahead of his time in the way in thought about the economics of baseball. Also, the fact that the Red Sox hired Bill James and Theo Epstein and had phenomenal success suggests to me that something is wrong; there’s a balance between experience and top-tier players and the high-value, low-salary guys. Just my two cents.

  3. Daniel says:

    “There is plenty about sports that cannot be quantified, Jason. Data just makes us think a little bit more about the nuances of the games we love.” —Ben this sums up the exact way I feel about stats and sports. Anthony and I have debated this many times (mainly about Holinger cause he hates the Mavs and I think hes a wiener). As a coach I use stats daily and they are extremely useful…to a point. I love looking at our matchups for upcoming opponents and what their stats are and then seeing the team in person. Its amazing what you can tell from watching someone warmup or try to guard someone. You can tell if the are confident or scared and when they are scared you go right at them. You can tell if someone is a “shooter” or has simply been getting more shots and because they are a poor shooter and people leave them open. Open players are bound to hit a couple (see Jason Kidd) thus giving them an unrealistic shooting percentage. Clearly if they hit a few and then play a little D on them there stats fall back to earth (again see J Kidd). No stat can see that or predict that. That’s basketball sense. Now sure you could have shooting percentages for defended shots and open shots, but stats can’t tell you why they are open or covered. Are they doubling the post? Are they just leaving him open? Are they in a zone and some idiot post doesn’t rotate?
    Stats are what they are, simply an average of the past and possible predictor of future odds. I love stats to the extent they can help, but I also appreciate the fact that having a feel for a sport will help more in the moment of the actual game than stats. I can’t tell you how many times we are fouling at the end of games (we lose a lot hence the fouling) and percentage wise you should foul one person, but you can see they want to be the one to shoot, they want the ball. Then the 80% shooter looks scared to death to shoot and thats who you go after.
    I guess what I am saying is to some small extent I see what Whitlock is trying to say (kind of), but as usual he does an atrocious job of getting his point across because he is more focused on being controversial than actually writing a good article. I think it is ridiculous to say it is ruining sports in any way, but as a coach I can see somewhat what he is saying as far as watching a player and having a better sense of what they can do than stats would predict.
    Sorry for the rant. I am aware I probably spelled lots of words wrong and made little sense, but its my two cents from my perspective.

    • Ben says:

      I have to admit, Daniel: I was secretly hoping you would see this post and comment on it. You are the only real coach that I know, and obviously you are one of the sports fans whom I respect the most. (Anthony is another one.) A few people told me yesterday not to even bother being pissed off about Whitlock, because he’s an “out-of-touch dinosaur” and “all he does is stir the pot,” and I get that. I know he’s controversial, and he feeds off of making people squirm. Still, I felt like this warranted a response if only because a.) in a roundabout, unintentional way, he’s ripping my profession, and b.) so many of the comments and tweets that I saw were actually supporting him.

      I love the perspective that stats get you to a certain point as a coach as then basketball sense takes over. The example about a free throw shooter at the end of the game is priceless. There is no stat for “end of game confidence.” There are so many moving parts in basketball that it’s nearly impossible—or totally impossible—to get a complete sense of a player’s game from stats. And really, that’s true in baseball and especially football as well. This is what Whitlock is missing. He thinks people are trying to quantify absolutely everything, but there are literally dozens or hundreds of variables that impact player performance (such as the ones you’ve pointed out—why is someone open or covered at any given time? What impacts a player’s confidence level. There could be hundreds of factors.).

  4. football says:

    That brings us to quite possibly the most intriguing match-up to that point of
    the season when Oregon comes to Rice-Eccles.

    It is Tennessee’s Rocky Top, Florida’s Old Ball Coach, and of course the
    Gator Chomp and the mighty Tim Tebow. Among those that came out
    in the wish list is a better line play, addition of team entrances, and crowd atmosphere.

Leave a Reply

Your email address will not be published. Required fields are marked *