Monday, April 17, 2006

Why Statistics can't Explain what Joe Dumars Knows

In today's print Wall Street Journal, there was an article, "The Story That Stats Don't Tell" about how the effort of NBA organizations to use statisticians to optimize their player choices has yielded results far short of similar efforts in other sports, notable Major League Baseball. I would link to the article, but WSJ.com is complete abomination, and a search on the exact title of the article produces zero results. Anyway, it got me thinking again about how difficult it seems to quantify an individual player's worth in basketball, a subject I have pondered in the past as well as had lengthy geeky conversations about with my brother.

My interest in basketball stats began while playing fantasy basketball, where it's easy to figure out who the best player is in terms of the fantasy league: there is a published formula for 'fantasy points' that depends on conventional stats (a point is worth 1 fantasy point, a rebound 2, a block 4 etc). It was fun to use stats like fantasy points per game over the last X games to find the best players to pick up; one year I won the league (I also got butt lucky that year and had Kevin Garnett during his MVP season).

Next, I read the book Money Ball about how General Manager Billy Beane and statistician Paul DePodesta used stats to build a team of unconventional, unlikely players and take the Oakland A's to the playoffs and have the league's best record with a payroll less than a quarter than that of the Yankees'. Completely fascinating stuff: the geeks prevailed!

Then, there was the fact that my favorite team in the NBA, the Detroit Pistons, happen to be the Oakland A's of the NBA. I know, that doesn't really do them justice, but bear with me. A team of cast-offs with people that no other team really wanted like Ben Wallace and Chauncey Billups. A payroll less than half the awful NY Knicks (and a very good, but still inferior Dallas Mavericks). And they won a championship. They went to the finals last year, and they have the league's best record going into this year's playoffs. Joe Dumars is the Billy Beane of basketball.

All that, and apparently, the Pistons are NOT among the teams that make use of professional statisticians. The WSJ article named the Celtics, Sonics and Houston Rockets as teams that do, and they range from mediocre to sucky.


Why Basketball Stats Suck

1. All plays are fluid.

With the exception of free throw shots, every play in basketball occurs in motion, and is very hard to quantify. Consider a typical fast break where 2-4 guys are sprinting down court with defenders racing after from various starting positions and various speeds. That exact play, with the exact same players starting in all the same positions, may not occur again for years, if ever. Contrast that with each at bat in baseball. In baseball, the state of the game before the play is quantifiable: the game literally stops between each play. You know how many men are on base, where the fielders are, how well the particular batter has performed in the past against the exact pitcher he is facing etc. etc.

2. The stats that are recorded are vague

An individual player may have a great field goal percentage, but what are his shots like? If he makes wide open jumpers from 15 feet out because another player is drawing players away and getting him a great look, is that as good as someone who makes a turn around hook shot with a hand in his face? According to the stats, yes.

What about an assist? If a player hands the ball to a team mate right next to him who jacks up a 3 pointer and makes it, is that as valuable as a pass that threads between two defenders and gives a teammate a wide open layup? Again, according to the stats, yes.

3. Many stats aren't recorded

Some stats that are of clear value simply aren't recorded officially, such as taking an offensive charge. A defender plants his feet, stands in front of a driving defender, and falls over, resulting in a turn over and a foul on the offender. This is counted as nothing.

Another defensive action of real value that is not recorded is a defender's ability to alter opposing player's shots. Ben Wallace may get a couple blocks a game, but I'm guessing he makes opponents alter and miss five to six shots per game.

4. Individual Performances are very dependent on Teammates

The WSJ article mentioned this one too. In baseball, a player at bat is by himself; no teammate is going to help him hit the ball. In basketball, other players are involved in the vast majority of baskets. Some players set picks for others. The ball may be passed around the top of the key involving three players before an open man is found. Who gets what portion of the credit?


How to Do Better

1. Qualify Each Existing Stat

The first step in better predicting the value of players would be to start recording enhanced stats. Every shot, every assist, every rebound, would be qualified somehow. Perhaps something as simple as a subjective rating of difficulty from 1-5. Someone with enough money could hire and train people to watch every NBA game and keep track of these things. A shot: was it wide open? Difficulty = 1. Was it a turn around jumper with three defenders around? Difficulty = 5. A rebound: was it a simple rebound off a foul shot? Or did the player jump up and rip it down over several defenders? Each game could be independently kept track of by 3 - 5 players, and the averages could be kept, perhaps throwing out outliers.

Or one could attempt to be a little less subjective. The location of each shot as well as the distance and velocity of the defender could be estimated for each shot.

2. Record New Stats

An easy win would be to start keeping track of those unrecorded stats I mentioned, like altering a player's shot. You could record anything that you thought could reliably measured, really.

You could then do regression analysis to find each one's effect on the points scored and points allowed, winning percentage etc to discover how important each one is.

3. Team Performance With and Without an Individual Player

Another method might be to look at how a team performed with and without a given player. However, it is rare you will have good data on this; unless a player is injured for exactly half of the games, and you have many games vs the exact same opponent with the exact same lineup to compare with and without an individual player, this would unlikely produce anything but noise.


It's Still Tough

Either way, I doubt any of these efforts would yield anything close to as accurate as what is done with Baseball. I'm willing to bet the existing stats guys are already doing stuff like this; I doubt that the Rockets hired MIT MBA Daryl Morey to come in and say, "according to my calculations, Kobe Bryant is leading the league in scoring."

There are just too many ways to be a valuable player. How can you compare Richard Hamilton's ability to run around like a madman tiring out his defender, driving him into pick after pick to get open shots to Ben Wallace's ability to alter shots to others, when no one else has this strategy?

Ask Joe D

All of this is a tribute to Joe Dumars' genius. He's onto something for sure, but how does he pick? How did he know Chauncey Billups could be transformed from journeyman to Allstar and MVP-candidate? How could he have predicted that under sized Ben Wallace would the best defender in the NBA? Maybe it that's what makes the NBA so interesting to me; no one has cracked the code to automating the process of picking out unlikely valuable players.

Sunday, April 02, 2006

firefox crash recovery: crucial on OS X

I like firefox. I've grown accustomed to its features on Windows, Linux, and, since last year when I got an Apple Powerbook, on Mac OS X too.

I would use Safari, but a couple of key applications (gmail and google reader) don't work as well on it. However, while I haven't had any trouble on Windows or Linux, firefox crashes about once a day on the Mac. This isn't terrible, aside from losing the 10 tabs I have open at any given time.

Fear no more, firefox crash recovery is here. Nothing too special, it just reopens any windows and/or tabs you had open the next time you open firefox after a crash.

I also stumbled upon another use of crash recover when VPNing into work. I tried to open firefox remotely, and it complained that it was already open. So I manually killed it, and when I opened it, all of my tabs from work were open! I usually have 10-12 work related docs open, so a nice bonus when working from home.