Saturday, July 18, 2009

These aren’t John Hollinger’s dad’s statistics

Okay, I love direct observational research as much as the other guy, but as someone firmly rooted in the sciences (aka. wasting my life in college and beyond), I love quantifiable results. As a result, I find that statistics are vital towards observing, categorizing and understanding the environment that we inhabit (and share with other organisms). Stats are awesome, and I love them (I got 98% in stats in university) . Though, I know that stats can be manipulated and abused – furthermore, statistics do not fully explain the entire situation, nor do sufficient statistics exist that can approach a full knowledge of a hybrid (not just numerical) system.

We can attempt to use statistics to better understand basketball – and as a fan, and non-paid analyst, I’m not alone in this point of view (just check out the Wages of Wins sometime). That said, the statistics that we currently have in basketball are not complete. No current stats exist to fully describe all possible positive actions on the court (e.g. there’s no stat for a defender breaking / slipping a screen and staying on his man, despite the set play run by the offense). Also, some stats that would be useful are just not recorded in the NBA (e.g. in the Adriatic league their stats also include things like how many times a player has their shot blocked, or how many times a player draws a foul – don’t you want to know how many times Dirk gets stuffed, or how many fouls Shaq drew per game during his peak?).

Right now there aren’t enough stats to fully map out good play, and some of the stats that we do have appear to mean more than they actually do! A steal means that a defender has disrupted the offense to the point that a turnover has occurred – many guards and swingmen get steals. The bigman analog has always been blocks, but the most important part of the steal (aka, the change of actual possession of the ball) is lost in a block. Often a blocked shot prevents a made basket, but does not always give possession back to the defending side. In effect, a blocked shot that still gives the ball back to the offense is no better than a defender causing a deflection of a pass where the ball goes out of bounds – and the offensive team retains possession.

Deflections are the most important stat that continues to not be officially counted by the NBA; on one hand it would be difficult to adjudicate (when is it a block? when is it a deflection?), but on the other hand, deflections often lead to other stats. The guy who picks up a ball that’s deflected out of the hand of the ball carrier is rewarded with a steal – but the guy who got the ball out of the ball carrier’s hand (the deflector) doesn’t get credited for causing the play. I think that’s kind of unfair.

In fact the guy who defends well frequently gets jobbed out of a good stat. For years I’ve seen a solid defensive play in the paint by a center (straight up single coverage that results in his man taking a shot – and missing) be recorded as a GREAT defensive board by the power forward (who wasn’t defending anyone, he just happened to get the rebound off of someone else's good defense). I’ve seen this happen on all levels of play. If anything, it prove the limitations of stats, and the limitations of the stats to accurately reward defense.

It’s no secret that defense is way harder to grade by using stats alone. Firstly, there are not enough defensive stats (like the guy who can navigate through a number screens and gets back in time to contest/change a shot gets the same amount of stats as another play who doesn’t even go through the screens and lets his man take an open jumper), and secondly, the stats that do exist do not always mean great defense. A very quick guy like Allen Iverson spent years on the court playing in the passing lanes. This is him gambling on defense to get steals. It’s exciting when a guy does get those break away steals in the game – but it’s not sound, fundamental man defense. Really, this play is sorely discouraged in the case where there isn’t a big shot blocker in the paint to clean up any mess that an unguarded PG getting into the paint can cause.

I know – I’ve seen John Stockton make these same type of ‘Defensive Back’ style steals because he had big Mark Eaton (two time defensive player of the year – once had close to 500 blocks in a season) watching his back. Iverson had the same with Theo Ratliff and Dikembe Mutombo. Chris Paul is another player who benefits from being backed up by shot blocking force of nature Tyson Chandler. If guys like Stockton, Iverson and Paul had Mehmet Okur and Jarron Collins laying down the law in the paint behind them you can be sure that the head coach would have told them to defend their man more, and gamble on defense less.

Over the years I’ve been lucky enough to coach some basketball on the youth developmental level. We tried to go past the lack of collected stats on defense in order to quantify good play on both sides of the ball. An attempt was made to count the number of times the player defending a shooter resulted in that shooter missing. Other times we tried to record good traps on the ball carrier, or how many times the defender harassed the ball handler into turning the ball over. These types of things can be recorded but by far the best defense I’ve ever seen on any level is that where the man being defended does not ever get the ball, entry passes are denied, position is denied, and when he does get the ball – he’s in such a disadvantageous point that he cannot positively help his team when he has the ball. You know what kind of stats that guy gets – for playing the best defense? He gets nothing for it. There is no reward for him that shows up on the boxcore – and this is a failure in terms of statistics being used to categorize and enumerate good play in a game. But if he manages to tip an errantly passed ball that he forces the guy he’s defending into attempting(because he can’t shoot it where he is), the team mate who picks up the ball gets credited for a steal.

Oh yeah, that’s basically it for defense in the NBA, I talked about the fallacy that defensive boards is akin to good defense, I talked about how blocks may stop a shot, but do not give you possession of the ball all that time, and I talked about how steals can be indicative of poor defense, not good defense. (Of course, exceptions to these rules always exist – it is good defense to control a tough defensive rebound that’s contested, or to block a shot to a team mate, or to single handedly strip a guy that  you are defending. But let’s not forget that these instances of good defense are counted just as much as the same stats that are products of situational defense. A ball that rolls do you on defense counts just like a ball that you fought for.)

The long and short of it is that we need more stats – especially on defense. If you’ve read this far down then you are awesome . . . because I’ve tried to make some ‘new’ stats. Some people are good enough for simple points, rebounds, assists, steals and blocks. That’s fine for a casual fantasy basketball league . . . but not good enough if you really watch the game like your life depends on it. I wish there were more stats, and more ways to reward good play – but right now the NBA does not appear to give us things like that (instead the +/- stat is thrown to us, like it means anything). Anyway, I’ve tried to make some new stats and here are five calculations that you can even do at home (studio audience of old ladies says ‘ooooohh’, already put in a receptive mood because of the deal that they got on that juicer they just purchased):

1. Defensive Gambling (DG): This stat is pretty easy – I try to recognize that some of the stats that people get on defense are related to gambling on some level. (Gambling in a passing lane, gambling by leaving your man to get a weak side shot block, gambling by trying to strip a man when the ref is right there, and so forth) We already had blocks and steals . . . but they exist in a world without accumulated risks. The risk here, of course, exists as a foul being called – another stat that we already have. The fun part is balancing the risks vs. the rewards.

The Foul is the base for this equation – as it means that you tried a risky move and you got burned. Not only do you have a penalty against you, but it directly benefits the offensive team. A steal, on the other hand, is a situation where a player takes a risk, is rewarded – and the offensive team is directly disadvantaged. When a team misses a three pointer and the opposing team gets the ball and scores a layup that is a 5 point swing. A steal does not directly mean you gain points, but you directly prevent the chance of points – it’s not a 5 point swing, but it’s a possession swing. I say that this is worth 1.5 fouls, in terms of risk/reward. A block does not guarantee such a similar swing in terms of possession (it does sometimes though), but while a steal only prevents the potential of a shot being taken, a block directly affects shots that are taken. (It’s a shame there’s no stat for shots changed, because a good shot blocker has an aura that surrounds him that changes many shots – for example when players started to scout Kirilenko his blocks went way down . . . but he was changing the same number of net shots as guys changed their shots to avoid getting blocked by him) A block, then, is not as good as a steal, but it’s still good – a block is equal to 1.3 fouls. The full equation for Defensive Gambling is as follows:

Stat Formula -- Defensive Gambling
Clean block on Kobe - the Ultimate Risk vs Reward

Example: In 2003-2004 Andrei Kirilekno was healthy (at this point in his career, he had played in 240 of a total possible 246 games), getting playing time (37.1 mpg), and happy. That season Andrei had 215 blocked shots and 150 steals against 174 fouls. His Defensive Gambling score was astronomical that season.

DG = [ ( 215 * 1.3) + (150 * 1.5) ] / 174

DG = [ 279.5 + 225 ] / 174

DG = 3.002893563

So that means that Andrei was gambling at a very fortuitous rate – he’d get away with gambling three times before he would get burned once. If you extrapolate that to how many fouls he averaged per game that season you can see that he was doing something right with all that time on the court. Because the haters probably want to know, Andrei’s career average DG score is 2.25. For a point of reference, Scottie Pippen’s career DG score is 1.41 – and Pip is the prototypical wing defender who made 10 All-Defensive teams in his career. Pippen had way more steals, but he took way too many poor gambles as well. Pippen has 6 seasons where he’s averaged greater than 3 fouls per game. Andrei has none in his career. So put that in your pipe and smoke it!

Clearly looking at things like blocks and steals on their own gives a skewed view of the world – putting it against a risk/reward system is a better frame of reference that we can use to see which players make smart gambles, and which ones do not. Andreifor all his faults – has a good Defensive Gambling ability if you ask me.

2. Pure Hustle (PH): Pure hustle builds off of the Defensive Gambling equation, but adds more chances for risks and more chances for rewards. Here we see Blocks, Steals and Offensive Rebounds go against Turnovers and Persona Fouls. The common factor for hustle is a turn over. It’s the most benign result of hustle. Too much hustle can result in a foul – which hurts you and your team. This is why a foul is worth more. On the good side of things, hustle can result in good defense, or a second effort (or third, or fourth) on offense. We all remember Tayshawn Prince robbing Reggie Miller with a game winning block. We should all remember Bird stealing the inbounds from Isiah. And we all know that second chance shots can not only allow you to score from an advantageous position – but also result in scores themselves. This is why I think when it comes to hustle, Offensive Rebounds are worth the most, with Blocks and Steals being on equal footing with fouls.

Stat Formula -- Pure Hustle

Hustle before he went crazy

Example: When people think of hustle they are apt to think of garbage guys who don’t have much skill. That type of scrappy play can result in solid contributions for a team, but I think hustle is more than just effort. Hustle is assisted by skill and psychology. It is a risk/reward equation after all – if you go all out all the time you’re just going to pick up fouls. You have to know when to go all out, and when to hold back. Many people hated him, but no one could deny the Pure Hustle of Dennis Rodman. Rodman has 4329 career Offensive Rebounds, 611 career Steals and 531 career Blocks. They aren’t the best numbers around, but when you input them with his 1481 career turnovers and 2843 career fouls he manages a Pure Hustle rating of 1.80.

This may not look like much, but for a frame of reference, Paul Millsap (what most current Jazz fans think Pure Hustle is all about) has managed a career Pure Hustle value of 1.24. So if Paul Millsap is giving 110% out there, Dennis Rodman is giving 155% out there.

 3. Shooting Worth (SW): This is an easy one, because it already exists on many website by it’s other name “Points per Shot” (PPS). It’s just how many total points you get (including free throws) divided by the number of shots (FGA) you take. Sure, there’s some points that may be unrepresented by FGA (like technical fouls, or being in the bonus), but the fact that the player is involved in those free throws is a benefit that they get for being good shooters. The idea is that if you take a shot it’s worth two points, right? Well, that doesn’t always work out in the PROs, and the league average (or mean for all player data) is 1.22. [Yes, I looked at every shot in the L last year on ESPN.com] So, in effect, you have a positive Shooting Worth if you are better than average, and have a SW greater than 1.22 – of course, you can still be good at your job if you have a lower value; you’d just be below league average. That’s the crux of this statistic – it seems to display individuals who take good shots, and this stat can be used to better differentiate between guys who score a high ppg who take good shots, or the guys who shoot a lot in order to get good stats.

Stat Formula -- Shooting Worth
Superstar or Average shooter?

Example: Allen Iverson has been a phenomenal scorer in his career. He’s averaged over 30 ppg on four separate occasions, and even with his no-show last year his career average is still 27.1 ppg. That’s a really great average, but how does that really stack up against a guy who averages something similar, but shoots much better than 42.5 fg%? Well, when you input Iverson's 23983 total career points over his 19,590 total shots you see that his actual Shooting Worth is 1.22which is exactly average. He’s not a “better” scorer than some of his contemporaries, he’s just a volume scorer. Karl Malone only has a career scoring average that’s in the 25 points per game range – but unlike Ivey, Karl was much better than average when it came to the quality of his shots; Karl’s Shooting Worth was 1.41 (better than lots of other known scorers like Kobe, Jerry West and Glen Rice, to name a few). From this we can conclude that a shot by Allen Iverson was, in effect, worth less than a shot by KarlKarl was taking better shots.

 4. Shot Frequency (SF): Some websites try to find out often a player shoots and they represent it by the number of shots / value of time (usually minutes). I don’t know what 0.282 shots looks like, so that representation means little to me. I would rather look at the same data in another way, namely, how many minutes of burn does a player need before he jacks up a shot. Some guys start shooting the ball as soon as they are on the court, while others spend a lot of time on the court – but don’t shoot the ball. In effect, this stat (Shot Frequency) will be able to identify if a certain player is gun shy to a fault (some would say that John Stockton was) or a pure, unashamed jacker of shots (too many to name). In this case, the lower the number means a higher frequency of shots . . . because the number represented is the time needed on the floor between shots. To find this out we divide the total minutes played by the number of Field Goals attempted (FGA).

Stat Formula -- Shot Frequency
Iverson really shoots a lot

Example: Let’s continue talking about Iverson here. Most of his life he’s been the first option on offense … a guy that the rest of the team made way for. As a result, he spent a lot of his career shooting the ball. How often does he shoot? Well, he’s played in 36719 minutes and taken 19690 total field goal attempts in that time. If you input the data it reveals that Allen Iverson, for his career, will shot the ball once every 1.865 minutes he’s on the floor. To be honest, this is the highest frequency I’ve seen in all my months ‘beta testing’ these statistics. 1.865 minutes is every 1:52 of actual game time. So that means if he plays 35 minutes in a game he’ll shoot the ball nearly 19 times. For a point of reference another Hall of Fame type “point guard” is John Stockton, who shot the ball once every 3.497 minutes (once ever 3:30 of game time). If given the same amount of time (35 minutes), Stockton would shoot the ball 10 times. (Really, 10 times, 10.008 is pretty darn accurate!) That’s almost half as many shots as Iverson . . . and effectively, half as frequent.

This statistic helps us understand which guys really shoot the ball the most, and most likely, are the primary offensive options for any given team. (Though, some anomalies exist, like Matt Harpring who shoots the ball very frequently for a guy with his skills – he shot the ball more often than some guys who started over him during his career)

Gestalt Offense (GO): This is the biggie. I’ve fretted over this for days and days and finally am prepared to release it upon an unwitting population. It’s not perfect, and it seems to only work best with larger sample sizes (a seasons’ worth of data, or a road trip, may look fine – any singular game can break it). It really favors guards and only really works best for players who played in a modern-ish era (one where turnovers, offensive rebounds and other similar statistics were actually recorded). The basis for this was to isolate how much pressure certain players could place on the defense. What a player does on offense (when it comes to stats) can be either one of five things: he can assist on a score, he can get to the foul line, he can take a shot, he can be called for a turn over, or he can grab and offensive rebound. There are no stats for setting solid screens, or anything like the ‘hockey assist’ right now with how the stats are kept. So the statistic is already limited, but I try to do the best with the data that I have. Essentially the GO Rating is the summation of a player’s modified assists, free throw opportunities, offensive rebounding ability, penchant for scoring against their turnovers, all over the number of games a player has played:

Stat Formula -- Gestalt Offense (simple)

Of course, it’s a lot more complicated than that . . .

Stat Formula -- Gestalt Offense (assists)

The first, and most straight forward, part of the equation is assists. This equation not only values assists higher than their normal value ( 2.23 > 1, after all), but it also serves to give a bonus to good passers, and penalize poor passers. (Good or poor depending on if they have a high Assist to Turnover Ratio) The first standard value here is 2.23. Where did this value come from? Well, it’s the true value of an assist when you balance it for the ratio of assisted 2PTFG vs. 3PTFG. (According to the stats on ESPN.com, where all the major stats came from, aside from archived player stats from Basketball-Reference.com) The second standard number is 1.48, this is the mean Assist to Turnover ratio for teams in the NBA. (Room for improvement exists here if I could actually find the mean Assist to Turnover ratio for the players)

As it stands how good you are at passing (vs. turning it over) affects how many total points you get from this category. After all, Tim Hardaway may be able to get you 37 assists on a road trip, but those 37 assists would be better than 37 assists from Alonzo Mourning because Zo would pick up quite a few more turnovers along the way.

Stat Formula -- Gestalt Offense (free throws)

Free throws are simple enough to understand now that we’ve gone through assists. Here we don’t look at how many free throws an individual has made, as the point of the GO rating was to see the type of pressure that an offensive player puts on the defense. Instead the focus is to see how many free throw attempts the player gets. Of course the catch is that value for FGA is modified by how good a shooter you are compared to the league average (which is 77%). A guy like José Manuel Calderón may only go to the line half the time as someone like Shaq – but Shaq is below average and Jose above average to the point that the difference in points from this category are not as large as we would think.

Again, like in assists, it’s a bonus to good shooters and penalty to those who are not so hot from the line. In effect, you want your best free throw shooters at the line, their FTA are then worth more than that of guys who do not go to the line with as much confidence as they do.

Stat Formula -- Gestalt Offense (offensive rebounds)

Rebounding is easy. I figured that in the grand scheme of things, an Offensive Rebound is worth 1.16 points. I based this on how a Turnover is 1.3 points (see below), how an assist is 2.23 points and how points are worth, well, points. This is a modifier.

The second modifier is similar to previous categories. Before we looked at if the player was better than average at Assist to turnover ratio, and then how they fared in FT% against the mean. Here we see that 26.73% of the total rebounds in the NBA were offensive rebounds. If you are a better than average offensive rebounder then you get a bonus here, if you are worse, then you get a negative modifier (as seen as a number less than 1.0, but not a negative number, per se). This part of the equation really seems to get broken if used on Charles Barkley (who, if you look at just one game, would get some crazy number of points in this section because he had 11 offensive rebounds and 5 defensive ones). Thankfully this section self-neuters itself if you have zero offensive rebounds.

Stat Formula -- Gestalt Offense (shooting)

Man, where do I begin with this one? Well, we went over Shot Frequency already. We divide 33.6 (representing 33.6 minutes of action for an elite player) by the player’s own Shot Frequency. This tells us how many shots he will put up if given Starter type of minutes.

The next step is adding 1 to the Effective Field Goal Percentage (eFG%). This is so that we get a positive value with the percentage when multiplied. Why use eFG% instead of just regular FG%? Well, eFG% adds into account three point shots, and the higher degree of difficulty of shooting from that distance. (For more info on eFG% click here) This step tries to even things out a bit.

The third part is how the player’s Shot Worth (also described earlier) rates compared to the league average for Shot Worth (or PPS). If you are Malone-like, you get a nice bonus here, if you are Iverson-like, you don’t get a bonus, and if you are really poor, then you get a penalty.

The last part is the total points scored by the individual, divided by 10. Ten was one of the first numbers I used that seemed to make things work out when I was fiddling around. I don’t know if this last section of the Shooting section is legit or not yet, I remember when I tried this out without the last part that individuals would be getting very little from this part of the equation. At some stage you have to factor in how many points a guys scores, even if you are looking are how well they shoot.

Stat Formula -- Gestalt Offense (turnovers)

Uhhhh, so this is very easy to understand. I’m not going to mention anything more than that this value is subtracted from the rest of the section values in the numerator. Don’t forget to divide everything by the total number of games, ya here!

Man, I’m super duper tired, so I’m just going to go ahead and post this. Give me feedback, tell me I’m crazy and set me straight. I think we need mores stats, and unfortunately, the lack of defensive categories means I can’t make a Defensive GO rating type of stat. I’m open to changing my formulas and re-working things. If you want I can e-mail you the spread sheet that contains the Macro that calculates these new stats. (I got really sick and tired of calculating the GO Rating by hand with pen and paper.)

I will probably make another post on the Go Rating in the future, filled with examples of players, and how they rated. So far I’ve seen that eFG%, Total Points, Assist to turnover ratio and a few other factors really seem to hinder bigmen, while giving guards and other wing players a boost. For the record, Magic Johnson’s GO Rating is 102.241, and he is the gold standard . . . some guys score higher than he does, while others politely fall into place. (Jordan is at 120, Karl Malone at 83, and so forth.)

13 comments:

V. Money said...

For free throws in GO, shouldn't it be ((FTM/FTA)/.77)*FTA instead of (FT%/.77)*FTA?

The second answer will be 100 times the first.

Amar said...

V.Money -- the way I have it set up is that FT% is exactly what you put it as, on the spreadsheet it turns into a decimal number (e.g. .889, and not 88.9%)

Amar said...

I can totally see how it's misleading, though.

V. Money said...

Thanks for clarifying.

By the way, I've already plugged in some numbers, and I think I might have some interesting data.

From what I'm getting, even though LeBron James outdid him in all five categories, Chris Paul made it razor-close in both SW and GO. LeBron put up 123.9 by my calculations, and Paul 123.4. Considering that Jordan averaged 120, those numbers are nothing to sneeze at.

Amar said...

hmmm, 123.4 for Chris Paul? when I input him I get 102.39 (for career regular season stats). Maybe you are doing just this last season? Yup, you are right, CP3 got 123.349 in the 2008-2009 regular season.

If we want to just go ahead and do singular seasons then I'm sure guys like Magic and Jordan will have things in the 130 range. Jordan's career GO Rating is 120.8299 . . . so that includes his days with the Wizards.

V. Money said...

Yes, it was last season. And last season, those guys had nothing on Dwyane Wade, who I have as about 129.4ish.

One thing that worries me about DG and PH, though, is that refs can and do treat players differently. LeBron got 2.348 in DG last season, but he also got whistled for only 1.7 fouls per game, somewhat low given his style of play.

Amar said...

stats can only deal with what's recorded . . . star players always get less fouls called on them, and when they initiate contact, they get the benefit of the doubt. fact of life . . .

V. Money said...

Yeah, but I still think it can be good to nitpick. It's not a reflection on your performance.

Another thing that could come up is that DH would likely hurt point guards who can't get a lot of blocks and rebounds because of their size.

For example, Chris Paul led the league in steals by a lopsided margin, but only managed a Defensive Hustle rating of 0.87. The lack of offensive boards, the lack of blocks, and the turnovers (because, you know, he has to handle the ball all the time) all do him in. Allen Iverson also posted a 0.87 during his MVP season for similar reasons.

It's not perfect, but no model is.

toasterhands said...

I really found the Pure Hustle portion to be right on target.

"Hustle is assisted by skill and psychology. It is a risk/reward equation after all – if you go all out all the time you’re just going to pick up fouls. You have to know when to go all out, and when to hold back. Many people hated him, but no one could deny the Pure Hustle of Dennis Rodman."

Chris Andersen reminds me of Rodman a lot. He has good, calculated hustle.

One minor gripe- Tayshuan not Tayshawn

A great read. Found you on Ball Don't Lie.

toasterhands said...

gosh, even I spelled it wrong- "Tayshaun"

Mike said...

Should the turnovers metric be normalized somehow? It seems that players who have a higher fraction of the offense going through them are going to have a higher turnover number and thus be penalized for their usage.

Maybe dividing turnovers by (assists+fga+offensive rebounds) would better indicate the true effective turnover rate for players. This could be normalized by comparing this to the NBA average of the same ratio and give you a more comparable metric to your other normalized metrics.

Ivan Bezdomny said...

Just found this. Really awesome ideas... especially the Pure Hustle (which is how I stumbled on this).

Any way you could re-publish this on a page that allows for printing or at least reasonable viewing/emailing? Blogger is a disgrace when it comes to handling formats :-(

photoshop updates said...

excellent piece of information, I had come to know about your website from my friend kishore, pune,i have read atleast 8 posts of yours by now, and let me tell you, your site gives the best and the most interesting information. This is just the kind of information that i had been looking for, i'm already your rss reader now and i would regularly watch out for the new posts, once again hats off to you! Thanx a lot once again, Regards, , photoshop updates