Wednesday, February 13, 2008

The Concept

Who are the best coaches?
Ken Pomeroy examined a metric earlier in the year for evaluating coaches. His metric will tend to find the coaches who caused the biggest turnarounds at their programs. This is ideal for predicting the “coach of the year” winner. Or, since the “coach of the year” winner also tends to come from the higher profile conferences, Ken’s metric helps remind us of coaches from the smaller schools who also deserve the “coach of the year” award. Coaches that maintain success at the highest level (such as Coach K at Duke) would receive high rankings in Ken’s system, but never the highest ranking.

In contrast, I develop a metric that rewards sustained success. In my opinion, NCAA tournament wins and appearances are the standard by which coaches are judged. Certainly conference titles, wins against rivals, and other factors matter, but I believe tournament success is the standard. My best example of this is Bo Ryan of Wisconsin. In his 6 years as the Wisconsin head coach, Bo Ryan has finished 1st in the Big Ten twice, 2nd twice, 3rd, and 4th in the Big Ten standings. Those regular season numbers are more consistent than those of Tom Izzo, but Bo Ryan isn’t considered at the same level as Tom Izzo because Ryan has not progressed as deep into the NCAA tournament. NCAA tournament wins plus appearances form the basis for my rankings and I try to determine whether those tournament wins are earned in recruiting, in the regular season, or in the postseason.

Recruiting vs the Regular Season vs the Tournament
In my model I first estimate how well high school talent predicts NCAA tournament wins and appearances. With a value for each type of player, I place a value on each recruiting class and determine the average recruiting class value for each coach at his current school. (This is RECR.) I associate all wins beyond those predicted by talent with coaching ability. (Coaching ability is RS + TOUR) I then separate whether coaches earn the additional tournament wins in the regular season, by earning a high seed (RS), or in the tournament, by winning more than was expected by seed (TOUR). The exact details are a little more complicated and these can be found in the section entitled “Fine Print” below.

What recruiting measures is pretty self explanatory, but it is less clear what RS and TOUR are measuring. I think there are several things you can think about when comparing regular and post-season performance. Does the regular season more accurately describe which coaches are good at player development? Does the post-season more accurately describe which coaches are good game managers, good X’s and O’s coaches? Does this measure the impact of system? For example, do certain systems win more tournament games than regular season games? Or is tournament success more a measure of luck? While regular season success is based on 30+ games, the post-season requires only one bad performance to be over.

How do these measures transfer between schools?
Ideally I’d like to separate whether wins are due to the coach or due to the prestige of the school. One can separate the impact by looking at coaching changes, but there are not enough coaching changes to separately identify all schools. For example, you can’t separately identify the importance of Coach K from the prestige of Duke at this point.

The key is that school prestige should really only impact recruiting. Coaching ability (game management/player development) should be fairly transferable between jobs. This is why the tables list RECR at the current school only, and RS and TOUR at all schools.

Even though I can’t separately identify the prestige of each school, in most cases we can still make an educated guess about how recruiting ability will transfer. For example, when a coach moves from Texas A&M to Kentucky, he should increase his recruiting success. When a coach moves from Kentucky to Minnesota, he should decrease his recruiting success. And in the case of someone like Rick Pitino, whose success is iconic, his recruiting ability would probably translate to just about any school.

Of course if we were just going to make an educated guess, we didn’t need my model. Most people saw Billy Gillispie win at Texas A&M and assumed that at a school as prestigious as Kentucky that Gillispie should win even more. But a key improvement of my model is to control for the talent on hand when a coach takes a new job. Bruce Webber doesn’t get quite as much credit for his Final Four run, because Bill Self left him a team stacked with great players. Meanwhile, Todd Lickliter’s coaching ranking won’t take much of a hit this year because the Iowa team he inherited is not nearly as talented.

Which coach should fill a vacancy?
For most jobs, particularly prestigious institutions, the ability to develop and manage talent is critical. (Teams should choose coaches with high RS and TOUR ratings.) With some adjustments for quality of opposition, if you can teach effective full court pressure at Wisconsin-Milwaukee, you can teach effective full court pressure at Tennessee. And since few mid-majors can recruit BCS level talent every year, the ability to develop players and teach the game is probably all that is measurable for mid-major coaches. By picking mid-major coaches that win in the tournament, athletic directors are essentially picking the mid-major coaches that have high ratings for RS and TOUR.

But for schools where school prestige is a struggle, I would highly recommend spending the money to get a proven winner (ala Minnesota hiring Tubby Smith). If you can’t find a proven winner, I would consider hiring an assistant at a major program with stellar recruiting qualifications over a fundamentally sound mid-major coach because you have to have BCS level talent to have a chance to win in a BCS league.

Harder to Win in BCS Leagues
There exists a market for coaching talent and most BCS schools have more resources. The result is that the top coaches generally end up at BCS schools. And I felt the rankings needed to reflect this. To put it simply, if you put Coach K or Ben Howland at a mid-major, the odds of those coaches succeeding would be very high. On the other hand, if you take Bill Carmody and John Thompson the 3rd from the Ivy league and put them in a BCS conference, the probability of success is uncertain. Therefore I give a slightly lower weight to wins at non-BCS schools. (If I didn’t Thad Matta and John Thompson the 3rd would be rated implausibly high in my opinion.) I determine the weight by looking at non-BCS coaches who move to BCS schools and the success rate. Unfortunately, this also makes coaches like Phil Martelli look a little worse than is probably fair.

Fine Print
Here are some details that are probably not of interest to everyone.

Talent vs Coaching Ability
Identifying the impact of talent is not simple. The coaches that have the most coaching ability end up at the best schools and eventually attract the best recruits. A simple comparison across schools will show that talent has a huge impact on NCAA success, but this may be a proxy for the unobserved ability of the coach. In simple terms, is Duke better than Florida St. this year because Duke has 8 McDonald’s All-Americans or because Coach K develops players better than Leonard Hamilton? Instead of looking across coaches, we can try to solve this problem by looking within coaches over time. For example, we can examine whether Thad Matta won more games when he had three McDonald’s All-Americans than when he had one. I focus on the within coach variation in the current version of the tables.

Because talent no longer includes spurious coach ability, the estimated effects of talent are smaller, the value added through recruiting is smaller, and the value added in the course of the regular season (through player development and game management) is higher.

Model            Current    Previous
Coach            Recr  RS   Recr  RS  
Roy Williams     1.31  2.61 3.37  0.66
Mike Krzyzewski  1.42  2.66 3.69  0.21
Paul Hewitt      0.87  0.86 2.19 -0.01
Gary Williams    0.58  0.87 1.46 -0.21
Al Skinner       0.13  1.76 0.34  1.41
Leonard Hamilton 0.75 -0.49 1.86 -1.37
Seth Greenberg   0.32  0.36 0.80  0.06  
Oliver Purnell   0.22  0.28 0.58 -0.19  
Frank Haith      0.35 -0.17 0.90 -0.79  

I feel like the current model is more appropriate, but the within model still has several potential problems. For example, in some within specifications that I did not use, holding the amount of older talent constant, having more young talent actually predicts less NCAA success. This certainly doesn’t make intuitive sense. If you have the same number of talented older players and add one talented young player, you shouldn’t be worse off. (You could always leave that player on the bench.) The problem is that coaches tend to obtain the largest volume of young talent when 1) the upperclassmen have underperformed, and 2) after mass exoduses to the NBA. Here the young talent proxies for years when the team is expected to win less. More generally, the talent level will depend on the expectation about the team, and the variation over time may not be random. In the future I might try a propensity score model or find some sort of creative IV to improve my estimates of the impact of talent.

Note on Time Periods
My recruiting data lasts from 1999-2007, but I can only evaluate the regular season performance after 2002 because I want to control for the talent level in each year. The players in the 1999 data are seniors in 2002. I believe that 5 years is enough to get a good sense of regular season performance, but given the small sample of tournament games, and the fact that one bad outcome can end a season, I expanded the tournament metric to include the last 10 years worth of tournament performances.

The combined rating measures the annual expectation of wins plus appearances, but since each component is measured using a different time period, it shouldn’t necessarily add up to actual wins plus appearances over any particular time period. For a coach like Coach K, who was at a school for 10 years, and was in the tournament every year, the numbers should be close. Coach K had 10 appearances and 28 wins over the decade or 3.8 wins plus appearances per year and his rating is 3.89.

The rating of 3.89 may also be high because Duke has been hurt by unexpected early entry. For coaches that change jobs, the rating will also vary because it will depend on inherited talent. Finally, since tournament performance is averaged over actual appearances, if a coach misses the tournament frequently, this number should not be considered relevant in every year. Nonetheless, the rating can be considered a long-term expectation of wins plus appearances for coaches that make the tournament just about every year. If this all sounds too convoluted, you can always defer to actual tournament wins in the last 10 years.