Is the System Broken?

On NCAA Selections, Efficiency Metrics and Resume Metrics

I love metrics. Largely because I am obsessive about things I am passionate about, and metrics give me something else to obsess over and discuss when it comes to college basketball. I also like to know how things work, and metrics can help there (not the overall team metrics, per say, but more of the advanced statistical metrics). Finally, I love to argue. While subjective observations are great and useful, subjective arguments come across much better when you can back them up with cold, hard statistics.

So with that in mind, this recent tweet caught my eye:

No Escalators corrected himself in a Quote Tweet, Hall didn't drop in KenPom:

And I'm not here to knock the man, other than being a UConn fan, NoEscalators is alright in my book. I’m also not here to knock Hall’s W. A road win is a road win. I think Hall is a good team, but Marquette is atrocious (an article or maybe a podcast for another day). Any good team should blow them out on any court, home or road.

But let's talk about NoEscalators’ original point, because I think it's a commonly held belief among fans that the metrics don't respond adequately to Wins & Losses, and therefore the Selection Committee is not utilizing the criteria they should be using: wins and losses. That’s not really accurate.

Part of the problem is that KenPom & Torvik take up almost all of the metric oxygen because they're easily accessible and the media have loved to talk about them for ages. They were among the earliest metrics we had outside RPI. Because of that legacy, and the ease of accessibility, media and fans tend to put outsized importance on these metrics, both positively and negatively. Just think about it, when was the last time you heard a prominent talking head discuss WAB or KPI or SOR? You might find them discussed by stat nerds or bracketologists or bubble watchers, but rarely among national talking heads or writers.

KenPom and Torvik are efficiency metrics, which means they attempt to measure how many points you score per possession, and adjust that for strength of opponent, which is generally a proprietary adjustment that differs from site to site. The difference between a 1 point win and a 1 point loss is 1 basket, and efficiency metrics treat those games as the extremely close results that they are. You can objectively play a good basketball game, score a ton of points efficiently, but still lose if your opponent was just slightly more efficient. You could play inefficiently, and still win if your opponent was just slightly more inefficient than you.

For a perfect example, let’s look no further than my beloved Johnnies. We hosted Alabama and posted an adjusted net efficiency rating of +18.5 per Bart Torvik (one of many reasons I like Torvik more than KenPom, he provides the data for individual game adjusted and raw efficiency margins). We lost that game by 7 points, because Alabama posted an adjusted efficiency margin of +27.9 points (the difference between the actual margin and the adjusted efficiency margins has to do with the amount of possessions in the game being adjusted up to 100 as a control, as well as the adjustment for playing on the road). Fast forward roughly 2 weeks, and Alabama posted a 26 point adjusted efficiency margin, but lost by 10 to Gonzaga. Fast forward roughly another 2 weeks, and St. John’s posted an 18.0 adjusted efficiency margin and beat Ole Miss.

Now, you might say that is a function of opponent. However, that’s not the case. The adjustment is meant to control for opponent. So in the eyes of Torvik, St. John’s effectively played equally well (or poorly, depending on your subjective opinion) against Alabama and Ole Miss, but got wildly different results—a 12 point swing. The same is true for Alabama as against St. John’s and Gonzaga.

KenPom and Torvik focus strictly on "efficiency" because they're meant to be predictive. They're designed to tell you what a team is likely to do moving forward, i.e. if you maintain that efficiency level (adjusted for competition) your projected record—based on your schedule—is XX-XX. This is another reason I like Torvik a bit more than KenPom. KenPom relies strictly on a adjusted net efficiency rating, which is simply your adjusted offensive efficiency (how many points per 100 possessions you score, adjusted for strength of opponent’s defense) minus your adjusted defensive efficiency (how many points per 100 possessions you concede, adjusted for strength of opponent’s offense). So if you look at UConn, for example, their adjusted net efficiency rating is +30.15 (adjusted net efficiency not to be confused with the NET, an acronym for NCAA Evaluation Tool). That effectively means that on a neutral floor, UConn would be expected to beat the average D1 squad—a team with a 0.00 net rating—by 30 points. 

On the other hand, Torvik does effectively the same thing, but then he calculates your expected win percentage versus the average D1 opponent. So, to stick with UConn, Torvik has them at a +30.0 adjusted net efficiency, so virtually identical to KenPom, and per Torvik that translates to a .963 win probability versus the average D1 team. This year, for reference, that would be a team like UMass, Winthrop, FIU, Montana St., Drake, Iona, Incarnate Word, Siena, Lipscomb, or South Dakota St., who sit between .510 and .490 in Barthag and 158 to 167 in the rankings.

A minor point that was recently tangentially discussed by Jim Root and Will Warren on their Stats by Will & Jim podcast is that you can also drop in rank, without actually dropping in efficiency if the teams below you are playing more efficiently. So every drop in rank isn't necessarily an indictment of how that team is playing from an efficiency standpoint. So there should be more focus on the actual efficiency margin in KP or the Barthag for Torvik. For example, SJU was projected 16th preseason at KenPom, with a projected Net Rating of +23.48. As of today they sit at +23.00, ranked 19th. And even if they had maintained that +23.48 figured, they’d still be ranked 19th, because at least 3 other other teams have surpassed them from a net rating perspective (realistically some above them fell and others rose).

So, to put a bow on it, efficiency metrics don't always respond the way we would like to wins & losses. You can win inefficiently & lose efficiently. That's why you can win and fall in KP or Torvik or you can lose and rise in KP or Torvik. That is the way those metrics were designed. Yelling about that is like yelling at the sky for being blue.

However, that's not an "indictment of ***everything***".

We have an entire set of team sheet metrics that DO respond to wins and losses, almost exclusively:

WAB - Wins Above Bubble, promulgated by both Torvik and the NCAA, but only the NCAA’s appears on team sheets for the Selection Committee
KPI - Kevin Pauga Index, promulgated by Kevin Pauga, an Associate AD at Michigan State…because that’s not a conflict of interest ::eye roll::
SOR - Strength of Record, an ESPN proprietary metric

And of course there's NET, which is a hybrid metric accounting for both efficiency and results.

I don't know how SOR works to be perfectly honest. As far as I can tell, it’s a complete black box. KPI includes a bunch of factors, including strength of opponent, margin of victory and pace of play, but it is a black box as to how those factors are applied. However, KPI is binary in nature, so winning adds to your KPI score and losing subtracts from it. You will always get some level of credit for a win.

WAB functions in the same binary fashion, but WAB simply uses efficiency to project what an average bubble team's win probability is based solely on opponent and location. If you win, you score 1.0 minus X, where X=the decimal probability of a bubble team winning that same game (i.e. 75%=0.75). You get the same score whether you win by 1 or win by 40. It responds exclusively to Ws & Ls. So if your concern is “why didn’t Seton Hall get more credit in KenPom or Torvik or even NET for beating Marquette on the road?” What you probably want to look at is WAB. Seton Hall got +0.31 to their Torvik WAB, meaning, as bad as Marquette has been, the average bubble team would only win there 69% of the time, per Torvik. Because Hall actually won that game 100% of the time, they had .31 more wins over that average bubble team, and that’s added to their Wins Above Bubble (WAB). Pretty simple. That win at Marquette bumped Hall from 31st in WAB to 26th in WAB per Torvik. 

It’s a bit more complicated over at the NCAA’s version of WAB in this particular circumstance, largely because they do not show the values for individual games, BUT the upshot is Hall did still get credit for improving their WAB. Other results, particularly Hall non-conference opponent Washington State losing to Seattle, saw Hall’s non-conference WAB drop from .89 to .8. However, their overall WAB only took a hit from .8 to .77, meaning they essentially got .6 credit, mostly from beating Marquette, but perhaps partially from other results improving as well. We just don’t know because it is a moving target (like all metrics), and the NCAA doesn’t provide the game-by-game breakdowns of WAB value that Torvik does. Over at Torvik, on a team’s page, you can see the WAB column. For completed games, that column shows the cumulative WAB total, but if you hover over the cumulative number, it shows you the impact that game had on the total. For future games, it shows the WAB value that team would get from a win.

In any event, if what you're concerned about is metrics that respond to Ws and Ls, focus on the resume metrics, particularly WAB, not KenPom or Torvik. That’s not my naked plea to you, that is what the committee has been doing in recent history.

So to bring it back full circle, the fact that KenPom and Torvik occasionally punish wins and reward losses is not an indictment of the system. Those metrics are functioning the way the are intended. It’s also not an indictment of the system because those metrics are far less important in the Selection Committee room than the resume-based metrics. Two seasons ago, KPI was the most-correlated with inclusion and seeding. Last season—the first one in which the NCAA added their proprietary WAB to the team sheet—WAB was most correlated with inclusion and seeding.

Which is to say the committee is focusing mostly on what we all want to matter the most: a team’s results. Predictive metrics can help, and generally play a bit more of a role in seeding, because we don't want to seed an efficient, but underachieving team too low, as that could be unfair to the higher seed. Or when teams are close in resume metrics, efficiency metrics can be a tiebreaker. However, by and large, the Committee is relying on a team’s resume, not its efficiency metrics.

So please, stop claiming things are broken just because you don't understand them. The system is working as we'd all like it to.

Next
Next

Better Know a Team: Creighton