clock menu more-arrow no yes mobile

Filed under:

Five Reasons NBA Lineup Data Is Lying to You

Statistics that rate certain lineups or player combinations are widely cited but notoriously unreliable. As the 2022-23 NBA playoffs approach, here are some tips for how to use—and not use—this data correctly.

AP Images/Ringer illustration

During the second quarter of Tuesday’s broadcast of the Lakers-Timberwolves play-in game, TNT celebrated the Lakers’ new starting lineup with a graphic. This indeed deserved celebration: In the regular season, LeBron James, Anthony Davis, D’Angelo Russell, Austin Reaves, and Jarred Vanderbilt were a connected, two-way force and outscored opponents by a whopping 20.6 points per 100 possessions, per NBA Advanced Stats.

“Look at the minutes and the plus-minus,” Reggie Miller said on the broadcast. “Fantastic.”

Yet an epidemic exists in the field of intelligent NBA analysis, and—much like pollen allergies and final exams—it appears most frequently and frustratingly in the spring as playoff breakdowns kick into high gear. This TNT graphic was just the latest, most visible example. Lineup data, which purports to demonstrate a team’s overall strength through the performance of a five-man group or smaller combination of players, almost always obfuscates more than it reveals.

So before the 2022-23 playoffs begin and such analyses inundate the NBA ecosystem, let’s explore five reasons that lineup data lies, alongside five tips to better discern the truth about which NBA combinations work best.

1. Sample size is (almost) always a problem.

The first overarching problem with lineup data is the same issue that afflicts so many statistical analyses in sports: There’s not enough data from which to draw meaningful conclusions.

Over a full 82-game season, with around 4,000 total minutes and 8,000-plus possessions for each team, the context surrounding a team’s performance tends to even out. But over more limited samples, factors like opponent quality, injuries, load management, and game location can skew the results sharply in one direction or the other, and one very good or bad game can have an inordinate impact.

Consider the Lakers’ new starting five, which outscored opponents by 37 points across 167 possessions in the regular season. More than half of that margin—plus-22 in 16 minutes—came in one game against the Bulls. This Lakers quintet also padded its margin with a plus-eight against the Rockets and a plus-10 against Jazz backups on the last day of the season, and it played the Suns backups the game before.

The lineup’s construction makes sense, as it surrounds LeBron and Davis with players who can shoot and cut, and it packs the defensive back line with ferocious rim protectors. It’s promising that the unit’s early returns matched that expectation, regardless of opponent quality—but breaking down the context removes some of the shine from the eye-popping plus-20.6 figure.

To analyze lineups with some measure of certainty, then, we need to wait for all those extra variables to settle. Analyst Kostya Medvedovsky calculated last year that it takes about 550 possessions for a five-man lineup’s offensive rating to “stabilize” and about 850 possessions for its defensive rating to do the same. (Defense takes longer because a team has much more control over its own shooting percentage than its opponents’.)

For context, however, only 25 lineups this season reached 550 possessions, per Cleaning the Glass—that’s less than one per team. Only 11 reached 850 possessions. That doesn’t mean there’s no value in analyzing a lineup before it reaches those thresholds—500 possessions will still provide much more information than 100 or 200—but it does mean that any analysis should be limited and precise.

“I suspect,” Medvedovsky wrote, “that means there’s very little you can do with simple 5-man unit ratings to see which units are ‘working.’ Even if you’re lucky enough to have a unit with a ton of possessions together, you’re going to be comparing it against other lineups which have barely any.”

The problem is that lineup data is easily accessible online, even in much smaller samples, but it’s presented without these sorts of guardrails. “Because it’s so easy to get on,” says Andrew Patton, who works with Medvedovsky on the DARKO projection system, “it’s very easy to accidentally, in good faith, misuse.”

So Patton, a data scientist and self-described “Sixers psychopath,” decided to take a proactive approach to the predicament. He helped create an online tool, simply titled “Should I Use This Rating?” Enter a lineup’s offensive rating, defensive rating, and sample size, and it will spit out a stock answer, like a Magic 8 Ball, as to whether the information is meaningful.

Plug this Lakers lineup’s regular-season stats into the “Should I Use This Rating?” tool, and it responds, in big black text against a bright-red background, “Absolutely Not.”

Tip: Double-check the sample size before citing lineup data.

The good news is that sample size information is as readily available as the net ratings themselves, and folks can use that information to ask, “Should I use this rating?” The tool is most helpful, Patton says, to gauge lineups with a moderate number of possessions that can’t be easily dismissed or obviously trusted: “One hundred possessions is not enough. Just throw that away. Fifteen hundred is definitely enough. But in that middle ground is [where] it starts to get tricky.”

Plugging in different lineup stats reveals the different shades of usefulness. Beyond the “Absolutely Not” basement, descriptors progress from “Caveat Heavily” (like the Cavaliers lineup with their four stars plus Caris LeVert) to “Meaningfulish” (like the Bulls’ fantastic starting group with Patrick Beverley and Alex Caruso in the backcourt) all the way to “Actually, Yes” for lineups with high enough volumes to be stable.

2. The NBA is a make-or-miss league.

This problem ties into small sample size concerns because of increased randomness, but it’s important enough to warrant another entry. As The Athletic’s Seth Partnow noted on Twitter about lineup analyses, “A thing that stands out when you dig in is how often ‘big changes in performance’ reflect either the regression from or progression of outliery shooting stuff.”

Let’s continue with the Lakers starters as a representative example and dig in.

In the regular season, the Lakers’ new starting five shot a scorching 52 percent from distance (24-for-46) when sharing the court—the second-best mark for any five-man unit with at least 75 minutes played, per NBA Advanced Stats. But they were rather lucky to achieve that mark: LeBron, for instance, made 11 of his 16 3-point attempts in those lineups but shot just 30 percent from distance over the rest of his season. Overall, with this lineup, the Lakers would have been “expected” to make only 16 3s, according to factors like shot location and defender distance, per Second Spectrum.

On the other end, meanwhile, the Lakers benefited from ice-cold shooting by their opponents, to the tune of a 29 percent mark on 3s (16-for-56)—tied for the fifth-lowest mark among all lineups with at least 75 minutes played. But their opponents would have been “expected” to make 21 of those attempts, per Second Spectrum.

Combining those stats—eight extra made 3s for the Lakers and five extra missed 3s from their opponents—yields an estimated 39 extra points from shooting luck alone. And remember, the total margin for this new Lakers lineup was plus-37, meaning shooting luck by itself could account for all of that.

Lo and behold, in the play-in game, the Lakers’ aberrant shooting advantage disappeared. With its new starters in the game, L.A. shot just 2-for-9 from distance, while the Timberwolves went 8-for-12. The much-ballyhooed Lakers lineup was thus outscored by 15 points in 17 minutes and didn’t play at all past the middle of the third quarter.

Tip: Double-check shooting splits, too.

For a lineup of any size, NBA Advanced Stats shows both its own shooting percentages and those of its opponents. Use this data for gut checks: If 3-point percentages are down in the 20s or up near the 50s on either offense or defense, those outliers will probably regress closer to the mid-to-high 30s going forward.

Shooter quality naturally affects these benchmarks, however. The best 3-point shooting percentage this season for any high-usage five-man lineup belonged to the Warriors, at 47 percent—which makes sense, because Steph Curry and Klay Thompson took the bulk of those shots.

3. High-usage lineups suffer from selection bias.

Pop quiz: Out of the high-usage lineups in the NBA, what fraction of them outscore their opponents, and what fraction are outscored themselves?

Your first, logical impulse might be to assume that it’s an even split. Basketball’s a zero-sum game; if one team goes ahead, the other falls behind. Therefore, about half of the league’s high-usage lineups should be positive, and about half should be negative.

In reality, we don’t see a nice, even bell curve illustrating performances by high-usage lineups. Instead, we see a giant skew in the positive direction.

NBA Five-Man Lineups, 2017-23

Minutes Played Positive Margin Negative Margin Average Net Rating
Minutes Played Positive Margin Negative Margin Average Net Rating
100+ 67% 33% +3.99
250+ 80% 20% +5.75
500+ 90% 10% +5.95

Over the last half-dozen seasons, two-thirds of lineups that reached at least 100 minutes posted a positive scoring margin, according to an analysis of NBA Advanced Stats data. For lineups that reached at least 250 minutes, that proportion rose to 80 percent. And for lineups with 500-plus minutes, it was an unfathomable 90 percent—and those teams had an average net rating of plus-six points per 100 possessions.

For reference, the Celtics led the NBA this season with a plus-6.7 net rating. That means the average lineup that plays big minutes is almost as good as the best team in the league.

Zooming in on this season, we find that of the 31 lineups that played at least 250 minutes, 25 (or 81 percent) had a positive scoring margin. Of the six that didn’t, four were barely negative (such as a Trail Blazers lineup that finished with a negative-0.5 net rating), and the worst two belonged to the Rockets.

This result seems illogical at first—but it makes sense upon further inspection because it represents a case of selection bias. Lineups don’t just appear on the court; they have to be actively chosen. And for the most part, coaches will stop playing them if they’re not working. Five-man units that are outscored probably won’t play enough to reach a meaningful threshold anyway.

(One exception to this rule is Tom Thibodeau, who stubbornly sticks with his preferred lineups even if they’re not working. The Knicks had the NBA’s most-used lineup with a negative net rating in both 2021-22 and 2020-21; ditto for the Timberwolves in 2016-17.)

Tip: Cite net ratings only if they’re extreme.

Because of this skew, the baseline for what counts as notable should be higher. For instance, the Hawks’ starting lineup had a net rating of plus-six this season, per Cleaning the Glass, which could suggest they’re a playoff dark horse as long as those players stay on the court—but knowing that plus-six is the average for super-high-usage lineups makes Atlanta’s core look a lot less exciting.

As a rough rule of thumb, then, don’t pay much attention to a specific lineup’s net rating unless it’s in the double digits (at least when focusing on the positive direction). Exercising caution is critical. This higher bar also makes a lineup like the Nuggets’ starting group—which is plus-12.7 in nearly 1,500 possessions this season, per Cleaning the Glass—stand out as even more special, because it’s clear that its overall performance is well above average and outside the bounds of luck.

4. Five-man lineups don’t actually play that much.

OK, so you’ve taken the first three lessons to heart: You’re analyzing exclusively lineups with large samples, you’re checking their shooting splits to make sure extreme luck isn’t playing a role, and you’re homing in on those with a big scoring margin.

Yet any five-man lineup analysis you’re choosing still won’t mean all that much because real life isn’t 2K with fatigue turned off—players have to rest, and rotation patterns mean even the most-used five-man lineups don’t actually play together that often. The most-used five-man lineup this season (minimum 20 games) was the Nuggets’ starting group, which played all of 17.2 minutes per game across the 41 contests in which it appeared.

In the regular season, this lack of playing time matters in part because it prevents lineups from amassing larger sample sizes, which makes the data less reliable. But five-man minutes don’t rise all that much in the postseason either, even as minutes for individual starters increase. In the 2021-22 playoffs, the most-used five-man lineups played just 13.5 minutes per game on average. In 2020-21, the average was 14.3 minutes per game. In 2018-19 (before the bubble playoffs), it was 13.4. The high end for any five-man lineup in the postseason is about 20 minutes per game.

In other words, even the most-used five-man lineups share the court for much less than half of a game, and what happens in the other 34ish minutes of game time is just as important as what happens in the starters’ 14ish minutes together.

Tip: Look beyond a single season of a five-man lineup—to that lineup’s history, multiple overlapping lineups, or smaller combinations of players.

Among five-man units that played at least 100 minutes this season, the Warriors lineup of Curry, Thompson, Andrew Wiggins, Draymond Green, and Kevon Looney had the best net rating, at plus-21.9 points per 100 possessions, per NBA Advanced Stats. Due to injuries and absences, that group didn’t play enough for that figure to be truly meaningful—but because it has such an excellent history before this season too, we can be more confident in its strengths.

This lesson applies to lineup analysis more broadly: Looking at a lineup’s history beyond the current season adds useful information and context, and it helps to increase the sample size. One useful tool that captures this process is the DARKO projection system, which publishes estimated ratings for five-man lineups (split into offensive and defensive ratings) based on blended results from the past and present to account for all the problems discussed above.

Naturally, this system shows that Warriors group as by far the best five-man lineup. Here is the top of the leaderboard for lineups on current rosters, according to an analysis of DARKO’s data.

Best Current Five-Man Lineups

Team Lineup Points Above Average
Team Lineup Points Above Average
Warriors Curry, Thompson, Wiggins, Green, Looney +13.9
Celtics Smart, Brown, Tatum, Horford, R. Williams +10.6
Celtics Smart, Brown, Tatum, G. Williams, R. Williams +10.2
Nuggets Murray, Caldwell-Pope, Porter, Gordon, Jokic +10.2
Grizzlies Morant, Bane, Brooks, Jackson, Adams +8.3
Celtics White, Brown, Tatum, Horford, R. Williams +8.2
Clippers Mann, George, Leonard, Morris, Zubac +7.6
Bulls Beverley, Caruso, LaVine, DeRozan, Vucevic +7.4
Heat Vincent, Herro, Butler, Martin, Adebayo +7.4
Bucks Holiday, Connaughton, Middleton, Antetokounmpo, Portis +7.1
Nuggets Caldwell-Pope, Brown, Porter, Gordon, Jokic +7.1
Based on analysis of data from the DARKO projection system

Want to know why statistical models consider the Celtics as title favorites? Having three of the top six lineups helps—and that’s another clue to aid in analyzing lineup data. If multiple lineups with similar players are all excellent, like the Celtics’ groups with Brown, Tatum, and Robert Williams or the different iterations of the Nuggets’ top players, then those combinations can cover a larger portion of a game or series.

Expanding your analysis to smaller sets of players offers more game coverage too. The three-man group of Joel Embiid, James Harden, and Tobias Harris shared the court for 26.2 minutes per game this season, and last postseason, Harden, Harris, and Tyrese Maxey played 30.3 minutes per game together. Cutting smaller slices helps with the problems of time and sample size—though it has its own issue as well.

5. It’s easy to cherry-pick.

Thus far, we’ve focused mostly on five-man units, but the misuse of net ratings also extends to smaller subunits of lineup combinations. One trick that writers (including myself—I’m guilty on occasion too!) use to provide context looks something like this: Spencer Dinwiddie, Cam Johnson, and Mikal Bridges have a plus-7.1 net rating together, which would be the top mark in the league—so watch out for the Nets, because even without Kevin Durant and Kyrie Irving, they’re still awesome when their best players are on the floor.

That italicized portion is technically true: Plus-7.1 is higher than the Celtics’ league-best plus-6.7 overall mark. But there’s an inherent flaw in that framing. Picking the Nets’ lineups with Dinwiddie, Bridges, and Johnson means we’re comparing the new-look Nets at only their best to every other team at their best, worst, and everything in between. Of course Brooklyn would look better with that imbalance.

The Nets example might seem a touch absurd because hardly anyone expects Brooklyn to challenge the 76ers in the first round. But that makes it a great illustration of the fact that, for each of the 14 teams in the current playoff field, we can cherry-pick a group of the three best players, which will look incredibly, misleadingly strong with the top-mark-in-the-league framing:

(Misleading) Three-Man Combo Ratings for Playoff Teams

Team 3-Man Combo Net Rating Where Would It Rank Leaguewide?
Team 3-Man Combo Net Rating Where Would It Rank Leaguewide?
Bucks Holiday, Antetokounmpo, Lopez +12.9 1st
Celtics White, Brown, Tatum +11.8 1st
76ers Harden, Harris, Embiid +9.3 1st
Cavaliers Mitchell, Mobley, Allen +8.7 1st
Knicks Brunson, Randle, Robinson +6.5 2nd
Nets Dinwiddie, Johnson, Bridges +7.1 1st
Hawks Young, Hunter, Collins +4.5 3rd
Nuggets Murray, Gordon, Jokic +14.2 1st
Grizzlies Morant, Bane, Jackson +12.8 1st
Kings Fox, Barnes, Sabonis +4.6 3rd
Suns Paul, Booker, Durant +15.3 1st
Clippers George, Leonard, Zubac +9.6 1st
Warriors Curry, Thompson, Green +8.3 1st
Lakers Reaves, James, Davis +14.3 1st
Tip: Compare like to like.

When asked for his primary piece of advice to better analyze lineup data, Patton recommends focusing on the cherry-picking problem. “When possible,” he says, “compare two things that are the most similar.”

That means comparing on/off data for one player to on/off data for another player, not cross-contaminating player ratings and team ratings. Or it means comparing three-man data to three-man data, or five-man to five-man, or overall team to overall team, rather than comparing one team’s best unit to another team’s best, worst, and everything in between.

In other words, if you’re analyzing the first-round matchup between the Celtics and Hawks, don’t look at the net rating for Atlanta’s starters and conclude, That’s almost as good as Boston’s net rating; this series might be close. Instead, if you want to consider the Hawks’ best five-man lineup, place it alongside Boston’s best five-man lineup—and then you’ll notice that the Celtics’ most-used quintet is at plus-12.2, per Cleaning the Glass, twice as good as Atlanta’s.

Fortunately, the playoffs lend themselves to comparing like to like, because so much is about head-to-head matchups between players, lineups, and teams. Keep all these warnings and tips in mind, and your analysis will be more precise and predictive all postseason long.