Two weeks ago, Michael David Smith of the Wall Street Journal’s online edition wrote that the Detroit Lions may be the unluckiest team in NFL history. Despite, at the time, outscoring their opponents, the Lions had won only 2 of 9 games. Certainly, Lions fans expected better—and hoped for much better. Infuriatingly, the Lions seem much improved, but there’s been no change in the bottom line. However, it’s hard not to consider Bill Parcells’ famous line, “You are what your record says you are.” Many fans, bloggers, and media pros subscribe to this idea: no matter how much more competitive the Lions look, they are not actually better until they have more Ws next to their name.
So, what do we make of this? Do we ignore what our eyes tell us? Do we disregard increased production on both sides of the ball as window treatments on the Titanic? Or, do we foolishly embrace false “progress” because we’re so desperate to believe? How much of the Lions’ 2-9 record can be blamed on happenstance, and how much of it is just the Lions’ lack of ability? Fortunately, Brian Burke of Advanced NFL Stats recently wrote an article exploring exactly how random win-loss records are in the NFL.
Imagine flipping a perfectly fair coin 10 times. It would actually be uncommon for the coin to come out 5 heads and 5 tails. (In fact, it would only happen 24% of the time). But if you flipped the coin an infinite number of times, the rate of heads would be certain to approach 50%. The difference between what we actually observe over the short-run and what we would observe over an infinite number of trials is known as sample error. No matter how many times you actually flip the coin, it’s only a sample of the infinitely possible times the coin could be flipped.
As a prime example, the NFL's short 16-game regular season schedule produces a great deal of sample error. To figure out how much randomness is involved in any one season, we can calculate the variance in team winning percentage that we would expect from a random binomial process, like coin flips. Then we can calculate the variance from the team records we actually observe. The difference is the variance due to true team ability.
I strongly, strongly encourage you to read “The Randomness of Win-Loss Records” at Advanced NFL Stats in its entirety. Go ahead, I’ll wait.
Okay, back? Great. Lost? Don’t worry: I’ve got you covered with some bullet points:
- 42% of an NFL team’s regular season record can be accounted for by randomness, otherwise known as sample error.
- The correlation coefficient (r) between observed team records and a team’s true ability the square root of 0.58, which is 0.75.
- After a full season of 16 games, your best guess of a team's true team strength should regress its actual record one quarter of the way back to the league-wide mean of .500.
- The theoretical maximum accuracy of any predictive model is about .75. (from the comments, and Burke’s earlier work about luck & NFL outcomes).
If 42% of the Lions’ 2-9 record can be accounted for by randomness, that’s 4.62 games’ worth out of the eleven. Assuming that the Lions have had nothing but bad luck to this point—they’re at the very nadir of randomness—then we flip it to nothing but good luck, we can see the theoretical maximum given this talent. So, if Lions had gotten all the bounces: no Stafford injury, no Megatron Referee Fail, no Wendling/McCann freak TD return, no Alphonso Smith Disasters, Drew Stanton competes that pass, Shaun Hill doesn’t airmail that two-pointer (neither of which would happen anyway because Stafford would’ve been healthy, remember?), a few fewer specious penalties for the Lions, a few more for the opponents, recover a few more of the forced fumbles, catch a couple of dropped INTs . . . the Lions could be as good as 6-5 right now.
Before you freak out: that assumes both a 16-game season, and that the Lions are currently having the rottenest luck possible. An 11-game sample isn’t the same as a 16-game sample; there may yet be some regression to the mean—that is, if the Lions really aren’t what their record says they are, their luck will turn before we get to the end of the season. Well, either that, or next season will be a 16-game dip in the strawberry river:
Let’s assume for a second that there’s no sudden switch in the Lions’ fortunes, and they don’t sweep the NFC North at home during these next five games. Let’s also assume they maintain their current pace: a winning percentage of .182. Applied to 16 games, that’s 2.912 wins. What’s the “best guess at their true strength,” if we regress them one-quarter of the way back to the mean? If I understand this correctly, the difference between .500 and .182 is .318—and a quarter of that is .0795. So, the Lions’ “true strength” should be a winning percentage of .262: just over four wins.
Again: this assumes the Lions only win one more game. If the Lions finish 3-13, we’ll have no business saying “well this was really a 7-win team that got screwed.” Sure, if everything had broken the Lions’ way, and they’d been the beneficiary of some truly rare luck, then maybe they’d have won six or seven games—but as they are, busted-up Stafford and all, if the Lions only win one more game, they really are only a 3-to-4 win team.
So, again, perspective: this is applying Brian Burke’s analysis of win/loss randomness in the NFL to the Detroit Lions’ current record. All it can do is tell us, at the end of the season, what role “the Football Gods” have played in making the Lions’ record what it is—it is a redictive system, giving us a way of understanding what's already happened. It can’t tell us which games were the result of randomness, if “the randomness” has already happened, or if the Lions are “due” for a hot streak. It can’t tell us what we really want to know: how many games the Lions will win going forward.
Let’s attack this from the other direction: with a predictive model, one that can actually assess teams' relative strengths and project a winner. I’m choosing the Simple Ranking System, as published by Doug at Pro Football Reference.
Yes, this is required reading too. Yes, I’ll wait.
Fortunately, it is as simple as the name implies, so it only requires one bullet:
- Every team's rating is their average point margin, adjusted up or down depending on the strength of their opponents.
Okay, so average point differential, adjusted by strength of schedule, which adjusts the rankings, which adjusts the strengh of schedule, which adjusts the rankings, which adjusts the strength of schedule, over and over and over until the numbers stop changing. Very simple indeed, yes—but as Doug says, “As it turns out, this is a pretty good predictive system.”
|Green Bay Packers||7||4||0||0.636||103||1.2||10.6|
|New England Patriots||9||2||0||0.818||68||2.2||8.4|
|New York Jets||9||2||0||0.818||77||1||8|
|San Diego Chargers||6||5||0||0.545||85||-2.1||5.7|
|New York Giants||7||4||0||0.636||37||-1.4||1.9|
|Kansas City Chiefs||7||4||0||0.636||54||-3.3||1.6|
|New Orleans Saints||8||3||0||0.727||68||-4.5||1.6|
|Tampa Bay Buccaneers||7||4||0||0.636||-4||-3.1||-3.5|
|St. Louis Rams||5||6||0||0.455||-18||-4.1||-5.8|
|San Francisco 49ers||3||7||0||0.3||-59||-2.5||-8.4|
Guess how this chart is sorted? By SRS rank. You can see the Packers, Patriots, Steelers, and Jets up there at the top, and Seahawks, Cardinals, and Panthers scraping the bottom of the barrel. But wait, that team in bold, the one that’s darn near in the center? That’s the Lions, ranked 18th overall. When we take into account who they’ve played—per SRS, the Lions have played the 6th-hardest schedule in the NFL to this point—and how their offense and defense has performed, the Lions are the 18th-strongest team in the NFL.
This isn’t “with Stafford,” “with that Megatron touchdown,” “with that Drew Stanton pass,” or with anything imaginary added or subtracted. Quite literally, it’s the scoreboard of every Lions game so far this year; it’s simply been adjusted by the scoreboards of everyone they’ve played.
Ah, but how accurate is this method? It’s a predictive model, but how predictive is it? Clearly, if it says the 2-9 Lions are near the middle of the pack in relative strength, it can’t be good at predicting who’ll win and who’ll lose, right? Well, I regressed the SRS rankings against win percentage, and this is what I got:
Check out the correlation factor there: .7449205, or if you round up .001, .745. What was the theoretical maximum for a predictive model again? Well, if Brian Burke is right, it’s approximately .75. That means that given the inherent randomness in NFL outcomes, the Simple Ranking System is as good as it gets when it comes to assessing relative strength of NFL teams, and thereby predicting future NFL outcomes. Again, according to this system, the Lions are the 18th-best team in the land. Further, if I’m not mistaken, they’re the biggest outlier on the chart: they’re the lowest, rightest dot (-0.1 SRS, .186 W-L). Nobody’s getting screwed harder, or helped out more, by Lady Luck than the Lions. Just trace the Y axis up to the line of best fit (the diagonal one), and you’ll know what the Lions’ win percentage ought to be: .500. That’s right, SRS expects the Lions to have 5 wins by now.
So what does this all mean? It means that if the Lions keep playing like they’ve been playing, they’re either going to pick up multiple wins in these last five games—or next season, they’ll be tubing down the strawberry river of regression to the mean.