I’m going to break one of my own rules today. I’m going to write about Federer, Nadal and Djokovic in a way that makes sweeping generalisations about their playstyles. This is often my pet peeve in tennis coverage: commentators or pundits falling back on the outdated, decade-old stereotypes of Federer as effortlessly balletic, Nadal as the indomitable grinder with fighting spirit, and Djokovic as the impenetrable wall. All of these pigeon-holed caricatures of these players discredit much of their actual greatness. Outside those overly rigid 2D caricatures, in the more complex and rich 3D reality, Federer is one of the hardest workers in the sport and both Nadal and Djokovic are two of the most talented players to ever play the game.
But while I do think those old descriptions are boring, overplayed, and often fail miserably to match up with what’s actually happening during matchplay, I am going to write a little bit about why I think they can be vaguely useful in a bigger picture sense, and how they fit into one of my frameworks for predicting and analysing how those rivalries (and many matchups) play out on court.
This piece is going to be strange and is a bit of a brain dump. It’s also very much a theory in progress and is inarguably rough around the edges.
GTO
The field of artificial intelligence, and its subset of machine learning, likes testing its progress and abilities against table-top games. Perhaps the most famous examples of this are IBM’s Deep Blue beating Grandmaster Gary Kasaparov at Chess in 1997, and much more recently DeepMind’s/Google’s AlphaGo beating Lee Sedol at Go in 2016. The field has advanced significantly in recent years, with newer and/or more generalised iterations like AlphaGo Zero, Alpha Zero, and MuZero advancing DeepMind’s work to create ‘superhuman’ AI in record time. Machine learning methods have evolved significantly in recent years but one of the underlying methodologies remains similar: Monte Carlo tree search (MTCS).
I’m not going to go into too much detail on this because the rabbit holes are extremely deep and I would also certainly irritate those reading who actually work in machine learning. But the very high-level, simplified explanation is that the very best game AI use the following structures to inform and optimise their play.
The initial available moves are laid out for selection like the initial trunk and branches of a tree, the tree is then expanded as the game progresses to include possible combinations of scenarios based on how the game is being played between the competitors, simulations are run to gauge the success rate of those and future moves that lead to the various win/loss/draw conditions, and backpropagation is then performed as a way to deconstruct what went wrong/right and which moves and strategies need to be updated for future games if mistakes or non-optimal branches were taken. This is a rather complex way of saying this is how AI learns and improves at something narrow like a game of chess or Go. Unlike humans, AI has the benefit of playing millions of these games in short time periods via ‘self-play’ in order to populate those ‘trees’ with information, probabilities and optimisation. It’s no wonder therefore that some of the most advanced versions of these AI can now become ‘superhuman’ in these games in a matter of hours or days rather than weeks, months or years.
Chess and Go are both games with perfect information, i.e a turn based games where all available information is visible to both players on the boards in front of them. Further up the complexity ladder is a game like six handed No Limit Hold’em Poker with its imperfect information due to hidden cards and bluffing. Tennis, and many other non-table top game or sports, currently represents far too great a challenge for AI to try and ‘solve’ in the way that Go, Chess and Poker have been solved. Regular sports, while still in essence representing ‘games’ with codified rules and win/loss conditions, often require more general manifestations of intelligence and ability than the current cutting edge of AI can muster. Tennis is not only a game of imperfect information, in part because both players move constantly and simultaneously while choosing potentially unpredictable shots, but it would also represent an impossible technical challenge for current computing power and hardware development. It would require both hardware and software elements to first make the decision for where to move and what shot to play/predict in an extremely short time window, while also near-simultaneously executing that movement in sync on the court via robotics. Federer, Nadal and Djokovic can rest assured that no tennis AI player is going to ‘solve’ the game against them, at least not in their careers. Those furiously participating in the GOAT debate can avoid further shambles for now.
Of course AI is increasingly relevant in tennis and all sports, with simpler goals like automating and processing data collection, spatial player movement tracking, shot pattern analysis, strategy optimisation etc. to push the sport forward. Every elite player currently uses some form of AI to develop their game and find an edge (some more than others). But any notion of an AI actor that could actually compete with, rather than supplementarily help, the very best players is still a long way away. Because much of Poker has now been ‘solved’, i.e there are correct plays in many combination of cards, hands and pot scenarios, the best poker players now use ‘solvers’ (sometimes alongside ‘solved pre-flop ranges’) to pick up on hard-to-see patterns and verify whether their own play was ‘optimal’ based on hand histories. This is fascinating because it’s a bit like human’s own version of backpropagation, and even more fascinating because in this instance humans are learning from artificial intelligence that was built by other humans, a bit like some kind of developmental ouroboros. The equivalent in tennis would be a bit like if someone built a near-perfect tennis AI, with a robotic body to execute its moves, and then the very best players trained against that AI & robotic concoction to increase their own rate of progress against the comparatively weak human field.
In Poker, if a human player’s strategy aligns with the ‘solver’ AI it’s considered to be close to ‘game theory optimal’ (GTO) play. But what does GTO look like in tennis?
Federer, Nadal and Djokovic
I think in quite different ways, all of Federer, Nadal and Djokovic (and of course Murray) have pushed the game of tennis closer to game theory optimal (GTO) play at times.
Federer gets a bit of a raw deal when talking about baseline ability largely because he was immediately followed by the two greatest baseliners to ever live in Nadal and Djokovic. But Federer, from 2003-2007ish was a transformatively great baseliner. Federer’s forehand represented a mix of devastating power, alongside forgiving margin thanks to high topspin rates. And his all court game, complimented by exceptional hand-skills and a brilliant slice, all built on top of his superlative serving foundation, combined to make the Swiss a complete nightmare for the field to play against. The only problem for Federer is that he then met two other players (Nadal and Djokovic) who managed to perfect their own versions of high margin baseline aggression to even greater accuracy and effectiveness (I go into more detail about Nadal’s high margin edge here and Djokovic’s here, although both of them count movement as hugely important). My over-generalised mental model for those three players is that Federer is the most instinctive and feel-orientated of the three, preferring and thriving when playing in the moment and sometimes suffering when forced to construct longer points especially out of his backhand corner. Nadal and Djokovic are both more comparatively comfortable when points run long with slightly more balanced toolsets (double handed backhand over a single handed backhand helps1), which perhaps rewards more patient and considered point construction, while also still possessing sufficient weapons to elevate themselves above the role of a human backboard (a la David Ferrer). This is all rather simplistic but this is one part of why I think Nadal and Djokovic have outperformed Federer, in aggregate, in terms of pressure or big point play over the years. Nadal and Djokovic, against the backdrop of tennis conditions in the 2000’s and 2010’s, usually had a safer, but still effective, way to win points than Federer did. And over thousands of points, that edge, even if seemingly tiny when isolated, builds. There are of course other arguments, for e.g most of Federer’s prime coming before Djokovic’s which arguably skews their rivalry. But I still think it’s generally true that both Nadal and Djokovic were able to play Federer with more margin of error than he could play against them, and therefore played something closer to ‘game theory optimal (GTO) tennis’ in this context. I should be clear here that being able to play with increased margin of error alone is not enough, otherwise David Ferrer would have a better H2H against the Big 3. It’s Nadal and Djokovic’s ability to consistently find that reproducible margin, on top of all their other extraordinary skills, which makes them such complete threats.
Nadal is clearly capable of devastating offence, largely thanks to him possessing one of the greatest forehands in tennis history. But he also realised very early on in this particular rivalry that forcing or encouraging an error out of Federer was closer to game theory optimal (GTO) play than trying to out-winner Federer. While the line between unforced and forced errors in tennis can get a bit blurry and subjective, both Nadal and Djokovic have built part of their enormous empires on blurring that line even further by forcing opponents into strategic dead-ends, i.e making forced errors that look like, and are counted as, unforced errors.
This is not an isolated example in this rivalry. Of the 40 matches Federer and Nadal have played, Federer has made more unforced errors than Nadal 35/40 times. Federer won the only five times this was not the case (Shanghai 2017, Miami 2017, ATP Finals 2011, Hamburg 2007, Wimbledon 2006).2
The same also applies in multiple high-stakes Djokovic Federer matches. Perhaps most famously and recently, the 2019 Wimbledon final. Much was written at the time about Federer being the better player overall:
Total points won:
Roger Federer: 218
Novak Djokovic: 204
Winners/Unforced errors:
Roger Federer: 94/61
Novak Djokovic: 54/52
Break points faced in first 4 sets:
Roger Federer: 2 (both in 4th set)
Novak Djokovic: 8
But when it really mattered, ie the three decisive tiebreaks, Djokovic found something closer to GTO play than his opponent:
Unforced errors in the three tiebreakers (combined:
Federer: 11
Djokovic: 0
While enforced error and winner rates don’t neatly or reliably correlate to match outcomes within normal ranges, they’re still instructional in this example of looking at which players have a margin of error edge. And if you want to go a bit deeper, the most interesting thing about that 2019 Wimbledon final between Djokovic and Federer was what actually unfolded in those points in the three tiebreaks:
From my First Four Shots Meme piece a few months ago:
…what’s really telling here is that contrary to the usual average — that 65% of points usually occur in the 0-4 shot range and 35% in 5 shots or more — across those three Wimbledon 2019 Final tiebreakers, only 55% of points were in that 0-4 shot range, and a whopping 45% in 5 shots or more. So in one of the defining pressure moments of Slam matches in the last decade, there was a nearly 50-50 split between short and longer points. In these tense, high-pressure moments of the match, points ran longer than usual, and Djokovic won 11 points over five shots to Federer’s 4. Djokovic actually won more of his total tiebreaker points, in that final, over 5 shots than under 5 shots (11 to 10).
Djokovic pushed Federer into territory where he knew he had an edge, the longer points. Novak’s error rate went to zero, Federer’s ballooned. Reproducibility won. Higher margin won. Something closer to GTO tennis won.
Caveats
Of course I’m being a bit simplistic here and there are plenty of exceptions to all of the above. For example, Federer winning that brutal, long rally in the 5th set of the Australian Open final in 2017 against Nadal to comeback and grab the title (and his first win against Nadal in any Slam since 2007!). Or Djokovic slapping a risky return winner, down match point, to initiate a comeback in his 2011 semi final win over Federer at the US Open. Neither of those examples neatly conform to the idea that Nadal and Djokovic play closer to GTO tennis, with greater available margin, than Federer, or that GTO-esque tennis in the narrow context above is necessarily always the winning strategy. As I mentioned at the beginning, these descriptions paint these three players with overly broad brush strokes. Federer can clearly still thrive outside of that instinctive, in-the-moment zone, even when pushed into longer, less comfortable points. And Nadal and Djokovic, especially in recent years on the back of improved serves and +1 aggression, can also excel in those shorter, more instinctive moments. All three players have regularly confounded those simplistic stereotypes over the past twenty years. But while all three players are far too great to have what could be called a ‘weakness’ in any direction in this regard, the margins are so infinitesimally fine when they compete against each other that these small competitive edges do start to creep into their results. I think therefore, over the course of their entire careers combined, it’s generally true that Nadal and Djokovic have benefited from greater margin and more optimal patterns of play against Federer than Federer has been able to enjoy against either of them, especially when the stakes were at their highest.
In best of five sets (mostly at Slams) Nadal enjoys a dominating 12-5 H2H over Federer and Djokovic isn’t far behind with an 11-7 H2H edge against the Swiss. But in best of three matches Federer and Djokovic are tied at 16-16 and Nadal only narrowly leads Federer 12-11. It stands to reason that over the course of longer matches, i.e best of 5 sets, GTO-esque play would be less affected by variance. This is why you’ll hear poker players talk about things like ‘EV’ or expected value when determining whether their plays were correct. A short poker session over just a few hundred hands could still be a losing session even if the human player or AI is playing pretty much perfectly or game theory optimal poker. But the greater the number of hands the more likely the GTO player is to beat the game against non-GTO competition as factors like variance and luck become less significant. Nadal and Djokovic being able to play closer to GTO tennis than Federer may perhaps end up enabling the most significant edge of all in that rivalry triangle: winning more of the crucial matches in the longer-format Slams which contribute to the all important total Slam count that fans consider so important.
One of the main caveats here is that tennis, unlike Chess, Go and Poker, has many more shifting variables when it comes to playing conditions. Court surface (grass, hard, clay), court speed via amount of grit on hard courts or seed type on grass courts, evolving technology like string and racquet composition etc etc. There is certainly an argument to be made that Federer, against a different backdrop of conditions (maybe smaller/heavier racquets, less forgiving strings, a mostly grass tour, or some combination of other conditions) would have been the one to be able to play closer to GTO than Nadal and Djokovic. After all, GTO tennis in the serve-and-volley-heavy periods of the 80’s and 90’s would have likely been very different to GTO orientated tennis in the 2000’s and 2010’s.
I would also argue that Djokovic is probably the closest to GTO out of the three players overall3, even more so than Nadal, purely because his balanced play-style works, relatively unchanged, across all three current (arguably homogenised) surfaces, while also enjoying the benefit of about 65% of the elite tennis calendar on his preferred surface (hard courts). Djokovic’s recent history of ageing extremely well in terms of movement and physical prowess also means his ability to play something like tennis’ version of GTO play remains largely in tact in his mid-thirties, especially compared to Nadal who has had some of his equivalent GTO-friendly tools blunted by repeated injuries and declining movement. Djokovic also has no current equal when it comes to winning matches despite losing the first set, which one could speculatively argue might suggest that Novak’s own little version of backpropagation is superior in how he learns from previous mistakes or sub-optimal play, and then optimises, in the moment.
What of Nadal and Djokovic’s rivalry?
I’ll save this for another time (this piece is already too long), but one of the reasons I think the Nadal and Djokovic rivalry is so interesting is because it represents the clashing of two players, who for much of the last 15 years often both played something like tennis’ version of GTO play, especially in their primes (their rivalry is much more predictable by surface post-prime). Two largely unstoppable, yet heavily optimised and adaptable, forces colliding. And because Nadal and Djokovic had this unusual ability to play near-optimal tennis so consistently (at least relatively speaking), both were often forced into what would usually be sub-optimal or unusually risky points to find the edge against one another (for e.g Nadal having to hit 23 winners in the 5th set of their 2013 Roland Garros semi final to claw it back)! This is partly why I think those two pushed each other in ways which no one else could across some of their epics at the Australian Open (2012), Roland Garros (2013) and Wimbledon (2018). The edge in some of the biggest matches between prime Nadal and Djokovic felt thinner than any rivalry I’ve ever seen.
Evolving meta
Because tennis remains a more fluid and changing game than some of the simpler table top games, GTO strategies or perhaps the meta ('most effective tactic available’) also changes with it. The game, and its environment and tools, are not the same as they were twenty-thirty years ago.
Nadal has contributed to evolving playstyles in tennis not only because he continued the evolution of topspin enabled, high-margin baseline aggression, but also by way of things like his pioneering of extremely deep return positions which have now been adopted by other elite, younger players like Thiem, Zverev, Medvedev (who was statistically the best hard court returner on tour last year) and many others. And Djokovic’s open stance movement and sliding has also been the inspiration for many younger players from Auger-Aliassime, to Sinner, to Medvedev to Alcaraz etc. Novak has lead a movement revolution that has shrunk the court for opponents looking to attack by enabling a crop of flexible players who can play shots on the full run, slide, and then recover back to the middle of the court in record time ready for the next shot. These traits from Nadal and Djokovic, which either directly or indirectly influenced the coming generations, are part of what supplied those two legends with that greater margin of error mentioned above, allowing them to play something closer to GTO tennis than many of their rivals. Kei Nishikori, who has competed at the top level for most of the last 15 years and played both older and younger competitors, alluded to this in an interview a week or so ago:
Medvedev and Zverev, as arguably the two most successful younger players right now, have both taken bits of the Nadal/Djokovic abilities above, like going shopping in the supermarket of tennis attributes, while also leveraging their god given height to base their games on top of nearly impenetrable service games. The result can often manifest as Isner-like service performance combined with Djokovic/Nadal-like return game stickiness (combining to form the intimidating Frankenstein of the ‘Big Serving Counterpuncher’).
Players working out solutions to current meta’s or strategies in tennis is one of the most interesting things about the sport. Djokovic recently ‘solved’ the Medvedev mix of ‘big serving counterpunching’ in Paris by rushing the net and serve and volleying (Nadal did something similar in the US Open final in 2019) to remove Medvedev’s often GTO-esque human backboard strategy on return of serve. To bring it back to the AI analogy, you can think of these bits of strategy solutions as human tennis players creating their own versions of new branches in their biological version of the Monte Carlo tree search. Human’s can’t utilise ‘self-play’ to the degree or efficiency with which tabletop game AI does, or hope to fundamentally ‘solve’ a sport. But every time a player meets an opponent on court, new branches of information that inform their strategy (both subconscious and conscious) are grown, and incremental improvements and adaptations are made. The very best manage those adaptations and optimisations better, and faster, than anyone else.
GTO in the 2020’s
I’ve used ‘GTO’ or ‘game theory optimal’ rather loosely throughout this piece. Mostly because I don’t think it’s possible for any tennis player to actually manage game theory optimal play for myriad obvious reasons, and yet I still think it’s a valuable, if overly-general, way of framing some of the differences between Federer and Nadal/Djokovic (and more broadly the competitive landscape of tennis in coming years). This shouldn’t be read as a marginalisation of Federer, as I think what the Swiss did in the face of two less exploitable, all-time great foes, is nothing short of incredible, and his place alongside Nadal and Djokovic always had a good chance of representing a case of unfortunate timing considering his age relative to his two biggest rivals. Federer also had moments, even late in his career such as his extraordinary 2017 season, where he sustained a level of relatively low-margin, aggressive tennis for longer, and with more consistency, than just about any player I can remember. I’m not sure I will ever see a player pull off what Federer did that year, in quite the same way, again.
In this way, one of the great things about the Big 3 is that they are actually quite distinct in multiple ways. Federer’s differences from Nadal and Djokovic have made this last 15+ years all the more fascinating when they’ve clashed.
So as we look forward to the next decade of elite tennis a few things become clear. Federer, Nadal and Djokovic will soon be retired and the players who are looking most likely to dominate the game after they do so are the balanced and GTO-leaning Medvedev (No.2) and Zverev (No.3) (neither can claim to be as great as Nadal or Djokovic but their places amongst, and competitive edges relative to, their competition stand a chance at ending up looking slightly familiar). Medvedev and Zverev are followed by the more strength specific Tsitsipas (whose own game is actually very high margin and balanced on clay, but less so on other surfaces for now), and Rublev, Berrettini, Shapovalov et al. Sinner and Alcaraz will also both increasingly fit into this equation in years to come but both have an interesting mix of huge baseline weapons, solid returns, and slightly underpowered serves relative to some of the rest of the young elite right now, making their slotting into the top of the game slightly harder to predict.
Tennis will not be ‘solved’ by superhuman AI players anytime soon (thank god). But as we come to the sunset of this golden generation, the three defining players of men’s modern tennis have contributed all sorts of interesting information to tennis’ collective monte carlo tree search. Whether or not the next group of branches and optimisations are tiny refinements or larger leaps of progress is up to the next crop of players, their problem solving abilities and evolving genetic physical gifts.
For now all that’s left is to appreciate Federer, Nadal and Djokovic for not only enthralling tennis fans for much of the last two decades, but also for running all those oh-so-human simulations so entertainingly and effectively in their 148 combined meetings. Because after all, any game progresses fastest when its very best competitors clash as often as possible, throwing out rich beams of information every time something in their matchup works or fails, growing that tree of whether something should or shouldn’t be done in the future. Those three legends have populated tennis’ tree with new, significant, and bountiful branches faster than perhaps any other players in history.
GTO GOAT’s.
— MW
Edit: Further (deeper) analysis on this topic:
Twitter: @mattracquet
The next issue for paid subscribers will be Sunday’s usual analysis (2021 Scorecard) but it’ll be sent out on Tuesday, after Christmas, instead.
Happy Christmas to those who celebrate it.
Know someone who’d like The Racquet? Give them a gift subscription for Christmas:
Top image: combination of Tim Clayton/Corbis via Getty and Jean Catuffe/Getty
// Looking for more?
Most recent:
Not only did one of Federer’s biggest strengths (serve) happen to bump into two of the best returners of all time in Djokovic and Nadal, but Federer’s slice backhand would usually be the higher margin play when defending or returning out of his backhand corner (indeed this helped him be a complete consistent baseliner for some of his better years on tour in 2004-2007 and in many other non-Big-3 matchups). But part of the bad luck of having to contend with Nadal was that Rafa just tore through Federer’s slice backhand on all surfaces, rendering that as a less successful option for Federer. Federer was therefore faced with a choice, he could hit over his backhand to drive it, which was extremely hard to do consistently in the face of Nadal’s topspin and lefty slider serve, which meant lots of errors, or Federer could slice, which would be met with Nadal winners/aggression. A cruel choice. Some of Federer’s high margin, or what would have been closer to GTO, plays were therefore largely taken away from him against his biggest rival.
I think Nadal and Djokovic were fairly even in this regard (i.e the ability to execute optimal and reproducible strategy consistently when it mattered) when both were at the peaks of their powers, and that Federer wasn’t actually far behind either. Federer may be coming off poorly in some parts of this piece but, for perspective, the zoomed out version of my mental model for the GTO spectrum of pro players is something like:
(Close to GTO) Nadal/Djokovic...Federer………………………….……………………………….Bublik (Non GTO)