Howdy!
In my second blog post in this series, Networking in Games, I did an introductory dive into how networked games work and the different steps in the process that make up the networking framework for a game. In section of that post describing Application Logic, I referenced several techniques that multiplayer games use to mitigate network latency experienced by the player and today I plan to do a deeper dive into these various strategies.
Following this post, I plan to spend the next couple of weeks working to implement all of these latency mitigation techniques myself while documenting my progress here in this blog series. I also want to reiterate again that I do not wish to pass off any of this information as entirely correct or as techniques that I have come up with on my own. This is certainly not true and this post is merely meant as a way to summarize what I have learned while researching latency mitigation strategies this past week.
Why do any of this?
Before we get into the various latency mitigation techniques, its important to reiterate why these techniques are even needed in the first place. As I had laid out in previous posts, client-server network architectures where the server is authoritative are often employed in modern multiplayer games for a variety of reasons but one of the primary reasons is in order to make it harder for players to cheat. With the server acting as a neutral party that hosts and controls the game state, it becomes much more difficult for an individual player to manipulate game data and ruin the experience for other players. However, this network architecture comes with some physical limitations. Data can only be transmitted across the Internet so fast and even in the most optimal network environments there will be some amount of latency accrued when sending data from clients to the server and back. The naïve implementation of application logic with this in mind then would be to simply send client inputs to the server, have the server acts on those inputs and send the results of those inputs back to each clients. But, if we were to wait for the server to validate all inputs from the player on the server before acting on them in the client we would end up with a situation like the one show below:
In this example, the player presses the right arrow and then waits 100ms for the server to acknowledge the input was valid before that action is displayed to the screen. This is unacceptable behavior for most modern games and thus latency mitigation strategies are employed to help lower the amount that the player feels this cost of the round trip cost to the server while still keeping the server as the central authority over the game.
Deterministic Lockstep - An Aside
It is worth mentioning that some games actually do use this method of sending only a player's input to the server with a technique known as deterministic lockstep. RTS games are an example of games that tend to use deterministic lockstep. Without diving into it too deeply, deterministic lockstep is a technique that relies on the physics simulation in the game being completely deterministic, meaning that the exact same series of input will provide the exact same physics simulation every time, no matter what.
This might sound simple but even the smallest amount of variance in the simulation, even something as small as how different operating systems handle floating point precision, can cause two physics simulations to diverge drastically as shown in the video below from Glenn Fiedler's article on Deterministic Lockstep (linked at the bottom of this post):
However, if this variance can be removed entirely (again no small feat), then the same physics simulation can be ran on both server and client machines. Clients send their input to the server, the server validates the inputs and applies them to its physics simulation and sends them back to the client. Upon receiving the validated inputs, each client then applies the inputs to its own simulation and, because the physics simulations are deterministic, they will produce the same exact result for both the server and the clients. The clients will need be running their simulation a packet or two behind the server time but the end result is smooth gameplay transmitted using less bandwidth:
This level of determinism simply isn't feasible for some games though and it becomes harder to achieve as the number of players, the number of inputs, and the complexity of those inputs begins to grow. Thus, other latency mitigation strategies are employed in most modern networked games to create a better feel for players.
Latency Mitigation Strategies
There are four main latency mitigation strategies that I have found reiterated across the sources I reviewed during my research for this week:
Client-Side Prediction and Server Reconciliation)
Entity Extrapolation (also known as dead reckoning)
Entity Interpolation
Lag Compensation
A lot of these techniques build off one another and most modern networked games employ some combination of several these strategies to give the illusion of a lag-free gameplay experience.
Client-Side Prediction and Server Reconciliation
Client-side prediction is a technique that relies on the knowledge that, while we need an authoritative server to prevent cheating, most players are not going to be cheating.
Using this knowledge, we can assume that the client's inputs are valid and immediately apply them locally while at the same time we send them to the server to be validated. The server then checks the input as normal and sends the validated game state back down to the client. The client rectifies its original prediction as necessary based on this validated game state. The end result is that player input feels as if it is processed immediately on the client side while still validating that input on the server:
But, this assumes that the roundtrip cost to the server is the same as the time it takes to move the character and this is rarely, if ever, the case. A more realistic example would look something like this:
In this example, the client moves twice before the server had time to acknowledge the player's inputs and thus when the server sends the updated game state down, it is one input behind the client. The result of this, with client-side prediction implemented as described so far, is that the player will move twice on their screen, then snap back to their position after the first input and then snap back to their original predicted position. This is obviously not an ideal player experience but luckily it can be solved pretty easily using a technique known as server reconciliation.
Server reconciliation attaches a sequence number or a timestamp to each input sent from the client to the server. The client also keeps a copy of each of the input it sends to the server. Now, when the client receives a validated game state from the from the server (with an associated sequence number/timestamp) the client discards its own copy of the associated input and reapplies all of the input that the server has yet to validate. This process is shown in the diagram below:
With client-side prediction and server reconciliation, the players input can now be predicted and validated by the server with minimal noticeable latency!
Entity Extrapolation and Interpolation
With the techniques discussed so far, we can successfully mitigate the latency of a player's own inputs, but these techniques don't handle other networked entities in the game, such as other players. Looking at the same diagram from before but with the addition of other clients, the problem becomes more recognizable:
While Client 1 sees all input being played out as intended, if Client 2 has to wait for a server update to see any input from Client 1, then Client 1 will appear to be jumping around the map at random intervals which is not ideal. There are two main methods for dealing with the latency of other networked entities, extrapolation and interpolation.
Extrapolation
Extrapolation, also commonly referred to as dead reckoning, describes the technique of sending additional information from the server to each client about the other networked entities that allows each client to predict or extrapolate what other networked entities might be doing next.
The idea is that if each client knows where each other client is at and knows the direction and speed at which that client is moving, it can make a reasonable guess as to where that client will be in the upcoming frames until the next game state update comes in from the server. When this server update is finally received, the game state is corrected to whatever the actual positions of all the entities should be. If the client predicted the movement correctly then the illusion works, but if the prediction is incorrect it can cause significant rubber banding. This technique works best with slow moving entities that don't change direction quickly but it won't work at all for games where a player's speed and direction can change instantly, like in a 3D first-person shooter.
Interpolation
Interpolation addresses the pitfalls of extrapolation by instead interpolating between two known game states sent from the server. Instead of receiving the game state and movement data in every update from the server and predicting where that entity will be going, the client interpolates from the last game state that was received from the server and the one that was just received. This means that each client sees all other clients one server update behind where they actually are. This is illustrated in the diagram below:
Now though, each client sees all other clients in a position they were actually at and thus no rubber banding can occur. Each client does technically see all other clients in the past, but the delay of one server update is generally not noticeable at all.
Lag Compensation
In situations where timing and spatial accuracy is important, one additional technique is employed to cover one of the pitfalls of entity interpolation: lag compensation. The classic example where lag compensation is used is in first-person shooters where players need to be able aim at another player, fire their weapon, and hit the other player. Using the strategies we've discussed so far, if Player 1 were to attempt this, even if they aimed perfectly on the head of a moving target, they would never be able to hit that Player 2 because all clients see all other clients in the past. By the time the shot is processed on the server, Player 2 has already moved from the position they were shot at and the server counts the shot as a miss. Lag compensation addresses this by reading the timestamp of Player 1's shot and rolling the server's simulation back to the moment Player 1 shot, where Player 2 was directly in their crosshairs. and processing the shot from that position. This correctly identifies the shot as a hit and thus shooting feels more accurate, even in a networked environment. The only pitfall of lag compensation is from the Player 2's perspective. Player 2 continues to move for some time before the server processes Player 1's shot and thus if Player 2 is moving particularly fast or if they move behind a wall before the shot is processed, the illusion is broken and Player 2 can be left feeling like they were killed out of nowhere. This is the tradeoff of lag compensation but it is a design decision that favor prioritizing the instigating client's timeline so that moments like gunplay feel accurate and fair.
And those are some of the main latency mitigation strategies that are employed in modern networked games! Next week I will begin implementing them starting with the naïve implementation and client-side prediction. Resources:
Fast-Paced Multiplayer by Gabriel Gambatta:
Networked Physics (2004) by Glenn Fiedler:
Networked Physics (2014) by Glenn Fiedler:
Floating Point Determinism by Glenn Fiedler:
Latency Compensating Methods in Client/Server In-game Protocol Design and Optimization by Yahn W. Bernier:
Quake Engine Code Review by Fabien Sanglard: https://fabiensanglard.net/quakeSource/index.php
Comments