The story of the non deterministic Replay

This is the story of how I discovered my simplified replay system wasn’t so deterministic as I believed because I had an ugly bug, but read the post if you want to know where exactly.

While integrating the lockstep engine on the game I am working on, I decided to do something to save and load replays to be able to easily reproduce some bugs I was experimenting. After I had that done and working, it was pretty awesome to see I can replay the same game multiple times (food for another blog post by the way), however, I thought it could be fun and easy to play them faster, why not.

Since I have a fixed time step logic, it should should be pretty straightforward, simply use a multiplied time and then the fixed timestep logic would do all the work and the game logic shouldn't notice the change. I decided to give it a try and it worked…. almost, when playing the replays at higher speeds I noticed some visual differences but I wasn’t totally sure (it could be interpolation code).

To verify, I went back to the test project, where I had the moving box, and test it there, but I needed some way to be sure. Since I have already a way to calculate checksums of the game state, I used that to verify the game states when playing replays at different speeds (from 2x to 16x).

It failed, even though it only failed to validate some frames, following frames were not necessarily invalid (this is something important to consider).

invalid-state Image 1: It shows one of the best tools in the world to check game states when replaying a game.

So, I was right, I saw some differences, I could be sure that something was happening. The thing was, with only the checksums I couldn't know what the real difference was. Next step, making something to detect it.

In order to do that, I had to change to start saving (at least for debug) the game state, not only the checksum, and to check differences between stored game states in the replay and the current game state (when replaying the game) when checksum validation fails. It worked too, now I have the exact place where the differences are.

invalidstate_realdiff

Image 2: It shows why serializing all the game state in one string is the best thing to do in your life.

After testing it a bit, I noticed another curious thing, the game validation wasn’t always failing given the same replay and the same speed. That gave me a hint that the problem was probably not related with the game code itself (the moving box).

So, if I played the replay at 1x, it was validated properly. If I played the replay at 8x, it failed, most of the time, but not always. So, it seems there is something related with speed I don’t understand yet.

I decided to test the same replay but with Unity timescale modified, my first test was using 1x for replay but 5x for the timescale, validation failed, then the opposite, 10x for replay but 0.1x for timescale, and it worked well. So the problem seems to be related with my accumulator logic inside the fixed timestep logic?

Some test cycles later, it turns out that, it was indeed a bug in one of the core classes of the engine!

The problem was on my class LockstepFixedUpdate, the first version was overriding the Update() method and performing lockstep logic, it worked ok if only at most one fixed update is processed, but in the case a big delta time arrives, it only process lockstep logic once for the first fixed update and never again.

That means that, in case replay commands were to be processed in frame 3, we are in frame 1 and a big dt of 10 frames arrives, then lockstep logic checks only in frame one and never again until all 10 frames were processed. This bug even bypass the lockstep!

Since I made a test to replicate the bug, it was really easy to fix it, I changed to process the lockstep logic with each fixed step updates and it works fine now, I have high speed replays!! YEAH!!

Conclusion

In the process of finding this bug I started to expand the engine support and create better tools, this is really important if I want to build something solid over it.

The only way to detect issues as soon as possible is to iterate over the engine as soon as possible and to do that, use cases are needed and games provide the best use cases. In my case, I am using not only the game I am trying to make but also other similar games as references when deciding what I want and how I want to test it, for example, having replays, being able to play the replay at different speeds, being able to save the replays, etc. Also, being able to replicate a bug in a small test case where you can iterate quickly to fix it is super useful.

Detecting (and having) problems like this in a small and simple game gives the idea of the complexity of a medium to big game, all the variables and the difficulties, it is not something to underestimate, so when developers say they couldn't add multiplayer features to their game because it was really hard to do it, it is not a lie.

I love all of this stuff, even though I understand it is not an easy path.

To complete this post, here is a video showing a prototype of how I load and play a replay which was created by playing with two players in LAN, one was my computer and the other was my phone:

The quote of the day is 'Fail as much as possible, as soon as possible to avoid failing when it is too late'.

Hope you enjoyed the journey.

VN:F [1.9.22_1171]
Rating: 5.0/5 (2 votes cast)

Lockstep multiplayer first steps

In this Gamasutra article, Tundra developers explain their approach to minimize problems when developing a lockstep multiplayer platform used for their game Rapture - World Conquest. Based on that, my first approach was to create and understand a deterministic lockstep logic without even considering networking yet, to focus on one problem at a time.

My objective is to have a reproducible game that given the start state and all the players actions during time, the game can be played again and the final state will be the same.

Lockstep logic

The first step was start by creating a logic to encapsulate the fixed game state for physics and game logic, and a lockstep to perform player actions. As I started without considering networking yet, all the player actions are directly enqueued locally by the user input or by a saved replay.

A lockstep logic means that there are some conditions that, if not met, the game should pause the simulation and wait for them. In the case of multiplayer games this is when the game have to wait for other players actions that didn't arrive in time (and where the waiting for other players dialog is shown in some games).

My code looks something like this pseudo code:

update(dt) {
    if (lockstepLogic.IsLockstepTurn()) {
        if (!lockstepLogic.IsReady())
            return;
        lockstepLogic.Process();
    }
    // normal accumulator logic for the fixed gamestep
}

Since I create all player actions locally, the lockstep never happens right now but it is a good practice to simulate and test it anyways.

Testing it

My first prototype using this logic is a box that moves over the screen. When the right mouse button is pressed, the game enqueues a player action to move the box to the specified position, then when the lockstep logic is processed the box receives the command and start moving to that position.

Interpolation

The moving logic was pretty simple, based on start and destination positions and a speed, the box moves itself to the destination in a straight line. Since that logic is performed in each fixed gamestep, the player see the box jumping between positions, it works but it doesn't look so good. To improve that and make the movement smoother I created a simple interpolation code.

Interpolation depends a lot on what you are trying to interpolate, in some cases could be a simple as the code I used there but there are other cases like the bouncing ball which need more data to create a better interpolation.

Note: adding interpolation means we have now a delay of one fixed gamestep between the box position being rendered and the real position in the game since we are using previous and current positions info for the interpolation.

My first replays

By recording all the player actions with the fixed gamestep frame they were executed, I created a basic way to replay the "game" by just reseting the game state to the initial state and start enqueing all the recorded actions in each corresponding frame.

Here is a video of the progress so far:

Yeah, I know, it's not a great thing, but at least I am starting to understand some of the challenges of making multiplayer games.

Validating simulation

To validate the game is always performing the same simulation given the same input, I added a checksum calculation based on the game state (for now just the moving box state) which is saved from time to time to use later as validation when simulating the game again. The idea was to start defining an API to get the important game state to consider when validating the simulation, and also to start testing game state validation. The code looks like this:

update(frame) {
    if (IsChecksumFrame(frame)) {
        if (recording) {
            checksumRecorder.RecordState(frame, CalculateChecksum());
        } else {
            if (!checksumValidator.IsValid(frame, 
                   checksumRecorder.SavedChecksums, CalculateChecksum())) {
               throw Exception("current game state is not valid!");
            }
        }
    }
}

In my first tests I wasn't reseting the game to the initial state properly so my game was producing different checksums when reproducing the saved player actions, checksums validation started to work after that was fixed.

How am I calculating the Checksum? Right now I am not sure exactly what algorithm to use nor which game state should I consider for the checksum, so what I did was to encapsulate that in some interfaces and implemented a basic way to get both. For the checksum I am using a simple MD5 over a string, and for the game state, well..., a string with all important values concatenated (the moving box position, destination, if it is moving or not, etc).

string CalculateChecksum() {
    string gameStateValue = game.GetGameState();
    return MD5.CalculateHash(gameStateValue);
}

string GetGameState() {
    string gameState = "";
    foreach (object in gameObjects)
        gameState += object.GetGameState();
    return gameState;
}

For a future implementation what I know is that the game state should be composed with important values that can affect the simulation, so I shouldn't care about about audiovisual stuff like particles, effects and sounds.

Also, the game state concept could be used to reset to the initial game state or even for saved game states (to easily replicate some bug for example), and finally, it could be used to synchronize and validate state if for some reason I end up using a client/server architecture.

Next frontier: Determinism?

For now I didn't explore determinism realm because the solution really depends on the game logic but at the same time I must have it clear before starting the game code. One of the next steps is probably start testing with fixed point math, not sure yet, the idea is try to follow an approach similar to the gamasutra article's of reducing non determinism problem to the minimum before going multiplayer.

If you want to take a look all the code used for this blog post, here is the link.

Other links

Example of a dynamic lockstep implementation for Unity

Lockstep Framework for Unity

Reddit post about that Lockstep Framework

Another framework named lockstep.io

 

VN:F [1.9.22_1171]
Rating: 5.0/5 (3 votes cast)

Exploring Remote Multiplayer

Some time ago I’ve started to prototype a multiplayer game, it is some kind of super simplified RTS game for mobile devices where macro decisions are encapsulated into one action through a button.

First versions were prototyped in a multiplayer hot seat fashion by using only one device (a tablet or a phone), that allowed me to quickly iterate between different ideas and find fun as soon as possible. That was a successful approach since the idea of the game proved that it was indeed fun to play with friends, so the next step was to go remote multiplayer with two or more devices.

Since networking is a whole new world for me, I started by reading lots of different articles about making multiplayer games.

The first thing to know is that the common solutions around depend on each kind of game, and knowing that beforehand helps you deciding the best solution for your game, and how your game must be adapted to support it.

In my case, I am developing a really simplified version of a RTS game for mobile, similar to Clash Royale. It is a 1v1 game where each player controls multiple units by giving high level orders like “everyone attack!” or “build new supply depot”.

The common networking approach for this kind of games, where synchronizing the whole world is a bit heavy given there are tons of units, is to do a synchronized simulation where only the main actions are transmitted (to save bandwidth) and each machine/device perform the same simulation. This implies the game should be deterministic in order to avoid desynchronization between players, so given one start state of the game and a list of actions over time, the final state should be always the same.

Making a game deterministic is a really hard problem, but it also has its rewards, like being able to replay the game given the actions of each player over time. This is a great debug tool for developers and at the same time a great freature for players since it is used to see other strategies or to share a great victory with your friends, everything with almost no storage cost.

The idea is to start writing my findings in the remote multiplayer journey to share all the lessons learned.

References

These are some of the articles I read:

What every programmer needs to know about game networking

1500 Archers on a 28.8: Network Programming in Age of Empires and Beyond

Fast-Paced Multiplayer

Why adding multiplayer makes game coding twice as much work

Core network structures for games

Making Fast-Paced Multiplayer Networked Games is Hard

How to create an online multiplayer game with Unity

Networking in Unity

Creating a Cross-Platform Multiplayer Game in Unity

Understanding Fighting Game Networking

Desyncs and FPU synchronization

Cross platform RTS synchronization and floating point indeterminism

Minimizing the Pain of Lockstep Multiplayer

The Tech of Planetary Annihilation: ChronoCam

Synchronous RTS Engines and a Tale of Desyncs

Don’t Store That in a Float

 

VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)