Networked Games -- Dynamic Shared States

Dynamic Shared States

By now you have already gotten a taste of sharing objects across the network, mainly focusing on the life cycle of objects and how to identify and send updates (reliable as well as unreliable). In this lecture, let us really take a look at how to achieve global consistency (or not, by being close enough :-).

The first lesson here is actually it is impossible to have exact global consistency in a rigorous way and still be practically useful. Consider the following:

    It is impossible to allow dynamic shared state to change frequently
    and guarantee that all hosts simultaneously access identical versions
    of that state.

This statement is called the Consistency-Throughput Tradeoff. In other words, you can either have a distributed virtual environment that remain mostly static and all hosts maintain identical information. Or you can have a dynamic world with frequent information changes and no absolute consistency.

This assertion has a rigorous proof, and in fact its reasoning is very straightforward. To get absolute consistency among all distributed hosts, the data source must wait (with acknowledgments) until everybody has received the previous update before proceeding with more updates.

This is very restrictive for multiplayer games. Imagine the following scenario in a gathering game. Two players both saw the same piece of resource and went ahead to acquire it. Really, who got it? This is a trivial problem if everything takes place in slow motion. But that takes away all the fun, doesn't it.

So here is another way to look at Consistency-Throughput Tradeoff. That is, there is a finite amount of network bandwidth. Do you use it for updating everybody with changes, or do you instead send messages (including acks) to enforce a consistent view among all hosts?

A Spectrum of Different Approaches

Since it's a tradeoff (a very familiar word), let's take a few picks. The obvious ones would be: the left, the right, and the middle. In other words, the two extremes and a blend of the two. If we really really want global consistency, use centralized repository. The opposite is to only care about performance and let each host guess the global states. We hope the end results on each host are a good enough approximation of the canonical global view. The middle ground approach is called frequent state regeneration.

Before we examine them in turn, you should note that any real system would inevitably have to use different approaches for disparate parts of the system. Such decision has to be made in a case-by-case manner.

Centralized Repository

For this, there is a very familiar example. NFS, the networked file system. On NFS, let's imagine each state is stored in a separate file. All hosts write to the same NFS and read from the same NFS. Whenever a write (i.e. update) needs to take place, a host goes and open the file using atomic operations (i.e. lock that state), write to it and then release the lock by closing the file.

No matter what game operation a host is to perform, it first reads all the states from NFS and then proceeds. If a file is locked, it waits. In this manner, with a global synchronization step, similar to an MPI_Barrier, everybody gets exactly the same view.

Of course we don't really have to have all states on disk. Keeping them in core would be just fine, as long as the same mechanisms are implemented. Particularly, lock/unlock and a globally synchronized barrier.

This is, however, prohibitively expensive if you are implementing a basketball being passed from one player to another. It will really appear to be in slow motion :-). Let's use this approach for really important events only.

Frequent State Regeneration

Since we haven't yet talked about the other extreme, this method may not seem like the middle ground. But for the time being, just bare with me here.

This time, let's try something different. Each host (client) just keep generating the entire state over and over, and during the process just bombard any body you see on the network with messages containing that entire state. Say do this at 10 times per second.

What is the result? As long as we are not congesting the network, every host would "sooner or later (all within a few seconds though)" get some udpates from everybody. Many of those messages actually are just spam in the sense that the receiver has no idea how to use them. For instance, a player practicing his skills reclusively inside a tomb underground gets messages about sun rise. However, the pro is that if we are passing a basketball, it will look a lot smoother and responsive than using centralized repositories.

Note, here each message must contain an entire state so that the receiver can use that update independently. Also, the messages will very likely arrive out of order due to jitters in network latency and bandwidth. So it is a good idea to add a time stamp to each message.

To get rid of spamming, filtering is necessary. Actually, all we need is for each client to send its update to a server (one in charge its zone, e.g.) and have the server to simply relay those messages to clients that really should get them.

While we talked about design patterns, the observer pattern mostly advocates using "pull", i.e. have each receiver come request the update. That is good to work with centralized repositories.

But to work with frequent state regeneration, "push" is more appropriate. Can your current design handle this change?

Dead Reckoning

Now we are on the other extreme. Dead reckoning pushed more work to each receiver. Instead of having receivers operate in a passive mode, let's have them do some work to guess or estimate what could happen?

This method is appropriate for many scenarios where an event is easily predictable. For instace, after a bullet leaves a gun, we pretty much know where it is going and how fast. Using dead reckoning, we can probably dramatically reduce the amount of precious network bandwidth used for sending blind updates.

But what if we cannot exactly predict movement of our peers? Let's use correction methods called "convergence". For instance, a group of mermaidhunters are at work under sea. For each hunter, it would be fine for his perception of where other hunters are to temporarily be off a little, as long as the divergence does not get larger and larger over time.

In dead reckoning, full updates from sender are still expected from time to time. Say, hunter A finally got an update of hunter B's location. Convergence is then the process to correct hunter B's location to the correct one without causing visual artifacts.

This is highly subject though. If hunter B is in peripheral view of hunter A, or even right behind hunter A, just snap B's location to the correct one would be just fine. But in other cases, sophisticated interpolation methods need to be employed using splines of various degrees of continuity.

State Ownership

Finally, let's ask who can send "updates"? Obviously, hunter A should not update everybody else about B's location. This is actually still a simple case. How about two players trying to move a ball at the same time?

A common solution is to always require establishing ownership of a state before updates can be sent, and always enforce singleton type of ownership. This may cause some problems with physics. Detailed actions like moving the same ball, however, should be avoided in networked games though.

The locking mechanism, whether on centralized repositories or on servers acting as filters of frequent state regeneration, must be able to recognize such ownership.

Okay, what changes are needed in your design?