Optimzation, Instancing, and supporting older GPUs.

While I’ve worked on video games professionally for about ten years, this type of game is not quite like software I’ve previously written. The major difference is in the number of objects that are dynamic. Of the ones I’ve worked on, they tend to have a large portion of the environment built of large static parts that don’t ever change. This is typical in FPS games. There tend to be something like ten or twenty enemies at once and that’s about it.

In Banished, there are thousands of objects that can change. Trees, rocks, people, wildlife, and more. Each is running a simulation and each attempts to do so very quickly. For the most part I’ve succeded in keeping these simulations fast or spread out over time.

The biggest performance issue that I’ve run into is the shear number of objects that need to be animated and rendered. Anything that can be seen needs to be drawn once to create shadow buffers, and then once for being shown on screen. When you zoom away far enough to see the whole map, around 8,000 objects are usually visible. Under DirectX 9, making 16,000 draw calls to draw all the trees, buildings, and rocks simply doesn’t work. The frame rate is terrible, even on a really fast computer.

The fix for this is to draw multiple objects at once in a single draw call. After all, there are many similar trees and buildings, they just have different scale, position, and orientation. DirectX 10 and 11 has great support for doing this type of operation, and DirectX 9 has support for it if using Shader Model 3.

While DX10/11 video cards certainly have wide support, there’s still quite a few XP users and older video cards that require DX9. Because I want a wide audience to be able to play the game I’ve been sticking to DX9 and shader model 2.0. It does (mostly) all the graphics operations I want and need. Unfortunately this limits me to implementing the instancing using shader constants, in a manner similar to rendering a bone-animated-skinned model. This limits the renderer to drawing fifty-two objects in a single call, but it’s still a big win for CPU performance.

I had adding instancing support a while back for objects that didn’t deform, like trees, rocks and buildings. With the newly added animated livestock and wild animals, I spent part of the weekend adding instancing for animated and deforming models. The speed ups in frame rate are fantastic when drawing hundreds of chickens, people, or wildlife. In the worst case the rendering system makes about 400 draw calls instead of 16,000. Animations that are common can now be shared between different objects as well, which is another big speed up for the CPU, since every animating object no longer has to update a unique hierarchy of bones.

The near future should find me putting together hunters and the animals they’re after as well as implementing orchards.