Porting: SIMD

The work on making the code portable continues.

Way back when I first got an i7 CPU, I wrote a global illumination ray tracer that made really nice pictures while pushing the CPU to its limits. The math code for the ray tracer used the CPU’s SIMD instructions to perform the math operations more quickly, and I got very used to coding with them. The performance gain using them was fairly large.

For Banished I wanted to get started writing code and the game quickly – so I tended to use things I knew very well – everything from the choice of compiler, modeling tools, algorithms, and APIs. For math code (which is used by pretty much anything in the game that moves, animates, or is displayed in 3D) I chose to use the SIMD intrinsics.

That stands for Single Instruction, Multiple Data – meaning the instructions operate on more than one value at a time.

For example the single instruction:

c = _mm_add_ps(a, b);

would be the same as doing 4 additions:

c.x = a.x + b.x;
c.y = a.y + b.y;
c.z = a.z + b.z;
c.w = a.w + b.w;

If all goes well, I should be able to compile on OSX and Linux using the same SIMD instructions, as I’ll still be targeting Intel/AMD CPUs. But just in case it doesn’t work or compile, or more importantly, if I port to a platform without those intrinsics, I’d like the code to be ready.

In writing the SIMD code, I didn’t keep any sort of portable C++ version around. Which is not so good. Having the C code for reference material is nice. Especially for something that’s been optimized to be fast and is no longer easily readable – or some equation that was hard to derive and that scrap of paper got thrown away…

The porting process is pretty easy. Most SIMD code just performs 4 operations at once, so in plain C++ code, all the operations are generally written out explicitly.

The C++ code I wrote is now much easier to look at. But I’ve also left the code open enough that if another platform has it’s own set of differing SIMD instructions I can write code using them if so desired on that platform, or I can just use the C++ code.

The worst part of porting the math code like this is typo’s. After all, there’s not too many different ways to write a matrix multiply, matrix inverse, or quaternion interpolate. So any errors tend to be errors in typing. You write .x, .y, .z, or .w (or similar) so many times that they tend to just blur together. You look at the code so many times knowing what it should be, that you don’t see the glaring error.

So after writing the C++ code, everything appeared to work correctly, except I noticed some deer would twitch during their animations.

It took me more time debugging to find the two errors that caused this than the time it took to port the entirety of the math library to generic C++.

In one place I had mistakenly written

float d = (a._v._0 * b._v._0) + (a._v._1 * b._v._1) + (a._v._2 * b._v._2) + (b._v._3 * b._v._3);

instead of

float d = (a._v._0 * b._v._0) + (a._v._1 * b._v._1) + (a._v._2 * b._v._2) + (a._v._3 * b._v._3);

And in the other location

Vector3((_r._0 * a._r._0) + (_r._1 * a._r._0) + (_r._2 * a._u._0),
	(_r._0 * a._r._1) + (_r._1 * a._f._1) + (_r._2 * a._u._1),
	(_r._0 * a._r._2) + (_r._1 * a._f._2) + (_r._2 * a._u._2)),

instead of

Vector3((_r._0 * a._r._0) + (_r._1 * a._f._0) + (_r._2 * a._u._0),
	(_r._0 * a._r._1) + (_r._1 * a._f._1) + (_r._2 * a._u._1),
	(_r._0 * a._r._2) + (_r._1 * a._f._2) + (_r._2 * a._u._2)),

Hopefully you can spot the errors faster than I did – it’s easier without additional math code above and below. I guess my brain saw what I expected to see, rather than what it really saw.

I’m actually fairly amazed the game displayed mostly correctly with these errors – the first error was part of Quaterion::Interpolate – which is used on every object displayed in game every frame, as well as during animation. The second error was in the code to multiply one orthonormal basis by another. Also used very heavily.

And for those of you thinking I should have unit tests for this kinda of thing, yeah math is definitely a system that can be unit tested – but eh, unit tests, that’s a discussion for another day.

Anyway, lots of debugging later the math code is now portable and the game works properly as far as I can tell with identical output.

Next up is an OpenGL version of the renderer, which I’ve got half done… more to come later.

19 Comments

Porting: UTF-8

With the mod kit and steam workshop out there, I’ve been working on porting the game to OSX and Linux, cleaning up code, and writing some new code. I’ll still be fixing bugs and making small changes, but my focus now is going to be on ports, and building prototypes for new games.

I’ve built a new machine for developing on Linux, and bought some Mac’s, so I’m all set with hardware. And it’s been a while since I used makefiles, other IDEs, gcc/clang, or did any sort of *nix development, but I have done it so I’m not starting from ground zero.

But before I can actually go about working on the new hardware and compiling things, there are a few issues in the Banished code base that need fixing. I planned on porting the code base one day, so things are nicely setup into common code and platform specific code, but there are still some issues I didn’t properly account for.

There’s code portability issues. I want to make the code as portable as possible – there’s a chance I’ll be making games for more than just Windows, Mac, and Linux one day, so I might as well try to fix them now.

The first issue is with text.

When I wrote the code I assumed one day it would support more than just English. Back when I made console games I never worked on any of the text code, so all I really knew is that the system used two bytes per character in a string of text, and this was fine for all the languages that the game was translated into – generally EFIGS and maybe Japanese. Since windows API takes wchar_t for all filenames and text, that’s what I used, and I happily and naively coded away using wide strings. This probably would be okay if the game stayed on Windows.

But it’s sort of mistake for cross-platform code. I didn’t even use it correctly. The released game doesn’t currently use UTF-16, it’s just UCS-2, so there are some languages that have characters that are unrepresentable.

Not only that, but API calls on other systems generally don’t take wchar_t*, they take char* instead. Certainly I could make conversion functions and convert UCS-2 to UTF-8 as needed, but that’s not really ideal.

The final problem is that the size of wchar_t on Windows is 2 bytes, but this isn’t guaranteed to be the case on other platforms. The size issue shouldn’t really matter – but it’s possible somewhere I multiplied a string length by 2, instead of multiplying by sizeof(wchar_t); That would cause problems.

I’ve known about these issues for a while, but I generally code something that works first, and don’t refactor until I have to. And now I have to.

So my big fix recently was to remove the use of wchar_t and use char instead. (And make sure there’s no multiplies by 2). Not only that but all strings need to use UTF8 properly – when printing text, or reading text from resource files the right thing needs to be done to properly decode and show the right character.

The first step of doing this is easy. You can find and replace a bunch of stuff using a text editor.

  • wchar_t becomes char
  • L”string” becomes “string”
  • wsprintf(buffer, “%ls”, param) becomes sprintf(buffer, “%s”, param)
  • wcscpy, wcslen, wcscat becomes strcpy, strlen, strcat

Next I fixed my String class to properly use char. No big deal.

All the serialization code that previously written to serialize wchar_t became serialization of char. This conflicted with serialization of single byte signed values, which were already typed as char.

For example

void Serialize(char v);             // serialize a signed 8 bit value
void Serialize(unsigned char v);    // serialize a unsigned 8 bit value
void Serialize(wchar_t v);          // serialize a character

Became

void Serialize(signed char v);     // this looks ambiguous to me, because i think 
void Serialize(unsigned char v);   // of 'char' as signed, even though it's
void Serialize(char v);            // compiler dependent and compiles ok...

This prompted me to redeclare all integers with typedefs and make a distinction between a character and an 8-bit integer.

  • signed int became int32
  • unsigned int became uint32
  • signed short became int16
  • unsigned short became uint16
  • signed char became int8
  • unsigned char because uint8
  • char stays the same, and is only used for characters and strings.

int64 and uint64 were already typedef’d. These typedefs are made in platform code, so per platform types can be declared using the right sizes, regardless of their names on each platform.

So the overloads then became

void Serialize(int8 v);     // serialize a signed 8 bit value
void Serialize(uint8 v);    // serialize an unsigned 8 bit value
void Serialize(char v);     // serialize a character

This took care of the type conflict since the compiler treats ‘signed char‘ as a different type than ‘char‘. It also clarifies what the types are used for in the code. If I see a char in code, it means character, or char* means null terminated string. If you see int8 or uint8, it’s a number stored as 8 bits. This makes things a bit more clear.

This was also a big find and replace, except for char, which I had to go through and determine if they were int8, or actually ascii characters. This wasn’t too hard though, as most previous uses of text used wchar_t – so most char‘s became int8. I’ll probably take a few weeks getting used to typing int32 instead of just int.

One issue that came up with the type name change was the existing source data. If you’ve been modding the game, in the data you might see

int _value = 400;

with the new code, it would become

int32 _value = 400;

While I could version all the text data, I don’t really want to break peoples mods that they’ve already setup and have to do the find and replace on their own, so there’s some versioning code that reads the old typename. For the next game I’ll get rid of that versioning code, but it’s going to stay in Banished for now.

Next came the hard part. Input data could be in any format – generally text files would be in Ascii if I create them, or UTF8 I used a symbol like the Euro. But mod creators and translated strings could be in UTF8, UTF8 with a byte order mark, or UTF16 big-endian, or UTF16 little-endian. Really it shouldn’t matter what a mod creator uses as a text format – I just want the game to load it and do the right thing.

While there are libraries for dealing with this, I tend to write my own code since I don’t like different code styles mixed in my code, type conflicts, and dealing with crazy code licenses. So 400 or so lines of code later and lots of debugging I wrote two great functions. One detects character encoding using byte order marks and checking for valid UTF8. The other can convert any one text encoding to another. There’s also support functions for decoding strings one character at a time.

After that it was a simple matter to convert all text files to UTF8 as they are opened by the game engine.

Once I had all UTF8 strings in memory, they have to be decoded when used as output – the font rendering code now decodes UTF8 into actual characters using 1 to 4 bytes of a string at a time before looking up each glyph in the font texture.

The Windows API can be compiled to use either wide strings or not, but I left them as wide and made a WideString class to deal with the conversion to and from the internal UTF8 String format. The WideString class only exists on windows compilations, and is only used in files that would be rewritten per platform anyway.

After all that, the game compiled and ran just fine, but it wouldn’t load old data or save games. This is bad. I can’t go breaking save games when a new version comes out. And I’d like to keep all existing mods working.

So then I had to version old strings that are stored in saves and data on disk – strings are just written as an int32 length, followed by all the bytes of data. So new strings set the high bit on the length to mark it as the new version. I doubt that the game will have a string of text 4 gigabytes long so this will be okay. If an old string is detected when the high bit unset, it converts it from UCS-2 to UTF-8 on load, and the game happily continues loading older data.

Again this versioning code will hopefully go away when I make another game, since it won’t be needed.

So now the only wchar_t that exists is in platform code on Windows that won’t be compiled on OSX or Linux, and the game properly supports various character encodings.

Text encodings are one of those things I never really want to think about – and now that I’ve spent a while dealing with it, hopefully I never have to deal with it again. Phew.

29 Comments

Update to Steam 1.0.4

A new version of the game has been updated on Steam that changes the way mods are uploaded to Workshop. Adding a mod to workshop now requires the original compiled source data to be available to allow uploading. This allows only mod authors to upload their content.

Mods must be rebuilt with Mod Kit 141123 before this will work. Non-Steam versions should still function properly regardless if mods are built with the newest mod kit or the previous one.

Changes for 1.0.4 Build 141123

  • - Uploading mods to Steam Workshop now requires the user to have the original compiled data before packaging on disk (generally .crs files). The data must be in the same location it was build and match the files in the package. For example, if the mod was built in C:\BanishedKit\mymod\bin\, then all files created during mod compilation must remain in that directory for the mod to be added to Steam Workshop.

    Without the original data, the Add to Workshop and Update on Workshop buttons are unavailable. Current mods need to be rebuilt with Banished Kit 141123 before they can be updated.

    No other changes to the game or mod kit have been made, so non-steam users can continue using build 141103.
29 Comments

Release 1.0.4

Version 1.0.4 is now out of beta and has been released. It’s up on Steam, and the content on Steam Workshop is now public to everyone.

For those of you not using Steam, you can download the latest version from either Humble, or GOG, depending where you downloaded. The builds should be up soon on their sites. Until then, you can download this patch (BanishedPatch_Any_To_1.0.4.141103.zip) to bring the game up to date. As always, just extract this archive into your Banished folder and overwrite any existing files.

The mod kit hasn’t changed much, but there is a new release of it available on the Mod page.

Changes in this release:

  • - Increased memory usage allowed for save games. This allows larger modded maps than default to be saved safely. However at some point very large maps will crash the game due to out of memory, or textures failing to be created.
  • - Trade UI now expands automatically for orders.
  • - Fix a bug that caused mods to become unreferenced in save games when opening the mod dialog in game and then pressing cancel.
  • - Fixed a bug that caused mods that should be unloaded to stay in memory if they were loaded at game start. This fixed random ghost buttons on the toolbar.
  • - Fixed tutorials not progressing properly.
20 Comments

Hotfix!

Apparently I really broke the last beta build. There’s two new fixes in this hotfix build.

Changes for beta 1.0.4 Build 141003

  • * Fixed mods not showing up on the toolbar.
  • * Fixed crash when loading save games saved with 141001.

There might be some weirdness with roads if you saved a game with 141001 and now load it – roads might have disappeared until you build more of them. The joys of making mistakes in code!

There’s a new mod kit that you can download.
And here’s the new patch: Download it here.

21 Comments