2015-06-07

xoreos is a FLOSS project aiming to reimplement BioWare's Aurora engine (and derivatives), covering their games starting with Neverwinter Nights and potentially up to Dragon Age II. This post gives a short update on the current progress.

Note: This is a cross-post of a news item on the xoreos website (part 1, part 2, part 3).

And further down the path of getting all targetted games to show areas I go. Previously, I wrote about my progress with The Witcher, Jade Empire and Neverwinter Nights 2. For the next two months, I took a look at the odd one out: the Nintendo DS game Sonic Chronicles: The Dark Brotherhood.

Yes, a Nintendo DS game. I wasn't so sure myself that game is actually a "proper" target for xoreos. I'm still not 100% sure, but I know now that it at least does use several BioWare file formats, as well as Nintendo DS formats. I also saw that some of those BioWare formats are used in Dragon Age: Origins as well, so Sonic Chronicles actually did provide a natural station on my path.

This report is divided into three pages. On this here first page, I go a bit into the details of those common BioWare file formats. On the next page, I cover the graphics (that are mostly Nintendo formats). And the third page shows how I tied it all together in xoreos.

So, onwards to the BioWare formats.

GFF4

GFF is BioWare's "General File Format", which is used as the basis for many things in BioWare games. It's an old format, already found in the Infinity Engine, but not quite as complex yet. Conceptually, it is comparable to XML[1]: hierarchical data, organized in a tree-like fashion, able to hold basically everything. As such, it's used to describe areas, characters, items, dialogues, ... Unlike XML, however, GFF is a binary format, not directly human-readable.

Since GFF is such an important format, xoreos already implemented a reader (thanks to BioWare releasing specifications for the Neverwinter Nights toolset. And we provide a tool to convert them into XML for easier readability, too. It was only, however, for versions 3.2 (used by Neverwinter Nights, Neverwinter Nights 2, Knights of the Old Republic, Knights of the Old Republic 2 and Jade Empire) and 3.3 (used by The Witcher). But Sonic Chronicles, Dragon Age: Origins and Dragon Age 2 needed a reader for versions 4.0 and 4.1 -- and boy did they change the format.

You see, after converting the GFF3 to XML, the whole thing is really quite readable and understandable. Every tag has a full string as a name, making the uses and intentions clear. But from the game's perspective, this has a huge drawback: it's slow. Strings are unwieldy, slow to read and compare, and variable length items are generally a pain when you want to quickly jump to a specific field. To curb that, GFF4 removes those pesky strings. Instead, fields use a single 32-bit integer as their "name", making comparisons easy as pie.



Lucky for me, the new GFF4 format is already documented on the Dragon Age Toolset Wiki. The huge amount of example files provided by the two Dragon Age games and Sonic Chronicles gave me ample opportunities to test out corner cases as well. Easy. The gff2xml tool mentioned above now supports GFF4 as well.

[1] In fact, BioWare generates their GFF4 files out of XML, as can be seen from the Dragon Age: Origins toolset.

TLK

Next up, I saw a new TLK format used in Sonic. TLK is a "talktable", a list of strings indexed by a numerical ID. The idea is that you have all text used in the game in one place, easy to use and easy to translate. Already used in Neverwinter Nights, xoreos has a reader for it already. It's relatively simple, too.

However, the new format is quite different. In fact, it's a GFF4! I did say that you can basically stick everything in a GFF, right? That's what they did for Sonic Chronicles (and the two Dragon Age games). With the new GFF4 reader, adding GFF4'd TLK support was quick and painless.

GDA

Just like the GFF4'd TLK, GDA is an old friend in GFF4 suit. This time, it's 2DA: a 2 dimensional array, a table if you will. If you're still lost, think Excel spreadsheet, a simple collection of data organized on a grid.

2DAs are used to, for example, specify the models of different objects. The GIT file describing objects in an area would say "Here's an object, we call it Chair, it has Appearance 179". The game then looks into appearances.2da, at row 179 and column "ModelName", grab that filename there and load it as the object's model.

GDA is, essentially, just the same thing as GFF4. A list of columns giving their name and type, and a list of rows with the data for each column. However... While real 2DA have an actual column name (the "ModelName", for example), making guessing the meaning easy, GDA don't actually store a name. They store a hash of the name (specifically, the CRC32 of the UTF-16LE encoded string in all lowercase), a number that's meaningless in and of itself.

There's 845 unique hashes in the GDA files found in Sonic. There's no real way to turn them back into readable strings, but there's a certain trick I could apply: a dictionary attack. I got myself a huge list of words found in a dictionary, hashed them, and compared the hashes. Then I extracted all strings I could find in the game (from the GFFs, mostly), and did the same. Then I combined the words of these lists. Then I combined matches. Each time, I manually went through the list to kick out the many, many false positives: strings that hashed to a valid number, but that don't make sense in the context of the game ("necklessnoflyzone", "rareuniquemummifications", "properlyunsmoked").

Phew, that was a lot of tedious work. Still, I managed to find the source strings for 534 of those 845 hashes, 63%. Sure, there's still 311 missing, but that'll have to wait for later.

And that's it for the common BioWare file formats. See page two for the graphical formats.
Page 2
After having implemented readers for the common BioWare formats, I turned to the graphics formats. They're, for the most part, stock Nintendo DS formats, which means I could build upon detective work from the Nintendo modding scene. I have to thank three people in particular: Martin Korth, of NO$GBA fame, whose GBATEK documentation is invaluable, lowlines who figured out much of the gory details of Nintendo's formats and pleoNeX, whose GPLv3-licensed tool Tinke provided the base on which I implemented my code.

SMALL

When I looked over the files inside the Sonic Chronicles archives, I noticed a peculiar thing. There's a lot of files with names ending in ".small". I suspected a compression scheme, especially after examining the files in a hexeditor. And sure enough, there are several compression algorithms provided by the Nintendo DS firmware. The one used by Sonic Chronicles is an LZSS variant, which can be decompressed using barubary's MIT-licensed dsdecmp tool (GitHub mirror). I pulled the decompressor into xoreos.

NSBTX

The first graphics format in Sonic Chronicles I inspected was NSBTX. NSBTX files are texture; or rather: archives of several textures used by a single model each. Implementing the reading of these was simple enough, and I added a small program to our tools collection that can "extract" them into TGAs.



NFTR

Next up, I wanted to see the fonts, NFTR, used in the game. They're bitmap fonts, with each glyph an image. The image can be 1-bit black and white, or it can be greyscale for anti-aliasing, shadowing or outlining purposes. Additionally, there's mapping tables to translate a code point in a certain encoding (UTF-16, UTF-32, CP1252 or Shift-JIS) into the index of the glyph it represents.

There was a bit of trial and error involved, as the documentation and existing FLOSS projects to display the fonts weren't quite correct in certain details (which might not even be their fault; Nintendo likes to subtly change formats between firmware versions). But, before long, I could print arbitrary strings in these fonts in xoreos.

NBFS/NBFP

Sonic Chronicles comes with a few NBFS files, which is a dead-simple raw format: 8-bit paletted image data, with the palette (in 16-bit RGB555 values) in an NBFP file of the same name. They're mostly used for images spanning a whole Nintendo DS screen.

NCGR/NCLR

The main image format used in Sonic Chronicles, however, was still missing: NCGR. As I went along implementing a reader, I learned these ugly facts:

The palettes are in separate NCLR files that are shared among NCGR

Most of the images are made up of several NCGR files, arranged on a grid

The NCGR image data itself is made up of 8x8 pixel tiles

Essentially, this image of Sonic is divided into these parts:

This all makes assembling the final image a bit...ugly. But hey, I made it work in the end.

...except for one thing: a few of the NCGR files don't specify their width and height. By fiddling with the values a bit, I managed to find these values manually, but the resulting image looks off: it's as if the image is supposed to be rearranged afterwards, different pieces drawn at different places. Each of those file also has an NCER file with the same name. I assume that means information on how to draw these NCGR are within the NCER. A thing for the TODO pile.

NSBMD

I then went for the big one: the 3D model format NSBMD. This involved a lot of fiddling, guessing and trial-and-error, as the documentation of these formats is sparse, and oftentimes wrong.

Conceptually, an NSBMD file consists of these parts:

Bones

Bone commands

Materials

Polygons

Polygon commands

A bone consists of a name and a transformation that displaces it from its (at that point unknown) parent bone. They are stored as a flat list. The bone commands then specify how the bones connect together. And they give each bone a location in the Nintendo DS's matrix stack (a list of transformation matrices containing the absolute transformation of each bone at the time of rendering).

The material contains the texture name (which reference textures inside the NSBTX with the same name as the NSBMD), and a few properties.

Each polygon can use a single material, and contains a list of polygon commands. These polygon commands produce the actual geometry. They set color, normal and texture coordinates, and generate vertices. They also manipulate the matrix stack, specifically replacing the working matrix with the matrix from the stack position of certain bones. In essence, this bases the vertices on the position of the bone.

While the Nintendo DS interprets the polygon commands on-the-fly while rendering, and while they can be nearly directly converted to OpenGL-1.2-era glBegin()/glEnd() blocks, this is not really want we want to do. So instead, we, while loading, interpret the polygon commands into an intermediate structure.

The result is a relatively massive loader for these files, and that doesn't yet include support for animations.

One interesting anecdote: the Nintendo DS doesn't use floating-point numbers to represent real numbers, but various formats of fixed-point numbers. Those are found extensively in the NSBMD files. But when I implemented the GFF4 format earlier, I found, in the GFF4 files used by Sonic Chronicles, a field type not described in the Dragon Age toolset wiki. Turns out, those are Nintendo DS fixed-point numbers!

CBGT/PAL and CDPTH

With those pesky models out of the way, I was ready to show the areas, right? Wrong. There's yet another graphics format in Sonic Chronicles: CBGT, used for the area background images.

However, CBGT isn't a Nintendo format. No, it's one of BioWare's creation. It does, though, take inspiration from the Nintendo DS formats. It consists of blocks of 64x64 pixels, each compressed using the LZSS algorithm found in SMALL files, and each block divided into 8x8 pixel tiles. PAL files of the same name carry palettes, with each CBGT able to use a different palette within the PAL.

Since I already knew how to puzzle together those cells and tiles from the NCGR format, getting the image itself was not a problem. But I was at a loss where to get the dimensions of the image from, and how to distribute the palettes onto the cells. I figured out an algorithm for the latter, that worked for nearly all images, but the outliers still annoyed me. Then it hit me: for each CBGT/PAL pair, there's a third file: a 2DA. And that one contains the information which cell uses which palette, neatly organized in a 2D table exactly how the cells are arranged in the final image. This, of course, is enough to calculate the final image dimensions as well.

I also found a fourth file for nearly each CBGT/PAL/2DA tuple: a CDPTH. Arranged in a similar fashion to the CBGT, it contains 16-bit depth information for each area background. This is used to let certain background pieces draw over the 3D models in the game, when they should appear behind something.

Now I was ready to implement actual Sonic Chronicles stuff. I'll describe that on page 3.
Page 3
Now that I had (nearly) everything graphical together, it was time to weave it all together into something approaching fake Sonic Chronicles gameplay.

Windows size

Being a Nintendo DS game, Sonic Chronicles run on two screens of 256x192 each, arranged on top of each other. To make this easy on me, I decided to, for now, force xoreos' window size to a static 256x384, and draw into it as if it was the two Nintendo DS screen. For the future, we'll have to think about how to handle scaling.

There's at least two ways to handle this:

Render into two textures, one for the each screen, then scale these

Scale and position the pieces separately

The former is the easy way out, but the latter might provide higher quality. There might be a third, a middle way: draw all 2D elements combined with scaling, and render the 3D objects in higher resolution.

Intro panels

To make the game feel a bit more real, I rigged up few static panels showing the various splash screens, and the "Start your adventure" GUI. Note that the button isn't actually working: it's really just an image that waits for a mouse click.

Area background

After the intro panels, Sonic Chronicles in xoreos than dumps you right into the first area. Using a static panel again, it displays the mini map on the top screen, and the area background on the bottom screen. With the arrow keys, you can move the camera along the X and Y axes, and the area background panel follows the camera to draw the section.

Area placeables

That was easy enough. I then wasted the better part of a day trying to trace and guess at how the game loads the "placeable" objects, the usable 3D objects in the area. The area description within the ARE file lists all placeables, each with an integer with the "name" of 40023 that seems to be a running ID and an integer called 40018 that's unique for each type. I.e. collectable rings have a 40018 value of 0, the item chest 6, and the pile of wood in the first area has a value of 15. The model names are listed in appearances.gda. So far, so familiar. However, the model for the wood pile is on row 101, and I failed to find a consistent way to connect those two numbers, 15 and 101, either mathematical or with the help of other data files, that would have worked for other placeables as well.

With nowhere else to turn, I looked at the disassembly. And I wept. There is no clean way to connect those numbers, because the placeable instantiation is hardcoded: there's a big old switch with all possible values (43 of them, 0-42) for the integer 40018, with object instantiation for each of them.

To keep it simple for now, I added a little array mapping that type ID onto a row in the appearances.gda. Not all cells are filled yet, but enough that the first area makes sense.

But, to get the models to display correctly, there was still something missing. You see, xoreos sets up a proper perspective projection matrix, where objects in the distance are smaller and all this jazz. Sonic Chronicles, however, uses an orthogonal projection at an angle of 45°. So, I added a method in our GraphicsManager to let the game code select an orthogonal projection instead.

And after some other minor fixes related to this, like scaling the rate of camera movement to fit the 45° angle (so that 3D objects stay at the same point on the background image when you move), and adding the changed orthogonal viewport to our unproject() function (so that detecting that you've moused over an object works correctly), the placeables now display and behave correctly within the areas.

That all?

So, what's missing? Quite a lot, actually:

Model animations. Those can be both geometry- and texture-based. In geometry-based animations, the bones move around and rotate, leading to different vertex positions. With texture-based animations, the textures move, rotate or scale, or even get replaced by different textures.

Most of the placeable types aren't yet recognized. Nor do the placeables do anything yet. Nor do we create creatures, triggers and squads. Nor do we have a player character that can move around.

There are also no conversations of any kind implemented yet, and there's no proper GUI and menu support.

We're also lacking sound and music. There's partial documentation for them, though, so it should be relatively easy to manage. Videos, which we still miss too, will be more difficult: we need to reverse-engineer the ActImagine VX video codec, since no one has done that yet.

This all is stuff for the TODO pile, though. Nothing I want to work on at the moment. Of course, we would welcome your contribution, so please, don't hesitate to contact us if you want to tackle any of these features, or anything else for that matter!

What's next, then?

If I want to continue on the path of getting all games to show areas, Dragon Age: Origins would be next. We'll see how well I'll do there, I guess!

Show more