This War of Mine: Unpacking Gamefiles
December 2018
This War of Mine is a game where you have to deal with consequences of war. Rather than playing a soldier, you are controlling a group of civilians, trying to survive the war and in the process, defending themselves and raiding other buildings. Further, the game especially highlights the emotional conflicts a person has to go through during those times and presents the player with difficult decisions.
Unpacking the gamefiles and inspecting how the game works was on my todo list for a long time now, I gave it a shot 4 years ago but did not put in enough time to finish it. It’s been bugging me a long time since, so I finally gave it another shot and finished the project.
I started by looking through the game’s directory and I noticed two notable
file extensions: *.idx
and *.dat
. One idx
file is always
accompanied by a dat
file, where the dat file is considerably
smaller than the corresponding idx
file.
This implies that the idx
files hold meta information describing the content
which is probably embedded inside the dat
files.
Opening the executable inside BinaryNinja and skimming
through the functions I noticed the function MountInternalFileSystem
as well as a few other functions belonging to some kind of filesystem.
Firing up GDB and breaking inside the MountInternalFileSystem
function
I discovered, that all containers like common
and animations
are mounted
at the start of the game.
In order to figure out more I wanted to see how the game accesses a file inside a container
and hopefully being able to find out how the .dat
and .idx
files are structured.
Knowing that the game uses lua I searched for any string containing .lua
and
found the function LuaWrapper::ExecuteFile
by xreffing that string. This function
is executed right at the start of the game and executes the main lua file, which
in turn executes other lua files. Putting a
breakpoint in that function and starting the game, we see that the game tries to execute
a file located at common/scripts
and named Main
.
The path does not exist inside the game directory, but common
is the name of a
container as seen in the first screenshot.
Looking at this function in binary ninja, we see three interesting calls.
First of all, the function allocates a new class FileReader
. After that,
it retrieves the file length, allocates a buffer with the file length (+1 for the
zero terminator) and reads the file contents into that buffer.
Digging deeper into the constructor of the FileReader
class, we arrive at a
function called GetInputStream
from the class FileSystem
.
The only thing which the constructor does is to retrieve the input stream of the passed
file which finally leads us to FileSystemContainerMountPoint::GetInputStream
.
This function passes a pointer to a local variable to the
function ContainerFileReaderCore::GetIndex
. ContainerFileReaderCore::GetIndex
retrieves the file information from the corresponding .idx
file and fills the given
struct with it. We’ll re-visit this function again later.
After that, the FileSystemContainerMountPoint::GetInputStream
constructs an
instance of the FileSystemContainerInputStream
class and returns it. Looking into
that constructor we discover a initialization of a class called DecompressionContext
,
so we’ll need to keep in mind that the data is probably compressed.
By now, we have a basic understanding of the games workflow. Every pair
of .idx
and .dat
files represent a container, which is mounted as soon as
the game starts. Every access to a file will instantiate an instance of
the FileReader
class. The constructor of that class prepares everything in order
to actually read from the file. After that, the file can be read by calling
FileReader::Read
.
Next, I stumbled over a function called ContainerFileReaderCore::ReadIndicesFromFile
,
which opens the .idx
files. The function reads the first 11 bytes of the file
in a first read call, and all the remaining bytes after.
The first 11 bytes in an .idx
file are some kind of metadata, whereas the remaining
content describes files embedded in the corresponding .dat file. We need to figure
out how the file is structured. For that, we can take a look at the earlier mentioned
function ContainerFileReaderCore::GetIndex
:
This function is responsible for retrieving data about a specific file from the
.idx
files.
The function is pretty straight forward. It receives a path to a file, which
will be converted to lowercase and the first character (/) is removed.
After that, the string length is calculated. Both values are passed to
Hash::Murmur32
, as well as the seed and another constant used inside
the hash function.
The calculated hash is then used to find the FileIndex struct. The last highlighted block is interesting, as the FileIndex struct is copied over to the passed FileIndex. Now we know that the struct consists of 4 dwords and 1 byte, so 17 bytes in total.
This means that if we are correct, the .idx
file should be 11 bytes (header) +
X * 17 bytes (x = filecount) big. Looking at the size of the common.idx
file,
it totals 17487 bytes. (17487 - 11)/17=1028, meaning, the IDX file contains
metadata about 1028 files which are probably located inside the corresponding .dat file.
We can leverage the binaryninja api to give us a quick overview over the content
of the common.idx
file. We know that each file is described by 4 dwords and one bool,
the first 11 bytes are skipped. Opening the common.idx
file in binary ninja
and running this script
br = BinaryReader(bv, Endianness.LittleEndian)
br.seek(11) # Skip first 11 bytes
while not br.eof:
u1 = br.read32() # read dword
u2 = br.read32() # read dword
u3 = br.read32() # read dword
u4 = br.read32() # read dword
u5 = br.read8() # read bool
print(hex(u1), hex(u2), hex(u3), hex(u4), hex(u5))
returns this https://pastebin.com/9jw4Nygz list. We will use that one in a moment.
Every line in that list describes a file embedded in the corresponding .dat file
Now we need to examine how the returned FileStruct from the GetIndex
function is used. As the game uses
lua I searched for some kind of initialization function for lua and found
the function LuaGameDelegate::OnBeforeInit
, which luckily executes a lua file
via LuaWrapper::ExecuteFile
. This means, that the function has to accomplish the
entire process of hashing the name, getting the FileStruct and finally using
the FileStruct correctly to read the lua file, so hopefully we’ll find all our answers there.
The file we are going to follow is named Main.lua
. As this file resides
inside the common container, its hash has to be inside the list of files
we extracted out of the common.idx
metafile earlier.
To find out the hash for the filename, we can place a breakpoint inside the hash
function of the game and peek at the eax register. We discover that the
hash is 0x5faa28a2
which has to be inside the list of files we extracted
from the common.idx
file earier, since it is the only way to identify a file in there.
Looking it up we find its entry:
('0x5faa28a2L', '0x205L', '0x902L', '0xd5cb2L', '0x1')
Now we also know that the first DWORD represents the filename hash in our filelist:
Offset 0: Hash
Offset 4: Unknown
Offset 8: Unknown
Offset 12: Unknown
Offset 16: Unknown
First of all, it uses the newly retrieved FileIndex struct to acquire the file length
of the lua file, as it needs to allocate enough memory to hold its content.
Digging into the FileReader::GetFileLength
we discover that it returns a value at
a static offset. Breaking inside the function and looking in the memory,
we find the FileIndex struct with all values from our file entry inside it. The returned value
from that function is 0x902 which is at offset 0x8 in our file entry.
We now only have three unknowns left in our file entry:
Offset 0: Hash
Offset 4: Unknown
Offset 8: Filesize
Offset 12: Unknown
Offset 16: Unknown
To find the last piece of the puzzle, we need to take a look at the
FileReader constructor again. Also we’re going to focus on the same function,
LuaWrapper::ExecuteFile
. Digging deep into the FileReader constructor, we
arrive at a function called PhysicalFileReader::RequestNextChunk
.
This is a asynchronous function which only moves the chunk request into a queue.
The queue will be dealt with by another thread. Also it moves the pointer of
the function BaseThread::_RPCHelper<PhysicalFileReader>
into a struct. So we’ll
set a breakpoint at the FileReader constructor now and after it triggered, place a breakpoint
at the function BaseThread::_RPCHelper<PhysicalFileReader>
.
Looking at the RPCHelper function, we notice fseek and fread calls. This function is most likely used to read the requested file. We can set a breakpoint at those calls and take a look at the arguments.
The second argument to fseek is 0xd5cb2
– again we recognize that value as it is contained
in our file entry list.
So It’s safe to say that the DWORD at offset 12 is the offset inside the .dat
file.
The fseek
function is used to to skip to a certain part inside a file. The second argument
for that function is 0xd5cb2
which we also have inside our file list entry at offset 12.
This value tells us at which offset our file starts inside the .dat
file.
Stepping until the fread call and looking at the argument again, we also recognize
another value: 0x205
. This one is passed to fread
as the amount to read starting
at the earlier seeked offset.
The file Main.lua
can be found inside the common.dat
file at offset 0xd5cb2
and is
0x205
bytes long.
Now we actually have almost all values except the last bit:
Offset 0: Hash
Offset 4: Filesize (before decompression)
Offset 8: Filesize
Offset 12: Offset inside .dat file
Offset 16: Unknown
Remember that we noticed some zlib function for compression earlier which is why
we have two different filesizes: The filesize inside the .dat
file before
decompression and a second filesize of the file after decompression.
At this point we are basically ready to write the programm to extract the files. But we need to use zlib in the same way as the game, with the same arguments. For that, we can simply look at the DecompressionContext which we encountered earlier.
The prototype for inflateInit2_ looks like this:
inflateInit2_((strm), (windowBits), ZLIB_VERSION, (int)sizeof(z_stream))
So the arguments passed via the stack are:
- esp 0x0: The strm object
- esp 0x4: windowBits, 0x1f (31)
- esp 0x8: zlib version, “1.2.3” here
- esp 0xc: size of the strm object, 0x38 (56)
Using those values to initialize zlib, we are able to start exracting the files.
While doing a testrun and extracting all containers, I ran into the following error:
Extracting 'gog.dat'
Extracted file: 1461901587
Zlib returned error: -3
According to the zlib manual, error code -3 stands for Z_DATA_ERROR (-3)
and
is returned when the input data is not conforming to the zlib format.
At this point I just manually checked the file and it turned out that some files
are not zlib compressed. So we also finally discovered the purpose of the last bit
in the .idx
file.
Offset 0: Hash
Offset 4: Filesize (before decompression)
Offset 8: Filesize
Offset 12: Offset inside .dat file
Offset 16: IsCompressed
Modifying my unpacker to handle non compressed files, I am finally able to decompress all files. The only issue left is that we cannot reconstruct the filename, as this information is not stored anywhere. We only have the hash of the filename from which we can not infer the filename.
Everytime the game has to access a file it needs to hash the filename. We can inject a little hook into the game which writes the parameters from every call of the hash function to the file.
I placed the hook inside the Hash::Murmur32 function at the end. The hook is using linux interrupts to write to stdout, as the game does not write anything else to stdout, we are free to spam it and pipe its output to a file without the need to filter the output afterwards.
The following assembler first saves needed registers on the stack, writes the 4 byte hash to stdout, writes the filename to stdout and finally writes a newline to stdout.
push eax ; push register on stack
push ebx ; push register on stack
push ecx ; push register on stack
push edx ; push register on stack
mov eax, 4 ; number of write interrupt
mov ebx, 1 ; destination of write, 1 for stdout
lea ecx, [esp+0xc] ; hashing result
mov edx, 4 ; bytes to write, hash result is always 4 bytes
int 0x80 ; call interrupt
mov eax, 4 ; number of write interrupt
mov ebx, 1 ; destination of write, 1 for stdout
mov ecx, [esp+0x10] ; filename
mov edx, [esp+0x14] ; bytes to write, length of the filename
int 0x80 ; call interrupt
mov eax, 4 ; number of write interrupt
mov ebx, 1 ; destination of write, 1 for stdout
mov ecx, 0x87e86eb ; newline found somewhere in the file
mov edx, 1 ; bytes to write, newline is always 1 byte
int 0x80 ; call interrupt
pop edx ; restore saved registers
pop ecx ; restore saved registers
pop ebx ; restore saved registers
pop eax ; restore saved registers
Everytime the game now accesses a file we get the hash and corresponding filename printed to stdout. We can pipe the entire output into a file and can use that file to map the filehashes to actual filenames.
if(this->fileNames.count(fi.hash)) {
resultPath /= fs::path(this->fileNames[fi.hash]);
printf("Extracted file: %s\n", this->fileNames[fi.hash].c_str());
} else {
resultPath /= fs::path(std::to_string(fi.hash));
printf("Extracted file: %s\n", std::to_string(fi.hash).c_str());
}
Obviously this will only contain files which were opened by the game while we played. The longer we play, the bigger our fílemap will become.
The full sourcecode of the unpacker/packer can be found on my GitHub.
I am hoping to transfer most of the knowledge of this game to the recently released Frostpunk, which is using the same engine.