Archive for June, 2008

Got into reverse engineering mode

2008-06-27 (Friday)

Hi! Got the Operation Stealth versions’ opcode comparison done after all (It took an awful lot of time with all the procrastination and indecision thrown in and it wasn’t very inspiring work) and now I’ve been mostly reverse engineering Operation Stealth’s missing opcodes or fixing existing ones’ implementations. Feels a lot better now :-). Reverse engineering is fun!

IDA‘s graph view has proven extremely useful. It helps a lot in understanding what a function does and how it does it by giving a flow chart view of the function. It wouldn’t be easy to go back to using IDA without it, I’ll say! The graph view is one good reason for getting a license for IDA Pro as the freeware version 4.9 of IDA doesn’t have it.

Mid-term goals

Talked with Sev and he urged me onwards as I was still fiddling with the opcode comparison on Monday evening. So we set a new goal for the mid-term: All unimplemented opcodes that are used in Operation Stealth’s intro have to be implemented by mid-term (And fix an animation related regression). Here’s the goal in parts:

Implement unimplemented opcodes used in intro:

  • 40h: Undefined opcode (Was o1_closePart in Future Wars)
  • A1h: o2_removeGfxElementA0
  • A2h: o2_opA2
  • A3h: o2_opA3

And fix the animation regression in Operation Stealth’s very first room (The actor just slides around unanimated).

Recent achievements

So here’s what I’ve got done recently:

Fixed already existing opcodes:

  • 08h: o1_checkCollision in r32790
  • 83h: o2_isSeqRunning in r32804
  • 9Ah: o2_wasZoneChecked in r32790
  • A0h: o2_addGfxElementType20 (Renamed opcode) in r32769

Implemented missing opcodes:

  • 82h: o2_modifySeqListElement (Renamed opcode) in r32786
  • 8Dh: o2_op8D (No good name yet) in r32785
  • A1h: o2_removeGfxElementType20 (Renamed opcode) in r32769
  • A2h: o2_addGfxElementType21 (Renamed opcode) in r32769
  • A3h: o2_removeGfxElementType21 (Renamed opcode) in r32769

Added Operation Stealth support for:

  • zoneQuery in r32790 (Only used in Operation Stealth)
  • addOverlay in r32816 (Was previously Future Wars specific)

What’s still left of the mid-term goals:

As the opcode 40h turned out to be a no-operation in Operation Stealth’s PC version (It was o1_closePart in Future Wars) it only leaves me an actor animation regression to fix (Previously actor animation worked in the very first room in Operation Stealth, but now it bugs, the actor isn’t animated, he just slides around like on wheels). So that’s to go and then I’ve got my mid-term goals done.

Delphine compression format deciphering

2008-06-16 (Monday)

Spent most of my work time last week deciphering Delphine’s compression format. Now I feel satisfied by having made the implementation in ScummVM’s Cine-engine quite robust and hopefully understandable too (I documented it and really made myself understand what the darn thing does).

The decompression code had been initially reverse engineered from disassembly by someone else and there were no comments to speak of so that’s why it wasn’t very understandable to me in the beginning. Just compare the older version that I started with (Header file & code file) with the version that’s now in place (Header file & code file).

I also wrote documentation for the compression format to ScummVM’s wiki (Look there if you want to intimately know how the compression format works).

So in the end it turned out the compression algorithm used by Delphine’s adventure games uses sliding window compression (Quite like LZ77) combined with a fixed non-adaptive entropy coding scheme (Not of any type I could recognize).

Now to use the decompressor and unpack all those scripts and do the comparison…

P.S. Oh, and I also tried IDA’s graph view for the first time and worked with it on Delphine’s unpacking routine’s disassembly. Here’s an image of the abstracted graph of the unpacking function.

Some procrastination and preparation for opcode comparison

2008-06-9 (Monday)

Hello everyone. There’s been some procrastination in the air so to speak. I’ve been working on making an opcode comparison between different Operation Stealth versions in order to discover which opcodes are platform specific (Amiga/PC/ST), which are extensively used etc. The idea is that once I know that then I’ll know which opcodes to try to implement first. Sev pointed out to me a standalone resource file unpacker and script decompiler for Cinématique.

My first problem was what files to decompress i.e. knowing which files are resource files. Well, I looked at ‘vol.1’ – ‘vol.9’ files. They seemed to be simply text files containing lists of at least some of the resource files (Maybe all of them, don’t know). Then I got interested in how Cine currently figures out which resources files to unpack and it seemed it doesn’t use the ‘vol.?’ files at all but instead it uses a ‘vol.cnf’ file. Well, I tried to figure out the code that deciphers the ‘vol.cnf’ file and now I understand it. I started a page about Cinématique‘s internals, file formats etc in ScummVM’s wiki at the Cine Specifications page and wrote some documentation about the ‘vol.cnf’ file format there.

The standalone resource file unpacker and script decompiler didn’t have any support for wildcards or directory recursion so I wanted to try to use Python to do the directory tree walking and call the external unpacker or script decompiler when necessary. Hadn’t done directory tree walking or cleanly done command line options parsing before in Python so it took some time to learn to use those. Eventually after some tryings I used getopt.getopt for command line parsing and os.walk for directory tree walking.

I also was somehow torn with what to do in Python and what to do in C. The resource file unpacker and script decompiler were in C. If I were to call them from Python what about error handling? I couldn’t find a nice way to pass info back to the Python script from the external programs – I tried popen3 a bit but maybe the stdin/stdout/stderr passing only works on Unix, dunno, or maybe I just didn’t try hard enough :-).

Eventually after some procrastination and different ways of trying things I’m now quite convinced it’s probably a good compromise and use of resources to let low-level stuff be done in the C code (Bit manipulation, unpacking etc) and the higher level stuff (Calculating scripts’ opcode statistics etc) be done in Python because that way I’m using both of the languages’ strengths (i.e. use them for what they’re best for) instead of trying to do most or all of the stuff in one language only.

So now I’m at the point that I’ll just combine the Python and C programs to do the unpacking of all the resource files for the different game versions and then modify the script decompiler to output in .CSV format so that it’s easy to read the C program’s output in with Python or Open Office. Once I have the data parsed I can do the opcode statistics calculations. There’s a question though about how to detect what files are script files and what are not so I know which files to call the script decompiler on. Well, we shall see…

BTW sometimes understanding C code that deals with pointers and C library calls can be a bit of a PITA (And no, I don’t mean the eatable kind). Just compare fixVolCnfName()’s old C code version and the more C++ style version I did of it with some comments. Now I can understand what it does :-).