-
Notifications
You must be signed in to change notification settings - Fork 7
Decompilation notes
This page is a collection of notes and ideas regarding the topic of static code analysis and decompilation, primarily with Ghidra in mind.
The code base has grown quite a bit and most of it is entirely PC-specific, so the first step would have to deal with that. Certain optimizations would have to be removed.
- Removal of unneeded video code
- Removal of unneeded audio code
- Expansion of the
tcalltail call macro tocall+retrather thanjmp - Addition of a wrapper function for map access, perhaps abstracted to a macro
- More comprehensive documentation of calling conventions, either beforehand or during static analysis
All this should ideally happen in the static-analysis branch created for this purpose.
Ghidra is a powerful disassembler with decompiler, but cannot decompile assembler source code. The closest we can get is symbolic disassembly.
The following steps would be needed:
- Prepare the YASM-generated symbol map for use in Ghidra (see below)
- Use the script
ImportSymbolsScript.pyfrom Ghidra's script manager to import the symbols - Manually add all the data types and correct error caused by automatic code detection
- Find the best decompiler settings
- Manually add appropriate calling conventions for all the functions
- Export the resulting C code and post-process it with a bunch of regular expressions
- Manually turn the result into a proper (and properly documented) C port of the engine, ideally while viewing C and assembler code side by side
This chain of sed commands can turn YASM's symbol map output into something that Ghidra's importer script understands:
sed -nE '/Real +Virtual +Name/,$p' build/map.txt | sed -nE '2,$ s/([0-9A-F]+) +[0-9A-F]+ +([^ ]+)/\2 \1/ p' | sed -E 's/^([A-Z0-9_]+[ .])/\L\1/' | sed -E 's/(\.[A-Z0-9_]+ )/\L\1/' > px3_ghidra_sym.txt
The two rightmost sed invocations harmonize upper-case identifiers to lower case.
As of November 2023, the game logic has been ported to compilable C code and, using stubs for everything else, an executable can be created.