Part 1 done
The entire IllexBoyAdvance core has been rewritten, so now it outputs code while it executes it. Actually, this is pretty old news, and the guys at GP32X have known it for a while. I'm just lazy when it comes to updating this blog. :P
Part 2, while requiring much less code than Part 1, will be much harder as it's not repetitive nor simple:
Picture a bank of memory. It contains all the memory that can contain executable code. This includes the BIOS, RAM, ROM, and whatever else.
A normal static-recompiler will output one code-block for each address it reads from on run-time.
Now, we have a very small, but fast RAM, and a very large but slow ROM. How much do you want to bet that people are swapping their code into RAM as needed to speed things up? This creates a little problem: Each address can contain many different instructions in the course of execution.
Self-modifying code is the Achilies Heel of static recompilers. So, does that mean I got myself into a dead end?
No. But that is why Part 2 is so much harder than Part 1.
I see three paths to overcoming this obstacle:
1) Ignore it.
Maybe I'm wrong and nobody executes stuff on RAM. Maybe only a few games do. I have to do some tests and find out if I actually have a problem in my hands or not.
2) Binary Tree style.
Remember that bank of memory I mentioned earlier? For each executable instruction, I'd have a pointer to a Bintree, where previously executed instructions are the keys. Whenever there's a Jump/Call, the emulator looks for the appropriate tree in a big array, uses the instruction to find out if it has been decompiled before or not, and then:
If it has NOT been decompiled previously:
Compile it, add the code offset to the tree.
If it has been decompiled: Compare each instruction of the code currently being decompiled, and the version that was done previously. If a jump is reached and all opcodes are the same, then there is nothing to do. If not all the opcodes are the same, store the offset of the first different opcode, output a new disassembly, without overwriting the previous one.
When executing code that has already been recompiled, each time there's a jump, look in the array for the corresponding tree, get an offset, compare it with the the opcode it's supposed to execute. If it's good, jump to the pre-compiled code. If not, resort to the interpretor.
3) Dynarec Style.
For each code block, there's a translated code cache. When an instruction overwrites memory that has been recompiled, the cache block is invalidated (thrown away). Of course, there's no chance of doing that here. Instead, a new cache would have to be made, and some sort of hash would differenciate the caches.
Both methods 2 and 3 have a heavy performance penalty on the emulator, but it remains faster than a Dynarec. As can be seen, neither are trivial to implement, so I have to give this a whole lot of thought before I start coding. If anybody has any suggestions (even if it's to say, "Your blog entry made no sense. Please refrain from blogging at 1AM") I'd really like to hear them.
That's what I created the A7Board for. http://tkf15h.phpnet.us