More work on IBA
IllexBoyAdvance has seen a good amount of development lately.
Currently around line 3589 of 7190, which puts me at around 50% done.
Yesterday I spent time doing a revision, making sure I wasn't leaving too many mistakes about. And I caught quite a few bugs, mostly writing the output code to a string and then forgetting to dump it afterwards. :-P
BTW, I made a temporary webpage for this project: illexboyadvance.sourceforge.net
I am feeling rather confident that this rewrite I'm doing is capable of getting fullspeed GBA emulation on the GP2X. These are a few reasons why:
1) Each instruction states which registers it needs, and for the emulator to access them, a series of ANDs/Shifts must be done. While Shifts and Ands are not costly operations on an ARM processor, doing them often still hurts performance. The VBA C core does this various times per instruction when it could calculate it once and store the value in a variable. Illex does all these calculations when outputting code, so once compiled it will not have to be done again.
2) GCC optimizing the output code is sure to be much faster than the current core.
3) There are various instructions where certain registers are treated as special cases. These checks are being eliminated from output code, unless they really need to be done.
4) VBA has routinary tasks, things it does for each opcode, that don't need to be done all that often. Keeping R15 updated, for example. Going from one opcode to the next. The usual slow-downs associated with interpretors. All that's going away.
Then there's a bunch of little things that could be changed:
5) Useless turnary Op in VBA core:
Z_FLAG = (res == 0) ? true : false;
Assuming GCC doesn't optimize this fully, the turnary operator uses a branch, just like an "if()" would. Therefore, it's a better idea to simply do:
Z_FLAG = (res == 0);
A single branch isn't a big deal, I'm just giving an example of the things you'd find in VBA.
6) Whatever unnecessary operations (N/Z/C flag calculation, for example) do end up in the output source are going to be optimized away by GCC.
7) Various other things can be done, such as using the hardware blitter, eliminating clockcycle calculation if we're desperate, bribing the processor into thinking it's an AMD64, the GP2X MMU can be used to emulate the GBA's, outputting inline ASM so as to make use of the native N/Z/C/V flags rather than calculating them manually, and finally beating it with a bat untill it can't take any more and does my bidding!