More C0D4G3

Monday, July 31, 2006

Part 1 done

The entire IllexBoyAdvance core has been rewritten, so now it outputs code while it executes it. Actually, this is pretty old news, and the guys at GP32X have known it for a while. I'm just lazy when it comes to updating this blog. :P

Part 2, while requiring much less code than Part 1, will be much harder as it's not repetitive nor simple:
Picture a bank of memory. It contains all the memory that can contain executable code. This includes the BIOS, RAM, ROM, and whatever else.

A normal static-recompiler will output one code-block for each address it reads from on run-time.
Now, we have a very small, but fast RAM, and a very large but slow ROM. How much do you want to bet that people are swapping their code into RAM as needed to speed things up? This creates a little problem: Each address can contain many different instructions in the course of execution.
Self-modifying code is the Achilies Heel of static recompilers. So, does that mean I got myself into a dead end?
No. But that is why Part 2 is so much harder than Part 1.
I see three paths to overcoming this obstacle:
1) Ignore it.
Maybe I'm wrong and nobody executes stuff on RAM. Maybe only a few games do. I have to do some tests and find out if I actually have a problem in my hands or not.

2) Binary Tree style.
Remember that bank of memory I mentioned earlier? For each executable instruction, I'd have a pointer to a Bintree, where previously executed instructions are the keys. Whenever there's a Jump/Call, the emulator looks for the appropriate tree in a big array, uses the instruction to find out if it has been decompiled before or not, and then:
If it has NOT been decompiled previously:
Compile it, add the code offset to the tree.

If it has been decompiled: Compare each instruction of the code currently being decompiled, and the version that was done previously. If a jump is reached and all opcodes are the same, then there is nothing to do. If not all the opcodes are the same, store the offset of the first different opcode, output a new disassembly, without overwriting the previous one.

When executing code that has already been recompiled, each time there's a jump, look in the array for the corresponding tree, get an offset, compare it with the the opcode it's supposed to execute. If it's good, jump to the pre-compiled code. If not, resort to the interpretor.

3) Dynarec Style.
For each code block, there's a translated code cache. When an instruction overwrites memory that has been recompiled, the cache block is invalidated (thrown away). Of course, there's no chance of doing that here. Instead, a new cache would have to be made, and some sort of hash would differenciate the caches.

Both methods 2 and 3 have a heavy performance penalty on the emulator, but it remains faster than a Dynarec. As can be seen, neither are trivial to implement, so I have to give this a whole lot of thought before I start coding. If anybody has any suggestions (even if it's to say, "Your blog entry made no sense. Please refrain from blogging at 1AM") I'd really like to hear them.
That's what I created the A7Board for. http://tkf15h.phpnet.us

Thursday, July 13, 2006

More work on IBA

IllexBoyAdvance has seen a good amount of development lately.
Currently around line 3589 of 7190, which puts me at around 50% done.
Yesterday I spent time doing a revision, making sure I wasn't leaving too many mistakes about. And I caught quite a few bugs, mostly writing the output code to a string and then forgetting to dump it afterwards. :-P
BTW, I made a temporary webpage for this project: illexboyadvance.sourceforge.net

I am feeling rather confident that this rewrite I'm doing is capable of getting fullspeed GBA emulation on the GP2X. These are a few reasons why:
1) Each instruction states which registers it needs, and for the emulator to access them, a series of ANDs/Shifts must be done. While Shifts and Ands are not costly operations on an ARM processor, doing them often still hurts performance. The VBA C core does this various times per instruction when it could calculate it once and store the value in a variable. Illex does all these calculations when outputting code, so once compiled it will not have to be done again.

2) GCC optimizing the output code is sure to be much faster than the current core.

3) There are various instructions where certain registers are treated as special cases. These checks are being eliminated from output code, unless they really need to be done.

4) VBA has routinary tasks, things it does for each opcode, that don't need to be done all that often. Keeping R15 updated, for example. Going from one opcode to the next. The usual slow-downs associated with interpretors. All that's going away.

Then there's a bunch of little things that could be changed:
5) Useless turnary Op in VBA core:
Z_FLAG = (res == 0) ? true : false;
Assuming GCC doesn't optimize this fully, the turnary operator uses a branch, just like an "if()" would. Therefore, it's a better idea to simply do:
Z_FLAG = (res == 0);
A single branch isn't a big deal, I'm just giving an example of the things you'd find in VBA.

6) Whatever unnecessary operations (N/Z/C flag calculation, for example) do end up in the output source are going to be optimized away by GCC.

7) Various other things can be done, such as using the hardware blitter, eliminating clockcycle calculation if we're desperate, bribing the processor into thinking it's an AMD64, the GP2X MMU can be used to emulate the GBA's, outputting inline ASM so as to make use of the native N/Z/C/V flags rather than calculating them manually, and finally beating it with a bat untill it can't take any more and does my bidding!

Tuesday, July 11, 2006

34

Well, to be more exact, 34.682241691. That's the percentage of the ARM CPU core I've re-written so far.
In the process, I've seen some atrocities that make me wonder how come this thing even runs on the PC. Due to this, I'm starting to think the GP2X will be perfectly capable of running GBA games, and maybe even have space for a turbo option. I'd have to see how much emulating the rest of the hardware costs, but I think it's safe to say the 200Mhz ARM will be able to run GBA games faster than a real GBA (because Pokemon is unbearable at the native speed). Let's wait and see.

As for my computer, I bought the video card but I couldn't get my hands on the mother board. The store had sold out. Bah. Hopefully later durin this week another one will show up.

Saturday, July 08, 2006

>_<

Yay, quite a bit of good stuff to write here...

Since my last post, I've been working a whole lot on FishMotor.
And it is beautifull!
It currently loads 3DS files (non-animated), textures them (almost any image format you can think of) and displays them with lighting. Most of it through a plugin system that is very easy to work with.
A Linux version is being made side-by-side, and currently is about 100fps slower than the windows version. This is probably due to horrid open-source drivers for my old video card.

My old video card mentioned above won't be a problem for me much longer though:
On Monday I'll be buying an ATI x1300, which while not top-of-the-line, is a big upgrade from my ole 7500.
Actually, saying it's a big upgrade is an understatement.
To go with the fancy new card, I got myself an Athlon64 3500+. Yup, I can finally get back to work on that Dreamcast emulator project, Minerva, except...

Talking about emulators, I'm modding VisualBoyAdvance by making it spit out C++ code from whatever it executes. This code will then be compiled and linked to the emulator. Result? Hopefully full-speed GBA emulation on a GP2X by means of static-recompilation. It's a great educational experience and whatever I learn here will benefit Minerva. The VBA port is preferencial, as it is much less ambitious than Minerva. Being simpler, a GBA emu will pave the road for a DC one.

Saturday, June 17, 2006

Update

It's been a long time since I updated this, so I'll probably have a lot to write... I just can't think of any of it yet.

Oh well, I'll start with this:

What you see here is version 0.3 of the FishMotor engine I've been working on lately.
Support for smooth face normals on the terrain, animated and textured MD2 models (say hello to Homer and his shotgun), static PLG models (the cat), and Anim8or exported C files. The engine had some design flaws that were bugging me:
1) Statically linked with OpenGL
2) GLUT. Urg, glut....
3) Things were getting messy.
I felt the need to write a plugin system for it, and just when I'd finished, I scrapped the whole project all together.
And it is being re-written. Large chunks of code will be copied over so it's not being re-done from scratch really, but a major overhaul is in progress. It's a completly plugin-based engine where the renderer is in an external plugin. This means DirectX or OpenGL support without having to recompile. Currently I'm making the OpenGL plugin, and it's in very early stages. All it does is render static meshes with no lighting.
On the other hand, I now have Linux support for it.
Take a look at the code sample below:

cPlugMan PM;
cEngine *(*NewEngine)(), *Engine;
void *H;
H = PM.Load("OpenGL_Plugin/oglRenderer");
if( !H ) return 1337;
NewEngine = (cEngine *(*)()) PM.GetFunc( H, "New" );
if( !NewEngine ) return 1336;
Engine = NewEngine();

Engine->Init();
if( !Engine->OpenWindow( width, height, 0 ) ) return 1335;
Engine->SetTitle( "Hello World" );


As you might have noticed, there is no OS-dependent code in there. Even the plugin's name doesn't have .so/.dll at the end (it's added automatically by the loader). With two calls to the engine, a window is set up and ready for rendering.

To Do:
  1. Triangle Strip generation from triangulated meshes
  2. Texture support.
  3. Eat some pizza
  4. Lights
  5. Transparency and other effects
  6. Collision detection
  7. Make that unnamed space game and enter it as a competition entry
After that I'll be adding features as I need them in whatever other projects I work on.


In other news, I bought the case for my new PC a few weeks ago, and I'll be buying a mother board + AMD64 3500+ before the end of this year. A few more parts and Minerva will be back on track.

The Tamagochi thing is probably dead. Not much interest in it from anybody (including myself) :P
Now to get back to work....

Wednesday, March 15, 2006

Less codage

Minerva
Apple moving on to Intel was the stupidest idea ever. It had to be the other way around, PC's had to go over to some kind of RISC architecture... which is what AMD is doing, sorta. The new 64 bit chips look more like RISC processors than CISC. This is going to be the salvation of my Minerva project. Intel processors have few GPRs (8), compared to the SH4 (16). This results in having to do constant register allocation Bleh. It also means your purdy 3.0ghz P4 will look like a little old granny.
Changes to the Minerva plans:
1) No more copy+paste. This was a technique I'd use to permit cross-platform code generation, but it relies on the DC's registers to be stored on memory rather than on native registers . Instead, I'll have to use...
2) Proper code generation. Generating optimized binary code rather than copying general-purpose code.

To do:
1) Save money, get a 64bit machine to work on. ^_^
2) Re-do the processor in assembly rather than C
3) RAM code can be re-used, but must still be finished.
4) Debug interface


Other projects
The other projects have all progressed... sorta. I've been working on code that will be usefull for all of them:
the XML/INI loader for configurations, MIX-Loader for storing game files (Now with Blowfish encryption support!), and an assembler for ARM processors (part of a VM that will run some of my projects [Tamagochi, space game, etc.] on my GP2X).

Thursday, March 09, 2006

Death, pain, and suffering

Ok, there isn't any death, but there is a lot of psycological pain and suffering. Till when must this go on?! Work sucks.

I got some work done on the renderer, it now correctly rotates/translates/draws stuff in line-mode. The frame-rate is sad, I'll buy it a chocolate bar so it can cheer up.
Also done is a parser for XML-like files for loading game configurations and stuff. It is also capable of loading INI files, compatible with the Red Alert 2 INIs. I have also managed to finish a MIX file loader, which can be used to open the MIX files that come with Red Alert 2. MIX files are like TAR balls, wherein they're used to store a bunch of files in a single big file, with no compression.
I'm going to combine the INI-loader, MIX-loader, and the renderer to make a click-and-drag game creator for the GP2X, except, with no clicking and dragging due to the lack of a pointer device. No idea what I'm going to do with the interface, but at the moment it's not something to worry about.
To Do:
Finish the script compiler.
Make a PCX loader (I'm using Minilib here...)
Come up with some interface idea and implement it. Probably gonna use the GUI I made a while ago.
Bake a pie and savor its juicy goodness.