
58 lines
4.1 KiB

title: "3D Software Rendering on the GBA"
date: '2020-07-21'
subtitle: "Game Boy Advance Fixed-point math, here we come..."
- gba
- programming
When I started programming the [gba-sprite-engine]( two years ago, I knew I would be getting myself into trouble. The Game Boy Advance only has 16Mhz and it's whole software library is written in low-level C using DMA (Direct Memory Access) and memory-mapped IO. Translation: pointers! `**` - Yay!
In the end, switching to `C++11` while trying to unit test and stub out BIOS code as much as possible did help soften the pain. I'm glad I got my hands dirty again, and it writing "closer to the metal" was a welcome change from the usual high-level stuff I produce.
But GBA MODE 0-1-2 is not the only possibility to write a GBA game. There's also _bitmap mode_, 3-4-5, that lets you write pixel colors (or palette indices) yourself. This opens up possibilities of software rendering things yourself. 90% of the GBA library did **not** do that. But a few games did:
- Doom, Doom II, Duke Nukem Advance (Ray casting engines and/or ports)
- 007 Nightfire (A more modern 3D engine)
- Asterix & Obelix XXL
- A few terrible race games
How do you render things in 3D without hardware acceleration, and without an FPU on the circuit board that handles `float` digits, taken into account the (mostly) 16-BIT bus rate and 16Mhz CPU? Well... It does not exactly produce 30+ FPS:
{{< row >}}
{{% col %}}
![]( "Wireframing 507 vertices and 968 faces")
{{% /col %}}
{{% col %}}
![octahedron]( "Trying to rasterize the same thing")
{{% /col %}}
{{< wor >}}
Drawing a lot of lines is not exactly something the GBA loves to do. And I did use [tonclib's optimized routines]( after a failed attempt to implement Bresenham myself. MODE4 has weird byte-write requirements and you can optimize DMA writing of horizontal lines.
But the worst part was fixed-point math, sine lookup tables, and calling the BIOS just to get a square root of something. `Math.sin()` takes input in radians, in any common programming language. The above imported [Babylon JS]( mesh expects the same, but my sine table is filled in `[1-512]` slices and expects it's input 16-BIT. More needless bit-shifting.
I intended to design the engine again as high-level as possible taking advantage of C++'s objects and operator overloading. How about `worldMatrix * viewMatrix;`? Everything is unit-tested (thank god for that, it took out a lot of bugs). But passing objects around in limited RAM sounds ridiculous - and it probably is, even if it's a `const MatrixFx&` reference or a `std::shared_ptr<Mesh>`.
Reverting to a simple box sped up the FPS:
{{< row >}}
{{% col %}}
![]( "A BabylonJS-exported Box. (including a bug)")
{{% /col %}}
{{% col %}}
![]( "A rasterized octahedron, with back-face culling.")
{{% /col %}}
{{< wor >}}
Even calculating the frames per second is a pain. What's a "second"? Okay, so we need a hardware timer interrupt. When does this thing overflow? How many cycles does the CPU take before that happens? Are you seriously using the divide operator instead of `fxdiv()`?
Also, I could not remember most of the math needed to project 3D vertices into a 2D view, so I let myself be guided by David's excellent [3D soft engine tutorial]( in JavaScript. Of course I had to port in all Matrix/Vector operations myself.
Future work: texturizing - I'm curious to see at what rate we could get a simple box textured with a mario "?" block. I won't even try to attempt portal rendering like the 007 Nightfire devs.
Check out the source code here: