Super Adventure Island / Takahashi Meijin no Daibōken Jima

Offsets refer to the US version of the ROM.

Level Format

Level Pointers

Offset 0x33F82 contains a table of pointers to level and cutscene headers. Each entry in the table is 4 bytes long: a 24-bit pointer followed by an additional byte which denotes the type of header (whether it is an actual level or one of the mode 7 cutscenes.)

There are (I think) a total of 60 entries:

entry 0 - opening scene (falling from sky)
entry 1 - level 1-1
entry 2 - level 1-1's midway point 
entry 3 - level 1-2
...

Level Headers

The game stores 2 separate headers for each level - one for the start, and one for the midway point. The two headers are mostly identical except for the player start point (though theoretically you could abuse this to make the player start in a totally different area if they die past the level's midway point).

The byte following each pointer has these known bit values:

    0x02 - get flown in by the bird
    0x04 - skip level - weird. the game puts you at the level AFTER this one, with the correct sprites and numbering.
    0x08 - normal level
    0x80 - cutscene (using this for a normal level will just make the HUD and player vanish and you can't move or do anything)

Headers are 0x6A (?) bytes long. The headers are pretty large and I hadn't bothered figuring out what each individual field does, but a bunch of them are probably pointers to graphics data etc.

Header Offset Size Description
0x00 1 Written to BG mode register and $0BFD
Should almost always be 0x79 except for cutscenes.
0x01 1 Written to $0C97
0x02 2 Written to $0C8D
0x04 2 Written to $0C8F
0x06 2 Written to $0C85
0x08 2 Written to $0C87
0x0A 2 Written to $0C89
0x0C 2 Written to $0C8B
0x0E 2 Written to $009A
0x10 1 Written to $0C83
0x11 2 Pointer to HDMA Table?
Written to $0D80
0x13 1 Written to $0D82
0x14 2 Written to $0DC0
Pointer to ???
0x16 1 Written to $0DC2
0x17 2 Written to $0310
Pointer to ???
0x19 1 Written to $0312
0x1A 2 Written to $0313
0x1C 1 Written to $0315
0x1D 2 Written to $0C29
0x1F 2 Written to $0C2F
0x21 1 Width of Level (Screens)
Written to $0C18
0x22 1 Height of Level (Screens)
Written to $0C19 and $0C2E
0x23 1 Bank Number for Map Pointers
Written to $0C1C
0x24 2 Pointer to Screen Indices
Written to $0C1D
0x26 2 Pointer to Screen 0 Data
Written to $0C1F
0x28 2 Pointer to Metatiles (Different part?)
Written to $0C21
0x2A 2 Pointer to Metatiles?
Written to $0C23
0x2C 2 Pointer to Tile Properties
Written to $0C25
0x2E 2 Written to $0C4A
0x30 2 Written to $0C50
0x32 1 Written to $0C3B and $0C49
Value * 8 is Written to $0C3D - $0C3E
0x33 1 Written to $0C3C and $0C4F
0x34 1 Written to $0C3F
0x35 2 Written to $0C40
0x37 2 Written to $0C42
0x39 2 Written to $0C44
0x3B 2 Written to $0C46
0x3D 2 Written to $0C6B
0x3F 2 Written to $0C71
0x41 1 Written to $0C5C and $0C6A
Value * 8 is written to $0C5E - $0C5F
0x42 1 Written to $0C5D and $0C70
0x43 1 Written to $0C60
0x44 2 Written to $0C61
0x46 2 Written to $0C63
0x48 2 Written to $0C65
0x4A 2 Written to $0C67
0x4C 3 Pointer to some Compressed Data written to $0050
0x4F 3 Pointer to some Compressed Data written to $0050
0x52 3 Pointer to some Compressed Data written to $0050
0x55 3 Pointer to some Compressed Data written to $0050
0x58 3 Pointer to some Compressed Data written to $0050
0x5B 3 Pointer to some Compressed Data written to $0050
0x5E 3 Pointer To some Compressed Data written to $0050

The format from 0x1D and up varies depending on if RAM address $0BFD == $07 aka Cutscene vs. Normal Level. The values listed are for standard levels.

Level Screen Data

Each screen is made up of 32x32 tiles. Number of horizontal + vertical screens defined in header. Screen 0 is pointed to in the header, subsequent screens are immediately after one another. Each screen is made up of 16bit tile numbers representing each 32x32 tile.

Level Screen Indices

The order in which these screens make up the level is stored in this list (pointed to in header again.) This is made up of m*n bytes where the first 'n' make up the top row of the level, and the last 'n' do the same for the last half (for a total of m rows.) Level 1-1 takes place entirely on the bottom of the stage, so there are a bunch of 0s at the beginning for that stage (where screen 0 is totally blank.) You can use numbers more than once to make level geometry repeat itself if you're lazy.

Compressed Graphics

Offset 0x356C6 has pointers to compressed data for player animation frames.
Offset 0x323F5 has pointers to compressed data for ??? (other stuff).

The graphics use some kind of RLE-like compression scheme that uses bit shifting to determine how many times to write each byte of compressed data.

How a tile is decompressed:

  1. Load the current byte of compressed data.
  2. Let x = 0.
  3. Do a bit shift left.
  4. Each time the carry bit is set, set x to the next byte of the compressed data.
  5. Write x to current memory offset, increment memory offset by two.

Do steps 2-5 a total of eight times per tile.

Examples:

Compressed:   80 FF
Decompressed: FF FF FF FF FF FF FF FF

Compressed:   0F 01 02 03 04
Decompressed: 00 00 00 00 01 02 03 04

Compressed:   CC FF 00 FF 00
Decompressed: FF 00 00 00 FF 00 00 00

Compressed:   00
Decompressed: 00 00 00 00 00 00 00 00

Compressed:   FF 01 02 03 04 05 06 07 08
Decompressed: 01 02 03 04 05 06 07 08

Decompression order at beginning of game, before showing title screen:

Even numbered bytes (00,02...) are loaded first, then odd numbered bytes are decompressed after.

07:8000 - 07:8054 -> 7F:2000
07:810D - 07:8165 -> 7F:2001

07:8055 - 07:80A6 -> 7F:2200
07:8166 - 07:81B1 -> 7F:2201

07:80A7 - 07:80DB -> 7F:2400
07:81B2 - 07:81DB -> 7F:2401

07:80DC - 07:810C -> 7F:2600
07:81DC - 07:8208 -> 7F:2601

etc...
7F:2000 - 7F:27FF are the first 4 horizontal rows of player sprite data in WRAM. 
Then the next 4 rows are loaded. Player sprites span most of the used space in bank 7

Graphics decompression routine begins at ROM address 03:AB23.

Variables:
$42 - byte of graphics data
$43 - byte being read in
$44 - bytes remaning in this item
$46 - number of items remaining to decompress for this row

--  lda    #$08
    sta    $44
    stz    $42
    lda    $0000, y    ; ex: for loading sprites, y = $8000 and bank number = 7
    sta    $43            
    iny
-   asl    $43         ; shift the byte left.
    bcs    +           ; if this results in an overflow, jump ahead
    lda    $42         ; otherwise load the byte already at $42 and store it again.
    bra    ++        
+   lda    $0000, y    ; on overflow: load the next byte, store it in $42
    sta    $42
    iny
++  sta    $7F2000,x
    inx
    inx
    dec    $44
    bne    -           ; if $44 is still positive, 
    dec    $46
    bne    --
7372

Level format, compression and graphic compression derived by Revenant. Started on 2011-03-06, last updated 2011-03-09, cleaned up on 2012-01-01.