SA-1 (Expansion Coprocessor Chip)
Overview
The SA-1 is a coprocessor capable of running at four times the base speed of the SNES main CPU when access does not conflict with that of SNES CPU, DMA or HDMA processing. It provides additional shared RAM, visible to both the SNES CPU and the SA-1, and optionally backed up by battery referred to as BW-RAM with a maximum install RAM size of 256 KB (2 Mbit).
The SNES WRAM (7E:xxxx,7F:xxxx) and PPU/CPU registers at 00:2xxx and 00:4xxx are not visible to the SA-1 CPU.
A smaller fixed-size 2 KB I-RAM (16 kbit) is also provided. The SA-1 includes it own DMA controller that is part of the character conversion logic that can convert bitmap-formatted graphics to the SNES PPU's character format (ie: tiled) graphics.
The SA-1 also incorporates an implementation of the Super MMC memory decoder. The MMC functions provide banking of BW-RAM and ROM pages as well as write protection logic for BW-RAM and I-RAM.
SA-1 ROMS indicate the SA-1 chip's presence by setting the mapping type and cartridge type bytes in the ROM header at ROM offset (without SMC header) 007FD5 and 007FD6 to $23,$34 or $23,$35.
The circuitry accommodates conflicting simultaneous memory accesses by the SNES-CPU into the shared memory regions and allows both CPUs to operate on valid data however, there is a performance penalty for such accesses so such accesses should normally be avoided. The main ones to watch out for are ROM-ROM collisions (have the SNES CPU run code that only operates on RAM locations, WRAM or PPU/CPU registers in BW-RAM locations instead to prevent such collision), BW-RAM/BW-RAM collisions (run the SNES CPU run code in WRAM, I-RAM or ROM and only access WRAM, ROM, registers or I-RAM to avoid) and I-RAM/I-RAM collision (run the SNES CPU code in BW-RAM, ROM or WRAM to avoid).
The best common denominator from the above that yields the fewest collisions may be to run the SNES-CPU mainly in BW-RAM (use interrupts to implement a crude from of paging) as a sort of script processor (mainly for animations and stuff likely to utilize PPU registers) with the main loop (non-interrupt thread) being a WAI/JMP loop in WRAM that is entered while new code for the SNES CPU to run is being updated by the SA-1. The SA-1 makes this easy because it implements interrupt vector registers in $2205-$2208 and $220C-$220F (that override the ROM interrupt vectors) that allow program-generated interrupts vectored to any bank 0 address on either processor. Whenever such code does an RTI the SNES CPU will return to the loop in WRAM until the next interrupt is encountered. Basically the SNES CPU is following an event-driven execution model.
A small WRAM (it must live in the 8k 00:0000-00:1FFF WRAM region) routine while using interrupt-driven processing may look like:
noint:
LDA $4210
RTI
- WAI
BRA -
The interrupt vectors are set to (00:)noint whenever there is no code currently available for it to run without access collisions. The loop will keep the processor asleep and only accessing WRAM whenever no interrupts occur. This allows the SA-1 full speed ROM, I-RAM and BW-RAM access at a rate 4 times that of the SNES CPU allowing it to do the heavy lifting it was designed for.