A DSP chip from Capcom; it is actually a Hitachi HG51B169 as confirmed by decapping. There are 1024K words of 24-bit instructions, running as 20.000MHz. and it is used in 2 games:

The test program is located at $029C00.

CX4 Interface Summary

The CX4 has 16 multi-purpose triple-byte registers in the memory range $7F80 through $7FAF.

Mode Size Bank RAM Registers
0x20H 2Mbit - 32Mbit $00 - $3F $6000 - $6BFF $7F40 - $7FAF

CX4 Registers

Registers Function
$7F40 - $7F47 DMA Transfer (?)
$7F49 - $7F4B ROM Offset
$7F4D - $7F4E Page Select
$7F4F Instruction Pointer

Start Address = ((Page_Select * 256) + Instruction Pointer) * 2) + ROM_Offset

I ran some tests on a Mashmods flash programmer and MMX2 cart and these two cases give the same results.

Page Select = $000E and ROM_Offset = $028000
Page Select = $010E and ROM_Offset = $008000

CX4 Memory Layout

Program ROM is obviously 256x16-bit pages at a time (taken from the SNES ROM). Program RAM is 2x256x16-bit (two banks). Data ROM is 1024x24-bit (only ROM internal to the Cx4). Data RAM is 4x384x16-bit. Call stack is 8-levels deep, at least 16-bits wide.

Memory Type Size Location
Program ROM 256 x 16-bit SNES ROM
Program RAM 2 x 256 x 16-bit (2 Banks)
Data ROM 1024 x 24-bit CX4 Internal
Data RAM 4 x 384 x 16-bit

CX4 Data ROM Layout

Location Table Data
000 - 0FF Inverse
100 - 1FF Square Root (sqrt)
200 - 27F First Quadrant Sine (sin)
280 - 2FF First Quadrant Arcsine (asin)
300 - 37F First Quadrant Tangent (tan)
380 - 3FF First Quadrant Cosine (cos)

CX4 Command Summary

Commands are executed on the CX4 by writing the command to $7F4F while bit 6 of $7F5E is clear. Bit 6 of $7F5E will stay set until the command has completed, at which time output data will be available. See individual commands for input and output parameter addresses.

Command Function Name
$00 Sprite Functions
$01 Wireframe
$05 Propulsion
$0D Set Vector Length
$10 Triangle
$13 Triangle
$15 Pythagorean
$1F Arc-Tan
$22 Trapezoid
$25 Multiply
$2D Transform Coordinates

Test Functions

Command Function Name
$00 - $3F Command Shift
`$40 Sum
`$54 Square
`$5C Immediate Register
$5E - $7E Immediate Register (Multiple)
`$89 Immediate ROM

The following steps must be completed when accessing these functions:

  1. Wait until bit 6 of $xx7F5e is clear
  2. Write $0E to $xx7F4D
  3. Write $00 to $xx7F4E
  4. Write $01 to $xx7F48
  5. Wait until bit 6 of $xx7F5E is clear
  6. Write command to $xx7F4F
  7. Wait until bit 6 of $xx7F5E is clear

CX4 Source Code Contributors

Reversed engineered by zsknight and documented by anomie. Source code and additional information by Overload.

unsigned char CX4_Ram[0x0C00];
unsigned char CX4_Reg[0x0100];

#define uint24 unsigned int

// 24 Bit Work Registers
uint24 R0, R1, R2, R3, R4, R5, R6, R7, R8, R9, R10, R11, R12, R13, R14, R15;
		
const uint24 CX4_SinTable[256] = {		
    0x000000, 0x000324, 0x000648, 0x00096c, 0x000c8f, 0x000fb2, 0x0012d5, 0x0015f6,
    0x001917, 0x001c37, 0x001f56, 0x002273, 0x002590, 0x0028aa, 0x002bc4, 0x002edb,
    0x0031f1, 0x003505, 0x003817, 0x003b26, 0x003e33, 0x00413e, 0x004447, 0x00474d,
    0x004a50, 0x004d50, 0x00504d, 0x005347, 0x00563e, 0x005931, 0x005c22, 0x005f0e,
    0x0061f7, 0x0064dc, 0x0067bd, 0x006a9b, 0x006d74, 0x007049, 0x007319, 0x0075e5,
    0x0078ad, 0x007b70, 0x007e2e, 0x0080e7, 0x00839c, 0x00864b, 0x0088f5, 0x008b9a,
    0x008e39, 0x0090d3, 0x009368, 0x0095f6, 0x00987f, 0x009b02, 0x009d7f, 0x009ff6,
    0x00a267, 0x00a4d2, 0x00a736, 0x00a994, 0x00abeb, 0x00ae3b, 0x00b085, 0x00b2c8,
    0x00b504, 0x00b73a, 0x00b968, 0x00bb8f, 0x00bdae, 0x00bfc7, 0x00c1d8, 0x00c3e2,
    0x00c5e4, 0x00c7de, 0x00c9d1, 0x00cbbb, 0x00cd9f, 0x00cf7a, 0x00d14d, 0x00d318,
    0x00d4db, 0x00d695, 0x00d848, 0x00d9f2, 0x00db94, 0x00dd2d, 0x00debe, 0x00e046,
    0x00e1c5, 0x00e33c, 0x00e4aa, 0x00e60f, 0x00e76b, 0x00e8bf, 0x00ea09, 0x00eb4b,
    0x00ec83, 0x00edb2, 0x00eed8, 0x00eff5, 0x00f109, 0x00f213, 0x00f314, 0x00f40b,
    0x00f4fa, 0x00f5de, 0x00f6ba, 0x00f78b, 0x00f853, 0x00f912, 0x00f9c7, 0x00fa73,
    0x00fb14, 0x00fbac, 0x00fc3b, 0x00fcbf, 0x00fd3a, 0x00fdab, 0x00fe13, 0x00fe70,
    0x00fec4, 0x00ff0e, 0x00ff4e, 0x00ff84, 0x00ffb1, 0x00ffd3, 0x00ffec, 0x00fffb,
    0x000000, 0xfffcdb, 0xfff9b7, 0xfff693, 0xfff370, 0xfff04d, 0xffed2a, 0xffea09,
    0xffe6e8, 0xffe3c8, 0xffe0a9, 0xffdd8c, 0xffda6f, 0xffd755, 0xffd43b, 0xffd124,
    0xffce0e, 0xffcafa, 0xffc7e8, 0xffc4d9, 0xffc1cc, 0xffbec1, 0xffbbb8, 0xffb8b2,
    0xffb5af, 0xffb2af, 0xffafb2, 0xffacb8, 0xffa9c1, 0xffa6ce, 0xffa3dd, 0xffa0f1,
    0xff9e08, 0xff9b23, 0xff9842, 0xff9564, 0xff928b, 0xff8fb6, 0xff8ce6, 0xff8a1a,
    0xff8752, 0xff848f, 0xff81d1, 0xff7f18, 0xff7c63, 0xff79b4, 0xff770a, 0xff7465,
    0xff71c6, 0xff6f2c, 0xff6c97, 0xff6a09, 0xff6780, 0xff64fd, 0xff6280, 0xff6009,
    0xff5d98, 0xff5b2d, 0xff58c9, 0xff566b, 0xff5414, 0xff51c4, 0xff4f7a, 0xff4d37,
    0xff4afb, 0xff48c5, 0xff4697, 0xff4470, 0xff4251, 0xff4038, 0xff3e27, 0xff3c1e,
    0xff3a1b, 0xff3821, 0xff362e, 0xff3444, 0xff3260, 0xff3085, 0xff2eb2, 0xff2ce7,
    0xff2b24, 0xff296a, 0xff27b7, 0xff260d, 0xff246b, 0xff22d2, 0xff2141, 0xff1fb9,
    0xff1e3a, 0xff1cc3, 0xff1b55, 0xff19f0, 0xff1894, 0xff1740, 0xff15f6, 0xff14b4,
    0xff137c, 0xff124d, 0xff1127, 0xff100a, 0xff0ef6, 0xff0dec, 0xff0ceb, 0xff0bf4,
    0xff0b05, 0xff0a21, 0xff0945, 0xff0874, 0xff07ac, 0xff06ed, 0xff0638, 0xff058d,
    0xff04eb, 0xff0453, 0xff03c4, 0xff0340, 0xff02c5, 0xff0254, 0xff01ec, 0xff018f,
    0xff013b, 0xff00f1, 0xff00b1, 0xff007b, 0xff004e, 0xff002c, 0xff0013, 0xff0004
}

The resolution of an Angle is 9 bits, 8 bits data plus a sign bit. Angles range from -180° ~ +180°.

uint24 CX4_Sin(uint24 Rx){
	R0 = Rx & 0x1ff;

	if (R0 & 0x100) R0 ^= 0x1ff;
	if (R0 & 0x080) R0 ^= 0x0ff;

	if (Rx & 0x100)
		return CX4_SinTable[R0 + 0x80];
	else
		return CX4_SinTable[R0];
}

uint24 CX4_Cos(uint24 Rx){
	return CX4_Sin(Rx + 0x080);
}

bool CX4_Adc24(uint24 &A, uint24 B, bool Carry){
	uint32 C = (A & 0xffffff) + (B & 0xffffff);
	if (Carry) C++;
	A = (uint24) C;
	return (C & 0x1000000);
}

void CX4_Mul24(uint24 A, uint24 B, uint24 &CL, uint24 &CH){
	if (B & 0x800000){
		A = -A;
		B = -B;
	}

	CL = CH = 0;

	uint24 AdderL = A;
	uint24 AdderH = 0;

	if (A & 0x800000) AdderH--;

	B &= 0xffffff;

	while (B){
		if (B & 1)
			CX4_Adc24(CH, AdderH, CX4_Adc24(CL, AdderL, false));

		AdderH <<= 1;
		if (AdderL & 0x800000) AdderH++; 
		AdderL <<= 1;

		B >>= 1;
	}
}

uint24 CX4_Ldr24(uint32 ofs){
	return (CX4_Reg[ofs + 0] | (CX4_Reg[ofs + 1] << 8) | (CX4_Reg[ofs + 2] << 16));
}

void CX4_Str24(uint32 ofs, uint24 Rx){
	CX4_Reg[ofs + 0] = (uint8) (Rx);
	CX4_Reg[ofs + 1] = (uint8) (Rx >> 8);
	CX4_Reg[ofs + 2] = (uint8) (Rx >> 16);
}

0x01 - Wireframe

N/A

0x05 - Propulsion

N/A

0x0D - Set Vector Length

N/A

0x10 - Triangle

I/O Address Name / Variable
Input $7F80 - $7F82 Angle (R0)
$7F83 - $7F85 Radius (R1)
Output $7F86 - $7F88 X (R2)
$7F89 - $7F8B Y (R3)
void CX4_Triangle16(){
	R0 = CX4_Ldr24(0x80);
	R1 = CX4_Ldr24(0x83);
	
	R4 = R0 & 0x1ff;
	if (R1 & 0x8000) R1 |= 0xff0000;
	
	CX4_Mul24(CX4_Cos(R4), R1, R5, R2);
	R5 = (R5 >> 16) & 0xff;
	R2 = (R2 << 8) + R5;
	
	CX4_Mul24(CX4_Sin(R4), R1, R5, R3);
	R5 = (R5 >> 16) & 0xff;
	R3 = (R3 << 8) + R5;
	
	CX4_Str24(0x80, R0);
	CX4_Str24(0x83, R1);
	CX4_Str24(0x86, R2);
	CX4_Str24(0x89, R3);
	CX4_Str24(0x8c, R4);
	CX4_Str24(0x8f, R5);
}

0x13 - Triangle

I/O Address Name / Variable
Input $7F80 - $7F82 Angle (R0)
$7F83 - $7F85 Radius (R1)
Output $7F86 - $7F88 X (R2)
$7F89 - $7F8B Y (R3)
void CX4_Triangle24(){
	R0 = CX4_Ldr24(0x80);
	R1 = CX4_Ldr24(0x83);

	R4 = R0 & 0x1ff;

	CX4_Mul24(CX4_Cos(R4), R1, R5, R2);
	R5 = (R5 >> 8) & 0xffff;
	R2 = (R2 << 16) + R5;

	CX4_Mul24(CX4_Sin(R4), R1, R5, R3);
	R5 = (R5 >> 8) & 0xffff;
	R3 = (R3 << 16) + R5;

	CX4_Str24(0x80, R0);
	CX4_Str24(0x83, R1);
	CX4_Str24(0x86, R2);
	CX4_Str24(0x89, R3);
	CX4_Str24(0x8c, R4);
	CX4_Str24(0x8f, R5);
}

0x15 - Pythagorean

N/A

0x1F - Arc-Tan

N/A

0x22 - Trapezoid

N/A

0x25 - Multiply

I/O Address Name / Variable
Input $7F80 - $7F82 Multiplicand (R0)
$7F83 - $7F85 Multiplier (R1)
Output $7F80 - $7F85 Product (R1:R0)
void CX4_Multiply(){
	R0 = CX4_Ldr24(0x80);
	R1 = CX4_Ldr24(0x83);

	CX4_Mul24(R0, R1, R0, R1);

	CX4_Str24(0x80, R0);
	CX4_Str24(0x83, R1);
}

0x2D - Transform Coordinates

N/A

0x40 - Sum

I/O Address Name / Variable
Input $6000 - $67FF
Output $7F80 - $7F82 (R0)
void CX4_Sum(){
	R0 = 0;
	for (uint32 i = 0; i < 0x800; i++) R0 += CX4_Ram[i];
	CX4_Str24(0x80, R0);
}

0x54 - Square

I/O Address Name / Variable
Input $7F80 - $7F82 (R0)
Output $7F83 - $7F88 (R2:R1)
void CX4_Square(){
	R0 = CX4_Ldr24(0x80);

	CX4_Mul24(R0, R0, R1, R2);

	CX4_Str24(0x83, R1);
	CX4_Str24(0x86, R2);
}

0x5C - Immediate Register

I/O Address Name / Variable
Input None
Output $6000 - $6030
$7F80 - $7F82 (R0)
const unsigned char ImmediateData[48] = {
	 0x00,  0xfe,  0xff,  0x00,  0x01,  0x00,  0xfe,  0xff,  0xff,  0x01,  0x00,  0x00,
	 0xff,  0xff,  0x7f,  0xff,  0x7f,  0xff,  0x00,  0x7f,  0xff,  0x00,  0x80,  0x00,
	 0x7f,  0xff,  0xff,  0x80,  0x00,  0x00,  0xff,  0xff,  0x00,  0x00,  0xff,  0xff,
	 0xff,  0x00,  0x00,  0x00,  0xff,  0x00,  0xff,  0xff,  0xff,  0x00,  0x00,  0x00	 
};

void CX4_ImmediateReg(){
	R0 = 0;

	for (uint32 i = 48; i > 0; --i){
		CX4_Ram[R0] = ImmediateData[i];
		R0++;
	}

	CX4_Str24(0x80, R0);
}

0x5E - 0x7E - Immediate Register (Multiple)

I/O Address Name / Variable
Input $7F80 - $7F82 (R0)
Output $6XXX - $6XXX
$7F80 - $7F82 (R0)

This command transfers a preset triple-byte pattern into the memory offset specified in R0.

Command Number of Bytes
$5E 48
$60 45
$62 42
$64 39
$66 36
$68 33
$6A 30
$6C 27
$6E 24
$70 21
$72 18
$74 15
$76 12
$78 9
$7A 6
$7C 3
void CX4_ImmediateReg(uint32 NumberOfBytes){
	R0 = CX4_Ldr24(0x80);

	for (uint32 i = NumberOfBytes; i > 0; --i){
		CX4_Ram[R0 & 0xfff] = ImmediateData[i];
		R0++;
	}

	CX4_Str24(0x80, R0);
}

0x89 - Immediate ROM

I/O Address Name / Variable
Input None
Output $7F80 - $7F85 (R1:R0)
void CX4_ImmediateROM(){
	R0 = 0x054336;
	R1 = 0xffffff;

	CX4_Str24(0x80, R0);
	CX4_Str24(0x83, R1);
}

CX4 Data ROM Extraction

ROM_Offset = 028000
Page Select = 0006
PC = 04

04: 606b lda     r11
05: 7000 sta     rp
06: 6008 lda     rom
07: e06c sta     r12
08: 3c00 ret

Patents

5740404, 5513374, 5426600, 5440747, 5535417, 5381360, 5832258

Pinout

A0..23, D0..7, RD, WR, RST, etc. are connected to the cart edge
RA0..20, RD0..7, ROE, RWE, RCE, etc. are connected to the ROM
RCE1 is CE on the first ROM if 2x8Mbit ROMs are used, or the only ROM if a 16Mbit ROM is used
RCE2 is CE on the second ROM if 2x8Mbit ROMs are used SRCE is CE on the SRAM chip (never used on official games)

1   A3     21  A15    41  RA8    61  /IRQ
2   A4     22  A14    42  RA7    62  D7
3   A5     23  A13    43  RA6    63  D6
4   A6     24  A12    44  RA5    64  D5
5   A7     25  SRCE   45  RA4    65  D4
6   A8     26  RCE1   46  RA3    66  Vcc
7   A9     27  RCE2   47  RA2    67  D3
8   A10    28  RA19   48  RA1    68  D2
9   A11    29  RA18   49  RA0    69  D1
10  GND    30  RA17   50  GND    70  D0
11  XIN    31  Vcc    51  RWE    71  Vcc
12  XOUT   32  RA16   52  ROE    72  RST
13  A23    33  RA15   53  RD7    73  GND
14  A22    34  RA20   54  RD6    74  ???
15  A21    35  RA14   55  RD5    75  ???
16  A20    36  RA13   56  RD4    76  RD
17  A19    37  RA12   57  RD3    77  WR
18  A18    38  RA11   58  RD2    78  A0
19  A17    39  RA10   59  RD1    79  A1
20  A16    40  RA9    60  RD0    80  A2

Pins 74 and 75 are not internally connected to GND, but are tied to GND on the MMX2 and MMX3 PCB's. I have no idea what they're for, but they're probably inputs of some kind. Possibly one of them could be a Hi/LoROM switch like pin 10 on the MAD-1, but that is purely speculation at this point.

Pin 51 (RWE) is asserted low for writes to both the ROM and SRAM address space, so it can be used to write to reprogrammable ROM chips in-circuit, as well as writing to SRAM.

Information provided by Overload (codeviolation@hotmail.com) / Dr. Decapitator / byuu. Jonas Quinn - Program ROM discovery. Overload - Data ROM extraction. Segher - Instruction set reverse-engineering. qwertymodo - Pinout.