When you have finished this tutorial, you will have written a small game. Things you have to know before you start:

  • 65816 assembly
  • Interrupts (What are NMI and VBlank)
  • What is VRAM, CGRAM, and their basic structures
  • What is DMA and how to use it
  • Video modes (mode 0), using backgrounds
  • How to get information on player input
  • Optional: Using 16x16 tiles

If you have problems with any of these, I recommend reading the other tutorials here, it worked for me :) Okay. What are we going to do? A Tic-Tac-Toe game :D I chose this because it's very simple and I can demonstrate several things while making it. What we'll do?

  • We'll load a palette and some tiles
  • We'll set up two backgrounds. One is scrolled.
  • After this, we finish setting up (just bits and pieces)
  • In the main loop, we convert data from the RAM to SNES registers
  • While in VBlank, we get joypad input, and do what we need to do

First things first: set up a working environment (for this tutorial, use mine - tic-tac-toe-init.7z. Create the main file, and fill it with this:

.include "header.inc"
.include "initsnes.asm"

.bank 0 slot 0
.org 0
.section "Vblank"
;--------------------------------------
VBlank:
rti
;--------------------------------------
.ends


.bank 0 slot 0
.org 0
.section "Main"
;--------------------------------------
Start:
  InitSNES

forever:
wai
jmp forever
;--------------------------------------
.ends

This is my standard "Empty Code". If you'd like to test it, add these lines just after InitSNES:

sep #$30        ; get 8-bit registers
stz $2121       ; write to CGRAM from $0
lda #%11101111  ; this is
ldx #%00111111  ; a green color
sta $2122       ; write it
stx $2122       ; to CGRAM
lda #%00001111  ; turn on screen
sta $2100       ; here

If you get a green screen, then it's all right! (delete these lines now). Now, let's get working on the actual program.

Step 1: Load the tiles, and the palette.

For now, use my file. We'll transfer the tiles using DMA, and the palette using the old-school method. Put this code after everything, this will put the tiles and the palette into the ROM.

.bank 1 slot 0       ; We'll use bank 1
.org 0
.section "Tiledata"
.include "tiles.inc" ; If you are using your own tiles, replace this
.ends

Now put this code after the "InitSNES" line - this loads the palette:

rep #%00010000  ;16 bit xy
sep #%00100000  ;8 bit ab

;See this? We take every byte from the palette, and put it to CGRAM
ldx #$0000
- lda UntitledPalette.l,x
sta $2122
inx
cpx #8
bne -

;I'll explain this later
;We'll have two palettes, only one color is needed for the second:
lda #33		;The color we need is the 33rd
sta $2121
lda.l Palette2
sta $2122
lda.l Palette2+1
sta $2122

Note: -,--,---,+,++,+++ are special labels. The bne - branches to the nearest "-" backwards. The + means a forward jump. Refer to the WLA readme. These are useful, I use them often :)

Continue coding, here goes a typical DMA transfer (If you don't understand this, read some tutorials about it - it's really just telling the SNES what to do)

ldx #UntitledData   ; Address
lda #:UntitledData  ; of UntitledData
ldy #(15*16*2)      ; length of data
stx $4302           ; write
sta $4304           ; address
sty $4305           ; and length
lda #%00000001      ; set this mode (transferring words)
sta $4300
lda #$18            ; $211[89]: VRAM data write
sta $4301           ; set destination

ldy #$0000          ; Write to VRAM from $0000
sty $2116

lda #%00000001      ; start DMA, channel 0
sta $420B

Okay, we're done with this. So, here it comes:

Step 2: Create the tilemaps for BG1&BG2

BG2 will be the easier one, it will contain only one tile. So, let's start with BG1 :D BG1 will contain the "#" shape, and later, the O's and X's. We'll make the # shape first:

What to do:
X|X|X    Legend:
-+-+-    X: first empty tiles, then OX
X|X|X    |-+: Lines
-+-+-
X|X|X

We'll use the background (tile 0) for X, |-+ are tiles 2,4 and 6 respectively (16x16 tiles count as two. Then another quirk, about which we don't care for now. Tile 3 would be the right half of | and the left half of -) Now here's an ugly piece of code, but at least it's short :)

lda #%10000000	; VRAM writing mode
sta $2115
ldx #$4000	    ; write to vram
stx $2116       ; from $4000

;ugly code starts here - it writes the # shape I mentioned before.
.rept 2
   ;X|X|X
   .rept 2
     ldx #$0000 ; tile 0 ( )
     stx $2118
     ldx #$0002 ; tile 2 (|)
     stx $2118
   .endr
   ldx #$0000
   stx $2118
   ;first line finished, add BG's
   .rept 27
     stx $2118  ; X=0
   .endr
   ;beginning of 2nd line
   ;-+-+-
   .rept 2
     ldx #$0004 ; tile 4 (-)
     stx $2118
     ldx #$0006 ; tile 6 (+)
     stx $2118
   .endr
   ldx #$0004   ; tile 4 (-)
   stx $2118
   ldx #$0000
   .rept 27
     stx $2118
   .endr
.endr
.rept 2
  ldx #$0000    ; tile 0 ( )
  stx $2118
  ldx #$0002    ; tile 2 (|)
  stx $2118
.endr

After I wrote this, I realized that I could have used a table, then copy data from there, but I leave this to you as a homework :) Set up BG2:

ldx #$6000  ; BG2 will start here
stx $2116
ldx #$000C  ; And will contain 1 tile
stx $2118

Note: BG2 uses colors 32 and 33 when in mode 0.

Now, this was short, wasn't it? Good news, we're done with Step 2.

Step 3: Set up video mode, and interrupts, then loop forever. This is a "read the documentation and write values" part, so I'll just give you the code:

;set up the screen
lda #%00110000  ; 16x16 tiles, mode 0
sta $2105       ; screen mode register
lda #%01000000  ; data starts from $4000
sta $2107       ; for BG1
lda #%01100000  ; and $6000
sta $2108       ; for BG2

stz $210B	    ; BG1 and BG2 use the $0000 tiles

lda #%00000011  ; enable bg1 and 2
sta $212C

;The PPU doesn't process the top line, so we scroll down 1 line.
rep #$20        ; 16bit a
lda #$07FF      ; this is -1 for BG1
sep #$20        ; 8bit a
sta $210E       ; BG1 vert scroll
xba
sta $210E

rep #$20        ; 16bit a
lda #$FFFF      ; this is -1 for BG2
sep #$20        ; 8bit a
sta $2110       ; BG2 vert scroll
xba
sta $2110

lda #%00001111  ; enable screen, set brightness to 15
sta $2100

lda #%10000001  ; enable NMI and joypads
sta $4200

Phew, we're done! If you haven't done so, run wla, and test your program. You should see a yellow # shape, and four cyan colored triangles which could be the corners of a square.

No, you won't get to Step 4 this easily :) we have some planning to do... So we have a VBlank which occurs every frame. This is a short time, but for us, it's plenty. This is why I put everything (except two things) there :) Luckily, we have RAM into which we can store information. So, what do we have to store?

  • The O's and X's which are placed by the players

  • The position of the cursor (the cyan stuff)

  • The previous input of the joypad, to prevent multiple reactions

    And what do we have to do? Here's a pseudocode (don't copy this): VBlank: Get controller input If it's the same as last time, rti Else, get the input (We have a delete key, an X placer key, & an O placer key, and up/down/left/right) If delete key, delete all data, then rti If X or O, then put the corresponding tile according to the cursor's location If u/d/l/r, move the cursor (and do not let it run out if the 3x3), then rti rti

The SNES is very nice, it shadows $7E0000-$7E1FFF to bank $00 (see Memory Mapping), which means we can write $ABCD instead of $7EABCD, and we can also use the X and Y registers for data access - long addresses can only be accessed with A. Let's design where will we put the data:

The O's and X's: they are a 3x3 tile

draft   Our info in the RAM   Info for the SNES in VRAM
X|X|X    $0000|$0001|$0002    $4000|$4002|$4004 (27 empty tiles here)
-+-+-    -----+-----+-----    -----+-----+----- (here, too)
X|X|X => $0003|$0004|$0005 => $4040|$4042|$4044 (and so on)
-+-+-    -----+-----+-----    -----+-----+-----
X|X|X    $0006|$0007|$0008    $4080|$4082|$4084

Cursor position:

We will store this in a straightforward format, like this:

(0,0)|(1,0)|(2,0)   Legend:
-----+-----+-----   (X,Y)
(0,1)|(1,1)|(2,1)
-----+-----+-----
(0,2)|(1,2)|(2,2)

This is just 2 bytes. We have plenty of space, so skip a little, & put them here:
X: $0100
Y: $0101

Controller input:

This is easy, just one byte: Place it to $0200, we will use $0201 as a temp value

Remember I said we'll put everything to the VBlank except two things? Here they come... :) See how we stored the cursor position? You understand it, but the SNES won't. So we have to supply scroll data for it. We'll do the conversion in the main loop.

Now, we can finally do some coding :) First, we'll do the easier part. It's the conversion. See what do we have to convert:

     Our data
(0,0)|(1,0)|(2,0)   ( 0, -1)|(-32, -1)|(-64, -1)  Legend:
-----+-----+-----   ---------+---------+---------  (X,Y)
(0,1)|(1,1)|(2,1)   ( 0,-33)|(-32,-33)|(-64,-33)
-----+-----+-----   ---------+---------+---------
(0,2)|(1,2)|(2,2)   ( 0,-65)|(-32,-65)|(-64,-65)

Remember: the PPU doesn't process line 0. Here's a formula: SNEScoord=-(32Ourcoord) (if y coord, decrement by 1). To calculate 32Ourcoord, we have to shift left the data 5 times (X22222=X32)

So, NOW we can code. Put this code after the first two .include lines:

.macro ConvertX
; Data in: our coord in A
; Data out: SNES scroll data in C (the 16 bit A)
.rept 5
asl a           ; multiply A by 32
.endr
rep #%00100000  ; 16 bit A
eor #$FFFF      ; this will do A=1-A
inc a           ; A=A+1
sep #%00100000  ; 8 bit A
.endm

.macro ConvertY
; Data in: our coord in A
; Data out: SNES scroll data in C (the 16 bit A)
.rept 5
asl a           ; multiply A by 32
.endr
rep #%00100000  ; 16 bit A
eor #$FFFF      ; this will do A=1-A
sep #%00100000  ; 8 bit A
.endm

Put this code after WAI but before JMP forever - we'll get fresh data this way.

rep #%00100000  ; get 16 bit A
lda #$0000      ; empty it
sep #%00100000  ; 8 bit A
lda $0100       ; get our X coord
 ConvertX       ; WLA needs a space before a macro name
sta $210F       ; BG2 horz scroll
xba
sta $210F       ; write 16 bits

;now repeat it, but change $0100 to $0101, and $210F to $2110
rep #%00100000  ; get 16 bit A
lda #$0000      ; empty it
sep #%00100000  ; 8 bit A
lda $0101       ; get our Y coord
 ConvertY       ; WLA needs a space before a macro name
sta $2110       ; BG2 vert scroll
xba
sta $2110       ; write 16 bits

Remember that the XO have to be copied to VRAM? This will be the second thing we do. First, make a conversion table (We have to add $4000 to these values, but this is much simpler than if we stored words):

;put this after everything
.bank 2 slot 0
.org 0
.section "Conversiontable"
VRAMtable:
.db $00,$02,$04,$40,$42,$44,$80,$82,$84
.ends

;write this after the conversion routine, just before jmp forever
ldx #$0000; reset our counter
-
rep #%00100000     ; 16 bit A
lda #$0000         ; empty it
sep #%00100000     ; 8 bit a
lda VRAMtable.l,x  ; this is a long indexed address, nice :)
rep #%00100000
clc
adc #$4000         ; add $4000 to the value
sta $2116          ; write to VRAM from here
lda #$0000         ; reset A while it's still 16 bit
sep #%00100000     ; 8 bit A
lda $0000,x        ; get the corresponding tile from RAM
; VRAM data write mode is still %10000000
sta $2118          ; write
stz $2119          ; this is the hi-byte
inx
cpx #9             ; finished?
bne -              ; no, go back

Good! Now if we write to the RAM, the SNES will obey, and do what we want! Note: I could write these into the VBlank routine, but this way it's more organized. (For techies: when the VBlank finishes, the PC will point to the byte after wai, and the program will continue from there. Always there. This is why it can be written in VBlank, because here they ALWAYS go sequentially. Normally, you don't put a wai to the main loop if the CPU can still do some work, the VBlank routine is usually just for setting video mode, and to do things which can't be done mid-frame.) For simple mortals: Why could I write the conversion to the VBlank routine? Let's compare these programs:

What I did              What I could have done
VBlank:                 VBlank:
Get controller input,   Get controller input
if finished, rti        if finished, jmp label
                        label:
                        do some conversions
rti                     rti

forever:                forever:
wai                     wai
           VBLANK OCCURS HERE
do some conversions
jmp forever             jmp forever

When my program runs it's this order:  wai-controllerinput-rti-conversion-loop.
When the other program runs it's this: wai-controllerinput-conversion-rti-loop.
If you leave out the rti, it's the same.

Now for the VBlank. If we do this, we're finished! I tried to comment this as much I can.

VBlank:
lda $4212       ; get joypad status
and #%00000001  ; if joy is not ready
bne VBlank      ; wait
lda $4219       ; read joypad (BYSTudlr)
sta $0201       ; store it
cmp $0200       ; compare it with the previous
bne +           ; if not equal, go
rti	            ; if it's equal, then return

+ sta $0200     ; store
and #%00010000  ; get the start button
                ; this will be the delete key
beq +           ; if it's 0, we don't have to delete
ldx #$0000
- stz $0000,x   ; delete addresses $0000 to $0008
inx
cpx #$09        ; this is 9. Guess why (homework :) )
bne -
stz $0100       ; delete the scroll
stz $0101       ; data also

+ lda $0201     ; get back the temp value
and #%11000000  ; Care only about B and Y
beq +           ; if empty, skip this
; so, B or Y is pressed. Let's say B is O,
; and Y is X.
cmp #%11000000  ; both are pressed?
beq +		    ; then don't do anything
cmp #%10000000  ; B?
bne ++          ; no, try Y
; B is pressed, write an O ($08)
; we have to tell the cursor position,
; and calculate an address from that
; Formula: Address=3*Y+X
lda $0101       ; get Y
sta $0202       ; put it to a temp value
clc
adc $0202       ; multiply by 3 - an easy way
adc $0202       ; A*3=A+A+A :)
adc $0100       ; add X
; Now A contains our address
ldx #$0000      ; be on the safe side
tax
lda #$08
sta $0000,x     ; put $08 to the good address
jmp +           ; done with this

++              ; now for Y
cmp #%01000000  ; Y?
bne +           ; no, jump forward (this should not happen)
; Y is pressed, write an X ($0A)
lda $0101       ; get Y
sta $0202       ; put it to a temp value
clc
adc $0202       ; multiply by 3 - an easy way
adc $0202       ; A*3=A+A+A :)
adc $0100       ; add X
; Now A contains our address
ldx #$0000      ; be on the safe side
tax
lda #$0A
sta $0000,x     ; put $0A to the good address
+               ; finished putting tiles

; cursor moving comes now
lda $0201       ; get control
and #%00001111  ; care about directions
sta $0201       ; store this

cmp #%00001000  ; up?
bne +           ; if not, skip
lda $0101       ; get scroll Y
cmp #$00        ; if on the top,
beq +           ; don't do anything
dec $0101       ; sub 1 from Y
+

lda $0201       ; get control
cmp #%00000100  ; down?
bne +           ; if not, skip
lda $0101
cmp #$02        ; if on the bottom,
beq +           ; don't do anything
inc $0101       ; add 1 to Y
+

lda $0201       ; get control
cmp #%00000010  ; left?
bne +           ; if not, skip
lda $0100
cmp #$00        ; if on the left,
beq +           ; don't do anything
dec $0100       ; sub 1 from X
+

lda $0201       ; get control
cmp #%00000001  ; right?
bne +           ; if not, skip
lda $0100
cmp #$02        ; if on the right,
beq +           ; don't do anything
inc $0100       ; add 1 to X
+
rti             ; F|NisH3D!

DONE!

Now run WLA and enjoy! You can find a working demo in the source below.

Complete Source Code: tic-tac-toe-tutorial.7z

Update 5/30/22 by jeffythedragonslayer: Grab wla.bat to build it here: [https://forums.nesdev.org/viewtopic.php?t=23921]

Disclaimer 1: I have tested and made sure that all the example code will assemble and run properly on several emulators. However, I cannot guarantee the same results that I have had. You take full responsibility when you use my code. Nintendo and SNES are registered trademarks of Nintendo.

Disclaimer 2: This tutorial is written by Aceman2000. You can redistribute this file, but you have to leave this section unchanged. The original version can be found at Vintagedev. All other rights reserved.