Romless NES program specification

Preliminary version: details are not finalized, and may change after discussion period.

A romless NES program is one which runs from RAM rather than ROM. Before being run, its data is loaded into the various RAM areas in the NES and cartridge. The program doesn't care how the data is loaded, allowing it to be stored in more than one format. The preferred format is a standard iNES file, allowing them to be run on anything. Most importantly, programs can be uploaded from a PC to a NES via a serial link cable, allowing quick development and testing on a low-cost cartridge.

See the main page for an overview and tutorial.

Contents

Program elements

A romless program consists of several elements which are loaded into the NES and cartridge.

Element Loaded to Data
Mapper mapper UNROM, MMC1, MMC3, etc.
Mirroring mapper horizontal, vertical
RAM $200 $600 bytes
WRAM $6000 $2000 bytes
CHR PPU $0000 $2000 bytes
Screen 1 PPU $2000 $400 bytes
Screen 2 PPU $2C00 $400 bytes
Palette PPU $3F00 $20 bytes

A second set of NMI, reset, and IRQ vectors is kept in RAM at $7FA-$7FF, and loaded as a part of loading RAM. The cartridge can then forward NMI and IRQ interrupts to these using JMP ($7FA) and JMP ($7FE) instructions, respectively.

A program doesn't know or care how the above elements are loaded. This allows it to be stored in various formats, such as an iNES file or compilation. It also allows it to be loaded in different ways, such as from an iNES file, compilation, or serial link to PC. The same program could be stored and loaded in all of these ways, without any changes.

File format

A romless file is a standard iNES file with the program elements stored at predefined locations, so that they can be processed by other tools. Along with the elements is a program that loads them into the NES, allowing the file to be run normally like any other iNES file.

Offset Size Element
+$0 $10 Standard iNES header
+$10 $200 not used; clear to 0 if possible
+$210 $600 RAM
+$810 $1800 reserved; clear to 0
+$2010 $2000 WRAM
+$4010 $2000 CHR
+$6010 $400 Screen 1
+$6410 $400 Screen 2
+$6810 $20 Palette
+$6830 $13E0 reserved; clear to 0
+$7C10 $1 Mapper: 0=NROM 1=MMC1 4=MMC3 etc.
+$7C11 $1 Mirroring: 0=horiz 1=vert 2/3=either is fine
+$7C12 $E reserved; clear to 0
+$7C20 $3F0 "ROMLESS1" signature, loader code

The iNES header should specify 32K PRG and 0K CHR, and the file should be $8010 bytes long. If no mapper is specified (+$7C10 = 0), the iNES header should specify MMC1. This way it can use WRAM, and the PC-based loader will know that it doesn't require MMC1, simply any mapper which supports WRAM. Otherwise, the two mapper values should match.

The padding at +$10 is for any other layered file formats, for example to allow the upcoming iNES music file format to put its header there without disturbing this format.

The file format has been arranged this way to allow less flexible assemblers like nesasm to work with it.

Optional elements

Some program elements are optional. If an element is specified as being not used, the program can run on hardware that doesn't support that element. For example, if a program specifies that it doesn't use WRAM, it can run on cartridges that don't have it. The following elements are optional, with use/non-use specified as described.

Element To specify use To specify non-use
Mirroring Clear bit 1 of +$7C11 Set bit 1 of +$7C11
CHR RAM Put data in CHR Clear CHR to 0
WRAM Put data in WRAM Clear WRAM to 0
Interrupts Set NMI and/or IRQ Set NMI and IRQ to $FFFF

When mirroring specified as used, the PC-based loader verifies that the cartridge in the NES has the proper mirroring when running the program. If using a mapper with settable mirroring, like MMC1 or MMC3, this doesn't matter, since the mirroring can be set properly while running.

Loading procedure

Addresses Action
Verify that hardware required by program is present (optional)
$8000-$FFFF Configure mapper
Wait for VBL
$2000-$2005 Clear to 0, skipping $2002 and $2004, and writing to $2005 twice
PPU $3F00-$3F1F Load with palette. Write first entry to $3F10, rather than one from sprite sub-palette. This way background color is set by first entry.
$4000-$4017 Clear to 0, writing in reverse order so that $4015 is cleared before $4000-$4013
$2C00-$2FFF Load with screen 2. Written to $2C00 so it goes to second nametable regardless of mirroring.
PPU 0-$1FFF Load with CHR
PPU $2000-$23FF Load with screen 1
$6000-$7FFF Load with WRAM
$200-$7FF Load with RAM
$2006 Write with 0 twice
S, A, X, Y Set S to $FD, and clear A, X, and Y to 0
0-$1DF Clear to 0
Wait for VBL
$4017 Clear to 0 to begin mode 0
P Clear all flags except I ($04)
Jump to reset vector in RAM with JMP ($7FC)

To support interrupts, have IRQ and NMI vectors point to forwarding code that does JMP ($7FA) and JMP ($7FE), respectively.

Execution environment

When code in RAM begins executing, the following environment has been established by the loader.

Hardware Addresses Loaded with
CPU registers A, X, Y, S, P A,X,Y=0 S=$FD P=$04
Zero-page & stack 0-$1DF all 0
Top of stack $1E0-$1FF garbage
Code and data $200-$7FF program's RAM
PPU registers $2000-$2006 all 0
APU registers $4000-$4017 all 0
Cartridge WRAM $6000-$7FFF program's WRAM
Mapper registers $8000-$FFFF set for proper mirroring, CHR RAM and WRAM enabled
Cartridge CHR RAM PPU $0000-$1FFF program's CHR
Nametable 1 PPU $2000-$23FF program's screen 1
Nametable 2 PPU $2C00-$2FFF program's screen 2
Palette PPU $3F00-$3F1F program's palette
Sprites PPU OAM garbage

Just before the loader begins executing the program, it loops until $2002 reads back with bit 7 set, then writes 0 to $4017. To begin executing the program, it does JMP ($7FC) to start from its reset vector.

Interrupt vectors are at $7FA-$7FF and act like $FFFA-$FFFF, except that vectoring takes an extra 5 cycles.

The loader code in a romless file specifically avoids writing to $1E0-$1FF. This allows an emulator to place data there that the romless program can read. This feature will be used by the iNES music format.

Design rationale

File layout: Some assemblers have limitations on file layout. For example, if some data will be loaded into RAM at $200, it must be placed in the file at +$210, +$2210, +$4210, etc. asm6 requires that things be defined in the order they go into the file. Thus, the file order should match the most convenient order in the source file. Given that it's convenient to have screens just after CHR, as it is in memory, the only thing that can reasonably be moved is the palette. The loader must be near the end of memory, since some mappers don't guarantee anything else mapped in at power.

Interrupt vectors: To be of use, the interrupt vectors must eventually jump to RAM. The simplest approach would be to have them point to RAM, perhaps NMI to $200 and IRQ to $203, allowing the placement of JMP instructions there. But this doesn't behave like interrupts normally do, requiring a different programming approach than usual. So the NMI and IRQ vectors point to JMP ($7FA) and JMP ($7FE) instructions in ROM, which read the vectors from RAM. The vectors could have been placed anywhere in RAM. Putting them in zero-page or the stack page might limit other uses of those pages. Putting them at $200 and $204 would make some sense, and be fairly easy to work with, but it's still different than the normal vectors. $7FA-$7FF feels almost like the normal locations in most regards. The reset vector can then be used to point to the entry for the code.

Leave OAM with garbage: OAM isn't easy to initialize to known content because it loses its data if rendering isn't enabled. Since the program starts with rendering disabled, this would require loading OAM just before starting. Even then, a program isn't going to have much use for an already-loaded OAM. Also, the PC-based loader would have difficulty loading OAM. It might be possible at some point, and wouldn't affect programs that assume it contains garbage.

Load palette: Loading the palette is questionable, but it's nice to at least black it out while loading, especially for the PC loader. If we're blacking the palette, we might as well load it with something useful.

Cleared RAM: The PC-based loader can't easily load all of RAM, so some of it must be left cleared or with garbage. Clearing is useful for small programs. The zero-page and stack page are unlikely to need any code pre-loaded, so they're the best to have cleared.

$1E0-$1FF garbage: The PC-based loader can't even clear all of RAM, since the final code that begins the program must be in RAM somewhere. The top of the stack is the best place, since that's the least likely to even be useful cleared. The iNES-based loader does NOT clear this, even though it could. This is to allow an emulator to pass parameters to the romless program, for example the upcoming iNES music format to pass the track number to play.


Contact Shay Green