Note: The method described here doesn't work on NES hardware. See a version that does.
A relatively high-frequency interrupt is necessary to periodically set the DMC's DAC directly by writing to $4011. Checking the DMC.txt spec, there is mention of a PCM IRQ at the end of playing a PCM sample. A sample length of 1 byte (8 one-bit samples) is available. Finally, the DMC period per output byte (8 bits) can be as low as 432.
If a DMC sample were initiated and the CPU IRQ were enabled, an interrupt could be triggered as soon as 432 clock cycles later. In the IRQ handler, a new DMC sample of length one could be restarted, thus triggering yet another IRQ 432 clock cycles later. With a clock rate of 1.79 MHz, this IRQ would trigger approximately 4143 times a second. Even if a complete wave could be output only every other IRQ, this would allow frequencies to 2071 Hz, plenty for musical notes.
The DMC's constant sample deltas can be used to help generate a sawtooth wave. The IRQ handler sets the DAC to the current volume (0 to 16), then starts the DMC with a 1-byte sample of value $00 (in memory). This results in a smooth 8-step ramp where each step decreases the DAC value by 2.
set dac, start dmc set dac, start dmc _ _ | |_ sample = 0 | |_ | |_ | |_ | |_ | | |_ | | |_ | | |_ | | |_ | ___| |_| IRQ IRQ
Volume is controlled by the value the DAC is reset to on each IRQ:
reset dac to $08 _ _ | |_ | |_ | |_ | |_ | |_ | __| |_________|
The DMC only has 16 period settings with somewhat arbitrary values, so available frequencies are severely limited. This can be remedied by putting a small delay loop into the IRQ handler. The delay is only used to lengthen one DMC period to almost the length of the next larger period, thus it won't become very large. It results in slight differences in the output waveform, but they aren't that noticeable.
set dac set dac _ _ | |_ | |_ | |_ | |_ | |_ | | |_ | | |_ | | |_ | | |_ delay | ___| |_______| IRQ
This is still rather limiting as the largest DMC period is 3424, which yields a 523 Hz IRQ interrupt base. The available volume level is also quite small. Both can be remedied by resetting the DAC every N IRQs, rather than every IRQ.
start dmc _ _ | |_ | |_ | |_ sample = 0 | |_ | |_ | | |_ | | |_ start dmc | | |_ | | |_ delay | | |______ | | |_ | | |_ sample = 0 | | |_ | | |_ | | |_ | | reset dac |_ | reset dac | to $1F |_ delay | to $1F ___| |_______| IRQ IRQ
In this example, the available volume range is doubled, and the available frequency range is lowered by one octave.
An idea I haven't explored is to use different DMC periods for different segments of the waveform. This might allow fewer cycles devoted to delays, freeing up some CPU time.
A square wave can be generated without the help of the DMC constantly adjusting the DAC, but the DMC must be playing a sample in order generate the periodic IRQ. DMC interference can be minimized by using samples of alternating set and reset bits, i.e. either $AA or $55. If the waveform being generated is of significant amplitude, this constant toggling will be unnoticeable.
sample = $55 delay _ _ _ _ _____ | |_| |_| |_| |_| | | | | | | | | | | | | | | | sample = $55 | _| | _ _ _ ______| | |__| |_| |_| |_| delay IRQ IRQ
With square waves, the intermediate amplitudes can be any value. If more IRQ steps are used for a each wave period, rich waves can be produced:
____ ____ | | | | __| | ____| | __ |____ | |____ | | | | | |____| |____|