On choosing the Z80 over the 6502Published: July 07, 2014
Tags: 6502 8-bit homebrew computing retrocomputing z80
As previously mentioned, in passing, I've been working on a homebrew 8-bit computer for the past few months, and doing a miserable job of documenting it. I'm hoping to get to work on correcting that in the near future, starting with this entry which will discuss a design decision I made without an awful lot of consideration to begin with, but have retro-actively justified a few times over.
When it comes to building your own 8-bit computer, there are essentially two choices for the processor. You can use the Z80 or the 6502. These are both classic chips which dominated the homecomputing market in the late 70s and early 80s. Perhaps surprisingly, both of them are still manufactured today (with higher clock speeds and lower power consumption than the original versions), and are available quite cheaply in hobbyist friendly packages like 40 pin DIP or 44 pin PLCC. I'm using the Z80, and a very fair question to ask is "Why?".
In some ways, the 6502 would have been a more natural choice for me, to the extent that this kind of project is often motivated by nostalgia. I used a lot of 6502-powered devices growing up - mostly the Commodore 64 but to a lesser degree the BBC Micro and the Atari 2600. I think the first and only Z80 product I used was the Gameboy, but both these chips were used in all kinds of things in those days so it's very possible I had Z80s in my house controlling the VCR or the microwave or something. Building a computer using the same processor as the machine which gave me my love of computing in the first place would have felt amazing, and I'm slightly surprised I didn't charge into using the 6502 on this basis alone.
I chose the Z80 because when I first started researching the project I somehow got the impression that it was a much more popular choice for homebrew computing, and that by choosing it I'd find a lot more documentation on the web, a lot more people to ask questions to, etc., etc. Being at the point where I'm almost finished the project, I now think I got this pretty much perfectly backward. There is a vibrant 6502 community online, with 6502.org forming an obvious hub, complete with active forums full of amazing projects. There are a lot of "superstars" active in the 6502 enthusiast world, including ex-Commodore engineer Bill Herd and the self-taught Jeri Ellsworth, who implemented an entire C64 in an FPGA. I've found the online Z80 scene to be largely dead in comparison, especially if you are not interested specifically and exclusively in TI's range of Z80-powered calculators. z80.info is actually, to be frank, fairly crummy. The most useful resources I've found are the old book "Build your own Z80 computer" by Steve Ciarcia (legitimately freely available as PDF online) for the hardware side and this website describing the Z80 instruction set for the software side. The N8VEM community is fairly active, but to be honest I find it fairly impenetrable: lots of different designs by different people, all with hard to remember acronyms and no clear picture on what's compatible with what. All of the information is stored in a terrible CMS which feels like I'm navigating somebody else's hard drive...
I've considered switching to the 6502 on more than one occasion throughout the project, motivated by a variety of things: the initial realisation that I'd get a better community, finishing reading Brian Bagnall's book "Commodore: A company on the edge" and being overcome with warm fuzzies for all things Commodore, and really wanting to use the super-nifty 6522 VIA chip but not understanding at the time what the hell to do with the "Phi 2" pin if interfacing it to a non-6502 processor (I know now that this basically acts as a chip enable pin, so you can just connect it to the Z80's IOREQ). Each time I considered it, I tried to weigh up the choice in a sombre and technically minded way, and each time I stuck with the Z80. I really do think it has a lot to recommend it, especially for first-time computer homebrewers. I've discussed some of these points below.
Emphatically, this is not supposed to be a "6502 sux0rs, here's why the z80 beats the snot out of it!!1" kind of article (even though the web is full of the reverse argument). I have tremendous respect for the 6502, for the team who designed it, and for the great machines that it powered. I can see the beauty in its minimalist architecture. Building your own 6502 computer can be just as educational, rewarding and empowering as building your own Z80 system; if you're doing either of these things, you are doing a good thing. I do feel like the Z80 is a little bit of an underdog in the homebrewing world, and this article attempts to challenge this state of affairs a little bit with what I think are legitimate advantages the Z80 holds over the 6502. I will say that I am quite new to the Z80, and actually have no hands-on experience building 6502 systems, though I've done a lot of reading. There may be factual inaccuracies in what follows. If you find any, please email me and I'll make corrections as appropriate.
Native 16-bit operations
The Z80 has native support for 16-bit addition and subtraction - two 8-bit registers are combined to form a single 16-bit one. On the 6502, you need to do this manually, adding the two lower order bytes, checking the carry flag, and adding the two higher order bytes. The 6502 will still probably do it faster than the Z80 at any given clock speed, because it typically uses fewer cycles per instruction, but the machine code will be longer because it takes multiple instructions. I can imagine this getting old fast. You can of course write the code once as a function and call it whenever you need to do 16-bit arithmetic, but this reduces some of the speed advantage.
Powerful data moving instructions
The Z80 instruction set has a couple of really nifty, powerful instructions for moving data around in memory, and between memory and IO (they're LDIR, LDDR, INIR, INDR, OTIR, and OTDR). You can do things like set up a pointer to a memory location, a byte counter and a port number and then use a single instruction to tell the CPU to move that number of consecutive bytes between memory and the port, and it'll just loop away. On the 6502, you'd need to do this manually - including the 16-bit address arithmetic. This is more boring to write, more error prone and takes up a lot more code space. Admittedly the 6502 instruction set is smaller, simpler and neater for not having these kinds of instructions, and I can see the appeal in that, as well as the potential satisfaction that can come from crafting your whole program yourself from the simplest possible conceptual units - but for newcomers to assembly (and, I imagine, old hats who are over the thrill of bare metal and just want to get stuff done) it's really nice to be able to just write a single instruction and know that your data will get shuffled around as quickly as the people who designed the CPU could make it happen.
A full-fledged 16-bit stack pointer
NOTE: the original version of this blog post contained a mistake, where I claimed that the 6502's stack was restricted to the zero page (0x0000 - 0x00FF), and argued that this was especially problematic since the zero page is also often used as a bank of registers, due to faster access time. I claimed that having to share this page for these two purposes forced one to choose between having lots of fast data storage or a deep recursion capability but not both. Two readers, Ola and Chris, emailed me to let me know that the stack is actually restricted to page 1 (0x0100 - 0x01FF), so this trade-off doesn't really exist. The stack situation on the 6502 is thus not as bad as I first thought, but is still more limited than the Z80. Thanks for the corrections!
The 6502's stack pointer is also an 8-bit value, meaning it can address only 256 bytes of RAM, and it is constrained to lie within page 1 of the 6502's address space (addresses 0x0100 through 0x01FF). This means that you can never make nested function calls more than 128 levels deep (as each call pushes a 2 byte return address onto the stack). Perhaps not much of a problem in practice for many people, but also a fairly ugly restriction, and definitely problematic for recursively defined functions.
This restriction is also a bit of a pain if you want to try multitasking. You can't, say, have one process running in the bottom 32 kB of memory and one in the top 32 kB of memory, because the process in the top 32 kB can't have its own separate stack space. You'd need to either implement a bank switching scheme so you can physically point the zero page at different parts of your physical RAM address space (even if the number of processes you want to run multiplied by their size is less than 64 kB, so that you don't really need bank switching), or else physically copy the entire zero pages of each process back and forth between some other part of memory whenever you switched processes, both of which strike me as a pain.
If you wanted to implement a stack-based programming language, like Forth on the 6502, I have to imagine that you need to simulate your own stack using general purpose memory rather than relying on the system stack, due to its small size. This must slow things down a bit.
The Z80 has a 16-bit stack pointer which can point anywhere in the address space at all, so you don't have any of these limitations and you can use your whole 64 kB address space to its maximum potential.
Separate address spaces for memory and peripherals
The Z80 makes a fundamental distinction between reading or writing memory (RAM or ROM, though obviously not writing ROM) and reading or writing peripheral chips like UARTs, real-time clocks, etc. These actions are done with separate instructions: the various forms of LD (load) for memory, and IN and OUT for peripherals. When reading or writing memory, the MREQ pin is brought low, along with the RD or WR pin as necessary, and when reading or writing peripherals, the IOREQ pin is brought low. Values in memory are addressed using a 16-bit address bus, giving you 64 kB of memory, and peripherals are addressed using the lower 8-bits of the address bus, giving you 256 bytes of IO space (this is actually a bit of a simplification, but it's the official story on IO and I'll stick with it for here). Importantly, these 64 kB and 256 B address spaces are totally separate, non-overlapping entities.
The 6502, in contrast, uses "memory mapped IO". This means that the CPU itself makes no distinction between memory or peripheral chips. You use the same kinds of instructions, LD (load) and ST (store), to read or write to each. There's once again a 16-bit address bus, leading to 64 kB of memory space. It's up to the computer designer to come up with an address decoding circuit which routes the appropriate signals to memory or peripheral chips accordingly. Having no distinction between memory and peripherals does make the 6502's instruction set smaller and more consistent - some would say elegant - but this comes at a pretty high cost.
All of your peripherals on a 6502 system have to be put somewhere into the same 64 kB address space as your actual memory, and this causes quite a lot of flow on nastiness. First up, it obviously means that the more peripherals you add, the less actual memory you can use for your programs. It's true that your peripherals probably won't actually need a whole lot of address space: the fact that useful Z80 computers exist proves you can get away with under 256 bytes of such space without problems. However, if you want to dedicate just a 256 byte space to your peripherals, you need an address decoding scheme capable of decoding access to exactly that page, i.e. you have to compare all of the 8 higher order bits of the address bus to some fixed value. And then, of course, you have to do additional decoding to split that 256 byte range up for your individual peripherals. This leads to an increased chip count compared to the Z80, where you just have to split up the 256 byte space already separated out for you by the CPU's non-memory-mapped architecture. You could do this with a single extra chip, the 74688 8-bit comparator, but every extra chip increases costs and, more importantly (since 7400 chips are usually pretty cheap) eats up more of your limited board space.
With this comparator-based approach, your IO page is fixed at one particular point in memory. In simple systems this is no big deal. You probably have a bunch of ROM at the top of your address space (since the 6502 looks at the top of the space when starting up) and RAM underneath that. Your programs are constrained to live in RAM anyway, so you can just stick the IO page in between these two sections and you don't really lose much space. But what if you have a more complicated design, where you can switch the ROM in and out of the address space, so you can get a full 64 kB of RAM when you want it? Well, unless you switch your IO page out as well (in which case your program obviously can't do any IO, limiting its usefulness), then you're stuck with 63.75 kB of RAM in total, with 256 bytes of IO stuck somewhere in the middle of it. This limits your available contiguous space and forces you to split your program up into two chunks. If you want the option of switching to a mode with a lot of contiguous RAM and IO, you need to make your IO page relocatable by putting an 8-bit latch in front of your 8-bit comparator - yet another chip. This will let you move the IO page to the top or bottom of the address space, giving you a contiguous 63.75 kB.
All of this is quite horrible, and all the homebrewed 6502 machines I've seen online do nothing of the sort. They all have very simple address decoding schemes using a small number of chips, and as such they all have the same shortcomings: inflexible memory layouts with a lot of wasted space. These computers often have several kilobytes of memory which are either "holes" in the address space which don't do anything, or are repetitions of small chunks of memory space connected to peripherals, which are "mirrored" several times because higher order address lines aren't being decoded. A lot of simple Z80 systems have holes and mirroring too, of course, mine included, but the important difference is that this "wasted" spaced is not wasted RAM, which could have been used for code or data, but rather wasted IO space which there's no use for anyway once all your peripheral chips have the space they need.
This has been a fairly long section, but in brief: the Z80's use of separate address spaces for memory and peripherals means that you can use a small number of chips and a fairly simple design and end up with minimal wasted memory space, and with just a little bit more work you can get a system where you can easily give yourself a full 64 kB of RAM when you need it. The 6502's memory mapped IO means that simple, low chip-count solutions are probably pretty wasteful and inflexible.
Easy Linux development
I've been using the GNU project's Z80 assembler, z80asm, and disassembler, z80dasm, to generate machine code for my project thus far. z80asm is in the apt repositories used by Debian and Ubuntu, so after an apt-get install z80asm I'm good to go. Surprisingly, there are no packages with "6502" in the name or description in those repositories. 6502.org has a list of assemblers, and 3 of them claim to work on linux. None of them are in the Debian or Ubuntu repositories - which is not to say they won't work, but there's more effort involved in getting up and running. All 3 projects have websites which have not been updated in a very long time. UPDATE: S. Bryant kindly emailed to let me know that in fact there is a 6502 assembler in the Debian repositories, you just need to know what it's called! It's xa65. I was pleasantly surprised to find that it is maintained by Cameron Kaiser, who I already knew of from his great work in preserving Gopher. xa65 seems to be under active development, which is good to see.
I'm really looking forward to eventually using C to write programs for my z80 machine, mostly because getting it to work will require me to completely and totally understand exactly what is involved in producing a C runtime. The Small Device C Compiler supports the Z80 (along with some other 8-bit architectures), and is also available via apt. The most prominent C compiler for the 6502 seems to be cc65, which was abandoned by its author about 1 year ago. An old version is being maintained on Github, but that's it.
If you're a Linux user and you want to cross-develop software for an 8-bit CPU, it seems to me like you can get up and running in a flash for the Z80, but not the 6502.
Easy to find
I only know of two places online where you can buy the 6502 - the mega-distributor Mouser and the smaller supplier Jameco, both based in the US. Mouser will ship to you if you're outside of the US, but you need to either pay a fairly hefty shipping fee, or spend a huge amount on parts to become eligible for free shipping. Either option will massively inflate the base cost of building an 8-bit computer. It's also worth noting that Mouser's page for the 6502 claims "This product may require additional documentation to export from the United States". I don't know exactly what is in involved in this, and under what circumstances it applies, but it doesn't sound like fun. Jameco's shipping seems a little more sensible - they'll ship a 6502 to New Zealand for about US$15. That's still about 3 times what the chip itself costs, but it's not prohibitive. However, Jameco are a much smaller supplier than Mouser, and they recently stopped selling the modern 16-bit version of the 6502, so it's possible the 8-bit version will eventually disappear too.
On the other hand, the Z80 seems to be a lot easier to find anywhere in the world. In the US, it's carried by both Mouser and Digikey. The Farnell/Element 14 group seem to stock it in most other major countries - I can buy it locally here in New Zealand, which is nice.
The 6502 found its way into many more home computers and game machines in the 70s and 80s than the Z80, and it's no secret that one of the major reasons is that the 6502 was cheap - especially for Commodore, who bought MOS Technology early on and could therefore include the 6502 in their machines at cost, while everyone else had to pay a markup. Even with the markup, the 6502 was the cheapest 8-bit microprocessor on the market by a wide margin, so cheap that when it was originally announced, many thought it was a joke. Steve Wozniak has emphasised the 6502's price playing a prominent role in his decision to use it for the early Apple machines.
Price is perhaps not as important a factor in this decision today, as both the 6502 and Z80 are reasonably cheap, but in a cost-sensitive application, the tables have actually turned. The modern version of the 6502, the W65C02, costs US$6.95 at Mouser, whereas a modern Z80, the Z84C costs US$4.14 at Mouser or US$4.31 at Digikey. These are unit prices. If you bought 100 Z80s from Mouser they'd cost you US$3.06 each, compared to US$5.85 each for the WC65C02 - almost half the price. Most hobbyists aren't going to buy 100 of either chip, but if you wanted to use one of these processors in a hobbyist kit or something the bulk discount matters. Combine the lower cost with the easier international sourcing and the Z80 is the clear winner for anything you want to make it easy for anybody in the world to get their hands on.
That's about it. The Z80 is easier to find outside the US than the 6502, it's cheaper, and you can start hacking for it on Linux in no time flat. Both the Z80 and 6502 give you 64 kB of memory space, but the Z80 makes it easier to use all of this space as however you like, and to exercise the full capabilities of your system. Every byte in the space is the same as any other, and the stack works identically everywhere, so there are no restrictions on how you choose to use the space. The space is never interrupted by memory-mapped peripherals, so there's no shortage of contiguous blocks of RAM. And the instruction set makes it very easy and very natural to move these contiguous blocks around with clear, concise code. All of this makes it very easy to try things like multitasking. There are a few other things I've not gone into detail on, like the Z80's shadow registers or native vectored interrupt handling, but I think I've covered the main points above.
In the interests of fairness, I should state that there is one place the 6502 is always going to kick the z80's ass, and that's speed. Z80 instructions typically take more clock cycles than their 6502 counterparts. This actually wasn't much of a problem back during the hey-day of these chips: the z80's design allowed it to be clocked faster than the 6502 so, for example, the Sinclair ZX machines' Z80 CPUs ran at 3.25 MHz while the C64's 6502 ran at 1 MHz. The faster Z80 clock offset the lower cycle efficiency, leaving the machines on roughly equal footing - in fact, this article argues (which much vitriol) that the ZX Spectrum is faster than the C64. However, much like in the case of cost, the tables have turned since the 1980s. The modern WC65C02 can be clocked at 14 MHz. 20 MHz versions of the Z84C exist, but are hard to find in stock, whereas the 10 MHz versions are abundant and so that's probably a more realistic speed to consider the maximum. So today's 6502s are more cycle efficient and clocked faster. If speed is really important to you, you should probably go with the 6502.