Maybe someday the 65816 target will get out there, a challenge in itself.
Thanks
LLVM is also very modular which makes it easy to maintain forks for a specific backend that don't touch core functionality.
But I'm sure it's more complicated than that. :-)
Thanks again
It would be interesting to also have a viable backend for the Z80 architecture, which also seems to have a highly interested community of potential maintainers.
... but now three years out of date, because it's hard to maintain :-)
Our goal is to get most modifications not in the third category into upstream at some point which makes the maintenance load bearable.
They usually only had a single general purpose register (plus some helpers). Registers were 8-bit but addresses (pointers) were 16-bit. Memory was highly non-uniform, with (fast) SRAM, DRAM and (slow) ROM all in one single address space. Instructions often involved RAM directly and there were a plethora of complicated addressing modes.
Partly this was because there was no big gap between processing speed and memory access, but this makes it very unlikely that similar architectures will ever come back.
As interesting as experiments like LLVM-MOS are, they would not be a good fit for upstream LLVM.
Don't think "memory access" (i.e. RAM), think "accessing generic (addressable) scratchpad storage" as a viable alternative to both low-level cache and a conventional register file. This is not too different from how GPU low-level architectures might be said to work these days.
Also, maintaining a fork is difficult, but doable. I work on LLVM a ton, so it's pretty easy for it to fold in to my work week-to-week. And quite surprisingly, I used AI to help last time, and it actually helped quite a lot!
I'd happily take a llvm-z80 and llvm-6502 over sdcc if both were available
Edit: oh wow, look at that https://github.com/grapereader/llvm-z80. Aw but not touched for 12 years.
For GBDK-2020 we've been using the 6502 support in SDCC to support the NES as a target console for about 2 years alongside the existing Game Boy and SMS/Game Gear targets.
The 6502 port has been usable, but doesn't seem fully mature. There has been a lot of code churn for it during the last 12 months compared to the z80/sm83 ports as it gets improved. Recently (their recommended pre-release build 15614) this seems to have resulted in some breaking regressions that we haven't fully tracked down.
Perhaps this port is getting less testing coverage than the z80/sm83 port. Unsure. The majority of the 6502 work seems to be done by a newer member of their team, with the longer term members seeming to be somewhat hands-off. That might be an additional factor.
Edit: BTW, the 6502 port in SDCC at build 15267 (~4.5.0+) has been reasonably stable and usable, and is what we based our last GBDK-2020 release on (6 months ago).
https://thred.github.io/c-bench-64/
I think the ideal compiler for 6502, and maybe any of the memory-poor 8-bit systems would be one that supported both native code generation where speed is needed as well as virtual machine code for compactness. Ideally would also support inline assembler.
The LLVM-MOS approach of reserving some of zero page as registers is a good start, but given how valuable zero page is, it would also be useful to be able to designate static/global variables as zero page or not.
From mysterymath's (LLVM-MOS) description, presenting some of zero page as 16 bit registers (to make up for lack thereof, and perhaps due to LLVM not having any other support for preferred/faster memory regions), while beneficial, still had limitations since LLVM just assumes that there will be FAST register-register transfer operations available, and that is not even true for the 6502's real registers (no TXY), let alone these ZP "registers" which would require using the accumulator to copy.
A code generation/optimization approach that would seem more guaranteed to do well on the 6502 might be to treat it more as tree search (with pruning) - generate multiple branching alternatives and then select the best based on whatever is being optimized for (clock cycles or code size).
Coding for the 6502 by hand was always a bit like that ... you had some ideas of alternate ways things could be coded, but there was also an iterative phase of optimization (cf search) to tweak it and save a byte here, a couple of cycles there ...
I've mentioned elsewhere I used to work for Acorn back in the day, developing for the BBC micro, with me and a buddy developing the ISO-Pascal system which was delivered in 2 16KB ROMs. Putting code in ROM gives you an absolute hard size budget, and putting a full Pascal compiler, interpreter, runtime library and programmers editor into a total 32KB was no joke! I remember at the end of the project we were still a few hundred bytes over what would fit in the ROMs, and had to fight for every byte to make it fit!
Optimizing code will never solve good data oriented design. This is just one of the reasons that Asm programmers routinely beat C code on the 6502. Another one is math. In the C language specification, if fixed point had been given equal footing with float that would also help.
These are such a blind spos that you rarely even see custom 6502 high level languages accommodate these fundamental truths.
BTW, growing up on the 6502, I had no problems moving into the very C friendly 68000 but later going backwards to the 6809, on the surface it looked so much like a 6502 that I was initially trying to force SOA in my data design before realizing it was better suited to AOS.
So, optimal C targetting the 6502 is not going to look much like C targetting a modern processor. The developer still needs to be very aware of the limitations of the processor they are targetting.
One somewhat radical thing that LLVM-MOS does is to analyze the program's call graph, and for functions that are not used recursively it will assign parameters and local variables to zero page instead, both for speed of access and to avoid need for a stack frame. Even though this violates the WYSIWYG mental model, this is a nice abstraction of what the assembly language programmer would have done themself.
very nice, it sounds similar to the 'compiled stack' concept. I've seen that here in the co2 language for the 6502
"Variables declared as subroutine parameters or by using let are statically allocated using a "compiled stack", calculated by analyzing the program's entire call graph. This means scopes will not use memory locations used by any inner scopes, but are free to use them from sibling scopes. This ensures efficient variables lookups, while also not wasting RAM. However, it does mean that recursion is not supported."
https://www.youtube.com/watch?v=ejbTKtgSZI0
However his github doesn't show any activity on it in the last 2 years.
A lot of the C library outside of math isn't going to be speed critical - things like IO and heap for example, and there could also be dual versions to choose from if needed. Especially for retrocomputing, IO devices themselves were so slow that software overhead is less important.
Memory is better allocated in more of a customized application specific way, such as an arena allocator, or just avoid dynamic allocation altogether if possible.
I was co-author of Acorn's ISO-Pascal system for the 6502-based BBC micro (16KB or 32KB RAM) back in the day, and one part I was proud of was a pretty full featured (for the time) code editor that was included, written in 4KB of heavily optimized assembler. The memory allocation I used was just to take ownership of all free RAM, and maintain the edit buffer before the cursor at one end of memory, and the buffer content after the cursor at the other end. This meant that as you typed and entered new text, it was just appended to the "before cursor" block, with no text movement or memory allocation needed.
The lack of RAM is a major factor; stack usage must be kept to a minimum and you can forget any kind of heap. RAM can be extended with a special mapper, but due to the lack of a R/W pin on the cartridge, reads and writes use different address ranges, and C does not handle this without a hacky macro solution.
Not to mention the timing constraints with 2600 display kernels and page-crossing limitations, bank switching, inefficient pointer chasing, etc. etc. My intuition is you'd need a SMT solver to write a language that compiles for this system without needing inline assembly.
See Batari Basic
Threaded code might be a worthwhile middle-of-the-way approach that spans freely across the "native" and "pure VM interpreter" extremes.
Simple architecture and really really joyful to use even for casual programmers born a decade, or two later :)
The attention the 6502 get is just because of history. The advantage the 6502 had was that it was cheap--on every other axis the 6502 sucked.
On the other side of the cpu wars, all those 8080 machines moving on the Z80s got to keep all their binary software, which happened again for IBM PCs and clones as those evolved.
The 6800 was expensive versus the 6502--almost 10x (6502 was $25 when the 6800 was $175 which was already reduced from $360)!
The 6800 (from 1974) could have been competitive with the 6502, but (https://en.wikipedia.org/wiki/Motorola_6809#6800_and_6502):
“The 6800 was initially sold at $360 in single-unit quantities, but had been lowered to $295. The 6502 was introduced at $25, and Motorola immediately reduced the 6800 to $125. It remained uncompetitive and sales prospects dimmed. The introduction of the Micralign to Motorola's lines allowed further reductions and by 1981 the price of the then-current 6800P was slightly less than the equivalent 6502, at least in single-unit quantities. By that point, however, the 6502 had sold tens of millions of units and the 6800 had been largely forgotten.”
LLVM-MOS includes all of that. It is the same process as using LLVM to cross-assemble targeting an embedded ARM board. Same GNU assembler syntax just with 6502 opcodes. This incidentally makes LLVM-MOS one of the best available 6502 assembly development environments, if you like Unix-style cross-assemblers.
https://github.com/DaveDuck321/gb-llvm
https://github.com/DaveDuck321/libgbxx
It seems to be first reasonably successful attempt (can actually be used) among a handful of previous abandoned llvm Game Boy attempts.
This is why these back-ends aren't accepted by the LLVM project: without a significant commitment to supporting them, they're a liability for LLVM.
Portable assembly has a nice ring to it but reality is a harsh mistress and she only speaks C++.
Even the hobby underdog qbe seems ill-suited to 6502 repurposing.