NeoISA
do we really need a new ISA?
i have created this page to start a discussion about whether it makes sense to commit to another cpu command architecture, or whether there are alternatives to it, which are far more agile. have fun.
why
since i can remember i am fascinated by the different cpu architectures and the connections
between the hardware architecture and the command architecture, the ISA.
but why this page?
my way of thinking was triggered a few years ago when RISK-V started to get a lot of attention.
two points in particular triggered me. my first thought was "boaaa, yet another new RISC
(yar-yet another risc)" but the second one fascinated me because from my point of view this was
a milestone to put this ISA under the BSD license - how cool is that! after i read the
design guide by
andrew waterman he inspired me to move this topic out of my background of
my brain and into a focus area. this website is now exactly meant to write down my insights from
the last years to get them out of my head.
history
in my early days of computer science, which were influenced by zx81 and the vc20, i came in contact with the comodere basic with such strange hex signs from some magazines after some first attempts. colloquially this was called "assembler" and should be very fast. that was the first machine code of the 6502 i had to deal with. from then on it started and fascinates me until today. from today's point of view i can only take my hat off to how such a performance can be generated from only 3.5k transistors. i had a lot of fun with the 6502 and learned a lot about the interaction of silicon and isa already described above. my amazing next step was an atari st with its 68k cpu. mainframe power on a home computer, insane. 68k became my feel-good cpu. a cool ISA on which it was fun to develop. in contrast i found the 80x86 to be very bulky, unintuitive and inperformant in its time. my opinion at that time was, that the x86's will not become generally accepted - not my last misconception ;) already at that time one thought matured in endless discussions with my buddy marek, while we developed a special operating system, a kind of microkernel system, how we will become master over the many different ISA's. but more about that later. the i860 from intel was the entry into the world of pipelining and parallel processing, because it could execute up to three commands at the same time. on the i860 it became clear for the first time, that in contrast to the 68k it would not be so easy to develop more complex systems in machine language. one of the last at this time surprisingly interesting processors were the transputers from inmos. to this day i still find the ISA elegant and efficient, working with a stack instead of registers i still find extremely charming for many reasons. unfortunately, the t9000 did not really manage to get the company out of its predicament, although it was far ahead of its time in my opinion. in my opinion there were only a few really interesting developments, like the itanium, the cell and the transmeta series. i was also taken with the transmetas. emulating an x86 and running it with a very efficient vliw, which is not visible to the programmer. i remembered the long conversations with marek. all the others are just more risc's in my view, where everybody cooked his own soup. of course there are a lot of other processors like dsp's, microcontrollers, and so on, but i always concentrated on cpu's that execute general purpose code.
and what is this all about
as i wrote above, the idea of an isa never really left me. actually something like the
transputer, but trimmed for new, just like risc-v, just different ;) so i set off about three
years ago, and starting to research can't be that difficult. naivety and curiosity is the
beginning of knowledge.
first of all i had to realize that i am not the only one in the world
who is thinking about this topic. there are 1000's master and doctor thesis who have dealt with
this topic in the last years. it was and still is a land of milk and honey for me! and hardly
170 thesis later, i already start writing *laughs* i had started to think about where to start.
so the classical chicken egg problem. so emulator or compiler, and if emulator, then something
that generates code for a fpga? while researching i came across PyMTL wow, everything i
needed. after the first few attempts i soon realized that it is important to generate machine
code first, which led me directly to LLVM. so i
worked through the CPU0
tutorial to generate a backend, developed my own syntax, and voilà there i had my own risc
isa - yar. yeah, cool, but not what i wanted.
so take a step back again and see what
this is based on. i didn't want a classic register based risc ISA again. but to my surprise
there are and were very few stack based approaches. again and again stack was shown to be not
scalable. there are approaches like sTTAck
or BOOST
that both have their interesting sides, but are more academic in nature. for a long time i
had dealt with a registerless memory to memory architecture, the perl
processor. this paper really brought me a big step forward, because among
other things many other interesting papers are referred to here. an interesting aspect of this
is that a large part of registers is only used once or not at all. the use of registers is done
within very few instructions, and that between 200-300 registers a kind of saturation occurs.
another interesting aspect of the registerless perl architecture is the fact that there are no
more load's and store's, which reduces the number of commands by 25%-30%. another interesting
approach was the transport-trigger architecture
TTA, which basically maps the pipeline in a huge switch matrix, using the
cache of the executing units as registers. this can reduce the pressure on the registers
drastically. however, again a register architecture, apart from the upper sTTAck architecture.
there is another variant of this exposed datapath architecture, that is scad. during
my research i came across two more current developments. on the one hand, the ForwardCom with its vector ISA and many
interesting approaches, which i will talk about later, and the most promising approach for me,
the Mill CPU with its belt (fifo) as
register.
mill or not mill - that is the question
the mill has really taken my shoes off. the mill has been in use for 16 years now and i have to
say that it was really worth it. it hits my nerve and from my point of view it is better than
risc-v. every aspect of cpu architecture was looked at and not only brought up to date but also
uncompromisingly old habits were cut. the memory management and the task change alone is the
optimum that is possible from my point of view. together with the many other aspects this is the
all around carefree package. and as claudia always said: always waiting for that ;)
but
- if it weren't for the patents, and the unbelievably long development time, which is far from
being foreseeable.
patents
yes, i'm not a real friend of patents. i believe that a free availability approach like risc-v's approach to ISA and the architecture behind it will lead to a large community where everyone will work together to bring such a complex and complicated system to a stable level. this will result in companies that, like sifive, can build up their knowledge around it and offer it accordingly.