Wednesday, August 8, 2012

Complex CPU Addressing modes, for free!

One of the most powerful features of a central processing units is the ability to access memory using complex indexing modes. These modes are essential to implement compound data structures such as arrays, structures and classes in an efficient manner. Our FPGA soft-processor core will need to do all these things too.
 
The instruction set architecture (ISA) that we have developed in the previous post support only basic register indirection, like this:

MOV [R1], R2
MOV R1,[R2]

where the [R1] denotes the memory address pointed to by the register R1.

However, as we shall see, we can support rather complex indexing modes essentially for free!
 
Recall the layout of our execution unit:




Not that the memory address register (MAR) can only be loaded using bus3, which means taking a trip through the ALU. We can therefore load MAR with the result of an ALU operation at no additional cost. This gives us e.g. the following indirect addressing options:

MOV [R1 + R2], R3
MOV R1, [R2 - R3]

The additional information needed in the instruction is minimal, we just need to specify an ALU operation and the bus2 driver just like we do for regular ALU operations.

Note that the above also holds true for reading/writing the memory block register (MBR) itself! We can only write to the MBR by going through the ALU. Also, when we want to store the contents of MBR in a register, we have to go through the ALU too! This gives us the opportunity to replace the stores with ALU operations, yielding instructions of the form

MOV [R1 + R2] - R3, R4
MOV R1 + R2, [R3 - R4]


To achieve this, we will need to store an additional ALU operation and bus2 driver in the instruction. Fortunately, we have room to spare in these instructions! Below we see the instruction words for load and store operations as they are currently in use.

bits31-2827-2423-2019-1615-1211-87-43-0
load0010unused[src reg]unuseddst reg
store0011unusedsrc reg[dst reg]unused

To support the new indexing modes, we need to modify the instruction words to contain two complete ALU specification instead of one. We shall call them the "top alu" (suffix T) and "bottom alu" (suffix B) configuration, and they both consist of an ALU operation (OP, 4 bits) and the driver specifiers for bus1 (B1, 4 bits) and bus2 (B2, 4 bits).

opcodetop alubottom aludst reg
bits31-2827-2423-2019-1615-1211-87-43-0
load0010OPTB1TB2TOPBunusedB2Bdst reg
store0011OPTB1TB2TOPBB1BB2Bunused

The load operation always uses the contents of the addressed memory location as input for bus1 during bottom alu, leaving B1B unused. The store operation stores the result into the addressed memory location, leaving the destination register unused.

The code than handles the instruction in the control unit has not become significantly more complex, even through we need to juggle a few more parameters:

when X"2" =>;
        -- Indirect register load
        case PHASE is
                when X"0" =>;
                        -- Tranfer bottom ALU operation into MAR        
                        CONTR <= X"0000" & IR(15 downto 4) & X"C";      
                        PHASE <= unsigned(phase) + 1;
                when others =>
                        -- Transfer MBR + top ALU into output register
                        CONTR <= X"0000" & IR(27 downto 24) & X"D" & IR(19 downto 16) & IR(3 downto 0);
                        -- end of instruction, load the next instruction
                        PHASE <= (others => '0');
                        PC <= unsigned(PC) + 1; 
                end case;
when X"3" =>;
        -- Indirect register store
        -- (Store the contents of rN at the address in rM)
        case PHASE is
                when X"0" =>
                        -- Tranfer bottom ALU operation into MAR        
                        CONTR <= X"0000" & IR(15 downto 4) & X"C";      
                        PHASE <= unsigned(phase) + 1;
                when others =>
                        -- Transfer top ALU operation into MBR
                        CONTR <= X"0000" & IR(27 downto 16) & X"D";                                                     
                        -- end of instruction, load the next instruction
                        PHASE <= (others => '0');
                        PC <= unsigned(PC) + 1; 
                end case;

As always, the complete, synthesizable VHDL for this project is available on bitbucket.

Of course, with all these funky addressing modes, assembling instructions by hand is starting to become quite a chore. What we need, is an assembler!

No comments:

Post a Comment