Machine Architecture and Organization (sec 010)

CSCI 2021 Lab 0x9: Extending Y86-64

The standard Y86-64 instruction set provides the OPq instruction family (addq, subq, andq, and xorq) for doing arithmetic or bitwise operations on two numbers. These instructions require that both operands be in registers, which helps keep things simple. But you may have noticed that this requirement is somewhat inconvenient in programming, since you need to devote a separate instruction and register for a constant to do even simple operations like adding 1.

For this lab, you're a hardware designer at the company designing the next version of the Y86-64 processor, part of the team extending the processor with new instructions. Specifically you're part of the team implementing a new family of instructions, called iOPq for short, which do arithmetic or bitwise instructions with a constant (immediate) operand. This new instruction will take the place of the combination of an irmovq and an OPq instruction, and also will save on the use of registers. Other engineers on the team have already implemented support for the new instruction in the assembler and simulator and created a couple of test programs. Your job is to implement support for the new instruction in the HCL control logic of the sequential Y86-64 processor implementation.

The iaddq instruction has the effect of adding a constant to a register, with the new value stored in the same register, and there are also isubq, iandq, and ixorq instructions that do the same with the other ALU operations. The operation of these instructions is summarized as follows:

iaddq V, rA rA <- V + rA

isubq V, rA rA <- V - rA

iandq V, rA rA <- V & rA

ixorq V, rA rA <- V ^ rA

Notice that the order of the two operands is significant in the case of isubq. The designers chose this order so that isubq can be useful for other things besides what could be done with iaddq with a negative constant. For instance, isubq with a 0 constant is like a negation instruction.

The encoding of the new instructions is somewhat similar to the encoding of the existing OPq instructions. The instruction code is 0xc, and the function code uses the same encoding as for OPq: 0 for add, 1 for subtract, 2 for AND, and 3 for XOR. The register rA is encoded by the first hex digit of the register IDs byte, while the second hex digit of that byte is 0xf signifying no second register. The remaining 8 bytes of the instruction are a 64-bit little endian constant. For instance, since %rdi is register 7, the encoding of iaddq $8,%rdi is c07f0800000000000000.

Getting Started

Extract the lab files with this command:

tar xvf /web/classes/Spring-2020/csci2021/labs/0x9/lab0x9.tar

This will create a directory named lab0x9 with the files you'll need for the lab. Because the hardware design is expressed in the HCL language, you may want to review the description of that language in the slides or textbook as you go. The file you'll be modifying is seq.hcl. See how each stage in the processor (Fetch, Decode, etc.), control signals, and the program counter are all described in HCL. This is a programmatic representation of the logic of SEQ (a sequential version of a Y86-64 processor).

In this CPU description, the different instructions are named with constant values that start with an extra I. The existing OPq is referred to as IOPQ, and we've already given you a declaration of a new instruction type IIOPQ for the new iOPq. You'll need to modify several different parts of the processor so that they do the right thing when given the new instruction. Sometimes the new instruction will be similar to the old OPq, while other times it will need to be different.

Fetch

Recall that HCL expressions like
evaluate to true if var is found in the list and false if not. In the Fetch section we see this in action where IOPQ is listed under the instr_valid control signal. Add IIOPQ to this list since you want it to be read as a valid instruction.

Look at the other control signals generated during Fetch. Which of these will IIOPQ need?
Decode
Now recall HCL switch statements
return the value corresponding to the first boolean expression that evaluates as true. The list-containment expression from Fetch is used as a boolean expression.

In IOPQ, registers rA and rB are identified as the sources for the ALU. IIOPQ also has a register input to the ALU, but how will you encode the logic for the constant value?
Execute, Memory Stage, PC Update

Once you've identified IIOPQ's input sources in Decode, you are ready to plug them into the ALU. We've covered all the HCL syntax you need to continue. Use IIOPQ's similarity to IOPQ to decide how to finish the Execute and Memory Stages.

Building and Testing Your Solution

We've given you two sample test cases named negate.ys and asumi.ys. When you give the command make, it will assemble the test cases into Y86-64 object files using yas, and compile the HCL processor description seq.hcl together with some supporting C code into a simulator named ssim. We've also given you a version of the ISA-level simulator yis which has already been updated with the new instruction. In other words, yis will show you the expected behavior of the instruction. But if you try to run the examples before you have implemented the new instruction, you will see that they won't have the correct behavior. Instead ssim will stop with the status INS, meaning illegal instruction.

Once you have finished modifying the seq.hcl file, you will need to compile a new instance of the SEQ simulator (ssim) based on your HCL file, and then test it:

Building a new simulator. You can use make to build a new SEQ simulator:

make

This builds a version of ssim that uses the control logic you specified in seq.hcl.
Testing your solution on a simple Y86-64 program. You can run the SEQ simulator on an object file with a command like ./ssim negate.yo. This will execute the program on the simulator until it halts, and print the final status. You can also have the simulator automatically check its results against the ISA model (like yis) using the option -t. The program negate.yo uses isubq to compute the negation of 10, so the final value in %rax should be 0xfffffffffffffff6. The program asumi.yo computes the sum of an array of long integers, so its final result should be 0x0000abcdabcdabcd.

For more information on the SEQ simulator refer to the handout CS:APP3e Guide to Y86-64 Processor Simulators. Note however that the version of the simulator you build in this lab doesn't include the GUI features.