Skip to content

03. Multi Cycle

$\gdef \N{\mathbb{N}} \gdef \Z{\mathbb{Z}} \gdef \Q{\mathbb{Q}} \gdef \R{\mathbb{R}} \gdef \C{\mathbb{C}} \gdef \setcomp#1{\overline{#1}} \gdef \sseq{\subseteq} \gdef \pset#1{\mathcal{P}(#1)} \gdef \covariant{\operatorname{Cov}} \gdef \of{\circ} \gdef \p{^{\prime}} \gdef \pp{^{\prime\prime}} \gdef \ppp{^{\prime\prime\prime}} \gdef \pn#1{^{\prime\times{#1}}} $

Pasted image 20230709184212.png

Single Cycle Multi-Cycle
Each instruction is executed in a single clock cycle Instructions are divided into multiple smaller stages, taking one cycle each
The hardware ‘knows’ what instruction we have at the beginning of the Execute stage The hardware does not ‘know’ what instruction it has until the end of the Execute stage
The processor has a fixed set of hardware that performs all the necessary steps for instruction execution within one cycle The hardware is designed to handle different stages of execution seperately
Some parts of the hardware may remain idle during certain instructions, leading to reduced overall performance Instructions that require fewer cycles finish faster, while complex instructions will take more cycles to complete
Duplicate hardware No duplicate hardware
No extra registers Extra registers required to store results between cycles
CPI is always one CPI depends on the types of instructions

There are three to five cycles for any given command (other than floating point), similar to Single Cycle:

  1. Fetch
  2. Decode
  3. Execute
  4. Memory
  5. Register Write

Each stage is performed in one clock cycle.

Fetch

This instruction is identical for all command types.
The address from the PC is passed to the Memory File. The Memory File reads the data at that address, and is stored in a temporary Instruction Register, to be used later as the instruction to execute. The PC is incremented by 4.

Decode

The values from the specified registers (RS and RT) are passed into temporary registers (called A and B) to make sure the data persists across the next cycle. The Control sets the control lines according to the opcode.

Just in case the command is a Branch command, we calculate what the destination address is. The bottom 16 bits are sign extended to 32 bits and shifted left two bits. The ALU adds the PC to this value and stores the result in the ALUout register

Execute

The ALU uses the values in registers A and B as it’s operands. Depending on the instruction, one of four things happen:

  • R-Type instructions
    • The ALU performs an operation on A and B, storing the result in ALU-Result
  • Memory instructions (sw/lw)
    • ALU calculates effective memory address by adding the offset to RS
  • Branch instructions
    • ALU subtracts B from A to see if they’re equal, and sets the zero flag accordingly
  • Jump instructions
    • The PC is set to the 4 leading bits of the PC (after incrementing it by 4) and the 28 bits of the shifted address
    • Excalidraw Excalidraw

All instructions other than Load, Store, and R-Types complete here.

Memory Access

  • R-Type
    • The result calculated in the Execute cycle is waiting in the ALUout register. This value is written to register RD.
  • Store word/byte
    • The address computed in the Execute cycle is passed from ALUout to the Write Address line into the Memory File, and the contents of RT are passed to the Write Data line of the Memory File. The Control unit sets the MemWrite line to 1.
  • Load word/byte
    • The address computed in the Execute cycle is passed from ALUout to the Read Address line into the Memory File. The Control unit sets the MemRead line to 1. The result is stored in the Memory Data register to persist across the next cycle.

Register Write

This stage only happens with load commands. The data in the Memory Data register is sent to the Register File and written to RT.