← back to syllabus ← back to notes
1.Read sections 4.1, 4.2, 4.3, and 4.4 from CODmips textbook’s Chapter 4. In Section 4.4 read until “A Multicycle Implementation.”
These topics are covered in our CODmips textbook’s Chapter 4 presentation deck
Section 4.1 of the CODmips textbook serves as the introduction to the chapter on the processor. This section provides a high-level and abstract overview of the principles and techniques used in implementing a processor. It sets the context for the more detailed discussions in the subsequent sections of the chapter.
Key points from Section 4.1 include:
Starting with a highly abstract and simplified overview, we will explain the principles and techniques used in implementing a processor. This initial overview follows building a datapath and construct a simple version of a processor capable of implementing an instruction set like MIPS.
Figure 4.1 shows a high-level view of a MIPS implementation, focusing on functional units and their interconnections. However, this figure omits the selection (control) logic for multiple data sources (multiplexors) and the control signals needed for different instruction types.
An abstract view of the implementation of the MIPS subset showing the major functional units and the major connections between them. All instructions start by using the program counter to supply the instruction address to the instruction memory. After the instruction is fetched, the register operands used by an instruction are specified by fields of that instruction. Once the register operands have been fetched, they can be operated on to compute a memory address (for a load or store), to compute an arithmetic result (for an integer arithmetic-logical instruction), or a compare (for a branch). If the instruction is an arithmetic-logical instruction, the result from the ALU must be written to a register. If the operation is a load or store, the ALU result is used as an address to either load a value from memory into the registers or store a value from the registers. The result from the ALU or memory is written back into the register file. Branches require the use of the ALU output to determine the next instruction address, which comes either from the ALU (where the PC and branch offset are summed) or from an adder that increments the current PC by 4. The thick lines interconnecting the functional units represent buses, which consist of multiple signals. The arrows are used to guide the reader in knowing how information flows. Since signal lines may cross, we explicitly show when crossing lines are connected by the presence of a dot where the lines cross.
In essence, Section 4.1 lays the groundwork for the chapter by outlining the topics that will be covered, ranging from basic datapath construction to advanced pipelining techniques and considerations for complex instruction sets. It also indicates different levels of detail that readers can focus on based on their interests.
Section 4.2 of the CODmips textbook focuses on logic design conventions and how the hardware logic implementing a computer operates and is clocked. This section reviews key ideas in digital logic that are essential for understanding the rest of the chapter on the processor.
In summary, Section 4.2 lays the groundwork for understanding the processor’s design by introducing the fundamental concepts of combinational and sequential logic and emphasizing the importance of a predictable clocking methodology, specifically the edge-triggered approach used throughout the chapter.
Section 4.3 of the CODmips textbook is titled “Building a Datapath” and it details the construction of the essential hardware elements and their interconnections needed to execute instructions in a processor. This section focuses on laying the groundwork for a basic implementation before moving on to more complex pipelined designs.
The section introduces the fundamental datapath elements that are required in a MIPS implementation. These elements are responsible for operating on or holding data within the processor. The key datapath elements discussed include:
The section describes how the instruction memory, PC, and adder are combined to form the initial part of the datapath responsible for fetching instructions and incrementing the PC. Figure 4.6 illustrates this portion.
It explains the datapath elements needed for R-format ALU operations (like add, subtract, AND, OR, slt), which primarily involve the register file and the ALU. Figure 4.7 shows these elements.
The section outlines the additional units required for load (lw
) and store (sw
) instructions, which include the data memory and the sign extension unit, in addition to the register file and ALU. Figure 4.8 depicts these elements.
It discusses the elements needed for branch equal (beq) instructions, which involve comparing two registers for equality using the ALU and calculating the branch target address by adding the sign-extended offset to the PC. Figure 4.9 depicts these elements.
The section then shows how these individual datapath components are integrated into a single, unified datapath capable of executing the basic instruction classes (load-store word, ALU operations, and branches) in a single clock cycle. This integrated datapath requires the addition of multiplexors to select the appropriate data sources for different instructions. Figure 4.11 illustrates this combined datapath.
In essence, Section 4.3 walks through the process of identifying the necessary hardware components for different types of instructions and then combines them into a basic datapath architecture. This datapath serves as the foundation for the more detailed processor implementations discussed in the subsequent sections of the chapter.
Section 4.4 of CODmips textbook is titled “A Simple Implementation Scheme” and it describes how to implement a basic version of the MIPS processor datapath that was built in Section 4.3. This implementation is characterized by using a single long clock cycle for every instruction.
The fundamental characteristic of this implementation is that each instruction begins execution on one clock edge and completes its execution on the next clock edge. This means that the clock cycle must be long enough to accommodate the longest instruction in the instruction set.
With the datapath constructed, the section moves on to discuss the control unit. The control unit is responsible for generating the control signals that dictate the operation of the datapath components (register file writes, memory reads/writes, ALU operations, and multiplexor selections) based on the instruction being executed.
To design the control unit, it’s necessary to understand the formats of the different instruction classes (R-type, load-store, and branch) and the control lines needed for the datapath. Figure 4.14 shows these instruction formats.
Figure 4.15 illustrates the simple datapath with all necessary multiplexors and identified control lines.
The design of the ALU control is addressed first. The ALU control unit takes as input the ALU operation (ALUOp) control signals from the main control unit and the function code (funct field) from R-type instructions to determine the specific operation the ALU should perform. Figure 4.12 (referenced in Figure 4.47) shows how the ALU control bits are set based on ALUOp and the function code.
The section then explains how to design the main control unit which generates the other control signals based on the opcode field of the instruction.
Figure 4.16 describes the function of the seven main control lines (RegDst, ALUSrc, MemtoReg, RegWrite, MemRead, MemWrite, Branch).
Figure 4.17 shows the datapath with the control unit and its input (opcode) and outputs (control signals).
The section likely walks through the execution of different instruction types (like load, R-type, and branch) on the designed datapath, highlighting the active control signals and data flow for each step within the single clock cycle.
Figure 4.20 illustrates the datapath in operation for a load instruction.
The control function can be precisely defined using a truth table that maps the opcode to the required control signal settings.
Figure 4.22 shows such a truth table for the simple single-cycle implementation. This truth table can then be implemented using logic gates.
While this single-cycle implementation is conceptually simple, the section likely points out that it is not practical for high performance because the clock cycle time is limited by the slowest instruction, leading to inefficiencies for faster instructions.
Section 4.4 details the implementation of a basic MIPS processor where each instruction takes one full clock cycle to execute. It covers the design of the control unit, including the ALU control and the main control logic, and how the control signals are generated based on the instruction’s opcode and function code. The section likely illustrates the execution flow of different instruction types on this single-cycle datapath, while also hinting at the performance limitations of this approach.