← back to syllabus ← back to notes
Parameters are fundamental for creating reusable and configurable hardware designs. They allow you to define constants that can be modified at compile time, influencing the behavior and structure of your modules. Think of them as global variables within a module’s scope, but with the crucial difference that they are typically resolved before simulation.
Let’s break down how to use parameters effectively in SystemVerilog.
You declare a parameter using the parameter keyword. The basic syntax is:
parameter [data_type] parameter_name = value;
data_type: Specifies the data type of the parameter. Common types include integer, real, time, and enumerated types. If omitted, the type defaults to integer.parameter_name: The name you give to your parameter. Follow standard naming conventions (typically uppercase for parameters).value: The constant value assigned to the parameter. This value can be a constant expression.Some examples:
parameter WIDTH = 8; // An 8-bit wide value
parameter DEPTH = 256; // A depth value
parameter REAL_VAL = 3.14159; // A real value
parameter logic [7:0] DEFAULT_VALUE = 8'hAA; // An 8-bit logic value
parameter enum { STATE_IDLE, STATE_READ, STATE_WRITE } STATE = STATE_IDLE; // Enumerated type
Once declared, you can use parameters anywhere within the module where a constant value is required. This includes:
For example, a parameterized adder using behavioral modeling
module adder #(parameter WIDTH = 8) (
  input logic [WIDTH-1:0] a,
  input logic [WIDTH-1:0] b,
  input logic cin,
  output logic [WIDTH-1:0] sum,
  output logic cout
);
  assign {cout, sum} = a + b + cin;
endmodule
Using the adder in other modules, like a test bench.
// code below would be in a test bench or another module's definition.
// Instantiating the adder with different widths:
adder #(.WIDTH(16)) adder16 (
  .a(data_a),
  .b(data_b),
  .cin(carry_in),
  .sum(sum16),
  .cout(carry_out)
);
adder adder8 ( // Using the default WIDTH = 8
  .a(data_c),
  .b(data_d),
  .cin(carry_in2),
  .sum(sum8),
  .cout(carry_out2)
);
The real power of parameters comes from the ability to override their values during module instantiation.  This is done using the #(.parameter_name(value)) syntax, as shown in the adder example above.  This allows you to reuse the same module with different configurations without modifying the module’s source code.
SystemVerilog also provides the keyword localparam.  These are similar to parameters but cannot be overridden during instantiation.  They are strictly local to the module in which they are defined.  Use localparam for constants that should not be changed externally.
localparam DELAY = 2; // A delay value that should not be modified from outside
Generate blocks provide a mechanism for creating multiple instances of modules or code blocks based on compile-time conditions or loop iterations. This is essential for designing regular structures like arrays of processing elements, memory banks, or replicated logic.  genvar is a special variable used exclusively within generate blocks as an index or iterator.
genvar Keyword:genvar declares an integer variable that is used as a loop counter or index within a generate block. It’s crucial to understand that genvar is not a regular variable; it exists only during the elaboration phase (before simulation) and is used to generate hardware instances.  You cannot use a genvar outside of a generate block.
genvar i; // Declaring a genvar
generate Block:The generate block encloses the code that you want to replicate or conditionally instantiate.  There are three main types of generate constructs:
for loop generate: Used for repetitive instantiation.if-else generate: Used for conditional instantiation.case generate: Used for multi-conditional instantiation (similar to a case statement).for Loop Generate:This is the most common type. It’s used to create multiple instances of a module or block of code.
generate
  for (genvar i = 0; i < N; i++) begin : instances // 'instances' is a generate block name (important!)
    // Inside the loop, 'i' is used to create unique instances.
    adder #( .WIDTH(WIDTH) ) adder_inst (
      .a(data_a[i*WIDTH+:WIDTH]), // Using 'i' to index into a wider data bus
      .b(data_b[i*WIDTH+:WIDTH]),
      .cin(carry_in[i]),
      .sum(sum[i*WIDTH+:WIDTH]),
      .cout(carry_out[i])
    );
  end
endgenerate
begin : block_name: Giving a name to the generate block is essential, especially for hierarchical referencing and debugging.i*WIDTH+:WIDTH: This is a common pattern for indexing into a wider data bus. It creates slices of the data_a and data_b signals based on the genvar i.instances[0].adder_inst.sum.if-else Generate:This construct allows you to conditionally instantiate different blocks of code based on a compile-time condition.
generate
  if (ENABLE_ADDER) begin : adder_block
    adder #( .WIDTH(WIDTH) ) adder_inst (
      .a(data_a),
      .b(data_b),
      .cin(carry_in),
      .sum(sum),
      .cout(carry_out)
    );
  end else begin : multiplier_block
    multiplier #( .WIDTH(WIDTH) ) multiplier_inst (
      .a(data_a),
      .b(data_b),
      .prod(product)
    );
  end
endgenerate
case Generate:Similar to if-else, but for multiple conditions.
generate
  case (OPERATION)
    ADD: begin : add_block
      // ... instantiation for addition ...
    end
    SUBTRACT: begin : sub_block
      // ... instantiation for subtraction ...
    end
    default: begin : default_block
      // ... default instantiation ...
    end
  endcase
endgenerate
genvar statements are evaluated during the elaboration phase, not during simulation. This means the conditions and loop iterations must be known at compile time. You cannot use run-time signals to control generate blocks.begin : block_name. This is crucial for hierarchical referencing and debugging.genvar variables are only visible within the generate block.genvars: You cannot assign values to a genvar inside the generate block. They are automatically incremented in for loops.module memory_array #(
  parameter DEPTH = 256,
  parameter WIDTH = 8
) (
  // ... ports ...
);
  genvar i;
  generate
    for (i = 0; i < DEPTH; i++) begin : memory_instances
      memory_cell #( .WIDTH(WIDTH) ) mem_cell (
        // ... connections ...
      );
    end
  endgenerate
endmodule
This example creates an array of DEPTH memory cells, each of WIDTH bits.
genvar and generateBy mastering generate blocks and genvars, you can create highly parameterized and reusable hardware designs, significantly improving your productivity and code maintainability. Now, let’s put this knowledge into practice with some exercises!
The dff module:
module dff (
  input logic clk,
  input logic rst,
  input logic enable,
  input logic d,
  output logic q
);
  always_ff @(posedge clk) begin
    if (rst) begin
      q <= 0; // Synchronous reset
    end else if (enable) begin
      q <= d; // Data is loaded only when enable is high
    end
  end
endmodule
The test bench for the dff module:
// Testbench to demonstrate the d_flip_flop
module d_flip_flop_tb;
  logic clk;
  logic rst;
  logic enable;
  logic d;
  logic q;
  d_flip_flop dut (
    .clk(clk),
    .rst(rst),
    .enable(enable),
    .d(d),
    .q(q)
  );
  // Clock generation
  initial begin
    clk = 0;
    forever #5 clk = ~clk; // 10ns period
  end
  // Test sequence
  initial begin
    rst = 1;
    enable = 0;
    d = 0;
    #10 rst = 0; // Release reset
    d = 1;
    enable = 1;
    #10; // q should now be 1
    d = 0;
    enable = 1;
    #10; // q should now be 0
    enable = 0; // Disable the flip-flop
    d = 1;      // Change d, but q should remain unchanged
    #10;       // q should still be 0
    enable = 1; // Enable again
    #10;       // q should now be 1
    $display("Final value of q: %b", q);
    $finish;
  end
endmodule
The register module:
module register #(
  parameter WIDTH = 8 // Default width of 8 bits
) (
  input logic clk,
  input logic rst,
  input logic enable,
  input logic [WIDTH-1:0] d, // Data input, parameterized width
  output logic [WIDTH-1:0] q // Data output, parameterized width
);
  // Array of D flip-flops to form the register
  logic [WIDTH-1:0] q_internal; // Internal storage for the register
  genvar i;
  generate
    for (i = 0; i < WIDTH; i++) begin : flip_flops
      dff flip_flop_inst (
        .clk(clk),
        .rst(rst),
        .enable(enable),
        .d(d[i]),      // Connecting individual bits of d
        .q(q_internal[i]) // Connecting individual bits of q_internal
      );
    end
  endgenerate
  assign q = q_internal; // Assign the internal storage to the output
endmodule
The test bench for the register module:
// Testbench for the parameterized register
module register_tb;
  logic clk;
  logic rst;
  logic enable;
  logic [7:0] d; // 8-bit data for default instantiation
  logic [7:0] q;
  // Instantiating the register with the default width (8 bits)
  register reg8 (
    .clk(clk),
    .rst(rst),
    .enable(enable),
    .d(d),
    .q(q)
  );
  // Instantiating the register with a different width (16 bits)
  logic [15:0] d16;
  logic [15:0] q16;
  register #( .WIDTH(16) ) reg16 ( // Parameter override
    .clk(clk),
    .rst(rst),
    .enable(enable),
    .d(d16),
    .q(q16)
  );
  // Clock generation
  initial begin
    clk = 0;
    forever #5 clk = ~clk;
  end
  // Test sequence
  initial begin
    rst = 1;
    enable = 0;
    d = 8'hAA; // Example data for 8-bit register
    d16 = 16'hBEEF; // Example data for 16-bit register
    #10 rst = 0;  // Release reset
    enable = 1;
    #10 d = 8'h55; // Change data for 8-bit register
    #10 d16 = 16'hDEAD; // Change data for 16-bit register
    #10 enable = 0; // Disable
    #10 enable = 1; // Enable again
    #10 $display("8-bit Register q: %h", q); // Should be 55
    #10 $display("16-bit Register q16: %h", q16); // Should be DEAD
    $finish;
  end
endmodule
The register_file module:
module register_file #(
  parameter DEPTH = 8,  // Number of registers (default 8)
  parameter WIDTH = 8   // Width of each register (inherited or specified)
) (
  input logic clk,
  input logic rst,
  input logic enable,
  input logic [$log2(DEPTH)-1:0] write_addr, // Write address
  input logic [WIDTH-1:0] write_data,      // Write data
  input logic write_en,                  // Write enable
  input logic [$log2(DEPTH)-1:0] read_addr1, // Read address 1
  output logic [WIDTH-1:0] read_data1,     // Read data 1
  input logic [$log2(DEPTH)-1:0] read_addr2, // Read address 2
  output logic [WIDTH-1:0] read_data2      // Read data 2
);
  // Array of registers
  register #( .WIDTH(WIDTH) ) registers [DEPTH]; // Parameterized register instances
  genvar i;
  generate
    for (i = 0; i < DEPTH; i++) begin : register_instances
      registers[i] (
        .clk(clk),
        .rst(rst),
        .enable(enable),
        .d( (write_en && (write_addr == i)) ? write_data : '0 ), // Conditional write
        .q() // Output not directly connected within the array
      );
    end
  endgenerate
  // Read logic (combinational) - Two independent read ports
  assign read_data1 = registers[read_addr1].q; // Hierarchical access to register output
  assign read_data2 = registers[read_addr2].q; // Hierarchical access to register output
endmodule
The test bench for the register_file module:
// Testbench for the parameterized register file
module register_file_tb;
  logic clk;
  logic rst;
  logic enable;
  logic [2:0] write_addr; // 8 registers so 3 bits for address
  logic [7:0] write_data;
  logic write_en;
  logic [2:0] read_addr1;
  logic [7:0] read_data1;
  logic [2:0] read_addr2;
  logic [7:0] read_data2;
  register_file rf (
    .clk(clk),
    .rst(rst),
    .enable(enable),
    .write_addr(write_addr),
    .write_data(write_data),
    .write_en(write_en),
    .read_addr1(read_addr1),
    .read_data1(read_data1),
    .read_addr2(read_addr2),
    .read_data2(read_data2)
  );
  // Clock generation
  initial begin
    clk = 0;
    forever #5 clk = ~clk;
  end
  // Test sequence
  initial begin
    rst = 1;
    enable = 0;
    write_en = 0;
    #10 rst = 0;
    enable = 1;
    write_addr = 3'h3;
    write_data = 8'hAA;
    write_en = 1;
    #10 write_en = 0;
    write_addr = 3'h5;
    write_data = 8'h55;
    write_en = 1;
    #10 write_en = 0;
    read_addr1 = 3'h3;
    read_addr2 = 3'h5;
    #10;
    $display("Read Data 1 (addr 3): %h", read_data1); // Should be AA
    $display("Read Data 2 (addr 5): %h", read_data2); // Should be 55
    $finish;
  end
endmodule
The counter module:
module counter #(
  parameter WIDTH = 8 // Default width of 8 bits
) (
  input logic clk,
  input logic rst,
  input logic enable,
  output logic [WIDTH-1:0] count
);
  logic [WIDTH-1:0] count_internal; // Internal storage for the counter
  always_ff @(posedge clk) begin
    if (rst) begin
      count_internal <= '0; // Reset to 0
    end else if (enable) begin
      count_internal <= count_internal + 1; // Increment on rising clock edge when enabled
    end
  end
  assign count = count_internal; // Assign internal value to output
endmodule
The test bench for the counter module:
// Testbench for the counter
module counter_tb;
  logic clk;
  logic rst;
  logic enable;
  logic [7:0] count; // 8-bit count for default instantiation
  // Instantiate the counter (default 8-bit width)
  counter counter_8bit (
  .clk(clk),
  .rst(rst),
  .enable(enable),
  .count(count)
  );
  // Instantiate a 16-bit counter to test parameter override
  logic [15:0] count_16bit;
  counter #(16) counter_16bit_inst ( // Override WIDTH to 16
  .clk(clk),
  .rst(rst),
  .enable(enable),
  .count(count_16bit)
  );
  // Clock generation
  initial begin
    clk = 0;
    forever #5 clk = ~clk; // 10ns period
  end
  // Test sequence
  initial begin
    rst = 1;
    enable = 0;
    #10 rst = 0; // Release reset
    enable = 1;
    #10;      // count should be 1 (8-bit) and 1 (16-bit)
    #10;      // count should be 2 (8-bit) and 2 (16-bit)
    #10;      // count should be 3 (8-bit) and 3 (16-bit)
    $display("8-bit Count: %h", count);      // Should be 3
    $display("16-bit Count: %h", count_16bit); // Should be 3
    // Test overflow for 8-bit counter
    repeat (253) @(posedge clk); // Count up to 255
    #10;
    $display("8-bit Count (Overflow): %h", count); // Should be FF (255)
    #10;
    $display("8-bit Count (After Overflow): %h", count); // Should be 00 (wrapped around)
    // Test overflow for 16-bit counter
    repeat (65533) @(posedge clk); // Count up to 65535
    #10;
    $display("16-bit Count (Overflow): %h", count_16bit); // Should be FFFF (65535)
    #10;
    $display("16-bit Count (After Overflow): %h", count_16bit); // Should be 0000 (wrapped around)
    $finish;
  end
endmodule
The program_counter module:
module program_counter #(
  parameter WIDTH = 8 // Default width of 8 bits
) (
  input logic clk,
  input logic rst,
  input logic enable,
  input logic load,          // Load a new PC value
  input logic [WIDTH-1:0] load_value, // Value to load
  output logic [WIDTH-1:0] pc      // Program counter output
);
  always_ff @(posedge clk) begin
    if (rst) begin
      pc_internal <= '0; // Reset to 0
    end else if (enable) begin
      if (load) begin
        pc_internal <= load_value; // Load new value
      end else begin
        pc_internal <= pc_internal + 1; // Increment PC
      end
    end
  end
  assign pc = pc_internal; // Assign internal value to output
endmodule
The test bench for program_counter:
// Testbench for program counter
module program_counter_tb;
  logic clk;
  logic rst;
  logic enable;
  logic load;
  logic [7:0] load_value;
  logic [7:0] pc;
  // Instantiate the program counter (default 8-bit width)
  program_counter pc_8bit (
    .clk(clk),
    .rst(rst),
    .enable(enable),
    .load(load),
    .load_value(load_value),
    .pc(pc)
  );
    // Instantiate a 16-bit PC for testing parameter override.
  logic [15:0] load_value_16bit;
  logic [15:0] pc_16bit;
  program_counter #(16) pc_16bit_inst (
    .clk(clk),
    .rst(rst),
    .enable(enable),
    .load(load),
    .load_value(load_value_16bit),
    .pc(pc_16bit)
  );
  // Clock generation
  initial begin
    clk = 0;
    forever #5 clk = ~clk;
  end
  // Test sequence
  initial begin
    rst = 1;
    enable = 0;
    load = 0;
    #10 rst = 0; // Release reset
    enable = 1;
    #10; // pc should now be 1
    load = 1;
    load_value = 8'hFF;
    load_value_16bit = 16'hFFFF;
    #10 load = 0; // Deactivate load
    #10; // pc should now be FF (8 bit) and FFFF (16 bit)
    #10; // pc should now be 00 (8 bit) and 0000 (16 bit) because of overflow
    $display("8-bit PC: %h", pc);       // Should be FF
    $display("16-bit PC: %h", pc_16bit); // Should be FFFF
    $finish;
  end
endmodule
The sign_extender module:
module sign_extender #(
  parameter IN_WIDTH = 16, // Input width (default 16 bits)
  parameter OUT_WIDTH = 32 // Output width (default 32 bits)
) (
  input logic [IN_WIDTH-1:0] in,
  output logic [OUT_WIDTH-1:0] out
);
  // Sign extension logic: replicate the most significant bit of the input
  // to fill the additional bits in the output.
  assign out = , in}; 
endmodule
The test bench for the sign_extender module:
// Testbench for sign extender
module sign_extender_tb;
  logic [15:0] in;
  logic [31:0] out;
  // Instantiate the sign extender (default parameters)
  sign_extender se_16_to_32 (
  .in(in),
  .out(out)
  );
  // Instantiate a sign extender with different parameters (e.g., 8 to 32)
  logic [7:0] in_8;
  logic [31:0] out_8;
  sign_extender #(8, 32) se_8_to_32 (
  .in(in_8),
  .out(out_8)
  );
  initial begin
    // Test cases
    in = 16'h7FFF; // Positive number
    #10;
    $display("16-bit input: %h, 32-bit output: %h", in, out); // Expected: 00007FFF
    in = 16'h8000; // Negative number (MSB is 1)
    #10;
    $display("16-bit input: %h, 32-bit output: %h", in, out); // Expected: FFFF8000
    in = 16'hFFFF; // -1
    #10;
    $display("16-bit input: %h, 32-bit output: %h", in, out); // Expected: FFFFFFFF
    in_8 = 8'h7F; // Positive number
    #10;
    $display("8-bit input: %h, 32-bit output (8 to 32): %h", in_8, out_8); // Expected: 0000007F
    in_8 = 8'h80; // Negative number
    #10;
    $display("8-bit input: %h, 32-bit output (8 to 32): %h", in_8, out_8); // Expected: FFFFFFF80
    $finish;
  end
endmodule
sll) or Right (slr)sllThe shift_left module:
module shift_left #(
  parameter WIDTH = 8, // Default width of 8 bits
  parameter SHIFT_AMOUNT = 1 // Default shift amount of 1
) (
  input logic [WIDTH-1:0] data_in,
  output logic [WIDTH-1:0] data_out
);
  // Shift left by SHIFT_AMOUNT (logical shift)
  assign data_out = data_in << SHIFT_AMOUNT;
endmodule
The test bench for the shift_left module:
// Testbench for shift left module
module shift_left_tb;
  logic [7:0] data_in;
  logic [7:0] data_out;
  // Instantiate the shift left module (default parameters)
  shift_left sl_8bit (
.data_in(data_in),
.data_out(data_out)
  );
  // Instantiate a shift left module with different parameters
  logic [15:0] data_in_16bit;
  logic [15:0] data_out_16bit;
  shift_left #(16, 2) sl_16bit ( // 16-bit width, shift by 2
.data_in(data_in_16bit),
.data_out(data_out_16bit)
  );
  initial begin
    // Test cases for 8-bit shift
    data_in = 8'h01; // 0000 0001
    #10;
    $display("8-bit input: %h, output: %h", data_in, data_out); // Expected: 0000 0010 (shift by 1)
    data_in = 8'h80; // 1000 0000
    #10;
    $display("8-bit input: %h, output: %h", data_in, data_out); // Expected: 0000 0000 (shift by 1 - logical)
    data_in = 8'h0F; // 0000 1111
    #10;
    $display("8-bit input: %h, output: %h", data_in, data_out); // Expected: 0001 1110 (shift by 1)
    // Test cases for 16-bit shift
    data_in_16bit = 16'h0001; // 0000 0000 0000 0001
    #10;
    $display("16-bit input: %h, output: %h", data_in_16bit, data_out_16bit); // Expected: 0000 0000 0000 0100 (shift by 2)
    data_in_16bit = 16'h8000; // 1000 0000 0000 0000
    #10;
    $display("16-bit input: %h, output: %h", data_in_16bit, data_out_16bit); // Expected: 0000 0000 0000 0000 (shift by 2 - logical)
    $finish;
  end
endmodule
slrThe shift_right module:
module shift_right #(
  parameter WIDTH = 8, // Default width of 8 bits
  parameter SHIFT_AMOUNT = 1 // Default shift amount of 1
) (
  input logic [WIDTH-1:0] data_in,
  output logic [WIDTH-1:0] data_out
);
  // Logical shift right by SHIFT_AMOUNT
  assign data_out = data_in >> SHIFT_AMOUNT;
endmodule
The test bench for the shift_right module:
// Testbench for shift right logical module
module shift_right_tb;
  logic [7:0] data_in;
  logic [7:0] data_out;
  // Instantiate the shift right logical module (default parameters)
  shift_right srl_8bit (
  .data_in(data_in),
  .data_out(data_out)
  );
  // Instantiate a shift right logical module with different parameters
  logic [15:0] data_in_16bit;
  logic [15:0] data_out_16bit;
  shift_right #(16, 2) srl_16bit ( // 16-bit width, shift by 2
  .data_in(data_in_16bit),
  .data_out(data_out_16bit)
  );
  initial begin
    // Test cases for 8-bit shift
    data_in = 8'h01; // 0000 0001
    #10;
    $display("8-bit input: %h, output: %h", data_in, data_out); // Expected: 0000 0000 (shift by 1)
    data_in = 8'h80; // 1000 0000
    #10;
    $display("8-bit input: %h, output: %h", data_in, data_out); // Expected: 0100 0000 (shift by 1 - logical)
    data_in = 8'hFF; // 1111 1111
    #10;
    $display("8-bit input: %h, output: %h", data_in, data_out); // Expected: 0111 1111 (shift by 1)
        data_in = 8'h0F; // 0000 1111
    #10;
    $display("8-bit input: %h, output: %h", data_in, data_out); // Expected: 0000 0111 (shift by 1)
    // Test cases for 16-bit shift
    data_in_16bit = 16'h0001; // 0000 0000 0000 0001
    #10;
    $display("16-bit input: %h, output: %h", data_in_16bit, data_out_16bit); // Expected: 0000 0000 0000 0000 (shift by 2)
    data_in_16bit = 16'h8000; // 1000 0000 0000 0000
    #10;
    $display("16-bit input: %h, output: %h", data_in_16bit, data_out_16bit); // Expected: 0010 0000 0000 0000 (shift by 2 - logical)
        data_in_16bit = 16'hFFFF; // 1111 1111 1111 1111
    #10;
    $display("16-bit input: %h, output: %h", data_in_16bit, data_out_16bit); // Expected: 0011 1111 1111 1111 (shift by 2)
    $finish;
  end
endmodule
A crucial difference between logical and arithmetic shift operations exists, particularly when dealing with signed numbers.
Let’s break it down:
The key difference arises with right shifts of signed numbers.
| Shift Type | Left Shift | Right Shift | 
|---|---|---|
| Logical | Zeros in from the right | Zeros in from the left | 
| Arithmetic | Zeros in from the right (same as logical) | Sign bit (MSB) is copied and filled in from left | 
A simple Mealy FSM mealy_fsm module:
module mealy_fsm #(
  parameter NUM_STATES = 4 // Example: 4 states
) (
  input logic clk,
  input logic rst,
  input logic in,
  output logic out
);
  // Define the states (using an enum is good practice)
  typedef enum logic [1:0] { S0 = 2'b00, S1 = 2'b01, S2 = 2'b10, S3 = 2'b11 } state_type;
  state_type current_state, next_state;
  // State register (sequential logic)
  always_ff @(posedge clk) begin
    if (rst) begin
      current_state <= S0; // Reset to initial state (S0)
    end else begin
      current_state <= next_state;
    end
  end
  // Next state logic (combinational)
  always_comb begin
    next_state = current_state; // Default: stay in the current state
    case (current_state)
      S0: begin
        if (in) next_state = S1;
      end
      S1: begin
        if (in) next_state = S2;
      end
      S2: begin
        if (in) next_state = S3;
      end
      S3: begin
        if (in) next_state = S0;
      end
    endcase
  end
  // Output logic (combinational - Mealy output depends on current state *and* input)
  always_comb begin
    out = 0; // Default output
    case (current_state)
      S0: begin
        if (in) out = 1;
      end
      S1: begin
        if (in) out = 0;
      end
      S2: begin
        if (in) out = 1;
      end
      S3: begin
        if (in) out = 0;
      end
    endcase
  end
endmodule
The test bench for the simple Mealy FSM mealy_fsm module:
// Testbench for Mealy FSM
module mealy_fsm_tb;
  logic clk;
  logic rst;
  logic in;
  logic out;
  mealy_fsm fsm (
  .clk(clk),
  .rst(rst),
  .in(in),
  .out(out)
  );
  // Clock generation
  initial begin
    clk = 0;
    forever #5 clk = ~clk;
  end
  // Test sequence
  initial begin
    rst = 1;
    in = 0;
    #10 rst = 0; // Release reset
    in = 1; // Input 1
    #10;
    $display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S1, 1, 1
    in = 1; // Input 1
    #10;
    $display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S2, 1, 0
    in = 1; // Input 1
    #10;
    $display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S3, 1, 1
    in = 1; // Input 1
    #10;
    $display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S0, 1, 0
    in = 0; // Input 0
    #10;
    $display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S0, 0, 0 (No state change because input is 0)
    $finish;
  end
endmodule
A simple Moore FSM moore_fsm module:</summary>
module moore_fsm #(
  parameter NUM_STATES = 4 // Example: 4 states
) (
  input logic clk,
  input logic rst,
  input logic in,
  output logic out
);
  // Define the states (using an enum is good practice)
  typedef enum logic [1:0] { S0 = 2'b00, S1 = 2'b01, S2 = 2'b10, S3 = 2'b11 } state_type;
  state_type current_state, next_state;
  // State register (sequential logic)
  always_ff @(posedge clk) begin
    if (rst) begin
      current_state <= S0; // Reset to initial state (S0)
    end else begin
      current_state <= next_state;
    end
  end
  // Next state logic (combinational)
  always_comb begin
    next_state = current_state; // Default: stay in the current state
    case (current_state)
      S0: begin
        if (in) next_state = S1;
      end
      S1: begin
        if (in) next_state = S2;
      end
      S2: begin
        if (in) next_state = S3;
      end
      S3: begin
        if (in) next_state = S0;
      end
    endcase
  end
  // Output logic (combinational - Moore output depends *only* on current state)
  always_comb begin
    out = 0; // Default output
    case (current_state)
      S0: out = 0;
      S1: out = 1;
      S2: out = 0;
      S3: out = 1;
    endcase
  end
endmodule
The test bench for the simple Moore FSM moore_fsm module:
// Testbench for Moore FSM
module moore_fsm_tb;
  logic clk;
  logic rst;
  logic in;
  logic out;
  moore_fsm fsm (
  .clk(clk),
  .rst(rst),
  .in(in),
  .out(out)
  );
  // Clock generation
  initial begin
    clk = 0;
    forever #5 clk = ~clk;
  end
  // Test sequence
  initial begin
    rst = 1;
    in = 0;
    #10 rst = 0; // Release reset
    in = 1; // Input 1
    #10;
    $display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S1, 1, 1
    in = 1; // Input 1
    #10;
    $display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S2, 1, 0
    in = 1; // Input 1
    #10;
    $display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S3, 1, 1
    in = 1; // Input 1
    #10;
    $display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S0, 1, 0
    in = 0; // Input 0
    #10;
    $display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S0, 0, 0 (No state change because input is 0)
    $finish;
  end
endmodule
Welcome to Computer Architecture! This course will delve into the fundamental principles governing how computers work at a hardware level. It’s not just about programming (though that’s related!), nor is it solely about circuit design (though that plays a role). Computer architecture sits at the intersection of hardware and software, defining the interface between them.
Think of it as the blueprint of a building, like our New Academic Building. Architects don’t lay every brick, nor do they decide how the occupants will use each room. Instead, they design the structure, layout, and systems (electrical, plumbing) that enable both construction and habitation. Similarly, computer architects define the fundamental organization and behavior of a computer system, enabling both hardware implementation and software execution.
Every computer, from your smartphone to a supercomputer, can be conceptually broken down into five main components:
These five components are interconnected by buses, which are sets of wires that carry data and control signals.

One of the most crucial concepts in computer architecture is the stored program concept. Before this, computers were often hardwired for specific tasks. Changing the program required rewiring the machine—a tedious and error-prone process.
The stored program concept, attributed to John von Neumann, revolutionized computing by storing both the instructions (the program) and the data in the computer’s memory. This allows for:
This concept is fundamental to how all modern computers operate.
While the von Neumann architecture is dominant, it’s important to understand its historical context and alternatives.
| Feature | Von Neumann (Princeton) Architecture | Harvard Architecture | 
|---|---|---|
| Memory | Single memory space for both instructions and data | Separate memory spaces for instructions and data | 
| Access | Instructions and data share the same memory bus | Instructions and data can be accessed simultaneously | 
| Advantages | Simpler design, more efficient use of memory | Faster instruction fetch, avoids bottlenecks | 
| Disadvantages | Potential bottleneck (von Neumann bottleneck) as both instructions and data compete for the same memory access | More complex design, requires separate memory modules | 
| Applications | General-purpose computers, PCs, laptops | Embedded systems, digital signal processors (DSPs) | 
The von Neumann bottleneck arises because both instructions and data must travel over the same bus to and from memory. This can limit performance, especially when the CPU needs to fetch instructions and data frequently. The Harvard architecture mitigates this by allowing parallel access to instruction and data memories.
(Diagrams comparing the two architectures)
While modern general-purpose computers primarily use variations of the von Neumann architecture (often with caching and other techniques to reduce the bottleneck), the Harvard architecture is still relevant in specialized applications where performance and parallelism are critical.
Additional readings for these architecture types:
Background reading on performance
Performance in computer architecture is a multifaceted concept, and there isn’t one single “best” metric. It’s often a balancing act between different factors, and the “right” performance measure depends on the specific application and priorities. Here’s a breakdown of key aspects:
1. Execution Time:
2. Throughput:
3. Latency:
4. Resource Utilization:
5. Power Consumption:
6. Cost:
The “Power Wall” refers to the increasing difficulty and impracticality of continuing to increase processor clock speeds to achieve performance gains. For many years, increasing clock speed was the primary driver of improved CPU performance. However, this approach has run into fundamental physical limitations, leading to the “power wall.”
The Problem:
As clock speeds increase, so does the power consumption of the processor. This increased power consumption manifests as heat. The relationship is roughly cubic: doubling the clock speed can increase power consumption by a factor of eight. This heat becomes increasingly difficult and expensive to dissipate. Think of it like trying to cool a rapidly boiling pot of water; at some point, you can’t add any more heat without it boiling over.
Consequences of Excessive Heat:
The Relationship Between Clock Speed and Power
Where:
C is the capacitance of the circuitV is the voltagef is the frequency (clock speed)
V) is squared in this equation. This is crucial. To increase clock speed, you often need to increase the voltage to maintain stability. This means that the power increases quadratically with voltage. Since voltage often needs to be increased proportionally with frequency, you end up with a cubic relationship overall.Why It’s Close to a Factor of Eight
Important Caveats
In Summary
While not an absolute rule, the “factor of eight” is a good rule of thumb to illustrate the significant power challenges associated with increasing clock speeds. It highlights the need for innovative design techniques and power management strategies in modern computer architecture.
The Shift in Focus:
The power wall has forced a fundamental shift in how computer architects design processors. Instead of focusing solely on increasing clock speed, the emphasis has moved towards:
In summary: The power wall is a critical challenge in computer architecture. It signifies the limitations of simply increasing clock speeds to achieve performance gains. The industry has responded by shifting its focus towards multi-core processors, specialized hardware, architectural innovations, and power-efficient designs. Managing power consumption and heat dissipation has become a central concern for computer architects.
Today, we’ve laid the foundation for understanding the basic components and principles of computer architecture. Our next lecture will delve into the Instruction Set Architecture (ISA).
The ISA defines the set of instructions that a particular processor can understand and execute. It’s the interface between the hardware and the software. We’ll explore:
Understanding the ISA is crucial for writing efficient code, optimizing compiler design, and designing new processors. It’s the bridge between the high-level world of programming and the low-level world of hardware.