← back to syllabus ← back to notes
Parameters are fundamental for creating reusable and configurable hardware designs. They allow you to define constants that can be modified at compile time, influencing the behavior and structure of your modules. Think of them as global variables within a module’s scope, but with the crucial difference that they are typically resolved before simulation.
Let’s break down how to use parameters effectively in SystemVerilog.
You declare a parameter using the parameter keyword. The basic syntax is:
parameter [data_type] parameter_name = value;
data_type
: Specifies the data type of the parameter. Common types include integer, real, time, and enumerated types. If omitted, the type defaults to integer.parameter_name
: The name you give to your parameter. Follow standard naming conventions (typically uppercase for parameters).value
: The constant value assigned to the parameter. This value can be a constant expression.Some examples:
parameter WIDTH = 8; // An 8-bit wide value
parameter DEPTH = 256; // A depth value
parameter REAL_VAL = 3.14159; // A real value
parameter logic [7:0] DEFAULT_VALUE = 8'hAA; // An 8-bit logic value
parameter enum { STATE_IDLE, STATE_READ, STATE_WRITE } STATE = STATE_IDLE; // Enumerated type
Once declared, you can use parameters anywhere within the module where a constant value is required. This includes:
For example, a parameterized adder
using behavioral modeling
module adder #(parameter WIDTH = 8) (
input logic [WIDTH-1:0] a,
input logic [WIDTH-1:0] b,
input logic cin,
output logic [WIDTH-1:0] sum,
output logic cout
);
assign {cout, sum} = a + b + cin;
endmodule
Using the adder
in other modules, like a test bench.
// code below would be in a test bench or another module's definition.
// Instantiating the adder with different widths:
adder #(.WIDTH(16)) adder16 (
.a(data_a),
.b(data_b),
.cin(carry_in),
.sum(sum16),
.cout(carry_out)
);
adder adder8 ( // Using the default WIDTH = 8
.a(data_c),
.b(data_d),
.cin(carry_in2),
.sum(sum8),
.cout(carry_out2)
);
The real power of parameters comes from the ability to override their values during module instantiation. This is done using the #(.parameter_name(value))
syntax, as shown in the adder
example above. This allows you to reuse the same module with different configurations without modifying the module’s source code.
SystemVerilog also provides the keyword localparam
. These are similar to parameters but cannot be overridden during instantiation. They are strictly local to the module in which they are defined. Use localparam for constants that should not be changed externally.
localparam DELAY = 2; // A delay value that should not be modified from outside
Generate blocks provide a mechanism for creating multiple instances of modules or code blocks based on compile-time conditions or loop iterations. This is essential for designing regular structures like arrays of processing elements, memory banks, or replicated logic. genvar
is a special variable used exclusively within generate
blocks as an index or iterator.
genvar
Keyword:genvar
declares an integer variable that is used as a loop counter or index within a generate
block. It’s crucial to understand that genvar
is not a regular variable; it exists only during the elaboration phase (before simulation) and is used to generate hardware instances. You cannot use a genvar
outside of a generate
block.
genvar i; // Declaring a genvar
generate
Block:The generate
block encloses the code that you want to replicate or conditionally instantiate. There are three main types of generate constructs:
for
loop generate: Used for repetitive instantiation.if-else
generate: Used for conditional instantiation.case
generate: Used for multi-conditional instantiation (similar to a case statement).for
Loop Generate:This is the most common type. It’s used to create multiple instances of a module or block of code.
generate
for (genvar i = 0; i < N; i++) begin : instances // 'instances' is a generate block name (important!)
// Inside the loop, 'i' is used to create unique instances.
adder #( .WIDTH(WIDTH) ) adder_inst (
.a(data_a[i*WIDTH+:WIDTH]), // Using 'i' to index into a wider data bus
.b(data_b[i*WIDTH+:WIDTH]),
.cin(carry_in[i]),
.sum(sum[i*WIDTH+:WIDTH]),
.cout(carry_out[i])
);
end
endgenerate
begin : block_name
: Giving a name to the generate block is essential, especially for hierarchical referencing and debugging.i*WIDTH+:WIDTH
: This is a common pattern for indexing into a wider data bus. It creates slices of the data_a
and data_b
signals based on the genvar
i
.instances[0].adder_inst.sum
.if-else
Generate:This construct allows you to conditionally instantiate different blocks of code based on a compile-time condition.
generate
if (ENABLE_ADDER) begin : adder_block
adder #( .WIDTH(WIDTH) ) adder_inst (
.a(data_a),
.b(data_b),
.cin(carry_in),
.sum(sum),
.cout(carry_out)
);
end else begin : multiplier_block
multiplier #( .WIDTH(WIDTH) ) multiplier_inst (
.a(data_a),
.b(data_b),
.prod(product)
);
end
endgenerate
case
Generate:Similar to if-else
, but for multiple conditions.
generate
case (OPERATION)
ADD: begin : add_block
// ... instantiation for addition ...
end
SUBTRACT: begin : sub_block
// ... instantiation for subtraction ...
end
default: begin : default_block
// ... default instantiation ...
end
endcase
endgenerate
genvar
statements are evaluated during the elaboration phase, not during simulation. This means the conditions and loop iterations must be known at compile time. You cannot use run-time signals to control generate blocks.begin : block_name
. This is crucial for hierarchical referencing and debugging.genvar
variables are only visible within the generate
block.genvar
s: You cannot assign values to a genvar
inside the generate
block. They are automatically incremented in for
loops.module memory_array #(
parameter DEPTH = 256,
parameter WIDTH = 8
) (
// ... ports ...
);
genvar i;
generate
for (i = 0; i < DEPTH; i++) begin : memory_instances
memory_cell #( .WIDTH(WIDTH) ) mem_cell (
// ... connections ...
);
end
endgenerate
endmodule
This example creates an array of DEPTH
memory cells, each of WIDTH
bits.
genvar
and generate
By mastering generate
blocks and genvars
, you can create highly parameterized and reusable hardware designs, significantly improving your productivity and code maintainability. Now, let’s put this knowledge into practice with some exercises!
The dff
module:
module dff (
input logic clk,
input logic rst,
input logic enable,
input logic d,
output logic q
);
always_ff @(posedge clk) begin
if (rst) begin
q <= 0; // Synchronous reset
end else if (enable) begin
q <= d; // Data is loaded only when enable is high
end
end
endmodule
The test bench for the dff
module:
// Testbench to demonstrate the d_flip_flop
module d_flip_flop_tb;
logic clk;
logic rst;
logic enable;
logic d;
logic q;
d_flip_flop dut (
.clk(clk),
.rst(rst),
.enable(enable),
.d(d),
.q(q)
);
// Clock generation
initial begin
clk = 0;
forever #5 clk = ~clk; // 10ns period
end
// Test sequence
initial begin
rst = 1;
enable = 0;
d = 0;
#10 rst = 0; // Release reset
d = 1;
enable = 1;
#10; // q should now be 1
d = 0;
enable = 1;
#10; // q should now be 0
enable = 0; // Disable the flip-flop
d = 1; // Change d, but q should remain unchanged
#10; // q should still be 0
enable = 1; // Enable again
#10; // q should now be 1
$display("Final value of q: %b", q);
$finish;
end
endmodule
The register
module:
module register #(
parameter WIDTH = 8 // Default width of 8 bits
) (
input logic clk,
input logic rst,
input logic enable,
input logic [WIDTH-1:0] d, // Data input, parameterized width
output logic [WIDTH-1:0] q // Data output, parameterized width
);
// Array of D flip-flops to form the register
logic [WIDTH-1:0] q_internal; // Internal storage for the register
genvar i;
generate
for (i = 0; i < WIDTH; i++) begin : flip_flops
dff flip_flop_inst (
.clk(clk),
.rst(rst),
.enable(enable),
.d(d[i]), // Connecting individual bits of d
.q(q_internal[i]) // Connecting individual bits of q_internal
);
end
endgenerate
assign q = q_internal; // Assign the internal storage to the output
endmodule
The test bench for the register
module:
// Testbench for the parameterized register
module register_tb;
logic clk;
logic rst;
logic enable;
logic [7:0] d; // 8-bit data for default instantiation
logic [7:0] q;
// Instantiating the register with the default width (8 bits)
register reg8 (
.clk(clk),
.rst(rst),
.enable(enable),
.d(d),
.q(q)
);
// Instantiating the register with a different width (16 bits)
logic [15:0] d16;
logic [15:0] q16;
register #( .WIDTH(16) ) reg16 ( // Parameter override
.clk(clk),
.rst(rst),
.enable(enable),
.d(d16),
.q(q16)
);
// Clock generation
initial begin
clk = 0;
forever #5 clk = ~clk;
end
// Test sequence
initial begin
rst = 1;
enable = 0;
d = 8'hAA; // Example data for 8-bit register
d16 = 16'hBEEF; // Example data for 16-bit register
#10 rst = 0; // Release reset
enable = 1;
#10 d = 8'h55; // Change data for 8-bit register
#10 d16 = 16'hDEAD; // Change data for 16-bit register
#10 enable = 0; // Disable
#10 enable = 1; // Enable again
#10 $display("8-bit Register q: %h", q); // Should be 55
#10 $display("16-bit Register q16: %h", q16); // Should be DEAD
$finish;
end
endmodule
The register_file
module:
module register_file #(
parameter DEPTH = 8, // Number of registers (default 8)
parameter WIDTH = 8 // Width of each register (inherited or specified)
) (
input logic clk,
input logic rst,
input logic enable,
input logic [$log2(DEPTH)-1:0] write_addr, // Write address
input logic [WIDTH-1:0] write_data, // Write data
input logic write_en, // Write enable
input logic [$log2(DEPTH)-1:0] read_addr1, // Read address 1
output logic [WIDTH-1:0] read_data1, // Read data 1
input logic [$log2(DEPTH)-1:0] read_addr2, // Read address 2
output logic [WIDTH-1:0] read_data2 // Read data 2
);
// Array of registers
register #( .WIDTH(WIDTH) ) registers [DEPTH]; // Parameterized register instances
genvar i;
generate
for (i = 0; i < DEPTH; i++) begin : register_instances
registers[i] (
.clk(clk),
.rst(rst),
.enable(enable),
.d( (write_en && (write_addr == i)) ? write_data : '0 ), // Conditional write
.q() // Output not directly connected within the array
);
end
endgenerate
// Read logic (combinational) - Two independent read ports
assign read_data1 = registers[read_addr1].q; // Hierarchical access to register output
assign read_data2 = registers[read_addr2].q; // Hierarchical access to register output
endmodule
The test bench for the register_file
module:
// Testbench for the parameterized register file
module register_file_tb;
logic clk;
logic rst;
logic enable;
logic [2:0] write_addr; // 8 registers so 3 bits for address
logic [7:0] write_data;
logic write_en;
logic [2:0] read_addr1;
logic [7:0] read_data1;
logic [2:0] read_addr2;
logic [7:0] read_data2;
register_file rf (
.clk(clk),
.rst(rst),
.enable(enable),
.write_addr(write_addr),
.write_data(write_data),
.write_en(write_en),
.read_addr1(read_addr1),
.read_data1(read_data1),
.read_addr2(read_addr2),
.read_data2(read_data2)
);
// Clock generation
initial begin
clk = 0;
forever #5 clk = ~clk;
end
// Test sequence
initial begin
rst = 1;
enable = 0;
write_en = 0;
#10 rst = 0;
enable = 1;
write_addr = 3'h3;
write_data = 8'hAA;
write_en = 1;
#10 write_en = 0;
write_addr = 3'h5;
write_data = 8'h55;
write_en = 1;
#10 write_en = 0;
read_addr1 = 3'h3;
read_addr2 = 3'h5;
#10;
$display("Read Data 1 (addr 3): %h", read_data1); // Should be AA
$display("Read Data 2 (addr 5): %h", read_data2); // Should be 55
$finish;
end
endmodule
The counter
module:
module counter #(
parameter WIDTH = 8 // Default width of 8 bits
) (
input logic clk,
input logic rst,
input logic enable,
output logic [WIDTH-1:0] count
);
logic [WIDTH-1:0] count_internal; // Internal storage for the counter
always_ff @(posedge clk) begin
if (rst) begin
count_internal <= '0; // Reset to 0
end else if (enable) begin
count_internal <= count_internal + 1; // Increment on rising clock edge when enabled
end
end
assign count = count_internal; // Assign internal value to output
endmodule
The test bench for the counter
module:
// Testbench for the counter
module counter_tb;
logic clk;
logic rst;
logic enable;
logic [7:0] count; // 8-bit count for default instantiation
// Instantiate the counter (default 8-bit width)
counter counter_8bit (
.clk(clk),
.rst(rst),
.enable(enable),
.count(count)
);
// Instantiate a 16-bit counter to test parameter override
logic [15:0] count_16bit;
counter #(16) counter_16bit_inst ( // Override WIDTH to 16
.clk(clk),
.rst(rst),
.enable(enable),
.count(count_16bit)
);
// Clock generation
initial begin
clk = 0;
forever #5 clk = ~clk; // 10ns period
end
// Test sequence
initial begin
rst = 1;
enable = 0;
#10 rst = 0; // Release reset
enable = 1;
#10; // count should be 1 (8-bit) and 1 (16-bit)
#10; // count should be 2 (8-bit) and 2 (16-bit)
#10; // count should be 3 (8-bit) and 3 (16-bit)
$display("8-bit Count: %h", count); // Should be 3
$display("16-bit Count: %h", count_16bit); // Should be 3
// Test overflow for 8-bit counter
repeat (253) @(posedge clk); // Count up to 255
#10;
$display("8-bit Count (Overflow): %h", count); // Should be FF (255)
#10;
$display("8-bit Count (After Overflow): %h", count); // Should be 00 (wrapped around)
// Test overflow for 16-bit counter
repeat (65533) @(posedge clk); // Count up to 65535
#10;
$display("16-bit Count (Overflow): %h", count_16bit); // Should be FFFF (65535)
#10;
$display("16-bit Count (After Overflow): %h", count_16bit); // Should be 0000 (wrapped around)
$finish;
end
endmodule
The program_counter
module:
module program_counter #(
parameter WIDTH = 8 // Default width of 8 bits
) (
input logic clk,
input logic rst,
input logic enable,
input logic load, // Load a new PC value
input logic [WIDTH-1:0] load_value, // Value to load
output logic [WIDTH-1:0] pc // Program counter output
);
always_ff @(posedge clk) begin
if (rst) begin
pc_internal <= '0; // Reset to 0
end else if (enable) begin
if (load) begin
pc_internal <= load_value; // Load new value
end else begin
pc_internal <= pc_internal + 1; // Increment PC
end
end
end
assign pc = pc_internal; // Assign internal value to output
endmodule
The test bench for program_counter
:
// Testbench for program counter
module program_counter_tb;
logic clk;
logic rst;
logic enable;
logic load;
logic [7:0] load_value;
logic [7:0] pc;
// Instantiate the program counter (default 8-bit width)
program_counter pc_8bit (
.clk(clk),
.rst(rst),
.enable(enable),
.load(load),
.load_value(load_value),
.pc(pc)
);
// Instantiate a 16-bit PC for testing parameter override.
logic [15:0] load_value_16bit;
logic [15:0] pc_16bit;
program_counter #(16) pc_16bit_inst (
.clk(clk),
.rst(rst),
.enable(enable),
.load(load),
.load_value(load_value_16bit),
.pc(pc_16bit)
);
// Clock generation
initial begin
clk = 0;
forever #5 clk = ~clk;
end
// Test sequence
initial begin
rst = 1;
enable = 0;
load = 0;
#10 rst = 0; // Release reset
enable = 1;
#10; // pc should now be 1
load = 1;
load_value = 8'hFF;
load_value_16bit = 16'hFFFF;
#10 load = 0; // Deactivate load
#10; // pc should now be FF (8 bit) and FFFF (16 bit)
#10; // pc should now be 00 (8 bit) and 0000 (16 bit) because of overflow
$display("8-bit PC: %h", pc); // Should be FF
$display("16-bit PC: %h", pc_16bit); // Should be FFFF
$finish;
end
endmodule
The sign_extender
module:
module sign_extender #(
parameter IN_WIDTH = 16, // Input width (default 16 bits)
parameter OUT_WIDTH = 32 // Output width (default 32 bits)
) (
input logic [IN_WIDTH-1:0] in,
output logic [OUT_WIDTH-1:0] out
);
// Sign extension logic: replicate the most significant bit of the input
// to fill the additional bits in the output.
assign out = , in};
endmodule
The test bench for the sign_extender
module:
// Testbench for sign extender
module sign_extender_tb;
logic [15:0] in;
logic [31:0] out;
// Instantiate the sign extender (default parameters)
sign_extender se_16_to_32 (
.in(in),
.out(out)
);
// Instantiate a sign extender with different parameters (e.g., 8 to 32)
logic [7:0] in_8;
logic [31:0] out_8;
sign_extender #(8, 32) se_8_to_32 (
.in(in_8),
.out(out_8)
);
initial begin
// Test cases
in = 16'h7FFF; // Positive number
#10;
$display("16-bit input: %h, 32-bit output: %h", in, out); // Expected: 00007FFF
in = 16'h8000; // Negative number (MSB is 1)
#10;
$display("16-bit input: %h, 32-bit output: %h", in, out); // Expected: FFFF8000
in = 16'hFFFF; // -1
#10;
$display("16-bit input: %h, 32-bit output: %h", in, out); // Expected: FFFFFFFF
in_8 = 8'h7F; // Positive number
#10;
$display("8-bit input: %h, 32-bit output (8 to 32): %h", in_8, out_8); // Expected: 0000007F
in_8 = 8'h80; // Negative number
#10;
$display("8-bit input: %h, 32-bit output (8 to 32): %h", in_8, out_8); // Expected: FFFFFFF80
$finish;
end
endmodule
sll
) or Right (slr
)sll
The shift_left
module:
module shift_left #(
parameter WIDTH = 8, // Default width of 8 bits
parameter SHIFT_AMOUNT = 1 // Default shift amount of 1
) (
input logic [WIDTH-1:0] data_in,
output logic [WIDTH-1:0] data_out
);
// Shift left by SHIFT_AMOUNT (logical shift)
assign data_out = data_in << SHIFT_AMOUNT;
endmodule
The test bench for the shift_left
module:
// Testbench for shift left module
module shift_left_tb;
logic [7:0] data_in;
logic [7:0] data_out;
// Instantiate the shift left module (default parameters)
shift_left sl_8bit (
.data_in(data_in),
.data_out(data_out)
);
// Instantiate a shift left module with different parameters
logic [15:0] data_in_16bit;
logic [15:0] data_out_16bit;
shift_left #(16, 2) sl_16bit ( // 16-bit width, shift by 2
.data_in(data_in_16bit),
.data_out(data_out_16bit)
);
initial begin
// Test cases for 8-bit shift
data_in = 8'h01; // 0000 0001
#10;
$display("8-bit input: %h, output: %h", data_in, data_out); // Expected: 0000 0010 (shift by 1)
data_in = 8'h80; // 1000 0000
#10;
$display("8-bit input: %h, output: %h", data_in, data_out); // Expected: 0000 0000 (shift by 1 - logical)
data_in = 8'h0F; // 0000 1111
#10;
$display("8-bit input: %h, output: %h", data_in, data_out); // Expected: 0001 1110 (shift by 1)
// Test cases for 16-bit shift
data_in_16bit = 16'h0001; // 0000 0000 0000 0001
#10;
$display("16-bit input: %h, output: %h", data_in_16bit, data_out_16bit); // Expected: 0000 0000 0000 0100 (shift by 2)
data_in_16bit = 16'h8000; // 1000 0000 0000 0000
#10;
$display("16-bit input: %h, output: %h", data_in_16bit, data_out_16bit); // Expected: 0000 0000 0000 0000 (shift by 2 - logical)
$finish;
end
endmodule
slr
The shift_right
module:
module shift_right #(
parameter WIDTH = 8, // Default width of 8 bits
parameter SHIFT_AMOUNT = 1 // Default shift amount of 1
) (
input logic [WIDTH-1:0] data_in,
output logic [WIDTH-1:0] data_out
);
// Logical shift right by SHIFT_AMOUNT
assign data_out = data_in >> SHIFT_AMOUNT;
endmodule
The test bench for the shift_right
module:
// Testbench for shift right logical module
module shift_right_tb;
logic [7:0] data_in;
logic [7:0] data_out;
// Instantiate the shift right logical module (default parameters)
shift_right srl_8bit (
.data_in(data_in),
.data_out(data_out)
);
// Instantiate a shift right logical module with different parameters
logic [15:0] data_in_16bit;
logic [15:0] data_out_16bit;
shift_right #(16, 2) srl_16bit ( // 16-bit width, shift by 2
.data_in(data_in_16bit),
.data_out(data_out_16bit)
);
initial begin
// Test cases for 8-bit shift
data_in = 8'h01; // 0000 0001
#10;
$display("8-bit input: %h, output: %h", data_in, data_out); // Expected: 0000 0000 (shift by 1)
data_in = 8'h80; // 1000 0000
#10;
$display("8-bit input: %h, output: %h", data_in, data_out); // Expected: 0100 0000 (shift by 1 - logical)
data_in = 8'hFF; // 1111 1111
#10;
$display("8-bit input: %h, output: %h", data_in, data_out); // Expected: 0111 1111 (shift by 1)
data_in = 8'h0F; // 0000 1111
#10;
$display("8-bit input: %h, output: %h", data_in, data_out); // Expected: 0000 0111 (shift by 1)
// Test cases for 16-bit shift
data_in_16bit = 16'h0001; // 0000 0000 0000 0001
#10;
$display("16-bit input: %h, output: %h", data_in_16bit, data_out_16bit); // Expected: 0000 0000 0000 0000 (shift by 2)
data_in_16bit = 16'h8000; // 1000 0000 0000 0000
#10;
$display("16-bit input: %h, output: %h", data_in_16bit, data_out_16bit); // Expected: 0010 0000 0000 0000 (shift by 2 - logical)
data_in_16bit = 16'hFFFF; // 1111 1111 1111 1111
#10;
$display("16-bit input: %h, output: %h", data_in_16bit, data_out_16bit); // Expected: 0011 1111 1111 1111 (shift by 2)
$finish;
end
endmodule
A crucial difference between logical and arithmetic shift operations exists, particularly when dealing with signed numbers.
Let’s break it down:
The key difference arises with right shifts of signed numbers.
Shift Type | Left Shift | Right Shift |
---|---|---|
Logical | Zeros in from the right | Zeros in from the left |
Arithmetic | Zeros in from the right (same as logical) | Sign bit (MSB) is copied and filled in from left |
A simple Mealy FSM mealy_fsm
module:
module mealy_fsm #(
parameter NUM_STATES = 4 // Example: 4 states
) (
input logic clk,
input logic rst,
input logic in,
output logic out
);
// Define the states (using an enum is good practice)
typedef enum logic [1:0] { S0 = 2'b00, S1 = 2'b01, S2 = 2'b10, S3 = 2'b11 } state_type;
state_type current_state, next_state;
// State register (sequential logic)
always_ff @(posedge clk) begin
if (rst) begin
current_state <= S0; // Reset to initial state (S0)
end else begin
current_state <= next_state;
end
end
// Next state logic (combinational)
always_comb begin
next_state = current_state; // Default: stay in the current state
case (current_state)
S0: begin
if (in) next_state = S1;
end
S1: begin
if (in) next_state = S2;
end
S2: begin
if (in) next_state = S3;
end
S3: begin
if (in) next_state = S0;
end
endcase
end
// Output logic (combinational - Mealy output depends on current state *and* input)
always_comb begin
out = 0; // Default output
case (current_state)
S0: begin
if (in) out = 1;
end
S1: begin
if (in) out = 0;
end
S2: begin
if (in) out = 1;
end
S3: begin
if (in) out = 0;
end
endcase
end
endmodule
The test bench for the simple Mealy FSM mealy_fsm
module:
// Testbench for Mealy FSM
module mealy_fsm_tb;
logic clk;
logic rst;
logic in;
logic out;
mealy_fsm fsm (
.clk(clk),
.rst(rst),
.in(in),
.out(out)
);
// Clock generation
initial begin
clk = 0;
forever #5 clk = ~clk;
end
// Test sequence
initial begin
rst = 1;
in = 0;
#10 rst = 0; // Release reset
in = 1; // Input 1
#10;
$display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S1, 1, 1
in = 1; // Input 1
#10;
$display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S2, 1, 0
in = 1; // Input 1
#10;
$display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S3, 1, 1
in = 1; // Input 1
#10;
$display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S0, 1, 0
in = 0; // Input 0
#10;
$display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S0, 0, 0 (No state change because input is 0)
$finish;
end
endmodule
A simple Moore FSM moore_fsm
module:</summary>
module moore_fsm #(
parameter NUM_STATES = 4 // Example: 4 states
) (
input logic clk,
input logic rst,
input logic in,
output logic out
);
// Define the states (using an enum is good practice)
typedef enum logic [1:0] { S0 = 2'b00, S1 = 2'b01, S2 = 2'b10, S3 = 2'b11 } state_type;
state_type current_state, next_state;
// State register (sequential logic)
always_ff @(posedge clk) begin
if (rst) begin
current_state <= S0; // Reset to initial state (S0)
end else begin
current_state <= next_state;
end
end
// Next state logic (combinational)
always_comb begin
next_state = current_state; // Default: stay in the current state
case (current_state)
S0: begin
if (in) next_state = S1;
end
S1: begin
if (in) next_state = S2;
end
S2: begin
if (in) next_state = S3;
end
S3: begin
if (in) next_state = S0;
end
endcase
end
// Output logic (combinational - Moore output depends *only* on current state)
always_comb begin
out = 0; // Default output
case (current_state)
S0: out = 0;
S1: out = 1;
S2: out = 0;
S3: out = 1;
endcase
end
endmodule
The test bench for the simple Moore FSM moore_fsm
module:
// Testbench for Moore FSM
module moore_fsm_tb;
logic clk;
logic rst;
logic in;
logic out;
moore_fsm fsm (
.clk(clk),
.rst(rst),
.in(in),
.out(out)
);
// Clock generation
initial begin
clk = 0;
forever #5 clk = ~clk;
end
// Test sequence
initial begin
rst = 1;
in = 0;
#10 rst = 0; // Release reset
in = 1; // Input 1
#10;
$display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S1, 1, 1
in = 1; // Input 1
#10;
$display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S2, 1, 0
in = 1; // Input 1
#10;
$display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S3, 1, 1
in = 1; // Input 1
#10;
$display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S0, 1, 0
in = 0; // Input 0
#10;
$display("State: %s, Input: %b, Output: %b", fsm.current_state, in, out); // Expected: S0, 0, 0 (No state change because input is 0)
$finish;
end
endmodule
Welcome to Computer Architecture! This course will delve into the fundamental principles governing how computers work at a hardware level. It’s not just about programming (though that’s related!), nor is it solely about circuit design (though that plays a role). Computer architecture sits at the intersection of hardware and software, defining the interface between them.
Think of it as the blueprint of a building, like our New Academic Building. Architects don’t lay every brick, nor do they decide how the occupants will use each room. Instead, they design the structure, layout, and systems (electrical, plumbing) that enable both construction and habitation. Similarly, computer architects define the fundamental organization and behavior of a computer system, enabling both hardware implementation and software execution.
Every computer, from your smartphone to a supercomputer, can be conceptually broken down into five main components:
These five components are interconnected by buses, which are sets of wires that carry data and control signals.
One of the most crucial concepts in computer architecture is the stored program concept. Before this, computers were often hardwired for specific tasks. Changing the program required rewiring the machine—a tedious and error-prone process.
The stored program concept, attributed to John von Neumann, revolutionized computing by storing both the instructions (the program) and the data in the computer’s memory. This allows for:
This concept is fundamental to how all modern computers operate.
While the von Neumann architecture is dominant, it’s important to understand its historical context and alternatives.
Feature | Von Neumann (Princeton) Architecture | Harvard Architecture |
---|---|---|
Memory | Single memory space for both instructions and data | Separate memory spaces for instructions and data |
Access | Instructions and data share the same memory bus | Instructions and data can be accessed simultaneously |
Advantages | Simpler design, more efficient use of memory | Faster instruction fetch, avoids bottlenecks |
Disadvantages | Potential bottleneck (von Neumann bottleneck) as both instructions and data compete for the same memory access | More complex design, requires separate memory modules |
Applications | General-purpose computers, PCs, laptops | Embedded systems, digital signal processors (DSPs) |
The von Neumann bottleneck arises because both instructions and data must travel over the same bus to and from memory. This can limit performance, especially when the CPU needs to fetch instructions and data frequently. The Harvard architecture mitigates this by allowing parallel access to instruction and data memories.
(Diagrams comparing the two architectures)
While modern general-purpose computers primarily use variations of the von Neumann architecture (often with caching and other techniques to reduce the bottleneck), the Harvard architecture is still relevant in specialized applications where performance and parallelism are critical.
Additional readings for these architecture types:
Background reading on performance
Performance in computer architecture is a multifaceted concept, and there isn’t one single “best” metric. It’s often a balancing act between different factors, and the “right” performance measure depends on the specific application and priorities. Here’s a breakdown of key aspects:
1. Execution Time:
2. Throughput:
3. Latency:
4. Resource Utilization:
5. Power Consumption:
6. Cost:
The “Power Wall” refers to the increasing difficulty and impracticality of continuing to increase processor clock speeds to achieve performance gains. For many years, increasing clock speed was the primary driver of improved CPU performance. However, this approach has run into fundamental physical limitations, leading to the “power wall.”
The Problem:
As clock speeds increase, so does the power consumption of the processor. This increased power consumption manifests as heat. The relationship is roughly cubic: doubling the clock speed can increase power consumption by a factor of eight. This heat becomes increasingly difficult and expensive to dissipate. Think of it like trying to cool a rapidly boiling pot of water; at some point, you can’t add any more heat without it boiling over.
Consequences of Excessive Heat:
The Relationship Between Clock Speed and Power
Where:
C
is the capacitance of the circuitV
is the voltagef
is the frequency (clock speed)
V
) is squared in this equation. This is crucial. To increase clock speed, you often need to increase the voltage to maintain stability. This means that the power increases quadratically with voltage. Since voltage often needs to be increased proportionally with frequency, you end up with a cubic relationship overall.Why It’s Close to a Factor of Eight
Important Caveats
In Summary
While not an absolute rule, the “factor of eight” is a good rule of thumb to illustrate the significant power challenges associated with increasing clock speeds. It highlights the need for innovative design techniques and power management strategies in modern computer architecture.
The Shift in Focus:
The power wall has forced a fundamental shift in how computer architects design processors. Instead of focusing solely on increasing clock speed, the emphasis has moved towards:
In summary: The power wall is a critical challenge in computer architecture. It signifies the limitations of simply increasing clock speeds to achieve performance gains. The industry has responded by shifting its focus towards multi-core processors, specialized hardware, architectural innovations, and power-efficient designs. Managing power consumption and heat dissipation has become a central concern for computer architects.
Today, we’ve laid the foundation for understanding the basic components and principles of computer architecture. Our next lecture will delve into the Instruction Set Architecture (ISA).
The ISA defines the set of instructions that a particular processor can understand and execute. It’s the interface between the hardware and the software. We’ll explore:
Understanding the ISA is crucial for writing efficient code, optimizing compiler design, and designing new processors. It’s the bridge between the high-level world of programming and the low-level world of hardware.