Courses & Projects by Rob Marano

Assignment 13

<5 points>

Homework Pointing Scheme

Total points	Explanation
0	Not handed in
1	Handed in late
2	Handed in on time, not every problem fully worked through and clearly identifying the solution
3	Handed in on time, each problem answered a boxed answer, each problems answered with a clearly worked through solution, and less than majority of problems answered correctly
4	Handed in on time, majority of problems answered correctly, each solution boxed clearly, and each problem fully worked through
5	Handed in on time, every problem answered correctly, every solution boxed clearly, and every problem fully worked through.

Reading

Notes: Week 13 Notes
Textbook: Chapter 5 (Memory Hierarchy) in Patterson & Hennessy.

Problem Set

Part 1: Cache Addressing and Geometry

1. Cache Fields and Configuration

For a direct-mapped cache design with a 32-bit physical address, the following bits of the address are used to access the cache:

Tag: 31–10
Index: 9–5
Offset: 4–0

a) What is the cache block size (in words)? b) How many entries (blocks) does the cache have? c) What is the ratio between total bits required for such a cache implementation over the data storage bits? (Assume 1 valid bit per block and no dirty bit).

Part 2: Cache Access Tracing

2. Direct-Mapped Cache Access

Below is a list of 32-bit memory address references, given as word addresses (not byte addresses): 0x03, 0xb4, 0x2b, 0x02, 0xbf, 0x58, 0xbe, 0x0e, 0xb5, 0x2c, 0xba, 0xfd

For each of these references, identify the binary address, the tag, and the index given a direct-mapped cache with 16 one-word blocks. Also list if each reference is a hit or a miss, assuming the cache is initially empty. Create a trace table showing this.

Part 3: Average Memory Access Time and CPI

3. Cache Performance and Bottlenecks

Assume that main memory accesses take 70 ns and that memory accesses represent 36% of all instructions executing in a pipeline. The following table shows data for L1 caches attached to each of two processors, P1 and P2.

Processor	L1 Size	L1 Miss Rate	L1 Hit Time
P1	2 KB	8.0%	0.66 ns
P2	4 KB	6.0%	0.90 ns

a) Assuming that the L1 hit time completely determines the clock cycle time for P1 and P2, what are their respective clock rates in GHz? b) What is the Average Memory Access Time (AMAT) in nanoseconds for P1 and P2? c) Assuming a base CPI of 1.0 without any memory stalls, what is the total effective CPI for P1 and P2? Which processor is ultimately faster? Show your total execution time calculation.

Submission

Submit your answers as a PDF or Markdown file via the Microsoft Teams’ assignment. Show all your mathematical work clearly for full credit.

This site built with GitHub.