





Four-issue superscalar executes 400 instructions during cache miss! 1











Due to cost

Due to size of DRAM

Due to cost and wire delays (wires on-chip cost much less, and are faster)





















Advantage is low power because only hit data is accessed.



## Write Policy

Cache hit:

write through: write both cache & memory - generally higher traffic but simplifies cache coherence write back: write cache only (memory is written only when the entry is evicted)

- a dirty bit per block can further reduce the traffic

Cache miss:

no write allocate: only write to main memory write allocate: (aka fetch on write) fetch block into cache

**Common combinations:** 

write through and no write allocate write back with write allocate









Design the largest primary cache without slowing down the clock Or adding pipeline stages.



Larger block size will reduce compulsory misses (first miss to a block). Larger blocks may increase conflict misses since the number of blocks is smaller.







Deisgners of the MIPS M/1000 estimated that waiting for a four-word buffer to empty

increased the read miss penalty by a factor of 1.5.



## Write Alternatives





