CSU, Chico: Computer Science 380
(Computer Architecture)
Enlarge
|
Normal Size
|
Reduce
Slide
of 108 (Refresh off)
Ch 3: INSTRUCTION-LEVEL PARALLELISM AND ITS DYNAMIC EXPLOITATION
ILP: Concepts and challenges
Pipeline performance
Advanced pipelining techniques
Basic blocks and ILP
Loop-level parallelismn (LLP)
Converting LLP to ILP
Dependences
Data dependence
Name dependence
Control dependence
Relaxing control dependence
Dynamic scheduling: The idea
R.M. Tomasulo (1967) Algorithm
Scoreboarding review ...
Tomasulo's algorithm is similar to scoreboarding ...
Components of a Tomasulo Unit
MIPS FPU using Tomasulo's Algorithm
Major steps in Tomasulo's Algorithm
Reservation station fields
Code example (revisited from Scoreboard code example)
Approximate timing of example
More details of the algorithm
Dynamic loop scheduling
Two active iterations of the loop
(Comparative figure from prior edition of textbook)
Drawbacks of Tomasulo's Algorithm
When Tomasulo's Algorithm is most useful
Dynamic branch prediction
Branch prediction buffers (BPBs)
1-bit branch prediction
2-bit branch prediction
State transition diagram for 2-bit branch prediction
n-bit branch prediction
Ideal branch prediction?
Implementing branch histories
Misprediction rate for 2-bit BPBs
4K entry BPB vs "unlimited" BPB
Branch-prediction performance
Example: Correlated prediction
Even simpler example
Correlating predictors
(m,n) correlated predictors
Correlated predictor schematic
Correlated predictors do better
Tournament predictors: local and global predictors
Fraction of predictions coming from the local predictor
Performance comparison of 3 branch prediction schemes
Ex: Alpha 21264 branch prediction
Branch target buffers (BTBs)
BTB Schematic
BTB Flowchart
BTB Penalties in different cases
BTB variants
Return address prediction stack
When the best prediction fails ...
ILP and multiple issue
5 primary approaches for multiple-issue processors
A statically-scheduled superscalar MIPS processor
Some issues with this approach
Multiple instruction issue with dynamic scheduling
Approximate timing of example
Dual-issue version of MIPS Tomasulo pipeline
Resource usage table
Having a separate adder for effective address calculation
Approximate timing of example
Resource usage table
Multiple-instruction issue with dynamic scheduling
VLIW (Very Long Instruction Word)
Limits to multiple-issue
An important lesson
What is "speculation"?
Ambitious speculation methods
HW/SW cooperation for speculation
Poison bits
Speculative instructions with renaming
Hardware-based speculation
Advantages of hardware-based speculation
Implementing hardware-based speculation
Fields of reorder buffer (ROB) entries
Steps of execution in hardware-based speculation
HW-based speculation schematic
Ex 1: HW-based speculation
Ex 1: MUL.D ready to commit
Ex 2: L.D and MUL.D committed
Detail of steps in HW-based speculation 1/2
Detail of steps in HW-based speculation 2/2
Ex 1: Dual-issue without speculation
Ex 1: Dual-issue without speculation, timing
Ex 2: Dual-issue with speculation
Ex 2: Dual-issue with speculation, timing
Explicit register renaming
Limits of ILP
Average ILP in perfect processor
Implications of ILP limitations
Limitations on window size and maximum issue count
Window size and instruction issues per cycle
Effects of imperfect branch prediction
Imperfect branch prediction 1/2
Imperfect branch prediction 2/2
Effect of finite register set 1/2
Effect of finite register set 2/2
Effects of imperfect alias analysis 1/2
Effects of imperfect alias analysis 2/2
Limitations of ILP in realizable processors
Window size variations 1/2
Window size variations 2/2
Fallacies and pitfalls
This
E-Slideshow
was prepared by
Dr. J
for
CSCI 380
(Last revised: Thu Jan 30 21:34:39 PDT 2003)