Instruction Latencies (R12K)
Integer units latency Repeat rate
- ALU 1
- add, sub, logic ops, shift, br 1 1
- ALU 2
- add, sub, logic ops 1 1
- signed multiply (32/64 bit) 6/10 6/10
- (unsigned multiply: +1 cycle)
- divide (32/64 bit) 35/67 35/67
- Address Unit
- load integer 2 1
- load floating point 3 1
- store - 1
- Atomic LL,ADD,SC sequence 6 6
Floating point units
- FPU 1
- add, sub, compare, convert 2 1
- FPU 2
- multiply 2 1
- multiply-add (madd) 4 1
- FPU 3
- divide, reciprocal (32/64 bit) 12/19 14/21
- sqrt (32/64 bit) 18/33 20/35
- rsqrt (32/64 bit) 30/52 34/56
Repeat rate of 1 means that after
pipelining processor can complete
Int operations: 2 int operations/cycle
FP operations: 2 fp operations/cycle
Compiler has this table build in.
The goal of compiler scheduling
is finding instructions that can be
executed in parallel to fill all slots:
ILP - Instruction Level Parallelism