Prefetch Data from Memory
Reordering instructions in unrolled loop leads to effective (pseudo-) prefetch of the data
- no instruction overhead; compiler does this optimization automatically.
Explicit (manual) prefetch for memory:
- prefetch to 1st level cache should be done in form of pseudo-prefetch
- compiler will insert prefetch to 2nd level cache automatically (LNO)
- manual prefetch to 2nd level cache can be done with compiler directive:
-
-
-
-
-
-
- same in C with the corresponding #pragma directive
c*$* prefetch_ref=a(1+16)
c*$* prefetch_ref=a(I+32),stride=16,kind=rd