Summary
- Scalar optimization:
- improving ILP by code transformation and grouping independent instructions
- improving memory access by restructuring loop nests to take better advantage of memory hierarchy
- compilers are good at instruction level optimizations and loop transformations. It depends on the language, however:
- F77 is the easiest for compiler to work with
- C is more difficult
- F90/C++ are most complex for compiler optimizations
- the user is responsible to present the code in a way that allows for compiler optimizations:
- don’t violate the language standard
- write clean and clear code
- consider the data structures for (false) sharing and alignment
- consider the data structures for data dependencies
- most natural presentation of algorithms using multi-dimensional arrays