Programming on Parallel Machines
Why is this book different from all other parallel programming books? It is aimed more on the practical end of things, in that:
- There is very little theoretical content, such as O() analysis, maximum theoretical speedup, PRAMs, directed acyclic graphs (DAGs) and so on.
- Real code is featured throughout.
- We use the main parallel platforms’OpenMP, CUDA and MPI’rather than languages that at this stage are largely experimental or arcane.
- The running performance themes’communications latency, memory/network contention, load balancing and so on’are interleaved throughout the book, discussed in the context of specific platforms or applications.
- Considerable attention is paid to techniques for debugging.
The main programming language used is C/C++, but some of the code is in R, which has become the pre-eminent language for data analysis. As a scripting language, R can be used for rapid prototyping. In our case here, it enables me to write examples which are much less less cluttered than they would be in C/C++, thus easier for students to discern the fundamental parallelixation principles involved. For the same reason, it makes it easier for students to write their own parallel code, focusing on those principles. And R has a rich set of parallel libraries.
It is assumed that the student is reasonably adept in programming, and has math background through linear algebra. An appendix reviews the parts of the latter needed for this book. Another appendix presents an overview of various systems issues that arise, such as process scheduling and virtual memory.
It should be note that most of the code examples in the book are NOT optimized. The primary emphasis is on simplicity and clarity of the techniques and languages used. However, there is plenty of discussion on factors that affect speed, such cache coherency issues, network delays, GPU memory structures and so on.