
CS 5944 Scientific Computing for Engineers: Spring 2012 – 3 Credits This class is part of the Interdisciplinary Graduate Minor in Computational Science. See IGMCS for details.Wednesdays from 1:30 – 4:15, Room 233 Claxton Prof. Jack Dongarra with help from Profs. George Bosilca, Jakub Kurzak, Shirley Moore, Stan Tomov, and Vince Weaver Email: dongarra@eecs.utk.edu Phone: 8659748295 Office hours: Wednesday 11:00  1:00, or by appointment TA: Blake Haugen, bhaugen@utk.edu TAÕs Office : 352 Claxton
TAÕs Office Hours: WednesdayÕs 10:00 – 12:00 or by appointment
There will be four major aspects of the course:
The grade would be based on homework, a midterm project, a final project, and a final project presentation. Topics for the final project would be flexible according to the student's major area of research.
Class Roster If your name is not on the list or some information is incorrect, please send mail to the TA:
Book for the Class: The Sourcebook of Parallel Computing, Edited by Jack Dongarra, Ian Foster, Geoffrey Fox, William Gropp, Ken Kennedy, Linda Torczon, Andy White, October 2002, 760 pages, ISBN 1558608710, Morgan Kaufmann Publishers.
Lecture Notes: (Tentative outline of the class)
Introduction to High Performance Computing Read Chapter 1, 2, and 9 Homework 1 (due January 25, 2012, due date extended to Friday January 27th)
Homework 2 (due February 1, 2012) Read Chapter 3 Read Chapter 20
Parallel programming paradigms and their performances Homework 3 (due February 8, 2012) Read Chapter 21
Message Passing Interface (MPI) Read Chapter 11
Message Passing Interface (MPI) Homework 4 (due February 29, 2012)
Performance Evaluation and Tuning Homework 5 (due February 29, 2012)
Homework 6 (due March 8th, 2012) Read Chapter 9 HPC Performance Issues and Systems Read Chapter 3
Homework 7 (due March 14^{th}, 2012)
Partitioned Global Address Space (PGAS) languages Homework 8 (due March 28, 2012)
March 23 – Spring Break
11. March 28 (Dr. Tomov) Projection and its importance in scientific computing Homework 9 (due April 11, 2012)
Discretization of PDEs and Tools for the Parallel Solution of the Resulting Syst Mesh generation and load balancing Homework 10 (due April 18, 2012)
Sparse Matrices and Optimized Parallel Implementations NVIDIA's Compute Unified Device Architecture (CUDA) Homework 11 (due April 25, 2012) Read Chapter 20 and 21
Iterative Methods in Linear Algebra (Part 1) Iterative Methods in Linear Algebra (Part 2)
Video of CUDA  "Better Performance at Lower Occupancy", V. Volkov Video of OpenCL  "What is OpenCL"
15. April 25 (No class) Read Chapter 20 BaileyÕs paper on Ò12 ways to fool ÉÓ
Class Final reports The project is to describe and demonstrate what you have learned in class. The idea is to take an application and implement it on a parallel computer. Describe what the application is and why this is of importance. You should describe the parallel implementation, look at the performance, perhaps compare it to another implementation if possible. You should write this up in a report, 1015 pages, and in class you will have 20 minutes to make a presentation.
Here are some ideas for projects: o Projects and additional projects.
Additional Reading Materials Message Passing SystemsSeveral implementations of the MPI standard are available today. The most widely used open source MPI implementations are Open MPI and MPICH.Here is the link to the MPI Forum.Other useful reference materialá Here are pointers to specs on various processors: http://www.cpuworld.com/CPUs/index.html http://www.cpuworld.com/sspec/index.html http://processorfinder.intel.com
á Introduction to message passing systems and parallel computing
``Message Passing Interfaces'', Special issue of Parallel Computing, vol 20(4), April 1994.
Ian Foster, Designing and Building Parallel Programs, see http://wwwunix.mcs.anl.gov/dbpp/
Alice Koniges, ed., Industrial Strength Parallel Computing, ISBN1558605401, Morgan Kaufmann Publishers, San Francisco, 2000.
Ananth Gramma et al., Introduction to Parallel Computing, 2^{nd} edition, Pearson Education Limited, 2003.
Michael Quinn, Parallel Programming: Theory and Practice, McGrawHill, 1993
David E. Culler & Jaswinder Pal Singh, Parallel Computer Architecture, Morgan Kaufmann, 1998, see http://www.cs.berkeley.edu/%7Eculler/book.alpha/index.html
George Almasi and Allan Gottlieb, Highly Parallel Computing, Addison Wesley, 1993
Matthew Sottile, Timothy Mattson, and Craig Rasmussen, Introduction to Concurrency in Programming Languages, Chapman & Hall, 2010
á Other relevant books
Stephen Chapman, Fortran 95/2003 for Scientists and Engineers, McGrawHill, 2007
Stephen Chapman, MATLAB Programming for Engineers, Thompson, 2007
Barbara Chapman, Gabriele Jost, Ruud van der Pas, and David J. Kuck, Using OpenMP: Portable Shared Memory Paralllel Programming, MIT Press, 2007
Tarek ElGhazawi, William Carlson, Thomas Sterling, Katherine Yelick, UPC: Distributed Shared Memory Programming, John Wiley & Sons, 2005
David Bailey, Robert Lucas, Samuel Williams, eds., Performance Tuning of Scientific Applications, Chapman & Hall, 2010
Message Passing Standards``MPI
 The Complete Reference, Volume 1, The MPI1 Core, Second Edition'',
``MPI: The Complete Reference  2nd Edition: Volume 2
 The MPI2 Extensions'',
MPI2.1 Standard, September 2008 PDF format: http://www.mpiforum.org/docs/mpi21report.pdf Hardcover: https://fs.hlrs.de/projects/par/mpi//mpi21/
MPI2.2 Standard, September 2009 PDF format: http://www.mpiforum.org/docs/mpi2.2/mpi22report.pdf Hardcover: https://fs.hlrs.de/projects/par/mpi//mpi22/
Online Documentation and Information about Machines á Overview of Recent Supercomputers, Aad J. van der Steen and Jack J. Dongarra, 2007.á Green 500 List of Energy –Efficient Supercomputers
Other Scientific Computing Information Sites á Netlib Repository at UTK/ORNL á LAPACK á GAMS  Guide to Available Math Software á Fortran Standards Working Group á Message Passing Interface (MPI) Forum á OpenMP á DOD High Performance Computing Modernization Program á DOE Accelerated Strategic Computing Initiative (ASC) á AIST Parallel and High Performance Application Software Exchange (in Japan) (includes information on parallel computing conferences and journals) á HPCwire
Related Online Books/Textbooks á Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, SIAM Publication, Philadelphia, 1994. á LAPACK Users' Guide (Second Edition), SIAM Publications, Philadelphia, 1995. á Using MPI: Portable Parallel Programming with the MessagePassing Interface by W. Gropp, E. Lusk, and A. Skjellum á Parallel Computing Works, by G. Fox, R. Williams, and P. Messina (Morgan Kaufmann Publishers) á Designing and Building Parallel Programs. A deadtree version of this book is available by AddisonWesley. á Introduction to HighPerformance Scientific Computing, by Victor Eijkhout with Edmond Chow, Robert Van De Geijn, February 2010 á Introduction to Parallel Computing, by Blaise Barney
Performance Analysis Tools Websites á PAPI á TAU á Vampir á Scalasca á mpiP á ompP á IPM á Eclipse Parallel Tools Platform Other Online Software and Documentationá Matlab documentation is available from several sources, most notably by typing ``help'' into the Matlab command window. A primer (for version 4.0/4.1 of Matlab, not too different from the current version) is available in either postscript or pdf. á SuperLU is a fast implementations of sparse Gaussian elimination for sequential and parallel computers, respectively. á Sources of test matrices for sparse matrix algorithms á University of Florida Sparse Matrix Collection á Templates for the solution of linear systems, a collection of iterative methods, with advice on which ones to use. The web site includes online versions of the book (in html and postscript) as well as software. á Templates for the Solution of Algebraic Eigenvalue Problems is a survey of algorithms and software for solving eigenvalue problems. The web site points to an html version of the book, as well as software. á Updated survey of sparse direct linear equation solvers, by Xiaoye Li á MGNet is a repository for information and software for Multigrid and Domain Decomposition methods, which are widely used methods for solving linear systems arising from PDEs. á Resources for Parallel and High Performance Computing á ACTS (Advanced CompuTational Software) is a set of software tools that make it easier for programmers to write high performance scientific applications for parallel computers. á PETSc: Portable, Extensible, Toolkit for Scientific Computation á Issues related to Computer Arithmetic and Error Analysis á Efficient software for very high precision floating point arithmetic á Notes on IEEE Floating Point Arithmetic, by Prof. W. Kahan á Other notes on arithmetic, error analysis, etc. by Prof. W. Kahan á Report on arithmetic error that cause the Ariane 5 Rocket Crash Video of the explosion á The IEEE floating point standard is currently being updated. To find out what issues the standard committee is considering, look here. Jack Dongarra4/24/2012



