Scott Emrich
Office: 608 Min Kao
Phone: (865) 974-3891; E-mail: semrich at utk.edu
Office hours: TBD; and by appointment
Overview
This course covers foundational concepts for building scalable and reproducible computational workflows. Students will learn how to decompose work into jobs, manage dependencies, handle failure, and execute workloads using batch schedulers and containers. This course is neither a Spark certification course nor an MPI programming class. The course will emphasize principles that generalize across scientific HPC environments and industry data platforms, preparing students to reason about scale, cost, and the responsible use of shared computing systems.
Text and syllabus
The syllabus can be found here
Schedule
| Date | Topic | Homework | Notes |
| 1/20/2026 | Intro to scalable computing | ||
| 1/22/2026 | Spark vs. SLURM | ||
| 1/27/2026 | Responsible HPC usage | ||
| 1/29/2026 | Intro to containers and DAGs | ||
| 2/03/2026 | Job Arrays and Parameter Sweeps | ||
| 2/05/2026 | Failure on purpose | ||
| 2/10/2026 | Workflow design at a high level | ||
| 2/12/2026 | From hand-drawn DAGs to executable workflows | ||
| 2/17/2026 | From research idea to workflow: Capstone pitches | ||
| 2/19/2026 | Execution frameworks as workflow realizations | ||
| 2/24/2026 | SLURM jam: From design to debugging | ||
| 2/26/2026 | Project DAG Design Studio (no class) | ||
| 3/03/2026 | Scaling is a tradeoff, not a goal | ||
| 3/05/2026 | Designing for humans: logging, monitoring, and debuggability | ||
| 3/10/2026 | Spring break (no class) | ||
| 3/12/2026 | Spring break (no class) | ||
| 3/17/2026 | Interactive compute as a tool | ||
| 3/19/2026 | Interactive job jam! | ||
| 3/24/2026 | Data movement & I/O in HPC workflows | ||
| 3/26/2026 | Project architecture presentation prep (no class) | ||
| 3/31/2026 | Scheduler guest lecture (student choice) | ||
| 4/2/2026 | Spring recess (no class) | ||
| 4/7/2026 | Initial project presentations for feedback | ||
| 4/9/2026 | Cost and Performance Thinking | ||
| 4/14/2026 | Evalutating not just executing workflows | ||
| 4/16/2026 | Workflow jam: designing towards evaluation | ||
| 4/21/2026 | Guest lecture: "War stories" from UTK NICS (Crosby) | ||
| 4/23/2026 | Guest lecture: "War stories" from ORNL OLCF (Holman) | ||
| 4/28/2026 | Draft project presentations (in class) | ||
| 4/30/2026 | Study hall to finalize projects | ||
| 5/05/2026 | Projects due (no formal class; videos will be shared) |
All students are required to abide by the DSE and University Honor Code.
Discussion of concepts and general approaches with classmates is encouraged; however, unless explicitly stated otherwise, all submitted code and written answers must be developed and written individually.
You may use external resources—including documentation, textbooks, and generative AI tools—as learning aids (e.g., to clarify syntax, understand error messages, or review general concepts). However, relying on such tools to generate substantial portions of assignment solutions, complete implementations, or logic specific to a graded task is not permitted.
If you are unsure whether a particular use is allowed, please ask. As a guiding principle: if you could not explain or re-derive the solution without the tool, then its use was inappropriate. Submitted work must reflect your own understanding and effort.