Computational Methods and Data Science

DSE 511 - Fall 2025

Coordinating Instructor

Scott Emrich
Office: 608 Min Kao Hall
Phone: (865) 974-3891; E-mail: semrich at utk.edu
Office hours: whenever my office door is open; and by appointment

Overview

This course equips PhD students from diverse research domains with the computational thinking, data fluency, and collaborative development skills essential for modern data-driven research. It blends foundational computer science concepts (e.g., data structures, version control), applied statistics (e.g., inference, simulation), and core data science workflows (e.g., data wrangling, ML, reproducibility) to foster a deep understanding of how to design and implement reproducible, scalable computational systems.

Designed for students with varied programming backgrounds, DSE 511 emphasizes real-world application, cross-disciplinary problem-solving, and ethical considerations in data science. Students will gain confidence writing code, collaborating via version control, processing and analyzing data, and building tools that support their thesis or domain-specific research.

Syllabus

The syllabus can be found here

Schedule

Date Topic Homework Notes
8/19/2025 (re)Intro to Computational Thinking   Wing, J. M. (2006). "Computational Thinking". Communications of the ACM, 49(3), 33-35.
"The missing semester of your CS education" (MIT)
8/21/2025 Problem decomposition   How to write pseudocode from GeeksforGeeks
8/26/2025 Programming fundamentals (Python)   Python for Data Science Intro course (YouTube, 4 hour primer for diverse backgrounds)
Official Python tutorial
8/28/2025 Version control   Atlassian git tutorials
9/02/2025 Collaborative coding   Git.... the game!
Pro git book (free online version)
9/04/2025 Stat review I: Exploratory Data Analysis   Seeing Theory online text/visualizations of statistical topics
Freedman, D., Pisani, R., Purves, R. (2007). Statistics (4th Edition), Chapters 1-5
9/09/2025 Stat review I (cont): Inference   Think stats (2e)
9/11/2025 Statistical computing    
9/16/2025 Data structures and alg basics (1)   Big-O Cheat Sheet
9/18/2025 Data structures and alg basics (2)   zyBook on Data Structures and Algorithms (Python; optional)
9/23/2025 Software development + best practices   The Good Research Code Handbook (free online book)
Good enough practices in scientific computing (paper)
9/25/2025 Data acquisition   Our World in Data
Practical introduction to scraping (RealPython)
9/30/2025 Data privacy and ethics   The ethical algorithm
10/02/2025 Data wrangling   Tidyverse cheat sheet for beginners
Python Data Science Handbook Chapter 3
Fall Break!
10/09/2025 Responsible AI   Fairness and machine learning (free online)
"Why should I trust you" by Ribeiro et al.
10/14/2025 No class: fairness in AI challenge    
10/16/2025 Brief introduction to R   R for Data Science (2e)
10/21/2025 Numerical Methods (review)    
10/23/2025 Deep learning: theory    
10/28/2025 Deep learning: training   Python Data Science Handbook Chapter 5
10/30/2025 Model evaluation and interpretability   Interpretable Machine Learning (free online book)
11/04/2025 No class Election day  
11/06/2025 Deep learning: convolutional neural networks   Papers with Code
arxiv Sanity
11/11/2025 Frontiers of Data Science: transformers   The Turing Way Community
See papers posted on Canvas
11/13/2025 Guest lecture: Applied LLMs    
11/18/2025 Modern machine learning tools (student lecture)    
11/20/2025 Team Science and technical collaboration / Capstone final checkpoint   Ten simple rules to ruin a collaborative environment (paper)
11/25/2025 Reproducible research (asyncronous/no class)   Boettiger, C. (2015). "An introduction to Docker for reproducible research", SIGOPS Oper. Syst. Rev.
Perkel, J. M. (2019). "Workflow systems turn raw data into scientific knowledge", Nature
FAIR Principles: https://www.go-fair.org/fair-principles/
12/02/2025 Final capstone project presenations    

Academic dishonesty

All are required to abide by the EECS and University honor code. Discussions are encouraged, but all answers/programs must be written/developed individually. Final projects will be performed as a group.