ECE 691 Scalable and Resilient AI/ML Systems
Instructor: Edmon Begoli, ebegoli@utk.edu, 865.576.0599
Location: Min Kao building, Room 623
Schedule: bi-weekly, Thursdays 5-7 pm
Scalable and Resilient AI/Machine Learning Systems course focuses on the issues, principles and design techniques relevant for the implementation of production-ready AI/ML systems.
The participants in this class will learn about:
- the issues that are specific to the operation of ML systems in real-world environments,
- the principles of reliability, robustness, and resilience in the context of operational use of ML systems, and
- the design techniques and design patterns for how to ensure operation of the ML systems that satisfy those principles.
Topics
- reactive systems and microservices
- methods and metrics for reliable AI/ML operations and model evaluation
- vulnerabilities to adversarial inputs and data poisoningL
- techniques for scalable data management in the context of AI/ML
The class will meet bi-weekly, with a focus on review and discussion of the reading assignments, led by assigned topic leads. The participants in the class might be invited to author one or more survey papers by the end of the course.
We will use course Canvas site for all course announcements, assigments postings, and discussions.