Optimal control solution techniques for systems with known and unknown dynamics. Dynamic programming, Hamilton-Jacobi reachability, and direct and indirect methods for trajectory optimization. Introduction to model predictive control. Adaptive control, model-based and model-free reinforcement learning, and connections between modern reinforcement learning and fundamental optimal control ideas.
Ed Schmerling |
---|
Spencer M. Richards | Devansh Jalota |
---|
Lectures meet on Mondays and Wednesdays from 9:45am to 11:15am in Skilling Auditorium; lecture recordings will be available at the class canvas site.
Dr. Schmerling's office hours are on Mondays 1:00pm to 2:00pm (course material) and Wednesdays 4:00pm to 5:00pm (project discussion) in Durand 217, and by appointment.
Devansh's office hours are on Tuesdays 5:00pm to 6:00pm (Zoom) and Fridays 4:00pm to 5:00pm (Durand 210 and Zoom).
Spencer's office hours are on Thursdays 8:00am to 10:00am (Zoom), and by appointment.
The class syllabus can be found here.
Subject to change. An evolving set of partial course notes is available here. Corresponding lecture slides will be uploaded before each class period.
Week | Topic | Lecture Slides |
---|---|---|
1 |
Introduction; control, stability, and metrics
Introduction to learning; system identification and adaptive control Recitation: Automatic differentiation (autodiff) with JAX Monday: HW0 (ungraded) out |
Lecture 1;
Code
Lecture 2; Code Recitation 1; Code |
2 |
Nonlinear optimization theory
Calculus of variations Recitation: Convex optimization with CVXPY Monday: HW1 out |
Lecture 3;
Code
Lecture 4 Recitation 2 |
3 |
Indirect methods for optimal control
Pontryagin's maximum principle; introduction to dynamic programming Recitation: Regression models Friday: project proposal due |
Lecture 5;
Code
Lecture 6 Recitation 3 |
4 |
Discrete LQR, stochastic DP, value iteration, policy iteration
Introduction to reinforcement learning, dual control, LQG Recitation: Training neural networks with JAX Monday: HW1 due, HW2 out |
Lecture 7;
Code
Lecture 8 Recitation 4 |
5 |
Nonlinearity; tracking LQR
Iterative LQR, differential dynamic programming (DDP) |
Lecture 9;
Code
Lecture 10; Code |
6 |
Direct methods for optimal control; sequential convex programming
HJB, HJI, reachability analysis Monday: HW2 due |
Lecture 11;
Code
Lecture 12; Code |
7 |
Intro to model predictive control
Feasibility and stability of MPC Monday: project midterm report due; Tuesday: HW3 out |
Lecture 13
Lecture 14 |
8 |
MPC implementation considerations, robust MPC
Adaptive and learning MPC |
Lecture 15;
Code
Lecture 16 |
9 |
Model-based RL
Model-free RL: policy gradient and actor critic Tuesday: HW3 due; Wednesday: HW4 out |
Lecture 17
Lecture 18; Code |
10 |
No lecture (Memorial Day)
Model-based policy learning, wrap up, recent work, and future directions Wednesday: HW4 due, project video presentation and final report due |
Lecture 19 |
Follow this link to access the course website for the previous edition of Optimal and Learning-Based Control.