Optimal control solution techniques for systems with known and unknown dynamics. Dynamic programming, Hamilton-Jacobi reachability, and direct and indirect methods for trajectory optimization. Introduction to model predictive control. Adaptive control, model-based and model-free reinforcement learning, and connections between modern reinforcement learning and fundamental optimal control ideas.
Marco Pavone
|
Daniele Gammelli
|
|---|
Hugo Buurmeijer
|
Ellie Brosius
|
Pranit Mohnot
|
|---|
Lectures are held on Mondays and Wednesdays from 1:30pm to 2:50pm in Skilling Auditorium.
Lecture recordings will be available on Canvas.
Recitations are held on Fridays from 1:30 pm to 2:30 pm for the first four weeks of the quarter. Recitations are hybrid sessions, held both in Durand 251 and on Zoom (link on Canvas). Each recitation will be recorded and available offline on Canvas.
Note: The first recitation (Friday, April 3rd) will be Zoom only and will have no in-person option.
Office hours will begin in the second week of the quarter
Marco Pavone's office hours are on Tuesdays 1:00pm to 2:00pm in Durand 261 and by appointment.
Daniele Gammelli's office hours are on Wednesdays 3:00pm to 4:00 pm in Durand 251.
Hugo Buurmeijer's office hours are on Thursdays 1:00pm to 2:00 pm in Durand 251.
Ellie Brosius's & Pranit Mohnot's office hours are Wednesdays 10:00am to 11:00am in Durand 251.
The class syllabus can be found here.
Subject to change.
| Week | Topic | Lecture Slides |
|---|---|---|
| 1 |
Course overview; intro to nonlinear optimization
Optimization theory Recitation: Automatic differentiation with JAX Monday: HW0 (ungraded) out |
Lecture 1
Lecture 2 Recitation 1 ; Code |
| 2 |
Calculus of variations
Indirect methods for optimal control Recitation: Convex optimization with CVXPY Friday: HW1 released |
|
| 3 |
Pontryagin's maximum principle continuous-time LQR
Direct methods (collocation, SCP) Recitation: Regression models |
|
| 4 |
Dynamic programming (DP), discrete LQR
Nonlinear LQR for tracking and trajectory generation (iLQR, DDP) Recitation: Training neural networks with JAX Friday: HW1 due, HW2 released |
|
| 5 |
Stochastic DP, value iteration, policy iteration
HJB, HJI, and reachability analysis |
|
| 6 |
MPC I: Introduction, persistent feasibility
MPC II: Persistent feasibility of MPC (cont’d), stability of MPC, and explicit MPC Friday: HW2 due, HW3 released |
|
| 7 |
Intro to learning, sys ID, adaptive control
Intro to imitation learning and RL |
|
| 8 |
Imitation learning
RL I: Foundations of RL Friday: HW3 due, HW4 released | |
| 9 |
No lecture (Memorial Day)
RL II: Model-free RL - Value-based methods |
|
| 10 |
RL III: Model-free RL - Policy optimization
RL IV: Model-based RL and Conclusions Wednesday: HW4 due |
|
| 11 | Final Exam: Monday June 8, 3:30-6:30pm |
Follow this link to access the course website for the previous edition of Optimal and Learning-Based Control.