Optimal control solution techniques for systems with known and unknown dynamics. Dynamic programming, Hamilton-Jacobi reachability, and direct and indirect methods for trajectory optimization. Introduction to model predictive control. Adaptive control, model-based and model-free reinforcement learning, and connections between modern reinforcement learning and fundamental optimal control ideas.
Marco Pavone | Daniele Gammelli |
---|
Daniel Morton | Matt Foutter |
---|
Lectures are held on Mondays and Wednesdays from 1:30pm to 2:50pm in Huang Engineering Center 18.
Lecture recordings will be available on Canvas.
Recitations are held on Fridays from 9:00am to 10:30am for the first four weeks of the quarter. Recitations are hybrid sessions, held both in Durand 251 and on Zoom (link on Canvas). Each recitation will be recorded and available offline on Canvas.
Office hours will begin in the second week of the quarter
Marco Pavone's office hours are on Tuesdays 1:00pm to 2:00pm in Durand 261 and by appointment.
Daniele Gammelli's office hours are on Thursdays 4:00pm to 5:00pm in Durand 217 (project discussion).
Daniel Morton's office hours are on Fridays 3:00pm to 5:00pm in Durand 217.
Matt Foutter's hybrid office hours are on Mondays 3:00pm to 4:00pm and Thursdays 5:00pm to 6:00pm in Durand 251.
The class syllabus can be found here.
Subject to change.
Week | Topic | Lecture Slides |
---|---|---|
1 |
Course overview; intro to nonlinear optimization
Optimization theory Recitation: Automatic differentiation with JAX Monday: HW0 (ungraded) out |
Lecture 1
Lecture 2 Recitation 1 ; Code |
2 |
Calculus of variations
Indirect methods for optimal control Recitation: Convex optimization with CVXPY Monday: HW1 released |
Lecture 3
Lecture 4 Recitation 2 ; Code |
3 |
Pontryagin's maximum principle and computational methods
Direct methods Recitation: Regression models |
Lecture 5
; Code
Lecture 6 ; Code Recitation 3 |
4 |
Dynamic programming, discrete LQR
Nonlinear LQR for tracking and trajectory generation Recitation: Training neural networks with JAX Monday: HW1 due, HW2 released |
Lecture 7
Lecture 8 ; Code Recitation 4 |
5 |
Stochastic DP, value iteration, policy iteration
Introduction to RL, learning settings Friday: Project proposal due |
Lecture 9
Lecture 10 |
6 |
HJB, HJI, and reachability analysis
MPC I: Introduction Monday: HW2 due Tuesday: HW3 released |
Lecture 11
; Code
Lecture 12 |
7 |
MPC II: Feasibility and stability
MPC III: Robustness Friday: Project midterm report due |
Lecture 13
Lecture 14 |
8 |
Intro to learning, sys ID, adaptive control
RL I: Model-free RL - Value-based methods Monday: HW4 released Tuesday: HW3 due |
Lecture 15
Lecture 16 |
9 |
No lecture (Memorial Day)
RL II: Model-free RL - Policy optimization |
Lecture 17 |
10 |
RL III: Model-based RL
RL IV: Model-based policy optimization and Conclusions Monday: HW4 due Wednesday: Project video presentation and final report due |
Lecture 18
Lecture 19 |
Follow this link to access the course website for the previous edition of Optimal and Learning-Based Control.