AA 203: Optimal and Learning-Based Control

Spring 2025

Course Description

Optimal control solution techniques for systems with known and unknown dynamics. Dynamic programming, Hamilton-Jacobi reachability, and direct and indirect methods for trajectory optimization. Introduction to model predictive control. Adaptive control, model-based and model-free reinforcement learning, and connections between modern reinforcement learning and fundamental optimal control ideas.

Instructors

Marco Pavone	Daniele Gammelli

Course Assistants

Daniel Morton	Matt Foutter	Luis Pabon

Meeting Times

Lectures are held on Mondays and Wednesdays from 1:30pm to 2:50pm in Gates B3.

Lecture recordings will be available on Canvas.

Recitations are held on Fridays from 11am-12pm for the first four weeks of the quarter. Recitations are hybrid sessions, held both in Durand 251 and on Zoom (link on Canvas). Each recitation will be recorded and available offline on Canvas.

Office Hours

Office hours will begin in the second week of the quarter

Marco Pavone's office hours are on Tuesdays 1:00pm to 2:00pm in Durand 261 and by appointment.

Daniele Gammelli's office hours are on Thursdays 10:30am to 11:30am in Durand 272.

Daniel Morton's office hours are on Wednesdays 10:30am to 11:30am in Durand 272.

Matt Foutter's & Luis Pabon's office hours are Mondays 10:00am to 11:30am in Durand 251.

Syllabus

The class syllabus can be found here.

Schedule

Subject to change.

Week	Topic	Lecture Slides
1	Course overview; intro to nonlinear optimization Optimization theory Recitation: Automatic differentiation with JAX Monday: HW0 (ungraded) out	Lecture 1 Lecture 2 Recitation 1 ; Code
2	Calculus of variations Indirect methods for optimal control Recitation: Convex optimization with CVXPY Monday: HW1 released	Lecture 3 Lecture 4
3	Pontryagin's maximum principle continuous-time LQR Direct methods (collocation, SCP) Recitation: Regression models	Lecture 5 ; Code Lecture 6 ; Code Recitation 3
4	Dynamic programming (DP), discrete LQR Nonlinear LQR for tracking and trajectory generation (iLQR, DDP) Recitation: Training neural networks with JAX Monday: HW1 due, HW2 released	Lecture 7 Lecture 8 ; Code
5	Stochastic DP, value iteration, policy iteration HJB, HJI, and reachability analysis	Lecture 9 Lecture 10 ; Code
6	MPC I: Introduction, persistent feasibility MPC II: Persistent feasibility of MPC (cont’d), stability of MPC, and explicit MPC Monday: HW2 due, HW3 released	Lecture 11 Lecture 12
7	Intro to learning, sys ID, adaptive control Intro to imitation learning and RL	Lecture 13 Lecture 14
8	Imitation learning RL I: Foundations of RL Monday: HW3 due, HW4 released	Lecture 15 Lecture 16
9	No lecture (Memorial Day) RL II: Model-free RL - Value-based methods	Lecture 17
10	RL III: Model-free RL - Policy optimization RL IV: Model-based RL and Conclusions Wednesday: HW4 due	Lecture 18 Lecture 19
11	Final Exam: Monday June 9, 3:30-6:30pm

Follow this link to access the course website for the previous edition of Optimal and Learning-Based Control.