AA 203: Optimal and Learning-Based Control

Spring 2023

Course Description

Optimal control solution techniques for systems with known and unknown dynamics. Dynamic programming, Hamilton-Jacobi reachability, and direct and indirect methods for trajectory optimization. Introduction to model predictive control. Adaptive control, model-based and model-free reinforcement learning, and connections between modern reinforcement learning and fundamental optimal control ideas.

Instructors

Spencer M. Richards	Daniele Gammelli

Course Assistants

Thomas Lew	Devansh Jalota

Meeting Times

Lectures are held on Mondays and Wednesdays from 1:30pm to 2:50pm in Thornton 102.

Lecture recordings will be available on Canvas.

Recitations are held on Fridays from 9:00am to 10:30am for the first four weeks of the quarter. Recitations are hybrid sessions, held both in Durand 023 Videoconference Room and on Zoom (link on Canvas).

Office Hours

Office hours will take place in Durand Building, Room 217 at the following times:

Daniele Gammelli's office hours are on Tuesdays 2:00pm to 3:00pm (course material) and Thursdays 9:00am to 10:00am (project discussion).

Spencer Richards' office hours are on Mondays and Wednesdays 3:00pm to 4:00pm.

Thomas Lew's office hours are on Fridays 1:00pm to 3:00pm.

Devansh Jalota's office hours are on Wednesdays 8:00am to 10:00am.

Schedule

Subject to change.

Week	Topic	Lecture Slides
1	Introduction; feedback, stability, optimal control problems Nonlinear optimization theory Recitation: Automatic differentiation (AD) with JAX Monday: HW0 (ungraded) out	Lecture 1 Lecture 2 Recitation 1 ; Code
2	Pontryagin's maximum principle (PMP) and indirect methods Direct methods: Multiple shooting, collocation, and SCP Recitation: Convex optimization with CVXPY Monday: HW1 released	Lecture 3 Lecture 4 ; Code Recitation 2
3	Dynamic programming (DP), discrete-time LQR Stochastic DP, value iteration, policy iteration Recitation: Regression models Friday: Project proposal due	Lecture 5 Lecture 6 ; Code Recitation 3
4	Nonlinear LQR for tracking, iLQR, DDP The Hamilton-Jacobi-Bellman (HJB) equation, LQR in continuous-time Recitation: Training neural networks with JAX Monday: HW1 due, HW2 released	Lecture 7 ; Code Lecture 8 Recitation 4
5	System identification and adaptive control Introduction to RL and Markov decision processes (MDPs)	Lecture 9 ; Code Lecture 10
6	MPC I: Introduction MPC II: Feasibility and stability Monday: HW2 due, HW3 released	Lecture 11 Lecture 12 ; Code
7	MPC III: Robustness MPC IV: Adaptation Monday: Project midterm report due	Lectures 13 and 14
8	RL I: Model-free RL - Value-based methods RL II: Model-free RL - Policy optimization Monday: HW3 due, HW4 released	Lecture 15 Lecture 16
9	No lecture (Memorial Day) RL III: Model-based RL	Lecture 17
10	RL IV: Model-based policy optimization Course summary, recent work, future directions Monday: HW4 due Wednesday: Project video presentation and final report due	Lecture 18 Course summary (optimal control) Course summary (reinforcement learning)

Follow this link to access the course website for the previous edition of Optimal and Learning-Based Control.