Kaidi Yang is a Postdoctoral Scholar with the Autonomous Systems Lab in the Department of Aeronautics and Astronautics at Stanford University. He obtained the Ph.D. degree in Civil Engineering from ETH Zurich in 2019. Prior to that, he received a BEng. in Automation, a BSc. in Pure and Applied Mathematics (minor), and an MSc. in Control Science and Engineering from Tsinghua University in China.
His primary research interest lies in the control and optimization of multimodal transportation systems, with a particular focus on emerging technologies and shared mobility. Currently, he is working on various topics on Mobility-on-Demand systems, e.g., the coordination of a mixed fleet of autonomous and human-driven vehicles, integration of autonomous mobility-on-demand and public transport, etc.
In his free time, Kaidi enjoys traveling, hiking, cooking, and tennis.
Abstract: Optimization problems over dynamic networks have been extensively studied and widely used in the past decades to formulate numerous real-world problems. However, (1) traditional optimization-based approaches do not scale to large networks, and (2) the design of good heuristics or approximation algorithms often requires significant manual trial-and-error. In this work, we argue that data-driven strategies can automate this process and learn efficient algorithms without compromising optimality. To do so, we present network control problems through the lens of reinforcement learning and propose a graph network-based framework to handle a broad class of problems. Instead of naively computing actions over high-dimensional graph elements, e.g., edges, we propose a bi-level formulation where we (1) specify a desired next state via RL, and (2) solve a convex program to best achieve it, leading to drastically improved scalability and performance. We further highlight a collection of desirable features to system designers, investigate design decisions, and present experiments on real-world control problems showing the utility, scalability, and flexibility of our framework.
@inproceedings{GammelliHarrisonEtAl2023, author = {Gammelli, D. and Harrison, J. and Yang, K. and Pavone, M. and Rodrigues, F. and Pereira, F. C.}, title = {Graph Reinforcement Learning for Network Control via Bi-Level Optimization}, booktitle = {{Int. Conf. on Machine Learning}}, year = {2023}, address = {Honolulu, Hawaii}, month = jul, owner = {jthluke}, timestamp = {2024-09-20}, url = {https://arxiv.org/abs/2305.09129} }
Abstract: In this chapter, we will discuss some of the privacy related challenges that have arisen in data-driven cyber-physical systems. While user data can help data-driven systems improve their efficiency, safety, and adaptability, the way that this data is collected and analyzed can lead to privacy risks to users. If left unaddressed, potential privacy risks may deter users from contributing their data to cyber-physical systems, thereby limiting the effectiveness of data-driven tools employed by the systems. To ensure that users are protected from privacy risks and feel comfortable sharing their data with cyber-physical systems, this chapter discusses how differential privacy and cryptography, techniques originally developed for privacy in health care and computer systems respectively, can be used to conduct data analysis and optimization with principled privacy guarantees in cyber-physical systems. Along the way, we will discuss and compare the properties, strengths weaknesses of these approaches. We will also show why these techniques address many of the privacy-related issues that arise when using data anonymization and data aggregation, two of the most common approaches to privacy-aware data sharing.
@incollection{TsaoGopalakrishnanEtAl2023b, author = {Tsao, M. and Gopalakrishnan, K. and Yang, K. and Pavone, M.}, title = {Privacy-Aware Control of Cyber Physical Systems}, booktitle = {Smarter Cyber Physical Systems: Enabling Methodologies and Application}, year = {2023}, publisher = {{CRC Press}}, note = {Submitted}, keywords = {sub}, owner = {gkarthik}, timestamp = {2022-12-06} }
Abstract: Network routing problems are common across many engineering applications. Computing optimal routing policies requires knowledge about network demand, i.e., the origin and destination (OD) of all requests in the network. However, privacy considerations make it challenging to share individual OD data that would be required for computing optimal policies. Privacy can be particularly challenging in standard network routing problems because sources and sinks can be easily identified from flow conservation constraints, making feasibility and privacy mutually exclusive. In this paper, we present a differentially private algorithm for network routing problems. The main ingredient is a reformulation of network routing which moves all user data-dependent parameters out of the constraint set and into the objective function. We then present an algorithm for solving this formulation based on a differentially private variant of stochastic gradient descent. In this algorithm, differential privacy is achieved by injecting noise, and one may wonder if this noise injection compromises solution quality. We prove that our algorithm is both differentially private and asymptotically optimal as the size of the training set goes to infinity. We corroborate the theoretical results with numerical experiments on a road traffic network which show that our algorithm provides differentially private and near-optimal solutions in practice.
@inproceedings{TsaoGopalakrishnanEtAl2023, author = {Tsao, M. and Gopalakrishnan, K. and Yang, K. and Pavone, M.}, title = {Differentially Private Stochastic Convex Optimization for Network Routing Applications}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2023}, address = {Singapore}, doi = {10.1109/CDC49753.2023.10383207}, url = {https://ieeexplore.ieee.org/abstract/document/10383207/}, owner = {somrita}, timestamp = {2024-02-29} }
Abstract: Data-driven methodologies offer many exciting upsides, but they also introduce new challenges, particularly in the realm of user privacy. Specifically, the way data is collected can pose privacy risks to end users. In many routing services, a single entity (e.g., the routing service provider) collects and manages user trajectory data. When it comes to user privacy, these systems have a central point of failure since users have to trust that this entity will not sell or use their data to infer sensitive private information. With this as motivation, we study the problem of using location data for routing services in a privacy-preserving way. Rather than having users report their location to a central operator, we present a protocol in which users participate in a decentralized and privacy-preserving computation to estimate travel times for the roads in the network in a way that no individuals’ location is ever observed by any other party. The protocol uses the Laplace mechanism in conjunction with secure multi-party computation to ensure that it is cryptogrpahically secure and that its output is differentially private. The protocol is computationally efficient and does not require specialized hardware; all it needs is GPS, which is included in most mobile devices. A natural question is if privacy necessitates degradation in accuracy or system performance. We show that if a road has sufficiently high capacity, then the travel time estimated by our protocol is provably close to the ground truth travel time. We validate the protocol through numerical experiments which show that using the protocol as a routing service provides privacy guarantees with minimal overhead to user travel time.
@inproceedings{TsaoYangEtAl2022, author = {Tsao, M. and Yang, K. and Gopalakrishnan, K. and Pavone, M.}, title = {Private Location Sharing for Decentralized Routing Services}, booktitle = {{Proc. IEEE Int. Conf. on Intelligent Transportation Systems}}, year = {2022}, doi = {10.1109/ITSC55140.2022.9922387}, month = oct, url = {https://arxiv.org/abs/2202.13305}, owner = {gkarthik}, timestamp = {2022-11-21} }
Abstract: The era of Big Data has brought with it a richer understanding of user behavior through massive data sets, which can help organizations optimize the quality of their services. In the context of transportation research, mobility data can provide Municipal Authorities (MA) with insights on how to operate, regulate, or improve the transportation network. Mobility data, however, may contain sensitive information about end users and trade secrets of Mobility Providers (MP). Due to this data privacy concern, MPs may be reluctant to contribute their datasets to MA. Using ideas from cryptography, we propose an interactive protocol between a MA and a MP in which MA obtains insights from mobility data without MP having to reveal its trade secrets or sensitive data of its users. This is accomplished in two steps: a commitment step, and a computation step. In the first step, Merkle commitments and aggregated traffic measurements are used to generate a cryptographic commitment. In the second step, MP extracts insights from the data and sends them to MA. Using the commitment and zero-knowledge proofs, MA can certify that the information received from MP is accurate, without needing to directly inspect the mobility data. We also present a differentially private version of the protocol that is suitable for the large query regime. The protocol is verifiable for both MA and MP in the sense that dishonesty from one party can be detected by the other. The protocol can be readily extended to the more general setting with multiple MPs via secure multi-party computation.
@article{TsaoYangZoepfPavone2021, author = {Tsao, M. and Yang, K. and Zoepf, S. and Pavone, M.}, title = {Trust but Verify: Cryptographic Data Privacy for Mobility Management}, journal = {{IEEE Transactions on Control of Network Systems}}, volume = {9}, number = {1}, pages = {50--61}, year = {2022}, keywords = {pub}, url = {https://arxiv.org/abs/2104.07768} }
Abstract: Autonomous Mobility-on-Demand (AMoD) systems represent an attractive alternative to existing transportation paradigms, currently challenged by urbanization and increasing travel needs. By centrally controlling a fleet of self-driving vehicles, these systems provide mobility service to customers and are currently starting to be deployed in a number of cities around the world. Current learning-based approaches for controlling AMoD systems are limited to the single-city scenario, whereby the service operator is allowed to take an unlimited amount of operational decisions within the same transportation system. However, real-world system operators can hardly afford to fully re-train AMoD controllers for every city they operate in, as this could result in a high number of poor-quality decisions during training, making the single-city strategy a potentially impractical solution. To address these limitations, we propose to formalize the multi-city AMoD problem through the lens of meta-reinforcement learning (meta-RL) and devise an actor-critic algorithm based on recurrent graph neural networks. In our approach, AMoD controllers are explicitly trained such that a small amount of experience within a new city will produce good system performance. Empirically, we show how control policies learned through meta-RL are able to achieve near-optimal performance on unseen cities by learning rapidly adaptable policies, thus making them more robust not only to novel environments, but also to distribution shifts common in real-world operations, such as special events, unexpected congestion, and dynamic pricing schemes.
@inproceedings{GammelliYangEtAl2022, author = {Gammelli, D. and Yang, K. and Harrison, J. and Rodrigues, F. and Pereira, F. and Pavone, M.}, booktitle = {{ACM Int. Conf. on Knowledge Discovery and Data Mining}}, title = {Graph Meta-Reinforcement Learning for Transferable Autonomous Mobility-on-Demand}, year = {2022}, keywords = {pub}, owner = {gammelli}, url = {https://arxiv.org/abs/2202.07147}, timestamp = {2022-03-02} }
Abstract: Dynamic network flow models have been extensively studied and widely used in the past decades to formulate many problems with great real-world impact, such as transportation, supply chain management, power grid control, and more. Within this context, time-expansion techniques currently represent a generic approach for solving control problems over dynamic networks. However, the complexity of these methods does not allow traditional approaches to scale to large networks, especially when these need to be solved recursively over a receding horizon (e.g., to yield a sequence of actions in model predictive control). Moreover, tractable optimization-based approaches are often limited to simple linear deterministic settings and are not able to handle environments with stochastic, non-linear, or unknown dynamics. In this work, we present dynamic network flow problems through the lens of reinforcement learning and propose a graph network-based framework that can handle a wide variety of problems and learn efficient algorithms without significantly compromising optimality. Instead of a naive and poorly-scalable formulation, in which agent actions (and thus network outputs) consist of actions on edges, we present a two-phase decomposition. The first phase consists of an RL agent specifying desired outcomes to the actions. The second phase exploits the problem structure to solve a convex optimization problem and achieve (as best as possible) these desired outcomes. This formulation leads to dramatically improved scalability and performance. We further highlight a collection of features that are potentially desirable to system designers, investigate design decisions, and present experiments showing the utility, scalability, and flexibility of our framework.
@inproceedings{GammelliHarrisonEtAl2022, author = {Gammelli, D. and Harrison, J. and Yang, K. and Pavone, M. and Rodrigues, F. and Francisco, Pereira C.}, booktitle = {{Learning on Graphs Conference}}, title = {Graph Reinforcement Learning for Network Control via Bi-Level Optimization}, year = {2022}, keywords = {pub}, owner = {gammelli}, timestamp = {2022-11-24} }
Abstract: Automated vehicles (AVs) are expected to be beneficial for Mobility-on-Demand (MoD), thanks to their ability of being globally coordinated. To facilitate the steady transition towards full autonomy, we consider the transition period of AV deployment, whereby an MoD system operates a mixed fleet of automated vehicles (AVs) and human-driven vehicles (HVs). In such systems, AVs are centrally coordinated by the operator, and the HVs might strategically respond to the coordination of AVs. We devise computationally tractable strategies to coordinate mixed fleets in MoD systems. Specifically, we model an MoD system with a mixed fleet using a Stackelberg framework where the MoD operator serves as the leader and human-driven vehicles serve as the followers. We develop two models: 1) a steady-state model to analyze the properties of the problem and determine the planning variables (e.g., compensations, prices, and the fleet size of AVs), and 2) a time-varying model to design a real-time coordination algorithm for AVs. The proposed models are validated using a case study inspired by real operational data of a MoD service in Singapore. Results show that the proposed algorithms can significantly improve system performance.
@inproceedings{YangTsaoEtAl2021, author = {Yang, K. and Tsao, M. and Xu, X. and Pavone, M.}, title = {Real-Time Control of Mixed Fleets in Mobility-on-Demand Systems}, booktitle = {{Proc. IEEE Int. Conf. on Intelligent Transportation Systems}}, year = {2021}, note = {{Extended Version available} at \url{https://arxiv.org/abs/2008.08131}}, address = {Indianapolis, IN, USA}, month = sep, url = {https://arxiv.org/pdf/2008.08131}, owner = {ykd07}, timestamp = {2021-06-15} }
Abstract: Autonomous mobility-on-demand (AMoD) systems represent a rapidly developing mode of transportation wherein travel requests are dynamically handled by a coordinated fleet of robotic, self-driving vehicles. Given a graph representation of the transportation network - one where, for example, nodes represent areas of the city, and edges the connectivity between them - we argue that the AMoD control problem is naturally cast as a node-wise decision-making problem. In this paper, we propose a deep reinforcement learning framework to control the rebalancing of AMoD systems through graph neural networks. Crucially, we demonstrate that graph neural networks enable reinforcement learning agents to recover behavior policies that are significantly more transferable, generalizable, and scalable than policies learned through other approaches. Empirically, we show how the learned policies exhibit promising zero-shot transfer capabilities when faced with critical portability tasks such as inter-city generalization, service area expansion, and adaptation to potentially complex urban topologies.
@inproceedings{GammelliYangEtAl2021, author = {Gammelli, D. and Yang, K. and Harrison, J. and Rodrigues, F. and Pereira, F. C. and Pavone, M.}, title = {Graph Neural Network Reinforcement Learning for Autonomous Mobility-on-Demand Systems}, year = {2021}, url = {https://arxiv.org/abs/2104.11434}, owner = {jh2}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, timestamp = {2021-03-23} }