Abstract: This work introduces a framework to diagnose the strengths and shortcomings of Autonomous Vehicle (AV) collision avoidance technology with synthetic yet realistic potential collision scenarios adapted from real-world, collision-free data. Our framework generates counterfactual collisions with diverse crash properties, e.g., crash angle and velocity, between an adversary and a target vehicle by adding perturbations to the adversary’s predicted trajectory from a learned AV behavior model. Our main contribution is to ground these adversarial perturbations in realistic behavior as defined through the lens of data-alignment in the behavior model’s parameter space. Then, we cluster these synthetic counterfactuals to identify plausible and representative collision scenarios to form the basis of a test suite for downstream AV system evaluation. We demonstrate our framework using two state-of-the-art behavior prediction models as sources of realistic adversarial perturbations, and show that our scenario clustering evokes interpretable failure modes from a baseline AV policy under evaluation.
@inproceedings{DyroFoutterEtAl2024, author = {Dyro, R. and Foutter, M. and Li, R. and Di Lillo, L. and Schmerling, E. and Zhou, X. and Pavone, M.}, title = {Realistic Extreme Behavior Generation for Improved AV Testing}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2025}, url = {/wp-content/papercite-data/pdf/Dyro.Foutter.Li.ea.ICRA2025.pdf}, owner = {foutter}, keywords = {sub}, timestamp = {2024-09-15} }
Abstract: Recent years have seen a surge of artificial currency-based mechanisms in contexts where monetary instruments are deemed unfair or inappropriate, e.g., for traffic congestion management or allocation of food donations. Yet the applicability of these mechanisms remains limited, since it is challenging for users to learn how to bid an artificial currency that has no value outside the mechanism. Indeed, users must learn the value of the currency as well as how to optimally spend it in a coupled manner. In this paper, we study learning to bid in two prominent classes of artificial currency auctions: those in which currency is issued at the beginning of a finite period only to be spent over the period; and those where in addition to the initial endowment currency is transferred among users by redistributing payments in each time step. In the latter class the currency has been referred to as karma, since users do not only spend karma to acquire public resources but also gain karma for yielding them. In both classes, we propose a simple learning strategy, called adaptive karma pacing strategy, and show that a) it is asymptotically optimal for a single agent bidding against a stationary competition; b) it leads to convergent learning dynamics when all agents adopt it; and c) it constitutes an approximate Nash equilibrium as the number of agents grows. This requires a novel analysis in comparison to adaptive pacing strategies in monetary auctions, since we depart from the classical assumption that the currency has known value outside the auctions. The analysis is further complicated by the possibility to both spend and gain currency in auctions with redistribution.
@inproceedings{BerriaudElokdaEtAl2024, author = {Berriaud, D. and Elokda, E. and Jalota, D. and Frazzoli, E. and Pavone, M. and Dorfler, F.}, title = {To Spend or to Gain: Online Learning in Repeated Karma Auctions}, booktitle = {The Conference on Web and Internet Economics (WINE)}, year = {2024}, address = {Edinburgh, United Kingdom}, month = jul, keywords = {sub}, owner = {devanshjalota}, timestamp = {2024-03-01}, url = {https://arxiv.org/abs/2403.04057} }
Abstract: Quantum computation shows promise for addressing numerous classically intractable problems, such as optimization tasks. Many optimization problems are NP-hard, meaning that they scale exponentially with problem size and thus cannot be addressed at scale by traditional computing paradigms. The recently proposed quantum algorithm https://arxiv.org/abs/2206.14999 addresses this challenge for some NP-hard problems, and is based on classical semidefinite programming (SDP). In this manuscript, we generalize the SDP-inspired quantum algorithm to sum-of-squares programming, which targets a broader problem set. Our proposed algorithm addresses degree-k polynomial optimization problems with N ≤2n variables (which are representative of many NP-hard problems) using O(nk) qubits, O(k) quantum measurements, and O(poly(n)) classical calculations. We apply the proposed algorithm to the prototypical Max-kSAT problem and compare its performance against classical sum-of-squares, state-of-the-art heuristic solvers, and random guessing. Simulations show that the performance of our algorithm surpasses that of classical sum-of-squares after rounding. Our results further demonstrate that our algorithm is suitable for large problems and approximates the best known classical heuristics, while also providing a more generalizable approach compared to problem-specific heuristics.
@inproceedings{WangBrownEtAl2024, author = {Wang, I. W. and Brown, R. and Patti, T. L. and Anandkumar, A. and Pavone, M. and Yelin, S. F.}, title = {Sum-of-Squares inspired Quantum Metaheuristic for Polynomial Optimization with the Hadamard Test and Approximate Amplitude Constraints}, booktitle = {}, year = {2024}, keywords = {sub}, owner = {amine}, timestamp = {2024-09-19}, url = {https://arxiv.org/abs/2408.07774} }
Abstract: The ability to differentiate through optimization problems has unlocked numerous applications, from optimization-based layers in machine learning models to complex design problems formulated as bilevel programs. It has been shown that exploiting problem structure can yield significant computation gains for optimization and, in some cases, enable distributed computation. One should expect that this structure can be similarly exploited for gradient computation. In this work, we discuss a decentralized framework for computing gradients of constraint-coupled optimization problems. First, we show that this framework results in significant computational gains, especially for large systems, and provide sufficient conditions for its validity. Second, we leverage exponential decay of sensitivities in graph-structured problems towards building a fully distributed algorithm with convergence guarantees. Finally, we use the methodology to rigorously estimate marginal emissions rates in power systems models. Specifically, we demonstrate how the distributed scheme allows for accurate and efficient estimation of these important emissions metrics on large dynamic power system models.
@article{ValenzuelaBrownEtAl2024, author = {Valenzuela, L. F. and Brown, R. and Pavone, M.}, title = {Decentralized Implicit Differentiation}, journal = {{IEEE Transactions on Control of Network Systems}}, note = {Submitted}, year = {2024}, url = {https://arxiv.org/abs/2403.01260}, owner = {rabrown1}, timestamp = {2024-03-05}, keywords = {sub} }
Abstract: Hierarchical policies enable strong performance in many sequential decision-making problems, such as those with high-dimensional action spaces, those requiring long-horizon planning, and settings with sparse rewards. However, learning hierarchical policies from static offline datasets presents a significant challenge. Crucially, actions taken by higher-level policies may not be directly observable within hierarchical controllers, and the offline dataset might have been generated using a different policy structure, hindering the use of standard offline learning algorithms. In this work, we propose OHIO: a framework for offline reinforcement learning (RL) of hierarchical policies. Our framework leverages knowledge of the policy structure to solve the inverse problem, recovering the unobservable high-level actions that likely generated the observed data under our hierarchical policy. This approach constructs a dataset suitable for off-the-shelf offline training. We demonstrate our framework on robotic and network optimization problems and show that it substantially outperforms end-to-end RL methods and improves robustness. We investigate a variety of instantiations of our framework, both in direct deployment of policies trained offline and when online fine-tuning is performed.
@inproceedings{SchmidtGammelliEtAl2024, author = {Schmidt, C. and Gammelli, D. and Harrison, J. and Pavone, M. and Rodrigues, F.}, title = {Offline Hierarchical Reinforcement Learning via Inverse Optimization}, booktitle = {{Conf. on Neural Information Processing Systems}}, keywords = {sub}, note = {Submitted}, year = {2024}, owner = {gammelli}, timestamp = {2024-08-14} }
Abstract: The past few years have seen immense progress on two fronts that are critical to safe, widespread mobile robot deployment: predicting uncertain motion of multiple agents, and planning robot motion under uncertainty. However, the numerical methods required on each front have resulted in a mismatch of representation for prediction and planning. In prediction, numerical tractability is usually achieved by coarsely discretizing time, and by representing multimodal multi-agent interactions as distributions with infinite support. On the other hand, safe planning typically requires very fine time discretization, paired with distributions with compact support, to reduce conservativeness and ensure numerical tractability. The result is, when existing predictors are coupled with planning and control, one may often find unsafe motion plans. This paper proposes ZAPP (Zonotope Agreement of Prediction and Planning) to resolve the representation mismatch. ZAPP unites a prediction-friendly coarse time discretization and a planning-friendly zonotope uncertainty representation; the method also enables differentiating through a zonotope collision check, allowing one to integrate prediction and planning within a gradient-based optimization framework. Numerical examples show how ZAPP can produce safer trajectories compared to baselines in interactive scenes.
@inproceedings{PaparussoKousikEtAl2024, title = {{ZAPP!} Zonotope Agreement of Prediction and Planning for Continuous-Time Collision Avoidance with Discrete-Time Dynamics}, author = {Paparusso, L. and Kousik, S. and Schmerling, E. and Braghin, F. and Pavone, M.}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, owner = {rdyro}, timestamp = {2023-09-28}, keywords = {sub}, year = {2024}, url = {/wp-content/papercite-data/pdf/Paparusso.ea.ICRA24.pdf} }
Abstract: We discuss guidelines for evaluating the performance of parameterized stochastic solvers for optimization problems, with particular attention to systems that employ novel hardware, such as digital quantum processors running variational algorithms, analog processors performing quantum annealing, or coherent Ising Machines. We illustrate through an example a benchmarking procedure grounded in the statistical analysis of the expectation of a given performance metric measured in a test environment. In particular, we discuss the necessity and cost of setting parameters that affect the algorithm’s performance. The optimal value of these parameters could vary significantly between instances of the same target problem. We present an open-source software package that facilitates the design, evaluation, and visualization of practical parameter tuning strategies for complex use of the heterogeneous components of the solver. We examine in detail an example using parallel tempering and a simulator of a photonic Coherent Ising Machine computing and display the scoring of an illustrative baseline family of parameter-setting strategies that feature an exploration-exploitation trade-off.
@inproceedings{NeiraBrownEtAl2024, author = {Neira, D. E. B. and Brown, R. and Sathe, P. and Wudarski, F. and Pavone, M. and Rieffel, E. G. and Venturelli, D.}, title = {Benchmarking the Operation of Quantum Heuristics and Ising Machines: Scoring Parameter Setting Strategies on Optimization Applications}, year = {2024}, keywords = {sub}, note = {Submitted}, url = {https://arxiv.org/abs/2402.10255}, owner = {rabrown1}, timestamp = {2024-03-01} }
Abstract: We study the convex hulls of reachable sets of nonlinear systems with bounded disturbances and uncertain initial conditions. Reachable sets play a critical role in control, but remain notoriously challenging to compute, and existing over-approximation tools tend to be conservative or computationally expensive. In this work, we characterize the convex hulls of reachable sets as the convex hulls of solutions of an ordinary differential equation with initial conditions on the sphere. This finite-dimensional characterization unlocks an efficient sampling-based estimation algorithm to accurately over-approximate reachable sets. We also study the structure of the boundary of the reachable convex hulls and derive error bounds for the estimation algorithm. We give applications to neural feedback loop analysis and robust MPC.
@article{LewBonalliPavoneTAC2024, author = {Lew, T. and Bonalli, R. and Pavone, M.}, title = {Convex Hulls of Reachable Sets}, journal = {{IEEE Transactions on Automatic Control}}, year = {2024}, note = {Submitted}, url = {https://arxiv.org/abs/2303.17674}, keywords = {sub}, owner = {lew}, timestamp = {2024-03-01} }
Abstract: Artificial currencies have grown in popularity in many real-world resource allocation settings. In particular, they have gained traction in government benefits programs, e.g., food assistance or transit benefits programs, that provide support to eligible users in the population, e.g., through subsidized food or public transit. However, such programs are prone to two common fraud mechanisms: (i) misreporting fraud, wherein users can misreport their private attributes to gain access to more artificial currency (credits) than they are entitled to, and (ii) black market fraud, wherein users may seek to sell some of their credits in exchange for real money. In this work, we develop mechanisms to address these two sources of fraud in artificial currency based government benefits programs. To address misreporting fraud, we propose an audit mechanism that induces a two-stage game between an administrator and users, wherein the administrator running the benefits program can audit users at some cost and levy fines against them for misreporting their information. For this audit game, we first investigate the conditions on the administrator’s budget to establish the existence of equilibria and present a linear programming approach to compute these equilibria under both the signaling game and Bayesian persuasion formulations. We then show that the decrease in misreporting fraud corresponding to our audit mechanism far outweighs the spending of the administrator to run it by establishing that its total costs are lower than that of the status quo with no audits. To highlight the practical viability of our audit mechanism in mitigating misreporting fraud, we present a case study on Washington D.C.’s federal transit benefits program where the proposed audit mechanism even demonstrates several orders of magnitude improvement in total cost compared to a no-audit strategy for some parameter ranges.
@inproceedings{JalotaTsaoEtAl2023, author = {Jalota, D. and Tsao, M. and Pavone, M.}, title = {Catch Me If You Can: Combatting Fraud in Artificial Currency Based Government Benefits Programs}, year = {2024}, url = {https://arxiv.org/abs/2402.16162}, keywords = {sub}, owner = {devanshjalota}, timestamp = {2024-07-02} }
Abstract: Fraudulent or illegal activities are ubiquitous across applications and involve users bypassing the rule of law, often with the strategic aim of obtaining some benefit that would otherwise be unattainable within the bounds of lawful conduct. However, user fraud is detrimental, as it may compromise safety or impose disproportionate negative externalities on particular population groups. To mitigate the potential harms of user fraud, we study the problem of policing such fraud as a security game between an administrator and users. In this game, an administrator deploys R security resources (e.g., police officers) across L locations and levies fines against users engaging in fraud at those locations. For this security game, we study both welfare and revenue maximization administrator objectives. In both settings, we show that computing the optimal administrator strategy is NP-hard and develop natural greedy algorithm variants for the respective settings that achieve at least half the welfare or revenue as the welfare-maximizing or revenue-maximizing solutions, respectively. We also establish a resource augmentation guarantee that our proposed greedy algorithms with one extra resource, i.e., R+1 resources, achieve at least the same welfare (revenue) as the welfare-maximizing (revenue-maximizing) outcome with R resources. Finally, since the welfare and revenue-maximizing solutions can differ significantly, we present a framework inspired by contract theory, wherein a revenue-maximizing administrator is compensated through contracts for the welfare it contributes. Beyond extending our theoretical results in the welfare and revenue maximization settings to studying equilibrium strategies in the contract game, we also present numerical experiments highlighting the efficacy of contracts in bridging the gap between the revenue and welfare-maximizing administrator outcomes.
@inproceedings{JalotaEtAl2024, author = {Jalota, D. and Ostrovsky, M. and Pavone, M.}, title = {When Simple is Near-Optimal in Security Games}, year = {2024}, keywords = {sub}, url = {https://arxiv.org/abs/2402.11209}, owner = {devanshjalota}, timestamp = {2024-03-01} }
Abstract: Machine learning systems deployed in safety-critical robotics settings must be robust to distribution shifts. However, system designers must understand the cause of a distribution shift in order to implement the appropriate intervention or mitigation strategy and prevent system failure. In this paper, we present a novel framework for diagnosing distribution shifts in a streaming fashion by deploying multiple stochastic martingales simultaneously. We show that knowledge of the underlying cause of a distribution shift can lead to proper interventions over the lifecycle of a deployed system. Our experimental framework can easily be adapted to different types of distribution shifts, models, and datasets. We find that our method outperforms existing work on diagnosing distribution shifts in terms of speed, accuracy, and flexibility, and validate the efficiency of our model in both simulated and live hardware settings.
@inproceedings{HindyLuoEtAl2024, author = {Hindy, A. and Luo, R. and Banerjee, S. and Kuck, J. and Schmerling, E. and Pavone, M.}, title = {Diagnostic Runtime Monitoring with Martingales}, note = {Submitted}, booktitle = {}, year = {2024}, url = {https://arxiv.org/abs/2407.21748}, keywords = {sub}, owner = {somrita}, timestamp = {2024-02-09} }
Abstract: Marginal emissions rates – the sensitivity of carbon emissions to electricity demand – are important for evaluating the impact of emissions mitigation measures. Like locational marginal prices, locational marginal emissions rates (LMEs) can vary geographically, even between nearby locations, and may be coupled across time periods because of, for example, storage and ramping constraints. This temporal coupling makes computing LMEs computationally expensive for large electricity networks with high storage and renewable penetrations. Recent work demonstrates that decentralized algorithms can mitigate this problem by decoupling timesteps during differentiation. Unfortunately, we show these potential speedups are negated by the sparse structure inherent in power systems problems. We address these limitations by introducing a parallel, reverse-mode decentralized differentiation scheme that never explicitly instantiates the solution map Jacobian. We show both theoretically and empirically that parallelization is necessary to achieve non-trivial speedups when computing grid emissions sensitivities. Numerical results on a 500 node system indicate that our method can achieve greater than 10x speedups over centralized and serial decentralized approaches.
@inproceedings{DeglerisValenzuelaEtAl2024, author = {Degleris, A. and Valenzuela, L. F. and Rajagopal, R. and Pavone, M. and Gamal, A. E.}, title = {Fast Grid Emissions Sensitivities using Parallel Decentralized Implicit Differentiation}, booktitle = {}, year = {2024}, keywords = {sub}, owner = {amine}, timestamp = {2024-09-19}, url = {https://arxiv.org/abs/2408.10620} }
Abstract: In this chapter, we will discuss some of the privacy related challenges that have arisen in data-driven cyber-physical systems. While user data can help data-driven systems improve their efficiency, safety, and adaptability, the way that this data is collected and analyzed can lead to privacy risks to users. If left unaddressed, potential privacy risks may deter users from contributing their data to cyber-physical systems, thereby limiting the effectiveness of data-driven tools employed by the systems. To ensure that users are protected from privacy risks and feel comfortable sharing their data with cyber-physical systems, this chapter discusses how differential privacy and cryptography, techniques originally developed for privacy in health care and computer systems respectively, can be used to conduct data analysis and optimization with principled privacy guarantees in cyber-physical systems. Along the way, we will discuss and compare the properties, strengths weaknesses of these approaches. We will also show why these techniques address many of the privacy-related issues that arise when using data anonymization and data aggregation, two of the most common approaches to privacy-aware data sharing.
@incollection{TsaoGopalakrishnanEtAl2023b, author = {Tsao, M. and Gopalakrishnan, K. and Yang, K. and Pavone, M.}, title = {Privacy-Aware Control of Cyber Physical Systems}, booktitle = {Smarter Cyber Physical Systems: Enabling Methodologies and Application}, year = {2023}, publisher = {{CRC Press}}, note = {Submitted}, keywords = {sub}, owner = {gkarthik}, timestamp = {2022-12-06} }
Abstract:
@article{JalotaOstrovskyEtAl2023, author = {Jalota, D. and Ostrovsky, M. and Pavone, M.}, title = {Matching with Transfers under Distributional Constraints}, journal = {{Games and Economic Behavior}}, year = {2023}, note = {Submitted}, keywords = {sub}, owner = {devanshjalota}, timestamp = {2023-01-19}, url = {https://arxiv.org/abs/2202.05232} }
Abstract: This letter proposes a novel force-based task-orientation controller for interaction tasks with environmental orientation uncertainties. The main aim of the controller is to align the robot tool along the main task direction (e.g., along screwing, insertion, polishing, etc.) without the use of any external sensors (e.g., vision systems), relying only on end-effector wrench measurements/estimations. We propose a gradient descent-based orientation controller, enhancing its performance with the orientation predictions provided by a Gaussian Process model. Derivation of the controller is presented, together with simulation results (considering a probing task) and experimental results involving various re-orientation scenarios, i.e., i) a task with the robot in interaction with a soft environment, ii) a task with the robot in interaction with a stiff and inclined environment, and iii) a task to enable the assembly of a gear into its shaft. The proposed controller is compared against a state-of-the-art approach, highlighting its ability to re-orient the robot tool even in complex tasks (where the state-of-the-art method fails).
@article{RovedaPavone2024, author = {Roveda, L. and Pavone, M.}, journal = {{IEEE Robotics and Automation Letters}}, title = {Gradient Descent-Based Task-Orientation Robot Control Enhanced With Gaussian Process Predictions}, year = {2024}, volume = {9}, number = {9}, pages = {8035--8042}, month = sep, doi = {10.1109/LRA.2024.3438039}, url = {https://ieeexplore.ieee.org/abstract/document/10621597}, owner = {amine}, timestamp = {2024-09-19} }
Abstract: We study the problem of estimating the convex hull of the image f(X)⊂\mathbbR^n of a compact set X⊂\mathbbR^m with smooth boundary through a smooth function f:\mathbbR^m\to\mathbbR^n. Assuming that f is a diffeomorphism or a submersion, we derive new bounds on the Hausdorff distance between the convex hull of f(X) and the convex hull of the images f(x_i) of M samples x_i on the boundary of X. When applied to the problem of geometric inference from random samples, our results give tighter and more general error bounds than the state of the art. We present applications to the problems of robust optimization, of reachability analysis of dynamical systems, and of robust trajectory optimization under bounded uncertainty.
@article{LewBonalliJansonPavone2024, author = {Lew, T. and Bonalli, R. and Janson, L. and Pavone, M.}, title = {Estimating the convex hull of the image of a set with smooth boundary: error bounds and applications}, journal = {{Discrete \& Computational Geometry}}, year = {2024}, volume = {}, number = {}, pages = {1--39}, month = aug, doi = {10.1007/s00454-024-00683-5}, url = {https://arxiv.org/abs/2302.13970}, owner = {lew}, timestamp = {2023-02-27} }
Abstract: Foundation models, e.g., large language models, trained on internet-scale data possess zero-shot generalization capabilities that make them a promising technology for anomaly detection for robotic systems. Fully realizing this promise, however, poses two challenges: (i) mitigating the considerable computational expense of these models such that they may be applied online, and (ii) incorporating their judgement regarding potential anomalies into a safe control framework. In this work we present a two-stage reasoning framework: a fast binary anomaly classifier based on analyzing observations in an LLM embedding space, which may trigger a slower fallback selection stage that utilizes the reasoning capabilities of generative LLMs. These stages correspond to branch points in a model predictive control strategy that maintains the joint feasibility of continuing along various fallback plans as soon as an anomaly is detected (while the selector decides), thus ensuring safety. We demonstrate that, even when instantiated with relatively small language models, our fast anomaly classifier outperforms autoregressive reasoning with state-of-the-art GPT models. This enables our runtime monitor to improve the trustworthiness of dynamic robotic systems under resource and time constraints.
@inproceedings{SinhaElhafsiEtAl2024, author = {Sinha, R. and Elhafsi, A. and Agia, C. and Foutter, M. and Schmerling, E. and Pavone, M.}, title = {Real-Time Anomaly Detection and Planning with Large Language Models}, booktitle = {{Robotics: Science and Systems}}, address = {Delft, Netherlands}, month = jul, year = {2024}, owner = {amine}, url = {https://arxiv.org/abs/2407.08735}, timestamp = {2024-09-19} }
Abstract: While real-world problems are often challenging to analyze analytically, deep learning excels in modeling complex processes from data. Existing optimization frameworks like CasADi facilitate seamless usage of solvers but face challenges when integrating learned process models into numerical optimizations. To address this gap, we present the Learning for CasADi (L4CasADi) framework, enabling the seamless integration of PyTorch-learned models with CasADi for efficient and potentially hardware-accelerated numerical optimization. The applicability of L4CasADi is demonstrated with two tutorial examples: First, we optimize a fish’s trajectory in a turbulent river for energy efficiency where the turbulent flow is represented by a PyTorch model. Second, we demonstrate how an implicit Neural Radiance Field environment representation can be easily leveraged for optimal control with L4CasADi. L4CasADi, along with examples and documentation, is available under MIT license at this https URL.
@inproceedings{SalzmannArrizabalagaEtAl2023, author = {Salzmann, T. and Arrizabalaga, J. and Andersson, J. and Pavone, M. and Ryll, M.}, booktitle = {{Learning for Dynamics \& Control Conference}}, title = {Learning for {CasADi}: Data-driven Models in Numerical Optimization}, year = {2024}, month = jul, url = {https://arxiv.org/abs/2312.05873}, owner = {somrita}, timestamp = {2024-03-01} }
Abstract: Operators of Electric Autonomous Mobility-on-Demand (E-AMoD) fleets need to make several real-time decisions such as matching available vehicles to ride requests, rebalancing idle vehicles to areas of high demand, and charging vehicles to ensure sufficient range. While this problem can be posed as a linear program that optimizes flows over a space-charge-time graph, the size of the resulting optimization problem does not allow for real-time implementation in realistic settings. In this work, we present the E-AMoD control problem through the lens of reinforcement learning and propose a graph network-based framework to achieve drastically improved scalability and superior performance over heuristics. Specifically, we adopt a bi-level formulation where we (1) leverage a graph network-based RL agent to specify a desired next state in the space-charge graph, and (2) solve more tractable linear programs to best achieve the desired state while ensuring feasibility. Experiments using real-world data from San Francisco and New York City show that our approach achieves up to 89% of the profits of the theoretically-optimal solution while achieving more than a 100x speedup in computational time. We further highlight promising zero-shot transfer capabilities of our learned policy on tasks such as inter-city generalization and service area expansion, thus showing the utility, scalability, and flexibility of our framework. Finally, our approach outperforms the best domain-specific heuristics with comparable runtimes, with an increase in profits by up to 3.2x.
@inproceedings{SinghalGammelliEtAl2024, author = {Singhal, A. and Gammelli, D. and Luke, J. and Gopalakrishnan, K. and Helmreich, D. and Pavone, M.}, title = {Real-time Control of Electric Autonomous Mobility-on-Demand Systems via Graph Reinforcement Learning}, booktitle = {{European Control Conference}}, year = {2024}, address = {Stockholm, Sweden}, month = jun, doi = {10.23919/ecc64448.2024.10591098}, owner = {jthluke}, timestamp = {2024-09-12}, url = {https://arxiv.org/abs/2311.05780} }
Abstract: As natural access points to the subsurface, lava tubes and other caves have become premier targets of planetary missions for astrobiological analyses. Few existing robotic paradigms, however, are able to explore such challenging environments. ReachBot is a robot that enables navigation in planetary caves by using extendable and retractable limbs to locomote. In this paper, we outline the potential science return and mission operations for a notional mission that deploys ReachBot to a martian lava tube. We describe the motivating science goals and provide a science traceability matrix to guide payload selection. We also develop a Concept of Operations (ConOps) for ReachBot, providing a framework for deployment and activities on Mars, analyzing mission risks, and developing mitigation strategies.
@inproceedings{DiCuevasQuiñonesEtAl2024, author = {Di, J. and Cuevas-Quinones, S. and Newdick, S. and Chen, T. G. and Pavone, M. and Lapôtre, Mathieu G. A. and Cutkosky, M.}, title = {Martian Exploration of Lava Tubes (MELT) with ReachBot: Scientific Investigation and Concept of Operations}, booktitle = {{Int. Conf. on Space Robotics}}, year = {2024}, month = jun, owner = {amine}, url = {https://arxiv.org/abs/2406.13857}, timestamp = {2024-09-19} }
Abstract: Recent years have seen significant advances in quantum/quantum-inspired technologies capable of approximately searching for the ground state of Ising spin Hamiltonians. The promise of leveraging such technologies to accelerate the solution of difficult optimization problems has spurred an increased interest in exploring methods to integrate Ising problems as part of their solution process, with existing approaches ranging from direct transcription to hybrid quantum-classical approaches rooted in existing optimization algorithms. While it is widely acknowledged that quantum computers should augment classical computers, rather than replace them entirely, comparatively little attention has been directed toward deriving analytical characterizations of their interactions. In this paper, we present a formal analysis of hybrid algorithms in the context of solving mixed-binary quadratic programs (MBQP) via Ising solvers. By leveraging an existing completely positive reformulation of MBQPs, as well as a new strong-duality result, we show the exactness of the dual problem over the cone of copositive matrices, thus allowing the resulting reformulation to inherit the straightforward analysis of convex optimization. We propose to solve this reformulation with a hybrid quantum-classical cutting-plane algorithm. Using existing complexity results for convex cutting-plane algorithms, we deduce that the classical portion of this hybrid framework is guaranteed to be polynomial time. This suggests that when applied to NP-hard problems, the complexity of the solution is shifted onto the subroutine handled by the Ising solver.
@article{BrownBernalEtAl2024, author = {Brown, R. A. and Bernal Neira, D. E. and Venturelli, D. and Pavone, M.}, title = {A Copositive Framework for Analysis of Hybrid Ising-Classical Algorithms}, journal = {{SIAM Journal on Optimization}}, volume = {34}, number = {2}, pages = {1455--1489}, year = {2024}, month = jun, doi = {10.1137/22M1514581}, url = {https://arxiv.org/abs/2207.13630}, timestamp = {2024-09-19} }
Abstract: Adjusting robot behavior to human preferences can require intensive human feedback, preventing quick adaptation to new users and changing circumstances. Moreover, current approaches typically treat user preferences as a reward, which requires a manual balance between task success and user satisfaction. To integrate new user preferences in a zero-shot manner, our proposed Text2Interaction framework invokes a large language model to generate a task plan, motion preferences as Python code, and parameters of a safe controller. By maximizing the combined probability of task completion and user satisfaction instead of a weighted sum of rewards, we can reliably find plans that fulfill both requirements. We find that 83% of users working with Text2Interaction agree that it integrates their preferences into the robot’s plan, and 94% prefer Text2Interaction over the baseline. Our ablation study shows that Text2Interaction aligns better with unseen preferences than other baselines while maintaining a high success rate.
@inproceedings{ThummAgiaEtAl2024, author = {Thumm, J. and Agia, C. and Pavone, M. and Althoff, M.}, title = {Text2Interaction: Establishing Safe and Preferable Human-Robot Interaction}, booktitle = {{Conf. on Robot Learning}}, year = {2024}, keywords = {press}, owner = {amine}, timestamp = {2024-09-19}, url = {https://arxiv.org/abs/2408.06105} }
Abstract: In Model Predictive Control (MPC), discrepancies between the actual system and the predictive model can lead to substantial tracking errors and significantly degrade performance and reliability. While such discrepancies can be alleviated with more complex models, this often complicates controller design and implementation. By leveraging the fact that many trajectories of interest are periodic, we show that perfect tracking is possible when incorporating a simple observer that estimates and compensates for periodic disturbances. We present the design of the observer and the accompanying tracking MPC scheme, proving that their combination achieves zero tracking error asymptotically, regardless of the complexity of the unmodelled dynamics. We validate the effectiveness of our method, demonstrating asymptotically perfect tracking on a high-dimensional soft robot with nearly 10,000 states and a fivefold reduction in tracking errors compared to a baseline MPC on small-scale autonomous race car experiments.
@inproceedings{PabonEtAl2024, author = {Pabon, L. and Köhler, J. and Alora, J.I. and Eberhard, P.B. and Carron, A. and Zeilinger, M.N. and Pavone, M.}, title = {Perfecting Periodic Trajectory Tracking: Model Predictive Control with a Periodic Observer}, year = {2024}, booktitle = {{IEEE/RSJ Int. Conf. on Intelligent Robots \& Systems}}, address = {Abu Dhabi}, url = {https://arxiv.org/abs/2404.01550}, owner = {lpabon}, timestamp = {2024-07-01} }
Abstract: ReachBot, a proposed robotic platform, employs extendable booms as limbs for mobility in challenging environments, such as martian caves. When attached to the environment, ReachBot acts as a parallel robot, with reconfiguration driven by the ability to detach and re-place the booms. This ability enables manipulation-focused scientific objectives: for instance, through operating tools, or handling and transporting samples. To achieve these capabilities, we develop a two-part solution, optimizing for robustness against task uncertainty and stochastic failure modes. First, we present a mixed-integer stance planner to determine the positioning of ReachBot’s booms to maximize the task wrench space about the nominal point(s). Second, we present a convex tension planner to determine boom tensions for the desired task wrenches, accounting for the probabilistic nature of microspine grasping. We demonstrate improvements in key robustness metrics from the field of dexterous manipulation, and show a large increase in the volume of the manipulation workspace. Finally, we employ Monte-Carlo simulation to validate the robustness of these methods, demonstrating good performance across a range of randomized tasks and environments, and generalization to cable-driven morphologies. We make our code available at our project webpage, https://stanfordasl.github.io/reachbot_manipulation.
@inproceedings{MortonCutkoskyPavone2024, author = {Morton, D. and Cutkosky, M. and Pavone, M.}, title = {Task-Driven Manipulation with Reconfigurable Parallel Robots}, booktitle = {{IEEE/RSJ Int. Conf. on Intelligent Robots \& Systems}}, year = {2024}, keywords = {pub}, url = {https://arxiv.org/pdf/2403.10768.pdf}, owner = {dmorton}, timestamp = {2024-03-16} }
Abstract: When deploying modern machine learning-enabled robotic systems in high-stakes applications, detecting distribution shift is critical. However, most existing methods for detecting distribution shift are not well-suited to robotics settings, where data often arrives in a streaming fashion and may be very high-dimensional. In this work, we present an online method for detecting distribution shift with guarantees on the false positive rate — i.e., when there is no distribution shift, our system is very unlikely (with probability < ε) to falsely issue an alert; any alerts that are issued should therefore be heeded. Our method is specifically designed for efficient detection even with high dimensional data, and it empirically achieves up to 11x faster detection on realistic robotics settings compared to prior work while maintaining a low false negative rate in practice (whenever there is a distribution shift in our experiments, our method indeed emits an alert). We demonstrate our approach in both simulation and hardware for a visual servoing task, and show that our method indeed issues an alert before a failure occurs.
@inproceedings{LuoSinhaEtAl2023, author = {Luo, R. and Sinha, R. and Sun, Y. and Hindy, A. and Zhao, S. and Savarese, S. and Schmerling, E. and Pavone, M.}, title = {Online Distribution Shift Detection via Recency Prediction}, booktitle = {proc_IEEE_ICRA}, year = {2024}, keywords = {pub}, owner = {gammelli}, timestamp = {2024-09-19}, url = {https://ieeexplore.ieee.org/abstract/document/10611114} }
Abstract: Electrifying a commercial fleet, while concurrently adopting distributed energy resources, such as solar panels and battery storage, can significantly reduce the carbon intensity of its operation. However, coordinating the fleet operations with distributed resources requires an intelligent system to determine their joint dispatch. In this paper, we propose a 24/7 Carbon-Free Electrified Fleet digital twin framework for the coordination of an electric bus fleet, co-located photovoltaic solar arrays, and a battery energy storage system. The framework includes forecasting and surrogate modules for marginal grid emissions factors, solar generation, and bus energy consumption. These inputs are then passed into the optimization module to minimize emissions and the electricity bill. We evaluate the digital platform in a case study for Stanford University’s Marguerite Shuttle fleet assuming (1) non-controllable loads are coupled behind-the-meter, and (2) a stand-alone depot. Additionally, we perform a techno-economic analysis, quantifying the value of a bus depot battery storage system. Fleet operators may leverage our flexible framework to determine electric bus and battery storage dispatch, reduce electricity costs, and achieve 24/7 carbon-free charging.
@article{LukeRibeiroEtAl2024, author = {Luke, J. and Ribeiro, M. and Martin, S. and Balogun, E. and Cezar, G. and Pavone, M. and Rajagopal, R.}, title = {Optimal Coordination of Electric Buses and Battery Storage for Achieving a 24/7 Carbon-Free Electrified Fleet}, journal = {{Applied Energy}}, year = {2024}, note = {In press}, doi = {10.2139/ssrn.4815427}, keywords = {press}, owner = {jthluke}, timestamp = {2024-09-12}, url = {https://dx.doi.org/10.2139/ssrn.4815427} }
Abstract: We revisit the sample average approximation (SAA) approach for general non-convex stochastic programming. We show that applying the SAA approach to problems with expected value equality constraints does not necessarily result in asymptotic optimality guarantees as the number of samples increases. To address this issue, we relax the equality constraints. Then, we prove the asymptotic optimality of the modified SAA approach under mild smoothness and boundedness conditions on the equality constraint functions. Our analysis uses random set theory and concentration inequalities to characterize the approximation error from the sampling procedure. We apply our approach to the problem of stochastic optimal control for nonlinear dynamical systems subject to external disturbances modeled by a Wiener process. We verify our approach on a rocket-powered descent problem and show that our computed solutions allow for significant uncertainty reduction.
@article{LewBonalliPavone2024, author = {Lew, T. and Bonalli, R. and Pavone, M.}, title = {Sample Average Approximation for Stochastic Programming with Equality Constraints}, journal = {{SIAM Journal on Optimization}}, note = {In press}, year = {2024}, keywords = {press}, owner = {lew}, timestamp = {2022-06-22}, url = {https://arxiv.org/abs/2206.09963} }
Abstract: Consider deploying a team of robots in order to visit sites in a risky environment (i.e., where a robot might be lost during a traversal), subject to team-based operational constraints such as limits on team composition, traffic throughputs, and launch constraints. We formalize this problem using a graph to represent the environment, enforcing probabilistic survival constraints for each robot, and using a matroid (which generalizes linear independence to sets) to capture the team-based operational constraints. The resulting “Matroid Team Surviving Orienteers” (MTSO) problem has broad applications for robotics such as informative path planning, resource delivery, and search and rescue. We demonstrate that the objective for the MTSO problem has submodular structure, which leads us to develop two polynomial time algorithms which are guaranteed to find a solution with value within a constant factor of the optimum. The second of our algorithms is an extension of the accelerated continuous greedy algorithm, and can be applied to much broader classes of constraints while maintaining bounds on suboptimality. In addition to in-depth analysis, we demonstrate the efficiency of our approaches by applying them to a scenario where a team of robots must gather information while avoiding dangers in the Coral Triangle and characterize scaling and parameter selection using a synthetic dataset.
@article{JorgensenPavone2019, author = {Jorgensen, S. and Pavone, M.}, title = {The Matroid Team Servicing Orienteers Problem and its Variants: Constrained Routing of Heterogeneous Teams with Risky Traversal}, journal = {{Int. Journal of Robotics Research}}, volume = {43}, number = {1}, pages = {34--52}, year = {2024}, url = {https://doi.org/10.1177/02783649231210326}, owner = {somrita}, timestamp = {2024-12-29} }
Abstract: Reliable and efficient trajectory optimization methods are a fundamental need for autonomous dynamical systems, effectively enabling applications including rocket landing, hypersonic reentry, spacecraft rendezvous, and docking. Within such safety-critical application areas, the complexity of the emerging trajectory optimization problems has motivated the application of AI-based techniques to enhance the performance of traditional approaches. However, current AI-based methods either attempt to fully replace traditional control algorithms, thus lacking constraint satisfaction guarantees and incurring in expensive simulation, or aim to solely imitate the behavior of traditional methods via supervised learning. To address these limitations, this paper proposes the Autonomous Rendezvous Transformer (ART) and assesses the capability of modern generative models to solve complex trajectory optimization problems, both from a forecasting and control standpoint. Specifically, this work assesses the capabilities of Transformers to (i) learn near-optimal policies from previously collected data, and (ii) warm-start a sequential optimizer for the solution of non-convex optimal control problems, thus guaranteeing hard constraint satisfaction. From a forecasting perspective, results highlight how ART outperforms other learning-based architectures at predicting known fuel-optimal trajectories. From a control perspective, empirical analyses show how policies learned through Transformers are able to generate near-optimal warmstarts, achieving trajectories that are (i) more fuel-efficient, (ii) obtained in fewer sequential optimizer iterations, and (iii) computed with an overall runtime comparable to benchmarks based on convex optimization.
@inproceedings{GuffantiGammelliEtAl2024, author = {Guffanti, T. and Gammelli, D. and D'Amico, S. and Pavone, M.}, title = {Transformers for Trajectory Optimization with Application to Spacecraft Rendezvous}, booktitle = {{IEEE Aerospace Conference}}, year = {2024}, keywords = {pub}, owner = {gammelli}, timestamp = {2023-11-15}, url = {https://arxiv.org/abs/2310.13831} }
Abstract:
@inproceedings{FoutterBohjEtAl2024, author = {Foutter, M. and Bhoj, P. and Sinha, R. and Elhafsi, A. and Banerjee, S. and Agia, C. and Kruger, J. and Guffanti, T. and Gammelli, D. and D'Amico, S. and Pavone, M.}, title = {Adapting a Foundation Model for Space-based Tasks}, booktitle = {{Robotics: Science and Systems - Workshop on Semantics for Robotics: From Environment Understanding and Reasoning to Safe Interaction}}, year = {2024}, asl_abstract = {Foundation models, e.g., large language models, possess attributes of intelligence which offer promise to endow a robot with the contextual understanding necessary to navigate complex, unstructured tasks in the wild. In the future of space robotics, we see three core challenges which motivate the use of a foundation model adapted to space-based applications: 1) Scalability of ground-in-the-loop operations; 2) Generalizing prior knowledge to novel environments; and 3) Multi-modality in tasks and sensor data. Therefore, as a first-step towards building a foundation model for space-based applications, we automatically label the AI4Mars dataset to curate a language annotated dataset of visual-question-answer tuples. We fine-tune a pretrained LLaVA checkpoint on this dataset to endow a vision-language model with the ability to perform spatial reasoning and navigation on Mars' surface. In this work, we demonstrate that 1) existing vision-language models are deficient visual reasoners in space-based applications, and 2) fine-tuning a vision-language model on extraterrestrial data significantly improves the quality of responses even with a limited training dataset of only a few thousand samples.}, asl_address = {Delft, Netherlands}, asl_url = {https://arxiv.org/abs/2408.05924}, url = {https://arxiv.org/abs/2408.05924}, owner = {foutter}, timestamp = {2024-08-12} }
Abstract:
@inproceedings{ConteEtAl2024evaluating, title = {Evaluating a Reinforcement Learning Approach for Collision Avoidance with Heterogeneous Aircraft}, author = {Conte, C. and Accardo, D. and Gopalakrishnan, K. and Pavone, M.}, booktitle = {AIAA SCITECH 2024 Forum}, pages = {1860}, year = {2024} }
Abstract: Tolling, or congestion pricing, offers a promising traffic management policy for regulating congestion, but has also attracted criticism for placing outsized financial burdens on low-income users. Credit-based congestion pricing (CBCP) and discount-based congestion pricing (DBCP) policies, which respectively provide travel credits and toll discounts to low-income users on tolled roads, have emerged as promising mechanisms for reducing traffic congestion without worsening societal inequities. However, the optimal design of CBCP and DBCP policies, as well as their relative advantages and disadvantages, remain poorly understood. To address this, we study the effects of implementing CBCP and DBCP policies to route users on a network of multi-lane highways with tolled express lanes. We formulate a non-atomic routing game framework in which a subset of eligible users is granted toll relief in the form of a fixed budget or toll discount, while the remaining ineligible users must pay out-of-pocket. We prove the existence of Nash equilibrium traffic flow patterns corresponding to any given CBCP or DBCP policy. Under the additional assumption that eligible users have time-invariant VoTs, we provide a convex program to efficiently compute these equilibria. For networks consisting of a single edge, we identify conditions under which CBCP policies outperform DBCP policies (and vice versa), in the sense of improving eligible users’ access to the express lane. Finally, we present empirical results from a CBCP pilot study of the San Mateo 101 Express Lane Project in California. Our empirical results corroborate our theoretical analysis of the impact of deploying credit-based and discount-based policies, and lend insights into the sensitivity of their impact with respect to the travel demand and users’ VoTs.
@inproceedings{ChiuJalotaEtAl2024, author = {Chiu, C. and Jalota, D. and Pavone, M.}, title = {Credit vs. Discount-Based Congestion Pricing: A Comparison Study}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2024}, note = {In press}, keywords = {press}, owner = {devanshjalota}, url = {https://arxiv.org/abs/2403.13923}, timestamp = {2024-03-25} }
Abstract: Caves and lava tubes on the Moon and Mars are sites of geological and astrobiological interest but consist of terrain that is inaccessible with traditional robot locomotion. To support the exploration of these sites, we present ReachBot, a robot that uses extendable booms as appendages to manipulate itself with respect to irregular rock surfaces. The booms terminate in grippers equipped with microspines and provide ReachBot with a large workspace, allowing it to achieve force closure in enclosed spaces, such as the walls of a lava tube. To propel ReachBot, we present a contact-before-motion planner for nongaited legged locomotion that uses internal force control, similar to a multifingered hand, to keep its long, slender booms in tension. Motion planning also depends on finding and executing secure grips on rock features. We used a Monte Carlo simulation to inform gripper design and predict grasp strength and variability. In addition, we used a two-step perception system to identify possible grasp locations. To validate our approach and mechanisms under realistic conditions, we deployed a single ReachBot arm and gripper in a lava tube in the Mojave Desert. The field test confirmed that ReachBot will find many targets for secure grasps with the proposed kinematic design.
@article{ChenNewdickEtAl2024, author = {Chen, T. G. and Newdick, S. and Di, J. and Bosio, C. and Ongole, N. and Lapôtre, M. and Pavone, M. and Cutkosky, M. R.}, title = {Locomotion as manipulation with ReachBot}, journal = {{Science Robotics}}, volume = {9}, number = {89}, pages = {eadi9762}, year = {2024}, keywords = {pub}, owner = {gammelli}, timestamp = {2024-09-19}, url = {https://www.science.org/doi/abs/10.1126/scirobotics.adi9762} }
Abstract: Model predictive control (MPC) has established itself as the primary methodology for constrained control, enabling general-purpose robot autonomy in diverse real-world scenarios. However, for most problems of interest, MPC relies on the recursive solution of highly non-convex trajectory optimization problems, leading to high computational complexity and strong dependency on initialization. In this work, we present a unified framework to combine the main strengths of optimization-based and learning-based methods for MPC. Our approach entails embedding high-capacity, transformer-based neural network models within the optimization process for trajectory generation, whereby the transformer provides a near-optimal initial guess, or target plan, to a non-convex optimization problem. Our experiments, performed in simulation and the real world onboard a free flyer platform, demonstrate the capabilities of our framework to improve MPC convergence and runtime. Compared to purely optimization-based approaches, results show that our approach can improve trajectory generation performance by up to 75%, reduce the number of solver iterations by up to 45%, and improve overall MPC runtime by 7x without loss in performance.
@article{CelestiniGammelliEtAl2024, author = {Celestini, D. and Gammelli, D. and Guffanti, T. and D'Amico, S. and Capelli, E. and Pavone, M.}, title = {Transformer-based Model Predictive Control: Trajectory Optimization via Sequence Modeling}, journal = {{IEEE Robotics and Automation Letters}}, year = {2024}, keywords = {pub}, owner = {gammelli}, timestamp = {2024-08-14}, url = {https://ieeexplore.ieee.org/document/10685132} }
Abstract: The Coherent Ising Machine (CIM) is a non-conventional architecture that takes inspiration from physical annealing processes to solve Ising problems heuristically. Its dynamics are naturally continuous and described by a set of ordinary differential equations that have been proven to be useful for the optimization of continuous variables non-convex quadratic optimization problems. The dynamics of such Continuous Variable CIMs (CV-CIM) encourage optimization via optical pulses whose amplitudes are determined by the negative gradient of the objective; however, standard gradient descent is known to be trapped by local minima and hampered by poor problem conditioning. In this work, we propose to modify the CV-CIM dynamics using more sophisticated pulse injections based on tried-and-true optimization techniques such as momentum and Adam. Through numerical experiments, we show that the momentum and Adam updates can significantly speed up the CV-CIM’s convergence and improve sample diversity over the original CV-CIM dynamics. We also find that the Adam-CV-CIM’s performance is more stable as a function of feedback strength, especially on poorly conditioned instances, resulting in an algorithm that is more robust, reliable, and easily tunable. More broadly, we identify the CIM dynamical framework as a fertile opportunity for exploring the intersection of classical optimization and modern analog computing.
@inproceedings{BrownEtAlCPAIOR2024, author = {Brown, R. A. and Venturelli, D. and Pavone, M. and Bernal Neira, D. E.}, title = {Accelerating Continuous Variable Coherent Ising Machines via Momentum}, booktitle = {proc_CPAIOR}, year = {2024}, keywords = {pub}, owner = {gammelli}, timestamp = {2024-09-19}, url = {https://link.springer.com/chapter/10.1007/978-3-031-60597-0_8} }
Abstract: Robots require a semantic understanding of their surroundings to operate in an efficient and explainable way in human environments. In the literature, there has been an extensive focus on object labeling and exhaustive scene graph generation; less effort has been focused on the task of purely identifying and mapping large semantic regions. The present work proposes a method for semantic region mapping via embodied navigation in indoor environments, generating a high-level representation of the knowledge of the agent. To enable region identification, the method uses a vision-to-language model to provide scene information for mapping. By projecting egocentric scene understanding into the global frame, the proposed method generates a semantic map as a distribution over possible region labels at each location. This mapping procedure is paired with a trained navigation policy to enable autonomous map generation. The proposed method significantly outperforms a variety of baselines, including an object-based system and a pretrained scene classifier, in experiments in a photorealistic simulator.
@inproceedings{BigazziEtAl2024, title = {Mapping High-level Semantic Regions in Indoor Environments without Object Recognition}, author = {Bigazzi, R. and Baraldi, L. and Kousik, S. and Cucchiara, R. and Pavone, M.}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, owner = {rdyro}, timestamp = {2023-09-28}, keywords = {pub}, year = {2024}, url = {https://arxiv.org/abs/2403.07076} }
Abstract:
@inproceedings{BanerjeeBalabanEtAl2024, author = {Banerjee, S. and Balaban, B. and Shirley, M. and Bradner, K. and Pavone, M.}, title = {Contingency Planning Using Bi-level Markov Decision Processes for Space Missions}, booktitle = {{IEEE Aerospace Conference}}, year = {2024}, asl_abstract = {This work focuses on autonomous contingency planning for scientific missions by enabling rapid policy computation from any off-nominal point in the state space in the event of a delay or deviation from the nominal mission plan. Successful contingency planning involves managing risks and rewards, often probabilistically associated with actions, in stochastic scenarios. Markov Decision Processes (MDPs) are used to mathematically model decision-making in such scenarios. However, in the specific case of planetary rover traverse planning, the vast action space and long planning time horizon pose computational challenges. A bi-level MDP framework is proposed to improve computational tractability, while also aligning with existing mission planning practices and enhancing explainability and trustworthiness of AI-driven solutions. We discuss the conversion of a mission planning MDP into a bi-level MDP, and test the framework on RoverGridWorld, a modified GridWorld environment for rover mission planning. We demonstrate the computational tractability and near-optimal policies achievable with the bi-level MDP approach, highlighting the trade-offs between compute time and policy optimality as the problem's complexity grows. This work facilitates more efficient and flexible contingency planning in the context of scientific missions.}, asl_address = {Big Sky, Montana}, asl_month = mar, asl_url = {}, owner = {somrita}, timestamp = {2024-02-09} }
Abstract:
@inproceedings{AgiaVilaEtAl2024, author = {Agia, C. and Vila, {G. C.} and Bandyopadhyay, S. and Bayard, {D. S.} and Cheung, K. and Lee, {C. H.} and Wood, E. and Aenishanslin, I. and Ardito, S. and Fesq, L. and Pavone, M. and Nesnas, {I. A. D.}}, title = {Modeling Considerations for Developing Deep Space Autonomous Spacecraft and Simulators}, booktitle = {{IEEE Aerospace Conference}}, year = {2024}, asl_abstract = {To extend the limited scope of autonomy used in prior missions for operation in distant and complex environments, there is a need to further develop and mature autonomy that jointly reasons over multiple subsystems, which we term system-level autonomy. System-level autonomy establishes situational awareness that resolves conflicting information across subsystems, which may necessitate the refinement and interconnection of the underlying spacecraft and environment onboard models. However, with a limited understanding of the assumptions and tradeoffs of modeling to arbitrary extents, designing onboard models to support system-level capabilities presents a significant challenge. In this paper, we provide a detailed analysis of the increasing levels of model fidelity for several key spacecraft subsystems, with the goal of informing future spacecraft functional- and system-level autonomy algorithms and the physics-based simulators on which they are validated. We do not argue for the adoption of a particular fidelity class of models but, instead, highlight the potential tradeoffs and opportunities associated with the use of models for onboard autonomy and in physics-based simulators at various fidelity levels. We ground our analysis in the context of deep space exploration of small bodies, an emerging frontier for autonomous spacecraft operation in space, where the choice of models employed onboard the spacecraft may determine mission success. We conduct our experiments in the Multi-Spacecraft Concept and Autonomy Tool (MuSCAT), a software suite for developing spacecraft autonomy algorithms.}, asl_address = {Big Sky, Montana}, asl_month = mar, asl_url = {https://arxiv.org/abs/2401.11371}, owner = {agia}, timestamp = {2024-03-01} }
Abstract: Although vehicle electrification and utilization of on-site solar PV generation can begin reducing the greenhouse gas emissions associated with bus fleet operations, a method to intelligently coordinate bus-route assignments, bus charging, and energy storage is needed to fully decarbonize fleet operations while simultaneously minimizing electricity costs. This paper proposes a 24/7 Carbon-Free Electrified Fleet digital twin framework for modeling, controlling, and analyzing an electric bus fleet, co-located solar PV arrays, and a battery energy storage system. The framework consists of forecasting modules for marginal grid emissions factors, solar generation, and bus energy consumption that are input to the optimization module, which determines bus and battery operations at minimal electricity and emissions costs. We present a digital platform based on this framework, and for a case study of Stanford University’s Marguerite Shuttle, the platform reduced peak charging demand by 99%, electric utility bill by 2778, and associated carbon emissions by 100% for one week of simulated operations for 38 buses. When accounting for operational uncertainty, the platform still reduced the utility bill by 784 and emissions by 63%.
@inproceedings{RibeiroLukeEtAl2023, author = {Ribeiro, M. and Luke, J. and Martin, S. and Balogun, E. and Cezar, G. and Pavone, M. and Rajagopal, R.}, title = {Towards a 24/7 Carbon-Free Electric Fleet: A Digital Twin Framework}, booktitle = {{Energy Proceedings}}, year = {2023}, volume = {43}, address = {Doha, Qatar}, month = dec, doi = {10.46855/energy-proceedings-11033}, owner = {jthluke}, timestamp = {2024-08-12}, url = {https://www.energy-proceedings.org/towards-a-24-7-carbon-free-electric-fleet%3A-a-digital-twin-framework/} }
Abstract: We propose Text2Motion, a language-based planning framework enabling robots to solve sequential manipulation tasks that require long-horizon reasoning. Given a natural language instruction, our framework constructs both a task- and motion-level plan that is verified to reach inferred symbolic goals. Text2Motion uses feasibility heuristics encoded in Q-functions of a library of skills to guide task planning with Large Language Models. Whereas previous language-based planners only consider the feasibility of individual skills, Text2Motion actively resolves geometric dependencies spanning skill sequences by performing geometric feasibility planning during its search. We evaluate our method on a suite of problems that require long-horizon reasoning, interpretation of abstract goals, and handling of partial affordance perception. Our experiments show that Text2Motion can solve these challenging problems with a success rate of 82%, while prior state-of-the-art language-based planning methods only achieve 13%. Text2Motion thus provides promising generalization characteristics to semantically diverse sequential manipulation tasks with geometric dependencies between skills.
@article{LinAgiaEtAl2023, author = {Lin, K. and Agia, C. and Migimatsu, T. and Pavone, M. and Bohg, J.}, title = {Text2Motion: From Natural Language Instructions to Feasible Plans}, journal = {{Autonomous Robots}}, volume = {47}, number = {8}, pages = {1345–-1365}, year = {2023}, month = nov, doi = {10.1007/s10514-023-10131-7}, url = {https://doi.org/10.1007/s10514-023-10131-7}, owner = {agia}, timestamp = {2024-02-29} }
Abstract: Autonomous agents increasingly rely on learned components to streamline safe and reliable decision making. However, data dissimilar to that seen in training, deemed to be Out-of-Distribution (OOD), creates undefined behavior in the output of our learned-components, which can have detrimental consequences in a safety critical setting such as autonomous satellite rendezvous. In the wild, we typically are exposed to a mix of in-and-out of distribution data where OOD inputs correspond to uncommon and unfamiliar data when a nominally competent system encounters a new situation. In this paper, we propose an architecture that detects the presence of OOD inputs in an online stream of data. The architecture then uses these OOD inputs to recognize domain invariant features between the original training and OOD domain to improve model inference. We demonstrate that our algorithm more than doubles model accuracy on the OOD domain with sparse, unlabeled OOD examples compared to a naive model without such data on shifted MNIST domains. Importantly, we also demonstrate our algorithm maintains strong accuracy on the original training domain, generalizing the model to a mix of in-and-out of distribution examples seen at deployment. Code for our experiment is available at: https://github.com/StanfordASL/CoRL_OODWorkshop_DANN-DL
@inproceedings{FoutterSinhaEtAl2023, author = {Foutter, M. and Sinha, R. and Banerjee, S. and Pavone, M.}, title = {Self-Supervised Model Generalization using Out-of-Distribution Detection}, booktitle = {{Conf. on Robot Learning - Workshop on Out-of-Distribution Generalization in Robotics}}, year = {2023}, address = {Atlanta, Georgia}, month = nov, owner = {jthluke}, timestamp = {2024-09-20}, url = {https://openreview.net/forum?id=z5XS3BY13J} }
Abstract: Congestion pricing has long been hailed as a means to mitigate traffic congestion; however, its practical adoption has been limited due to the resulting social inequity issue, e.g., low-income users are priced out off certain roads. This issue has spurred interest in the design of equitable mechanisms that aim to refund the collected toll revenues as lump-sum transfers to users. Although revenue refunding has been extensively studied for over three decades, there has been no thorough characterization of how such schemes can be designed to simultaneously achieve system efficiency and equity objectives. In this work, we bridge this gap through the study of congestion pricing and revenue refunding (CPRR) schemes in non-atomic congestion games. We first develop CPRR schemes, which in comparison to the untolled case, simultaneously (i) increase system efficiency and (ii) decrease wealth inequality, while being (iii) user-favorable: irrespective of their initial wealth or values-of-time (which may differ across users) users would experience a lower travel cost after the implementation of the proposed scheme. We then characterize the set of optimal user-favorable CPRR schemes that simultaneously maximize system efficiency and minimize wealth inequality. These results assume a well-studied behavior model of users minimizing a linear function of their travel times and tolls, without considering refunds. We also study a more complex behavior model wherein users are influenced by and react to the amount of refund that they receive. Although, in general, the two models can result in different outcomes in terms of system efficiency and wealth inequality, we establish that those outcomes coincide when the aforementioned optimal CPRR scheme is implemented. Overall, our work demonstrates that through appropriate refunding policies we can achieve system efficiency while reducing wealth inequality.
@inproceedings{JalotaSoloveyGopalakrishnanEtAl2023, author = {Jalota, D. and Solovey, K. and Gopalakrishnan, K. and Zoepf, S. and Balakrishnan, H. and Pavone, M.}, title = {When Efficiency meets Equity in Congestion Pricing and Revenue Refunding Schemes}, booktitle = {{IEEE Transactions on Control of Network Systems}}, note = {In press}, year = {2023}, month = oct, keywords = {press}, owner = {devanshjalota}, timestamp = {2021-06-22}, url = {https://ieeexplore.ieee.org/document/10319707} }
Abstract: As robots acquire increasingly sophisticated skills and see increasingly complex and varied environments, the threat of an edge case or anomalous failure is ever present. For example, Tesla cars have seen interesting failure modes ranging from autopilot disengagements due to inactive traffic lights carried by trucks to phantom braking caused by images of stop signs on roadside billboards. These system-level failures are not due to failures of any individual component of the autonomy stack but rather system-level deficiencies in semantic reasoning. Such edge cases, which we call semantic anomalies, are simple for a human to disentangle yet require insightful reasoning. To this end, we study the application of large language models (LLMs), endowed with broad contextual understanding and reasoning capabilities, to recognize such edge cases and introduce a monitoring framework for semantic anomaly detection in vision-based policies. Our experiments apply this framework to a finite state machine policy for autonomous driving and a learned policy for object manipulation. These experiments demonstrate that the LLM-based monitor can effectively identify semantic anomalies in a manner that shows agreement with human reasoning. Finally, we provide an extended discussion on the strengths and weaknesses of this approach and motivate a research outlook on how we can further use foundation models for semantic anomaly detection. Our project webpage can be found at https://sites.google.com/view/llm-anomaly-detection.
@article{ElhafsiSinhaEtAl2023, author = {Elhafsi, A. and Sinha, R. and Agia, C. and Schmerling, E. and Nesnas, I. A. D and Pavone, M.}, title = {Semantic Anomaly Detection with Large Language Models}, journal = {{Autonomous Robots}}, volume = {47}, number = {8}, pages = {1035--1055}, year = {2023}, month = oct, doi = {10.1007/s10514-023-10132-6}, url = {https://arxiv.org/abs/2305.11307}, owner = {amine}, timestamp = {2024-09-19} }
Abstract: Even for known nonlinear dynamical systems, feedback controller synthesis is a difficult problem that often requires leveraging the particular structure of the dynamics to induce a stable closed-loop system. For general nonlinear models, including those fit to data, there may not be enough known structure to reliably synthesize a stabilizing feedback controller. In this paper, we discuss a state-dependent nonlinear tracking controller formulation based on a state-dependent Riccati equation for general nonlinear control-affine systems. This formulation depends on a nonlinear factorization of the system of vector fields defining the control-affine dynamics, which always exists under mild smoothness assumptions. We propose a method for learning this factorization from a finite set of data. On a variety of simulated nonlinear dynamical systems, we empirically demonstrate the efficacy of learned versions of this controller in stable trajectory tracking. Alongside our learning method, we evaluate recent ideas in jointly learning a controller and stabilizability certificate for known dynamical systems; we show experimentally that such methods can be frail in comparison.
@inproceedings{RichardsSlotineEtAl2023, author = {Richards, S. M. and Slotine, J.-J. and Azizan, N. and Pavone, M.}, title = {Learning Control-Oriented Dynamical Structure from Data}, year = {2023}, booktitle = {{Int. Conf. on Machine Learning}}, note = {In press}, owner = {spenrich}, timestamp = {2023-07-17}, url = {https://arxiv.org/abs/2302.02529}, address = {Honolulu, Hawaii}, month = jul, keywords = {} }
Abstract: Optimization problems over dynamic networks have been extensively studied and widely used in the past decades to formulate numerous real-world problems. However, (1) traditional optimization-based approaches do not scale to large networks, and (2) the design of good heuristics or approximation algorithms often requires significant manual trial-and-error. In this work, we argue that data-driven strategies can automate this process and learn efficient algorithms without compromising optimality. To do so, we present network control problems through the lens of reinforcement learning and propose a graph network-based framework to handle a broad class of problems. Instead of naively computing actions over high-dimensional graph elements, e.g., edges, we propose a bi-level formulation where we (1) specify a desired next state via RL, and (2) solve a convex program to best achieve it, leading to drastically improved scalability and performance. We further highlight a collection of desirable features to system designers, investigate design decisions, and present experiments on real-world control problems showing the utility, scalability, and flexibility of our framework.
@inproceedings{GammelliHarrisonEtAl2023, author = {Gammelli, D. and Harrison, J. and Yang, K. and Pavone, M. and Rodrigues, F. and Pereira, F. C.}, title = {Graph Reinforcement Learning for Network Control via Bi-Level Optimization}, booktitle = {{Int. Conf. on Machine Learning}}, year = {2023}, address = {Honolulu, Hawaii}, month = jul, owner = {jthluke}, timestamp = {2024-09-20}, url = {https://arxiv.org/abs/2305.09129} }
Abstract: Real-time optimal control of high-dimensional, nonlinear systems remains a challenging task due to the computational intractability of their models. While several model-reduction and learning-based approaches for constructing low-dimensional surrogates of the original system have been proposed in the literature, these approaches suffer from fundamental issues which limit their application in real-world scenarios. Namely, they typically lack generalizability to different control tasks, ability to trade dimensionality for accuracy, and ability to preserve the structure of the dynamics. Recently, we proposed to extract low-dimensional dynamics on Spectral Submanifolds (SSMs) to overcome these issues and validated our approach in a highly accurate simulation environment. In this manuscript, we extend our framework to a real-world setting by employing time-delay embeddings to embed SSMs in an observable space of appropriate dimension. This allows us to learn highly accurate, low-dimensional dynamics purely from observational data. We show that these innovations extend Spectral Submanifold Reduction (SSMR) to real-world applications and showcase the effectiveness of SSMR on a soft robotic system.
@inproceedings{AloraCenedeseEtAl2023b, author = {Alora, J.I. and Cenedese, M. and Schmerling, E. and Haller, G. and Pavone, M.}, title = {Practical Deployment of Spectral Submanifold Reduction for Optimal Control of High-Dimensional Systems}, booktitle = {{IFAC World Congress}}, year = {2023}, address = {Yokohama, Japan}, month = jul, doi = {10.1016/j.ifacol.2023.10.1734}, owner = {jthluke}, timestamp = {2024-09-20}, url = {/wp-content/papercite-data/pdf/Alora.Cenedese.IFAC23.pdf} }
Abstract: ReachBot is a robot that uses extendable and retractable booms as limbs to move around unpredictable environments such as martian caves. Each boom is capped by a microspine gripper designed for grasping rocky surfaces. Motion planning for ReachBot must be versatile to accommo-date variable terrain features and robust to mitigate risks from the stochastic nature of grasping with spines. In this paper, we introduce a graph traversal algorithm to select a discrete sequence of grasps based on available terrain features suitable for grasping. This discrete plan is complemented by a decoupled motion planner that considers the alternating phases of body movement and end-effector movement, using a combination of sampling-based planning and sequential convex programming to optimize individual phases. We use our motion planner to plan a trajectory across a simulated 2D cave environment with at least 90% probability of success and demonstrate improved robustness over a baseline trajectory. Finally, we use a simplified prototype to verify a body movement trajectory generated by our motion planning algorithm.
@inproceedings{NewdickOngoleEtAl2023, author = {Newdick, S. and Ongole, N and Chen, T. G. and Schmerling, E. and Cutkosky, M. R. and Pavone, M.}, title = {Motion Planning for a Climbing Robot with Stochastic Grasps}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2023}, address = {London, United Kingdom}, month = may, doi = {10.1109/ICRA48891.2023.10160218}, owner = {jthluke}, timestamp = {2024-09-19}, url = {https://arxiv.org/abs/2209.10687} }
Abstract: In this paper we present a trade study-based method to optimize the architecture of ReachBot, a new robotic concept that uses deployable booms as prismatic joints for mobility in environments with adverse gravity conditions and challenging terrain. Specifically, we introduce a design process wherein we analyze the compatibility of ReachBot’s design with its mission. We incorporate terrain parameters and mission requirements to produce a final design optimized for mission-specific objectives. ReachBot’s design parameters include (1) number of booms, (2) positions and orientations of the booms on ReachBot’s chassis, (3) boom maximum extension, (4) boom cross-sectional geome-try, and (5) number of active/passive degrees-of-freedom at each joint. Using first-order approximations, we analyze the relationships between these parameters and various performance metrics including stability, manipulability, and mechanical in-terference. We apply our method to a mission where ReachBot navigates and gathers data from a martian lava tube. The resulting design is shown in Fig. 1.
@inproceedings{NewdickChenEtAl2023, author = {Newdick, S. and Chen, T. G. and Hockman, B. and Schmerling, E. and Cutkosky, M. R. and Pavone, M.}, title = {Designing ReachBot: System Design Process with a Case Study of a Martian Lava Tube Mission}, booktitle = {{IEEE Aerospace Conference}}, year = {2023}, address = {Big Sky, Montana}, month = mar, doi = {10.1109/AERO55745.2023.10115893}, owner = {jthluke}, timestamp = {2024-09-20}, url = {https://arxiv.org/abs/2210.11534} }
Abstract: Learning-enabling components are increasingly popular in many aerospace applications, including satellite pose estimation. However, as input distributions evolve over a mission lifetime, it becomes challenging to maintain performance of learned models. In this work, we present an open-source benchmark of a satellite pose estimation model trained on images of a satellite in space and deployed in novel input scenarios (e.g., different backgrounds or misbehaving pixels). We propose a framework to incrementally retrain a model by selecting a subset of test inputs to label, which allows the model to adapt to changing input distributions. Algorithms within this framework are evaluated based on (1) model performance throughout mission lifetime and (2) cumulative costs associated with labeling and model retraining. We also propose a novel algorithm to select a diverse subset of inputs for labeling, by characterizing the information gain from an input using Bayesian uncertainty quantification and choosing a subset that maximizes collective information gain using concepts from batch active learning. We show that our algorithm outperforms others on the benchmark, e.g., achieves comparable performance to an algorithm that labels 100% of inputs, while only labeling 50% of inputs, resulting in low costs and high performance over the mission lifetime.
@inproceedings{BanerjeeSharmaEtAl2022, author = {Banerjee, S. and Sharma, A. and Schmerling, E. and Spolaor, M. and Nemerouf, M. and Pavone, M.}, title = {Data Lifecycle Management in Evolving Input Distributions for Learning-based Aerospace Applications}, booktitle = {{IEEE Aerospace Conference}}, year = {2023}, address = {Big Sky, Montana}, month = mar, doi = {10.1109/AERO55745.2023.10115970}, owner = {jthluke}, timestamp = {2024-09-20}, url = {https://arxiv.org/abs/2209.06855} }
Abstract: Locational marginal emissions rates (LMEs) estimate the rate of change in emissions due to a small change in demand in a transmission network, and are an important metric for assessing the impact of various energy policies or interventions. In this work, we develop a new method for computing the LMEs of an electricity system via implicit differentiation. The method is model agnostic; it can compute LMEs for any convex optimization-based dispatch model, including some of the complex dispatch models employed by system operators in real electricity systems. In particular, this method lets us derive LMEs for dynamic dispatch models, which have temporal constraints such as ramping and storage. Using real data from the U.S. electricity system, we validate the proposed method against a state-of-the-art merit-order-based method and show that incorporating dynamic constraints improves model accuracy by 8.2%. Finally, we use simulations on a realistic 240-bus model of WECC to demonstrate the flexibility of the tool and the importance of incorporating dynamic constraints. In this example, static and dynamic LMEs deviate from one another by 28.4% on average, implying dynamic constraints are essential in accurately modeling emissions rates.
@article{ValenzuelaDeglerisEtAl2022, author = {Valenzuela, L. F. and Degleris, A. and Gamal, A. E. and Pavone, M. and Rajagopal, R.}, title = {Dynamic Locational Marginal Emissions via Implicit Differentiation}, journal = {{IEEE Transactions on Power Systems}}, year = {2023}, volume = {39}, number = {1}, pages = {1138--1147}, doi = {10.1109/TPWRS.2023.3247345}, owner = {jthluke}, timestamp = {2024-09-20}, url = {https://arxiv.org/abs/2302.14282} }
Abstract: In this chapter, we will discuss some of the privacy related challenges that have arisen in data-driven cyber-physical systems. While user data can help data-driven systems improve their efficiency, safety, and adaptability, the way that this data is collected and analyzed can lead to privacy risks to users. If left unaddressed, potential privacy risks may deter users from contributing their data to cyber-physical systems, thereby limiting the effectiveness of data-driven tools employed by the systems. To ensure that users are protected from privacy risks and feel comfortable sharing their data with cyber-physical systems, this chapter discusses how differential privacy and cryptography, techniques originally developed for privacy in health care and computer systems respectively, can be used to conduct data analysis and optimization with principled privacy guarantees in cyber-physical systems. Along the way, we will discuss and compare the properties, strengths weaknesses of these approaches. We will also show why these techniques address many of the privacy-related issues that arise when using data anonymization and data aggregation, two of the most common approaches to privacy-aware data sharing.
@incollection{TsaoGopalakrishnanEtAl2023b, author = {Tsao, M. and Gopalakrishnan, K. and Yang, K. and Pavone, M.}, title = {Privacy-Aware Control of Cyber Physical Systems}, booktitle = {Smarter Cyber Physical Systems: Enabling Methodologies and Application}, year = {2023}, publisher = {{CRC Press}}, note = {Submitted}, keywords = {sub}, owner = {gkarthik}, timestamp = {2022-12-06} }
Abstract: Network routing problems are common across many engineering applications. Computing optimal routing policies requires knowledge about network demand, i.e., the origin and destination (OD) of all requests in the network. However, privacy considerations make it challenging to share individual OD data that would be required for computing optimal policies. Privacy can be particularly challenging in standard network routing problems because sources and sinks can be easily identified from flow conservation constraints, making feasibility and privacy mutually exclusive. In this paper, we present a differentially private algorithm for network routing problems. The main ingredient is a reformulation of network routing which moves all user data-dependent parameters out of the constraint set and into the objective function. We then present an algorithm for solving this formulation based on a differentially private variant of stochastic gradient descent. In this algorithm, differential privacy is achieved by injecting noise, and one may wonder if this noise injection compromises solution quality. We prove that our algorithm is both differentially private and asymptotically optimal as the size of the training set goes to infinity. We corroborate the theoretical results with numerical experiments on a road traffic network which show that our algorithm provides differentially private and near-optimal solutions in practice.
@inproceedings{TsaoGopalakrishnanEtAl2023, author = {Tsao, M. and Gopalakrishnan, K. and Yang, K. and Pavone, M.}, title = {Differentially Private Stochastic Convex Optimization for Network Routing Applications}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2023}, address = {Singapore}, doi = {10.1109/CDC49753.2023.10383207}, url = {https://ieeexplore.ieee.org/abstract/document/10383207/}, owner = {somrita}, timestamp = {2024-02-29} }
Abstract: When we rely on deep-learned models for robotic perception, we must recognize that these models may behave unreliably on inputs dissimilar from the training data, compromising the closed-loop system’s safety. This raises fundamental questions on how we can assess confidence in perception systems and to what extent we can take safety-preserving actions when external environmental changes degrade our perception model’s performance. Therefore, we present a framework to certify the safety of a perception-enabled system deployed in novel contexts. To do so, we leverage robust model predictive control (MPC) to control the system using the perception estimates while maintaining the feasibility of a safety-preserving fallback plan that does not rely on the perception system. In addition, we calibrate a runtime monitor using recently proposed conformal prediction techniques to certifiably detect when the perception system degrades beyond the tolerance of the MPC controller, resulting in an end-to-end safety assurance. We show that this control framework and calibration technique allows us to certify the system’s safety with orders of magnitudes fewer samples than required to retrain the perception network when we deploy in a novel context on a photo-realistic aircraft taxiing simulator. Furthermore, we illustrate the safety-preserving behavior of the MPC on simulated examples of a quadrotor.
@inproceedings{SinhaSchmerlingEtAl2023, author = {Sinha, R. and Schmerling, E. and Pavone, M.}, title = {Closing the Loop on Runtime Monitors with Fallback-Safe MPC}, year = {2023}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, address = {Singapore}, doi = {10.1109/CDC49753.2023.10383965}, url = {https://ieeexplore.ieee.org/document/10383965}, owner = {rhnsinha}, timestamp = {2023-04-12} }
Abstract:
@article{SinghLandryEtAl2019, author = {Singh, S. and Landry, B. and Majumdar, A. and Slotine, J-J. E. and Pavone, M.}, title = {Robust Feedback Motion Planning via Contraction Theory}, journal = {{Int. Journal of Robotics Research}}, volume = {42}, number = {9}, pages = {655--688}, year = {2023}, keywords = {pub}, owner = {ssingh19}, timestamp = {2019-09-11}, url = {https://journals.sagepub.com/doi/pdf/10.1177/02783649231186165} }
Abstract: Model Predictive Control (MPC) has become a popular framework in embedded control for high-performance autonomous systems. However, to achieve good control performance using MPC, an accurate dynamics model is key. To maintain real-time operation, the dynamics models used on embedded systems have been limited to simple first-principle models, which substantially limits their representative power. In contrast to such simple models, machine learning approaches, specifically neural networks, have been shown to accurately model even complex dynamic effects, but their large computational complexity hindered combination with fast real-time iteration loops. With this work, we present Real-time Neural MPC , a framework to efficiently integrate large, complex neural network architectures as dynamics models within a model-predictive control pipeline. Our experiments, performed in simulation and the real world onboard a highly agile quadrotor platform, demonstrate the capabilities of the described system to run learned models with, previously infeasible, large modeling capacity using gradient-based online optimization MPC. Compared to prior implementations of neural networks in online optimization MPC we can leverage models of over 4000 times larger parametric capacity in a 50 Hz real-time window on an embedded platform. Further, we show the feasibility of our framework on real-world problems by reducing the positional tracking error by up to 82% when compared to state-of-the-art MPC approaches without neural network dynamics.
@article{SalzmannPavoneEtAl2022_2, author = {Salzmann, T. and Kaufmann, E. and Arrizabalaga, J. and Pavone, M. and Scaramuzza, D. and Ryll, M.}, title = {Real-Time Neural {MPC}: Deep Learning Model Predictive Control for Quadrotors and Agile Robotic Platforms}, journal = {{IEEE Robotics and Automation Letters}}, year = {2023}, volume = {8}, number = {4}, pages = {2397--2404}, doi = {10.1109/LRA.2023.3246839}, owner = {jthluke}, timestamp = {2024-09-20}, url = {https://arxiv.org/abs/2203.07747.pdf} }
Abstract: Real-time adaptation is imperative to the control of robots operating in complex, dynamic environments. Adaptive control laws can endow even nonlinear systems with good trajectory tracking performance, provided that any uncertain dynamics terms are linearly parameterizable with known nonlinear features. However, it is often difficult to specify such features a priori, such as for aerodynamic disturbances on rotorcraft or interaction forces between a manipulator arm and various objects. In this paper, we turn to data-driven modeling with neural networks to learn, offline from past data, an adaptive controller with an internal parametric model of these nonlinear features. Our key insight is that we can better prepare the controller for deployment with control-oriented meta-learning of features in closed-loop simulation, rather than regression-oriented meta-learning of features to fit input-output data. Specifically, we meta-learn the adaptive controller with closed-loop tracking simulation as the base-learner and the average tracking error as the meta-objective. With both fully-actuated and underactuated nonlinear planar rotorcraft subject to wind, we demonstrate that our adaptive controller outperforms other controllers trained with regression-oriented meta-learning when deployed in closed-loop for trajectory tracking control.
@article{RichardsAzizanEtAl2023, author = {Richards, S. M. and Azizan, N. and Slotine, J.-J. and Pavone, M.}, title = {Control-Oriented Meta-Learning}, year = {2023}, journal = {{Int. Journal of Robotics Research}}, volume = {42}, number = {10}, pages = {777--797}, owner = {spenrich}, timestamp = {2024-02-29}, url = {https://arxiv.org/abs/2204.06716} }
Abstract: Today, even the most compute-and-power constrained robots can measure complex, high data-rate video and LIDAR sensory streams. Often, such robots, ranging from low-power drones to space and subterranean rovers, need to transmit high-bitrate sensory data to a remote compute server if they are uncertain or cannot scalably run complex perception or mapping tasks locally. However, today’s representations for sensory data are mostly designed for human, not robotic, perception and thus often waste precious compute or wireless network resources to transmit unimportant parts of a scene that are unnecessary for a high-level robotic task. This paper presents an algorithm to learn task-relevant representations of sensory data that are co-designed with a pre-trained robotic perception model’s ultimate objective. Our algorithm aggressively compresses robotic sensory data by up to 11 x more than competing methods. Further, it achieves high accuracy and robust generalization on diverse tasks including Mars terrain classification with low-power deep learning accelerators, neural motion planning, and environmental timeseries classification.
@article{NakanoyaEtAl2021, author = {Nakanoya, Manabu and Narasimhan, Sai Shankar and Bhat, Sharachchandra and Anemogiannis, Alexandros and Datta, Akul and Katti, Sachin and Chinchali, Sandeep and Pavone, Marco}, title = {Co-Design of Communication and Machine Inference for Cloud Robotics}, journal = {{Autonomous Robots}}, volume = {47}, number = {}, pages = {579–-594}, year = {2023}, owner = {rdyro}, timestamp = {2024-02-29}, keywords = {pub} }
Abstract: When deploying machine learning models in high-stakes robotics applications, the ability to detect unsafe situations is crucial. Early warning systems can provide alerts when an unsafe situation is imminent (in the absence of corrective action). To reliably improve safety, these warning systems should have a provable false negative rate; i.e., of the situations than are unsafe, fewer than epsilon will occur without an alert. In this work, we present a framework that combines a statistical inference technique known as conformal prediction with a simulator of robot/environment dynamics, in order to tune warning systems to provably achieve an epsilon false negative rate using as few as 1/epsilon data points. We apply our framework to a driver warning system and a robotic grasping application, and empirically demonstrate guaranteed false negative rate and low false detection (positive) rate using very little data.
@inproceedings{LuoZhaoEtAl2023, author = {Luo, R. and Zhao, S. and Kuck, J. and Ivanovic, B. and Savarese, S. and Schmerling, E. and Pavone, M.}, title = {Sample-Efficient Safety Assurances using Conformal Prediction}, booktitle = {{Int. Journal of Robotics Research}}, year = {2023}, owner = {rsluo}, timestamp = {2023-02-10}, url = {https://arxiv.org/abs/2109.14082} }
Abstract: Trajectory optimization under uncertainty underpins a wide range of applications in robotics. However, existing methods are limited in terms of reasoning about sources of epistemic and aleatoric uncertainty, space and time correlations, nonlinear dynamics, and non-convex constraints. In this work, we first introduce a continuous-time planning formulation with an average-value-at-risk constraint over the entire planning horizon. Then, we propose a sample-based approximation that unlocks an efficient, general-purpose, and time-consistent algorithm for risk-averse trajectory optimization. We prove that the method is asymptotically optimal and derive finite-sample error bounds. Simulations demonstrate the high speed and reliability of the approach on problems with stochasticity in nonlinear dynamics, obstacle fields, interactions, and terrain parameters.
@article{LewBonalliPavone2023, author = {Lew, T. and Bonalli, R. and Pavone, M.}, title = {Risk-Averse Trajectory Optimization via Sample Average Approximation}, journal = {{IEEE Robotics and Automation Letters}}, volume = {9}, number = {2}, pages = {1500--1507}, year = {2023}, url = {https://arxiv.org/abs/2307.03167}, keywords = {pub}, owner = {lew}, timestamp = {2024-02-29} }
Abstract: We study the convex hulls of reachable sets of nonlinear systems with bounded disturbances. Reachable sets play a critical role in control, but remain notoriously challenging to compute, and existing over-approximation tools tend to be conservative or computationally expensive. In this work, we exactly characterize the convex hulls of reachable sets as the convex hulls of solutions of an ordinary differential equation from all possible initial values of the disturbances. This finite-dimensional characterization unlocks a tight estimation algorithm to over-approximate reachable sets that is significantly faster and more accurate than existing methods. We present applications to neural feedback loop analysis and robust model predictive control.
@inproceedings{LewBonalliEtAl2023, author = {Lew, T. and Bonalli, R. and Pavone, M.}, title = {Exact Characterization of the Convex Hulls of Reachable Sets}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2023}, address = {Singapore}, doi = {10.1109/CDC49753.2023.10383902}, owner = {lew}, timestamp = {2023-04-03}, url = {https://ieeexplore.ieee.org/document/10383902} }
Abstract: This paper presents a technique, named STLCG, to compute the quantitative semantics of Signal Temporal Logic (STL) formulas using computation graphs. STLCG provides a platform which enables the incorporation of logical specifications into robotics problems that benefit from gradient-based solutions. Specifically, STL is a powerful and expressive formal language that can specify spatial and temporal properties of signals generated by both continuous and hybrid systems. The quantitative semantics of STL provide a robustness metric, that is, how much a signal satisfies or violates an STL specification. In this work, we devise a systematic methodology for translating STL robustness formulas into computation graphs. With this representation, and by leveraging off-the-shelf automatic differentiation tools, we are able to efficiently backpropagate through STL robustness formulas and hence enable a natural and easy-to-use integration of STL specifications with many gradient-based approaches used in robotics. Through a number of examples stemming from various robotics applications, we demonstrate that STLCG is versatile, computationally efficient, and capable of incorporating human-domain knowledge into the problem formulation.
@article{LeungArechigaEtAl2021, author = {Leung, K. and Ar\'{e}chiga, N. and Pavone, M.}, title = {Backpropagation through signal temporal logic specifications: Infusing logical structure into gradient-based methods}, journal = {{Int. Journal of Robotics Research}}, year = {2023}, volume = {42}, number = {6}, pages = {356--370}, doi = {10.1177/02783649221082115}, owner = {jthluke}, timestamp = {2024-09-20}, url = {https://arxiv.org/abs/2008.00097} }
Abstract: System optimum (SO) routing, wherein the total travel time of all users is minimized, is a holy grail for transportation authorities. However, SO routing may discriminate against users who incur much larger travel times than others to achieve high system efficiency, i.e., low total travel times. To address the inherent unfairness of SO routing, we study the β-fair SO problem whose goal is to minimize the total travel time while guaranteeing a β≥1 level of unfairness, which specifies the maximal ratio between the travel times of different users with shared origins and destinations. To obtain feasible solutions to the β-fair SO problem while achieving high system efficiency, we develop a new convex program, the Interpolated Traffic Assignment Problem (I-TAP), which interpolates between a fair and an efficient traffic-assignment objective. We then leverage the structure of I-TAP to develop two pricing mechanisms to collectively enforce the I-TAP solution in the presence of selfish homogeneous and heterogeneous users, respectively, that independently choose routes to minimize their own travel costs. We mention that this is the first study of pricing in the context of fair routing. Finally, we use origin-destination demand data for a range of transportation networks to numerically evaluate the performance of I-TAP as compared to a state-of-the-art algorithm. The numerical results indicate that our approach is faster by several orders of magnitude, while achieving higher system efficiency for most levels of unfairness.
@article{JalotaSoloveyEtAl2023, author = {Jalota, D. and Solovey, K. and Tsao, M. and Zoepf, S. and Pavone, M.}, title = {Balancing Fairness and Efficiency in Traffic Routing via Interpolated Traffic Assignment}, journal = {{Autonomous Agents and Multi-Agent Systems}}, volume = {37}, number = {32}, pages = {1--40}, year = {2023}, keywords = {pub}, asl_url = {https://link.springer.com/article/10.1007/s10458-023-09616-7}, owner = {devanshjalota}, timestamp = {2024-02-29} }
Abstract:
@article{JalotaPavoneEtAl2023, author = {Jalota, D. and Pavone, M. and Qi, Q. and Ye, Y.}, title = {Fisher Markets with Linear Constraints: Equilibrium Properties and Efficient Distributed Algorithms}, journal = {{Games and Economic Behavior}}, volume = {141}, number = {}, pages = {223--260}, year = {2023}, keywords = {pub}, owner = {devanshjalota}, timestamp = {2024-02-29}, url = {https://www.sciencedirect.com/science/article/pii/S0899825623000891} }
Abstract: Over the past decade, GPS-enabled traffic applications such as Google Maps and Waze have become ubiquitous and have had a significant influence on billions of daily commuters’ travel patterns. A consequence of the online route suggestions of such applications, for example, via greedy routing, has often been an increase in traffic congestion since the induced travel patterns may be far from the system optimum. Spurred by the widespread impact of traffic applications on travel patterns, this work studies online traffic routing in the context of capacity-constrained parallel road networks and analyzes this problem from two perspectives. First, we perform a worst-case analysis to identify the limits of deterministic online routing. Although we find that deterministic online algorithms achieve finite, problem/instance-dependent competitive ratios in special cases, we show that for a general setting the competitive ratio is unbounded. This result motivates us to move beyond worst-case analysis. Here, we consider algorithms that exploit knowledge of past problem instances and show how to design data-driven algorithms whose performance can be quantified and formally generalized to unseen future instances. We then present numerical experiments based on an application case for the San Francisco Bay Area to evaluate the performance of the proposed data-driven algorithms compared with the greedy algorithm and two look-ahead heuristics with access to additional information on the values of time and arrival time parameters of users. Our results show that the developed data-driven algorithms outperform commonly used greedy online-routing algorithms. Furthermore, our work sheds light on the interplay between data availability and achievable solution quality.
@article{JalotaPaccagnanEtAl2023, author = {Jalota, D. and Paccagnan, D. and Schiffer, M. and Pavone, M.}, title = {Online Routing Over Parallel Networks: Deterministic Limits and Data-driven Enhancements}, journal = {{INFORMS Journal on Computing}}, year = {2023}, volume = {35}, number = {3}, pages = {560--577}, doi = {10.1287/ijoc.2023.1275}, owner = {jthluke}, timestamp = {2024-09-19}, url = {https://arxiv.org/abs/2109.08706} }
Abstract: Credit-based congestion pricing (CBCP) has emerged as a mechanism to alleviate the social inequity concerns of road congestion pricing - a promising strategy for traffic congestion mitigation - by providing low-income users with travel credits to offset some of their toll payments. While CBCP offers immense potential for addressing inequity issues that hamper the practical viability of congestion pricing, the deployment of CBCP in practice is nascent, and the potential efficacy and optimal design of CBCP schemes have yet to be formalized. In this work, we study the design of CBCP schemes to achieve particular societal objectives and investigate their influence on traffic patterns when routing heterogeneous users with different values of time (VoTs) in a multi-lane highway with an express lane. We introduce a new non-atomic congestion game model of a mixed-economy, wherein eligible users receive travel credits while the remaining ineligible users pay out-of-pocket to use the express lane. In this setting, we investigate the effect of CBCP schemes on traffic patterns by characterizing the properties (i.e., existence, comparative statics) of the corresponding Nash equilibria and, in the setting when eligible users have time-invariant VoTs, develop a convex program to compute these equilibria. We further present a bi-level optimization framework to design optimal CBCP schemes to achieve a central planner’s societal objectives. Finally, we conduct numerical experiments based on a case study of the San Mateo 101 Express Lanes Project, one of the first North American CBCP pilots. Our results demonstrate the potential of CBCP to enable low-income travelers to avail of the travel time savings provided by congestion pricing on express lanes while having comparatively low impacts on the travel costs of other road users.
@inproceedings{JalotaEtAl2023, author = {Jalota, D. and Lazarus, J. and Bayen, A. and Pavone, M.}, title = {Credit-Based Congestion Pricing: Equilibrium Properties and Optimal Scheme Design}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2023}, address = {Singapore}, doi = {10.1109/CDC49753.2023.10384266}, url = {https://ieeexplore.ieee.org/document/10384266}, owner = {devanshjalota}, timestamp = {2023-04-01} }
Abstract: In transportation networks, users typically choose routes in a decentralized and self-interested manner to minimize their individual travel costs, which, in practice, often results in inefficient overall outcomes for society. As a result, there has been a growing interest in designing road tolling schemes to cope with these efficiency losses and steer users toward a system-efficient traffic pattern. However, the efficacy of road tolling schemes often relies on having access to complete information on users’ trip attributes, such as their origin-destination (O-D) travel information and their values of time, which may not be available in practice. Motivated by this practical consideration, we propose an online learning approach to set tolls in a traffic network to drive heterogeneous users with different values of time toward a system-efficient traffic pattern. In particular, we develop a simple yet effective algorithm that adjusts tolls at each time period solely based on the observed aggregate flows on the roads of the network without relying on any additional trip attributes of users, thereby preserving user privacy. In the setting where the O-D pairs and values of time of users are drawn i.i.d. at each period, we show that our approach obtains an expected regret and road capacity violation of O(\sqrtT), where T is the number of periods over which tolls are updated. Our regret guarantee is relative to an offline oracle that has complete information on users’ trip attributes. We further establish a Ω(\sqrtT) lower bound on the regret of any algorithm, which establishes that our algorithm is optimal up to constants. Finally, we demonstrate the superior performance of our approach relative to several benchmarks on a real-world transportation network, thereby highlighting its practical applicability.
@inproceedings{JalotaEtAl2022, author = {Jalota, D. and Gopalakrishnan, K. and Azizan, N. and Johari, R. and Pavone, M.}, title = {Online Learning for Traffic Routing under Unknown Preferences}, booktitle = {{Int. Conf. on Artificial Intelligence and Statistics}}, year = {2023}, keywords = {pub}, owner = {devanshjalota}, timestamp = {2022-05-03}, url = {https://arxiv.org/abs/2203.17150} }
Abstract: We consider the sequential decision-making problem of making proactive request assignment and rejection decisions for a profit-maximizing operator of an autonomous mobility on demand system. We formalize this problem as a Markov decision process and propose a novel combination of multi-agent Soft Actor-Critic and weighted bipartite matching to obtain an anticipative control policy. Thereby, we factorize the operator’s otherwise intractable action space, but still obtain a globally coordinated decision. Experiments based on real-world taxi data show that our method outperforms state of the art benchmarks with respect to performance, stability, and computational tractability.
@inproceedings{EndersEtAl2023, author = {Enders, Tobias and Harrison, James and Pavone, M. and Schiffer, Maximilian}, title = {Hybrid Multi-agent Deep Reinforcement Learning for Autonomous Mobility on Demand Systems}, journal = {{Learning for Dynamics \& Control Conference}}, year = {2023}, timestamp = {2023-03-20}, owner = {rdyro}, url = {/wp-content/papercite-data/pdf/Enders.Harrison.Pavone.Schiffer.L4DC23.pdf} }
Abstract: An underlying structure in several sampling-based methods for continuous multi-robot motion planning (MRMP) is the tensor roadmap (TR), which emerges from combining multiple PRM graphs constructed for the individual robots via a tensor product. We study the conditions under which the TR encodes a near-optimal solution for MRMP—satisfying these conditions implies near optimality for a variety of popular planners, including dRRT*, and the discrete methods M* and CBS when applied to the continuous domain. We develop the first finite-sample analysis of this kind, which specifies the number of samples, their deterministic distribution, and magnitude of the connection radii that should be used by each individual PRM graph, to guarantee near-optimality using the TR. This significantly improves upon a previous asymptotic analysis, wherein the number of samples tends to infinity, and supports guaranteed high-quality solutions in practice, within bounded running time. To achieve our new result, we first develop a sampling scheme, which we call the staggered grid, for finite-sample motion planning for individual robots, which requires significantly less samples than previous work. We then extend it to the much more involved MRMP setting which requires to account for interactions among multiple robots. Finally, we report on a few experiments that serve as a verification of our theoretical findings and raise interesting questions for further investigation.
@article{DayanSoloveyEtAl2021, author = {Dayan, D. and Solovey, K. and Pavone, M. and Halperin, D.}, title = {Near-Optimal Multi-Robot Motion Planning with Finite Sampling}, journal = {{IEEE Transactions on Robotics}}, volume = {39}, number = {5}, pages = {3422--3436}, year = {2023}, url = {https://arxiv.org/abs/2011.08944}, owner = {kirilsol}, timestamp = {2024-02-29} }
Abstract: Background. The revolutions in AI hold tremendous capacity to augment human achievements in surgery, but robust integration of deep learning algorithms with high-fidelity surgical simulation remains a challenge. We present a novel application of reinforcement learning (RL) for automating surgical maneuvers in a graphical simulation. Methods. In the Unity3D game engine, the Machine Learning-Agents package was integrated with the NVIDIA FleX particle simulator for developing autonomously behaving RL-trained scissors. Proximal Policy Optimization (PPO) was used to reward movements and desired behavior such as movement along desired trajectory and optimized cutting maneuvers along the deformable tissue-like object. Constant and proportional reward functions were tested, and TensorFlow analytics was used to informed hyperparameter tuning and evaluate performance. Results. RL-trained scissors reliably manipulated the rendered tissue that was simulated with soft-tissue properties. A desirable trajectory of the autonomously behaving scissors was achieved along 1 axis. Proportional rewards performed better compared to constant rewards. Cumulative reward and PPO metrics did not consistently improve across RL-trained scissors in the setting for movement across 2 axes (horizontal and depth). Conclusion. Game engines hold promising potential for the design and implementation of RL-based solutions to simulated surgical subtasks. Task completion was sufficiently achieved in one-dimensional movement in simulations with and without tissue-rendering. Further work is needed to optimize network architecture and parameter tuning for increasing complexity.
@article{BourdillonEtAl2022, author = {Bourdillon, A. and Garg, A. and Wang, H. and Woo, Y. and Pavone, M. and Boyd, J.}, title = {Integration of Reinforcement Learning in a Virtual Robotic Surgical Simulation}, journal = {{Journal of Surgical Innovations}}, year = {2023}, volume = {30}, number = {1}, pages = {94--102}, doi = {10.1177/15533506221095298}, owner = {jthluke}, timestamp = {2024-09-20}, url = {https://journals.sagepub.com/doi/full/10.1177/15533506221095298} }
Abstract: Sequential Convex Programming (SCP) has recently gained significant popularity as an effective method for solving optimal control problems and has been successfully applied in several different domains. However, the theoretical analysis of SCP has received comparatively limited attention, and it is often restricted to discrete-time formulations. In this paper, we present a unifying theoretical analysis of a fairly general class of SCP procedures for continuous-time optimal control problems. In addition to the derivation of convergence guarantees in a continuous-time setting, our analysis reveals two new numerical and practical insights. First, we show how one can more easily account for manifold-type constraints, which are a defining feature of optimal control of mechanical systems. Second, we show how our theoretical analysis can be leveraged to accelerate SCP-based optimal control methods by infusing techniques from indirect optimal control.
@article{BonalliLewEtAl2023, author = {Bonalli, R. and Lew, T. and Pavone, M.}, title = {Analysis of Theoretical and Numerical Properties of Sequential Convex Programming for Continuous-Time Optimal Control}, journal = {{IEEE Transactions on Automatic Control}}, volume = {68}, number = {8}, year = {2023}, pages = {4570--4585}, keywords = {pub}, owner = {lew}, timestamp = {2023-09-12}, url = {https://arxiv.org/abs/2009.05038} }
Abstract: Real-world systems are often characterized by high-dimensional nonlinear dynamics, making them challenging to control in real time. While reduced-order models (ROMs) are frequently employed in model-based control schemes, dimensionality reduction introduces model uncertainty which can potentially compromise the stability and safety of the original high-dimensional system. In this work, we propose a novel reduced-order model predictive control (ROMPC) scheme to solve constrained optimal control problems for nonlinear, high-dimensional systems. To address the challenges of using ROMs in predictive control schemes, we derive an error bounding system that dynamically accounts for model reduction error. Using these bounds, we design a robust MPC scheme that ensures robust constraint satisfaction, recursive feasibility, and asymptotic stability. We demonstrate the effectiveness of our proposed method in simulations on a high-dimensional soft robot with nearly 10,000 states.
@inproceedings{AloraPabonEtAl2023, author = {Alora, J.I. and Pabon, L. and Köhler, J. and Cenedese, M. and Schmerling, E. and N., Zeilinger M. and Haller, G. and Pavone, M.}, title = {Robust Nonlinear Reduced-Order Model Predictive Control}, year = {2023}, keywords = {pub}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, address = {Singapore}, url = {https://arxiv.org/abs/2309.05746}, owner = {jjalora}, timestamp = {2023-09-11} }
Abstract: Modeling and control of high-dimensional, nonlinear robotic systems remains a challenging task. While various model- and learning-based approaches have been proposed to address these challenges, they broadly lack generalizability to different control tasks and rarely preserve the structure of the dynamics. In this work, we propose a new, data-driven approach for extracting low-dimensional models from data using Spectral Submanifold Reduction (SSMR). In contrast to other data-driven methods which fit dynamical models to training trajectories, we identify the dynamics on generic, low-dimensional attractors embedded in the full phase space of the robotic system. This allows us to obtain computationally-tractable models for control which preserve the system’s dominant dynamics and better track trajectories radically different from the training data. We demonstrate the superior performance and generalizability of SSMR in dynamic trajectory tracking tasks vis-a-vis the state of the art.
@inproceedings{AloraCenedeseEtAl2023, author = {Alora, J.I. and Cenedese, M. and Schmerling, E. and Haller, G. and Pavone, M.}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, title = {Data-Driven Spectral Submanifold Reduction for Nonlinear Optimal Control of High-Dimensional Robots}, year = {2023}, address = {London, United Kingdom}, doi = {10.1109/ICRA48891.2023.10160418}, owner = {somrita}, timestamp = {2024-02-29}, url = {https://arxiv.org/abs/2209.05712} }
Abstract: Data-driven methodologies offer many exciting upsides, but they also introduce new challenges, particularly in the realm of user privacy. Specifically, the way data is collected can pose privacy risks to end users. In many routing services, a single entity (e.g., the routing service provider) collects and manages user trajectory data. When it comes to user privacy, these systems have a central point of failure since users have to trust that this entity will not sell or use their data to infer sensitive private information. With this as motivation, we study the problem of using location data for routing services in a privacy-preserving way. Rather than having users report their location to a central operator, we present a protocol in which users participate in a decentralized and privacy-preserving computation to estimate travel times for the roads in the network in a way that no individuals’ location is ever observed by any other party. The protocol uses the Laplace mechanism in conjunction with secure multi-party computation to ensure that it is cryptogrpahically secure and that its output is differentially private. The protocol is computationally efficient and does not require specialized hardware; all it needs is GPS, which is included in most mobile devices. A natural question is if privacy necessitates degradation in accuracy or system performance. We show that if a road has sufficiently high capacity, then the travel time estimated by our protocol is provably close to the ground truth travel time. We validate the protocol through numerical experiments which show that using the protocol as a routing service provides privacy guarantees with minimal overhead to user travel time.
@inproceedings{TsaoYangEtAl2022, author = {Tsao, M. and Yang, K. and Gopalakrishnan, K. and Pavone, M.}, title = {Private Location Sharing for Decentralized Routing Services}, booktitle = {{Proc. IEEE Int. Conf. on Intelligent Transportation Systems}}, year = {2022}, doi = {10.1109/ITSC55140.2022.9922387}, month = oct, url = {https://arxiv.org/abs/2202.13305}, owner = {gkarthik}, timestamp = {2022-11-21} }
Abstract: Robust motion planning entails computing a global motion plan that is safe under all possible uncertainty realizations, be it in the system dynamics, the robot’s initial position, or with respect to external disturbances. Current approaches for robust motion planning either lack theoretical guarantees, or make restrictive assumptions on the system dynamics and uncertainty distributions. In this paper, we address these limitations by proposing the robust rapidly-exploring random-tree (Robust-RRT) algorithm, which integrates forward reachability analysis directly into sampling-based control trajectory synthesis. We prove that Robust-RRT is probabilistically complete (PC) for nonlinear Lipschitz continuous dynamical systems with bounded uncertainty. In other words, Robust-RRT eventually finds a robust motion plan that is feasible under all possible uncertainty realizations assuming such a plan exists. Our analysis applies even to unstable systems that admit only short-horizon feasible plans; this is because we explicitly consider the time evolution of reachable sets along control trajectories. Thanks to the explicit consideration of time dependency in our analysis, PC applies to unstabilizable systems. To the best of our knowledge, this is the most general PC proof for robust sampling-based motion planning, in terms of the types of uncertainties and dynamical systems it can handle. Considering that an exact computation of reachable sets can be computationally expensive for some dynamical systems, we incorporate sampling-based reachability analysis into Robust-RRT and demonstrate our robust planner on nonlinear, underactuated, and hybrid systems.
@inproceedings{WuLewEtAl2022, author = {Wu, A. and Lew, T. and Solovey, K. and Schmerling, E. and Pavone, M.}, booktitle = {{Int. Symp. on Robotics Research}}, title = {{Robust-RRT}: Probabilistically-Complete Motion Planning for Uncertain Nonlinear Systems}, year = {2022}, month = may, keywords = {pub}, owner = {lew}, timestamp = {2022-05-22}, url = {https://arxiv.org/abs/2205.07728} }
Abstract: When deploying machine learning models in high-stakes robotics applications, the ability to detect unsafe situations is crucial. Early warning systems can provide alerts when an unsafe situation is imminent (in the absence of corrective action). To reliably improve safety, these warning systems should have a provable false negative rate; i.e., of the situations than are unsafe, fewer than epsilon will occur without an alert. In this work, we present a framework that combines a statistical inference technique known as conformal prediction with a simulator of robot/environment dynamics, in order to tune warning systems to provably achieve an epsilon false negative rate using as few as 1/epsilon data points. We apply our framework to a driver warning system and a robotic grasping application, and empirically demonstrate guaranteed false negative rate and low false detection (positive) rate using very little data.
@inproceedings{LuoZhaoEtAl2022, author = {Luo, R. and Zhao, S. and Kuck, J. and Ivanovic, B. and Savarese, S. and Schmerling, E. and Pavone, M.}, title = {Sample-Efficient Safety Assurances using Conformal Prediction}, booktitle = {{Workshop on Algorithmic Foundations of Robotics}}, year = {2022}, month = may, owner = {rsluo}, timestamp = {2021-09-20}, url = {https://arxiv.org/abs/2109.14082} }
Abstract: Uncertainty pervades through the modern robotic autonomy stack, with nearly every component (e.g., sensors, detection, classification, tracking, behavior prediction) producing continuous or discrete probabilistic distributions. Trajectory forecasting, in particular, is surrounded by uncertainty as its inputs are produced by (noisy) upstream perception and its outputs are predictions that are often probabilistic for use in downstream planning. However, most trajectory forecasting methods do not account for upstream uncertainty, instead taking only the most-likely values. As a result, perceptual uncertainties are not propagated through forecasting and predictions are frequently overconfident. To address this, we present a novel method for incorporating perceptual state uncertainty in trajectory forecasting, a key component of which is a new statistical distance-based loss function which encourages predicting uncertainties that better match upstream perception. We evaluate our approach both in illustrative simulations and on large-scale, real-world data, demonstrating its efficacy in propagating perceptual state uncertainty through prediction and producing more calibrated predictions.
@inproceedings{IvanovicLinEtAl2022, author = {Ivanovic, B. and Lin, Y. and Shrivastava, S. and Chakravarty, P. and Pavone, M.}, title = {Propagating State Uncertainty Through Trajectory Forecasting}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2022}, month = may, keywords = {pub}, owner = {borisi}, timestamp = {2022-02-01} }
Abstract: Reasoning about the future behavior of other agents is critical to safe robot navigation. The multiplicity of plausible futures is further amplified by the uncertainty inherent to agent state estimation from data, including positions, velocities, and semantic class. Forecasting methods, however, typically neglect class uncertainty, conditioning instead only on the agent’s most likely class, even though perception models often return full class distributions. To exploit this information, we present HAICU, a method for heterogeneous-agent trajectory forecasting that explicitly incorporates agents’ class probabilities. We additionally present PUP, a new challenging real-world autonomous driving dataset, to investigate the impact of Perceptual Uncertainty in Prediction. It contains challenging crowded scenes with unfiltered agent class probabilities that reflect the long-tail of current state-of-the-art perception systems. We demonstrate that incorporating class probabilities in trajectory forecasting significantly improves performance in the face of uncertainty, and enables new forecasting capabilities such as counterfactual predictions.
@inproceedings{IvanovicLeeEtAl2021, author = {Ivanovic, B. and Lee, K-H. and Tokmakov, P. and Wulfe, B. and McAllister, R. and Gaidon, A. and Pavone, M.}, title = {Heterogeneous-Agent Trajectory Forecasting Incorporating Class Uncertainty}, booktitle = {{IEEE/RSJ Int. Conf. on Intelligent Robots \& Systems}}, month = may, year = {2022}, keywords = {pub}, owner = {borisi}, timestamp = {2021-09-29}, url = {https://arxiv.org/abs/2104.12446} }
Abstract: We present a data-driven algorithm for efficiently computing stochastic control policies for general joint chance constrained optimal control problems. Our approach leverages the theory of kernel distribution embeddings, which allows representing expectation operators as inner products in a reproducing kernel Hilbert space. This framework enables approximately reformulating the original problem using a dataset of observed trajectories from the system without imposing prior assumptions on the parameterization of the system dynamics or the structure of the uncertainty. By optimizing over a finite subset of stochastic open-loop control trajectories, we relax the original problem to a linear program over the control parameters that can be efficiently solved using standard convex optimization techniques. We demonstrate our proposed approach in simulation on a system with nonlinear non-Markovian dynamics navigating in a cluttered environment.
@inproceedings{ThorpeLewEtAl2022, author = {Thorpe, A.~J. and Lew, T. and Oishi, M.~M.~K. and Pavone, M.}, title = {Data-Driven Chance Constrained Control using Kernel Distribution Embeddings}, year = {2022}, booktitle = {{Learning for Dynamics \& Control Conference}}, month = mar, url = {https://arxiv.org/abs/2202.04193}, keywords = {pub}, owner = {lew}, timestamp = {2022-03-01} }
Abstract: Robots are widely deployed in space environments because of their versatility and robustness. However, adverse gravity conditions and challenging terrain geometry expose the limitations of traditional robot designs, which are often forced to sacrifice one of mobility or manipulation capabilities to attain the other. Prospective climbing operations in these environments reveals a need for small, compact robots capable of versatile mobility and manipulation. We propose a novel robotic concept called ReachBot that fills this need by combining two existing technologies: extendable booms and mobile manipulation. ReachBot leverages the reach and tensile strength of extendable booms to achieve an outsized reachable workspace and wrench capability. Through their lightweight, compactable structure, these booms also reduce mass and complexity compared to traditional rigid-link articulated-arm designs. Using these advantages, ReachBot excels in mobile manipulation missions in low gravity or that require climbing, particularly when anchor points are sparse. After introducing the ReachBot concept, we discuss modeling approaches and strategies for increasing stability and robustness. We then develop a 2D analytical model for ReachBot’s dynamics inspired by grasp models for dexterous manipulators. Next, we introduce a waypoint-tracking controller for a planar ReachBot in microgravity. Our simulation results demonstrate the controller’s robustness to disturbances and modeling error. Finally, we briefly discuss next steps that build on these initially promising results to realize the full potential of ReachBot.
@inproceedings{SchneiderBylardEtAl2022, author = {Schneider, S. and Bylard, A. and Chen, T. G. and Wang, P. and Cutkosky, M. R. and Pavone, M.}, title = {{ReachBot:} {A} Small Robot for Large Mobile Manipulation Tasks}, booktitle = {{IEEE Aerospace Conference}}, year = {2022}, address = {Big Sky, Montana}, month = mar, url = {https://arxiv.org/abs/2110.10829}, keywords = {pub}, owner = {schneids}, timestamp = {2021-11-04} }
Abstract: In this work, we analyze an efficient sampling-based algorithm for general-purpose reachability analysis, which remains a notoriously challenging problem with applications ranging from neural network verification to safety analysis of dynamical systems. By sampling inputs, evaluating their images in the true reachable set, and taking their ε-padded convex hull as a set estimator, this algorithm applies to general problem settings and is simple to implement. Our main contribution is the derivation of asymptotic and finite-sample accuracy guarantees using random set theory. This analysis informs algorithmic design to obtain an ε-close reachable set approximation with high probability, provides insights into which reachability problems are most challenging, and motivates safety-critical applications of the technique. On a neural network verification task, we show that this approach is more accurate and significantly faster than prior work. Informed by our analysis, we also design a robust model predictive controller that we demonstrate in hardware experiments.
@inproceedings{LewJansonEtAl2022, author = {Lew, T. and Janson, L. and Bonalli, R. and Pavone, M.}, title = {A Simple and Efficient Sampling-based Algorithm for General Reachability Analysis}, year = {2022}, booktitle = {{Learning for Dynamics \& Control Conference}}, month = mar, url = {https://arxiv.org/abs/2112.05745}, keywords = {pub}, owner = {lew}, timestamp = {2022-03-01} }
Abstract: Reliable and efficient trajectory generation methods are a fundamental need for autonomous dynamical systems of tomorrow. The goal of this article is to provide a comprehensive tutorial of three major convex optimization-based trajectory generation methods: lossless convexification (LCvx), and two sequential convex programming algorithms known as SCvx and GuSTO. In this article, trajectory generation is the computation of a dynamically feasible state and control signal that satisfies a set of constraints while optimizing key mission objectives. The trajectory generation problem is almost always nonconvex, which typically means that it is not readily amenable to efficient and reliable solution onboard an autonomous vehicle. The three algorithms that we discuss use problem reformulation and a systematic algorithmic strategy to nonetheless solve nonconvex trajectory generation tasks through the use of a convex optimizer. The theoretical guarantees and computational speed offered by convex optimization have made the algorithms popular in both research and industry circles. To date, the list of applications include rocket landing, spacecraft hypersonic reentry, spacecraft rendezvous and docking, aerial motion planning for fixed-wing and quadrotor vehicles, robot motion planning, and more. Among these applications are high-profile rocket flights conducted by organizations like NASA, Masten Space Systems, SpaceX, and Blue Origin. This article aims to give the reader the tools and understanding necessary to work with each algorithm, and to know what each method can and cannot do. A publicly available source code repository supports the numerical examples provided at the end of this article. By the end of the article, the reader should be ready to use each method, to extend them, and to contribute to their many exciting modern applications.
@article{MalyutaEtAl2022, author = {Malyuta, D. and Reynolds, T.~P. and Szmuk, M. and Lew, T. and Bonalli, R. and Pavone, M. and Acikmese, B.}, title = {Convex Optimization for Trajectory Generation}, year = {2022}, journal = {{IEEE Control Systems Magazine}}, volume = {42}, number = {5}, pages = {40--113}, month = jan, url = {https://arxiv.org/abs/2106.09125}, keywords = {pub}, owner = {lew}, timestamp = {2022-01-30} }
Abstract: This paper presents an approach to guaranteed trajectory tracking for nonlinear control-affine systems subject to external disturbances based on robust control contraction metrics (CCM) that aim to minimize the L-infinity gain from the disturbances to the deviation of actual variables of interests from their nominal counterparts. The guarantee is in the form of invariant tubes, computed offline, around any nominal trajectories in which the actual states and inputs of the system are guaranteed to stay despite disturbances. Under mild assumptions, we prove that the proposed robust CCM (RCCM) approach yields tighter tubes than an existing approach based on CCM and input-to-state stability analysis. We show how the RCCM-based tracking controller together with tubes can be incorporated into a feedback motion planning framework to plan safe-guaranteed trajectories for robotic systems. Simulation results for a planar quadrotor illustrate the effectiveness of the proposed method and also empirically demonstrate significantly reduced conservatism compared to the CCM-based approach.
@article{ZhaoLakshmananEtAl2022, author = {Zhao, P. and Lakshmanan, A. and Ackerman, K. and Gahlawat, A. and Pavone, M. and Hovakimyan, N.}, title = {Tube-Certified Trajectory Tracking for Nonlinear Systems With Robust Control Contraction Metrics}, journal = {{IEEE Robotics and Automation Letters}}, volume = {7}, number = {2}, pages = {5528-5535}, year = {2022}, url = {https://arxiv.org/abs/2109.04453}, owner = {rdyro}, timestamp = {2022-02-05} }
Abstract: Challenged by urbanization and increasing travel needs, existing transportation systems call for new mobility paradigms. In this article, we present the emerging concept of Autonomous Mobility-on-Demand, whereby centrally orchestrated fleets of autonomous vehicles provide mobility service to customers. We provide a comprehensive review of methods and tools to model and solve problems related to Autonomous Mobility-on-Demand systems. Specifically, we first identify problem settings for their analysis and control, both from the operational and the planning perspective. We then review modeling aspects, including transportation networks, transportation demand, congestion, operational constraints, and interactions with existing infrastructure. Thereafter, we provide a systematic analysis of existing solution methods and performance metrics, highlighting trends and trade-offs. Finally, we present various directions for further research.
@article{ZardiniLanzettiEtAl2021, author = {Zardini, G. and Lanzetti, N. and Pavone, M. and Frazzoli, E.}, title = {Analysis and Control of Autonomous Mobility-on-Demand Systems: A Review}, journal = {{Annual Review of Control, Robotics, and Autonomous Systems}}, volume = {5}, number = {1}, pages = {633--658}, year = {2022}, url = {https://www.annualreviews.org/doi/abs/10.1146/annurev-control-042920-012811}, owner = {rdyro}, timestamp = {2022-02-05}, keywords = {pub} }
Abstract: The design of future mobility solutions (autonomous vehicles, micromobility solutions, etc.) and the design of the mobility systems they enable are closely coupled. Indeed, knowledge about the intended service of novel mobility solutions would impact their design and deployment process, whilst insights about their technological development could significantly affect transportation management policies. This requires tools to study such a coupling and co-design future mobility systems in terms of different objectives. This paper presents a framework to address such co-design problems. In particular, we leverage the recently developed mathematical theory of co-design to frame and solve the problem of designing and deploying an intermodal mobility system, whereby autonomous vehicles service travel demands jointly with micromobility solutions such as shared bikes and e-scooters, and public transit, in terms of fleets sizing, vehicle characteristics, and public transit service frequency. Our framework is modular and compositional, allowing one to describe the design problem as the interconnection of its individual components and to tackle it from a system-level perspective. Moreover, it only requires very general monotonicity assumptions and it naturally handles multiple objectives, delivering the rational solutions on the Pareto front and thus enabling policy makers to select a policy. To showcase our methodology, we present a real-world case study for Washington D.C., USA. Our work suggests that it is possible to create user-friendly optimization tools to systematically assess the costs and benefits of interventions, and that such analytical techniques might inform policy-making in the future.
@article{ZardiniEtAlBis2020, author = {Zardini, G. and Lanzetti, N. and Censi, A. and Frazzoli, E. and Pavone, M.}, title = {Co-Design to Enable User-Friendly Tools to Assess the Impact of Future Mobility Solutions}, journal = {{IEEE Transactions on Network Science and Engineering}}, year = {2022}, note = {In press}, url = {https://arxiv.org/pdf/2008.08975.pdf}, keywords = {press}, owner = {gzardini}, timestamp = {2020-08-21} }
Abstract:
@article{Wollenstein-BetechSalazarEtAl2022, author = {Wollenstein-Betech, S. and Salazar, M. and Houshmand, A. and Pavone, M. and Paschalidis, I. C. and Cassandras, C. G.}, title = {Routing and Rebalancing Intermodal Autonomous Mobility-on-Demand Systems in Mixed Traffic}, journal = {{IEEE Transactions on Intelligent Transportation Systems}}, year = {2022}, volume = {23}, number = {8}, pages = {2261--2276}, url = {/wp-content/papercite-data/pdf/Wollenstein-Betech.Pavone.T-ITS22.pdf}, keywords = {pub}, owner = {rdyro}, timestamp = {2022-08-13} }
Abstract: As autonomous decision-making agents move from narrow operating environments to unstructured worlds, learning systems must move from a closed-world formulation to an open-world and few-shot setting in which agents continuously learn new classes from small amounts of information. This stands in stark contrast to modern machine learning systems that are typically designed with a known set of classes and a large number of examples for each class. In this work we extend embedding-based few-shot learning algorithms to the open-world recognition setting. We combine Bayesian non-parametric class priors with an embedding-based pre-training scheme to yield a highly flexible framework which we refer to as few-shot learning for open world recognition (FLOWR). We benchmark our framework on open-world extensions of the common MiniImageNet and TieredImageNet few-shot learning datasets. Our results show, compared to prior methods, strong classification accuracy performance and up to a 12% improvement in H-measure (a measure of novel class detection) from our non-parametric open-world few-shot learning scheme.
@article{WillesHarrisonEtAl2022, author = {Willes, John and Harrison, James and Harakeh, Ali and Finn, Chelsea and Pavone, Marco and Waslander, Steven}, journal = {{IEEE Transactions on Pattern Analysis \& Machine Intelligence}}, title = {Bayesian Embeddings for Few-Shot Open World Recognition}, year = {2022}, keywords = {pub}, owner = {rdyro}, timestamp = {2022-09-21}, url = {https://arxiv.org/abs/2107.13682} }
Abstract: The era of Big Data has brought with it a richer understanding of user behavior through massive data sets, which can help organizations optimize the quality of their services. In the context of transportation research, mobility data can provide Municipal Authorities (MA) with insights on how to operate, regulate, or improve the transportation network. Mobility data, however, may contain sensitive information about end users and trade secrets of Mobility Providers (MP). Due to this data privacy concern, MPs may be reluctant to contribute their datasets to MA. Using ideas from cryptography, we propose an interactive protocol between a MA and a MP in which MA obtains insights from mobility data without MP having to reveal its trade secrets or sensitive data of its users. This is accomplished in two steps: a commitment step, and a computation step. In the first step, Merkle commitments and aggregated traffic measurements are used to generate a cryptographic commitment. In the second step, MP extracts insights from the data and sends them to MA. Using the commitment and zero-knowledge proofs, MA can certify that the information received from MP is accurate, without needing to directly inspect the mobility data. We also present a differentially private version of the protocol that is suitable for the large query regime. The protocol is verifiable for both MA and MP in the sense that dishonesty from one party can be detected by the other. The protocol can be readily extended to the more general setting with multiple MPs via secure multi-party computation.
@article{TsaoYangZoepfPavone2021, author = {Tsao, M. and Yang, K. and Zoepf, S. and Pavone, M.}, title = {Trust but Verify: Cryptographic Data Privacy for Mobility Management}, journal = {{IEEE Transactions on Control of Network Systems}}, volume = {9}, number = {1}, pages = {50--61}, year = {2022}, keywords = {pub}, url = {https://arxiv.org/abs/2104.07768} }
Abstract: When testing conditions differ from those represented in training data, so-called out-of-distribution (OOD) inputs can mar the reliability of black-box learned components in the modern robot autonomy stack. Therefore, coping with OOD data is an important challenge on the path towards trustworthy learning-enabled open-world autonomy. In this paper, we aim to demystify the topic of OOD data and its associated challenges in the context of data-driven robotic systems, drawing connections to emerging paradigms in the ML community that study the effect of OOD data on learned models in isolation. We argue that as roboticists, we should reason about the overall system-level competence of a robot as it performs tasks in OOD conditions. We highlight key research questions around this system-level view of OOD problems to guide future research toward safe and reliable learning-enabled autonomy.
@inproceedings{SinhaSharmaEtAl2022, author = {Sinha, R. and Sharma, S. and Banerjee, S. and Lew, T. and Luo, R. and Richards, S. M. and Sun, Y. and Schmerling, E. and Pavone, M.}, title = {A System-Level View on Out-of-Distribution Data in Robotics}, year = {2022}, keywords = {}, url = {https://arxiv.org/abs/2212.14020}, owner = {rhnsinha}, timestamp = {2022-12-30} }
Abstract: We propose a learning-based robust predictive control algorithm that compensates for significant uncertainty in the dynamics for a class of discrete-time systems that are nominally linear with an additive nonlinear component. Such systems commonly model the nonlinear effects of an unknown environment on a nominal system. We optimize over a class of nonlinear feedback policies inspired by certainty equivalent “estimate-and-cancel” control laws pioneered in classical adaptive control to achieve significant performance improvements in the presence of uncertainties of large magnitude, a setting in which existing learning-based predictive control algorithms often struggle to guarantee safety. In contrast to previous work in robust adaptive MPC, our approach allows us to take advantage of structure (i.e., the numerical predictions) in the a priori unknown dynamics learned online through function approximation. Our approach also extends typical nonlinear adaptive control methods to systems with state and input constraints even when we cannot directly cancel the additive uncertain function from the dynamics. Moreover, we apply contemporary statistical estimation techniques to certify the system’s safety through persistent constraint satisfaction with high probability. Finally, we show in simulation that our method can accommodate more significant unknown dynamics terms than existing methods.
@inproceedings{SinhaHarrisonEtAl2022, author = {Sinha, R. and Harrison, J. and Richards, S. M. and Pavone, M.}, title = {Adaptive Robust Model Predictive Control with Matched and Unmatched Uncertainty}, year = {2022}, keywords = {pub}, booktitle = {{American Control Conference}}, url = {https://arxiv.org/abs/2104.08261}, owner = {rhnsinha}, timestamp = {2022-01-31} }
Abstract: We propose a learning-based robust predictive control algorithm that compensates for significant uncertainty in the dynamics for a class of discrete-time systems that are nominally linear with an additive nonlinear component. Such systems are commonly used to model the nonlinear effects of an unknown environment on a nominal linear system. Inspired by certainty equivalent “estimate-and-cancel” control laws pioneered in classical adaptive control, we optimize over a class of nonlinear feedback policies to significantly improve performance in the presence of uncertainties of large magnitude, a setting in which existing learning-based predictive control algorithms often struggle to guarantee safety. In contrast to previous work in robust adaptive model predictive control, our approach allows us to take advantage of structure (i.e., the numerical predictions) in the a priori unknown dynamics learned online through function approximation. Our approach also extends typical nonlinear adaptive control methods to systems with state and input constraints even when we cannot directly cancel the additive uncertain function from the dynamics. Moreover, we apply contemporary statistical estimation techniques to certify the system’s safety in the form of persistent constraint satisfaction with high probability. Finally, we show in simulation that our method can accommodate more significant unknown dynamics terms than existing methods.
@article{SinhaHarrisonEtAl2022b, author = {Sinha, R. and Harrison, J. and Richards, S. M. and Pavone, M.}, title = {Adaptive Robust Model Predictive Control via Uncertainty Cancellation}, journal = {{IEEE Transactions on Automatic Control}}, year = {2022}, keywords = {press}, note = {In press}, url = {https://arxiv.org/abs/2212.01371}, owner = {rhnsinha}, timestamp = {2023-01-30} }
Abstract: Reinforcement learning (RL) is capable of sophisticated motion planning and control for robots in uncertain environments. However, state-of-the-art deep RL approaches typically lack safety guarantees, especially when the robot and environment models are unknown. To justify widespread deployment, robots must respect safety constraints without sacrificing performance. Thus, we propose a Black-box Reachability-based Safety Layer (BRSL) with three main components: (1) data-driven reachability analysis for a black-box robot model, (2) a “dreaming” trajectory planner that hallucinates future actions and observations using an ensemble of neural networks trained online, and (3) a differentiable polytope collision check between the reachable set and obstacles that enables correcting unsafe actions. In simulation, BRSL outperforms other state-of-the-art safe RL methods on a Turtlebot 3, a quadrotor, and a trajectory-tracking point mass with an unsafe set adjacent to the area of highest reward.
@article{SelimAlanwarEtAl2022, author = {Selim, M. and Alanwar, A. and Kousik, S. and Gao, G. and Pavone, M. and Johansson, K.}, title = {Safe Reinforcement Learning Using Black-Box Reachability Analysis}, journal = {{IEEE Robotics and Automation Letters}}, volume = {7}, number = {4}, pages = {10665-10672}, year = {2022}, url = {https://arxiv.org/pdf/2204.07417}, owner = {rdyro}, timestamp = {2022-07-22} }
Abstract: In e-commerce warehouses, online retailers increase their efficiency by using a mixed-shelves (or scattered storage) concept, where unit loads are purposefully broken down into single items, which are individually stored in multiple locations. Irrespective of the stock keeping units a customer jointly orders, this storage strategy increases the likelihood that somewhere in the warehouse the items of the requested stock keeping units will be in close vicinity, which may significantly reduce an order picker’s unproductive walking time. This paper optimizes picker routing through such mixed-shelves warehouses. Specifically, we introduce a generic exact algorithm that covers a multitude of picking policies, independently of the underlying picking zone layout, and is suitable for real-time applications. Besides its generality, this algorithm provides a new state of the art in terms of solvable instance sizes and computational times, providing average reductions of 67% compared with the best known algorithm. Using this algorithm we compare three different real-world e-commerce warehouse settings that differ slightly in their application of scattered storage and in their picking policies. Our results reveal that the right combination of drop-off points, dynamic batching, the utilization of picking carts, and the picking zone layout can greatly improve the picking performance. In particular, some combinations of policies yield efficiency increases of more than 30%.
@article{SchifferBoysenEtAl2018, author = {Schiffer, M. and Boysen, N. and Laporte, G. and Pavone, M.}, title = {Optimal picking policies in e-commerce warehouses}, journal = {{Management Science}}, volume = {68}, number = {10}, pages = {7497-7517}, year = {2022}, keywords = {pub}, owner = {schiffer}, timestamp = {2018-08-14} }
Abstract: Autonomous systems and humans are increasingly sharing the same space. Robots work side by side or even hand in hand with humans to balance each other’s limitations. Such cooperative interactions are ever more sophisticated. Thus, the ability to reason not just about a human’s center of gravity position, but also its granular motion is an important prerequisite for human-robot interaction. Though, many algorithms ignore the multimodal nature of humans or neglect uncertainty in their motion forecasts. We present Motron, a multimodal, probabilistic, graph-structured model, that captures human’s multimodality using probabilistic methods while being able to output deterministic motions and corresponding confidence values for each mode. Our model aims to be tightly integrated with the robotic planning-control-interaction loop; outputting physically feasible human motions and being computationally efficient. We demonstrate the performance of our model on several challenging real-world motion forecasting datasets, outperforming a wide array of generative methods while providing state-of-the-art deterministic motions if required. Both using significantly less computational power than state-of-the art algorithms.
@inproceedings{SalzmannPavoneEtAl2022, author = {Salzmann, T. and Pavone, M. and Ryll, M.}, title = {Motron: Multimodal Probabilistic Human Motion Forecasting}, booktitle = {{IEEE Conf. on Computer Vision and Pattern Recognition}}, year = {2022}, keywords = {pub}, owner = {salzmann}, timestamp = {2022-03-02}, url = {https://arxiv.org/pdf/2203.04132.pdf} }
Abstract: We study an online hypergraph matching problem inspired by ridesharing and delivery applications. The vertices of a hypergraph are revealed sequentially and must be matched within d timesteps of their reveal, otherwise they will leave the system in favor of an outside option. Hyperedges can contain at most k vertices and are revealed to the algorithm once all of its vertices have arrived, and can only be included into the matching before any of its vertices leave the system. We study utility maximization and cost minimization objectives in this model. In the utility maximization setting, we show that the optimal competitive ratio is 1/d whenever k >= 3, and is achievable in polynomial-time for any fixed k. For the cost minimization setting, we assume costs are monotone, which is a natural assumption in ridesharing and delivery problems. When k = 2, we show that the optimal competitive ratio for deterministic algorithms is \frac32 and is achieved by a polynomial-time thresholding algorithm. When k>2, we show that a polynomial-time randomized batching algorithm is (2 - 1/d)log k-competitive, and it is NP-hard to achieve a competitive ratio better than \log k - O (\log \log k).
@article{TsaoEtAl2021, author = {Pavone, M. and Saberi, A. and Schiffer, M. and Tsao, M.}, title = {Online Hypergraph Matching with Delays}, journal = {{Operations Research}}, volume = {70}, number = {4}, pages = {2194-2212}, year = {2022}, comment = {This manuscript was first submitted to Operations Research on 01-27-2021. The first round of revision was submitted on 08-10-2021. The paper was accepted by Operations Research on 12-21-2021.}, keywords = {pub}, owner = {mwtsao}, timestamp = {2021-08-10} }
Abstract: This study investigated a novel mission architecture where a long-reach crawling and anchoring robot, which repurposes extendable booms for mobile manipulation, is deployed to explore and sample difficult terrains on solar system bodies, with a key focus on Mars exploration. To this end, the robot concept introduced by this effort, called ReachBot, uses rollable extendable booms as manipulator arms and as highly reconfigurable structural members. ReachBot is capable of (1) rapid and versatile crawling through sequences of long-distance grasps, (2) traversing a large workspace while anchored (by adjusting boom lengths and orientations), and (3) applying high interaction forces and torques, primarily leveraging boom tensile strength and the variety of anchors within reach. These features allow a light and compact robot to achieve versatile mobility and forceful interaction in traditionally difficult environments such as vertical cliff walls or the rocky and uneven interiors of caves on Mars (see figure, left). In particular, ReachBot is uniquely suited for exploring and sampling Noachian targets on Mars that contain key sources of historical and astrobiological information preserved in strata in the form of cliff-face fractures and sublimation pits [1]. To develop this concept, this Phase I study brought together an interdisciplinary team of experts in robot autonomy, robotic manipulation, mechanical design, bio-inspired grasping, and geological planetary science from Stanford.
@techreport{PavoneCutkoskyEtAl2012, author = {Pavone, M. and Cutkosky, M. and Lap\^{o}tre, M. and Schneider, S. and Chen, T. G. and Bylard, A.}, title = {ReachBot: a Small Robot for Large Mobile Manipulation Tasks in Martian Cave Environments}, institution = {{NASA NIAC Program}}, year = {2022}, note = {Final report}, owner = {schneids}, timestamp = {2022-10-14}, url = {/wp-content/papercite-data/pdf/Pavone.ea.NIAC.Final.Report.2022.pdf} }
Abstract: This paper proposes a two-level, data-driven, digital twin concept for the autonomous landing of aircraft, under some assumptions. It features a digital twin instance for model predictive control; and an innovative, real-time, digital twin prototype for fluid-structure interaction and flight dynamics to inform it. The latter digital twin is based on the linearization about a pre-designed glideslope trajectory of a high-fidelity, viscous, nonlinear computational model for flight dynamics; and its projection onto a low-dimensional approximation subspace to achieve real-time performance, while maintaining accuracy. Its main purpose is to predict in real-time, during flight, the state of an aircraft and the aerodynamic forces and moments acting on it. Unlike static lookup tables or regression-based surrogate models based on steady-state wind tunnel data, the aforementioned real-time digital twin prototype allows the digital twin instance for model predictive control to be informed by a truly dynamic flight model, rather than a less accurate set of steady-state aerodynamic force and moment data points. The paper describes in details the construction of the proposed two-level digital twin concept and its verification by numerical simulation. It also reports on its preliminary flight validation in autonomous mode for an off-the-shelf unmanned aerial vehicle instrumented at Stanford University.
@article{McClellanLorenzettiEtAl2021, author = {McClellan, A. and Lorenzetti, J. and Pavone, M. and Farhat, C.}, title = {A Physics-Based Digital Twin for Model Predictive Control of Autonomous Unmanned Aerial Vehicle Landing}, journal = {{Philosophical Transactions of the Royal Society A}}, volume = {380}, keywords = {pub}, year = {2022}, url = {/wp-content/papercite-data/pdf/McClellan.Lorenzetti.ea.PTRSA21.pdf}, owner = {jlorenze}, timestamp = {2021-11-11} }
Abstract: Very high dimensional nonlinear systems arise in many engineering problems due to semi-discretization of the governing partial differential equations, e.g. through finite element methods. The complexity of these systems present computational challenges for direct application to automatic control. While model reduction has seen ubiquitous applications in control, the use of nonlinear model reduction methods in this setting remains difficult. The problem lies in preserving the structure of the nonlinear dynamics in the reduced order model for high-fidelity control. In this work, we leverage recent advances in Spectral Submanifold (SSM) theory to enable model reduction under well-defined assumptions for the purpose of efficiently synthesizing feedback controllers.
@inproceedings{MahlknechtAloraEtAl2022, author = {Mahlknecht, F. and Alora, J.I. and Jain, S. and Schmerling, E. and Bonalli, R. and Haller, G. and Pavone, M.}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, title = {Using Spectral Submanifolds for Nonlinear Periodic Control}, year = {2022}, keywords = {pub}, owner = {jjalora}, timestamp = {2022-11-22}, url = {https://arxiv.org/abs/2209.06573} }
Abstract: Probabilistic classifiers output confidence scores along with their predictions, and these confidence scores should be calibrated, i.e., they should reflect the reliability of the prediction. Confidence scores that minimize standard metrics such as the expected calibration error (ECE) accurately measure the reliability on average across the entire population. However, it is in general impossible to measure the reliability of an individual prediction. In this work, we propose the local calibration error (LCE) to span the gap between average and individual reliability. For each individual prediction, the LCE measures the average reliability of a set of similar predictions, where similarity is quantified by a kernel function on a pretrained feature space and by a binning scheme over predicted model confidences. We show theoretically that the LCE can be estimated sample-efficiently from data, and empirically find that it reveals miscalibration modes that are more fine-grained than the ECE can detect. Our key result is a novel local recalibration method LoRe, to improve confidence scores for individual predictions and decrease the LCE. Experimentally, we show that our recalibration method produces more accurate confidence scores, which improves downstream fairness and decision making on classification tasks with both image and tabular data.
@inproceedings{LuoEtAl2022, author = {Luo, R. and Bhatnagar, A. and Wang, H. and Xiong, C. and Savarese, S. and Bai, Y. and Zhao, S. and Ermon, S. and Schmerling, E. and Pavone, M.}, title = {Local Calibration: Metrics and Recalibration}, booktitle = {{Proc. Conf. on Uncertainty in Artificial Intelligence}}, year = {2022}, keywords = {pub}, owner = {rdyro}, timestamp = {2022-01-26}, url = {https://arxiv.org/abs/2102.10809} }
Abstract: Model predictive controllers use dynamics models to solve constrained optimal control problems. However, computational requirements for real-time control have limited their use to systems with low-dimensional models. Nevertheless, high-dimensional models arise in many settings, for example, discretization methods for generating finite-dimensional approximations to partial differential equations can result in models with thousands to millions of dimensions. In such cases, reduced-order models (ROMs) can significantly reduce computational requirements, but model approximation error must be considered to guarantee controller performance. In this article, a reduced-order model predictive control (ROMPC) scheme is proposed to solve robust, output feedback, constrained optimal control problems for high-dimensional linear systems. Computational efficiency is obtained by using projection-based ROMs, and guarantees on robust constraint satisfaction and stability are provided. The performance of the approach is demonstrated in simulation for several examples, including an aircraft control problem leveraging an inviscid computational fluid dynamics model with dimension 998 930.
@article{LorenzettiMcClellanEtAl2022, author = {Lorenzetti, J. and McClellan, A. and Farhat, C. and Pavone, M.}, title = {Linear Reduced-Order Model Predictive Control}, journal = {{IEEE Transactions on Automatic Control}}, volume = {67}, number = {11}, pages = {5980--5995}, year = {2022}, keywords = {pub}, owner = {rdyro}, timestamp = {2022-10-27}, url = {https://arxiv.org/abs/2012.03384} }
Abstract: To safely deploy learning-based systems in highly uncertain environments, one must ensure that they always satisfy constraints. In this work, we propose a practical and theoretically justified approach to maintaining safety in the presence of dynamics uncertainty. Our approach leverages Bayesian meta-learning with last-layer adaptation: the expressiveness of neural-network features trained offline, paired with efficient last-layer online adaptation, enables the derivation of tight confidence sets which contract around the true dynamics as the model adapts online. We exploit these confidence sets to plan trajectories that guarantee the safety of the system. Our approach handles problems with high dynamics uncertainty where reaching the goal safely is initially infeasible by first exploring to gather data and reduce uncertainty, before autonomously exploiting the acquired information to safely perform the task. Under reasonable assumptions, we prove that our framework provides safety guarantees in the form of a single joint chance constraint. Furthermore, we use this theoretical analysis to motivate regularization of the model to improve performance. We extensively demonstrate our approach in simulation and on hardware.
@article{LewEtAl2022, author = {Lew, T. and Sharma, A. and Harrison, J. and Bylard, A. and Pavone, M.}, title = {Safe Active Dynamics Learning and Control: A Sequential Exploration-Exploitation Framework}, journal = {{IEEE Transactions on Robotics}}, volume = {38}, number = {5}, pages = {2888--2907}, booktitle = {{Proc. Conf. on Uncertainty in Artificial Intelligence}}, year = {2022}, doi = {10.1109/TRO.2022.3154715}, url = {https://arxiv.org/pdf/2008.11700.pdf}, owner = {lew}, timestamp = {2022-01-27} }
Abstract: There are spatio-temporal rules that dictate how robots should operate in complex environments, e.g., road rules govern how (self-driving) vehicles should behave on the road. However, seamlessly incorporating such rules into a robot control policy remains challenging especially for real-time applications. In this work, given a desired spatio-temporal specification expressed in the Signal Temporal Logic (STL) language, we propose a semi-supervised controller synthesis technique that is attuned to human-like behaviors while satisfying desired STL specifications. Offline, we synthesize a trajectory-feedback neural network controller via an adversarial training scheme that summarizes past spatio-temporal behaviors when computing controls, and then online, we perform gradient steps to improve specification satisfaction. Central to the offline phase is an imitation-based regularization component that fosters better policy exploration and helps induce naturalistic human behaviors. Our experiments demonstrate that having imitation-based regularization leads to higher qualitative and quantitative performance compared to optimizing an STL objective only as done in prior work. We demonstrate the efficacy of our approach with an illustrative case study and show that our proposed controller outperforms a state-of-the-art shooting method in both performance and computation time.
@inproceedings{LeungPavone2022, author = {Leung, K. and Pavone, M.}, title = {Semi-Supervised Trajectory-Feedback Controller Synthesis with Signal Temporal Logic Specifications}, booktitle = {{American Control Conference}}, year = {2022}, keywords = {pub}, owner = {karenl7}, timestamp = {2021-10-14} }
Abstract: System optimum (SO) routing, wherein the total travel time of all users is minimized, is a holy grail for transportation authorities. However, SO routing may discriminate against users who incur much larger travel times than others to achieve high system efficiency, i.e., low total travel times. To address the inherent unfairness of SO routing, we study the β-fair SO problem whose goal is to minimize the total travel time while guaranteeing a β≥1 level of unfairness, which specifies the maximal ratio between the travel times of different users with shared origins and destinations. To obtain feasible solutions to the β-fair SO problem while achieving high system efficiency, we develop a new convex program, the Interpolated Traffic Assignment Problem (I-TAP), which interpolates between a fair and an efficient traffic-assignment objective. We then leverage the structure of I-TAP to develop two pricing mechanisms to collectively enforce the I-TAP solution in the presence of selfish homogeneous and heterogeneous users, respectively, that independently choose routes to minimize their own travel costs. We mention that this is the first study of pricing in the context of fair routing. Finally, we use origin-destination demand data for a range of transportation networks to numerically evaluate the performance of I-TAP as compared to a state-of-the-art algorithm. The numerical results indicate that our approach is faster by several orders of magnitude, while achieving higher system efficiency for most levels of unfairness.
@inproceedings{JalotaSoloveyEtAl2022, author = {Jalota, D. and Solovey, K. and Tsao, M. and Zoepf, S. and Pavone, M.}, title = {Balancing Fairness and Efficiency in Traffic Routing via Interpolated Traffic Assignment}, booktitle = {{Proc. Int. Conf. on Autonomous Agents and Multiagent Systems}}, year = {2022}, owner = {devanshjalota}, timestamp = {2022-03-01} }
Abstract: Autonomous Mobility-on-Demand (AMoD) systems represent an attractive alternative to existing transportation paradigms, currently challenged by urbanization and increasing travel needs. By centrally controlling a fleet of self-driving vehicles, these systems provide mobility service to customers and are currently starting to be deployed in a number of cities around the world. Current learning-based approaches for controlling AMoD systems are limited to the single-city scenario, whereby the service operator is allowed to take an unlimited amount of operational decisions within the same transportation system. However, real-world system operators can hardly afford to fully re-train AMoD controllers for every city they operate in, as this could result in a high number of poor-quality decisions during training, making the single-city strategy a potentially impractical solution. To address these limitations, we propose to formalize the multi-city AMoD problem through the lens of meta-reinforcement learning (meta-RL) and devise an actor-critic algorithm based on recurrent graph neural networks. In our approach, AMoD controllers are explicitly trained such that a small amount of experience within a new city will produce good system performance. Empirically, we show how control policies learned through meta-RL are able to achieve near-optimal performance on unseen cities by learning rapidly adaptable policies, thus making them more robust not only to novel environments, but also to distribution shifts common in real-world operations, such as special events, unexpected congestion, and dynamic pricing schemes.
@inproceedings{GammelliYangEtAl2022, author = {Gammelli, D. and Yang, K. and Harrison, J. and Rodrigues, F. and Pereira, F. and Pavone, M.}, booktitle = {{ACM Int. Conf. on Knowledge Discovery and Data Mining}}, title = {Graph Meta-Reinforcement Learning for Transferable Autonomous Mobility-on-Demand}, year = {2022}, keywords = {pub}, owner = {gammelli}, url = {https://arxiv.org/abs/2202.07147}, timestamp = {2022-03-02} }
Abstract: Dynamic network flow models have been extensively studied and widely used in the past decades to formulate many problems with great real-world impact, such as transportation, supply chain management, power grid control, and more. Within this context, time-expansion techniques currently represent a generic approach for solving control problems over dynamic networks. However, the complexity of these methods does not allow traditional approaches to scale to large networks, especially when these need to be solved recursively over a receding horizon (e.g., to yield a sequence of actions in model predictive control). Moreover, tractable optimization-based approaches are often limited to simple linear deterministic settings and are not able to handle environments with stochastic, non-linear, or unknown dynamics. In this work, we present dynamic network flow problems through the lens of reinforcement learning and propose a graph network-based framework that can handle a wide variety of problems and learn efficient algorithms without significantly compromising optimality. Instead of a naive and poorly-scalable formulation, in which agent actions (and thus network outputs) consist of actions on edges, we present a two-phase decomposition. The first phase consists of an RL agent specifying desired outcomes to the actions. The second phase exploits the problem structure to solve a convex optimization problem and achieve (as best as possible) these desired outcomes. This formulation leads to dramatically improved scalability and performance. We further highlight a collection of features that are potentially desirable to system designers, investigate design decisions, and present experiments showing the utility, scalability, and flexibility of our framework.
@inproceedings{GammelliHarrisonEtAl2022, author = {Gammelli, D. and Harrison, J. and Yang, K. and Pavone, M. and Rodrigues, F. and Francisco, Pereira C.}, booktitle = {{Learning on Graphs Conference}}, title = {Graph Reinforcement Learning for Network Control via Bi-Level Optimization}, year = {2022}, keywords = {pub}, owner = {gammelli}, timestamp = {2022-11-24} }
Abstract: In this work we derive a second-order approach to bilevel optimization, a type of mathematical programming in which the solution to a parameterized optimization problem (the “lower” problem) is itself to be optimized (in the “upper” problem) as a function of the parameters. Many existing approaches to bilevel optimization employ first-order sensitivity analysis, based on the implicit function theorem (IFT), for the lower problem to derive a gradient of the lower problem solution with respect to its parameters; this IFT gradient is then used in a first-order optimization method for the upper problem. This paper extends this sensitivity analysis to provide second-order derivative information of the lower problem (which we call the IFT Hessian), enabling the usage of faster-converging second-order optimization methods at the upper level. Our analysis shows that (i) much of the computation already used to produce the IFT gradient can be reused for the IFT Hessian, (ii) errors bounds derived for the IFT gradient readily apply to the IFT Hessian, (iii) computing IFT Hessians can significantly reduce overall computation by extracting more information from each lower level solve. We corroborate our findings and demonstrate the broad range of applications of our method by applying it to problem instances of least squares hyperparameter auto-tuning, multi-class SVM auto-tuning, and inverse optimal control.
@inproceedings{DyroSchmerlingEtAl2022, author = {Dyro, R. and Schmerling, E. and Arechiga, N. and Pavone, M.}, title = {Second-Order Sensitivity Analysis for Bilevel Optimization}, booktitle = {{Int. Conf. on Artificial Intelligence and Statistics}}, year = {2022}, keywords = {pub}, owner = {rdyro}, url = {https://arxiv.org/abs/2205.02329}, timestamp = {2022-02-05} }
Abstract: We address the problem of routing a team of drones and trucks over large-scale urban road networks. To conserve their limited flight energy, drones can use trucks as temporary modes of transit en route to their own destinations. Such coordination can yield significant savings in total vehicle distance traveled, i.e., truck travel distance and drone flight distance, compared to operating drones and trucks independently. But it comes at the potentially prohibitive computational cost of deciding which trucks and drones should coordinate and when and where it is most beneficial to do so. We tackle this fundamental trade-off by decoupling our overall intractable problem into tractable sub-problems that we solve stage-wise. The first stage solves only for trucks, by computing paths that make them more likely to be useful transit options for drones. The second stage solves only for drones, by routing them over a composite of the road network and the transit network defined by truck paths from the first stage. We design a comprehensive algorithmic framework that frames each stage as a multi-agent path finding problem and implement two distinct methods for solving them. We evaluate our approach on extensive simulations with up to 100 agents on the real-world Manhattan road network containing nearly 4500 vertices and 10000 edges. Our framework saves on more than 50% of vehicle distance traveled compared to independently solving for trucks and drones, and computes solutions for all settings within 5 minutes on commodity hardware.
@inproceedings{ChoudhurySoloveyEtAl2022, author = {Choudhury, S. and Solovey, K. and Kochenderfer, M. Pavone}, title = {Coordinated Multi-Agent Pathfinding for Drones and Trucks over Road Networks}, booktitle = {{Proc. Int. Conf. on Autonomous Agents and Multiagent Systems}}, year = {2022}, keywords = {pub}, owner = {kirilsol}, timestamp = {2021-10-17} }
Abstract: ReachBot is a new concept for planetary exploration, consisting of a small body and long, lightweight extending arms loaded primarily in tension. The arms are equipped with spined grippers for anchoring on rock surfaces. The design and testing of a planar prototype is presented here. Experiments with rock grasping and coordinated locomotion illustrate the advantages of low inertia passive grippers, triggered by impact and using stored mechanical energy for the internal force. Gripper design involves a trade-off among the range of possible grasp angles, maximum grasp force, required triggering force, and required reset force. The current prototype can pull with up to 8 N when gripping volcanic rock, limited only by the strength of the 3D printed components. Calculations predict a maximum pull of 26 N for the same spines and stronger materials.
@inproceedings{ChenMillerEtAl2022, author = {Chen, T. G. and Miller, B. and Winston, C. and Schneider, S. and Bylard, A. and Pavone, M. and Cutkosky, M. R.}, title = {{ReachBot:} {A} Small Robot with Exceptional Reach for Rough Terrain}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2022}, url = {/wp-content/papercite-data/pdf/Chen.Miller.ea.RAL22.pdf}, keywords = {pub}, owner = {bylard}, timestamp = {2021-12-09} }
Abstract:
@article{ChenCauligiEtAl2022, author = {Chen, T. G. and Cauligi, A. and Suresh, S. A. and Pavone, M. and Cutkosky, M. R.}, title = {Testing Gecko-Inspired Adhesives with {Astrobee} Aboard the {ISS}}, journal = {{IEEE Robotics and Automation Magazine}}, volume = {29}, number = {3}, pages = {24--33}, year = {2022}, keywords = {pub}, owner = {acauligi}, timestamp = {2021-11-02}, url = {https://ieeexplore.ieee.org/document/9783137} }
Abstract: Many robotics problems, from robot motion planning to object manipulation, can be modeled as mixed-integer convex programs (MICPs). However, state-of-the-art algorithms are still unable to solve MICPs for control problems quickly enough for online use and existing heuristics can typically only find suboptimal solutions that might degrade robot performance. In this work, we turn to data-driven methods and present the Combinatorial Offline, Convex Online (CoCo) algorithm for quickly finding high quality solutions for MICPs. CoCo consists of a two-stage approach. In the offline phase, we train a neural network classifier that maps the problem parameters to a (logical strategy), which we define as the discrete arguments and relaxed big-M constraints associated with the optimal solution for that problem. Online, the classifier is applied to select a candidate logical strategy given new problem parameters; applying this logical strategy allows us to solve the original MICP as a convex optimization problem. We show through numerical experiments how CoCo finds near optimal solutions to MICPs arising in robot planning and control with 1 to 2 orders of magnitude solution speedup compared to other data-driven approaches and solvers.
@article{CauligiCulbertsonEtAl2022, author = {Cauligi, A. and Culbertson, P. and Schmerling, E. and Schwager, M. and Stellato, B. and Pavone, M.}, title = {{CoCo}: Online Mixed-Integer Control via Supervised Learning}, journal = {{IEEE Robotics and Automation Letters}}, volume = {7}, number = {2}, pages = {1447--1454}, year = {2022}, url = {http://arxiv.org/abs/2107.08143}, keywords = {pub}, owner = {acauligi}, timestamp = {2022-03-10} }
Abstract: Verifying that input-output relationships of a neural network conform to prescribed operational specifications is a key enabler towards deploying these networks in safety-critical applications. Semidefinite programming (SDP)-based approaches to Rectified Linear Unit (ReLU) network verification transcribe this problem into an optimization problem, where the accuracy of any such formulation reflects the level of fidelity in how the neural network computation is represented, as well as the relaxations of intractable constraints. While the literature contains much progress on improving the tightness of SDP formulations while maintaining tractability, comparatively little work has been devoted to the other extreme, i.e., how to most accurately capture the original verification problem before SDP relaxation. In this work, we develop an exact, convex formulation of verification as a completely positive program (CPP), and provide analysis showing that our formulation is minimal—the removal of any constraint fundamentally misrepresents the neural network computation. We leverage our formulation to provide a unifying view of existing approaches, and give insight into the source of large relaxation gaps observed in some cases.
@inproceedings{BrownSchmerlingEtAl2022, author = {Brown, R. and Schmerling, E. and Azizan, N. and Pavone, M.}, title = {A Unified View of SDP-based Neural Network Verification through Completely Positive Programming}, booktitle = {{Int. Conf. on Artificial Intelligence and Statistics}}, year = {2022}, owner = {rabrown1}, timestamp = {2022-02-17} }
Abstract:
@article{BonalliLewESAIM2022, author = {Bonalli, R. and Lew, T. and Pavone, M.}, title = {Sequential Convex Programming For Non-Linear Stochastic Optimal Control}, journal = {{ESAIM: Control, Optimisation \& Calculus of Variations}}, volume = {28}, year = {2022}, owner = {lew}, timestamp = {2022-01-29}, url = {https://arxiv.org/abs/2009.05182} }
Abstract: Value functions are powerful abstractions broadly used across optimal control and robotics algorithms. Several lines of work have attempted to leverage trajectory optimization to learn value function approximations, usually by solving a large number of trajectory optimization problems as a means to generate training data. Even though these methods point to a promising direction, for sufficiently complex tasks, their sampling requirements can become computationally intractable. In this work, we leverage insights from adversarial learning in order to improve the sampling efficiency of a simple value function learning algorithm. We demonstrate how generating adversarial samples for this task presents a unique challenge due to the loss function that does not admit a closed form expression of the samples, but that instead requires the solution to a nonlinear optimization problem. Our key insight is that by leveraging duality theory from optimization, it is still possible to compute adversarial samples for this learning problem with virtually no computational overhead, including without having to keep track of shifting distributions of approximation errors or having to train generative models. We apply our method, named SEAGuL, to a canonical control task (balancing the acrobot) and a more challenging and highly dynamic nonlinear control task (the perching of a small glider). We demonstrate that compared to random sampling, with the same number of samples, training value function approximations using SEAGuL leads to improved generalization errors that also translate to control performance improvement.
@inproceedings{LandryDaiEtAl2021, author = {Landry, B. and Dai, H. and Pavone, M.}, title = {SEAGuL: Sample Efficient Adversarially Guided Learning of Value Functions}, booktitle = {{Learning for Dynamics \& Control Conference}}, year = {2021}, month = dec, url = {http://proceedings.mlr.press/v144/landry21a/landry21a.pdf}, owner = {blandry}, timestamp = {2021-03-15} }
Abstract: Sharing forecasts of network timeseries data, such as cellular or electricity load patterns, can improve independent control applications ranging from traffic scheduling to power generation. Typically, forecasts are designed without knowledge of a downstream controller’s task objective, and thus simply optimize for mean prediction error. However, such task-agnostic representations are often too large to stream over a communication network and do not emphasize salient temporal features for cooperative control. This paper presents a solution to learn succinct, highly-compressed forecasts that are co-designed with a modular controller’s task objective. Our simulations with real cellular, Internet-of-Things (IoT), and electricity load data show we can improve a model predictive controller’s performance by at least 25% while transmitting 80% less data than the competing method. Further, we present theoretical compression results for a networked variant of the classical linear quadratic regulator (LQR) control problem.
@inproceedings{ChengPavoneEtAl2021, author = {Cheng, J. and Pavone, M. and Katti, S. and Chinchali, S. and Tang, A.}, title = {Data Sharing and Compression for Cooperative Networked Control}, booktitle = {{Conf. on Neural Information Processing Systems}}, year = {2021}, month = dec, url = {https://arxiv.org/abs/2109.14675}, owner = {borisi}, timestamp = {2021-10-06} }
Abstract: Congestion pricing has long been hailed as a means to mitigate traffic congestion; however, its practical adoption has been limited due to the resulting social inequity issue, e.g., low-income users are priced out off certain roads. This issue has spurred interest in the design of equitable mechanisms that aim to refund the collected toll revenues as lump-sum transfers to users. Although revenue refunding has been extensively studied for over three decades, there has been no thorough characterization of how such schemes can be designed to simultaneously achieve system efficiency and equity objectives. In this work, we bridge this gap through the study of congestion pricing and revenue refunding (CPRR) schemes in non-atomic congestion games. We first develop CPRR schemes, which in comparison to the untolled case, simultaneously (i) increase system efficiency and (ii) decrease wealth inequality, while being (iii) user-favorable: irrespective of their initial wealth or values-of-time (which may differ across users) users would experience a lower travel cost after the implementation of the proposed scheme. We then characterize the set of optimal user-favorable CPRR schemes that simultaneously maximize system efficiency and minimize wealth inequality. These results assume a well-studied behavior model of users minimizing a linear function of their travel times and tolls, without considering refunds. We also study a more complex behavior model wherein users are influenced by and react to the amount of refund that they receive. Although, in general, the two models can result in different outcomes in terms of system efficiency and wealth inequality, we establish that those outcomes coincide when the aforementioned optimal CPRR scheme is implemented. Overall, our work demonstrates that through appropriate refunding policies we can achieve system efficiency while reducing wealth inequality.
@inproceedings{JalotaEtAl2021, author = {Jalota, D. and Solovey, K. and Gopalakrishnan, K. and Zoepf, S. and Balakrishnan, H. and Pavone, M.}, title = {When Efficiency meets Equity in Congestion Pricing and Revenue Refunding Schemes}, booktitle = {{ACM Conf. on Equity and Access in Algorithms, Mechanisms, and Optimization}}, year = {2021}, address = {Online}, month = oct, owner = {devanshjalota}, timestamp = {2021-06-22}, url = {https://dl.acm.org/doi/10.1145/3465416.3483296} }
Abstract: Automated vehicles (AVs) are expected to be beneficial for Mobility-on-Demand (MoD), thanks to their ability of being globally coordinated. To facilitate the steady transition towards full autonomy, we consider the transition period of AV deployment, whereby an MoD system operates a mixed fleet of automated vehicles (AVs) and human-driven vehicles (HVs). In such systems, AVs are centrally coordinated by the operator, and the HVs might strategically respond to the coordination of AVs. We devise computationally tractable strategies to coordinate mixed fleets in MoD systems. Specifically, we model an MoD system with a mixed fleet using a Stackelberg framework where the MoD operator serves as the leader and human-driven vehicles serve as the followers. We develop two models: 1) a steady-state model to analyze the properties of the problem and determine the planning variables (e.g., compensations, prices, and the fleet size of AVs), and 2) a time-varying model to design a real-time coordination algorithm for AVs. The proposed models are validated using a case study inspired by real operational data of a MoD service in Singapore. Results show that the proposed algorithms can significantly improve system performance.
@inproceedings{YangTsaoEtAl2021, author = {Yang, K. and Tsao, M. and Xu, X. and Pavone, M.}, title = {Real-Time Control of Mixed Fleets in Mobility-on-Demand Systems}, booktitle = {{Proc. IEEE Int. Conf. on Intelligent Transportation Systems}}, year = {2021}, note = {{Extended Version available} at \url{https://arxiv.org/abs/2008.08131}}, address = {Indianapolis, IN, USA}, month = sep, url = {https://arxiv.org/pdf/2008.08131}, owner = {ykd07}, timestamp = {2021-06-15} }
Abstract: Charging infrastructure is the coupling link between power and transportation networks, thus determining charging station siting is necessary for planning of power and transportation systems. While previous works have either optimized for charging station siting given historic travel behavior, or optimized fleet routing and charging given an assumed placement of the stations, this paper introduces a linear program that optimizes for station siting and macroscopic fleet operations in a joint fashion. Given an electricity retail rate and a set of travel demand requests, the optimization minimizes total cost for an autonomous EV fleet comprising of travel costs, station procurement costs, fleet procurement costs, and electricity costs, including demand charges. Specifically, the optimization returns the number of charging plugs for each charging rate (e.g., Level 2, DC fast charging) at each candidate location, as well as the optimal routing and charging of the fleet. From a case-study of an electric vehicle fleet operating in San Francisco, our results show that, albeit with range limitations, small EVs with low procurement costs and high energy efficiencies are the most cost-effective in terms of total ownership costs. Furthermore, the optimal siting of charging stations is more spatially distributed than the current siting of stations, consisting mainly of high-power Level 2 AC stations (16.8 kW) with a small share of DC fast charging stations and no standard 7.7kW Level 2 stations. Optimal siting reduces the total costs, empty vehicle travel, and peak charging load by up to 10%.
@inproceedings{LukeSalazarEtAl2021, author = {Luke, J. and Salazar, M. and Rajagopal, R. and Pavone, M.}, title = {Joint Optimization of Autonomous Electric Vehicle Fleet Operations and Charging Station Siting}, booktitle = {{Proc. IEEE Int. Conf. on Intelligent Transportation Systems}}, year = {2021}, address = {Indianapolis, IN}, month = sep, doi = {10.1109/ITSC48978.2021.9565089}, owner = {jthluke}, timestamp = {2023-11-15}, url = {http://arxiv.org/abs/2107.00165} }
Abstract: Tractable safety-ensuring algorithms for cyber-physical systems are important in critical applications. Approaches based on Control Barrier Functions assume continuous enforcement, which is not possible in an online fashion. This paper presents two tractable algorithms to ensure forward invariance of discrete-time controlled cyber-physical systems. Both approaches are based on Control Barrier Functions to provide strict mathematical safety guarantees. The first algorithm exploits Lipschitz continuity and formulates the safety condition as a robust program which is subsequently relaxed to a set of affine conditions. The second algorithm is inspired by tube-NMPC and uses an affine Control Barrier Function formulation in conjunction with an auxiliary controller to guarantee safety of the system. We combine an approximate NMPC controller with the second algorithm to guarantee strict safety despite approximated constraints and show its effectiveness experimentally on a mini-Segway.
@article{SchilligerEtAl2021, author = {Schilliger, J. and Lew, T. and Richards, S.~M. and Hanggi, S. and Pavone, M. and Onder, C.}, title = {Control Barrier Functions for Cyber-Physical Systems and Applications to NMPC}, journal = {{IEEE Robotics and Automation Letters}}, volume = {6}, number = {4}, pages = {8623--8630}, year = {2021}, month = aug, url = {https://arxiv.org/abs/2104.14250}, owner = {lew}, timestamp = {2021-08-23} }
Abstract: Autonomous systems have played an important role in response to the Covid-19 pandemic. Notably, there have been multiple attempts to leverage Unmanned Aerial Vehicles (UAVs) to disinfect surfaces. Although recent research suggests that surface transmission has a minimal impact in the spread of Covid-19, surfaces do play a significant role in the transmission of many other viruses. Employing UAVs for mass spray disinfection offers several potential advantages including high throughput application of disinfectant, large scale deployment, and the minimization of health risks to sanitation workers. Despite these potential benefits and preliminary usage of UAVs for disinfection, there has been little research into their design and effectiveness. In this work we present an autonomous UAV capable of effectively disinfecting surfaces. We identify relevant parameters such as disinfectant concentration, amount, and application distance required of the UAV to sterilize high touch surfaces such as door handles. Finally, we develop a robotic system that enables the fully autonomous disinfection of door handles in an unstructured, previously unknown environment. To our knowledge, this is the smallest untethered UAV ever built with both full autonomy and spraying capabilities, allowing it to operate in confined indoor settings, and the first autonomous UAV to specifically target high touch surfaces on an individual basis with spray disinfectant, resulting in more efficient use of disinfectant.
@inproceedings{RoelofsLandryEtAl, author = {Roelofs, S. and Landry, B. and Jalil, M. K. and Martin, A. and Koppaka, S. and Tang, S. K. Y. and Pavone, M.}, title = {Vision-based Autonomous Disinfection of High Touch Surfaces in Indoor Environments}, booktitle = {{Int. Conf. on Control, Automation and Systems}}, year = {2021}, month = aug, url = {https://arxiv.org/pdf/2108.11456.pdf}, owner = {blandry}, timestamp = {2021-08-21} }
Abstract: In order to safely deploy Deep Neural Networks (DNNs) within the perception pipelines of real-time decision making systems, there is a need for safeguards that can detect out-of-training-distribution (OoD) inputs both efficiently and accurately. Building on recent work leveraging the local curvature of DNNs to reason about epistemic uncertainty, we propose Sketching Curvature of OoD Detection (SCOD), an architecture-agnostic framework for equipping any trained DNN with a task-relevant epistemic uncertainty estimate. Offline, given a trained model and its training data, SCOD employs tools from matrix sketching to tractably compute a low-rank approximation of the Fisher information matrix, which characterizes which directions in the weight space are most influential on the predictions over the training data. Online, we estimate uncertainty by measuring how much perturbations orthogonal to these directions can alter predictions at a new test input. We apply SCOD to pre-trained networks of varying architectures on several tasks, ranging from regression to classification. We demonstrate that SCOD achieves comparable or better OoD detection performance with lower computational burden relative to existing baselines.
@inproceedings{SharmaAzizanEtAl2021, author = {Sharma, A. and Azizan, N. and Pavone, M.}, title = {Sketching Curvature for Efficient Out-of-Distribution Detection for Deep Neural Networks}, booktitle = {{Proc. Conf. on Uncertainty in Artificial Intelligence}}, year = {2021}, month = jul, url = {https://arxiv.org/abs/2102.12567}, owner = {apoorva}, timestamp = {2021-05-24} }
Abstract: Real-time adaptation is imperative to the control of robots operating in complex, dynamic environments. Adaptive control laws can endow even nonlinear systems with good trajectory tracking performance, provided that any uncertain dynamics terms are linearly parameterizable with known nonlinear features. However, it is often difficult to specify such features a priori, such as for aerodynamic disturbances on rotorcraft or interaction forces between a manipulator arm and various objects. In this paper, we turn to data-driven modeling with neural networks to learn, offline from past data, an adaptive controller with an internal parametric model of these nonlinear features. Our key insight is that we can better prepare the controller for deployment with control-oriented meta-learning of features in closed-loop simulation, rather than regression-oriented meta-learning of features to fit input-output data. Specifically, we meta-learn the adaptive controller with closed-loop tracking simulation as the base-learner and the average tracking error as the meta-objective. With a nonlinear planar rotorcraft subject to wind, we demonstrate that our adaptive controller outperforms other controllers trained with regression-oriented meta-learning when deployed in closed-loop for trajectory tracking control.
@inproceedings{RichardsAzizanEtAl2021, author = {Richards, S. M. and Azizan, N. and Slotine, J.-J. and Pavone, M.}, title = {Adaptive-Control-Oriented Meta-Learning for Nonlinear Systems}, year = {2021}, booktitle = {{Robotics: Science and Systems}}, note = {}, owner = {spenrich}, timestamp = {2023-01-30}, url = {https://arxiv.org/abs/2103.04490}, address = {Virtual}, month = jul, keywords = {} }
Abstract: Today, even the most compute-and-power constrained robots can measure complex, high data-rate video and LIDAR sensory streams. Often, such robots, ranging from low-power drones to space and subterranean rovers, need to transmit high-bitrate sensory data to a remote compute server if they are uncertain or cannot scalably run complex perception or mapping tasks locally. However, today’s representations for sensory data are mostly designed for human, not robotic, perception and thus often waste precious compute or wireless network resources to transmit unimportant parts of a scene that are unnecessary for a high-level robotic task. This paper presents an algorithm to learn task-relevant representations of sensory data that are co-designed with a pre-trained robotic perception model’s ultimate objective. Our algorithm aggressively compresses robotic sensory data by up to 11x more than competing methods. Further, it achieves high accuracy and robust generalization on diverse tasks including Mars terrain classification with low-power deep learning accelerators, neural motion planning, and environmental timeseries classification.
@inproceedings{NakanoyaChinchaliEtAl2021, author = {Nakanoya, M. and Chinchali, S. and Anemogiannis, A. and Datta, A. and Katti, S. and Pavone, M.}, title = {Task-relevant Representation Learning for Networked Robotic Perception}, booktitle = {{Robotics: Science and Systems}}, year = {2021}, address = {Online}, month = jul, url = {https://arxiv.org/abs/2011.03216}, owner = {csandeep}, timestamp = {2021-05-19} }
Abstract: Deep learning has had a far reaching impact in robotics. Specifically, deep reinforcement learning algorithms have been highly effective in synthesizing neural-network controllers for a wide range of tasks. However, despite this empirical success, these controllers still lack theoretical guarantees on their performance, such as Lyapunov stability (i.e., all trajectories of the closed-loop system are guaranteed to converge to a goal state under the control policy). This is in stark contrast to traditional model based controller design, where principled approaches (like LQR) can synthesize stable controllers with provable guarantees. To address this gap, we propose a generic method to synthesize a Lyapunov-stable neural-network controller, together with a neural-network Lyapunov function to simultaneously certify its stability. Our approach formulates the Lyapunov condition verification as a mixed-integer linear program (MIP). Our MIP verifier either certifies the Lyapunov condition, or generates counter-examples that can help improve the candidate controller and the Lyapunov function. We also present an optimization program to compute an inner approximation of the region-of-attraction for the closed-loop system. We apply our approach to robots including an inverted pendulum, a 2D and a 3D quadrotor, and showcase that our neural-network controller outperforms a baseline LQR controller.
@inproceedings{DaiLandryEtAl2021, author = {Dai, H. and Landry, B. and Yang, L. and Pavone, M. and Tedrake, R.}, title = {Lyapunov-Stable Neural-Network Control}, booktitle = {{Robotics: Science and Systems}}, year = {2021}, address = {Virtual}, month = jul, url = {https://arxiv.org/pdf/2109.14152.pdf}, owner = {blandry}, timestamp = {2021-11-07} }
Abstract: Finite element methods have been successfully used to develop physics-based models of soft robots that capture the nonlinear dynamic behavior induced by continuous deformation. These high-fidelity models are therefore ideal for designing controllers for complex dynamic tasks such as trajectory optimization and trajectory tracking. However, finite element models are also typically very high-dimensional, which makes real-time control challenging. In this work we propose an approach for finite element model-based control of soft robots that leverages model order reduction techniques to significantly increase computational efficiency. In particular, a constrained optimal control problem is formulated based on a nonlinear reduced order finite element model and is solved via sequential convex programming. This approach is demonstrated through simulation of a cable-driven soft robot for a constrained trajectory tracking task, where a 9768-dimensional finite element model is used for controller design.
@inproceedings{TonkensLorenzettiEtAl2021, author = {Tonkens, S. and Lorenzetti, J. and Pavone, M.}, title = {Soft Robot Optimal Control Via Reduced Order Finite Element Models}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2021}, address = {Xi'an, China}, month = may, url = {https://arxiv.org/abs/2011.02092}, owner = {stonkens}, timestamp = {2021-06-10} }
Abstract: Multi-robot systems are uniquely well-suited to performing complex tasks such as patrolling and tracking, information gathering, and pick-up and delivery problems, offering significantly higher performance than single-robot systems. A fundamental building block in most multi-robot systems is task allocation: assigning robots to tasks (e.g., patrolling an area, or servicing a transportation request) as they appear based on the robots’ states to maximize reward. In many practical situations, the allocation must account for heterogeneous capabilities (e.g., availability of appropriate sensors or actuators) to ensure the feasibility of execution, and to promote a higher reward, over a long time horizon. To this end, we present the FlowDec algorithm for efficient heterogeneous task-allocation achieving an approximation factor of at least 1/2 of the optimal reward. Our approach decomposes the heterogeneous problem into several homogeneous subproblems that can be solved efficiently using min-cost flow. Through simulation experiments, we show that our algorithm is faster by several orders of magnitude than a MILP approach.
@inproceedings{SoloveyBandyopadhyayEtAl2021, author = {Solovey, K. and Bandyopadhyay, S. and Rossi, F. and Wolf, M. T. and Pavone, M.}, title = {Fast Near-Optimal Heterogeneous Task Allocation via Flow Decomposition}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2021}, address = {Xi'an, China}, month = may, url = {https://arxiv.org/abs/2011.03603}, owner = {kirilsol}, timestamp = {2020-12-07} }
Abstract: To achieve seamless human-robot interactions, robots need to intimately reason about complex interaction dynamics and future human behaviors within their motion planning process. However, there is a disconnect between state-of-the-art neural network-based human behavior models and robot motion planners—either the behavior models are limited in their consideration of downstream planning or a simplified behavior model is used to ensure tractability of the planning problem. In this work, we present a framework that fuses together the interpretability and flexibility of trajectory optimization (TO) with the predictive power of state-of-the-art human trajectory prediction models. In particular, we leverage gradient information from data-driven prediction models to explicitly reason about human-robot interaction dynamics within a gradient-based TO problem. We demonstrate the efficacy of our approach in a multi-agent scenario whereby a robot is required to safely and efficiently navigate through a crowd of up to ten pedestrians. We compare against a variety of planning methods, and show that by explicitly accounting for interaction dynamics within the planner, our method offers safer and more efficient behaviors, even yielding proactive and nuanced behaviors such as waiting for a pedestrian to pass before moving.
@inproceedings{SchaeferLeungEtAl2021, author = {Schaefer, S. and Leung, K. and Ivanovic, B. and Pavone, M.}, title = {Leveraging Neural Network Gradients within Trajectory Optimization for Proactive Human-Robot Interactions}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2021}, address = {Xi'an, China}, month = may, url = {https://arxiv.org/abs/2012.01027}, owner = {borisi}, timestamp = {2020-11-01} }
Abstract: Human behavior prediction models enable robots to anticipate how humans may react to their actions, and hence are instrumental to devising safe and proactive robot planning algorithms. However, modeling complex interaction dynamics and capturing the possibility of many possible outcomes in such interactive settings is very challenging, which has recently prompted the study of several different approaches. In this work, we provide a self-contained tutorial on a conditional variational autoencoder (CVAE) approach to human behavior prediction which, at its core, can produce a multimodal probability distribution over future human trajectories conditioned on past interactions and candidate robot future actions. Specifically, the goals of this tutorial paper are to review and build a taxonomy of state-of-the-art methods in human behavior prediction, from physics-based to purely data-driven methods, provide a rigorous yet easily accessible description of a data-driven, CVAE-based approach, highlight important design characteristics that make this an attractive model to use in the context of model-based planning for human-robot interactions, and provide important design considerations when using this class of models.
@article{IvanovicLeungEtAl2020, author = {Ivanovic, B. and Leung, K. and Schmerling, E. and Pavone, M.}, title = {Multimodal Deep Generative Models for Trajectory Prediction: A Conditional Variational Autoencoder Approach}, journal = {{IEEE Robotics and Automation Letters}}, volume = {6}, number = {2}, pages = {295--302}, year = {2021}, month = apr, url = {https://arxiv.org/abs/2008.03880}, owner = {borisi}, timestamp = {2020-12-23} }
Abstract: We consider the problem of controlling a large fleet of drones to deliver packages simultaneously across broad urban areas. To conserve their limited flight range, drones can seamlessly hop between and ride on top of public transit vehicles (e.g., buses and trams). We design a novel comprehensive algorithmic framework that strives to minimize the maximum time to complete any delivery. We address the multifaceted complexity of the problem through a two-layer approach. First, the upper layer assigns drones to package delivery sequences with a provably near-optimal polynomial-time task allocation algorithm. Then, the lower layer executes the allocation by periodically routing the fleet over the transit network while employing efficient bounded-suboptimal multi-agent pathfinding techniques tailored to our setting. We present extensive experiments supporting the efficiency of our approach on settings with up to 200 drones, 5000 packages, and large transit networks of up to 8000 stops in San Francisco and the Washington DC area. Our results show that the framework can compute solutions within a few seconds (up to 2 minutes for the largest settings) on commodity hardware, and that drones travel up to 450% of their flight range by using public transit.
@article{ChoudhurySoloveyETAL2020j, author = {Choudhury, S. and Solovey, K. and Kochenderfer, M. Pavone, M.}, title = {Efficient Large-Scale Multi-Drone Delivery Using Transit Networks}, journal = {{Journal of Artificial Intelligence Research}}, volume = {70}, pages = {757--788}, year = {2021}, month = mar, url = {https://doi.org/10.1613/jair.1.12450}, owner = {kirilsol}, timestamp = {2021-03-23} }
Abstract: We propose a novel framework for learning stabilizable nonlinear dynamical systems for continuous control tasks in robotics. The key contribution is a control-theoretic regularizer for dynamics fitting rooted in the notion of stabilizability, a constraint which guarantees the existence of robust tracking controllers for arbitrary open-loop trajectories generated with the learned system. Leveraging tools from contraction theory and statistical learning in reproducing kernel Hilbert spaces, we formulate stabilizable dynamics learning as a functional optimization with a convex objective and bi-convex functional constraints. Under a mild structural assumption and relaxation of the functional constraints to sampling-based constraints, we derive the optimal solution with a modified representer theorem. Finally, we utilize random matrix feature approximations to reduce the dimensionality of the search parameters and formulate an iterative convex optimization algorithm that jointly fits the dynamics functions and searches for a certificate of stabilizability. We validate the proposed algorithm in simulation for a planar quadrotor, and on a quadrotor hardware testbed emulating planar dynamics. We verify, both in simulation and on hardware, significantly improved trajectory generation and tracking performance with the control-theoretic regularized model over models learned using traditional regression techniques, especially when learning from small supervised datasets. The results support the conjecture that the use of stabilizability constraints as a form of regularization can help prune the hypothesis space in a manner that is tailored to the downstream task of trajectory generation and feedback control. This produces models that are not only dramatically better conditioned, but also data efficient.
@article{SinghRichardsEtAl2020, author = {Singh, S. and Richards, S. M. and Sindhwani, V. and Slotine, J-J. E. and Pavone, M.}, title = {Learning Stabilizable Nonlinear Dynamics with Contraction-Based Regularization}, journal = {{Int. Journal of Robotics Research}}, volume = {40}, number = {10--11}, pages = {1123-1150}, year = {2021}, url = {/wp-content/papercite-data/pdf/Singh.Richards.ea.IJRR20.pdf}, owner = {ssingh19}, timestamp = {2020-03-25} }
Abstract: Cities worldwide struggle with overloaded transportation systems and their externalities, such as traffic congestion and emissions. The emerging autonomous transportation technology has a potential to alleviate these issues. At the same time, the decisions of profit-maximizing operators running large autonomous fleets could have a negative impact on other stakeholders, e.g., by disproportionately cannibalizing public transport, and therefore could make the transportation system even less efficient and sustainable. A careful analysis of these tradeoffs requires modeling the main modes of transportation, including public transport, within a unified framework. In this paper, we propose such a framework, which allows us to study the interplay among mobility service providers, public transport authorities, and customers. In particular, we analyze the effect of autonomous ride-hailing services on the demand for public transportation. Our framework combines a graph-theoretic network model for the transportation system with a game-theoretic market model in which mobility service providers are profit-maximizers, while customers select individually-optimal transportation options. We show how to reformulate the decision problem of each mobility service provider as a tractable second-order conic program. This allows us to compute equilibria via best response. Moreover, we show that the degenerate monopolistic case of a single mobility service provider can efficiently be solved as a quadratic program. We apply our framework to data for the city of Berlin, Germany, and present sensitivity analyses to study parameters that mobility service providers or municipalities can influence to steer the overall system. We show that depending on market conditions and policy restrictions, autonomous ride-hailing systems may complement or cannibalize a public transportation system, serving between 7 % and 80 % of all customers. We discuss the main factors behind differences in these outcomes as well as strategic design options available to policymakers. Among others, we show that the monopolistic and the competitive cases yield similar modal shares, but differ in the profit outcome of each mobility service provider.
@inproceedings{LanzettiSchifferEtAl2021, author = {Lanzetti, N. and Schiffer, M. and Ostrovsky, M. and Pavone, M.}, booktitle = {Proceedings of the TSL Second Triennial Conference}, title = {On the Interplay Between Self-Driving Cars and Public Transportation: A Game-theoretic Perspective}, year = {2021}, keywords = {pub}, owner = {borisi}, url = {https://arxiv.org/abs/2109.01627}, timestamp = {2020-12-11} }
Abstract: Autonomous mobility-on-demand (AMoD) systems represent a rapidly developing mode of transportation wherein travel requests are dynamically handled by a coordinated fleet of robotic, self-driving vehicles. Given a graph representation of the transportation network - one where, for example, nodes represent areas of the city, and edges the connectivity between them - we argue that the AMoD control problem is naturally cast as a node-wise decision-making problem. In this paper, we propose a deep reinforcement learning framework to control the rebalancing of AMoD systems through graph neural networks. Crucially, we demonstrate that graph neural networks enable reinforcement learning agents to recover behavior policies that are significantly more transferable, generalizable, and scalable than policies learned through other approaches. Empirically, we show how the learned policies exhibit promising zero-shot transfer capabilities when faced with critical portability tasks such as inter-city generalization, service area expansion, and adaptation to potentially complex urban topologies.
@inproceedings{GammelliYangEtAl2021, author = {Gammelli, D. and Yang, K. and Harrison, J. and Rodrigues, F. and Pereira, F. C. and Pavone, M.}, title = {Graph Neural Network Reinforcement Learning for Autonomous Mobility-on-Demand Systems}, year = {2021}, url = {https://arxiv.org/abs/2104.11434}, owner = {jh2}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, timestamp = {2021-03-23} }
Abstract: In future transportation systems, the charging behavior of electric Autonomous Mobility on Demand (AMoD) fleets, i.e., fleets of electric self-driving cars that service on-demand trip requests, will likely challenge power distribution networks (PDNs), causing overloads or voltage drops. In this paper, we show that these challenges can be significantly attenuated if the PDNs’ operational constraints and exogenous loads (e.g., from homes or businesses) are accounted for when operating an electric AMoD fleet. We focus on a system-level perspective, assuming full coordination between the AMoD and the PDN operators. From this single entity perspective, we assess potential coordination benefits. Specifically, we extend previous results on an optimization-based modeling approach for electric AMoD systems to jointly control an electric AMoD fleet and a series of PDNs, and analyze the benefit of coordination under load balancing constraints. For a case study of Orange County, CA, we show that the coordination between the electric AMoD fleet and the PDNs eliminates 99% of the overloads and 50% of the voltage drops that the electric AMoD fleet would cause in an uncoordinated setting. Our results show that coordinating electric AMoD and PDNs can help maintain the reliability of PDNs under added electric AMoD charging load, thus significantly mitigating or deferring the need for PDN capacity upgrades.
@article{EstandiaSchifferEtAl2019, author = {Estandia, A. and Schiffer, M. and Rossi, F. and Luke, J. and Kara, E. C. and Rajagopal, R. and Pavone, M.}, title = {On the Interaction between Autonomous Mobility on Demand Systems and Power Distribution Networks -- An Optimal Power Flow Approach}, journal = {{IEEE Transactions on Control of Network Systems}}, volume = {8}, number = {3}, pages = {1163--1176}, year = {2021}, doi = {10.1109/TCNS.2021.3059225}, url = {https://arxiv.org/abs/1905.00200}, owner = {jthluke}, timestamp = {2021-02-21} }
Abstract: As robotic systems move from highly structured environments to open worlds, incorporating uncertainty from dynamics learning or state estimation into the control pipeline is essential for robust performance. In this paper we present a nonlinear particle model predictive control (PMPC) approach to control under uncertainty, which directly incorporates any particle-based uncertainty representation, such as those common in robotics. Our approach builds on scenario methods for MPC, but in contrast to existing approaches, which either constrain all or only the first timestep to share actions across scenarios, we investigate the impact of a partial consensus horizon. Implementing this optimization for nonlinear dynamics by leveraging sequential convex optimization, our approach yields an efficient framework that can be tuned to the particular information gain dynamics of a system to mitigate both over-conservatism and over-optimism. We investigate our approach for two robotic systems across three problem settings: time-varying, partially observed dynamics; sensing uncertainty; and model-based reinforcement learning, and show that our approach improves performance over baselines in all settings.
@inproceedings{DyroHarrisonEtAl2021, author = {Dyro, R. and Harrison, J. and Sharma, A. and Pavone, M.}, title = {Particle MPC for Uncertain and Learning-Based Control}, booktitle = {{IEEE/RSJ Int. Conf. on Intelligent Robots \& Systems}}, year = {2021}, keywords = {pub}, owner = {rdyro}, timestamp = {2022-02-05}, url = {https://arxiv.org/abs/2104.02213} }
Abstract:
@article{ChapmanBonalliEtAlTAC2021, author = {Chapman, M. P. and Bonalli, R. and Smith, K. M. and Yang, I. and Pavone, M. and Tomlin, C. J.}, title = {Risk-sensitive safety analysis using Conditional Value-at-Risk}, journal = {{IEEE Transactions on Automatic Control}}, volume = {67}, number = {12}, pages = {6521-6536}, year = {2021}, keywords = {pub}, owner = {bonalli}, timestamp = {2021-03-23}, url = {https://arxiv.org/abs/2101.12086} }
Abstract: Despite decades of work in fast reactive planning and control, challenges remain in developing reactive motion policies on non-Euclidean manifolds and enforcing constraints while avoiding undesirable potential function local minima. This work presents a principled method for designing and fusing desired robot task behaviors into a stable robot motion policy, leveraging the geometric structure of non-Euclidean manifolds, which are prevalent in robot configuration and task spaces. Our Pullback Bundle Dynamical Systems (PBDS) framework drives desired task behaviors and prioritizes tasks using separate position-dependent and position/velocity-dependent Riemannian metrics, respectively, thus simplifying individual task design and modular composition of tasks. For enforcing constraints, we provide a class of metric-based tasks, eliminating local minima by imposing non-conflicting potential functions only for goal region attraction. We also provide a geometric optimization problem for combining tasks inspired by Riemannian Motion Policies (RMPs) that reduces to a simple least-squares problem, and we show that our approach is geometrically well-defined. We demonstrate the PBDS framework on the sphere S2 and at 300-500 Hz on a manipulator arm, and we provide task design guidance and an open-source Julia library implementation. Overall, this work presents a fast, easy-to-use framework for generating motion policies without unwanted potential function local minima on general manifolds.
@inproceedings{BylardBonalliEtAl2021, author = {Bylard, A. and Bonalli, R. and Pavone, M.}, title = {Composable Geometric Motion Policies using Multi-Task Pullback Bundle Dynamical Systems}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2021}, address = {Xi'an, China}, month = {#may#}, url = {https://arxiv.org/abs/2101.01297}, owner = {bylard}, timestamp = {2021-03-23} }
Abstract: A number of prototypical optimization problems in multi-agent systems (e.g., task allocation and network load-sharing) exhibit a highly local structure: that is, each agent’s decision variables are only directly coupled to few other agent’s variables through the objective function or the constraints. In this paper, we develop a rigorous notion of "locality" that quantifies the degree to which agents can compute their portion of the global solution of such a distributed optimization problem based solely on information in their local neighborhood. We build upon the results of Rebeschini and Tatikonda (2019) to develop a more general theory of locality that fully captures the importance of problem data to individual solution components, as opposed to a theory that only captures response to perturbations. This analysis provides a theoretical basis for a rather simple algorithm in which agents individually solve a truncated sub-problem of the global problem, where the size of the sub-problem used depends on the locality of the problem, and the desired accuracy. Numerical results show that the proposed theoretical bounds are remarkably tight for well-conditioned problems.
@article{BrownRossiEtAl2020, author = {Brown, R. A. and Rossi, F. and Solovey, K. and Tsao, M. and Wolf, M. T. and Pavone, M.}, title = {On Local Computation for Network-Structured Convex Optimization in Multi-Agent Systems}, journal = {{IEEE Transactions on Control of Network Systems}}, volume = {8}, number = {2}, pages = {542-554}, year = {2021}, url = {/wp-content/papercite-data/pdf/Brown.Rossi.ea.TCNS20.pdf}, owner = {rabrown1}, timestamp = {2021-08-31} }
Abstract: We study an online hypergraph matching problem with delays, motivated by ridesharing applications. In this model, users enter a marketplace sequentially, and are willing to wait up to d timesteps to be matched, after which they will leave the system in favor of an outside option. A platform can match groups of up to k users together, indicating that they will share a ride. Each group of users yields a match value depending on how compatible they are with one another. As an example, in ridesharing, k is the capacity of the service vehicles, and d is the amount of time a user is willing to wait for a driver to be matched to them. We present results for both the utility maximization and cost minimization variants of the problem. In the utility maximization setting, the optimal competitive ratio is 1/d whenever k is at least 3, and is achievable in polynomial-time for any fixed k. In the cost minimization variation, when k = 2, the optimal competitive ratio for deterministic algorithms is 3/2 and is achieved by a polynomial-time thresholding algorithm. When k>2, we show that a polynomial-time randomized batching algorithm is (2 - 1/d) log k-competitive, and it is NP-hard to achieve a competitive ratio better than log k - O(log log k).
@inproceedings{TsaoEtAl2020, author = {Pavone, M. and Saberi, A. and Schiffer, M. and Tsao, M.}, booktitle = {The Conference on Web and Internet Economics (WINE)}, title = {Online Hypergraph Matching with Delays}, year = {2020}, address = {Beijing, China}, month = dec, owner = {mwtsao}, url = {http://arxiv.org/abs/2009.12022} }
Abstract: Model predictive control is a powerful framework for enabling optimal control of constrained systems. However, for systems that are described by high-dimensional state spaces this framework can be too computationally demanding for real-time control. Reduced order model predictive control (ROMPC) frameworks address this issue by leveraging model reduction techniques to compress the state space model used in the online optimal control problem. While this can enable real-time control by decreasing the online computational requirements, these model reductions introduce approximation errors that must be accounted for to guarantee constraint satisfaction and closed-loop stability for the controlled high-dimensional system. In this work we propose an offline methodology for efficiently computing error bounds arising from model reduction, and show how they can be used to guarantee constraint satisfaction in a previously proposed ROMPC framework. This work considers linear, discrete, time-invariant systems that are compressed by Petrov-Galerkin projections, and considers output-feedback settings where the system is also subject to bounded disturbances.
@inproceedings{LorenzettiPavone2020b, author = {Lorenzetti, J. and Pavone, M.}, title = {Error Bounds for Reduced Order Model Predictive Control}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2020}, address = {Jeju Island, Republic of Korea}, month = dec, url = {https://arxiv.org/pdf/1911.12349.pdf}, owner = {jlorenze}, timestamp = {2020-11-30} }
Abstract: Public goods are often either over-consumed in the absence of regulatory mechanisms, or remain completely unused, as in the Covid-19 pandemic, where social distance constraints are enforced to limit the number of people who can share public spaces. In this work, we plug this gap through market based mechanisms designed to efficiently allocate capacity constrained public goods. To design these mechanisms, we leverage the theory of Fisher markets, wherein each agent in the economy is endowed with an artificial currency budget that they can spend to avail public goods. While Fisher markets provide a strong methodological backbone to model resource allocation problems, their applicability is limited to settings involving two types of constraints - budgets of individual buyers and capacities of goods. Thus, we introduce a modified Fisher market, where each individual may have additional physical constraints, characterize its solution properties and establish the existence of a market equilibrium. Furthermore, to account for additional constraints we introduce a social convex optimization problem where we perturb the budgets of agents such that the KKT conditions of the perturbed social problem establishes equilibrium prices. Finally, to compute the budget perturbations we present a fixed point scheme and illustrate convergence guarantees through numerical experiments. Thus, our mechanism, both theoretically and computationally, overcomes a fundamental limitation of classical Fisher markets, which only consider capacity and budget constraints.
@inproceedings{JalotaEtAl2020, author = {Jalota, D. and Pavone, M. and Qi, Q. and Ye, Y.}, title = {Markets for Efficient Public Good Allocation with Social Distancing}, booktitle = {The Conference on Web and Internet Economics (WINE)}, year = {2020}, url = {https://arxiv.org/abs/2005.10765}, address = {Beijing, China}, month = dec, owner = {devanshjalota}, timestamp = {2020-10-10} }
Abstract: Discrete latent spaces in variational autoencoders have been shown to effectively capture the data distribution for many real-world problems such as natural language understanding, human intent prediction, and visual scene representation. However, discrete latent spaces need to be sufficiently large to capture the complexities of real-world data, rendering downstream tasks computationally challenging. For instance, performing motion planning in a high-dimensional latent representation of the environment could be intractable. We consider the problem of sparsifying the discrete latent space of a trained conditional variational autoencoder, while preserving its learned multimodality. As a post hoc latent space reduction technique, we use evidential theory to identify the latent classes that receive direct evidence from a particular input condition and filter out those that do not. Experiments on diverse tasks, such as image generation and human behavior prediction, demonstrate the effectiveness of our proposed technique at reducing the discrete latent sample space size of a model while maintaining its learned multimodality.
@inproceedings{ItkinaIvanovicEtAl2019, author = {Itkina, M. and Ivanovic, B. and Senanayake, R. and Kochenderfer, M. J. and Pavone, M.}, title = {Evidential Sparsification of Multimodal Latent Spaces in Conditional Variational Autoencoders}, booktitle = {{Conf. on Neural Information Processing Systems}}, year = {2020}, address = {}, month = dec, owner = {borisi}, timestamp = {2020-09-27}, url = {https://arxiv.org/abs/2010.09164} }
Abstract: Meta-learning is a promising strategy for learning to efficiently learn within new tasks, using data gathered from a distribution of tasks. However, the meta-learning literature thus far has focused on the task segmented setting, where at train-time, offline data is assumed to be split according to the underlying task, and at test-time, the algorithms are optimized to learn in a single task. In this work, we enable the application of generic meta-learning algorithms to settings where this task segmentation is unavailable, such as continual online learning with a time-varying task. We present meta-learning via online changepoint analysis (MOCA), an approach which augments a meta-learning algorithm with a differentiable Bayesian changepoint detection scheme. The framework allows both training and testing directly on time series data without segmenting it into discrete tasks. We demonstrate the utility of this approach on a nonlinear meta-regression benchmark as well as two meta-image-classification benchmarks.
@inproceedings{HarrisonSharmaEtAl2020, author = {Harrison, J. and Sharma, A. and Finn, C. and Pavone, M.}, booktitle = {{Conf. on Neural Information Processing Systems}}, title = {Continuous Meta-Learning without Tasks}, year = {2020}, month = dec, url = {https://arxiv.org/abs/1912.08866}, owner = {apoorva}, timestamp = {2020-05-05} }
Abstract: We introduce an algorithm for synthesizing and verifying piecewise linear Lyapunov functions to prove global exponential stability of piecewise linear dynamical systems. The Lyapunov functions we synthesize are parameterized by feedforward neural networks with leaky ReLU activation units. To train these neural networks, we design a loss function that measures the maximal violation of the Lyapunov conditions in the state space. We show that this maximal violation can be computed by solving a mixed-integer linear program (MILP). Compared to previous learning-based approaches, our learning approach is able to certify with high precision that the learned neural network satisfies the Lyapunov conditions not only for sampled states, but over the entire state space. Moreover, compared to previous optimization-based approaches that require a pre-specified partition of the state space when synthesizing piecewise Lyapunov functions, our method can automatically search for both the partition and the Lyapunov function simultaneously. We demonstrate our algorithm on both continuous and discrete-time systems, including some for which known strategies for partitioning of the Lyapunov function would require introducing higher order Lyapunov functions.
@inproceedings{DaiLandryEtAl2020, author = {Dai, H. and Landry, B. and Pavone, M. and Tedrake, R.}, title = {Counter-Example Guided Synthesis of Neural Network Lyapunov Functions for Piecewise Linear Systems}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2020}, address = {Jeju Island, Republic of Korea}, month = dec, url = {http://groups.csail.mit.edu/robotics-center/public_papers/Dai20.pdf}, owner = {blandry}, timestamp = {2021-03-15} }
Abstract: Reasoning about human motion is a core component of modern human-robot interactive systems. In particular, one of the main uses of behavior prediction in autonomous systems is to inform ego-robot motion planning and control. However, a majority of planning and control algorithms reason about system dynamics rather than the predicted agent tracklets that are commonly output by trajectory forecasting methods, which can hinder their integration. Towards this end, we propose Mixtures of Affine Time-varying Systems (MATS) as an output representation for trajectory forecasting that is more amenable to downstream planning and control use. Our approach leverages successful ideas from probabilistic trajectory forecasting works to learn dynamical system representations that are well-studied in the planning and control literature. We integrate our predictions with a proposed multimodal planning methodology and demonstrate significant computational efficiency improvements on a large-scale autonomous driving dataset.
@inproceedings{IvanovicElhafsiEtAl2020, author = {Ivanovic, B. and Elhafsi, A. and Rosman, G. and Gaidon, A. and Pavone, M.}, title = {{MATS}: An Interpretable Trajectory Forecasting Representation for Planning and Control}, booktitle = {{Conf. on Robot Learning}}, year = {2020}, month = nov, owner = {borisi}, timestamp = {2020-10-14}, url = {https://arxiv.org/abs/2009.07517} }
Abstract: This paper presents a novel online framework for safe crowd-robot interaction based on risk-sensitive stochastic optimal control, wherein the risk is modeled by the entropic risk measure. The control algorithm relies on mode insertion gradient optimization for this risk measure as well as Monte Carlo sampling from Trajectron++, a state-of-the-art generative model that produces multimodal probabilistic trajectory forecasts for multiple interacting agents. Our modular approach decouples the crowd-robot interaction into learning-based prediction and model-based control, which is advantageous compared to end-to-end policy learning methods in that it allows the robot’s desired behavior to be specified at run time. In particular, we show that the robot exhibits diverse interaction behavior by varying the risk sensitivity parameter. A simulation study and a real-world experiment show that the proposed online framework can accomplish safe and efficient navigation while avoiding collisions with more than 50 humans in the scene.
@inproceedings{NishimuraIvanovicEtAl2020, author = {Nishimura, H. and Ivanovic, B. and Gaidon, A. and Pavone, M. and Schwager, M.}, title = {Risk-Sensitive Sequential Action Control with Multi-Modal Human Trajectory Forecasting for Safe Crowd-Robot Interaction}, booktitle = {{IEEE/RSJ Int. Conf. on Intelligent Robots \& Systems}}, year = {2020}, address = {}, month = oct, owner = {borisi}, timestamp = {2020-07-03}, url = {https://arxiv.org/abs/2009.05702} }
Abstract: Assistive free-flying robots are a promising platform for supporting and working alongside astronauts in carrying out tasks that require interaction with the environment. However, current free-flying robot platforms are limited by existing manipulation technologies in being able to grasp and manipulate surrounding objects. Instead, gecko-inspired adhesives offer many advantages for an alternate grasping and manipulation paradigm for use in assistive free-flyer applications. In this work, we present the design of a gecko-inspired adhesive gripper for performing perching and grasping maneuvers for the Astrobee robot, a free-flying robot currently operating on-board the International Space Station. We present software and hardware integration details for the gripper units that were launched to the International Space Station in 2019 for in-flight experiments with Astrobee. Finally, we present preliminary results for on-ground experiments conducted with the gripper and Astrobee on a free-floating spacecraft test bed.
@inproceedings{CauligiChenEtAl2020, author = {Cauligi, A. and Chen, T. and Suresh, S. A. and Dille, M. and Ruiz, R. G. and Vargas, A. M. and Pavone, M. and Cutkosky, M. R.}, title = {Design and Development of a Gecko-Adhesive Gripper for the {Astrobee} Free-Flying Robot}, booktitle = {{Int. Symp. on Artificial Intelligence, Robotics and Automation in Space}}, year = {2020}, address = {Pasadena, California}, month = oct, url = {https://arxiv.org/pdf/2009.09151.pdf}, owner = {acauligi}, timestamp = {2020-09-18} }
Abstract: Rovers require knowledge of terrain to plan trajectories that maximize safety and efficiency. Terrain type classification relies on input from human operators or machine learning-based image classification algorithms. However, high level terrain classification is typically not sufficient to prevent incidents such as rovers becoming unexpectedly stuck in a sand trap; in these situations, online rover-terrain interaction data can be leveraged to accurately predict future dynamics and prevent further damage to the rover. This paper presents a meta-learning-based approach to adapt probabilistic predictions of rover dynamics by augmenting a nominal model affine in parameters with a Bayesian regression algorithm (P-ALPaCA). A regularization scheme is introduced to encourage orthogonality of nominal and learned features, leading to interpretable probabilistic estimates of terrain parameters in varying terrain conditions.
@inproceedings{BanerjeeHarrisonEtAl2020, author = {Banerjee, S. and Harrison, J. and Furlong, P. M. and Pavone, M.}, title = {Adaptive Meta-Learning for Identification of Rover-Terrain Dynamics}, booktitle = {{Int. Symp. on Artificial Intelligence, Robotics and Automation in Space}}, year = {2020}, address = {Pasadena, California}, month = oct, url = {https://arxiv.org/abs/2009.10191}, owner = {somrita}, timestamp = {2020-09-18} }
Abstract: The design of autonomous vehicles (AVs) and the design of AV-enabled mobility systems are closely coupled. Indeed, knowledge about the intended service of AVs would impact their design and deployment process, whilst insights about their technological development could significantly affect transportation management decisions. This calls for tools to study such a coupling and co-design AVs and AV-enabled mobility systems in terms of different objectives. In this paper, we instantiate a framework to address such co-design problems. In particular, we leverage the recently developed theory of co-design to frame and solve the problem of designing and deploying an intermodal Autonomous Mobility-on-Demand system, whereby AVs service travel demands jointly with public transit, in terms of fleet sizing, vehicle autonomy, and public transit service frequency. Our framework is modular and compositional, allowing one to describe the design problem as the interconnection of its individual components and to tackle it from a system-level perspective. To showcase our methodology, we present a real-world case study for Washington D.C., USA. Our work suggests that it is possible to create user-friendly optimization tools to systematically assess costs and benefits of interventions, and that such analytical techniques might gain a momentous role in policy-making in the future.
@inproceedings{ZardiniEtAl2020, author = {Zardini, G. and Lanzetti, N. and Salazar, M. and Censi, A. and Frazzoli, E. and Pavone, M.}, title = {On the Co-Design of AV-Enabled Mobility Systems}, booktitle = {{Proc. IEEE Int. Conf. on Intelligent Transportation Systems}}, year = {2020}, address = {Rhodes, Greece}, month = sep, url = {https://arxiv.org/abs/2003.04739}, owner = {gzardini}, timestamp = {2020-03-11} }
Abstract: This paper studies congestion-aware route-planning policies for Autonomous Mobility-on-Demand (AMoD) systems, whereby a fleet of autonomous vehicles provides on demand mobility under mixed traffic conditions. Specifically, we first devise a network flow model to optimize the AMoD routing and rebalancing strategies in a congestion-aware fashion by accounting for the endogenous impact of AMoD flows on travel time. Second, we capture reactive exogenous traffic consisting of private vehicles selfishly adapting to the AMoD flows in a user centric fashion by leveraging an iterative approach. Finally, we showcase the effectiveness of our framework with a case-study considering the transportation sub-network in New York City. Our results suggest that for high levels of demand, pure AMoD travel can be detrimental due to the additional traffic stemming from its rebalancing flows, whilst the combination of AMoD with walking or micro mobility options can significantly improve the overall system performance.
@inproceedings{Wollenstein-BetechHoushmandEtAl2020, author = {Wollenstein-Betech, S. and Houshmand, A. and Salazar, M. and Pavone, M. and Cassandras, C. G. and Paschalidis, I. C.}, title = {Congestion-aware Routing and Rebalancing of Autonomous Mobility-on-Demand Systems in Mixed Traffic}, booktitle = {{Proc. IEEE Int. Conf. on Intelligent Transportation Systems}}, year = {2020}, address = {Rhodes, Greece}, month = sep, url = {/wp-content/papercite-data/pdf/Wollenstein-Betech.ea.ITSC20.pdf}, owner = {samauro}, timestamp = {2020-07-02} }
Abstract: Reasoning about human motion is an important prerequisite to safe and socially-aware robotic navigation. As a result, multi-agent behavior prediction has become a core component of modern human-robot interactive systems, such as self-driving cars. While there exist many methods for trajectory forecasting, most do not enforce dynamic constraints and do not account for environmental information (e.g., maps). Towards this end, we present Trajectron++, a modular, graph-structured recurrent model that forecasts the trajectories of a general number of diverse agents while incorporating agent dynamics and heterogeneous data (e.g., semantic maps). Trajectron++ is designed to be tightly integrated with robotic planning and control frameworks; for example, it can produce predictions that are optionally conditioned on ego-agent motion plans. We demonstrate its performance on several challenging real-world trajectory forecasting datasets, outperforming a wide array of state-of-the-art deterministic and generative methods.
@inproceedings{SalzmannIvanovicEtAl2020, author = {Salzmann, T. and Ivanovic, B. and Chakravarty, P. and Pavone, M.}, title = {Trajectron++: Dynamically-Feasible Trajectory Forecasting With Heterogeneous Data}, booktitle = {{European Conf. on Computer Vision}}, year = {2020}, address = {}, month = aug, owner = {borisi}, timestamp = {2020-09-14}, url = {https://arxiv.org/abs/2001.03093} }
Abstract: Reachability analysis is at the core of many applications, from neural network verification, to safe trajectory planning of uncertain systems. However, this problem is notoriously challenging, and current approaches tend to be either too restrictive, too slow, too conservative, or approximate and therefore lack guarantees. In this paper, we propose a simple yet effective sampling-based approach to perform reachability analysis for arbitrary dynamical systems. Our key novel idea consists of using random set theory to give a rigorous interpretation of our method, and prove that it returns sets which are guaranteed to converge to the convex hull of the true reachable sets. Additionally, we leverage recent work on robust deep learning and propose a new adversarial sampling approach to robustify our algorithm and accelerate its convergence. We demonstrate that our method is faster and less conservative than prior work, present results for approximate reachability analysis of neural networks and robust trajectory optimization of high-dimensional uncertain nonlinear systems, and discuss future applications.
@inproceedings{LewPavone2020, title = {Sampling-based Reachability Analysis: A Random Set Theory Approach with Adversarial Sampling}, author = {Lew, T. and Pavone, M.}, booktitle = {{Conf. on Robot Learning}}, year = {2020}, month = aug, url = {https://arxiv.org/abs/2008.10180}, owner = {lew}, timestamp = {2020-11-07} }
Abstract: Algorithms for motion planning in unknown environments are generally limited in their ability to reason about the structure of the unobserved environment. As such, current methods generally navigate unknown environments by relying on heuristic methods to choose intermediate objectives along frontiers. We present a unified method that combines map prediction and motion planning for safe, time-efficient autonomous navigation of unknown environments by dynamically-constrained robots. We propose a data-driven method for predicting the map of the unobserved environment, using the robot’s observations of its surroundings as context. These map predictions are then used to plan trajectories from the robot’s position to the goal without requiring frontier selection. We demonstrate that our map-predictive motion planning strategy yields a substantial improvement in trajectory time over a naive frontier pursuit method and demonstrates similar performance to methods using more sophisticated frontier selection heuristics with significantly shorter computation time.
@inproceedings{ElhafsiIvanovicEtAl2020, author = {Elhafsi, A. and Ivanovic, B. and Janson, L. and Pavone, M.}, title = {Map-Predictive Motion Planning in Unknown Environments}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2020}, address = {Paris, France}, month = jun, url = {https://arxiv.org/abs/1910.08184}, owner = {borisi}, timestamp = {2019-10-21} }
Abstract: This paper presents an algorithmic framework to optimize the operation of an Autonomous Mobility-on-Demand system whereby a centrally controlled fleet of electric self-driving vehicles provides on-demand mobility. In particular, we first present a mixed-integer linear program that captures the joint vehicle coordination and charge scheduling problem, accounting for the battery level of the single vehicles and the energy availability in the power grid. Second, we devise a heuristic algorithm to compute near-optimal solutions in polynomial time. Finally, we apply our algorithm to realistic case studies for Newport Beach, CA. Our results validate the near optimality of our method with respect to the global optimum, whilst suggesting that through vehicle-to-grid operation we can enable a 100% penetration of renewable energy sources and still provide a high-quality mobility service.
@inproceedings{BoewingSchifferEtAl2020, author = {Boewing, F. and Schiffer, M. and Salazar, M. and Pavone, M.}, title = {A Vehicle Coordination and Charge Scheduling Algorithm for Electric Autonomous Mobility-on-Demand Systems}, booktitle = {{American Control Conference}}, year = {2020}, address = {Denver, CO, United States}, month = jun, url = {/wp-content/papercite-data/pdf/Boewing.ea.ACC20.pdf}, owner = {samauro}, timestamp = {2020-03-19} }
Abstract: This paper presents models and optimization methods for the design of electric vehicle propulsion systems. Specifically, we first derive a bi-convex model of a battery electric powertrain including the transmission and explicitly accounting for the impact of its components’ size on the energy consumption of the vehicle. Second, we formulate the energy-optimal sizing and control problem for a given driving cycle and solve it as a sequence of second-order conic programs. Finally, we present a real-world case study for heavy-duty electric trucks, comparing a single-gear transmission with a continuously variable transmission (CVT), and validate our approach with respect to state-of-the-art particle swarm optimization algorithms. Our results show that, depending on the electric motor technology, CVTs can reduce the energy consumption and the battery size of electric trucks between up to 10%, and shrink the electric motor up to 50%.
@inproceedings{VerbruggenSalazarEtAl2019, author = {Verbruggen, F. J. R. and Salazar, M. and Pavone, M. and Hofman, T.}, title = {Joint Design and Control of Electric Vehicle Propulsion Systems}, booktitle = {{European Control Conference}}, year = {2020}, address = {St. Petersburg, Russia}, month = may, url = {/wp-content/papercite-data/pdf/Verbruggen.Salazar.ea.ECC2020.pdf}, owner = {samauro}, timestamp = {2020-02-27} }
Abstract: We study fundamental theoretical aspects of probabilistic roadmaps (PRM) in the finite time (non-asymptotic) regime. In particular, we investigate how completeness and optimality guarantees of the approach are influenced by the underlying deterministic sampling distribution \mathcalX and connection radius r>0. We develop the notion of (δ,ε)-completeness of the parameters \mathcalX,r, which indicates that for every motion-planning problem of clearance at least δ>0, PRM using \mathcalX,r returns a solution no longer than 1+εtimes the shortest δ-clear path. Leveraging the concept of ε-nets, we characterize in terms of lower and upper bounds the number of samples needed to guarantee (δ,ε)-completeness. This is in contrast with previous work which mostly considered the asymptotic regime in which the number of samples tends to infinity. In practice, we propose a sampling distribution inspired by ε-nets that achieves nearly the same coverage as grids while using significantly fewer samples.
@inproceedings{TsaoSoloveyETAL2020, author = {Tsao, M. and Solovey, K. and Pavone, M.}, title = {Sample Complexity of Probabilistic Roadmaps via Epsilon-nets}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2020}, address = {Paris, France}, month = may, url = {https://ieeexplore.ieee.org/document/9196917}, timestamp = {2020-02-25} }
Abstract: RRT* is one of the most widely used sampling-based algorithms for asymptotically-optimal motion planning. This algorithm laid the foundations for optimality in motion planning as a whole, and inspired the development of numerous new algorithms in the field, many of which build upon RRT* itself. In this paper, we first identify a logical gap in the optimality proof of RRT*, which was developed in Karaman and Frazzoli (2011). Then, we present an alternative and mathematically-rigorous proof for asymptotic optimality. Our proof suggests that the connection radius used by RRT* should be increased from γ(\frac\log nn)^1/d to γ’ (\frac\log nn)^1/(d+1) in order to account for the additional dimension of time that dictates the samples’ ordering. Here gamma, γ’ are constants, and n, d are the number of samples and the dimension of the problem, respectively.
@inproceedings{SoloveyJansonETAL2020, author = {Solovey, K. and Janson, L. and Schmerling, E. and Frazzoli, E. and Pavone, M.}, title = {Revisiting the Asymptotic Optimality of RRT*}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2020}, address = {Paris, France}, month = may, url = {https://ieeexplore.ieee.org/document/9196553}, timestamp = {2020-02-25} }
Abstract: The control of constrained systems using model predictive control (MPC) becomes more challenging when full state information is not available and when the nominal system model and measurements are corrupted by noise. Since these conditions are often seen in practical scenarios, techniques such as robust output feedback MPC have been developed to address them. However, existing approaches to robust output feedback MPC are still challenged by increased complexity of the online optimization problem, increased computational requirements for controller synthesis, or both. In this work we present a simple and efficient methodology for synthesizing a tube-based robust output feedback MPC scheme for linear, discrete, time-invariant systems subject to bounded, additive disturbances. Specifically, we completely avoid the use of Minkowski addition during controller synthesis and the online optimization problem has the same complexity as in the nominal full state feedback MPC problem, enabling our approach to scale with system dimension more effectively than previously proposed schemes.
@inproceedings{LorenzettiPavone2020, author = {Lorenzetti, J. and Pavone, M.}, title = {A Simple and Efficient Tube-based Robust Output Feedback Model Predictive Control Scheme}, booktitle = {{European Control Conference}}, year = {2020}, address = {St. Petersburg, Russia}, month = may, url = {https://arxiv.org/pdf/1911.07360.pdf}, owner = {jlorenze}, timestamp = {2020-08-03} }
Abstract: Planning safe trajectories for nonlinear dynamical systems subject to model uncertainty and disturbances is challenging. In this work, we present a novel approach to tackle chance-constrained trajectory planning problems with nonconvex constraints, whereby obstacle avoidance chance constraints are reformulated using the signed distance function. We propose a novel sequential convex programming algorithm and prove that under a discrete time problem formulation, it is guaranteed to converge to a solution satisfying first-order optimality conditions. We demonstrate the approach on an uncertain 6 degrees of freedom spacecraft system and show that the solutions satisfy a given set of chance constraints.
@inproceedings{LewBonalliEtAl2020, author = {Lew, T. and Bonalli, R. and Pavone, M.}, title = {Chance-Constrained Sequential Convex Programming for Robust Trajectory Optimization}, booktitle = {{European Control Conference}}, year = {2020}, address = {St. Petersburg, Russia}, month = may, url = {/wp-content/papercite-data/pdf/Lew.Bonalli.Pavone.ECC20.pdf}, owner = {lew}, timestamp = {2020-03-16} }
Abstract: We consider the problem of controlling a large fleet of drones to deliver packages simultaneously across broad urban areas. To conserve their limited flight range, drones can seamlessly hop between and ride on top of public transit vehicles (e.g., buses and trams). We design a novel comprehensive algorithmic framework that strives to minimize the maximum time to complete any delivery. We address the multifaceted complexity of the problem through a two-layer approach. First, the upper layer assigns drones to package delivery sequences with a provably near-optimal polynomial-time task allocation algorithm. Then, the lower layer executes the allocation by periodically routing the fleet over the transit network while employing efficient bounded-suboptimal multi-agent pathfinding techniques tailored to our setting. We present extensive experiments supporting the efficiency of our approach on settings with up to 200 drones, 5000 packages, and large transit networks of up to 8000 stops in San Francisco and the Washington DC area. Our results show that the framework can compute solutions within a few seconds (up to 2 minutes for the largest settings) on commodity hardware, and that drones travel up to 450% of their flight range by using public transit.
@inproceedings{ChoudhurySoloveyETAL2020, author = {Choudhury, S. and Solovey, K. and Kochenderfer, M. Pavone, M.}, title = {Efficient Large-Scale Multi-Drone Delivery Using Transit Networks}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2020}, address = {Paris, France}, month = may, url = {https://ieeexplore.ieee.org/document/9197313}, owner = {kirilsol}, timestamp = {2020-09-22} }
Abstract: A number of prototypical optimization problems in multi-agent systems (e.g. task allocation and network load-sharing) exhibit a highly local structure: that is, each agent’s decision variables are only directly coupled to few other agent’s variables through the objective function or the constraints. Nevertheless, existing algorithms for distributed optimization generally do not exploit the locality structure of the problem, requiring all agents to compute or exchange the full set of decision variables. In this paper, we develop a rigorous notion of "locality" that relates the structural properties of a linearly-constrained convex optimization problem (in particular, the sparsity structure of the constraint matrix and the objective function) to the amount of information that agents should exchange to compute an arbitrarily high-quality approximation to the problem from a cold-start. We leverage the notion of locality to develop a locality-aware distributed optimization algorithm, and we show that, for problems where individual agents only require to know a small portion of the optimal solution, the algorithm requires very limited inter-agent communication. Numerical results show that the convergence rate of our algorithm is directly explained by the locality parameter proposed, and that the proposed theoretical bounds are remarkably tight.
@inproceedings{BrownRossiEtAl20, author = {Brown, R. A. and Rossi, F. and Solovey, K. and Wolf, M. T. and Pavone, M.}, title = {Exploiting Locality and Structure for Distributed Optimization in Multi-Agent Systems}, booktitle = {{European Control Conference}}, year = {2020}, address = {St. Petersburg, Russia}, month = may, url = {/wp-content/papercite-data/pdf/Brown.Rossi.ea.ECC20.pdf}, owner = {rabrown1}, timestamp = {2020-02-25} }
Abstract: Today’s robotic fleets are increasingly measuring high-volume video and LIDAR sensory streams, which can be mined for valuable training data, such as rare scenes of road construction sites, to steadily improve robotic perception models. However, re-training perception models on growing volumes of rich sensory data in central compute servers (or the "cloud") places an enormous time and cost burden on network transfer, cloud storage, human annotation, and cloud computing resources. Hence, we introduce HarvestNet, an intelligent sampling algorithm that resides on-board a robot and reduces system bottlenecks by only storing rare, useful events to steadily improve perception models re-trained in the cloud. HarvestNet significantly improves the accuracy of machine-learning models on our novel dataset of road construction sites, field testing of self-driving cars, and streaming face recognition, while reducing cloud storage, dataset annotation time, and cloud compute time by between 65.7-81.3%. Further, it is between 1.05-2.58x more accurate than baseline algorithms and scalably runs on embedded deep learning hardware.
@inproceedings{ChinchaliPergamentEtAl2020, author = {Chinchali, S. and Pergament, E. and Nakanoya, M. and Cidon, E. and Zhang, E. and Bharadia, D. and Pavone, M. and Katti, S.}, title = {Sampling Training Data for Distributed Learning between Robots and the Cloud}, booktitle = {{Int. Symp. on Experimental Robotics}}, year = {2020}, address = {Valetta, Malta}, month = mar, owner = {csandeep}, timestamp = {2020-11-09} }
Abstract: Mixed-integer convex programming (MICP) has seen significant algorithmic and hardware improvements with several orders of magnitude solve time speedups compared to 25 years ago. Despite these advances, MICP has been rarely applied to real-world robotic control because the solution times are still too slow for online applications. In this work, we extend the machine learning optimizer (MLOPT) framework to solve MICPs arising in robotics at very high speed. MLOPT encodes the combinatorial part of the optimal solution into a strategy. Using data collected from offline problem solutions, we train a multiclass classifier to predict the optimal strategy given problem-specific parameters such as states or obstacles. Compared to previous approaches, we use task-specific strategies and prune redundant ones to significantly reduce the number of classes the predictor has to select from, thereby greatly improving scalability. Given the predicted strategy, the control task becomes a small convex optimization problem that we can solve in milliseconds. Numerical experiments on a cart-pole system with walls, a free-flying space robot and task-oriented grasps show that our method provides not only 1 to 2 orders of magnitude speedups compared to state-of-the-art solvers but also performance close to the globally optimal MICP solution.
@inproceedings{CauligiCulbertsonEtAl2020, author = {Cauligi, A. and Culbertson, P. and Stellato, B. and Bertsimas, D. and Schwager, M. and Pavone, M.}, title = {Learning Mixed-Integer Convex Optimization Strategies for Robot Planning and Control}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2020}, address = {Jeju Island, Republic of Korea}, month = mar, url = {https://arxiv.org/pdf/2004.03736.pdf}, owner = {acauligi}, timestamp = {2020-04-05} }
Abstract: Sequential convex programming (SCP) has recently emerged as an effective tool to quickly compute locally optimal trajectories for robotic and aerospace systems alike, even when initialized with an unfeasible trajectory. In this paper, by focusing on the Guaranteed Sequential Trajectory Optimization (GuSTO) algorithm, we propose a methodology to accelerate SCP-based algorithms through warm-starting. Specifically, leveraging a dataset of expert trajectories from GuSTO, we devise a neural-network-based approach to predict a locally optimal state and control trajectory, which is used to warm-start the SCP algorithm. This approach allows one to retain all the theoretical guarantees of GuSTO while simultaneously taking advantage of the fast execution of the neural network and reducing the time and number of iterations required for GuSTO to converge. The result is a faster and theoretically guaranteed trajectory optimization algorithm.
@inproceedings{BanerjeeEtAl2020, author = {Banerjee, S. and Lew, T. and Bonalli, R. and Alfaadhel, A. and Alomar, I. A. and Shageer, H. M. and Pavone, M.}, title = {Learning-based Warm-Starting for Fast Sequential Convex Programming and Trajectory Optimization}, booktitle = {{IEEE Aerospace Conference}}, year = {2020}, address = {Big Sky, Montana}, month = mar, url = {https://ieeexplore.ieee.org/abstract/document/9172293/}, owner = {lew}, timestamp = {2020-01-09} }
Abstract: The design of Autonomous Vehicles (AVs) and the design of AVs-enabled mobility systems are closely coupled. Indeed, knowledge about the intended service of AVs would impact their design and deployment process, whilst insights about their technological development could significantly affect transportation management decisions. This calls for tools to study such a coupling and co-design AVs and AVs-enabled mobility systems in terms of different objectives. In this paper, we instantiate a framework to address such co-design problems. In particular, we leverage the recently developed theory of co-design to frame and solve the problem of designing and deploying an intermodal Autonomous Mobility-on-Demand system, whereby AVs service travel demands jointly with public transit, in terms of fleet sizing, vehicle autonomy, and public transit service frequency. Our framework is modular and compositional, allowing to describe the design problem as the interconnection of its individual components and to tackle it from a system-level perspective. Moreover, it only requires very general monotonicity assumptions and it naturally handles multiple objectives, delivering the rational solutions on the Pareto front and thus enabling policy makers to select a solution through “political” criteria. To showcase our methodology, we present a real-world case study for Washington D.C., USA. Our work suggests that it is possible to create user-friendly optimization tools to systematically assess the costs and benefits of interventions, and that such analytical techniques might gain a momentous role in policy-making in the future.
@inproceedings{ZardiniLanzettiEtAl2020, author = {Zardini, G. and Lanzetti, N. and Salazar, M. and Censi, A. and Frazzoli, E. and Pavone, M.}, title = {Towards a Co-Design Framework for Future Mobility Systems}, booktitle = {{Annual Meeting of the Transportation Research Board}}, year = {2020}, address = {Washington D.C., United States}, month = jan, url = {https://arxiv.org/pdf/1910.07714.pdf}, owner = {samauro}, timestamp = {2019-10-22} }
Abstract: A model predictive control algorithm seeks to produce a control law that optimizes the future behavior of a deployed system over a finite time horizon, by leveraging a real-time computational model of this system. For applications involving fluid-structure interaction (FSI), this leveraging is challenging because it implies the design of an accurate and yet real-time computational model for the prediction of the time-dependent flow-induced forces and moments acting on the system. The projection-based reduction of CFD-based computational models for FSI provides one approach for addressing this issue. In this context, linear model reduction is adequate as the controller can be expected to maintain the system of interest within small perturbations around a pre-designed optimal trajectory. For the automated landing of an aircraft application considered in this paper, this requires the construction of a CFD-based projection-based reduced-order model that is linearized around a time-dependent trajectory rather than a mere steady-state equilibrium position. To this end, a computational approach for the projection-based model order reduction of linearized CFD-based computational models for FSI is presented, with the goal of application to downstream control tasks such as automated aircraft landing. The approach addresses the issues of linearization around a trajectory and construction of stable reduced-order models to achieve real-time computation, while maintaining model accuracy, and thereby enabling model predictive control. It is verified using a simple model predictive control algorithm.
@inproceedings{McClellanLorenzettiEtAl2020, author = {McClellan, A. and Lorenzetti, J. and Pavone, M. and Farhat, C.}, title = {Projection-based Model Order Reduction for Flight Dynamics and Model Predictive Control}, booktitle = {{AIAA Scitech Forum}}, year = {2020}, address = {Orlando, Florida}, month = jan, url = {https://arc.aiaa.org/doi/abs/10.2514/6.2020-1190}, owner = {jlorenze}, timestamp = {2019-12-02} }
Abstract: The control problem associated with autonomous aircraft carrier landings for UAVs is challenging due to requirements on safety, high-performance operation, and uncertain and highly dynamic environments. This work proposes a control scheme for such problems that enables safe operation of the UAV at the limits of its performance by utilizing a model predictive control (MPC) approach. While real-time computation requirements typically limit the fidelity of the models used in optimization-based control, in this work it is demonstrated that high-fidelity computational fluid dynamics (CFD) models can be used within an MPC framework via the construction of a projection-based reduced order model (ROM). An application of a CFD-based MPC scheme to the glideslope tracking problem is then developed to demonstrate the effectiveness of the proposed approach.
@inproceedings{LorenzettiMcClellanEtAl2020, author = {Lorenzetti, J. and McClellan, A. and Farhat, C. and Pavone, M.}, title = {{UAV} Aircraft Carrier Landing Using {CFD}-Based Model Predictive Control}, booktitle = {{AIAA Scitech Forum}}, year = {2020}, address = {Orlando, Florida}, month = jan, url = {/wp-content/papercite-data/pdf/Lorenzetti.McClellan.Farhat.Pavone.AIAA20.pdf}, owner = {jlorenze}, timestamp = {2019-12-02} }
Abstract:
@inproceedings{WillesHarrisonEtAl2021, author = {Willes, J. and Harrison, J. and Harakeh, A. and Finn, C. and Pavone, M. and Waslander, S.}, title = {Open-Set Incremental Learning via Bayesian Prototypical Embeddings}, year = {2020}, booktitle = {Conf. on Neural Information Processing Systems - Workshop on Meta-Learning}, keywords = {pub}, owner = {jh2}, timestamp = {2021-03-23} }
Abstract: Within a robot autonomy stack, the planner and controller are typically designed separately, and serve different purposes. As such, there is often a diffusion of responsibilities when it comes to ensuring safety for the robot. We propose that a planner and controller should share the same interpretation of safety but apply this knowledge in a different yet complementary way. To achieve this, we use Hamilton-Jacobi (HJ) reachability theory at the planning level to provide the robot planner with the foresight to avoid entering regions with possible inevitable collision. However, this alone does not guarantee safety. In conjunction with this HJ reachability-infused planner, we propose a minimally-interventional multi-agent safety-preserving controller also derived via HJ-reachability theory. The safety controller maintains safety for the robot without unduly impacting planner performance. We demonstrate the benefits of our proposed approach in a multi-agent highway scenario where a robot car is rewarded to navigate through traffic as fast as possible, and we show that our approach provides strong safety assurances yet achieves the highest performance compared to other safety controllers.
@inproceedings{WangLeungEtAl2020, author = {Wang, X. and Leung, K. and Pavone, M.}, title = {Infusing Reachability-Based Safety into Planning and Control for Multi-agent Interactions}, booktitle = {{IEEE/RSJ Int. Conf. on Intelligent Robots \& Systems}}, year = {2020}, url = {https://arxiv.org/pdf/2008.00067.pdf}, address = {Las Vegas, United States}, owner = {karenl7}, timestamp = {2020-10-19} }
Abstract:
@article{SalazarLanzettiEtAl2019, author = {Salazar, M. and Lanzetti, N. and Rossi, F. and Schiffer, M. and Pavone, M.}, title = {Intermodal Autonomous Mobility-on-Demand}, journal = {{IEEE Transactions on Intelligent Transportation Systems}}, volume = {21}, number = {9}, pages = {3946--3960}, year = {2020}, url = {https://ieeexplore.ieee.org/document/8894439}, owner = {samauro}, timestamp = {2019-11-11} }
Abstract: We study the interaction between a fleet of electric self-driving vehicles servicing on-demand transportation requests (referred to as autonomous mobility-on-demand, or AMoD, systems) and the electric power network. We propose a joint model that captures the coupling between the two systems stemming from the vehicles’ charging requirements, capturing time-varying customer demand, battery depreciation, and power transmission constraints. First, we show that the model is amenable to efficient optimization. Then, we prove that the socially optimal solution to the joint problem is a general equilibrium if locational marginal pricing is used for electricity. Finally, we show that the equilibrium can be computed by selfish transportation and generator operators (aided by a nonprofit independent system operator) without sharing private information. We assess the performance of the approach and its robustness to stochastic fluctuations in demand through case studies and agent-based simulations. Collectively, these results provide a first-of-a-kind characterization of the interaction between AMoD systems and the power network, and shed additional light on the economic and societal value of AMoD.
@article{RossiIglesiasEtAl2018b, author = {Rossi, F. and Iglesias, R. and Alizadeh, M. and Pavone, M.}, title = {On the Interaction Between {Autonomous Mobility-on-Demand} Systems and the Power Network: Models and Coordination Algorithms}, journal = {{IEEE Transactions on Control of Network Systems}}, year = {2020}, volume = {7}, number = {1}, pages = {384--397}, url = {https://arxiv.org/abs/1709.04906}, doi = {10.1109/TCNS.2019.2923384}, owner = {frossi2}, timestamp = {2020-03-20} }
Abstract: In systems where collisions can be tolerated, permitting and optimizing collisions in vehicle trajectories can enable a richer set of possible behaviors, allowing both better performance and determination of safest courses of action in scenarios where collision is inevitable. This paper develops an approach for optimal trajectory planning on a three degree-of-freedom free-flying spacecraft having tolerance to collisions. First, we use experimental data to formulate a physically realistic collision model for the spacecraft. We show that this model is linear over the expected operational range, enabling a piecewise affine representation of the full hybrid-vehicle dynamics. Next, we incorporate this dynamics model along with vehicle constraints into a mixed integer program. Experimental comparisons of trajectories with and without collision-avoidance requirements demonstrate the capability of the collision-tolerant strategy to achieve significant performance improvements in realistic scenarios. A simulated case study illustrates the potential for this approach to find damage-mitigating paths in online implementations.
@article{MoteEgerstedtEtAl2020, author = {Mote, M. and Egerstedt, M. and Feron, E. and Bylard, A. and Pavone, M.}, title = {Collision-Inclusive Trajectory Optimization for Free-Flying Spacecraft}, journal = {{AIAA Journal of Guidance, Control, and Dynamics}}, volume = {43}, number = {7}, pages = {1247-1258}, year = {2020}, url = {/wp-content/papercite-data/pdf/Mote.ea.JGCD.2020.preprint.pdf}, owner = {bylard}, timestamp = {2021-10-14} }
Abstract: Action anticipation, intent prediction, and proactive behavior are all desirable characteristics for autonomous driving policies in interactive scenarios. Paramount, however, is ensuring safety on the road—a key challenge in doing so is accounting for uncertainty in human driver actions without unduly impacting planner performance. This paper introduces a minimally-interventional safety controller operating within an autonomous vehicle control stack with the role of ensuring collision-free interaction with an externally controlled (e.g., human-driven) counterpart while respecting static obstacles such as a road boundary wall. We leverage reachability analysis to construct a real-time (100Hz) controller that serves the dual role of (1) tracking an input trajectory from a higher-level planning algorithm using model predictive control, and (2) assuring safety through maintaining the availability of a collision-free escape maneuver as a persistent constraint regardless of whatever future actions the other car takes. A full-scale steer-by-wire platform is used to conduct traffic weaving experiments wherein two cars, initially side-by-side, must swap lanes in a limited amount of time and distance, emulating cars merging onto/off of a highway. We demonstrate that, with our control stack, the autonomous vehicle is able to avoid collision even when the other car defies the planner’s expectations and takes dangerous actions, either carelessly or with the intent to collide, and otherwise deviates minimally from the planned trajectory to the extent required to maintain safety.
@article{LeungSchmerlingEtAl2019, author = {Leung, K. and Schmerling, E. and Zhang, M. and Chen, M. and Talbot, J. and Gerdes, J. C. and Pavone, M.}, title = {On Infusing Reachability-Based Safety Assurance within Planning Frameworks for Human-Robot Vehicle Interactions}, journal = {{Int. Journal of Robotics Research}}, year = {2020}, volume = {39}, number = {10--11}, pages = {1326--1345}, url = {https://arxiv.org/abs/2012.03390}, timestamp = {2020-10-13} }
Abstract: This paper presents a technique, named stlcg, to compute the quantitative semantics of Signal Temporal Logic (STL) formulas using computation graphs. This provides a platform which enables the incorporation of logic-based specifications into robotics problems that benefit from gradient-based solutions. Specifically, STL is a powerful and expressive formal language that can specify spatial and temporal properties of signals generated by both continuous and hybrid systems. The quantitative semantics of STL provide a robustness metric, i.e., how much a signal satisfies or violates an STL specification. In this work we devise a systematic methodology for translating STL robustness formulas into computation graphs. With this representation, and by leveraging off-the-shelf auto-differentiation tools, we are able to back-propagate through STL robustness formulas and hence enable a natural and easy-to-use integration with many gradient-based approaches used in robotics. We demonstrate, through examples stemming from various robotics applications, that the technique is versatile, computationally efficient, and capable of injecting human-domain knowledge into the problem formulation.
@inproceedings{LeungArechigaEtAl2020, author = {Leung, K. and Ar\'{e}chiga, N. and Pavone, M.}, title = {Back-propagation through signal temporal logic specifications: Infusing logical structure into gradient-based methods}, booktitle = {{Workshop on Algorithmic Foundations of Robotics}}, year = {2020}, address = {Oulu, Finland}, url = {https://arxiv.org/abs/2008.00097v2}, owner = {karenl7}, timestamp = {2020-04-09} }
Abstract: During the last decade, tensegrity systems have been the focus of numerous investigations exploring the possibility of adopting them for planetary landing and exploration applications. Early approaches mainly focused on locomotion aspects related to tensegrity systems, where mobility was achieved by actuating the cable members of the system. Later efforts focused on understanding energy storage mechanisms of tensegrity systems undergoing landing events. More precisely, it was shown that under highly dynamic events, buckling of individual members of a tensegrity structure does not necessarily imply structural failure, suggesting that efficient structural design of planetary landers could be achieved by allowing its compression members to buckle. In this work, we combine both aspects of previous research on tensegrity structures, showing a possible lattice-like structural configuration able to withstand impact events, store pre-impact kinetic energy, and utilize a part of that energy for the locomotion process. Our work shows the feasibility of this proposed approach via both experimental and computational means.
@inproceedings{GarangerEtAl2020, author = {Garanger, K. and Krajewski, M. and del Valle, I. and Raheja, U. and Rimoli, J. and Rath, M. Pavone, M.}, title = {Soft Tensegrity Systems for Planetary Landing and Exploration}, booktitle = {{Proc. ASCE Earth and Space Conference}}, year = {2020}, owner = {rdyro}, timestamp = {2022-06-14}, url = {https://arxiv.org/abs/2003.10999} }
Abstract: We present an approach for interpreting parameterized policies based on a formally-specified abstract description of the importance of certain behaviors or observed outcomes of a policy. The standard way to deploy data-driven policies usually involves sampling from the set of outcomes produced by the policy. Our approach leverages parametric signal temporal logic (pSTL) formulas to construct an interpretable view on the modeling parameters via a sequence of variational inference problems; one to solve for the pSTL parameters and another to construct a new parameterization satisfying the specification. We perform clustering using a finite set of examples, either real or simulated, and combine computational graph learning and normalizing flows to form a relationship between these parameters and pSTL formulas either derived by hand or inferred from data. We illustrate the utility of our approach to model selection for validation of the safety properties of an autonomous driving system, using a learned generative model of the surrounding agents.
@inproceedings{DeCastroLeungEtAl2020, author = {DeCastro, J. and Leung, K. and Ar\'{e}chiga, N. and Pavone, M.}, title = {Interpretable Policies from Formally-Specified Temporal Properties}, booktitle = {{Proc. IEEE Int. Conf. on Intelligent Transportation Systems}}, year = {2020}, url = {/wp-content/papercite-data/pdf/DeCastro.Leung.ea.ITSC20.pdf}, address = {Rhodes, Greece}, timestamp = {2020-10-19} }
Abstract: We propose a new randomized optimization method for high-dimensional problems which can be seen as a generalization of coordinate descent to random subspaces. We show that an adaptive sampling strategy for the random subspace significantly outperforms the oblivious sampling method, which is the common choice in the recent literature. The adaptive subspace can be efficiently generated by a correlated random matrix ensemble whose statistics mimic the input data. We prove that the improvement in the relative error of the solution can be tightly characterized in terms of the spectrum of the data matrix, and provide probabilistic upper-bounds. We then illustrate the consequences of our theory with data matrices of different spectral decay. Extensive experimental results show that the proposed approach offers significant speed ups in machine learning problems including logistic regression, kernel classification with random convolution layers and shallow neural networks with rectified linear units. Our analysis is based on convex analysis and Fenchel duality, and establishes connections to sketching and randomized matrix decomposition.
@inproceedings{LacottePilanciEtAl2019, author = {Lacotte, J. and Pilanci, M. and Pavone, M.}, title = {High-Dimensional Optimization in Adaptive Random Subspaces}, booktitle = {{Conf. on Neural Information Processing Systems}}, year = {2019}, address = {Vancouver, Canada}, month = dec, owner = {lacotte}, timestamp = {2021-11-21}, url = {https://arxiv.org/abs/1906.11809} }
Abstract: We study risk-sensitive imitation learning where the agent’s goal is to perform at least as well as the expert in terms of a risk profile. We first formulate our risk-sensitive imitation learning setting. We consider the generative adversarial approach to imitation learning (GAIL) and derive an optimization problem for our formulation, which we call it risk-sensitive GAIL (RS-GAIL). We then derive two different versions of our RS-GAIL optimization problem that aim at matching the risk profiles of the agent and the expert w.r.t. Jensen-Shannon (JS) divergence and Wasserstein distance, and develop risk-sensitive generative adversarial imitation learning algorithms based on these optimization problems. We evaluate the performance of our algorithms and compare them with GAIL and the risk-averse imitation learning (RAIL) algorithms in two MuJoCo and two OpenAI classical control tasks.
@inproceedings{LacotteGhavamzadehEtAl2019, author = {Lacotte, J. and Ghavamzadeh, M. and Chow, Y. and Pavone, M.}, title = {Risk-Sensitive Generative Adversarial Imitation Learning}, booktitle = {{Int. Conf. on Artificial Intelligence and Statistics}}, year = {2019}, address = {Okinawa, Japan}, month = dec, owner = {lacotte}, timestamp = {2021-11-21}, url = {https://arxiv.org/abs/1808.04468} }
Abstract: This paper presents a routing algorithm for intermodal Autonomous Mobility on Demand (AMoD) systems, whereby a fleet of self-driving cars provides on-demand mobility in coordination with public transit. Specifically, we present a time-variant flow-based optimization approach that captures the operation of an AMoD system in coordination with public transit. We then leverage this model to devise a model predictive control (MPC) algorithm to route customers and vehicles through the network with the objective of minimizing customers’ travel time. To validate our MPC scheme, we present a real-world case study for New York City. Our results show that servicing transportation demands jointly with public transit can significantly improve the service quality of AMoD systems. Additionally, we highlight the differences of our time-variant framework compared to existing mesoscopic, time-invariant models.
@inproceedings{ZgraggenTsaoEtAl2019, author = {Zgraggen, J. and Tsao, M. and Salazar, M. and Schiffer, M. and Pavone, M.}, title = {A Model Predictive Control Scheme for Intermodal Autonomous Mobility-on-Demand}, booktitle = {{Proc. IEEE Int. Conf. on Intelligent Transportation Systems}}, year = {2019}, address = {Auckland, New Zealand}, month = nov, url = {/wp-content/papercite-data/pdf/Zgraggen.Salazar.ea.ITSC19.pdf}, owner = {samauro}, timestamp = {2020-02-12} }
Abstract: We study route-planning for Autonomous Mobility-on-Demand (AMoD) systems that accounts for the impact of road traffic on travel time. Specifically, we develop a congestion-aware routing scheme (CARS) that captures road-utilization-dependent travel times at a mesoscopic level via a piecewise affine approximation of the Bureau of Public Roads (BPR) model. This approximation largely retains the key features of the BPR model, while allowing the design of a real-time, convex quadratic optimization algorithm to determine congestion-aware routes for an AMoD fleet. Through a real-world case study of Manhattan, we compare CARS to existing routing approaches, namely a congestion-unaware and a threshold congestion model. Numerical results show that CARS significantly outperforms the other two approaches, with improvements in terms of travel time and global cost in the order of 20%.
@inproceedings{SalazarTsaoEtAl2019, author = {Salazar, M. and Tsao, M. and Aguiar, I. and Schiffer, M. and Pavone, M.}, title = {A Congestion-aware Routing Scheme for Autonomous Mobility-on-Demand Systems}, booktitle = {{European Control Conference}}, year = {2019}, address = {Naples, Italy}, month = nov, url = {/wp-content/papercite-data/pdf/Salazar.Tsao.ea.ECC19.pdf}, owner = {samauro}, timestamp = {2020-03-08} }
Abstract: This paper presents eco-routing strategies for plug-in hybrid electric vehicles, whereby we jointly compute the routing and energy management strategy and the objective is a combination of travel time and energy consumption. Specifically, we first use Pontryagin’s principle to compute the optimal Pareto front in terms of achievable fuel and battery consumption for different types of road links. Second, we leverage these Pareto fronts to formulate a network flow optimization problem to compute the optimal routing and energy management strategy, minimizing a combination of travel time and energy consumption. Finally, we present a real-world case-study for the Eastern Massachusetts highway subnetwork. The proposed approach allows to compute the optimal solution for different objectives, ranging from minimum time to minimum energy, revealing that by sacrificing a small amount of travel time significant improvements in fuel consumption can be achieved.
@inproceedings{SalazarHoushmandEtAl2019, author = {Salazar, M. and Houshmand, A. and Cassandras, C. G. and Pavone, M.}, title = {Optimal Routing and Energy Management Strategies for Plug-in Hybrid Electric Vehicles}, booktitle = {{Proc. IEEE Int. Conf. on Intelligent Transportation Systems}}, year = {2019}, address = {Auckland, New Zealand}, month = nov, url = {/wp-content/papercite-data/pdf/Salazar.Houshmand.ea.ITSC19.pdf}, owner = {samauro}, timestamp = {2020-02-12} }
Abstract: Many robotics applications, from object manipulation to locomotion, require planning methods that are capable of handling the dynamics of contact. Trajectory optimization has been shown to be a viable approach that can be made to support contact dynamics. However, the current state-of-the art methods remain slow and are often difficult to get to converge. In this work, we leverage recent advances in bilevel optimization to design an algorithm capable of efficiently generating trajectories that involve making and breaking contact. We demonstrate our method’s efficiency by outperforming an alternative state-of-the-art method on a benchmark problem. We moreover demonstrate the method’s ability to design a simple periodic gait for a quadruped with 15 degrees of freedom and four contact points
@inproceedings{LandryLorenzettiEtAl2019, author = {Landry, B. and Lorenzetti, J. and Manchester, Z. and Pavone, M.}, title = {Bilevel Optimization for Planning through Contact: A Semidirect Method}, booktitle = {{Int. Symp. on Robotics Research}}, year = {2019}, address = {Hanoi, Vietnam}, month = oct, url = {https://arxiv.org/pdf/1906.04292.pdf}, owner = {blandry}, timestamp = {2020-04-13} }
Abstract: Developing safe human-robot interaction systems is a necessary step towards the widespread integration of autonomous agents in society. A key component of such systems is the ability to reason about the many potential futures (e.g. trajectories) of other agents in the scene. Towards this end, we present the Trajectron, a graph-structured model that predicts many potential future trajectories of multiple agents simultaneously in both highly dynamic and multimodal scenarios (i.e. where the number of agents in the scene is time-varying and there are many possible highly-distinct futures for each agent). It combines tools from recurrent sequence modeling and variational deep generative modeling to produce a distribution of future trajectories for each agent in a scene. We demonstrate the performance of our model on several datasets, obtaining state-of-the-art results on standard trajectory prediction metrics as well as introducing a new metric for comparing models that output distributions.
@inproceedings{IvanovicPavone2019, author = {Ivanovic, B. and Pavone, M.}, title = {The {Trajectron}: Probabilistic Multi-Agent Trajectory Modeling with Dynamic Spatiotemporal Graphs}, booktitle = {{IEEE Int. Conf. on Computer Vision}}, year = {2019}, address = {Seoul, South Korea}, month = oct, url = {https://arxiv.org/abs/1810.05993}, owner = {borisi}, timestamp = {2019-07-22} }
Abstract: Planning under model uncertainty is a fundamental problem across many applications of decision making and learning. In this paper, we propose the Robust Adaptive Monte Carlo Planning (RAMCP) algorithm, which allows computation of risk-sensitive Bayes-adaptive policies that optimally trade off exploration, exploitation, and robustness. RAMCP formulates the risk-sensitive planning problem as a two-player zero-sum game, in which an adversary perturbs the agent’s belief over the models. We introduce two versions of the RAMCP algorithm. The first, RAMCP-F, converges to an optimal risk-sensitive policy without having to rebuild the search tree as the underlying belief over models is perturbed. The second version, RAMCP-I, improves computational efficiency at the cost of losing theoretical guarantees, but is shown to yield empirical results comparable to RAMCP-F. RAMCP is demonstrated on an n-pull multi-armed bandit problem, as well as a patient treatment scenario.
@inproceedings{SharmaHarrisonEtAl2019, author = {Sharma, A. and Harrison, J. and Tsao, M. and Pavone, M.}, title = {Robust and Adaptive Planning under Model Uncertainty}, booktitle = {{Int. Conf. on Automated Planning and Scheduling}}, year = {2019}, address = {Berkeley, California}, month = jul, url = {https://arxiv.org/pdf/1901.02577.pdf}, owner = {apoorva}, timestamp = {2019-04-10} }
Abstract: We consider the problem of vehicle routing for Autonomous Mobility-on-Demand (AMoD) systems, wherein a fleet of self-driving vehicles provides on-demand mobility in a given environment. Specifically, the task it to compute routes for the vehicles (both customer-carrying and empty travelling) so that travel demand is fulfilled and operational cost is minimized. The routing process must account for congestion effects affecting travel times, as modeled via a volume-delay function (VDF). Route planning with VDF constraints is notoriously challenging, as such constraints compound the combinatorial complexity of the routing optimization process. Thus, current solutions for AMoD routing resort to relaxations of the congestion constraints, thereby trading optimality with computational efficiency. In this paper, we present the first computationally-efficient approach for AMoD routing where VDF constraints are explicitly accounted for. We demonstrate that our approach is faster by at least one order of magnitude with respect to the state of the art, while providing higher quality solutions. From a methodological standpoint, the key technical insight is to establish a mathematical reduction of the AMoD routing problem to the classical traffic assignment problem (a related vehicle-routing problem where empty traveling vehicles are not present). Such a reduction allows us to extend powerful algorithmic tools for traffic assignment, which combine the classic Frank-Wolfe algorithm with modern techniques for pathfinding, to the AMoD routing problem. We provide strong theoretical guarantees for our approach in terms of near-optimality of the returned solution.
@inproceedings{SoloveySalazarEtAl2019, author = {Solovey, K. and Salazar, M. and Pavone, M.}, title = {Scalable and Congestion-aware Routing for Autonomous Mobility-on-Demand via Frank-Wolfe Optimization}, booktitle = {{Robotics: Science and Systems}}, year = {2019}, address = {Freiburg im Breisgau, Germany}, month = jun, url = {/wp-content/papercite-data/pdf/Solovey.Salazar.Pavone.RSS19.pdf}, owner = {samauro}, timestamp = {2019-02-02} }
Abstract: Despite the success of model predictive control (MPC), its application to high-dimensional systems, such as flexible structures and coupled fluid/rigid-body systems, remains a largely open challenge due to excessive computational complexity. A promising solution approach is to leverage reduced order models for designing the model predictive controller. In this paper we present a reduced order MPC scheme that enables setpoint tracking while robustly guaranteeing constraint satisfaction for linear, discrete, time-invariant systems. Setpoint tracking is enabled by designing the MPC cost function to account for the steady-state error between the full and reduced order models. Robust constraint satisfaction is accomplished by solving (offline) a set of linear programs to provide bounds on the errors due to bounded disturbances, state estimation, and model approximation. The approach is validated on a synthetic system as well as a high-dimensional linear model of a flexible rod, obtained using finite element methods.
@inproceedings{LorenzettiLandryEtAl2019, author = {Lorenzetti, J. and Landry, B. and Singh, S. and Pavone, M.}, title = {Reduced Order Model Predictive Control For Setpoint Tracking}, booktitle = {{European Control Conference}}, year = {2019}, address = {Naples, Italy}, month = jun, url = {https://arxiv.org/pdf/1811.06590.pdf}, owner = {jlorenze}, timestamp = {2019-04-26} }
Abstract: Signal Temporal Logic (STL) is an expressive language used to describe logical and temporal properties of signals, both continuous and discrete. Inferring STL formulas from behavior traces can provide powerful insights into complex systems. These insights can help system designers better understand and improve the systems they develop (e.g., long-term behaviors of time series data), yet this is a very challenging and often intractable problem. This work presents a method for evaluating STL formulas using computation graphs, hence bridging a connection between STL and many modern machine learning frameworks that depend on computation graphs, such as deep learning. We show that this approach is particularly effective for solving parameteric STL (pSTL) problems, the problem of parameter fitting for a given signal. We provide a relaxation technique that makes this method more tractable when solving general pSTL formulas. By using computation graphs, we can leverage the benefits and the computational prowess of modern day machine learning tools. Motivated by the problem of learning explanatory factors and safety assurance for complex cyber-physical systems, we demonstrate our proposed method on an autonomous driving case study.
@inproceedings{LeungArechigaEtAl2019, author = {Leung, K. and Ar\'{e}chiga, N. and Pavone, M.}, title = {Backpropagation for Parametric {STL}}, booktitle = {{IEEE Intelligent Vehicles Symposium: Workshop on Unsupervised Learning for Automated Driving}}, year = {2019}, address = {Paris, France}, month = jun, url = {/wp-content/papercite-data/pdf/Leung.Arechiga.ea.ULAD19.pdf}, owner = {karenl7}, timestamp = {2021-07-12} }
Abstract: Many problems in modern robotics can be addressed by modeling them as bilevel optimization problems. In this work, we leverage augmented Lagrangian methods and recent advances in automatic differentiation to develop a general-purpose nonlinear optimization solver that is well suited to bilevel optimization. We then demonstrate the validity and scalability of our algorithm with two representative robotic problems, namely robust control and parameter estimation for a system involving contact. We stress the general nature of the algorithm and its potential relevance to many other problems in robotics.
@inproceedings{LandryManchesterEtAl2019, author = {Landry, B. and Manchester, Z. and Pavone, M.}, title = {A Differentiable Augmented Lagrangian Method for Bilevel Nonlinear Optimization}, booktitle = {{Robotics: Science and Systems}}, year = {2019}, address = {Freiburg im Breisgau, Germany}, month = jun, url = {https://arxiv.org/pdf/1902.03319.pdf}, owner = {blandry}, timestamp = {2019-05-18} }
Abstract: Today’s robotic systems are increasingly turning to computationally expensive models such as deep neural networks (DNNs) for tasks like localization, perception, planning, and object detection. However, resource-constrained robots, like low-power drones, often have insufficient on-board compute resources or power reserves to scalably run the most accurate, state-of-the art neural network compute models. Cloud robotics allows mobile robots the benefit of offloading compute to centralized servers if they are uncertain locally or want to run more accurate, compute-intensive models. However, cloud robotics comes with a key, often understated cost: communicating with the cloud over congested wireless networks may result in latency or loss of data. In fact, sending high data-rate video or LIDAR from multiple robots over congested networks can lead to prohibitive delay for real-time applications, which we measure experimentally. In this paper, we formulate a novel Robot Offloading Problem - how and when should robots offload sensing tasks, especially if they are uncertain, to improve accuracy while minimizing the cost of cloud communication? We formulate offloading as a sequential decision making problem for robots, and propose a solution using deep reinforcement learning. In both simulations and hardware experiments using state-of-the art vision DNNs, our offloading strategy improves vision task performance by between 1.3-2.6x of benchmark offloading strategies, allowing robots the potential to significantly transcend their on-board sensing accuracy but with limited cost of cloud communication.
@inproceedings{ChinchaliSharmaEtAl2019, author = {Chinchali, S. and Sharma, A. and Harrison, J. and Elhafsi, A. and Kang, D. and Pergament, E. and Cidon, E. and Katti, S. and Pavone, M.}, title = {Network Offloading Policies for Cloud Robotics: a Learning-based Approach}, booktitle = {{Robotics: Science and Systems}}, year = {2019}, address = {Freiburg im Breisgau, Germany}, month = jun, url = {https://arxiv.org/pdf/1902.05703.pdf}, owner = {apoorva}, timestamp = {2019-02-07} }
Abstract: Sequential Convex Programming (SCP) has recently gain popularity as a tool for trajectory optimization, due to its sound theoretical properties and practical performance. Yet, most SCP-based methods for trajectory optimization are restricted to Euclidean settings, which precludes their application to problem instances where one needs to reason about manifold-type constraints (that is, constraints, such as loop closure, which restrict the motion of a system to a subset of the ambient space). The aim of this paper is to fill this gap by extending SCP-based trajectory optimization methods to a manifold setting. The key insight is to leverage geometric embeddings to lift a manifold-constrained trajectory optimization problem into an equivalent problem defined over a space enjoying Euclidean structure. This insight allows one to extend existing SCP methods to a manifold setting in a fairly natural way. In particular, we present an SCP algorithm for manifold problems with theoretical guarantees that resemble those derived for the Euclidean setting, and demonstrate its practical performance via numerical experiments.
@inproceedings{BonalliBylardEtAl2019, author = {Bonalli, R. and Bylard, A. and Cauligi, A. and Lew, T. and Pavone, M.}, title = {Trajectory Optimization on Manifolds: {A} Theoretically-Guaranteed Embedded Sequential Convex Programming Approach}, booktitle = {{Robotics: Science and Systems}}, year = {2019}, address = {Freiburg im Breisgau, Germany}, month = jun, url = {https://arxiv.org/pdf/1905.07654.pdf}, owner = {bylard}, timestamp = {2019-05-01} }
Abstract: This paper presents a model predictive control (MPC) approach to optimize routes for Ride-sharing Autonomous Mobility-on-Demand (RAMoD) systems, whereby self-driving vehicles provide coordinated on-demand mobility, possibly allowing multiple customers to share a ride. Specifically, we first devise a time-expanded network flow model for RAMoD. Second, leveraging this model, we design a real-time MPC algorithm to optimize the routes of both empty and customer-carrying vehicles, with the goal of optimizing social welfare, namely, a weighted combination of customers’ travel time and vehicles’ mileage. Finally, we present a real-world case study for the city of San Francisco, CA, by using the microscopic traffic simulator MATSim. The simulation results show that a RAMoD system can significantly improve social welfare with respect to a single-occupancy AMoD system, and that the predictive structure of the proposed MPC controller allows it to outperform existing reactive ride-sharing coordination algorithms for RAMoD.
@inproceedings{TsaoMilojevicEtAl2019, author = {Tsao, M. and Milojevic, D. and Ruch, C. and Salazar, M. and Frazzoli, E. and Pavone, M.}, title = {Model Predictive Control of Ride-sharing Autonomous Mobility on Demand Systems}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2019}, address = {Montreal, Canada}, month = may, url = {/wp-content/papercite-data/pdf/Tsao.ea.ICRA19.pdf}, owner = {samauro}, timestamp = {2020-02-12} }
Abstract: Model-free Reinforcement Learning (RL) offers an attractive approach to learn control policies for high-dimensional systems, but its relatively poor sample complexity often forces training in simulated environments. Even in simulation, goal-directed tasks whose natural reward function is sparse remain intractable for state-of-the-art model-free algorithms for continuous control. The bottleneck in these tasks is the prohibitive amount of exploration required to obtain a learning signal from the initial state of the system. In this work, we leverage physical priors in the form of an approximate system dynamics model to design a curriculum scheme for a model-free policy optimization algorithm. Our Backward Reachability Curriculum (BaRC) begins policy training from states that require a small number of actions to accomplish the task, and expands the initial state distribution backwards in a dynamically-consistent manner once the policy optimization algorithm demonstrates sufficient performance. BaRC is general, in that it can accelerate training of any model-free RL algorithm on a broad class of goal-directed continuous control MDPs. Its curriculum strategy is physically intuitive, easy-to-tune, and allows incorporating physical priors to accelerate training without hindering the performance, flexibility, and applicability of the model-free RL algorithm. We evaluate our approach on two representative dynamic robotic learning problems and find substantial performance improvement relative to previous curriculum generation techniques and naïve exploration strategies
@inproceedings{IvanovicHarrisonEtAl2019, author = {Ivanovic, B. and Harrison, J. and Sharma, A. and Chen, M. and Pavone, M.}, title = {{BaRC:} Backward Reachability Curriculum for Robotic Reinforcement Learning}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2019}, address = {Montreal, Canada}, month = may, url = {https://arxiv.org/pdf/1806.06161.pdf}, owner = {borisi}, timestamp = {2018-09-05} }
Abstract: Sequential Convex Programming (SCP) has recently seen a surge of interest as a tool for trajectory optimization. Yet, most available methods lack rigorous performance guarantees and are often tailored to specific optimal control setups. In this paper, we present GuSTO (Guaranteed Sequential Trajectory Optimization), an algorithmic framework to solve trajectory optimization problems for control-affine systems with drift. GuSTO generalizes earlier SCP-based methods for trajectory optimization (by addressing, for example, goal region constraints and problems with either fixed or free final time), and enjoys theoretical convergence guarantees in terms of convergence to, at least, a stationary point. The theoretical analysis is further leveraged to devise an accelerated implementation of GuSTO, which originally infuses ideas from indirect optimal control into an SCP context. Numerical experiments on a variety of trajectory optimization setups show that GuSTO generally outperforms current state-of-the-art approaches in terms of success rates, solution quality, and computation times.
@inproceedings{BonalliCauligiEtAl2019, author = {Bonalli, R. and Cauligi, A. and Bylard, A. and Pavone, M.}, title = {{GuSTO:} Guaranteed Sequential Trajectory Optimization via Sequential Convex Programming}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2019}, address = {Montreal, Canada}, month = may, url = {https://arxiv.org/pdf/1903.00155.pdf}, owner = {bylard}, timestamp = {2018-10-04} }
Abstract: Quadcopters have been used as hovering encountered-type haptic devices in virtual reality. We suggest that quadcopters can facilitate rich haptic interactions beyond force feedback by appropriating physical objects and the environment. We present HoverHaptics, an autonomous safe-to-touch quadcopter and its integration with a virtual shopping experience. HoverHaptics highlights three affordances of quadcopters that enable these rich haptic interactions: (1) dynamic positioning of passive haptics, (2) texture mapping, and (3) animating passive props. We identify inherent challenges of hovering encountered-type haptic devices, such as their limited speed, inadequate control accuracy, and safety concerns. We then detail our approach for tackling these challenges, including the use of display techniques, visuo-haptic illusions, and collision avoidance. We conclude by describing a preliminary study (n = 9) to better understand the subjective user experience when interacting with a quadcopter in virtual reality using these techniques.
@inproceedings{AbtahiLandryEtAl2019, author = {Abtahi, P. and Landry, B. and Yang, J. J. and Pavone, M. and Follmer, S. and Landay, J. A.}, title = {Beyond The Force: Using Quadcopters to Appropriate Objects and the Environment for Haptics in Virtual Reality}, booktitle = {{ACM CHI Conf. on Human Factors in Computing Systems}}, year = {2019}, address = {Glasgow, UK}, month = may, url = {https://dl.acm.org/doi/10.1145/3290605.3300589}, owner = {blandry}, timestamp = {2020-04-13} }
Abstract: Satellite servicing is a rapidly developing industry which requires a number advances in semi- and fully-automated space robotics to unlock many key servicing capabilities. One upcoming mission example is the NASA Restore-L Robotic Servicing spacecraft, which is equipped with two 7-joint robotic manipulators used to capture a satellite and perform a complex series of refueling tasks, including swapping between various end-effector tools stored on board. In this scenario, planning of the manipulator motions must account for a number of constraints, such as collision avoidance and the potential need for uninterrupted visual tracking of objects or of the end-effector. Such complex constraints in a cluttered environment, such as the interface between two spacecraft, are time-consuming to incorporate into hand-designed trajectories. Thus, in this work we present a software tool which uses robot motion planning and path refinement algorithms for automated, real-time computation of near-optimal, collision-free trajectories which satisfy the aforementioned perception constraints. The tool is built on the ROS MoveIt! framework, which can simulate and visualize trajectories as well as seamlessly switch between motion planning and refinement algorithms depending on task requirements. Additionally, we performed experimental campaigns to benchmark a number of available algorithms for performance in handling such perception constraints. Although the framework is applied to a mock-up of Restore-L satellite servicer in this paper, the tool can be applied to any fixed-base manipulator planning scenario with a similar class of constraints.
@inproceedings{ZahroofBylardEtAl2019, author = {Zahroof, T. and Bylard, A. and Shageer, H. and Pavone, M.}, title = {Perception-Constrained Robot Manipulator Planning for Satellite Servicing}, booktitle = {{IEEE Aerospace Conference}}, year = {2019}, address = {Big Sky, Montana}, month = mar, url = {/wp-content/papercite-data/pdf/Zahroof.Bylard.Shageer.Pavone.AeroConf19.pdf}, owner = {bylard}, timestamp = {2019-01-14} }
Abstract: This paper presents Latent Sampling-based Motion Planning (L-SBMP), a methodology towards computing motion plans for complex robotic systems by learning a plannable latent representation. Recent works in control of robotic systems have effectively leveraged local, low-dimensional embeddings of high-dimensional dynamics. In this paper we combine these recent advances with techniques from sampling-based motion planning (SBMP) in order to design a methodology capable of planning for high-dimensional robotic systems beyond the reach of traditional approaches (e.g., humanoids, or even systems where planning occurs in the visual space). Specifically, the learned latent space is constructed through an autoencoding network, a dynamics network, and a collision checking network, which mirror the three main algorithmic primitives of SBMP, namely state sampling, local steering, and collision checking. Notably, these networks can be trained through only raw data of the system’s states and actions along with a supervising collision checker. Building upon these networks, an RRT-based algorithm is used to plan motions directly in the latent space - we refer to this exploration algorithm as Learned Latent RRT (L2RRT). This algorithm globally explores the latent space and is capable of generalizing to new environments. The overall methodology is demonstrated on two planning problems, namely a visual planning problem, whereby planning happens in the visual (pixel) space, and a humanoid robot planning problem.
@article{IchterPavone2019, author = {Ichter, B. and Pavone, M.}, title = {Robot Motion Planning in Learned Latent Spaces}, journal = {{IEEE Robotics and Automation Letters}}, volume = {4}, number = {3}, pages = {2407--2414}, year = {2019}, month = jan, url = {https://arxiv.org/abs/1807.10366}, owner = {ichter}, timestamp = {2019-02-01} }
Abstract: This paper presents a queueing-theoretical approach to the analysis, control, and evaluation of mobility-on-demand (MoD) systems for urban personal transportation. A MoD system consists of a fleet of vehicles providing one-way car sharing service and a team of drivers to rebalance such vehicles. The drivers then rebalance themselves by driving select customers similar to a taxi service. We model the MoD system as two coupled closed Jackson networks with passenger loss. We show that the system can be approximately balanced by solving two decoupled linear programs and exactly balanced through nonlinear optimization. The rebalancing techniques are applied to a system sizing example using taxi data in three neighborhoods of Manhattan. Lastly, we formulate a real-time closed-loop rebalancing policy for drivers and perform case studies of two hypothetical MoD systems in Manhattan and Hangzhou, China. We show that the taxi demand in Manhattan can be met with the same number of vehicles in a MoD system, but only require 1/3 to 1/4 the number of drivers; in Hangzhou, where customer demand is highly unbalanced, higher driver-to-vehicle ratios are required to achieve good quality of service.
@article{ZhangPavone2018, author = {Zhang, R. and Rossi, F. and Pavone, M.}, title = {Analysis, Control, and Evaluation of Mobility-on-Demand Systems: a Queueing-Theoretical Approach}, journal = {{IEEE Transactions on Control of Network Systems}}, volume = {6}, number = {1}, pages = {115-126}, year = {2019}, doi = {10.1109/TCNS.2018.2800403}, url = {/wp-content/papercite-data/pdf/Zhang.Rossi.Pavone.TCNS18.pdf}, owner = {frossi2}, timestamp = {2017-12-30} }
Abstract:
@incollection{SchmerlingPavone2019, author = {Schmerling, E. and Pavone, M.}, title = {Kinodynamic Planning}, booktitle = {Encyclopedia of Robotics}, publisher = {{Springer}}, edition = {First}, year = {2019}, url = {/wp-content/papercite-data/pdf/Schmerling.Pavone.EOR19.pdf}, owner = {bylard}, timestamp = {2019-10-07} }
Abstract: In this paper we present a queuing network approach to the problem of routing and rebalancing a fleet of self-driving vehicles providing on- demand mobility within a capacitated road network. We refer to such systems as autonomous mobility-on-demand systems, or AMoD. We first cast an AMoD system into a closed, multi-class BCMP queuing network model capable of capturing the passenger arrival process, traffic, the state- of-charge of electric vehicles, and the availability of vehicles at the stations. Second, we propose a scalable method for the synthesis of routing and charging policies, with performance guarantees in the limit of large fleet sizes. Third, we validate the theoretical results on a case study of New York City. Collectively, this paper provides a unifying framework for the analysis and control of AMoD systems, which provides a large set of modeling options (e.g., the inclusion of road capacities and charging constraints), and subsumes earlier Jackson and network flow models.
@article{IglesiasRossiEtAl2017, author = {Iglesias, R. and Rossi, F. and Zhang, R. and Pavone, M.}, title = {A {BCMP} Network Approach to Modeling and Controlling Autonomous Mobility-on-Demand Systems}, journal = {{Int. Journal of Robotics Research}}, year = {2019}, volume = {38}, number = {2--3}, pages = {357--374}, url = {/wp-content/papercite-data/pdf/Iglesias.Rossi.Zhang.Pavone.IJRR18.pdf}, owner = {rdit}, timestamp = {2018-05-06} }
Abstract: The operation of today’s robots entails interactions with humans, e.g., in autonomous driving amidst human-driven vehicles. To effectively do so, robots must proactively decode the intent of humans and concurrently leverage this knowledge for safe, cooperative task satisfaction—a problem we refer to as proactive decision making. However, simultaneous intent decoding and robotic control requires reasoning over several possible human behavioral models, resulting in high-dimensional state trajectories. In this paper, we address the proactive decision making problem using a novel combination of formal methods, control, and data mining techniques. First, we distill high-dimensional state trajectories of human-robot interaction into concise, symbolic behavioral summaries that can be learned from data. Second, we leverage formal methods to model high-level agent goals, safe interaction, and information-seeking behavior with temporal logic formulae. Finally, we design a novel decision-making scheme that maintains a belief distribution over models of human behavior, and proactively plans informative actions. After showing several desirable theoretical properties, we apply our framework to a dataset of humans driving in crowded merging scenarios. For it, temporal logic models are generated and used to synthesize control strategies using tree-based value iteration and deep reinforcement learning (RL). Additionally, we illustrate how data-driven models of human responses to informative robot probes, such as from generative models like Conditional Variational Autoencoders (CVAEs), can be clustered with formal specifications. Results from simulated self-driving car scenarios demonstrate that data-driven strategies enable safe interaction, correct model identification, and significant dimensionality reduction.
@article{ChinchaliLivingstonEtAl2018, author = {Chinchali, S. P. and Livingston, S. C. and Chen, M. and Pavone, M.}, title = {Multi-objective optimal control for proactive decision-making with temporal logic models}, journal = {{Int. Journal of Robotics Research}}, volume = {38}, number = {12-13}, pages = {1490--1512}, year = {2019}, url = {/wp-content/papercite-data/pdf/Chinchali.Livingston.Chen.Pavone.IJRR18.pdf}, owner = {SCL}, timestamp = {2020-11-09} }
Abstract: The objective of this paper is to present a full-stack, real-time motion planning framework for kinodynamic robots and then show how it is applied and demonstrated on a physical quadrotor system operating in a laboratory environment. The proposed framework utilizes an offline-online computation paradigm, neighborhood classification through machine learning, sampling-based motion planning with an optimal cost distance metric, and trajectory smoothing to achieve real-time planning for aerial vehicles. This framework accounts for dynamic obstacles with an event-based replanning structure and a locally reactive control layer that minimizes replanning events. The approach is demonstrated on a quadrotor navigating moving obstacles in an indoor space and stands as, arguably, one of the first demonstrations of full-online kinodynamic motion planning, with execution cycles of 3 Hz to 5 Hz. For the quadrotor, a simplified dynamics model is used during the planning phase to accelerate online computation. A trajectory smoothing phase, which leverages the differentially flat nature of quadrotor dynamics, is then implemented to guarantee a dynamically feasible trajectory.
@article{AllenPavone2018, author = {Allen, R. and Pavone, M.}, title = {A Real-Time Framework for Kinodynamic Planning in Dynamic Environments with Application to Quadrotor Obstacle Avoidance}, journal = {{Robotics and Autonomous Systems}}, volume = {115}, pages = {174--193}, year = {2019}, doi = {10.1016/j.robot.2018.11.017}, url = {/wp-content/papercite-data/pdf/Allen.Pavone.RAS18.pdf}, owner = {bylard}, timestamp = {2019-01-07} }
Abstract: Reach-avoid games are excellent proxies for studying many problems in robotics and related fields, with applications including multi-robot systems, human-robot interactions, and safety-critical systems. Solving reach-avoid games is however difficult due to the conflicting and asymmetric goals of agents, and trade-offs between optimality, computational complexity, and solution generality are commonly required. This paper seeks to find attacker strategies in reach-avoid games that reduce computational complexity while retaining solution quality by using a receding horizon strategy. To solve for the open-loop strategy fast enough to enable an receding horizon approach, the problem is formulated as a mixed-integer second-order cone program. This formulation leverages the use of sums-of-squares optimization to provide guarantees that the strategy is robust to all possible defender policies. The method is demonstrated through numerical and hardware experiments.
@inproceedings{LorenzettiChenEtAl2018, author = {Lorenzetti, J. and Chen, M. and Landry, B. and Pavone, M.}, title = {Reach-Avoid Games Via Mixed-Integer Second-Order Cone Programming}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2018}, address = {Miami Beach, Florida}, month = dec, url = {/wp-content/papercite-data/pdf/Lorenzetti.Chen.Landry.Pavone.CDC18.pdf}, owner = {jlorenze}, timestamp = {2019-09-25} }
Abstract: This paper presents a stochastic, model predictive control (MPC) algorithm that leverages short-term proba- bilistic forecasts for dispatching and rebalancing Autonomous Mobility-on-Demand systems (AMoD), i.e. fleets of self-driving vehicles. We first present the core stochastic optimization problem in terms of a time-expanded network flow model. Then, to ameliorate its tractability, we present two key relaxations. First, we replace the original stochastic problem with a Sample Average Approximation, and provide its performance guaran- tees. Second, we divide the controller into two submodules. The first submodule assigns vehicles to existing customers and the second redistributes vacant vehicles throughout the city. This enables the problem to be solved as two totally unimodular linear programs, allowing the controller to scale to large problem sizes. Finally, we test the proposed algorithm in two scenarios based on real data and show that it outperforms prior state-of-the-art algorithms. In particular, in a simulation using customer data from the ridesharing company DiDi Chuxing, the algorithm presented here exhibits a 62.3 percent reduction in customer waiting time compared to state of the art non- stochastic algorithms.
@inproceedings{TsaoIglesiasEtAl2018, author = {Tsao, M. and Iglesias, R. and Pavone, M.}, title = {Stochastic Model Predictive Control for Autonomous Mobility on Demand}, booktitle = {{Proc. IEEE Int. Conf. on Intelligent Transportation Systems}}, year = {2018}, note = {{Extended version available} at \url{https://arxiv.org/pdf/1804.11074}}, address = {Maui, Hawaii}, month = nov, url = {https://arxiv.org/pdf/1804.11074.pdf}, owner = {rdit}, timestamp = {2018-07-12} }
Abstract: In this paper we study models and coordination policies for intermodal Autonomous Mobility-on-Demand (AMoD), wherein a fleet of self-driving vehicles provides on-demand mobility jointly with public transit. Specifically, we first present a network flow model for intermodal AMoD, where we capture the coupling between AMoD and public transit and the goal is to maximize social welfare. Second, leveraging such a model, we design a pricing and tolling scheme that allows to achieve the social optimum under the assumption of a perfect market with selfish agents. Finally, we present a real-world case study for New York City. Our results show that the coordination between AMoD fleets and public transit can yield significant benefits compared to an AMoD system operating in isolation.
@inproceedings{SalazarRossiEtAl2018, author = {Salazar, M. and Rossi, F. and Schiffer, M. and Onder, C. H. and Pavone, M.}, title = {On the Interaction between Autonomous Mobility-on-Demand and the Public Transportation Systems}, booktitle = {{Proc. IEEE Int. Conf. on Intelligent Transportation Systems}}, year = {2018}, note = {{Extended version available} at \url{https://arxiv.org/abs/1804.11278}}, address = {Maui, Hawaii}, month = nov, url = {https://arxiv.org/pdf/1804.11278.pdf}, owner = {frossi2}, timestamp = {2019-07-02} }
Abstract: We propose a novel framework for learning stabilizable nonlinear dynamical systems for continuous control tasks in robotics. The key idea is to develop a new control-theoretic regularizer for dynamics fitting rooted in the notion of stabilizability, which guarantees that the learnt system can be accompanied by a robust controller capable of stabilizing any trajectory that the system can generate. By leveraging tools from contraction theory, statistical learning, and convex optimization, we provide a general and tractable algorithm to learn stabilizable dynamics, which can be applied to complex underactuated systems. We validate the proposed algorithm on a simulated planar quadrotor system and observe that the control-theoretic regularized dynamics model is able to consistently generate and accurately track reference trajectories whereas the model learnt using standard regression techniques, e.g., ridge-regression (RR) does extremely poorly on both tasks. Furthermore, in aggressive flight regimes with high velocity and bank angle, the tracking controller fails to stabilize the trajectory generated by the ridge-regularized model whereas no instabilities were observed using the control-theoretic learned model, even with a small number of demonstration examples. The results presented illustrate the need to infuse standard model-based reinforcement learning algorithms with concepts drawn from nonlinear control theory for improved reliability.
@inproceedings{SinghSindhwaniEtAl2018, author = {Singh, S. and Sindhwani, V. and Slotine, J.-J. E. and Pavone, M.}, title = {Learning Stabilizable Dynamical Systems via Control Contraction Metrics}, booktitle = {{Workshop on Algorithmic Foundations of Robotics}}, year = {2018}, address = {Merida, Mexico}, month = oct, url = {https://arxiv.org/abs/1808.00113}, owner = {ssingh19}, timestamp = {2019-07-27} }
Abstract: In the pursuit of real-time motion planning, a commonly adopted practice is to compute a trajectory by running a planning algorithm on a simplified, low-dimensional dynamical model, and then employ a feedback tracking controller that tracks such a trajectory by accounting for the full, high-dimensional system dynamics. While this strategy of planning with model mismatch generally yields fast computation times, there are no guarantees of dynamic feasibility, which hampers application to safety-critical systems. Building upon recent work that addressed this problem through the lens of Hamilton-Jacobi (HJ) reachability, we devise an algorithmic framework whereby one computes, offline, for a pair of "planner" (i.e., low-dimensional) and "tracking" (i.e., high-dimensional) models, a feedback tracking controller and associated tracking bound. This bound is then used as a safety margin when generating motion plans via the low-dimensional model. Specifically, we harness the computational tool of sum-of-squares (SOS) programming to design a bilinear optimization algorithm for the computation of the feedback tracking controller and associated tracking bound. The algorithm is demonstrated via numerical experiments, with an emphasis on investigating the trade-off between the increased computational scalability afforded by SOS and its intrinsic conservativeness. Collectively, our results enable scaling the appealing strategy of planning with model mismatch to systems that are beyond the reach of HJ analysis, while maintaining safety guarantees.
@inproceedings{SinghChenEtAl2018, author = {Singh, S. and Chen, M. and Herbert, S. L. and Tomlin, C. J. and Pavone, M.}, title = {Robust Tracking with Model Mismatch for Fast and Safe Planning: an {SOS} Optimization Approach}, booktitle = {{Workshop on Algorithmic Foundations of Robotics}}, year = {2018}, address = {Merida, Mexico}, month = oct, url = {https://arxiv.org/abs/1808.00649}, owner = {ssingh19}, timestamp = {2019-07-27} }
Abstract: Reach-avoid problems involve driving a system to a set of desirable configurations while keeping it away from undesirable ones. Providing mathematical guarantees for such scenarios is challenging but have numerous potential practical applications. Due to the challenges, analysis of reach-avoid problems involves making trade-offs between generality of system dynamics, generality of problem setups, optimality of solutions, and computational complexity. In this paper, we combine sum-of-squares optimization and dynamic programming to address the reach-avoid problem, and provide a conservative solution that maintains reaching and avoidance guarantees. Our method is applicable to polynomial system dynamics and to general problem setups, and is more computationally scalable than previous related methods. Through a numerical example involving two single integrators, we validate our proposed theory and compare our method to Hamilton-Jacobi reachability. Having validated our theory, we demonstrate the computational scalability of our method by computing the reach-avoid set of a system with two kinematic cars.
@inproceedings{LandryChenEtAl2018, author = {Landry, B. and Chen, M. and Hemley, S. and Pavone, M.}, title = {Reach-Avoid Problems via Sum-of-Squares Optimization and Dynamic Programming}, booktitle = {{IEEE/RSJ Int. Conf. on Intelligent Robots \& Systems}}, year = {2018}, address = {Madrid, Spain}, month = oct, url = {https://arxiv.org/pdf/1807.11553.pdf}, owner = {blandry}, timestamp = {2018-03-03} }
Abstract: This work presents a methodology for modeling and predicting human behavior in settings with N humans interacting in highly multimodal scenarios (i.e. where there are many possible highly-distinct futures). A motivating example includes robots interacting with humans in crowded environments, such as self-driving cars operating alongside human-driven vehicles or human-robot collaborative bin packing in a warehouse. Our approach to model human behavior in such uncertain environments is to model humans in the scene as nodes in a graphical model, with edges encoding relationships between them. For each human, we learn a multimodal probability distribution over future actions from a dataset of multi-human interactions. Learning such distributions is made possible by recent advances in the theory of conditional variational autoencoders and deep learning approximations of probabilistic graphical models. Specifically, we learn action distributions conditioned on interaction history, neighboring human behavior, and candidate future agent behavior in order to take into account response dynamics. We demonstrate the performance of such a modeling approach in modeling basketball player trajectories, a highly multimodal, multi-human scenario which serves as a proxy for many robotic applications.
@inproceedings{IvanovicSchmerlingEtAl2018, author = {Ivanovic, B. and Schmerling, E. and Leung, K. and Pavone, M.}, title = {Generative Modeling of Multimodal Multi-Human Behavior}, booktitle = {{IEEE/RSJ Int. Conf. on Intelligent Robots \& Systems}}, year = {2018}, address = {Madrid, Spain}, month = oct, url = {https://arxiv.org/pdf/1803.02015.pdf}, owner = {borisi}, timestamp = {2018-10-14} }
Abstract: Gaussian Process (GP) regression has seen widespread use in robotics due to its generality, simplicity of use, and the utility of Bayesian predictions. In particular, the predominant implementation of GP regression is kernel-based, as it enables fitting of arbitrary nonlinear functions by leveraging kernel functions as infinite-dimensional features. While incorporating prior information has the potential to drastically improve data efficiency of kernel-based GP regression, expressing complex priors through the choice of kernel function and associated hyperparameters is often challenging and unintuitive. Furthermore, the computational complexity of kernel-based GP regression scales poorly with the number of samples, limiting its application in regimes where a large amount of data is available. In this work, we propose ALPaCA, an algorithm for efficient Bayesian regression which addresses these issues. ALPaCA uses a dataset of sample functions to learn a domain-specific, finite-dimensional feature encoding, as well as a prior over the associated weights, such that Bayesian linear regression in this feature space yields accurate online predictions of the posterior density. These features are neural networks, which are trained via a meta-learning approach. ALPaCA extracts all prior information from the dataset, rather than relying on the choice of arbitrary, restrictive kernel hyperparameters. Furthermore, it substantially reduces sample complexity, and allows scaling to large systems. We investigate the performance of ALPaCA on two simple regression problems, two simulated robotic systems, and on a lane-change driving task performed by humans. We find our approach outperforms kernel-based GP regression, as well as state of the art meta-learning approaches, thereby providing a promising plug-in tool for many regression tasks in robotics where scalability and data-efficiency are important.
@inproceedings{HarrisonSharmaEtAl2018, author = {Harrison, J. and Sharma, A. and Pavone, M.}, title = {Meta-Learning Priors for Efficient Online Bayesian Regression}, booktitle = {{Workshop on Algorithmic Foundations of Robotics}}, year = {2018}, address = {Merida, Mexico}, month = oct, url = {https://arxiv.org/pdf/1807.08912.pdf}, owner = {apoorva}, timestamp = {2018-10-07} }
Abstract: Signal temporal logic (STL) and Hamilton-Jacobi (HJ) reachability analysis are effective mathematical tools for formally analyzing the behavior of robotic systems. STL is a specification language that uses a combination of logic and temporal operators to precisely express real-valued and time-dependent requirements on system behaviors. While recursively defined STL specifications are extremely expressive and controller synthesis methods exist, so far there has not been work that quantifies the set of states from which STL formulas can be satisfied. HJ reachability, on the other hand, is a method for computing the reachable set, that is the set of states from which a system is able to reach a goal while satisfying state and control constraints. While reasoning about system requirements through sets of states is useful for predetermining whether it is possible to satisfy desired system properties as well as obtaining state feedback controllers, so far the applicability of HJ reachability has been limited to relatively simple reach-avoid specifications. In this paper, we merge STL and HJ reachability into a single framework that combines the key advantage of both methods ? expressiveness of specifications and set quantification. To do this, we establish a correspondence between temporal and reachability operators, and utilize the idea of least-restrictive feasible controller sets (LRFCSs) to break down controller synthesis for complex STL formulas into a sequence of reachability and elementary set operations. LRFCSs are crucial for avoiding controller conflicts among the different reachability operations. In addition, the synthesized state feedback controllers are guaranteed to satisfy STL specifications if determined to be possible by our framework, and violate specifications minimally if not. We demonstrate our method through numerical simulations and robotic experiments.
@inproceedings{ChenTamEtAl2018, author = {Chen, M. and Tam, Q. and Livingston, S. C. and Pavone, M.}, title = {Signal Temporal Logic meets {Hamilton-Jacobi} Reachability: Connections and Applications}, booktitle = {{Workshop on Algorithmic Foundations of Robotics}}, year = {2018}, address = {Merida, Mexico}, month = oct, url = {/wp-content/papercite-data/pdf/Chen.Tam.Livingston.Pavone.WAFR18.pdf}, owner = {bylard}, timestamp = {2021-03-25} }
Abstract: We study the interaction between a fleet of electric, self-driving vehicles servicing on-demand transportation requests (referred to as Autonomous Mobility-on-Demand, or AMoD, system) and the electric power network. We propose a joint linear model that captures the coupling between the two systems stemming from the vehicles’ charging requirements. The model subsumes existing network flow models for AMoD systems and linear models for the power network, and it captures time-varying customer demand and power generation costs, road congestion, and power transmission and distribution constraints. We then leverage the linear model to jointly optimize the operation of both systems. We propose an algorithmic procedure to losslessly reduce the problem size by bundling customer requests, allowing it to be efficiently solved by state-of-the-art linear programming solvers. Finally, we study the implementation of a hypothetical electric-powered AMoD system in Dallas-Fort Worth, and its impact on the Texas power network. We show that coordination between the AMoD system and the power network can reduce the overall energy expenditure compared to the case where no cars are present (despite the increased demand for electricity) and yield savings of $78M per year compared to an uncoordinated scenario. Collectively, the results of this paper provide a first-of-a-kind characterization of the interaction between electric-powered AMoD systems and the electric power network, and shed additional light on the economic and societal value of AMoD.
@inproceedings{RossiIglesiasEtAl2018, author = {Rossi, F. and Iglesias, R. and Alizadeh, M. and Pavone, M.}, title = {On the Interaction Between {Autonomous Mobility-on-Demand} Systems and the Power Network: Models and Coordination Algorithms}, booktitle = {{Robotics: Science and Systems}}, year = {2018}, note = {{Extended version available at }\url{https://arxiv.org/abs/1709.04906}}, address = {Pittsburgh, Pennsylvania}, month = jun, url = {/wp-content/papercite-data/pdf/Rossi.Iglesias.Alizadeh.Pavone.RSS18.pdf}, owner = {frossi2}, timestamp = {2018-06-30} }
Abstract: In this paper, we review multi-agent collective behavior algorithms in the literature and classify them according to their underlying mathematical structure. For each mathematical technique, we identify the multi-agent coordination tasks it can be applied to, and we analyze its scalability, bandwidth use, and demonstrated maturity. We highlight how versatile techniques such as artificial potential functions can be used for applications ranging from low-level position control to high-level coordination and task allocation, we discuss possible reasons for the slow adoption of complex distributed coordination algorithms in the field, and we highlight areas for further research and development.
@inproceedings{RossiBandyopadhyayEtAl2018, author = {Rossi, F. and Bandyopadhyay, S. and Wolf, M. and Pavone, M.}, title = {Review of Multi-Agent Algorithms for Collective Behavior: a Structural Taxonomy}, booktitle = {{IFAC Workshop on Networked \& Autonomous Air \& Space Systems}}, year = {2018}, address = {Santa Fe, New Mexico}, month = jun, url = {https://arxiv.org/abs/1803.05464}, owner = {frossi2}, timestamp = {2018-02-01} }
Abstract: This paper addresses the problem of planning a safe (i.e., collisionfree) trajectory from an initial state to a goal region when the obstacle space is apriori unknown and is incrementally revealed online, e.g., through lineofsight perception. Despite its ubiquitous nature, this formulation of motion planning has received relatively little theoretical investigation, as opposed to the setup where the environment is assumed known. A fundamental challenge is that, unlike motion planning with known obstacles, it is not even clear what an optimal policy to strive for is. Our contribution is threefold. First, we present a notion of optimality for safe planning in unknown environments in the spirit of comparative (as opposed to competitive) analysis, with the goal of obtaining a benchmark that is, at least conceptually, attainable. Second, by leveraging this theoretical benchmark, we derive a pseudooptimal class of policies that can seamlessly incorporate any amount of prior or learned information while still guaranteeing the robot never collides. Finally, we demonstrate the practicality of our algorithmic approach in numerical experiments using a range of environment types and dynamics, including a comparison with a state of the art method. A key aspect of our framework is that it automatically and implicitly weighs exploration versus exploitation in a way that is optimal with respect to the information available.
@inproceedings{JansonHuEtAl2018, author = {Janson, L. and Hu, T. and Pavone, M.}, title = {Safe Motion Planning in Unknown Environments: Optimality Benchmarks and Tractable Policies}, booktitle = {{Robotics: Science and Systems}}, year = {2018}, address = {Pittsburgh, Pennsylvania}, month = jun, url = {https://arxiv.org/pdf/1804.05804.pdf}, owner = {bylard}, timestamp = {20180412} }
Abstract: In this paper we explore notions of traversability for hopping rovers on small solar system bodies, such as asteroids and comets, with a focus on developing actionable tools for mission planning. We start with a discussion of hopping dynamics and the inherent differences between notions of “traversability” for hopping and traditional wheeled rovers. We then discuss various map-based tools for understanding the surface gravity environment and propose an algorithm that partitions the surface into locally traversable regions. Finally, we leverage dynamic simulations to estimate k-hop backwards reachable sets—that is the surface regions from which a particular point can be reached within k hops. A case study of comet 67P demonstrates that even extremely irregular bodies may be largely traversable with an appropriate hopper design.
@inproceedings{Hockman2018, author = {Hockman, B. and Pavone, M.}, title = {Traversability of Hopping Rovers on Small Solar System Bodies}, booktitle = {{Int. Symp. on Artificial Intelligence, Robotics and Automation in Space}}, year = {2018}, address = {Madrid, Spain}, month = jun, url = {/wp-content/papercite-data/pdf/Hockman.Pavone.ISAIRAS18.pdf}, owner = {bhockman}, timestamp = {2018-05-25} }
Abstract: We present a framework to enable a fleet of rigidly attached quadrotor aerial robots to transport heavy objects along a known reference trajectory without inter-robot communication or centralized coordination. Leveraging a distributed wrench controller, we provide exponential stability guarantees for the entire assembly, under a mild geometric condition. This is achieved by each quadrotor independently solving a local optimization problem to counteract the biased torque effects from each robot in the assembly. We rigorously analyze the controllability of the object, design a distributed compensation scheme to address these challenges, and show that the resulting strategy collectively guarantees full group control authority. To ensure feasibility for online implementation, we derive bounds on the net desired control wrench, characterize the output wrench space of each quadrotor, and perform subsequent trajectory optimization under these input constraints. We thoroughly validate our method in simulation with eight quadrotors transporting a heavy object in a cluttered environment subject to various sources of uncertainty, and demonstrate the algorithm’s resilience.
@inproceedings{WangSinghEtAl2018, author = {Wang, Z. and Singh, S. and Pavone, M. and Schwager, M.}, title = {Cooperative Object Transport in {3D} with Multiple Quadrotors using No Peer Communication}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2018}, address = {Brisbane, Australia}, month = may, url = {/wp-content/papercite-data/pdf/Wang.Singh.Pavone.ea.ICRA18.pdf}, owner = {ssingh19}, timestamp = {2018-01-16} }
Abstract: This paper presents a method for constructing human-robot interaction policies in settings where multimodality, i.e., the possibility of multiple highly distinct futures, plays a critical role in decision making. We are motivated in this work by the example of traffic weaving, e.g., at highway onramps/offramps, where entering and exiting cars must swap lanes in a short distance — a challenging negotiation even for experienced drivers due to the inherent multimodal uncertainty of who will pass whom. Our approach is to learn multimodal probability distributions over future human actions from a dataset of human-human exemplars and perform real-time robot policy construction in the resulting environment model through massively parallel sampling of human responses to candidate robot action sequences. Direct learning of these distributions is made possible by recent advances in the theory of conditional variational autoencoders (CVAEs), whereby we learn action distributions simultaneously conditioned on the present interaction history, as well as candidate future robot actions in order to take into account response dynamics. We demonstrate the efficacy of this approach with a human-in-the-loop simulation of a traffic weaving scenario.
@inproceedings{SchmerlingLeungEtAl2018, author = {Schmerling, E. and Leung, K. and Vollprecht, W. and Pavone, M.}, title = {Multimodal Probabilistic Model-Based Planning for Human-Robot Interaction}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2018}, address = {Brisbane, Australia}, month = may, url = {/wp-content/papercite-data/pdf/Schmerling.Leung.Vollprecht.Pavone.ICRA18.pdf}, owner = {schmrlng}, timestamp = {2017-09-18} }
Abstract: The goal of this paper is to present an end-to-end, data-driven framework to control Autonomous Mobility-on-Demand systems (AMoD, i.e. fleets of self-driving vehicles). We first model the AMoD system using a time-expanded network, and present a formulation that computes the optimal rebalancing strategy (i.e., preemptive repositioning) and the minimum feasible fleet size for a given travel demand. Then, we adapt this formulation to devise a Model Predictive Control (MPC) algorithm that leverages short-term demand forecasts based on historical data to compute rebalancing strategies. We test the end-to-end performance of this controller with a state-of-the-art LSTM neural network to predict customer demand and real customer data from DiDi Chuxing: we show that this approach scales very well for large systems (indeed, the computational complexity of the MPC algorithm does not depend on the number of customers and of vehicles in the system) and outperforms state-of-the-art rebalancing strategies by reducing the mean customer wait time by up to to 89.6%.
@inproceedings{IglesiasRossiEtAl2018, author = {Iglesias, R. and Rossi, F. and Wang, K. and Hallac, D. and Leskovec, J. and Pavone, M.}, title = {Data-Driven Model Predictive Control of Autonomous Mobility-on-Demand Systems}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2018}, address = {Brisbane, Australia}, month = may, url = {/wp-content/papercite-data/pdf/Iglesias.Rossi.Wang.ea.ICRA18.pdf}, owner = {frossi2}, timestamp = {2018-01-14} }
Abstract: A defining feature of sampling-based motion planning is the reliance on an implicit representation of the state space, which is enabled by a set of probing samples.Traditionally, these samples are drawn either probabilistically or deterministically to uniformly cover the state space. Yet, the motion of many robotic systems is often restricted to "small" regions of the state space, due to e.g. differential constraints or collision-avoidance constraints. To accelerate the planning process, it is thus desirable to devise non-uniform sampling strategies that favor sampling in those regions where an optimal solution might lie. This paper proposes a methodology for non-uniform sampling, whereby a sampling distribution is learnt from demonstrations, and then used to bias sampling. The sampling distribution is computed through a conditional variational autoencoder, allowing sample generation from the latent space conditioned on the specific planning problem. This methodology is general, can be used in combination with any sampling-based planner, and can effectively exploit the underlying structure of a planning problem while maintaining the theoretical guarantees of sampling-based approaches. Specifically, on several planning problems, the proposed methodology is shown to effectively learn representations for the relevant regions of the state space, resulting in an order of magnitude improvement in terms of success rate and convergence to the optimal cost
@inproceedings{IchterHarrisonEtAl2018, author = {Ichter, B. and Harrison, J. and Pavone, M.}, title = {Learning Sampling Distributions for Robot Motion Planning}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2018}, address = {Brisbane, Australia}, month = may, url = {https://arxiv.org/pdf/1709.05448.pdf}, owner = {frossi2}, timestamp = {2018-01-16} }
Abstract: We present a collaborative visual localization method for rovers designed to hop and tumble over the surface of small Solar System bodies, such as comets and asteroids. In a two-phase approach, an orbiting primary spacecraft first maps the surface of a body by capturing images from various poses and illumination angles; these images are processed to create a prior map of 3D landmarks. In the second phase, a hopping rover is deployed to the surface where it uses the prior map and a camera to perform on-board visual simultaneous localization and mapping (SLAM). Small bodies present new challenges to existing visual SLAM algorithms. Rotation periods as short as 1-12 hours, in the absence of atmospheric scattering, create high-contrast shadows that move over the surface. The constantly changing illumination angles cause landmark outliers and increase pose uncertainty. Furthermore, in this collaborative visual SLAM problem, the scene scale between spacecraft and hopping rover varies by several orders of magnitude (kilometers to centimeters). In this work, we describe how to augment ORB-SLAM2 - a state of the art visual SLAM implementation - so that it combines prior images with multiple illumination angles to handle large illumination variations. We also demonstrate how a wide field of view (FOV) camera (e.g. on a hopping rover) can relocalize to prior maps captured by a narrow FOV camera (e.g. a spacecraft navcam) to handle large scale variations. To reduce pose and scale errors accumulated while exploring the surface, we show how the rover can perform large hops to capture views of the surface that it can match to the prior map. After relocalizing, the rover’s on-board estimates are updated with a pose graph optimization and bundle adjustment. We evaluate the proposed method with sequences of images captured around a mock asteroid; illumination angles are varied while narrow and wide FOV cameras are steered along trajectories representative of orbital and hopping motions. Trajectory estimates are compared and found to be consistent with ground truth data. Evaluations suggest this method is robust to large illumination variations, scene scale changes and off-nadir camera pointing angles.
@inproceedings{ChiodiniReidEtAl2018, author = {Chiodini, S. and Reid, R. G. and Hockman, B. and Nesnas, I. A. D. and Debei, S. and Pavone, M.}, title = {Robust Visual Localization for Hopping Rovers on Small Bodies}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2018}, address = {Brisbane, Australia}, month = may, url = {/wp-content/papercite-data/pdf/Chiodini.Reid.Hockman.ea.ICRA18.pdf}, owner = {bhockman}, timestamp = {2018-01-16} }
Abstract: The localization of landers on the surface of small bodies has traditionally relied on observations from a mothership (e.g. Rosetta’s Philae lander and Hayabusa 2’s MASCOT and MINERVA landers). However, when line-of-sight with the mothership is not always available, or for surface rovers that travel large distances, alternative mothership-independent localization techniques may be required. On-board vision-based techniques have demonstrated effective localization in terrestrial applications as well as for Mars rovers, but may be unreliable on small bodies where rovers must contend with fast-moving shadows, difficulties observing absolute scale, and issues acquiring images such as dust, sun blinding, tumbling and low albedo. We investigate the feasibility of an entirely new localization approach based on surface gravimetry, where a rover can constrain its location on the surface by precisely measuring the local gravity vector. This mothership-independent localization technique is well-suited to a class of hybrid rovers that can bounce and tumble over the surface of small bodies; it is insensitive to surface illumination, and even works at night. We develop a Bayesian framework for computing localization “likelihood maps” from gravimetry (and gradiometry) data, accounting for all sensor and model uncertainties. We then propose a method for deriving landing distributions of a bouncing rover from simulation data to serve as a prior for the localization estimate. Finally, we conduct a case study on the Philae lander, where we show how this approach could have helped reject localization hypotheses and significantly narrow the areas searched for the Philae lander.
@inproceedings{HockmanReidEtAl2018, author = {Hockman, B. and Reid, R. G. and Nesnas, I. A. D. and Pavone, M.}, title = {Gravimetric Localization on the Surface of Small Bodies}, booktitle = {{IEEE Aerospace Conference}}, year = {2018}, address = {Big Sky, Montana}, month = mar, url = {/wp-content/papercite-data/pdf/Hockman.Reid.Nesnas.Pavone.AeroConf18.pdf}, owner = {bhockman}, timestamp = {2018-01-16} }
Abstract: Modern mobile networks are facing unprecedented growth in demand due to a new class of traffic from Internet of Things (IoT) devices such as smart wearables and autonomous cars. Future networks must schedule delay-tolerant software updates, data backup, and other transfers from IoT devices while maintaining strict service guarantees for conventional real-time applications such as voice-calling and video. This problem is extremely challenging because conventional traffic is highly dynamic across space and time, so its performance is significantly impacted if all IoT traffic is scheduled immediately when it originates. In this paper, we present a reinforcement learning (RL) based scheduler that can dynamically adapt to traffic variation, and to various reward functions set by network operators, to optimally schedule IoT traffic. Using 4 weeks of real network data from downtown Melbourne, Australia spanning diverse traffic patterns, we demonstrate that our RL scheduler can enable mobile networks to carry 14.7% more data with minimal impact on existing traffic, and outperforms heuristic schedulers by more than 2x. Our work is a valuable step towards designing autonomous, "self- driving" networks that learn to manage themselves from past data.
@inproceedings{ChinchaliHuEtAl2018, author = {Chinchali, S. and Hu, P. and Chu, T. and Sharma, M. and Bansal, M. and Misra, R. and Pavone, M. and Katti, S}, title = {Cellular Network Traffic Scheduling with Deep Reinforcement Learning}, booktitle = {{Proc. AAAI Conf. on Artificial Intelligence}}, year = {2018}, address = {New Orleans, Louisiana}, month = feb, url = {/wp-content/papercite-data/pdf/Chinchali.ea.AAAI18.pdf}, owner = {frossi2}, timestamp = {2018-04-10} }
Abstract: In a pullout maneuver an initially diving aircraft is returned to level flight. Depending on the initial condition, aircraft characteristics and control inputs, altitude loss may be significant and minimizing it can be important to avoid collision with the ground. A motivating example is that of stall/spin recoveries, where the pullout represents a majority of the total altitude lost. This paper presents a solution of the minimal altitude loss pullout maneuver by posing it as an infinite horizon optimal control problem and solving it using dynamic programming techniques on a reduced-order point mass model for a low-wing general aviation aircraft. The computed optimal policy results in a ?bang-bang? type controller, typical of shortest path problems, with maximum lift coefficient and bank rate applied at each point in time. The effect of maximum lift coefficient on the minimum altitude loss is analyzed, showing that attaining the highest lift coefficient possible throughout the pullout is critical. Based on these results a pullout flight control system is designed, with the optimal policy acting as an outer loop issuing commands to two inner loops that track lift coefficient and roll rate, respectively. The proposed pullout controller is tested on 6 DOF simulations, and shown to be effective at recovering the aircraft with close to optimal altitude loss.
@inproceedings{BungePavoneEtAl2018, author = {Bunge, R.A. and Pavone, M. and Kroo, I.}, title = {Minimal Altitude Loss Pullout Maneuvers}, booktitle = {{AIAA Conf. on Guidance, Navigation and Control}}, year = {2018}, address = {Kissimmee, Florida}, doi = {10.2514/6.2018-1339}, month = jan, url = {/wp-content/papercite-data/pdf/Bunge.Pavone.Kroo.AIAAGNC18.pdf}, owner = {frossi2}, timestamp = {2018-01-21} }
Abstract: The literature on Inverse Reinforcement Learning (IRL) typically assumes that humans take actions in order to minimize the expected value of a cost function, i.e., that humans are risk neutral. Yet, in practice, humans are often far from being risk neutral. To fill this gap, the objective of this paper is to devise a framework for risk-sensitive IRL in order to explicitly account for a human’s risk sensitivity. To this end, we propose a flexible class of models based on coherent risk measures, which allow us to capture an entire spectrum of risk preferences from risk-neutral to worst-case. We propose efficient non-parametric algorithms based on linear programming and semi-parametric algorithms based on maximum likelihood for inferring a human’s underlying risk measure and cost function for a rich class of static and dynamic decision-making settings. The resulting approach is demonstrated on a simulated driving game with ten human participants. Our method is able to infer and mimic a wide range of qualitatively different driving styles from highly risk-averse to risk-neutral in a data-efficient manner. Moreover, comparisons of the Risk-Sensitive (RS) IRL approach with a risk-neutral model show that the RS-IRL framework more accurately captures observed participant behavior both qualitatively and quantitatively, especially in scenarios where catastrophic outcomes such as collisions can occur.
@article{SinghLacotteEtAl2018, author = {Singh, S. and Lacotte, J. and Majumdar, A. and Pavone, M.}, title = {Risk-sensitive Inverse Reinforcement Learning via Semi- and Non-Parametric Methods}, journal = {{Int. Journal of Robotics Research}}, volume = {37}, number = {13}, pages = {1713--1740}, year = {2018}, url = {https://arxiv.org/pdf/1711.10055.pdf}, owner = {ssingh19}, timestamp = {2019-08-21} }
Abstract: In this paper we present a framework for risk-sensitive model predictive control (MPC) of linear systems affected by stochastic multiplicative uncertainty. Our key innovation is to consider a time-consistent, dynamic risk evaluation of the cumulative cost as the objective function to be minimized. This framework is axiomatically justified in terms of time-consistency of risk assessments, is amenable to dynamic optimization, and is unifying in the sense that it captures a full range of risk preferences from risk-neutral (i.e., expectation) to worst case. Within this framework, we propose and analyze an online risk-sensitive MPC algorithm that is provably stabilizing. Furthermore, by exploiting the dual representation of time-consistent, dynamic risk measures, we cast the computation of the MPC control law as a convex optimization problem amenable to real-time implementation. Simulation results are presented and discussed.
@article{SinghChowEtAl2018b, author = {Singh, S. and Chow, Y.-L. and Majumdar, A. and Pavone, M.}, title = {A Framework for Time-Consistent, Risk-Sensitive Model Predictive Control: Theory and Algorithms}, journal = {{IEEE Transactions on Automatic Control}}, volume = {64}, number = {7}, pages = {2905--2912}, year = {2018}, note = {{Extended version available at:} \url{http://arxiv.org/abs/1703.01029}}, url = {http://arxiv.org/pdf/1703.01029.pdf}, owner = {ssingh19}, timestamp = {2019-07-29} }
Abstract: This paper considers the problem of routing and rebalancing a shared fleet of autonomous (i.e., self-driving) vehicles providing on-demand mobility within a capacitated transportation network, where congestion might disrupt throughput. We model the problem within a network flow framework and show that under relatively mild assumptions the rebalancing vehicles, if properly coordinated, do not lead to an increase in congestion (in stark contrast to common belief). From an algorithmic standpoint, such theoretical insight suggests that the problem of routing customers and rebalancing vehicles can be decoupled, which leads to a computationally-efficient routing and rebalancing algorithm for the autonomous vehicles. Numerical experiments and case studies corroborate our theoretical insights and show that the proposed algorithm outperforms state-of-the-art point-to-point methods by avoiding excess congestion on the road. Collectively, this paper provides a rigorous approach to the problem of congestion-aware, system-wide coordination of autonomously driving vehicles, and to the characterization of the sustainability of such robotic systems.
@article{RossiZhangEtAl2017, author = {Rossi, F. and Zhang, R. and Hindy, Y. and Pavone, M.}, title = {Routing Autonomous Vehicles in Congested Transportation Networks: Structural Properties and Coordination Algorithms}, journal = {{Autonomous Robots}}, volume = {42}, number = {7}, pages = {1427--1442}, year = {2018}, doi = {10.1007/s10514-018-9750-5}, url = {/wp-content/papercite-data/pdf/Rossi.Zhang.Hindy.Pavone.AURO17.pdf}, owner = {frossi2}, timestamp = {2018-08-07} }
Abstract:
@inproceedings{LeungSchmerlingEtAl2018, author = {Leung, K. and Schmerling, E. and Chen, M. and Talbot, J. and Gerdes, J. C. and Pavone, M.}, title = {On Infusing Reachability-Based Safety Assurance within Probabilistic Planning Frameworks for Human-Robot Vehicle Interactions}, booktitle = {{Int. Symp. on Experimental Robotics}}, year = {2018}, address = {Buenos Aires, Argentina}, url = {/wp-content/papercite-data/pdf/Leung.Schmerling.Chen.ea.ISER18.pdf}, owner = {mochen72}, timestamp = {2018-10-13} }
Abstract: We study the following multi-robot coordination problem: given a graph, where each edge is weighted by the probability of surviving while traversing it, find a set of paths for K robots that maximizes the expected number of nodes collectively visited, subject to constraints on the probabilities that each robot survives to its destination. We call this the Team Surviving Orienteers (TSO) problem, which is motivated by scenarios where a team of robots must traverse a dangerous environment, such as aid delivery after disasters. We present the TSO problem formally along with several variants, which represent “survivability-aware" counterparts for a wide range of multi-robot coordination problems such as vehicle routing, patrolling, and informative path planning. We propose an approximate greedy approach for selecting paths, and prove that the value of its output is within a factor 1-exp(-p_s/lambda) of the optimum where p_s is the per-robot survival probability threshold, and 1/lambda < 1 is the approximation factor of an oracle routine for the well-known orienteering problem. We also formalize an on-line update version of the TSO problem, and a generalization to heterogeneous teams where both robot types and paths are selected. Using numerical simulations, we verify that our approach works well in practice and that it scales to problems with hundreds of nodes and tens of robots.
@article{JorgensenChenEtAl2017b, author = {Jorgensen, S. and Chen, R.H. and Milam, M.B. and Pavone, M.}, title = {The Team Surviving Orienteers Problem: Routing Teams of Robots in Uncertain Environments with Survival Constraints}, journal = {{Autonomous Robots}}, volume = {42}, number = {4}, pages = {927--952}, year = {2018}, url = {/wp-content/papercite-data/pdf/Jorgensen.Chen.Milam.Pavone.AURO2017.pdf}, owner = {stefantj}, timestamp = {2018-04-10} }
Abstract: Probabilistic sampling-based algorithms, such as the probabilistic roadmap (PRM) and the rapidly-exploring random tree (RRT) algorithms, represent one of the most successful approaches to robotic motion planning, due to their strong theoretical properties (in terms of probabilistic completeness or even asymptotic optimality) and remarkable practical performance. Such algorithms are probabilistic in that they compute a path by connecting independently and identically distributed (i.i.d.) random points in the configuration space. Their randomization aspect, however, makes several tasks challenging, including certification for safety-critical applications and use of offline computation to improve real-time execution. Hence, an important open question is whether similar (or better) theoretical guarantees and practical performance could be obtained by considering deterministic, as opposed to random sampling sequences. The objective of this paper is to provide a rigorous answer to this question. Specifically, we first show that PRM, for a certain selection of tuning parameters and deterministic low-dispersion sampling sequences, is deterministically asymptotically optimal, i.e., it returns a path whose cost converges deterministically to the optimal one as the number of points goes to infinity. Second, we characterize the convergence rate, and we find that the factor of sub-optimality can be very explicitly upper-bounded in terms of the l2-dispersion of the sampling sequence and the connection radius of PRM. Third, we show that an asymptotically optimal version of PRM exists with computational and space complexity arbitrarily close to O(n) (the theoretical lower bound), where n is the number of points in the sequence. This is in stark contrast to the O(n log n) complexity results for existing asymptotically- optimal probabilistic planners. Fourth, we show that our theoretical results and insights extend to other batch-processing algorithms such as FMT*, to non-uniform sampling strategies, to k-nearest-neighbor implementations, and to differentially- constrained problems. Finally, through numerical experiments, we show that planning with deterministic low-dispersion sampling generally provides superior performance in terms of path cost and success rate.
@article{JansonIchterEtAl2015, author = {Janson, L. and Ichter, B. and Pavone, M.}, title = {Deterministic Sampling-Based Motion Planning: Optimality, Complexity, and Performance}, journal = {{Int. Journal of Robotics Research}}, volume = {37}, number = {1}, pages = {46--61}, year = {2018}, doi = {10.1177/0278364917714338}, url = {/wp-content/papercite-data/pdf/Janson.Ichter.Pavone.IJRR18.pdf}, owner = {bylard}, timestamp = {2017-03-07} }
Abstract: In many sequential decision-making problems one is interested in minimizing an expected cumulative cost while taking into account risk, i.e., increased awareness of events of small probability and high consequences. Accordingly, the objective of this paper is to present efficient reinforcement learning algorithms for risk-constrained Markov decision processes (MDPs), where risk is represented via a chance constraint or a constraint on the conditional value-at-risk (CVaR) of the cumulative cost. We collectively refer to such problems as percentile risk-constrained MDPs. Specifically, we first derive a formula for computing the gradient of the Lagrangian function for percentile risk-constrained MDPs. Then, we devise policy gradient and actor-critic algorithms that (1) estimate such gradient, (2) update the policy parameters in the descent direction, and (3) update the Lagrange multiplier in the ascent direction. For these algorithms we prove convergence to locally-optimal policies. Finally, we demonstrate the effectiveness of our algorithms in an optimal stopping problem and an online marketing application.
@article{ChowGhavamzadehEtAl2018, author = {Chow, Y. and Ghavamzadeh, M. and Janson, L. and Pavone, M.}, title = {Risk-Constrained Reinforcement Learning with Percentile Risk Criteria}, journal = {{Journal of Machine Learning Research}}, volume = {18}, number = {167}, pages = {1--51}, year = {2018}, url = {/wp-content/papercite-data/pdf/Chow.Ghavamzadeh.Janson.Pavone.JMLR18.pdf}, owner = {bylard}, timestamp = {2018-06-03} }
Abstract: Endowing robots with the capability of assessing risk and making risk-aware decisions is widely considered a key step toward ensuring safety for robots operating under uncertainty. But, how should a robot quantify risk? A natural and common approach is to consider the framework whereby costs are assigned to stochastic outcomes - an assignment captured by a cost random variable. Quantifying risk then corresponds to evaluating a risk metric, i.e., a mapping from the cost random variable to a real number. Yet, the question of what constitutes a "good" risk metric has received little attention within the robotics community. The goal of this paper is to explore and partially address this question by advocating axioms that risk metrics in robotics applications should satisfy in order to be employed as rational assessments of risk. We discuss general representation theorems that precisely characterize the class of metrics that satisfy these axioms (referred to as distortion risk metrics), and provide instantiations that can be used in applications. We further discuss pitfalls of commonly used risk metrics in robotics, and discuss additional properties that one must consider in sequential decision making tasks. Our hope is that the ideas presented here will lead to a foundational framework for quantifying risk (and hence safety) in robotics applications.
@inproceedings{MajumdarPavone2017, author = {Majumdar, A. and Pavone, M.}, title = {How Should a Robot Assess Risk? {Towards} an Axiomatic Theory of Risk in Robotics}, booktitle = {{Int. Symp. on Robotics Research}}, year = {2017}, address = {Puerto Varas, Chile}, month = dec, url = {/wp-content/papercite-data/pdf/Majumdar.Pavone.ISRR17.pdf}, owner = {anirudha}, timestamp = {2018-01-16} }
Abstract: Consider a scenario where robots traverse a graph, but crossing each edge bears a risk of failure. A team operator seeks a set of paths for the smallest team which guarantee the probabilities that at least one robot visits each node satisfies specified per-node visit thresholds, and the probabilities each robot reaches its destination satisfy a per-robot survival threshold. We present the Risk-Sensitive Coverage (RSC) problem formally as an instance of the submodular set cover problem and propose an efficient cost-benefit greedy algorithm for finding a feasible set of paths. We prove that the number of robots deployed by our algorithm is no more than (lambda/p_s)(1+log( lambda Delta_K/p_s)) times the smallest team, where Delta_K quantifies the relative benefit of the first and last paths, p_s is the per-robot survival probability threshold and 1/lambda < 1 is the approximation factor of an oracle routine for the well-known orienteering problem. We demonstrate the quality of our solutions by comparing to optimal solutions computed for special cases of the RSC and the efficiency of our approach by applying it to a search and rescue scenario where 225 sites must be visited, each with probability at least 0.95.
@inproceedings{JorgensenChenEtAl2017d, author = {Jorgensen, S. and Chen, R.H. and Milam, M.B. and Pavone, M.}, title = {The Risk-Sensitive Coverage Problem: Multi-Robot Routing Under Uncertainty with Service Level and Survival Constraints}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2017}, address = {Melbourne, Australia}, month = dec, url = {/wp-content/papercite-data/pdf/Jorgensen.Chen.Milam.Pavone.CDC17.pdf}, owner = {stefantj}, timestamp = {2018-04-10} }
Abstract: In this paper we approach the robust motion planning problem through the lens of perception-aware planning, whereby we seek a low-cost motion plan subject to a separate constraint on perception localization quality. To solve this problem we introduce the Multiobjective Perception-Aware Planning (MPAP) algorithm which explores the state space via a multiobjective search, considering both cost and a perception heuristic. This perception-heuristic formulation allows us to both capture the history dependence of localization drift and represent complex modern perception methods. The solution trajectory from this heuristic-based search is then certified via Monte Carlo methods to be robust. The additional computational burden of perception-aware planning is offset through massive parallelization on a GPU. Through numerical experiments the algorithm is shown to find robust solutions in about a second. Finally, we demonstrate MPAP on a quadrotor flying perception-aware and perception-agnostic plans using Google Tango for localization, finding the quadrotor safely executes the perception-aware plan every time, while crashing over 20% of the time on the perception-agnostic due to loss of localization.
@inproceedings{IchterLandryEtAl2017, author = {Ichter, B. and Landry, B. and Schmerling, E. and Pavone, M.}, title = {Perception-Aware Motion Planning via Multiobjective Search on {GPUs}}, booktitle = {{Int. Symp. on Robotics Research}}, year = {2017}, address = {Puerto Varas, Chile}, month = dec, url = {https://arxiv.org/pdf/1705.02408.pdf}, owner = {ichter}, timestamp = {2018-01-16} }
Abstract: Hopping rovers have emerged as a promising platform for the future surface exploration of small Solar System bodies, such as asteroids and comets. However, hopping dynamics are governed by nonlinear gravity fields and stochastic bouncing on highly irregular surfaces, which pose several challenges for traditional motion planning methods. This paper presents the first ever discussion of motion planning for hopping rovers that explicitly accounts for various sources of uncertainty. We first address the problem of planning a single hopping trajectory by developing (1) an algorithm for robustly solving Lambert’s orbital boundary value problems in irregular gravity fields, and (2) a method for computing landing distributions by propagating control and model uncertainties—from which, a time/energy-optimal hop can be selected using a (myopic) policy gradient. We then cast the sequential planning problem as a Markov decision process and apply a sample-efficient, off-line, off-policy reinforcement learning algorithm—namely, a variant of least squares policy iteration (LSPI)—to derive approximately optimal control policies that are safe, efficient, and amenable to real-time implementation on computationally-constrained rover hardware. These policies are demonstrated in simulation to be robust to modelling errors and outperform previous heuristics.
@inproceedings{HockmanPavone2017, author = {Hockman, B. and Pavone, M.}, title = {Stochastic Motion Planning for Hopping Rovers on Small Solar System Bodies}, booktitle = {{Int. Symp. on Robotics Research}}, year = {2017}, address = {Puerto Varas, Chile}, month = dec, url = {/wp-content/papercite-data/pdf/Hockman.Pavone.ISRR17.pdf}, owner = {bhockman}, timestamp = {2018-01-16} }
Abstract: Model-free policy learning has enabled robust performance of complex tasks with relatively simple algorithms. However, this simplicity comes at the cost of requiring an Oracle and arguably very poor sample complexity. This renders such methods unsuitable for physical systems. Variants of model-based methods address this problem through the use of simulators, however, this gives rise to the problem of policy transfer from simulated to the physical system. Model mismatch due to systematic parameter shift and unmodelled dynamics error may cause suboptimal or unsafe behavior upon direct transfer. We introduce the Adaptive Policy Transfer for Stochastic Dynamics (ADAPT) algorithm that achieves provably safe and robust, dynamically-feasible zero-shot transfer of RL-policies to new domains with dynamics error. ADAPT combines the strengths of offline policy learning in a black-box source simulator with online tube-based MPC to attenuate bounded model mismatch between the source and target dynamics. ADAPT allows online transfer of policy, trained solely in a simulation offline, to a family of unknown targets without fine-tuning. We also formally show that (i) ADAPT guarantees state and control safety through state-action tubes under the assumption of Lipschitz continuity of the divergence in dynamics and, (ii) ADAPT results in a bounded loss of reward accumulation in case of direct transfer with ADAPT as compared to a policy trained only on target. We evaluate ADAPT on 2 continuous, non-holonomic simulated dynamical systems with 4 different disturbance models, and find that ADAPT performs between 50%-300% better on mean reward accrual than direct policy transfer.
@inproceedings{HarrisonGargEtAl2017, author = {Harrison, J. and Garg, A. and Ivanovic, B. and Zhu, Y. and Savarese, S. and Li, F.-F. and Pavone, M.}, title = {{ADAPT:} Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems}, booktitle = {{Int. Symp. on Robotics Research}}, year = {2017}, address = {Puerto Varas, Chile}, month = dec, url = {https://arxiv.org/pdf/1707.04674.pdf}, owner = {pavone}, timestamp = {2018-01-16} }
Abstract: The operation of today’s robots increasingly entails interactions with humans, in settings ranging from autonomous driving amidst human-driven vehicles to collaborative manufacturing. To effectively do so, robots must proactively decode the intent or plan of humans and concurrently leverage such a knowledge for safe, cooperative task satisfaction—a problem we refer to as proactive decision making. However, the problem of proactive intent decoding coupled with robotic control is computationally intractable as a robot must reason over several possible human behavioral models and resulting high-dimensional state trajectories. In this paper, we address the proactive decision making problem using a novel combination of algorithmic and data mining techniques. First, we distill high-dimensional state trajectories of human-robot interaction into concise, symbolic behavioral summaries that can be learned from data. Second, we leverage formal methods to model high-level agent goals, safe interaction, and information-seeking behavior with temporal logic formulae. Finally, we design a novel decision-making scheme that simply maintains a belief distribution over high-level, symbolic models of human behavior, and proactively plans informative control actions. Leveraging a rich dataset of real human driving data in crowded merging scenarios, we generate temporal logic models and use them to synthesize control strategies using tree-based value iteration and reinforcement learning (RL). Results from two simulated self-driving car scenarios, one cooperative and the other adversarial, demonstrate that our data-driven control strategies enable safe interaction, correct model identification, and significant dimensionality reduction.
@inproceedings{ChinchaliLivingstonEtAl2017, author = {Chinchali, S. P. and Livingston, S. C. and Pavone, M.}, title = {Multi-objective optimal control for proactive decision-making with temporal logic models}, booktitle = {{Int. Symp. on Robotics Research}}, year = {2017}, address = {Puerto Varas, Chile}, month = dec, url = {/wp-content/papercite-data/pdf/Chinchali.Livingston.Pavone.ISRR17.pdf}, owner = {pavone}, timestamp = {2018-01-16} }
Abstract: Spacecraft equipped with gecko-inspired dry adhesive grippers can dynamically grasp objects having a wide variety of featureless surfaces. In this paper we propose an optimization-based control strategy to exploit the dynamic robustness of such grippers for the task of grasping a free-floating, spinning object. First, we extend previous work characterizing the dynamic grasping capabilities of these grippers to the case where both object and spacecraft are free-floating and comparably sized. We then formulate the acquisition problem as a two-phase optimal control problem, which is amenable to real time implementation and can handle constraints on velocity, control, as well as integer timing constraints for grasping a specific target location on the surface of a spinning object. Conservative analytical bounds on the set of initial states that guarantee persistent feasibility are derived.
@inproceedings{MacPhersonHockmanEtAl2017, author = {MacPherson, R. and Hockman, B. and Bylard, A. and Estrada, M. A. and Cutkosky, M. R. and Pavone, M.}, title = {Trajectory Optimization for Dynamic Grasping in Space using Adhesive Grippers}, booktitle = {{Field and Service Robotics}}, year = {2017}, address = {Zurich, Switzerland}, month = sep, url = {/wp-content/papercite-data/pdf/MacPherson.Hockman.Bylard.ea.FSR17.pdf}, owner = {bylard}, timestamp = {2018-01-16} }
Abstract: Consider a setting where robots must visit nodes in a graph, but each robot may fail when traversing an edge. The goal is to find a set of paths for a team of robots which maximizes the expected number of nodes collectively visited, while guaranteeing that the paths satisfy a notion of "independence" formalized by a matroid (e.g. limits on team size, number of visits to regions), and that the probabilities that each robot survives to its destination are above a given threshold. We call this problem the Matroid Team Surviving Orienteers (MTSO) problem, which has broad applications such as environmental monitoring in risky regions and search and rescue in dangerous conditions. We present the MTSO formally and detail numerous examples of matroids in a path planning context. We then propose an approximate greedy algorithm for selecting a feasible set of paths and prove that the value of the output is within a factor p_s/(p_s+lambda) of the optimum, where p_s is the per-robot survival probability threshold and 1/lambda < 1 is the approximation factor of an oracle routine for the well known orienteering problem. We demonstrate the efficiency of our approach by applying it to a scenario where a team of robots must gather information while avoiding pirates in the Coral Triangle.
@inproceedings{JorgensenChenEtAl2017c, author = {Jorgensen, S. and Chen, R.H. and Milam, M.B. and Pavone, M.}, title = {The Matroid Team Surviving Orienteers Problem: Constrained Routing of Heterogeneous Teams with Risky Traversal}, booktitle = {{IEEE/RSJ Int. Conf. on Intelligent Robots \& Systems}}, year = {2017}, address = {Vancouver, Canada}, month = sep, url = {/wp-content/papercite-data/pdf/Jorgensen.Chen.Milam.Pavone.IROS2017.pdf}, owner = {stefantj}, timestamp = {2018-04-10} }
Abstract: This paper presents a tool for addressing a key component in many algorithms for planning robot trajectories under uncertainty: evaluation of the safety of a robot whose actions are governed by a closed-loop feedback policy near a nominal planned trajectory. We describe an adaptive importance sampling Monte Carlo framework that enables the evaluation of a given control policy for satisfaction of a probabilistic collision avoidance constraint which also provides an associated certificate of accuracy (in the form of a confidence interval). In particular this adaptive technique is well-suited to addressing the complexities of rigid-body collision checking applied to non-linear robot dynamics. As a Monte Carlo method it is amenable to parallelization for computational tractability, and is generally applicable to a wide gamut of simulatable systems, including alternative noise models. Numerical experiments demonstrating the effectiveness of the adaptive importance sampling procedure are presented and discussed.
@inproceedings{SchmerlingPavone2017, author = {Schmerling, E. and Pavone, M.}, title = {Evaluating Trajectory Collision Probability through Adaptive Importance Sampling for Safe Motion Planning}, booktitle = {{Robotics: Science and Systems}}, year = {2017}, address = {Cambridge, Massachusetts}, month = jul, url = {https://arxiv.org/pdf/1609.05399.pdf}, owner = {schmrlng}, timestamp = {2018-01-16} }
Abstract: The literature on Inverse Reinforcement Learning (IRL) typically assumes that humans take actions in order to minimize the expected value of a cost function, i.e., that humans are risk neutral. Yet, in practice, humans are often far from being risk neutral. To fill this gap, the objective of this paper is to devise a framework for risk-sensitive IRL in order to explicitly account for an expert’s risk sensitivity. To this end, we propose a flexible class of models based on coherent risk metrics, which allow us to capture an entire spectrum of risk preferences from risk-neutral to worst-case. We propose efficient algorithms based on Linear Programming for inferring an expert’s underlying risk metric and cost function for a rich class of static and dynamic decision-making settings. The resulting approach is demonstrated on a simulated driving game with ten human participants. Our method is able to infer and mimic a wide range of qualitatively different driving styles from highly risk-averse to risk-neutral in a data-efficient manner. Moreover, comparisons of the Risk-Sensitive (RS) IRL approach with a risk-neutral model show that the RS-IRL framework more accurately captures observed participant behavior both qualitatively and quantitatively.
@inproceedings{MajumdarSinghEtAl2017, author = {Majumdar, A. and Singh, S. and Mandlekar, A. and Pavone, M.}, title = {Risk-sensitive Inverse Reinforcement Learning via Coherent Risk Models}, booktitle = {{Robotics: Science and Systems}}, year = {2017}, address = {Cambridge, Massachusetts}, month = jul, url = {/wp-content/papercite-data/pdf/Majumdar.Singh.Mandlekar.Pavone.RSS17.pdf}, owner = {ssingh19}, timestamp = {2017-04-28} }
Abstract: We present a framework for online generation of robust motion plans for robotic systems with nonlinear dynamics subject to bounded disturbances, control constraints, and online state constraints such as obstacles. In an offline phase, one computes the structure of a feedback controller that can be efficiently implemented online to track any feasible nominal trajectory. The offline phase leverages contraction theory and convex optimization to characterize a fixed-size “tube” that the state is guaranteed to remain within while tracking a nominal trajectory (representing the center of the tube). In the online phase, when the robot is faced with obstacles, a motion planner uses such a tube as a robustness margin for collision checking, yielding nominal trajectories that can be safely executed (i.e., tracked without collisions under disturbances). In contrast to recent work on robust online planning using funnel libraries, our approach is not restricted to a fixed library of maneuvers computed offline and is thus particularly well-suited to applications such as UAV flight in densely cluttered environments where complex maneuvers may be required to reach a goal. We demonstrate our approach through simulations of a 6-state planar quadrotor navigating cluttered environments in the presence of a cross-wind. We also discuss applications of our approach to Tube Model Predictive Control (TMPC) and compare the merits of our method with state-of-the-art nonlinear TMPC techniques.
@inproceedings{SinghMajumdarEtAl2017, author = {Singh, S. and Majumdar, A. and Slotine, J.-J. E. and Pavone, M.}, title = {Robust Online Motion Planning via Contraction Theory and Convex Optimization}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2017}, note = {{Extended version available at }\url{http://asl.stanford.edu/wp-content/papercite-data/pdf/Singh.Majumdar.Slotine.Pavone.ICRA17.pdf}}, address = {Singapore}, month = may, url = {/wp-content/papercite-data/pdf/Singh.Majumdar.Slotine.Pavone.ICRA17.pdf}, owner = {bylard}, timestamp = {2018-06-30} }
Abstract: In this paper we present the PUMP (Parallel Uncertainty-aware Multiobjective Planning) algorithm for addressing the stochastic kinodynamic motion planning problem, whereby we seek a low-cost, dynamically-feasible motion plan subject to a constraint on collision probability (CP). As a departure from previous methods for chance-constrained motion planning, PUMP directly considers both CP and the optimization objective at equal priority when planning through the free configuration space, achieving an unprecedented combination of cost performance, certified safety, and speed. Planning is conducted through a massively parallel multiobjective search, here implemented with a particular application focus on GPU hardware. PUMP explores the configuration space while maintaining a Pareto optimal front of motion plans, considering cost and approximate collision probability. We introduce a novel particle-based CP approximation scheme, designed for efficient GPU implementation, which accounts for dependencies over the history of a trajectory execution. Upon termination of the exploration phase, PUMP performs a search over the Pareto optimal set of solution motion plans to identify the lowest cost motion plan that is certified to satisfy the CP constraint (according to an asymptotically exact estimator). We present numerical experiments for quadrotor planning wherein PUMP identifies solutions in 100 ms, evaluating over one hundred thousand partial plans through the course of its exploration phase. The results show that this multiobjective search achieves a lower motion plan cost, for the same collision probability constraint, compared to a safety buffer-based search heuristic and repeated RRT trials.
@inproceedings{IchterSchmerlingEtAl2017, author = {Ichter, B. and Schmerling, E. and Agha-mohammadi, A. and Pavone, M.}, title = {Real-Time Stochastic Kinodynamic Motion Planning via Multiobjective Search on {GPUs}}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2017}, address = {Singapore}, month = may, url = {http://arxiv.org/pdf/1607.06886.pdf}, owner = {bylard}, timestamp = {2017-03-07} }
Abstract: Free-flying robots have the potential to autonomously fulfill a wide range of tasks involving manipulation of objects in space. In this paper we study the design of a wrist mechanism for free-flying robots that are equipped with an adhesive gripper for attaching to objects and surfaces. The wrist and gripper allow the robots to apply moments in addition to forces, which increases their versatility for object manipulation. We apply grasp optimization to establish limitations on the forces/moments that the wrist can impart, subject to adhesion capabilities. Building on these results, we present an approach for tuning a passive wrist mechanism, or controlling an active wrist, to broaden the range of forces and moments that the robot can exert. Our theoretical insights and wrist designs are validated in simulations and on a planar micro-gravity test bed.
@inproceedings{EstradaJiangEtAl2017, author = {Estrada, M. A. and Jiang, H. and Noll, B. and Hawkes, E. W. and Pavone, M. and Cutkosky, M. R.}, title = {Force and Moment Constraints of a Curved Surface Gripper and Wrist for Assistive Free Flyers}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2017}, address = {Singapore}, month = may, url = {/wp-content/papercite-data/pdf/Estrada.Jiang.Noll.ea.ICRA17.pdf}, owner = {bylard}, timestamp = {2017-03-07} }
Abstract: In this paper we study the following multi-robot coordination problem: given a graph, where each edge is weighted by the probability of surviving while traversing it, find a set of paths for K robots that maximizes the expected number of nodes collectively visited, subject to constraints on the probability that each robot survives to its destination. We call this problem the Team Surviving Orienteers (TSO) problem. The TSO problem is motivated by scenarios where a team of robots must traverse a dangerous, uncertain environment, such as aid delivery in disaster or war zones. We present the TSO problem formally along with several variants, which represent "survivability-aware" counterparts for a wide range of multi-robot coordination problems such as vehicle routing, patrolling, and informative path planning. We propose an approximate greedy approach for selecting paths, and prove that the value of its output is bounded within a factor 1 - exp(-p_s/lambda) of the optimum where p_s is the per-robot survival probability threshold, and 1/lambda < 1 is the approximation factor of an oracle routine for the well-known orienteering problem. Our approach has linear time complexity in the team size and polynomial complexity in the graph size. Using numerical simulations, we verify that our approach is close to the optimum in practice and that it scales to problems with hundreds of nodes and tens of robots.
@inproceedings{JorgensenChenEtAl2017, author = {Jorgensen, S. and Chen, R.H. and Milam, M.B. and Pavone, M.}, title = {The Team Surviving Orienteers Problem: Routing Robots in Uncertain Environments with Survival Constraints}, booktitle = {{IEEE Int. Conf. on Robotic Computing}}, year = {2017}, address = {Taichung, Taiwan}, month = apr, url = {/wp-content/papercite-data/pdf/Jorgensen.Chen.Milam.Pavone.ICRC2017.pdf}, owner = {bylard}, timestamp = {2018-04-10} }
Abstract: This paper presents a novel approach, named the Group Marching Tree (GMT*) algorithm, to planning on GPUs at rates amenable to application within control loops, allowing planning in real-world settings via repeated computation of near-optimal plans. GMT*, like the Fast Marching Tree (FMT*) algorithm, explores the state space with a “lazy” dynamic programming recursion on a set of samples to grow a tree of near-optimal paths. GMT*, however, alters the approach of FMT* with approximate dynamic programming by expanding, in parallel, the group of all active samples with cost below an increasing threshold, rather than only the minimum cost sample. This group approximation enables low-level parallelism over the sample set and removes the need for sequential data structures, while the “lazy” collision checking limits thread divergence—all contributing to a very efficient GPU implementation. While this approach incurs some suboptimality, we prove that GMT* remains asymptotically optimal up to a constant multiplicative factor. We show solutions for complex planning problems under differential constraints can be found in 10 ms on a desktop GPU and 30 ms on an embedded GPU, representing a significant speed up over the state of the art, with only small losses in performance. Finally, we present a scenario demonstrating the efficacy of planning within the control loop ( 100 Hz) towards operating in dynamic, uncertain settings.
@inproceedings{IchterSchmerlingEtAl2017b, author = {Ichter, B. and Schmerling, E. and Pavone, M.}, title = {Group Marching Tree: Sampling-Based Approximately Optimal Motion Planning on {GPUs}}, booktitle = {{IEEE Int. Conf. on Robotic Computing}}, year = {2017}, address = {Taichung, Taiwan}, month = apr, url = {/wp-content/papercite-data/pdf/Ichter.Schmerling.Pavone.ICRC17.pdf}, owner = {bylard}, timestamp = {2017-03-07} }
Abstract: High-altitude balloons in near space offer the possibility of affordable scientific experimentation and hardware testing for outer space missions. In this paper we present a novel, low cost high-altitude balloon system that achieves multi-day flight using inexpensive latex balloons by automatically venting lifting gas and dispensing ballast to maintain altitude. Traditionally, superpressure balloons have been used for high-altitude scientific missions; however, despite their long endurance and payload capacity in the tens of kilograms, their cost is in excess of tens of thousands of dollars. Latex balloons are significantly less expensive, typically costing little more than a hundred dollars, but in normal use fly for only a couple hours, rising until reduced atmospheric pressure causes the balloon to stretch beyond its limits. Precision-weighted latex balloons have demonstrated multi-day flights, but such systems cannot change altitude while aloft and offer minimal payload capacity (measuring in tens of grams). Our system, known as ValBal, offers altitude control capabilities exceeding those of a superpressure balloon at a two order of magnitude reduction in cost. ValBal can stabilize anywhere in its operational range of 10-25 km altitude, and can execute scheduled or remotely commanded altitude transitions during flight. In its current iteration, ValBal can be configured to accommodate payloads on the order of 10000 cubic centimeters and 2 kilograms. In June 2016, a ValBal demonstration mission flew for over 70 hours continuously, surpassing the previous world record of 57 hours, for the longest duration of a latex balloon flight. ValBal has flown twice more since then, including a flight of almost 80 hours. Planned developments will seek to improve the endurance to a week and increase the payload interface capabilities for scientific missions.
@inproceedings{SushkoTedjaratiEtAl2017, author = {Sushko, A. and Tedjarati, A. and Creus-Costa, J. and Maldonado, S. and Marshland, K. and Pavone, M.}, title = {Low cost, high endurance, altitude-controlled latex balloon for near-space research ({ValBal})}, booktitle = {{IEEE Aerospace Conference}}, year = {2017}, address = {Big Sky, Montana}, month = mar, url = {/wp-content/papercite-data/pdf/Suskho.Tedjarati.ea.AERO2017.pdf}, owner = {frossi2}, timestamp = {2017-10-05} }
Abstract: Removing large orbital debris in a safe, robust, and cost-effective manner is a long-standing challenge, having serious implications for LEO satellite safety and access to space. Many studies have focused on the deorbit of spent rocket bodies (R/Bs) as an achievable and high-priority first step. However, major difficulties arise from the R/Bs’ residual tumble and lack of traditional docking/grasping fixtures. Previously investigated docking strategies often require complex and risky approach maneuvers or have a high chance of producing additional debris. To address this challenge, this paper investigates the use of controllable dry adhesives (CDAs), also known as gecko-inspired adhesives, as an alternative approach to R/B docking and deorbiting. CDAs are gathering interest for in-space grasping and manipulation due to their ability to controllably attach to and detach from any smooth, clean surface, including flat and curved surfaces. Such capability significantly expands the number and types of potential docking locations on a target. CDAs are also inexpensive, are space-qualified (performing well in a vacuum, in extreme temperatures, and under radiation), and can attach and detach while applying minimal force to a target surface, all important considerations for space deployment. In this paper, we investigate a notional strategy for initial capture and stabilization of a R/B having multi-axis tumble, exploiting the unique properties of CDA grippers to reduce maneuver complexity, and we propose alternatives for rigidly attaching deorbiting kits to a R/B. Simulations based on experimentally verified models of CDA grippers show that these approaches show promise as robust alternatives to previously explored methods.
@inproceedings{BylardMacPhersonEtAl2017, author = {Bylard, A. and MacPherson, R. and Hockman, B. and Cutkosky, M. R. and Pavone, M.}, title = {Robust Capture and Deorbit of Rocket Body Debris Using Controllable Dry Adhesion}, booktitle = {{IEEE Aerospace Conference}}, year = {2017}, address = {Big Sky, Montana}, month = mar, url = {/wp-content/papercite-data/pdf/Bylard.MacPherson.Hockman.ea.AeroConf17.pdf}, owner = {bylard}, timestamp = {2017-03-07} }
Abstract: This paper presents a sampling-based motion planning algorithm for real-time and propellant-optimized autonomous spacecraft trajectory generation in near-circular orbits. Specifically, this paper leverages recent algorithmic advances in the field of robot motion planning to the problem of impulsively-actuated, propellant-optimized rendezvous and proximity operations under the Clohessy-Wiltshire-Hill (CWH) dynamics model. The approach calls upon a modified version of the Fast Marching Tree (FMT*) algorithm to grow a set of feasible trajectories over a deterministic, low-dispersion set of sample points covering the free state space. To enforce safety, the tree is only grown over the subset of actively-safe samples, from which there exists a feasible one-burn collision avoidance maneuver that can safely circularize the spacecraft orbit along its coasting arc under a given set of potential thruster failures. Key features of the proposed algorithm include: (i) theoretical guarantees in terms of trajectory safety and performance, (ii) amenability to real-time implementation, and (iii) generality, in the sense that a large class of constraints can be handled directly. As a result, the proposed algorithm offers the potential for widespread application, ranging from on-orbit satellite servicing to orbital debris removal and autonomous inspection missions.
@article{StarekSchmerlingEtAl2016, author = {Starek, J. A. and Schmerling, E. and Maher, G. D. and Barbee, B. W. and Pavone, M.}, title = {Fast, Safe, Propellant-Efficient Spacecraft Motion Planning Under {Clohessy}-{Wiltshire}-{Hill} Dynamics}, journal = {{AIAA Journal of Guidance, Control, and Dynamics}}, volume = {40}, number = {2}, pages = {418--438}, year = {2017}, doi = {10.2514/1.g001913}, url = {/wp-content/papercite-data/pdf/Starek.Schmerling.ea.JGCD16.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: We propose an experimental method for studying mobility and surface operations of microgravity robots on zero-gravity parabolic flights - a test bed traditionally used for experiments requiring strictly zero gravity. By strategically exploiting turbulence-induced gravity fluctuations, our technique enables a new experimental approach for testing surface interactions of robotic systems in micro- to milli-gravity environments. This strategy is used to evaluate the performance of internally-actuated hopping rovers designed for controlled surface mobility on small Solar System bodies. In experiments, these rovers demonstrated a range of maneuvers on various surfaces, including both rigid and granular. Results are compared with analytical predictions and numerical simulations, yielding new insights into the dynamics and control of hopping rovers.
@inproceedings{HockmanReidEtAl2016, author = {Hockman, B. and Reid, R. G. and Nesnas, I. A. D. and Pavone, M.}, title = {Experimental Methods for Mobility and Surface Operations of Microgravity Robots}, booktitle = {{Int. Symp. on Experimental Robotics}}, year = {2016}, address = {Tokyo, Japan}, month = oct, url = {/wp-content/papercite-data/pdf/Hockman.Reid.ea.ISER16.pdf}, owner = {bylard}, timestamp = {2017-03-07} }
Abstract: In early 2011, NASA’s Office of the Chief Technologist released a set of technology roadmaps with the aim of fostering the development of concepts and cross-cutting technologies addressing NASA’s needs for the 2011-2021 decade and beyond. NASA reached out to the National Research Council (NRC) to review the program objectives and prioritize the list of technologies. In January 2012, the NRC released its report entitled “Restoring NASA’s Technological Edge and Paving the Way for a New Era in Space." While the NRC report provides a systematic and thorough ranking of the future technology needs for NASA, it does not discuss in detail the technical aspects of its prioritized technologies (which lie beyond its scope). This chapter, building upon this framework, aims at providing such technical details for a selected number of high-priority technologies in the autonomous systems area. Specifically, this chapter focuses on technology area TA04 “Robotics, Tele-Robotics, and Autonomous Systems" and discusses in some detail the technical aspects and challenges associated with three high-priority TA04 technologies: “Relative Guidance Algorithms," “Extreme Terrain Mobility," and “Small Body/Microgravity Mobility." Each of these technologies is discussed along four main dimensions: scope, need, state-of-the-art, and challenges/future directions. The result is a unified presentation of key autonomy challenges for next-generation space missions.
@incollection{StarekAcikmeseEtAl2016, author = {Starek, J. A. and Acikmese, B. and Nesnas, I. A. D. and Pavone, M.}, title = {Spacecraft Autonomy Challenges for Next Generation Space Missions}, booktitle = {Advances in Control System Technology for Aerospace Applications}, publisher = {{Springer}}, year = {2016}, chapter = {1}, doi = {10.1007/978-3-662-47694-9_1}, month = sep, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Starek.Acikmese.ea.ACSTAA16.pdf} }
Abstract: This paper considers the problem of routing and rebalancing a shared fleet of autonomous (i.e., self-driving) vehicles providing on-demand mobility within a capacitated transportation network, where congestion might disrupt throughput. We model the problem within a network flow framework and show that under relatively mild assumptions the rebalancing vehicles, if properly coordinated, do not lead to an increase in congestion (in stark contrast to common belief). From an algorithmic standpoint, such theoretical insight suggests that the problem of routing customers and rebalancing vehicles can be decoupled, which leads to a computationally-efficient routing and rebalancing algorithm for the autonomous vehicles. Numerical experiments and case studies corroborate our theoretical insights and show that the proposed algorithm outperforms state-of-the-art point-to-point methods by avoiding excess congestion on the road. Collectively, this paper provides a rigorous approach to the problem of congestion-aware, system-wide coordination of autonomously driving vehicles, and to the characterization of the sustainability of such robotic systems.
@inproceedings{ZhangRossiEtAl2016, author = {Zhang, R. and Rossi, F. and Pavone, M.}, title = {Routing Autonomous Vehicles in Congested Transportation Networks: Structural Properties and Coordination Algorithms}, booktitle = {{Robotics: Science and Systems}}, year = {2016}, doi = {10.15607/rss.2016.xii.032}, month = jul, url = {http://arxiv.org/pdf/1603.00939.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: This work describes a new method for autonomous mode-matching and quadrature nulling of a Microelectromechanical system (MEMS) wineglass mode gyroscope, utilizing particle swarm optimization. Use of this derivative-free optimization scheme allows for multi-objective optimization of gyroscopic performance parameters. Modal frequency split and both mode shapes’ quadrature and amplitude were optimized through this method. Optimal parameters for frequency split, quadratures, and principle axis amplitudes were found to be 0.71 Hz, 13.9 and 10.5 mV, and 284.6 and 299.6 mV, respectively. Autonomous calibration greatly increased the scale factor of the sensor and enhanced the noise performance to levels typically achieved by diligent hand tuning.
@inproceedings{FladerAhnEtAl2016, author = {Flader, I. B. and Ahn, C. H. and Gerrard, D. D. and Ng, E. J. and Yang, Y. and Hong, V. A. and Pavone, M. and Kenny, T. W.}, title = {Autonomous calibration of {MEMS} disk resonating gyroscope for improved sensor performance}, booktitle = {{American Control Conference}}, year = {2016}, address = {Boston, Massachusetts}, doi = {10.1109/ACC.2016.7526579}, month = jul, url = {/wp-content/papercite-data/pdf/Flader.Ahn.Gerrard.etal.ACC16.pdf}, owner = {bylard}, timestamp = {2018-04-08} }
Abstract: In this paper we present a model predictive control (MPC) approach to optimize vehicle scheduling and routing in an autonomous mobility-on-demand (AMoD) system. In AMoD systems, robotic, self-driving vehicles transport customers within an urban environment and are coordinated to optimize service throughout the entire network. Specifically, we first propose a novel discrete-time model of an AMoD system and we show that this formulation allows the easy integration of a number of real-world constraints, e.g., electric vehicle charging constraints. Second, leveraging our model, we design a model predictive control algorithm for the optimal coordination of an AMoD system and prove its stability in the sense of Lyapunov. At each optimization step, the vehicle scheduling and routing problem is solved as a mixed integer linear program (MILP) where the decision variables are binary variables representing whether a vehicle will 1) wait at a station, 2) service a customer, or 3) rebalance to another station. Finally, by using real-world data, we show that the MPC algorithm can be run in real-time for moderately-sized systems and outperforms previous control strategies for AMoD systems.
@inproceedings{ZhangRossiEtAl2016b, author = {Zhang, R. and Rossi, F. and Pavone, M.}, title = {Model Predictive Control of {Autonomous} {Mobility-on-Demand} Systems}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2016}, address = {Stockholm, Sweden}, doi = {10.1109/ICRA.2016.7487272}, month = may, url = {http://arxiv.org/pdf/1509.03985.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: Safely integrating unmanned aerial vehicles into civil airspace is contingent upon development of a trustworthy collision avoidance system. This paper proposes an approach whereby a parameterized resolution logic that is considered trusted for a given range of its parameters is adaptively tuned online. Specifically, to address the potential conservatism of the resolution logic with static parameters, we present a dynamic programming approach for adapting the parameters dynamically based on the encounter state. We compute the adaptation policy offline using simulation-based approximate dynamic programming which can handle the high dimensionality of the problem. Numerical experiments show that this approach improves safety and operational performance compared to the baseline resolution logic, while retaining trustworthiness.
@inproceedings{SunbergKochenderferEtAl2016, author = {Sunberg, Z. and Kochenderfer, M. and Pavone, M.}, title = {Optimized and Trusted Collision Avoidance for Unmanned Aerial Vehicles using Approximate Dynamic Programming}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2016}, address = {Stockholm, Sweden}, doi = {10.1109/ICRA.2016.7487280}, month = may, url = {/wp-content/papercite-data/pdf/Sunberg.Pavone.Kochenderfer.ICRA16.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: We explore the use of grippers with gecko-inspired adhesives for spacecraft docking and acquisition of tumbling objects in microgravity. Towards the goal of autonomous object manipulation in space, adhesive grippers mounted on planar free-floating platforms are shown to be tolerant of a range of incoming linear and angular velocities. Through modeling, simulations, and experiments, we characterize the dynamic “grasping envelope” for successful acquisition and derive insights to inform future gripper designs and grasping strategies for motion planning.
@inproceedings{EstradaHockmanEtAl2016, author = {Estrada, M. A. and Hockman, B. and Bylard, A. and Hawkes, E. W. and Cutkosky, M. R. and Pavone, M.}, title = {Free-Flyer Acquisition of Spinning Objects with Gecko-Inspired Adhesives}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2016}, address = {Stockholm, Sweden}, doi = {10.1109/ICRA.2016.7487696}, month = may, url = {/wp-content/papercite-data/pdf/Estrada.Hockman.Bylard.ea.ICRA16.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: Recent proliferation of cyber-physical systems, ranging from autonomous cars to nuclear hazard inspection robots, has exposed several challenging research problems on automated fault detection and recovery. This paper considers how recently developed formal synthesis and model verification techniques may be used to automatically generate information-seeking trajectories for anomaly detection. In particular, we consider the problem of how a robot could select its actions so as to maximally disambiguate between different model hypotheses that govern the environment it operates in or its interaction with other agents whose prime motivation is a priori unknown. The identification problem is posed as selection of the most likely model from a set of candidates, where each candidate is an adversarial Markov decision process (MDP) together with a linear temporal logic (LTL) formula that constrains robot-environment interaction. An adversarial MDP is an MDP in which transitions depend on both a (controlled) robot action and an (uncontrolled) adversary action. States are labeled, thus allowing interpretation of satisfaction of LTL formulae, which have a special form admitting satisfaction decisions in bounded time. An example where a robotic car must discern whether neighboring vehicles are following its trajectory for a surveillance operation is used to illustrate the problem and demonstrate our approach.
@inproceedings{ChinchaliLivingstonEtAl2016, author = {Chinchali, S. P. and Livingston, S. C. and Pavone, M. and Burdick, J. W.}, title = {Simultaneous Model Identification and Task Satisfaction in the Presence of Temporal Logic Constraints}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2016}, address = {Stockholm, Sweden}, doi = {10.1109/ICRA.2016.7487553}, month = may, url = {/wp-content/papercite-data/pdf/Chinchali.Livingston.ea.ICRA16.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: In this paper we present an algorithm to compute risk averse policies in Markov Decision Processes (MDP) when the total cost criterion is used together with the average value at risk (AVaR) metric. Risk averse policies are needed when large deviations from the expected behavior may have detrimental effects, and conventional MDP algorithms usually ignore this aspect. We provide conditions for the structure of the underlying MDP ensuring that approximations for the exact problem can be derived and solved efficiently. Our findings are novel inasmuch as average value at risk has not previously been considered in association with the total cost criterion. Our method is demonstrated in a rapid deployment scenario, whereby a robot is tasked with the objective of reaching a target location within a temporal deadline where increased speed is associated with increased probability of failure. We demonstrate that the proposed algorithm not only produces a risk averse policy reducing the probability of exceeding the expected temporal deadline, but also provides the statistical distribution of costs, thus offering a valuable analysis tool.
@inproceedings{CarpinChowEtAl2016, author = {Carpin, S. and Chow, Y. and Pavone, M.}, title = {Risk Aversion in Finite {Markov} {Decision} {Processes} Using Total Cost Criteria and Average Value at Risk}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2016}, address = {Stockholm, Sweden}, doi = {10.1109/ICRA.2016.7487152}, month = may, url = {/wp-content/papercite-data/pdf/Carpin.Chow.Pavone.ICRA16.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: This paper presents a sampling-based motion planning algorithm for real-time, propellant-optimized autonomous spacecraft trajectory generation in near-circular orbits. Specifically, this paper leverages recent algorithmic advances in the field of robot motion planning to the problem of impulsively-actuated, propellant-optimized rendezvous and proximity operations under the Clohessy-Wiltshire-Hill (CWH) dynamics model. The approach calls upon a modified version of the Fast Marching Tree (FMT*) algorithm to grow a set of feasible and actively-safe trajectories over a deterministic, low-dispersion set of sample points covering the free state space. Key features of the proposed algorithm include: (i) theoretical guarantees of trajectory safety and performance, (ii) real-time implementability, and (iii) generality, in the sense that a large class of constraints can be handled directly. As a result, the proposed algorithm offers the potential for widespread application, ranging from on-orbit satellite servicing to orbital debris removal and autonomous inspection missions.
@inproceedings{StarekSchmerlingEtAl2016b, author = {Starek, J. A. and Schmerling, E. and Maher, G. D. and Barbee, B. W. and Pavone, M.}, title = {Real-Time, Propellant-Optimized Spacecraft Motion Planning under {Clohessy-Wiltshire-Hill} Dynamics}, booktitle = {{IEEE Aerospace Conference}}, year = {2016}, address = {Big Sky, Montana}, doi = {10.1109/aero.2016.7500704}, month = mar, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Starek.Schmerling.ea.AeroConf16.pdf} }
Abstract: The objective of this paper is to present a full-stack, real-time kinodynamic planning framework and demonstrate it on a quadrotor for collision avoidance. Specifically, the proposed framework utilizes an offline-online computation paradigm, neighborhood classification through machine learning, sampling-based motion planning with an optimal control distance metric, and trajectory smoothing to achieve real-time planning for aerial vehicles. The approach is demonstrated on a quadrotor navigating obstacles in an indoor space and stands as, arguably, one of the first demonstrations of full-online kinodynamic motion planning; exhibiting execution times under 1/3 of a second. For the quadrotor, a simplified dynamics model is used during the planning phase to accelerate online computation. A trajectory smoothing phase, which leverages the differentially flat nature of quadrotor dynamics, is then implemented to guarantee a dynamically feasible trajectory.
@inproceedings{AllenPavone2016b, author = {Allen, R. and Pavone, M.}, title = {A Real-Time Framework for Kinodynamic Planning with Application to Quadrotor Obstacle Avoidance}, booktitle = {{AIAA Conf. on Guidance, Navigation and Control}}, year = {2016}, address = {San Diego, CA}, doi = {10.2514/6.2016-1374}, month = jan, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Allen.Pavone.AIAAGNC16.pdf} }
Abstract: In this paper we present and analyze a queueing-theoretical model for autonomous mobility-on-demand (MOD) systems where robotic, self-driving vehicles transport customers within an urban environment and rebalance themselves to ensure acceptable quality of service throughout the entire network. We cast an autonomous MOD system within a closed Jackson network model with passenger loss. It is shown that an optimal rebalancing algorithm minimizing the number of (autonomously) rebalancing vehicles and keeping vehicles availabilities balanced throughout the network can be found by solving a linear program. The theoretical insights are used to design a robust, real-time rebalancing algorithm, which is applied to a case study of New York City. The case study shows that the current taxi demand in Manhattan can be met with about 8,000 robotic vehicles (roughly 60% of the size of the current taxi fleet). Finally, we extend our queueing-theoretical setup to include congestion effects, and we study the impact of autonomously rebalancing vehicles on overall congestion. Collectively, this paper provides a rigorous approach to the problem of system-wide coordination of autonomously driving vehicles, and provides one of the first characterizations of the sustainability benefits of robotic transportation networks.
@article{ZhangPavone2016, author = {Zhang, R. and Pavone, M.}, title = {Control of Robotic {Mobility-on-Demand} Systems: A Queueing-Theoretical Perspective}, journal = {{Int. Journal of Robotics Research}}, year = {2016}, volume = {35}, number = {1--3}, pages = {186--203}, doi = {10.1177/0278364915581863}, url = {/wp-content/papercite-data/pdf/Zhang.Pavone.IJRR15.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: Confident predictions of human driving behaviours are necessary in designing safe and efficient control policies for autonomous vehicles. A better understanding of how human drivers react to their surrounding may avoid the design of overly-conservative control policies which require greater cost (e.g., time, traffic flow disruption) to achieve their objective. In this paper, we explore ways to learn distributions over human driver actions that are typical of a highway setting. We use actions filtered from Next Generation SIMulation (NGSIM) vehicle trajectory data gathered on the US 101 highway as training data for a Recurrent Neural Network. In particular, we use a Mixture Density Network (MDN) model to represent predicted driver actions as a Gaussian Mixture Model. We present and discuss exploratory results on the filtering of the raw NGSIM data and design of the MDN model.
@techreport{LeungSchmerlingEtAl2016, author = {Leung, K. and Schmerling, E. and Pavone, M.}, title = {Distributional Prediction of Human Driving Behaviours using Mixture Density Networks}, institution = {{Stanford University}}, year = {2016}, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Leung.Schmerling.Pavone.2016.pdf} }
Abstract: In this paper, we present a queueing network approach to the problem of routing and rebalancing a fleet of self-driving vehicles providing on-demand mobility within a capacitated road network. We refer to such systems as autonomous mobility-on-demand systems, or AMoD. We first cast an AMoD system into a closed, multi-class BCMP queueing network model. Second, we present analysis tools that allow the characterization of performance metrics for a given routing policy, in terms, e.g., of vehicle availabilities and second-order moments of vehicle throughput. Third, we propose a scalable method for the synthesis of routing policies, with performance guarantees in the limit of large fleet sizes. Finally, we validate our theoretical results on a case study of New York City. Collectively, this paper provides a unifying framework for the analysis and control of AMoD systems, which subsumes earlier Jackson and flow network models, provides a quite large set of modeling options (e.g., the inclusion of road capacities and general travel time distributions), and allows the analysis of second and higher-order moments for the performance metrics.
@inproceedings{IglesiasRossiEtAl2016, author = {Iglesias, R. and Rossi, F. and Zhang, R. and Pavone, M.}, title = {A {BCMP} Network Approach to Modeling and Controlling {Autonomous} {Mobility-on-Demand} Systems}, booktitle = {{Workshop on Algorithmic Foundations of Robotics}}, year = {2016}, address = {San Francisco, California}, url = {/wp-content/papercite-data/pdf/Iglesias.Rossi.Zhang.Pavone.WAFR16.pdf}, owner = {bylard}, timestamp = {2017-03-07} }
Abstract: In this paper we discuss the design, control, and experimentation of internally-actuated rovers for the exploration of low-gravity (micro-g to milli-g) planetary bodies, such as asteroids, comets, or small moons. The actuation of the rover relies on spinning three internal flywheels, which allows all subsystems to be packaged in one sealed enclosure and enables the platform to be minimalistic, thereby reducing its cost. By controlling flywheels’ spin rate, the rover is capable of achieving large surface coverage by attitude-controlled hops, fine mobility by tumbling, and coarse instrument pointing by changing orientation relative to the ground. We discuss the dynamics of such rovers, their control, and key design features (e.g., flywheel design and orientation, geometry of external spikes, and system engineering aspects). We then discuss the design and control of a first-of-a-kind test bed, which allows the accurate emulation of a microgravity environment for mobility experiments and consists of a 3 DoF gimbal attached to an actively controlled gantry crane. Finally, we present experimental results on the test bed that provide key insights for control and validate the theoretical analysis.
@article{HockmanFrickEtAl2016, author = {Hockman, B. and Frick, A. and Nesnas, I. A. D. and Pavone, M.}, title = {Design, Control, and Experimentation of Internally-Actuated Rovers for the Exploration of Low-Gravity Planetary Bodies}, journal = {{Journal of Field Robotics}}, volume = {34}, number = {1}, pages = {5--24}, year = {2016}, doi = {10.1002/rob.21656}, url = {/wp-content/papercite-data/pdf/Hockman.Pavone.ea.JFR15.pdf}, owner = {bylard}, timestamp = {2017-08-11} }
Abstract: Lightweight, highly autonomous drones that can actively interact with the world are emerging as the next step-change in consumer electronic technology, much in the same way that smart phones revolutionized personal computing..
@article{AllenPavoneEtAl2016, author = {Allen, R. and Pavone, M. and Schwager, M.}, title = {Flying Smartphones: When Portable Computing Sprouts Wings}, journal = {{IEEE Pervasive Computing}}, volume = {15}, number = {3}, pages = {83--88}, year = {2016}, doi = {10.1109/MPRV.2016.43}, url = {/wp-content/papercite-data/pdf/Allen.Pavone.Schwager.PC16.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: In the recent past, several sampling-based algorithms have been proposed to compute trajectories that are collision-free and dynamically-feasible. However, the outputs of such algorithms are notoriously jagged. In this paper, by focusing on robots with car-like dynamics, we present a fast and simple heuristic algorithm, named Convex Elastic Smoothing (CES) algorithm, for trajectory smoothing and speed optimization. The CES algorithm is inspired by earlier work on elastic band planning and iteratively performs shape and speed optimization. The key feature of the algorithm is that both optimization problems can be solved via convex programming, making CES particularly fast. A range of numerical experiments show that the CES algorithm returns high-quality solutions in a matter of a few hundreds of milliseconds and hence appears amenable to a real-time implementation.
@inproceedings{ZhuSchmerlingEtAl2015, author = {Zhu, Z. and Schmerling, E. and Pavone, M.}, title = {A Convex Optimization Approach to Smooth Trajectories for Motion Planning with Car-Like Robots}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2015}, address = {Osaka, Japan}, doi = {10.1109/CDC.2015.7402333}, month = dec, owner = {bylard}, timestamp = {2017-01-28}, url = {http://arxiv.org/pdf/1506.01085.pdf} }
Abstract: In this paper we provide a thorough, rigorous theoretical framework to assess optimality guarantees of samplingbased algorithms for drift control systems: systems that, loosely speaking, can not stop instantaneously due to momentum. We exploit this framework to design and analyze a sampling-based algorithm (the Differential Fast Marching Tree algorithm) that is asymptotically optimal, that is, it is guaranteed to converge, as the number of samples increases, to an optimal solution. In addition, our approach allows us to provide concrete bounds on the rate of this convergence. The focus of this paper is on mixed time/control energy cost functions and on linear affine dynamical systems, which encompass a range of models of interest to applications (e.g., double-integrators) and represent a necessary step to design, via successive linearization, samplingbased and provably-correct algorithms for non-linear drift control systems. Our analysis relies on an original perturbation analysis for two-point boundary value problems, which could be of independent interest.
@inproceedings{SchmerlingJansonEtAl2015b, author = {Schmerling, E. and Janson, L. and Pavone, M.}, title = {Optimal Sampling-Based Motion Planning under Differential Constraints: the Drift Case with Linear Affine Dynamics}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2015}, address = {Osaka, Japan}, doi = {10.1109/CDC.2015.7402604}, month = dec, url = {http://arxiv.org/pdf/1405.7421.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: A drag-free satellite is a spacecaft composed of an internal test mass shielded by an external satellite that compensates all dominant disturbance forces encountered in the space environment such as aerodynamic drag and solar radiation pressure. By minimizing all non-gravitational disturbances on the test mass, the trajectory of the spacecraft is a near perfect geodesic. In concert with precise orbit determination techniques, drag-free satellites allow us to investigate topics in geodesy, aeronomy, and gravitational physics and conduct challenging experiments in low-disturbance environments to unprecedented accuracy. This paper addresses the development of a high-fidelity simulator and control system design for the Modular Gravitational Reference Sensor (MGRS) drag-free satellite. MGRS is a 100 kg microsatellite due to launch in 2018 into a Sun-synchronous orbit with a mean altitude of 657 km that aims to demonstrate three-axis drag-free operations with residual non-gravitational acceleration of a test mass under 10^-12 \msrootHz in the frequency range 0.01 to 1 Hz. The drag-free performance goal reflects a substantial improvement upon past drag-free missions such as TRIAD I, GPB, and GOCE, and will be accomplished at a fraction of the cost. Additionally, this mission represents a key technology demonstration within a larger research endeavour that aims to develop a multi-purpose distributed drag-free architecture based on microsatellite platforms. Our modeling framework allows us to gain a comprehensive insight into the range of expected disturbances, derive sizing constraints for a suitable micropropulsion system, and formulate a preliminary drag-free translational and attitude control system using H_∞- control techniques.
@inproceedings{SinghDAmicoEtAl2015, author = {Singh, S. and D'Amico, S.. and Pavone, M.}, title = {High-Fidelity Modeling and Control System Synthesis for a Drag-Free Microsatellite}, booktitle = {{Int. Symp. on Space Flight Dynamics}}, year = {2015}, address = {Munich, Germany}, month = oct, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Singh.Damico.Pavone.ISSFD15.pdf} }
Abstract: Bi-directional search is a widely used strategy to increase the success and convergence rates of sampling-based motion planning algorithms. Yet, few results are available that merge both bi-directional search and asymptotic optimality into existing optimal planners, such as PRM*, RRT*, and FMT*. The objective of this paper is to fill this gap. Specifically, this paper presents a bi-directional, sampling-based, asymptotically-optimal algorithm named Bi-directional FMT* (BFMT*) that extends the Fast Marching Tree (FMT*) algorithm to bidirectional search while preserving its key properties, chiefly lazy search and asymptotic optimality through convergence in probability. BFMT* performs a two-source, lazy dynamic programming recursion over a set of randomly-drawn samples, correspondingly generating two search trees: one in cost-to-come space from the initial configuration and another in cost-to-go space from the goal configuration. Numerical experiments illustrate the advantages of BFMT* over its unidirectional counterpart, as well as a number of other state-of-the-art planners.
@inproceedings{StarekGomezEtAl2015, author = {Starek, J. A. and Gomez, J. V. and Schmerling, E. and Janson, L. and Moreno, L. and Pavone, M.}, title = {An Asymptotically-Optimal Sampling-Based Algorithm for Bi-directional Motion Planning}, booktitle = {{IEEE/RSJ Int. Conf. on Intelligent Robots \& Systems}}, year = {2015}, address = {Hamburg, Germany}, doi = {10.1109/IROS.2015.7353652}, month = sep, url = {/wp-content/papercite-data/pdf/Starek.Gomez.ea.IROS15.pdf}, owner = {bylard}, timestamp = {2017-03-07} }
Abstract: This article presents a novel approach, named MCMP (Monte Carlo Motion Planning), to the problem of motion planning under uncertainty, i.e., to the problem of computing a low-cost path that fulfills probabilistic collision avoidance constraints. MCMP estimates the collision probability (CP) of a given path by sampling via Monte Carlo the execution of a reference tracking controller (in this paper we consider LQG). The key algorithmic contribution of this paper is the design of statistical variance-reduction techniques, namely control variates and importance sampling, to make such a sampling procedure amenable to real-time implementation. MCMP applies this CP estimation procedure to motion planning by iteratively (i) computing an (approximately) optimal path for the deterministic version of the problem (here, using the FMT* algorithm), (ii) computing the CP of this path, and (iii) inflating or deflating the obstacles by a common factor depending on whether the CP is higher or lower than a target value. The advantages of MCMP are threefold: (i) asymptotic correctness of CP estimation, as opposed to most current approximations, which, as shown in this paper, can be off by large multiples and hinder the computation of feasible plans; (ii) speed and parallelizability, and (iii) generality, i.e., the approach is applicable to virtually any planning problem provided that a path tracking controller and a notion of distance to obstacles in the configuration space are available. Numerical results illustrate the correctness (in terms of feasibility), efficiency (in terms of path cost), and computational speed of MCMP
@inproceedings{JansonSchmerlingEtAl2015b, author = {Janson, L. and Schmerling, E. and Pavone, M.}, title = {{Monte} {Carlo} Motion Planning for Robot Trajectory Optimization Under Uncertainty}, booktitle = {{Int. Symp. on Robotics Research}}, year = {2015}, address = {Sestri Levante, Italy}, month = sep, owner = {bylard}, timestamp = {2017-01-28}, url = {http://arxiv.org/pdf/1504.08053.pdf} }
Abstract: Probabilistic sampling-based algorithms, such as the probabilistic roadmap (PRM) and the rapidly-exploring random tree (RRT) algorithms, represent one of the most successful approaches to robotic motion planning, due to their strong theoretical properties (in terms of probabilistic completeness or even asymptotic optimality) and remarkable practical performance. Such algorithms are probabilistic in that they compute a path by connecting independently and identically distributed (i.i.d.) random points in the configuration space. Their randomization aspect, however, makes several tasks challenging, including certification for safety-critical applications and use of offline computation to improve real-time execution. Hence, an important open question is whether similar (or better) theoretical guarantees and practical performance could be obtained by considering deterministic, as opposed to random sampling sequences. The objective of this paper is to provide a rigorous answer to this question. The focus is on the PRM algorithm—our results, however, generalize to other batch-processing algorithms such as FMT∗. Specifically, we first show that PRM, for a certain selection of tuning parameters and deterministic low-dispersion sampling sequences, is deterministically asymptotically optimal,i.e., it returns a path whose cost converges deterministically to the optimal one as the number of points goes to infinity. Second, we characterize the convergence rate, and we find that the factor of sub-optimality can be very explicitly upper-bounded in terms of the ‘2-dispersion of the sampling sequence and the connection radius of PRM. Third, we show that an asymptotically optimal version of PRM exists with computational and space complexity arbitrarily close to O(n) (the theoretical lower bound), where n is the number of points in the sequence. This is in stark contrast to the O(n logn) complexity results for existing asymptotically-optimal probabilistic planners. Finally, through numerical experiments, we show that planning with deterministic low-dispersion sampling generally provides superior performance in terms of path cost and success rate
@inproceedings{JansonIchterEtAl2015b, author = {Janson, L. and Ichter, B. and Pavone, M.}, title = {Deterministic Sampling-Based Motion Planning: Optimality, Complexity, and Performance}, booktitle = {{Int. Symp. on Robotics Research}}, year = {2015}, address = {Sestri Levante, Italy}, month = sep, url = {http://arxiv.org/pdf/1505.00023.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: This tutorial paper examines the operational and economic aspects of autonomous mobility-on-demand (AMoD) systems, a rapidly emerging mode of personal transportation wherein robotic, self-driving vehicles transport customers in a given environment. We address AMoD systems along three dimensions: (1) modeling - analytical models capable of capturing the salient dynamic and stochastic features of customer demand, (2) control - coordination algorithms for the vehicles aimed at stability and subsequently throughput maximization, and (3) economic - fleet sizing and financial analyses for case studies of New York City and Singapore. Collectively, the models and algorithms presented in this paper enable a rigorous assessment of the value of AMoD systems. In particular, the case study of New York City shows that the current taxi demand in Manhattan can be met with about 8,000 robotic vehicles (roughly 70% of the size of the current taxi fleet), while the case study of Singapore suggests that an AMoD system can meet the personal mobility need of the entire population of Singapore with a number of robotic vehicles that is less than 40% of the current number of passenger vehicles. Directions for future research on AMoD systems are presented and discussed.
@inproceedings{ZhangSpieserEtAl2015, author = {Zhang, R. and Spieser, K. and Frazzoli, E. and Pavone, M.}, title = {Models, Algorithms and Evaluation for {Autonomous} {Mobility-on-Demand} Systems}, booktitle = {{American Control Conference}}, year = {2015}, address = {Chicago, Illinois}, doi = {10.1109/ACC.2015.7171122}, month = jul, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.ea.ACC15.pdf} }
Abstract: This paper presents a queueing network approach to the analysis and control of mobility-on-demand (MoD) systems for urban personal transportation. A MoD system consists of a fleet of vehicles providing one-way car sharing service and a team of drivers to rebalance such vehicles. The drivers then rebalance themselves by driving select customers similar to a taxi service. We model the MoD system as two coupled closed Jackson networks with passenger loss. We show that the system can be approximately balanced by solving two decoupled linear programs and exactly balanced through nonlinear optimization. The rebalancing techniques are applied to a system sizing example using taxi data in three neighborhoods of Manhattan, which suggests that the optimal vehicle-to-driver ratio in a MoD system is between 3 and 5. Lastly, we formulate a real-time closed-loop rebalancing policy for drivers and demonstrate its stability (in terms of customer wait times) for typical system loads.
@inproceedings{ZhangPavone2015, author = {Zhang, R. and Pavone, M.}, title = {A Queueing Network Approach to the Analysis and Control of {Mobility-on-Demand} Systems}, booktitle = {{American Control Conference}}, year = {2015}, address = {Chicago, Illinois}, doi = {10.1109/ACC.2015.7172070}, month = jul, owner = {bylard}, timestamp = {2017-01-28}, url = {http://arxiv.org/pdf/1409.6775v2.pdf} }
Abstract: This paper presents distributed algorithms for formation control of multiple robots in three dimensions. In particular, we leverage the mathematical properties of cyclic pursuit along with results from contraction and partial contraction theory to design distributed control algorithms ensuring global convergence to symmetric formations. As a base case we consider regular polygons as desired formations and then provide extensions to Johnson solid formations. Finally, we analyze the robustness of the control algorithms under bounded additive disturbances and provide performance bounds with respect to the formation error.
@inproceedings{SinghSchmerlingEtAl2015, author = {Singh, S. and Schmerling, E. and Pavone, M.}, title = {Decentralized Algorithms for {3D} Symmetric Formations in Robotic Networks - A Contraction Theory Approach}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2015}, address = {Seattle, Washington}, doi = {10.1109/ICRA.2015.7139355}, month = may, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Singh.Pavone.ICRA2015.pdf} }
Abstract: Motion planning under differential constraints is a classic problem in robotics. To date, the state of the art is represented by sampling-based techniques, with the Rapidly-exploring Random Tree algorithm as a leading example. Yet, the problem is still open in many aspects, including guarantees on the quality of the obtained solution. In this paper we provide a thorough theoretical framework to assess optimality guarantees of sampling-based algorithms for planning under differential constraints. We exploit this framework to design and analyze two novel sampling-based algorithms that are guaranteed to converge, as the number of samples increases, to an optimal solution (namely, the Differential Probabilistic RoadMap algorithm and the Differential Fast Marching Tree algorithm). Our focus is on driftless control-affine dynamical models, which accurately model a large class of robotic systems. In this paper we use the notion of convergence in probability (as opposed to convergence almost surely): the extra mathematical flexibility of this approach yields convergence rate bounds - a first in the field of optimal sampling-based motion planning under differential constraints. Numerical experiments corroborating our theoretical results are presented and discussed.
@inproceedings{SchmerlingJansonEtAl2015, author = {Schmerling, E. and Janson, L. and Pavone, M.}, title = {Optimal Sampling-Based Motion Planning under Differential Constraints: the Driftless Case}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2015}, note = {{Extended version available at }\url{http://arxiv.org/abs/1403.2483/}}, address = {Seattle, Washington}, doi = {10.1109/ICRA.2015.7139514}, month = may, url = {http://arxiv.org/pdf/1403.2483.pdf}, owner = {bylard}, timestamp = {2018-06-30} }
Abstract: n this paper we propose a framework combining techniques from sampling-based motion planning, machine learning, and trajectory optimization to address the kinody- namic motion planning problem in real-time environments. This framework relies on a look-up table that stores precomputed optimal solutions to boundary value problems (assuming no obstacles), which form the directed edges of a precomputed motion planning roadmap. A sampling-based motion planning algorithm then leverages such a precomputed roadmap to compute online an obstacle-free trajectory. Machine learning techniques are employed to minimize the number of online solutions to boundary value problems required to compute the neighborhoods of the start state and goal regions. This approach is demonstrated to reduce online planning times up to six orders of magnitude. Simulation results are presented and discussed. Problem-specific framework modifications are then discussed that would allow further computation time reductions.
@inproceedings{AllenPavone2015, author = {Allen, R. and Pavone, M.}, title = {Toward A Real-Time Framework for Solving the Kinodynamic Motion Planning Problem}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2015}, address = {Seattle, Washington}, doi = {10.1109/ICRA.2015.7139288}, month = may, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Allen.Pavone.ICRA15.pdf} }
Abstract: This paper presents a method for safe spacecraft autonomous maneuvering that leverages robotic motion planning techniques to spacecraft control. Specifically, the scenario we consider is an in-plane rendezvous of a chaser spacecraft in proximity to a target spacecraft at the origin of the Clohessy-Wiltshire-Hill frame. The trajectory for the chaser spacecraft is generated in a receding-horizon fashion by executing a sampling-based robotic motion planning algorithm named Fast Marching Trees (FMT*), which efficiently grows a tree of trajectories over a set of probabilistically-drawn samples in the state space. To enforce safety, the tree is only grown over actively safe samples, from which there exists a one-burn collision avoidance maneuver that circularizes the spacecraft orbit along a collision-free coasting arc and that can be executed under potential thruster failures. The overall approach establishes a provably-correct framework for the systematic encoding of safety specifications into the spacecraft trajectory generation process and appears promising for real-time implementation on orbit. Simulation results are presented for a two-fault tolerant spacecraft during autonomous approach to a single client in Low Earth Orbit.
@inproceedings{StarekBarbeeEtAl2015, author = {Starek, J. A. and Barbee, B. W. and Pavone, M.}, title = {A Sampling-Based Approach to Spacecraft Autonomous Maneuvering with Safety Specifications}, booktitle = {{AAS GN\&C Conference}}, year = {2015}, address = {Breckenridge, Colorado}, month = feb, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Starek.Barbee.ea.AASGNC15.pdf} }
Abstract: This chapter discusses the operational and economic aspects of autonomous mobility-ondemand (AMoD) systems, a transformative and rapidly developing mode of transportation wherein robotic, self-driving vehicles transport passengers in a given environment. Specifically, AMoD systems are addressed along three dimensions: (1) modeling, that is analytical models capturing salient dynamic and stochastic features of customer demand, (2) control, that is coordination algorithms for the vehicles aimed at throughput maximization, and (3) economic, that is fleet sizing and financial analyses for case studies of New York City and Singapore. Collectively, the models and methods presented in this chapter enable a rigorous assessment of the value of AMoD systems.
@incollection{Pavone2015, author = {Pavone, M.}, title = {{Autonomous} {Mobility-on-Demand} Systems for Future Urban Mobility}, booktitle = {Autonomes Fahren}, publisher = {{Springer}}, year = {2015}, doi = {10.1007/978-3-662-45854-9_19}, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.DAIMLER14.pdf} }
Abstract: Existing approaches to constrained dynamic programming are limited to formulations where the constraints share the same additive structure of the objective function (that is, they can be represented as an expectation of the summation of one-stage costs). As such, these formulations cannot handle joint probabilistic (chance) constraints, whose structure is not additive. To bridge this gap, this paper presents a novel algorithmic approach for joint chance-constrained dynamic programming problems, where the probability of failure to satisfy given state constraints is explicitly bounded. Our approach is to (conservatively) reformulate a joint chance constraint as a constraint on the expectation of a summation of indicator random variables, which can be incorporated into the cost function by considering a dual formulation of the optimization problem. As a result, the primal variables can be optimized by standard dynamic programming, while the dual variable is optimized by a root-finding algorithm that converges exponentially. Error bounds on the primal and dual objective values are rigorously derived. We demonstrate algorithm effectiveness on three optimal control problems, namely a path planning problem, a Mars entry, descent and landing problem, and a Lunar landing problem. All Mars simulations are conducted using real terrain data of Mars, with four million discrete states at each time step. The numerical experiments are used to validate our theoretical and heuristic arguments that the proposed algorithm is both (i) computationally efficient, i.e., capable of handling real-world problems, and (ii) near-optimal, i.e., its degree of conservatism is very low.
@article{OnoPavoneEtAl2015, author = {Ono, M. and Pavone, M. and Kuwata, Y. and Balaram, J.}, title = {Chance-Constrained Dynamic Programming with Application to Risk-Aware Robotic Space Exploration}, journal = {{Autonomous Robots}}, volume = {39}, number = {4}, pages = {555--571}, year = {2015}, doi = {10.1007/s10514-015-9467-7}, owner = {bylard}, timestamp = {2017-01-28}, url = {http://web.stanford.edu/~pavone/papers/Ono.Pavone.ea.AURO14.pdf} }
Abstract: In this paper we present a novel probabilistic sampling-based motion planning algorithm called the Fast Marching Tree algorithm (FMT*). The algorithm is specifically aimed at solving complex motion planning problems in high-dimensional configuration spaces. This algorithm is proven to be asymptotically optimal and is shown to converge to an optimal solution faster than its state-of-the-art counterparts, chiefly PRM* and RRT*. The FMT* algorithm performs a "lazy" dynamic programming recursion on a predetermined number of probabilistically-drawn samples to grow a tree of paths, which moves steadily outward in cost-to-arrive space. As a departure from previous analysis approaches that are based on the notion of almost sure convergence, the FMT* algorithm is analyzed under the notion of convergence in probability: the extra mathematical flexibility of this approach allows for convergence rate bounds–the first in the field of optimal sampling-based motion planning. Specifically, for a certain selection of tuning parameters and configuration spaces, we obtain a convergence rate bound of order O(n-1/d+rho), where n is the number of sampled points, d is the dimension of the configuration space, and rho is an arbitrarily small constant. We go on to demonstrate asymptotic optimality for a number of variations on FMT*, namely when the configuration space is sampled non-uniformly, when the cost is not arc length, and when connections are made based on the number of nearest neighbors instead of a fixed connection radius. Numerical experiments over a range of dimensions and obstacle configurations confirm our theoretical and heuristic arguments by showing that FMT*, for a given execution time, returns substantially better solutions than either PRM* or RRT*, especially in high-dimensional configuration spaces and in scenarios where collision-checking is expensive.
@article{JansonSchmerlingEtAl2015, author = {Janson, L. and Schmerling, E. and Clark, A. and Pavone, M.}, title = {{Fast} {Marching} {Tree:} A Fast Marching Sampling-Based Method for Optimal Motion Planning in Many Dimensions}, journal = {{Int. Journal of Robotics Research}}, volume = {34}, number = {7}, pages = {883--921}, year = {2015}, doi = {10.1177/0278364915577958}, url = {http://arxiv.org/pdf/1306.3532.pdf}, owner = {bylard}, timestamp = {2017-02-20} }
Abstract: In this paper we discuss the design, control, and experimentation of internally-actuated rovers for the exploration of low-gravity (micro-g to milli-g) planetary bodies, such as asteroids, comets, or small moons. The actuation of the rover relies on spinning three internal flywheels, which allows all subsystems to be packaged in one sealed enclosure and enables the platform to be minimalistic, thereby reducing its cost. By controlling flywheels’ spin rate, the rover is capable of achieving large surface coverage by attitude-controlled hops, fine mobility by tumbling, and coarse instrument pointing by changing orientation relative to the ground. We discuss the dynamics of such rovers, their control, and key design features (e.g., flywheel design and orientation, geometry of external spikes, and system engineering aspects). The theoretical analysis is validated on a first-of-a-kind 6 degree-of-freedom (DoF) microgravity test bed, which consists of a 3 DoF gimbal attached to an actively controlled gantry crane.
@inproceedings{HockmanFrickEtAl2015, author = {Hockman, B. and Frick, A. and Nesnas, I. A. D. and Pavone, M.}, title = {Design, Control, and Experimentation of Internally-Actuated Rovers for the Exploration of Low-Gravity Planetary Bodies}, booktitle = {{Field and Service Robotics}}, year = {2015}, address = {Toronto, Canada}, doi = {10.1007/978-3-319-27702-8_19}, url = {/wp-content/papercite-data/pdf/Hockman.Pavone.ea.FSR15.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: In this paper we address the problem of decision making within a Markov decision process (MDP) framework where risk and modeling errors are taken into account. Our approach is to minimize a risk-sensitive conditional-value-at-risk (CVaR) objective, as opposed to a standard risk-neutral expectation. We refer to such problem as CVaR MDP. Our first contribution is to show that a CVaR objective, besides capturing risk sensitivity, has an alternative interpretation as expected cost under worst-case modeling errors, for a given error budget. This result, which is of independent interest, motivates CVaR MDPs as a unifying framework for risk-sensitive and robust decision making. Our second contribution is to present a value-iteration algorithm for CVaR MDPs, and analyze its convergence rate. To our knowledge, this is the first solution algorithm for CVaR MDPs that enjoys error guarantees. Finally, we present results from numerical experiments that corroborate our theoretical findings and show the practicality of our approach.
@inproceedings{ChowTamarEtAl2015, author = {Chow, Y. and Tamar, A. and Mannor, S. and Pavone, M.}, title = {Risk-Sensitive and Robust Decision-Making: a {CVaR} Optimization Approach}, booktitle = {{Conf. on Neural Information Processing Systems}}, year = {2015}, address = {Montreal, Canada}, url = {/wp-content/papercite-data/pdf/Chow.Tamar.Mannor.Pavone.NIPS15.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: This paper studies fundamental limitations of performance for distributed decision-making in robotic networks. The class of decision-making problems we consider encompasses a number of prototypical problems such as average-based consensus as well as distributed optimization, leader election, majority voting, MAX, MIN, and logical formulas. We first propose a formal model for distributed computation on robotic networks that is based on the concept of I/O automata and is inspired by the Computer Science literature on distributed computing clusters. Then, we present a number of bounds on time, message, and byte complexity, which we use to discuss the relative performance of a number of approaches for distributed decision-making. From a methodological standpoint, our work sheds light on the relation between the tools developed by the Computer Science and Controls communities on the topic of distributed algorithms.
@inproceedings{RossiPavone2014, author = {Rossi, F. and Pavone, M.}, title = {On the Fundamental Limitations of Performance for Distributed Decision-Making in Robotic Networks}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2014}, address = {Los Angeles, California}, doi = {10.1109/CDC.2014.7039760}, month = dec, url = {http://arxiv.org/pdf/1409.4863.pdf}, owner = {bylard}, timestamp = {2017-02-20} }
Abstract: In this paper we consider the problem of multirobot deployment under temporal deadlines. The objective is to compute strategies trading off safety for speed to maximize the probability of reaching a given set of target locations within a pre-assigned temporal deadline. We formulate this problem using the theory of Constrained Markov Decision Processes and we show that thanks to this framework it is possible to determine deploying strategies maximizing the probability of success while satisfying the deadline. Moreover, the formulation allows to exactly compute the failure probability of complex deployment tasks. Simulation results illustrate how the proposed method works in different scenarios and show how informed decisions can be made regarding the size of the robot team.
@inproceedings{CarpinPavoneEtAl2014, author = {Carpin, S. and Pavone, M. and Sadler, B. M.}, title = {Rapid Multirobot Deployment with Time Constraints}, booktitle = {{IEEE/RSJ Int. Conf. on Intelligent Robots \& Systems}}, year = {2014}, address = {Chicago, Illinois}, doi = {10.1109/IROS.2014.6942702}, month = sep, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Carpin.Pavone.ea.IROS14.pdf} }
Abstract: Assessing reachability for a dynamical system, that is deciding whether a certain state is reachable from a given initial state within a given cost threshold, is a central concept in controls, robotics, and optimization. Direct approaches to assess reachability involve the solution to a two-point boundary value problem (2PBVP) between a pair of states. Alternative, indirect approaches involve the characterization of reachable sets as level sets of the value function of an appropriate optimal control problem. Both methods solve the problem accurately, but are computationally intensive and do no appear amenable to real-time implementation for all but the simplest cases. In this work, we leverage machine learning techniques to devise query-based algorithms for the approximate, yet real-time solution of the reachability problem. Specifically, we show that with a training set of pre-solved 2PBVP problems, one can accurately classify the cost-reachable sets of a differentially-constrained system using either (1) locally-weighted linear regression or (2) support vector machines. This novel, query-based approach is demonstrated on two systems: the Dubins car and a deep-space spacecraft. Classification errors on the order of 10% (and often significantly less) are achieved with average execution times on the order of milliseconds, representing 4 orders-of-magnitude improvement over exact methods. The proposed algorithms could find application in a variety of time-critical robotic applications, where the driving factor is computation time rather than optimality.
@inproceedings{AllenClarkEtAl2014, author = {Allen, R. and Clark, A. and Starek, J. A. and Pavone, M.}, title = {A Machine Learning Approach for Real-time Computation of Dynamical System Reachability Sets}, booktitle = {{IEEE/RSJ Int. Conf. on Intelligent Robots \& Systems}}, year = {2014}, address = {Chicago, Illinois}, doi = {10.1109/IROS.2014.6942859}, month = sep, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Allen.Clark.Starek.ea.IROS2014.pdf} }
Abstract: In this paper we present and analyze a queueing-theoretical model for autonomous mobility-on-demand (MOD) systems where robotic, self-driving vehicles transport customers within an urban environment and rebalance themselves to ensure acceptable quality of service throughout the entire network. We cast an autonomous MOD system within a closed Jackson network model with passenger loss. It is shown that an optimal rebalancing algorithm minimizing the number of (autonomously) rebalancing vehicles and keeping vehicles availabilities balanced throughout the network can be found by solving a linear program. The theoretical insights are used to design a robust, real-time rebalancing algorithm, which is applied to a case study of New York City. The case study shows that the current taxi demand in Manhattan can be met with about 8,000 robotic vehicles (roughly 70% of the size of the current taxi fleet operating in Manhattan). Finally, we extend our queueing-theoretical setup to include congestion effects, and we study the impact of autonomously rebalancing vehicles on overall congestion. Collectively, this paper provides a rigorous approach to the problem of system-wide coordination of autonomously driving vehicles, and provides one of the first characterizations of the sustainability benefits of robotic transportation networks.
@inproceedings{ZhangPavone2014, author = {Zhang, R. and Pavone, M.}, title = {Control of Robotic {Mobility-on-Demand} Systems: a Queueing-Theoretical Perspective}, booktitle = {{Robotics: Science and Systems}}, year = {2014}, note = {Best Paper Award Finalist}, address = {Berkeley, California}, doi = {10.15607/rss.2014.x.026}, month = jul, owner = {bylard}, timestamp = {2017-01-28}, url = {http://web.stanford.edu/~pavone/papers/Zhang.Pavone.RSS14.pdf} }
Abstract: Directed in-situ science and exploration on the surface of small Solar System bodies requires controlled mobility. In the microgravity environment of small bodies such as asteroids, comets or small moons, the low gravitational and frictional forces at the surface make typical wheeled rovers ineffective. Through a joint collaboration, the Jet Propulsion Laboratory together with Stanford University have been studying microgravity mobility approaches using hopping/tumbling platforms. They have developed an internally-actuated spacecraft/rover hybrid platform, known as "Hedgehog", that uses flywheels and brakes to impart mobility. This paper presents a model of the platform’s mobility, analyzing its three main states of motion (pivoting, slipping and hopping) and the contact dynamics between the platform’s spikes and various regolith simulants. To experimentally validate the model, an Atwood machine (pulley and counterbalance) was used to emulate microgravity. Experiments were performed with a range of torques on both rigid and granular surfaces while a high-speed camera tracked the platform’s motion. Using parameters measured during the experiments, the platform was simulated numerically and its motion compared. Within the limits of the experimental setup, the model is consistent with observations; it indicates the ability to perform controlled forward motions in microgravity on a range of rigid and granular regolith simulants.
@inproceedings{ReidRovedaEtAl2014, author = {Reid, R. G. and Roveda, L. and Nesnas, I. A. D. and Pavone, M.}, title = {Contact Dynamics of Internally-Actuated Platforms for the Exploration of Small {Solar} {System} Bodies}, booktitle = {{Int. Symp. on Artificial Intelligence, Robotics and Automation in Space}}, year = {2014}, address = {Montr\'{e}al, Quebec}, month = jun, url = {/wp-content/papercite-data/pdf/Reid.Roveda.ea.iSAIRAS14.pdf}, owner = {bylard}, timestamp = {2017-02-20} }
Abstract: In this paper we present a framework for risk-averse model predictive control (MPC) of linear systems affected by multiplicative uncertainty. Our key innovation is to consider time-consistent, dynamic risk metrics as objective functions to be minimized. This framework is axiomatically justified in terms of time-consistency of risk preferences, is amenable to dynamic optimization, and is unifying in the sense that it captures a full range of risk assessments from risk-neutral to worst case. Within this framework, we propose and analyze an online risk-averse MPC algorithm that is provably stabilizing. Furthermore, by exploiting the dual representation of time-consistent, dynamic risk metrics, we cast the computation of the MPC control law as a convex optimization problem amenable to implementation on embedded systems. Simulation results are presented and discussed.
@inproceedings{ChowPavone2014, author = {Chow, Y. and Pavone, M.}, title = {A Framework for Time-Consistent, Risk-Averse Model Predictive Control: Theory and Algorithms}, booktitle = {{American Control Conference}}, year = {2014}, address = {Portland, Oregon}, doi = {10.1109/ACC.2014.6859437}, month = jun, url = {/wp-content/papercite-data/pdf/Chow.Pavone.ACC14.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: The in-situ exploration of small Solar System bodies (such as asteroids or comets) is becoming a central objective for future planetary exploration. Such bodies are characterized by very weak gravitational fields, which make hopping mobility platforms one of the preferred mobility strategies for microgravity surface exploration, as recognized by space agencies worldwide. However, little is known about the dynamical behavior of hopping platforms in low gravity environments, where small bodies’ rotational dynamics can have a critical effect. Accordingly, the objective of this paper is to study in detail the dynamic envelope of hopping microgravity rovers, with a focus on internal actuation. Specifically, we first perform a static analysis with the goal of determining regions of a small body where an internally-actuated hopping rover can stably remain at rest. Then, we perform a dynamic analysis and discuss the actuation and instrument pointing performance of hopping microgravity platforms as a function of a number of system and environmental parameters (e.g., rover shape, body rotation rate). Finally, we tailor our analysis to a potential mission to Mars’ moon Phobos. Collectively, our results show that internally-actuated rovers, from an actuation standpoint, are a viable mobility solution for a vast class of small Solar System bodies. Also, our analysis represents a key first step to develop path planning algorithms for microgravity explorers to safely explore dynamically feasible regions.
@inproceedings{KoenigPavoneEtAl2014, author = {Koenig, Adam W. and Pavone, M. and Castillo-Rogez, Julie C. and Nesnas, I. A. D.}, title = {A Dynamical Characterization of Internally-Actuated Microgravity Mobility Systems}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2014}, address = {Hong Kong}, doi = {10.1109/ICRA.2014.6907836}, month = may, url = {/wp-content/papercite-data/pdf/Koenig.Pavone.ea.ICRA14.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: The objective of this work is to provide analytical guidelines and financial justification for the design of shared-vehicle mobility-on-demand systems. Specifically, we consider the fundamental issue of determining the appropriate number of vehicles to field in the fleet, and estimate the financial benefits of several models of car sharing. As a case study, we consider replacing all modes of personal transportation in a city such as Singapore with a fleet of shared automated vehicles, able to drive themselves, e.g., to move to a customer’s location. Using actual transportation data, our analysis suggests a shared-vehicle mobility solution can meet the personal mobility needs of the entire population with a fleet whose size is approximately one third of the total number of passenger vehicles currently in operation.
@incollection{SpieserTreleavenEtAl2014, author = {Spieser, K. and Treleaven, K. and Zhang, R. and Frazzoli, E. and Moгton, D. and Pavone, M.}, title = {Toward a Systematic Approach to the Design and Evaluation of {Autonomous} {Mobility-on-Demand} Systems: A Case Study in {Singapore}}, booktitle = {Road Vehicle Automation}, year = {2014}, doi = {10.1007/978-3-319-05990-7_20}, url = {http://dspace.mit.edu/handle/1721.1/82904}, owner = {bylard}, publisher = {{Springer}}, timestamp = {2017-06-15} }
Abstract: In this paper we study the inherent trade-off between time and communication complexity for the distributed consensus problem. In our model, communication complexity is measured as the maximum data throughput (in bits per second) sent through the network at a given instant. Such a notion of communication complexity, referred to as bandwidth complexity, is related to the frequency bandwidth a designer should collectively allocate to the agents if they were to communicate via a wireless channel, which represents an important constraint for dense robotic networks. We prove a lower bound on the bandwidth complexity of the consensus problem and provide a consensus algorithm that is bandwidth-optimal for a wide class of consensus functions. We then propose a distributed algorithm that can trade communication complexity versus time complexity as a function of a tunable parameter, which can be adjusted by a system designer as a function of the properties of the wireless communication channel. We rigorously characterize the tunable algorithm’s worst-case bandwidth complexity and show that it compares favorably with the bandwidth complexity of well-known consensus algorithm.
@inproceedings{RossiPavone2014b, author = {Rossi, F. and Pavone, M.}, title = {Distributed Consensus with Mixed Time/Communication Bandwidth Performance Metrics}, booktitle = {{Allerton Conf. on Communications, Control and Computing}}, year = {2014}, address = {Champaign, Illinois}, doi = {10.1109/ALLERTON.2014.7028468}, url = {http://arxiv.org/pdf/1410.0956.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: Multi-vehicle routing problems in systems and control theory are concerned with the design of control policies to coordinate several vehicles moving in a metric space, in order to complete spatially localized, exogenously generated tasks, in an efficient way. Control policies depend on several factors, including the definition of the tasks, of the task generation process, of the vehicle dynamics and constraints, of the information available to the vehicles, and of the performance objective. Ensuring the stability of the system, i.e., the uniform boundedness of the number of outstanding tasks, is a primary concern. Typical performance objectives include measures of quality of service, such as, e.g., the average or worst-case time a task spends in the system before being completed, or the percentage of tasks that are completed before certain deadlines. The scalability of the control policies to large groups of vehicles often drives the choice of the information structure, requiring distributed computation
@incollection{FrazzoliPavone2014, author = {Frazzoli, E. and Pavone, M.}, title = {Multi-Vehicle Routing}, booktitle = {Encyclopedia of Systems and Control}, publisher = {{Springer}}, year = {2014}, doi = {10.1007/978-1-4471-5102-9_218-1}, owner = {bylard}, timestamp = {2017-01-28}, url = {http://web.stanford.edu/~pavone/papers/Frazzoli.Pavone.ESC13.pdf} }
Abstract: In this paper we consider a stochastic deployment problem, where a robotic swarm is tasked with the objective of positioning at least one robot at each of a set of pre-assigned targets while meeting a temporal deadline. Travel times and failure rates are stochastic but related, inasmuch as failure rates increase with speed. To maximize chances of success while meeting the deadline, a control strategy has therefore to balance safety and performance. Our approach is to cast the problem within the theory of constrained Markov Decision Processes, whereby we seek to compute policies that maximize the probability of successful deployment while ensuring that the expected duration of the task is bounded by a given deadline. To account for uncertainties in the problem parameters, we consider a robust formulation and we propose efficient solution algorithms, which are of independent interest. Numerical experiments confirming our theoretical results are presented and discussed.
@article{ChowPavoneEtAl2014, author = {Chow, Y. and Pavone, M. and Sadler, B. M. and Carpin, S.}, title = {Trading Safety Versus Performance: Rapid Deployment of Robotic Swarms with Robust Performance Constraints}, journal = {{ASME Journal of Dynamic Systems, Measurement, and Control}}, volume = {137}, number = {3}, pages = {031005.1--031005.11}, year = {2014}, doi = {10.1115/1.4028117}, owner = {bylard}, timestamp = {2017-01-28}, url = {http://web.stanford.edu/~pavone/papers/Chow.Pavone.ea.ASME14.pdf} }
Abstract: In this paper we present a novel probabilistic sampling-based motion planning algorithm called the Fast Marching Tree algorithm (FMT*). The algorithm is specifically aimed at solving complex motion planning problems in high-dimensional configuration spaces. This algorithm is proven to be asymptotically optimal and is shown to converge to an optimal solution faster than its state-of-the-art counterparts, chiefly PRM* and RRT*. The FMT* algorithm performs a "lazy" dynamic programming recursion on a predetermined number of probabilistically-drawn samples to grow a tree of paths, which moves steadily outward in cost-to-arrive space. As a departure from previous analysis approaches that are based on the notion of almost sure convergence, the FMT* algorithm is analyzed under the notion of convergence in probability: the extra mathematical flexibility of this approach allows for convergence rate bounds–the first in the field of optimal sampling-based motion planning. Specifically, for a certain selection of tuning parameters and configuration spaces, we obtain a convergence rate bound of order O(n-1/d+ρ), where n is the number of sampled points, d is the dimension of the configuration space, and ρis an arbitrarily small constant. We go on to demonstrate asymptotic optimality for a number of variations on FMT*, namely when the configuration space is sampled non-uniformly, when the cost is not arc length, and when connections are made based on the number of nearest neighbors instead of a fixed connection radius. Numerical experiments over a range of dimensions and obstacle configurations confirm our theoretical and heuristic arguments by showing that FMT*, for a given execution time, returns substantially better solutions than either PRM* or RRT*, especially in high-dimensional configuration spaces and in scenarios where collision-checking is expensive.
@inproceedings{JansonPavone2013, author = {Janson, L. and Pavone, M.}, title = {{Fast} {Marching} {Trees:} A Fast Marching Sampling-Based Method for Optimal Motion Planning in Many Dimensions}, booktitle = {{Int. Symp. on Robotics Research}}, year = {2013}, address = {Singapore}, month = dec, url = {/wp-content/papercite-data/pdf/Janson.Pavone.ISRR13.pdf}, owner = {bylard}, timestamp = {2017-02-20} }
Abstract: In this paper, we present a discretization algorithm for the solution of stochastic optimal control problems with dynamic, time-consistent risk constraints. Previous works have shown that such problems can be cast as Markov decision problems (MDPs) on an augmented state space where a constrained form of Bellman’s recursion can be applied. However, even if both the state space and action spaces for the original optimization problem are finite, the augmented state in the induced MDP problem contains state variables that are continuous. Our approach is to apply a uniform-grid discretization scheme for the augmented state. To prove the correctness of this approach, we develop novel Lipschitz bounds for constrained dynamic programming operators. We show that convergence to the optimal value functions is linear in the step size, which is the same convergence rate for discretization algorithms for unconstrained dynamic programming operators. Simulation experiments are presented and discussed.
@inproceedings{ChowPavone2013b, author = {Chow, Y. and Pavone, M.}, title = {A Uniform-Grid Discretization Algorithm for Stochastic Optimal Control with Risk Constraints}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2013}, address = {Firenze, Italy}, doi = {10.1109/CDC.2013.6760250}, month = dec, url = {/wp-content/papercite-data/pdf/Chow.Pavone.CDC13.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: The past decade has witnessed a rapidly growing interest in decentralized algorithms for collective decision-making in cyber-physical networks. For a large variety of settings, control strategies are now known that either minimize time complexity (i.e., convergence time) or optimize communication complexity (i.e., number and size of exchanged messages). Yet, little attention has beed paid to the problem of studying the inherent trade-off between time and communication complexity. Generally speaking, time-optimal algorithms are fast and robust, but require a large (and sometimes impractical) number of exchanged messages; in contrast, communication optimal algorithms minimize the amount of information routed through the network, but are slow and sensitive to link failures. In this paper we address this gap by focusing on a generalized version of the decentralized consensus problem (that includes voting and mediation) on undirected network topologies and in the presence of "infrequent" link failures. We present and rigorously analyze a tunable, semi-hierarchical algorithm, where the tuning parameter allows a graceful transition from time-optimal to communication-optimal performance (hence, allowing hybrid performance metrics), and determines the algorithm’s robustness, measured as the time required to recover from a failure. An interesting feature of our algorithm is that it leads the decision-making agents to self-organize into a semi-hierarchical structure with variable-size clusters, among which information is flooded. Our results make use of a novel connection between the consensus problem and the theory of gamma synchronizers. Simulation experiments are presented and discussed.
@inproceedings{RossiPavone2013, author = {Rossi, F. and Pavone, M.}, title = {Decentralized Decision-Making on Robotic Networks with Hybrid Performance Metrics}, booktitle = {{Allerton Conf. on Communications, Control and Computing}}, year = {2013}, address = {Champaign, Illinois}, doi = {10.1109/Allerton.2013.6736546}, month = oct, url = {/wp-content/papercite-data/pdf/Rossi.Pavone.Allerton13.pdf}, owner = {bylard}, timestamp = {2017-02-20} }
Abstract: Recent years have witnessed great advancements in the science and technology for unmanned aerial vehicles (UAVs), e.g., in terms of autonomy, sensing, and networking capabilities. This chapter surveys algorithms on task assignment and scheduling for one or multiple UAVs in a dynamic environment, in which targets arrive at random locations at random times, and remain active until one of the UAVs flies to the target’s location and performs an on-site task. The objective is to minimize some measure of the targets’ activity, e.g., the average amount of time during which a target remains active. The chapter focuses on a technical approach that relies upon methods from queueing theory, combinatorial optimization, and stochastic geometry. The main advantage of this approach is its ability to provide analytical estimates of the performance of the UAV system on a given problem, thus providing insight into how performance is affected by design and environmental parameters, such as the number of UAVs and the target distribution. In addition, the approach provides provable guarantees on the system’s performance with respect to an ideal optimum. To illustrate this approach, a variety of scenarios are considered, ranging from the simplest case where one UAV moves along continuous paths and has unlimited sensing capabilities, to the case where the motion of the UAV is subject to curvature constraints, and finally to the case where the UAV has a finite sensor footprint. Finally, the problem of cooperative routing algorithms for multiple UAVs is considered, within the same queueing-theoretical framework, and with a focus on control decentralization.
@incollection{EnrightFrazzoliEtAl2013, author = {Enright, J. J. and Frazzoli, E. and Pavone, M. and Savla, K.}, title = {{UAV} Routing and Coordination in Stochastic, Dynamic Environments}, booktitle = {Handbook of Unmanned Aerial Vehicles}, publisher = {{Springer}}, year = {2013}, doi = {10.1007/978-90-481-9707-1_28}, month = aug, url = {/wp-content/papercite-data/pdf/Enright.Frazzoli.ea.Springer13.pdf}, owner = {bylard}, timestamp = {2017-02-20} }
Abstract: In this paper we study rebalancing strategies for a mobility-on-demand urban transportation system blending customer-driven vehicles with a taxi service. In our system, a customer arrives at one of many designated stations and is transported to any other designated station, either by driving themselves, or by being driven by an employed driver. The system allows for one-way trips, so that customers do not have to return to their origin. When some origins and destinations are more popular than others, vehicles will become unbalanced, accumulating at some stations and becoming depleted at others. This problem is addressed by employing rebalancing drivers to drive vehicles from the popular destinations to the unpopular destinations. However, with this approach the rebalancing drivers themselves become unbalanced, and we need to "rebalance the rebalancers" by letting them travel back to the popular destinations with a customer. Accordingly, in this paper we study how to optimally route the rebalancing vehicles and drivers so that stability (in terms of boundedness of the number of waiting customers) is ensured while minimizing the number of rebalancing vehicles traveling in the network and the number of rebalancing drivers needed; surprisingly, these two objectives are aligned, and one can find the optimal rebalancing strategy by solving two decoupled linear programs. Leveraging our analysis, we determine the minimum number of drivers and minimum number of vehicles needed to ensure stability in the system. Interestingly, our simulations suggest that, in Euclidean network topologies, one would need between 1/3 and 1/4 as many drivers as vehicles.
@inproceedings{SmithPavoneEtAl2013, author = {Smith, S. L. and Pavone, M. and Schwager, M. and Frazzoli, E. and Rus, D.}, title = {Rebalancing the Rebalancers: Optimally Routing Vehicles and Drivers in {Mobility-on-Demand} Systems}, booktitle = {{American Control Conference}}, year = {2013}, address = {Washington, D.C.}, doi = {10.1109/ACC.2013.6580187}, month = jun, url = {/wp-content/papercite-data/pdf/Smith.Pavone.ea.ACC13.pdf}, owner = {bylard}, timestamp = {2017-02-20} }
Abstract: In this paper we present a dynamic programing approach to stochastic optimal control problems with dynamic, time-consistent risk constraints. Constrained stochastic optimal control problems, which naturally arise when one has to consider multiple objectives, have been extensively investigated in the past 20 years; however, in most formulations, the constraints are formulated as either risk-neutral (i.e., by considering an expected cost), or by applying static, singleperiod risk metrics with limited attention to "time-consistency" (i.e., to whether such metrics ensure rational consistency of risk preferences across multiple periods). Recently, significant strides have been made in the development of a rigorous theory of dynamic, time-consistent risk metrics for multi-period (risk-sensitive) decision processes; however, their integration within constrained stochastic optimal control problems has received little attention. The goal of this paper is to bridge this gap. First, we formulate the stochastic optimal control problem with dynamic, time-consistent risk constraints and we characterize the tail subproblems (which requires the addition of a Markovian structure to the risk metrics). Second, we develop a dynamic programming approach for its solution, which allows to compute the optimal costs by value iteration. Finally, we discuss both theoretical and practical features of our approach, such as generalizations, construction of optimal control policies, and computational aspects. A simple, two-state example is given to illustrate the problem setup and the solution approach.
@inproceedings{ChowPavone2013, author = {Chow, Y. and Pavone, M.}, title = {Stochastic Optimal Control with Dynamic, Time-Consistent Risk Constraints}, booktitle = {{American Control Conference}}, year = {2013}, address = {Washington, D.C.}, doi = {10.1109/ACC.2013.6579868}, month = jun, url = {/wp-content/papercite-data/pdf/Chow.Pavone.ACC13.pdf}, owner = {bylard}, timestamp = {2017-02-20} }
Abstract: The future exploration of small Solar System bodies will, in part, depend on the availability of mobility platforms capable of performing both large surface coverage and short traverses to specific locations. Weak gravitational fields, however, make the adoption of traditional mobility systems difficult. In this paper we present a planetary mobility platform (called "spacecraft/rover hybrid") that relies on internal actuation. A hybrid is a small ( 5 kg), multifaceted robot enclosing three mutually orthogonal flywheels and surrounded by external spikes or contact surfaces. By accelerating/decelerating the flywheels and by exploiting the low-gravity environment, such a platform can perform both long excursions (by hopping) and short, precise traverses (through controlled "tumbles"). This concept has the potential to lead to small, quasi-expendable, yet maneuverable rovers that are robust as they have no external moving parts. In the first part of the paper we characterize the dynamics of such platforms (including fundamental limitations of performance) and we discuss control and planning algorithms. In the second part, we discuss the development of a prototype and present experimental results both in simulations and on physical test stands emulating low-gravity environments. Collectively, our results lay the foundations for the design of internally-actuated rovers with controlled mobility (as opposed to random hopping motion).
@inproceedings{AllenPavoneEtAl2013, author = {Allen, R. and Pavone, M. and McQuin, C. and Nesnas, Issa and {Castillo-Rogez}, Julie C. and Nguyen, {Tam-Nguyen} and Hoffman, Jeffrey A.}, title = {Internally-Actuated Rovers for All-Access Surface Mobility: Theory and Experimentation}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2013}, address = {Karlsruhe, Germany}, doi = {10.1109/ICRA.2013.6631363}, month = may, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Allen.Pavone.ea.ICRA13.pdf} }
Abstract: In this paper we present a mission architecture for the systematic and affordable in-situ exploration of small Solar System bodies (such as asteroids, comets, and Martian moons). At a general level, a mother spacecraft would deploy on the surface of a small body one, or several, spacecraft/rover hybrids, which are small (<= 5 kg, 15 Watts), multi-faceted robots enclosing three mutually orthogonal flywheels and surrounded by external spikes (in particular, there is no external propulsion). By accelerating/decelerating the flywheels and by exploiting the low gravity environment, the hybrids would be capable of performing both long excursions (by hopping) and short traverses to specific locations (through a sequence of controlled "tumbles"). Their control would rely on synergistic operations with the mother spacecraft (where most of hybrids perception and localization functionalities would be hosted), which would make the platforms minimalistic and in turn the entire mission architecture affordable. Specifically, in the first part of the paper we present preliminary models and laboratory experiments for the hybrids, first-order estimates for critical subsystems, and a preliminary study for synergistic mission operations. In the second part, we tailor our mission architecture to the exploration of Mars’ moon Phobos. The mission aims at exploring Phobos’ Stickney crater, whose spectral similarities with C-type asteroids and variety of terrain properties make it a particularly interesting exploration target to address both high-priority science for the Martian system and strategic knowledge gaps for the future human exploration of Mars.
@inproceedings{PavoneCastilloEtAl2013, author = {Pavone, M. and Castillo, J. and Nesnas, I. and Hoffman, J. A. and Strange, N.}, title = {Spacecraft/Rover Hybrids for the Exploration of Small {Solar} {System} Bodies}, booktitle = {{IEEE Aerospace Conference}}, year = {2013}, address = {Big Sky, Montana}, doi = {10.1109/AERO.2013.6497160}, month = mar, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Castillo.ea.Aero13.pdf} }
Abstract: Pickup and delivery problems (PDPs), in which objects or people have to be transported between specific locations, are among the most common combinatorial problems in real-world logistical operations. A widely-encountered type of PDP is the Stacker Crane Problem (SCP), where each commodity/customer is associated with a pickup location and a delivery location, and the objective is to find a minimum-length tour visiting all locations with the constraint that each pickup location and its associated delivery location are visited in immediate, consecutive order. The SCP is NP-Hard and the best known approximation algorithm only provides a 9/5 approximation ratio. In this paper, we examine an embedding of the SCP within a stochastic framework, and our objective is three-fold: First, we describe a large class of algorithms for the SCP, where every member is asymptotically optimal, i.e., it produces, almost surely, a solution approaching the optimal one as the number of pickups/deliveries goes to infinity; moreover, one can achieve computational complexity O(n^2+ε) within the class, where n is the number of pickup/delivery pairs and εis an arbitrarily small positive constant. Second, we characterize the length of the optimal SCP tour asymptotically. Finally, we study a dynamic version of the SCP, whereby pickup and delivery requests arrive according to a Poisson process, and which serves as a model for large-scale demand-responsive transport (DRT) systems. For such a dynamic counterpart of the SCP, we derive a necessary and sufficient condition for the existence of stable vehicle routing policies, which depends only on the workspace geometry, the distributions of pickup and delivery points, the arrival rate of requests, and the number of vehicles. Our results leverage a novel connection between the Euclidean Bipartite Matching Problem and the theory of random permutations, and, for the dynamic setting, exhibit novel features that are absent in traditional spatially-distributed queueing systems.
@article{TreleavenPavoneEtAl2013, author = {Treleaven, K. and Pavone, M. and Frazzoli, E.}, title = {Asymptotically Optimal Algorithms for One-to-One Pickup and Delivery Problems with Applications to Transportation Systems}, journal = {{IEEE Transactions on Automatic Control}}, volume = {58}, number = {9}, pages = {2261--2276}, year = {2013}, doi = {10.1109/TAC.2013.2259993}, url = {/wp-content/papercite-data/pdf/Treleaven.Pavone.ea.TAC13.pdf}, owner = {bylard}, timestamp = {2017-02-20} }
Abstract: Future planetary explorations envisioned by the National Research Council’s (NRC’s) Vision and Voyages for Planetary Science in the Decade 2013-2022, developed at the request of NASA the Science Mission Directorate (SMD) Planetary Science Division (PSD), seek to reach targets of broad scientific interest across the solar system. This goal can be achieved by missions with next-generation capabilities such as innovative interplanetary trajectory solutions, highly accurate landings, the ability to be in close proximity to targets of interest, advanced pointing precision, multiple spacecraft in collaboration, multitarget tours, and advanced robotic surface exploration. Advancements in guidance, navigation, and control (GN&C) and mission design-ranging from software and algorithm development to new sensors-will be necessary to enable these future missions. Spacecraft GN&C technologies have been evolving since the launch of the first rocket. Guidance is defined to be the onboard determination of the desired path of travel from the vehicle’s current location to a designated target. Navigation is defined as the science behind transporting ships, aircraft, or spacecraft from place to place; particularly, the method of determining position, course, and distance traveled as well as the determination of the time reference. Control is defined as the onboard manipulation of vehicle steering controls to track guidance commands while maintaining vehicle pointing with the required precision. As missions become more complex, technological demands on GN&C increase, and so continuous technology progress is necessary. Recognizing the significance of this research, the NRC of the National Academies listed many GN&C technologies as top priorities in the recently released NASA Space Technology Roadmaps and Priorities: Restoring NASA’s Technological Edge and Paving the Way for a New Era in Space. This document-Part III, Surface Guidance, Navigation, and Control-is the third, and last, in a series of technology assessments evaluating the capabilities and technologies needed for future missions pursuing SMD PSD’s scientific goals. These reports cover the status of technologies and provide findings and recommendations to NASA PSD for future needs in GN&C and mission design technologies. Part I covers planetary mission design in general, as well as the estimation and control of vehicle flight paths when flight path and attitude dynamics may be treated as decoupled or only loosely coupled (as is the case the majority of the time in a typical planetary mission). Part II, Onboard Guidance, Navigation, and Control, covers attitude estimation and control in general, as well as the estimation and control of vehicle flight paths when flight path and attitude dynamics are strongly coupled (as is the case during certain critical phases, such as entry, descent, and landing, in some planetary missions). Part III, Surface Guidance, Navigation, and Control, examines GN&C for vehicles that are not in free flight, but that operate on or near the surface of a natural body of the solar system. It should be noted that this is the first time that Surface GNC has been assessed and requirements given for future missions. Together, these documents provide the PSD with a roadmap for achieving science missions in the next decade
@techreport{QuadrelliMcHenryEtAl2013, author = {Quadrelli, M. and McHenry, M. and Wilcox, B. and Hall, J. and Volpe, R. and Nesnas, I. and Nayar, H. and Backes, P. and Mukherjee, R. and Matthies, L. and Zimmerman, W. and Mittman, D. and Pavone, M. and Elfes, A.}, title = {Guidance, Navigation, and Control Technology Assessment for Future Planetary Science Missions Part {III}, Surface Guidance, Navigation, and Control}, institution = {{Planetary Science Division, NASA Science Mission Directorate}}, year = {2013}, owner = {bylard}, timestamp = {2017-01-28}, url = {https://solarsystem.nasa.gov/scitech/display.cfm?ST_ID=2547} }
Abstract: One of the most common combinatorial problems in logistics and transportation-after the Traveling Salesman Problem-is the Stacker Crane Problem (SCP), where commodities or customers are associated each with a pickup location and a delivery location, and the objective is to find a minimum-length tour ’picking up’ and ’delivering’ all items, while ensuring the number of items on-board never exceeds a given capacity. While vastly many SCPs encountered in practice are embedded in road or road-like networks, very few studies explicitly consider such environments. In this paper, first, we formulate an environment model capturing the essential features of a "small-neighborhood" road network, along with models for omni-directional vehicles and directed vehicles. Then, we formulate a stochastic version of the unit-capacity SCP, on our road network model, where pickup/delivery sites are random points along segments of the network. Our main contribution is a polynomial-time algorithm for the problem that is asymptotically constant-factor; i.e., it produces a solution no worse than κ+o(1) times the length of the optimal one, where o(1) goes to zero as the number of items grows large, almost surely. The constant κis at most 3, and for omni-directional vehicles it is provably 1, i.e., optimal. Simulations show that with a number of pickup/delivery pairs as low as 50, the proposed algorithm delivers a solution whose cost is consistently within 10% of that of an optimal solution.
@inproceedings{TreleavenPavoneEtAl2012b, author = {Treleaven, K. and Pavone, M. and Frazzoli, E.}, title = {Models and Efficient Algorithms for Pickup and Delivery Problems on Roadmaps}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2012}, address = {Maui, Hawaii}, doi = {10.1109/CDC.2012.6426164}, month = dec, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Treleaven.Pavone.ea.CDC12.pdf} }
Abstract: This paper presents a novel risk-constrained multi-stage decision making approach to the architectural analysis of planetary rover missions. In particular, focusing on a 2018 Mars rover concept, which was considered as part of a potential Mars Sample Return campaign, we model the entry, descent, and landing (EDL) phase and the rover traverse phase as four sequential decision-making stages. The problem is to find a sequence of divert and driving maneuvers so that the rover drive is minimized and the probability of a mission failure (e.g., due to a failed landing) is below a user specified bound. By solving this problem for several different values of the model parameters (e.g., divert authority), this approach enables rigorous, accurate and systematic trade-offs for the EDL system vs. the mobility system, and, more in general, cross-domain trade-offs for the different phases of a space mission. The overall optimization problem can be seen as a chance-constrained dynamic programming problem, with the additional complexity that 1) in some stages the disturbances do not have any probabilistic characterization, and 2) the state space is extremely large (i.e, hundreds of millions of states for trade-offs with high-resolution Martian maps). To this purpose, we solve the problem by performing an unconventional combination of average and minimax cost analysis and by leveraging high efficient computation tools from the image processing community. Preliminary trade-off results are presented.
@inproceedings{KuwataPavoneEtAl2012, author = {Kuwata, Y. and Pavone, M. and Balaram, J.}, title = {A Risk-Constrained Multi-Stage Decision Making Approach to the Architectural Analysis of {M}ars Missions}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2012}, address = {Maui, Hawaii}, doi = {10.1109/CDC.2012.6426090}, month = dec, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Kuwata.Pavone.ea.CDC12.pdf} }
Abstract: Demand-responsive transport (DRT) systems, where users generate requests for transportation from a pickup point to a delivery point, are expected to increase in usage dramatically as the inconvenience of privately-owned cars in metropolitan areas becomes excessive. However, despite the increasing role of DRT systems, there are very few rigorous results characterizing achievable performance (in terms, e.g., of stability conditions). In this paper, our aim is to bridge this gap for a rather general model of DRT systems, which takes the form of a generalized Dynamic Pickup and Delivery Problem. The key strategy is to develop analytical bounds for the optimal cost of the Euclidean Stacker Crane Problem (ESCP), which represents a general static model for DRT systems. By leveraging such bounds, we characterize a necessary and sufficient condition for the stability of DRT systems; the condition depends only on the workspace geometry, the stochastic distributions of pickup and delivery points, customers’ arrival rate, and the number of vehicles. Our results exhibit some surprising features that are absent in traditional spatially-distributed queueing systems.
@inproceedings{TreleavenPavoneEtAl2012, author = {Treleaven, K. and Pavone, M. and Frazzoli, E.}, title = {Cost Bounds for Pickup and Delivery Problems with Application to Large-Scale Transportation Systems}, booktitle = {{American Control Conference}}, year = {2012}, address = {Montr\'{e}al, Canada}, doi = {10.1109/ACC.2012.6315329}, month = jun, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Treleaven.Pavone.ea.ACC12.pdf} }
Abstract: Command of support robots by the warfighter requires intuitive interfaces to quickly communicate high degree-of-freedom (DOF) information while leaving the hands unencumbered. Stealth operations rule out voice commands and vision-based gesture interpretation techniques, as they often entail silent operations at night or in other low visibility conditions. Targeted at using bio-signal inputs to set navigation and manipulation goals for the robot (say, simply by pointing), we developed a system based on an electromyography (EMG) "BioSleeve", a high density sensor array for robust, practical signal collection from forearm muscles. The EMG sensor array data is fused with inertial measurement unit (IMU) data. This paper describes the BioSleeve system and presents initial results of decoding robot commands from the EMG and IMU data using a BioSleeve prototype with up to sixteen bipolar surface EMG sensors. The BioSleeve is demonstrated on the recognition of static hand positions (e.g. palm facing front, fingers upwards) and on dynamic gestures (e.g. hand wave). In preliminary experiments, over 90% correct recognition was achieved on five static and nine dynamic gestures. We use the BioSleeve to control a team of five LANdroid robots in individual and group/squad behaviors. We define a gesture composition mechanism that allows the specification of complex robot behaviors with only a small vocabulary of gestures/commands, and we illustrate it with a set of complex orders.
@inproceedings{StoicaAssadEtAl2012, author = {Stoica, A. and Assad, C. and Wolf, M. and You, K. S. and Pavone, M. and Huntsberger, T. and Iwashita, Y.}, title = {Using Arm and Hand Gestures to Command Robots during Stealth Operations}, booktitle = {{Proc. of SPIE}}, year = {2012}, address = {Baltimore, Maryland}, doi = {10.1117/12.923690}, month = jun, url = {/wp-content/papercite-data/pdf/Stoica.Assad.ea.SPIE12.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: The recent decadal survey report for planetary science (compiled by the National Research Council) has prioritized three main areas for planetary exploration: (1) the characterization of the early Solar system history, (2) the search for planetary habitats, and (3) an improved understanding about the nature of planetary processes. A growing number of ground and space observations suggest that small bodies are ideally suited for addressing all these three priorities. In parallel, several technological advances have been recently made for microgravity rovers, penetrators, and MEMS-based instruments. Motivated by these findings and new technologies, the objective of this paper is to study the expected science return of spatially-extended in-situ exploration at small bodies, as a function of surface covered and in the context of the key science priorities identified by the decadal survey report. Specifically, targets within the scope of our analysis belong to three main classes: main belt asteroids and irregular satellites, Near Earth Objects, and comets. For each class of targets, we identify the corresponding science objectives for potential future exploration, we discuss the types of measurements and instruments that would be required, and we discuss mission architectures (with an emphasis on spatially-extended in-situ exploration) to achieve such objectives. Then, we characterize (notionally) how the science return for two reference targets would scale with the amount (and type) of surface that is expected to be covered by a robotic mobile platform. The conclusion is that spatially-extended in-situ information about the chemical and physical heterogeneity of small bodies has the potential to lead to a much improved understanding about their origin, evolution, and astrobiological relevance.
@inproceedings{CastilloPavoneEtAl2012, author = {Castillo, J. and Pavone, M. and Nesnas, I. and Hoffman, J. A.}, title = {Expected Science Return of Spatially-Extended In-Situ Exploration at Small {Solar} {System} Bodies}, booktitle = {{IEEE Aerospace Conference}}, year = {2012}, address = {Big Sky, Montana}, doi = {10.1109/AERO.2012.6187034}, month = mar, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Castillo.Pavone.ea.Aero12.pdf} }
Abstract: In this paper we develop methods for maximizing the throughput of a mobility-on-demand urban transportation system. We consider a finite group of shared vehicles, located at a set of stations. Users arrive at the stations, pick-up vehicles, and drive (or are driven) to their destination station where they drop-off the vehicle. When some origins and destinations are more popular than others, the system will inevitably become out of balance: Vehicles will build up at some stations, and become depleted at others. We propose a robotic solution to this rebalancing problem that involves empty robotic vehicles autonomously driving between stations. Specifically, we develop a rebalancing policy that lets every station reach an equilibrium in which there are excess vehicles and no waiting customers and that minimizes the number of robotic vehicles performing rebalancing trips. To do this, we utilize a fluid model for the customers and vehicles in the system. We then show that the optimal rebalancing policy can be found as the solution to a linear program. We use this solution to develop a real-time rebalancing policy which can operate in highly variable environments. We verify policy performance in a simulated mobility-on-demand environment and in hardware experiments.
@article{PavoneSmithEtAl2012, author = {Pavone, M. and Smith, S. L. and Frazzoli, E. and Rus, D.}, title = {Robotic Load Balancing for {Mobility-on-Demand} Systems}, journal = {{Int. Journal of Robotics Research}}, volume = {31}, number = {7}, pages = {839--854}, year = {2012}, doi = {10.1177/0278364912444766}, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Smith.ea.IJRR12.pdf} }
Abstract: This study investigated a novel mission architecture for the systematic and affordable in-situ exploration of small Solar System bodies. Specifically, a mother spacecraft would deploy over the surface of a small body one, or several, spacecraft/rover hybrids, which are small, multi-faceted enclosed robots with internal actuation and external spikes. They would be capable of 1) long excursions (by hopping), 2) short traverses to specific locations (through a sequence of controlled tumbles), and 3) high-altitude, attitude-controlled ballistic flight (akin to spacecraft flight). Their control would rely on synergistic operations with the mother spacecraft (where most of hybrids’ perception and localization functionalities would be hosted), which would make the platforms minimalistic and, in turn, the entire mission architecture affordable. The Phase I study was aimed at providing an initial feasibility assessment of the proposed architecture and had, in particular, four main objectives: 1) to characterize the expected science return of spatially-extended in-situ exploration at small Solar System bodies, 2) to demonstrate that a hybrid can achieve both large surface coverage via hopping and fine mobility via tumbling in low gravity environments (specifically, for a boulder-free environment with a gravity level on the order of mm/s^2, the requirement was 20%-30% motion accuracy with an average speed on the order of cm/s); 3) to provide first-order estimates for the critical subsystems, and 4) to study mission operations and a mission scenario to Phobos.
@techreport{PavoneCastilloEtAl2012, author = {Pavone, M. and Castillo, J. and Hoffman, J. A. and Nesnas, I.}, title = {Spacecraft/Rover Hybrids for the Exploration of Small {Solar} {System} Bodies}, institution = {{NASA NIAC Program}}, year = {2012}, note = {Final report}, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.ea.NIAC.Final.Report.2012.pdf} }
Abstract: The Stacker Crane Problem is NP-Hard and the best known approximation algorithm only provides a 9/5 approximation ratio. The objective of this paper is threefold. First, by embedding the problem within a stochastic framework, we present a novel algorithm for the SCP that: (i) is asymptotically optimal, i.e., it produces, almost surely, a solution approaching the optimal one as the number of pickups/deliveries goes to infinity; and (ii) has computational complexity O(n^2+\eps), where n is the number of pickup/delivery pairs and \eps is an arbitrarily small positive constant. Second, we asymptotically characterize the length of the optimal SCP tour. Finally, we study a dynamic version of the SCP, whereby pickup and delivery requests arrive according to a Poisson process, and which serves as a model for large-scale demand-responsive transport (DRT) systems. For such a dynamic counterpart of the SCP, we derive a necessary and sufficient condition for the existence of stable vehicle routing policies, which depends only on the workspace geometry, the stochastic distributions of pickup and delivery points, the arrival rate of requests, and the number of vehicles. Our results leverage a novel connection between the Euclidean Bipartite Matching Problem and the theory of random permutations, and, for the dynamic setting, exhibit novel features that are absent in traditional spatially-distributed queueing systems.
@inproceedings{TreleavenPavoneEtAl2011, author = {Treleaven, K. and Pavone, M. and Frazzoli, E.}, title = {An Asymptotically Optimal Algorithm for Pickup and Delivery Problems}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2011}, address = {Orlando, Florida}, doi = {10.1109/CDC.2011.6161406}, month = dec, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Treleaven.Pavone.ea.CDC11.pdf} }
Abstract: In this paper we develop methods for maximizing the throughput of a mobility-on-demand urban transportation system. We consider a finite group of shared vehicles, located at a set of stations. Users arrive at the stations, pick-up vehicles, and drive (or are driven) to their destination station where they drop-off the vehicle. When some origins and destinations are more popular than others, the system will inevitably become out of balance: Vehicles will build up at some stations, and become depleted at others. We propose a robotic solution to this rebalancing problem that involves empty robotic vehicles autonomously driving between stations. We develop a rebalancing policy that minimizes the number of vehicles performing rebalancing trips. To do this, we utilize a fluid model for the customers and vehicles in the system. The model takes the form of a set of nonlinear time-delay differential equations. We then show that the optimal rebalancing policy can be found as the solution to a linear program. By analyzing the dynamical system model, we show that every station reaches an equilibrium in which there are excess vehicles and no waiting customers. We use this solution to develop a real-time rebalancing policy which can operate in highly variable environments. We verify policy performance in a simulated mobility-on-demand environment with stochastic features found in real-world urban transportation networks.
@inproceedings{PavoneSmithEtAl2011, author = {Pavone, M. and Smith, S. L. and Frazzoli, E. and Rus, D.}, title = {Load Balancing for {Mobility-on-Demand} Systems}, booktitle = {{Robotics: Science and Systems}}, year = {2011}, address = {Los Angeles, California}, doi = {10.15607/rss.2011.vii.034}, month = jun, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Smith.ea.RSS11.pdf} }
Abstract: In this paper, we present adaptive and distributed algorithms for motion coordination of a group of m vehicles. The vehicles must service demands whose time of arrival, spatial location, and service requirement are stochastic; the objective is to minimize the average time demands spend in the system. The general problem is known as the m-vehicle Dynamic Traveling Repairman Problem (m-DTRP). The best previously known control algorithms rely on centralized task assignment and are not robust against changes in the environment. In this paper, we first devise new control policies for the 1-DTRP that: i) are provably optimal both in light-load conditions (i.e., when the arrival rate for the demands is small) and in heavy-load conditions (i.e., when the arrival rate for the demands is large), and ii) are adaptive, in particular, they are robust against changes in load conditions. Then, we show that specific partitioning policies, whereby the environment is partitioned among the vehicles and each vehicle follows a certain set of rules within its own region, are optimal in heavy-load conditions. Building upon the previous results, we finally design control policies for the m-DTRP that i) are adaptive and distributed, and ii) have strong performance guarantees in heavy-load conditions and stabilize the system in any load condition.
@article{PavoneFrazzoliEtAl2011, author = {Pavone, M. and Frazzoli, E. and Bullo, F.}, title = {Adaptive and Distributed Algorithms for Vehicle Routing in a Stochastic and Dynamic Environment}, journal = {{IEEE Transactions on Automatic Control}}, volume = {56}, number = {6}, pages = {1259--1274}, year = {2011}, doi = {10.1109/TAC.2010.2092850}, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Frazzoli.ea.TAC11.pdf} }
Abstract: A widely applied strategy for workload sharing is to equalize the workload assigned to each resource. In mobile multiagent systems, this principle directly leads to equitable partitioning policies whereby: 1) the environment is equitably divided into subregions of equal measure; 2) one agent is assigned to each subregion; and 3) each agent is responsible for service requests originating within its own subregion. The current lack of distributed algorithms for the computation of equitable partitions limits the applicability of equitable partitioning policies to limited-size multiagent systems operating in known, static environments. In this paper, first we design provably correct and spatially distributed algorithms that allow a team of agents to compute a convex and equitable partition of a convex environment. Second, we discuss how these algorithms can be extended so that a team of agents can compute, in a spatially distributed fashion, convex and equitable partitions with additional features, e.g., equitable and median Voronoi diagrams. Finally, we discuss two application domains for our algorithms, namely dynamic vehicle routing for mobile robotic networks and wireless ad hoc networks. Through these examples, we show how one can couple the algorithms presented in this paper with equitable partitioning policies to make these amenable to distributed implementation. More in general, we illustrate a systematic approach to devise spatially distributed control policies for a large variety of multiagent coordination problems. Our approach is related to the classic Lloyd algorithm and exploits the unique features of power diagrams.
@article{PavoneArsieEtAl2011, author = {Pavone, M. and Arsie, A. and Frazzoli, E. and Bullo, F.}, title = {Distributed Algorithms for Environment Partitioning in Mobile Robotic Networks}, journal = {{IEEE Transactions on Automatic Control}}, volume = {56}, number = {8}, pages = {1834--1848}, year = {2011}, doi = {10.1109/TAC.2011.2112410}, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Arsie.ea.TAC11.pdf} }
Abstract: Recent years have witnessed great advancements in the science and technology of autonomy, robotics and networking. This paper surveys recent concepts and algorithms for dynamic vehicle routing (DVR), that is, for the automatic planning of optimal multi-vehicle routes to perform tasks that are generated over time by an exogenous process. We consider a rich variety of scenarios relevant for robotic applications. We begin by reviewing the basic DVR problem: demands for service arrive at random locations at random times and a vehicle travels to provide on-site service while minimizing the expected wait time of the demands. Next, we treat different multi-vehicle scenarios based on different models for demands (e.g., demands with different priority levels and impatient demands), vehicles (e.g., motion constraints, communication and sensing capabilities), and tasks. The performance criterion used in these scenarios is either the expected wait time of the demands or the fraction of demands serviced successfully. In each specific DVR scenario, we adopt a rigorous technical approach that relies upon methods from queueing theory, combinatorial optimization and stochastic geometry. First, we establish fundamental limits on the achievable performance, including limits on stability and quality of service. Second, we design algorithms, and provide provable guarantees on their performance with respect to the fundamental limits.
@article{BulloFrazzoliEtAl2011, author = {Bullo, F. and Frazzoli, E. and Pavone, M. and Savla, K. and Smith, S. L.}, title = {Dynamic Vehicle Routing for Robotic Systems}, journal = {{Proc. of the IEEE}}, volume = {99}, number = {9}, pages = {1482--1504}, year = {2011}, doi = {10.1109/JPROC.2011.2158181}, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Bullo.Frazzoli.ea.IEEEProc11.pdf} }
Abstract: Transportation-On-Demand (TOD) systems, where users generate requests for transportation from a pick-up point to a delivery point, are already very popular and are expected to increase in usage dramatically as the inconvenience of privately-owned cars in metropolitan areas becomes excessive. Routing service vehicles through customers is usually accomplished with heuristic algorithms. In this paper we study TOD systems in a formal setting that allows us to characterize fundamental performance limits and devise dynamic routing policies with provable performance guarantees. Specifically, we study TOD systems in the form of a unit-capacity, multiple-vehicle dynamic pick-up and delivery problem, whereby pick-up requests arrive according to a Poisson process and are randomly located according to a general probability density. Corresponding delivery locations are also randomly distributed according to a general probability density, and a number of unit-capacity vehicles must transport demands from their pick-up locations to their delivery locations. We derive insightful fundamental bounds on the steady-state waiting times for the demands, and we devise constant-factor optimal dynamic routing policies. Simulation results are presented and discussed.
@inproceedings{PavoneTreleavenEtAl2010, author = {Pavone, M. and Treleaven, K. and Frazzoli, E.}, title = {Fundamental Performance Limits and Efficient Policies for {Transportation-On-Demand} Systems}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2010}, address = {Atlanta, Georgia}, doi = {10.1109/CDC.2010.5717552}, month = dec, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Treleaven.ea.CDC10.pdf} }
Abstract: In this paper we study a dynamic vehicle routing problem where demands have stochastic deadlines on their waiting times. Specifically, a network of robotic vehicles must service demands whose time of arrival, location and on-site service are stochastic; moreover, once a demand arrives, it remains active for a stochastic amount of time, and then expires. An active demand is successfully serviced when one of the vehicles visits its location before its deadline and provide the required on-site service. The aim is to find the minimum number of vehicles needed to ensure that the steady-state probability that a demand is successfully serviced is larger than a desired value, and to determine the policy the vehicles should execute to ensure that such objective is attained. First, we carefully formulate the problem, and we show its well-posedness by providing some novel ergodic results. Second, we provide a lower bound on the optimal number of vehicles; finally, we analyze two service policies, and we show that one of them is optimal in light load. Simulation results are presented and discussed.
@inproceedings{PavoneFrazzoli2010, author = {Pavone, M. and Frazzoli, E.}, title = {Dynamic Vehicle Routing with Stochastic Time Constraints}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2010}, address = {Anchorage, Alaska}, doi = {10.1109/ROBOT.2010.5509222}, month = may, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Frazzoli.ICRA10.pdf} }
Abstract: In this paper we introduce a dynamic vehicle routing problem in which there are multiple vehicles and multiple priority classes of service demands. Service demands of each priority class arrive in the environment randomly over time and require a random amount of on-site service that is characteristic of the class. To service a demand, one of the vehicles must travel to the demand location and remain there for the required on-site service time. The quality of service provided to each class is given by the expected delay between the arrival of a demand in the class and that demand’s service completion. The goal is to design a routing policy for the service vehicles which minimizes a convex combination of the delays for each class. First, we provide a lower bound on the achievable values of the convex combination of delays. Then, we propose a novel routing policy and analyze its performance under heavy-load conditions (i.e., when the fraction of time the service vehicles spend performing on-site service approaches one). The policy performs within a constant factor of the lower bound, where the constant depends only on the number of classes, and is independent of the number of vehicles, the arrival rates of demands, the on-site service times, and the convex combination coefficients.
@article{SmithPavoneEtAl2010, author = {Smith, S. L. and Pavone, M. and Bullo, F. and Frazzoli, E.}, title = {Dynamic Vehicle Routing with Priority Classes of Stochastic Demands}, journal = {{SIAM Journal on Control and Optimization}}, volume = {48}, number = {5}, pages = {3224--3245}, year = {2010}, doi = {10.1137/090749347}, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Smith.Pavone.ea.SIAM10.pdf} }
Abstract: In this paper, distributed control policies for spacecraft formations that draw inspiration from the simple idea of cyclic pursuit are studied. First studied are cyclic-pursuit control laws for both single- and double-integrator models in three dimensions. In particular, control laws are developed that only require relative measurements of position and velocity with respect to the two leading neighbors in the ring topology of cyclic pursuit and that allow convergence to a variety of symmetric formations, including evenly spaced circular and elliptic formations and evenly spaced Archimedes spirals. Second, potential applications are discussed, including spacecraft formation for interferometric imaging. Finally, experimental results obtained by implementing the aforementioned control laws on the Synchronized Position Hold Engage Reorient Experimental Satellite testbed onboard the International Space Station are presented and discussed.
@article{RamirezPavoneEtAl2010, author = {Ramirez, J. L. and Pavone, M. and Frazzoli, E. and Miller, D. W.}, title = {Distributed Control of Spacecraft Formations via Cyclic Pursuit: Theory and Experiments}, journal = {{AIAA Journal of Guidance, Control, and Dynamics}}, volume = {33}, number = {5}, pages = {1655--1669}, year = {2010}, doi = {10.2514/1.46511}, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Ramirez.Pavone.ea.JGCD10.pdf} }
Abstract: In this paper we study distributed control policies for spacecraft formations that draw inspiration from the simple idea of cyclic pursuit. First, we extend existing cyclic-pursuit control laws devised for single-integrator models in two dimensions to the case of double-integrator models in three dimensions. In particular, we develop control laws that only require relative measurements of position and velocity with respect to the two leading neighbors in the ring topology of cyclic pursuit, and allow the spacecraft to converge to a variety of symmetric formations, including evenly spaced circular formations and evenly spaced Archimedes’ spirals. Second, we discuss potential applications, including spacecraft coordination for interferometric imaging and convergence to zero-effort orbits. Finally, we present and discuss experimental results obtained by implementing the aforementioned control laws on three nanospacecraft on board the International Space Station.
@inproceedings{RamirezPavoneEtAl2009, author = {Ramirez, J. L. and Pavone, M. and Frazzoli, E. and Miller, D. W.}, title = {Distributed Control of Spacecraft Formation via Cyclic Pursuit: Theory and Experiments}, booktitle = {{American Control Conference}}, year = {2009}, address = {St. Louis, Missouri}, doi = {10.1109/ACC.2009.5160735}, month = jun, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Ramirez.Pavone.ea.ACC09.pdf} }
Abstract: In this paper we study a dynamic vehicle routing problem in which there are multiple vehicles and multiple classes of demands. Demands of each class arrive in the environment randomly over time and require a random amount of on-site service that is characteristic of the class. To service a demand, one of the vehicles must travel to the demand location and remain there for the required on-site service time. The quality of service provided to each class is given by the expected delay between the arrival of a demand in the class, and that demand’s service completion. The goal is to design a routing policy for the service vehicles which minimizes a convex combination of the delays for each class. First, we provide a lower bound on the achievable values of the convex combination of delays. Then, we propose a novel routing policy and analyze its performance under heavy load conditions (i.e., when the fraction of time the service vehicles spend performing on-site service approaches one). The policy performs within a constant factor of the lower bound (and thus the optimal), where the constant depends only on the number of classes, and is independent of the number of vehicles, the arrival rates of demands, the on-site service times, and the convex combination coefficients.
@inproceedings{PavoneSmithEtAl2009, author = {Pavone, M. and Smith, S. L. and Bullo, F. and Frazzoli, E.}, title = {Dynamic Multi-Vehicle Routing with Multiple Classes of Demands}, booktitle = {{American Control Conference}}, year = {2009}, address = {St. Louis, Missouri}, doi = {10.1109/ACC.2009.5160557}, month = jun, url = {/wp-content/papercite-data/pdf/Pavone.Smith.ea.ACC09.pdf}, owner = {bylard}, timestamp = {2017-02-20} }
Abstract: The most widely applied resource allocation strategy is to balance, or equalize, the total workload assigned to each resource. In mobile multi-agent systems, this principle directly leads to equitable partitioning policies in which (i) the workspace is divided into subregions of equal measure, (ii) there is a bijective correspondence between agents and subregions, and (iii) each agent is responsible for service requests originating within its own subregion. In this paper, we provide the first distributed algorithm that provably allows m agents to converge to an equitable partition of the workspace, from any initial configuration, i.e., globally. Our approach is related to the classic Lloyd algorithm, and provides novel insights into the properties of power diagrams. Simulation results are presented and discussed.
@inproceedings{PavoneArsieEtAl2009, author = {Pavone, M. and Arsie, A. and Frazzoli, E. and Bullo, F.}, title = {Equitable Partitioning Policies for Robotic Networks}, booktitle = {{Proc. IEEE Conf. on Robotics and Automation}}, year = {2009}, address = {Kobe, Japan}, doi = {10.1109/ROBOT.2009.5152809}, month = may, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Arsie.ea.ICRA09.pdf} }
Abstract: In this article, we discussed the use of various spatial tessellations to determine, in the framework of partitioning policies, optimal workload share in a mobile robotic network. We also proposed efficient and spatially distributed algorithms for achieving some of these tessellations with minimum or no communication between the agents. Because of space limitations, we have not reported results of numerical experiments in this article but provided bibliographic references to publications containing such results and further details. It is interesting to note that these tessellations appear while considering different variations of the same basic problem (DTRP). It is then natural to investigate the existence of a single objective function, whose optima correspond to the various tessellations under these different variations. The game theory approach seems to be a promising one.
@article{PavoneSavlaEtAl2009, author = {Pavone, M. and Savla, K. and Frazzoli, E.}, title = {Sharing the Load}, journal = {{IEEE Robotics and Automation Magazine}}, volume = {16}, number = {2}, pages = {52--61}, year = {2009}, doi = {10.1109/MRA.2009.932528}, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Savla.ea.RAM09.pdf} }
Abstract: In this paper, we study the problem of designing motion strategies for a team of mobile agents, required to fulfill request for on-site service in a given planar region. In our model, each service request is generated by a spatio-temporal stochastic process; once a service request has been generated, it remains active for a certain deterministic amount of time, and then expires. An active service request is fulfilled when one of the mobile agents visits the location of the request. Specific problems we investigate are the following: what is the minimum number of mobile agents needed to ensure that a certain fraction of service requests is fulfilled before expiration? What strategy should they use to ensure that this objective is attained? This problem can be viewed as the stochastic and dynamic version of the well-known vehicle routing problem with time windows. We also extend our analysis to the case in which the time service requests remain active is itself a random variable, describing customer impatience. The customers’ impatience is only known to the mobile agents via prior statistics. In this case, it is desired to minimize the fraction of service requests missed because of impatience. Finally, we show how the routing strategies presented in the paper can be executed in a distributed fashion.
@article{PavoneBisnikEtAl2009, author = {Pavone, M. and Bisnik, N. and Frazzoli, E. and Isler, V.}, title = {A Stochastic and Dynamic Vehicle Routing Problem with Time Windows and Customer Impatience}, journal = {{Journal of Mobile Networks and Applications}}, volume = {14}, number = {3}, pages = {350--364}, year = {2009}, doi = {10.1007/s11036-008-0101-1}, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Bisnik.ea.MONE09.pdf} }
Abstract: In this paper we study a variation of the Dynamic Traveling Repairperson Problem (DTRP) in which there are two classes of demands; high priority, and low priority. In the problem, demands arrive in the environment randomly over time and assume a random location and on-site service requirement. A service vehicle must travel to each demand location and provide the required on-site service. The quality of service provided to each class of demands is measured by the expected delay between a demand’s arrival and its service completion. The goal is to design policies for the service vehicle which minimize a convex combination of the delays for each class. We provide a lower bound on the achievable delay for this problem, and propose a policy which performs within a known constant factor of the optimal in heavy load (i.e., when the fraction of time the service vehicle spends performing on-site service approaches one). The problem studied in this paper is analogous to the multi-class queuing problem in classical queuing theory.
@inproceedings{SmithPavoneEtAl2008, author = {Smith, S. L. and Pavone, M. and Bullo, F. and Frazzoli, E.}, title = {Dynamic Vehicle Routing with Heterogeneous Demands}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2008}, address = {Cancun, Mexico}, doi = {10.1109/CDC.2008.4739284}, month = dec, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Smith.Pavone.ea.CDC08.pdf} }
Abstract: The most widely applied resource allocation strategy is to balance, or equalize, the total workload assigned to each resource. In mobile multi-agent systems, this principle directly leads to equitable partitioning policies in which (i) the workspace is divided into subregions of equal measure, (ii) each agent is assigned to a unique subregion, and (iii) each agent is responsible for service requests originating within its own subregion. In this paper, we design distributed and adaptive policies to allow a team of agents to achieve a convex and equitable partition of a convex workspace. Our approach is related to the classic Lloyd algorithm, and exploits the unique features of Power Diagrams. We discuss possible applications to routing of vehicles in stochastic and dynamic environments, and to wireless networks. Simulation results are presented and discussed.
@inproceedings{PavoneFrazzoliEtAl2008, author = {Pavone, M. and Frazzoli, E. and Bullo, F.}, title = {Distributed Policies for Equitable Partitioning: Theory and Applications}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2008}, address = {Cancun, Mexico}, doi = {10.1109/CDC.2008.4739483}, month = dec, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Frazzoli.ea.CDC08.pdf} }
Abstract: We present decentralized algorithms for a class of stochastic and dynamic vehicle routing problems, known as the multiple-vehicle dynamic traveling repairperson problem (m-DTRP), in which demands arrive randomly over time and their locations have an arbitrary distribution, and the objective is to minimize the average waiting time between the appearance of a demand and the time it is visited by a vehicle. The best previously known control algorithms rely on centralized, a-priori task assignment, and are therefore of limited applicability in scenarios involving large ad-hoc networks of autonomous vehicles. By combining results from geometric probability and locational optimization, we provide a policy that solves, providing a constant-factor approximation to the optimal achievable performance, the decentralized version of the m-DTRP; such policy (i) does not rely on centralized and a priori task assignment, (ii) is spatially distributed, scalable to large networks, and adaptive to network changes. Simulation results are presented and discussed.
@inproceedings{PavoneFrazzoliEtAl2007, author = {Pavone, M. and Frazzoli, E. and Bullo, F.}, title = {Decentralized Algorithms for Stochastic and Dynamic Vehicle Routing with General Demand Distribution}, booktitle = {{Proc. IEEE Conf. on Decision and Control}}, year = {2007}, address = {New Orleans, Louisiana}, doi = {10.1109/CDC.2007.4434989}, month = dec, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Frazzoli.ea.CDC07.pdf} }
Abstract: Consider the following scenario: a spatio-temporal stochastic process generates service requests, localized at points in a bounded region on the plane; these service requests are fulfilled when one of a team of mobile agents visits the location of the request. For example, a service request may represent the detection of an event in a sensor network application, which needs to be investigated on site. Once a service request has been generated, it remains active for an amount of time which is itself a random variable, and then expires. The problem we investigate is the following: What is the minimum number of mobile agents needed to ensure that each service request is fulfilled before expiring, with probability at least 1 - \eps? What strategy should they use to ensure this objective is attained? Formulating the probability of successfully servicing requests before expiration as a performance metric, we derive bounds on the minimum number of agents required to ensure a given performance level, and present decentralized motion coordination algorithms that approximate the optimal strategy.
@inproceedings{PavoneBisnikEtAl2007, author = {Pavone, M. and Bisnik, N. and Frazzoli, E. and Isler, V.}, title = {Decentralized Vehicle Routing in a Stochastic and Dynamic Environment with Customer Impatience}, booktitle = {{Int. Conf. on Robot Communication and Coordination}}, year = {2007}, address = {Athens, Greece}, doi = {10.4108/icst.robocomm2007.2220}, month = oct, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Bisnik.ea.Robocomm07.pdf} }
Abstract: This paper presents a decentralized control policy for symmetric formations in multi-agent systems. It is shown that n agents, each one pursuing its leading neighbor along the line of sight rotated by a common offset angle α, eventually converge to a single point, a circle or a logarithmic spiral pattern, depending on the value of α. Simulation results are presented and discussed.
@inproceedings{PavoneFrazzoli2007b, author = {Pavone, M. and Frazzoli, E.}, title = {Decentralized Policies for Geometric Pattern Formation}, booktitle = {{American Control Conference}}, year = {2007}, address = {New York, New York}, doi = {10.1109/ACC.2007.4283108}, month = jul, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Frazzoli.ACC07.pdf} }
Abstract: This paper presents a decentralized control policy for symmetric formations in multiagent systems. It is shown that n agents, each one pursuing its leading neighbor along the line of sight rotated by a common offset angle α, eventually converge to a single point, a circle or a logarithmic spiral pattern, depending on the value of α. In the final part of the paper, we present a strategy to make the agents totally anonymous, and we discuss a potential application to coverage path planning.
@article{PavoneFrazzoli2007, author = {Pavone, M. and Frazzoli, E.}, title = {Decentralized Policies for Geometric Pattern Formation and Path Coverage}, journal = {{ASME Journal of Dynamic Systems, Measurement, and Control}}, volume = {129}, number = {5}, pages = {633--643}, year = {2007}, doi = {10.1115/1.2767658}, url = {/wp-content/papercite-data/pdf/Pavone.Frazzoli.JDSMC07.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: This paper describes a general approach for the adaptive supervised learning of behaviors in a behavior-based robot. The key idea is to formalize a behavior produced by a Motor Map driven by an internal adaptive reward function. Aim of the adaptive reward function is to select the most significant sensory inputs and to use them in the best way. The greatest challenge is to keep small the search space. Motor map learning relies on the classical Kohonen algorithm, while the structure of the reward function is learnt through a non-associative reinforcement learning algorithm. Simulation results on a six legged biologically-inspired robot confirm the suitability of the approach. This methodology allows the human designer to easily embody all the a priori knowledge on the robot controller, while providing at the same time a high degree of adaptability and robustness against the sensory malfunctioning.
@inproceedings{ArenaFortunaEtAl2006, author = {Arena, P. and Fortuna, L. and Frasca, M. and Patan\`e, L. and Pavone, M.}, title = {Towards Autonomous Adaptive Behavior in a Bio-inspired {CNN}-controlled Robot}, booktitle = {{Int. Symp. on Circuits and Systems}}, year = {2006}, address = {Kos, Greece}, doi = {10.1109/ISCAS.2006.1692549}, month = may, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Arena.Fortuna.ea.ISCAS06a.pdf} }
Abstract: This paper describes the implementation of a bio-inspired six legged robot: Gregor I. Both structure and locomotion control are inspired by biological observations in cockroaches. Robot mechanics attempts to emulate main structural features in cockroaches, like self-stabilizing posture and specializing legged function; in turn, locomotion control is based on the theory of the central pattern generator implemented on a VLSI chip. The final aim is to artificially replicate the fundamental principles that guarantee cockroach’s extraordinary agility. Our major concern was on the implementation of rear legs, that seem to play a crucial role in obstacle overcoming and payload capability, and on the locomotion control, performed in this work by a cellular neural network playing the role of an artificial central pattern generator. Experimental tests showed that Gregor I is able to walk at the travel speed of 0.1 body length per second and to successfully negotiate obstacles more than 170% of the height of its mass center.
@inproceedings{ArenaFortunaEtAl2006b, author = {Arena, P. and Fortuna, L. and Frasca, M. and Patan\`e, L. and Pavone, M}, title = {Realization of a {CNN}-Driven Cockroach-Inspired Robot}, booktitle = {{Int. Symp. on Circuits and Systems}}, year = {2006}, address = {Kos, Greece}, doi = {10.1109/ISCAS.2006.1693168}, month = may, url = {/wp-content/papercite-data/pdf/Arena.Fortuna.ea.ISCAS06b.pdf}, owner = {bylard}, timestamp = {2017-01-28} }
Abstract: This paper addresses the design of a six legged robot for planetary exploration. The robot is specifically designed for uneven terrains and is biologically inspired on different levels: mechanically as well as in control. A novel structure is developed basing on a (careful) emulation of the cockroach, whose extraordinary agility and speed are principally due to its self-stabilizing posture and specializing legged function. Structure design enhances these properties, in particular with an innovative piston-like scheme for rear legs, while avoiding an excessive and useless complexity. Locomotion control is designed following an analog electronics approach, that in space applications could hold many benefits. In particular, the locomotion control is based on a Cellular Neural Network playing the role of an artificial Central Pattern Generator. Several dynamical simulations were carried out to test the structure and the locomotion control. Simulation results led to the implementation of the first prototype: Gregor I. Experimental tests showed that Gregor I is able to walk at the travel speed of 0.1 body length per second and to successfully negotiate obstacles more than 170% of the height of its center of mass.
@article{PavoneArenaEtAl2006b, author = {Pavone, M. and Arena, P. and Patan\`e, L.}, title = {An Innovative Mechanical and Control Architecture for a Biomimetic Hexapod for Planetary Exploration}, journal = {{Space Technology}}, volume = {26}, number = {1-2}, pages = {13--24}, year = {2006}, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Arena.ea.ST06.pdf} }
Abstract: A control system based on the principles used by cockroaches to climb obstacles is introduced and applied to a bio-inspired hexapod robot. Cockroaches adaptively use different strategies as functions of the ground morphology and obstacle characteristics. The control system introduced in this paper consists of two parts working in parallel. Locomotion control is performed by a cellular neural network (CNN) playing the role of an artificial central pattern generator (CPG) for the robot, while a new attitude control system has been designed. In order to reproduce the adaptative capabilities of the biological model, the attitude control system is based on a motor map and is aimed at regulating the posture of the robot to allow it to overcome obstacles. In fact, high obstacles require the locomotion gait to be reorganized by changing the posture of the robot to be more effective during the overcoming of the obstacle. Both proprioceptive and exteroceptive information are needed to solve this problem, they constitute the input of the adaptive attitude control. Simulation results illustrating the suitability of the control system are also shown.
@article{PavoneArenaEtAl2006, author = {Pavone, M. and Arena, P. and Fortuna, L. and Frasca, M. and Patan\`e, L.}, title = {Climbing Obstacle in Bio-robots via {CNN} and Adaptive Attitude Control}, journal = {{Int. Journal of Circuit Theory and Applications}}, volume = {34}, number = {1}, pages = {109--125}, year = {2006}, doi = {10.1109/ISCAS.2005.1465810}, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Arena.ea.IJCTA06.pdf} }
Abstract: This paper addresses the design of a six legged robot for planetary exploration. The robot is specifically designed for uneven terrains and is bio-logically inspired on different levels: mechanically as well as in control. A novel structure is developed basing on a (careful) emulation of the cockroach, whose extraordinary agility and speed are principally due to its self-stabilizing posture and specializing legged function. Structure design enhances these properties, in particular with an innovative piston-like scheme for rear legs, while avoiding an excessive and useless complexity. Locomotion control is designed following an analog electronics approach, that in space applications could hold many benefits. In particular, the locomotion control is based on a Cellular Neural Network playing the role of an artificial Central Pattern Generator. Several dynamical simulations were carried out to test the structure and the locomotion control. Simulation results led to the implementation of the first prototype: Gregor I. Experimental tests showed that Gregor I is able to walk at the travel speed of 0.1 body length per second and to successfully negotiate obstacles more than 170% of the height of its center of mass.
@inproceedings{PavoneArenaEtAl2005, author = {Pavone, M. and Arena, P. and Patan\`e, L.}, title = {An Innovative Mechanical and Control Architecture for a Biomimetic Hexapod for Planetary Exploration}, booktitle = {{Int. Astronautical Congress}}, year = {2005}, note = {Paper \# IAC-05-A3.2.B.09}, address = {Fukuoka, Japan}, doi = {10.2514/6.IAC-05-A3.2.B.09}, month = oct, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Arena.IAC05.pdf} }
Abstract: A control system based on the principles used by cockroaches to climb obstacles is introduced and applied to a bio-inspired hexapod robot. Cockroaches adaptively use different strategies as functions of the ground morphology and obstacle characteristics. The control system introduced in this paper consists of two parts working in parallel. Locomotion control is performed by a cellular neural network (CNN) playing the role of an artificial central pattern generator (CPG) for the robot, while a new attitude control system has been designed. In order to reproduce the adaptative capabilities of the biological model, the attitude control system is based on a motor map and is aimed at regulating the posture of the robot to allow it to overcome obstacles. In fact, high obstacles require the locomotion gait to be reorganized by changing the posture of the robot to be more effective during the overcoming of the obstacle. Both proprioceptive and exteroceptive information are needed to solve this problem, they constitute the input of the adaptive attitude control. Simulation results illustrating the suitability of the control system are also shown.
@inproceedings{ArenaFortunaEtAl2005, author = {Arena, P. and Fortuna, L. and Frasca, M. and Patan\`e, L. and Pavone, M.}, title = {Climbing Obstacles via Bio-Inspired {CNN-CPG} and Adaptive Attitude Control}, booktitle = {{Int. Symp. on Circuits and Systems}}, year = {2005}, address = {Kobe, Japan}, doi = {10.1109/ISCAS.2005.1465810}, month = may, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Arena.Fortuna.ea.ISCAS05.pdf} }
Abstract: Autonomous systems are increasingly nearing widespread adoption, with new robotic platforms constantly being tested and deployed alongside humans in domains such as autonomous driving, service robotics, and surveillance. Accordingly, human-robot interaction will soon be present in many everyday scenarios. However, there are still many challenges preventing autonomous systems from safely and smoothly navigating interactions with humans. For example, while merging into traffic is one of the most common day-to-day maneuvers we perform as drivers, it poses a major problem for state-of-the-art self-driving vehicles. The reason humans can naturally navigate through many social interaction scenarios, such as merging in traffic, is that humans have an intrinsic capacity to reason about other people’s intents, beliefs, and desires, applying this reasoning to predict what might happen in the future and make corresponding decisions. As a result, imbuing autonomous systems with the ability to reason about other agents’ potential future actions is critical to enabling informed decision making and proactive actions to be taken in human-robot interaction scenarios. Indeed, the ability to predict other agents’ behaviors (also known as "trajectory forecasting") has already become a core component of modern robotic systems, especially so in safety-critical applications such as autonomous vehicles. Towards this end, this dissertation tackles the development of trajectory forecasting methods, their effective integration within the robotic autonomy stack, and the injection of task-awareness in their performance evaluation.
@phdthesis{Ivanovic2021, author = {Ivanovic, B.}, title = {Trajectory Forecasting in the Modern Robotic Autonomy Stack}, school = {{Stanford University, Dept. of Aeronautics and Astronautics}}, year = {2021}, address = {Stanford, California}, month = dec, url = {https://stacks.stanford.edu/file/druid:nw436bv8593/IvanovicPhD-augmented.pdf}, owner = {bylard}, timestamp = {2021-12-06} }
Abstract: Advances in sensing and actuation capabilities have allowed for the proliferation of robots across many fields, including aerial, industrial, and automotive applications. A driving factor in being able to deploy such robots in everyday applications is algorithms that imbue real-time decision making capabilities. Such decision-making capabilities can be formulated using the modeling framework of optimization programs. However, such optimization-based approaches are still limited by computational resources available on robot platforms. For example, in many aerospace applications, spacecraft robotic systems are equipped with embedded computers much less capable than the hardware typically used to solve such optimization algorithms. Thus, there is a pressing need to be able to scale and extend optimization-based planning and control algorithms to robotics applications with severely constrained computational resources. In this work, we turn towards recent advances in nonlinear optimization, supervised learning, and control theory to accelerate solving optimization-based controllers for online deployment. We then show how data-driven approaches can exploit powerful computational resources offline to learn the underlying structure of optimization problems such that the online decision making problem can be reduced to an approximate problem that is much easier to solve on embedded computers. In the first part of this dissertation, we present a local trajectory optimization framework known as Guaranteed Sequential Trajectory Optimization (GuSTO) that provides a theoretically-motivated algorithm that iteratively solves a series of convex optimization problems until convergence. We demonstrate how this framework can accommodate a broad class of trajectory optimization problems, including free-final time, free final-state, and problems on a manifold. We further discuss how GuSTO enables new applications, specifically in the domain of spacecraft robotic manipulation, and discuss the development of a novel gecko-inspired adhesive robot gripper design for the Astrobee assistive free-flying robot. In the second part of this dissertation, we turn towards global trajectory optimization problems, specifically those that can be formulated as mixed-integer convex programs (MICPs). MICPs are a popular modeling framework that can be used to model planning and control problems that are inherently combinatorial or discrete. However, existing algorithms fall short in being able to provide reliable solution approaches that can be deployed for real-time applications (i.e., 10-100Hz computation rates) on embedded systems. In this work, we turn towards data-driven approaches that can be used to find high quality feasible solutions to such MICPs and present Combinatorial Offline, Convex Online (CoCo). We demonstrate how such approaches can leverage the underlying structure of optimal control problems and compare our proposed approach against state-of-the-art commercial solvers. Numerical simulations are provided through this work to demonstrate the efficacy of our proposed approach and present hardware results on a free-flying spacecraft robotic test bed.
@phdthesis{Cauligi2021, author = {Cauligi, A.}, title = {Data-Driven Approaches for Mixed Integer Convex Programming in Robot Control}, school = {{Stanford University, Dept. of Aeronautics and Astronautics}}, year = {2021}, address = {Stanford, California}, month = dec, url = {https://stacks.stanford.edu/file/druid:mx142wx7479/Cauligi-augmented.pdf}, owner = {bylard}, timestamp = {2021-12-06} }
Abstract: Robots operating in the unstructured environments of the real world must contend with at least two sources of geometric complexity: (1) the differential geometric complexity of robot configuration spaces and task spaces, which can in practice be general non-Euclidean manifolds, and (2) the complexity of the geometric shape of robot links and obstacles in the environment, which have infinite variability and are often highly nonconvex. Mature robot autonomy requires algorithms that can tackle these sources of geometric complexity with precision and at real-time control and planning speeds. This thesis focuses on bridging existing gaps in previous methods to meet these needs. In particular, to address differential geometric complexity in motion design, we first present a framework called Multi-Task Pullback Bundle Dynamical Systems (PBDS), which is a geometric control methodology for forming fast composable geometric motion policies, respecting simultaneous robotic tasks on non-Euclidean robot and task manifolds. Second, we present an embedded sequential convex programming approach which exploits differential geometric structure to eliminate explicit manifold-type constraints in trajectory optimization while still guaranteeing final satisfaction of these constraints. Together, these approaches correctly enforce the differential geometric structure of robot planning and control problems while maintaining similar or greater computational efficiency compared to past approaches. Finally, we address the shape complexity of robot links and obstacles by repurposing hardware-accelerated ray-tracing (i.e., ray-tracing cores) for rapidly forming collision avoidance constraints in trajectory optimization, particularly targeting complex robot and obstacle triangle mesh representations. This enables robot motion designers to leverage the full complexity of robot and obstacle geometries at speeds orders of magnitude faster than currently available collision-checking libraries, while leveraging recently-developed ray-tracing cores which have previously had little utility in the robot autonomy stack.
@phdthesis{Bylard2021, author = {Bylard, A.}, title = {Leveraging the Geometric Structure of Robotic Tasks for Motion Design}, school = {{Stanford University, Dept. of Aeronautics and Astronautics}}, year = {2021}, address = {Stanford, California}, month = dec, url = {https://stacks.stanford.edu/file/druid:jw453mh1486/BylardPhDThesis-augmented.pdf}, owner = {bylard}, timestamp = {2021-12-06} }
Abstract: Many physical systems are modeled by finite-dimensional sets of ordinary differential equations (ODEs). Others have dynamics that evolve over a continuum (i.e. are infinite-dimensional) and are best modeled by partial differential equations (PDEs), including systems with fluid flows, deformable/flexible structures, or fluid-structure interaction. In practice, PDE models are generally semi-discretized to produce high-fidelity finite-dimensional ODE models. Since the dimension of these models can range from thousands to millions, computational challenges severely limit the use of standard approaches to model-based controller design. In this thesis, we propose an approach for efficiently designing high-performing controllers based on high-dimensional models. Specifically, we develop a model predictive control (MPC) algorithm for solving constrained optimal control problems that leverages high-fidelity, but low-dimensional, reduced order approximations of the original model to satisfy practical computational requirements. In the linear setting, we combine existing ideas from tube MPC with novel approaches for controller synthesis and analysis to develop a reduced order MPC (ROMPC) scheme for solving robust, output feedback control problems, and we provide theoretical closed-loop performance guarantees that explicitly account for model reduction error. We also extend the ROMPC scheme to the nonlinear setting by exploiting piecewise-affine reduced order models. We motivate and validate the proposed approach through two case studies. First, we use a linear, coupled rigid-body/fluid dynamics model for aircraft control, where the high-dimensional computational fluid dynamics (CFD) model has over one million dimensions. Second, we use a nonlinear finite element model (FEM) with over ten thousand dimensions to control a soft robot. Simulation and hardware experiments are used in both studies to demonstrate the practicality and performance of ROMPC.
@phdthesis{Lorenzetti2021, author = {Lorenzetti, J.}, title = {Reduced Order Model Predictive Control of High-Dimensional Systems}, school = {{Stanford University, Dept. of Aeronautics and Astronautics}}, year = {2021}, address = {Stanford, California}, month = aug, url = {https://stacks.stanford.edu/file/druid:xb656xk9170/LorenzettiPhD-augmented.pdf}, owner = {bylard}, timestamp = {2021-12-06} }
Abstract: Advances in the fields of artificial intelligence and machine learning have unlocked a new generation of robotic systems—"learning-enabled" robots that are designed to operate in unstructured, uncertain, and unforgiving environments, especially settings where robots are required to interact in close proximity with humans. However, as learning-enabled methods, especially "deep" learning, continue to become more pervasive throughout the autonomy stack, it also becomes increasingly difficult to ascertain the performance and safety of these robotic systems and explain their behavior, necessary prerequisites for their deployment in safety-critical settings. This dissertation develops methods drawing upon techniques from the field of formal methods, namely Hamilton-Jacobi (HJ) reachability and Signal Temporal Logic (STL), to complement a learning-enabled robot autonomy stack, thereby leading to safer and more robust robot behavior. The first part of this dissertation investigates the problem of providing safety assurance for human-robot interactions, safety-critical settings wherein robots must reason about the uncertainty in human behavior to achieve seamless interactions with humans. Specifically, we develop a two-step approach where we first develop a learning-based human behavior prediction model tailored towards proactive robot planning and decision-making, which we then couple with a reachability-based safety controller that minimally intervenes whenever the robot is near safety violation. The approach is validated through human-in-the-loop simulation as well as on an experimental vehicle platform, demonstrating clear connections between theory and practice. The second part of this dissertation examines the use of STL as a formal language to incorporate logical reasoning into robot learning. In particular, we develop a technique, named STLCG, that casts STL into the same computational language as deep neural networks. Consequently, by using STLCG to express designers’ domain expertise into a form compatible with neural networks, we can embed domain knowledge into learned components within the autonomy stack to provide additional levels of robustness and interpretability.
@phdthesis{Leung2021, author = {Leung, K.}, title = {On Using Formal Methods for Safe and Robust Robot Autonomy}, school = {{Stanford University, Dept. of Aeronautics and Astronautics}}, year = {2021}, address = {Stanford, California}, month = aug, url = {https://stacks.stanford.edu/file/druid:kk533yj7758/Leung_PhD-augmented.pdf}, owner = {bylard}, timestamp = {2021-12-06} }
Abstract: Autonomous robots have the potential to free humans from dangerous or dull work. To achieve truly autonomous operation, robots must be able to understand unstructured environments and make safe decisions in the face of uncertainty and non-stationarity. As such, robots must be able to learn about, and react to, changing operating conditions or environments continuously, efficiently, and safely. While the last decade has seen rapid advances in the capabilities of machine learning systems driven by deep learning, these systems are limited in their ability to adapt online, learn with small amounts of data, and characterize uncertainty. The desiderata of learning robots therefore directly conflict with the weaknesses of modern deep learning systems. This thesis aims to remedy this conflict and develop robot learning systems that are capable of learning safely and efficiently. In the first part of the thesis we develop tools for efficient learning in changing environments. In particular, we develop tools for the meta-learning problem setting—in which data from a collection of environments may be used to accelerate learning in a new environment—in both the regression and classification setting. These algorithms are based on exact Bayesian inference on meta-learned features. This approach enables characterization of uncertainty in the face of small amounts of within-environment data, and efficient learning via exact conditioning. We extend these approaches to time-varying settings beyond episodic variation, including continuous gradual environmental variation and sharp, changepoint-like variation. In the second part of the thesis we adapt these tools to the problem of robot modeling and control. In particular, we investigate the problem of combining our neural network-based meta-learning models with prior knowledge in the form of a nominal dynamics model, and discuss design decisions to yield better performance and parameter identification. We then develop a strategy for safe learning control. This strategy combines methods from modern constrained control—in particular, robust model predictive control—with ideas from classical adaptive control to yield a computationally efficient, simple to implement, and guaranteed safe control strategy capable of learning online. We conclude the thesis with a discussion of short, intermediate, and long-term next steps in extending the ideas developed herein toward the goal of true robot autonomy.
@phdthesis{Harrison2021, author = {Harrison, J.}, title = {Uncertainty and Efficiency in Adaptive Robot Learning and Control}, school = {{Stanford University, Dept. of Mechanical Engineering}}, year = {2021}, address = {Stanford, California}, month = aug, url = {https://stacks.stanford.edu/file/druid:hh754jn1534/James_Harrison_Thesis-augmented.pdf}, owner = {bylard}, timestamp = {2021-12-06} }
Abstract: In this dissertation, we investigate an up-and-coming class of mathematical programs, bilevel optimization, and how it can be leveraged to tackle the most pressing algorithmic challenges of control in robotics. In this dissertation, we give an overview of our work on bilevel optimization, where two mathematical programs are nested into one another, and our progress on leveraging this class of problems to move us closer to computationally tractable control of nonlinear systems. Specifically, we demonstrate how it is possible to design novel solution methods that utilize advances in automatic differentiation while retaining the benefits of state of the art constrained nonlinear optimization solvers. We also demonstrate how particularly challenging problems of nonlinear control such as planning through contact, adversarial learning of value functions, and Lyapunov synthesis can all surprisingly be tackled by explicitly addressing them as bilevel optimization problems.
@phdthesis{Landry2021, author = {Landry, B.}, title = {Differentiable and Bilevel Optimization for Control in Robotics}, school = {{Stanford University, Dept. of Aeronautics and Astronautics}}, year = {2021}, address = {Stanford, California}, month = jun, url = {https://stacks.stanford.edu/file/druid:bw199zy3697/LandryPhD-augmented.pdf}, owner = {bylard}, timestamp = {2021-12-06} }
Abstract: Integrating autonomous robots into safety-critical settings requires reasoning about uncertainty at all levels of the autonomy stack. This thesis presents novel algorithmic tools for imbuing robustness within two hierarchically complementary areas, namely: motion planning and decision-making. In Part I of the thesis, by harnessing the theories of contraction and semi-infinite convex optimization and the computational tool of sum-of-squares programming, we present a unified framework for robust real-time motion planning for complex underactuated nonlinear systems. Broadly, the approach entails pairing open-loop motion planning algorithms that neglect uncertainty and are optimized for generating trajectories for simple kinodynamic models in real-time, with robust nonlinear trajectory-tracking feedback controllers. We demonstrate how to systematically synthesize these controllers and integrate them within planning to generate and execute certifiably safe trajectories that are robust to the closed-loop effects of disturbances and planning with simplified models. In Part II of the thesis, we demonstrate how to embed the control-theoretic advancements developed in Part I as constraints within a novel semi-supervised algorithm for learning dynamical systems from user demonstrations. The constraints act as a form of context-driven hypothesis pruning to yield learned models that jointly balance regression performance and stabilizability, ultimately resulting in generated trajectories for the robot that are conditioned for feedback control. Experimental results on a quadrotor testbed illustrate the efficacy of the proposed algorithms in Parts I and II of the thesis, and clear connections between theory and hardware. Finally, in Part III of the thesis, we describe a framework for lifting notions of robustness from lowlevel motion planning to higher-level sequential decision-making using the theory of risk measures. Leveraging a class of risk measures with favorable axiomatic foundations, we demonstrate how to formulate decision-making algorithms with tunable robustness properties. In particular, we focus on a novel application of this framework to inverse reinforcement learning where we learn predictive motion models for humans in safety-critical scenarios, and illustrate their effectiveness within a commercial driving simulator featuring humans in-the-loop. The contributions within this thesis constitute an important step towards endowing modern robotic systems with the ability to systematically and hierarchically reason about safety and efficiency in the face of uncertainty, which is crucial for safety-critical applications
@phdthesis{Singh2019, author = {Singh, S.}, title = {Robust Control, Planning, and Inference for Safe Robot Autonomy}, school = {{Stanford University, Dept. of Aeronautics and Astronautics}}, year = {2019}, address = {Stanford, California}, month = aug, url = {https://stacks.stanford.edu/file/druid:pr731qc2534/Singh-PhD-augmented.pdf}, owner = {bylard}, timestamp = {2021-12-06} }
Abstract: The last decade saw the rapid development of two major mobility paradigms: Mobility-on-Demand (MoD) systems (e.g. ridesharing, carsharing) and self-driving vehicles. While individually impactful, together they present a major paradigm shift in modern mobility. Autonomous Mobility-on-Demand (AMoD) systems, wherein a fleet of self-driving vehicles serve on-demand travel requests, present a unique opportunity to alleviate many of our transportation woes. Specifically, by combining fully-compliant vehicles with central coordination, AMoD systems can achieve system-level optimal strategies via, e.g., coordinated routing and preemptive dispatch. This thesis presents methods to model, analyze and control AMoD systems. In particular, special emphasis is given to develop stochastic algorithms that can cope with the uncertainty inherent to travel demand. In the first part, we present a steady-state modeling framework built on queueing networks and network flow theory. By casting the system as a multi-class BCMP network, the framework provides analysis tools that allow the characterization of performance metrics for a given routing policy, in terms, e.g., of vehicle availabilities, and first and second order moments of vehicle throughput. Moreover, we present a scalable method for the synthesis of routing policies, with performance guarantees in the limit of large fleet sizes. The framework provides a large set of modeling options, and specifically address cases where the operational concerns of congestion and battery charge level are considered. We validate our theoretical results on a case study of New York City. In the second part, we leverage the insights provided by the steady-state models to present real-time control algorithms. Specifically, we cast the real-time control problem within a stochastic model predictive control framework. The control loop consists of a forecasting generative model and a stochastic optimization subproblem. At each time step, the generative model first forecasts a finite number of travel demand for a finite horizon and then we solve the stochastic subproblem via Sample Average Approximation. We show via simulation that this approach is more robust to uncertain demand and vastly outperforms state-of-the-art fleet-level control algorithms. Finally, we validate the presented frameworks by deploying a fleet control application in a carsharing system in Japan. The application uses the aforementioned algorithms to provide, in real-time, tasks to the carsharing employees regarding actions to be taken to better meet customer demand. Results show significant improvement over human based decision making.
@phdthesis{Iglesias2019, author = {Iglesias, R. D.}, title = {Stochastic Modeling and Control of Autonomous Mobility-on-Demand Systems}, school = {{Stanford University, Inst. for Civil and Environmental Engineering}}, year = {2019}, address = {Stanford, California}, month = aug, url = {https://stacks.stanford.edu/file/druid:mm997fz9077/IglesiasPhD-augmented.pdf}, owner = {bylard}, timestamp = {2021-12-06} }
Abstract: Advances in mobile robot autonomy are poised to transform society: there is enormous demand for robots that can handle our commutes, manage our homes, provide assistance to our loved ones, and explore places too dangerous or too distant for us humans to set foot. Before this potential may be realized, however, robots must be able to contend with uncertainties arising from the unstructured, unpredictable, and often unforgiving world they aim to enter. This dissertation develops concepts, algorithms, and modeling frameworks for quantifying uncertainty in how a robot plans its course of action, how it carries out that plan, and how it interacts with its environment (in particular, with the humans around it). In each of these cases, these tools are motivated by the purpose of yielding actionable insights that improve the quality and computational efficiency of robot planning and decision making.
@phdthesis{Schmerling2019, author = {Schmerling, E.}, title = {Multimodal Modeling and Uncertainty Quantification for Robot Planning and Decision Making}, school = {{Stanford University, Inst. for Computational \& Mathematical Engineering}}, year = {2019}, address = {Stanford, California}, month = mar, url = {https://stacks.stanford.edu/file/druid:xx848nq8857/SchmerlingPhD-augmented.pdf}, owner = {bylard}, timestamp = {2021-12-06} }
Abstract: Motion planning is a fundamental problem in robotics, whereby one seeks to compute a low-cost trajectory from an initial state to a goal region that avoids any obstacles. Sampling-based motion planning algorithms have emerged as an effective paradigm for planning with complex, high-dimensional robotic systems. These algorithms maintain only an implicit representation of the state space, constructed by sampling the free state space and locally connecting samples (under the supervision of a collision checking module). This thesis presents approaches towards enabling real-time and robust sampling-based motion planning with improved sampling strategies and massive parallelism. In the first part of this thesis, we discuss algorithms to leverage massively parallel hardware (GPUs) to accelerate planning and to consider robustness during the planning process. We present an algorithm capable of planning at rates amenable to application within control loops, 10 ms. This algorithm uses approximate dynamic programming to explore the state space in a massively-parallel, near-optimal manner. We further present two algorithms capable of real-time, uncertainty-aware and perception-aware motion planning that exhaustively explore the state space via a multiobjective search. This search identifies a Pareto set of promising paths (in terms of cost and robustness) and certifies their robustness via Monte Carlo methods. We demonstrate the effectiveness of these algorithm in numerical simulations and a physical experiment on a quadrotor. In the second part of this thesis, we examine sampling-strategies for probing the state space; traditionally this has been uniform, independent, and identically distributed (i.i.d.) random points. We present a methodology for biasing the sample distribution towards regions of the state space in which the solution trajectory is likely to lie. This distribution is learned via a conditional variational autoencoder, allowing a general methodology, which can be used in combination with any samplingbased planner and can effectively exploit the underlying structure of a planning problem while maintaining the theoretical guarantees of sampling-based approaches. We also analyze the use of deterministic, low-dispersion samples instead of i.i.d. random points. We show that this allows deterministic asymptotic optimality (as opposed to probabilistic), a convergence rate bound in terms of the sample dispersion, reduced computational complexity, and improved practical performance. The technical approaches in this work are applicable to general robotic systems and lay the foundations of robustness and algorithmic speed required for robotic systems operating in the world.
@phdthesis{Ichter2018, author = {Ichter, B.}, title = {Massive Parallelism and Sampling Strategies for Robust and Real-Time Robotic Motion Planning}, school = {{Stanford University, Dept. of Aeronautics and Astronautics}}, year = {2018}, address = {Stanford, California}, month = sep, url = {https://stacks.stanford.edu/file/druid:xm179nc3440/IchterSubmitPhD-augmented.pdf}, owner = {bylard}, timestamp = {2021-12-06} }
Abstract: Robots are often designed for dangerous environments such as severe storms, but routing algorithms rarely are. This dissertation introduces a new class of routing problems with "risky traversal" where a robot may fail when travelling between two sites. Our key insight is that many objectives in the risky traversal model satisfy a diminishing returns property known as submodularity. We develop a set of tools based on submodular optimization which lead to efficient solutions for a wide variety of problems: (1) The "Team Surviving Orienteers" (TSO) problem, where the size of the team is fixed and we seek routes which maximize the expected rewards collected at nodes, subject to survival probability constraints on each robot. (2) The "On-line TSO" problem, where observations are incorporated to update the paths on-line in a parallel fashion (in response to survival events). (3) The "Heterogeneous TSO" problem, which allows robots to have different capabilities such as sensors (affecting rewards collected), actuation (affecting ability to traverse between sites), and robustness (affecting survival probabilities). (4) The "Matroid TSO" problem, where the set of routes must satisfy an "independence" constraint represented by a matroid, for example limits on the number of each type of robot available, traffic through regions of the environment, total risk budgets for the team, or combinations of these limits. (5) The "Risk Sensitive Coverage" problem, which is the dual to the TSO where the team must satisfy a coverage constraint (e.g., ensure that nodes are visited with specified probabilities) while using minimum resources (e.g., number of robots deployed, distance travelled, or expected number of failures). Our algorithms are based on the approximate greedy algorithm, where we iteratively select a path which approximately maximizes the expected incremental rewards subject to some constraints. Due to the submodular structure of the problems considered in this dissertation, we can prove bounds on the suboptimality of our algorithms. The approach developed in this dissertation provides a foundational set of tools for routing large scale teams in dangerous environments while explicitly planning for robot failure.
@phdthesis{Jorgensen2018, author = {Jorgensen, S.}, title = {Submodular Optimization for Risk-Aware Coordination of Multi-Robot Systems}, school = {{Stanford University, Dept. of Aeronautics and Astronautics}}, year = {2018}, address = {Stanford, California}, month = jun, url = {/wp-content/papercite-data/pdf/Jorgensen.PhD18.pdf}, owner = {bylard}, timestamp = {2019-08-02} }
Abstract: In this thesis, we investigate the mobility challenges associated with robotic exploration of small solar system bodies, such as comets and asteroids. We open with a discussion on the surface environment of small bodies, and in particular, how their extremely weak gravity motivates hopping as a promising form of locomotion for long-distance traverses. We then propose an adaptable rover architecture called “Hedgehog”—a minimalistic, internally-actuated, hopping rover designed for targeted mobility in such low-gravity environments. By applying internal torques to three mutually-orthogonal flywheels, the rover’s chassis rotates, giving rise to ground reaction forces and various motion primitives, including long-range hopping, short precise tumbling, and small pose adjustments. We propose various models for analyzing the dynamics of these motion primitives and derive control laws for achieving desired motions. We then discuss various methods for conducting experiments in a reduced-gravity environment, including a custom six-degree-of-freedom laboratory test bed and parabolic flights. We validate our control laws in these test beds and demonstrate unprecedented motion accuracy for internally-actuated hoppers. Finally, we broaden our focus to general hopping platforms and consider various algorithmic tools for autonomous exploration. Specifically, we develop a suite of tools for motion planning, localization, and traversability analysis, with a careful attention on the various sources of model uncertainty and the complex dynamics of hopping trajectories. Despite the stochastic nature of bouncing dynamics, we demonstrate through high-fidelity simulations that a hopping rover can efficiently traverse highly irregular bodies that would otherwise be inaccessible to traditional rovers.
@phdthesis{Hockman2018b, author = {Hockman, B. J.}, title = {Robotic Mobility on Small Solar System Bodies: {Design,} Control, and Autonomy}, school = {{Stanford University, Dept. of Mechanical Engineering}}, year = {2018}, address = {Stanford, California}, month = jun, url = {https://stacks.stanford.edu/file/druid:zg590cd2343/Hockman_PhD-augmented.pdf}, owner = {bylard}, timestamp = {2021-12-06} }
Abstract: Autonomous Mobility-on-Demand systems (that is, fleets of self-driving cars offering on-demand transportation) hold promise to reshape urban transportation by offering high quality of service at lower cost compared to private vehicles. However, the impact of such systems on the infrastructure of our cities (and in particular on traffic congestion and the electric power network) is an active area of research. In particular, Autonomous Mobility-on-Demand (AMoD) systems could greatly increase traffic congestion due to additional "rebalancing" trips required to re-align the distribution of available vehicles with customer demand; furthermore, charging of large fleets of electric vehicles can induce significantly stress in the electric power network, leading to high electricity prices and potential network instability. In this thesis, we build analytical tools and algorithms to model and control the interaction between AMoD systems and our cities. We open our work by exploring the interaction between AMoD systems and urban congestion. Leveraging the theory of network flows, we devise models for AMoD systems that capture endogenous traffic congestion and are amenable to efficient optimization. These models allow us to show the key theoretical result that, under mild assumptions that are substantially verified for U.S. cities, AMoD systems do not increase congestion compared to privately-owned vehicles for a given level of customer demand if empty-traveling vehicles are properly routed. We leverage this insight to design a real-time congestion-aware routing algorithm for empty vehicles; microscopic agent-based simulations with New York City taxi data show that the algorithm significantly reduces congestion compared to a state-of-the-art congestion-agnostic rebalancing algorithm, resulting in 22% lower wait times for AMoD customers. We then devise a randomized congestion-aware routing algorithm for customer-carrying vehicles and prove rigorous analytical bounds on its performance. Preliminary results based on New York City taxi data show that the algorithm could yield a further reduction in congestion and, as a result, 5% lower service times for AMoD customers. We then turn our attention to the interaction between AMoD fleets with electric vehicles and the power network. We extend the network flow model developed in the first part of the thesis to capture the vehicles’ state-of-charge and their interaction with the power network (including charging and the ability to inject power in the network in exchange for a payment, denoted as "vehicle-to-grid"). We devise an algorithmic procedure to losslessly reduce the size of the resulting model, making it amenable to efficient optimization, and test our models and optimization algorithms on a hypothetical deployment of an AMoD system in Dallas-Fort Worth, TX with the goal of maximizing social welfare. Simulation results show that coordination between the AMoD system and the power network can reduce electricity prices by over \180M/year, with savings of \120M/year for local power network customers and $35M/year for the AMoD operator. In order to realize such benefits, the transportation operator must cooperate with the power network: we prove that a pricing scheme can be used to enforce the socially optimal solution as a general equilibrium, aligning the interests of a self-interested transportation operator and self-interested power generators with the goal of maximizing social welfare. We then design privacy-preserving algorithms to compute such coordination-promoting prices in a distributed fashion. Finally, we propose a receding-horizon implementation that trades off optimality for speed and demonstrate that it can be deployed in real-time with microscopic simulations in Dallas-Fort Worth. Collectively, these results lay the foundations for congestion-aware and power-aware control of AMoD systems; in particular, the models and algorithms in this thesis provide tools that will enable transportation network operators and urban planners to foster the positive externalities of AMoD and avoid the negative ones, thus fully realizing the benefits of AMoD systems in our cities.
@phdthesis{Rossi2018, author = {Rossi, F.}, title = {On the Interaction between {Autonomous Mobility-on-Demand} Systems and the Built Environment: Models and Large Scale Coordination Algorithms}, school = {{Stanford University, Dept. of Aeronautics and Astronautics}}, year = {2018}, address = {Stanford, California}, month = mar, url = {/wp-content/papercite-data/pdf/Rossi.PhD18.pdf}, owner = {frossi2}, timestamp = {2018-03-19} }
Abstract: Markov decision processes (MDPs) provide a mathematical framework for modeling sequential decision making where system evolution and cost/reward depend on uncertainties and control actions of a decision. MDP models have been widely adopted in numerous domains such as robotics, control systems, finance, economics, and manufacturing. At the same time, optimization theories of MDPs serve as the theoretical underpinnings to numerous dynamic programming and reinforcement learning algorithms in stochastic control problems. While the study in MDPs is attractive for several reasons, there are two main challenges associated with its practicality: (1) An accurate MDP model is oftentimes not available to the decision maker. Affected by modeling errors, the resultant MDP solution policy is non-robust to system fluctuations. (2) The most widely-adopted optimization criterion for MDPs is represented by the risk-neutral expectation of a cumulative cost. This does not take into account the notion of risk, i.e., increased awareness of events of small probability but high consequences. In this thesis we study multiple important aspects in risk-sensitive sequential decision making where the variability of stochastic costs and robustness to modeling errors are taken into account. First, we address a special type of risk-sensitive decision making problems where the percentile behaviors are considered. Here risk is either modeled by the conditional value-at-risk (CVaR) or the Value-at-risk (VaR). VaR measures risk as the maximum cost that might be incurred with respect to a given confidence level, and is appealing due to its intuitive meaning and its connection to chance-constraints. The VaR risk measure has many fundamental engineering applications such as motion planning, where a safety constraint is imposed to upper bound the probability of maneuvering into dangerous regimes. Despite its popularity, VaR suffers from being unstable, and its singularity often introduces mathematical issues to optimization problems. To alleviate this problem, an alternative measure that addresses most of VaR?s shortcomings is CVaR. CVaR is a risk-measure that is rapidly gaining popularity in various financial applications, due to its favorable computational properties (i.e., CVaR is a coherent risk) and superior ability to safeguard a decision maker from the "outcomes that hurt the most". As a risk that measures the conditional expected cost given that such cost is greater than or equal to VaR, CVaR accounts for the total cost of undesirable events (it corresponds to events whose associated probability is low, but the corresponding cost is high) and is therefore preferable in financial application ssuch as portfolio optimization. Second, we consider optimization problems in which the objective function involves a coherent risk measure of the random cost. Here the term coherent risk [7] denotes a general class of risks that satisfies convexity, monotonicity, translational-invariance and positive homogeneity. These properties not only guarantee that the optimization problems are mathematically well-posed, but they are also axiomatically justified. Therefore modeling risk-aversion with coherent risks has already gained widespread acceptance in engineering, finance and operations research applications, among others. On the other hand, when the optimization problem is sequential, another important property of a risk measure is time consistency. A time consistent risk metric satisfies the "dynamic-programming" style property which ensures rational decision making, i.e., the strategy that is risk-optimal at the current stage will also be deemed optimal in subsequent stages. To get the best of both worlds, the recently proposed Markov risk measures [119] satisfy both the coherent risk properties and time consistency. Thus to ensure rationality in risk modeling and algorithmic tractability, this thesis will focus on risk-sensitive sequential decision making problems modeled by Markov risk measures.
@phdthesis{Chow2017, author = {Chow, Y.}, title = {Risk-Sensitive and Data-Driven Sequential Decision Making}, school = {{Stanford University, Dept. of Aeronautics and Astronautics}}, year = {2017}, address = {Stanford, California}, month = mar, url = {/wp-content/papercite-data/pdf/Chow.PhD17.pdf}, owner = {frossi2}, timestamp = {2018-03-19} }
Abstract: Urban mobility in the 21st century faces significant challenges, as the unsustainable trends of urban population growth, congestion, pollution, and low vehicle utilization worsen in large cities around the world. As autonomous vehicle technology draws closer to realization, a solution is beginning to emerge in the form of autonomous mobility-on-demand (AMoD), whereby fleets of self-driving vehicles transport customers within an urban environment. This dissertation introduces a systematic approach to the design, control, and evaluation of these systems. In the first part of the dissertation, a stochastic queueing-theoretical model of AMoD is developed, which allows both the analysis of quality-of-service metrics as well as the synthesis of control policies. This model is then extended to one-way car sharing systems, or human-driven mobility-on-demand (MoD) systems. Based on these models, closed-loop control algorithms are designed to efficiently route empty (rebalancing) vehicles in very large systems with thousands of vehicles. The performance of the algorithms and the potential societal benefits of AMoD and MoD are evaluated through case studies of New York City and Singapore using real-world data. In the second part of the dissertation, additional structural and operational constraints are considered for AMoD systems. First, the impact of AMoD on traffic congestion with respect to the underlying structural properties of the road network is analyzed using a network flow model. In particular, it is shown that empty rebalancing vehicles in AMoD systems will not increase congestion, in stark contrast to popular belief. Finally, the control of AMoD systems with additional operational constraints is studied under a model predictive control framework, with a focus on range and charging constraints of electric vehicles. The technical approach developed in this dissertation allows us to evaluate the societal benefits of AMoD systems as well as lays the foundation for the design and control of future urban transportation networks.
@phdthesis{Zhang2016, author = {Zhang, R.}, title = {Models and Large-Scale Coordination Algorithms for {Autonomous} {Mobility-on-Demand}}, school = {{Stanford University, Dept. of Aeronautics and Astronautics}}, year = {2016}, address = {Stanford, California}, month = jun, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Zhang.PhD16.pdf} }
Abstract: Autonomy has demonstrated success in many vehicle control problems, but has yet to show significant breakthroughs for spacecraft guidance during proximity operations. In part due to a costly verification and validation process as well as from limited access to formally-safe guidance algorithms, mission planners have instead had to rely on maneuver plans with straightforward, easily-verified trajectories and extensive human oversight. Unfortunately, this strategy often introduces propellant inefficiencies, adds significant labor overhead, and limits missions to Earth proximity where two-way communication times are short. This dissertation seeks to remedy these issues by developing a provably-safe and propellant-efficient sampling-based motion planning framework for fully-autonomous spacecraft proximity operations. The framework is designed for a wide range of hazardous guidance scenarios, including autonomous orbital rendezvous and inspection, pinpoint small-body descent, and on-orbit satellite servicing. Due to the dangers associated with operating near other objects, special care is taken to enable real-time guidance as well as ensure the availability of safe abort trajectories so that spacecraft can respond quickly and safely to control failures and sudden environmental changes. Through its generality, efficiency, and speed, the proposed approach offers the potential to enable entirely new capabilities for next-generation space missions, while also increasing the frequency, flexibility, and reliability of present-day operations in space.
@phdthesis{Starek2016, author = {Starek, J. A.}, title = {Sampling-Based Motion Planning for Safe and Efficient Spacecraft Proximity Operations}, school = {{Stanford University, Dept. of Aeronautics and Astronautics}}, year = {2016}, address = {Stanford, California}, month = jun, url = {/wp-content/papercite-data/pdf/Starek.PhD16.pdf}, owner = {bylard}, timestamp = {2017-03-07} }
Abstract: This thesis presents a full-stack, real-time planning framework for kinodynamic robots that is enabled by a novel application of machine learning for reachability analysis. As products of this work, three contributions are discussed in detail in this thesis. The first contribution is the novel application of machine learning for rapid approximation of reachable sets for dynamical systems. The second contribution is the synthesis of machine learning, sampling-based motion planning, and optimal control into a cohesive planning framework that is built on an offline-online computation paradigm. The final contribution is the application of this planning framework on a quadrotor system to produce, arguably, one of the first demonstrations of fully-online kinodynamic motion planning. During physical experiments, the framework is shown to execute planning cycles at a rate 3 Hz to 5 Hz, a significant improvement over existing techniques. For the quadrotor, a simplified dynamics model is used during the planning phase to accelerate online computation. A trajectory smoothing phase, which leverages the differentially flat nature of quadrotor dynamics, is then implemented to guarantee a dynamically feasible trajectory. An event-based replanning structure is implemented to handle the case of dynamic, even adversarial, obstacles. A locally reactive control layer, inspired by potential fields methods, is added to the framework to help minimizes replanning events and produce graceful avoidance maneuvers in the presence of high speed obstacles.
@phdthesis{Allen2016, author = {Allen, R.}, title = {A Real-Time Framework for Kinodynamic Planning with Application to Quadrotor Obstacle Avoidance}, school = {{Stanford University, Dept. of Aeronautics and Astronautics}}, year = {2016}, address = {Stanford, California}, month = jun, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Allen.PhD16.pdf} }
Abstract: Recent years have witnessed great advancements in the sciences and technology of autonomy, robotics and networking. This dissertation develops concepts and algorithms for dynamic vehicle routing (DVR), that is, for the automatic planning of optimal multi-vehicle routes to provide service to demands (or more generally to perform tasks) that are generated over time by an exogenous process. We consider a rich variety of scenarios relevant for robotic applications. We begin by reviewing some of the approaches available to tackle DVR problems. Next, we study different multi-vehicle scenarios based on different models for demands (in particular, demands with time constraints, demands with different priority levels, and demands that must be transported from a pick-up to a delivery location). The performance criterion used in these scenarios is either the expected waiting time of the demands or the fraction of demands serviced successfully. In each specific DVR scenario we adopt a rigorous technical approach, which we call algorithmic queueing theory and which relies upon methods from queueing theory, combinatorial optimization, and stochastic geometry. Algorithmic queueing theory consists of three basics steps: 1) queueing model of the DVR problem and analysis of its structure; 2) establishment of fundamental limitations on performance, independent of algorithms; and 3) design of algorithms that are either optimal or constant-factor away from optimal. In the second part of the dissertation, we address problems concerning the implementation of routing policies in large-scale robotic networks, such as adaptivity and decentralized computation. We first present distributed algorithms for environment partitioning, and then we apply them to devise routing policies for DVR problems that (i) are spatially distributed, scalable to large networks, and adaptive to network changes, and (ii) have remarkably good performance guarantees. The technical approach developed in this dissertation is applicable to a wide variety of DVR problems: several possible extensions are discussed throughout the thesis.
@phdthesis{Pavone2010, author = {Pavone, M.}, title = {Dynamic Vehicle Routing for Robotic Networks}, school = {{Massachusetts Inst. of Technology, Dept. of Aeronautics and Astronautics}}, year = {2010}, address = {Cambridge, MA}, month = jun, owner = {bylard}, timestamp = {2017-01-28}, url = {/wp-content/papercite-data/pdf/Pavone.Thesis.PHD.AA10.pdf} }