James Harrison

Contacts:

Personal Webpage

James Harrison


James is a graduate student in the department of Mechanical Engineering. He received a B.Eng. in Mechanical Engineering from McGill University in 2015, and an M.S. in Mechanical Engineering from Stanford University in 2017. James’ research interests include control theory, robotics, and machine learning. In particular, his current work focuses on verifiably safe and robust methods for reinforcement learning, as well as unsupervised learning and representation learning in robotic task and motion planning.

Awards:

  • Office of Technology Licensing Stanford Graduate Fellowship
  • Natural Sciences and Engineering Research Council of Canada (NSERC) Doctoral Scholarship

ASL Publications

  1. T. Lew, A. Sharma, J. Harrison, A. Bylard, and M. Pavone, “Safe Active Dynamics Learning and Control: A Sequential Exploration-Exploitation Framework,” 2021. (Submitted)

    Abstract: To safely deploy learning-based systems in highly uncertain environments, one must ensure that they always satisfy constraints. In this work, we propose a practical and theoretically justified approach to maintaining safety in the presence of dynamics uncertainty. Our approach leverages Bayesian meta-learning with last-layer adaptation: the expressiveness of neural-network features trained offline, paired with efficient last-layer online adaptation, enables the derivation of tight confidence sets which contract around the true dynamics as the model adapts online. We exploit these confidence sets to plan trajectories that guarantee the safety of the system. Our approach handles problems with high dynamics uncertainty where reaching the goal safely is initially infeasible by first exploring to gather data and reduce uncertainty, before autonomously exploiting the acquired information to safely perform the task. Under reasonable assumptions, we prove that our framework provides safety guarantees in the form of a single joint chance constraint. Furthermore, we use this theoretical analysis to motivate regularization of the model to improve performance. We extensively demonstrate our approach in simulation and on hardware.

    @inproceedings{LewEtAl2021,
      author = {Lew, T. and Sharma, A. and Harrison, J. and Bylard, A. and Pavone, M.},
      title = {Safe Active Dynamics Learning and Control: A Sequential Exploration-Exploitation Framework},
      year = {2021},
      note = {Submitted},
      month = mar,
      url = {https://arxiv.org/pdf/2008.11700.pdf},
      keywords = {sub},
      owner = {lewt},
      timestamp = {2021-03-05}
    }
    
  2. D. Gammelli, K. Yang, J. Harrison, F. Rodrigues, F. C. Pereira, and M. Pavone, “Graph Neural Network Reinforcement Learning for Autonomous Mobility-on-Demand Systems,” 2021. (Submitted)

    Abstract: Autonomous mobility-on-demand (AMoD) systems represent a rapidly developing mode of transportation wherein travel requests are dynamically handled by a coordinated fleet of robotic, self-driving vehicles. Given a graph representation of the transportation network - one where, for example, nodes represent areas of the city, and edges the connectivity between them - we argue that the AMoD control problem is naturally cast as a node-wise decision-making problem. In this paper, we propose a deep reinforcement learning framework to control the rebalancing of AMoD systems through graph neural networks. Crucially, we demonstrate that graph neural networks enable reinforcement learning agents to recover behavior policies that are significantly more transferable, generalizable, and scalable than policies learned through other approaches. Empirically, we show how the learned policies exhibit promising zero-shot transfer capabilities when faced with critical portability tasks such as inter-city generalization, service area expansion, and adaptation to potentially complex urban topologies.

    @inproceedings{GammelliYangEtAl2021,
      author = {Gammelli, D. and Yang, K. and Harrison, J. and Rodrigues, F. and Pereira, F. C. and Pavone, M.},
      title = {Graph Neural Network Reinforcement Learning for Autonomous Mobility-on-Demand Systems},
      year = {2021},
      note = {Submitted},
      url = {https://arxiv.org/abs/2104.11434},
      keywords = {sub},
      owner = {jh2},
      timestamp = {2021-03-23}
    }
    
  3. J. Willes, J. Harrison, A. Harakeh, C. Finn, M. Pavone, and S. Waslander, “Open-Set Incremental Learning via Bayesian Prototypical Embeddings,” 2021. (Submitted)

    Abstract:

    @inproceedings{WillesHarrisonEtAl2021,
      author = {Willes, J. and Harrison, J. and Harakeh, A. and Finn, C. and Pavone, M. and Waslander, S.},
      title = {Open-Set Incremental Learning via Bayesian Prototypical Embeddings},
      year = {2021},
      note = {Submitted},
      keywords = {sub},
      owner = {jh2},
      timestamp = {2021-03-23}
    }
    
  4. R. Sinha, J. Harrison, S. M. Richards, and M. Pavone, “Adaptive Robust Model Predictive Control with Matched and Unmatched Uncertainty,” 2021. (Submitted)

    Abstract: We propose a learning-based robust predictive control algorithm that can handle large uncertainty in the dynamics for a class of discrete-time systems that are nominally linear with an additive nonlinear dynamics component. Such systems commonly model the nonlinear effects of an unknown environment on a nominal system. Motivated by an inability of existing learning-based predictive control algorithms to achieve safety guarantees in the presence of uncertainties of large magnitude in this setting, we achieve significant performance improvements by optimizing over a novel class of nonlinear feedback policies inspired by certainty equivalent “estimate-and-cancel” control laws pioneered in classical adaptive control. In contrast with previous work in robust adaptive MPC, this allows us to take advantage of the structure in the a priori unknown dynamics that are learned online through function approximation. Our approach also extends typical nonlinear adaptive control methods to systems with state and input constraints even when an additive uncertain function cannot directly be canceled from the dynamics. Moreover, our approach allows us to apply contemporary statistical estimation techniques to certify the safety of the system through persistent constraint satisfaction with high probability. We show that our method allows us to consider larger unknown terms in the dynamics than existing methods through simulated examples.

    @inproceedings{SinhaHarrisonEtAl2021,
      author = {Sinha, R. and Harrison, J. and Richards, S. M. and Pavone, M.},
      title = {Adaptive Robust Model Predictive Control with Matched and Unmatched Uncertainty},
      year = {2021},
      note = {Submitted},
      keywords = {sub},
      url = {https://arxiv.org/pdf/2104.08261.pdf},
      owner = {rhnsinha},
      timestamp = {2021-06-04}
    }
    
  5. R. Dyro, J. Harrison, A. Sharma, and M. Pavone, “Particle MPC for Uncertain and Learning-Based Control,” 2021. (Submitted)

    Abstract: Autonomous decision-making in novel or changing environments requires quantification and consideration of uncertainties in the system or environment dynamics that impact downstream control performance. Thus, as robotic systems move from highly structured environments to open worlds, incorporating uncertainty in learning or estimation into the control pipeline is essential for robust and efficient performance. In this paper we present a nonlinear particle model predictive control (PMPC) approach to control under uncertainty. This approach, due to the particle representation of uncertainty, is capable of handling arbitrary uncertainty specifications. We implement our nonlinear PMPC scheme with a sequential convex programming non-convex optimization scheme, and we discuss practical implementation of such a framework. We investigate our approach for two robotic systems across three problem settings: time-varying, partially observed dynamics; sensing uncertainty; and model-based reinforcement learning, and show that our approach improves performance over baselines in all settings.

    @inproceedings{DyroHarrisonEtAl2021,
      author = {Dyro, R. and Harrison, J. and Sharma, A. and Pavone, M.},
      title = {Particle MPC for Uncertain and Learning-Based Control},
      year = {2021},
      note = {Submitted},
      keywords = {sub},
      owner = {jh2},
      timestamp = {2021-03-23}
    }
    
  6. J. Harrison, A. Sharma, C. Finn, and M. Pavone, “Continuous Meta-Learning without Tasks,” in Conf. on Neural Information Processing Systems, 2020.

    Abstract: Meta-learning is a promising strategy for learning to efficiently learn within new tasks, using data gathered from a distribution of tasks. However, the meta-learning literature thus far has focused on the task segmented setting, where at train-time, offline data is assumed to be split according to the underlying task, and at test-time, the algorithms are optimized to learn in a single task. In this work, we enable the application of generic meta-learning algorithms to settings where this task segmentation is unavailable, such as continual online learning with a time-varying task. We present meta-learning via online changepoint analysis (MOCA), an approach which augments a meta-learning algorithm with a differentiable Bayesian changepoint detection scheme. The framework allows both training and testing directly on time series data without segmenting it into discrete tasks. We demonstrate the utility of this approach on a nonlinear meta-regression benchmark as well as two meta-image-classification benchmarks.

    @inproceedings{HarrisonSharmaEtAl2020,
      author = {Harrison, J. and Sharma, A. and Finn, C. and Pavone, M.},
      booktitle = {{Conf. on Neural Information Processing Systems}},
      title = {Continuous Meta-Learning without Tasks},
      year = {2020},
      note = {Submitted},
      month = dec,
      url = {https://arxiv.org/abs/1912.08866},
      owner = {apoorva},
      timestamp = {2020-05-05}
    }
    
  7. S. Banerjee, J. Harrison, P. M. Furlong, and M. Pavone, “Adaptive Meta-Learning for Identification of Rover-Terrain Dynamics,” in Int. Symp. on Artificial Intelligence, Robotics and Automation in Space, Pasadena, California, 2020.

    Abstract: Rovers require knowledge of terrain to plan trajectories that maximize safety and efficiency. Terrain type classification relies on input from human operators or machine learning-based image classification algorithms. However, high level terrain classification is typically not sufficient to prevent incidents such as rovers becoming unexpectedly stuck in a sand trap; in these situations, online rover-terrain interaction data can be leveraged to accurately predict future dynamics and prevent further damage to the rover. This paper presents a meta-learning-based approach to adapt probabilistic predictions of rover dynamics by augmenting a nominal model affine in parameters with a Bayesian regression algorithm (P-ALPaCA). A regularization scheme is introduced to encourage orthogonality of nominal and learned features, leading to interpretable probabilistic estimates of terrain parameters in varying terrain conditions.

    @inproceedings{BanerjeeHarrisonEtAl2020,
      author = {Banerjee, S. and Harrison, J. and Furlong, P. M. and Pavone, M.},
      title = {Adaptive Meta-Learning for Identification of Rover-Terrain Dynamics},
      booktitle = {{Int. Symp. on Artificial Intelligence, Robotics and Automation in Space}},
      year = {2020},
      address = {Pasadena, California},
      month = oct,
      url = {https://arxiv.org/abs/2009.10191},
      owner = {somrita},
      timestamp = {2020-09-18}
    }
    
  8. A. Sharma, J. Harrison, M. Tsao, and M. Pavone, “Robust and Adaptive Planning under Model Uncertainty,” in Int. Conf. on Automated Planning and Scheduling, Berkeley, California, 2019.

    Abstract: Planning under model uncertainty is a fundamental problem across many applications of decision making and learning. In this paper, we propose the Robust Adaptive Monte Carlo Planning (RAMCP) algorithm, which allows computation of risk-sensitive Bayes-adaptive policies that optimally trade off exploration, exploitation, and robustness. RAMCP formulates the risk-sensitive planning problem as a two-player zero-sum game, in which an adversary perturbs the agent’s belief over the models. We introduce two versions of the RAMCP algorithm. The first, RAMCP-F, converges to an optimal risk-sensitive policy without having to rebuild the search tree as the underlying belief over models is perturbed. The second version, RAMCP-I, improves computational efficiency at the cost of losing theoretical guarantees, but is shown to yield empirical results comparable to RAMCP-F. RAMCP is demonstrated on an n-pull multi-armed bandit problem, as well as a patient treatment scenario.

    @inproceedings{SharmaHarrisonEtAl2019,
      author = {Sharma, A. and Harrison, J. and Tsao, M. and Pavone, M.},
      title = {Robust and Adaptive Planning under Model Uncertainty},
      booktitle = {{Int. Conf. on Automated Planning and Scheduling}},
      year = {2019},
      note = {In Press},
      address = {Berkeley, California},
      month = jul,
      url = {https://arxiv.org/pdf/1901.02577.pdf},
      owner = {apoorva},
      timestamp = {2019-04-10}
    }
    
  9. S. Chinchali, A. Sharma, J. Harrison, A. Elhafsi, D. Kang, E. Pergament, E. Cidon, S. Katti, and M. Pavone, “Network Offloading Policies for Cloud Robotics: a Learning-based Approach,” in Robotics: Science and Systems, Freiburg im Breisgau, Germany, 2019.

    Abstract: Today’s robotic systems are increasingly turning to computationally expensive models such as deep neural networks (DNNs) for tasks like localization, perception, planning, and object detection. However, resource-constrained robots, like low-power drones, often have insufficient on-board compute resources or power reserves to scalably run the most accurate, state-of-the art neural network compute models. Cloud robotics allows mobile robots the benefit of offloading compute to centralized servers if they are uncertain locally or want to run more accurate, compute-intensive models. However, cloud robotics comes with a key, often understated cost: communicating with the cloud over congested wireless networks may result in latency or loss of data. In fact, sending high data-rate video or LIDAR from multiple robots over congested networks can lead to prohibitive delay for real-time applications, which we measure experimentally. In this paper, we formulate a novel Robot Offloading Problem - how and when should robots offload sensing tasks, especially if they are uncertain, to improve accuracy while minimizing the cost of cloud communication? We formulate offloading as a sequential decision making problem for robots, and propose a solution using deep reinforcement learning. In both simulations and hardware experiments using state-of-the art vision DNNs, our offloading strategy improves vision task performance by between 1.3-2.6x of benchmark offloading strategies, allowing robots the potential to significantly transcend their on-board sensing accuracy but with limited cost of cloud communication.

    @inproceedings{ChinchaliSharmaEtAl2019,
      author = {Chinchali, S. and Sharma, A. and Harrison, J. and Elhafsi, A. and Kang, D. and Pergament, E. and Cidon, E. and Katti, S. and Pavone, M.},
      title = {Network Offloading Policies for Cloud Robotics: a Learning-based Approach},
      booktitle = {{Robotics: Science and Systems}},
      year = {2019},
      address = {Freiburg im Breisgau, Germany},
      month = jun,
      url = {https://arxiv.org/pdf/1902.05703.pdf},
      owner = {apoorva},
      timestamp = {2019-02-07}
    }
    
  10. B. Ivanovic, J. Harrison, A. Sharma, M. Chen, and M. Pavone, “BaRC: Backward Reachability Curriculum for Robotic Reinforcement Learning,” in Proc. IEEE Conf. on Robotics and Automation, Montreal, Canada, 2019.

    Abstract: Model-free Reinforcement Learning (RL) offers an attractive approach to learn control policies for high-dimensional systems, but its relatively poor sample complexity often forces training in simulated environments. Even in simulation, goal-directed tasks whose natural reward function is sparse remain intractable for state-of-the-art model-free algorithms for continuous control. The bottleneck in these tasks is the prohibitive amount of exploration required to obtain a learning signal from the initial state of the system. In this work, we leverage physical priors in the form of an approximate system dynamics model to design a curriculum scheme for a model-free policy optimization algorithm. Our Backward Reachability Curriculum (BaRC) begins policy training from states that require a small number of actions to accomplish the task, and expands the initial state distribution backwards in a dynamically-consistent manner once the policy optimization algorithm demonstrates sufficient performance. BaRC is general, in that it can accelerate training of any model-free RL algorithm on a broad class of goal-directed continuous control MDPs. Its curriculum strategy is physically intuitive, easy-to-tune, and allows incorporating physical priors to accelerate training without hindering the performance, flexibility, and applicability of the model-free RL algorithm. We evaluate our approach on two representative dynamic robotic learning problems and find substantial performance improvement relative to previous curriculum generation techniques and naïve exploration strategies

    @inproceedings{IvanovicHarrisonEtAl2019,
      author = {Ivanovic, B. and Harrison, J. and Sharma, A. and Chen, M. and Pavone, M.},
      title = {{BaRC:} Backward Reachability Curriculum for Robotic Reinforcement Learning},
      booktitle = {{Proc. IEEE Conf. on Robotics and Automation}},
      year = {2019},
      address = {Montreal, Canada},
      month = may,
      url = {https://arxiv.org/pdf/1806.06161.pdf},
      owner = {borisi},
      timestamp = {2018-09-05}
    }
    
  11. J. Harrison, A. Sharma, and M. Pavone, “Meta-Learning Priors for Efficient Online Bayesian Regression,” in Workshop on Algorithmic Foundations of Robotics, Merida, Mexico, 2018. (In Press)

    Abstract: Gaussian Process (GP) regression has seen widespread use in robotics due to its generality, simplicity of use, and the utility of Bayesian predictions. In particular, the predominant implementation of GP regression is kernel-based, as it enables fitting of arbitrary nonlinear functions by leveraging kernel functions as infinite-dimensional features. While incorporating prior information has the potential to drastically improve data efficiency of kernel-based GP regression, expressing complex priors through the choice of kernel function and associated hyperparameters is often challenging and unintuitive. Furthermore, the computational complexity of kernel-based GP regression scales poorly with the number of samples, limiting its application in regimes where a large amount of data is available. In this work, we propose ALPaCA, an algorithm for efficient Bayesian regression which addresses these issues. ALPaCA uses a dataset of sample functions to learn a domain-specific, finite-dimensional feature encoding, as well as a prior over the associated weights, such that Bayesian linear regression in this feature space yields accurate online predictions of the posterior density. These features are neural networks, which are trained via a meta-learning approach. ALPaCA extracts all prior information from the dataset, rather than relying on the choice of arbitrary, restrictive kernel hyperparameters. Furthermore, it substantially reduces sample complexity, and allows scaling to large systems. We investigate the performance of ALPaCA on two simple regression problems, two simulated robotic systems, and on a lane-change driving task performed by humans. We find our approach outperforms kernel-based GP regression, as well as state of the art meta-learning approaches, thereby providing a promising plug-in tool for many regression tasks in robotics where scalability and data-efficiency are important.

    @inproceedings{HarrisonSharmaEtAl2018,
      author = {Harrison, J. and Sharma, A. and Pavone, M.},
      title = {Meta-Learning Priors for Efficient Online Bayesian Regression},
      booktitle = {{Workshop on Algorithmic Foundations of Robotics}},
      year = {2018},
      address = {Merida, Mexico},
      month = oct,
      url = {https://arxiv.org/pdf/1807.08912.pdf},
      keywords = {press},
      owner = {apoorva},
      timestamp = {2018-10-07}
    }
    
  12. B. Ichter, J. Harrison, and M. Pavone, “Learning Sampling Distributions for Robot Motion Planning,” in Proc. IEEE Conf. on Robotics and Automation, Brisbane, Australia, 2018.

    Abstract: A defining feature of sampling-based motion planning is the reliance on an implicit representation of the state space, which is enabled by a set of probing samples.Traditionally, these samples are drawn either probabilistically or deterministically to uniformly cover the state space. Yet, the motion of many robotic systems is often restricted to "small" regions of the state space, due to e.g. differential constraints or collision-avoidance constraints. To accelerate the planning process, it is thus desirable to devise non-uniform sampling strategies that favor sampling in those regions where an optimal solution might lie. This paper proposes a methodology for non-uniform sampling, whereby a sampling distribution is learnt from demonstrations, and then used to bias sampling. The sampling distribution is computed through a conditional variational autoencoder, allowing sample generation from the latent space conditioned on the specific planning problem. This methodology is general, can be used in combination with any sampling-based planner, and can effectively exploit the underlying structure of a planning problem while maintaining the theoretical guarantees of sampling-based approaches. Specifically, on several planning problems, the proposed methodology is shown to effectively learn representations for the relevant regions of the state space, resulting in an order of magnitude improvement in terms of success rate and convergence to the optimal cost

    @inproceedings{IchterHarrisonEtAl2018,
      author = {Ichter, B. and Harrison, J. and Pavone, M.},
      title = {Learning Sampling Distributions for Robot Motion Planning},
      booktitle = {{Proc. IEEE Conf. on Robotics and Automation}},
      year = {2018},
      address = {Brisbane, Australia},
      month = may,
      url = {https://arxiv.org/pdf/1709.05448.pdf},
      owner = {frossi2},
      timestamp = {2018-01-16}
    }
    
  13. J. Harrison, A. Garg, B. Ivanovic, Y. Zhu, S. Savarese, F.-F. Li, and M. Pavone, “ADAPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems,” in Int. Symp. on Robotics Research, Puerto Varas, Chile, 2017.

    Abstract: Model-free policy learning has enabled robust performance of complex tasks with relatively simple algorithms. However, this simplicity comes at the cost of requiring an Oracle and arguably very poor sample complexity. This renders such methods unsuitable for physical systems. Variants of model-based methods address this problem through the use of simulators, however, this gives rise to the problem of policy transfer from simulated to the physical system. Model mismatch due to systematic parameter shift and unmodelled dynamics error may cause suboptimal or unsafe behavior upon direct transfer. We introduce the Adaptive Policy Transfer for Stochastic Dynamics (ADAPT) algorithm that achieves provably safe and robust, dynamically-feasible zero-shot transfer of RL-policies to new domains with dynamics error. ADAPT combines the strengths of offline policy learning in a black-box source simulator with online tube-based MPC to attenuate bounded model mismatch between the source and target dynamics. ADAPT allows online transfer of policy, trained solely in a simulation offline, to a family of unknown targets without fine-tuning. We also formally show that (i) ADAPT guarantees state and control safety through state-action tubes under the assumption of Lipschitz continuity of the divergence in dynamics and, (ii) ADAPT results in a bounded loss of reward accumulation in case of direct transfer with ADAPT as compared to a policy trained only on target. We evaluate ADAPT on 2 continuous, non-holonomic simulated dynamical systems with 4 different disturbance models, and find that ADAPT performs between 50%-300% better on mean reward accrual than direct policy transfer.

    @inproceedings{HarrisonGargEtAl2017,
      author = {Harrison, J. and Garg, A. and Ivanovic, B. and Zhu, Y. and Savarese, S. and Li, F.-F. and Pavone, M.},
      title = {{ADAPT:} Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems},
      booktitle = {{Int. Symp. on Robotics Research}},
      year = {2017},
      address = {Puerto Varas, Chile},
      month = dec,
      url = {https://arxiv.org/pdf/1707.04674.pdf},
      owner = {pavone},
      timestamp = {2018-01-16}
    }