This course aims to introduce the advanced state-of-the-art deep learning methods used in robotics. The methods are studied in robotic reinforcement learning, behavior cloning, learning from demonstration, imitation learning, predictive world modeling, and hybrid reasoning tasks. The students are expected to learn the details of these methods, their motivations and limitations in different robotic problems. The students are also expected to acquire hands-on experience on these methods. A simulated robotic environment in PyBullet with an arm-hand robot and relevant sensors is provided to the students. Using the relevant deep learning libraries, the students are expected to implement robot learning frameworks where the robots learn to acquire desired skill sets.
Textbook
Homeworks
The homeworks can be implemented in pairs, however the partner should change in each homework. The robot simulation environment or the robot interaction dataset is provided to the students in homeworks.
Group Project
A group of (up to 3) students extend one of the methods they implement and submit this extension to a local journal. Submission to the journal is a must for the project to be eligible for evaluation.
Tentative Outline
- Introduction to deep learning, backpropagation, convolutional neural networks
- Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning. MIT press, 2016. (Chapters 6 & 9)
- Deep q-learning / CNN
- Sutton, Richard S., and Andrew G. Barto. Reinforcement learning: An introduction. MIT press, 2018. (Chapters 6, 9, 10 & 11.1.)
- Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529-533.
- Zeng, Andy, et al. "Tossingbot: Learning to throw arbitrary objects with residual physics." IEEE Transactions on Robotics 36.4 (2020): 1307-1319.
- Deep policy gradients, Actor-Critic Algorithms (GAE, PPO, TRPO, SAC)
- Sutton, Richard S., and Andrew G. Barto. Reinforcement learning: An introduction. MIT press, 2018. (Chapter 13)
- Schulman, John, et al. "Proximal policy optimization algorithms." arXiv preprint arXiv:1707.06347 (2017).
- Haarnoja, Tuomas, et al. "Soft actor-critic algorithms and applications." arXiv preprint arXiv:1812.05905 (2018).
- Schulman, John, et al. "High-dimensional continuous control using generalized advantage estimation." arXiv preprint arXiv:1506.02438 (2015).
- Schulman, John, et al. "Trust region policy optimization." International conference on machine learning. PMLR, 2015.
- Offline Deep-RL
- Levine, Sergey, et al. "Offline reinforcement learning: Tutorial, review, and perspectives on open problems." arXiv preprint arXiv:2005.01643 (2020).
- Singh, Avi, Albert Yu, Jonathan Yang, Jesse Zhang, Aviral Kumar, and Sergey Levine. "Cog: Connecting new skills to past experience with offline reinforcement learning." arXiv preprint arXiv:2010.14500 (2020).
- Learning from Demonstration and RL (with CNMPs):
- Gaussian Process Regression
- Garnelo, Marta, et al. "Conditional neural processes." ICML 2018 (pp. 1704-1713). PMLR.
- Seker, Muhammet Yunus, et al. "Conditional Neural Movement Primitives." Robotics: Science and Systems. Vol. 10. 2019.
- Akbulut, Mete, et al. "Acnmp: Skill transfer and task extrapolation through learning from demonstration and reinforcement learning via representation sharing." Conference on Robot Learning. PMLR, 2021.
- Learning from Demonstration (LSTMs)
- Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning. MIT press, 2016. (Chapter 10)
- Seker, M. Yunus, Ahmet E. Tekden, and Emre Ugur. "Deep effect trajectory prediction in robot manipulation." Robotics and Autonomous Systems 119 (2019): 173-184.
- Rahmatizadeh, Rouhollah, et al. "From virtual demonstration to real-world manipulation using LSTM and MDN." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 32. No. 1. 2018.
- Behavior Cloning, Transformers in Robotics:
- Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems, 30.
- Dasari, S., & Gupta, A. "Transformers for one-shot visual imitation." In Conference on Robot Learning 2021 (pp. 2071-2084). PMLR.
- Chen, Lili, et al. "Decision transformer: Reinforcement learning via sequence modeling." Advances in neural information processing systems 34 (2021): 15084-15097.
- Predictive Coding with GNNs
- Tekden, Ahmet E., et al. "Belief regulated dual propagation nets for learning action effects on groups of articulated objects." 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2020.
- Adversarial Models
- Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning. MIT press, 2016. (Chapter 20.10.4)
- Ho, Jonathan, and Stefano Ermon. "Generative adversarial imitation learning." Advances in neural information processing systems 29 (2016).
- Contrastive Learning
- Srinivas, Aravind, Michael Laskin, and Pieter Abbeel. "Curl: Contrastive unsupervised representations for reinforcement learning." arXiv preprint arXiv:2004.04136 (2020).
- Li, Yunzhu, et al. "3d neural scene representations for visuomotor control." Conference on Robot Learning. PMLR, 2022.
- Large Language Models in Robotics
- Ahn, Michael, et al. "Do as i can, not as i say: Grounding language in robotic affordances." arXiv preprint arXiv:2204.01691 (2022).
- Shridhar, Mohit, Lucas Manuelli, and Dieter Fox. "Cliport: What and where pathways for robotic manipulation." Conference on Robot Learning. PMLR, 2022.
- Fan, Linxi, et al. "Minedojo: Building open-ended embodied agents with internet-scale knowledge." arXiv preprint arXiv:2206.08853 (2022).
- Neuro-symbolic Methods: Symbol Emergence
- Konidaris, G., Kaelbling, L. P., & Lozano-Perez, T. (2018). From skills to symbols: Learning symbolic representations for abstract high-level planning. Journal of Artificial Intelligence Research, 61, 215-289.
- LatPlan: Asai, M., & Fukunaga, A. (2018, April). Classical planning in deep latent space: Bridging the subsymbolic-symbolic boundary. In Proceedings of the aaai conference on artificial intelligence (Vol. 32, No. 1).
- Ahmetoglu, Alper, et al. "DeepSym: Deep Symbol Generation and Rule Learning for Planning from Unsupervised Robot Interaction." Journal of Artificial Intelligence Research 75 (2022): 709-745.
- Diffusion Methods in Robotics
- Lambert, Alexander, et al. "Learning Implicit Priors for Motion Optimization." arXiv preprint arXiv:2204.05369 (2022).
- Project presentations