Akhil Bagaria

I am a Research Scientist at Amazon, working on Reinforcement Learning. Most recently, I was Computer Science PhD student at Brown University working with George Konidaris as part of the Intelligent Robot Lab (IRL) and the Brown BigAI initiative. I was an intern in David Silver’s RL team at DeepMind, London in Summer 2022 where I was mentored by Tom Schaul.

I study Reinforcement Learning (RL) with the goal of creating general-purpose agents that can learn from raw sensorimotor data. The majority of my work has focussed on the problems of exploration and skill discovery; currently, I am focused on the application of model-based RL to solve large-scale problems in the Amazon Supply Chain.

I graduated from Harvey Mudd College in 2016, where I was part of the Lab for Autonomous and Intelligent Robotics (LAIR) and was advised by Professor Chris Clark. I then worked at Apple in Cupertino, CA for 2 years as part of the Multitouch Algorithms team under the leadership of Nicole Wells and Wayne Westerman.

News

May 2025: My PhD thesis was selected as the Best PhD Dissertation in Computer Science by Brown for the 2025 academic year.
May 2025: My paper Intrinsically Motivated Discovery of Temporally Abstract World Model Graphs has been accepted to RLC 2025. This paper is about how RL agents can discover models of the world that are abstract in both space and time, and do so in a way that enables long-horizon planning. Preprint coming soon!
Sep 2024: I started as a Research Scientist at Amazon, studying the application of RL in Amazon’s supply chain.
Aug 2024: I successfully defended my Phd! Many thanks to my advisor, George Konidaris and my committee members Rich Sutton and Michael Littman.
June 2024: I presented my PhD research about exploration in RL at the Learning and Intelligent Systems lab at MIT CSAIL. It was an honor talking to some of my academic heros like Leslie Kaelbling, Tomás Lozano-Pérez and Russ Tedrake!
May 2024: I presented my research at Berkeley AI Research, thanks to Cam Allen for hosting!
Apr 2024: I presented my research at Marc Bellemare and Glen Berseth’s group meeting at MILA.
Dec 2023: I presented our paper on learning affordances at NeurIPS.
Dec 2023: I passed my PhD thesis proposal, my committee members are: George Konidaris, Michael Littman and Rich Sutton.
May 2023: Our paper on count-based exploration has been accepted for an Oral at ICML, pre-print here!
Apr 2023: Our paper on Proto-goal RL has been accepted at IJCAI 2023 – looking forward to Macau!
Feb 2023: After an amazing internship with Tom at Deepmind, we have published my internship paper to arxiv.

Conference Publications

Effectively Learning Initiation Sets in Hierarchical Reinforcement Learning
Akhil Bagaria, Ben Abattamateo, Omer Gottesman, Sreehari Rammohan & George Konidaris
In Proceedings Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS) 2023.
Scaling Goal-based Exploration via Pruning Proto-goals
Akhil Bagaria, Ray Jiang, Ramana Kumar, Tom Schaul.
In Proceedings of the 32nd International Joint Conference on Artificial Intelligence (IJCAI), Macau, August 2023 and
Barbados Reinforcement Learning Workshop on Lifelong RL. 2023.
Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning
Sam Lobel*, Akhil Bagaria*, George Konidaris.
(*) Joint first authors.
ICML 2023, Hawaii, selected for an Oral presentation.
Optimistic Initialization for Exploration in Continuous Control
Sam Lobel, Akhil Bagaria, Cam Allen, Omer Gottesman, George Konidaris.
In Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22).
Skill Discovery for Exploration and Planning using Deep Skill Graphs
Akhil Bagaria, Jason Senthil and George Konidaris.
In Proceedings of the Thirty-eighth International Conference on Machine Learning.
Accepted for a Long Talk (Top 3%)
Robustly Learning Composable Options in Deep Reinforcement Learning
Akhil Bagaria*, Jason Senthil*, Matt Slivinski, and George Konidaris.
In Proceedings of the 30th International Joint Conference on Artificial Intelligence, August 2021.
Option Discovery using Deep Skill Chaining
Akhil Bagaria and George Konidaris
Proceedings of the International Conference on Learning Representations (ICLR) 2020
[code] [video]

Workshop Publications

Scaling Goal-based Exploration via Pruning Proto-goals
Akhil Bagaria, Ray Jiang, Ramana Kumar, Tom Schaul.
Barbados Reinforcement Learning Workshop on Lifelong RL. 2023.
Skill Discovery for Exploration and Planning using Deep Skill Graphs
Akhil Bagaria, Jason Crowley, Nicholas Lim & George Konidaris
Proceedings of the 4th Lifelong Learning Workshop, ICML 2020
Option Discovery using Deep Skill Chaining
Akhil Bagaria and George Konidaris
Proceedings of the Deep Reinforcement Learning Workshop, NeurIPS 2019
[code] [video]
Replication of a Unified Bellman Optimality Principle Combining Reward Maximization and Empowerment
Akhil Bagaria, Seungchan Kim, Alessio Mazzetto, Rafael Rodriguez-Sanchez
Accepted, NeurIPS 2019 Replication Challenge
[code]

Invited Talks

Apr 2024: Presented at Marc Bellemare and Glen Berseth’s group meeting in MILA.
Oct 2023: Presented about discovering options for exploration at the Autonomous Learning Lab at UMass Amherst.
Sep 2023: Talked about Coin Flip Networks at UT Austin’s RL reading group.
Aug 2023: Bridging Planning and RL Workshop at IJCAI (PRL).
April 2023: Spoke about learning abstractions at Mykel Kochenderfer’s SISL group at Stanford.
December 2022: Presented my research about exploration and discovery at DeepMind, Alberta.
November 2022: Presented my internship project about proto-goals at the RL Team meeting and exploration reading group at DeepMind, London.
August 2022: Presented about pseudocounts at the exploration reading group at DeepMind, London.

Undergraduates I have advised

Paul Zhou: Now a PhD student at BAIR.
Kshitij Sachan: Now a research scientist at Redwood Research.
Alex Ivanov: Masters student at Brown.
Jason Senthil: Software engineer at Facebook.
Sreehari Rammohan: undergraduate at Brown.
Nicholas Lim