Abhinav Bhatia

Publications

RL$^3$: Boosting Meta Reinforcement Learning via RL inside RL$^2$
Bhatia, A., Nashed, SB., & Zilberstein, S. (2023). In NeurIPS Workshop on Generalization in Planning.
TL;DR: Incorporating task-specific Q-value estimates as inputs to a meta-RL policy can lead to improved generalization and better performance over longer adaptation periods. PDF

Selecting the Partial State Abstractions of MDPs: A Metareasoning Approach with Deep Reinforcement Learning
Nashed, S.B., Svegliato, J., Bhatia, A., Russell S., Zilberstein, S. (2022). In IEEE/RSJ International Conference on Intelligent Robots and Systems.
TL;DR: Deep RL to select the most informative partial state abstractions for MDPs at runtime to optimize the time-dependent utility of the final solution. Good results on a variety of domains. PDF

Adaptive Rollout Length for Model-Based RL Using Model-Free Deep RL
Bhatia, A., Thomas, PS., & Zilberstein, S. (2022). In arXiv preprint arXiv:2206.02380.
TL;DR: Meta-level deep RL to adapt the rollout-length in model-based RL non-myopically based on feedback from the learning process, such as accuracy of the model, learning progress and scarcity of samples. PDF

Tuning the Hyperparameters of Anytime Planning: A Metareasoning Approach with Deep Reinforcement Learning
Bhatia, A., Svegliato, J., Nashed, S. B., & Zilberstein, S. (2022). In Proceedings of the International Conference on Automated Planning and Scheduling.
TL;DR: Deep RL to determine optimal stopping point and hyperparameters of anytime algorithms at runtime to optimize utility of the final solution. Good results on Anytime A* search algorithm and RRT* motion planning algorithm. PDF

Tuning the Hyperparameters of Anytime Planning: A Deep Reinforcement Learning Approach
Bhatia, A., Svegliato, J., & Zilberstein, S. (2021). In ICAPS Workshop on Heuristics and Search for Domain-independent Planning.
TL;DR: Deep RL to control hyperparameters of anytime algorithms at runtime to optimize quality of the final solution. Good results on Anytime A* search algorithm. PDF

On the Benefits of Randomly Adjusting Anytime Weighted A*
Bhatia, A., Svegliato, J., & Zilberstein, S. (2021). In Proceedings of the International Symposium on Combinatorial Search.
TL;DR: Randomized Weighted A* tunes the weight in Anytime Weighted A* randomly at runtime and outperforms every static weighted baseline. PDF

Resource Constrained Deep Reinforcement Learning
Bhatia, A., Varakantham, P., & Kumar, A. (2019). In Proceedings of the International Conference on Automated Planning and Scheduling.
TL;DR: Deep RL to optimize constrained resource allocation at city scale. Good results on realistic datasets. PDF