Stairway to Autonomy: Hierarchical Decision-Making for LLM-Guided Planning, Bandit-Driven Exploration, and Multi-Agent Navigation

Nayak, Siddharth Nagar

Author(s)

Nayak, Siddharth Nagar

DownloadThesis PDF (27.13Mb)

Advisor

Balakrishnan, Hamsa

Terms of use

Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Copyright retained by author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/

Metadata

Show full item record

Abstract

Autonomous multi-agent systems must efficiently plan, explore, and navigate in dynamic and unknown environments, particularly for tasks like search & rescue and environmental monitoring. These settings are often characterized by partial observability, limited communication, and dynamic objectives that require flexible coordination across agents. Designing autonomy that scales with team size and task complexity requires modular decision-making systems capable of high-level reasoning, information-driven exploration, and robust decentralized execution. This dissertation presents a hierarchical decision-making framework that addresses these challenges across three complementary levels of autonomy: high-level planning, adaptive exploration, and decentralized scalable navigation. At the highest level, LLaMAR (Language Model-based Long-Horizon Planner for Multi-Agent Robotics) leverages large language models (LLMs) to decompose long-horizon tasks into structured subtasks, enabling agents to adapt their strategies dynamically. However, the effective execution of these plans requires knowledge about the environment. Our mid-level exploration strategy, BaTMaN (Banditbased Tracking and Monitoring and Navigation), systematically prioritizes waypoints that maximize information gain while balancing real-world constraints such as energy efficiency and sensor reliability. Finally, InforMARL provides a scalable, decentralized navigation by leveraging graph-based local information aggregation, improving sample efficiency, and demonstrating transferability to unseen team sizes. This dissertation develops each of these modules to address a distinct level of the autonomy stack. LLaMAR functions as the high-level planner, translating natural language goals into structured sequences of subtasks and incorporating real-time corrections through a plan-act-correct-verify cycle. BaTMaN serves as the mid-level exploration engine, guiding sensor-equipped agents to prioritize informative regions based on uncertainty. InforMARL operates at the execution level, enabling decentralized agents to navigate through dynamic environments using graph-based local information aggregation and reactive control policies. Each module is independently deployable and optimized for different challenges: strategic reasoning, data-efficient monitoring, and scalable navigation, respectively. When combined, the three modules form a coherent autonomy stack for multi-agent systems operating under uncertainty.

Date issued

2025-05

URI

https://hdl.handle.net/1721.1/162935

Department

Massachusetts Institute of Technology. Department of Aeronautics and Astronautics

Publisher

Massachusetts Institute of Technology

Collections

Doctoral Theses