Show simple item record

dc.contributor.advisorYoucef-Toumi, Kamal
dc.contributor.authorPalleiko, Andrew
dc.date.accessioned2025-10-29T17:42:53Z
dc.date.available2025-10-29T17:42:53Z
dc.date.issued2025-05
dc.date.submitted2025-06-26T14:15:20.902Z
dc.identifier.urihttps://hdl.handle.net/1721.1/163461
dc.description.abstractImitation learning is a popular approach for obtaining intelligent robotic policies by learning from human demonstrations. Within this field, there is significant interest in the development of multi-task architectures that can efficiently learn diverse sets of tasks. Skill-based imitation learning methods, which abstract action sequences into ``skill'' representations for planning, offer structural advantages for handling the challenges of multi-task imitation learning that make them an attractive option for this problem. This work presents a novel skill-based imitation learning architecture formulation, with a causal transformer VAE skill-abstraction network paired with an autoregressive transformer planning policy. We find that our skill-abstraction network shows promise in identifying meaningful skills, but that the chosen planning architecture is poorly suited for predicting these skills due to multimodality in the resulting latent space. This is followed by a set of evaluations applied to an existing skill-based method with comparisons to a non-skill-based network on a multi-task dataset. We systematically investigate the performance impacts of six different policy and dataset conditions: data quantity, task variety, retry behavior, control precision, goal representations, and zero-shot transfer. Our experiments reveal limited increases in skill-based policy performance with more demonstrations or task variety, but improvements across architectures through exposure to demonstration retry behavior. Overall, the skill-based architecture demonstrates superior robustness to goal representation variations and low-level process noise than the non-skill-based policy, while neither architecture achieves meaningful zero-shot generalization to novel task combinations. These findings provide insights into the current state of IL methods, with the additional goal of establishing a framework for the evaluation of future multi-task IL architectures.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://rightsstatements.org/page/InC-EDU/1.0/
dc.titleDesign and Evaluation of Skill-Based Imitation Learning Policies for Robotic Manipulation
dc.typeThesis
dc.description.degreeS.M.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Mechanical Engineering
dc.identifier.orcidhttps://orcid.org/0000-0001-5805-7358
mit.thesis.degreeMaster
thesis.degree.nameMaster of Science in Mechanical Engineering


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record