Complex robotic tasks like assembling objects or stacking blocks often involve multiple, smaller steps or phases. Humans naturally split these complex actions into clear sub-steps ("pick up block", "place block on stack"), but explicitly labeling and teaching robots each step individually is tedious and costly. This project explores unsupervised methods, based on pretrained visual representations, to automatically identify sub-task boundaries from unlabeled video demonstrations. You will collect simulated demonstrations of a three-block stacking task, identify potential phase boundaries based on visual embedding similarities, and integrate these discovered phases into a reset-free reinforcement learning framework. The goal is to significantly reduce external resets during training and speed up policy convergence.
This project is particularly valuable for students interested in task decomposition, representation learning, and hierarchical RL. It provides practical experience in a highly relevant research area and a solid foundation for future thesis or project work. When applicable, your results can be published as a short benchmark note or as an appendix to an existing paper.
For more details or to apply, feel free to contact me directly via email or in-person.
You should be familiar with a python-based simulation framework and visual embeddings (e.g., CLIP and ResNets). Experience with PyTorch and basic RL knowledge will be beneficial. The project provides you with a prebuilt 3D printed WidowX arm, ready-to-use MuJoCo simulation environments, baseline RL implementations, and all necessary computational resources (though bringing your own GPU is a plus).
Rhythmus | Tag | Uhrzeit | Format / Ort | Zeitraum | |
---|---|---|---|---|---|
nach Vereinbarung | n.V. | 13.10.2025-06.02.2026 |
Modul | Veranstaltung | Leistungen | |
---|---|---|---|
39-M-Inf-P Projekt | Projekt | unbenotete Prüfungsleistung
|
Studieninformation |
Die verbindlichen Modulbeschreibungen enthalten weitere Informationen, auch zu den "Leistungen" und ihren Anforderungen. Sind mehrere "Leistungsformen" möglich, entscheiden die jeweiligen Lehrenden darüber.
Automatically discover sub-task boundaries within complex manipulation tasks to enable efficient long-horizon Reinforcement Learning.