392241 Project: What do Robots Actually See? Comparing Visual Representations for Reinforcement Learning with a 3D-printed Arm (Pj) (WiSe 2025/2026)

Contents, comment

Robots learn to interact with the world through visual input. But how exactly does the choice of visual representation, the way a robot "sees" its environment, impact its learning ability? This project aims to systematically evaluate several state-of-the-art pretrained encoders (e.g., ResNet-50, DINOv2, R3M, VIP, CLIP) by analyzing their visual attention patterns and correlating these with RL learning speed and success rate in a robotic manipulation task. You will embed images from a simulated WidowX pick-and-place task, generate attention heat maps, and measure how well attention overlaps with relevant objects. Then, you will plug these frozen encoders into an off-policy RL algorithm and empirically evaluate their downstream performance.

This project is a valuable opportunity for students interested in computer vision and robot learning, offering hands-on experience with modern deep learning frameworks, reinforcement learning, and simulation environments. You will directly contribute to understanding how visual representation quality impacts robot learning performance. When applicable, your results can be published as a short benchmark note or as an appendix to an existing paper.

For more details or to apply, feel free to contact me directly via email or in-person.

Requirements for participation, required level

You should have strong familiarity with PyTorch and a python-based simulation framework. Experience with OpenCV and basic reinforcement learning concepts (e.g., SAC or similar) will be beneficial. The project provides you with a prebuilt 3D printed WidowX arm, ready-to-use MuJoCo simulation environments, baseline RL implementations, and all necessary computational resources (though bringing your own GPU is a plus).

Teaching staff

Dates ( Calendar view )

Frequency Weekday Time Format / Place Period  
by appointment n.V.   13.10.2025-06.02.2026

Subject assignments

Module Course Requirements  
39-M-Inf-P Projekt Projekt Ungraded examination
Student information

The binding module descriptions contain further information, including specifications on the "types of assignments" students need to complete. In cases where a module description mentions more than one kind of assignment, the respective member of the teaching staff will decide which task(s) they assign the students.


Analyze and compare how different pretrained visual encoders affect robotic reinforcement learning performance, efficiency, and interpretability for a simulated manipulation task.

No eLearning offering available
Address:
WS2025_392241@ekvv.uni-bielefeld.de
This address can be used by teaching staff, their secretary's offices as well as the individuals in charge of course data maintenance to send emails to the course participants. IMPORTANT: All sent emails must be activated. Wait for the activation email and follow the instructions given there.
If the reference number is used for several courses in the course of the semester, use the following alternative address to reach the participants of exactly this: VST_568328324@ekvv.uni-bielefeld.de
Notes:
Additional notes on the electronic mailing lists
Last update basic details/teaching staff:
Tuesday, June 17, 2025 
Last update times:
Sunday, June 15, 2025 
Last update rooms:
Sunday, June 15, 2025 
Type(s) / SWS (hours per week per semester)
project (Pj) / 2
Language
This lecture is taught in english
Department
Faculty of Technology
Questions or corrections?
Questions or correction requests for this course?
Planning support
Clashing dates for this course
Links to this course
If you want to set links to this course page, please use one of the following links. Do not use the link shown in your browser!
The following link includes the course ID and is always unique:
https://ekvv.uni-bielefeld.de/kvv_publ/publ/vd?id=568328324
Send page to mobile
Click to open QR code
Scan QR code: Enlarge QR code
ID
568328324