The goal of multimodal affect recognition is to predict affect (i.e., basic sense of feeling), from multimodal data such as videos which contain, audio, visual and textual modalities. State-of-the-art methods for affect recognition are increasingly turning to deep learning methods. These methods, however, often use pre-extracted features for each of the modalities. By pre-extracting features, models are missing out on the power of representation learning afforded through deep learning. In this project, you will implement multimodal models using an end-to-end approach (i.e. methods that use raw data instead of pre-extracted features) and compare this approach to existing multimodal methods that use pre-extracted features.
Required skills:
• Python,
• PyTorch preferred (TensorFlow is also ok)
• Experience implementing deep learning models
In case this proposal would not attract enough students for a team project, I'd adapt it into (in reduced/modified form)
• an individual project
• a project for only 2 students
Rhythmus | Tag | Uhrzeit | Format / Ort | Zeitraum |
---|
Modul | Veranstaltung | Leistungen | |
---|---|---|---|
39-M-Inf-GP Grundlagenprojekt Intelligente Systeme | weiteres Projekt | unbenotete Prüfungsleistung
|
Studieninformation |
Die verbindlichen Modulbeschreibungen enthalten weitere Informationen, auch zu den "Leistungen" und ihren Anforderungen. Sind mehrere "Leistungsformen" möglich, entscheiden die jeweiligen Lehrenden darüber.