392264 Project: SemanticSpeak: Development of a framework for image/video generation from speech using generative AI (Pj) (SoSe 2026)

Contents, comment

Current AI systems can generate images and videos from text prompts. However, generating visual content directly from speech remains a challenging problem, as speech contains not only linguistic information but also tone, emotion, and prosody.
This project explores how semantic representations extracted from speech can drive visual generation using generative AI models.
The goal is to design and implement a prototype pipeline that maps speech features to semantic embeddings compatible with visual generative models. The speech-to-image or speech-to-video generation pipeline will be trained and evaluated using multimodal datasets.

Depending on the number of students and project scope, the project can also include:
• Evaluation of the alignment between generated visuals and spoken input
• Analysis of the influence of prosody and emotion
• Comparison of direct speech-based vs. speech-to-text-based pipelines

Requirements for participation, required level

• Good programming skills with Python
• Basic knowledge of machine learning/deep learning
• Interest in generative AI
• Preferably experience or a very strong interest in speech processing (e.g. speech recognition, speech-to-text, ...)
Upon completion of this project, we will work hand in hand to publish the results in a well-established conference or journal in Human–Computer Interaction (HCI) or Computer Vision (CV)

Teaching staff

Dates ( Calendar view )

Frequency	Weekday	Time	Format / Place	Period
by appointment	n.V.			13.04.-24.07.2026	Nach Vereinbarung, online, CITEC oder R.1

Subject assignments

Module	Course	Requirements
39-M-Inf-AI-app-foc_a Applied Artificial Intelligence (focus) Applied Artificial Intelligence (focus)	Applied Artificial Intelligence (focus): Project	Study requirement	Student information
39-M-Inf-INT-app-foc_a Applied Interaction Technology (focus) Applied Interaction Technology (focus)	Applied Interaction Technology (focus): Project	Study requirement	Student information

The binding module descriptions contain further information, including specifications on the "types of assignments" students need to complete. In cases where a module description mentions more than one kind of assignment, the respective member of the teaching staff will decide which task(s) they assign the students.

No more requirements

No eLearning offering available

Address:: SS2026_392264@ekvv.uni-bielefeld.de; This address can be used by teaching staff, their secretary's offices as well as the individuals in charge of course data maintenance to send emails to the course participants. IMPORTANT: All sent emails must be activated. Wait for the activation email and follow the instructions given there.; If the reference number is used for several courses in the course of the semester, use the following alternative address to reach the participants of exactly this: VST_720232759@ekvv.uni-bielefeld.de
Notes:: Additional notes on the electronic mailing lists

Last update basic details/teaching staff:: Tuesday, May 26, 2026
Last update times:: Saturday, April 25, 2026
Last update rooms:: Saturday, April 25, 2026

Type(s) / SWS (hours per week per semester): project (Pj) / 2
Department: Faculty of Technology
Questions or corrections?: Questions or correction requests for this course?
Planning support: Clashing dates for this course
Links to this course: If you want to set links to this course page, please use one of the following links. Do not use the link shown in your browser!; The following link includes the course ID and is always unique:; https://ekvv.uni-bielefeld.de/kvv_publ/publ/vd?id=720232759
Send page to mobile: Click to open QR code
Scan QR code:
ID: 720232759

Quick links

392264 Project: SemanticSpeak: Development of a framework for image/video generation from speech using generative AI (Pj) (SoSe 2026)

Contents, comment

Requirements for participation, required level

Teaching staff

Dates ( Calendar view )

Subject assignments

Requirement concretion

eLearning

Automatic electronic mailing list for the course

Changes to/updates of the course details

Others