Lecture topic
The lecture series provides a comprehensive introduction to concepts, technologies, and applications of knowledge representation, graph databases, and graph analysis with a particular focus on life science and biomedical fields of application. The aim is to teach students both theoretical principles and practical methods for modelling, analysing, and using complex, networked data.
Course content
At the beginning, basic concepts of knowledge and knowledge representation are introduced. Established standards such as RDF (Resource Description Framework) and OWL (Web Ontology Language) are discussed, and their significance for the construction and use of knowledge graphs is explained. Building on this, the properties, structures, and possible applications of knowledge graphs are presented.
The next part provides an introduction to graph databases. Students learn the basic concepts of graph-based data storage and relate these to classic relational database systems. In addition, the lecture series offers an overview of the basics of graph theory and graph algorithms, such as route planning.
Another focus is on query languages for graph databases. The lecture covers common query languages such as Cypher and SPARQL and shows how complex queries can be efficiently formulated and optimised. In addition, basic graph traversal and pattern recognition algorithms are presented in order to identify relevant structures and relationships, for example, in biological networks.
Building on this, the topic of graph machine learning is covered. Students gain an overview of the connection between graph databases and machine learning, as well as methods of graph learning. Using a concrete application example – the prediction of sepsis based on small blood counts – the lecture shows how graph-based learning methods can be used in medical practice.
Finally, the lecture series focuses on the applications of graphs in the life sciences. Topics covered include the storage and analysis of biological sequences (DNA, RNA, proteins), the identification and visualisation of molecular interactions and biological signalling pathways, and the prediction of drug-target interactions in drug discovery. It also shows how knowledge graphs can contribute to the integration of heterogeneous patient data (genomic, clinical, lifestyle-related) and support personalised therapy approaches. Challenges relating to data protection, privacy, and data security are also critically discussed.
The lecture will be accompanied by corresponding programming exercises that will deepen the lecture content on RDFs, graph databases, and graph queries.
Teaching goals:
• Students understand the fundamental concepts of knowledge representation, knowledge graphs, and graph databases, and can apply them to model complex, interconnected data.
• Students are able to efficiently use graph databases with suitable query languages (e.g., Cypher, SPARQL), apply graph algorithms, and identify relevant patterns and relationships in real data sets.
• Students can critically evaluate graph-based machine learning methods and graph applications in the life sciences and apply them conceptually to specific use cases, such as personalised medicine or drug discovery.
Recommended prior knowledge of the Python programming language is helpful.
• Graph Databases „New opportunities for connected data, Ian Robinson, Jim Wevver, Emil Eifrem, Publisher: O‘Reilly
• A Comprehensive Guide to Graph Algorithms in Neo4j, Mark Needham & Amy E. Hodler, Publisher: Neo4J
• Building Knowledge Graphs „A practitioner‘s Gide, Jesus Barrasa, Jim Webber, Publisher: O‘Reilly
• Santos et al. (https://www.nature.com/articles/s41587-021-01145-6)
• Walke et al. (https://academic.oup.com/database/article/doi/10.1093/database/baad045/7222237)
• https://open.hpi.de/courses/knowledgegraphs2020/overview
| Frequency | Weekday | Time | Format / Place | Period |
|---|
| Module | Course | Requirements | |
|---|---|---|---|
| 39-Inf-WP-CLS Computational Life Sciences (Basis) Computational Life Sciences (Basis) | Einführende Vorlesung | Student information | |
| - | Graded examination | Student information | |
| 39-Inf-WP-DS Data Science (Basis) Data Science (Basis) | Einführende Vorlesung | Student information | |
| - | Graded examination | Student information | |
| 39-M-MBT12_a Differentiation 1 in Science for M.Sc. Wahlpflicht 1 Molekulare Biotechnologie Master | - | Graded examination | Student information |
| 39-M-MBT13_a Differentiation 1 in Science for M.Sc. Wahlpflicht 2 Molekulare Biotechnologie Master | - | Graded examination | Student information |
The binding module descriptions contain further information, including specifications on the "types of assignments" students need to complete. In cases where a module description mentions more than one kind of assignment, the respective member of the teaching staff will decide which task(s) they assign the students.