In comparative genomics, a common preprocessing step is to represent genomes as a series of blocks, represented usually by numbers, where each number corresponds to a gene family. Using different kinds of algorithms, usually based on sequence similarity, the gene family detection step has to be performed, in order to allow the subsequent comparative genomics analysis.
Recently, a different approach was proposed, called family free, where the family detection step is skipped, and entirely included in the comparative analysis. This approach can be used on several areas, and in this project we are interested in genome rearrangements, large scale evolutionary events that change the genome by moving or reversing big DNA blocks to different positions or even chromosomes.
The Double Cut and Join (DCJ) operation is the most studied model for genome rearrangements since its introduction in 2005, due to its capability to simulate several rearrangement operations and yet it gives rise to a simple combinatorial model, solvable in linear time.
In 2014, Martinez et al. proposed the Family Free DCJ (FFDCJ) model, where the family free approach is used in the DCJ model. This problem was shown to be NP-hard, but solvable in medium sized instances through an ILP.
In this project, we want to further develop the FFDCJ model, also testing on simulated and real data, to assess how good a tool it is for phylogenetic reconstruction, orthology assignment and ancestral reconstruction.
This project will have two main objectives: i) improve the theoretical FFDCJ model, allowing new operations insertions, deletions and duplications, from which results are already known in the original (non family free) DCJ model; ii) develop tests on real and simulated data, to evaluate the new improvements.
Module | Course | Requirements | |
---|---|---|---|
39-M-Inf-P_BI Projekt Bioinformatik | Projekt | Ungraded examination
|
Student information |
The binding module descriptions contain further information, including specifications on the "types of assignments" students need to complete. In cases where a module description mentions more than one kind of assignment, the respective member of the teaching staff will decide which task(s) they assign the students.