Training period with salary, Research master, 2009-2010

Rearrangement scenarios with tandem duplication events


Supervisor Jean-Stéphane Varré (jean-stephane.varre [@] lifl.fr, 03 59 57 79 18)

Themes : bioinformatique, comparaison de génomes, graphes, permutations

Fixed term period : 4 to 6 months (paid). Start in 2010.

Requirements :good skills in algorithms, nothing in biology nor computational biology

Scientific context

One of the problems in comparative genomics consists in recovering the history of the gene organization along the genome during the course of evolution. Genomes are modelled by sequences of numbers, each of them representing a gene. A same number may be used more than once since the gene has been duplicated.

The order of the genes is affected because of some evolutionary events. Among them, a segment of the genome can be reversed or transoposed at another place in the genome, or yet duplicated. Problems addressed are, for example, to find the shortest sequence of evolutionary events allowing for transforming one genome into another (a scenario) or the set of events explaining a phylogentic tree for a given set of genomes.

Proposal

The study of genomic rearrangements with duplicated genes lead to difficult problems [1]. However, those problems are very important because duplications arise in most of the genomes. We are interested in studying genomes such that duplicated genes are supposed to come from a tandem duplication event, that is a duplication where the genes copied are next to the orginal ones.

First of all, the work will focus on the understanding of algorithms computing a scenario between two genomes without duplicates [2] and then of algorithms dealing with duplicates such as [3].

Afterwards, the applicant will design an algorithm able to transform a sequence of genes into a sequence of genes where duplicates resulting from a tandem duplication event are grouped together. The algorithm will have to be as parsimonious as possible, i.e. using the minimum number of events. The following problem will be to change this sequence to obtain a perfect duplicated segment, where the two duplicates part will be side by side. Lastly, those two algorithms could be integrated in a heuristic method allowing to find a scenario with tandem duplications and inversions.

References

  1. [1] G Blin, C Chauve, G Fertin, R Rizzi, S Vialette. Comparing Genomes with Duplications: a Computational Complexity Point of View. 2007. IEEE/ACM transactions on computational biology and bioinformatics. www
  2. [2] R Friedberg, A Darling, S Yancopoulos. Genome rearrangement by the double cut and join operation. 2008. Methods in Molecular Biology. www
  3. [3] M Bader. Sorting by reversals, block interchanges, tandem duplications, and deletions. 2009. BMC bioinformatics. www