Hub members Have many expertise, covering most of the fields in bioinformatics and biostatistics. You'll find below a non-exhaustive list of these expertise
Searched keyword : Comparative Genomics
Related people (1)
Related projects (24)
Candida albicans is responsible for the majority of life-threatening fungal infections occurring in hospitalized patients and is also the most frequently isolated fungal commensal of humans. Microevolution of C. albicans isolates has been observed in a number of instances, being in particular characterized by loss-of-heterozygosity events. Yet, most studies that have investigated such microevolutions have not used whole-genome sequencing. In this project, we aim to characterize C. albicans microevolution at the genome-wide level. To this aim, we will take advantage of multiple isolates collected at the same time in healthy individuals and that share the same molecular type, thus providing information on the extent of genetic diversity of commensal isolates. We will also take advantage of series of isolates collected in patients with different forms of candidiasis and/or that have received antifungal therapy, thus providing information of the impact of pathogenic interaction and antifungal treatment on genome dynamics.
Legionella pneumophila is from a genomic point of view a very diverse species, however, only few clones are responsible for over 50% of all human disease cases. Thus we aim to understand the evolution and emergence of these 5 major disease related clones. We have sequenced a large number L. pneumophila strains belonging to these clones and are undertaking comparative and phylogenetic genome analyses.
A major program of evolutionary and comparative genomics of yeasts has been in progress in my laboratory for many years (see publications). In the next few months (before summer 2015) I need to finish a few comparisons about a new clade to publish as soon as possible.
Comparative genomics approaches in microbiology now use thousands of genomes to analyze a given species in different environmental or medical contexts. By collecting and comparing these genomic sequences, many studies are focusing on the overall gene content of a species (i.e. the pan-genome) to understand their evolution in terms of core and accessory parts. Variable regions are of primary importance to understand the adaptive potential of bacteria and contain genomic regions that are exchanged between strains by horizontal gene transfer (i.e. the mobilome). As recently suggested (Chan et al., Genome Biology 2015), a consensus representation of multiple genomes would provide a better analysis framework than using individual reference genomes. Here, we propose to extend this concept by giving it a formal mathematical representation using a graph model and by applying statistical methods to cluster gene families.
The genome of C. tetani contains a chromosome of approximately 2,8Mb and a large size plasmid (74 kbp) harboring the tetanus-toxin gene. The genome of the strain E88 was sequenced and annotated (PNAS 2003, 100:1316-1321). We have sequenced and performed a first comparison of the genomes of three additional strains (Res Microbiol 2015, 166:326-331). Fourteen additional C. tetani strains were sequenced including historical strains (1952-1968) and recent French clinical isolates. We have the raw data obtained by Illumina sequencing. Sequence comparaison of chromosome and plasmid will be done. For this, in a first time the assembly of sequence read of each strain will be done, in a second time a comparaison of chromosome and the plasmid of these 14 strains by BLAST approach will be made. Finally, a phylogenetic tree will be generated allowing us to see the evolution of this bacteria.
SNP based analysis of French Bordetella pertussis isolates: comparison of isolates producing all the vaccine antigens to isolates producing only some of them.
Whooping cough is a vaccine-preventable disease due to Bordetella pertussis. Even if vaccination has allowed the control of the disease, isolates are still circulating and cyclic increases of incidence are observed every 3 to 5 years even in vaccinated countries. Most developed countries now use acellular vaccines containing 3 to 5 vaccine antigens (pertussis toxin (PT), filamentous hemagglutinin (FHA), pertactin (PRN) fimbrial proteins (FIM2/FIM3)) that have replaced whole cell vaccines. In regions vaccinating with acellular vaccines with a high coverage, isolates no more producing some vaccine antigens (mainly PRN) have been reported in the last years. Bordetella pertussis reference genome has been fully annotated in 2003 by the Sanger Institute. Analysis and comparison of different B.pertussis genomic sequences showed that circulating B.pertussis isolates differ from vaccine and reference strains. Genome evolution is characterized by gene deletions, antigenic divergences, SNP accumulations…Recent genomic analysis gathering isolates from different countries showed that the worldwide B. pertussis population has evolved in the last 60 years,. Gene categories under selection were identified underlying that Bvg-activated genes and genes coding for surface-exposed proteins were important for adaptation. However these analyses concerned only overall vaccine antigen producing isolates. The PTMMH Unit includes the National Center of reference for Bordetellosis. In the last years some particular B.pertussis French isolates no more producing PRN but also FHA or PT have been collected, analyzed and sequenced. We would like to further analyze these genomic data with a focus on the vaccine antigen deficient isolates through a SNP-based comparison of these isolates vs co-circulating isolates producing all vaccine antigens and vs a reference strain.
One project of our laboratory is to characterize the function of small proteins specifically expressed in nongrowing Salmonella cells under the tight control of SigmaS. Our current works focus on small proteins secreted or targeted to the membrane since their characterization might reveal novel aspects of membrane functions and secretion in persistent bacteria, including pathogens. The proposed project is an extension of a former project on the phylogenetic distribution of the small orfs, and aims at analyzing the genomic context of orthologs.
We submitted an article about the diversity of Clostridium botulinum strains based on MLST analysis. One reviewer of the article asked further study of our NGS data.
We are interested to look at evolution of bacteria in presence of bacteriophages within the gastrointestinal tract of animals.
Bacillus cereus is ubiquitous in nature, and while most isolates appear to be harmless, some are associated with food-borne illnesses, periodontal diseases, and more serious infections. Moreover, emerging B. cereus strains that cause anthrax-like disease have been isolated in Cameroon and Côte d’Ivoire. These strains are particular, because although they belong to the B. cereus species genetically speaking, they harbor two plasmids, pBCXO1 and pBCOX2, that are very similar to the pOX1 and pOX2 plasmids of Bacillus anthracis that encode the toxins and the polyglutamate capsule, respectively. Around one hundred strains of Bacillus cereus from different origins (environment, clinical and food) deposited at the Collection de l’Institut Pasteur (CIP) are studied and whole genome sequencing has been performed in order to characterize the pathogenicity of the strains by looking for virulence genes and by comparing the strains (genes of interest), depending on their origin.
Identification de SNP entre 2 souches bactériennes la souche sauvage et le révertant (Pseudomonas aeruginosa)
The microbiota of the gastrointestinal tract is a diverse mixture different species and strains of bacteria. We aim to identify regions of dissimilarity between two different bacterial strains to be able to quantitate their relative abundance and thus obtain information about their ability to cohabitate a common environment. In addition, we are interested in assessing the surface localised protein repertoir of specific species through a combination bioinformatic analysis and proteomics.
Escherichia coli is one of the major bacterial pathogens that are responsible for numerous nosocomial infections. While most of the E. coli infections are rather related to colonisation of the urinary tract there are rare, but complex to resolve, cases of E. coli infection of central veinous catheter. To determine whether the E. coli strains that colonize central veinous catheters have specific properties we have started a preliminary project where few E. coli strains responsible for such infections have been collected (4 strains) and a comparative phenotypic analysis has been initiated. To deepen this analysis the genome of these 4 strains has been sequenced and we expected to analyse these genomes to identify potential interesting features that can be link to the clinical context of these strains. Notably, it is expected that these strains of E. coli might carry some specific adhesion factors, virulence genes and potential antibiotic resistance. This project could lead to the identification of some interesting factors that can explain their tropism for central veinous catheters. It is envisionned that this project can be expanded to the analysis of more E. coli strains responsible for central veinous catheter-related infections, and thus could open the way for the development of either biological markers of such infection or fighting strategies against catheter-related infections.
Trichosporon asahii is a yeast responsible of human invasive infection worldwide. Actually, no genotyping method is available to determine relationship between clinical isolates. At the NRCMA we have more than 40 clinical isolates and 2 collection strains associated with clinical data. Thanks to P2M facility, whole genome for 33 isolates was sequenced. The aim of this project is to study the genetic diversity of Trichosporon asahii and the potential relationship with clinical and/or phenotypic data and finally propose a new genotyping method that could be usefull for clinician in case of local or national outbreak.
Alterations of the cellular proteome over time due to spontaneous or enzymatic deamidation of glutamine (Gln) and asparagine (Asn) residues is a probable source of aging-related diseases. In particular, deamidation of a conserved glutamine in all GTPases of the Ras superfamily that are essential for cellular GTP turnover, confers to these molecular switches gain-of-function properties that can stimulate oncogenic signaling pathways. The CNF1 toxin produced by pathogenic Escherichia coli, a prevalent resident in human gut microbiota, is an example of a bacterial factor with the potential to cause somatic “oncogenic mutation” of Rho GTPases through deamidation. Development of holistic approaches for personalized medicine to monitor healthy microbiota could led to improved public health and increased lifespans.
- The Institut Pasteur genomic taxonomy database of microbial strains (“Pasteur MLST”) is a free, publicly-accessible resource that hosts nucleotide sequence-based definitions of microbial strains, along with information on bacterial isolates (provenance data) and their genomic sequences. The Pasteur MLST database provides universal nomenclatures that are largely adopted for important pathogens (Klebsiella, Listeria, …), and represent a unifying language on strains for microbial population biology. - Unified strain taxonomies facilitate the coordinated international surveillance of bacterial pathogens. Several hundred research laboratories and public health agencies worldwide have deposited novel strain types, sequences and provenance data on their bacterial isolates. - Pasteur MLST is powered by the Open source GPL3 BIGSdb web application developed at Oxford University (Keith Jolley & Martin Maiden). (http://bigsdb.pasteur.fr ). Its evolution in terms of functionality is tightly linked to the developments of the software at Oxford U. Its evolution in terms of contents is managed by dedicated international teams of curators for each bacterial pathogenic species, coordinated by the PasteurMLST team. - The genomic taxonomies hosted at Pasteur MLST represent unique, authoritative resources that are highly valued by the community, as testified by the routine use of Pasteur MLST strain tags (e.g., K. pneumoniae ST258) in the scientific literature. Several labs (National Reference Centers or Units) of Institut Pasteur are coordinating the curation of genomic taxonomies (Klebsiella, Listeria, Corynebacteria, Bordetella, Leptospira, Yersinia, ...). The aim of the project is to obtain support from the C3BI HUB for the maintenance of the BIGSdb instance at Pasteur: deployment, upgrades, installation of API functionality developed by our partner, coping with future IT evolutions, ...
Helicobacter pylori is a Gram-negative pathogen whose infection results in various gastric diseases including gastric cancer in Humans. Current drug therapy against the bacteria involves a combination of two antibiotics, proton-pump inhibitor and a Bismuth salt. Introduction of bismuth has resulted in increased success rate compared to traditional therapies without bismuth salt. H. pylori is a genetically variable bacteria with high rates of mutations and recombination. Interestingly till now, there is no report of Bismuth resistance in H. pylori clinical isolates. Very limited data is available about the mechanism of entry and the anti-bacterial action of bismuth in H. pylori. We selected laboratory strain of H. pylori for bismuth resistance in order to identify/determine its effect in H. pylori. The resistance strains exhibit different levels of susceptibility to bismuth compared to the parent strain. Comparative genomics will shed light on the mode of resistance(s) acquired in H. pylori strains. Identification of genes and pathway involved in bismuth resistance will be useful to determine the mechanistic details of bismuth's action.
Members of the genus Yersinia include environmental as well as pathogenic bacteria. Pathogenic species (Y. pestis, Y. pseudotuberculosis and Y. enterocolitica) have historically been targets for research aimed at understanding how bacteria evolve into diverse mammalian pathogens. The advent of large-scale population genomic studies has greatly accelerated the progress in this field, revealing how gene gain, gene loss and genome rearrangement events have impacted evolution of enteropathogenic bacteria. In the context of the sequencing of many novel genomes of Yersinia spp., within this project we are interested in the generation of a bioinformatic tool that will allow us to compare and visualize single nucleotide polymorphisms as well as genome organization in members of the genus Yersinia, in order to better understand genomic events that drive evolution within this bacterial species.
Genome organization and synteny analysis of Yersinia pseudotuberculosis strains responsible for the Far-East Scarlatine-Like Fever
The Yersinia pseudotuberculosis complex comprises the plague pathogen Yersinia pestis, the enteropathogen Yersinia pseudotuberculosis, as well as specific sub-group of Y. pseudotuberculosis strains responsible for the Far-East Scarlatine-Like Fever (FESLF). We have sequenced the genomes of FESLF strains and we are interested in investigating the genome organization of these bacteria.
The LeiSHield-MATI consortium: Investigating genomic adaptation of Leishmania parasites in endemic areas
Leishmania causes devastating human diseases – leishmaniases - representing an important public health problem in the Mediterranean basin and declared as emerging diseases in the EU due to climate change and population displacement. The LeiSHield-MATI consortium will for the first time investigate in an integrative fashion the complex parasite-vector-host interplay in cutaneous leishmaniasis affecting Morocco, Algeria, Tunisia, and Iran (MATI), using field isolates and human clinical samples. The ultimate goal of our project is to identify genetic factors selected during natural infection and to understand how the complex parasite-vector-animal interaction impacts clinical outcome in infected patients. This goal will be achieved through a highly ambitious secondment plan between all partners, and the organization of courses and workshops to train the next generation of scientists generating a long-term impact on the research capacities in endemic areas. Capitalizing on complementary infrastructures of its EU, African and Asian partners and their expertise in molecular parasitology, epidemiology, systems level analyses, bioinformatics, computational biology, immunology, dermatology, field studies, and public health, our project will drive important innovation in clinical research, strengthen capacities in disease endemic regions, inform authorities on control measures, and raise awareness in all partner countries on this emerging EU public health problem. The highly inter-disciplinary and inter-sectorial structure of LeiSHield-MATI, and its powerful integrative and comparative approach is novel in parasitic systems and will drive a unique bio-marker discovery pipeline for the future development of new prognostic and diagnostic tools, as well as novel preventive and therapeutic measures that will ensure long-term collaboration, promote scientific and commercial self-sustainability of its partners, and will have an important impact to improve public health.
Because of the fast-growing number of assembled genomes available in the public repositories (GenBank/RefSeq), it is now quite common to end up with many genome assemblies that belong to the same species. In this current context, determining the type strain/species of any new species/genus requires to identify the representive one(s) within a large collection of genomes. The medoid of a genome collection being the one(s) whose sum of distances to all other genomes is minimal, it is an excellent candidate to be the representative genome of a collection. We therefore aim at procuring a bioinformatic tool able to quickly and accurately identify the medoid of any genome collections.
Finely tuned sensory systems enable bacteria to sense and respond to fluctuating environments, coordinating adaptive changes in metabolic pathways and physiological outputs. For pathogenic Leptospira, signaling pathways allow a timely expression of virulence factors during the successive steps of infection of a mammal host. As the bacteria is excreted by its host, signaling pathways enable switching the expression towards factors promoting survival in the environment. A unifying theme across bacterial species is that biofilm formation coincides with the synthesis of the cellular signaling molecule bis-(3?-5?)-cyclic dimeric guanosine monophosphate (c-di-GMP) and this feature seems to be conserved in Leptospira. Our current work shows that the c-di-GMP regulation pathway is a major regulatory network involved in biofilm formation, virulence and motility in the pathogen Leptospira interrogans. Biofilm production and virulence expression is quite variable across the leptospira genus (highly virulent species, low virulent species and saprophytes species showing increase biofilm production). We would like to explore how the c-di-GMP metabolism, and the many genes associated with its synthesis, and degradation have evolved across the leptospira genus. We believe that understanding the evolutionary relationship of the c-di-GMP metabolism genes in the Leptospira genus would help us to understand the contribution of this second messenger to pathogenesis and biofilm formation in the Leptospira genus