Expertise

Hub members Have many expertise, covering most of the fields in bioinformatics and biostatistics. You'll find below a non-exhaustive list of these expertise

Search by keywords | Search by organisms

Searched keyword : Genetics

Related people (15)

Anne BITON


I received a Ph.D. in Biostatistics and Bioinformatics applied to Cancer Research in 2011 from the University Paris Sud XI, I was working at the Curie institute under the supervision of Emmanuel Barillot and François Radvanyi. My Ph.D. was about the unsupervised analysis of cancer transcriptome. During my postdoctoral time, I worked on the computational and statistical analysis of NGS data. My areas of interest and expertise include - functional genomics - human genetics - statistical analysis of high-dimensional data - normalization, batch-correction, meta-analysis of high-throughput data - unsupervised learning, independent component analysis - NGS data analysis (RNA-Seq, DNA-Seq, …) - analysis of the non-coding genome, transposable elements


Keywords
Machine learningModelingGenetics
Organisms

Projects (11)

Pascal CAMPAGNE

Group : SABER - Hub Core

Initially trained in evolutionary and environmental sciences, I studied population genetics and micro-evolutionary processes in a number of postdoctoral research projects. I recently joined the C3BI-Hub at the Institut Pasteur, where I work on various aspects involving Biostatistics and the analysis of genetic data.


Keywords
Association studiesGenomicsGenotypingBiostatisticsGeneticsEvolutionPopulation genetics
Organisms
BacteriaParasiteHumanInsect or arthropodOther animal
Projects (9)

Claudia CHICA


As a computational biologist I have been involved in various projects seeking to answer different biological questions. Those projects have allowed me to define my main research interest, namely the evolutionary study of the emergence, storage and modulation of information in biological systems assisted by computational methods. During my research career I have acquired extensive experience in the analysis of sequence data at the DNA and protein level. I’m trained both in NGS bioinformatic protocols (ChIP-seq, ATAC-seq, RNA-seq, genome assembly) and fine detail sequence analysis. Most importantly, I have gained proficiency in the use of the statistical models that are at the basis of the quantitative analysis of low and high throughput sequence data. Additionally, my experience as a lecturer and instructor has taught me that training researchers about the formal basis of bioinformatic methodologies is the key for a successful collaboration between wet and dry lab. Likewise, I have gained valuable skills by working within two international consortia (TARA Oceans project and TRANSNET): the ability to collaborate with multidisciplinary groups and to coordinate younger researchers.


Keywords
AlgorithmicsGenomicsSequence analysisTranscriptomicsGenome analysisGeneticsEvolutionInteractomics
Organisms

Projects (16)

Freddy CLIQUET


One of my projects consists in developing GRAVITY, a java tool based on Cytoscape to integrate genetic variants within protein-protein interaction networks to allow the visual and statistical interpretation of next-generation sequencing data, ultimately helping geneticists and clinicians to identify causal variants and better diagnose their patients. I’m also involved in several other projects in the lab, taking part in the design of pipelines for the processing and the analysis of genomics data, including SNP arrays, whole-exome and whole-genome sequencing data. This means being confronted to the big data problematic, the unit having to manage hundreds of terabytes of genomics data. Finally, I am now analysing these data in order to identify possible causes for autism, to help clinicians with their diagnosis but also to better understand the biological mechanisms at play in this complex disease. This is done through the project aiming at understanding the genetic architecture of autism in the Faroe Islands, and also with the newly starting IMI2 European project AIMS2-Trials.


Keywords
AlgorithmicsData managementData VisualizationGenomicsMachine learningProteomicsGenome analysisBiostatisticsProgram developmentScientific computingApplication of mathematics in sciencesExploratory data analysisSofware development and engineeringData and text miningGenetics
Organisms

Projects (0)

    Alexis CRISCUOLO

    Group : GIPHY - Embedded : PIßnet

    | work as a research engineer in the ßioinƒormatics and ßiostatistics HUß of the |nstitut Pasteur. Holder of a PhD in bioinƒormatics, my main interest is on ƒast but robust phylogenetic inƒerence algorithms and methods ƒrom large genome-scaled datasets. |n consequence, | am oƒten involved in related bioinƒormatics projects, such as perƒorming de novo or ab initio genome assemblies, designing and processing core genome †yping schemes, building and analysing phylogenomics datasets, or implementing and distributing novel tools and methods.


    Keywords
    AlgorithmicsClusteringGenome assemblyGenomicsGenotypingPhylogeneticsTaxonomyGenome analysisProgram developmentEvolutionSequence homology analysis
    Organisms

    Projects (19)

    Julien GUGLIELMINI


    After a PhD in Microbiology on bacterial toxin-antitoxin systems at the Free University of Brussels, I joined the Institut Pasteur for a 3 years postdoc in Eduardo Rocha’s lab. During this period, I performed comparative genomics and pylogenetic analysis on bacterial conjugation and type IV secretion systems. Then, I worked 2 years in Olivier Tenaillon’s team on the modelling and evolution of organismal complexity. I joined the HUB in 2015, and I am involved in phylogenetic and comparative genomics projects.


    Keywords
    GenomicsPhylogeneticsSequence analysisGenome analysisGeneticsEvolutionPopulation genetics
    Organisms
    ArchaeaBacteriaVirus
    Projects (10)

    Kenzo-Hugo HILLION

    Group : WINTER - Hub Core

    After a Master degree in Genetics at Magistère Européen de Génétique, Paris Diderot, I did a second Master in bioinformatics at University of Nantes where I focused my work on the study of mapping strategy for allele specific analysis at the bioinformatics platform of Institut Curie. I then joined Institut Pasteur to work on an ELIXIR project related to the bio.tools registry through the development of a dedicated tool and the participation of several workshops and hackathons. As an engineer of the bioinformatics and Biostatistics Hub, I am involved in several projects from Differential Analysis of RNA-seq data to Metagenomics. I am also in charge of the maintenance of the Galaxy Pasteur instance.


    Keywords
    ChIP-seqEpigenomicsGenomicsSequence analysisProgram developmentDatabases and ontologiesSofware development and engineeringGeneticsData integrationRead mappingWorkflow and pipeline developmentConfocal Microscopy
    Organisms

    Projects (4)

    Hanna JULIENNE

    Group : DETACHED - Detached : Statistical Genetics

    I am seeking to apply my knowledge in computer science and statistics to understand real world data. I have interdisciplinary background spanning complex systems, Big Data, machine learning, biostatistics and genomics. I have completed a PhD in which I applied clustering and PCA to epigenomics data and discovered new insights on the coupling between replication and epigenetics. I worked at Dataiku, a dynamic start up in which I was actively engaged to help their clients to build their Big Data strategy and draw value from their data. I studied the human microbiota during two years at MetaGenoPolis (MGP), an innovative research center. We aim at improving human health by developing strategies (eg. nutritional, therapeutical, preventive…) to restore dysbiosed microbiota with our industrial and academical partners. I currently work in the statistical genetics group at the Pasteur Institut where I apply my software development and data science skills to quantify the impact of the human genome variation on diverse health parameters.


    Keywords
    ClusteringData managementGenomicsGenome analysisExploratory data analysisGeneticsComparative metagenomicsDimensional reductionMultidimensional data analysis
    Organisms

    Projects (2)

    Etienne KORNOBIS

    Group : FUNGEN - Embedded : Epigenetic regulation

    After a PhD in Biology in 2011 on population genetics and phylogeography on amazing little amphipods (Crangonyx, Crymostygius) at the University of Reykjavik (Iceland), I pursued my interest in Bioinformatics and Evolutionary Biology in various post-docs in Spain (MNCN Madrid, UB Barcelona). During this time, I investigated transcriptomic landscapes for various non-model species (groups Conus, Junco and Caecilians) using de novo assemblies and participated in the development of TRUFA, a web platform for de novo RNA-seq analysis. In July 2016, I integrated the Revive Consortium and the Epigenetic Regulation unit at Pasteur Institute, where my main focus were transcriptomic and epigenetic analyses on various thematics using short and long reads technologies, with a special interest in alternative splicing events detection. I joined the Bioinformatics and Biostatistics Hub in January 2018. My latest interests are long reads technologies, alternative splicing and achieving reproducibility in Bioinformatics using workflow managers, container technologies and literate programming.


    Keywords
    Data managementData VisualizationSequence analysisTranscriptomicsWeb developmentGenome analysisProgram developmentExploratory data analysisSofware development and engineeringGeneticsEvolutionRead mappingWorkflow and pipeline developmentPopulation geneticsMotifs and patterns detectionGrid and cloud computing
    Organisms
    HumanInsect or arthropodOther animalAnopheles gambiae (African malaria mosquito)Mouse
    Projects (2)

    Frédéric LEMOINE


    After a Master degree in bioinformatics and biostatistics, I did a PhD in computer science / bioinformatics at University Paris-Sud (now in University Paris-Saclay), where I worked on integration and analysis of comparative genomics data. After a postdoc in Lausanne, Switzerland where I worked on small-RNA sequencing data, I joined GenoSplice where I was responsible for the development of bioinformatics projects related to next generation sequencing. I joined Institut Pasteur in Nov. 2015, to work in the Evolutionary Bioinformatics Unit and participate in the development of new tools and algorithms that are able to tackle efficiently the ever increasing amount of sequencing data.


    Keywords
    AlgorithmicsData managementPhylogeneticsSequence analysisDatabaseGenome analysisProgram developmentScientific computingDatabases and ontologiesSequencingWorkflow and pipeline development
    Organisms

    Projects (0)

      Blaise LI


      I obtained a PhD in phylogeny in 2008 at the Muséum National d’Histoire Naturelle in Paris, then worked as a post-doc in Torino (Italy, 2009 – 2011) and Faro (Portugal, 2011 – 2013) where I worked on methodological aspects of phylogeny. In 2013, I have been hired as research engineer in bioinformatics at the Institut de Génétique Humaine in Montpellier where I wrote tools to analyse high-throughput sequencing data, especially small RNA-seq. This is also the kind of job I do now at Institut Pasteur, since 2016. I enjoy programming in python, I’m interested in evolutionary biology, and I find teaching the UNIX command-line a rewarding activity. My published work is available here: http://www.normalesup.org/~bli/useful.html


      Keywords
      GenomicsNon coding RNATranscriptomicsSofware development and engineeringGeneticsWorkflow and pipeline development
      Organisms
      Insect or arthropodOther animalDrosophila melanogaster (Fruit fly)C. elegans
      Projects (4)

      Natalia PIETROSEMOLI

      Group : FUNGEN - Hub Core

      Dr. Natalia Pietrosemoli is an Engineer with a M. Sc. in Modeling and Simulation of Complex Realities from the International Center for Theoretical Physics, ICTP and the International School of Advanced Studies, SISSA (Triest, Italy). During her M. Sc. internships she mostly worked in modeling, optimization, combinatorics and information theory applied to medical imaging. In 2012 she got a Ph. D in Computational Biology from the School of Bioengineering of Rice University (Houston, TX, US), where she specialized in computational structural biology and functional genomics. Her doctoral thesis “Protein functional features extracted with from primary sequences : a focus on disordered regions”, contributed to a better understanding of the functional and evolutionary role of intrinsic disorder in protein plasticity, complexity and adaptation to stress conditions. As part of her Ph. D., Natalia was a visiting scholar in two labs in Madrid: the Structural Computational Biology Group at the Spanish National Cancer Research Centre (CNIO), where she mainly worked in sequence analysis and the functional-structural relationships of proteins, and the Computational Systems Biology Group at the Spanish National Centre for Biotechnology (CNB-CSIC ), where she studied the functional implications of intrinsically disordered proteins at the genomic level for several organisms, collaborating with different experimental and theoretical groups. In 2013, she joined the Swiss Institute of Bioinformatics as a postdoctoral fellow in the Bioinformactics Core Facility. Her main project consisted in the molecular classification of a rare type of lymphoma, which involved the integration of transcriptomic, clinical and mutational data for the identification of molecular markers for classification, diagnosis and prognosis. This work was performed in collaboration with the Pathology Institute at the University Hospital of Lausanne (CHUV). In November of 2015 Natalia joined the Hub Team @ Pasteur C3BI as a Senior Bioinformatician. Natalia is especially interested in the integrative analysis of different omics data, both at large-scale and for small datasets, and loves collaborating in interdisciplinary environments and having feedback from her fellow experimental colleagues. Currently, she’s coordinating several projects performing functional and pathway analysis at the genomic level. By grouping genes, proteins and other biological molecules into the pathways they are involved in, the complexity of the analyses is significantly reduced, while the explanatory power increases with respect to having a list of differentially expressed genes or proteins.


      Keywords
      AlgorithmicsData managementGenomicsImage analysisMachine learningModelingProteomicsSequence analysisStructural bioinformaticsTranscriptomicsDatabaseGenome analysisBiostatisticsScientific computingDatabases and ontologiesApplication of mathematics in sciencesData and text miningGeneticsGraphics and Image ProcessingBiosensors and biomarkersClinical researchCell biology and developmental biologyInteractomicsBioimage analysis
      Organisms

      Projects (24)

      Related projects (33)

      Genotype to phenotype analysis of immune responses in chronic inflammatory diseases



      Project status : In Progress

      Evolutionary relationships between giant viruses and eukaryotes

      The phylogenetic position and status of “giant viruses”, formerly called NucleoCytoplasmic Large DNA viruses (NCLDV) or putative order Megavirales, are controversial. Many preliminary phylogenetic analyses have been published, but their presentations are usually highly biased by the prejudice of the authors concerning the nature of giant viruses. Our own preliminary analyses suggest that giant viruses are indeed ancient (they predate the last universal eukaryotic ancestor) and have possibly provided important functions to emerging eukaryotic cells (e.g. DNA topoisomerase activities). The number of giant virus genomes has recently dramatically increased, opening new opportunity to study their position in the “universal tree of life” and their evolutionary relationships with eukaryotes. The aim of the project is to perform an exhaustive phylogenetic analysis of all giant virus proteins with eukaryotic (archaeal/bacterial) homologues to (i) test the monophyly of giant viruses, (ii) determine their contribution to early eukaryotic evolution, iii) determine if some giant virus proteins can be useful to root the eukaryotic tree. We need the help of a bioinformatics colleague with good expertise in building phylogenetic trees from large data sets using different methods of tree construction and robustness evaluation. This work will be complemented by the systematic search for significant indels (insertion/deletion) in the alignments obtained by two members of the BMGE team (Patrick Forterre and Morgan Gaia).



      Project status : In Progress

      Phylogenetic analysis of the Leishmania HSP70 protein family



      Project status : Closed

      N/A



      Project status : In Progress

      Comparative genomic and phylogenetic analysis of Clostridium baratii strains



      Project status : Closed

      Identification of new or unexpected pathogens, including viruses, bacteria, fungi and parasites associated with acute or progressive diseases

      Microbial discovery remains a challenging task for which there are a lot of unmet medical and public health needs. Deep sequencing has profoundly modified this field, which can be summarized in two questions : i) which pathogens or association of pathogens are associated with diseases of unknown etiology and ii) among microbes infecting animal (including arthropod) reservoirs, which ones are able to infect large vertebrates, including humans. We are currently addressing these two questions and our current request comes with the willingness for Institut Pasteur to increase its contribution and visibility of this thematic, in particular in relation with hospitals and the Institut Pasteur International network (IPIN).  We expect to identify new microbes associated with human diseases, and this is expected to pave the way for basic research programs focusing on virulence mechanisms and host specificity, and will also lead to phylogenetic and epidemiological studies (frequency of host infection, mode of transmission etc...), as well as the development of improved diagnostic tests for human infections. Our objective is also to contribute to the efforts of Institut Pasteur in the field of infectious diseases, by building a pipeline, from sample to microbial identification, able to manage large cohorts of samples. This project is currently supported by the LABEX IBEID and the CITECH, and critically requires a bioIT support, justifying this application. Partners include different hospitals including Necker-Enfants malades University Hospital regarding patients with progressive disease, different IPIN laboratories, as well as INRA and CIRAD regarding animal/arthropod reservoirs.



      Project status : In Progress

      JASS: an online tool for the joint analysis of GWAS summary statistics

      In recent years, large genome-wide association studies (GWAS) have been successful in identifying thousands of significant genetic associations for multiple traits and diseases1. In the course of this endeavor, sample size has proven to be the key factor for identifying new variants. For example, GWAS of body mass index (BMI), now including up to 350,000 individuals from more than 100 cohorts, have been able to identify genetic variant that explain as low as 0.02% of BMI variance2. While standard approaches for detecting new genetic variants associated with traits and diseases will go on as sample size increases, multivariate analyses have been proposed as an alternative strategy for both improving detection of new variants and exploring the multidimensional components of complex traits and diseases. Intuitively, multivariate analysis can be used to improve detection of variants displaying a pleiotropic effect3 by accumulating moderate evidence of association across multiple traits and diseases. Several recent examples have been published about not only GWAS hit overlap across related traits4, but also of genome-wide shared genetic effect5. Multivariate analyses of GWAS have also proven useful to understand shared genetics between diseases5, and potential causal relationship between phenotypes using Mendelian randomization (MR)6. Importantly, most of existing multivariate methods are based on GWAS summary statistics, while approaches based on individual-level data have been seldom considered because of major practical and ethical issues. In the continuity of ongoing work on multi-phenotype analysis (Aschard et al 20147, Aschard et al 20158), we developed an effective and robust multivariate approach of GWAS summary statistics that addresses the major barriers of existing approaches, i.e. the presence of correlation between studies that would exists when GWAS analyzed share sample9-16. Our approach consists in a robust omnibus multivariate test of GWAS summary statis



      Project status : Awaiting Publication

      Genetic profile of patients with dyslexia

      Background: Dyslexia is characterized by difficulty with learning to read fluently and with accurate comprehension despite normal intelligence. It affects 5–10% of school-age children. Familial studies repeatedly showed that first-degree relatives of affected individuals have a 30–50% risk of developing the disorder. Twin studies showed that heritability was approximately 50% with a higher concordance rate for monozygotic twins compared to dizygotic twins. Although genetic factors contribute to dyslexia, very little is known on the genes associated with the condition. Preliminary data: Our project consists in the complementary analysis of (i) a cohort of 209 patients with dyslexia, 89 relatives and 95 very well phenotyped controls and (ii) an extended pedigree (Nantaise family) with 12 members diagnosed with dyslexia in three generations. For all the individuals of the project, we genotyped >600K SNPs in order to detect SNP association and copy-number variants (CNVs). For the extended pedigree, we also used linkage analysis and whole genome sequence (WGS). Our preliminary results indicate that a single region on chromosome 7q36 is segregating with dyslexia in the Nantaise family. The region is located within CNTNAP2, a gene previously proposed as a susceptibility gene, but without formal proof of its association. The WGS data of three affected and three unaffected individuals of the pedigree was performed to detect all the variants in the linkage region. Project: We proposed to use this unique resource in France to characterize the genetic profile of patients with dyslexia. We will (i) detect the CNVs present in the patients and (ii) detect the variants in the linkage region.



      Project status : In Progress