Hub members Have many expertise, covering most of the fields in bioinformatics and biostatistics. You'll find below a non-exhaustive list of these expertise

Search by keywords | Search by organisms

Searched keyword : Exploratory data analysis

Related people (9)


One of my projects consists in developing GRAVITY, a java tool based on Cytoscape to integrate genetic variants within protein-protein interaction networks to allow the visual and statistical interpretation of next-generation sequencing data, ultimately helping geneticists and clinicians to identify causal variants and better diagnose their patients. I’m also involved in several other projects in the lab, taking part in the design of pipelines for the processing and the analysis of genomics data, including SNP arrays, whole-exome and whole-genome sequencing data. This means being confronted to the big data problematic, the unit having to manage hundreds of terabytes of genomics data. Finally, I am now analysing these data in order to identify possible causes for autism, to help clinicians with their diagnosis but also to better understand the biological mechanisms at play in this complex disease. This is done through the project aiming at understanding the genetic architecture of autism in the Faroe Islands, and also with the newly starting IMI2 European project AIMS2-Trials.

AlgorithmicsData managementData VisualizationGenomicsMachine learningProteomicsGenome analysisBiostatisticsProgram developmentScientific computingApplication of mathematics in sciencesExploratory data analysisSofware development and engineeringData and text miningGenetics

Projects (0)

    Marie-Agnès DILLIES

    Group : HEAD - Hub Core

    I obtained an engineering degree in Biomedical engineering from Université de Technologie de Compiègne (UTC) in 1989, a master degree in Control of Complex Systems from UTC in 1990, a PhD in Control of Complex Systems from UTC in 1993, a University Degree in Human Genetics from The University of Rennes 1 in 2001 and a master degree in Functional Genomics from University Paris Diderot (Paris 7) in 2002. I worked as a statistician at the Transcriptome and Epigenome Platform from 2002 to 2017, where I was responsible for the statistical analyses of the data and had an important training activity (on the campus and outside). Since 2015 I have been co-head of the Bioinformatics and Biostatistics Hub within the Center of Bioinformatics, Biostatistics and Integrative Biology (C3BI). I am co-director of the Pasteur course Introduction to Data Analysis and co-organiser of the sincellTE summer school (a school dedicated to single cell transcriptome and epigenome data analysis). I am also co-managing the StatOmique group which gathers more than 60 statisticians from France.

    RNA-seqStatistical inferenceTranscriptomicsBiostatisticsApplication of mathematics in sciencesExploratory data analysisIllumina HiSeqStatistical experiment designSequencing

    Projects (4)

    Amine GHOZLANE

    Group : SINGLE - Detached : Biomics

    After a PhD in informatics on graph analysis (metabolic networks and sRNA-mRNA interaction graphs) at the LaBRI (Université de Bordeaux), I joined the DSIMB team (INTS) for a post-doc on structural modeling. Then, I performed a second post-doc at Metagenopolis – INRA Jouy-en-Josas, where I was initiated to the analysis of metagenomic data. I was recruited at the HUB in 2015, and since I pursue the development of methods dedicated to the treatment of metagenomic data by combining either the treatment of sequencing data, the statistics, the protein structural modeling and the graph analysis.

    AlgorithmicsClusteringGenome assemblyGenomicsMetabolomicsModelingNon coding RNASequence analysisStructural bioinformaticsTargeted metagenomicsDatabaseGenome analysisBiostatisticsProgram developmentScientific computingDatabases and ontologiesExploratory data analysisData and text miningIllumina HiSeqComparative metagenomicsRead mappingIllumina MiSeqSequence homology analysisGene predictionMultidimensional data analysisSequencingShotgun metagenomics

    Projects (21)

    Hanna JULIENNE

    Group : DETACHED - Detached : Statistical Genetics

    I am seeking to apply my knowledge in computer science and statistics to understand real world data. I have interdisciplinary background spanning complex systems, Big Data, machine learning, biostatistics and genomics. I have completed a PhD in which I applied clustering and PCA to epigenomics data and discovered new insights on the coupling between replication and epigenetics. I worked at Dataiku, a dynamic start up in which I was actively engaged to help their clients to build their Big Data strategy and draw value from their data. I studied the human microbiota during two years at MetaGenoPolis (MGP), an innovative research center. We aim at improving human health by developing strategies (eg. nutritional, therapeutical, preventive…) to restore dysbiosed microbiota with our industrial and academical partners. I currently work in the statistical genetics group at the Pasteur Institut where I apply my software development and data science skills to quantify the impact of the human genome variation on diverse health parameters.

    ClusteringData managementGenomicsGenome analysisExploratory data analysisGeneticsComparative metagenomicsDimensional reductionMultidimensional data analysis

    Projects (2)

    Etienne KORNOBIS

    Group : GORE - Embedded : Epigenetic regulation

    After a PhD in Biology in 2011 on population genetics and phylogeography on amazing little amphipods (Crangonyx, Crymostygius) at the University of Reykjavik (Iceland), I pursued my interest in Bioinformatics and Evolutionary Biology in various post-docs in Spain (MNCN Madrid, UB Barcelona). During this time, I investigated transcriptomic landscapes for various non-model species (groups Conus, Junco and Caecilians) using de novo assemblies and participated in the development of TRUFA, a web platform for de novo RNA-seq analysis. In July 2016, I integrated the Revive Consortium and the Epigenetic Regulation unit at Pasteur Institute, where my main focus were transcriptomic and epigenetic analyses on various thematics using short and long reads technologies, with a special interest in alternative splicing events detection. I joined the Bioinformatics and Biostatistics Hub in January 2018. My latest interests are long reads technologies, alternative splicing and achieving reproducibility in Bioinformatics using workflow managers, container technologies and literate programming.

    Data managementData VisualizationSequence analysisTranscriptomicsWeb developmentGenome analysisProgram developmentExploratory data analysisSofware development and engineeringGeneticsEvolutionRead mappingWorkflow and pipeline developmentPopulation geneticsMotifs and patterns detectionGrid and cloud computing
    HumanInsect or arthropodOther animalAnopheles gambiae (African malaria mosquito)Mouse
    Projects (3)

    Christophe MALABAT

    Group : HEAD - Hub Core

    After a PhD in biochemistry of the rapeseed proteins, during which I developed my first automated scripts for handling data processing and analysis, I join Danone research facility center for developing multivariate models for the prediction of milk protein composition using infrared spectrometry.
    As I was already developing my own informatics tools, I decided to join the course of informatic for biology of the Institut Pasteur in 2007. At the end of the course I was recruited by the Institute and integrate the unit of “génétique des interactions macromoléculaires” of Alain Jacquier. Within this group, I learn to handle sequencing data and I developed processing and analysis tools using python and R. I also create a genome browser and database system for storing, retrieving and visualizing microarray data. After 8 years within the Alain Jacquier’s lab, I join the Hub of bioinformatics and biostatistics as co-head of the team.

    ClusteringData managementSequence analysisTranscriptomicsWeb developmentDatabaseGenome analysisProgram developmentScientific computingExploratory data analysisData and text miningIllumina HiSeqRead mappingLIMSIllumina MiSeqHigh Throughput ScreeningMultidimensional data analysisWorkflow and pipeline developmentRibosome profilingMotifs and patterns detection

    Projects (9)

    Emeline PERTHAME

    Group : Stats - Hub Core

    Since February 2017 Research engineer, Hub of Bioinformatics and Biostatistics of the C3BI, Institut Pasteur 2015-2017 Post doctoral position, team MISTIS, INRIA Grenoble Topic: Robust clustering and robust non linear regression in high dimension. Collaboration with Florence Forbes (INRIA). 2012-2015 PhD thesis in Statistics, Applied Mathematics Department of Agrocampus-Ouest, IRMAR UMR 6625 CNRS, Rennes Topic: Stability of variable selection in regression and classification issues for correlated data in high dimension. Supervisor: David Causeur (Agrocampus-Ouest, IRMAR). Education 2015 PhD thesis in Statistics, Applied Mathematics Department of Agrocampus-Ouest, IRMAR UMR 6625 CNRS, Rennes 2012 ISUP degree (Institut de Statistique de l’UPMC), Université Pierre et Marie Curie, Paris 2012 Master 2 of Statistics, Université Pierre et Marie Curie, Paris

    ClusteringModelingStatistical inferenceTranscriptomicsBiostatisticsExploratory data analysisDimensional reductionStatistical experiment designMultidimensional data analysis

    Projects (16)

    Hugo VARET

    Group : PLATEFORM - Detached : Biomics

    Hugo Varet is a biostatistician engineer from the Ensai (Ecole Nationale de la Statistique et de l’Analyse de l’Information) and has been recruited by the hub of the C3BI (Center of Bioinformatics, Biostatistics and Integrative Biology) to work at the Transcriptome & Epigenome Platform. He is in charge of the statistical analyses of the RNA-Seq data produced by the platform and develops R pipelines that help in this task. One of them is named SARTools and is available on GitHub:

    ModelingSequence analysisStatistical inferenceTranscriptomicsBiostatisticsScientific computingApplication of mathematics in sciencesExploratory data analysisHigh Throughput ScreeningClinical research

    Projects (17)