Hub members Have many expertise, covering most of the fields in bioinformatics and biostatistics. You'll find below a non-exhaustive list of these expertise
Searched keyword : Exploratory data analysis
Related people (9)
One of my projects consists in developing GRAVITY, a java tool based on Cytoscape to integrate genetic variants within protein-protein interaction networks to allow the visual and statistical interpretation of next-generation sequencing data, ultimately helping geneticists and clinicians to identify causal variants and better diagnose their patients. I’m also involved in several other projects in the lab, taking part in the design of pipelines for the processing and the analysis of genomics data, including SNP arrays, whole-exome and whole-genome sequencing data. This means being confronted to the big data problematic, the unit having to manage hundreds of terabytes of genomics data. Finally, I am now analysing these data in order to identify possible causes for autism, to help clinicians with their diagnosis but also to better understand the biological mechanisms at play in this complex disease. This is done through the project aiming at understanding the genetic architecture of autism in the Faroe Islands, and also with the newly starting IMI2 European project AIMS2-Trials.
AlgorithmicsData managementData VisualizationGenomicsMachine learningProteomicsGenome analysisBiostatisticsProgram developmentScientific computingApplication of mathematics in sciencesExploratory data analysisSofware development and engineeringData and text miningGenetics
I obtained an engineering degree in Biomedical engineering from Université de Technologie de Compiègne (UTC) in 1989, a master degree in Control of Complex Systems from UTC in 1990, a PhD in Control of Complex Systems from UTC in 1993, a University Degree in Human Genetics from The University of Rennes 1 in 2001 and a master degree in Functional Genomics from University Paris Diderot (Paris 7) in 2002. I worked as a statistician at the Transcriptome and Epigenome Platform from 2002 to 2017, where I was responsible for the statistical analyses of the data and had an important training activity (on the campus and outside). Since 2015 I have been co-head of the Bioinformatics and Biostatistics Hub within the Center of Bioinformatics, Biostatistics and Integrative Biology (C3BI). I am co-director of the Pasteur course Introduction to Data Analysis and co-organiser of the sincellTE summer school (a school dedicated to single cell transcriptome and epigenome data analysis). I am also co-managing the StatOmique group which gathers more than 60 statisticians from France.
RNA-seqStatistical inferenceTranscriptomicsBiostatisticsApplication of mathematics in sciencesExploratory data analysisIllumina HiSeqStatistical experiment designSequencing
- Biomarqueurs d’identification précoce du sepsis aux urgences (BIPS)(Jean-Marc CAVAILLON - Cytokines and Inflammation) - Awaiting Publication
- Study of the early pathogenesis during Lassa fever in cynomolgus monkeys and its correlation with the outcome(Sylvain BAIZE - Biology of Viral Emerging Infections) - In Progress
- Host microbiota modification by the pathogen Listeria monocytogenes(Javier PIZARRO-CERDA - Bacteria-Cell Interactions) - Closed
After a PhD in informatics on graph analysis (metabolic networks and sRNA-mRNA interaction graphs) at the LaBRI (Université de Bordeaux), I joined the DSIMB team (INTS) for a post-doc on structural modeling. Then, I performed a second post-doc at Metagenopolis – INRA Jouy-en-Josas, where I was initiated to the analysis of metagenomic data. I was recruited at the HUB in 2015, and since I pursue the development of methods dedicated to the treatment of metagenomic data by combining either the treatment of sequencing data, the statistics, the protein structural modeling and the graph analysis.
AlgorithmicsClusteringGenome assemblyGenomicsMetabolomicsModelingNon coding RNASequence analysisStructural bioinformaticsTargeted metagenomicsDatabaseGenome analysisBiostatisticsProgram developmentScientific computingDatabases and ontologiesExploratory data analysisData and text miningIllumina HiSeqComparative metagenomicsRead mappingIllumina MiSeqSequence homology analysisGene predictionMultidimensional data analysisSequencingShotgun metagenomics
- Research of orphan LytTR DNA-binding domain(Alexis PROUTIÈRE - Biology of Gram-Positive Pathogens) - Pending
- Analysis of microbiota in Anopheles mosquitoes(Patricia BALDACCI - Center for Production and Infection of Anopheles) - Pending
- Assessing the role of gut microbiota in spondyloarthritis patients and impact of anti-TNF treament on its composition(Corinne RICHARD-MICELI - Immunoregulation) - Closed
I am seeking to apply my knowledge in computer science and statistics to understand real world data. I have interdisciplinary background spanning complex systems, Big Data, machine learning, biostatistics and genomics. I have completed a PhD in which I applied clustering and PCA to epigenomics data and discovered new insights on the coupling between replication and epigenetics. I worked at Dataiku, a dynamic start up in which I was actively engaged to help their clients to build their Big Data strategy and draw value from their data. I studied the human microbiota during two years at MetaGenoPolis (MGP), an innovative research center. We aim at improving human health by developing strategies (eg. nutritional, therapeutical, preventive…) to restore dysbiosed microbiota with our industrial and academical partners. I currently work in the statistical genetics group at the Pasteur Institut where I apply my software development and data science skills to quantify the impact of the human genome variation on diverse health parameters.
ClusteringData managementGenomicsGenome analysisExploratory data analysisGeneticsComparative metagenomicsDimensional reductionMultidimensional data analysis
After a PhD in Biology in 2011 on population genetics and phylogeography on amazing little amphipods (Crangonyx, Crymostygius) at the University of Reykjavik (Iceland), I pursued my interest in Bioinformatics and Evolutionary Biology in various post-docs in Spain (MNCN Madrid, UB Barcelona). During this time, I investigated transcriptomic landscapes for various non-model species (groups Conus, Junco and Caecilians) using de novo assemblies and participated in the development of TRUFA, a web platform for de novo RNA-seq analysis. In July 2016, I integrated the Revive Consortium and the Epigenetic Regulation unit at Pasteur Institute, where my main focus were transcriptomic and epigenetic analyses on various thematics using short and long reads technologies, with a special interest in alternative splicing events detection. I joined the Bioinformatics and Biostatistics Hub in January 2018. My latest interests are long reads technologies, alternative splicing and achieving reproducibility in Bioinformatics using workflow managers, container technologies and literate programming.
Data managementData VisualizationSequence analysisTranscriptomicsWeb developmentGenome analysisProgram developmentExploratory data analysisSofware development and engineeringGeneticsEvolutionRead mappingWorkflow and pipeline developmentPopulation geneticsMotifs and patterns detectionGrid and cloud computing
HumanInsect or arthropodOther animalAnopheles gambiae (African malaria mosquito)Mouse
- Build a software to decipher Gephyrin alternative transcripts obtained with long read sequencing(allemand ERIC - Epigenetic Regulation) - In Progress
- Transcriptomics of Anopheles – Plasmodium vivax interactions towards identification of malaria transmission blocking targets(Catherine BOURGOUIN - Functional Genetics of Infectious Diseases) - In Progress
- Mapping of Enhancers from transcriptome data(Christian MUCHARDT - Epigenetic Regulation) - In Progress
Data managementMachine learningStatistical inferenceScientific computingExploratory data analysisSofware development and engineeringParallel computingNeuroimaging and computational neuroscienceGrid and cloud computing
After a PhD in biochemistry of the rapeseed proteins, during which I developed my first automated scripts for handling data processing and analysis, I join Danone research facility center for developing multivariate models for the prediction of milk protein composition using infrared spectrometry.
As I was already developing my own informatics tools, I decided to join the course of informatic for biology of the Institut Pasteur in 2007. At the end of the course I was recruited by the Institute and integrate the unit of “génétique des interactions macromoléculaires” of Alain Jacquier. Within this group, I learn to handle sequencing data and I developed processing and analysis tools using python and R. I also create a genome browser and database system for storing, retrieving and visualizing microarray data. After 8 years within the Alain Jacquier’s lab, I join the Hub of bioinformatics and biostatistics as co-head of the team.
ClusteringData managementSequence analysisTranscriptomicsWeb developmentDatabaseGenome analysisProgram developmentScientific computingExploratory data analysisData and text miningIllumina HiSeqRead mappingLIMSIllumina MiSeqHigh Throughput ScreeningMultidimensional data analysisWorkflow and pipeline developmentRibosome profilingMotifs and patterns detection
- SHERLOCK4HAT - WP1.1(Brice ROTUREAU - Group: Trypanosome transmission) - In Progress
- Remettre les servers Genolist comme LegioList, TuberclListe, Colibri etc en service(Carmen BUCHRIESER - Biology Of Intracellular Bacteria) - Closed
- Identification of eukaryotic 5'UTRs(Arnaud ECHARD - Membrane Traffic and Cell Division) - Closed
Since February 2017 Research engineer, Hub of Bioinformatics and Biostatistics of the C3BI, Institut Pasteur 2015-2017 Post doctoral position, team MISTIS, INRIA Grenoble Topic: Robust clustering and robust non linear regression in high dimension. Collaboration with Florence Forbes (INRIA). 2012-2015 PhD thesis in Statistics, Applied Mathematics Department of Agrocampus-Ouest, IRMAR UMR 6625 CNRS, Rennes Topic: Stability of variable selection in regression and classification issues for correlated data in high dimension. Supervisor: David Causeur (Agrocampus-Ouest, IRMAR). Education 2015 PhD thesis in Statistics, Applied Mathematics Department of Agrocampus-Ouest, IRMAR UMR 6625 CNRS, Rennes 2012 ISUP degree (Institut de Statistique de l’UPMC), Université Pierre et Marie Curie, Paris 2012 Master 2 of Statistics, Université Pierre et Marie Curie, Paris
ClusteringModelingStatistical inferenceTranscriptomicsBiostatisticsExploratory data analysisDimensional reductionStatistical experiment designMultidimensional data analysis
- Modulation of Flu transmission in Niger , according to climate variations over the past ten years(Ronan JAMBOU - Other) - In Progress
- Left-right patterning of heart precursors(Tobias BØNNELYKKE - Heart Morphogenesis) - In Progress
- Collaboration between CETEA, C2RA and Hub for optimization of experimental designs (3R)(Myriam MATTEI - Center for Animal Resources and Research) - In Progress
Hugo Varet is a biostatistician engineer from the Ensai (Ecole Nationale de la Statistique et de l’Analyse de l’Information) and has been recruited in 2013 by the Transcriptome & Epigenome Platform of the Biomics Pole. Late 2014 he obtained a permanent position at the Bioinformatics & Biostatistics Hub and has been detached to the platform to continue the statistical analyses of RNA-Seq data and develop R pipelines and Shiny applications that help in this task. One of them is named SARTools and is available on GitHub: https://github.com/PF2-pasteur-fr/SARTools. In December 2019 he left the Biomics Platform and joined the Bioinformatics & Biostatistics Hub as a core-member.
ModelingSequence analysisStatistical inferenceTranscriptomicsBiostatisticsScientific computingApplication of mathematics in sciencesExploratory data analysisHigh Throughput ScreeningClinical research
- Effect of metabolic disturbance on the host transcriptional profile during Chlamydia infection(Agathe SUBTIL - Cellular Biology of Microbial Infection) - Pending
- Correlative analysis between lipid droplets number and volume in hepatocytes infected by hepatitis C virus variants(Emeline SIMON - Molecular Genetics of RNA Viruses) - In Progress
- Analysis of the transcriptome during lyssavirus infection in torpid bat: an in vitro model. Act 1(Laurent DACHEUX - Lyssavirus Dynamics and Host Adaptation) - Pending
Related projects (3)
Asymptomatic pathogen carriage in stunted and non-stunted children living in Antananarivo, Madagascar
This project is integrated in the analysis of the gut ecosystem of children implicated in the AFRIBIOTA project, a translational project performed within a consortium of researchers and medical doctor
The provision of human biological material collected, processed and stored under optimal conditions is crucial to ensure the quality of research carried out downstream. These optimal conditions must b
We are interested in Spondyloarthritis. Spondyloarthritis is a chronic inflammatory rheumatism. Currently 2 biologic treatments are available : anti TNF and anti IL-17A. We are analyzing how these