Hub members Have many expertise, covering most of the fields in bioinformatics and biostatistics. You'll find below a non-exhaustive list of these expertise
Searched keyword : Machine learning
Related people (10)
CV Senior Bioinformatician August 2015 – Present : Institut Pasteur, Paris PostDoc fellow 2011 – 2015 : Pascale Cossart’s laboratory, Unité des Interactions Bactéries-Cellules, Institut Pasteur, Paris Phd fellow 2007 – 2010 : Institut des Hautes Etudes Scientifiques, ann Ecole Normale Supérieure, Paris Magister of Science, Theoretical Physics 2003 – 2007 : Dynamical systems and statistics of complex matter, Université Paris 7 and Université Paris 6
BiophysicsMachine learningModelingProteomicsBiostatisticsDatabases and ontologiesHost-pathogen interactions
- Analysis of DNA methylation in the presence and absence of antibiotics in wt and mutant V. cholerae(Baharoglu ZEYNEP - Bacterial Genome Plasticity) - Closed
- Finding and Predicting CRISPR-Cas9 Efficiency(Jerome WONG NG - Synthetic Biology) - Closed
- Characterization of a Salmonella mutant carrying a single amino-acid substitution in the stress sigma factor RpoS(Françoise NOREL - Biochemistry of Macromolecular Interactions) - Closed
I received a Ph.D. in Biostatistics and Bioinformatics applied to Cancer Research in 2011 from the University Paris Sud XI, I was working at the Curie institute under the supervision of Emmanuel Barillot and François Radvanyi. My Ph.D. was about the unsupervised analysis of cancer transcriptome. During my postdoctoral time, I worked on the computational and statistical analysis of NGS data. My areas of interest and expertise include - functional genomics - human genetics - statistical analysis of high-dimensional data - normalization, batch-correction, meta-analysis of high-throughput data - unsupervised learning, independent component analysis - NGS data analysis (RNA-Seq, DNA-Seq, …) - analysis of the non-coding genome, transposable elements
- Characterization of Yolk Sac Derived Progenitors in the Fetal Liver(Laina FREYER - Macrophages and Endothelial Cells) - Pending
- Gene expression and its regulation during and after inpatient detoxification of cocaine: a link to relapse?(Romain ICICK - Integrative Neurobiology of Cholinergic Systems) - Pending
- Characterization of Yolk Sac Derived Progenitors in the Fetal Liver(Laina FREYER - Macrophages and Endothelial Cells) - Pending
One of my projects consists in developing GRAVITY, a java tool based on Cytoscape to integrate genetic variants within protein-protein interaction networks to allow the visual and statistical interpretation of next-generation sequencing data, ultimately helping geneticists and clinicians to identify causal variants and better diagnose their patients. I’m also involved in several other projects in the lab, taking part in the design of pipelines for the processing and the analysis of genomics data, including SNP arrays, whole-exome and whole-genome sequencing data. This means being confronted to the big data problematic, the unit having to manage hundreds of terabytes of genomics data. Finally, I am now analysing these data in order to identify possible causes for autism, to help clinicians with their diagnosis but also to better understand the biological mechanisms at play in this complex disease. This is done through the project aiming at understanding the genetic architecture of autism in the Faroe Islands, and also with the newly starting IMI2 European project AIMS2-Trials.
AlgorithmicsData managementData VisualizationGenomicsMachine learningProteomicsGenome analysisBiostatisticsProgram developmentScientific computingApplication of mathematics in sciencesExploratory data analysisSofware development and engineeringData and text miningGenetics
I joined the Bioinformatics and Biostatistics Hub at Institut Pasteur in 2016 where I am currently developing pipelines related to NGS for the Biomics Pôle. I have an interdisciplinary research experience: after a PhD in Astronomy (gravitational wave data analysis), I joined several research institute to work in the fields of plant modelling (INRIA, Montpellier, 2008-2011), System Biology — in particular logical modelling (EMBL-EBI Cambridge, U.K., 2011-2015), and drug discovery (Sanger Institute, Cambridge, U.K.), 2015). On a daily basis, I use data analysis and machine learning techniques within high-quality software to tackle scientific problems.
AlgorithmicsData managementData VisualizationGenome assemblyGenomicsMachine learningModelingScientific computingDatabases and ontologiesSofware development and engineeringData and text miningIllumina HiSeqGraph theory and analysisIllumina MiSeq
Since September 2016, I am a research engineer in the Bioinformatics and Biostatistics HUB of the Institut Pasteur and detached in the Proteomics facility. I have a PhD in Signal Processing from the Ecole Nationale Supérieure des Télécommunications de Bretagne (Telecom Bretagne) and a Master in Mathematics with a specialty in Statistical Engineering from Rennes 1 University. After my PhD, I was a research and teaching assistant in Mathematics at the Institut National des Sciences Appliquées (INSA) of Rennes, then I worked as a consultant for public local authorities in the company Ressources Consultants Finances. I started working in the field of Proteomics in October 2014 in the EDyP laboratory located in Grenoble (http://www.edyp.fr/). I have been working on the improvement of statistical analysis of bottom-up proteomics data. Today, most of the projects I work on consist of detecting changes in protein abundances using discovery-driven mass spectrometry. I am interested in the development of new methodologies to optimize proteomics data analysis pipelines, from the identification of peptides/proteins to their quantification and the interpretation of results. For this purpose, I worked on several R packages which can be downloaded from the CRAN and Bioconductor: cp4p (https://cran.r-project.org/web/packages/cp4p/index.html), imp4p (https://cran.r-project.org/web/packages/imp4p/index.html), DAPAR (http://bioconductor.org/packages/release/bioc/html/DAPAR.html) and its GUI ProStar.
Machine learningModelingPathway AnalysisProteomicsStatistical inferenceBiostatisticsApplication of mathematics in sciencesData and text miningData integrationStatistical experiment designMultidimensional data analysis
Data VisualizationMachine learningStatistical inferenceBiostatisticsApplication of mathematics in sciencesDimensional reductionMultidimensional data analysis
- Optimisation of freeze and conservation method of peripherical blood mononucleated cells(SORDOILLET VALLIER - Other) - Pending
- Afribiota-Neuro(Pascale VONAESCH - Molecular Microbial Pathogenesis) - In Progress
- Genetic and statistical analysis of data produced with the Collaborative Cross at the Institut Pasteur(Xavier MONTAGUTELLI - Mouse Genetics) - In Progress
Bernd Jagla received his PhD in bioinformatics (department of Biology, Chemistry, and Parmacy) from the Free University in Berlin, Germany in 1999. Before joining the Institut Pasteur, he worked for almost ten years in New York City, including as an associate research scientist in the Joint Centers for System Biology (Columbia University) and at the Columbia University Screening Center led by Dr J.E. Rothman. He joined the Institut Pasteur in 2009 to take charge of the bioinformatic needs at the Transcriptome et Epigenome platform, focusing on Next Generation Sequencing. As of 2016 he is member of the C3BI – HUB Team detached to the Human immunology center (CIH) and provides support for cytometry, next generation sequencing, and microarray data analysis. His areas of interest include the quality assurance and data analysis and visualization at the facility. He also has strong expertise in developing algorithms for function prediction from sequence data, image analysis, analysis of mass spectrometry data, workflow management systems. While at Pasteur he developed: KNIME extensions for Next Generation Sequencing (Link) Post Alignment Visualization and Characterization of High-Throughput Sequencing Experiments (Link) Post Alignment statistics of Illumina reads (Link)
AlgorithmicsChIP-seqData managementData VisualizationImage analysisMachine learningSequence analysisDatabaseGenome analysisBiostatisticsProgram developmentScientific computingData and text miningIllumina HiSeqGraphics and Image ProcessingIllumina MiSeqHigh Throughput ScreeningFlow cytometry/cell sortingPac Bio
Data managementMachine learningStatistical inferenceScientific computingExploratory data analysisSofware development and engineeringParallel computingNeuroimaging and computational neuroscienceGrid and cloud computing
Dr. Natalia Pietrosemoli is an Engineer with a M. Sc. in Modeling and Simulation of Complex Realities from the International Center for Theoretical Physics, ICTP and the International School of Advanced Studies, SISSA (Triest, Italy). During her M. Sc. internships she mostly worked in modeling, optimization, combinatorics and information theory applied to medical imaging. In 2012 she got a Ph. D in Computational Biology from the School of Bioengineering of Rice University (Houston, TX, US), where she specialized in computational structural biology and functional genomics. Her doctoral thesis “Protein functional features extracted with from primary sequences : a focus on disordered regions”, contributed to a better understanding of the functional and evolutionary role of intrinsic disorder in protein plasticity, complexity and adaptation to stress conditions. As part of her Ph. D., Natalia was a visiting scholar in two labs in Madrid: the Structural Computational Biology Group at the Spanish National Cancer Research Centre (CNIO), where she mainly worked in sequence analysis and the functional-structural relationships of proteins, and the Computational Systems Biology Group at the Spanish National Centre for Biotechnology (CNB-CSIC ), where she studied the functional implications of intrinsically disordered proteins at the genomic level for several organisms, collaborating with different experimental and theoretical groups. In 2013, she joined the Swiss Institute of Bioinformatics as a postdoctoral fellow in the Bioinformactics Core Facility. Her main project consisted in the molecular classification of a rare type of lymphoma, which involved the integration of transcriptomic, clinical and mutational data for the identification of molecular markers for classification, diagnosis and prognosis. This work was performed in collaboration with the Pathology Institute at the University Hospital of Lausanne (CHUV). In November of 2015 Natalia joined the Hub Team @ Pasteur C3BI as a Senior Bioinformatician. Natalia is especially interested in the integrative analysis of different omics data, both at large-scale and for small datasets, and loves collaborating in interdisciplinary environments and having feedback from her fellow experimental colleagues. Currently, she’s coordinating several projects performing functional and pathway analysis at the genomic level. By grouping genes, proteins and other biological molecules into the pathways they are involved in, the complexity of the analyses is significantly reduced, while the explanatory power increases with respect to having a list of differentially expressed genes or proteins.
AlgorithmicsData managementGenomicsImage analysisMachine learningModelingProteomicsSequence analysisStructural bioinformaticsTranscriptomicsDatabaseGenome analysisBiostatisticsScientific computingDatabases and ontologiesApplication of mathematics in sciencesData and text miningGeneticsGraphics and Image ProcessingBiosensors and biomarkersClinical researchCell biology and developmental biologyInteractomicsBioimage analysis
- Determination of the transcriptome controlled by the two-component system BvrR/BvrS using dominant positive and negative BvrR mutants(Javier PIZARRO-CERDA - Yersinia) - Pending
- Analyse transcriptionnelle du cellules cancéreuse intestinal vs normales après co-culture avec la bactérie associée au cancer Streptococcus gallolyticus(Ewa PASQUEREAU - Biology of Gram-Positive Pathogens) - Pending
- Functional interactomics of SKAP2(Jean-François BUREAU - Functional Genetics of Infectious Diseases) - Pending
After a diploma of statistician engineer from the Ensai (Ecole Nationale de la Statistique et de l’Analyse de l’Information) and a Ph.D in applied mathematics in the Statistics & Genome lab (AgroParisTech), I worked as a developer for the XLSTAT software. I have implemented some statistical methods such as mixture models, log-linear regression, mood test, bayesian hierarchical modeling CBC/HB, … Then I worked as a head teacher in statistics for one year. I was recruited in the Bioinformatic and biostatistic hub of the C3BI (Center of Bioinformatics, Biostatistics and Integrative Biology) in 2014, I am in charge of the statistical analysis and the development of R/R shiny pipelines.
Machine learningStatistical inferenceTargeted metagenomicsBiostatisticsApplication of mathematics in sciencesStatistical experiment design
Related projects (3)
Chronic inflammatory diseases such as rheumatoid arthritis, inflammatory bowel disease, spondyloarthritis (SpA) and psoriasis cause significant morbidity and are a substantial burden for the affected individuals and the society. An important obstacle to early diagnosis and the development of more specific and effective therapies is the very limited understanding of the pathogenesis of these diseases. In the past years genome-wide association studies have identified many genes that were not known to be involved in pathogenesis, and have linked several genes in immune pathways to inflammatory diseases, indicating that the immune system plays an essential role in the pathogenesis of these diseases. The current challenge is to correlate these genetic variants with the effector mechanisms implicated in pathogenesis, to allow translation of the genetic data into relevant diagnostics and innovative treatment strategies. To meet this challenge, we have designed a clinical study that examines the immune signaling pathways, the transcriptional networks and the genotype in the same SpA patient, in order to establish a link between genetic variation, cellular phenotype and function, and pathology. This approach will advance our understanding of the pathogenic mechanisms, and identify novel and relevant diagnostic tools, therapeutic targets and biomarkers. The long-term outcome of this strategy will be the rational design of specific therapies tailored to the genotype of the patient.
Identification of immune response signatures that correlate with therapeutic responses to TNF inhibitors using machine-learning algorithms
Anti-TNF therapy has revolutionized treatment of many chronic inflammatory diseases, including rheumatoid arthritis, Crohn’s disease and spondyloarthritis (SpA). However, clinical efficacy of TNF-inhi
Mitochondria are double-membrane bound organelles that are essential in every tissue of the body. They are metabolic hubs and signalling platforms that are deeply integrated into cellular homeostasis.