Hub members Have many expertise, covering most of the fields in bioinformatics and biostatistics. You'll find below a non-exhaustive list of these expertise
Searched keyword : Clustering
Related people (5)
| work as a research engineer in the ßioinƒormatics and ßiostatistics HUß of the |nstitut Pasteur. Holder of a PhD in bioinƒormatics, my main interest is on ƒast but robust phylogenetic inƒerence algorithms and methods ƒrom large genome-scaled datasets. |n consequence, | am oƒten involved in related bioinƒormatics projects, such as perƒorming de novo or ab initio genome assemblies, designing and processing core genome †yping schemes, building and analysing phylogenomics datasets, or implementing and distributing novel tools and methods.
AlgorithmicsClusteringGenome assemblyGenomicsGenotypingPhylogeneticsTaxonomyGenome analysisProgram developmentEvolutionSequence homology analysis
- Séquençage à haut débit (NGS) et traitement de séquences ADN des domaines variables d’anticorps simple chaine d’alpaga (domaines VHH ou Nanobodies®)(Margarida GOMES - Antibody Engineering) - Pending
- Antimalarial drug resistance in Africa: A comprehensive molecular analysis of the emergence of artemisinin resistant parasites in Africa(Didier MENARD - Biology of Host-parasite Interactions) - In Progress
- Implémentation d’un algorithme rapide de génotypage cgMLST(Valérie BOUCHEZ - Molecular Prevention and Therapy of Human Diseases) - In Progress
After a PhD in informatics on graph analysis (metabolic networks and sRNA-mRNA interaction graphs) at the LaBRI (Université de Bordeaux), I joined the DSIMB team (INTS) for a post-doc on structural modeling. Then, I performed a second post-doc at Metagenopolis – INRA Jouy-en-Josas, where I was initiated to the analysis of metagenomic data. I was recruited at the HUB in 2015, and since I pursue the development of methods dedicated to the treatment of metagenomic data by combining either the treatment of sequencing data, the statistics, the protein structural modeling and the graph analysis.
AlgorithmicsClusteringGenome assemblyGenomicsMetabolomicsModelingNon coding RNASequence analysisStructural bioinformaticsTargeted metagenomicsDatabaseGenome analysisBiostatisticsProgram developmentScientific computingDatabases and ontologiesExploratory data analysisData and text miningIllumina HiSeqComparative metagenomicsRead mappingIllumina MiSeqSequence homology analysisGene predictionMultidimensional data analysisSequencingShotgun metagenomics
- Characterization of the bacterial and fungal microbiota in Aedes aegypti natural breeding sites and larvae(Louis LAMBRECHTS - Insect-Virus Interactions) - Pending
- Targeted search of specific commensals in 16S databases(Pamela SCHNUPF - Molecular Microbial Pathogenesis) - In Progress
- Microbiota dysbiosis in human colon cancer(Iradj SOBHANI - Other) - Pending
I am seeking to apply my knowledge in computer science and statistics to understand real world data. I have interdisciplinary background spanning complex systems, Big Data, machine learning, biostatistics and genomics. I have completed a PhD in which I applied clustering and PCA to epigenomics data and discovered new insights on the coupling between replication and epigenetics. I worked at Dataiku, a dynamic start up in which I was actively engaged to help their clients to build their Big Data strategy and draw value from their data. I studied the human microbiota during two years at MetaGenoPolis (MGP), an innovative research center. We aim at improving human health by developing strategies (eg. nutritional, therapeutical, preventive…) to restore dysbiosed microbiota with our industrial and academical partners. I currently work in the statistical genetics group at the Pasteur Institut where I apply my software development and data science skills to quantify the impact of the human genome variation on diverse health parameters.
ClusteringData managementGenomicsGenome analysisExploratory data analysisGeneticsComparative metagenomicsDimensional reductionMultidimensional data analysis
After a PhD in biochemistry of the rapeseed proteins, during which I developed my first automated scripts for handling data processing and analysis, I join Danone research facility center for developing multivariate models for the prediction of milk protein composition using infrared spectrometry.
As I was already developing my own informatics tools, I decided to join the course of informatic for biology of the Institut Pasteur in 2007. At the end of the course I was recruited by the Institute and integrate the unit of “génétique des interactions macromoléculaires” of Alain Jacquier. Within this group, I learn to handle sequencing data and I developed processing and analysis tools using python and R. I also create a genome browser and database system for storing, retrieving and visualizing microarray data. After 8 years within the Alain Jacquier’s lab, I join the Hub of bioinformatics and biostatistics as co-head of the team.
ClusteringData managementSequence analysisTranscriptomicsWeb developmentDatabaseGenome analysisProgram developmentScientific computingExploratory data analysisData and text miningIllumina HiSeqRead mappingLIMSIllumina MiSeqHigh Throughput ScreeningMultidimensional data analysisWorkflow and pipeline developmentRibosome profilingMotifs and patterns detection
- Identification of eukaryotic 5'UTRs(Arnaud ECHARD - Membrane Traffic and Cell Division) - Closed
- Super-resolution imaging and reconstructions of human cell chromosome architecture(Xian HAO - Imaging and Modeling) - In Progress
- Utilize mouse models to study infection by HIV-1(Valentina LIBRI - Center for Translational Science) - Closed
Since February 2017 Research engineer, Hub of Bioinformatics and Biostatistics of the C3BI, Institut Pasteur 2015-2017 Post doctoral position, team MISTIS, INRIA Grenoble Topic: Robust clustering and robust non linear regression in high dimension. Collaboration with Florence Forbes (INRIA). 2012-2015 PhD thesis in Statistics, Applied Mathematics Department of Agrocampus-Ouest, IRMAR UMR 6625 CNRS, Rennes Topic: Stability of variable selection in regression and classification issues for correlated data in high dimension. Supervisor: David Causeur (Agrocampus-Ouest, IRMAR). Education 2015 PhD thesis in Statistics, Applied Mathematics Department of Agrocampus-Ouest, IRMAR UMR 6625 CNRS, Rennes 2012 ISUP degree (Institut de Statistique de l’UPMC), Université Pierre et Marie Curie, Paris 2012 Master 2 of Statistics, Université Pierre et Marie Curie, Paris
ClusteringModelingStatistical inferenceTranscriptomicsBiostatisticsExploratory data analysisDimensional reductionStatistical experiment designMultidimensional data analysis
- Analysis of the clinical manifestations of Lyme borreliosis in France from 2003 to 2011(Valerie CHOUMET - Environment and Infectious Risks) - In Progress
- Distribution of Cytotoxic necrotizing factor 1 among the sequence type 131 emergent multidrug resistant lineage of Escherichia coli(TSOUMTSA MEDA LANDRY LAURE - Bacterial Toxins) - In Progress
- Quality controls for human plasmas and serums stored in biobanks(HELENE LAUDE - Biological Resource Center of Institut Pasteur (CRBIP)) - In Progress
Related projects (3)
Non human primates are an important reservoir for zoonotic disease. Here we analyze in Cameroon how human activities in the forest influence contact with non human primates to better understand processes of emergence.
Mood disorders such as bipolar and major depressive illnesses are among the most severe psychiatric disorders. They have high prevalence and chronic course, and are associated with significant mental and somatic comorbidities and high personal and societal costs (lost productivity and increased medical expenses). Patients with bipolar disorder (BD), for example, exhibit a reduced lifespan compared with the general population, a finding that cannot only be explained by high suicide risk, reduced access to medical care and lifestyle factors. However, the pathophysiological mechanisms of BD are poorly understood, and patients often have incomplete treatment response. Advanced mathematical approaches such as machine learning techniques are increasingly being used to generate predictions based on complex data, and it has been successfully used to detect a number of clinical outcomes and to predict behaviours. In combination with mobile technologies (e.g. smartphones, wearables) to collect behavioural, physiological and environmental data, these big data predictive approaches may provide a much richer and deeper understanding of phenomenology and pathophysiological mechanisms of mood and bipolar disorders. By taking advantage of the high-standard bioinformatics expertise offered by the C3BI, this multidisciplinary, collaborative project aims to explore how clinical and biological factors, may contribute for better characterizing BD patients as well as to identify predictors of treatment response in BD. Our project also aims to explore how daily behavioural and physiological parameters may influence mood and behaviour in individuals at-risk or suffering from mood disorders.