Data analysis ENS T109 – 2018/2019

Dates

October 8th to 12th, 2018

 

Course objectives and description 

Biological data are often complex and challenging to analyse due to non-normal distributions, nonlinear relationships, spatial/temporal structures and high dimensionality. This course will introduce the students to key concepts and statistical tools for the experimental design and analysis of biological data. After a brief refresher on basic elements of statistics, the students will be made familiar with hypothesis testing, univariate statistical tests (e.g. ANOVA), linear models, descriptive multivariate analyses such as Principal Component Analysis (PCA) and clustering. The course will alternate theoretical aspects and computer exercises on small datasets with the R Studio software. The students will be assigned a small project involving the different concepts and tools covered by the course. Students are expected to bring their own laptops the afternoons of Wednesday 10th and Friday 12th October.

 

Course materials are available hereafter

           

Correction of MCQs

MCQ1MCQ2 MCQ3

Introduction to data analysis 2018-19

EVENT : C3BI Training


Main speaker : C3BI Team
Autumn session:
Date : 19-10-2018 at 09:00 am Location : Retrovirus room – LWOFF (14), Institut Pasteur, Paris
Winter session:
Date : 11-01-2019 at 09:00 am Location : BFJ 28-01-01A, Institut Pasteur, Paris


This course is addressed to first-year Ph.D. students from the Institut Pasteur: registration is systematic upon joining the institute. Depending on availability, second- and third-year Ph.D. students and postdocs may also apply. First-year PhD students with a background in mathematics or physics will be allowed to ask for an exemption.

The course will mix closely theory and practice. It will last four weeks, four days a week with a three-hours lecture per day. We organize two sessions, the first one starting October 19th, 2018 and the second one starting January 11th, 2019. Each session will start by an Introduction to Computer Science to ensure that all students are familiar with essential computer science notions such as computer architecture, file system organization, file format and programming languages. Following the statistics classes, an optional introduction to Image Analysis and Processing will be proposed by the Image Analysis Hub (2 lectures).
Description
Introduction to Computer Science module : This one-lecture module will provide students with essential computer science notions such as computer architecture, file system organization, file format and programming languages. At the end of this lecture, there will be time left for questions regarding the needed configuration of students’ personal laptops for the Data and Image Analysis modules.
Data analysis module : The course covers a broad range of concepts that are needed for experiment design, data exploration and analysis, interpreting results and generating figures for publications. It will provide fundamental knowledge in statistics, including uni- and multi-variate descriptive analyses, usual probability distributions and their application in biology, estimation, sampling and hypothesis testing. R and RStudio will be used for practice. Students are expected to install these tools before the beginning of the course: Installation instructions are provided in the first part of the R course material.
Introduction to Image analysis module : The two-lectures optional image analysis module will introduce the basic principles of image analysis, or how to extract quantitative information from microscopy images. The course is designed for people who have no or very little experience in the field. It will be oriented towards practical use, and short lectures will be followed by hands-on sessions and tutorials. It should help experienced microscopists and beginners who have never had any formal training in image quantification.

Schedule
The detailed program of each session is online fall 2018, winter 2019

Requirements
In order to follow the course all students need to bring a laptop and install R on it. Please check that your computer meets the minimum requirements listed below.
  • PC – Windows based : Intel i3 / Windows 7 / 4Go RAM / 256 Go HD
  • Apple Macintosh : mid-2010 mac book / OSX 10.10 / 4Go RAM / 256 Go HD
  • PC – Linux based : Intel i3 / Any distribution (supporting R >= 3.5.1, if possible) / 4Go RAM / 256 Go HD
Instructions to install R are provided at the beginning of the R course material. The week before the course, students are invited to get their laptop checked by the C3BI teaching team if necessary.

Application/Exemption

The form below has to be filled out either to request an exemption or to apply to the course.

  • Exemptions will be delivered to students already trained in biostatistics (join a CV and a letter from the supervisor).
  • PhD students in 2nd, 3rd years , as well as postdocs working at Pasteur Paris may also apply.
Documents
    • Introduction – Computer science 101 – slides in pdf
    • Lectures 1 and 2 – First steps with R and RStudio – online slides (Read the beginning and install R before coming.) – full course archive, including exercise data
    • Lecture 3 – Random Variables – slides – R code
    • Lecture 4 – Estimation – slides – R code – data
    • Lecture 5-6 – Confidence intervals & Hypothesis testing – Slides and Exercises
    • Practical Session 1 – Exercises – Supplementary data – Answers
    • Lecture 7-9 – Introduction to statistical modelling – Slides – data – Rcode – code Anova
    • Practical Session 2 – Archive – Correction
    • Lecture 10 – Principal Component Analysis – Slides – data – Rcode
    • Lecture 11 – Clustering – Slides – Rcode
    • Lecture 12
      • Experimental design and statistical power – Slides and R data and script
      • Introduction to R Markdown – Slides and R data and script
    • Practical Session 3 – Archive
    • Image Analysis module – Course material

An 18-month post-doctoral position is available in the “Chemoinformatics and Proteochemometrics”

EVENT : C3BI Available position


Contact : Olivier Sperandio Date : 18-09-2018  Location : Institut Pasteur, Paris


An 18-month post-doctoral position is available in the “Chemoinformatics and Proteochemometrics” group (Dr O. Sperandio) of the Structural Bioinformatics unit (Pr M. Nilges) within the Structural Biology and Chemistry department, available immediately. Research project: Molecular modeling and protein-protein docking to characterize key molecular mechanisms that underlay the pathophysiology of osteoporosis. The position is offered in the framework of the ANR-funded Targetbone collaborative project that brings together the complementary expertise of the groups of Professor Martine Cohen-Solal (Hôpital Lariboisière, project coordinator), Professor Giovanni Levi (Museum National d’Histoire Naturelle) and of the “Chemoinformatics and Proteochemometrics” group of Dr Olivier Sperandio at Institut Pasteur. The overall goal of the project is to provide an integrated understanding of the cellular and molecular mechanisms that underlay the pathophysiology of osteoporosis focusing on the differentiation process of Bone Marrow Mesenchymal Stem Cells (BM-MSC) and bone marrow progenitors towards the osteoblastic lineage. Key transcription factors, playing an important role in osteogenesis, are expressed by BM-MSC and are upstream regulators of master genes involved in the induction of osteoblast differentiation. The general aim of the project is to characterize the cellular and molecular factors that promote BM-MSCs differentiation modifying directly the function of transcription factors in BM-MSCs or in more differentiated progenitors in vivo and in vitro. The contribution of our group to this project is to use molecular modeling and protein-protein docking to characterize the molecular interactions that those key transcription factors have with their known partners to promote BM-MSCs differentiation at the molecular level. A tight collaboration is ongoing with the Pole Protein of Institut Pasteur for this project. This will bring precious crystal structures to validate the modeling approach with one or several generated structures. The expected results are the functional and structural characterization of the interactions that those transcription factors make with some of their key partners in the context of osteoporosis. This opens new perspectives to identify druggable binding cavities, which will pave the way for future drug design projects. Who are we looking for: The candidate must have a strong background in structural bioinformatics, homology modeling and protein-protein docking, ideally using the techniques based on evolutionary information. The candidate should be familiar with the concept of druggable pockets and the various software that can profile them. The candidate must be highly motivated, have good communication skills in english, and be willing and able to work with a team-spirit in a highly interactive research consortium. What are we offering: Funding for 18 months, with the possibility to extend the contract by applying to further funding. The possibility to be involved in other protein-protein docking projects, a highly-demanded topic on the Pasteur campus. A fruitful and highly cooperative environment with the rest of the department, the structural bioinformatics unit, and the bioinformatics center (C3BI) which contain numerous talented structural biologists and bioinformaticians. Salary will be commensurate with experience according to the Institut Pasteur guidelines. A first contact is usually established through a Skype interview, followed by an invitation to give an informal 30 minute talk to the team at the Institut Pasteur, and half a day discussing with the members of the lab. A decision to hire is then taken after discussion with the team. Qualified applicants should send their CV, a statement of research interests and two letters of recommendation to olivier.sperandio@pasteur.fr

Hands-on microbiome data analysis: tools for understanding microbial communities in health and disease

EVENT : C3BI Training


Main speaker : Gregorio Iraola, from Institut Pasteur de Montevideo Date : 03-12-2018 at 09:00 am Location : Institut Pasteur de Montevideo


This course aims to provide the theoretical and practical concepts for standard bioinformatic analysis in the field of microbiome research. The course will focus on the application of state-of-the-art software tools for the analysis of environmental and host-associated microbiomes, with particular emphasis on understanding how they change or constitute a risk for human health. The course will have expert lectures and theoretical/practical data analysis sessions with real datasets.

 

STUDENT’S PRE-REQUISITES • Directed to post-graduation (M.Sc. or Ph.D.) students. • Basic concepts of high-throughput sequencing technologies. • Basic understanding of metagenomics and microbial ecology. • Basic skills in the Linux terminal.

 

TEACHERS

Institut Pasteur Montevideo

  • Chair: Gregorio Iraola
  • Pablo Fresia
  • Daniela Costa
  • Cecilia Salazar
  • Verónica Antelo
  • Ignacio Ferrés
  • Matias Giménez
Institut Pasteur Paris
  • Marie Lopez
  • Amine Ghozlane
  • Angèle Benard
    • INVITED SPEAKERS
      • Gianfranco Grompone, Discovery Microbiome, Nutrition & Health Science Lead, Lesaffre, France.
      • David Danko, Director of Bioinformatics, MetaSUB International Consortium, Weill Cornell Medicine, US
       

      DEADLINE APPLICATIONS October 19, 2018. Send your CV (one page) and letter of motivation to: antonio.borderia@pasteur.fr

      Flyer_Microbiome_health-course_Montevideo_2018  

Integrated and spatial-temporal multiscale modeling of liver guide in vivo experiments in healthy & chronic disease states: a blue print for systems medicine?

EVENT : C3BI Seminars


Main speaker : Dirk Drasdo, from INRIA / IZBI Joint Research Group Date : 20-09-2018 at 02:00 pm Location : Salle Retrovirus – Bâtiment LWOFF ,Institut Pasteur, Paris


Background and Aims:  Hyperammonemia after drug-induced peri-central liver lobule damage, as from overdosing acetaminophen (paracetamol), and can lead to encephalopathy and dead of the patient. Guided by mathematical models, the consensus set of chemical reactions for detoxification of liver from ammonia has recently been shown to fail in explaining ammonia-detoxification after drug-induced peri-central damage (Schliess et. al., 2014). Our aim is to demonstrate how integrated and spatial-temporal models mimicking detoxification of the blood from ammonia in virtual tissue samples can assist in guiding identification of missing molecular mechanisms, or predicting the impact of micro-architectural alterations due to acute or chronic damage on ammonia detoxification. Our modeling methodology is very general.     Method:The consensus and alternative detoxification mechanisms have been implemented within mathematical integrated and spatial-temporal multi-scale models to test various hypotheses on potentially missing mechanisms in ammonia detoxification during liver regeneration after drug-induced pericentral damage in silicoin a virtual liver lobule (Drasdo et. al., J. Hepat. 2014). The multi-scale model simulates blood flow and molecular transport in the spatial lobule micro-architecture and displays each individual hepatocyte in space and time. Detoxification reactions are executed in each virtual hepatocyte. This makes in silicotesting of hypothesized mechanisms feasible from the molecular up to the tissue scale. The results are directly compared to experiments in mouse. Finally, fibrotic streets have been added to the model to predict the possible impact of architectural distortions and micro-shunts.     Results:We demonstrate how multiscale and multilevel models guided experiments towards identification of a previously unrecognized ammonia detoxification mechanism, that has the potential of improving treatment in hyperammonemia (Ghallab et. al., J. Hepat. 2016). The same model predicts for CCl4-induced fibrosis a reduced detoxification capacity for ammonia. Finally we outline how the whole body scale can be included to arrive at a model spanning molecular up to whole body scale permitting to study the relation of molecular changes and micro-architecture on whole body blood circulation, and briefly summarize results of integration of APAP toxic pathway as HGF signaling.    

Conclusion:Refined multi-scale models increasingly permit realistic prediction of liver function as well as of toxic injury in acute and chronic damage states. Those models can integrate data from various sources, in vitro, different animal models or human data. The direct representation of liver micro-architecture in those models will open up the future perspective to feed these models with patient-specific data, hence generating a virtual twin of a patients’ liver to guide personalized diagnosis and therapy planning.


Due to security policy in Institut Pasteur, please register before if you plan to come to this meeting

Viral phylodynamic inference: from ancient evolutionary histories to contemporary outbreaks

EVENT : C3BI Seminars


Main speaker : Philippe Lemey, from KU Leuven – Department of Microbiology and Immunology
Date : 06-09-2018 at 02:00 pm
Location : Auditorium Jaques Monod – MONOD (66) ,Institut Pasteur, Paris


The field of computational phylodynamics has witnessed a rich development of statistical inference tools with increasing levels of sophistication that can be applied to address a variety of questions about the evolution and epidemiology of viruses. The central premise of the field is that viruses generally evolve so rapidly that epidemic processes leave an imprint in their genomes. When focusing on deep phylogenies, a rich substitution history may confound time-measured evolutionary analyses, whereas for short-term outbreaks, it may be questioned whether the imprint provides the necessary resolution for insightful evolutionary reconstructions.

Here, I will illustrate these aspects on different viral examples. The Hepatitis B virus represents an example of a deep evolutionary history that has been difficult to date accurately using sequences sampled over the last decades. Recently, ancient DNA work has resulted in the first HBV samples dating back thousands of years. Using molecular clock modeling that accommodates time-dependent evolutionary rates, I will show how recent rapid evolutionary rate estimates can be reconciled with the long-term evolutionary dynamics of the virus.

The 2013-2016 West African Ebola epidemic marked the start of real-time genomic sequencing. Using this example, I will illustrate that short-term outbreak dynamics can be investigated using viral genome sequences, but integrating various sources of information with genomic data promises to deliver more precise insights in infectious diseases. Finally, using recent work on Lassa virus in West Africa, I will further highlight how in-field, real-time molecular epidemiology may impact outbreak responses.


Due to security policy in Institut Pasteur, please register before if you plan to come to this meeting

Signatures of ecological processes in microbial community time series

EVENT : C3BI Seminars


Main speaker : Karoline Faust, from KU Leuven Date : 04-10-2018 at 02:00 pm Location : Auditorium Francois Jacob – BIME (26) ,Institut Pasteur, Paris


Nowadays, a number of densely sampled microbial community time series is available, where the abundance of community members is tracked over several months through sequencing. These data allow exploring community dynamics by investigating signatures of underlying ecological processes that are present in the community time series. In this seminar, I will present our work on the exploitation of time series properties to distinguish between different ecological processes behind the observed dynamics

  http://psbweb05.psb.ugent.be/conet/karoline/

Due to security policy in Institut Pasteur, please register before if you plan to come to this meeting

Profiling epitranscriptomic RNA modifications by Next-Generation Sequencing

EVENT : C3BI Seminars


Main speaker : Yuri Motorin, from Ingénierie Moléculaire et Physiopathologie Articulaire (IMoPA), Université de Lorraine, Nancy Date : 14-06-2018 at 11:00 am Location : Auditorium Jaques Monod – MONOD (66) ,Institut Pasteur, Paris


RNA modifications are emerging players in the field of posttranscriptional regulation of gene expression, and are attracting a comparable degree of research interest to DNA and histone modifications in the field of epigenetics. The true potential of only a handful out of more than 100 RNA modifications is currently emerging as the consequence of a leap in detection technology, principally associated with high-throughput sequencing. In the seminar I will outline the major developments in this field with thorougful discussion of detection principles, advantages and drawbacks of new high-throughput approaches, with particular focus on 2′-O-methylations in rRNA and tRNA.


Due to security policy in Institut Pasteur, please register before if you plan to come to this meeting

Computational microbial genomics

EVENT : C3BI Seminars


Main speaker : Zamin Iqbal, from Royal Society/Wellcome Trust Sir Henry Dale Fellow, EMBL-EBI Date : 07-03-2019 at 02:00 pm Location : Auditorium Francois Jacob – BIME (26) ,Institut Pasteur, Paris


TBA


Due to security policy in Institut Pasteur, please register before if you plan to come to this meeting

C3BI Courses: Introduction to Molecular Phylogenetics – Hong Kong 2018

EVENT : C3BI Training


Main speaker: Olivier Gascuel, from C3BI, Institut Pasteur (France) Date : 22-10-2018 at 09:00 am Location: Institut Pasteur International Network – HKU Pasteur – Hong Kong


General Information:

This introductory course aims to give the basic theoretical and practical concepts, best practices, and software necessary to start working on molecular phylogenetics and its applications to epidemiology. The course will have theoretical morning sessions followed by small groups practice for a few selected students with their own data. Flyer for the course: CLICK ME

Topics:

  • Introduction to phylogeny: General principles for the inference, interpretation of trees, and application to infectious diseases
  • Introduction to the math behind the trees and evolutionary models
  • Distance and parsimony methods
  • Maximum likelihood methods
  • Bayesian methods, phylodynamics
  • Branch supports, bootstrapping
  • How to select the best method and evolutionary model
  • Tree dating, reconstructing and using character evolution
  • Molecular epidemiology

Teachers:

Chair: Olivier Gascuel, C3BI, Institut Pasteur (France)   Anna Zhukova, C3BI, Institut Pasteur (France) Frédéric Lemoine, C3BI, Institut Pasteur (France) Hein Min Tun, School of Public Health, The University of Hong Kong Julien Guglielmini, C3BI, Institut Pasteur (France) Sebastian Duchene, University of Melbourne (Australia) Tim Vaughan, ETH Zürich (Switzerland) Tommy Lam, School of Public Health, The University of Hong Kong Veronika Boskova, ETH Zürich (Switzerland)

Course dates:

Monday, October 22nd to Saturday, October 27th

Pre-requisites:

  • Basic knowledge on how to use sequence databanks
  • Basic knowledge using Blast and multiple alignments software
  • Basic knowledge of statistics (tests, distributions, parameter estimation)

Applications:

Open to postgraduate students, MD, DVM, postdoctoral fellows and young scientists from Hong Kong and overseas. The course fees are 500HK for the theory sessions and 1000HK for the full course. Students coming from the Institut Pasteur International Network will have the fees waived. Please fill in the following application form before August 20th Midnight (HK time). Use the link if you can’t see the embedded form: https://goo.gl/forms/rgYrUNrEz6rqgELP2)