Introduction to data analysis 2018-19

EVENT : C3BI Training


Main speaker : C3BI Team
Autumn session:
Date : 19-10-2018 at 09:00 am Location : Retrovirus room – LWOFF (14), Institut Pasteur, Paris
Winter session:
Date : 11-01-2019 at 09:00 am Location : BFJ 28-01-01A, Institut Pasteur, Paris


This course is addressed to first-year Ph.D. students from the Institut Pasteur: registration is systematic upon joining the institute. Depending on availability, second- and third-year Ph.D. students and postdocs may also apply. First-year PhD students with a background in mathematics or physics will be allowed to ask for an exemption.

The course will mix closely theory and practice. It will last four weeks, four days a week with a three-hours lecture per day. We organize two sessions, the first one starting October 19th, 2018 and the second one starting January 11th, 2019. Each session will start by an Introduction to Computer Science to ensure that all students are familiar with essential computer science notions such as computer architecture, file system organization, file format and programming languages. Following the statistics classes, an optional introduction to Image Analysis and Processing will be proposed by the Image Analysis Hub (2 lectures).
Description
Introduction to Computer Science module : This one-lecture module will provide students with essential computer science notions such as computer architecture, file system organization, file format and programming languages. At the end of this lecture, there will be time left for questions regarding the needed configuration of students’ personal laptops for the Data and Image Analysis modules.
Data analysis module : The course covers a broad range of concepts that are needed for experiment design, data exploration and analysis, interpreting results and generating figures for publications. It will provide fundamental knowledge in statistics, including uni- and multi-variate descriptive analyses, usual probability distributions and their application in biology, estimation, sampling and hypothesis testing. R and RStudio will be used for practice. Students are expected to install these tools before the beginning of the course: Installation instructions are provided in the first part of the R course material.
Introduction to Image analysis module : The two-lectures optional image analysis module will introduce the basic principles of image analysis, or how to extract quantitative information from microscopy images. The course is designed for people who have no or very little experience in the field. It will be oriented towards practical use, and short lectures will be followed by hands-on sessions and tutorials. It should help experienced microscopists and beginners who have never had any formal training in image quantification.

Schedule
The detailed program of each session is online fall 2018, winter 2019

Requirements
In order to follow the course all students need to bring a laptop and install R on it. Please check that your computer meets the minimum requirements listed below.
  • PC – Windows based : Intel i3 / Windows 7 / 4Go RAM / 256 Go HD
  • Apple Macintosh : mid-2010 mac book / OSX 10.10 / 4Go RAM / 256 Go HD
  • PC – Linux based : Intel i3 / Any distribution (supporting R >= 3.5.1, if possible) / 4Go RAM / 256 Go HD
Instructions to install R are provided at the beginning of the R course material. The week before the course, students are invited to get their laptop checked by the C3BI teaching team if necessary.

Application/Exemption

The form below has to be filled out either to request an exemption or to apply to the course.

  • Exemptions will be delivered to students already trained in biostatistics (join a CV and a letter from the supervisor).
  • PhD students in 2nd, 3rd years , as well as postdocs working at Pasteur Paris may also apply.
Documents
    • Introduction – Computer science 101 – slides in pdf
    • Lectures 1 and 2 – First steps with R and RStudio – online slides (Read the beginning and install R before coming.) – full course archive, including exercise data
    • Lecture 3 – Random Variables – slides – R code
    • Lecture 4 – Estimation – slides – R code – data
    • Lecture 5-6 – Confidence intervals & Hypothesis testing – Slides and Exercises
    • Practical Session 1 – Exercises – Supplementary data – Answers
    • Lecture 7-9 – Introduction to statistical modelling – Slides – data – Rcode – code Anova
    • Practical Session 2 – Archive – Correction
    • Lecture 10 – Principal Component Analysis – Slides – data – Rcode
    • Lecture 11 – Clustering – Slides – Rcode
    • Lecture 12
      • Experimental design and statistical power – Slides and R data and script
      • Introduction to R Markdown – Slides and R data and script
    • Practical Session 3 – Archive
    • Image Analysis module – Course material