Training – Introduction to data analysis


Date : 16/01/2017 – 17/02/2017 Location : Amphithéâtre Jaques Monod – MONOD (66) ,Institut Pasteur, Paris


Description
Course description :
This five weeks graduate level course will give participants basic skills and hands-on training in biostatistics. It will cover all the steps of an analysis workflow : design, collection, curation, hypothesis testing and data mining. The course can be divided into four different topics whose practice is mandatory for those seeking to implement a relevant analysis workflow. Thus, it will cover basics and practice in unix, text file management, R and biostatistics. We will focus on the topics useful for scientists working in a wet lab: experimental design, hypothesis testing, production of good scientific plots.

IDAprocess

Course schedule :
The course will take place from January 16th to February 17th 2017, with two-hour lectures every day.
Two sessions are provided:
  • session 1 from 9 AM to 11 AM
  • session 2 from 11 AM to 1 PM.
Course location :
Lectures will take place in Amphi Jacques Monod. Please note On Fridays, lectures will take place in the Bâtiment François Jacob, room 28-01-01A. On February 10th and 17th they will take place in the afternoon, from 1:30 to 3:30 for session 1 and from 3:30 to 5:30 for session 2.

Requirements
In order to follow the course all students need to bring their own laptop. Please check that your computer meets the minimum requirements listed below. PC – Windows Based : Intel i3 / Windows 7/ 4Go RAM / 256 Go HD, Apple Macintosh : mid-2010 mac book /OSX 10.10/ 4Go RAM/ 256 Go HD Teachers will be available on Thursday January 12th from 9 to 12 in the Hub meeting room (26-06-06 A) to help you if you have doubts about the configuration of your machine or if you need help to install the necessary software.
  • On PC – windows based you will need to install putty: http://www.putty.org/
  • You all need to install R version 3.3.2 : https://cran.univ-paris1.fr/ and RStudio : https://www.rstudio.com/products/rstudio/ (RStudio desktop, free version)

Documents
-> Programme -> Course documents -> Evaluation

Programme

ProgramNGSSessions

Introduction

NGSTraining_Mapping

Session 1: Unix

Unix

Session 2: Unix2

Day 1
Unix

Day 2
Unix

Day 3
Unix

Day 4
Unix

Day 5
Unix

Day 5 – end
Unix

Session 3 : R

R

Session 4 : Statistics

Stats

Session 5 : Experimental Design

ExperimentalDesign

Files: BransfordJohnsonExperiment_solution.pdf

Session 6 : Estimation & Tests

ExperimentalDesign

Files: Examples.R Internal.R lungA.csv Exercices.R

Session 7 : Regression Lineaire

RegressionLineaire

Files: TP lungA.csv lungB.csv lungC.csv

Session 8 : PCA

RegressionLineaire

Files: dat_pca_clus_counts.rda MultivariateAnalysis.Rmd

Session 9 : Clustering

RegressionLineaire

Session 10 : Final Project

RegressionLineaire

Files: IntroRmarkdown.pdf toStart.Rmd project.Rmd famuss.bib