List of Frequently Asked Question in Statistic
List of Frequently Asked Question in Statistic
Analyzing data in biological sciences often involves statistical tests which notably return a p-value.
When the p-value is lower than 0.05 (p ≤ 0.05), you may consider that there is a good chance that the observed data do not match the tested null hypothesis, with a < 5% risk of being wrong.
Conversely, when the p-value is higher than a given threshold (say p > 0.05), all you can conclude is that your data do not show any significant differences. This does not mean no difference exists; indeed, it is likely that your test is not resolutive enough (see section to go further, below).
p > 0.05, does not mean there is no difference. Without an estimate of the probability to miss a genuine difference, aka the risk β or type II Error, all you can conclude is: ” we cannot show a significant difference between parameters.” To estimate the risk β, one may use statistical softwares such as G*Power . (http://www.gpower.hhu.de/en.html).
Getting a p-value lower than 0.05 is not necessarily the point you want to make. The size of the effect crucially matters. For example, in a large sample (say, n > 100,000) a p < 0.05 may correspond to very subtle differences, while in practice they may be of little relevance. Conversely when p > 0.05, one may reasonably think that any differences that are not detected by the tests are likely to be very weak.
It is common for a biologist to perform numerous tests simultaneously, e.g., comparing the expression of thousands of genes and/or comparing multiple experimental groups, etc. Keep in mind that if you perform, for instance 100 tests, you expect to get 5 tests where p < 0.05 but in which no genuine difference exists. Therefore the general answer is “Yes, you need to account for multiple testing”.
Depending on the scope of your analysis you might want to be more or less conservative in your conclusions (i.e., detect more or less significant differences). A distinction is drawn between two different cases:
Case 1: imagine you want to make a list of potential candidates markers. Then, for the sake of criteria, you could hunt the k lowest raw p-values and use adjusted p-values as an additional information.
Case 2: the aim of your study is to provide a reliable shortlist of biomarkers, then you might want to be conservative and base your conclusions on the adjusted p-values. Doing so, you may reduce the amount of false positives among the multiple tests.
Choosing a convenient adjustment method depends on the type of data as well as on and the amount of false discoveries you are ready to accept, see Figure 1.