1 Before we begin
1.1 Learning objectives
The one thing you should take away from the Workshop is to plan the analysis along with the methodological details or your experiment, rather than after gathering the data you want to analyse.
This document should help you do the following:
- Get a basic introduction of the idea of probability distributions, extreme values and the random chance of observing outliers of the normal distribution
- Appreciate how statistical comparisons between groups rely on \(signal/noise\) ratios such as the \(z\), t and F ratios
- How to design experiments, e.g. block designs, that reduce \(noise\)
- Understand the idea behind linear models and mixed effects analysis which are powerful even when there is high within-group variance (experiment-to-experiment variance)
- Know the difference between technical and statistically/biologically independent replicates
- Get started with R/Rstudio using resources listed in Chapter 2 and the Chapter 11.
- Perform exploratory plots of data using the
ggplot2
package [1]. - Compare two groups with Student’s [t tests in Chapter 4.
- Compare more than two groups with ANOVA using linear mixed-effects (
lme4
andlmerTest
) [2,3] for ANOVAs (in Chapter 5, Chapter 6, Chapter 7). - Use
ggResidpanel
for model diagnostics, such as plotting residuals [4]. - Use
emmeans
for post-hoc comparisons between groups as in Chapter 5, Chapter 6, Chapter 7
1.2 Your role
You should focus on the analysis by becoming more familiar with R/RStudio by using it often in ‘dummy’ analyses and exercises. You should aim to become comfortable with how to use software because software is only one aspect of the analysis. Try not to get distracted or turned-off by the learning curve for R. Think more about which test to use and why and whether assumptions of data distributions are met.
In addition to this document, many online resources describe how to do a particular test. But you must know which test to use based on how you designed your experiments and gathered data/made measurements. Chapters 4-7 provide R commands (formula & syntax) to execute common tests. In almost all cases, your data table should be in long format. See Appendix. Remember that the result of the test does not tell you whether it is the appropriate one for your data!
1.3 Help from experts
I am neither a mathematician nor a statistician! Refer to the excellent ICL Statistics Advisory Service (SAS). Remember to contact them BEFORE doing experiments!