Statistical inference
Formal statistical inference is widely used in social science and elsewhere as a method for testing conjectures about patterns in data. Students’ earlier experience of the exploratory data analysis approach to teaching statistics is thought to facilitate better understanding of such formal statistical inference.
Two components
Data can be thought of as comprising two components (each of which could be further subdivided):
- Component 1 might be described, amongst other things, as the signal, the main effect or the explained variation.
- Component 2 is then described respectively as the noise, the error or deviation, or the residual.
Formal statistical inference is made up of a (very large) set of tools and methods for deciding whether the second component is sufficiently small to justify a hypothesis that the first component is in some sense real, and not just a result of the vagaries of chance.
As well as a strong grasp of the ideas of randomness, hypothesis testing and the computation of confidence intervals, the use of formal statistical inference requires a good appreciation of statistical modelling, distribution, sampling and the Law of Large Numbers.
Early encounters
Students’ early encounters with formal statistical inference typically begin with reinforcing representations of location (with most emphasis on the mean), dispersion (standard deviation) and covariation (scattergraphs). The focus on representations is likely to continue through the introduction of various types of probability distribution such as binomial, uniform, Poisson and, most importantly, the Normal distribution.
Finally, more formal methods for measuring correlation are likely to be introduced prior to the introduction of hypothesis testing and confidence intervals. Student difficulties with understanding formal statistical inference tend to point to the possible failure of prevailing pedagogic methods. It is possible that an emphasis on exploratory data analysis and informal inference at a younger age (pre-16) could result in more robust understanding of formal inference, but there is no research evidence to confirm this.