**What is it?**

Last week, we touched one popular method to compare two independent group (means of two populations), namely T-test. That section introduced some principles rather than describing specific process,

Make long to short, analysis of variance (ANOVA) is a statistical method for comparing several population means. In others words, using the language of inferential statistics, we use ANOVA to assess whether the observed differences among sample means are statistically significant. Does the variation among different sample means is plausibly by chance, or is it good evidence to argue the variation?

**Basic Logic**

- The variation
**between (or among)**groups / the variation**within**group - Since we compare variance*, we call the method ‘analysis of variance’

**What Should I know?**

- Again, distribution does matter. We utilized F-distribution for variance. See the articulation in the ‘T-test and experiment’ post (https://etapdocs.sunycreate.cloud/blog/t-test-and-experiment/)
- The null hypothesis is ‘all means are the equal’. Therefore, alternative hypothesis is ‘any mean is not equal’, not ‘all means are not equal’. IT IS IMPORTANT!
- ANOVA does not say where is the source of difference. We have to conduct post-hoc analysis to know which group differs from which group.

**Where can I learn it?**

- EPSY 530 Statistics 1
**. Yes, we address basic statistics,****but many PhD students overlook principles.**

**A more articulation**

*F-distribution?*

The shape of F-distribution seems wired for the first time. You usually hear a lot about ‘normally distributed shape’ when studying the Z or T-test (Two group comparison). However, many distribution types have different shapes. Depending on your research topic, you may see Pareto or Weibull distribution later.

Again, the logic behind each distribution is crucial to understand what is going on in the statistical inference. You can read this blog to see the outlook of different distribution (https://medium.com/mytake/understanding-different-types-of-distributions-you-will-encounter-as-a-data-scientist-27ea4c375eec)

*Post-hoc analysis?*

It means you have to compare two groups in several groups that you include in ANOVA. For example, if there are three groups (A, B, C), you can conduct three pairwise comparisons. A vs B, B vs C, A vs C. In general terms, N groups allow N(N-1)/2 comparison. The process of comparison is the same as the T-test. ANOVA lets you know whether there is a statistically significant difference among the group you compare. Post-hoc analysis reveals** ‘where’** the difference does exist.

**Recommendation: Moore, D. S., McCabe, G. P., & Craig. B. A. (2016).** **Introduction to the practice of statistics. (Ninth Edition). W. H. Freeman.**

Written by YangHyun Kim (ykim39@albany.edu)