2-Sample T-test - Vivek's Digital Garden

# 2-sample test Independent 2-sample t-test is used to - compare the mean of two independent groups - X variable is categorical with two independent levels Y is numeric Cons - the two "independent" groups may differ in more ways thean the one X variable studied. How to deal with it - Matching - Randomly assign - Multivariable #### Hypothesis $H_0$: $(\mu_A-\mu_B)=0$ $H_A$ : $(\mu_A-\mu_B)\ne0$ #### t-statistic $t_{stat}=\frac{(\mu_A-\mu_B)}{SE_{(\mu_A-\mu_B)}}$ ### Standard Error for difference in means $SE_{(\mu_A-\mu_B)}$ How far is the estimate of difference in means from the population difference in means To Calculate SE we must assume either 1. At population level the SD of the two groups is roughly equal (equal variance) - ANOVA and linear regression assumes equal 2. At the population level the SD in the two groups is not the same (non-equal variance) ![[Pasted image 20211213131101.png]] Are they equal? 1. Eyeball test (look at box plot) 2. compare standard deviation ($SD_A/SD_B>2$ => non-equal) 3. Formal test $H_0:\sigma_A=\sigma_B; H_A:\sigma_A\ne\sigma_B$ 1. Levene'stest 2. Bartlett's test (assumes both distributions are normal) Properties Standard error: $SD_{mean}=\frac{SD}{\sqrt{n}}$ Assume non-equal variance: $VAR(X_1-X_2) = VAR(X_1) + VAR(X_2)$ $VAR(X_1-X_2)=\frac{SD_1^2}{n_1}+\frac{SD_2^2}{n_2}$ $SD_{X_1-X_2}=\sqrt{\frac{SD_1^2}{n_1}+\frac{SD_2^2}{n_2}}$ Assuming equal variances $S_{pooled}^2 = \frac{(n_1-1)S_1^2+(n_2-1)S_2^2}{(n_1+1)+(n_2+1)}$ $SD_{X_1-X_2}=\sqrt{\frac{S^2_pooled}{n_1^2}+ \frac{S^2_pooled}{n_2^2}}$