
Analysis of Covariance
Linear Model

Hypotheses

Assumptions

Selecting a Covariate
Schematic with Example Data
a1 a2 a3 a4
Y X Y X Y X Y X
3 42
6 57
3 33
3 47
1 32
2 35
2 33
2 39
4 47
5 49
4 42
3 41
2 38
3 43
4 48
3 45
7 61
8 65
7 64
6 56
5 52
6 58
5 53
6 54
7 65
8 74
9 80
8 73
10 85
10 82
9 78
11 89
ANCOVA Summary Table
| Source | SS | df | MS | F | Error Term | |
| 1 | Covariate | 33.950 | 1 | 33.950 | 130.09 | [3] |
| 2 | A | 1.793 | 3 | 0.598 | 2.29 | [3] |
| 3 | Error | 7.047 | 27 | 0.261 | ||
| Adj Total | 8.840 | 30 | ||||
| Grand Total | 235.500 | 31 |
| Source | SS | df | MS | F | |
| A | 194.5 | 3 | 64.833 | 44.28 | |
| Error | 41.0 | 28 | 1.464 | ||
| Total | 235.5 | 31 |
Comparing ANCOVA with Randomized Block Designs
Some Stata Tricks
| One Factor Design with one Covariate: | |
| anova y a | analysis of variance |
| anova y x a | analysis of covariance |
| anova y x a x*a | tests homogeneity of slopes |
| Two Factor Design with One Covariate: | |
| anova y a b a*b | analysis of variance |
| anova y x a b a*b | analysis of covariance |
| anova y x a b a*b x*a*b | tests homogeneity of slopes |
| One Factor Design with Two Covariates: | |
| anova y a | analysis of variance |
| anova y x z a | analysis of covariance |
| anova y x a x*a | homogeneity of x slopes |
| anova y z a z*a | homogeneity of z slopes |
| Two Factor Design with Two Covariates: | |
| anova y a b a*b | analysis of variance |
| anova y x z a b a*b | analysis of covariance |
| anova y x a b a*b x*a*b | homogeneity of x slopes |
| anova y z a b a*b z*a*b | homogeneity of z slopes |
| Note: Don't forget the | cont option in the ancova |
Stata Example
input x y a x1 x2 x3
42 3 1 1 1 1
57 6 1 1 1 1
33 3 1 1 1 1
47 3 1 1 1 1
32 1 1 1 1 1
35 2 1 1 1 1
33 2 1 1 1 1
39 2 1 1 1 1
47 4 2 -1 1 1
49 5 2 -1 1 1
42 4 2 -1 1 1
41 3 2 -1 1 1
38 2 2 -1 1 1
43 3 2 -1 1 1
48 4 2 -1 1 1
45 3 2 -1 1 1
61 7 3 0 -2 1
65 8 3 0 -2 1
64 7 3 0 -2 1
56 6 3 0 -2 1
52 5 3 0 -2 1
58 6 3 0 -2 1
53 5 3 0 -2 1
54 6 3 0 -2 1
65 7 4 0 0 -3
74 8 4 0 0 -3
80 9 4 0 0 -3
73 8 4 0 0 -3
85 10 4 0 0 -3
82 10 4 0 0 -3
78 9 4 0 0 -3
89 11 4 0 0 -3
end
anova y a x, cont(x)
Number of obs = 32 R-squared = 0.9701
Root MSE = .510876 Adj R-squared = 0.9656
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 228.453154 4 57.1132885 218.83 0.0000
|
a | 1.79283521 3 .597611737 2.29 0.1010
x | 33.9531542 1 33.9531542 130.09 0.0000
|
Residual | 7.04684582 27 .26099429
-----------+----------------------------------------------------
Total | 235.50 31 7.59677419
adjust x, by(a) gen(adjy)
-------------------------------------------------------------------------------
Dependent variable: y Command: anova
Created variable: adjy
Covariate set to mean: x = 55
-------------------------------------------------------------------------------
----------+-----------
a | xb
----------+-----------
1 | 5.31013
2 | 5.32566
3 | 5.76735
4 | 5.09686
----------+-----------
Key: xb = Linear Prediction
/* fhcomp & tukeyhsd requires an extra step */
quietly anova adjy a
fhcomp a, nu(27) mse(.26099429) /* mse is from the original ancova */
Fisher-Hayter pairwise comparisons for variable a
studentized range critical value(.05, 3, 27) = 3.5065705
mean critical
grp vs grp group means dif dif
-------------------------------------------------------
1 vs 2 5.3101 5.3257 0.0155 0.6334
1 vs 3 5.3101 5.7674 0.4572 0.6334
1 vs 4 5.3101 5.0969 0.2133 0.6334
2 vs 3 5.3257 5.7674 0.4417 0.6334
2 vs 4 5.3257 5.0969 0.2288 0.6334
3 vs 4 5.7674 5.0969 0.6705* 0.6334
tukeyhsd a, nu(27) mse(.26099429) /* mse is from the original ancova */
Tukey HSD pairwise comparisons for variable a
studentized range critical value(.05, 4, 27) = 3.8701974
uses harmonica mean sample size = 8.000
mean critical
grp vs grp group means dif dif
-------------------------------------------------------
1 vs 2 5.3101 5.3257 0.0155 0.6990
1 vs 3 5.3101 5.7674 0.4572 0.6990
1 vs 4 5.3101 5.0969 0.2133 0.6990
2 vs 3 5.3257 5.7674 0.4417 0.6990
2 vs 4 5.3257 5.0969 0.2288 0.6990
3 vs 4 5.7674 5.0969 0.6705 0.6990
/* test for homogeneity of regression slopes */
anova y a x a*x, cont(x)
Number of obs = 32 R-squared = 0.9719
Root MSE = .525009 Adj R-squared = 0.9637
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 228.884782 7 32.6978259 118.63 0.0000
|
a | .355072259 3 .11835742 0.43 0.7338
x | 25.8488494 1 25.8488494 93.78 0.0000
a*x | .431627333 3 .143875778 0.52 0.6713
|
Residual | 6.61521849 24 .275634104
-----------+----------------------------------------------------
Total | 235.50 31 7.59677419
Stata Example Continued
regress y x x1 x2 x3 Source | SS df MS Number of obs = 32 ---------+------------------------------ F( 4, 27) = 218.83 Model | 228.453154 4 57.1132885 Prob > F = 0.0000 Residual | 7.04684582 27 .26099429 R-squared = 0.9701 ---------+------------------------------ Adj R-squared = 0.9656 Total | 235.50 31 7.59677419 Root MSE = .51088 [remainder of output omitted] regress y x Source | SS df MS Number of obs = 32 ---------+------------------------------ F( 1, 30) = 769.24 Model | 226.660319 1 226.660319 Prob > F = 0.0000 Residual | 8.83968103 30 .294656034 R-squared = 0.9625 ---------+------------------------------ Adj R-squared = 0.9612 Total | 235.50 31 7.59677419 Root MSE = .54282 [remainder of output omitted] regress y x1 x2 x3 Source | SS df MS Number of obs = 32 ---------+------------------------------ F( 3, 28) = 44.28 Model | 194.50 3 64.8333333 Prob > F = 0.0000 Residual | 41.00 28 1.46428571 R-squared = 0.8259 ---------+------------------------------ Adj R-squared = 0.8072 Total | 235.50 31 7.59677419 Root MSE = 1.2101 [remainder of output omitted] Regression Results Summarized Model: M0 R-square 0.9701 Model: M1 R-square 0.9625 Model: M2 R-square 0.8259
F-ratios Using Regression

with 1 and 27 degrees of freedom

with 3 and 27 degrees of freedom
ANCOVA Using Regression Residuals
regress y x
Source | SS df MS Number of obs = 32
---------+------------------------------ F( 1, 30) = 769.24
Model | 226.660319 1 226.660319 Prob > F = 0.0000
Residual | 8.83968103 30 .294656034 R-squared = 0.9625
---------+------------------------------ Adj R-squared = 0.9612
Total | 235.50 31 7.59677419 Root MSE = .54282
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
x | .1642466 .005922 27.735 0.000 .1521523 .1763409
_cons | -3.658563 .3395497 -10.775 0.000 -4.352016 -2.96511
------------------------------------------------------------------------------
predict resy, resid
anova resy a
Number of obs = 32 R-squared = 0.2010
Root MSE = .502235 Adj R-squared = 0.1154
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 1.77695557 3 .592318522 2.35 0.0940
|
a | 1.77695557 3 .592318522 2.35 0.0940
|
Residual | 7.06272558 28 .252240199
-----------+----------------------------------------------------
Total | 8.83968115 31 .285151005
Manual Adjustment for ANCOVA Using Residuals
Example with Two Covariates
input id y c1 c2 grp
1 6 1 6 1
2 9 1 7 1
3 8 2 15 1
4 8 3 13 1
5 12 3 18 1
6 12 4 9 1
7 10 4 16 1
8 8 5 10 1
9 12 5 16 1
10 13 6 18 1
11 13 4 12 2
12 16 4 12 2
13 15 5 17 2
14 16 6 9 2
15 19 6 20 2
16 17 8 18 2
17 19 8 16 2
18 23 9 20 2
19 19 10 10 2
20 22 10 17 2
21 20 7 8 3
22 22 7 14 3
23 24 9 11 3
24 26 9 11 3
25 24 10 16 3
26 25 11 20 3
27 28 11 19 3
28 27 12 19 3
29 29 13 12 3
30 26 13 16 3
31 27 7 16 4
32 28 8 10 4
33 25 8 13 4
34 27 9 7 4
35 31 9 15 4
36 29 10 20 4
37 32 10 16 4
38 30 12 21 4
39 32 12 15 4
40 33 14 21 4
end
tabstat y c1 c2, by(grp) stat(n mean sd) col(stat)
Summary for variables: y c1 c2
by categories of: grp
grp | N mean sd
---------+------------------------------
1 | 10 9.8 2.347576
| 10 3.4 1.712698
| 10 12.8 4.491968
---------+------------------------------
2 | 10 17.9 3.107339
| 10 7 2.309401
| 10 15.1 4.040077
---------+------------------------------
3 | 10 25.1 2.726414
| 10 10.2 2.20101
| 10 14.6 4.060651
---------+------------------------------
4 | 10 29.4 2.633122
| 10 9.9 2.18327
| 10 15.4 4.599517
---------+------------------------------
Total | 40 20.55 7.977372
| 40 7.625 3.439495
| 40 14.475 4.260658
----------------------------------------
anova y grp /* 0 covariates */
Number of obs = 40 R-squared = 0.8929
Root MSE = 2.71723 Adj R-squared = 0.8840
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 2216.1 3 738.7 100.05 0.0000
|
grp | 2216.1 3 738.7 100.05 0.0000
|
Residual | 265.8 36 7.38333333
-----------+----------------------------------------------------
Total | 2481.9 39 63.6384615
anova y c1 grp, cont(c1) /* 1 covariate */
Number of obs = 40 R-squared = 0.9594
Root MSE = 1.69598 Adj R-squared = 0.9548
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 2381.22741 4 595.306852 206.97 0.0000
|
c1 | 165.127408 1 165.127408 57.41 0.0000
grp | 415.841199 3 138.613733 48.19 0.0000
|
Residual | 100.672592 35 2.87635976
-----------+----------------------------------------------------
Total | 2481.9 39 63.6384615
anova y c1 c2 grp, cont(c1 c2) /* 2 covariates */
Number of obs = 40 R-squared = 0.9624
Root MSE = 1.65656 Adj R-squared = 0.9569
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 2388.59757 5 477.719513 174.08 0.0000
|
c1 | 98.974038 1 98.974038 36.07 0.0000
c2 | 7.37015734 1 7.37015734 2.69 0.1105
grp | 420.189396 3 140.063132 51.04 0.0000
|
Residual | 93.3024343 34 2.74418925
-----------+----------------------------------------------------
Total | 2481.9 39 63.6384615
adjust c1 c2, by(grp) gen(adjy)
---------------------------------------------------------------------------------------------------------------
Dependent variable: y Command: anova
Covariates set to mean: c1 = 7.625, c2 = 14.475
---------------------------------------------------------------------------------------------------------------
----------------------
grp | xb
----------+-----------
1 | 13.7834
2 | 18.3846
3 | 22.7797
4 | 27.2523
----------------------
Key: xb = Linear Prediction
quietly anova adjy grp
fhcomp grp, nu(34) mse(2.74418925)
Fisher-Hayter pairwise comparisons for variable grp
studentized range critical value(.05, 3, 34) = 3.4655934
mean critical
grp vs grp group means dif dif
-------------------------------------------------------
1 vs 2 13.7834 18.3846 4.6012* 1.8155
1 vs 3 13.7834 22.7797 8.9964* 1.8155
1 vs 4 13.7834 27.2523 13.4690* 1.8155
2 vs 3 18.3846 22.7797 4.3952* 1.8155
2 vs 4 18.3846 27.2523 8.8678* 1.8155
3 vs 4 22.7797 27.2523 4.4726* 1.8155
anova y c1 grp c1*grp, cont(c1) /* check homogeneity of regression for c2 */
Number of obs = 40 R-squared = 0.9598
Root MSE = 1.76482 Adj R-squared = 0.9511
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 2382.23359 7 340.319084 109.27 0.0000
|
c1 | 152.279387 1 152.279387 48.89 0.0000
grp | 70.1635717 3 23.3878572 7.51 0.0006
c1*grp | 1.00618243 3 .335394144 0.11 0.9550
|
Residual | 99.6664092 32 3.11457529
-----------+----------------------------------------------------
Total | 2481.9 39 63.6384615
anova y c2 grp c2*grp, cont(c2) /* check homogeneity of regression for c2 */
Number of obs = 40 R-squared = 0.9228
Root MSE = 2.44624 Adj R-squared = 0.9060
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 2290.40886 7 327.201265 54.68 0.0000
|
c2 | 73.6130056 1 73.6130056 12.30 0.0014
grp | 182.057287 3 60.6857623 10.14 0.0001
c2*grp | .785330753 3 .261776918 0.04 0.9876
|
Residual | 191.491142 32 5.98409817
-----------+----------------------------------
Linear Statistical Models Course
Phil Ender, 13may06, 11apr06, 25May00