
Regression Diagnostics
Descriptive Statistics and Exploratory Data Analysis
Of course, any data analysis would begin with descriptive statistics and exploratory data analysis.
Included in these analyses would be distribution plots (histograms, stem plots, or kdensity plots).
Plot Dependent Variable vs Predicted Values
One visual check of the goodness-of-fit of the model is to plot the values of
the dependent variable versus the predicted values. When there is perfect
prediction the plot will be a diagonal line.
Residual Analysis
Residuals are the difference between the observed score and the predicted score.

Residuals come in three varieties:
- Raw Residuals: The difference between the raw observed score and the predicted score,
as given in the formula above. Often denoted e or resid.
- Standardized Residuals: These are the raw residuals divided by the standard error of estimate.
Can be denoted rstan or zresid.
- Studentized Residuals: These are raw residuals divided by the standard error of the
residual with that case deleted. These are sometimes called studentized deleted residuals
or studentized jackknifed residuals. Can be denoted rstu.
Outliers
Outliers are cases with large residuals.
- Look for studentized residuals greater than 2.5 in absolute value but don't become
overly alarmed until residuals are greater than 3 or 4.
- Indicates a peculiarity -- data point is not typical of the rest of the data.
- These points should be examined carefully to try to find out why they are there. Perhaps there
was an error in data entry or subjects did not understand instructions, etc.
- It is possible that observations with large residuals may have little
effect on the estimation of b.
- What to do?
- Automatic rejection -- not wise.
- Possible data entry error -- correct error or delete case.
- May be a "real" data point.
Plotting Residuals
- Plot shape of residuals (histogram, kdensity, normal probability).
- Plot of residuals by case (index plot).
- Plot of residuals in time sequence (if applicable).
- Plot of residual vs predicted, aka, residual vs fitted.
- Plot of residual vs each predictor variable.
In General: Residual Plots
- You should get the impression of a horizontal band with points that
vary at random.
- There should be no relation between residuals and predicted (fitted) score.
The picture should look something like this-

DV vs Predictors
- Plot DV vs IVs to check on linearity and association.
Overall Plot of Residuals
- Histogram
- Stem plot
- Normal probability plot
Index Plot -- Plot of Residuals by Case
- If sample is too large, list only cases with residuals greater than ±2.0.
Time Sequence Plot
- You should get impression of a horizontal band with points that vary at random.
1. Watch out for situations in which variance increases with time; try
Weighted Least Squares (W.L.S.)

2. This pattern could indicate that a linear term is missing from the model.

3. This pattern could indicate that both a linear and a quadratic term in time
are missing from the model.

Plot Residuals versus Predicted (Fitted) Scores
- You should get the impression of a horizontal band with points that vary at random.
- There should be no relation between residuals and predicted scores.
- Square root of the absolute value of the residuals vesus fitted is good for
checking for heterogeneity of vaiance.
1. Watch out for situations in which variance is not constant as assumed (may need W.L.S. or
a transformation of Y).

2. This pattern could indicate that a variable is missing from the model
(Also caused by wrongly omitting intercept term in model).

3. An additional term is needed in the model, the square of a variable or an interaction
(again maybe transformation of Y).

Residual Plot versus Predictors
- You should get the impression of a horizontal band with points that vary at random.
- There should be no relation between residuals and IVs.
- Take care in inferring heteogeneity.
1. May need W.L.S. or a transformation of Y.

2. Perhaps errors in calculation?.

3. Need an additional term in X (X2) or transformation of Y..

Go to the next screen.
UCLA Department of Education
Phil Ender, 15Jun98