Assignment 4 in SPSS

230B Computer Lab

Jeff Forrest

Revised 3/5/05

 

I. Selecting your variables

II. Creating the dummy code variables

III. Creating the interaction variables

IV. Running the analysis

V. Assumptions checks

 

           

I. Selecting Your Variables

Using one of the datasets provided by the instructor or a dataset of your own choosing (with approval of the instructor), conduct an analysis that includes both qualitative and quantitative predictor variables (at least one predictor of each type). The qualitative predictor you choose must have at least three categories.

Look in the Codebook for the data set if you need category labels for the variables. In your write-up, be sure to back up your choice of variables with a hypothesis of what you expect to find and why (just like in Assignment 2). Run descriptive statistics and assumption checks as usual.

 

 

For this example, we are using the data set HSB1 and the following variables:

 

Continuous Dependent Variable: WRITE (Writing score)

Continuous Independent Variable: READ (Reading score)

Categorical Independent Variable: SES (three possible values of 1=Low, 2=Middle and 3=High)

 

Note: You must choose different variables for your own analysis.

 

II. Creating the Dummy Code Variables

 

            A. Why do we need dummy variables?

 

For this analysis, we will not consider SES to be a quantitative predictor. Instead we will consider SES to be a qualitative variable with 3 categories. To identify group membership without giving inherent meaning to the value assigned to a group (here a value of 1, 2, or 3 on SES), we must create dummy codes that identify group membership.        

 

B. Planning your dummy coding scheme

 

Since SES has three possible values, we will need to create two dummy variables (vectors) to capture the information. We will call them SESDUM1 and SESDUM2. These are not reserved names and you can call your dummy codes anything that makes sense to you. Here is the most straightforward way to create the dummy codes for the categorical variable of SES:

 

Table 1: Dummy Coding “Key” with High SES as the reference group

 

Code

SESDUM1

as:

Code

SESDUM2

as:

When SES = 1

1

0

When SES = 2

0

1

When SES = 3

0

0

 

In any study, the group coded with zeroes on all vectors is the reference group. In Table 1, the choice of SES = 3 (High SES) as our reference group was arbitrary. If you think making Middle SES the reference group will make your interpretation easier, you could have just as easily used the coding scheme in Table 2:

 

                                   

Table 2: Dummy Coding “Key” with Middle SES as the reference group

 

Code SESDUM1

as:

Code

SESDUM3

as:

When SES = 1

1

0

When SES = 2

0

0

When SES = 3

0

1

 

 

            C. Creating the dummy codes in SPSS

 

                        For our example, we will use Middle SES as our reference group and corresponding coding “key” from Table 2.

 

                        Step 1) Handle Missing data somehow before creating your dummy codes.

           

Either delete cases with missing values or make sure that whatever code designates missing valued (often the number 9) is replaced with the SPSS “system missing” symbol “*” in the data spreadsheet.

 

                        Step 2) Create the first dummy variable SESDUM1 and give it the values identified in Table 2.

 

a. Go to the TRANSFORM menu and select the RECODE option, then the INTO DIFFERENT VARIABLES option:

 

 

b. Select your variable (SES, for this example) and click it into the window labeled Numeric Variable->Output Variable.

 

c. Over on the right, in the window called Output Variable Name: type in a name for your first dummy code, in this case SESDUM1

 

d. Click the [CHANGE] button.

 

 

e. In the Label box type in whatever you want as a label. (optional)

 

f. Click on the [OLD AND NEW VALUES…] button.

 

 

 

 

           

Remember, we are using  the coding scheme from Table 2:

           

 

Table 2: Dummy Coding “Key” with Middle SES as the reference group

 

Code SESDUM1

as

Code

SESDUM3

as

When SES = 1

1

0

When SES = 2

0

0

When SES = 3

0

1

 

                                    g. On the “Old Value” side of the window, type “1” in the value box

 

                                    h. On the “New Value” side of the window, type a “1” in the value box

 

                                    i. Click on the [ADD] Button

 

 

                                                Notice that “1 à 1” now appears in the large box. This reads “when SES equals 1, then code SESDUM1 as 1”.

 

                                    J. Repeat for all possible value of SES, including System missing values if you have them in your data.

 

 

                                    k. Click on the [CONTINUE] button. This takes you back to the ‘RECODE INTO DIFFERENT VARIABLES’ window.

 

 

                                    l. Click on the [OK] button to create the new variable SESDUM1 with the appropriate values based on the original SES variable.

 

                                    m. Repeat for all remaining dummy variables. Again, use table 2 to determine which values are appropriate.

 

 

III. Creating the Interaction Variables

 

Now, you must code for the interactions between predictor variables. For this example, we will use the following coding scheme:

 

INTER1 = READ*SESDUM1

 

INTER3 = READ*SESDUM3

 

Step 1) Select the TRANSFORM menu, then the COMPUTE option

 

           

                        Step 2) Create the expression ‘read*sesdum1’ in the NUMERIC EXPRESSION box.

 

                        Step 3) Click on [OK] to create the new variable INTER1 that hold the value of READ X SESDUM1

 

                        Step 4) Repeat for all remaining interaction variables.

 

                       

 

IV. Running the Analysis

 

Chose the ANALYZE menu, then the REGRESSION option, then the LINEAR option.

 

 

 

 

We want the data to be entered into our model in three separate blocks. This allows us to see how much impact each new block has on our model. For instance, if the block of dummy codes makes a significant impact, then we conclude that the original categorical variable significantly contributes to the prediction of our Dependent Variable.

 

Step 1) Create the first block with your Dependent Variable and continuous Independent Variable:

 

a. Click the dependent variable MATH from the list on the left into the Dependent Window.

 

b. Click your continuous predictor variable READ over into the Independent(s) Window.

 

                                    c. Click on the [NEXT] button.

 

Step 2) Create the second block to add the categorical variable dummy codes into the model:

 

a. Click the dummy codes SESDUM1 and SESDUM3 over into the Independent(s) Window.

 

b. Click on the [NEXT] button.

 

Step 3) Create the third block to add the interaction variable into the model:

 

a. Click the interaction terms INTER1 and INTER3 over into the Independent(s) window.

 

Step 4) Click on the Statistics box at the bottom and add R-square change and collinearity statistics by clicking on the √ box next to each item.

 

Step 5) Request a normality plot (P-P plot).

 

Step 6) Choose the option to save your standardized residuals and unstandardized predicted values.

 

Step 7) Click [OK] to run the analysis.

 

 

 


 

[Back] to 230B Webb

 

Graduate School of Education & Information Studies