Ed230B/C

Interactions With Categorical Predictors


So Far...

  • We have considered the interaction of continuous variables, called by some product variables.
  • We have also considered the interaction of dummy variables with continuous variables.
  • Now is is time to consider the interaction of two categorical variables.
  • With fixed levels of the categorical variable this model would be considered to be an analysis of variance type model

    Example Using hsb2

    We will look at a model that uses write as the response variable and female and prog as predictors.

    use http://www.gseis.ucla.edu/courses/data/hsb2, clear
     
    tab1 female prog
     
    -> tabulation of female  
    
         female |      Freq.     Percent        Cum.
    ------------+-----------------------------------
           male |         91       45.50       45.50
         female |        109       54.50      100.00
    ------------+-----------------------------------
          Total |        200      100.00
     
    -> tabulation of prog  
    
        type of |
        program |      Freq.     Percent        Cum.
    ------------+-----------------------------------
        general |         45       22.50       22.50
       academic |        105       52.50       75.00
       vocation |         50       25.00      100.00
    ------------+-----------------------------------
          Total |        200      100.00
      
    table prog female, cont(mean write sd write freq)
     
    ------------------------------
    type of   |       female      
    program   |     male    female
    ----------+-------------------
      general | 49.14286     53.25
              | 10.36478  8.205248
              |       21        24
              | 
     academic | 54.61702  57.58621
              | 8.656622  7.115672
              |       47        58
              | 
     vocation | 41.82609  50.96296
              | 8.003705  8.341193
              |       23        27
    ------------------------------
     
    /* 1st model -- no interaction */
     
    anova write female prog
    
                               Number of obs =     200     R-squared     =  0.2408
                               Root MSE      = 8.32211     Adj R-squared =  0.2291
    
                      Source |  Partial SS    df       MS           F     Prob > F
                  -----------+----------------------------------------------------
                       Model |  4304.40272     3  1434.80091      20.72     0.0000
                             |
                      female |  1128.70487     1  1128.70487      16.30     0.0001
                        prog |  3128.18888     2  1564.09444      22.58     0.0000
                             |
                    Residual |  13574.4723   196  69.2575116   
                  -----------+----------------------------------------------------
                       Total |   17878.875   199   89.843593   
     
    xi: regress write i.female i.prog
    i.female          _Ifemale_0-1        (naturally coded; _Ifemale_0 omitted)
    i.prog            _Iprog_1-3          (naturally coded; _Iprog_1 omitted)
    
          Source |       SS       df       MS              Number of obs =     200
    -------------+------------------------------           F(  3,   196) =   20.72
           Model |  4304.40272     3  1434.80091           Prob > F      =  0.0000
        Residual |  13574.4723   196  69.2575116           R-squared     =  0.2408
    -------------+------------------------------           Adj R-squared =  0.2291
           Total |   17878.875   199   89.843593           Root MSE      =  8.3221
    
    ------------------------------------------------------------------------------
           write |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
      _Ifemale_1 |   4.771211   1.181876     4.04   0.000     2.440385    7.102037
        _Iprog_2 |   4.832929   1.482956     3.26   0.001     1.908331    7.757528
        _Iprog_3 |  -4.605141   1.710049    -2.69   0.008      -7.9776   -1.232683
           _cons |   48.78869   1.391537    35.06   0.000     46.04438      51.533
    ------------------------------------------------------------------------------
     
    test _Ifemale_1
    
     ( 1)  _Ifemale_1 = 0
    
           F(  1,   196) =   16.30
                Prob > F =    0.0001
     
    test _Iprog_2 _Iprog_3
    
     ( 1)  _Iprog_2 = 0
     ( 2)  _Iprog_3 = 0
    
           F(  2,   196) =   22.58
                Prob > F =    0.0000
     
    /* 2nd model -- interaction */
     
    anova write female prog female*prog
    
                               Number of obs =     200     R-squared     =  0.2590
                               Root MSE      = 8.26386     Adj R-squared =  0.2399
    
                      Source |  Partial SS    df       MS           F     Prob > F
                 ------------+----------------------------------------------------
                       Model |  4630.36091     5  926.072182      13.56     0.0000
                             |
                      female |  1261.85329     1  1261.85329      18.48     0.0000
                        prog |  3274.35082     2  1637.17541      23.97     0.0000
                 female*prog |  325.958189     2  162.979094       2.39     0.0946
                             |
                    Residual |  13248.5141   194  68.2913097   
                 ------------+----------------------------------------------------
                       Total |   17878.875   199   89.843593   
     
    xi: regress write i.female*i.prog
    i.female          _Ifemale_0-1        (naturally coded; _Ifemale_0 omitted)
    i.prog            _Iprog_1-3          (naturally coded; _Iprog_1 omitted)
    i.fem~e*i.prog    _IfemXpro_#_#       (coded as above)
    
          Source |       SS       df       MS              Number of obs =     200
    -------------+------------------------------           F(  5,   194) =   13.56
           Model |  4630.36091     5  926.072182           Prob > F      =  0.0000
        Residual |  13248.5141   194  68.2913097           R-squared     =  0.2590
    -------------+------------------------------           Adj R-squared =  0.2399
           Total |   17878.875   199   89.843593           Root MSE      =  8.2639
    
    ------------------------------------------------------------------------------
           write |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
      _Ifemale_1 |   4.107143   2.469299     1.66   0.098    -.7629756    8.977261
        _Iprog_2 |   5.474164   2.169095     2.52   0.012     1.196128      9.7522
        _Iprog_3 |   -7.31677   2.494224    -2.93   0.004    -12.23605   -2.397493
    _IfemXpro_~2 |  -1.137957   2.954299    -0.39   0.701    -6.964625     4.68871
    _IfemXpro_~3 |   5.029733    3.40528     1.48   0.141     -1.68639    11.74586
           _cons |   49.14286   1.803321    27.25   0.000     45.58623    52.69949
    ------------------------------------------------------------------------------
     
    test _IfemXpro_1_2 _IfemXpro_1_3
    
     ( 1)  _IfemXpro_1_2 = 0
     ( 2)  _IfemXpro_1_3 = 0
    
           F(  2,   194) =    2.39
                Prob > F =    0.0946
     
    test _Ifemale_1
    
     ( 1)  _Ifemale_1 = 0
    
           F(  1,   194) =    2.77
                Prob > F =    0.0979
     
    test _Iprog_2 _Iprog_3
    
     ( 1)  _Iprog_2 = 0
     ( 2)  _Iprog_3 = 0
    
           F(  2,   194) =   18.69
                Prob > F =    0.0000
     
    /* cannot use dummy coding -- need to use effect coding */
    /* xi3 will produce effect coding --  findit xi3 */
     
    xi3: regress write e.female*e.prog
    
    This is an experimental version of xi3
    Please view results with some caution
    d.female          _Ifemale_0-1        (naturally coded; _Ifemale_0 omitted)
    d.prog            _Iprog_1-3          (naturally coded; _Iprog_1 omitted)
    
          Source |       SS       df       MS              Number of obs =     200
    -------------+------------------------------           F(  5,   194) =   13.56
           Model |  4630.36091     5  926.072182           Prob > F      =  0.0000
        Residual |  13248.5141   194  68.2913097           R-squared     =  0.2590
    -------------+------------------------------           Adj R-squared =  0.2399
           Total |   17878.875   199   89.843593           Root MSE      =  8.2639
    
    ------------------------------------------------------------------------------
           write |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
      _Ifemale_1 |   2.702201   .6286312     4.30   0.000     1.462372     3.94203
        _Iprog_2 |   4.870758   .7838244     6.21   0.000     3.324847     6.41667
        _Iprog_3 |  -4.836331   .9237884    -5.24   0.000    -6.658289   -3.014373
       _Ife1Xpr2 |  -1.217608   .7838244    -1.55   0.122    -2.763519    .3283035
       _Ife1Xpr3 |   1.866237   .9237884     2.02   0.045     .0442794    3.688195
           _cons |   51.23086   .6286312    81.50   0.000     49.99103    52.47068
    ------------------------------------------------------------------------------
     
    tablist prog _Iprog_2 _Iprog_3  /* findit tablist */
    
      +---------------------------------------+
      |     prog   _Iprog_2   _Iprog_3   Freq |
      |---------------------------------------|
      | academic          1          0    105 |
      | vocation          0          1     50 |
      |  general         -1         -1     45 |
      +---------------------------------------+
     
    test _Ife1Xpr2 _Ife1Xpr3
    
     ( 1)  _Ife1Xpr2 = 0
     ( 2)  _Ife1Xpr3 = 0
    
           F(  2,   194) =    2.39
                Prob > F =    0.0946
     
    test _Ifemale_1
    
     ( 1)  _Ifemale_1 = 0
    
           F(  1,   194) =   18.48
                Prob > F =    0.0000
     
    test _Iprog_2 _Iprog_3
    
     ( 1)  _Iprog_2 = 0
     ( 2)  _Iprog_3 = 0
    
           F(  2,   194) =   23.97
                Prob > F =    0.0000


    UCLA Department of Education

    Phil Ender, 18dec99