Ed230B/C

Dichotomous Variables


Dichotomous Variables

  • A categorical variable with two levels.
  • Observations can be classed into two groups; male/female, group 1/group2, true/false, yes/no, etc.
  • Can use 1/0, 1/-1 or any coding system that uses two different values even 1/2 (see below).

    Interpreting Coefficients

  • Dummy Coding
  • Effect Coding Consider the Following Two Group Design:

    Levela1 a2Total
    1
    3
    2
    2
    2
    3
    4
    3
    5
    6
    4
    5
    10
    10
    9
    11
    Mean2.57.55.0

    Example Using Dummy Coding

    input y  grp x1 x2 x3 x4 onetwo
     1   1  1  0   1   326   1
     3   1  1  0   1   326   1
     2   1  1  0   1   326   1
     2   1  1  0   1   326   1
     2   1  1  0   1   326   1
     3   1  1  0   1   326   1
     4   1  1  0   1   326   1
     3   1  1  0   1   326   1
     5   2  0  1  -1 -11814  2
     6   2  0  1  -1 -11814  2
     4   2  0  1  -1 -11814  2
     5   2  0  1  -1 -11814  2
    10   2  0  1  -1 -11814  2
    10   2  0  1  -1 -11814  2
     9   2  0  1  -1 -11814  2
    11   2  0  1  -1 -11814  2
    end
    
    regress y grp, beta
    
      Source |       SS       df       MS                  Number of obs =      16
    ---------+------------------------------               F(  1,    14) =   23.33
       Model |      100.00     1      100.00               Prob > F      =  0.0003
    Residual |       60.00    14  4.28571429               R-squared     =  0.6250
    ---------+------------------------------               Adj R-squared =  0.5982
       Total |      160.00    15  10.6666667               Root MSE      =  2.0702
    
    ------------------------------------------------------------------------------
           y |      Coef.   Std. Err.       t     P>|t|                       Beta
    ---------+--------------------------------------------------------------------
         grp |          5   1.035098      4.830   0.000                   .7905694
       _cons |       -2.5   1.636634     -1.528   0.149                          .
    ------------------------------------------------------------------------------
    
    regress y x1, beta
    
      Source |       SS       df       MS                  Number of obs =      16
    ---------+------------------------------               F(  1,    14) =   23.33
       Model |      100.00     1      100.00               Prob > F      =  0.0003
    Residual |       60.00    14  4.28571429               R-squared     =  0.6250
    ---------+------------------------------               Adj R-squared =  0.5982
       Total |      160.00    15  10.6666667               Root MSE      =  2.0702
    
    ------------------------------------------------------------------------------
           y |      Coef.   Std. Err.       t     P>|t|                       Beta
    ---------+--------------------------------------------------------------------
          x1 |         -5   1.035098     -4.830   0.000                  -.7905694
       _cons |        7.5   .7319251     10.247   0.000                          .
    ------------------------------------------------------------------------------
    
    regress y x2, beta
    
      Source |       SS       df       MS                  Number of obs =      16
    ---------+------------------------------               F(  1,    14) =   23.33
       Model |      100.00     1      100.00               Prob > F      =  0.0003
    Residual |       60.00    14  4.28571429               R-squared     =  0.6250
    ---------+------------------------------               Adj R-squared =  0.5982
       Total |      160.00    15  10.6666667               Root MSE      =  2.0702
    
    ------------------------------------------------------------------------------
           y |      Coef.   Std. Err.       t     P>|t|                       Beta
    ---------+--------------------------------------------------------------------
          x2 |          5   1.035098      4.830   0.000                   .7905694
       _cons |        2.5   .7319251      3.416   0.004                          .
    ------------------------------------------------------------------------------
    
    regress y x3, beta
    
      Source |       SS       df       MS                  Number of obs =      16
    ---------+------------------------------               F(  1,    14) =   23.33
       Model |      100.00     1      100.00               Prob > F      =  0.0003
    Residual |       60.00    14  4.28571429               R-squared     =  0.6250
    ---------+------------------------------               Adj R-squared =  0.5982
       Total |      160.00    15  10.6666667               Root MSE      =  2.0702
    
    ------------------------------------------------------------------------------
           y |      Coef.   Std. Err.       t     P>|t|                       Beta
    ---------+--------------------------------------------------------------------
          x3 |       -2.5   .5175492     -4.830   0.000                  -.7905694
       _cons |          5   .5175492      9.661   0.000                          .
    ------------------------------------------------------------------------------
    
    regress y x4, beta
    
      Source |       SS       df       MS                  Number of obs =      16
    ---------+------------------------------               F(  1,    14) =   23.33
       Model |      100.00     1      100.00               Prob > F      =  0.0003
    Residual |       60.00    14  4.28571429               R-squared     =  0.6250
    ---------+------------------------------               Adj R-squared =  0.5982
       Total |      160.00    15  10.6666667               Root MSE      =  2.0702
    
    ------------------------------------------------------------------------------
           y |      Coef.   Std. Err.       t     P>|t|                       Beta
    ---------+--------------------------------------------------------------------
          x4 |  -.0004119   .0000853     -4.830   0.000                  -.7905694
       _cons |   2.634267   .7125415      3.697   0.002                          .
    ------------------------------------------------------------------------------
    
    regress y x1 x2, beta
    
      Source |       SS       df       MS                  Number of obs =      16
    ---------+------------------------------               F(  1,    14) =   23.33
       Model |      100.00     1      100.00               Prob > F      =  0.0003
    Residual |       60.00    14  4.28571429               R-squared     =  0.6250
    ---------+------------------------------               Adj R-squared =  0.5982
       Total |      160.00    15  10.6666667               Root MSE      =  2.0702
    
    ------------------------------------------------------------------------------
           y |      Coef.   Std. Err.       t     P>|t|                       Beta
    ---------+--------------------------------------------------------------------
          x1 |         -5   1.035098     -4.830   0.000                  -.7905694
          x2 |  (dropped)
       _cons |        7.5   .7319251     10.247   0.000                          .
    ------------------------------------------------------------------------------
    

    Well, why not just use 1's and 2's, why all this 0/1 or 1/-1 coding.

    regress y onetwo
    
          Source |       SS       df       MS              Number of obs =      16
    -------------+------------------------------           F(  1,    14) =   23.33
           Model |         100     1         100           Prob > F      =  0.0003
        Residual |          60    14  4.28571429           R-squared     =  0.6250
    -------------+------------------------------           Adj R-squared =  0.5982
           Total |         160    15  10.6666667           Root MSE      =  2.0702
    
    ------------------------------------------------------------------------------
               y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          onetwo |          5   1.035098     4.83   0.000     2.779935    7.220065
           _cons |       -2.5   1.636634    -1.53   0.149    -6.010231    1.010231
    ------------------------------------------------------------------------------

    As you can see, the coefficient for the groups is the same as for dummy coding. However, the constant is not as informative since it represents the mean for the group coded zero. A group that does not, in fact, exist. In this respect, dummy coding is much more informative.


    UCLA Department of Education

    Phil Ender, 11Feb99