Ed231A

Multivariate Analysis

Factor Analysis


Factor Analysis Model

That is,

where

Z -> (nxm) standard score matrix
A -> (mxp) factor pattern matrix
F -> (nxp) factor score matrix

Factor Pattern

The factor pattern matrix, A, is the matrix of coefficients which applied to the factor scores reproduces the standard score matrix.

Factor Structure

S -> (mxp) factor structure matrix

The factor structure matrix, S, is the matrix of correlations between factors and variables. Sometimes called the loading matrix. With orthogonal solutions the structure matrix and pattern matrix are the same.

Factor Correlations

Φ -> (pxp) factor correlation matrix

If the factor analysis solution is orthogonal then Φ = I.

Fun with Math

Substituting into [2]

we get

thus, when Φ = I, S = AI or S = A.

Reproduced Correlations

Variance to be Factored

Total variance       = hj2 + bj2 + ej2
Reliability          = hj2 + bj2
Communality      hj2 = hj2
Uniqueness       dj2 =       bj2 + ej2
Specificity      bj2 =       bj2
Error            ej2 =             ej2

PCA vs FA

  • Principal Components Analysis analyzes the total variance. That is, it analyzes the correlation matrix with one's in the diagonal.
  • Factor Analysis analyzes the common variance. Analyzes the correlation matrix with communality estimates in the diagonal. Sometimes called common factor analysis.

    The Concept of Simple Structure

  • Each row in the factor pattern matrix should contain at least one zero.
  • Each column of the factor pattern matrix should contain at least m zeros.
  • Every pair of columns should contain rows whose loadings are zero in one column but non-zero in the other.
  • Every pair of columns should contain a large rows whose loadings are zero in both columns.
  • Every pair of columns should have only a few non-zero loadings in both columns.

    Simple Structure Example

                Initial          Rotated
               Solution          Solution
              I  II III         F1  F2  F3
    var1      X  X   0           0   0   X
    var2      X  X   0           0   0   X
    var3      X  X   0           0   0   X
    var4      X -X   0           0   X   0
    var5      X -X   0           0   X   0
    var6      X  0   X           X   0   0
    var7      X  0  -X           X   0   0
    var8      X  0  -X           X   0   0
              G  B   B           P   P   P            
    

    Decisions in Factor Analysis

    1. Method of initial factor solution
    2. Method communality estimation
    3. Number of factors to retain
    4. Method of rotation

    Method of Initial Factor Solution

    1. Principal Components Analysis
    2. Principal Axis Factor Analysis
    3. Iterated Principal Axis Factor Analysis
    4. Image Factor Analysis
    5. Alpha Factor Analysis
    6. Maximum Likelihood Factor Analysis
    7. Unweighted Least Squares Factor Analysis
    8. Generalized Least Squares Factor Analysis

    Estimation of Communalities

    1. Ones (Principal Components Analysis)
    2. Reliabilities
    3. SMC's
    4. Largest off diagonal
    5. Average Correlations
    6. Centroid
    7. Averoid

    Number of Factors to Retain

    1. Number of Eigenvalues greater than or equal to one (Principal Components Analysis)
    2. Scree Test/Scree Plot
    3. Percent of variance
    4. Hypothesis testing (ML)
    5. Parallel analysis -- Monte Carlo (Humphreys & Ilgen, 1959; Montanelli & Humphreys, 1976)
    6. Linn Method -- Monte Carlo (Linn, 1968)
    7. Ender Method -- Quasi-Monte Carlo

    Scree Plot

    Methods of Rotation

  • Orthogonal Rotations
    1. Varimax*
    2. Varimax via the GPF algorithm
    3. Quartimax
    4. Orthomax
    5. Equamax
    6. Parsimax
    7. Minimum entropy
    8. Comrey's tandem 1
    9. Comrey's tandem 2
  • Oblique Rotations
    1. Promax*
    2. Oblimin
    3. Oblimax
    4. Quartimin
    5. Biquartimin
    6. Crawford-Ferguson
    7. Bentler's invariant pattern simplicity
    8. Binormamin
    9. Maxplane

    Rotating Example

    Unrotated Factor Solution

    Orthogonal Factor Solution

    Oblique Factor Solution

    The following example uses data for five socio-economic variables for 12 different locations. the variables are total population, median schooling, total employed, misc. professional services, and median housing value. The data are from Harman (1976).

    Sample Size for Factor Analysis

    There are a number of different guidelines given in the literature as to the appropriate sample size needed for factor analysis. I was taught that you needed at least 10 times as many observations as variables with a minimum of 200 observations. Pedhazur & Schmelkin (1991) suggest at least 50 observations per factor. Guadagnoli and Velicer (1988) have suggested a minimum sample size of 100 to 200 observations. Tabachnick & Fidell (1996) recommend at least 300 cases. And Comrey and Lee (1992) give the following guide for samples sizes: 50 as very poor, 100 as poor, 200 as fair, 300 as good, 500 as very good, and 1,000 as excellent.

    Just remember, as with all statistical rules of thumb, your milage may vary.

    Principal Axis Factor Analysis

    use http://www.gseis.ucla.edu/courses/data/harman1, clear
    
    factor pop medsch employ profser medhouse, pf fac(2)
    (obs=12)
    
                (principal factors; 2 factors retained)
      Factor     Eigenvalue     Difference    Proportion    Cumulative
    ------------------------------------------------------------------
         1        2.73430         1.01823      0.6225         0.6225
         2        1.71607         1.67651      0.3907         1.0131
         3        0.03956         0.06409      0.0090         1.0221
         4       -0.02452         0.04808     -0.0056         1.0165
         5       -0.07261               .     -0.0165         1.0000
    
                Factor Loadings
     Variable |      1          2    Uniqueness
    ----------+--------------------------------
          pop |   0.62533    0.76621    0.02189
       medsch |   0.71370   -0.55515    0.18244
       employ |   0.71447    0.67936    0.02800
      profser |   0.87899   -0.15846    0.20226
     medhouse |   0.74215   -0.57806    0.11505
    
    mat psi = e(Psi)'
    mat com = J(rowsof(psi),1,1)
    mat com = com - psi
    mat colnames com=communalities
    mat list com
    
    com[5,1]
              communalities
         pop      .97811334
      medsch      .81756393
      employ      .97199928
     profser      .79774303
    medhouse      .88495002
    
    
    rotate
    
                (varimax rotation)
                Rotated Factor Loadings
     Variable |      1          2    Uniqueness
    ----------+--------------------------------
          pop |   0.01319    0.98891    0.02189
       medsch |   0.90415    0.00911    0.18244
       employ |   0.13701    0.97633    0.02800
      profser |   0.78689    0.42255    0.20226
     medhouse |   0.94068    0.00887    0.11505
    
    rotate, promax(2)
    
                (promax rotation)
                   Rotated Factor Loadings
        Variable |      1          2    Uniqueness
    -------------+--------------------------------
             pop |  -0.06530    0.99749    0.02189
          medsch |   0.91274   -0.06808    0.18244
          employ |   0.06080    0.97421    0.02800
         profser |   0.76140    0.35944    0.20226
        medhouse |   0.94966   -0.07145    0.11505
    
    estat common
    
    Correlation matrix of the promax(3) rotated common factors
    
        ----------------------------------
             Factors |  Factor1   Factor2 
        -------------+--------------------
             Factor1 |        1           
             Factor2 |    .1623         1 
        ----------------------------------
    
    
    rotate, promax(3)
    
                (promax rotation)
                Rotated Factor Loadings
     Variable |      1          2    Uniqueness
    ----------+--------------------------------
          pop |  -0.09090    1.00505    0.02189
       medsch |   0.92105   -0.10087    0.18244
       employ |   0.03670    0.97716    0.02800
      profser |   0.75785    0.33428    0.20226
     medhouse |   0.95833   -0.10556    0.11505
    
    estat common
    
    Correlation matrix of the promax(3) rotated common factors
    
        ----------------------------------
             Factors |  Factor1   Factor2 
        -------------+--------------------
             Factor1 |        1           
             Factor2 |    .2204         1 
        ----------------------------------
    
    
    rotate, promax(4)
    
                (promax rotation)
                   Rotated Factor Loadings
        Variable |      1          2    Uniqueness
    -------------+--------------------------------
             pop |  -0.10777    1.01050    0.02189
          medsch |   0.92625   -0.11581    0.18244
          employ |   0.02078    0.98049    0.02800
         profser |   0.75527    0.32365    0.20226
        medhouse |   0.96375   -0.12112    0.11505
    
    estat common
    
    Correlation matrix of the promax(4) rotated common factors
    
        ----------------------------------
             Factors |  Factor1   Factor2 
        -------------+--------------------
             Factor1 |        1           
             Factor2 |    .2507         1 
        ----------------------------------
    

    The correlation between the factors increases as the promax power increases. promax(1) is the same as varimax. promax(3) is the default. Powers between 2 and 4 are recommended and non-integer powers can be used. You can try several different powers to get the solution that is most interpretable. In this example, the interpretations are all about the same.

    Remember, with oblique rotations you can get loadings greater than one.

    Iterated Principal Axis Factor Analysis

    factor pop medsch employ profser medhouse, ipf fac(2)
    (obs=12)
    
                (iterated principal factors; 2 factors retained)
      Factor     Eigenvalue     Difference    Proportion    Cumulative
    ------------------------------------------------------------------
         1        2.75653         1.01187      0.6124         0.6124
         2        1.74466         1.71387      0.3876         1.0000
         3        0.03079         0.03118      0.0068         1.0068
         4       -0.00039         0.03002     -0.0001         1.0068
         5       -0.03041               .     -0.0068         1.0000
    
                Factor Loadings
     Variable |      1          2    Uniqueness
    ----------+--------------------------------
          pop |   0.63002    0.79452   -0.02819
       medsch |   0.70064   -0.52408    0.23444
       employ |   0.69731    0.67102    0.06349
      profser |   0.88077   -0.14704    0.20262
     medhouse |   0.77891   -0.60568    0.02645
    
    mat psi = e(Psi)'
    mat com = J(rowsof(psi),1,1)
    mat com = com - psi
    mat colnames com=communalities
    mat list com
    
    com[5,1]
              communalities
         pop      1.0281865
      medsch      .76556374
      employ      .93651122
     profser       .7973836
    medhouse      .97355023
    
    
    rotate
    
                (varimax rotation)
                Rotated Factor Loadings
     Variable |      1          2    Uniqueness
    ----------+--------------------------------
          pop |   0.01056    1.01394   -0.02819
       medsch |   0.87483    0.01557    0.23444
       employ |   0.13944    0.95764    0.06349
      profser |   0.78595    0.42387    0.20262
     medhouse |   0.98669   -0.00090    0.02645
    
    rotate, promax(3)
    
                (promax rotation)
                Rotated Factor Loadings
     Variable |      1          2    Uniqueness
    ----------+--------------------------------
          pop |  -0.09906    1.03143   -0.02819
       medsch |   0.89062   -0.09031    0.23444
       employ |   0.03850    0.95844    0.06349
      profser |   0.75574    0.33634    0.20262
     medhouse |   1.00650   -0.12066    0.02645
    
    estat common
    
    Correlation matrix of the promax(3) rotated common factors
    
        ----------------------------------
             Factors |  Factor1   Factor2 
        -------------+--------------------
             Factor1 |        1           
             Factor2 |    .2225         1 
        ----------------------------------
    

    Maximum Likelihood Factor Analysis

    factor pop medsch employ profser medhouse, ml fac(2)
    (obs=12)
    
    Factor analysis/correlation                        Number of obs    =       12
        Method: maximum likelihood                     Retained factors =        2
        Rotation: (unrotated)                          Number of params =        9
                                                       Schwarz's BIC    =  26.0449
        Log likelihood =  -1.84039                     (Akaike's) AIC   =  21.6808
    
        Beware: solution is a Heywood case
                (i.e., invalid or boundary values of uniqueness)
    
        --------------------------------------------------------------------------
             Factor  |   Eigenvalue   Difference        Proportion   Cumulative
        -------------+------------------------------------------------------------
            Factor1  |      2.13887     -0.22952            0.4745       0.4745
            Factor2  |      2.36839            .            0.5255       1.0000
        --------------------------------------------------------------------------
        LR test: independent vs. saturated:  chi2(10) =   60.63 Prob>chi2 = 0.0000
        LR test:   2 factors vs. saturated:  chi2(1)  =    2.50 Prob>chi2 = 0.1135
        (tests formally not valid because a Heywood case was encountered)
    
    Factor loadings (pattern matrix) and unique variances
    
        -------------------------------------------------
            Variable |  Factor1   Factor2 |   Uniqueness 
        -------------+--------------------+--------------
                 pop |   1.0000   -0.0000 |      0.0000  
              medsch |   0.0098    0.9000 |      0.1900  
              employ |   0.9725    0.1179 |      0.0404  
             profser |   0.4389    0.7892 |      0.1844  
            medhouse |   0.0224    0.9600 |      0.0779  
        -------------------------------------------------
    
    faform  /* Available from ATS via the Internet */
    
    Factor Loadings in Canonical Form
    
                     1         2
         pop   0.62151   0.78340
      medsch   0.71109  -0.55170
      employ   0.69679   0.68853
     profser   0.89106  -0.14672
    medhouse   0.76602  -0.57911
    
    mat psi = e(Psi)'
    mat com = J(rowsof(psi),1,1)
    mat com = com - psi
    mat colnames com=communalities
    mat list com
    
    com[5,1]
              communalities
         pop      .99999969
      medsch      .81003767
      employ      .95956448
     profser      .81555395
    medhouse      .92206406
    
    
    rotate
    
    Factor analysis/correlation                        Number of obs    =       12
        Method: maximum likelihood                     Retained factors =        2
        Rotation: orthogonal varimax (Horst off)       Number of params =        9
                                                       Schwarz's BIC    =  26.0449
        Log likelihood =  -1.84039                     (Akaike's) AIC   =  21.6808
    
        Beware: solution is a Heywood case
                (i.e., invalid or boundary values of uniqueness)
    
        --------------------------------------------------------------------------
             Factor  |     Variance   Difference        Proportion   Cumulative
        -------------+------------------------------------------------------------
            Factor1  |      2.38260      0.25793            0.5286       0.5286
            Factor2  |      2.12467            .            0.4714       1.0000
        --------------------------------------------------------------------------
        LR test: independent vs. saturated:  chi2(10) =   60.63 Prob>chi2 = 0.0000
        LR test:   2 factors vs. saturated:  chi2(1)  =    2.50 Prob>chi2 = 0.1135
        (tests formally not valid because a Heywood case was encountered)
    
    Rotated factor loadings (pattern matrix) and unique variances
    
        -------------------------------------------------
            Variable |  Factor1   Factor2 |   Uniqueness 
        -------------+--------------------+--------------
                 pop |   0.0145    0.9999 |      0.0000  
              medsch |   0.9000   -0.0033 |      0.1900  
              employ |   0.1320    0.9706 |      0.0404  
             profser |   0.7955    0.4274 |      0.1844  
            medhouse |   0.9602    0.0085 |      0.0779  
        -------------------------------------------------
    
    Factor rotation matrix
    
        --------------------------------
                     | Factor1  Factor2 
        -------------+------------------
             Factor1 |  0.0145          
             Factor2 |  0.9999  -0.0145 
        --------------------------------
    
    rotate, promax(3)
    
    Factor analysis/correlation                        Number of obs    =       12
        Method: maximum likelihood                     Retained factors =        2
        Rotation: oblique promax (Horst off)           Number of params =        9
                                                       Schwarz's BIC    =  26.0449
        Log likelihood =  -1.84039                     (Akaike's) AIC   =  21.6808
    
        Beware: solution is a Heywood case
                (i.e., invalid or boundary values of uniqueness)
    
        --------------------------------------------------------------------------
             Factor  |     Variance   Proportion    Rotated factors are correlated
        -------------+------------------------------------------------------------
            Factor1  |      2.49044       0.5525
            Factor2  |      2.22530       0.4937
        --------------------------------------------------------------------------
        LR test: independent vs. saturated:  chi2(10) =   60.63 Prob>chi2 = 0.0000
        LR test:   2 factors vs. saturated:  chi2(1)  =    2.50 Prob>chi2 = 0.1135
        (tests formally not valid because a Heywood case was encountered)
    
    Rotated factor loadings (pattern matrix) and unique variances
    
        -------------------------------------------------
            Variable |  Factor1   Factor2 |   Uniqueness 
        -------------+--------------------+--------------
                 pop |  -0.0886    1.0153 |      0.0000  
              medsch |   0.9171   -0.1091 |      0.1900  
              employ |   0.0341    0.9717 |      0.0404  
             profser |   0.7662    0.3412 |      0.1844  
            medhouse |   0.9772   -0.1042 |      0.0779  
        -------------------------------------------------
    
    Factor rotation matrix
    
        --------------------------------
                     | Factor1  Factor2 
        -------------+------------------
             Factor1 |  0.1292   0.9963 
             Factor2 |  0.9916   0.0865 
        --------------------------------
    
    estat common
    
    Correlation matrix of the promax(3) rotated common factors
    
        ----------------------------------
             Factors |  Factor1   Factor2 
        -------------+--------------------
             Factor1 |        1           
             Factor2 |    .2145         1 
        ----------------------------------
     

    How to Do It

  • Create reduced correlation matrix, R1, by replacing the diagonal elements of the correleation matrix with SMC's
  • SMC's are obtained as follows: where each rjj is a diagonal element of R-1.
  • Now do the same as in Principal Components Analysis using R1 instead of R.

    Factor Scores

  • In common-factor analysis the scores on the common factors are estimated rather than determined from the scores on the observed variables.
  • It is mathematically impossible to determine uniquely or exactly the common-factor scores even if the population correlations are known.
  • This is know as factor indeterminacy.

  • Factor score estimation is done by a variation of the multiple regression procedure. where Z -> Standard scores
    A -> Factor pattern matrix
    Rzz -> Correlations among the variables Rxx -> Correlation among the factors

    Types of Factor Analysis

    R Factor Analysis Q Factor Analysis P Factor Analysis O Factor Analysis S Factor Analysis T Factor Analysis

    Stata Example

    Here is an example using the api99g dataset.

    use http://www.gseis.ucla.edu/courses/data/api99g
    
    keep if stype==1 /* use only elementary schools */
    (1773 observations deleted)
    
    summarize meals ell yr_rnd acs_k3 acs_46 avg_ed full enroll, clear
    
        Variable |     Obs        Mean   Std. Dev.       Min        Max
    -------------+-----------------------------------------------------
           meals |    4421    51.88102   31.07313          0        100
             ell |    4421    25.19204   22.91157          0         95
          yr_rnd |    4421    1.178919   .3833277          1          2
          acs_k3 |    4359    19.29571   1.539583         12         31
          acs_46 |    4294    28.90452    3.21889         14         50
          avg_ed |    4257    2.749298   .7542556          1          5
            full |    4420    87.86357   13.35186         13        100
          enroll |    4397    426.9616   175.8747        101       1570
    
    univar meals ell yr_rnd acs_k3 acs_46 avg_ed full enroll
    
                                            -------------- Quantiles --------------
    Variable       n     Mean     S.D.      Min      .25      Mdn      .75      Max
    -------------------------------------------------------------------------------
       meals    4421    51.88    31.07     0.00    24.00    53.00    79.00   100.00
         ell    4421    25.19    22.91     0.00     6.00    18.00    40.00    95.00
      yr_rnd    4421     1.18     0.38     1.00     1.00     1.00     1.00     2.00
      acs_k3    4359    19.30     1.54    12.00    19.00    19.00    20.00    31.00
      acs_46    4294    28.90     3.22    14.00    27.00    29.00    31.00    50.00
      avg_ed    4257     2.75     0.75     1.00     2.17     2.71     3.26     5.00
        full    4420    87.86    13.35    13.00    81.00    92.00   100.00   100.00
      enroll    4397   426.96   175.87   101.00   303.00   403.00   523.00  1570.00
    -------------------------------------------------------------------------------
    
    corr meals ell yr_rnd acs_k3 acs_46 avg_ed full enroll
    (obs=4059)
    
                 |    meals      ell   yr_rnd   acs_k3   acs_46   avg_ed     full   enroll
    -------------+------------------------------------------------------------------------
           meals |   1.0000
             ell |   0.7716   1.0000
          yr_rnd |   0.3027   0.3158   1.0000
          acs_k3 |  -0.0251   0.0275   0.0016   1.0000
          acs_46 |  -0.0274   0.0077   0.0522   0.2788   1.0000
          avg_ed |  -0.8392  -0.6818  -0.2842  -0.0193   0.0288   1.0000
            full |  -0.5145  -0.5146  -0.2592   0.0344  -0.0304   0.4036   1.0000
          enroll |   0.1984   0.3092   0.5125   0.1374   0.2017  -0.1645  -0.2696   1.0000
    
    factor meals ell yr_rnd acs_k3 acs_46 avg_ed full enroll, pf
    (obs=4059)
    
    Factor analysis/correlation                        Number of obs    =     4059
        Method: principal factors                      Retained factors =        4
        Rotation: (unrotated)                          Number of params =       26
    
        --------------------------------------------------------------------------
             Factor  |   Eigenvalue   Difference        Proportion   Cumulative
        -------------+------------------------------------------------------------
            Factor1  |      2.82884      2.08409            0.8482       0.8482
            Factor2  |      0.74475      0.44056            0.2233       1.0715
            Factor3  |      0.30419      0.23495            0.0912       1.1627
            Factor4  |      0.06924      0.14128            0.0208       1.1834
            Factor5  |     -0.07204      0.04088           -0.0216       1.1618
            Factor6  |     -0.11292      0.07997           -0.0339       1.1280
            Factor7  |     -0.19289      0.04100           -0.0578       1.0701
            Factor8  |     -0.23389            .           -0.0701       1.0000
        --------------------------------------------------------------------------
        LR test: independent vs. saturated:  chi2(28) = 1.3e+04 Prob>chi2 = 0.0000
    
    Factor loadings (pattern matrix) and unique variances
    
        ---------------------------------------------------------------------
            Variable |  Factor1   Factor2   Factor3   Factor4 |   Uniqueness 
        -------------+----------------------------------------+--------------
               meals |   0.8969   -0.2277    0.0614    0.0141 |      0.1398  
                 ell |   0.8240   -0.0428    0.0292   -0.0927 |      0.3097  
              yr_rnd |   0.4422    0.3837   -0.2378    0.0969 |      0.5913  
              acs_k3 |   0.0214    0.2651    0.3521    0.0123 |      0.8051  
              acs_46 |   0.0303    0.3445    0.2947   -0.0304 |      0.7926  
              avg_ed |  -0.8195    0.2249   -0.1193   -0.1402 |      0.2440  
                full |  -0.5728   -0.0453    0.0832    0.1734 |      0.6328  
              enroll |   0.3856    0.5497   -0.1052    0.0157 |      0.5378  
        ---------------------------------------------------------------------
    
    /*  don't display small loadings */
    
    factor meals ell yr_rnd acs_k3 acs_46 avg_ed full enroll, pf blanks(.35)
    (obs=4059)
    
    Factor analysis/correlation                        Number of obs    =     4059
        Method: principal factors                      Retained factors =        4
        Rotation: (unrotated)                          Number of params =       26
    
        --------------------------------------------------------------------------
             Factor  |   Eigenvalue   Difference        Proportion   Cumulative
        -------------+------------------------------------------------------------
            Factor1  |      2.82884      2.08409            0.8482       0.8482
            Factor2  |      0.74475      0.44056            0.2233       1.0715
            Factor3  |      0.30419      0.23495            0.0912       1.1627
            Factor4  |      0.06924      0.14128            0.0208       1.1834
            Factor5  |     -0.07204      0.04088           -0.0216       1.1618
            Factor6  |     -0.11292      0.07997           -0.0339       1.1280
            Factor7  |     -0.19289      0.04100           -0.0578       1.0701
            Factor8  |     -0.23389            .           -0.0701       1.0000
        --------------------------------------------------------------------------
        LR test: independent vs. saturated:  chi2(28) = 1.3e+04 Prob>chi2 = 0.0000
    
    Factor loadings (pattern matrix) and unique variances
    
        ---------------------------------------------------------------------
            Variable |  Factor1   Factor2   Factor3   Factor4 |   Uniqueness 
        -------------+----------------------------------------+--------------
               meals |   0.8969                               |      0.1398  
                 ell |   0.8240                               |      0.3097  
              yr_rnd |   0.4422    0.3837                     |      0.5913  
              acs_k3 |                       0.3521           |      0.8051  
              acs_46 |                                        |      0.7926  
              avg_ed |  -0.8195                               |      0.2440  
                full |  -0.5728                               |      0.6328  
              enroll |   0.3856    0.5497                     |      0.5378  
        ---------------------------------------------------------------------
        (blanks represent abs(loading)<.35)
        
    /* parallel analysis for eigenvalues 
       compare the eigenvalues of the factor analysis 
       with eigenvalues of randomly generated variables 
       to assist in determing the number of factors. 
    */
    
    fapara, seed(123456789)   /* Available from ATS via the Internet */
    (obs=4421)
    
    Parallel Analysis for Eigenvalues
    
          Eigen   Random      Dif
    c1   2.8288   0.0646   2.7642
    c2   0.7448   0.0407   0.7041
    c3   0.3042   0.0257   0.2785
    c4   0.0692   0.0202   0.0490
    c5  -0.0720  -0.0167  -0.0553
    c6  -0.1129  -0.0338  -0.0792
    c7  -0.1929  -0.0417  -0.1512
    c8  -0.2339  -0.0468  -0.1870
    
    rotate, varimax fact(3) blanks(.35)
    
    Factor analysis/correlation                        Number of obs    =     4059
        Method: principal factors                      Retained factors =        4
        Rotation: orthogonal varimax (Horst off)       Number of params =       26
    
        --------------------------------------------------------------------------
             Factor  |     Variance   Difference        Proportion   Cumulative
        -------------+------------------------------------------------------------
            Factor1  |      2.57781      1.66607            0.7729       0.7729
            Factor2  |      0.91173      0.52349            0.2734       1.0463
            Factor3  |      0.38825      0.31901            0.1164       1.1627
            Factor4  |      0.06924            .            0.0208       1.1834
        --------------------------------------------------------------------------
        LR test: independent vs. saturated:  chi2(28) = 1.3e+04 Prob>chi2 = 0.0000
    
    Rotated factor loadings (pattern matrix) and unique variances
    
        ---------------------------------------------------------------------
            Variable |  Factor1   Factor2   Factor3   Factor4 |   Uniqueness 
        -------------+----------------------------------------+--------------
               meals |   0.9226                               |      0.1398  
                 ell |   0.7920                               |      0.3097  
              yr_rnd |             0.5731                     |      0.5913  
              acs_k3 |                       0.4327           |      0.8051  
              acs_46 |                       0.4158           |      0.7926  
              avg_ed |  -0.8570                               |      0.2440  
                full |  -0.5128                               |      0.6328  
              enroll |             0.6391                     |      0.5378  
        ---------------------------------------------------------------------
        (blanks represent abs(loading)<.35)
    
    Factor rotation matrix
    
        --------------------------------------------------
                     | Factor1  Factor2  Factor3  Factor4 
        -------------+------------------------------------
             Factor1 |  0.9400   0.3410   0.0119   0.0000 
             Factor2 | -0.3117   0.8443   0.4359   0.0000 
             Factor3 |  0.1386  -0.4134   0.8999   0.0000 
             Factor4 |  0.0000   0.0000   0.0000   1.0000 
        --------------------------------------------------
    
    rotate, promax fact(3) blanks(.35)
    
    Factor analysis/correlation                        Number of obs    =     4059
        Method: principal factors                      Retained factors =        4
        Rotation: oblique promax (Horst off)           Number of params =       26
    
        --------------------------------------------------------------------------
             Factor  |     Variance   Proportion    Rotated factors are correlated
        -------------+------------------------------------------------------------
            Factor1  |      2.74173       0.8220
            Factor2  |      1.66111       0.4980
            Factor3  |      0.48782       0.1463
            Factor4  |      0.06924       0.0208
        --------------------------------------------------------------------------
        LR test: independent vs. saturated:  chi2(28) = 1.3e+04 Prob>chi2 = 0.0000
    
    Rotated factor loadings (pattern matrix) and unique variances
    
        ---------------------------------------------------------------------
            Variable |  Factor1   Factor2   Factor3   Factor4 |   Uniqueness 
        -------------+----------------------------------------+--------------
               meals |   0.9492                               |      0.1398  
                 ell |   0.7535                               |      0.3097  
              yr_rnd |             0.6371                     |      0.5913  
              acs_k3 |                       0.4472           |      0.8051  
              acs_46 |                       0.4222           |      0.7926  
              avg_ed |  -0.9137                               |      0.2440  
                full |  -0.4183                               |      0.6328  
              enroll |             0.6794                     |      0.5378  
        ---------------------------------------------------------------------
        (blanks represent abs(loading)<.35)
    
    Factor rotation matrix
    
        --------------------------------------------------
                     | Factor1  Factor2  Factor3  Factor4 
        -------------+------------------------------------
             Factor1 |  0.9793   0.6754  -0.0530   0.0000 
             Factor2 | -0.1914   0.6827   0.6330   0.0000 
             Factor3 |  0.0652  -0.2789   0.7723   0.0000 
             Factor4 |  0.0000   0.0000   0.0000   1.0000 
        --------------------------------------------------
    
    predict f1 f2 f3
    (regression scoring assumed)
    
    Scoring coefficients (method = regression; based on promax(3) rotated factors)
    
        ------------------------------------------------------
            Variable |  Factor1   Factor2   Factor3   Factor4 
        -------------+----------------------------------------
               meals |  0.52599   0.06864  -0.20729  -0.07627 
                 ell |  0.19849   0.20361   0.03846  -0.27115 
              yr_rnd |  0.02016   0.32112  -0.03006   0.11481 
              acs_k3 | -0.00472   0.02699   0.31607   0.00345 
              acs_46 | -0.00596   0.06858   0.31676  -0.02782 
              avg_ed | -0.23978   0.03342  -0.05150  -0.43295 
                full | -0.06266  -0.12827   0.04101   0.21073 
              enroll |  0.03506   0.37733   0.18618   0.04654 
        ------------------------------------------------------
    
    corr f1 f2 f3
    (obs=4059)
    
                 |       f1       f2       f3
    -------------+---------------------------
              f1 |   1.0000
              f2 |   0.6289   1.0000
              f3 |  -0.1909   0.2558   1.0000
    The three factors can be interpreted as follows. Factor 1 seems to reflect socioeconomic variables. Factor 2 appears to be related to the size of the population in the school neighborhoods, while Factor 3 is concerned with classroom size.

    Stata 9 allows for the following methods for initial factor extraction:

          pf      principal-axis factor analysis; the default
          pcf     principal-components factor analysis
          ipf     iterated principal-axis factor analysis
          ml      maximum-likelihood factor analysis
    The following options are allowed with the factor command:
          factors(#)     maximum number of factors to be retained
          mineigen(#)    minimum value of eigenvalues to be retained
          citerate(#)    communality re-estimation iterations (ipf only)
    The factor commands has the following post-estimation procedures:
          estat anti           anti-image correlation and covariance matrices
          estat common         correlation matrix of the common factors
          estat factors        AIC and BIC model selection criteria for different numbers of
                                 factors
          estat kmo            Kaiser-Meyer-Olkin measure of sampling adequacy
          estat residuals      matrix of correlation residuals
          estat rotatecompare  compare rotated and unrotated loadings
          estat smc            squared multiple correlations between each variable and the rest
          estat structure      correlations between variables and common factors
          estat summarize      estimation sample summary
          loadingplot          plot factor loadings
          rotate               rotate factor loadings
          scoreplot            plot score variables
          screeplot            plot eigenvalues
    The following factor rotation procedures are available in Stata 9 using the rotate command:
          varimax          varimax (orthogonal only); the default
          vgpf             varimax via the GPF algorithm (orthogonal only)
          quartimax        quartimax (orthogonal only)
          equamax          equamax (orthogonal only)
          parsimax         parsimax (orthogonal only)
          entropy          minimum entropy (orthogonal only)
          tandem1          Comrey's tandem 1 principle (orthogonal only)
          tandem2          Comrey's tandem 2 principle (orthogonal only)
    
          promax[(#)]      promax power # (implies oblique); default is promax(3)
          oblimin[(#)]     oblimin with gamma=#; default is oblimin(0)
          cf(#)            Crawford-Ferguson family with kappa=#, 0<=#<1
          bentler          Bentler's invariant pattern simplicity
          oblimax          oblimax
          quartimin        quartimin
          target(Tg)       rotate towards matrix Tg
          partial(Tg W)    rotate towards matrix Tg, weighted by matrix W
    The rotate command has the following options:
          orthogonal         restrict to orthogonal rotations; default, except with promax()
          oblique            allow oblique rotations
          rotation_methods   rotation criterion
          normalize          rotate Horst normalized matrix
          horst              synonym for normalize
          factors(#)         rotate # factors or components; default all
          components(#)      synonym for factors()


    Ed231A Page
    UCLA Department of Education

    Phil Ender, 16nov05, 15oct05, 29Jan98