Parallel Analysis on PCA

In [1]:
require 'statsample'
samples=150
variables=30
iterations=50
Statsample::Analysis.store(Statsample::Factor::ParallelAnalysis) do
  Daru.lazy_update = true
  
  rng = Distribution::Normal.rng()
  f1  = rnorm(samples)
  f2  = rnorm(samples)
  f3  = rnorm(samples)

  vectors={}

  variables.times do |i|
    vectors["v#{i}".to_sym] = Daru::Vector.new(samples.times.collect {|nv| f1[nv]*i+(f2[nv]*(15-i))+((f3[nv]*(30-i))*1.5)*rng.call})
    vectors["v#{i}".to_sym].rename "Vector #{i}"
  end

  ds = Daru::DataFrame.new(vectors)

  pa=Statsample::Factor::ParallelAnalysis.new(ds, :iterations=>iterations, :debug=>true)
  pca=pca(cor(ds))
  echo "There are 3 real factors on data"
  summary pca
  echo "Traditional Kaiser criterion (k>1) returns #{pca.m} factors"
  summary pa
  echo "Parallel Analysis returns #{pa.number_of_factors} factors to preserve"
 Daru.lazy_update = false
end

Statsample::Analysis.run_batch
Parallel Analysis: Iteration 0
Parallel Analysis: Iteration 1
Parallel Analysis: Iteration 2
Parallel Analysis: Iteration 3
Parallel Analysis: Iteration 4
Parallel Analysis: Iteration 5
Parallel Analysis: Iteration 6
Parallel Analysis: Iteration 7
Parallel Analysis: Iteration 8
Parallel Analysis: Iteration 9
Parallel Analysis: Iteration 10
Parallel Analysis: Iteration 11
Parallel Analysis: Iteration 12
Parallel Analysis: Iteration 13
Parallel Analysis: Iteration 14
Parallel Analysis: Iteration 15
Parallel Analysis: Iteration 16
Parallel Analysis: Iteration 17
Parallel Analysis: Iteration 18
Parallel Analysis: Iteration 19
Parallel Analysis: Iteration 20
Parallel Analysis: Iteration 21
Parallel Analysis: Iteration 22
Parallel Analysis: Iteration 23
Parallel Analysis: Iteration 24
Parallel Analysis: Iteration 25
Parallel Analysis: Iteration 26
Parallel Analysis: Iteration 27
Parallel Analysis: Iteration 28
Parallel Analysis: Iteration 29
Parallel Analysis: Iteration 30
Parallel Analysis: Iteration 31
Parallel Analysis: Iteration 32
Parallel Analysis: Iteration 33
Parallel Analysis: Iteration 34
Parallel Analysis: Iteration 35
Parallel Analysis: Iteration 36
Parallel Analysis: Iteration 37
Parallel Analysis: Iteration 38
Parallel Analysis: Iteration 39
Parallel Analysis: Iteration 40
Parallel Analysis: Iteration 41
Parallel Analysis: Iteration 42
Parallel Analysis: Iteration 43
Parallel Analysis: Iteration 44
Parallel Analysis: Iteration 45
Parallel Analysis: Iteration 46
Parallel Analysis: Iteration 47
Parallel Analysis: Iteration 48
Parallel Analysis: Iteration 49
Analysis 2016-03-26 02:42:37 +0000
= Statsample::Factor::ParallelAnalysis
  There are 3 real factors on data
  == Principal Component Analysis
    Number of factors: 8
    Communalities
+----------+---------+------------+--------+
| Variable | Initial | Extraction |   %    |
+----------+---------+------------+--------+
| v0       | 1.000   | 0.718      | 71.814 |
| v1       | 1.000   | 0.806      | 80.592 |
| v10      | 1.000   | 0.742      | 74.220 |
| v11      | 1.000   | 0.672      | 67.184 |
| v12      | 1.000   | 0.767      | 76.672 |
| v13      | 1.000   | 0.525      | 52.483 |
| v14      | 1.000   | 0.613      | 61.319 |
| v15      | 1.000   | 0.767      | 76.689 |
| v16      | 1.000   | 0.580      | 58.006 |
| v17      | 1.000   | 0.614      | 61.435 |
| v18      | 1.000   | 0.571      | 57.060 |
| v19      | 1.000   | 0.606      | 60.624 |
| v2       | 1.000   | 0.745      | 74.486 |
| v20      | 1.000   | 0.735      | 73.461 |
| v21      | 1.000   | 0.835      | 83.501 |
| v22      | 1.000   | 0.874      | 87.361 |
| v23      | 1.000   | 0.830      | 82.958 |
| v24      | 1.000   | 0.900      | 90.029 |
| v25      | 1.000   | 0.930      | 93.029 |
| v26      | 1.000   | 0.940      | 94.045 |
| v27      | 1.000   | 0.957      | 95.748 |
| v28      | 1.000   | 0.975      | 97.489 |
| v29      | 1.000   | 0.979      | 97.931 |
| v3       | 1.000   | 0.675      | 67.468 |
| v4       | 1.000   | 0.676      | 67.614 |
| v5       | 1.000   | 0.639      | 63.899 |
| v6       | 1.000   | 0.707      | 70.699 |
| v7       | 1.000   | 0.702      | 70.152 |
| v8       | 1.000   | 0.593      | 59.331 |
| v9       | 1.000   | 0.774      | 77.365 |
+----------+---------+------------+--------+

    Total Variance Explained
+--------------+---------+---------+---------+
|  Component   | E.Total |    %    | Cum. %  |
+--------------+---------+---------+---------+
| Component 1  | 11.635  | 38.784% | 38.784  |
| Component 2  | 2.228   | 7.425%  | 46.209  |
| Component 3  | 1.868   | 6.225%  | 52.434  |
| Component 4  | 1.781   | 5.936%  | 58.371  |
| Component 5  | 1.503   | 5.009%  | 63.380  |
| Component 6  | 1.275   | 4.250%  | 67.630  |
| Component 7  | 1.149   | 3.830%  | 71.460  |
| Component 8  | 1.009   | 3.362%  | 74.822  |
| Component 9  | 0.948   | 3.162%  | 77.984  |
| Component 10 | 0.813   | 2.709%  | 80.692  |
| Component 11 | 0.776   | 2.585%  | 83.278  |
| Component 12 | 0.688   | 2.292%  | 85.570  |
| Component 13 | 0.584   | 1.945%  | 87.515  |
| Component 14 | 0.516   | 1.719%  | 89.235  |
| Component 15 | 0.490   | 1.633%  | 90.868  |
| Component 16 | 0.454   | 1.512%  | 92.380  |
| Component 17 | 0.416   | 1.388%  | 93.768  |
| Component 18 | 0.345   | 1.149%  | 94.916  |
| Component 19 | 0.322   | 1.073%  | 95.990  |
| Component 20 | 0.276   | 0.919%  | 96.908  |
| Component 21 | 0.240   | 0.800%  | 97.708  |
| Component 22 | 0.206   | 0.685%  | 98.394  |
| Component 23 | 0.132   | 0.439%  | 98.832  |
| Component 24 | 0.098   | 0.327%  | 99.159  |
| Component 25 | 0.095   | 0.315%  | 99.475  |
| Component 26 | 0.064   | 0.215%  | 99.690  |
| Component 27 | 0.050   | 0.167%  | 99.857  |
| Component 28 | 0.030   | 0.100%  | 99.957  |
| Component 29 | 0.010   | 0.034%  | 99.990  |
| Component 30 | 0.003   | 0.010%  | 100.000 |
+--------------+---------+---------+---------+

    Component matrix
+-----+-------+-------+-------+-------+-------+-------+-------+-------+
|     | PC_1  | PC_2  | PC_3  | PC_4  | PC_5  | PC_6  | PC_7  | PC_8  |
+-----+-------+-------+-------+-------+-------+-------+-------+-------+
| v0  | .029  | .599  | .211  | -.365 | -.017 | -.128 | -.379 | .145  |
| v1  | -.011 | .431  | .139  | .104  | -.148 | .729  | -.163 | -.096 |
| v10 | -.309 | .211  | -.302 | -.367 | -.453 | .220  | .102  | .335  |
| v11 | -.386 | .373  | .258  | .394  | -.326 | -.091 | -.001 | .217  |
| v12 | -.351 | .063  | .544  | -.128 | .016  | .019  | .567  | -.070 |
| v13 | -.450 | .432  | .243  | .120  | .114  | -.130 | -.172 | -.046 |
| v14 | -.539 | -.069 | .166  | .273  | .024  | .395  | .018  | .242  |
| v15 | -.576 | -.032 | -.210 | .192  | -.494 | .202  | .215  | -.147 |
| v16 | -.546 | -.378 | .264  | -.110 | .074  | -.006 | .060  | .218  |
| v17 | -.627 | .258  | .005  | .151  | .093  | .131  | -.271 | -.181 |
| v18 | -.716 | -.171 | -.084 | .120  | .020  | .049  | -.050 | .045  |
| v19 | -.718 | .088  | -.063 | .054  | -.249 | .111  | .007  | .047  |
| v2  | -.064 | -.591 | .400  | .197  | -.086 | .126  | -.361 | -.199 |
| v20 | -.826 | -.133 | .017  | .069  | .132  | .099  | -.008 | -.053 |
| v21 | -.882 | -.147 | .001  | -.034 | .014  | .091  | -.141 | -.081 |
| v22 | -.904 | .033  | -.012 | -.147 | .040  | -.149 | -.059 | .078  |
| v23 | -.890 | .023  | .109  | -.112 | .085  | -.012 | .041  | -.056 |
| v24 | -.924 | .125  | -.011 | -.025 | .097  | -.089 | .101  | .054  |
| v25 | -.937 | .054  | -.084 | -.060 | .083  | -.176 | -.006 | -.025 |
| v26 | -.961 | -.054 | -.052 | -.028 | .076  | -.061 | .006  | -.022 |
| v27 | -.965 | -.016 | -.070 | .006  | .122  | -.048 | .016  | -.058 |
| v28 | -.970 | .015  | -.106 | -.009 | .126  | -.070 | .029  | -.012 |
| v29 | -.974 | -.007 | -.108 | .002  | .117  | -.057 | .004  | -.035 |
| v3  | -.036 | .422  | .209  | -.062 | -.345 | -.231 | .368  | -.374 |
| v4  | -.107 | -.546 | .428  | -.164 | -.320 | .069  | .046  | -.217 |
| v5  | .010  | .055  | .656  | .203  | .004  | -.067 | .015  | .400  |
| v6  | .078  | .131  | .503  | -.612 | .130  | .076  | -.140 | -.115 |
| v7  | -.091 | -.236 | .063  | .227  | -.528 | -.439 | -.230 | .240  |
| v8  | -.318 | .010  | -.048 | -.248 | -.445 | -.162 | -.337 | -.300 |
| v9  | -.203 | -.302 | -.115 | -.713 | -.142 | .164  | .011  | .269  |
+-----+-------+-------+-------+-------+-------+-------+-------+-------+

  Traditional Kaiser criterion (k>1) returns 8 factors
  == Parallel Analysis
    Bootstrap Method: random
    Uses SMC: No
    Correlation Matrix type : correlation_matrix
    Number of variables: 30
    Number of cases: 150
    Number of iterations: 50
    Number or factors to preserve: 4
    Eigenvalues
+----+-----------------+----------------------+--------+-----------+
| n  | data eigenvalue | generated eigenvalue |  p.95  | preserve? |
+----+-----------------+----------------------+--------+-----------+
| 1  | 11.6353         | 1.9397               | 2.0744 | Yes       |
| 2  | 2.2275          | 1.7961               | 1.8770 | Yes       |
| 3  | 1.8675          | 1.6885               | 1.7637 | Yes       |
| 4  | 1.7809          | 1.6032               | 1.6780 | Yes       |
| 5  | 1.5027          | 1.5281               | 1.5856 |           |
| 6  | 1.2750          | 1.4573               | 1.5253 |           |
| 7  | 1.1491          | 1.3892               | 1.4417 |           |
| 8  | 1.0086          | 1.3263               | 1.3981 |           |
| 9  | 0.9485          | 1.2711               | 1.3060 |           |
| 10 | 0.8126          | 1.2174               | 1.2501 |           |
| 11 | 0.7756          | 1.1585               | 1.2080 |           |
| 12 | 0.6877          | 1.1053               | 1.1429 |           |
| 13 | 0.5836          | 1.0553               | 1.0955 |           |
| 14 | 0.5158          | 1.0083               | 1.0468 |           |
| 15 | 0.4899          | 0.9635               | 1.0054 |           |
| 16 | 0.4537          | 0.9161               | 0.9573 |           |
| 17 | 0.4164          | 0.8739               | 0.9153 |           |
| 18 | 0.3446          | 0.8270               | 0.8561 |           |
| 19 | 0.3220          | 0.7909               | 0.8310 |           |
| 20 | 0.2756          | 0.7524               | 0.7944 |           |
| 21 | 0.2400          | 0.7086               | 0.7465 |           |
| 22 | 0.2056          | 0.6678               | 0.7227 |           |
| 23 | 0.1316          | 0.6309               | 0.6700 |           |
| 24 | 0.0982          | 0.5889               | 0.6263 |           |
| 25 | 0.0946          | 0.5534               | 0.5885 |           |
| 26 | 0.0645          | 0.5173               | 0.5562 |           |
| 27 | 0.0501          | 0.4795               | 0.5195 |           |
| 28 | 0.0300          | 0.4405               | 0.4791 |           |
| 29 | 0.0101          | 0.3978               | 0.4305 |           |
| 30 | 0.0029          | 0.3470               | 0.3938 |           |
+----+-----------------+----------------------+--------+-----------+

  Parallel Analysis returns 4 factors to preserve