Simplified calculation of principal components

The resolution of a set of n tests or other variates into components 7~, each of which accounts for the greatest possible portion 71, ~,~,-., of the total variance of the tests unaccounted for by the previous components, has been dealt with by the author in a previous paper (2). Such "factors," on account of their analogy with the principal axes of a quadric, have been called principal components. The present paper describes a modification of the iterative scheme of calculating principal components there presented, in a fashion that materially accelerates convergence. The application of the iterative process is not confined to statistics, but may be used to obtain the magnitudes and orientations of the principal axes of a quadric or hyper-quadric in a manner which will ordinarily be far less laborious than those given in books on geometry. This is true whether the quadrics are ellipsoids or hyperboloids; the proof of convergence given in an earlier paper is applicable to all kinds of central quadrics. For hyper-boloids some of the roots k~ of the characteristic equation would be negative, while for ellipsoids all are positive. If in a statistical problem some of the roots should come out negative, this would indicate either an error in calculation, or that, if correlations corrected for attenuation had been used, the same type of inconsistency had crept in that sometimes causes such correlations to exceed unity. Another method of calculating principal components has been discovered by Professor Truman L. Kelley, which involves less labor than the original iterative method, at least in the examples to which he has applied it (5). How it would compare with the present accelerated method is not clear, except that some experience at Columbia University has suggested that the method here set forth is the more efficient. It is possible that Kelley's method is more suitable when all the characteristic roots are desired, but not the corresponding correlations of the variates with the components. The present method seems to the computers who have tried both to be superior when the components themselves, as well as their contributions to the total variance , are to be specified. The advantage of the present method is enhanced when, as will often be the case in dealing with numerous vari-ates, not all the characteristic roots but only a few of the largest are required.