easier R than SPSS with Rcmdr : Contents
ch.24 Principal Component Analysis (PCA)
Let’s practice by calling the States data from the package carData.
The data for each U.S. state is organized. Here’s a rough look:
● SATV: Language scores for high school students
● SATM: Math scores for high school students
● Percent: High school graduation rate
● Dollars: Educational fees used for students
● Pay: Teacher’s wages
Let’s select ‘Principal Component Analysis’ of ‘Original menu’.
Select all variables.
Let’s choose all the ‘Options’ as well. The last option, ‘Add principal~’, determines the following:
The number of components to add is determined.
The result is the addition of PC1 and PC2 in the data.
For each ingredient, you can see its features. The second result above, the Component variances, is graphed as follows:
If you look at the ‘screeplot’ above, there are significantly fewer 2 and 3 ingredients compared to the 1st component. When you think of scree, which means mountainside, from Comp. 2 onwards, it corresponds to a flat, so it is thought to be the main component only up to Comp.1. If you judge Comp.4 to a start of flatland, then Comp.3 is considered the main component.
If you look at the ‘Proportion of Variance’ of the ‘Importance of components’, you can see that
Comp1 is the largest at 0.6516807, Comp.2 is followed by 0.1560831, and
Comp.3 is also around 0.1485106.
In other words, the 1st main component accounts for almost 65%.
Meanwhile, to draw a biplot, copy and paste a portion of the script verbatim, and below it biplot(. PC). Select these two lines and click ‘Submit’.
Alternatively, you can shorten it to a single line like this:
You can then visually see how side by side each variable is with Comp1 or side-by-side with Comp2.
Using the standardization menu as above,
After standardizing all the original scores,
If you do the principal component analysis again, you will get the same result. Therefore, this principal component analysis is a pre-standardization automatically performed to do the principal component analysis, which is typically the case.
easier R than SPSS with Rcmdr : Contents
=================================================
- R statisics portal https://tinyurl.com/stat-portal
- R data visualization book 1 https://tinyurl.com/R-plot-I (chart)
- R data visualization book 2
https://tinyurl.com/R-plot-II-3-4 many variables / map
https://tinyurl.com/R-plot-II-5-6 time related / statistics related
https://tinyurl.com/R-plot-II-7-8 others / reactive chart
- R data visualization book 3 https://tinyurl.com/R-data-Vis3
- R data visualization book 4 R 데이터 시각화 4권
- Meata Analysis book 1 https://tinyurl.com/MetaA-portal
- Meata Analysis book 2 https://tinyurl.com/MetaA-portal(2)
- Preciction Model and Machine Learning https://tinyurl.com/Machine-Learning-EZ
- Sample Size Calculations https://tinyurl.com/MY-sample-size
- Sample data https://tinyurl.com/data4edu
댓글 없음:
댓글 쓰기