Excel and Statistics, everybody should know. : Contents
2-way ANOVA
Open ‘example1.xlsx’ to ‘Anova: Two-Factor With Replication’ and specify the input range as shown. Before we analyze it, let’s find out what the nature of this data is.
|
Fertilizer1 |
Fertilizer2 |
Fertilizer3 |
High humidity |
|
|
|
Low humidity |
|
|
|
Grow beans, and divide them into 6 groups according to fertilizer conditions and humidity conditions. Two types of conditions mean two-way. There are 3 types of fertilizer conditions and 2 types of humidity conditions, for a total of 6 conditions. Let’s say you used 11 beans for each condition. And let’s say the result is the data above. Think of it as fertilizer and humidity as an item. In this way, when two jawguns are set up in combination at once, it can be called as ‘2 way ANOVA’, or ‘2 factor ANOVA’. Each bean is exposed to 2 conditions.
Here’s
the result. Expressed in color is the summary data of these six groups, and the
“ANOVA table” below for “Sample (row)” with 1 degree of freedom and 2
conditions, the p value is 0.62, which is not statistically significant. For ‘Columns’
with 2 degrees of freedom and 3 conditions, p-value of 0.047, which is statistically
significant. On the other hand, ‘interaction’ has a p value of 0.96, so it is
thought that there is no interaction.
Plus Alpha
Visit the http://vassarstats.net/anova2u.html, select the structure of row 2, column 3, and click Setup.
Empty yellow columns in three rows and two columns.
Copy and paste the values in Excel, but backspace at the end so that the values are entered correctly. Click ‘calculate’.
Shows the number and sum of data for each cell, the mean, sum of squares, variance, standard deviation, and standard error.
Shows an ANOVA table.
Essentially the same as what I showed in Excel.
This means that the p-value of the columns for the post-test is 0.0474, so if we only look at the columns and see a difference of 21.9 or more, we will consider it is statistically significant.
Looking at the previous summary data again, there is a difference of more than 21 between C1 and C2, so we judge that there is a statistical difference.
Let’s upload ‘2way_ANOVA.csv’ at the https://tinyurl.com/multifactor-ANOVA. The data is a long form of ‘example1.xlsx’.
It shows the results in the ‘Result’ tab and the ‘Interaction’ tab. The former is considered to have no interaction, and the latter is assuming there is an interaction.
If we compare the latter case with the result of Excel, we can see that the sum of squares (Sum Sq) is different and the results are different.
Why?
Excel calculates according to the method initially proposed by Fisher[1]. Yates expanded this further in 1934 with his paper “The analysis of multiple classifications with unequal numbers in the different classes”[2], and research has continued to develop ever since.
I will find one that is appropriate to explain[3] to the general reader and briefly introduce it. The text also explains the advantages and disadvantages of each method. As a result, researchers can be very perplexed. It would be nice if there was only one way, but that’s not the case.
SAS says that all 3 methods (actually 4 methods) are possible, and SPSS has 4 methods, of which type II is the default in version 10. In R, it’s probably a lot of different things, and in principle, you have to cite the first paper that suggested what method was used, but I usually end up saying that I used some statistical program.
In practical research, the numbers in each cell are often not the same. If you intentionally match the number of cells, you can use Excel’s method, in which case you can present Fisher’s original paper as a reference.
If you used https://tinyurl.com/Plot-with-error-bar or
https://tinyurl.com/multifactor-ANOVA, you can indicate that you used packages named ‘ez’ and ‘HH’ respectively.
‘Anova: Two-Factor Without Replication’ is not a method commonly used in actual research, so we will omit it. Can be used when there is only one sample result in a cell.
[1] Ronald FisherA monumental book of “Statistical Methods for Research Workers(In 1925,)”at the two-way ANOVA is mentioned.
[2] Yates, Frank. “The analysis of multiple classifications with unequal numbers in the different classes”. Journal of the American Statistical Association 29, Issue 185 (1934): 51–66.
[3] Shaw, Ruth G., Thomas Mitchell-Olds. “ANOVA for unbalanced data: an overview”. Ecology 74, Issue 6 (1993): 1638–1645.
=================================================
- R statisics portal https://tinyurl.com/stat-portal
- R data visualization book 1 https://tinyurl.com/R-plot-I (chart)
- R data visualization book 2
https://tinyurl.com/R-plot-II-3-4 many variables / map
https://tinyurl.com/R-plot-II-5-6 time related / statistics related
https://tinyurl.com/R-plot-II-7-8 others / reactive chart
- R data visualization book 3 https://tinyurl.com/R-data-Vis3
- R data visualization book 4 R 데이터 시각화 4권
- Meata Analysis book 1 https://tinyurl.com/MetaA-portal
- Meata Analysis book 2 https://tinyurl.com/MetaA-portal(2)
- Preciction Model and Machine Learning https://tinyurl.com/Machine-Learning-EZ
- Sample Size Calculations https://tinyurl.com/MY-sample-size
- Sample data https://tinyurl.com/data4edu
========================
Related Post
댓글 없음:
댓글 쓰기