2022년 11월 17일 목요일

Excel and Statistics(21), 2-way ANOVA

Excel and Statistics, everybody should know.  : Contents

2-way ANOVA

Open ‘example1.xlsx’ to ‘Anova: Two-Factor With Replication’ and specify the input range as shown. Before we analyze it, let’s find out what the nature of this data is.

 

 

Fertilizer1

Fertilizer2

Fertilizer3

High humidity

 

 

 

Low humidity

 

 

 

Grow beans, and divide them into 6 groups according to fertilizer conditions and humidity conditions. Two types of conditions mean two-way. There are 3 types of fertilizer conditions and 2 types of humidity conditions, for a total of 6 conditions. Let’s say you used 11 beans for each condition. And let’s say the result is the data above. Think of it as fertilizer and humidity as an item. In this way, when two jawguns are set up in combination at once, it can be called as ‘2 way ANOVA’, or ‘2 factor ANOVA’. Each bean is exposed to 2 conditions.

 

Here’s the result. Expressed in color is the summary data of these six groups, and the “ANOVA table” below for “Sample (row)” with 1 degree of freedom and 2 conditions, the p value is 0.62, which is not statistically significant. For ‘Columns’ with 2 degrees of freedom and 3 conditions, p-value of 0.047, which is statistically significant. On the other hand, ‘interaction’ has a p value of 0.96, so it is thought that there is no interaction.

Plus Alpha

Visit the http://vassarstats.net/anova2u.html, select the structure of row 2, column 3, and click Setup.

 

Empty yellow columns in three rows and two columns.

Copy and paste the values in Excel, but backspace at the end so that the values are entered correctly. Click ‘calculate’.

Shows the number and sum of data for each cell, the mean, sum of squares, variance, standard deviation, and standard error.

Shows an ANOVA table.

 

Essentially the same as what I showed in Excel.

 

This means that the p-value of the columns for the post-test is 0.0474, so if we only look at the columns and see a difference of 21.9 or more, we will consider it is  statistically significant.

 

 


Looking at the previous summary data again, there is a difference of more than 21 between C1 and C2, so we judge that there is a statistical difference.

 

Let’s upload ‘2way_ANOVA.csv’ at the https://tinyurl.com/multifactor-ANOVA. The data is a long form of ‘example1.xlsx’.

 

It shows the results in the ‘Result’ tab and the ‘Interaction’ tab. The former is considered to have no interaction, and the latter is assuming there is an interaction.

 

If we compare the latter case with the result of Excel, we can see that the sum of squares (Sum Sq) is different and the results are different.

Why?

Excel calculates according to the method initially proposed by Fisher[1]. Yates expanded this further in 1934 with his paper “The analysis of multiple classifications with unequal numbers in the different classes”[2], and research has continued to develop ever since.

I will find one that is appropriate to explain[3] to the general reader and briefly introduce it. The text also explains the advantages and disadvantages of each method. As a result, researchers can be very perplexed. It would be nice if there was only one way, but that’s not the case.

SAS says that all 3 methods (actually 4 methods) are possible, and SPSS has 4 methods, of which type II is the default in version 10. In R, it’s probably a lot of different things, and in principle, you have to cite the first paper that suggested what method was used, but I usually end up saying that I used some statistical program.

 

In practical research, the numbers in each cell are often not the same. If you intentionally match the number of cells, you can use Excel’s method, in which case you can present Fisher’s original paper as a reference.

If you used  https://tinyurl.com/Plot-with-error-bar or

https://tinyurl.com/multifactor-ANOVA, you can indicate that you used packages named ‘ez’ and ‘HH’ respectively.

‘Anova: Two-Factor Without Replication’ is not a method commonly used in actual research, so we will omit it. Can be used when there is only one sample result in a cell.

 


 



[1] Ronald FisherA monumental book of “Statistical Methods for Research Workers(In 1925,)”at the two-way ANOVA is mentioned.

[2] Yates, Frank. “The analysis of multiple classifications with unequal numbers in the different classes”. Journal of the American Statistical Association 29, Issue 185 (1934): 51–66.

[3] Shaw, Ruth G., Thomas Mitchell-Olds. “ANOVA for unbalanced data: an overview”. Ecology 74, Issue 6 (1993): 1638–1645.

 

 

=================================================

  • R data visualization book 2
https://tinyurl.com/R-plot-II-2  simple variables
https://tinyurl.com/R-plot-II-3-4   many variables / map
https://tinyurl.com/R-plot-II-5-6   time related / statistics related
https://tinyurl.com/R-plot-II-7-8   others / reactive chart 
 


 

 ========================

Related Post

06. dot, jitter, box, violin, spaghetti plot and t-test/ANOVA*
    Link1   Link2    Link3    Link4                     


  

 


댓글 없음:

댓글 쓰기