Statistics for everyone: Excel and Statistics(22), ANCOVA

Excel and Statistics, everybody should know. : Contents

ANCOVA

Download the ‘ancovaW.xls’ file from http://www.vassarstats.net/. There are separate versions for Window and Mac, but the latest Mac can load Excel files for Windows together, so I think there is probably no problem.

Enter your materials. DV is the dependent variable for which we want to know the result. CV (covariate or) is a confounding variable that may affect results.

Since ANOVA and t-test only have DV (dependent variable). ANOVA or t-test can be extended to control the influence of confounding variables, that is ANCOVA.

This Excel file can contain a total of 1000 data at the bottom, and a total of 10 groups can be analyzed, below which you can see intermediate results for calculations. If it’s for studying, you can also study by looking at the calculation formula.

The third sheet shows the result of the calculation. The ### mark is invisible because of the font size, but if you copy and paste it into a new file, it will be visible.

Observed mean is different, but CV is also different, so let’s correct the influence of this CV and compare the adjusted mean. Comparatively, the first team went up and the third team went downŒ. ANOVA table including p-value after calibration.

Plus Alpha

Visit https://tinyurl.com/ANCOVA-plot. A basic example is uploaded, which is the same as the Excel data we saw earlier. There are four options for sort, which in turn change the plot and change the result.

Select ‘for groups’ to compare the average scores of each group. Beef and Meat are almost the same, and Poultry is a big difference. Meat is slightly higher than Beef.

‘ANCOVA’ means to draw a trend line with the same slope in each of these three groups, and draw the trend lines all at once. Narrowly, Beef is a little higher than Meat. Why this difference? First of all, if you look at each trend line, you can see that as the covariate increases, the value increases. This is the case when cholesterol increases with age, or when blood sugar rises with age. However, if cholesterol increases with food, if the ages of the three groups are not similar to each other, the effects of age and the effects of food will be the same.

So ANOVA is trying to analyze only the effects of food that are adjusted for the effects of age, and this illustration effectively illustrates that.

A picture of ‘for null’. The straight lines are all the same. You can compare points that are higher from this line, or points that are lower from this line. Suppose the dots move parallel to the right. The average of the points is the same, but it will gradually go down rather than in a straight line.

It’s a plot in ‘for intersection’ and it assumes that the straight lines have the same slope. Is it? Judging visually from the picture, they look almost the same.

The result of ‘for intersection’. The p-value of the intersection is 0.5808, so let’s assume that the slope is the same, and there is no intersection.

Same result as I saw in Excel. The p-values seem to be slightly different, but notice that SS, df, and MS are the same.

The result of ‘for groups’ shows the result of ANOVA, which ignores the effects of covarate.

In ‘ANCOVA’, this value is determined to be without an intersection.

This is what I showed you earlier in Excel, and I copied and pasted it into a file to get a good view of the numbers.

In this way, we understood what it means to show in Excel, and at the same time, we learned how to express it through the appropriate plot.

However, there may not necessarily be just one covarate that impacts. Observing the effects of multiple covarates simultaneously is the multiple regression we learned earlier. On the other hand, even if the effects of covarate are corrected in such a statistical and mathematical way, it is not possible to compensate bacause the many covarate have not been measured and cannot be measured. So randomization is needed to control the effects of covarate.