How to execute a single ANOVA factor in R with patterns organized by a column?

I have a dataset where the samples are grouped by column. The following sample dataset is similar to my data format:

a = c(1,3,4,6,8) b = c(3,6,8,3,6) c = c(2,1,4,3,6) d = c(2,2,3,3,4) mydata = data.frame(cbind(a,b,c,d)) 

When I execute one ANOVA factor in Excel using the above dataset, I get the following results:

enter image description here

I know that a typical format in R is as follows:

 group measurement a 1 a 3 a 4 . . . . . . d 4 

And the command to execute ANOVA in R will use aov(group~measurement, data = mydata) . How to execute a single ANOVA factor in R with patterns sorted by column rather than row? In other words, how do I duplicate Excel results using R? Many thanks for the help.

+6
source share
1 answer

You add them in a long format:

 mdat <- stack(mydata) mdat values ind 1 1 a 2 3 a 3 4 a 4 6 a 5 8 a 6 3 b 7 6 b snipped output > aov( values ~ ind, mdat) Call: aov(formula = values ~ ind, data = mdat) Terms: ind Residuals Sum of Squares 18.2 65.6 Deg. of Freedom 3 16 Residual standard error: 2.024846 Estimated effects may be unbalanced 

Given the warning, it would be safer to use lm :

 > anova(lm(values ~ ind, mdat)) Analysis of Variance Table Response: values Df Sum Sq Mean Sq F value Pr(>F) ind 3 18.2 6.0667 1.4797 0.2578 Residuals 16 65.6 4.1000 > summary(lm(values~ind, mdat)) Call: lm(formula = values ~ ind, data = mdat) Residuals: Min 1Q Median 3Q Max -3.40 -1.25 0.00 0.90 3.60 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.4000 0.9055 4.859 0.000174 *** indb 0.8000 1.2806 0.625 0.540978 indc -1.2000 1.2806 -0.937 0.362666 indd -1.6000 1.2806 -1.249 0.229491 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 2.025 on 16 degrees of freedom Multiple R-squared: 0.2172, Adjusted R-squared: 0.07041 F-statistic: 1.48 on 3 and 16 DF, p-value: 0.2578 

And please don't ask me why Excel gives a different answer. Excel has generally shown that it is very unreliable when it comes to statistics. The answer to Excel explains why it does not give answers comparable to R.

Edit in response to comments: the Excel Data ANOVA data analysis package creates an output, but does not use the Excel function for this process, so when you change the data in the data cells from which it was obtained, then press F9 or the equivalent menu recalculation command, there will be no changes in the output section. This and other sources of user and numerical problems are documented on different pages of David Hazer when evaluating Excel problems with statistical calculations: http://www.daheiser.info/excel/frontpage.html Heiser began his efforts, which are now at least on ten years, they expected Microsoft to take responsibility for these errors, but they consistently ignore the efforts of him and other people to identify errors and propose better procedures. In addition, in the June 2008 section , the 6th special issue of Computational Statistics and Data Analysis, amended by BD McCullough, covers various statistical problems with Excel.

+11
source

All Articles