Means for each level of a factor

Average: 5 (1 vote)

A person on the R-Help list wanted to find the mean of each group, based on ID. Their data looked like:

data<-data.frame(ID=rep(letters[1:4], 5), size=runif(20))

The person was trying to write a for loop, as most people generally do when first encountering R (I sure did), but as is the case with most procedures in R, what can be done with a for loop in other languages, can be done looplessly in R.

Had they went with the loop route, it would have looked something like:

id <- levels(data$ID)
average <- data.frame(id=id, mean=rep(NA, 4))
for(i in 1:length(id)){
	average$mean[i] <- mean(subset(data, data$ID==id[i])$size)
}

However, the R solution is either:

tapply(data$size, data$ID, mean)

or

aggregate(data$size, list(data$ID), mean)

For further reference, section 4.2 in An Introduction to R describes using tapply in this way.