Compute Column-by-Column Summaries of Groups of Observations

aggregate.data.frame

Description

Given a data frame, one or more grouping vectors, and a summary function, returns a data frame containing results of summaries of each column broken down by group.

Usage

aggregate.data.frame(x, by, FUN, ..., simplify = TRUE)

Arguments

x	a data frame. If x is not a data frame, it is converted to one using the data.frame function. A 0-row data frame is not allowed.
by	a list of grouping vectors, each as long as a column of x. The list elements should be named so that the result can use those names for its corresponding columns.
FUN	a function that can be applied to any column of x and that returns a single value.
...	any other arguments are passed to FUN.
simplify	a logical flag. Determines how to deal with the result of the applied FUN. If TRUE, the result should be simplified to a vector or matrix if possible.

Details

This is the "data.frame" method for the generic function aggregate. If simplify is TRUE, summaries are simplified to vectors if they have a common length of one, or matrices if the length is greater than one.

Value

a data frame with a column for each column in by and x. The columns arising from by contain each unique combination of values in the grouping vectors (excluding combinations not seen in the data). These columns have the data class "factor". The columns arising from x contain the value of FUN applied to the partitions induced by the grouping vectors on each column of x.

Note

If x has columns of various types, it might be difficult to find a summary function that works on all columns. Instead, it might be easier to use aggregate.data.frame on only certain columns of x.

See Also

apply, by, merge, tapply.

Examples

aggregate(Sdatasets::iris[,1:4], list(Sdatasets::iris[,5]), mean)
##      Group.1 Sepal.Length Sepal.Width Petal.Length Petal.Width
## 1     setosa        5.006       3.428        1.462       0.246
## 2 versicolor        5.936       2.770        4.260       1.326
## 3  virginica        6.588       2.974        5.552       2.026
da <- data.frame(col1=c(1,2,3,4,5), col2=c(6,7,8,9,10))
by <- list(c("a", "b", "c", "d", "e"))
aggregate(da, by, mean)
##  Group.1 col1 col2
## 1       a    1    6
## 2       b    2    7
## 3       c    3    8
## 4       d    4    9
## 5       e    5   10
by <- list(c("a", "b", "c", "a", "b"))
aggregate(da, by, mean)
##  Group.1 col1 col2
## 1       a  2.5  7.5
## 2       b  3.5  8.5
## 3       c  3.0  8.0

Package stats version 6.1.9-33
Package Index