aggregate
Compute Summary Statistics of Subsets of Data

Description

Splits up data by time period or other factors and computes summary for each subset.
This function is an S Version 3 generic (see Methods). Methods can be written to handle specific S Version 3 classes of data. Classes that already have methods for this function include ts, data.frame, and formula.

Usage

aggregate(x, ...)
## Default S3 method:
aggregate(x, ...)
## S3 method for class 'ts':
aggregate(x, nfrequency = 1, FUN = sum, ndeltat = 1, ts.eps = getOption("ts.eps"), ...)
## S3 method for class 'formula':
aggregate(formula, data, FUN, ..., subset, na.action = na.omit)

Arguments

x a time series or a data frame. Currently, anything that is not a time series is converted to a data frame.
... arguments to pass to the specific method used. See aggregate.data.frame for acceptable arguments.
nfrequency an integer. The new number of observations frequency.
FUN a function that can be applied to any column of x and that returns a single value.
ndeltat a new fraction of the sampling period between successive observations. It is useful when nfrequency is missing.
ts.eps the tolerance used to decide the new frequency.
formula a modeling formula. Must have both left and right hand sides: the left side is either an expression (including a variable) or a list of expressions in a call to cbind.
na.action a function to filter missing data. The default (na.fail) reports an error if any missing values are found. One possible alternative is na.omit, which deletes observations that contain one or more missing values.
data a data frame to contain the variables named in formula.
subsetan expression evaluated as in the modelling functions that is used to select a set of rows from data.

Details

Each method accepts a function for computing the summary statistic. This function should always return a scalar.
In aggregate.ts, if nfrequency is not provided, it is set to 1/ndeltat. If nfrequency is the same as the old frequency of x, x is returned with no change. The original x is split into several blocks of length frequency(x) / nfrequency, and then FUN is applied to each block. A new time series is returned with the frequency nfrequency. For multivariate time series, each column is processed independently.
Value
returns a time series or a data frame containing the summary statistics.
See Also
aggregate.data.frame, by, tapply.
Examples
# compute 50 year averages of sunspot numbers from monthly numbers
aggregate(sunspots, ndeltat=50, FUN=mean)

# Compute regional averages of demographic data aggregate(Sdatasets::state.x77[,2:4], list(Region=Sdatasets::state.region), FUN=mean) aggregate(weight ~ feed, data = chickwts, mean)

Package stats version 6.0.0-69
Package Index