sweep
Sweep Out Array Summaries
Description
Returns an array like the input x with STATS swept out.
Usage
sweep(x, MARGIN, STATS, FUN = "-", check.margin = TRUE, ...)
Arguments
x |
an array.
Missing values (NAs) are allowed.
|
MARGIN |
a numeric or character vector describing the dimensions of x that correspond to
STATS.
If character strings, they are expected to be a subset of names(dimnames(x)).
|
STATS |
a vector or an array giving a summary statistic of the array x that is to be
swept out.
Missing values (NAs) are allowed.
|
FUN |
a function or a character string naming a function to be used in the
sweep operation.
|
check.margin |
a logical flag.
If TRUE (the default), it checks if the length or dim of STATS
matches the dim of x.
It generates a warning if the length or dim does not match.
If FALSE, does not check for this match.
|
... |
additional arguments to FUN, if any.
|
Details
sweep creates an array with the same dimensions as x by
copying the data in STATS across the dimensions in x
not specified by MARGIN.
Then, the function FUN is applied to the two arguments, x,
and the new array constructed from STATS,
followed by any additional arguments passed as the ... argument
to sweep.
sweep returns the value returned by FUN.
Usually FUN is a function that operates element-by-element
on each value in x and in the constructed array,
so sweep returns an array with the same dimensions as x,
but FUN (and thus sweep) could return anything.
FUN can handle missing values in x and the constructed array.
If
check.margin is
TRUE, a warning is generated if
the length or dim of
STATS does not match the dimensions of the
subarray of
x specified by
MARGIN.
It checks for three types of mismatches:
- length(STATS) is larger than prod(dim(x)[MARGIN]).
- length(STATS) is smaller than prod(dim(x)[MARGIN]) and the elements of STATS cannot be evenly recycled across MARGIN.
- dim(STATS) is not null, and it does not match dim(x)[MARGIN]
(ignoring dimensions of length 1).
In the most common cases,
STATS is the result of producing statistics by
calling the
apply function on an array, and
FUN is
"-" or
"/" to subtract or divide
the array elements by these statistics.
For example:
colmeans <- apply(z,2,mean) computes the
column means of array
z, and
zcenter <- sweep(z,2,colmeans)
subtracts these column means from the elements of
z.
sweep constructs an array from
colmeans
so the column mean for each column is subtracted
from all of the elements of
z in that column.
Value
returns an array like x, but with marginal statistics swept out, as
defined by the other arguments.
See Also
Examples
# Calculate and subtract column medians
z <- array(c(2,3,7,11,13,17,19,23,29,31,31,31), dim=c(3,4))
colmeds <- apply(z, 2, median)
centered <- sweep(z, 2, colmeds)
centered
# now standardize columns to median absolute deviation 1, if possible
sweep(centered, 2, apply(centered, 2, mad),
FUN=function(x,y)ifelse(y>0, x/y, x))