cut
Create Factor Object from Numeric Vector
Description
Creates a factor object by dividing a numeric vector into a
certain number of intervals or defined ranges.
Usage
cut(x, ...)
cut.default(x, breaks, labels = NULL, include.lowest = FALSE,
right = TRUE, dig.lab = 3, ordered_result = FALSE, ...)
.bincode(x, breaks, right = TRUE, include.lowest = FALSE)
Arguments
x |
a numeric vector to divide into a factor. Missing values (NAs)
are allowed.
|
breaks |
For the default method of cut,
an integer or numeric vector that defines the breakpoints. A single
number, 2 or more, specifies the number of equal-width
intervals for x. If a vector of breakpoints is specified, the
category has length(breaks)-1 groups and they correspond to
data in the intervals between successive values in breaks.
For .bincode, only only the numeric vector of breakpoints,
of length more than 1, is accepted.
|
labels |
a character vector that specifies a label for each interval.
- If the labels argument is not specified (the default),
the breakpoints are encoded to create the interval names in the
form "(lower.limit, upper.limit]" or, if right = FALSE,
"[lower.limit, upper.limit)".
- If the labels argument is specified, the number of
labels must be one fewer than number of break points.
- If FALSE, the return value is a vector of integers
instead of a factor.
- If TRUE, it is interpreted as one label "TRUE".
|
include.lowest |
a logical value that specifies the inclusion or exclusion of an
endpoint in the lowest or highest interval depending on the value
of right.
- When right = TRUE: if FALSE (the default), all
intervals include the upper endpoint but not the lower endpoint.
If TRUE, the lowest interval includes the lower endpoint.
- When right = FALSE: if FALSE (the default),
all intervals include the lower endpoint but not the upper endpoint.
If TRUE, the highest interval includes the upper endpoint.
|
right |
a logical value. If FALSE (the default), each interval includes
the lower endpoint of the internal instead of the upper endpoint.
include.lowest has the opposite effect. That is, instead of
having the default behavior where each interval includes the lower
endpoint but not the upper endpoint, the behavior in this case is
that the highest interval will include the upper endpoint.
For timeDate related classes such as positionsCalendar
and timeSpan, the default is TRUE.
|
dig.lab |
an integer that is used when labels are not given. It is used
to specify the number of digits to use when formatting the break
numbers. The default value is 3.
|
ordered_result |
a logical value. If TRUE, the result is turned into a ordered
factor before returning. The default is FALSE.
|
... |
further arguments for other methods.
|
Details
- If right = TRUE (the default), then values that are less than
or equal to the first breakpoint or greater than the last breakpoint
are returned as a missing values (NAs). Each interval consists
of values that are greater than the value of the breakpoint and less
than or equal to the value of the next breakpoint. However, if
include.lowest = TRUE, then the lowest group also includes the
value equal to the lowest breakpoint.
- If right = FALSE, then values that are less than the first
breakpoint or greater than or equal to the last breakpoint are
returned as missing values (NAs). Each interval consists of
values that are greater than or equal to one breakpoint and less
than the next breakpoint. However, if include.lowest = TRUE,
the upper group also includes data equal to the highest breakpoint.
In either case, missing values in
x create missing values in
the result.
The cut function is generic.
Value
cut returns either a factor or an integer vector.
- If labels is not FALSE, returns a factor or ordered
factor as long as x telling which
group each point in x belongs to, along with an attribute,
levels, which is a vector of character names for each group.
- if labels is FALSE, returns an integer vector (instead of a factor) as long as x
telling which group each point in x belongs to.
.bincode returns an integer vector, the same as what
cut(labels=FALSE, ...) would return.
See Also
Examples
x <- 1:10
cut(x, 3) # cut into 3 groups
cut(x, c(0,5,11)) # cut based on given breakpoints
cut(x, pretty(x)) # approx 5 "pretty" intervals
cut(x, c(1,5,10), right = FALSE, inc = TRUE) # cut using left intervals