factor(x = character(), levels, labels = levels, exclude = NA, ordered = is.ordered(x), nmax = NA) is.factor(x) as.factor(x)
x | the data, considered as taking values from a finite set (the levels). Missing values (NAs) are allowed. |
levels |
an optional vector of levels for the factor. Any data value not matching a value in
levels is assigned to NA in the factor.
|
labels | an optional vector of values to use as labels for the levels of the factor. The default is levels. |
exclude | a vector of values to exclude from forming levels. Any value that appears in both x and exclude is NA in the result, and it does not appear in the default levels attribute. |
ordered | a logical value. If is.ordered(x) (the default), it creates an ordered factor (class "ordered","factor" ). |
nmax | a positive integer or NA. If a positive integer and the levels argument is omitted, then it will be considered an error of x contains more than nmax distinct values. |
factor | returns an object of class "factor" or class "ordered","factor"
representing values taken from the finite set given by levels.
This object should not be numeric.
Comparisons and other operations behave as if
they operated on values from the levels set,
which is always of mode character.
NAs can appear, indicating that the corresponding value is undefined. |
is.factor | returns TRUE if x is a factor. Otherwise, it returns FALSE. |
as.factor | returns x if x is a factor. Otherwise, it returns factor(x). |
occupation <- c("doctor", "lawyer", "mechanic", "engineer") income <- c(150000, 100000, 30000, 60000) factor(occupation) factor(cut(income, breaks = c(0, 30000, 70000, 200000)), labels = c("low", "mid", "high"))# Make readable labels: occ <- factor(occupation, level = c("d", "l", "m", "e"), label = c("Doctor", "Lawyer", "Mechanic", "Engineer"))
color <- c("red", "red", "red", "green", "blue") colors <- factor(color, c("red", "green", "blue")) table(colors) # table counting occurrences of colors
# Treat word "Unknown" as a missing value flag: colors <- factor(c("red", "green", "Unknown", "blue"), exclude = "Unknown") is.na(colors) # 3rd value will be TRUE, the rest FALSE
# Function to create a factor if there are many repeats, # otherwise return the input x asFactorIfManyRepeats <- function(x, nmax=max(length(x)/5, 2)) { tryCatch(factor(x, nmax = nmax), error=function(e) x) } levels(asFactorIfManyRepeats(letters)) levels(asFactorIfManyRepeats(rep(letters[1:3], len=100)))