nchar
Lengths of Character Strings

Description

nchar returns a vector of the lengths (or bytes or width) of the character strings in the input.
nzchar is a convenient function to check the empty/non-empty string status in a character vector.

Usage

nchar(x, type = "chars", allowNA = FALSE, keepNA = FALSE)
nzchar(x, keepNA = FALSE)

Arguments

x a character vector or another object that can be converted to a character vector.
type a character string specifying how to calculate the result. type should be either "bytes", "chars", or "width", or it should be a prefix of one of these strings.
allowNA a logical flag. If TRUE, if nchar cannot calculate a result, nchar returns NA rather than generating an error. The default is FALSE.
keepNA a logical flag. If TRUE, nchar and nzchar will map missing values (NA) in x to missing values. If FALSE, the default, nchar will map missing values to 2 (the same as nchar("NA")) and nzchar will map them to TRUE.

Details

You can calculate the size of a character string, specified by the argument type, using one of the following three types: Usually these three types produce the same value, unless the string contains multibyte characters or wide Unicode characters.
If a string has the 'bytes' encoding (see Encoding), it cannot be interpreted as a sequence of characters when the type argument is "chars" or "width". In this case, nchar generates an error (if allowNA is FALSE), or produces an NA value for the size (if allowNA is TRUE). If type is "bytes", it always produces the number of bytes in the string, whether or not the bytes can be interpreted as characters.
Note: x is coerced to character, regardless of what it currently contains. This coercion always works, but in the case of non-atomic data the result might be surprising.
Value
ncharreturns a numeric vector the same length as x that contains the size of each element of x, converted to a character string.
nzcharreturns a logical vector the same length as x. TRUE if the element of x contains non-empty string, otherwise FALSE.
Differences between TIBCO Enterprise Runtime for R and Open-source R
In open-source R, nchar generates an error (or returns NA if allowNA is TRUE) if the string contains an invalid multi-byte character sequence and the type argument is "chars" or "width".
See Also
character, Encoding.
Examples
# example strings, including a wide Japanese character
x <- c('a','bb','hello', '\u30A4')
nchar(x)
# [1] 1 2 5 1

nchar(x, "bytes") # [1] 1 2 5 3 # '\u30A4' is represented as 3 bytes in UTF-8

nchar(x, "width") # [1] 1 2 5 2 # '\u30A4' is a double-wide character

z <- rawToChar(as.raw(1:255)); Encoding(z) <- 'bytes' Encoding(z) #[1] "bytes" nchar(z) # Error in nchar(z) : cannot determine character count for # 'bytes' encoding nchar(z, allowNA=TRUE) # [1] NA nchar(z, 'bytes') # [1] 255

nzchar(c("12.34", "test", "", "\u30A4", z)) # [1] TRUE TRUE FALSE TRUE TRUE

Package base version 6.0.0-69
Package Index