Lengths of Character Strings

nchar

Description

nchar returns a vector of the lengths (or bytes or width) of the character strings in the input.
nzchar is a convenient function to check the empty/non-empty string status in a character vector.

Usage

nchar(x, type = "chars", allowNA = FALSE, keepNA = FALSE)
nzchar(x, keepNA = FALSE)

Arguments

x	a character vector or another object that can be converted to a character vector.
type	a character string specifying how to calculate the result. type should be either "bytes", "chars", or "width", or it should be a prefix of one of these strings.
allowNA	a logical flag. If TRUE, if nchar cannot calculate a result, nchar returns NA rather than generating an error. The default is FALSE.
keepNA	a logical flag. If TRUE, nchar and nzchar will map missing values (NA) in x to missing values. If FALSE, the default, nchar will map missing values to 2 (the same as nchar("NA")) and nzchar will map them to TRUE.

Details

You can calculate the size of a character string, specified by the argument type, using one of the following three types:

"chars": the number of human-readable characters.
"bytes": the number of bytes needed to store the string.
"width": The number of columns cat would use to print the string in a monospaced font.

Usually these three types produce the same value, unless the string contains multibyte characters or wide Unicode characters.

If a string has the 'bytes' encoding (see Encoding), it cannot be interpreted as a sequence of characters when the type argument is "chars" or "width". In this case, nchar generates an error (if allowNA is FALSE), or produces an NA value for the size (if allowNA is TRUE). If type is "bytes", it always produces the number of bytes in the string, whether or not the bytes can be interpreted as characters.

Note: x is coerced to character, regardless of what it currently contains. This coercion always works, but in the case of non-atomic data the result might be surprising.

Value

nchar	returns a numeric vector the same length as x that contains the size of each element of x, converted to a character string.
nzchar	returns a logical vector the same length as x. TRUE if the element of x contains non-empty string, otherwise FALSE.

Differences between TIBCO Enterprise Runtime for R and Open-source R

In open-source R, nchar generates an error (or returns NA if allowNA is TRUE) if the string contains an invalid multi-byte character sequence and the type argument is "chars" or "width".

See Also

character, Encoding.

Examples

# example strings, including a wide Japanese character
x <- c('a','bb','hello', '\u30A4')
nchar(x)
# [1] 1 2 5 1
nchar(x, "bytes")
# [1] 1 2 5 3
# '\u30A4' is represented as 3 bytes in UTF-8
nchar(x, "width")
# [1] 1 2 5 2
# '\u30A4' is a double-wide character
z <- rawToChar(as.raw(1:255)); Encoding(z) <- 'bytes'
Encoding(z)
#[1] "bytes"
nchar(z)
# Error in nchar(z) : cannot determine character count for 
# 'bytes' encoding
nchar(z, allowNA=TRUE)
# [1] NA
nchar(z, 'bytes')
# [1] 255
nzchar(c("12.34", "test", "", "\u30A4", z))
# [1]  TRUE  TRUE FALSE  TRUE  TRUE

Package base version 6.0.0-69
Package Index