nchar
Lengths of Character Strings
Description
nchar returns a vector of the lengths (or bytes or width) of the character
strings in the input.
nzchar is a convenient function to check the empty/non-empty string
status in a character vector.
Usage
nchar(x, type = "chars", allowNA = FALSE, keepNA = FALSE)
nzchar(x, keepNA = FALSE)
Arguments
x |
a character vector or another object that can be converted to a character
vector.
|
type |
a character string specifying how to calculate the result. type
should be either "bytes", "chars", or "width",
or it should be a prefix of one of these strings.
|
allowNA |
a logical flag. If TRUE, if nchar cannot calculate a result,
nchar returns NA rather than generating an error. The
default is FALSE.
|
keepNA |
a logical flag. If TRUE, nchar and nzchar will map
missing values (NA) in x to missing values. If FALSE,
the default, nchar will map missing values to 2 (the same
as nchar("NA")) and nzchar will map them to TRUE.
|
Details
You can calculate the size of a character string, specified by the
argument
type, using one of the following three types:
- "chars": the number of human-readable characters.
- "bytes": the number of bytes needed to store the string.
- "width": The number of columns cat would use to
print the string in a monospaced font.
Usually these three types produce the same value,
unless the string contains multibyte characters or wide
Unicode characters.
If a string has the 'bytes' encoding (see
Encoding),
it cannot be interpreted as a sequence of characters
when the
type argument is
"chars" or
"width".
In this case,
nchar generates an error (if
allowNA is
FALSE), or produces an
NA value for the size (if
allowNA is
TRUE). If
type is
"bytes",
it always produces the number of bytes in the string, whether or
not the bytes can be interpreted as characters.
Note: x is coerced to character, regardless of what it currently
contains.
This coercion always works, but in the case of non-atomic data the
result might be surprising.
Value
nchar | returns a numeric vector the same length as x that contains the
size of each element of x, converted to a character string. |
nzchar | returns a logical vector the same length as x.
TRUE if the element of x contains non-empty string, otherwise FALSE. |
Differences between TIBCO Enterprise Runtime for R and Open-source R
In open-source R, nchar generates an error (or returns NA if
allowNA is TRUE) if the string contains an invalid
multi-byte character sequence and the type argument is "chars" or "width".
See Also
Examples
# example strings, including a wide Japanese character
x <- c('a','bb','hello', '\u30A4')
nchar(x)
# [1] 1 2 5 1
nchar(x, "bytes")
# [1] 1 2 5 3
# '\u30A4' is represented as 3 bytes in UTF-8
nchar(x, "width")
# [1] 1 2 5 2
# '\u30A4' is a double-wide character
z <- rawToChar(as.raw(1:255)); Encoding(z) <- 'bytes'
Encoding(z)
#[1] "bytes"
nchar(z)
# Error in nchar(z) : cannot determine character count for
# 'bytes' encoding
nchar(z, allowNA=TRUE)
# [1] NA
nchar(z, 'bytes')
# [1] 255
nzchar(c("12.34", "test", "", "\u30A4", z))
# [1] TRUE TRUE FALSE TRUE TRUE