Strings and Character Encodings

Rendezvous software uses strings in several roles:

String data inside message fields
Field names
Subject names (and other associated strings that are not strictly inside the message)
Certified delivery (CM) correspondent names
Group names (fault tolerance)
Encodings and Translation

Java programs represent all these strings in the Unicode 2-byte character set.

Before sending an outbound message, Rendezvous software translates these strings into the character encoding appropriate to the ISO locale.
Conversely, when extracting a string from an inbound message, Rendezvous software translates it to Unicode.

Rendezvous translates its strings as if the message used the default encoding (see Default Encoding, below). This assumption is not always correct (see Inbound Translation, below).

Default Encoding

The default encoding depends on the locale where Java is running. That is, the locale determines the value of the Java system property file.encoding, which in turn determines the translation scheme.

For example, the United States is locale en_US, and uses the Latin-1 character encoding (also called ISO 8859-1); Japan is locale ja_JP, and uses the Shift-JIS character encoding.

When the system property file.encoding is inaccessible, the default encoding is 8859-1 (Latin-1). Programs can override this system property; for details, see TibrvMsg.setStringEncoding().

Note 

Some browsers (for example, Microsoft Internet Explorer) do not permit programs to access the system property file.encoding. When programs attempt to access it, the browser throws a SecurityException. Although this is normal, and the program continues to run, the browser may nonetheless print a stack trace, indicating that the program cannot access that system property.

Outbound Translation

Outbound translation from Unicode to a specified encoding occurs when adding a string to a message.

Exotic Characters

A wire-format string can contain only characters that are valid in the encoding of the surrounding message. The translation procedure detects exotic characters, and throws an exception with TibrvStatus.INVALID_ENCODING.

Inbound Translation

Inbound translation occurs before the program receives the data.

Automatic inbound translation is correct when two programs exchange messages within the same locale.

Warning 

In contrast, the automatic translation might be incorrect when the sender and receiver use different character encodings.

In this situation, the receiver must explicitly retranslate to the local encoding.

See Also

TibrvMsg.getStringEncoding()

TibrvMsg.setStringEncoding()