Strings and Character Encodings
Rendezvous software uses strings in several roles:
• | String data inside message fields |
• | Field names |
• | Subject names (and other associated strings that are not strictly inside the message) |
• | Certified delivery (CM) correspondent names |
• | Group names (fault tolerance) |
.NET programs represent all these strings in the Unicode 2-byte character set. Before sending a message, Rendezvous software translates these strings into the character encoding appropriate to the ANSI code page. Conversely, when extracting these strings from inbound messages, Rendezvous software translates these strings into Unicode, as if they used the encoding appropriate to the ANSI code page.
For example, the United States is code page us-ascii
, and uses the Latin-1 character encoding (also called ISO 8859-1); Japan is code page shift-jis
, and uses the Shift-JIS character encoding.
When two programs exchange messages using the same code page, the translation is correct. However, when a message sender and receiver use different character encodings, the receiving program must retranslate between encodings as needed.
The default translation depends on the code page where the program is running. Programs can override this default encoding; for details, see the environment property StringEncoding.
Outbound Translation
Outbound translation from Unicode to the local code page occurs when the program sends the message (for example, using Transport.Send or a related method), or converts the message to a byte array.
Inbound Translation
Inbound translation occurs before the program receives the data.
Automatic inbound translation is correct when two programs exchange messages using the same code page.
Warning |
In contrast, the automatic translation might be incorrect when the sender and receiver use different character encodings. In this situation, the receiver must explicitly retranslate to the local encoding. |