Character Maps

Associated with every text field of an in-memory table is a character map. Each text field can be independently associated with either one of the built-in character maps or a custom character map that you define. A character map defines how every possible character occurring in a field must be mapped before inexact matching is performed. This mapping of characters can be used to accomplish letter case folding, removal of punctuation, translating the various types of whitespace to a common value, reducing accented versions of a character to the unaccented version, and other special character mappings that your application requires.

The character maps associated with the fields of an in-memory table are established at the time of table creation and cannot be changed.

Each thesaurus created also has a character map associated with it.

If no character map is explicitly defined for a field or thesaurus a standard default character map is used. The default character map performs the following character mappings:

Letter case folding—according to the rules defined by the Unicode Consortium for folding all alphabetic characters to a common letter case.
Diacritics folding—according to rules defined by the Unicode Consortium for stripping letters of their diacritic marks and other character “normalizations”.
Character class mappings—all characters belonging to the “whitespace” class and all characters belonging to the “punctuation” class except for the ampersand (&) are mapped to the blank character (Unicode code point: 0x20). Whitespace is as defined by the Unicode standard. For characters outside the standard ASCII range punctuation is as defined by the Unicode standard, for characters within the ASCII range anything that is not a letter, digit, or white space is considered punctuation.

Besides the default character map, a second predefined character map is available that does not map "punctuation" characters (that is, all punctuation characters remain unchanged).