Internationalization

The mathematical matching algorithms of TIBCO Patterns are, by their very nature, independent of particular languages. As a practical matter, TIBCO Patterns expects textual data in in-memory tables to be UTF-8 encoded. Once decoded internally, however, the core matching algorithms perform intelligent inexact matching upon this text as strings of abstract symbols, regardless of the particular alphabet or writing system from which the symbols are drawn.

Various writing systems have features that differentiate characters in ways that are irrelevant to the kind of matching you want to do. The most obvious example is the letter case of alphabets such as the Roman alphabet. By default, the matching Roman letters is case-insensitive. Similarly, it is not sensitive to the accented and unaccented versions of a character, differences in punctuation, and so on. To equate different versions of a character into one essential version is one function of the character maps in TIBCO Patterns. Another function is to strip out characters or classes of characters that are irrelevant in matching.