NetricsCharmap (TIBCO Patterns Java API)

java.lang.Object
- com.netrics.likeit.NetricsCharmap

```
public class NetricsCharmap
extends java.lang.Object
```
Define a character map.
When matching data it is usually advantageous to perform certain character normalizations before matching. These include things such as mapping all letters to a common letter case, normalizing white space, stripping out certain special characters. A character map defines the set of rules to be applied. The TIBCO Patterns Engine has a default character map that is appropriate for most cases. It maps all letters to lower case, strips all diacritic marks from letters, maps all punctuation and special characters to a blank, maps all whitespace to blanks, and compresses out all repeated blanks. There is also a second predefined character map that is similar to the standard map except that punctuation and special characters are left unchanged. For those cases where neither of these are appropriate the user can define their own character mapping rules. This class is the means by which these character mapping rules are defined.
Terms such as "letter" and "whitespace" are as defined in the unicode standard. The exception is that "punctuation" or "special characters" is a little broader than the unicode standard. Within the ASCII range this includes all characters that are not letters, digits or whitespace.
The first 16 bits of the full unicode character set are supported. This includes all commonly used code pages, including the Chinese and Japanese characters.

See Also:

NetricsServerInterface.cmapcreate(com.netrics.likeit.NetricsCharmap)

Field Summary

Fields
Modifier and Type	Field and Description
`static java.lang.String`	`Punctuation` The name of the predefined punctuation sensitive character map.
`static java.lang.String`	`Standard` The name of the predefined default character map.

Constructor Summary

Constructors
Constructor and Description

NetricsCharmap(java.lang.String name)
Specify the name of the character map.

Constructors
Constructor and Description
`NetricsCharmap(java.lang.String name)` Specify the name of the character map.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`foldCase()` Equate all letter cases to lower case.
`void`	`foldDiacritics()` Equate characters with diacritics with their normal cognates.
`void`	`mapChars(char[] fromchars, char[] tochars)` Use this method to equate any character with any other.
`void`	`mapPunctuation(char c)` Map all punctuation characters to a specific character.
`void`	`mapWhitespace(char c)` Map all whitespace characters to a specific character.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - Standard
```
public static final java.lang.String Standard
```
    The name of the predefined default character map.
    
    See Also:
    
    Constant Field Values
  - Punctuation
```
public static final java.lang.String Punctuation
```
    The name of the predefined punctuation sensitive character map.
    
    See Also:
    
    Constant Field Values
- Constructor Detail
  - NetricsCharmap
```
public NetricsCharmap(java.lang.String name)
```
    Specify the name of the character map.
    
    The name will be used when creating a table or thesaurus to tie a character map to a field. It must be unique among all currently defined character maps.
    
    Parameters:
    
    name - Name of the character map.
- Method Detail
  - foldCase
```
public void foldCase()
```
    Equate all letter cases to lower case. This adds the rule that all letters should be mapped to a common letter case (lower case is used).
  - foldDiacritics
```
public void foldDiacritics()
```
    Equate characters with diacritics with their normal cognates. This adds a rule that strips all diacritic marks from letters.
  - mapWhitespace
```
public void mapWhitespace(char c)
```
    Map all whitespace characters to a specific character.
    This adds a rule that maps all whitespace characters to a common character.
    Note: mapping whitespace to any character other than the standard blank character will significantly alter the scoring of data as all data will be considered a single token.
    
    Parameters:
    
    c - The character to map to
  - mapPunctuation
```
public void mapPunctuation(char c)
```
    Map all punctuation characters to a specific character.
    This is often used to equate hyphen, slash, period, comma, etc to the space character. This ignores all punctuation and special characters except that they are considered token separators. If desired they can be mapped to a different character. In this case all punctuation and special characters would be considered equivalent, but they would not act as token separators.
    
    Parameters:
    
    c - The character to map to
  - mapChars
```
public void mapChars(char[] fromchars,
                     char[] tochars)
```
    Use this method to equate any character with any other.
    A character at index x in the from array will be mapped to the character at index x in the to array. Each pair of characters will be considered equivalent for the purposes of matching.
    Mappings defined here will supersede the character class mappings defined by foldCase, foldDiacritics, mapWhitespace and mapPunctuation. Thus if you wanted to map all punctuation marks except the pound (#) character you could do so by calling mapPunctuation and mapChars with the pound character mapped to itself.
    
    Parameters:
    
    fromchars - The characters to map from.
    
    tochars - The characters to map to

Class NetricsCharmap

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

Standard

Punctuation

Constructor Detail

NetricsCharmap

Method Detail

foldCase

foldDiacritics

mapWhitespace

mapPunctuation

mapChars