public class NetricsThesaurus
extends java.lang.Object
| Constructor and Description |
|---|
NetricsThesaurus(java.lang.String name)
A thesaurus is used to equate terms which are not typographically similar.
|
NetricsThesaurus(java.lang.String name,
java.lang.String filename,
java.lang.String encoding)
Create a thesaurus of synonyms from a CSV file.
|
| Modifier and Type | Method and Description |
|---|---|
int |
addClassesFrom(NetricsFieldedReader rsrc)
Add a set of classes from a fielded source.
|
void |
addEquivalenceClass(java.lang.String[] terms)
Add an array of synonyms.
|
void |
setCharmap(java.lang.String name)
Set the character map for this thesaurus.
|
void |
setExactMatchMode()
Select exact match mode.
|
public NetricsThesaurus(java.lang.String name)
name - Name of the thesaurus.public NetricsThesaurus(java.lang.String name,
java.lang.String filename,
java.lang.String encoding)
name - The name of the thesaurus to be createdfilename - The name of the file from which to read the thesaurusencoding - This defines the character encoding used
in the file. Currently supported encodings are:
"UTF-8" or "LATIN1". DEFAULT: "LATIN1"public void addEquivalenceClass(java.lang.String[] terms)
terms - All Strings which are elements of the array are considered to be equal for the purpose of record scoring.public int addClassesFrom(NetricsFieldedReader rsrc) throws NetricsFileFormatException, NetricsException
rsrc - a NetricsFieldedReader object that provides the
equivalence classes.NetricsFileFormatException - if there was an error reading
records from the source.NetricsException - if an equivalence class has less than 1
entry.NetricsFieldedReader,
NetricsCSVReaderpublic void setCharmap(java.lang.String name)
name - The name of an existing character map.public void setExactMatchMode()