public class NetricsThesaurus
extends java.lang.Object
Constructor and Description |
---|
NetricsThesaurus(java.lang.String name)
A thesaurus is used to equate terms which are not typographically similar.
|
NetricsThesaurus(java.lang.String name,
java.lang.String filename,
java.lang.String encoding)
Create a thesaurus of synonyms from a CSV file.
|
Modifier and Type | Method and Description |
---|---|
int |
addClassesFrom(NetricsFieldedReader rsrc)
Add a set of classes from a fielded source.
|
void |
addEquivalenceClass(java.lang.String[] terms)
Add an array of synonyms.
|
void |
setCharmap(java.lang.String name)
Set the character map for this thesaurus.
|
void |
setExactMatchMode()
Select exact match mode.
|
public NetricsThesaurus(java.lang.String name)
name
- Name of the thesaurus.public NetricsThesaurus(java.lang.String name, java.lang.String filename, java.lang.String encoding)
name
- The name of the thesaurus to be createdfilename
- The name of the file from which to read the thesaurusencoding
- This defines the character encoding used
in the file. Currently supported encodings are:
"UTF-8" or "LATIN1". DEFAULT: "LATIN1"public void addEquivalenceClass(java.lang.String[] terms)
terms
- All Strings which are elements of the array are considered to be equal for the purpose of record scoring.public int addClassesFrom(NetricsFieldedReader rsrc) throws NetricsFileFormatException, NetricsException
rsrc
- a NetricsFieldedReader object that provides the
equivalence classes.NetricsFileFormatException
- if there was an error reading
records from the source.NetricsException
- if an equivalence class has less than 1
entry.NetricsFieldedReader
,
NetricsCSVReader
public void setCharmap(java.lang.String name)
name
- The name of an existing character map.public void setExactMatchMode()