Package com.tibco.patterns.learn.rlink
Class RLink
- java.lang.Object
-
- com.tibco.patterns.learn.rlink.RLink
-
public final class RLink extends java.lang.ObjectPerforms communication with rlink_jni library using JNI to create, load, save, train and evaluate RLink models. Can manage several models at the same time. When new model is created or read from file, a model ID is returned. It is then passed as modelId to all other methods to refer to a specific model. Model training and evaluation is done with individual feature vectors. Most methods are wrappers of native functions. They throw exceptions when native functions return error codes.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classRLink.SubsetTrainModeDefines methods used to generate training examples for subsets of given training examplestatic classRLink.ThermometerTypeType of thermometers used in RLink model.
-
Field Summary
Fields Modifier and Type Field Description static doubleEMPTY_SCOREThe value -1 used for the empty feature score (when it cannot be calculated).static doubleMAX_SCOREThe maximum non-empty feature score, and the maximum model score (1).static doubleMIN_SCOREThe minimum non-empty feature score, and the minimum model score (0).static doubleNO_CONFIDENCE_MEASUREThe value returned if no confidence measure was calculated or if the requested confidence measure is not supported by the given model.
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static voidbeginIteration(int modelId)Must be called before each training iteration with the entire training dataset.static intcreateModel(int nFeatures, double p, double learningRate, int precisionBits, RLink.ThermometerType thermometerType, int maxSubsetSize, int[] falseSubsets)Creates a new untrained RLink model.static voiddestroyModels()Destroys all RLink models created by the RLink static methods.static voidendIteration(int modelId)Must be called after each training iteration with the entire training dataset.static double[]getFalseInsertScores(int modelId)static double[]getFalseRemoveLimits(int modelId)static intgetFeatureCount(int modelId)Gets the number of model features.static java.lang.StringgetID(int modelId)Return the unique ID for this model.static doublegetInitialLearningRate(int modelId)Returns the initial learning rate, or -1 for file versions below RFV5.static java.lang.StringgetMetadata(int modelId)Returns the notes (meta-data) of the model.static doublegetMissingInfoLimit(int modelId)static doublegetNegativeTaper(int modelId)static doublegetNorm(int modelId)static doublegetPositiveTaper(int modelId)static intgetPrecisionBits(int modelId)Returns number of precision bits, or -1 for file versions below RFV5.static intgetSkippedCount(int modelId)Skipped count is applicable only for TAPER subset training mode.static RLink.SubsetTrainModegetSubsetTrainMode(int modelId)static RLink.ThermometerTypegetThermometerType(int modelId)Returns the thermometer type, or throws exception for file versions below RFV5 (thermometer type is -1, which is invalid).static doublegetThreshold(int modelId)Return the threshold for this model.static intgetTrainedCount(int modelId)Trained count is applicable only for TAPER subset training mode.static double[]getTrueInsertScores(int modelId)static double[]getTrueRemoveLimits(int modelId)static java.lang.StringgetVersion(int modelId)Return the version ID for this model.static voidlearn(int modelId, double[] featureValues, boolean label)Trains the existing model with the given training example (feature vector and label).static RLinkOutpredict(int modelId, double[] featureValues)Predicts the score (and label) using default options.static RLinkOutpredict(int modelId, double[] featureValues, PredictOptions predictOpts)Predicts the score (and label), calculates the requested confidence and significance.static intread(java.lang.String fileName)Loads a model from file.static voidsetAnnealingRate(int modelId, double value)Sets the annealing rate (the speed of the learning rate decrease with each training iteration).static voidsetDynamic(int modelId, double[] trueInsertScores, double[] falseInsertScores, double[] trueRemoveLimits, double[] falseRemoveLimits)Sets parameters for removal and insertion of scores to generate related training vectors when using SubsetTrainMode.DYNAMIC.static voidsetHeader(java.lang.String inFile, java.lang.String outFile, java.lang.String metaData, java.lang.String version, double threshold)Copy a model file, updating the header values.static voidsetID(int modelId, java.lang.String id)Set the ID of the model.static voidsetMetadata(int modelId, java.lang.String metadata)Sets the notes (meta-data) field for an existing model.static voidsetMissingInfoLimit(int modelId, double value)Sets the percentage of values that may be missing in a generated training example for a subset.static voidsetSubsetTrainMode(int modelId, RLink.SubsetTrainMode stMode)Set the subset training mode to an existing model.static voidsetTaper(int modelId, double positiveTaper, double negativeTaper)Set parameters for the tapering function that generates examples for subsets.static voidsetThreshold(int modelId, double threshold)Set the cutoff threshold for the model.static voidsetVersion(int modelId, java.lang.String version)Set the version ID of the model.static voidverifyMissingInfoLimit(double value)static voidverifyModelId(int modelId)Verifies that the model with the given modelId actually exists.static voidverifyTaper(double positiveTaper, double negativeTaper)static voidwrite(int modelId, java.lang.String fileName)Saves the existing model to file.
-
-
-
Field Detail
-
EMPTY_SCORE
public static final double EMPTY_SCORE
The value -1 used for the empty feature score (when it cannot be calculated).- See Also:
- Constant Field Values
-
MIN_SCORE
public static final double MIN_SCORE
The minimum non-empty feature score, and the minimum model score (0).- See Also:
- Constant Field Values
-
MAX_SCORE
public static final double MAX_SCORE
The maximum non-empty feature score, and the maximum model score (1).- See Also:
- Constant Field Values
-
NO_CONFIDENCE_MEASURE
public static final double NO_CONFIDENCE_MEASURE
The value returned if no confidence measure was calculated or if the requested confidence measure is not supported by the given model.- See Also:
- Constant Field Values
-
-
Method Detail
-
destroyModels
public static void destroyModels()
Destroys all RLink models created by the RLink static methods. No error codes are used, so this native method is public. Client code must call this method when RLink class will no longer be used.
-
verifyModelId
public static void verifyModelId(int modelId)
Verifies that the model with the given modelId actually exists.- Parameters:
modelId- - ID of the model that was created or read from file.- Throws:
java.lang.ArrayIndexOutOfBoundsException- if the model with the given modelId does not exist
-
createModel
public static int createModel(int nFeatures, double p, double learningRate, int precisionBits, RLink.ThermometerType thermometerType, int maxSubsetSize, int[] falseSubsets)Creates a new untrained RLink model. ModelConfig.build() must be used from other packages.- Parameters:
nFeatures- - number of model features.p- - Minkowski norm value for combining feature values.learningRate- - the learning rate.precisionBits- - defines precision of internal model weights.thermometerType- - thermometers used by model. Use ARRAY.maxSubsetSize- - if it equals nFeatures, a large model is created that is able to take subsets into account. If 1, a small model is created that can support more features. Other values are not supported.falseSubsets- - subsets that are always false. May be null.- Returns:
- the ID of the new model
-
setSubsetTrainMode
public static void setSubsetTrainMode(int modelId, RLink.SubsetTrainMode stMode)Set the subset training mode to an existing model. If this call is not used, the default subset training mode is TAPER. For model creation, this is typically set in ModelConfig. If model is read from file, subset training mode is set to default. This method can be used to change it before any further training of the loaded model.- Parameters:
modelId- - ID of the model that was created or read from file.stMode- - the new subset training mode. Not null.
-
getSubsetTrainMode
public static RLink.SubsetTrainMode getSubsetTrainMode(int modelId)
- Returns:
- the subset training mode of an existing model.
-
verifyTaper
public static void verifyTaper(double positiveTaper, double negativeTaper)- Throws:
java.lang.IllegalArgumentException- if any taper value is not between 0 and 1.
-
setTaper
public static void setTaper(int modelId, double positiveTaper, double negativeTaper)Set parameters for the tapering function that generates examples for subsets. This is only relevant when SubsetTrainMode.TAPER is used. Larger taper values result in more generated examples being skipped. For model creation, this is typically set in ModelConfig. If model is read from file, taper values are set to default. This method can be used to change them before any further training of the loaded model.- Parameters:
modelId- - ID of the model that was created or read from file.positiveTaper- - steepness of tapering function for examples with True labels.negativeTaper- - steepness of tapering function for examples with False labels.- Throws:
java.lang.IllegalArgumentException- if any taper value is not between 0 and 1.
-
getPositiveTaper
public static double getPositiveTaper(int modelId)
- Returns:
- the tapering value for true-labeled items
-
getNegativeTaper
public static double getNegativeTaper(int modelId)
- Returns:
- the tapering value for false-labeled items
-
verifyMissingInfoLimit
public static void verifyMissingInfoLimit(double value)
- Throws:
java.lang.IllegalArgumentException- if the missing info limit is not between 0 and 1.
-
setMissingInfoLimit
public static void setMissingInfoLimit(int modelId, double value)Sets the percentage of values that may be missing in a generated training example for a subset. This is only relevant when SubsetTrainMode.FIXED is used. For model creation, this is typically set in ModelConfig. If model is read from file, missing info limit is set to default. This method can be used to change it before any further training of the loaded model.- Parameters:
modelId- - ID of the model that was created or read from file.value- - percentage of values that may be missing in a generated example.- Throws:
java.lang.IllegalArgumentException- if the missing info limit is not between 0 and 1 or the modelId is invalid.
-
getMissingInfoLimit
public static double getMissingInfoLimit(int modelId)
- Parameters:
modelId- - ID of the model that was created or read from file.- Returns:
- the percentage of values that may be missing in a generated training example for a subset. This is only relevant when the model was trained in SubsetTrainMode.FIXED mode.
- Throws:
java.lang.IllegalArgumentException- if the modelId is invalid.java.lang.IllegalStateException- if model was not trained in SubsetTrainMode.FIXED mode.
-
setDynamic
public static void setDynamic(int modelId, double[] trueInsertScores, double[] falseInsertScores, double[] trueRemoveLimits, double[] falseRemoveLimits)Sets parameters for removal and insertion of scores to generate related training vectors when using SubsetTrainMode.DYNAMIC. For model creation this is typically set in STDynamic. This method can be used to change the parameters before any further training of the loaded model.- Parameters:
modelId- - ID of the model that was created or read from file.trueInsertScores- - Score to insert into true-labeled vectors when filling in a missing score. This score should be close to 1.0. Must be between 0.0 and 1.0, or RL_DYN_NO_INSERT, or RL_DYN_INSERT_DEFAULTfalseInsertScores- - Score to insert into false-labeled vectors when filling in a missing score. This score should be close to 0.0. Must be between 0.0 and 1.0, or RL_DYN_NO_INSERT, or RL_DYN_INSERT_DEFAULTtrueRemoveLimits- - Scores above this value will not be removed from true-labeled vectors. It is recommended this be no higher than 0.75. Must be between 0.0 and 1.0, or RL_DYN_NO_REMOVE.falseRemoveLimits- - Scores below this value will not be removed from false-labeled vectors. It is recommended this be no lower than 0.50. Must be between 0.0 and 1.0, or RL_DYN_NO_REMOVE.- Throws:
java.lang.IllegalArgumentException- if length of any given array does not match the number of features for this model, or an array contains an invalid value.java.lang.ArrayIndexOutOfBoundsException- if modelId is invalid.
-
getTrueInsertScores
public static double[] getTrueInsertScores(int modelId)
- Returns:
- the scores to insert into true-labeled vectors when filling in a missing score.
-
getFalseInsertScores
public static double[] getFalseInsertScores(int modelId)
- Returns:
- the scores to insert into false-labeled vectors when filling in a missing score.
-
getTrueRemoveLimits
public static double[] getTrueRemoveLimits(int modelId)
- Returns:
- the scores above which values will not be removed from true-labeled vectors
-
getFalseRemoveLimits
public static double[] getFalseRemoveLimits(int modelId)
- Returns:
- the scores below which values will not be removed from false-labeled vectors
-
write
public static void write(int modelId, java.lang.String fileName) throws java.io.IOExceptionSaves the existing model to file. Uses the same format as the file that was read.- Parameters:
modelId- - ID of the model that was created or read from file.fileName- - name of model binary file.- Throws:
java.lang.IllegalArgumentException- if fileName is null.java.io.IOException
-
read
public static int read(java.lang.String fileName) throws java.io.FileNotFoundExceptionLoads a model from file. Default values are set to subset train mode, positive and negative taper, and missing info limit. See description of ModelConfig class for default values.- Parameters:
fileName- - name of model binary file.- Returns:
- the index of the newly loaded model
- Throws:
java.io.FileNotFoundException- if the specified model file does not exist.java.lang.IllegalArgumentException- if fileName is null.
-
setHeader
public static void setHeader(java.lang.String inFile, java.lang.String outFile, java.lang.String metaData, java.lang.String version, double threshold)Copy a model file, updating the header values. This copies a model file from one location to another, updating one or more of the header values. It is much faster than loading the model into memory and then writing the model out. The fastest way to update header values is to use this method to copy the file, delete the old and then rename the new file to the old name.- Parameters:
inFile- path to input file.outFile- path to output file.metaData- new meta data value for header. If null meta data is not updated.version- new version value for header. If null version is not updated.threshold- new threshold value for header. If negative threshold is not updated.- Throws:
java.lang.IllegalArgumentException- on any errors reading or writing the files.java.lang.NullPointerException- if a required argument is null or on other errors.
-
setMetadata
public static void setMetadata(int modelId, java.lang.String metadata)Sets the notes (meta-data) field for an existing model.- Parameters:
modelId- - ID of the model that was created or read from file.metadata- - the text of meta-data. May be null to clear meta-data.
-
getMetadata
public static java.lang.String getMetadata(int modelId)
Returns the notes (meta-data) of the model.- Parameters:
modelId- - ID of the model that was created or read from file.- Returns:
- notes (meta-data) from an existing model.
-
getThreshold
public static double getThreshold(int modelId)
Return the threshold for this model. If no threshold was set for this model -1.0 is returned.- Parameters:
modelId- - ID of the model that was created or read from file.- Returns:
- the threshold value for this model.
-
setThreshold
public static void setThreshold(int modelId, double threshold)Set the cutoff threshold for the model.- Parameters:
modelId- - ID of the model that was created or read from file.threshold- - the threshold value for this model. The value -1.0 indicates that the threshold from the model should not be used.- Throws:
java.lang.IllegalArgumentException- if threshold is greater than 1.0.
-
getVersion
public static java.lang.String getVersion(int modelId)
Return the version ID for this model.- Parameters:
modelId- - ID of the model that was created or read from file.- Returns:
- the version string for this model, or null if it is not set.
-
setVersion
public static void setVersion(int modelId, java.lang.String version)Set the version ID of the model.- Parameters:
modelId- - ID of the model that was created or read from file.version- - version ID as a string. This should not be null.
-
getID
public static java.lang.String getID(int modelId)
Return the unique ID for this model.- Parameters:
modelId- - ID of the model that was created or read from file.- Returns:
- the ID string for this model, or null if it is not set.
-
setID
public static void setID(int modelId, java.lang.String id)Set the ID of the model.- Parameters:
modelId- - ID of the model that was created or read from file.id- - unique ID. This should not be null.
-
getSkippedCount
public static int getSkippedCount(int modelId)
Skipped count is applicable only for TAPER subset training mode. This parameter is reset to 0 after a model has been loaded from file.- Parameters:
modelId- - ID of the model that was created or read from file.- Returns:
- the number of generated examples that were skipped during learning.
-
getTrainedCount
public static int getTrainedCount(int modelId)
Trained count is applicable only for TAPER subset training mode. This parameter is reset to 0 after a model has been loaded from file.- Parameters:
modelId- - ID of the model that was created or read from file.- Returns:
- the number of generated examples that were used (not skipped) during learning.
-
getFeatureCount
public static int getFeatureCount(int modelId)
Gets the number of model features.- Parameters:
modelId- - ID of the model that was created or read from file.- Returns:
- the number of features in the specified model.
-
getNorm
public static double getNorm(int modelId)
- Returns:
- the Minkowski norm value for combining feature values used by the specified model.
-
getPrecisionBits
public static int getPrecisionBits(int modelId)
Returns number of precision bits, or -1 for file versions below RFV5.- Returns:
- the precision of internal model weights of the specified model.
-
getInitialLearningRate
public static double getInitialLearningRate(int modelId)
Returns the initial learning rate, or -1 for file versions below RFV5.- Returns:
- the initial learning rate of the specified model.
-
setAnnealingRate
public static void setAnnealingRate(int modelId, double value)Sets the annealing rate (the speed of the learning rate decrease with each training iteration).- Parameters:
value- - the annealing rate. Larger values decrease the learning rate faster. 0 means learning rate stays the same.
-
getThermometerType
public static RLink.ThermometerType getThermometerType(int modelId)
Returns the thermometer type, or throws exception for file versions below RFV5 (thermometer type is -1, which is invalid).- Returns:
- the thermometer type used by the specified model.
- Throws:
java.lang.IllegalArgumentException- for file versions below RFV5.
-
predict
public static RLinkOut predict(int modelId, double[] featureValues, PredictOptions predictOpts)
Predicts the score (and label), calculates the requested confidence and significance.- Parameters:
modelId- ID of the model that was created or read from file.featureValues- - the feature vector.predictOpts- - the options to use for this prediction. If null, uses default uptions.- Returns:
- prediction from the existing model of the given feature vector.
-
predict
public static RLinkOut predict(int modelId, double[] featureValues)
Predicts the score (and label) using default options. Seepredict(int, double[], PredictOptions).
-
learn
public static void learn(int modelId, double[] featureValues, boolean label)Trains the existing model with the given training example (feature vector and label).- Parameters:
modelId- - ID of the model to be trainedfeatureValues- - the feature vector.label- - the actual label to be learned for this feature vector.
-
beginIteration
public static void beginIteration(int modelId)
Must be called before each training iteration with the entire training dataset.- Parameters:
modelId- - ID of the model being trained.
-
endIteration
public static void endIteration(int modelId)
Must be called after each training iteration with the entire training dataset.- Parameters:
modelId- - ID of the model being trained.
-
-