Class COErrorRateMin
- java.lang.Object
-
- com.tibco.patterns.learn.training.COIterationCount
-
- com.tibco.patterns.learn.training.COErrorRate
-
- com.tibco.patterns.learn.training.COErrorRateMin
-
- All Implemented Interfaces:
ConvergenceObserver,TrainingObserver
public class COErrorRateMin extends COErrorRate
Stops the training after a minimum error rate for the validation dataset was found and this result was not improved during the specified number of subsequent iterations. It is recommended to use this observer class in most cases.The iteration with the minimum validation error rate is the best iteration. If several iterations have the same error rate, then an iteration with the smaller training error rate is considered the best. If the latter is also identical, then this class selects the iteration with the lower mean distance between the model prediction score and the actual label for all examples in the validation dataset. Iterations where the training error rate is significantly higher than the validation error rate are not selected to prevent underfitting.
After the best number of iterations is determined, a new model should be created and trained with this exact number of iterations. Use
createTrainCO()andcreateVldCO()to obtain observers for such retraining.If a random order of training examples is used, the order of examples while retraining must be the same as the order in the first training pass to ensure the same result.
This observer can only be used with the validation dataset. It is used together with a COErrorRate observer that is created for the training dataset. The target error rate inherited from from COErrorRate is not used.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class com.tibco.patterns.learn.training.COErrorRate
COErrorRate.IterationResult
-
-
Field Summary
Fields Modifier and Type Field Description protected COErrorRate.IterationResultbestRes-
Fields inherited from class com.tibco.patterns.learn.training.COErrorRate
currRes, prevRes
-
-
Constructor Summary
Constructors Constructor Description COErrorRateMin(COErrorRate trainCO)Creates object with default parameters.COErrorRateMin(COErrorRate trainCO, int nIterAfterMin)Creates object with specified parameters.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description COErrorRatecreateTrainCO()Creates a COErrorRate object for the training dataset to be used for retraining the model with the exact best number of iterations.COErrorRatecreateVldCO()Creates a COErrorRate object for the validation dataset to be used for retraining the model with the exact best number of iterations.voidendIteration()If the last iteration was the best, then remembers its results.intgetBestIteration()COErrorRate.IterationResultgetBestResult()static doublegetDftGoodFitDistance()static intgetDftMaxIterations()Get default maximum iterations for COErrorRateMin.static intgetDftNIterAfterMin()doublegetGoodFitDistance()intgetNIterAfterMin()COErrorRategetTrainCO()booleanisCompareVldFirst()booleanisConverged()Determines if further training is unlikely to improve the best result that was found.booleanneedRetrain()Returns true if the model needs to be retrained.booleanneedStopTraining()Determines if training needs to be stopped.voidsetCompareVldFirst(boolean value)Sets a flag whether to compare the Validation or the Training error rates first.voidsetGoodFitDistance(double goodFitDistance)Sets the maximum distance that the training error rate can be above the validation error rate and the result is deemed a good fit, i.e.voidsetNIterAfterMin(int nIterAfterMin)Sets the number of iterations to explore after the best iteration.java.lang.StringtoString()-
Methods inherited from class com.tibco.patterns.learn.training.COErrorRate
beginIteration, evaluatePrediction, getDftTargetErrorRate, getMinIterations, getProgressEstimate, getResult, getTargetErrorRate, isPerfectResult, printHeaderLine, setMaxIterations, setMinIterations, setPrintOptions
-
Methods inherited from class com.tibco.patterns.learn.training.COIterationCount
getMaxIterations, getNIterations, hasIterations, verifyIterationBegun, verifyIterationEnded
-
-
-
-
Field Detail
-
bestRes
protected COErrorRate.IterationResult bestRes
-
-
Constructor Detail
-
COErrorRateMin
public COErrorRateMin(COErrorRate trainCO, int nIterAfterMin)
Creates object with specified parameters. This observer must be created only for the validation dataset.- Parameters:
trainCO- - observer of the training dataset. Not null. UseCOErrorRate. Must be used to train the same model.nIterAfterMin- - the number of iterations that will be performed after the current best result (with the lowest error rate) is found to search for a better result.- Throws:
java.lang.IllegalArgumentException- if nIterAfterMin is less than 1.
-
COErrorRateMin
public COErrorRateMin(COErrorRate trainCO)
Creates object with default parameters.- See Also:
getDftNIterAfterMin()
-
-
Method Detail
-
toString
public java.lang.String toString()
- Overrides:
toStringin classCOErrorRate
-
getDftNIterAfterMin
public static int getDftNIterAfterMin()
- Returns:
- default number of iterations that will be performed after the current best result is found to search for a better result (35).
-
getDftGoodFitDistance
public static double getDftGoodFitDistance()
- Returns:
- default maximum distance that the training error rate can be above the validation error rate and the result is deemed a good fit, i.e. the model does not underfit the training data (0.01).
-
getDftMaxIterations
public static int getDftMaxIterations()
Get default maximum iterations for COErrorRateMin. It allows "unlimited" iterations to find the minimum error rate. The default is different from COIterationCount. HidesCOIterationCount.getDftMaxIterations().- Returns:
- default maximum iterations used in COErrorRateMin (Integer.MAX_VALUE).
-
getTrainCO
public COErrorRate getTrainCO()
- Returns:
- the observer of the training dataset that was provided when creating of this object.
-
getNIterAfterMin
public int getNIterAfterMin()
- Returns:
- the number of iterations that will be performed after the current best result (with lowest error rate) is found to search for a better result.
-
setNIterAfterMin
public final void setNIterAfterMin(int nIterAfterMin)
Sets the number of iterations to explore after the best iteration.- Parameters:
nIterAfterMin- - the number of iterations that will be performed after the current best result (with the lowest error rate) is found to search for a better result.- Throws:
java.lang.IllegalArgumentException- if nIterAfterMin is less than 1.
-
getGoodFitDistance
public double getGoodFitDistance()
- Returns:
- maximum distance that the training error rate can be above the validation error rate and the result is deemed a good fit, i.e. the model does not underfit the training data.
-
setGoodFitDistance
public void setGoodFitDistance(double goodFitDistance)
Sets the maximum distance that the training error rate can be above the validation error rate and the result is deemed a good fit, i.e. the model does not underfit the training data. If there are such iterations, then the training does not stop at iterations where there is underfitting.- Parameters:
goodFitDistance- - the value to set. Should be slightly above 0 (e.g. 0.005-0.01). Using 1.0 or greater ignores underfitting (not recommended).- Throws:
java.lang.IllegalArgumentException- if the given value is negative.- See Also:
getDftGoodFitDistance()
-
isCompareVldFirst
public boolean isCompareVldFirst()
- Returns:
- true (default) if the Validation error rates are compared first, and the Training error rates are compared only if the Validation error rates are equal. Returns false if the Training error rates are compared first.
-
setCompareVldFirst
public void setCompareVldFirst(boolean value)
Sets a flag whether to compare the Validation or the Training error rates first.If this is true (default), Validation error rates are compared first, and the Training error rates are compared only if the Validation error rates are equal.
If false, the Training error rates are compared first, so the training stops at an iteration where the Training error is the lowest. This can be used to identify mislabeled examples in the Training dataset. It should not be used for model comparisons or to train the final production model.
-
getBestResult
public final COErrorRate.IterationResult getBestResult()
- Returns:
- a copy of the best result, or null if no training was performed.
-
getBestIteration
public int getBestIteration()
- Returns:
- the number of the best iteration. 0 if no iterations have ended.
-
needRetrain
public boolean needRetrain()
Returns true if the model needs to be retrained. Should be called after completing the first training pass.- Returns:
- true if the best iteration is not the last iteration and thus the model needs to be retrained with the exact number of iterations.
- Throws:
java.lang.IllegalStateException- if iteration has not ended or no iterations were performed.
-
endIteration
public void endIteration()
If the last iteration was the best, then remembers its results. Until the minimum number of iterations is reached, last iteration is considered the best. Overriding methods must call this method.- Specified by:
endIterationin interfaceTrainingObserver- Overrides:
endIterationin classCOErrorRate- Throws:
java.lang.IllegalStateException- if iteration has not begun.
-
isConverged
public boolean isConverged()
Determines if further training is unlikely to improve the best result that was found.- Specified by:
isConvergedin interfaceConvergenceObserver- Overrides:
isConvergedin classCOErrorRate- Returns:
- true if the best iteration was found, then the number of iterations specified in this object were performed and this result has not improved. Until the minimum number of iterations is reached, last iteration is considered the best.
- Throws:
java.lang.IllegalStateException- if iteration has not ended or no iterations were performed.
-
needStopTraining
public boolean needStopTraining()
Determines if training needs to be stopped. Uses the same criteria asisConverged(), and the maximum number of iterations.- Specified by:
needStopTrainingin interfaceConvergenceObserver- Overrides:
needStopTrainingin classCOErrorRate- Returns:
- true if the best iteration was found, then the number of iterations specified in this object were performed and this result has not improved. Also returns true if the maximum number of iterations was performed. Until the minimum number of iterations is reached, last iteration is considered the best.
- Throws:
java.lang.IllegalStateException- if iteration has not ended or no iterations were performed.
-
createVldCO
public COErrorRate createVldCO()
Creates a COErrorRate object for the validation dataset to be used for retraining the model with the exact best number of iterations. Retraining must use the same order of training examples.- Returns:
- the newly created observer. Its print options are the same as for this object.
- Throws:
java.lang.IllegalStateException- if iteration has not ended or no iterations were performed.
-
createTrainCO
public COErrorRate createTrainCO()
Creates a COErrorRate object for the training dataset to be used for retraining the model with the exact best number of iterations.- Returns:
- the newly created observer. Its print options are the same as for the trainCO parameter that was passed to the constructor.
- Throws:
java.lang.IllegalStateException- if iteration has not ended or no iterations were performed.
-
-