create
public static HintAddUnderrepr create(FeatureQuery fq,
SubsetFamily trainSubsets,
double nStdDevBelowMean,
int minNPairs)
Creates a hint object with all underrepresented subsets, selecting them from subsets
in training dataset. Subset is underrepresented if the number of pairs for
that subset in training dataset is:
1. < minNPairs OR
2. < (mean number of pairs) - nStdDevBelowMean * (standard deviation of number of pairs), if standard deviation
is small enough
3. < a number that approaches the mean if standard deviation is very large
Assumes a trained model was saved (uses feature scores).
- Parameters:
trainSubsets - - all subsets in training dataset. No need to search validation
dataset - it is already analyzed for untrained subsets
nStdDevBelowMean - - the limit (expressed in standard deviations below mean)
minNPairs - - minimum number of pairs required in training dataset for any subset.
- Returns:
- the new hint object. Returns null if no underrepresented subsets can be found
(or model was never trained).
- Throws:
java.lang.IllegalArgumentException - if any of the parameters are negative.