Class HintAddUnderrepr

  • All Implemented Interfaces:
    Hint

    public final class HintAddUnderrepr
    extends HintAddPairs
    Hint to add pairs to subsets that are underrepresented in training dataset.
    • Method Detail

      • create

        public static HintAddUnderrepr create​(FeatureQuery fq,
                                              SubsetFamily trainSubsets,
                                              double nStdDevBelowMean,
                                              int minNPairs)
        Creates a hint object with all underrepresented subsets, selecting them from subsets in training dataset. Subset is underrepresented if the number of pairs for that subset in training dataset is:
        1. < minNPairs OR
        2. < (mean number of pairs) - nStdDevBelowMean * (standard deviation of number of pairs), if standard deviation is small enough
        3. < a number that approaches the mean if standard deviation is very large
        Assumes a trained model was saved (uses feature scores).
        Parameters:
        trainSubsets - - all subsets in training dataset. No need to search validation dataset - it is already analyzed for untrained subsets
        nStdDevBelowMean - - the limit (expressed in standard deviations below mean)
        minNPairs - - minimum number of pairs required in training dataset for any subset.
        Returns:
        the new hint object. Returns null if no underrepresented subsets can be found (or model was never trained).
        Throws:
        java.lang.IllegalArgumentException - if any of the parameters are negative.