Class DatasetStats
- java.lang.Object
-
- com.tibco.patterns.learn.api.hint.DatasetStats
-
public final class DatasetStats extends java.lang.ObjectStores statistics about existing pairs in a dataset, grouped by subset: the number of pairs with each label and the total number of labeled pairs. Used to determine whether to add another pair to the dataset.
-
-
Constructor Summary
Constructors Constructor Description DatasetStats(FeatureQuery featureQuery, RecPairMap recPairMap, DataPartition partition)Calculates and stores all dataset statistics for the given dataset.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static SubsetFamilycalcDatasetSubsets(FeatureQuery fq, RecPairMap recPairMap, DataPartition partition, boolean boolLabelsOnly)Finds all subsets present in the specified dataset.SubsetFamilygetFamilyAllSubsets()DataPartitiongetPartition()java.lang.StringtoString()voidupdate(RecPair newRecPair)Updates statistics with the info about the pair that has been added to this dataset.
-
-
-
Constructor Detail
-
DatasetStats
public DatasetStats(FeatureQuery featureQuery, RecPairMap recPairMap, DataPartition partition)
Calculates and stores all dataset statistics for the given dataset.- Parameters:
recPairMap- - stores all pairs in the dataset.featureQuery- - stores all features.
-
-
Method Detail
-
toString
public java.lang.String toString()
- Overrides:
toStringin classjava.lang.Object
-
getPartition
public DataPartition getPartition()
- Returns:
- the data partition that this object was created for.
-
getFamilyAllSubsets
public SubsetFamily getFamilyAllSubsets()
- Returns:
- the internal subset family that stores all subsets in the dataset.
-
calcDatasetSubsets
public static SubsetFamily calcDatasetSubsets(FeatureQuery fq, RecPairMap recPairMap, DataPartition partition, boolean boolLabelsOnly)
Finds all subsets present in the specified dataset. Only pairs that have feature values are counted as belonging to a subset.- Parameters:
boolLabelsOnly- - if true, only pairs with bool labels are counted.- Returns:
- a new subset family that contains these subsets.
-
update
public void update(RecPair newRecPair)
Updates statistics with the info about the pair that has been added to this dataset. Must be called after adding each pair to keep statistics current.- Parameters:
newRecPair- - the pair that has just been added to this dataset.- Throws:
java.lang.IllegalArgumentException- if recPair does not contain a boolean label or feature scores.
-
-