The data science operator
sample group provides one
small EventFlow application for each operator and most data constructs in the Palette
view in StreamBase Studio.
-
An operator is a StreamBase processing unit that performs predefined work on streaming data, such as aggregating windows of data streams, merging streams, or retrieving shared data from a table.
-
A data construct is a component that can store information from a stream or from an external data source that can then be used by an associated Spotfire Streaming operator.
Each sample has a separate README that describes the steps to run that sample.
Component Sample | Description |
---|---|
Data Science Operator Samples | |
ANOVA Operator Sample | Uses an ANOVA Operator to compute the analysis of variance which is a generalization of the t-test for comparing two or more groups with respect to equality of means. |
Chi-square Test Operator Sample | Uses Chi-square Test of Independence Operator to compute the chi-square test of independence between two categorical/discrete random variables along with other relevant summary information such as crosstabulation frequencies, relative frequences, etc. as well as the Cramer's V statistic. |
Predictive Modeling Sample: Classification Trees | Uses Classification Trees Operator to build classification tree models. The IRIS Flower data - irisdat.csv. SEPALLEN, SEPALLWID, PETALLEN, PETALWID features are selected as predictors. IRISTYPE is selected as response. |
Correlations Operator Sample | Uses Correlations Operator to gather tuples over various styles of output types such as over time or by selected values. The purpose of this operator is to create a matrix (list of tuples) of which the tuples fields are the columns of the matrix. |
Descriptive Statistics Operator Sample | Uses Descriptive Statistics Operator to provide basic statistical information for each specified variable including measures of central tendency (e.g. mean) and of dispersion (e.g. standard deviation). |
Frequency Tables Operator Sample | Uses a Frequency Tables Operator to compute contingency table that shows item and combination counts. |
Kolmogorov-Smirnov Two Sample Test | This sample uses an Kolmogorov-Smirnov Test Operator to compute the two-sample Kolmogorov-Smirnov test. This is the nonparametric analogue to the two-sample t-test, however, instead of comparing means between two groups, the test can be used to assess any differences between the two distributions. |
Predictive Modeling Sample: Linear Regression | Uses a Linear Regression Operator to build linear regression models. Ordinary least square, ridge regression, and lasso regression models are supported. |
Predictive Modeling Sample: Logistic Regression | Uses a Logistic Regression Operator to build binary logistic regression models. |
Predictive Modeling Sample: Multilayer Perceptron Classification | Uses a Multilayer Perceptron Classification Operator to build multilayer perceptron neural networks. It uses the IRIS Flower data - irisdat.csv. SEPALLEN, SEPALLWID, PETALLEN, PETALWID features are selected as predictors. IRISTYPE is selected as a response. |
Predictive Modeling Sample: Multilayer Perceptron Regression | Uses a Multilayer Perceptron Regression Operator to build multilayer perceptron neural networks. It uses the Boston Housing 2 data - BostonHousing2.csv. ValueofOccupiedHomes is selected as the response. The rest is selected as predictors. |
Paired T-test Sample | Uses a Paired T-Test Operator to compute the two sample dependent t-test where a two sample t-test is used to test the null hypothesis that the population means of two dependent groups as measured on a single variable are significantly different from one another. |
Predictive Modeling Sample: Regression Trees | Uses a Regression Trees Operator to build regression tree models. These operator starts taking data from the feed simulation and emitting the results after 300 rows collected. |
Single Sample T-Test Operator | Uses a Single Sample T-Test Operator to compute the single sample t-test. |
Predictive Modeling Sample: Support Vector Machine Classifier | Uses a SVM Classification Operator to build support vector machine classification models. |
Predictive Modeling Sample: Support Vector Machine Regression | Uses a SVM Regression Operator to build support vector machine regression models. |
Two Sample T-test Sample | Uses Two Sample T-Test Operator to compute the two sample independent t-test where a two sample t-test is used to test the null hypothesis that the population means of two groups as measured on a single variable are significantly different from one another. |
Two Sample T-Test by Groups Operator | Uses T-Test By Groups Operator to compute the two sample independent t-test where a two sample t-test is used to test the null hypothesis that the population means of two groups as measured on a single variable are significantly different from one another. |