Contents
This sample demonstrates the use of the Spotfire Streaming Two Sample KS Test operator. The Two Sample KS Test Operator can be used to test whether or not two independent data distributions are equal or not. This test looks for any differences in the distributions and is not just testing whether or not the means differ as in the two sample t-test. The KS test is nonparametric in that no underlying distributional assumptions are made, e.g., normality.
The provided StreamBase module uses the randomly generated data set including column X (response) and Y (code name A for Group 1 and code name B for Group 2). X is generated with uniform distribution ranging from 1 to 100. Y is generated as value A or B with the same weight. The data set is fed into the Matrix operator to collect and emit every 100 rows of data. The Kolmogorow Two Sample Test operator takes the collected data and options (from proposed schema) as inputs.
In StreamBase Studio, import this sample with the following steps:
-
From the top-level menu, click
> . -
In the search field, type
kstest
to narrow the list of options. -
Select Kolmogorov-Smirnov test from the Streaming Datascience Operators category.
-
Click
.
StreamBase Studio creates a single project containing the sample files.
-
In the Project Explorer view, expand the
sample_datascience_kstest
project and double-click to open theKSTest.sbapp
application. Make sure the application is the currently active tab in the EventFlow Editor. -
Click the Run button. This opens the SB Test/Debug perspective and starts the application.
-
Click on the Feed Simulations tab, click the
KSTest.sbfs
, then click the Run button to start feeding the data. -
The Kolmogorow Two Sample Test operator starts taking data from the feed simulation and emitting the results after 100 rows collected.
-
When done, press F9 or click the Stop Running Application button.