Contents
This sample demonstrates how to use a capture field in the schema of a Query Table to make that table reusable in different copies of its containing module. Each instance of the table holds key-value data with different value data types for different instances.
This sample also includes a Java operator (defined in CaptureFieldsAwareOperator.java
) that demonstrates the use of Operator API methods that affect how operators handle data from streams that include capture
fields.
The schema for a Query Table that holds key-value pairs would normally have two fields, a key field and a data field of a
particular data type. The schema for such a table might be, for example, {key long, data string}
. Before capture fields, to store a key-value pair of with a different data type, you would need a separate Query Table with
a different schema, such as {key long, data double}
.
The Query Table in this sample instead uses a capture field for the data field. Capture fields are only active in the context
of a module, so the Query Table, GenericDataStore
, is placed in the module GenericDataStore.sbapp
. Notice that its schema is {id long, data capture}
. This capture field's field name is data
, while its defined data type is dataFields
. This schema means: expect the first field to be named id
and to have data type long
; then expect any number of fields with different types thereafter.
The inner module is completed with a Query operator to insert values, another to report the count of rows accumulated in the table so far, and one to read all rows so you can confirm its contents. It also contains two instances of the CaptureFieldsAwareOperator Java operator. See the Java comments in the source file for details on that operator.
The inner module, GenericDataStore.sbapp
is referenced twice in the outer module, TopLevel.sbapp
. In the first reference, an input stream, Points
, with schema {id long, x int, y int}
feeds into the inner module's DataIn
stream. The first field, id
, matches the Query Table's requirement for a first field of type long
named id
. The x
and y
fields are captured by the Query Table's capture field, which adapts the first instance of the GenericDataStore
Query Table to have the same schema as the Points
input stream.
In the second reference, the input stream, Names
, with schema {id long, name string}
feeds into another instance of the inner module's DataIn
stream. In this case, the Query Table in the second instance of the inner module adapts its schema to match the Names
input stream.
Reusing the inner module does not mean there is a single Query Table that changes its schema. It means there are two instances of the Query Table, one per module reference, with different schemas that match the two input streams. Notice that the same abstract Query Table schema definition is used without change in both module references, yet each instance of the Query Table ends up with different concrete schemas at runtime.
A simple feed simulation is provided that feeds generated values to all four input streams as follows:
Input Stream | Feed Simulation Action |
---|---|
Names | 100 tuples of generated id , string value pairs at 10 tuples per second.
|
Points | 50 tuples of generated id , x , y values at 10 tuples per second.
|
CountNamesRows | One no-fields tuple every second, which triggers a count of all rows in the Names instance of the Query Table. |
CountPointsRows | One no-fields tuple every second, which triggers a count of all rows in the Points instance of the Query Table. |
The result of running the feed simulation is that the two instances of the inner module's Query Table are populated with generated
values, each of the appropriate data type. Meanwhile, once per second, Studio's Output Streams view reports the number of
rows accumulated so far in each of the two tables. You can send an empty tuple at any time to the ReadTable
stream for each module reference, and see the contents of the table at that time on the respective TableContents
output stream.
While the application is running, observe the console output from the Java operator (emitting on its logger at INFO level) to observe runtime output demonstrating the Operator API features for capture fields.
In StreamBase Studio, import this sample with the following steps:
-
From the top-level menu, select
> . -
Enter
capture field
to narrow the list of options. -
Select Capture fields with Query Table from the Data Constructs and Operators category.
-
Click
.
StreamBase Studio creates a single project containing the sample files.
-
In the Project Explorer view, open this sample's folder.
Keep an eye on the bottom right status bar of the Studio window. Make sure any
Updating
,Downloading
,Building
, orRebuild project
messages finish before you proceed. -
Open the
src/main/eventflow/
folder.packageName
-
Double-click to open the
TopLevel.sbapp
module. Make sure the module is the currently active tab in the EventFlow Editor. -
Click the Run button. This opens the SB Test/Debug perspective and starts the module.
-
Wait for the Waiting for fragment to initialize message to clear.
-
Select the Feed Simulations view, select the
TopLevel.sbfs
feed simulation, and click . -
View the results in the Output Streams view.
-
In the Console view, for each tuple enqueued by the feed simulation, the two Java operators in the
GenericDataStore.sbapp
module emit two messages (for a total of four messages per enqueued tuple). The messages show the different effects of using the FLATTEN and NEST strategies for Java operators accessing streams with capture fields. -
You can rerun the feed simulation to continue adding values to the sample's Query Table.
-
When done, press F9 or click the Terminate EventFlow Fragment button.
When you load the sample into StreamBase Studio, Studio copies the sample project's files to your Studio workspace, which is normally part of your home directory, with full access rights.
Important
Load this sample in StreamBase Studio, and thereafter use the Studio workspace copy of the sample to run and test it, even when running from the command prompt.
Using the workspace copy of the sample avoids permission problems. The default workspace location for this sample is:
studio-workspace
/sample_CaptureGenericDataStore
See Default Installation Directories for the default location of studio-workspace
on your system.