Feed Simulation Custom Reader Sample

This sample demonstrates how to implement a custom file reader to read non-standard, proprietary, or binary files as the source of a stream of input tuples for feed simulations. Your Java file-reading code must extend one of the classes in the com.streambase.sb.feedsim package in the StreamBase Client Library. StreamBase provides a way to use your custom class instead of its internal CSV-reading code in conjunction with the Feed Simulation Editor's Data File option.

Note

This sample cannot run in StreamBase Studio until you configure your environment as described in Essential Prerequisite Setup.

For more information about configuring Feed Simulations to use custom readers, see Feed Simulation with Custom File Reader.

Sample Overview

This sample demonstrates and provides Java source code for two implementations of custom file readers for feed simulations:

  • A custom CSV file reader

  • A custom tuple reader

The custom CSV file reader implementation forces all characters to lowercase if the first character in the file is a pound sign (#), forcing all characters to uppercase otherwise. This sample's implementation class, MyFeedSimCSVPlugin, extends the required FeedSimCSVInputStream interface. Because only the read() method needs to be overridden, this implementation works for any schema CSV file.

The custom Tuple reader implementation parses lines in a sample file that contains web server log data, stored in the NCSA Common Log Format. The getSchema() method returns a schema that assigns appropriate field names and types to the log data, and each readTuple() call returns the Tuple for each successive line of log data.

See Feed Simulation with Custom File Reader for a discussion of the classes in the com.streambase.sb.feedsim package in the StreamBase Client Library.

Essential Prerequisite Setup

This sample's EventFlow applications load and run normally in Studio with special configuration steps. If your goal is only to see the as-shipped samples running, you do not need this section's setup steps.

However, to run this sample's applications from the command prompt, or to use the Feed Simulation Editor's Data File Options dialog in Studio to make changes to one of the feed simulations, you must configure your environment as shown in the steps below.

  1. Start Studio and load this sample as described above. Studio automatically and silently builds the Java source files in the java-src directory and places the resulting class files in the java-bin directory. (Studio does not display the java-bin directory by default in the Package Explorer view.)

    Of course, you can also build your custom feed simulation reader classes from the command prompt with javac.

  2. Exit Studio and set the environment variable STREAMBASE_FEEDSIM_PLUGIN_CLASSPATH to the path to the java-bin subdirectory of this sample's directory in your Studio workspace. For example, for Windows:

    set STREAMBASE_FEEDSIM_PLUGIN_CLASSPATH=
      C:\Users\sbuser\Documents\StreamBase Studio n.m Workspace\sample_feedsim-plugin\java-bin
    

    For Bash on OS X:

    export STREAMBASE_FEEDSIM_PLUGIN_CLASSPATH= \
      /Users/sbuser/StreamBase Studio n.m Workspace/sample_feedsim-plugin/java-bin
    

    For Bash on Linux:

    export STREAMBASE_FEEDSIM_PLUGIN_CLASSPATH= \
       /home/sbuser/StreamBase Studio n.m Workspace/sample_feedsim-plugin/java-bin
    

    As an alternative, you can configure the Java system property streambase.feedsim.plugin-classpath.

  3. Restart Studio. If you set the environment variable in a terminal window or StreamBase Command Prompt, then start Studio from the same window with the sbstudio command.

This environment variable affects two subsystems:

  • It places the classes that implement your custom reader on the classpath of the JVM that hosts the sbfeedsim command run from the command prompt.

  • In Studio, it adds those classes to the classloader constructed by your custom feed simulation reader plug-in.

Without the above steps, when you open the Data File Options dialog from the Feed Simulation Editor, the dialog opens with a Class not found error in the File preview text area of the dialog.

This Sample's Files

This sample includes the following files that demonstrate a custom CSV file reader plug-in.

File Description
MyFeedSimCSVPlugin.sbapp A simple, pass-through EventFlow application that demonstrates the custom CSV file reader plug-in.
feedSimForCsvPlugin.sbfs A feed simulation file set up to read timestamp-simple.csv as its input data source, using the custom file reader MyFeedSimCSVPlugin.
timestamp-simple.csv A simple three-field CSV file of stock symbol, price, and timestamp data.
MyFeedSimCSVPlugin.java In the java-src directory, the Java source file for the MyFeedSimCSVPlugin class.

This sample includes the following files that demonstrate a custom tuple reader plug-in.

File Description
MyFeedSimTuplePlugin.sbapp A simple, pass-through EventFlow application that demonstrates the custom tuple reader plug-in.
feedSimForTuplePlugin.sbfs A feed simulation file set up to read log-data.txt as its input data source, using the custom file reader MyFeedSimTuplePlugin.
log-data.txt A text file containing a few lines of sample web server log data.
MyFeedSimTuplePlugin.java In the java-src directory, the Java source file for the MyFeedSimTuplePlugin class.

Running This Sample in StreamBase Studio

The steps to run this sample in Studio are as follows:

  1. Optional for Studio. Configure the classpath with a special environment variable as described in Essential Prerequisite Setup.

  2. In the Package Explorer view, double-click to open the MyFeedSimCSVPlugin.sbapp module.

  3. Make sure the module is the currently active tab in the EventFlow Editor, then click the Run button. This opens the SB Test/Debug perspective and starts the application.

  4. The Feed Simulation adapter in this module is configured to be self-running. Thus, soon after the module starts, look in the Application Output view for data emitted on the CsvPluginOutputStream similar to the following:

    symbol=IBM, price=89.57, date=2005-04-01 16:00:02.000-0500
    symbol=IBM, price=89.0, date=2005-04-01 16:00:03.000-0500
    symbol=IBM, price=88.44, date=2005-04-01 16:00:04.000-0500
    symbol=IBM, price=36.0, date=2005-04-01 16:00:05.000-0500
    symbol=NYT, price=35.8, date=2005-04-01 16:00:15.000-0500
    symbol=NYT, price=35.77, date=2005-04-01 16:00:20.000-0500
    symbol=NYT, price=35.78, date=2005-04-01 16:00:21.000-0500
    symbol=DELL, price=38.03, date=2005-04-01 16:00:22.000-0500
    

    Notice that the stock symbols are all uppercase as sent to the StreamBase application. In the CSV file, the symbols are mixed case.

  5. To re-read the same data file, in the Manual Input view, select the FeedSimControl stream, enter the string start in the command field, and click Send Data.

  6. When done, press F9 or click the Stop Running Application button.

  7. Next, open the second module in this sample, MyFeedSimTuplePlugin.sbapp.

  8. Make sure the new module is the currently active tab in the EventFlow Editor, then click the Run button to start the module.

  9. The second sample is also self-running. In the Application Output view, look for data emitted on the TuplePluginOutputStream similar to the following:

    IP_address=127.0.0.1, user_identifier=sampleUserIdentifier, user_ID=john, 
      time=1996-01-12 19:37:55.000-0500, request=GET index.htm HTTP/1.0, 
      HTTP_status=200, result_size=215
    IP_address=987.65.43.21, user_identifier=-, user_ID=fred, 
      time=1996-01-12 19:37:56.000-0500, request=GET products.htm HTTP/1.0, 
      HTTP_status=200, result_size=215
    IP_address=987.65.43.21, user_identifier=-, user_ID=susan, 
      time=1996-01-12 19:37:57.000-0500, request=GET sales.htm HTTP/1.0, 
      HTTP_status=200, result_size=215
    IP_address=123.45.67.89, user_identifier=sampleUserIdentifier, user_ID=anna, 
      time=1996-01-12 19:37:58.000-0500, request=GET /images/log.gif HTTP/1.0, 
      HTTP_status=200, result_size=215
    IP_address=127.0.0.1, user_identifier=-, user_ID=-, 
      time=1996-01-12 19:37:59.000-0500, request=GET /buttons/form.gif HTTP/1.0, 
      HTTP_status=200, result_size=215
    
  10. To re-read the same data file, in the Manual Input view, select the FeedSimControl stream, enter the string start in the command field, and click Send Data.

  11. When done, press F9 or click the Stop Running Application button.

Running This Sample in Terminal Windows

Follow these steps to run this sample in terminal windows.

This section describes how to run this sample in UNIX terminal windows or Windows command prompt windows. On Windows, be sure to use the StreamBase Command Prompt from the Start menu as described in the Test/Debug Guide, not the default command prompt.

  1. Make sure the environment is configured as described in Essential Prerequisite Setup.

  2. Open three terminal windows on UNIX, or three StreamBase Command Prompts on Windows. In each window, navigate to your workspace copy of the sample.

  3. In window 1, start StreamBase Server running the sample application.

    sbd MyFeedSimCSVPlugin.sbapp

  4. In window 2, type:

    sbc dequeue CsvPluginOuputStream

    This window is to display tuples dequeued from the application's primary output port.

  5. In window 3, type:

    sbfeedsim feedSimForCsvPlugin.sbfs

  6. In window 2, look for tuples like the following:

    IBM,89.57,2005-04-01 16:00:02.000-0500
    IBM,89,2005-04-01 16:00:03.000-0500
    IBM,88.44,2005-04-01 16:00:04.000-0500
    IBM,36,2005-04-01 16:00:05.000-0500
    NYT,35.8,2005-04-01 16:00:15.000-0500
    NYT,35.77,2005-04-01 16:00:20.000-0500
    NYT,35.78,2005-04-01 16:00:21.000-0500
    DELL,38.03,2005-04-01 16:00:22.000-0500
    
  7. In window 3, re-run the sbfeedsim command as often as you want, or manually enqueue tuples to the CsvPluginInputStream input stream.

  8. To exit the first sample, in window 3, type the following command to shut down the server and close the dequeuer session:

    sbadmin shutdown

  9. To run the other sample, start over in window 1:

    sbd MyFeedSimTuplePlugin.sbapp

  10. In window 2, type:

    sbc dequeue TuplePluginOuputStream

    This window is to display tuples dequeued from the application's primary output port.

  11. In window 3, type:

    sbfeedsim feedSimForTuplePlugin.sbfs

  12. In window 2, look for tuples like the following:

    127.0.0.1,sampleUserIdentifier,john,1996-01-12 19:37:55.000-0500,
      GET index.htm HTTP/1.0,200,215
    987.65.43.21,-,fred,1996-01-12 19:37:56.000-0500,GET products.htm 
      HTTP/1.0,200,215
    987.65.43.21,-,susan,1996-01-12 19:37:57.000-0500,GET sales.htm 
      HTTP/1.0,200,215
    123.45.67.89,sampleUserIdentifier,anna,1996-01-12 19:37:58.000-0500,
      GET /images/log.gif HTTP/1.0,200,215
    127.0.0.1,-,-,1996-01-12 19:37:59.000-0500,GET /buttons/form.gif HTTP/1.0,200,215
    
  13. In window 3, re-run the sbfeedsim command as often as you want, or manually enqueue tuples to the TuplePluginInputStream input stream.

  14. To exit the first sample, in window 3, type the following command to shut down the server and close the dequeuer session:

    sbadmin shutdown

Back to top^

Importing This Sample into StreamBase Studio

In StreamBase Studio, import this sample with the following steps:

  • From the top menu, select FileLoad StreamBase Sample.

  • Type feedsim in the Search field to narrow the list of samples.

  • Select the Feed Simulation sample.

  • Click OK.

StreamBase Studio creates a project for the sample in your current Studio workspace.

Sample Location

When you load the sample into StreamBase Studio, Studio copies the sample project's files to your Studio workspace, which is normally part of your home directory, with full access rights.

Important

Load this sample in StreamBase Studio, and thereafter use the Studio workspace copy of the sample to run and test it, even when running from the command prompt.

Using the workspace copy of the sample avoids the permission problems that can occur when trying to work with the initially installed location of the sample. The default workspace location for this sample is:

studio-workspace/sample_feedsim-plugin

See Default Installation Directories for the location of studio-workspace on your system.

In the default TIBCO StreamBase installation, this sample's files are initially installed in:

streambase-install-dir/sample/feedsim-plugin

See Default Installation Directories for the location of streambase-install-dir on your system. This location may require administrator privileges for write access, depending on your platform.