Contents
This sample shows how to use the TIBCO StreamBase® XML Normalizer operator.
The XML Normalizer operator is a global Java operator that parses a designated field containing a string in XML format, and emits one tuple for each top-level element extracted from the XML string field. Each emitted tuple contains a user-defined set of string fields parsed from the input XML string, plus an optional field that reports any XML parsing errors. All fields in the input tuple other than the XML string field are optionally passed through unchanged to each emitted tuple, except input fields of type tuple or list, which are not supported and are emitted as null.
See the description of the operator's Properties view in Using the XML Normalizer Operator.
The XMLSimple.sbapp
sample EventFlow module illustrates the following aspects of using the XML Normalizer operator:
-
The operator emits all extracted XML fields as strings, including XML fields that hold numeric data. The sample module includes a Map operator,
ConvertToNumbers
, that converts two extracted numeric fields to StreamBase int and double data types. -
The operator emits all extracted XML fields as a set of string fields. The sample module includes a second Map operator,
ConvertToTuple
, that converts the extracted XML fields to a single tuple field for further processing downstream.
The XML Normalizer sample includes the following files:
File | Purpose |
---|---|
XMLSimple.sbapp |
Sample EventFlow module to illustrate using the XML Normalizer operator. |
XMLSimple.sblayout |
The layout file associated with the sample EventFlow module. |
TradeHist.xml |
An example of trade data formatted as standard, indented XML. |
TradeHist.csv |
A CSV file with three fields, the first of which is the XML content in TradeHist.xml , but flattened to a single string with all line endings removed. This CSV file has one deliberate error at the end of the
XML string field, which demonstrates how the operator handles XML parsing errors. This file serves as input for the TradeHist.sbfs feed simulation.
|
TradeHist-unflattened.csv |
Same contents as TradeHist.csv , but with the first field shown as standard indented XML with embedded line ending characters. You can experiment with loading
this version of the CSV file as input for the feed simulation, using a custom file reader you write that removes the line
ending characters from the XML string field.
|
TradeHist.sbfs |
Feed simulation file that loads TradeHist.csv as its input file.
|
-
In the Package Explorer view, in project
sample_xml-normalizer
, double-click to open the
application.XMLSimple.sbapp
-
With the
XMLSimple.sbapp
application selected and active, click the button. This opens the SB Test/Debug perspective and starts the application. -
In the Feed Simulations view, select the
TradeHist.sbfs
feed simulation file and click . -
In the Application Output view, observe four tuples emitted. Select each tuple in sequence to see its contents in the Details Pane. Click the arrow on the left of the
Trade
subtuple to see the values of that tuple field.-
The first three tuples show a
Trade
tuple generated from fields extracted from the XML input field in the feed simulation. -
The first three tuples show
null
for theXMLErrorMessage
field, and each has the input tuple's two non-XML fields appended verbatim. -
The last emitted tuple shows
null
for all fields of theTrade
subtuple, and shows error text from the XML parser. The last emitted tuple is generated from an incomplete<trade>
element in the CSV file read by the feed simulation.
-
-
Experiment with edits to the
TradeHist.csv
input file. For example, you can eliminate the fourth error tuple by removing the incomplete<trade>
element from the end of the XML string field. As an alternative, you can generate an XML error earlier in the sequence by creating a deliberate XML error in the first, second, or third<trade>
element in the XML string field. Re-run the feed simulation each time to see the results of your experiments. -
When done, press F9 or click the Stop Running Application button.
This section describes how to run the sample in UNIX terminal windows or Windows command prompt windows. On Windows, be sure to use the StreamBase Command Prompt from the Start menu as described in the Test/Debug Guide, not the default command prompt.
-
Open three terminal windows on UNIX, or three StreamBase Command Prompts on Windows. In each window, navigate to the directory where the sample is installed, or to your workspace copy of the sample, as described above.
-
In window 1, start StreamBase Server with this command:
sbd XMLSimple.sbapp
-
In window 2, start the StreamBase dequeuer. Enter:
sbc dequeue OutputStream
No output is displayed at this point, but the dequeuer is prepared to receive output. This window will eventually show the output of the all the query operations.
-
In window 3, enqueue data to your application with the following command:
sbfeedsim TradeHist.sbfs
-
In window 2, observe three emitted tuples and one error tuple like the following example:
"MSFT,25.48,USD,2000,NASDAQ",null,4456,After Hours "IBM,164.25,USD,5000,NYSE",null,4456,After Hours "DELL,14.26,USD,20000,NASDAQ",null,4456,After Hours "null,null,null,null,null","Sax ParsingError: The element type ""trade"" must be terminated by the matching end-tag ""</trade>"".",4456,After Hours
-
In window 3, type: Ctrl+C to exit the sbc session.
-
In window 3, type the following command to terminate the server and dequeuer:
sbadmin shutdown
In StreamBase Studio, import this sample with the following steps:
-
From the top menu, click
→ . -
Select xml-normalizer from the Data Constructs and Operators category.
-
Click OK.
StreamBase Studio creates a single project for the operator samples.
When you load the sample into StreamBase Studio, Studio copies the sample project's files to your Studio workspace, which is normally part of your home directory, with full access rights.
Important
Load this sample in StreamBase Studio, and thereafter use the Studio workspace copy of the sample to run and test it, even when running from the command prompt.
Using the workspace copy of the sample avoids the permission problems that can occur when trying to work with the initially installed location of the sample. The default workspace location for this sample is:
studio-workspace
/sample_xml-normalizer
See Default Installation Directories for the location of studio-workspace
on your system.
In the default TIBCO StreamBase installation, this sample's files are initially installed in:
streambase-install-dir
/sample/xml-normalizer
See Default Installation Directories for the default location of studio-workspace
on your system.