Contents
This sample shows how to use the TIBCO StreamBase® XML Normalizer operator.
The XML Normalizer operator is a global Java operator that parses a designated field containing a string in XML format, and emits one tuple for each top-level element extracted from the XML string field. Each emitted tuple contains a user-defined set of string fields parsed from the input XML string, plus an optional field that reports any XML parsing errors. All fields in the input tuple other than the XML string field are optionally passed through unchanged to each emitted tuple, except input fields of type tuple or list, which are not supported and are emitted as null.
See the description of the operator's Properties view in Using the XML Normalizer Operator.
The XMLSimple.sbapp
sample EventFlow module illustrates the following aspects of using the XML Normalizer operator:
-
The operator emits all extracted XML fields as strings, including XML fields that hold numeric data. The sample module includes a Map operator,
ConvertToNumbers
, that converts two extracted numeric fields to StreamBase int and double data types. -
The operator emits all extracted XML fields as a set of string fields. The sample module includes a second Map operator,
ConvertToTuple
, that converts the extracted XML fields to a single tuple field for further processing downstream.
In StreamBase Studio, import this sample with the following steps:
-
From the top-level menu, click
→ . -
Type
xml
to narrow the list of options. -
Select xml-normalizer from the Data Constructs and Operators category.
-
Click
.
StreamBase Studio creates a project for the sample.
-
In the Project Explorer, open the sample you just loaded.
-
Open the
src/main/eventflow
folder. -
Open the package folder (most samples contain a single package folder. Open the top-level package folder if your sample contains more than one folder).
-
Open the named application file and click the Run button. This opens the SB Test/Debug perspective and starts the application.
If you see red marks, wait a moment for the project in Studio to load its features.
If red marks do not resolve themselves in a moment, select the project and right-click
→ from the context menu. -
In the Feed Simulations view, select the
TradeHist.sbfs
feed simulation file and click . -
In the Output Streams view, observe four tuples emitted. Select each tuple in sequence to see its contents in the Details Pane. Click the arrow on the left of the
Trade
subtuple to see the values of that tuple field.-
The first three tuples show a
Trade
tuple generated from fields extracted from the XML input field in the feed simulation. -
The first three tuples show
null
for theXMLErrorMessage
field, and each has the input tuple's two non-XML fields appended verbatim. -
The last emitted tuple shows
null
for all fields of theTrade
subtuple, and shows error text from the XML parser. The last emitted tuple is generated from an incomplete<trade>
element in the CSV file read by the feed simulation.
-
-
Experiment with edits to the
src/main/resources/TradeHist.csv
input file. For example, you can eliminate the fourth error tuple by removing the incomplete<trade>
element from the end of the XML string field. As an alternative, you can generate an XML error earlier in the sequence by creating a deliberate XML error in the first, second, or third<trade>
element in the XML string field. Re-run the feed simulation each time to see the results of your experiments. -
When done, press F9 or click the Stop Running Application button.
The XML Normalizer sample includes the following files:
- XMLSimple.sbapp
-
Sample EventFlow module to illustrate using the XML Normalizer operator.
- XMLSimple.sblayout
-
The layout file associated with the sample EventFlow module.
- TradeHist.xml
-
An example of trade data formatted as standard, indented XML.
- TradeHist.csv
-
A CSV file with three fields, the first of which is the XML content in
TradeHist.xml
, but flattened to a single string with all line endings removed. This CSV file has one deliberate error at the end of the XML string field, which demonstrates how the operator handles XML parsing errors. This file serves as input for theTradeHist.sbfs
feed simulation.
- TradeHist-unflattened.csv
-
Same contents as
TradeHist.csv
, but with the first field shown as standard indented XML with embedded line ending characters. You can experiment with loading this version of the CSV file as input for the feed simulation, using a custom file reader you write that removes the line ending characters from the XML string field. - TradeHist.sbfs
-
Feed simulation file that loads
TradeHist.csv
as its input file.
When you load the sample into StreamBase Studio, Studio copies the sample project's files to your Studio workspace, which is normally part of your home directory, with full access rights.
Important
Load this sample in StreamBase Studio, and thereafter use the Studio workspace copy of the sample to run and test it, even when running from the command prompt.
Using the workspace copy of the sample avoids permission problems. The default workspace location for this sample is:
studio-workspace
/sample_xml-normalizer
See Default Installation Directories for the default location of studio-workspace
on your system.