XML Normalizer Operator Sample

This sample shows how to use the TIBCO StreamBase® XML Normalizer operator.

About This Guide

The XML Normalizer operator is a global Java operator that parses a designated field containing a string in XML format, and emits one tuple for each top-level element extracted from the XML string field. Each emitted tuple contains a user-defined set of string fields parsed from the input XML string, plus an optional field that reports any XML parsing errors. All fields in the input tuple other than the XML string field are optionally passed through unchanged to each emitted tuple, except input fields of type tuple or list, which are not supported and are emitted as null.

See the description of the operator's Properties view in Using the XML Normalizer Operator.

The XMLSimple.sbapp sample EventFlow module illustrates the following aspects of using the XML Normalizer operator:

  • The operator emits all extracted XML fields as strings, including XML fields that hold numeric data. The sample module includes a Map operator, ConvertToNumbers, that converts two extracted numeric fields to StreamBase int and double data types.

  • The operator emits all extracted XML fields as a set of string fields. The sample module includes a second Map operator, ConvertToTuple, that converts the extracted XML fields to a single tuple field for further processing downstream.

Importing This Sample into StreamBase Studio

In StreamBase Studio, import this sample with the following steps:

  • From the top-level menu, click FileLoad StreamBase Sample.

  • Type xml to narrow the list of options.

  • Select xml-normalizer from the Data Constructs and Operators category.

  • Click OK.

StreamBase Studio creates a project for the sample.

Running XMLSimple.sbapp in StreamBase Studio

  1. In the Project Explorer, open the sample you just loaded.

  2. Open the src/main/eventflow folder.

  3. Open the package folder (most samples contain a single package folder. Open the top-level package folder if your sample contains more than one folder).

  4. Open the named application file and click the Run button. This opens the SB Test/Debug perspective and starts the application.

    If you see red marks, wait a moment for the project in Studio to load its features.

    If red marks do not resolve themselves in a moment, select the project and right-click MavenUpdate Project from the context menu.

  5. In the Feed Simulations view, select the TradeHist.sbfs feed simulation file and click Run.

  6. In the Output Streams view, observe four tuples emitted. Select each tuple in sequence to see its contents in the Details Pane. Click the arrow on the left of the Trade subtuple to see the values of that tuple field.

    1. The first three tuples show a Trade tuple generated from fields extracted from the XML input field in the feed simulation.

    2. The first three tuples show null for the XMLErrorMessage field, and each has the input tuple's two non-XML fields appended verbatim.

    3. The last emitted tuple shows null for all fields of the Trade subtuple, and shows error text from the XML parser. The last emitted tuple is generated from an incomplete <trade> element in the CSV file read by the feed simulation.

  7. Experiment with edits to the src/main/resources/TradeHist.csv input file. For example, you can eliminate the fourth error tuple by removing the incomplete <trade> element from the end of the XML string field. As an alternative, you can generate an XML error earlier in the sequence by creating a deliberate XML error in the first, second, or third <trade> element in the XML string field. Re-run the feed simulation each time to see the results of your experiments.

  8. When done, press F9 or click the Stop Running Application button.

This Sample's Files

The XML Normalizer sample includes the following files:

XMLSimple.sbapp

Sample EventFlow module to illustrate using the XML Normalizer operator.

XMLSimple.sblayout

The layout file associated with the sample EventFlow module.

TradeHist.xml

An example of trade data formatted as standard, indented XML.

TradeHist.csv

A CSV file with three fields, the first of which is the XML content in TradeHist.xml, but flattened to a single string with all line endings removed. This CSV file has one deliberate error at the end of the XML string field, which demonstrates how the operator handles XML parsing errors. This file serves as input for the TradeHist.sbfs feed simulation.

TradeHist-unflattened.csv

Same contents as TradeHist.csv, but with the first field shown as standard indented XML with embedded line ending characters. You can experiment with loading this version of the CSV file as input for the feed simulation, using a custom file reader you write that removes the line ending characters from the XML string field.

TradeHist.sbfs

Feed simulation file that loads TradeHist.csv as its input file.

Sample Location

When you load the sample into StreamBase Studio, Studio copies the sample project's files to your Studio workspace, which is normally part of your home directory, with full access rights.

Important

Load this sample in StreamBase Studio, and thereafter use the Studio workspace copy of the sample to run and test it, even when running from the command prompt.

Using the workspace copy of the sample avoids permission problems. The default workspace location for this sample is:

studio-workspace/sample_xml-normalizer

See Default Installation Directories for the default location of studio-workspace on your system.