Database Change Data Capture Input Adapter

Introduction

The TIBCO StreamBase® Database Change Data Capture Input adapter allows a StreamBase application to monitor changes to an Oracle or SQL Server database (2005,2008,2010,2012,2014,2016).

The adapter monitors the database's transaction log to detect when records are inserted, updated, or deleted in a database table. Each captured change results in a tuple emitted on the adapter's Capture output port. The tuple contains a set of adapter-supplied metadata indicating the type of operation (insert, update, or delete), the transaction log sequence number, and a bit mask indicating which table fields changed, along with the changed fields themselves. The application author expresses interest in a particular table column by adding a corresponding field in the adapter's Edit Schema tab.

The adapter is configured through a collection of properties set in the adapter's Properties view within StreamBase Studio. Properties specify, among other things, the database host name or IP address and TCP port number, the database username and password, and the table for which activity is being captured.

Setting Properties

This section describes the properties you can set for this adapter, using the Adapter Properties tab of the Properties view in StreamBase Studio.

General Tab

Name: Use this required field to specify or change the name of this instance of this component, which must be unique in the current EventFlow module. The name must contain only alphabetic characters, numbers, and underscores, and no hyphens or other special characters. The first character must be alphabetic or an underscore.

Adapter: A read-only field that shows the formal name of the adapter.

Class name: Shows the fully qualified class name that implements the functionality of this adapter. If you need to reference this class name elsewhere in your application, you can right-click this field and select Copy from the context menu to place the full class name in the system clipboard.

Start options: This field provides a link to the Cluster Aware tab, where you configure the conditions under which this adapter starts.

Enable Error Output Port: Select this check box to add an Error Port to this component. In the EventFlow canvas, the Error Port shows as a red output port, always the last port for the component. See Using Error Ports to learn about Error Ports.

Description: Optionally enter text to briefly describe the component's purpose and function. In the EventFlow Editor canvas, you can see the description by pressing Ctrl while the component's tooltip is displayed.

Adapter Properties Tab

This section describes the properties on the Adapter Properties tab in the Properties view for the Change Data Capture adapter.

Property Description
Database Type The type of database: Oracle or SQL Server.
JDBC URL Entry Type Controls the type of URL connection information. Select the URL type from the drop down box to enable different properties:
  • Host, Port, Database Name, Username, and Password are only enabled if Individual Components type is selected.

  • JDBC URL, Username, and Password are only enabled if JDBC URL type is selected.

  • JDBC Data Source is only enabled if JDBC Data Source (HOCON) type is selected.

Host The host name or IP address of the machine on which the database is running.
Port The TCP port number on which the database is listening.
Database Name The name of the database.
JDBC URL

Optionally the full JDBC connection URL which overrides the above connection information.

Example MSSQL: jdbc:sqlserver://{host}:{port};databaseName={databaseName}

Example Oracle: jdbc:oracle:thin:{host}:{port}:{databaseName}

Note

The CDC adapter does not use a HOCON configuration file to set the JDBC URL value. The URL value for the CDC adapter should be:

jdbc:sqlserver://MyHost\MyNamedInstance;databaseName=myDB
Username The username to use in connecting to the database.
Password The password to use in connecting to the database.
JDBC Data Source List of available JDBC data sources defined in HOCON.
Table Name The name of the database table for which changes are being captured. For Oracle, this is the name of the capture table.
Capture Instance Name The capture instance name of the for which changes are being captured. This value is only valid for SQL Server; if left blank the value will be generated using {tablename}_capture_instance.
Perform Enable CDC Calls If enabled the adapter will try to enable CDC on the given database and table. You can disable this option if you have already manually enabled CDC on the given database and table.
Start SQL Server Agent Service If enabled the adapter will try to also start the SQL server Agent Service. If you are connecting to a SQL Server database from a non-windows machine, or a machine without SQL Server installed then disable this option.
Table Name The name of the database table for which changes are being captured. For Oracle, this is the name of the capture table.
Log Level Controls the level of verbosity the adapter uses to send notifications to the console. This setting can be higher than the containing application's log level. If set lower, the system log level is used. Available values, in increasing order of verbosity, are: OFF, ERROR, WARN, INFO, DEBUG, TRACE.

Edit Schema Tab

Use the Edit Schema tab to express interest in one or more columns of the target table. Each field present in this tab must have a column with an identical name and compatible type in the target table. Fields present in the Edit Schema tab, along with several additional metadata fields supplied by the adapter, appear in the schema of the adapter's Capture output port.

For general instructions on using the Edit Schema tab, see Defining Input Streams.

Cluster Aware Tab

Use the settings in this tab to allow this operator or adapter to start and stop based on conditions that occur at runtime in a cluster with more than one node. During initial development of the fragment that contains this operator or adapter, and for maximum compatibility with TIBCO Streaming releases before 10.5.0, leave the Cluster start policy control in its default setting, Start with module.

Cluster awareness is an advanced topic that requires an understanding of StreamBase Runtime architecture features, including clusters, quorums, availability zones, and partitions. See Cluster Awareness Tab Settings on the Using Cluster Awareness page for instructions on configuring this tab.

Concurrency Tab

Use the Concurrency tab to specify parallel regions for this instance of this component, or multiplicity options, or both. The Concurrency tab settings are described in Concurrency Options, and dispatch styles are described in Dispatch Styles.

Caution

Concurrency settings are not suitable for every application, and using these settings requires a thorough analysis of your application. For details, see Execution Order and Concurrency, which includes important guidelines for using the concurrency options.

Using the Adapter in a StreamBase Application

This section demonstrates how to use the Change Data Capture input adapter within a StreamBase application and describes the use of the adapter's ports. As shown in the EventFlow diagram below (from this adapter's sample application), the adapter uses two output ports to communicate with the surrounding application. As with other StreamBase adapters and operators, you can optionally enable an Error Output Port, as described in Using Error Ports and Error Streams.

Note

In this adapter's sample application, the adapter's output ports are connected directly to externally-visible output streams. However, in more complex applications these ports will typically be connected to internal StreamBase operators.

Description of This Adapter's Ports

The Change Data Capture input adapter's ports are used as follows:

  • Status (output): The adapter emits tuples from this port when significant events occur, such as when the adapter connects to the database, when CDC is enabled, or when the adapter fails to populate a tuple field in response to a captured change. The schema for this port has the following fields:

    • Type, string: returns one of the following values to convey the type of event:

      • Capture

      • Connection

      • Service

      • Suspend/Resume

      • UserInput

    • Action, string: returns one of the following values to convey the action that resulted in the event:

      • Already Enabled — CDC was already enabled for the database or table when the adapter starts

      • Already Running — For SQL Server, the SQLSERVERAGENT service is already running when the adapter starts

      • Connected — The adapter connected to the database

      • Enabled — CDC has been enabled for the database or table

      • Failed — An attempt failed to connect to the database, enable CDC, start the SQLSERVERAGENT service (SQL Server only), or populate a tuple field from a captured change.

      • Resumed — An adapter instance was resumed

      • Suspending — An adapter instance is being suspended

      • Started — The SQLSERVERAGENT service (for SQL Server)

      • Timeout — A timeout occurred waiting for the SQLSERVERAGENT service to start (SQL Server only)

    • Object, string: returns an event type-specific value, such as the database URL being connected to, the database or table name for which CDC has been enabled, the name of the adapter instance being suspended or resumed, or the name of the table column whose captured change could not be written to a tuple field.

    • Message, string: Returns a human-readable description of the event.

  • Capture (output): This is the adapter's primary output port. The adapter emits tuples on this port when changes are captured for the target database table. The port's schema contains the user-supplied fields from the Edit Schema tab along with the following metadata fields supplied by the adapter:

    • _Operation, string: contains one of the following values to convey the type of captured activity:.

      • Insert — A database record was inserted

      • Update-Old — A database record was inserted. The tuple contains the old record values.

      • Update-New — A database record was inserted. The tuple contains the new record values.

      • Delete — A database record was deleted

    • _CommitSeqNum, string: contains the transaction commit sequence number.

    • _SubSeqNum, string: contains the sequence number within the transaction.

    • UpdateMask, string: contains a bitmask indicating which columns changed.

Typechecking and Error Handling

The Change Data Capture input adapter uses typecheck messages to help you configure the adapter within your StreamBase application. In particular, the adapter generates typecheck messages for the following reasons:

  • A database host name or IP address is not specified.

  • A database TCP port number is not specified.

  • An invalid database TCP port number is specified.

  • A database username not specified.

  • A database password not specified.

  • A database name not specified.

  • A table name not specified.

The adapter generates warning messages during runtime under various conditions, including:

  • The adapter is shutting down because it failed to enable CDC.

  • A tuple field cannot be populated from a captured change.

  • A field specified in the Edit Schema tab has no corresponding column in the target database table.

  • The SQL Server agent service is not present (SQL Server only).

  • The adapter failed to start the SQL Server agent service (SQL Server only).

  • The adapter failed to read a record from the capture table.

Suspend and Resume Behavior

When suspended, the adapter stops processing change data capture activity and no longer emits tuples on its primary output port.

When resumed, the adapter resumes processing change data capture activity and starts emitting tuples on its primary output port.

Related Topics