Oracle CDC LogMiner Parser Adapter

Introduction

The Spotfire Streaming Adapter For Oracle CDC LogMiner Parser works with the Spotfire Streaming Adapter For Oracle CDC LogMiner Connection adapter to allow a StreamBase application to monitor changes to an Oracle 12c and above databases using the Oracle LogMiner.

Note

Oracle Corporation does not include its LogMiner feature on Oracle 19c, which means this adapter cannot work with that Oracle release.

The adapters together monitor the database's LogMiner REDO log to detect when records are inserted, updated, or deleted in a database table. The tuple contains a set of adapter-supplied metadata indicating the type of operation (insert, update, or delete), along with various other CDC information.

The Parser adapter is used to define one or more tables to be monitored and the schema to be used against the matched tables or a generic name-value pair schema. The Connection adapter is linked to this Parser adapter to create the connection to the database; you can link more than one Parser adapter to a Connection adapter to monitor multiple tables from a single connection.

The Parser adapter is configured through a collection of properties set in the adapter's Properties view within StreamBase Studio. Properties specify, among other things, the linked connection adapter and the schema and table regular expressions to define tables which activity is being captured.

Setting up Oracle LogMiner

  1. Make sure your database is set up correctly and has supplemental logging enabled.

    1. Enable LogMiner:

      1. Log onto your database using a user with DBA privileges.

      2. Check the current log level by running select log_mode from v$database; If the command returns LOG_MODE of ARCHIVELOG then logging is already enabled and you can skip the following steps. If the command returns NOARCHIVELOG then you must enable logging.

      3. To enable logging, first shut down the database using shutdown immediate;.

      4. Then start up and mount the database using startup mount;.

      5. Now configure the database to enable archive log using alter database archivelog; and alter database open;.

    2. Enabled Supplemental Logging using ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;

    3. If you require primary key information, you can enable logging for each table with ALTER TABLE 'schema'.'table name' ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS; or for the entire database ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS;

    4. You can also enable full logging for each table with ALTER TABLE 'schema'.'table name' ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS; or for the entire database ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;

    5. Submit the changes using ALTER SYSTEM SWITCH LOGFILE;

  2. This sample provides scripts to create a SAMPLE table and perform insert/update/delete tasks on the table. The adapters in the sample are set up for the SAMPLE table. The Oracle CDC LogMiner Connection adapter properties do need to be updated with the correct database information in order to connect.

    The following scripts are provided:

    • CreateSampleTable.sql — This SQL script creates the SAMPLE table which the adapters are set up to capture CDC information.

    • InsertSampleData.sql — This SQL script inserts 10 demo records into the SAMPLE table.

    • UpdateSampleData.sql — This SQL script updates the 10 demo records in the SAMPLE table with new values.

    • DeleteSampleData.sql — This SQL script deletes the 10 demo records from the SAMPLE table.

  3. Load the JAR file containing the JDBC driver for your database into the Maven dependency system as described in Using External JAR Files.

Adapter Properties

This section describes the properties you can set for this adapter, using the Properties view in StreamBase Studio.

General Tab

Name: Use this required field to specify or change the name of this instance of this component. The name must be unique within the current EventFlow module. The name can contain alphanumeric characters, underscores, and escaped special characters. Special characters can be escaped as described in Identifier Naming Rules. The first character must be alphabetic or an underscore.

Operator: A read-only field that shows the formal name of the operator.

Class name: Shows the fully qualified class name that implements the functionality of this adapter. If you need to reference this class name elsewhere in your application, you can right-click this field and select Copy from the context menu to place the full class name in the system clipboard.

Start options: This field provides a link to the Cluster Aware tab, where you configure the conditions under which this adapter starts.

Enable Error Output Port: Select this checkbox to add an Error Port to this component. In the EventFlow canvas, the Error Port shows as a red output port, always the last port for the component. See Using Error Ports to learn about Error Ports.

Description: Optionally, enter text to briefly describe the purpose and function of the component. In the EventFlow Editor canvas, you can see the description by pressing Ctrl while the component's tooltip is displayed.

Operator Properties Tab

This section describes the properties on the Adapter Properties tab in the Properties view for the Change Data Capture adapter.

Property Description
Connection Adapter The name of the linked connection adapter that is used as the LogMiner connection for this adapter. The linked adapter must be in the same module as this adapter.
Schema Regex The regular expression used to filter on schema for the table to get changes for. If no value is supplied, the system matches all schema names.
Table Regex The regular expression used to filter on table name for the table to get changes for. If no value is supplied, the system matches all table names.
Insert When enabled, insert statements are tracked.
Update When enabled, update statements are tracked.
Delete When enabled, delete statements are tracked.
Upsert When enabled, upsert statements are tracked.
Unsupported When enabled unsupported operations will be tracked. Note: This should only be used for debugging purposes to determine if the current tables structure will not support LogMiner.
Compact Updates When enabled, only a single tuple is output for updates that span multiple log updates determined by the Start SCN and Commit SCN values.
Log Level Controls the level of verbosity the adapter uses to send notifications to the console. This setting can be higher than the containing application's log level. If set lower, the system log level is used. Available values, in increasing order of verbosity, are: OFF, ERROR, WARN, INFO, DEBUG, TRACE.

Edit Schema Tab

Use the Edit Schema tab to express interest in one or more columns of the target table(s). Each field present in this tab must have a column with an identical name and compatible type in the target table(s). Fields present in the Edit Schema tab, along with several additional metadata fields supplied by the adapter, appear in the schema of the adapter's output port.

If no fields are supplied, a generic list of column name to value schema is used with all values being converted to a string.

For general instructions on using the Edit Schema tab, see Defining Input Streams.

Cluster Aware Tab

Use the settings in this tab to enable this operator or adapter for runtime start and stop conditions in a multi-node cluster. During initial development of the fragment that contains this operator or adapter, and for maximum compatibility with releases before 10.5.0, leave the Cluster start policy control in its default setting, Start with module.

Cluster awareness is an advanced topic that requires an understanding of StreamBase Runtime architecture features, including clusters, quorums, availability zones, and partitions. See Cluster Awareness Tab Settings on the Using Cluster Awareness page for instructions on configuring this tab.

Concurrency Tab

Use the Concurrency tab to specify parallel regions for this instance of this component, or multiplicity options, or both. The Concurrency tab settings are described in Concurrency Options, and dispatch styles are described in Dispatch Styles.

Caution

Concurrency settings are not suitable for every application, and using these settings requires a thorough analysis of your application. For details, see Execution Order and Concurrency, which includes important guidelines for using the concurrency options.

Ports

This section describes the adapters ports.

Data Port

The data port outputs tuples when a valid change has been produced from the database that matches the setup for this parser.

  • Record (Tuple) or List(Tuple)—If no schema is supplied on the Edit Schema properties tab the following generic List(Tuple) schema is used.

    • Column (String)—The name of the column.

    • Value (String)—The current value of the field as a string.

  • MetaData (Tuple) or List(Tuple)—If compact updates are enabled a list of tuples will be supplied which include all the update records that have been combined to form a single output.

    • SCN (String)—The name of the column.

    • SCN (Double)—System change number (SCN) when the database change was made.

    • START_SCN (Double)—System change number (SCN) when the transaction that contains this change started. This column may also be NULL if the query is run over a time/SCN range that does not contain the start of the transaction.

    • COMMIT_SCN (Double)—System change number (SCN) when the transaction committed.

    • TIMESTAMP (Timestamp)—Timestamp when the database change was made.

    • START_TIMESTAMP (Timestamp)—Timestamp when the transaction that contains this change started. This column may also be NULL if the query is run over a time/SCN range that does not contain the start of the transaction.

    • COMMIT_TIMESTAMP (Timestamp)—Timestamp when the transaction committed.

    • XIDUSN (Double)—Transaction ID undo segment number of the transaction that generated the change.

    • XIDSLT (Double)—Transaction ID slot number of the transaction that generated the change.

    • XIDSQN (Double)—Transaction ID sequence number of the transaction that generated the change.

    • XID (Blob) - Raw representation of the transaction identifier.

    • PXIDUSN (Double)—Parent transaction ID undo segment number of a parallel transaction.

    • PXIDSLT (Double)—Parent transaction ID slot number of a parallel transaction.

    • PXIDSQN (Double)—Parent transaction ID sequence number of a parallel transaction.

    • PXID (Blob)—Raw representation of the parent transaction identifier.

    • TX_NAME (String)—Name of the transaction that made the change; only meaningful if the transaction is a named transaction.

    • OPERATION (String)—User level SQL operation that made the change.

      • INTERNAL—Change was caused by internal operations initiated by the database.

      • INSERT—Change was caused by an insert statement.

      • DELETE—Change was caused by a delete statement.

      • UPDATE—Change was caused by an update statement.

      • DDL—Change was caused by a DDL statement.

      • START—Change was caused by the start of a transaction.

      • COMMIT—Change was caused by the commit of a transaction.

      • SEL_LOB_LOCATOR—Operation was a SELECT statement that returned a LOB locator.

      • LOB_WRITE—Change was caused by an invocation of DBMS_LOB.WRITE.

      • LOB_TRIM—Change was caused by an invocation of DBMS_LOB.TRIM.

      • SELECT_FOR_UPDATE—Operation was a SELECT FOR UPDATE statement.

      • LOB_ERASE—Change was caused by an invocation of DBMS_LOB.ERASE.

      • MISSING_SCN—LogMiner encountered a gap in the redo records. This is most likely because not all redo logs were registered with LogMiner.

      • ROLLBACK—Change was caused by a full rollback of a transaction.

      • XML DOC BEGIN—Beginning of a change to an XMLType column or table.

      • XML DOC WRITE—Data for an XML document.

      • XML DOC END—End of the Data for an XML document.

      • UNSUPPORTED—Change was caused by operations not currently supported by LogMiner (for example, changes made to nested tables).

    • OPERATION_CODE (Int)—Number of the operation code.

      • 0—INTERNAL

      • 1—INSERT

      • 2—DELETE

      • 3—UPDATE

      • 5—DDL

      • 6—START

      • 7—COMMIT

      • 9—SELECT_LOB_LOCATOR

      • 10—LOB_WRITE

      • 11—LOB_TRIM

      • 25—SELECT_FOR_UPDATE

      • 29—LOB_ERASE

      • 34—MISSING_SCN

      • 36—ROLLBACK

      • 68—XML DOC BEGIN

      • 70 = XML DOC WRITE

      • 71 = XML DOC END

      • 255—UNSUPPORTED

    • ROLLBACK (Int)

      • 1 = If the redo record was generated because of a partial or a full rollback of the associated transaction.

      • 0 = Otherwise.

    • SEG_OWNER (String)—Owner of the modified data segment.

    • SEG_NAME (String)—Name of the modified data segment.

    • TABLE_NAME (String)—Name of the modified table (in case the redo pertains to a table modification).

    • SEG_TYPE (Int)—Type of the modified data segment:

      • 0—UNKNOWN

      • 1—INDEX

      • 2—TABLE

      • 19—TABLE PARTITION

      • 20—INDEX PARTITION

      • 34—TABLE SUBPARTITION

      • All other values—UNSUPPORTED

    • SEG_TYPE_NAME (String)—Segment type name:

      • UNKNOWN

      • INDEX

      • TABLE

      • TABLE PARTITION

      • INDEX PARTITION

      • TABLE SUBPARTITION

      • UNSUPPORTED

    • TABLE_SPACE (String)—Name of the tablespace containing the modified data segment. This column is not populated for rows where the value of the OPERATION column is DDL. This is because DDL may operate on more than one tablespace.

    • ROW_ID (String)—Row ID of the row modified by the change (only meaningful if the change pertains to a DML). This is NULL if the redo record is not associated with a DML.

    • USERNAME (String)—Name of the user who executed the transaction.

    • OS_USERNAME (String)—Name of the operating system user.

    • MACHINE_NAME (String)—Machine from which the user connected to the database.

    • AUDIT_SESSIONID (Double)—Audit session ID associated with the user session making the change.

    • SESSION_NUMBER (Double)—Session number of the session that made the change.

    • SERIAL_NUMBER (Double)—Serial number of the session that made the change.

    • SESSION_INFO (String)—Information about the database session that executed the transaction. Contains process information, machine name from which the user logged in, and so on. A possible SESSION_INFO column may contain the following:

      • login_username = HR

      • client_info =

      • OS_username = jkundu

      • Machine_name = nirvan

      • OS_terminal = pts/31

      • OS_program_name = sqlplus@nirvan (TNS V1-V3)

    • THREAD_NUMBER (Double)—Number of the thread that made the change to the database.

    • SEQUENCE_NUMBER (Int)—Sequence number of the SQL statement within the transaction.

    • RBASQN (Double)—Sequence# associated with the Redo Block Address (RBA) of the redo record associated with the change.

    • RBABLK (Double)—RBA block number within the log file.

    • RBABYTE (Double)—RBA byte offset within the block.

    • UBAFIL (Double)—Undo Block Address (UBA) file number identifying the file containing the undo block.

    • UBABLK (Double)—UBA block number for the undo block.

    • UBAREC (Double)—UBA record index within the undo block.

    • UBASQN (Double)—UBA undo block sequence number.

    • ABS_FILE_NUMBER (Double)—Data block absolute file number of the block changed by the transaction.

    • REL_FILE_NUMBER (Double)—Data block relative file number. The file number is relative to the tablespace of the object.

    • DATA_BLK_NUMBER (Double)—Data block number within the file.

    • DATA_OBJ_NUMBER (Double)—Data block object number identifying the object.

    • DATA_OBJV_NUMBER (Double)—Version number of the table being modified.

    • DATA_OBJD_NUMBER (Double)—Data block data object number identifying the object within the tablespace.

    • SQL_REDO (String)—Reconstructed SQL statement that is equivalent to the original SQL statement that made the change.

    • SQL_UNDO (String)—Reconstructed SQL statement that can be used to undo the effect of the original statement that made the change.

    • RS_ID (String)—Record set ID.

    • SSN (Double)—SQL sequence number. Used in conjunction with RS_ID, this uniquely identifies a logical row change, shown as one or more rows from the V$LOGMNR_CONTENTS view.

    • CSF (Int) - Continuation SQL flag. Possible values are:

      • 0—Indicates SQL_REDO and SQL_UNDO is contained within the same row

      • 1—Indicates that either SQL_REDO or SQL_UNDO is greater than 4000 bytes in size and is continued in the next row returned by the view

    • INFO (String)—Informational message about the row. For instance, the string USER DDL indicates that the DDL statement returned in the SQL_REDO column was the top-level DDL executed by the user and the string INTERNAL DDL indicates that the DDL statement returned in the SQL_REDO column was executed internally by the RDBMS.

    • STATUS (Double)—A value of 0 indicates that the reconstructed SQL statements as shown in the SQL_REDO and SQL_UNDO columns are valid executable SQL statements. Otherwise, the reconstructed SQL statements are not executable. This may be because no data dictionary was provided to LogMiner for the analysis, or that the data dictionary provided did not have the definition of the object being mined. A value of 5 indicates that this row is part of a change to an XMLType column or table and the XML document must be assembled before being applied.

    • REDO_VALUE (Double)—Used as input to the DBMS_LOGMNR.MINE_VALUE() and DBMS_LOGMNR.COLUMN_PRESENT() functions.

    • UNDO_VALUE (Double)—Used as input to the DBMS_LOGMNR.MINE_VALUE() and DBMS_LOGMNR.COLUMN_PRESENT() functions.

    • SAFE_RESUME_SCN (Double)—Reserved for future use.

    • OBJECT_ID (Blob)—Object identifier for DMLs to XMLType tables. For changes to non-typed tables, this column is NULL.

    • EDITION_NAME (String)—Identifies the edition in which a DDL statement was executed.

    • CLIENT_ID (String)—Client identifier in the session that performed the operation, if available.

    • SRC_CON_NAME (String)—Contains the pluggable database (PDB) name. This information will only be available when mining with a current LogMiner dictionary.

    • SRC_CON_ID (Double)—Contains the PDB ID (the PDB_ID column from the DBA_PDBS view). This information will be available only with a current LogMiner dictionary.

    • SRC_CON_UID (Double)—Contains the PDB UID (the CON_UID column from the DBA_PDBS view). This information will be available with or without a current LogMiner dictionary.

    • SRC_CON_DBID (Double)—Contains the PDB identifier (the DBID column from the DBA_PDBS view). This information will only be available when mining with a current LogMiner dictionary.

    • SRC_CON_GUID (Blob)—Contains the GUID associated with the PDB (the GUID column from the DBA_PDBS view). This information is available when mining with a current LogMiner dictionary.

    • CON_ID (Double)—The ID of the container to which the data pertains. Possible values include:

      • 0: This value is used for rows containing data that pertain to the entire CDB. This value is also used for rows in non-CDBs.

      • 1: This value is used for rows containing data that pertain to only the root.

Status Port

The status port outputs various status information tuples to provide insight into the adapter's state.

  • Status (String)—The name of the status.

    Time (Timestamp)—The date and time the status information was produced.

    Info (List Tuple)—A list of Name-Value pairs of details for the status, such as database name or error message.

All status messages that may be produced and the info list they provide.

  • Monitoring Table—The adapter is now monitoring a given table. Info list contains:

    • Table—The table being monitored.

  • Parse Error—The adapter was unable to parse an entry. Info list contains:

    • Error—The actual error message.

  • Parse Field Error—The adapter was unable to parse a field in an entry. Info list contains:

    • Column—The column in the table.

    • DB Type—The database column type.

    • StreamBase Type—The StreamBase field type.

    • Value—The string representation of the field from the database given by the REDO log.

    • MetaTuple—The meta data tuple in JSON format.

    • Error—The actual error message.

Suspend and Resume Behavior

When suspended, the adapter stops processing CDC activity and no longer emits tuples on its primary output port.

When resumed, the adapter resumes processing CDC activity and starts emitting tuples on its primary output port.