Contents
The TIBCO StreamBase® Adapter for Apache HBase is implemented as a suite of five global Java operators, including the HBase Admin, Delete, Get, Put, and Scan operators.
This page describes the HBase Get operator, which allows a StreamBase application to extract rows by ID from the connected HBase database. The operator uses property values in the project's server configuration file to set up the connection to the HBase database, as described in Configuration File Settings. Multiple HBase operators can share a single instance of an HBase connection by selecting the same HBase configuration setting ID.
This section describes the configuration for an HBase database connection instance that you specify in your project's sbconf file. This configuration is the same for all HBase operator types.
                The <HBase.conf> element of a project's HOCON
                file, despite its name, is used to specify configuration value groups for either
                operators or adapters.
              
                The HBase configuration section of the HOCON file starts with an <HBase.conf> element that contains one <HBase.conf name="hbase"> element. This element, in turn,
                contains one or more <setting> elements.
              
                Each <section name="hbase"> element must contain
                one element in the form <setting id=", where HBaseConfigName"/>HBaseConfigName is the name you assign to a group
                of settings that uniquely define an individual HBase database connection. All other
                <setting> elements are optional.
              
                The example configuration below shows a basic configuration to connect to an HBase
                server. You can have as many configurations as your application requires, but each
                configuration must have a unique id.
              
Example 1. Example <adapter-configuration> Section for HBase
name = "HBase.conf"
type = "com.tibco.ep.streambase.configuration.adapter"
version = "1.0.0"
configuration = {
          
// An adapter group type defines a collection of EventFlow adapter configurations,
// indexed by adapter type.
  AdapterGroup = {
          
// A collection of EventFlow adapter configurations, indexed by adapter type. 
// This object is required and must contain at least one configuration.
    adapters = {
          
// The root section for an EventFlow adapter configuration.
      hbase = {
          
// Section list. This array is optional and has no default value.
        sections = [
          
// A configuration for an EventFlow adapter named section.
            {
          
// Section name. The value does not have to be unique; that is, you can have multiple sections
// with the same name in the same array of sections. This property is required.
              name = "hbase"
          
// Section for setting adapter properties. All values must be strings. This object
// is optional and has no default value.
                settings = {
                  connectAtStartup = "true"
                  "hbase.client.retries.number" = "5"
                  "hbase.master" = "127.0.0.1:60000"
                  "hbase.zookeeper.property.clientPort" = "2181"
                  "hbase.zookeeper.quorum" = "127.0.0.1"
                  id = "HBase Demo"
                  "zookeeper.recovery.retry" = "5"
                  "zookeeper.session.timeout" = "5000"
            }
          }
        ]
      }
    }
  }
}
                
                
              
| Setting | Type | Description | 
|---|---|---|
| id | string | The value of the idsetting displays in the
                        drop-down list in the adapter's Properties view, and is used to uniquely
                        identify this section of the configuration file. | 
| connectAtStartup | true or false | If true, this operator instance connects to HBase on startup of this operator's containing module. | 
| *** | string | All other values are directly sent to the HBaseConfiguration class, which is responsible for setting up a connection to the HBase server. See the Apache HBase documentation for the available client configuration options and for further information on setting up a connection to HBase. | 
This section describes the properties you can set for an HBase Get operator, using the various tabs of the Properties view in StreamBase Studio.
Name: Use this required field to specify or change the name of this instance of this component, which must be unique in the current EventFlow module. The name must contain only alphabetic characters, numbers, and underscores, and no hyphens or other special characters. The first character must be alphabetic or an underscore.
Operator: A read-only field that shows the formal name of the operator.
Class name: Shows the fully qualified class name that implements the functionality of this operator. If you need to reference this class name elsewhere in your application, you can right-click this field and select Copy from the context menu to place the full class name in the system clipboard.
Start options: This field provides a link to the Cluster Aware tab, where you configure the conditions under which this operator starts.
Enable Error Output Port: Select this check box to add an Error Port to this component. In the EventFlow canvas, the Error Port shows as a red output port, always the last port for the component. See Using Error Ports to learn about Error Ports.
Description: Optionally enter text to briefly describe the component's purpose and function. In the EventFlow Editor canvas, you can see the description by pressing Ctrl while the component's tooltip is displayed.
| Property | Type | Description | 
|---|---|---|
| HBase Configuration | Edit Button | Shortcut to the StreamBase Configuration File Editor, used for
                        adapter configuration or converting an existing application's adapter-configurations.xmlfile to HOCON format. | 
| HBase Config | drop-down list | The name of the HBase configuration to use with this operator. The value selected by this drop-down list determines the database connection this operator works against. The values that populate this list are stored in the project's adapter configuration file, as described in the Configuration File Settings section. | 
| Table Name | string | The HBase table that this operation is to be performed against. | 
| Max Versions | int | The maximum number of versions to fetch for each row from the database. If this value is 1, the output schema represents a schema for a single version output. If, however, this value is greater than one, the output schema will be modified to represent a list of values with their corresponding timestamp values. | 
| Enable Pass Through Fields | check box | If enabled, the fields passed into the operator are to be copied into a user-defined field in the output schema. | 
| Pass Through Field Name | string | The name of the field in the output schema that is to hold the pass-through fields. | 
| Filter Factory Class | string | The fully qualified name of the class that implements com.streambase.sb.operator.hbase.IHBaseFilterFactoryto
                        create theorg.apache.hadoop.hbase.filter.FilterListto perform
                        filtering on the rows returned by this operation. Leave blank for no
                        filtering. See the Filter Factory Interface section for more on creating aFilterListclass. | 
| Row Id Field Name | string | The field in the inbound schema that represents the field that contains the Row ID to get. | 
| Enable Status Port | check box | If enabled, a status port is made available for this operator instance, which will emit status tuples for various events from this operator. | 
| Log Level | INFO | Controls the level of verbosity the adapter uses to issue informational traces to the console. This setting is independent of the containing application's overall log level. Available values, in increasing order of verbosity, are: OFF, ERROR, WARN, INFO, DEBUG, TRACE. | 
| Property | Type | Description | 
|---|---|---|
| Time Range Min Field Name | string | Specifies the field in the incoming schema to be used to specify the
                        starting point timestamp of a time range. Only columns and versions of
                        columns within the specified time range are retrieved; a value of null or
                        empty means do not use. The default Max Versions setting to return is 1. If your time range spans more than one version and you want all versions within the time range returned, increase the Max Versions setting to a value greater than one. | 
| Time Range Max Field Name | string | Specifies the field in the incoming schema to be used to specify the end
                        point timestamp of a time range. Only columns and versions of columns
                        within the specified time range are retrieved; a value of null or empty
                        means do not use. The default Max Versions setting to return is 1. If your time range spans more than one version and you want all versions within the range returned, increase the Max Versions setting to a value greater than one. | 
| Property | Type | Description | 
|---|---|---|
| Column Filter Type | "Range" or "Columns" | Indicates the type of column filtering to perform. | 
| Column Filter Min Range | string | The minimum value for the column range. If left blank, there is no lower bound. | 
| Column Filter Min Range Inclusive | check box | If enabled, the value of the minimum column itself is to be included in the range. | 
| Column Filter Max Range | string | The maximum value for the column range. If left blank, there is no upper bound. | 
| Column Filter Max Range Inclusive | check box | If enabled, the value of the maximum column itself is to be included in the range. | 
| Columns | field grid | The columns to return with this query. The Family column of the field grid must have a value, but the Column column can be blank, which directs the operator to include all columns of the specified Family. | 
| Property | Type | Description | 
|---|---|---|
| Row Id Type | drop-down list | The data type used for the Row ID field. | 
| Convert Value To String | check box | This option only applies if no user-defined schema is provided. If enabled, this option tries to convert all cell values to a string; if not enabled, all values will be a blob. | 
| HBase Schema | schema grid | The schema used for scan data results. If no schema is provided, the schema is a list of tuples containing the family column value sets. | 
Use the settings in this tab to allow this operator or adapter to start and stop based on conditions that occur at runtime in a cluster with more than one node. During initial development of the fragment that contains this operator or adapter, and for maximum compatibility with TIBCO Streaming releases before 10.5.0, leave the Cluster start policy control in its default setting, Start with module.
Cluster awareness is an advanced topic that requires an understanding of StreamBase Runtime architecture features, including clusters, quorums, availability zones, and partitions. See Cluster Awareness Tab Settings on the Using Cluster Awareness page for instructions on configuring this tab.
Use the Concurrency tab to specify parallel regions for this instance of this component, or multiplicity options, or both. The Concurrency tab settings are described in Concurrency Options, and dispatch styles are described in Dispatch Styles.
Caution
Concurrency settings are not suitable for every application, and using these settings requires a thorough analysis of your application. For details, see Execution Order and Concurrency, which includes important guidelines for using the concurrency options.
                The filter factory interface (com.streambase.sb.operator.hbase.IHBaseFilterFactory) is used to
                create a FilterList for an HBase Scan or Get operation. The Filter list allows
                HBase to return only those rows that match the criteria provided by the specified
                filters. The interface allows an interaction between the inbound tuple and the Scan
                or Get operations. See the HBase API documentation for further information about
                the org.apache.hadoop.hbase.filter.FilterBase class
                and its sub-classes, and for how-to information on filtering data.
              
When you have compiled a Java class to serve as a filter, provide the fully qualified name of your class in the Filter Factor Class field in the Operator Properties tab in your HBase Scan or Get operator's Properties view.
The next section provides an example class file that takes the inbound tuples and uses some of the fields of each tuple to create a filter to use for the read operation.
package com.streambase.sb.hbase.filterfactory;
import org.apache.hadoop.hbase.filter.CompareFilter.CompareOp;
import org.apache.hadoop.hbase.filter.FilterList;
import org.apache.hadoop.hbase.filter.SingleColumnValueFilter;
import org.apache.hadoop.hbase.filter.SubstringComparator;
import org.apache.hadoop.hbase.util.Bytes;
import com.streambase.sb.Tuple;
import com.streambase.sb.operator.hbase.IHBaseFilterFactory;
public class DemoFilterFactory implements IHBaseFilterFactory {
  @Override
  public FilterList createFilterList(Tuple tuple) throws Exception {
    FilterList filterList = 
      new FilterList(FilterList.Operator.MUST_PASS_ALL);
    filterList.addFilter(new 
      SingleColumnValueFilter(Bytes.toBytes(tuple.getString("family")), 
      Bytes.toBytes(tuple.getString("column")), CompareOp.EQUAL, 
      new SubstringComparator(tuple.getString("matchSubString"))));    
    return filterList;
  }
}  
            