Spotfire® User Guide

Connector for Cloudera Impala — Features and settings

You can connect to and access data from Cloudera Impala databases with the connector for Cloudera Impala. On this page, you can find information about the capabilities, available settings, and things to keep in mind when you work with data connections to Cloudera Impala.

Page content

Connector features

The following functionality is available when you access data with the connector for Cloudera Impala.

Feature Supported?
Load methods
  • Import (in-memory)
  • External (in-database)
  • On-demand
Custom queries Yes
Stored procedures No
Custom connection properties Yes
Single sign-on with identity provider No
Authoring in web client Yes
Included in Spotfire distribution on Linux Yes

Data source properties

The following are the supported data source properties that you can configure when you create a data connection with the connector for Cloudera Impala.

Option Description
Server

The name of the server where your data is located. To include a port number, add it directly after the name preceded by colon.

Example: MyDatabaseServer:20011

Default port number: 21050

Authentication method The authentication method to use when logging into the database. The following options are available:
  • No authentication
  • Kerberos
  • Username
  • Username and password
Use Spotfire username as delegated user

Use the username of the current Spotfire user as the delegated user (DelegationUID) in connections to the data source.

To use a connection with delegation, a user corresponding to your Spotfire username must be available in the Impala system.

Note: If you are running Spotfire offline, without connecting to a server, this setting is ignored.
Tip: You can specify a different delegated user by setting the DelegationUID as a custom property. If you specify a fixed value for DelegationUID, you cannot use the Use Spotfire username as delegated user setting.
Realm [Only available for Kerberos authentication.]

Realm in Kerberos is similar to a domain name. The Kerberos realm name is case-sensitive and conventionally uppercase.

Leave the field blank if a default realm has been configured for your Kerberos setup.

For details and further help, contact your Cloudera Impala system administrator.

Host FQDN [Only available for Kerberos authentication.]

The Fully Qualified Domain Name (FQDN) is the complete URL or path to a server.

For example:

myServer.myDomain.com

To use the Impala server host name as the FQDN, type _HOST.

For details and further help, contact your Cloudera Impala system administrator.

Service name [Only available for Kerberos authentication.]

The service name of the Impala server. For example, “impala”.

For details and further help, contact your Cloudera Impala system administrator.

Use SSL Select to connect using Secure Sockets Layer (SSL).
Allow common name host name mismatch

When Use SSL is selected, allow connecting to an Impala server where the host name does not match the common name in the SSL certificate.

Allow self-signed certificate When Use SSL is selected, allow connecting to an Impala server with a self-signed SSL certificate.
Use SASL

Select to connect using Simple Authentication and Security Layer (SASL).

Note: If you select the Transport mode option HTTP (under More properties), that option takes precedence over your selection for SASL. You cannot use both SASL and Transport mode HTTP at the same time.
Connection timeout (s)

Specify the maximum number of seconds allowed for a connection to the database to be established.

The default value is 120.

Note: If you set the connection timeout to zero, it will be interpreted as no timeout. That is, there will be no upper limit for trying to connect. This is generally not recommended.
Command timeout (s)

Specify the maximum number of seconds allowed for a command to be executed.

The default value is 1800.

Note: If you set the command timeout to zero, it will be interpreted as no timeout. That is, there will be no upper limit for trying to execute the command. This is generally not recommended.
Transport mode

Select the thrift transport mode to use in the thrift layer when connecting to the data source.

The following options are available:

  • Binary
    Binary is the default transport mode. Binary is in use if you do not select Use SASL, nor transport mode HTTP.
    Note: If you have selected the option Use SASL, that option takes precedence over the Transport mode option Binary.
  • HTTP
    The transport mode HTTP is often required when a proxy is used between client and server, for example for load balancing. When you select HTTP, you must also configure the HTTP Path.
    Note: If you select HTTP, this option takes precedence over the option Use SASL.
HTTP Path

Specify the partial URL that corresponds to the Impala server you are connecting to.

The partial URL is appended to the server host and port specified in the Server field. Together, they form the full address.

For example, if you want to connect to https://example.com:21050/gateway/sandbox/impala/version, in the Server field you enter https://example.com:21050 and in the HTTP Path field you enter /gateway/sandbox/impala/version.

Custom properties for Cloudera Impala connection data sources

The following is the default list of driver settings that are allowed as custom properties in Cloudera Impala connection data sources. To learn how to change the allowed custom properties, see Controlling what properties are allowed.

Default allowed custom properties

AsyncExecPollInterval, AutoReconnect,DefaultKeytabFile, DelegationUID, DelegationUserIDCase, 
DisableOptimizedEncodingConverter, EnableQueryRetry, GlobalResultSetCache, HTTPAuthCookies, http.header., 
IgnoreTransactions, LCaseSspKeyName, MaxCatalogNameLen, MaxColumnNameLen, MaxNumQueryRetries, 
MaxSchemaNameLen, MaxTableNameLen, Min_TLS, ProxyHost, ProxyPort, ProxyPWD, ProxyUID, QueryRetryInterval,
ResultSetCacheSize, RowsFetchedPerBlock, ServicePrincipalCanonicalization, SocketTimeout, 
SSOWebServerTimeout, StringColumnLength, TSaslTransportBufSize, TrustedCerts, UPNKeytabMappingFile, 
UseKeytab, UseOnlySSPI, UseProxy, UseSQLUnicodeTypes

Supported data types

When you are setting up a connection to an external data source, Spotfire needs to map the data types in the data source to data types in Spotfire. The following are the data types that the Cloudera Impala connector supports, as well as their corresponding data types in a Cloudera Impala database.

Database data type Spotfire data type
BOOLEAN Boolean/Bool
TIMESTAMP DateTime
INT Integer/Int
SMALLINT Integer/Int
TINYINT Integer/Int
BIGINT LongInteger/Long
DOUBLE Real/Double
FLOAT SingleReal/Float
CHAR String
STRING String
VARCHAR String
DECIMAL(p,s)

When p <= 9 and s = 0: Integer

When 9 < p <= 18 and s = 0: LongInteger

Else: Real

Supported functions

Supported functions are the functions that you can use when you work with in-database data tables, for example for calculated columns and custom expressions.

Note: Some supported functions might not be possible to use with your database. This depends on what functions are available in the database, which often differs between database versions and types.

The following are the functions that the Cloudera Impala connector supports.

Function type Functions supported
Date and Time DATE_ADD, DATE_SUB, DATEDIFF, DAYOFMONTH, FROM_UTC_TIMESTAMP, HOUR, MINUTE, MONTH, NOW, SECOND, TO_DATE, TO_UTC_TIMESTAMP, WEEKOFYEAR, YEAR, QUARTER
Conversion SN
Math ABS, ACOS, ASIN, ATAN, BIN, CEIL, CEILING, COS, DEGREES, E, EXP, HEX, LOG, LOG10, LN, LOG2, NEGATIVE, PI, PMOD, POSITIVE, POW, POWER, QUOTIENT, RADIANS, RAND, ROUND, SIGN, SIN, SQRT, TAN, UNHEX
Operators %, -, +, *, /
Statistical SUM, AVG, MIN, MAX, COUNT
Text ASCII, CONCAT, CONCAT_WS, INSTR, LENGTH, LOCATE, LOWER, LCASE, LTRIM, REGEXP_EXTRACT, REGEXP_REPLACE, REVERSE, RTRIM, TRIM, UPPER, UCASE

The following functions are also supported, but with a lower performance:

SPACE, RPAD, LPAD, REPEAT, SUBSTR, SUBSTRING

Non-supported functions
A few functions may be shown in the analysis even though they are not supported by the connector. These are:
INITCAP, TRANSLATE, CONV, GREATEST, LEAST, DAYNAME, DAYOFWEEK
Binning is also not supported by this connector.