Spotfire® User Guide

Connector for Cloudera Hive — Features and settings

You can connect to and access data from Cloudera Hive databases with the connector for Cloudera Hive. On this page, you can find information about the capabilities, available settings, and things to keep in mind when you work with data connections to Cloudera Hive.

Page content

Connector features

The following functionality is available when you access data with the connector for Cloudera Hive.

Feature Supported?
Load methods
  • Import (in-memory)
  • External (in-database)
  • On-demand
Custom queries Yes
Stored procedures No
Custom connection properties No
Single sign-on with identity provider No
Authoring in web client No
Included in Spotfire distribution on Linux Yes, if a driver exists

Data source properties

The following are the supported data source properties that you can configure when you create a data connection with the connector for Cloudera Hive.

Option Description
Server

The name of the server where your data is located. To include a port number, add it directly after the name preceded by colon. Default port is 10000.

Example with port number: MyDatabaseServer:10011

Note: The Cloudera Hive Connector supports only HiveServer2.
Authentication method

The authentication method to use when logging into the database. Choose from

  • No authentication
  • Kerberos authentication
  • Username authentication
  • Username and password authentication
  • Username and password authentication with SSL
Realm [Only available for Kerberos authentication.]

Realm in Kerberos is similar to a domain name. The Kerberos realm name is case-sensitive and conventionally uppercase.

Leave the field blank if a default realm has been configured for your Kerberos setup.

For details and further help, contact your Cloudera Hive system administrator.

Host FQDN [Only available for Kerberos authentication.]

The Fully Qualified Domain Name (FQDN) is the complete URL or path to a server.

For example:

myServer.myDomain.com

To use the Impala server host name as the FQDN, type _HOST.

For details and further help, contact your Cloudera Hive system administrator.

Service name [Only available for Kerberos authentication.]

The service name of the Hive server. For example, “hive”.

For details and further help, contact your Cloudera Hive system administrator.

Connect Connect to the specified server and populate the list of available databases in the Database drop-down list.
Database Select the database of interest from the drop-down list.
Transport mode

Select the thrift transport mode to use in the thrift layer when connecting to the database. The following options are available:

  • Binary
  • SASL is the default transport mode.
  • HTTP If you select HTTP, you must also enter a partial URL in the HTTP Path field.
HTTP Path

[Only applicable when you select transport mode HTTP.]

Specify the partial URL that corresponds to the Hive server you are connecting to.

The partial URL is appended to the host and port that you specified in the Server field. Together, they form the full address.

As an example, if you want to connect to the address https://example.com:10002/gateway/sandbox/hive/version: for Server, enter https://example.com:10002, and for HTTP Path, enter /gateway/sandbox/hive/version.

Supported data types

When you are setting up a connection to an external data source, Spotfire needs to map the data types in the data source to data types in Spotfire. The following are the data types that the Cloudera Hive connector supports.

Database data type Spotfire data type
BOOLEAN Boolean/Bool
TIMESTAMP DateTime
INT Integer/Int
SMALLINT Integer/Int
TINYINT Integer/Int
BIGINT LongInteger/Long
DECIMAL Real/Double
DOUBLE Real/Double
FLOAT SingleReal/Float
ARRAY String
STRING String

Supported functions

Supported functions are the functions that you can use when you work with in-database data tables, for example for calculated columns and custom expressions.

Note: Some supported functions might not be possible to use with your database. This depends on what functions are available in the database, which often differs between database versions and types.

The following are the functions that the Cloudera Hive connector supports.

Function type Functions supported
Date and Time YEAR, QUARTER, MONTH, WEEK, WEEKOFYEAR, DAY, DAYOFMONTH, HOUR, MINUTE, SECOND, DATEDIFF, DATE_ADD, DATE_SUB, FROM_UTC_TIMESTAMP, TO_UTC_TIMESTAMP, TO_DATE
Conversion SN
Math ABS, ACOS, ASIN, ATAN, BIN, CEIL, CEILING, CONV, COS, DEGREES, E, EXP, FLOOR, HEX, LN, LOG, LOG10, LOG2, NEGATIVE, PI, POW, POWER, POSITIVE, PMOD, RADIANS, RAND, ROUND, SIGN, SIN, SQRT, TAN
Operators %, -, +, *, /
Statistical SUM, AVG, MIN, MAX, COUNT, UNIQUECOUNT, VARIANCE, VAR_POP, VAR_SAMP, STDDEV_POP, STDDEV_SAMP, CORR, COVAR_POP, COVAR_SAMP, PERCENTILE
Text ASCII, CONCAT, CONCAT_WS, INSTR, FIND_IN_SET, GET_JSON_OBJECT, IN_FILE, INSTR, LENGTH, LOWER, LCASE, LPAD, LTRIM, RPAD, RTRIM, TRIM, UPPER, LOCATE, REPEAT, SPACE, REGEXP_REPLACE, REGEXP_EXTRACT, SUBSTR, SUBSTRING, REVERSE, TRANSLATE, PARSE_URL, UCASE

Non-supported functions

The following functions are not supported in this version of the connector.

Date and Time: FROM_UNIXTIME, UNIXTIMESTAMP

Statistical: HISTOGRAM_NUMERIC, COLLECT_SET, COLLECT_LIST

Text:

BASE64, CONTEXT_NGRAMS, DECODE, ENCODE, NGRAMS, PRINTF, SENTENCES, SPLIT, STR_TO_MAP, UNBASE64, FORMAT_NUMBER

Collection Functions, Type Conversion Functions, Conditional Functions, Miscellaneous Functions and Table-Generating Functions are not supported by this connector.

Binning is also not supported by Cloudera Hive.