Accessing data from Apache Spark SQL
You can access data from Apache Spark SQL systems in Spotfire.
About this task
Before you begin
- The Apache Spark SQL connector requires a driver on the computer running Spotfire. See Drivers and data sources in Spotfire.
- To make sure that your database is supported, see the system requirements for the Apache Spark SQL connector.
Procedure
-
Open the
Files and data
flyout, and click
Connect to.
- In the list of data sources, select Apache Spark SQL.
- In the panel on the right, choose if you want to create a new connection or add data from a shared data connection:
- Connector for Apache Spark SQL — Features and settings
You can connect to and access data from Spark SQL databases with the data connector for Apache Spark SQL. On this page, you can find information about the capabilities, available settings, and things to keep in mind when you work with data connections to Apache Spark SQL.
Working with and troubleshooting Apache Spark SQL data connections
About this task
Prerequisite: Spark Thrift Server
To access data in Apache Spark SQL with the Spotfire connector for Apache Spark SQL, the Spark Thrift Server must be installed on your cluster. Spark Thrift Server provides access to Spark SQL via ODBC, and it might not be included by default on some Hadoop distributions.
Prerequisite: spark.shuffle.service.enabled
If you use the in-database load method when connecting to Apache
Spark 2.1 or later, and you encounter errors in your analysis, the option
spark.shuffle.service.enabled might have to be
enabled on the Spark server.
Apache Spark SQL temporary views and tables in custom queries
If you are creating a custom query and you want to use data from an Apache Spark SQL temporary table or view, you must refer to those objects using their qualified names, specifying both the name and the location of the object. The qualified names required have the following format:
databaseName.tempViewName
By default, global temporary views are stored in the
global_temp database. The database name can vary,
and you can see it in the hierarchy of available database tables in Spotfire.
To select all columns from a global temporary view named
myGlobalTempView, that is stored in the global_temp
database:
SELECT * FROM global_temp.myGlobalTempView
Temporary views/tables (listed in Spotfire under 'Temporary views'
or 'Temporary tables') are always located in the
#temp database. To select all columns in a temporary
view named
myTempView:
SELECT * FROM #temp.myTempView
User agent tagging
If the ODBC driver that you use supports the
UserAgentEntry option, Spotfire includes the
following string as the
UserAgentEntry in queries:
TIBCOSpotfire/<ProductVersion>