Accessing data from Google BigQuery
You can access and analyze your Google BigQuery data in Spotfire. This page introduces how to create a data connection to Google BigQuery, and provides information about how to work with BigQuery data in Spotfire.
Prerequisites
- Preparations in
Spotfire
- Before you start, read the System Requirements for Spotfire Connectors. Make sure your version of Spotfire includes the connector for Google BigQuery.
- If localhost is not accessible due to firewall settings, a proxy, or for other reasons, you must define a local redirect address template. For more information, see Defining a local redirect address template for your version of the Spotfire® Server and Environment - Installation and Administration manual.
- Google account
requirements
- To be able to access data from Google BigQuery in Spotfire, you need a Google account. You must be a member of a Google Cloud Platform project where BigQuery is enabled.
- In the Google project, make sure you have roles that enable you to view and query the BigQuery data.
- For more information on how to use BigQuery in your Google Cloud Platform project, see the official documentation from Google.
When you analyze your BigQuery data in Spotfire, remember that there is a cost for processing queries in Google BigQuery. When you create a connection, you select a Google Cloud Platform project as the Billing project. The project you select is charged for the data that Google BigQuery processes for each query.
The number, size, and complexity of queries that Spotfire sends to Google BigQuery depends on many factors. The way you configure your connection or analysis impacts how Spotfire queries the external database. For example, you can try to limit your data selection in the connection. If you select only the data that you want to analyze and leave out data that is not of interest, you can reduce the size of queries. Another factor is the load method for data tables. If you import a data table, only one query for the entire table is sent when you open or refresh the connection. You can also keep the data table external, for in-database analysis. Then Spotfire sends queries to the database continuously based on your actions in the analysis.
When you select
Import as the load method for your data from
BigQuery, Spotfire tries to access the data using the BigQuery Storage Read
API. This loading mechanism provides much faster import performance compared to
BigQuery's other retrieval methods. To use the Storage Read API, you need the
permission
bigquery.readsessions.create. When loading data
where you lack this permission, for example for public data, Spotfire will fall
back to BigQuery's other, slower retrieval methods.
To learn more about the BigQuery Storage Read API, see BigQuery Storage Read API overview from the Google Cloud documentation.
- Connector for Google BigQuery — Features and settings
You can connect to and access data from Google BigQuery. On this page, you can find information about the capabilities, available settings, and things to keep in mind when you work with data connections to Google BigQuery. - Repeated columns from Google BigQuery
You can access BigQuery data with repeated columns. A repeated column is a column that can contain multiple values per row. In Spotfire, data tables with repeated columns are flattened. - Credentials profiles for authentication in Google BigQuery data connections
For automatic authentication in Google BigQuery connections, set up a credentials profile with the credentials of a Google service account. - Private endpoints for Google BigQuery
Spotfire supports using private endpoints in data connections to Google BigQuery. Private endpoints are internal IP addresses in a Virtual Private Cloud (VPC) network that can be accessed directly and securely. Using private endpoints increases security as the traffic does not leave the VPC network.
Creating a new connection to Google BigQuery
To access your BigQuery data in Spotfire, create a connection to BigQuery. When you have created your connection, you can share it in the Spotfire library. Then you and your team members can reuse the connection for convenient access to the data you selected in the connection.
- Open the
Files and data
flyout, and click
Connect to.
- In the list of connectors, select Google BigQuery. Then click New connection.
- In the Google BigQuery Connection dialog, click to launch the Google login procedure in your web browser.
- In your web browser, follow the instructions to log in with your Google account. In the last step, click to allow the Spotfire application access to Google BigQuery. When you have logged in, return to Spotfire.
- In the Google BigQuery connection dialog, select the Billing project.
- Optionally, if you want to access datasets that you have access to in projects that you are not a member of, enter the project ID(s) under Include data from additional projects. Optionally, select if you want to Enable repeated columns and Show public data.
- To connect, click OK.
- In the Views in Connection
dialog, select the data you want to analyze in Spotfire. When you are happy
with your data selection, click
OK.
Tip: For more information, see Selecting data from Google BigQuery.
- The final step is a
summary view of the data you are about to add to your analysis. Here, you can
configure if you want the data tables to be loaded as
Imported,
External, or
On-demand.
Tip: For more information, see Choosing how to load the data.
- To add the selected data to your analysis, click OK.
Selecting data from Google BigQuery
When you create or edit a connection to Google BigQuery, you select the data you want to analyze in the Views in Connection dialog. The following is information, tips, and limitations that are specific to selecting data from BigQuery in Spotfire.
Accessing different datasets
You might have access to BigQuery datasets in Google Cloud Platform projects that you are a member of, and in projects that you are not a member of.
Datasets that you have access to in projects that you are a member of are always available in Spotfire, when you have signed in with Google.
To access datasets in projects that you are not a member of, you must specify that you want to include data from those projects. Enter the Google Cloud Platform project ID(s) in the Include data from additional projects field, in the Google BigQuery Connection dialog.
Standard SQL and Legacy SQL
In Google BigQuery, you might have worked in two different dialects
of SQL;
Standard SQL and
Legacy SQL. In Spotfire, only the
Standard SQL dialect of SQL is supported.
When you select data from Google BigQuery, this means that:
- You cannot select data
from BigQuery views that were defined using
Legacy SQL. Such views are not listed when you select data. - You can only write
custom queries in
Standard SQL.
Custom queries in BigQuery connections
- When you select data
from Google BigQuery with a custom query, you must write your query in
Standard SQL. - If the data you want to
select with a custom query contains nested columns, you cannot write
SELECT *. The columns returned from a custom query cannot be of the typeSTRUCTorARRAY. If there are nested columns in the data, you must unnest them in the query. For more information about unnesting, see the official documentation from Google. - Repeated columns from Google BigQuery are not supported in custom queries.
Repeated columns
You can analyze BigQuery data with repeated columns. To enable repeated columns, select Enable repeated columns when you create the connection
In Spotfire, the names of repeated columns from BigQuery are shown
inside brackets
[]:
[column_name]
Nested columns (records)
A record column in BigQuery is a column that has other columns nested inside it.
You can recognize nested columns and their relationships from the
column names in Spotfire. For example, if you have a record column
Person, that has the columns
Name and
Age with related information about the person nested
inside it, the column names in Spotfire would be:
Person.Name
Person.Age
The record column
Person itself is not listed in Spotfire. The reason
is that it is only a container for the nested columns inside it, and does not
contain any data on its own.
Partitioned tables
In Spotfire, you see the partitioning columns in partitioned tables as parameters. When you select a partitioned table, each partitioning column is represented by two parameters. The two parameters correspond to a to and from value for the partitioning column.

To input values for the parameters, you use prompts or on-demand. With prompts, you can input values for the parameters when you open the data connection. This way each user can choose to limit the data selection to only data within the range of interest.
If a partitioned table has mandatory partitioning columns, prompts are automatically created for the corresponding parameters in Spotfire. If you want to use on-demand as input for the parameters, you can remove the prompts for the parameters.
When you define a prompt for a parameter based on a partitioning column, only the prompt type single selection is available. For the end user, it is mandatory to enter a value in any prompts you have defined.
If you do not define a prompt for a parameter, the data is not limited based on that parameter.
Public data
You can decide if you want BigQuery public datasets to be available for selection when you connect to Google BigQuery. To enable public datasets in your connection, select the checkbox Show public data in the Google BigQuery connection dialog.
When you enable this option, you might see very long loading times for listing data tables the Views in Connection dialog. This is due to the large amounts of BigQuery public datasets.
Accessing Google Analytics data with BigQuery
You can access and analyze Google Analytics data in BigQuery.
To be able to access your Google Analytics data with the Google BigQuery connector in Spotfire, you must enable repeated columns in your data connection. Select Enable repeated columns in the Google BigQuery Connection dialog when you create or edit your connection.
To learn more, see Repeated columns from Google BigQuery.