Connect to a BigQuery Data Source

You can connect Team Studio to a GCP BigQuery data source.

Perform this task on the computer where Team Studio server is installed.

Prerequisites

You must have write access to the Team Studio server. You must also have access to the GCP BigQuery configuration files.

Procedure

Copy the BigQuery driver to the following directories. and then change the ownership of these copies to the user who runs Team Studio (usually user 'chorus').
- $CHORUS_HOME/shared/ALPINE_DATA_REPOSITORY/jdbc_driver/Public
- $CHORUS_HOME/shared/libraries
Change the ownership of these copies to the user who runs Team Studio.

Usually, that user name is 'chorus'.
Create a new bigquery directory named $CHORUS_HOME/shared/ALPINE_DATA_REPOSITORY/jdbc/bigquery.
Copy the file driver.properties from the directory $CHORUS_HOME/shared/ALPINE_DATA_REPOSITORY/jdbc/default to the newly created bigquery directory.

Edit the contents of the file $CHORUS_HOME/shared/ALPINE_DATA_REPOSITORY/jdbc/bigquery/driver.properties as follows.

# Specify the JDBC class driver for the desired database type.
# Examples:
# Oracle = oracle.jdbc.driver.OracleDriver
# Greenplum = org.postgres#ql.Driver
# DB2 = com.ibm.db2.jcc.DB2Driver
# Netezza = org.netezza.Driver
# PostgreSQL = org.postgresql.Driver
# SQLServer = com.microsoft.sqlserver.jdbc.SQLServerDriver
# MySQL = com.mysql.jdbc.Driver
# Teradata = com.teradata.jdbc.TeraDriver
# Vertica = com.vertica.jdbc.Driver
# Sybase = com.sybase.jdbc2.jdbc.SybDriver
# Informix = com.informix.jdbc.IfxDriver
# SAPDB = com.sap.dbtech.jdbc.DriverSapDB
# InterBase = interbase.interclient.Driver
# HSqlDB = org.hsqldb.jdbcDriver
# MariaDB = org.mariadb.jdbc.Driver
# MySQL = com.mysql.jdbc.Driver
driverClass=com.simba.googlebigquery.jdbc42.Driver

# BigQuery (like Hive) does not support "schema"."tablename".
# For BigQuery, this entry must be empty string: identiferQuotation=
# with no whitespace (except newline) or characters after the equals sign
identifierQuotation=

Locate and open for editing the file additional_jdbc_drivers.rb.

The path is similar to path similar to /usr/chorus/install/releases/6.3.2.0.8068-7ac910ae3/components/core/app/mixins/sequel/extensions/additional_jdbc_drivers.rb

Add a line for the bigquery class so that the content resembles the following.

module Sequel
  module AdditionalJdbcDrivers
    MAP =  {
        mariadb: ->(db) { org.mariadb.jdbc.Driver },
        teradata: ->(db) { com.teradata.jdbc.TeraDriver },
        vertica: ->(db) { com.vertica.jdbc.Driver },
        hive2: ->(db) { org.apache.hive.jdbc.HiveDriver },
        hive: ->(db) { org.apache.hadoop.hive.jdbc.HiveDriver },
        bigquery: ->(db) { com.simba.googlebigquery.jdbc42.Driver }
        }
 
    MAP.each do |key, driver|
      ::Sequel::JDBC::DATABASE_SETUP[key] = driver
    end
  end
end

Important: You must apply this change to the file additional_jdbc_drivers.rb again after upgrading Team Studio.

Open the file $CHORUS_HOME/shared/ALPINE_DATA_REPOSITORY/configuration/alpine.conf and add the following configuration parameter.
```
database.bigquery.batchSize (default = 1000)
```
Note: Due to idiosyncrasies in the way BigQuery handles batch updates, uploading a narrow table can result in an error. This configuration parameter addresses possible batch size issues.
Restart Team Studio.
Open the Add Data Source dialog box.
Provide the Data Source Type, the Data Source Name, and (optionally), the Description.
Set the data connection (JDBC URL) using a URL similar to the following.
Tip: You can copy your BigQuery URL from your GCP BigQuery page
```
jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;ProjectId=teamstudio-user@teamstudio-alpine.iam.gserviceaccount.com
```
Specify Workspace Visibility.
A data source with Limited visibility must be manually associated with a workspace for members of that workspace to use the data source. By default, this option is set to Public.
Optional: Select Set database credentials as a shared account if you intend to allow all users to access the data source without using their own credentials.

Users access the database with your credentials as the data source owner. If you do not select this check box, each user must provide credentials for that data source to access it. You can check the box later if you change your mind.

BigQuery Data Source Connection Tests and Troubleshooting
If you encounter error messages or other problems with your data source connection, check this topic for suggestions for troubleshooting.

Connect to a Pivotal HAWQ Database

Connect to a Greenplum Database

Connect to an Oracle Database

Connect to a JDBC Data Source

Contents

Index

Search Results

Connect to a BigQuery Data Source

Prerequisites

Procedure