Hadoop Data Sources
These topics show you how to add a Hadoop data source from the command line or through the Team Studio user interface, and how to connect to various data sources.
- Adding a Hadoop Data Source from the Command Line
To add an HDFS data source, first make sure the Team Studio server can connect to the hosts, then use the Add Data Source dialog box to add it to Team Studio. - Adding a Hadoop Data Source from the User Interface
To add an HDFS data source, first make sure the Team Studio server can connect to the hosts, and then use the Add Data Source dialog box to add it to Team Studio. - Connecting to a Hive Data Source on Hadoop
You can create a Hive data source natively on Hadoop, without using JDBC. It is much faster than connecting to Hive over JDBC, and it has support for running HiveQL queries on the HQL Execute operator. - Connect to a MapR 4.x Data Source
This topic describes how to configure Team Studio to connect to a MapR 4.x data source. - Connect to a Pivotal Hadoop (PHD) Data Source
You might be required to add an additional parameter to configure a Team Studio data source to connect to PHD 3.0. - Connect to a YARN-Enabled Data Source
To connect to a YARN-enabled cluster, Team Studio must have access to the following ports on each node of the cluster: - Hadoop Data Source Connection Tests and Troubleshooting
You can test Hadoop connections to troubleshoot the connection to the datasource. Team Studio provides a variety of tests to verify connectivity and troubleshoot problems.
Related concepts
Copyright © 2021. Cloud Software Group, Inc. All Rights Reserved.