Connecting Team Studio to Data Sources

Review and follow these steps to connect your installation of Team Studio to your data sources.

Perform this task on the computer where you have installed Team Studio.

Prerequisites

Test network connectivity and configure the Team Studio server.

Procedure

  1. Enable web sockets.
    Verify that web sockets are correctly enabled by using a web socket test.
  2. Access the cluster nodes, including the NameNode and DataNodes for Hadoop.
    Verify that you can connect to them by using the command $ telnet hostname port.
  3. Enable read and write permissions for the appropriate directories, including /tmp for Hadoop.
    Verify this step by writing to a file in one of those directories and running a MapReduce job, if applicable.
  4. Ensure that the appropriate agent is enabled for your data source.
  5. Configure the necessary ports in $CHORUS_HOME/shared/ALPINE_DATA_REPOSITORY/configuration/alpine.conf.
  6. If you are using Spark, ensure the following.
    • The Spark host is added in $CHORUS_HOME/shared/ALPINE_DATA_REPOSITORY/configuration/alpine.conf.
      alpine.spark.sparkAkka.akka.remote.netty.tcp.hostname = IP address for Team Studio Server
    • Full communication is open between the Team Studio server and all cluster nodes.
  7. Ensure the Team Studio server can access the LDAP server if applicable.
    Verify that you can connect by using $ telnet hostname port.

What to do next

Connect to either a database data source or a Hadoop data source.