Adding the Data Source to Team Studio

After configuring the Team Studio server, add the data source to Team Studio.

Perform this task on a computer where Team Studio is installed.

Prerequisites

To perform this task, you must be either a data administrator or an application administrator. If you do not have administrator permissions, talk to your administrator to obtain credentials before continuing.

Procedure

  1. From the Team Studio user interface, from the menu, click Data.
  2. Click Add Data Source.
    The Add Data Source dialog box is displayed.
  3. From the Data Source Type list box, select Hadoop Cluster.
  4. Specify the required properties.
    • Data Source Name
    • Name Node Host
    • Name Node Port
    • Resource Manager Host
    • Resource Manager Port
  5. Specify the Data Source User as serviceuser.
    You can specify another name as needed.
  6. Specify the Group to which serviceuser belongs.
  7. Click Configuration Connection Parameters, and then specify additional parameters, as needed.
    These parameters are specified as key-value pairs, and according to your cluster.
    Note: In the following examples, MYREALM is the Kerberos realm, and /home/chorus/thisismy.keytab refers to the location of your keytab file on the Team Studio host.
    Key Value (to modify)
    yarn.app.mapreduce.am.staging-dir /user
    yarn.resourcemanager.scheduler.address 123.45.6.7:8030
    mapreduce.jobhistory.principal mapred/_HOST@MYREALM
    hadoop.security.authentication kerberos
    dfs.datanode.kerberos.principal hdfs/_HOST@MYREALM
    dfs.namenode.kerberos.principal hdfs/_HOST@MYREALM
    yarn.resourcemanager.principal yarn/_HOST@MYREALM
    alpine.principal chorus/[fully qualified domain name of the host where the keytab was generated]@MYREALM
    alpine.keytab /home/chorus/thisismy.keytab