Step-by-Step Guide to Configuring Kerberos in the Team Studio Client

After you complete the prerequisites, follow these steps to configure Kerberos in the Team Studio client.

Perform this task on the server where Team Studio is installed.

Prerequisites

Make sure you have met all of the Prerequisites for Configuring Kerberos in the Team Studio Client.

Procedure

  1. On the Team Studio server, install the Kerberos application and run the following command as root. Note that this step is completed by Cloudera Manager if the Team Studio machine is running on a cluster node.
    yum install krb5-devel.x86_64
    yum install krb5-libs and krb5-workstation
  2. Copy the krb5.conf file from the Kerberos server (not the Kerberos client). Save as /etc/krb5.conf on the Team Studio server (for example, /etc/krb5.conf for MIT KDC or AD KDC). The following krb5.conf file example uses AD KDC:
    [logging]
    default = FILE:/var/log/krb5libs.log
    kdc = FILE:/var/log/krb5kdc.log
    admin_server = FILE:/var/log/kadmind.log
    
    [libdefaults]
    default_realm = ALPINENOW.LOCAL
    dns_lookup_kdc = true
    dns_lookup_realm = false
    ticket_lifetime = 86400
    renew_lifetime = 604800
    forwardable = true
    default_tgs_enctypes = RC4-HMAC
    default_tkt_enctypes = RC4-HMAC
    permitted_enctypes = RC4-HMAC
    udp_preference_limit = 1
    [realms]
    ALPINENOW.LOCAL = {
    kdc = 10.0.0.109
    admin_server = 10.0.0.109
    }
  3. Ensure that the user has total permissions on the HDFS directories /tmp, /tmp/tsds_out, /user, /user/chorus, and any other desired HDFS directory.
  4. View the current ticket status using:
    klist -e
  5. Generate the principal for your client using option A, B, or C:
    • Option A: Run kadmin.local from the Kerberos server to generate principal for your client:
      kadmin.local
      #addprinc -randkey [username]/[servername]@ALPINE
      #xst -norandkey -k client.keytab [username]/[servername]@ALPINE

      The user who runs kinit should be the user who runs jobs. For example, if we are running Hadoop jobs with the user jenkins user on host.local, we should run the following:

      kadmin.local
      #addprinc -randkey jenkins/host.local@ALPINE
      #xst -norandkey -k /root/keytab/client/myjenkinskeytab.keytab jenkins/host.local@ALPINE
      Caution: If kadmin.local command is run from any machine other than the kdc server, repeat these steps:

      3 (kinit as root user to generate a ticket)

      4 (klist to view the ticket status)

      5 (kadmin.local from kdc server to generate the principal)

    • Option B: Use the existing keytab file if it exists. This file is for you to connect to the Kerberos server for authorization.
    • Option C: Use kadmin in place of kadmin.local. Keep in mind that for some kdc versions, kadmin does not support -norandkey, so the keytab files are /etc/krb5.keytab and the password is changed every time we run xst within kadmin.
  6. Copy the keytab files to your client server.
    scp myjenkinskeytab.keytab root@host.local:/home/jenkins/keytab/
  7. View the principal of your keytab file. You can do this with any user who has permission to access the keytab file.
    klist -e -k -t/home/jenkins/keytab/myjenkinskeytab.keytab
  8. Run kinit with your created credential:
    kinit -k -t /home/jenkins/keytab/myjenkinskeytab.keytab jenkins/host.local@ALPINE
  9. Add your kinit in crontab to renew every day with the user who is running the jobs:
    crontab -e
    0 6 * * * kinit -k -t /home/jenkins/keytab/myjenkinskeytab.keytab jenkins/host.local@ALPINE
  10. Verify user permissions:
    • Hadoop account users should have read/write permissions to the /tmp directory in HDF:
      hadoop fs -ls /
      drwxrwxrwx   - jenkins    superuser     0 2014-08-09 17:53 /tmp
    • Take note of the user running the kinit command. In this example, the user jenkins has permission to access the keytab file.
    • jenkins should have read/write permissions to the /user and /tmp/tsds_out directories in HDFS.
    • jenkins is configured in the later steps for job tracker in the Hadoop data source.
  11. Consider special instructions for specific Hadoop distributions. To run Team Studio on MapR, the Alpine host must have the MapR client installed.
  12. Verify the connection between the Team Studio server and each node of the cluster.
    telnet:
    telnet namenode 8020
    telnet: connect to address 10.0.0.xx: Connection refused # namenode is down or firewall is up
    telnet namenode 8020 # after turning on namenode
    Connected to namenode. # Team Studio server can communicate with namenode

    If the connections are not accessible, consider removing firewall restrictions using the iptables service.

  13. Troubleshoot the following:
    • Log locations:
      [root@node3 hadoop]# pwd  # this node contains both datanode and secondary namenode
      /var/log/gphd/hadoop
      [root@node3 hadoop]# ls
      hadoop-hdfs-datanode-node3.host.local.log   
      hadoop-hdfs-secondarynamenode-node3.host.local.log 

      .log is the latest version from the node.

    • Time sync: Verify that the time on the Team Studio server is the same as that for the kdc server and Kerberos clients.
    • Kerberos with HA (high availability): For kerberized clusters with HA enabled, try connecting to the active namenode before configuring both namenodes.
  14. Configure a Kerberos-Enabled Hadoop data source.