Step-by-Step Guide to Configuring Kerberos in the Team Studio Client
After you complete the prerequisites, follow these steps to configure Kerberos in the Team Studio client.
Prerequisites
Procedure
-
On the
Team Studio server, install the Kerberos application and run the following command as root. Note that this step is completed by Cloudera Manager if the
Team Studio machine is running on a cluster node.
yum install krb5-devel.x86_64 yum install krb5-libs and krb5-workstation
-
Copy the
krb5.conf file from the Kerberos server (not the Kerberos client). Save as
/etc/krb5.conf on the
Team Studio server (for example,
/etc/krb5.conf for MIT KDC or AD KDC). The following krb5.conf file example uses AD KDC:
[logging] default = FILE:/var/log/krb5libs.log kdc = FILE:/var/log/krb5kdc.log admin_server = FILE:/var/log/kadmind.log [libdefaults] default_realm = ALPINENOW.LOCAL dns_lookup_kdc = true dns_lookup_realm = false ticket_lifetime = 86400 renew_lifetime = 604800 forwardable = true default_tgs_enctypes = RC4-HMAC default_tkt_enctypes = RC4-HMAC permitted_enctypes = RC4-HMAC udp_preference_limit = 1 [realms] ALPINENOW.LOCAL = { kdc = 10.0.0.109 admin_server = 10.0.0.109 }
- Ensure that the user has total permissions on the HDFS directories /tmp, /tmp/tsds_out, /user, /user/chorus, and any other desired HDFS directory.
-
View the current ticket status using:
klist -e
-
Generate the principal for your client using option A, B, or C:
- Option A: Run
kadmin.local from the Kerberos server to generate principal for your client:
kadmin.local #addprinc -randkey [username]/[servername]@ALPINE #xst -norandkey -k client.keytab [username]/[servername]@ALPINE
The user who runs kinit should be the user who runs jobs. For example, if we are running Hadoop jobs with the user jenkins user on host.local, we should run the following:
kadmin.local #addprinc -randkey jenkins/host.local@ALPINE #xst -norandkey -k /root/keytab/client/myjenkinskeytab.keytab jenkins/host.local@ALPINE
- Option B: Use the existing keytab file if it exists. This file is for you to connect to the Kerberos server for authorization.
- Option C: Use kadmin in place of kadmin.local. Keep in mind that for some kdc versions, kadmin does not support -norandkey, so the keytab files are /etc/krb5.keytab and the password is changed every time we run xst within kadmin.
- Option A: Run
kadmin.local from the Kerberos server to generate principal for your client:
-
Copy the keytab files to your client server.
scp myjenkinskeytab.keytab root@host.local:/home/jenkins/keytab/
-
View the principal of your keytab file. You can do this with any user who has permission to access the keytab file.
klist -e -k -t/home/jenkins/keytab/myjenkinskeytab.keytab
-
Run
kinit with your created credential:
kinit -k -t /home/jenkins/keytab/myjenkinskeytab.keytab jenkins/host.local@ALPINE
-
Add your
kinit in
crontab to renew every day with the user who is running the jobs:
crontab -e 0 6 * * * kinit -k -t /home/jenkins/keytab/myjenkinskeytab.keytab jenkins/host.local@ALPINE
-
Verify user permissions:
- Hadoop account users should have read/write permissions to the
/tmp directory in HDF:
hadoop fs -ls / drwxrwxrwx - jenkins superuser 0 2014-08-09 17:53 /tmp
- Take note of the user running the kinit command. In this example, the user jenkins has permission to access the keytab file.
- jenkins should have read/write permissions to the /user and /tmp/tsds_out directories in HDFS.
- jenkins is configured in the later steps for job tracker in the Hadoop data source.
- Hadoop account users should have read/write permissions to the
/tmp directory in HDF:
-
Consider special instructions for specific Hadoop distributions. To run
Team Studio on MapR, the Alpine host must have the MapR client installed.
-
Verify the connection between the
Team Studio server and each node of the cluster.
telnet:
telnet namenode 8020 telnet: connect to address 10.0.0.xx: Connection refused # namenode is down or firewall is up telnet namenode 8020 # after turning on namenode Connected to namenode. # Team Studio server can communicate with namenode
If the connections are not accessible, consider removing firewall restrictions using the iptables service.
-
Troubleshoot the following:
- Log locations:
[root@node3 hadoop]# pwd # this node contains both datanode and secondary namenode /var/log/gphd/hadoop [root@node3 hadoop]# ls hadoop-hdfs-datanode-node3.host.local.log hadoop-hdfs-secondarynamenode-node3.host.local.log
.log is the latest version from the node.
- Time sync: Verify that the time on the Team Studio server is the same as that for the kdc server and Kerberos clients.
- Kerberos with HA (high availability): For kerberized clusters with HA enabled, try connecting to the active namenode before configuring both namenodes.
- Log locations:
- Configure a Kerberos-Enabled Hadoop data source.
Copyright © 2021. Cloud Software Group, Inc. All Rights Reserved.