HDFS Connection
The HDFS Connection shared resource contains all necessary parameters to connect to HDFS. It can be used by the HDFS Operation, ListFileStatus, Read, Write activities, and the HCatalog Connection shared resource.
General
In the General panel, you can specify the package that stores the HDFS Connection shared resource and the shared resource name.
The following table lists the fields in the General panel of the HDFS Connection shared resource:
HDFSConnection
In the HDFSConnection Configuration panel, you can provide necessary information to connect the plug-in with HDFS. You can also connect to a Kerberized HDFS server.
The following table lists the fields in the HDFSConnection panel of the HDFS Connection shared resource:
Field | Module Property? | Description |
---|---|---|
HDFS Url | Yes | The WebHDFS URL that is used to connect to HDFS. The default value is
http://localhost:50070.
The plug-in supports HttpFS and HttpFS with SSL. You can enter an HttpFS URL with HTTP or HTTPS in this field. For example: http://httpfshostname:14000 https://httpfshostname:14000 |
User Name | Yes | The unique user name that is used to connect to HDFS. |
Enable Kerberos | No | If you want to connect to a Kerberized WebHCat server, you can select this check box.
Note: If your server uses the AES-256 encryption, you must install Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files on your machine. For more details, see
Installing JCE Policy Files.
|
Kerberos Method | No | The Kerberos authentication method that is used to authorize access to HDFS. Select an authentication method from the list:
This field is displayed only when you select the Enable Kerberos check box. |
Kerberos Principal | Yes | The Kerberos principal name that is used to connect to HDFS.
This field is displayed only when you select the Enable Kerberos check box. |
Kerberos Password | Yes | The password for the Kerberos principal.
This field is displayed only when you select the Password from the Kerberos Method list. |
Kerberos Keytab | Yes | The keytab that is used to connect to HDFS.
This field is displayed only when you select Keytab from the Kerberos Method list. |
Test Connection
You can click Test Connection to test whether the specified configuration fields result in a valid connection.
Setting up High Availability
You can set up high availability for your cluster in this panel. To do so, enter two URLs as comma-separated values (no space between the comma and the second URL) in the HDFS Url field under the HDFS Connection section of this panel. The plug-in designates the first entry to be the primary node and the second entry to be the secondary node. The plug-in automatically connects and routes the request to the secondary node in the event that the primary node goes down.
To check the status of a node, use the API, <HDFS URL>/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus, For example, http://cdh571.na.tibco.com:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus