Importing Hadoop Datasets

Follow this procedure to import a Hadoop dataset into a workspace's sandbox.

Prerequisites

Procedure

  1. Browse a Hadoop data source to obtain a list of directories and files.

  2. Select the CSV file that you are interested in and choose Create as an External Table in the contextual sidebar.



  3. Use the Select workspace drop-down menu to browse a list of workspaces you are a member of. Only workspaces with sandboxes are displayed.
  4. In the Table name box, choose a table name for your import. Be sure that you use a valid table name for your database provider.
  5. Team Studio attempts to determine which delimiter your CSV file uses. If you use a non-standard delimiter or this determination is wrong, use the Delimiter command to choose a new delimiter.
  6. A preview of the data in tabular format appears. Verify that this format is correct, then click Create External Table.

Result

The new external table is created in the sandbox schema of the workspace you chose.