Working with the Sample Projects
The plug-in packages two sample projects with the installer. The sample projects show how ActiveMatrix BusinessWorks Plug-in for Big Data works.
After installing the plug-in, you can find the samples.zip file in the TIBCO_HOME/bw/palettes/bigdata/version_number/Samples directory.
The following list shows the details of the sample projects and their processes:
The detail of all the processes is explained in the following tables:
- DemoWorkflow.bwp
The DemoWorkflow.bwp process is the main process of the DataCollectionandQuery project . It shows how to operate HDFS files, transfer files between HDFS and a local directory, create a Hive table, and query data.
The DemoWorkflow.bwp process contains the following activities:
Activity Description Timer Starts the process when the specified time interval expires. RemoveExistingFiles Deletes the customers.csv file from the user/hdfs/bwdemo directory in HDFS. CopyDataToHDFS Copies the customers.csv file from the TIBCO_HOME/bw/palettes/bigdata/version_number/Samples/samples.zip/SampleData directory to the user/hdfs/bwdemo directory in HDFS. CreateHiveTableRefertToData Creates an external Hive table named customers in HDFS. CheckError Performs the subprocess. If no error occurs when the ErrorHandler.bwp process is running, the process goes to the HiveQuery activity.
If any error occurs, the process goes to the Generate-Error activity.
HiveQuery Queries 100 records from the customers.csv Hive table created by the CreateHiveTableRefertToData activity. Generate-Error Generates error messages from the Output tab of the ErrorHandler.bwp subprocess. Catch Catches error messages and displays in the Output tab when any error occurs in the DemoWorkflow.bwp process. End Ends the process. - ErrorHandler.bwp
The ErrorHandler.bwp is a subprocess of the DemoWorkflow.bwp process.
The ErrorHandler.bwp subprocess contains the following activities:
Activity Description Start Starts the process. WaitForJobCompletion Waits for the CreateHiveTableRefertToData activity to complete the job. ReadResult Reads the output of the job executed by the CreateHiveTableRefertToData activity, which is in the user/hdfs/hive/createstatus/stdout directory. ReadError Reads error messages of the job executed by the CreateHiveTableRefertToData activity, which is in the user/hdfs/hive/createstatus/stderr directory. Reply Sends a message in response to the ErrorHandler.bwp process and the ErrorHandler operation. Reply1 Sends a message in response to the ErrorHandler.bwp process and the ErrorHandler operation. Generate Error Generates error messages from the Output tab of the ReadError activity. Catch Catches the error messages and displays in the Output tab when any error occurs in the ErrorHandler.bwp subprocess. RenderXML Renders the error messages in XML string. End Ends the process. - ProxyMapReduceWordCount.bwp
The ProxyMapReduceWordCount process shows how to submit a proxy map reduce job on the Oozie server and waits till it completes its execution. Based on its status, the process also writes job execution result to the log.
The ProxyMapReduceWordCount.bwp process contains the following activities:
Activity Description Timer Starts the process after a specified time interval. SubmitMapReduceJob Submits proxy map-reduce job to the Oozie server with the provided job properties. Returns Job ID as response.
GetJobInfo Accepts Job ID from SubmitMapReduceJob activity output as an input and returns Job Information for the Job ID. It waits till the job completes its execution. SuccessLog If job status is successful, then writes that the job is successful to the log. FailedLog If job fails, then writes the error message to the log. - WorkflowJobSample.bwp
The WorkflowJobSample process shows how to submit a long running workflow job on the Oozie server and retrieve the current job status without waiting for it to finish the execution.
The WorkflowJobSample.bwp process contains the following activities:
Activity Description Timer Starts the process after a specified time interval. SubmitWorkflowJob Submits workflow job to the Oozie server with the provided job properties. Returns Job ID in response.
GetJobInfo Accepts Job ID from SubmitWorkflowJob activity output as an input and returns Job Information for the Job ID. It returns the current job status without waiting for it to finish the execution. CurrentLog Write current job status to the log.
