How to deploy and run oozie?

I am trying to do a simple job using oozie.
This will be one simple Pig action.

I have a file: FirstScript.pig containing:

dual = LOAD 'default.dual' USING org.apache.hcatalog.pig.HCatLoader(); store dual into 'dummy_file.txt' using PigStorage(); 

and a workflow.xml file containing:

 <workflow-app name="FirstWorkFlow" xmlns="uri:oozie:workflow:0.2"> <start to="FirstJob"/> <action name="FirstJob"> <pig> <job-tracker>hadoop:50300</job-tracker> <name-node>hdfs://hadoop:8020</name-node> <script>/FirstScript.pig</script> </pig> <ok to="okjob"/> <error to="errorjob"/> </action> <ok name='okjob'> <message>job OK, message[${wf:errorMessage()}]</message> </ok> <error name='errorjob'> <message>job error, error message[${wf:errorMessage()}]</message> </error> </workflow-app> 

I created a structure:

 FirstScript |- lib |---FirstScript.pig |- workflow.xml 

What now? How to deploy and run it using oozie?
Can anyone have more experienced help?

Yours faithfully
Pawel

+7
hadoop oozie apache-pig
source share
2 answers

I do it like this:

 hadoop fs -put workflow.xml some_dir/ oozie job --oozie http://your_host:11000/oozie -config cluster_conf.xml -run 

and my cluster_conf.xml file looks like this (check the ports first depending on the Hadoop distribution):

 <?xml version="1.0" encoding="UTF-8" standalone="no"?> <configuration> <property> <name>nameNode</name> <value>hdfs://my_nn:8020</value> </property> <property> <name>jobTracker</name> <value>my_jt:8050</value> </property> <property> <name>oozie.wf.application.path</name> <value>/user/my_user/some_dir/workflow.xml</value> </property> </configuration> 
+5
source share

The configuration file should point to job.properties instead of file.xml . Since job.properties contains the path to workflow.xml

 oozie job --oozie http://your_host:11000/oozie -config **/job.properties** -run 
+1
source share

All Articles