Python customer support for running Hive on top of Amazon EMR

I noticed that neither mrjob nor boto support the Python interface for submitting and running Hive jobs on Amazon Elastic MapReduce (EMR). Are there any other Python client libraries that support running Hive on EMR?

+8
python elastic-map-reduce hive boto
source share
1 answer
With boto, you can do something like this:
args1 = [u's3://us-east-1.elasticmapreduce/libs/hive/hive-script', u'--base-path', u's3://us-east-1.elasticmapreduce/libs/hive/', u'--install-hive', u'--hive-versions', u'0.7'] args2 = [u's3://us-east-1.elasticmapreduce/libs/hive/hive-script', u'--base-path', u's3://us-east-1.elasticmapreduce/libs/hive/', u'--hive-versions', u'0.7', u'--run-hive-script', u'--args', u'-f', s3_query_file_uri] steps = [] for name, args in zip(('Setup Hive','Run Hive Script'),(args1,args2)): step = JarStep(name, 's3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar', step_args=args, #action_on_failure="CANCEL_AND_WAIT" ) #should be inside loop steps.append(step) # Kick off the job jobid = EmrConnection().run_jobflow(name, s3_log_uri, steps=steps, master_instance_type=master_instance_type, slave_instance_type=slave_instance_type, num_instances=num_instances, hadoop_version="0.20") 
+9
source share

All Articles