Workflow Tool Comparison: Oozie Vs Cascading

I am looking for a workflow tool to run complex map abbreviations. I mean Oozi, but I also want to explore Cascading. Is there any code sample or example that links existing M / R jobs using a cascading API? Also, can you provide an Oozie Vs Cascading comparison?

+7
source share
2 answers

Cascading and Oozie are not in the same category.

Oozie is a workflow planner.

Cascading is an API for creating workflows. This is an agnostic regarding planners, that is, he must work with any planner system used.

There may be some confusion because Oozie docs mention "DAG" and both work on top of Hadoop.

In addition, Cascading has the concept of "data accessibility" in support of the checkpoint, which is supported in Oozie, albeit in different ways.

+7
source

Personally, I play on both sides, which I found interesting with a cascade

1) concise and expressive in terms of simple keywords, such as flow, tap, pipe, etc.,

2) amazing TDD based approach for local development and research

3) A good look at the scheduler (.dot file) and will be useful after the project has been grown, therefore simplifying maintenance.

4) DSL-based approach using groovy, scala, cloujre. therefore, there is no need to worry about learning any new language, or rather, chaos.

5) easy cloud deployment (for example, supporting Amazon as an open flag deployment).

6) you can call something like an existing pig or hive or a clean other can MR, as long as they expose java api.

7) surprising for jobs related to ML and NLP.

0
source

All Articles