Which key class is suitable for secondary sorting?

Question

Which key class is suitable for secondary sorting?

In Hadoop, you can use the secondary sorting mechanism to sort the values before sending them to the reducer.

The way this is done in Hadoop is that you add a value to sort by key, and then you have some custom methods for comparing groups and keys that connect to the sort system.

Thus, you will need a key, which consists mainly of a real key and a value for sorting. To do this fast enough, I will need a way to create a composite key, which is also easy to decompose into separate parts needed for group and key comparison methods.

What is the smartest way to do this. Is there a “ready-made” Hadoop class that can help me with this, or do I need to create a separate key class for each step of the map reduction?

How to do this if the key is actually a composite, consisting of several parts (also necessary separately due to the separator)?

What do you guys recommend?

PS I wanted to add the "secondary-sort" tag, but I don’t have enough repetitions yet.

+5

java sorting mapreduce hadoop

Niels basjes Jul 19 '10 at 10:11

source share

4 answers

Pranab · Answer 1 · 2012-02-03T03:05:14+0000

. Tuple, . Java-. WritableComparable.

https://github.com/pranab/chombo/blob/master/src/main/java/org/chombo/util/Tuple.java

Kapil D · Answer 2 · 2011-07-07T01:20:30+0000

. SecondarySort, .

https://github.com/kapild/hadoop-examples/tree/master/src/SecondarySort

jayunit100 · Answer 3 · 2011-10-10T19:03:54+0000

, , , ....

- / beans, , ...

- "#" !

:

http://pkghosh.wordpress.com/2011/04/13/map-reduce-secondary-sort-does-it-all/

Rajen Raiyarela · Answer 4 · 2014-06-23T07:12:36+0000

, , , - . WritableComparable, compareTo . , .

Which key class is suitable for secondary sorting?

More articles: