Hadoop "chunks" data into blocks of configured size. The default value is 64 MB. You can see where this causes problems for your approach; Each handler can receive only part of the file. If the file is less than 64 MB (or any other value is configured), then each cartographer will receive only 1 file.
; ( ), . < 64MB
, , , , / . - , " , " - .:)
, MR, . , , . < 64MB, . map ( 1 ).
, , , .
hadoop , Map/Reduce, , . mapred.reduce.tasks. job.setNumReduceTasks("mapred.reduce.tasks",[NUMBER OF FILES HERE]);
/, ; 1: in 1: out; , .