I have two Mapper classes that take some files from the same folder as the input, and based on the name of the file that has a timestamp, determines which mapping file should be specified as input. From time to time it happens that the same input file must be given as input for two different Mappers. Now I tested it to work when two different inputs are passed for both Mappers, but when I give them the same input, one of the Mapper classes does not generate a result that will be used for comparison in the reducer.
The code is huge, so instead of putting it here, I will describe what I did. I created two lists and looked at the files in the directory and based on the names of files that have timestamps, I put them in two different lists and then added them to two different Mappers, that is, they were both calculated differently, so I use different Mappers for calculations, which are then used for comparison in the reducer, but when it is the same input file, since the time criteria for both cartographers are almost the same, one of the converters does not generate any result. This is due to the fact that one cartographer cannot access the file because the other uses it, and if so, there is some way to use it.
Here MapPath1 is one list, while MapPath2 is another
for(i=0;i<MapPath1.size();i++) MultipleInputs.addInputPath(job,new Path(MapPath1.get(i)),TextInputFormat.class,Map1.class); if(type.equals("comparative")) for(i=0;i<MapPath2.size();i++) MultipleInputs.addInputPath(job,new Path(MapPath2.get(i)),TextInputFormat.class,Map2.class);
Update
I just found this question ( Several mappers in hadoop ) to be similar to mine, but I do not want to duplicate the input file, since it can be large, Can someone direct me on how I can create two separate tasks using different mappers and provide them with one gearbox.
source share