I was looking for a Hadoop application with a hard drive to check I / O activity in Hadoop, but I could not find an application that kept the disk usage higher than, say, 50% or some kind of application that actually forces the drives to load. I tried randomwriter, but this is surprisingly not disk I / O intensity.
So, I wrote a small program to create a file in Mapper and write some text to it. This application works well, but its use is high only in the main node, which is also called node, the job tracker and one of the slaves. Using a NIL drive or a minor value in other task trackers. I cannot understand why disk I / O is so low in task managers. Can someone push me in the right direction if I do something wrong? Thanks in advance.
Here is my code segment code that I wrote in the WordCount.java file to create and write a UTF string to a file -
Configuration conf = new Configuration(); FileSystem fs = FileSystem.get(conf); Path outFile; while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); outFile = new Path("./dummy"+ context.getTaskAttemptID()); FSDataOutputStream out = fs.create(outFile); out.writeUTF("helloworld"); out.close(); fs.delete(outFile); }
source share