I am trying to create 300M files from a java program, I switched from the old API file to the new java 7 nio package, but the new package goes even slower than the old one.
I see less CPU utilization than using the old API files, but I run this simple code and I get a file transfer speed of 0.5 MB / s, and records from java are read from one disk and write to another (writing is the only one process, disk access).
Files.write(FileSystems.getDefault().getPath(filePath), fiveToTenKBytes, StandardOpenOption.CREATE);
Is there any hope of getting reasonable bandwidth here?
Update:
I decompress 300 million image files of size 5-10 thousand bytes from large files. I have 3 drives, 1 local and 2 SAN (all have a typical throughput of ~ 20 MB / s on large files).
I also tried this code, which improved the speed to a bandwidth of less than 2 MB / s (9 days to unpack these files).
ByteBuffer byteBuffer = ByteBuffer.wrap(imageBinary, 0, (BytesWritable)value).getLength()); FileOutputStream fos = new FileOutputStream( imageFile ); fos.getChannel().write(byteBuffer); fos.close();
I read from the local disk and write to the attached SAN disk. I read from the Hadoop SequenceFile format, hadoop can usually read these files at a speed of 20 MB / s, using basically the same code.
The only thing that seems inappropriate, apart from the slowness of uber, is that I see more IOs read than the IO record by about 2: 1, although the gziped sequence file (the images get almost 1: 1 ratio though), so the compressed file should be approx. 1: 1 with an exit.
2nd UPDATE
Looking at iostat , I see some odd numbers, we look at xvdf here, I have one java process that is read from xvdb and written to xvdf , and there are no active ohter processes on xvdf
iostat -d 30 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn xvdap1 1.37 5.60 4.13 168 124 xvdb 14.80 620.00 0.00 18600 0 xvdap3 0.00 0.00 0.00 0 0 xvdf 668.50 2638.40 282.27 79152 8468 xvdg 1052.70 3751.87 2315.47 112556 69464
Reading on xvdf is 10x records, which are incredible.
fstab /dev/xvdf /mnt/ebs1 auto defaults,noatime,nodiratime 0 0 /dev/xvdg /mnt/ebs2 auto defaults,noatime,nodiratime 0 0