Test code:
import java.io.BufferedReader; import java.io.FileReader; import java.io.IOException; import java.util.Map.Entry; public class ReadLine { public static void main(String[] args) throws IOException { System.getenv(); System.getProperties(); BufferedReader br = new BufferedReader(new FileReader(args[0]), 2 << 17); int lineTotal = 0; int lineDone = 0; long start = System.currentTimeMillis(); long totalSincePre = System.currentTimeMillis(); while (br.readLine() != null) { lineTotal++; if (lineTotal % 100000 == 0) { long end = System.currentTimeMillis(); System.out.println("total speed=" + lineTotal / (end - totalSincePre) + "k/s. curr speed=" + (lineTotal - lineDone) / (end - start)); start = end; lineDone = lineTotal; } } printEnv(); } static void printEnv() { for (Entry<?, ?> e : System.getenv().entrySet()) { System.out.println(e.getKey() + ":" + e.getValue()); } for (Entry<?, ?> e : System.getProperties().entrySet()) { System.out.println(e.getKey() + ":" + e.getValue()); } } }
Test Env: OS: Linux, File size: 7.2G (csv text file, more than 1k per line), java version "1.6.0_32"
Test:
The code is packed in one jar. I performed the following two tests on the same host using the same jar reading the same file.
1) Run ./java -cp=my.jar ReadLine TestFile.txt Performance stabilized at a speed of about 150 thousand lines / s.
The console output looks like this:
total speed=251k/s. curr speed=251 total speed=304k/s. curr speed=384 total speed=323k/s. curr speed=371 total speed=337k/s. curr speed=387 total speed=350k/s. curr speed=414 total speed=358k/s. curr speed=401 total speed=363k/s. curr speed=395 total speed=349k/s. curr speed=277 total speed=304k/s. curr speed=150 total speed=277k/s. curr speed=153 total speed=258k/s. curr speed=154 total speed=244k/s. curr speed=152 total speed=233k/s. curr speed=152 total speed=225k/s. curr speed=154 total speed=218k/s. curr speed=153 total speed=196k/s. curr speed=149 total speed=193k/s. curr speed=146 ......stabled....... total speed=163k/s. curr speed=150 total speed=162k/s. curr speed=155 total speed=162k/s. curr speed=155 total speed=162k/s. curr speed=149 total speed=162k/s. curr speed=147 total speed=162k/s. curr speed=150 total speed=161k/s. curr speed=148 total speed=161k/s. curr speed=145 total speed=161k/s. curr speed=151 total speed=161k/s. curr speed=154 total speed=161k/s. curr speed=163 total speed=161k/s. curr speed=179
2) No code changes, but only added JARs (used in the production environment) in the classpath (necessary for production, but not for this program), for example ./java -cp=my.jar:hundreds_of_other_jars ReadLine TestFile.txt . Productivity fell to about 90 thousand lines / s
total speed=312k/s. curr speed=383 total speed=335k/s. curr speed=393 total speed=348k/s. curr speed=395 total speed=361k/s. curr speed=423 total speed=369k/s. curr speed=414 total speed=374k/s. curr speed=404 total speed=342k/s. curr speed=214 total speed=264k/s. curr speed=93 total speed=224k/s. curr speed=95 total speed=200k/s. curr speed=95 total speed=182k/s. curr speed=94 total speed=170k/s. curr speed=94 total speed=161k/s. curr speed=95 total speed=154k/s. curr speed=95 total speed=148k/s. curr speed=93 .....stabled..... total speed=139k/s. curr speed=92 total speed=135k/s. curr speed=92 total speed=132k/s. curr speed=92 total speed=129k/s. curr speed=92 total speed=127k/s. curr speed=92 total speed=125k/s. curr speed=90 total speed=123k/s. curr speed=91 total speed=121k/s. curr speed=92 total speed=120k/s. curr speed=89 total speed=118k/s. curr speed=92 total speed=117k/s. curr speed=91 total speed=116k/s. curr speed=91 total speed=115k/s. curr speed=91 total speed=114k/s. curr speed=90 total speed=113k/s. curr speed=91
My analysis:
The only difference is the path to the classes. The second test class path contains hundreds of JARs in the classpath. But not one of them is used in this program.
- Not related to the environment. This code is packaged in a JAR, and both tests used the same JAR. Both tests run on the same host, read the same file, use the same code. I also compared System.getEnv and System.getProperties, no differences other than the classpath.
- No operating system cache. This can be reproduced. After many tests, the results are the same. No matter which test was run first.
- jmap shows that memory usage does not make much difference, and all generations are not used very much.
- jstack shows that the call stack for both tests is likely to be
this is
at java.io.FileInputStream.available(Native Method) at sun.nio.cs.StreamDecoder.inReady(StreamDecoder.java:343) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:304) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158) - locked <0xb4220388> (a java.io.FileReader) at java.io.InputStreamReader.read(InputStreamReader.java:167) at java.io.BufferedReader.fill(BufferedReader.java:136) at java.io.BufferedReader.readLine(BufferedReader.java:299) - locked <0xb4220388> (a java.io.FileReader) at java.io.BufferedReader.readLine(BufferedReader.java:362) at com.amazon.invhealth.metrics.transform.topasin.RL.main(RL.java:24)
or
at sun.nio.cs.UTF_8$Decoder.decodeArrayLoop(UTF_8.java:240) at sun.nio.cs.UTF_8$Decoder.decodeLoop(UTF_8.java:305) at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:544) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:298) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158) - locked <0xb4220388> (a java.io.FileReader) at java.io.InputStreamReader.read(InputStreamReader.java:167) at java.io.BufferedReader.fill(BufferedReader.java:136) at java.io.BufferedReader.readLine(BufferedReader.java:299) - locked <0xb4220388> (a java.io.FileReader) at java.io.BufferedReader.readLine(BufferedReader.java:362) at com.amazon.invhealth.metrics.transform.topasin.RL.main(RL.java:24)
Looking at the call stack, these test cases probably use the same code.
- This is not caused by a specific JAR in the classpath. I tried to remove the first 50% of the class path to run the test, the performance is about 110 thousand lines / s. Then I delete the last 50% to run the test, the performance is also about 110 thousand lines per second. If you remove more than 2/3 of the cans from the class, the performance is about 120 thousand lines / s. Therefore, from the test, this performance problem is only related to how many JARs are in the classpath.
- Then I tried to pack all these JARs into one big JAR. Unfortunately, performance dropped from 90 thousand to 60 thousand .... More precisely, on the basis of my test, this drop in productivity is caused by how many classes are on the way to classes.
- I conducted two tests on another machine with a different file (file size and format are similar), the results are the same. So it can definitely be reproduced.
BUT I think this does not make sense. Did I miss something? If this is true, what is the main reason?
---------- More debugging -----------------
GC and Perm Size
Added -Xmx2432m -Xms256m -XX:MaxNewSize=700m -XX:MaxPermSize=1024m -XX:+PrintGCDetails for both tests. Everyone has only PSYoungGen GC. And the performance is the same for both.
Exit Information
Long CP one:
total speed=114k/s. curr speed=91 [GC [PSYoungGen: 247888K->16K(238272K)] 248810K->938K(413056K), 0.0003290 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] total speed=113k/s. curr speed=92 [GC [PSYoungGen: 238096K->16K(228864K)] 239018K->938K(403648K), 0.0003840 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] total speed=113k/s. curr speed=92 [GC [PSYoungGen: 228816K->16K(220096K)] 229738K->938K(394880K), 0.0006030 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] total speed=112k/s. curr speed=92 [GC [PSYoungGen: 219984K->16K(211584K)] 220906K->938K(386368K), 0.0004380 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] total speed=111k/s. curr speed=93 [GC [PSYoungGen: 211536K->16K(203584K)] 212458K->938K(378368K), 0.0005160 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] total speed=111k/s. curr speed=92 [GC [PSYoungGen: 203472K->16K(195840K)] 204394K->938K(370624K), 0.0005920 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] total speed=110k/s. curr speed=94 [GC [PSYoungGen: 195792K->16K(188608K)] 196714K->938K(363392K), 0.0004010 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] [GC [PSYoungGen: 188496K->16K(181568K)] 189418K->938K(356352K), 0.0004440 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] ...................... Heap PSYoungGen total 145984K, used 81767K [0xc8560000, 0xd7780000, 0xf4160000) eden space 145920K, 56% used [0xc8560000,0xcd535d18,0xd13e0000) from space 64K, 25% used [0xd7760000,0xd7764000,0xd7770000) to space 64K, 0% used [0xd7770000,0xd7770000,0xd7780000) PSOldGen total 174784K, used 922K [0x5c160000, 0x66c10000, 0xc8560000) object space 174784K, 0% used [0x5c160000,0x5c246ae8,0x66c10000) PSPermGen total 16384K, used 2032K [0x1c160000, 0x1d160000, 0x5c160000) object space 16384K, 12% used [0x1c160000,0x1c35c260,0x1d160000)
One Jar CP:
total speed=180k/s. curr speed=148 [GC [PSYoungGen: 87248K->16K(87296K)] 87904K->672K(262080K), 0.0005300 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] [GC [PSYoungGen: 87248K->16K(87296K)] 87904K->672K(262080K), 0.0004950 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] [GC [PSYoungGen: 87248K->16K(87296K)] 87904K->672K(262080K), 0.0005020 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] total speed=179k/s. curr speed=150 [GC [PSYoungGen: 87248K->16K(87296K)] 87904K->672K(262080K), 0.0005360 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] [GC [PSYoungGen: 87248K->16K(87296K)] 87904K->672K(262080K), 0.0005190 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] total speed=178k/s. curr speed=151 [GC [PSYoungGen: 87248K->16K(87296K)] 87904K->672K(262080K), 0.0005360 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] [GC [PSYoungGen: 87248K->16K(87296K)] 87904K->672K(262080K), 0.0005400 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] [GC [PSYoungGen: 87248K->16K(87296K)] 87904K->672K(262080K), 0.0003510 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] total speed=177k/s. curr speed=150 Heap PSYoungGen total 87296K, used 83826K [0xc8580000, 0xcdad0000, 0xf4180000) eden space 87232K, 96% used [0xc8580000,0xcd758928,0xcdab0000) from space 64K, 25% used [0xcdab0000,0xcdab4000,0xcdac0000) to space 64K, 0% used [0xcdac0000,0xcdac0000,0xcdad0000) PSOldGen total 174784K, used 656K [0x5c180000, 0x66c30000, 0xc8580000) object space 174784K, 0% used [0x5c180000,0x5c224080,0x66c30000) PSPermGen total 16384K, used 2022K [0x1c180000, 0x1d180000, 0x5c180000) object space 16384K, 12% used [0x1c180000,0x1c379bb0,0x1d180000)
JVM Initial
I really expect this to be the reason because it is reasonable. But after using the following code:
String filepath = args[0]; while (true) { BufferedReader br = new BufferedReader(new FileReader(filepath), 2 << 17); System.out.println("Press Enter to start..."); while (System.in.read() != '\n') ; int lineTotal = 0; int linePre = 0; long start = System.currentTimeMillis(); long totalStart = System.currentTimeMillis(); while (br.readLine() != null) { lineTotal++; if (lineTotal % 100000 == 0) { long end = System.currentTimeMillis(); System.out.println("total speed=" + lineTotal / (end - totalStart) + "k/s. curr speed=" + (lineTotal - linePre) / (end - start)); start = end; linePre = lineTotal; } } }
Run and press enter the main times, the performance will not change.