Why is this method a hot spot?

I am writing (simply!) A library of linear algebras. In implementing matrix multiplication , the VisualVM performance sample tells me that the algorithm spends 85% of its time ("proper time" in particular) with the following method when multiplying large matrices (5k x 120k):

public double next() { double result; if(hasNext()) result = vis[i++].next(); else throw new IllegalStateException("No next value"); return result; } 

Without going into details (sorry, I can’t share more code), this method is the next() method for the "iterator" for the matrix. (You can think of a class in which this method lives as a row iterator, consisting of separate column iterators that are stored in vis .) I'm not surprised that this method gets a lot of names, since it is an iterator, but I'm surprised that the program spends a lot of time on this method. This method does not do too much, so why is it wasting its time here?

Here are the specific questions I ask:

  • Are there any "gotcha" VisualVM that I click? For example, can JIT confuse VisualVM in some way, forcing VisualVM to attribute time to the wrong method?
  • Why will the program spend its time here? The method just does not do too much. In particular, I do not think that cache effects explain this problem, because the vis array is much smaller than the data of the multiplied matrices.

In case this is useful, here's the jad parsing of the method that was inserted above:

 public double next() { double result; if(hasNext()) //* 0 0:aload_0 //* 1 1:invokevirtual #88 <Method boolean hasNext()> //* 2 4:ifeq 32 result = vis[i++].next(); // 3 7:aload_0 // 4 8:getfield #42 <Field VectorIterator[] vis> // 5 11:aload_0 // 6 12:dup // 7 13:getfield #28 <Field int i> // 8 16:dup_x1 // 9 17:iconst_1 // 10 18:iadd // 11 19:putfield #28 <Field int i> // 12 22:aaload // 13 23:invokeinterface #72 <Method double VectorIterator.next()> // 14 28:dstore_1 else //* 15 29:goto 42 throw new IllegalStateException("No next value"); // 16 32:new #89 <Class IllegalStateException> // 17 35:dup // 18 36:ldc1 #91 <String "No next value"> // 19 38:invokespecial #93 <Method void IllegalStateException(String)> // 20 41:athrow return result; // 21 42:dload_1 // 22 43:dreturn } 

Thank you in advance for your help!

+8
java performance optimization visualvm
source share
3 answers

I realized that this method looked like a hot spot because VisualVM was instructed to ignore methods from the JRE in profiling. The time spent on these “ignored” methods was (supposedly) minimized in the self-use of the topmost ignored call stack entry.

The following is the settings screen in VisualVM, including the Do Not Configure Packets option, which made the data invalid. To configure the “ignore class” settings, you must (1) set the “Settings” checkbox highlighted in red, then (2) adjust the settings for the classes highlighted in blue.

VisualVM Settings Screen

Depending on what you are doing, it probably makes sense at least not to ignore the java.* And javax.* .

+9
source share

I do not know VisualVM from experience.

First determine if the bytecode tool is used to collect statistics. If so, look no further - the short method tool always overestimates its time (measuring time and increasing the statistics counter costs more time than the method itself).

But it is always possible that an iterator consumes more time than the calculation itself. Imagine just summing a matrix. Adding a float value to a local sum variable costs a lot less time than invoking a method, checking an invariant, and finally accessing an array.

+1
source share

Forget the profiler. Just stop a few times on the darn and inspect the stack. If 85% of the time is included in this procedure, then the chances for each pause are 85%, that you will see exactly where this procedure is located, and exactly where it came from. You can even see where it is in the process of matrix multiplication. Thousands of samples will not tell you about this.

My own point is that calling this function and then executing hasNext and then executing Next for each individual element will be much slower than i++ .

+1
source share

All Articles