Delphi SampleProfiler: How does this code call in ntdll.dll?

I have profiled part of my application using the Delphi Sampling Profiler . Like most people , I see most of the time spent inside ntdll.dll .

Note: I have included options to ignore Application.Idle and calls from System.pas . So this is not inside ntdll because the application is inactive:

alt text

After several starts, several times, most of the time seems to be spent inside ntdll.dll , but it is strange who the caller is:

enter image description here

Virtual Treeview's Caller:

 PrepareCell(PaintInfo, Window.Left, NodeBitmap.Width); 

Note: the application is not located inside ntdll.dll because the application is not used, because the caller is not Application.Idle .

What bothers me is that this line itself (i.e. not something inside PrepareCell) is the ntdll caller in ntdll . Even more confusing is that:

  • not only is it not something inside PrepareCell()
  • it's not even setting up a PrepareCell (like push stack variables, setting implicit exception frames, etc.), which is the caller. These things will appear in the profiler as an access point in begin inside PrepareCell.

VirtualTrees.pas:

 procedure TBaseVirtualTree.PrepareCell(var PaintInfo: TVTPaintInfo; WindowOrgX, MaxWidth: Integer); begin ... end; 

Therefore, I am trying to understand how this line is:

 PrepareCell(PaintInfo, Window.Left, NodeBitmap.Width); 

calls ntdll.dll .


The only other ways are three parameters:

  • PaintInfo
  • Window.Left
  • NodeBitmap.Width

Maybe one of them is the function or ntdll getting the property that ntdll will ntdll . So I set a breakpoint on the line and looked at the processor window at runtime:

alt text

There is a line that could be the culprit:

 call dword ptr [edx+$2c] 

But when I follow this jump, it does not end in ntdll.dll , but TBitmap.GetWidth :

alt text

Which, as you can see, does not call anywhere; and of course not in ntdll.dll .


So, how does the line go:

 PrepareCell(PaintInfo, Window.Left, NodeBitmap.Width); 

to call ntdll.dll ?


Note: I know well that this is not really a ntdll.dll call. Therefore, any correct answer should include the words "Sampling Profiler is misleading; this line does not cause ntdll.dll." The answer should also either say that most of the time is not spent in ntdll.dll, or that the highlighted line is not calling. Finally, any answer should explain why the Sampling Profiler is wrong and how this can be fixed.

Update 2

What is ntdll.dll? Ntdll is a set of APIs for Windows NT. The Win32 API is a wrapper around ntdll.dll that looks like the Windows API that existed in Windows 1/2/3 / 9x. To actually get into ntdll, you must call a function that uses ntdll directly or indirectly.

For example, when my Delphi application is idle, it waits for a message by calling the user32.dll function:

 WaitMessage; 

When, when you actually look at this:

 USER32.WaitMessage mov eax,$00001226 mov edx,$7ffe0300 call dword ptr [edx] ret 

Calling the function specified in $7ffe0300 is the way that Windows goes into Ring0 by calling the FunctionID specified in EAX. In this case, the system function 0x1226 is called. On my Windows Vista operating system, 0x1226 corresponds to the NtUserWaitMessage system function.

This is how you get into ntdll.dll: you name it.

I was desperate to avoid manually rejecting the answer when I formulated the initial question. Being very specific, carefully pointing out the reality of what I see, I tried to stop people from ignoring the facts and tried to use a waving argument.


Update three

I converted two parameters:

 PrepareCell(PaintInfo, Window.Left, NodeBitmap.Width); 

to stack variables:

 _profiler_WindowLeft := Window.Left; _profiler_NodeBitmapWidth := NodeBitmap.Width; PrepareCell(PaintInfo, _profiler_WindowLeft, _profiler_NodeBitmapWidth); 

To confirm that the bottleneck is not

  • Windows.Left , or
  • Nodebitmap.Width

The profiler still indicates that the line

 PrepareCell(PaintInfo, _profiler_WindowLeft, _profiler_NodeBitmapWidth); 

itself is a bottleneck; there is nothing inside PrepareCell. This should mean that it is something inside the call setup to prepare the cell or at the beginning of PrepareCell:

 VirtualTrees.pas.15746: PrepareCell(PaintInfo, _profiler_WindowLeft, _profiler_NodeBitmapWidth); mov eax,[ebp-$54] push eax mov edx,esi mov ecx,[ebp-$50] mov eax,[ebp-$04] call TBasevirtualTree.PrepareCell 

Nothing in this calls ntdll. Now the preamble in PrepareCell itself:

 VirtualTrees.pas.15746: begin push ebp mov ebp,esp add esp,-$44 push ebx push esi push edi mov [ebp-$14],ecx mov [ebp-$18],edx mov [ebp-$1c],eax lea esi,[ebp-$1c] mov edi,[ebp-$18] 

Nothing invokes ntdll.dll .


Questions still remain:

  • why is putting one variable on the stack and two others on registers a bottleneck?
  • why is nothing inside the PrepareCell bottleneck?
+8
delphi delphi-5
source share
2 answers

Well, this problem was actually my main reason for creating my own sampling profiler:
http://code.google.com/p/asmprofiler/wiki/AsmProfilerSamplingMode

Maybe not perfect, but you can try. Let me know what you think about it.

By the way, I think this is due to the fact that almost all calls end with kernel calls (memory requests, drawing events, etc.). Only calculations do not require a kernel call. Most calls end up waiting for kernel results:

 ntdll.dll!KiFastSystemCallRet 

You can see this in Process Explorer with a view of the thread stack in either Delphi or using the StackWalk64 API in my AsmProfiler "Live view":
http://code.google.com/p/asmprofiler/wiki/ProcessStackViewer

+3
source share

There are probably two things.

The first is that SamplingProfiler identifies the caller by climbing up the stack until he encounters what looks like a valid dial peer in Delphi from Delphi code.

The fact is that some procedures can reserve a large number of stacks simultaneously, without re-initialization. This can lead to a false positive. The only key then would be that your false positive was recently triggered.

The second thing is the localization of ntdll , which is known for sure, however ntdll is your waiting point in user space, and as user197220, ntdll is the place where you end up waiting most of the time when you call the system material and expect the result.

In your case, if you did not reduce the sampling rate, you are looking at 247 ms of processor time, which could probably pass as idle if these 247 samples were collected for many seconds of real time. Since there are false positive points for preparing PaintTrackTreeTry, my bet will be that ntdll time is drawing time (driver or OS software). You can try to comment on the code that really makes the picture to be sure.

0
source share

All Articles