Win32 Application Debugging

I am having trouble finding the cause of a hang in a Win32 application. The software displays some data in an OpenGL visual in a narrow loop:

std::vector<uint8_t> indices; glPolygonMode(GL_FRONT_AND_BACK, GL_FILL); glEnableClientState(GL_VERTEX_ARRAY); glVertexPointer(2, GL_DOUBLE, 0, vertexDataBuffer); while (...) { // get index type (1, 2, 4) and index count indices.resize(indexType * count); // get indices into "indices" buffer getIndices(indices.data(), indices.size()); //< seems to hang here! // draw (I'm using the correct parameters) glDrawElements(GL_TRIANGLES_*, count, GL_UNSIGNED_*); } glDisableClientState(GL_VERTEX_ARRAY); 

The code is compiled using VC11 Update 1 (CTP 3). When you run the optimized binary, it hangs inside the call to getIndices() (more on this below) after several of these loops. I have it already...

  • triple has confirmed all buffers even added by CRC to make sure I don't have buffer overflows.
  • Added HeapValidate () call inside the loop to ensure that the heap is not corrupted.
  • used by ApplicationVerifier
  • Enable heap allocation monitoring using GFlags and PageHeap .
  • burst into WinDbg when the application blocks

I did not find any problems with the code accessing the allocated buffer, and no heap damage. However, if I turn off the low fragmentation heap , the problem will disappear. It also disappears if I use a separate (low destructive) heap for the indices buffer.

Anyway, here is the stack trace leading to a deadlock:

 0:000> kb ChildEBP RetAddr Args to Child 0034e328 77b039c3 00000000 0034e350 00000000 ntdll!ZwWaitForKeyedEvent+0x15 0034e394 77b062bc 77b94724 080d36a8 0034e464 ntdll!RtlAcquireSRWLockExclusive+0x12e 0034e3c0 77aeb652 0034e464 0034e4b4 00000000 ntdll!RtlpCallVectoredHandlers+0x58 0034e3d4 77aeb314 0034e464 0034e4b4 77b94724 ntdll!RtlCallVectoredExceptionHandlers+0x12 0034e44c 77aa0133 0034e464 0034e4b4 0034e464 ntdll!RtlDispatchException+0x19 0034e44c 77b062c5 0034e464 0034e4b4 0034e464 ntdll!KiUserExceptionDispatcher+0xf 0034e7bc 77aeb652 0034e860 0034e8b0 00000000 ntdll!RtlpCallVectoredHandlers+0x61 0034e7d0 77aeb314 0034e860 0034e8b0 0034ec28 ntdll!RtlCallVectoredExceptionHandlers+0x12 0034e848 77aa0133 0034e860 0034e8b0 0034e860 ntdll!RtlDispatchException+0x19 0034e848 1c43c666 0034e860 0034e8b0 0034e860 ntdll!KiUserExceptionDispatcher+0xf 0034ebe8 1c43c4e5 0034ec28 080d35d0 080d35d6 lcdb4!lc::db::PackedIndices::unpackIndices<unsigned char>+0x86 0034ec14 1c45922d 0034ec28 080d35d0 00000006 lcdb4!lc::db::PackedIndices::unpack+0xb5 ... xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx getIndices 

For completeness, I sent the code lc::db::PackedIndices::unpackIndices() , including all the code added for debugging, to http://ideone.com/sVVXX7 .

The code that calls KiUserExceptionDispatcher is (*p++) = static_cast<T>(index); ( mov dword ptr [esp+10h],eax ).

I just can't understand what is happening. It seems that an exception has been thrown, but none of my exception handlers have been called. The application just freezes. I checked any locked critical sections ( !lock ), but did not find them. Also, I don’t understand why you need to throw an exception, since all memory cells are valid. Can someone give me some advice?

Update

I tried to find the type of exception to throw:

 0:000> s -d esp L1000 1003f 0028ebdc 0001003f 00000000 00000000 00000000 ?............... 0028efd8 0001003f 00000000 00000000 00000000 ?............... 0:000> .cxr 0028ebdc eax=77b94724 ebx=0804be30 ecx=00000002 edx=00000004 esi=77b94724 edi=0804be28 eip=77b062c5 esp=0028eec4 ebp=0028eee4 iopl=0 nv up ei ng nz na pe cy cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010287 ntdll!RtlpCallVectoredHandlers+0x61: 77b062c5 ff03 inc dword ptr [ebx] ds:002b:0804be30=00000001 0:000> .cxr 0028efd8 eax=0000003b ebx=00000001 ecx=0804bd98 edx=0028f340 esi=0028f340 edi=04b77580 eip=1c43c296 esp=0028f2c0 ebp=0028f2fc iopl=0 nv up ei pl nz na po nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010202 lcdb4!lc::db::PackedIndices::unpackIndices<unsigned char>+0x36: 1c43c296 8801 mov byte ptr [ecx],al ds:002b:0804bd98=3e 
+6
source share
3 answers

The thread hangs waiting for an exclusive SRW (slim read write lock) lock belonging to the OS exception handling code. And this exception is caused by your code. The exact exception and its details can be found using the following stack frame. 0034e848 77aa0133 0034e860 0034e8b0 0034e860 ntdll! RtlDispatchException + 0x19 - The RtlDispatchException argument is a pointer to EXCEPTION_RECORD . Therefore, if you type .exr 0034e860 , you can see the exception record. From the exception record, you will know the access to which address causes the exception (if the exception is an exception for violation of access rights).

How, after these steps, you found that the violation of access rights occurred due to writing to the address that you rightfully allocated on the heap - you can find the security attributes of the virtual page containing this address using the command ! virtual address

As you learned that the page’s security attributes were changed to (by some code) PAGE_READONLY at these heap addresses and, having seen the call stack of other threads, I have the following hypothesis, which I think can help you find the root cause.

I assume that Windows Heap Manager modifies the page attributes before raising an exception to indicate heap corruption. There seems to be some corruption in the ole heap - from the call stack of other threads that you showed. The root of the problem is probably the heap-distorting code that the heap subsequently finds and throws an exception, since the implementation code of the OS exclusion mechanism starts and hangs in the SWR lock before it can call the exception handler in your or other library code. After that, another uninformed thread in your code rightfully concerns the memory of the heap, which the heap has already protected due to the fact that it has already detected corruption by causing an exception and making an exception mechanism code for breaking and getting into the same dead lock. Given that you said that the problem does not reproduce when the program is running under the debugger, it would be interesting that the problem is related to the problem of time or the state of the race.

+2
source

The stack trace tells the story. Your program crashes, there are good chances that this is an access violation exception, a typical failure mode for C ++ code and usually caused by cumulative corruption. Windows then tries to call exception filters to search for any code that is ready to handle the exception. First of all, these are the handlers set by AddVectoredExceptionHandler (). To prevent re-entry, it is necessary to make a lock when one of these handlers, in turn, causes a failure.

And what is where the dollar stops. That is why it is not clear from the stack trace. This may be due to the fact that another thread also fell on the heap corruption and was busy handling the exception and took a lock. Use Debug + Windows + Threads to look at them. But, most likely, the state of the process is so distorted that the lock object itself is also damaged. Hardly, but it will happen.

And yes, turning off a heap with low fragmentation has the ability to hide a bunch of corruption. The layout of the memory will be very different, so any code that causes corruption can now hit something innocent. This, of course, is not a solution.

Debugging + Exception, check the "Drop" checkbox for "Win32 Exception". The debugger will now stop when an exception is thrown. At least you will find out which exception is thrown. Ultimately, you need to find out where the heap of corruption occurs. It is never in code that crashed; luck debugs it.

+2
source

If you are using an ATI graphics card (with ATI drivers), this is a known issue due to which you should not leak any other memory corruption in the future.

Try disabling all the states you can (glDisableClientState), use APITrace to find out which one you forgot.

One easy way to check for memory corruption in the graphics driver is to either test another board / driver, or force the software to render.

+1
source

All Articles