Windows SuspendThread not? (GetThreadContext fails)

We have a Windows32 application in which one thread can stop another to check its state [PC, etc.] by doing SuspendThread / GetThreadContext / ResumeThread.

if (SuspendThread((HANDLE)hComputeThread[threadId])<0) // freeze thread ThreadOperationFault("SuspendThread","InterruptGranule"); CONTEXT Context, *pContext; Context.ContextFlags = (CONTEXT_INTEGER | CONTEXT_CONTROL); if (!GetThreadContext((HANDLE)hComputeThread[threadId],&Context)) ThreadOperationFault("GetThreadContext","InterruptGranule"); 

It is extremely rare that GetThreadContext returns error code 5 on a multi-core system (Windows system code is “Access Denied”).

The SuspendThread documentation explicitly states that the target thread is suspended if the error is not returned. We check the return status of SuspendThread and ResumeThread; they never complain.

How can I pause a stream but cannot access its context?

This blog is http://www.dcl.hpi.uni-potsdam.de/research/WRK/2009/01/what-does-suspendthread-really-do/

suggests that SuspendThread, when he returns, may have started hanging another thread, but that thread has not yet been suspended. In this case, I can see how GetThreadContext will be problematic, but it seems like a dumb way to define SuspendThread. (How could a SuspendThread call find out when the target thread was actually paused?)

EDIT: I lied. I said this is for windows.

Well, the strange truth is that I don’t see this behavior on Windows XP 64 (at least not last week, and I really don’t know what happened before) ... but we tested this Windows Application under Wine on Ubuntu 10.x. The GetThreadContext source for brushes contains the “Access Denied” response on line 819 when, for some reason, trying to capture the state of a thread does not work. I suppose, but it seems that Wine GetThreadStatus believes that the thread simply cannot be re-accessed. Why would this be true after SuspendThead is outside of me, but there is code. Thoughts?

EDIT2: I lied again. I said that we only saw behavior in Wine. No ... we found Vista Ultimate, which seems to produce the same error (again, rarely). So it looks like Wine and Windows agree with an obscure case. It also appears that simply turning on the Sysinternals Process monitoring program exacerbates the situation and causes a problem in Windows XP 64; I suspect Heisenbug. (The process monitor does not even exist on the Wine-tasting (:-) machine or XP 64 system that I use for development).

What it is?

EDIT3: September 15, 2010. I added a thorough check for error return status, without breaking code, for SuspendThread, ResumeThread and GetContext. I have not seen any hints of this behavior on Windows systems since I did it. They didn’t return to Wines experiment.

November 2010: Strange. It seems that if I compile this in VisualStudio 2005, it does not work on Windows Vista and 7, but not before. If I compile under VisualStudio 2010, this will not work anywhere. You can point your finger at VisualStudio2005, but I am suspicious of the problem taking into account the location, and different optimizers in VS 2005 and VS 2010 put the code in several different places.

November 2012: The saga continues. We see this crash on several XP and Windows 7 machines at a fairly low speed (once every few thousand starts). Our Suspend actions apply to threads that mostly execute pure compute code, but sometimes make calls on Windows. I do not remember to see this problem when the thread computer was in our computational code. Of course, I do not see the PC in the stream when it hangs, because GetContext will not give it to me, so I can not directly confirm that the problem only occurs when making system calls. But all of our system calls are transmitted through one point, and so far the proof is that this moment was executed when we get the hang. Thus, indirect evidence suggests that the GetContext in the thread only fails if the system call is made by this thread. I did not have the energy to create a critical experiment to test this hypothesis.

+4
source share
5 answers

Let me quote from Richter / Nassar “ Windows through C ++ 5Ed, ” which may shed some light:

DWORD SuspendThread (HANDLE hThread);

Any thread can call this function to pause another thread (as long as you have a thread handle). It goes (but I will say it anyway) that the thread can pause but cannot resume itself. Like ResumeThread, SuspendThread returns the previous thread to suspend the count. a thread can be suspended just like MAXIMUM_SUSPEND_COUNT times (defined as 127 in WinNT.h). Note that SuspendThread is asynchronous with respect to kernel mode execution, but user mode execution does not occur until the thread resumes.

In real life, the application must be careful when it calls SuspendThread because you have no idea that the thread might occur when you try to pause it. If a thread tries to allocate memory from a heap, for example, the thread will have a lock on the heap. Like other threads trying to access the heap, their execution will be suspended until the first thread resumes. SuspendThread is only safe if you know exactly what the target thread (or maybe), and you take extreme measures to prevent problems or deadlocks caused by the suspension of the thread.

...

Windows actually allows you to look inside the thread core object and grab its current set of CPU registers. To do this, you simply call GetThreadContext:

BOOL GetThreadContext (HANDLE hThread, PCONTEXT pContext);

To call this function, simply highlight the CONTEXT structure, initialize some flags (the ContextFlags structure member), indicating that you want to register and return the address of the GetThreadContext structure. The function then fills in the items you requested.

You must first call SuspendThread a call to GetThreadContext; otherwise, the stream may be scheduled, and the context of the stream may differ from what you will return. A theme actually has two contexts: user mode and kernel mode. GetThreadContext can only return a user-mode thread context. If you call SuspendThread to stop the thread but this thread is running in kernel mode, its user-mode context is stable even though SuspendThread actually paused the thread. But the thread can no longer execute user-mode code until it is resumed, so you can safely consider the thread suspended and GetThreadContext will work.

I assume that GetThreadContext might fail if you just called SuspendThread, while the thread is in kernel mode and the kernel blocks the thread context block at this time.

Perhaps in multi-core systems, one core handles thread execution in kernel mode, when the user mode was just paused, continue to block the thread's CONTEXT structure, exactly when the other kernel calls GetThreadContext.

Since this behavior is not documented, I suggest contacting Microsoft.

+3
source

There are some special issues associated with suspending the thread that owns the CriticalSection . I can’t find a good link right now, but Raymond Chen mentions it and another mention of Chris Brumme's blog there . Basically, if you are not lucky enough to call SuspendThread while the thread is accessing the OS lock (for example, heap lock, DllMain lock, etc.), then really strange things can happen. I would suggest that this is so that you work extremely rarely.

Does the call to the GetThreadContext call repeat after the processor exits, for example, Sleep(0) ?

+2
source

An old question, but it's nice to see that you were still updating it with status changes after the problem arose for more than two years.

The reason for your problem is that at the translation level of the 64th level version of WoW64 there is an error in accordance with:

http://social.msdn.microsoft.com/Forums/en/windowscompatibility/thread/1558e9ca-8180-4633-a349-534e8d51cf3a

There is a rather critical error in GetThreadContext in WoW64 that forces it to return outdated content, which makes it unusable in many situations. Content is stored in user mode. Therefore, you think the value is not null, but in the outdated content it is still nil.

That's why it crashes on a newer OS, but not older, try running it on 32-bit Windows 7.

As to why this error seems to happen less often with solutions built on Visual Studio 2010/2012, it is likely that the compiler is doing something that mitigates most of the problem, for this you should check the IL generated from how in 2005 and in 2010, and see what the differences are. For example, is there a problem if the project is built without optimization, perhaps?

Finally, some additional readings:

http://www.nynaeve.net/?p=129

+2
source

Perhaps a thread safety issue. Are you sure that the hComputeThread structure will not come out of you? Maybe the thread exited when you caused the suspension? This can cause the suspension to complete successfully, but by the time you call the get context, it will disappear and the handle is invalid.

0
source

Calling SuspendThread on a thread that owns a synchronization object, such as mutex or a critical section, can lead to a deadlock if the calling thread tries to get a synchronization object belonging to a suspended thread. - MSDN

0
source

All Articles