Crash report crash report when my application is blocked on the client machine

I work with a somewhat unreliable application (Qt / windows), partially written for us by a third party (just trying to shift the blame there). Their latest version is more stable. Like. We get fewer error messages, but we get a lot of messages that they just hang and never return. The circumstances are different, and with the little information we can collect, we were unable to reproduce the problems.

Ideally, I would like to create a kind of watchdog timer that notices that the application is blocked and offers to send a crash report back to us. Good idea, but there are problems:

  • How does the watchman know that the process hanged himself? Presumably, we apply the application to periodically say “everything is OK” to the watchdog, but where do we put it so that it is often enough, but it is unlikely to be on the way to the code in which the application ends when it is locked .

  • What information should the watchdog report when a failure occurs? Windows has a decent debug api, so I'm sure all the interesting data is available, but I'm not sure it would be useful for tracking issues.

+4
source share
4 answers

You need a combination of minidump (use DrWatson to create them if you do not want to add your own mini-dump generation code) and userdump to start creating a mini-drive in freeze mode.

The thing about automatic hang-up detection is that it is difficult to solve when something is hanging, and when it just slows down or is blocked by IO, wait. I personally prefer the user to report this intentionally when they thought they hung him. Besides being much simpler (my applications don't hang out often, if at all :)), it also helps them “be part of the solution”. They like it.

First, check out the classic bugslayer article on alarms and symbols, which also provides excellent information on what happens to these things.

Secondly, get userdump , which allows you to create dumps, and to configure for dumping

When you have a dump, open it in WinDBG, and you can check all the status of the program, including threads and stop lights, registers, memory and parameters for functions. I think it will be especially interesting for you to use the ~ * kp "command in Windbg to get a stop call for each thread, but"! locks "to show all blocking objects. I think you will find that the hang will be due to a deadlock of synchronization objects, which will be difficult to track, since all threads are usually waiting for WaitForSingleObject, but look further at the stop calls to see application threads (rather than a framework such as background notifications and network routines.) After you narrow them down, you can see what calls have been made, maybe add some logging tools to the application to try give you more information the next time it failed.

Good luck.

Ps. A quick google reminded me of this: Debugging deadlocks . (CDB - windbg command line equivalent)

+5
source

You can use ADPlus from Microsoft Debugging Tools for Windows to identify freezes. It joins your process and dumps (mini or full) when the process freezes or crashes.

WinDbg is portable and should not be installed (however you need to configure the characters). You can create a custom installation that will launch your application using a package that will also start ADPlus after starting your application (ADPlus is a command line tool, so you should find a way to link it in some way).

By the way, if you find a way to recognize the hang inside yourself and you can minimize the process, you can register in Windows Error Reporting so that an emergency dump is sent to you (if the user has allowed it).

+2
source

I think that a separate application to do watchdogging will most likely cause more problems than it solves. I would suggest that instead you first create handlers for generating mini-disks when the application crashes, and then add a watch thread to the application, which will cause DELIBERATELY to fail if the application disconnects from the rails. An advantage of the watchdog thread (versus another application) is that it should be easier for the watchdog player to know that the application has gone off the rails.

Once you have MiniDumps, you can poke to find out the state of the application when it dies. This should give you enough information to figure out the problem, or at least where to look next.

CodeProject has some of MiniDumps that can be a useful example. MSDN also has more information about them.

+1
source

Do not worry about the watchdog. Sign up for Microsoft Windows Error Reproting (winqual.microsoft.com). They collect a stack for you. In fact, it is likely that they are already doing this today; they don’t share them until you sign up.

+1
source

All Articles