How to record or play back lines or instructions executed just before a crash

Often I have to debug failed C ++ programs on Windows, where I can reproduce the crash, but it is difficult to determine which sequence of instructions in the code caused the failure (for example, another thread overwriting the memory of the emergency thread). In this case, even the call stack does not help. I usually resort to narrowing down the cause of the crash by commenting on sections of the source code, but this is very tedious.

Does anyone know a tool for Windows that can report or play the last few lines of source code or machine code instructions that execute in all threads just before the crash? That is something like gdb reverse debugging or something like Mutek BugTrapper (which is no longer available). I am looking for a released and stable tool (I know SoftwareVerify Error Checker and Hexray IDA Pro 6.3 Trace Replayer, both of which are still in closed beta programs).

What I have already tried are WinDbg wt and ta @$ra trace commands, but both commands have the disadvantage that they automatically stop after a few seconds. I need trace commands that run until a failure occurs, and trace all threads of the running program.

NOTE: I am not looking for a debugging tool designed to fix a specific problem like gflags, pageheap, Memory Validator, Purify, etc. I am looking for an released and stable tool for tracking or playing at the command level.

+8
c ++ debugging windows windbg
source share
6 answers

I found a solution: “debug the replay” using VMware Workstation and Visual Studio 2010. This takes a lot of time, but you get a Visual Studio C ++ debugger that can debug back in time. Here's a video demonstrating how playback debugging works: http://blogs.vmware.com/workstation/2010/01/replay-debugging-try-it-today.html .

The drawback of the solution is that VMware seems to have stopped re-debugging in recent versions of VMware. Moreover, only some types of processors support playback. I did not find an exhaustive list of supported processors; I tested the playback functions on my three computers: repeat playback did not work on the Core i7 200; the replay worked on the Core2 6700 and on the Core2 Q9650.

I really hope that VMware revises and reintroduces debugging again in future versions of VMware Workstation, because it really adds a new dimension to debugging.

For those of your interest, here is a description of how you can set up your environment for debugging playback:

In the description below, “local debugging” means that Visual Studio and VMware are installed on the same PC. “Remote debugging” means that Visual Studio and VMware are installed on different computers.

  • Install Visual Studio 2010 Service Pack 1 (SP1) on the host system.

  • Verify that Visual Studio is configured to use Microsoft Symbol Servers. (In the "Tools | Options | Debug | Symbols" section).

  • On the host system, install "Debugging Tools for Windows."

  • Install VMware Workstation 7.1. (Version 8.0 no longer includes a debugging feature for playback). This will also install the plugin in Visual Studio.

  • Install the virtual machine (VM) on VMware with Windows XP SP3.

  • If the application under test is a debug assembly, install the Visual Studio debugging DLL on the virtual machine. (See http://msdn.microsoft.com/en-us/library/dd293568.aspx for instructions on how to do this, but use the "Debug" configuration instead of the "Release").

  • Copy "gflags.exe" from the "Debugging Tools for Windows" directory to the virtual machine, run gflags.exe on the virtual machine, select "Disable swapping of kernel cells" in the "Registry tab" section, and click OK. Restart the virtual machine.

  • Copy all the EXE and DLL files of the application under test to the virtual machine and make sure that you can run the application and reproduce the problem.

  • Shutting down the virtual machine and creating a snapshot (via the "Take a picture" context menu item on the VMware workstation).

  • (Only for remote debugging :) Run the following command on the Visual Studio PC and enter an arbitrary password:

    C: \ Program Files \ VMware \ VMware Workstation \ Visual Studio Integrated Debugger \ dclProxy.exe host name

    Replace the host name with the PC name.

  • (For remote debugging only :) Create a manual record for the virtual machine. That is, log into the VM operating system, start recording (via the "Record" context menu), launch the application under test, and follow the steps necessary to reproduce the problem. Then stop and save the recording.

  • Launch Visual Studio and go to "VMware | Options | Replay Debugging in VM | General" and set the following values:

    • "Local or remote" should be set to "Local" for local debugging or "Remote" for remote debugging.
    • The "virtual machine" must be installed in the path to the VM.vmx file.
    • The “Machione Passcode Remote Code” must be set as the access code you used above (for remote debugging only).
    • "Playback Record" should be set to the name of the recording that you previously created using VMware.
    • The "path to the executable path for the host" must be installed in the directory in which you save the DLLs that are required by the application under test and which are needed by Visual Studio to display the correct stack paths.

    Click "Apply."

  • Go to the "VMware Settings | Options | Re-Debugging in the Virtual Machine | Pre-Write" section and set the following values:

    • “Basic snapshot for recording”: name of the previously created snapshot.

    Click OK.

  • (for local debugging :) In Visual Studio, select "VMware | Create Recording for Replay"; this restarts the virtual machine. Log in to the virtual machine, launch the application under test and follow the steps necessary to reproduce the problem. Then stop and save the recording.

  • Select "VMware | Start Re-Debugging". VMware now automatically restarts the virtual machine and the application under test and plays back the recorded actions. Wait for the application to work; then the Visual Studio debugger is automatically activated.

  • In the Visual Studio debugger, set a breakpoint where you think the application was before the crash. Then select "VMware | Reverse Continue". The debugger now executes back to the breakpoint. This operation may take some time because the virtual machine will restart and play until a breakpoint is reached. (You can speed up this operation by adding a snapshot a few seconds before the script fails. You can add additional snapshots while debugging the snooze.)

  • After VMware re-plays the virtual machine at the breakpoint, you can use "Step Over" and "Step Into" to move from the breakpoint, i.e. you play back the recorded history of events until you reach the point, you can determine the reason why your application crashed.

Further information: http://www.replaydebugging.com/

+4
source share

If you come across another thread overwriting memory of the crashing thread , it is useful to use gflags ( GFlags and PageHeap ). Instead of telling you a few lines that were executed before the failure, it will tell you exactly where your algorithm overwrites the correctly allocated memory block.

First you activate this type of check:

gflags /p /enable your_app.exe /full or
gflags /p /enable your_app.exe /full /backwards

Make sure you activate correctly
gflags /p

run the application and collect the dump files

and then turn off validation with gflags:

gflags /p /disable your_app.exe

<h / "> Update 1

It does not immediately detect problems like *p = 0; where p is an invalid pointer
At least some problems have been discovered.
For example:

 #include <stdio.h> int main(int argc, char *argv[]) { int *p = new int; printf("1) p=%p\n",p); *p = 1; delete p; printf("2) p=%p\n",p); *p = 2; printf("Done\n"); return 0; } 

When I start with gflags turned on, I get a dump file, and the problem is correctly identified:

 STACK_TEXT: 0018ff44 00401215 00000001 03e5dfb8 03dfdf48 mem_alloc_3!main+0x5b [c:\src\tests\test.cpp\mem_alloc\mem_alloc\mem_alloc.3.cpp @ 11] 0018ff88 75f8339a 7efde000 0018ffd4 77bb9ef2 mem_alloc_3!__tmainCRTStartup+0x10f [f:\dd\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 586] 0018ff94 77bb9ef2 7efde000 2558d82c 00000000 kernel32!BaseThreadInitThunk+0xe 0018ffd4 77bb9ec5 004013bc 7efde000 00000000 ntdll!__RtlUserThreadStart+0x70 0018ffec 00000000 004013bc 7efde000 00000000 ntdll!_RtlUserThreadStart+0x1b STACK_COMMAND: ~0s; .ecxr ; kb FAULTING_SOURCE_CODE: 7: printf("1) p=%p\n",p); 8: *p = 1; 9: delete p; 10: printf("2) p=%p\n",p); > 11: *p = 2; 12: printf("Done\n"); 13: return 0; 14: 15: } 

<h / "> Update 2

Another example from @fmunkert:

 #include <stdio.h> int main() { int *p = new int; printf("1) p=%p\n",p); *p = 1; p++; printf("2) p=%p\n",p); *p = 2; // <==== Illegal memory access printf("Done\n"); return 0; } 

gflags /p /enable mem_alloc.3.exe /full /unaligned

 STACK_TEXT: 0018ff44 00401205 00000001 0505ffbe 04ffdf44 mem_alloc_3!main+0x52 [c:\src\tests\test.cpp\mem_alloc\mem_alloc\mem_alloc.3.cpp @ 12] 0018ff88 75f8339a 7efde000 0018ffd4 77bb9ef2 mem_alloc_3!__tmainCRTStartup+0x10f [f:\dd\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 586] 0018ff94 77bb9ef2 7efde000 2577c47c 00000000 kernel32!BaseThreadInitThunk+0xe 0018ffd4 77bb9ec5 004013ac 7efde000 00000000 ntdll!__RtlUserThreadStart+0x70 0018ffec 00000000 004013ac 7efde000 00000000 ntdll!_RtlUserThreadStart+0x1b STACK_COMMAND: ~0s; .ecxr ; kb FAULTING_SOURCE_CODE: 8: printf("1) p=%p\n",p); 9: *p = 1; 10: p++; 11: printf("2) p=%p\n",p); > 12: *p = 2; // <==== Illegal memory access 13: printf("Done\n"); 14: return 0; 15: 16: } 

Unfortunately, the / unaligned option may cause the program to not work properly ( How to use the Pageheap.exe file ):

Some programs make assumptions about 8-byte alignment, and they stop working correctly with the / unaligned option. Microsoft Internet Explorer is one such program.

+9
source share

I would attach WinDbg when the program starts, and make a mini-drive when it will debugbreaks on failure or exception:

 .dump /ma c:\mem.dmp // c:\mem.dmp could be any other location you desire 

I would enable gflags for your application, either from the command line inside WinDbg:

 !gflag +ust 

Remember to remove this flag after !!

Then you can run the automatic exepction analysis:

 !analyze -v 

this can tell you what, in his opinion, caused a failure, you can drop the call stacks of all threads:

 ~* kb 

and if you see something suspicious, you can switch the flow and check further:

 ~xs 

You can check the exception context entry:

 .ecxr 

there is a good link on how to restore the call stack from the catch block: http://blogs.msdn.com/b/slavao/archive/2005/01/30/363428.aspx, and also this: http: // blogs. msdn.com/b/jmstall/archive/2005/01/18/355697.aspx

The main thing is that when connecting windbg, you should check the status of all flows and call stacks, you can also open the mini-remote in the visual studio: http://msdn.microsoft.com/en-us/library/windows/desktop/ee416349%28v = vs.85% 29.aspx # Analysis_of_a_minidump , if you prefer visual studio for navigation, you can open the same dump in windbg to use its analysis tools and visual studio to navigate the code. Hope this helps.

+2
source share

Does gdb offer this functionality out of the box?

Some time has passed since I used it, but I remember that he could run the program until it crashed, and then repeat the steps for you in the debugger.

Also, would it be easy to set up your own logging application, which could output any amount of data that you have selected, and can be activated by the command-line option for exe?

Now you can configure it to solve your crash, or simply light up the basics, and then expand it as you corrected errors or added new features. The advantage would be that you could accurately display the data you find useful, and could even indicate logging levels to avoid clogging with noise?

+1
source share

How to use BMC AppSight?

We used it in the previous company (sorry, it took me a while to remember the name), it was used to investigate failures, etc. ISTR you ran it and then ran your software, and it recorded everything that happened in the log file, which you can view later.

It definitely works on Windows since I used it.

Could this be what you are looking for?

+1
source share

Not quite sure if this is what you want, but "u" will break the latest instructions from the current IP register in the current stream. This will show you the last instructions that were run, and you can usually figure out what values ​​were for different registers, maintaining your path through the code that it parses. This is a slow and tough process in most cases, but it gives you almost 100% accuracy (excluding some weird hardware problems or really weird code problems) that just happened. I used this method in the past to find out why some things were canceled when I didn't have the source code.

If you check the windbg help file, you will find more information about this.

-one
source share

All Articles