Dump call stack on error? - c

I'm debugging a program written in plain C (no C++, MFC, .NET, etc.) to the WIN32API. It must compile in both VS2005 (to run under Win 2K/XP) and VS2010 (to run under Win7.) I've been unable to duplicate a bug that my customer seems able to duplicate fairly reliably, so I'm looking for ways to have my program "debug itself" as-it-were. It is monitoring all of the key values that are changing, but what I'd really like to see is a stack dump when a value changes. Oh, I cannot run a "true" debug build (using the debug libraries) without installing the compiler on the customer's machine and that is not an option, so this must be built into my release build.
Is there any way to do this other than just adding my own function entry/exit calls to my own stack monitor? I'd especially like to be able to set a hardware breakpoint when a specific memory address changes unexpectedly (so I'd need to be able to disable/enable it around the few EXPECTED change locations.) Is this possible? In a Windows program?
I'd prefer something that doesn't require changing several thousand lines of code, if possible. And yes, I'm very underprivileged when it comes to development tools -- I consider myself lucky to have a pro version of the Visual Studio IDEs.
--edit--
In addition to the excellent answers provided below, I've found some info about using hardware breakpoints in your own code at http://www.codereversing.com/blog/?p=76. I think it was written with the idea of hacking other programs, but it looks like it might work find for my needs, allowing me to create a mini dump when an unexpected location writes to a variable. That would be cool and really useful, especially if I can generalize it. Thanks for the answers, now I'm off to see what I can create using all this new information!

You can use MiniDumpWriteDump function which creates a dump, which can be used for post-mortem debugging. In the case application crashes, you can call MiniDumpWriteDump from unhandled exception handler set by SetUnhandledExceptionFilter. If the bug you are talking about is not crash, you can call MiniDumpWriteDump from any place of the program, when some unexpected situation is detected. More about crash dumps and post-mortem debugging here: http://www.codeproject.com/Articles/1934/Post-Mortem-Debugging-Your-Application-with-Minidu
The main idea in this technique is that mini dump files produced on a client site are sent to developer, they can be debugged - threads, stack and variables information is available (with obvious restrictions caused by code optimizations).

There are a bunch of Win32 functions in dbghelp32.dll that can be used to produce a stack trace for a given thread: for an example of this see this code.
You can also look up the StackWalk64() and related functions on MSDN.
To get useful information out, you should turn on PDB file generation in the compiler for your release build: if you set up your installer so that on the customer's computer the PDB files are in the same place as the DLL, then you can get an intelligible stack trace out with function names, etc. Without that, you'll just get DLL names and hex addresses for functions.
I'm not sure how practical it would be to set up hardware breakpoints: you could write some sort of debugger that uses the Win32 debugging API, but that's probably more trouble than its worth.

If you can add limited instrumentation to raise an identifiable exception when the symptom recurs, you can use Process Dumper to generate a full process dump on any instance of that exception.
I find I cite this tool very frequently, it's a real godsend for hard-to-debug production problems but seems little-known.

Related

How to track down exceptional bugs in application when released?

When an application causes a serious segment-fault issue, which is hard to find or track. I can use a debug version and generate a core dump file when issue happens. And debug this app with core-dump file.
But how to track down exceptional bugs in application when released? There seems to be no core-dump file in release version. Although log is an option, it is useless when there is a hard to track bugs happens.
So my question is how to track down those hard to track bugs in release version? Any suggestions or technology out there available?
Following reference may help the discussion.
[1] Core dump in Linux
[2] generate a core dump in linux
[3] Solaris Core dump analysis
You can compile a release version with gcc -g -O2 ...
The lack of core dump is related to your user's setting of resource limits (unless the application is explicitly calling setrlimit or is setuid; then you should offer a way to avoid that call). You might teach your users how to get core dumps (with the appropriate bash ulimit builtin).
(and there is some obscure way to put the debugging information outside of the executable)
The distributions provide -dbg packages that provide debugging symbols for programs. They are built along with the binary packages and can provide your users the ability to generate meaningful backtraces from core dumps. If you build your packages using the same utilities, you can get these -dbg packages for your own software "nearly free".
I suggest to use a crash reporting system, in my experience we use google's break-pad project for our windows client program, of course you can write your own.
Google break-pad is an open-source multi-platform crash reporting system, it can make mini or full memory dump when exception or crash happen, then you can config it to upload the dump file and any additional files to a specific ftp server or http server, very help to find bug.
Here is the link:
Google Break-pad
Ask the "customer" for a description of what he or she did to make it crash, and try to replicate it yourself with your own version that has debug information.
The hard part is getting correct information from the customer. Often they will say they did nothing special or nothing different than before. If possible, go see the person having the problem, and ask them to do what they do to make the program crash, writing down every step.

listing all calls to my library

I'm building a shared library in C, which other programs use. Sometimes, these other programs crash because of some error in my shared library. While reproducing these sort of bugs, it is very useful for me to know which functions of my library are being called, with what arguments and in what order. Of course I can add printf() calls to all my functions, or add breakpoints to all of them, but I figure there just has to be a better way to determine this.
Edit: since I'm doing this on OSX, dtrace and the related script dapptrace seem promising. However, after digging through some documentation I'm still a bit lost.
Say, my library is /path/to/libmystuff.so and I've got a program test which links to this library. Using dtrace, how would I bring up a list of all the function calls that reside in libmystuff.so?
You could use ltrace for that purpose if you work on a Linux system. The original poster shows, in the comments below, a solution that works on Mac OS X using dtrace.
I am assuming that you are working on Unix.
Use gdb for debugging purposes.
If your program has crashed.
you can use the core file generated for looking into the stack trace.
It will give all information that you have asked for.
for more information for checking the stacktrace using gdb with the core file see here.
You can also log the functions call on file system with all details like function name, arguments etc.
(Usually logging is help in Server-Clients application but I am not sure about your application).
This way You can trace all calls. You can also enable logging in debugging mode only. I hope this reply will be useful to you.

Tools/techniques for diagnosing C app crash on Windows

I have written an application in C, which runs as a Windows service. Most users can run the app without any problems, but a significant minority experience crashes caused by an Access Violation, so I know I have a bug somewhere. I have tried setting up virtual machines to mirror the users' configurations as closely as possible, but cannot reproduce the issue.
My background is in Java - when a Java app crashes it will produce a stack trace showing exactly where the problem occurred, but native applications aren't so helpful. What techniques are normally used by C developers for tracking down this type of problem? I have no physical access to the users' machines that experience the crash, but I could send then additional tools to install, to capture information. I also have Windows error reports showing Exception Code/Offset etc but these don't mean much to me. I have compiled my application using gcc - are there some compiler options I can use to generate more information in the event of a crash?
You could try asking the users to run ProcDump to capture a core dump when the program crashes. Unlike using something like Visual Studio it's a single, simple command-line utility so there should be no problem getting the users to run it.
On most modern operating systems your app can install a crash handler that'll walk the stack(s) in the event of a crash. I have no experience doing this on Windows, but this article walks through how to do it.

Call tree for embedded software [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
Does anyone know some tools to create a call tree for C application that will run on a microcontroller (Cortex-M3)? It could be generated from source code (not ideal), object code (prefered solution), or at runtime (acceptable). I've looked at gprof, but there's still a lot missing to get it to work on an embedded system.
An added bonus would be that the tool also gives the maximum stack depth.
Update: solution is preferably free.
One good way to achieve this is by using the --callgraph option to the ARM linker (armlink) that is part of RVCT (not free).
For more details - callgraph documentation.
I realize from one of the comments that you are looking for a gcc-based solution, which this isn't. But it may still be helpful.
From source code, you can use Doxygen and GraphViz even if you don't already use Doxygen to document your code. It is possible to configure it so that it will include all functions and methods whether or not they have documentation comments. With AT&T Graphviz installed, Doxygen will include call and caller graphs for most functions and methods.
From object code, I don't have a ready answer. I would imagine that this would be highly target dependent since even with the debug information present, it would have to parse the object code to find calls and other stack users. In the worst case, that approach seems like it would require effectively simulating the target.
At runtime on the target hardware your choices are going to depend in part on what kind of embedded OS is present, and how it manages stacks for each thread.
A common approach is to initialize each stack to a known value that seems unlikely to be commonly stored in automatic variables. An interrupt handler or a thread can then inspect the stack(s) and measure an approximate high water mark.
Even without pre-filling the stack and later walking it to look for footprints, an interrupt could just sample the current value of the stack pointer (for each thread) and keep a record of its greatest observed extent. That would require storage for a copy of each threads SP, and the interrupt handler wouldn't have very much work to do to maintain the information. It would have to access the saved states of all the active threads, of course.
I don't know of a tool that does this explicitly.
If you happen to be using µC/OS-II from Micrium as your OS, you might take a look at their µC/Probe product. I haven't used it myself, but it claims to allow a connected PC to observe program and OS state information in near real time. I wouldn't be surprised if it is adaptable to another RTOS if needed.
Call graphs from source code is no problem as mentioned above your compiler or doxygen can generate this information from source code. Most modern compilers can generate a call graph as part of the compile process.
On a previous embedded projects I filled that stack with a pattern and ran a task. Check up to which point the stack destroyed my pattern. Reload stack with pattern and run the next task. This makes your code very ssslloowww .... but is free. It is not fully accurate because all the data is timing out the whole time and the code spends lots of time in error handlers.
On some processors you can get a trace pod so that you can monitor code cover and what not if your processor needs to run at full speed to test and you can also not use instrumented code. Unfortunately these types of tools are very expensive. Look at Green Hills Time machine if you have money. This make all types of debugging easier.
Check out StackAnalyzer.
I haven't used these, but are you aware of:
calltree
cflow
Since they analyse the source code, they don't calculate stack depth.
Note, Doxygen can do "call graphs" and "caller graphs" but I believe these are per-function and only show the tree up to a certain number of "hops" from each function.
Stack depth and/or call tree generation may be supported by compiler tools. For example, for Renesas micros there is a utility called Call Walker.
My calltree graph generator, implemented in bash, using cscope and dot.
Can generate graphs of upstream callers, downstream callees, and call-associations between functions. You can set it up to view graphs in a number of ways, including xfig, .png viewers, and the dynamic dot visualiztion tool "zgrviewer".
http://toolchainguru.blogspot.com/2011/03/c-calltrees-in-bash-revisited.html
Just a thought. Is it possible to run it in a virtual machine (like Valgrind) and take stack samples ?
Eclipse with CDT has C/C++ indexing and will show you a call graphs. As far as I know, you don't need to be able to build in Eclipse to get the indexer to work, just make sure all your source files are in the project.
It works pretty nicely.
Visual Studio will do similar (but it's not free). I use Visual Studio to work on embedded projects; using a makefile project I can do all the work except debugging in the VS IDE.
I have suggested this approach already in another discussion about embedded development, but if you really need a callgraph, as well as stack use info, and all this for free, I would personally consider using an open source emulator to simulate the whole thing, while instrumenting the object code by adding a handful hooks to the emulator itself to get this data.
I am not familiar with this particular target, but there is a whole number of open source ARM emulators available (freshmeat, sourceforge, google), and you are probably mostly interested in opcodes related to call/ret and push/pop?
For example check out skyeye.
So, even if you find that it's not straightforward to extend a compiler or an emulator to provide this information, it should still be possible to create a simple script in order to look for the entrypoint and all calls/rets, as well as opcodes related to stack usage.
Of course, the only reliable information on stack usage is going to come from runtime instrumentation, preferably exercising all important code paths.
A pretty light tool: Egypt
Use Understand: http://www.scitools.com/
It's not free, and runs on source (not runtime), but it works, it works well, and it's well supported.
It will tell you much more than could ever want to know about your code.
I know this is reponding to a very old question, but someone might stumble upon this with the same question...
I recently experimented with a Python script that analyses the assembler version of the application, extracts the stack usage and the call tree, and reports the maximum stack use. In my build system I then use this to create a stack of exactly that size.
I used it only on small applications, but it seems to work OK for AVR8, MSP430, and Cortex-M3. Obviously, there are strict limitations: no indirect calls (no function pointers, no virtual functions), no recursion, and stack-using assembler instruction patterns that are used are limited to what I found in GCC's output. If these limitations are not met, the script will report an error.
The Python source is 24k, free (boost license), not very fast, and still under development. Contact me if you are interested.

Does attaching to a process make it behave differently?

While I am aware of the differences between debug and release builds, I am curious if attaching the debugger to a process (built release or debug) changes that processes behaviour?
For reference, I'm developing on HP 11.31 Itanium but still am curious for the general case.
http://en.wikipedia.org/wiki/Heisenbug#Heisenbug
Of course, attaching a debugger will change the timing (which can change e.g. thread race conditions), and also some system calls can detect if a debugger is attached.
It certainly can, depending on the platform and the method of debugging. For example, when debugging on Windows, there is actually the IsDebuggerPresent function. As noted that function can be circumvented, but then there are other means. So basically, it's complicated.
Yep, lots of things inside the Windows data structures change when a debugger is attached. It changes how memory is allocated/freed, it adds additional housekeeping code and "markers" on the stack (Ever noticed the F00D values in newly allocated memory) in fact many of the changes are used by anti-debuggers to detect if an application is being debugged.
In interpreted languages (Java, .NET) the runtime will often generate different machine instructions when running under a debugger to help it trap and display exceptions, show the original code, etc. It will usually generate unoptimized code as well when a debugger is attached.
Some of these changes affect the way the software behaves and can result complicate transient bugs that are caused by optimizations or extremely fine timinig dependencies.
Yes, I've often found that attaching a debugger to a process instantly makes bugs disappear, only to have them reappear when I compile my app in release mode. Unfortunately I usually can't really ask all my users to open a debugger just to run my app, so it can be quite frustrating.
Another thing to keep in mind is that for multithreaded apps attaching the debugger definitely can yield very different results. These are the kind of things referred to as "Heisenbugs."
Sure, in multithreaded apps, attaching a debugger can yield different result.
However, how about the codes which are not related to threads?
I have seen a release build, which has a debugger attached, doesn't have problems. But, when a debugger is not attached, it has problems.
If it is launched first and a debugger is attached to it, it also shows the same problems.

Resources