While I am aware of the differences between debug and release builds, I am curious if attaching the debugger to a process (built release or debug) changes that processes behaviour?
For reference, I'm developing on HP 11.31 Itanium but still am curious for the general case.
http://en.wikipedia.org/wiki/Heisenbug#Heisenbug
Of course, attaching a debugger will change the timing (which can change e.g. thread race conditions), and also some system calls can detect if a debugger is attached.
It certainly can, depending on the platform and the method of debugging. For example, when debugging on Windows, there is actually the IsDebuggerPresent function. As noted that function can be circumvented, but then there are other means. So basically, it's complicated.
Yep, lots of things inside the Windows data structures change when a debugger is attached. It changes how memory is allocated/freed, it adds additional housekeeping code and "markers" on the stack (Ever noticed the F00D values in newly allocated memory) in fact many of the changes are used by anti-debuggers to detect if an application is being debugged.
In interpreted languages (Java, .NET) the runtime will often generate different machine instructions when running under a debugger to help it trap and display exceptions, show the original code, etc. It will usually generate unoptimized code as well when a debugger is attached.
Some of these changes affect the way the software behaves and can result complicate transient bugs that are caused by optimizations or extremely fine timinig dependencies.
Yes, I've often found that attaching a debugger to a process instantly makes bugs disappear, only to have them reappear when I compile my app in release mode. Unfortunately I usually can't really ask all my users to open a debugger just to run my app, so it can be quite frustrating.
Another thing to keep in mind is that for multithreaded apps attaching the debugger definitely can yield very different results. These are the kind of things referred to as "Heisenbugs."
Sure, in multithreaded apps, attaching a debugger can yield different result.
However, how about the codes which are not related to threads?
I have seen a release build, which has a debugger attached, doesn't have problems. But, when a debugger is not attached, it has problems.
If it is launched first and a debugger is attached to it, it also shows the same problems.
Related
We are currently developing a cluster manager software in C. If several nodes connect to the manager, it works perfect, but if we use some tools to simulate 1000 nodes to connect the manager, it will sometimes work in unexpected ways.
How can one debug this kind of bug? It only appears when the load(connection/nodes) is large?
If I use gdb to debug step by step, the app never malfunctions.
How to debug this kind of bug?
In general, you want to use at least these techniques:
Make sure the code compiles and links without warnings. The -Wall is a good start, but -Wextra is better.
Make sure the application has designed-in logging and tracing, which can be turned on or off, and which has sufficient details to debug these kinds of issues, and low overhead.
Make sure the code has good unit-test coverage.
Make sure the tests are sanitizer-clean.
there's also no warning in valgrind check.
It's not clear whether you've simply ran the target application under Valgrind, or whether you also have the unit tests, and the tests are Valgrind-clean. It's also not clear whether you've observed the application mis-behavior under Valgrind or not.
Valgrind used to be the best tool available for heap and unintialized memory problems, but in 2017 this is no longer the case.
Compiler-based Address, Thread and Memory sanitizers catch significantly wider class of errors (e.g. global and stack overflows, and data races), and you should run your unit tests under all of them.
When all of the above still fails to find the problem, you may be able to run the real application instrumented with sanitizers.
Lastly, there are tools like GDB tracing and systemtap -- they are harder to learn, but give you significant power. Overview here.
Sadly the debugger is less useful for debugging concurrency/load issues.
Keep adding logs/printfs, trigger the issue with load testing, then try to narrow it down with more logs/printfs. Repeat.
The faster it is to trigger the bug the faster this will converge. Also prefer the classic "bisection" / "binary search" technique when adding logs - try to narrow down the areas you're looking at by at least half every time.
I'm debugging a program written in plain C (no C++, MFC, .NET, etc.) to the WIN32API. It must compile in both VS2005 (to run under Win 2K/XP) and VS2010 (to run under Win7.) I've been unable to duplicate a bug that my customer seems able to duplicate fairly reliably, so I'm looking for ways to have my program "debug itself" as-it-were. It is monitoring all of the key values that are changing, but what I'd really like to see is a stack dump when a value changes. Oh, I cannot run a "true" debug build (using the debug libraries) without installing the compiler on the customer's machine and that is not an option, so this must be built into my release build.
Is there any way to do this other than just adding my own function entry/exit calls to my own stack monitor? I'd especially like to be able to set a hardware breakpoint when a specific memory address changes unexpectedly (so I'd need to be able to disable/enable it around the few EXPECTED change locations.) Is this possible? In a Windows program?
I'd prefer something that doesn't require changing several thousand lines of code, if possible. And yes, I'm very underprivileged when it comes to development tools -- I consider myself lucky to have a pro version of the Visual Studio IDEs.
--edit--
In addition to the excellent answers provided below, I've found some info about using hardware breakpoints in your own code at http://www.codereversing.com/blog/?p=76. I think it was written with the idea of hacking other programs, but it looks like it might work find for my needs, allowing me to create a mini dump when an unexpected location writes to a variable. That would be cool and really useful, especially if I can generalize it. Thanks for the answers, now I'm off to see what I can create using all this new information!
You can use MiniDumpWriteDump function which creates a dump, which can be used for post-mortem debugging. In the case application crashes, you can call MiniDumpWriteDump from unhandled exception handler set by SetUnhandledExceptionFilter. If the bug you are talking about is not crash, you can call MiniDumpWriteDump from any place of the program, when some unexpected situation is detected. More about crash dumps and post-mortem debugging here: http://www.codeproject.com/Articles/1934/Post-Mortem-Debugging-Your-Application-with-Minidu
The main idea in this technique is that mini dump files produced on a client site are sent to developer, they can be debugged - threads, stack and variables information is available (with obvious restrictions caused by code optimizations).
There are a bunch of Win32 functions in dbghelp32.dll that can be used to produce a stack trace for a given thread: for an example of this see this code.
You can also look up the StackWalk64() and related functions on MSDN.
To get useful information out, you should turn on PDB file generation in the compiler for your release build: if you set up your installer so that on the customer's computer the PDB files are in the same place as the DLL, then you can get an intelligible stack trace out with function names, etc. Without that, you'll just get DLL names and hex addresses for functions.
I'm not sure how practical it would be to set up hardware breakpoints: you could write some sort of debugger that uses the Win32 debugging API, but that's probably more trouble than its worth.
If you can add limited instrumentation to raise an identifiable exception when the symptom recurs, you can use Process Dumper to generate a full process dump on any instance of that exception.
I find I cite this tool very frequently, it's a real godsend for hard-to-debug production problems but seems little-known.
I'm new to embedded programming but I have to debug a quite complex application running on an embedded platform. I use GDB through a JTAG interface.
My program crashes at some point in an unexpected way. I suppose this happens due to some memory related issue. Does GDB allow me to inspect the memory after the system has crashed, thus being completely unresponsive?
It depends on your setup a bit. In particular, since you're using JTAG, you may be able to set your debugger up to halt the processor when it detects an exception (for example accessing protected memory illegally and so forth). If not, you can replace your exception handlers with infinite loops. Then you can manually unroll the exception to see what the processor was doing that caused the crash. Normally, you'll still have access to memory in that situation and you can either use GDB to look around directly, or just dump everything to a file so you can look around later.
It depends on what has crashed. If the system is only unresponsive (in some infinite loop, deadlock or similar), then it will normally respond to GDB and you will be able to see a backtrace (call stack), etc.
If the system/bus/cpu has actually crashed (on lower level), then it probably will not respond. In this case you can try setting breakpoints at suspicious places/variables and observe what is happening. Also simulator (ISS, RTL - if applicable) could come handy, to compare behavior with HW.
I'm using MinGW, which is gcc for Windows. My program involves multiple windows, two different main threads, and several worker threads in a thread pool for overlapped network I/O.
It works perfectly fine without compiler optimization.
A) Is compiler optimization even necessary? My program's already very fast. Is it at all likely that it will provide a significant improvement?
B) Are there any articles on how to properly build a multthreaded program so compiler optimization can do its job?
“Imploded aggressively” is a bit weird (is your program a controller for a fission bomb?), but I understand that your program behaved as desired without compiler optimizations and mysteriously with compiler optimizations.
The technical term for this is that your program is buggy.
Multithreaded programming is intrinsically hard. Multithreaded programming when the threads share memory is very hard; it's the masochist way of concurrent programming (message passing is a lot easier to get right). You don't just need to read an article or two, you need to read several books and get a few years' programming experience.
You were unlucky that your program seemed to work without optimizations. It probably wouldn't work on a different machine where the timings are a bit different, or with a different compiler, or on a different operating system, either. So you ended up wasting your time thinking your program worked. But it doesn't. A compiler transforms correct source code into correct executables, no matter what optimization level you choose.¹
¹ Barring compiler bugs, sure. But the odds are very strongly stacked against you.
99.9% of all household failures in one optimization mode and not another are due to serious bugs. Multithreading races etc. are very sensitive to code performance. An instruction reorder or loop shortcut can turn a test pass into a debugging nightmare.
I'm assuming that the server runs up OK and detonates under load in aparrently different places, so making conventional debugging useless?
You are going to have to rely on logging and changing the test conditions to narrow down the point of ignition. My guess is this is going to be a Heisenbug that mutates with changes to the code, optimization, options, load profile, buffer sizes etc.
Not fixing the problem is not a good plan since it wil just show up in another form on next years boxes with more cores etc. Even with optimization off, it's still there, lurking, waiting for the opportunity to strike.
I hope I'm providing some comfort.
Seriously - log everything you can with a good logger - one that queues up the logs so as to keep disk latency out of the main app. Change things around to try and make the bug mutate and perhaps show up in the non-optimized build too. Write down, (type in), absolutely everything that you do amd what happens after any change, good or bad. Making the bug worse is actually better than making its symptoms go away, (without knowing exactly why). Try the server on various hardware configs, if you can.
Eventually, you will find the bug!
You have one thing going for you - it seems that you can reliably reproduce the problem. That, in itself, is a massive plus.
Forgot to ask - apart from the nuclear explosive metaphor, what is the main symptom? Is it AV'ing/segfaulting all over the place, or is it locked or livelocked up?
To answer part "A" of your question, the unoptimized version of your code still has the concurrency bugs in it, but the timing of how the threads run is such that the bugs have not yet been exposed with your test workloads. The current version of the unoptimized program will eventually fail in use, so you will need to fix the concurrency bugs before using the program for real work.
How do I detect in my program that it's being run under a debugger? I am aware that this seems to indicate that I try to do something I shouldn't, and that is too messy. I think it is an interesting question tho. Specifically, is there a way to do it in a POSIX environment? For example using sigaction (2) to detect that some handler is installed? Even a worse thought; is there some inline assembly code I could use on an x86 architecture?
As we discuss it -- would it eventually be possible to launch a debugger, such as gdb (1), and break at the place where you perform this possible hack. Thanks for any dirty one-liners or unlikely references to standards related to this.
Does this article (archive link) help?
It suggests, amongst other things:
file descriptors leaking from the parent process
environment variables ($_)
process state (getsid(), etc).
Note that most (if not all) of these rely on the debugger spawning the process that's being debugged. They're not so effective if the debugger is attached to a process that's already running.
There is no reliable way to detect that you are running under a debugger. That's because a debugger may use any number of methods to actually debug your code, some of which will almost certainly not be caught by your method.
My question is, why would you care? Unless you're trying to hide something :-)