I am trying to write a small app that debugs another process by its ID and monitors the app until it crashes.
For now I've written a small code, most of it is from the MS example for writing a debugger.
My target application never passes the if(!de.u.Exception.dwFirstChance), even after the target has crashed.
I am able to see the exceptions coming if I put a bp on if(!de.u.Exception.dwFirstChance), but no exception meets the condition.
P.S : Too many edits :/
#include "stdafx.h"
#include <windows.foundation.diagnostics.h>
#include <debugapi.h>
#include <ntstatus.h>
DEBUG_EVENT de;
int _tmain(int pid)
{
DebugActiveProcess( pid);
while (true)
{
int a;
if (WaitForDebugEvent (&de, (DWORD)1000))
{
if (de.dwDebugEventCode == EXCEPTION_DEBUG_EVENT)
{
if(!de.u.Exception.dwFirstChance)
int excep = de.u.Exception.ExceptionRecord.ExceptionCode;
}
}
ContinueDebugEvent ( de.dwProcessId,
de.dwThreadId,
DBG_CONTINUE);
}
}
The article "Writing a Plug-in for Sysinternals ProcDump v4.0" indicates in the pseudo-code that the dump of a monitored process is generated when (and only when) a "Second Chance Exception" occurs.
// (extract altered for brevity)
Else "Second Chance Exception"
WriteDump(..)
Done = True
And "Writing a basic Windows debugger" indicates that EXCEPTION_DEBUGINFO.dwFirstChance, with a guard for STATUS_BREAKPOINT/EXCEPTION_BREAKPOINT can be used to detect this case.
"First and second chance exception handling" (KB105676) explains the difference between the exception chance types:
However, if the application is being debugged, the debugger sees all [first chance] exceptions before the program does. This is the distinction between the first and second chance exception: the debugger gets the first chance to see the exception (hence the name).
It is these First Chance Exceptions ("managed" or not) which are being detected, but they are almost all recoverable - i.e. they are caught by the application/run-time code and dealt with appropriately.
If the debugger allows the program execution to continue and does not handle the exception, the program will see the exception as usual. If the program does not handle the exception, the debugger gets a second chance to see the exception. In this latter case, the program normally would crash if the debugger were not present.
Thus, procdump likely generates the dump for a second chance exception with the assumption that any process-fatal exception will not be suppressed (by another debugger, as the program gave up its chance).
(EXIT_PROCESS_DEBUG_EVENT occurs after the process is terminated and is thus too late to generate an appropriate dump, although it does signal to end monitoring.)
YMMV: All information/observations comes from the articles and resources listed, without actual experience in usage of such techniques.
There is a previous question/answer about the single step exception: What is a single step exception?
Every exception you enumerated will crash your application if unhandled. There is a lot of information on the web about each.
Simple newbie advice: it is very unlikely that you are the first person ever to encounter a problem or wonder what something means. Google is a more appropriate tool for things like this. Google first, StackExchange if you can't find the answer.
Related
Outside of logging the failure to stderr and a log file, how should I deal with a fatal error?
example:
VkResult result = vkCreateInstance(&createInfo, NULL, &vulkanInfo->instance);
if (result != VK_SUCCESS) {
// ???
}
If vkCreateInstance fails, it's all over. The app cannot continue. What should I do?
Targets are Windows, Mac, Linux, Switch, and more.
I realize this is a very open ended question. I’m just curious how the great minds here deal with it.
how should I deal with a fatal error?
The app cannot continue
Because The app cannot continue you should stop your application. A properly written application would:
print an error message to stderr
stop and join and synchronize all threads
free all dynamically allocated memory
close all open files
generally clean up all shared resources if needed (I think of shared memory)
exit the application with an error
To be (almost extremely unnecessarily) portable, you can use exit(EXIT_FAILURE) to notify the system that your application exited with an error (but better use exit(EXIT_FAILURE) for readability). For the platforms you target, use exit() with any other value than 0 - exit(0) means application succeeded. For many applications, some specific exit values are also used to notify the upper application of what specific error happened, like grep exits 0 if it filtered some lines, 1 if no lines were filtered, and other exit codes if an error occurred (like for example the file does not exist).
Best cross platform practices for dealing with a fatal error in C?
Your code is dealing with Vulkan, so it's reasonable to assume that almost everyone using your software will be using a GUI and will not look at (and never see) anything sent to stdout or stderr. Instead; they will expect a "GUI specific notification" (a dialog box).
There's multiple different "cross platform GUI toolkit" libraries online to choose from (if you don't feel like writing a minimal wrapper for a dialog box and nothing else).
I need to get the code coverage information for a amount of C programs. I only need to know whether or not each line is executed. However, some of them will never end for the sake of infinite loops. The most of tools, e.g. like gcov, llvm-cov, which could get the information for the program only after it ends.
I set a time limitation for all the programs. If it dosen't end when its exection time is beyound the limitation, its process will be killed. However, when its process was killed, all the information stored in the memory is cleaned. So I can't get the code coverage information for those programs. How can I do that?
A simple solution would be to add a litle timer interrupt inside your program that will raise an exception after a few seconds of "bad behaviour" and cause the program to terminate nicely.
When searching and probing in the temporary fashion you seem to be after, it is completely legit to "hack it" as long as you remove the hack immediately you found the error..
When we write C programs we make calls to malloc or printf. But do we need to check every call? What guidelines do you use?
e.g.
char error_msg[BUFFER_SIZE];
if (fclose(file) == EOF) {
sprintf(error_msg, "Error closing %s\n", filename);
perror(error_msg);
}
The answer to your question is: "Do whatever you want", there is no written rule, BUT the right question is "What do users want in case of failure".
Let me explain, if you are a student writing a test program for example, no absolute need to check for errors: it may be a waste of time.
Now, if your code may be distributed or used by other people, that quite different: put yourself in the shoes of future users. Which message do you prefer when something goes wrong with an application:
Core was generated by `./cut --output-d=: -b1,1234567890- /dev/fd/63'.
Program terminated with signal SIGSEGV, Segmentation fault.
or
MySuperApp failed to start MySuperModule because there is not enough space on the disk.
Try to free space on disk, then relaunch the app.
If this error persists contact us at support#mysuperapp.com
As it has already been addressed in the comment, you have to consider two types of error:
A fatal error is one that kills your program (app / server / site / whatever it is). It renders it unusable, either by crashing or by putting it in some state whereby it can't do it's usable work. e.g. memory allocation, disk space ...
Non-fatal error is one where something messes up, but the program can continue to do what it's supposed to do. e.g. file not found, serve other users not requesting the thing that called the error.
Source : https://www.quora.com/What-is-the-difference-between-an-error-and-a-fatal-error
Just do error checking if your program behaviour has to behave differently in case an error is detected. Let me illustrate this with an example: Assume you have used a temporary file in your program and you use the unlink(2) system call to erase that temporary file at the end of the program. Have you to check if the file has been successfully erased? Let's analyse the problem with some common sense: if you check for errors, are you going to be able (inside the program) of doing some alternate thing to cope with this? This is uncommon (if you created the file, it's rare that you will not be able to erase it, but something can happen in the time between --- for example a change in directory permissions that forbids you to write on the directory anymore) But what can you do in that case? Is it possible to use a different approach to erase temporary file in that case. Probably not... so checking (in that case) a possible error from the unlink(2) system call will be almost useless.
Of course, this doesn't apply always, you have to use common sense while programming. Errors about writing to files should be always considered, as they belong to access permissions or mostly to full filesystems (In that case, even trying to generate a log message can be useles, as you have filled your disk --- or not, that depends) Do you know always the precise environment details to obviate if a full filesystem error can be ignored. Suppose you have to connect to a server in your program. Should the connect(2) system call failure be acted upon? probably most of the times, at least a message to the user with the protocol error (or the cause of the failure) must be given to the user.... assuming everything goes ok can save you time in a prototype, but you have to cope with what can happen, in production programs.
When i want to use return value of function than suggested to check return value before using it
For example pointer return address that can be null also.so suggested to keep null check before using it.
I have a program which produces a fatal error with a testcase, and I can locate the problem by reading the log and the stack trace of the fatal - it turns out that there is a read operation upon a null pointer.
But when I try to attach gdb to it and set a breakpoint around the suspicious code, the null pointer just cannot be observed! The program works smoothly without any error.
This is a single-process, single-thread program, I didn't experience this kind of thing before. Can anyone give me some comments? Thanks.
Appended: I also tried to call pause() syscall before the fatal-trigger code, and expected to make the program sleep before fatal point and then attach the gdb on it on-the-fly, sadly, no fatal occurred.
It's only guesswork without looking at the code, but debuggers sometimes do this:
They initialize certain stuff for you
The timing of the operations is changed
I don't have a quote on GDB, but I do have one on valgrind (granted the two do wildly different things..)
My program crashes normally, but doesn't under Valgrind, or vice versa. What's happening?
When a program runs under Valgrind,
its environment is slightly different
to when it runs natively. For example,
the memory layout is different, and
the way that threads are scheduled is
different.
Same would go for GDB.
Most of the time this doesn't make any
difference, but it can, particularly
if your program is buggy.
So the true problem is likely in your program.
There can be several things happening.. The timing of the application can be changed, so if it's a multi threaded application it is possible that you for example first set the ready flag and then copy the data into the buffer, without debugger attached the other thread might access the buffer before the buffer is filled or some pointer is set.
It's could also be possible that some application has anti-debug functionality. Maybe the piece of code is never touched when running inside a debugger.
One way to analyze it is with a core dump. Which you can create by ulimit -c unlimited then start the application and when the core is dumped you could load it into gdb with gdb ./application ./core You can find a useful write-up here: http://www.ffnn.nl/pages/articles/linux/gdb-gnu-debugger-intro.php
If it is an invalid read on a pointer, then unpredictable behaviour is possible. Since you already know what is causing the fault, you should get rid of it asap. In general, expect the unexpected when dealing with faulty pointer operations.
I have some image processing code that runs on a background thread and updates an Image control on the UI thread when it's done processing using Dispatcher.BeginInvoke(). When I'm running my application outside of the debugger, it crashes quite often. As soon as I run it in the debugger, I can't get it to happen at all. Apparently the timing difference is enough to make my life miserable right now ;-)
I've tried putting try/catch blocks around any code that seems relevant and logging any errors that come up, but to no avail - it somehow keeps slipping past me, and I'm not sure where else to look.
My hope for using the debugger was to set the debugger's exception catching behavior to break whenever any exception was thrown, but since I can't get the exception to happen while debugging, I can't find out where my code is throwing.
I can attach to my crashed process (since it stays on screen, just is completely unresponsive), pause the debugger, and see where each thread is in the code, but that doesn't really help me - I have no idea what the actual exception being thrown is.
Any suggestions on how to proceed?
Edit:
I've been using System.Diagnostics.Trace.WriteLine() with DbgView in as many places as I can think. I can track down where it appears the exception is occurring, I can't find out what the exception is, which is what is important.
I've used WinDBG+SOS before, to track down memory leaks, but not to track down hard-to-find-exceptions. Can anyone suggest resources for using WinDBG+SOS in this capacity?
With only few exceptions every BeginInvoke should have a corresponding EndInvoke counterpart (see here for more details why, one exception is e.g. Control.BeginInvoke).
A missing EndInvoke might be the reason that an exception is not caught by the main thread and your application terminates.
Since in your special case you are dealing with Dispatcher (which does not implement EndInvoke) you will have to handle the Dispatcher.UnhandledException event to catch any exception thrown during execution of a delegate.
By the way, a good tool to monitor the System.Diagnostics.Trace messages is DbgView from Sysinternals.
How can you be sure it's an exception, when you've never caught it?
Also, rather than placing try/catch blocks all over the place, place them at boundaries, especially at thread boundaries. Put them around any ThreadStart methods or other code invoked by BeginInvoke, or callback methods, etc.
Can you put one try/catch in the static Main called on application start? So when app crash you can output stacktrace and exception info somewhere.