All my local variables are deleted - c

In my C program, after I call a function, all the variables in the outer function are disappearing. The program no longer recognizes that they exist, and trying to access them causes an error.
void outer_function()
{
int x = 0;
inner_function();
printf("%d\n", x); // Throws an error because x does not exist
}
I'm not sure what in inner_function() is causing it, and the function is too long to paste here. What sort of behavior could cause the local variables in outer_function() to disappear? The only thing I can think of is that inner_function() is writing over outer_function()'s memory, but it seems like that would only change the contents of the variables, not delete them.
Edit: I don't think there's really a whole lot more I can tell you. gcc said EXC_BAD_ACCESS and then "warning: Unable to restore previously selected frame," and then crashed. I know it's difficult for you to say what's actually causing it without seeing the whole function, which is why I initially just asked what sort of bug could cause behavior like this.

Without seeing a complete, compilable code snippet, it's impossible to say. The only thing I can think of is that inner_function() is actually some perverse macro that's screwing things up.

Are you 100% sure that printf("%d\n", x); is the line that is causing the error? Have you stepped through this? I would add some lines to print the output of x before, during, and after the inner_function() to see exactly where the problem lies. I have a feeling that you have a problem inside the inner_function().

Once you enter the realm of undefined behaviour all bets are off, so if there is any undefined behaviour at all inside inner_function() the subsequent behaviour of your entire program and hence outer_function() is also undefined.

Maybe you declare and define inner_function in different ways (cdecl and stdcall).

Though you should still go back and edit your question to add some information about how your program is failing and what "local variables are being deleted" actually means, this is the type of thing that could cause a program to lost the value of a variable from a different scope.
void inner_function(void) {
int x[1];
memset(x, 0, 10 * sizeof(x));
}
This should actually fail when the function tries to return. This is called a buffer overflow because you have a buffer (a range of memory used to hold something) that you have permission (from the C programming language) to edit, but you edit that and a lot more. That "a lot more" data is other memory that the compiler expected that you would not edit like the return address and variables in other scopes.
This example is a very general case and it is intended to be easily understood, but it is very likely that if your inner_function does suffer from this type of error it won't be as clear as this. It is also possible to make a buffer overflow that does not overwrite the return value, so that inner_function would return without failing, but then you might find local variables from outer_function changed (which is what I think you were saying is happening in your code), but to write a usable example of this on purpose I would need to know a lot more about what platform, compiler, and compiler options you were using so that the I could make educated guesses about where on the stack, relative to the top of the stack (which is the current function's stack frame) things would probably be.

Related

How does C manage stack with pointers?

So I was messing up with dynamic memory and pointers, and I was wondering how C was managing the stack when it comes to pointers that points to local variables.
I came out with this simple function :
int* dummy(){
int test = 4;
int *t2;
t2 = &test;
return t2;
}
This function initialize a pointer, and an int as a local variable (should not be accessible outside of my function, as the stack state will be restored once I get out of the function). However, I am returning the pointer as the result of my function.
I can get the pointer back and print the value of my local variable with :
#include <stdio.h>
int main(void){
int* p = dummy();
// some other calls to other functions to mess up the bellow stack,
// where my local variable "test" was supposed to be landing
printf("%d\n", *p); // printing the value of "test" (which is 4)
}
Result
$ ./a.out
4
Why is this printing the correct result? Isn't the pointer pointing at a variable in a stack from an other state? I am confused.
If the memory stays somewhere without dynamic allocation, where does it stay? Is it lost forever? (no way to "free" it)
EDIT after the comments
The behavior is undefined. Adding compiler options for warning such as pedantic will print a warning that I am returning a pointer pointing at a local variable, and the executable gets bugus.
The reason for this is that dummy's stack state get lost when the program exists the function, thus not assuring the value of local variables, because they are... local.
One of the possible outcomes of undefined behavior is - behaving as expected.
Once dummy returns, test no longer exists - logically speaking. However, the region of the stack it occupied may not be immediately overwritten, so that value may persist in that (virtual) location for some time afterwards.
The pointer is invalid - we’re using it to access an object outside of that object’s lifetime - so the behavior is definitely undefined. But that doesn’t mean that the value must be something other than 4.
Undefined behavior is undefined. There is no value this program can print that is, in any sense, "correct". What's confusing you is that you think there is some "correct" value this program can print and therefore you wonder why it's printing the "correct" value.
The problem is entirely in your incorrect understanding that some value is more "correct" than some other value. All values are equally "correct" for this program.
I was wondering how C was managing the stack when it comes to pointers that points to local variables
It does not. C does not even require a stack. What you are doing is undefined behavior which means that the C standard does not impose ANY requirements on the compiler.
So the answer to your question has nothing to do with the C standard. It can be boiled down to "how it's usually handled". But when you do stuff like it is, as I mentioned, undefined behavior. So this kind of code is likely to be messed up as soon as you turn on any compiler optimizations.
It's possible to make educated guesses about the behavior, but you have no guarantees.
Actually, it's kind of like looking into your neighbours window and be surprised that it's the same neighbour you've had for years. In the exact same way, you have no guarantees that your neighbour has not suddenly moved and someone else have moved in. Well, in your case, this happened. Your neighbour have not moved out yet.

Stack overflow happening when changing a line which is never reached - why and how to prevent it?

I'm developing something in an embedded context with Zephyr.
Essentially I'm dealing with a boot-loop caused by a stack overflow. The stack overflow goes away when I change an unused parameter of a function call deep inside my main. To make sure that the problem is not with the inside of the function, I hard-coded its implementation to be return 0;.
The offending line being like such creates a boot loop:
uint8_t port;
ret = foo(&port, NULL, NULL);
But the line missing the de-referenced port has the code run normally:
uint8_t port;
ret = foo(NULL, NULL, NULL);
Mind you, as I've already said, the implementation of foo is hard-coded to return 0. The parameters are at no point used. Furthermore, I'm sure the line is never actually reached at runtime (in this case) as it lives behind some conditionals requiring my interaction to actually go through.
I've started to give up and blame things on faulty memory or ESD damage but when I tried the same code with the same changes on a spare piece of hardware I had laying around the same thing happens. What is it that I'm missing? I genuinely don't know what else I could do to find out why this is happening and how to fix it. I don't have an access to a debugger for this microcontroller (SAMD21) so I'm at a bit of a loss... Any ideas (or at least sympathy)?
When you remove that parameter does it run without any errors or are there other errors? If you are writing to the wrong memory (e.g. memory that was allocated with a size of zero) somewhere in your program, changes to unrelated parts of the program's code, such as changing the size of a struct, or the parameters of a function, could change where a fatal error occurs and what kind of fatal error it is.
Nevermind, I've found the culprit - a simple stack overflow. I was one byte away from it before the addition of the uint8_t port variable declaration into main. The variable when not used as a parameter in foo() was being optimised away by the compiler. Having one fewer byte on the call stack apparently was enough to prevent the overflow.
Solution: increase stack size and be more careful with clogging it up with unnecessary items.

Why is the compiler OK with this?

I spent an embarrassing amount of time last night tracking down a segfault in my application. Ultimately, it turned out I'd written:
ANNE_SPRITE_FRAME *desiredFrame;
*desiredFrame = anne_sprite_copy_frame(&sprite->current);
instead of:
ANNE_SPRITE_FRAME desiredFrame;
desiredFrame = anne_sprite_copy_frame(&sprite->current);
In line 1 I created a typed pointer, and in line 2 I set the value of the dereferenced pointer to the struct returned by anne_sprite_copy_frame().
Why was this a problem? And why did the compiler accept this at all? All I can figure is that the problem in example 1 is either:
I'm reserving space for the pointer but not the contents that it points to, or
(unlikely) it's trying to store the return value in the memory of the pointer itself
In line 1 I've created a typed pointer, and in line 2 I set the value of the dereferenced pointer to the struct returned by anne_sprite_copy_frame().
Both of these are allowed in C, which is why this is perfectly acceptable by the compiler.
The compiler doesn't check to make sure your pointer actually points to anything meaningful - it just dereferences and assigns.
One of the best and worst features of C is that the compiler does very little sanity checking for you - it follows your instructions, and does exactly what you tell it to do. You told it to do two legal operations - even though the variables were not initialized properly. As such, you get runtime issues, not compile time problems.
I'm reserving space for the pointer but not the contents that it points to
Yeah, exactly. But the compiler (unless it does some static analysis) can't infer that. It only sees that the syntax is valid and the types match, so it compiles your program. Dereferencing an uninitialized pointer is undefined behavior, though, so your program will most likely work erroneously.
The pointer is uninitialized, but it still has a value so it points somewhere. Writing the return value to that memory address overwrites whatever happens to be there, invoking undefined behavior.
Technically the compiler is not in the business of telling you that a syntactically valid construct will result in undefined (or even likely unexpected) behavior, but I would be surprised if there was no warning issued about this particular usage.
C is weakly typed. You can assign anything to anything with the obvious consequences. You have to be very careful and disciplined if you do not want to spend nights uncovering bugs that turn out "stupid". I mean no offense. I went through the same issues due to an array bound overflow that overwrote other variables and only showed up in some other part of the code trying to use these variables. Nightmare! That's why Java is so much easier to deal with. With C you are an acrobat without a net, with Java, you can afford to fall. That said, I do not mean to say Java is better. C has its raison d'etre.

Float value suddenly becoming huge

I would rather not dump code, but explain my problem. After hours of debugging I managed to understand that at some point in my code, a float value that is not explicitly modified turns HUGE (more than 1e15). I do use a lot of memory in my program (a string array containing 800+ words), other than that though, I have no idea what could cause this.
If anyone has any ideas regarding this, please share. Otherwise, I'll post a pastebin of the
code soon.
EDIT:
Here is the code: http://pastebin.com/vgiZweNq. The problem rests in the next_generation() function, where the sumfit variable goes nuts at random times in the loop.
Also, I've compiled this on linux using -fno-stack-limit and -fstack-check, to avoid stack overflows.
EDIT 2:
I've changed the program to use a dynamically allocated linked list, to further avoid stack overflows. Still, sumfit gets changed to Floatzilla at random points, usually pretty early on.
Cheers!
Since the variable is obviously being modified from an unexpected point, you might want to check some possibilities:
Is it being modified from a different thread or from an interrupt / event handler? If so, is the access properly synchronized to prevent a data race?
Are you doing pointer arithmetic that might be buggy and cause access outside the intended buffer?
Are you casting pointers between types of different sizes?
Especially if you are working on an embedded device: Maybe the memory is full and your stack is overlapping the heap, or the global variables.
More information about the platform this happens on would be helpful.
You're using strcpy on the chrom array, but i don't see where they ever get null terminated.
Maybe I'm just missing it, though.
You've got a huge string array. I reckon you're probably going off the end of it. Keep track of the size of data going into that array.

Bizarre bug in C [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
So I have a C program. And I don't think I can post any code snippets due to complexity issues. But I'll outline my error, because it's weird, and see if anyone can give any insights.
I set a pointer to NULL. If, in the same function where I set the pointer to NULL, I printf() the pointer (with "%p"), I get 0x0, and when I print that same pointer a million miles away at the end of my program, I get 0x0. If I remove the printf() and make absolutely no other changes, then when the pointer is printed later, I get 0x1, and other random variables in my structure have incorrect values as well. I'm compiling it with GCC on -O2, but it has the same behavior if I take off optimization, so that's not hte problem.
This sounds like a Heisenbug, and I have no idea why it's happening, nor how to fix it. Does anyone who has dealt with something like this in the past have advice on how they approached this kind of problem? I know this may sound kind of vague.
EDIT: Somehow, it works now. Thank you, all of you, for your suggestions.
The debugger told me interesting things - that my variable was getting optimized away. So I rewrote the function so it didn't need the intermediate variable, and now it works with and without the printf(). I have a vague idea of what might have been happening, but I need sleep more than I need to know what was happening.
Are you using multiple threads? I've often found that the act of printing something out can be enough to effectively suppress a race condition (i.e. not remove the bug, just make it harder to spot).
As for how to diagnose/fix it... can you move the second print earlier and earlier until you can see where it's changing?
Do you always see 0x1 later on when you don't have the printf in there?
One way of avoiding the delay/synchronization of printf would be to copy the pointer value into another variable at the location of the first printf and then print out that value later on - so you can see what the value was at that point, but in a less time-critical spot. Of course, as you've got odd value "corruption" going on, that may not be as reliable as it sounds...
EDIT: The fact that you're always seeing 0x1 is encouraging. It should make it easier to track down. Not being multithreaded does make it slightly harder to explain, admittedly.
I wonder whether it's something to do with the extra printf call making a difference to the size of stack. What happens if you print the value of a different variable in the same place as the first printf call was?
EDIT: Okay, let's take the stack idea a bit further. Can you create another function with the same sort of signature as printf and with enough code to avoid it being inlined, but which doesn't actually print anything? Call that instead of printf, and see what happens. I suspect you'll still be okay.
Basically I suspect you're screwing with your stack memory somewhere, e.g. by writing past the end of an array on the stack; changing how the stack is used by calling a function may be disguising it.
If you're running on a processor that supports hardware data breakpoints (like x86), just set a breakpoint on writes to the pointer.
Do you have a debugger available to you? If so, what do the values look like in that? Can you set any kind of memory/hardware breakpoint on the value? Maybe there's something trampling over the memory elsewhere, and the printf moves things around enough to move or hide the bug?
Probably worth looking at the asm to see if there's anything obviously wrong there. Also, if you haven't already, do a full clean rebuild. If the definition of the struct has changed recently, there's a vague change that the compiler could be getting it wrong if the dependency checking failed to correctly rebuild everything it needed to.
Have you tried setting a condition in your debugger which notifies you when that value is modified? Or running it through Valgrind? These are the two major things that I would try, especially Valgrind if you're using Linux. There's no better way to figure out memory errors.
Without code, it's a little hard to help, but I understand why you don't want to foist copious amounts on us.
Here's my first suggestion: use a debugger and set a watchpoint on that pointer location.
If that's not possible, or the bug disappears again, here's my second suggestion.
1/ Start with the buggy code, the one where you print the pointer value and you see 0x1.
2/ Insert another printf a little way back from there (in terms of code execution path).
3/ If it's still 0x1, go back to step 2, moving a little back through the execution path each time.
4/ If it's 0x0, you know where the problem lies.
If there's nothing obvious between the 0x0 printf and the 0x1 printf, it's likely to be corruption of some sort. Without a watchpoint, that'll be hard to track down - you need to check every single stack variable to ensure there's no possibility of overrun.
I'm assuming that pointer is a global since you set it and print it "a million miles away". If it is, lok at the variables you define on either side of it (in the source). They're the ones most likely to be causing overrun.
Another possibility is to turn off the optimization to see if the problem still occurs. We've occasionally had to ship code like that in cases where we couldn't fix the bug before deadlines (we'll always go back and fix it later, of course).

Resources