C : Passing auto variable by reference - c

I came across a piece of code which is working fine as of now,but in my opinion its undefined behavior and might introduce a bug in future.
Pseudo code :
void OpertateLoad(int load_id)
{
int value = 0;
/* code to calculate value */
SetLoadRequest(load_id,&value);
/*some processing not involving value**/
}
void SetLoadRequest(int load_id, int* value)
{
/**some processing**/
LoadsArray[load_id] = *value;
/**some processing**/
}
In my understanding C compiler will not guarantee where Auto variables will be stored. It could be stack/register(if available and suitable for processing).
I am suspecting that if compiler decides to store value onto general purpose register then, SetLoadRequest function might refer to wrong data.
Am I getting it right?or I am overthinking it?
I am using IARARM compiler for ARM CORTEX M-4 processor.
----------:EDIT:----------
Answers Summarize that " Compiler will ensure that data is persisted between the calls, no matter where the variable is stored ".
Just want to confirm : Is this behavior also true 'if a function is returning the address of local auto variable and caller is de-referencing it?'.
If NO then is there anything in C standard which guarantees the behavior in both cases? Or As I stated earlier its undefined behavior?

You are overthinking it. The compiler knows that, if value is in a register, that it must be stored to memory before passing a pointer to that memory to SetLoadRequest.
More generally, don't think about stack and registers at all. The language says there's a variable (without saying how it's implemented), and that you can take its address and use that in another function to refer to the variable. So you can!
The language also says that local variables cease to exist when leaving a block, so this permission does not extend to returning pointers to local variables (which causes undefined behavior if the caller does anything at all with the pointer).

I am overthinking it?
Yes!
The compiler will take care of this. If it stores it in a register, it will know how to handle it (with a memory load).

C11 draft standard n1570:
6.2.4 Storage durations of objects
6 For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the
object is created each time. The initial value of the object is indeterminate. If an
initialization is specified for the object, it is performed each time the declaration or
compound literal is reached in the execution of the block; otherwise, the value becomes
indeterminate each time the declaration is reached.

To my understanding, the scope of value is the body of OpertateLoad. However, SetLoadRequest assigns the value pointed to, so the actual value of value is copied. No undefined behaviour is involved.

Related

What value does a const variable take when it's initialized by a non-const and non-initialized variable?

#include <stdio.h>
int main()
{
int a;
const int b = a;
printf("%d %d\n", a, b);
return 0;
}
The same code I tried to execute on onlinegdb.com compiler and on Ubuntu WSL. In onlinegdb.com, I got both a and b as 0 with every run, whereas in WSL it was a garbage value. I am not able to understand why garbage value is not coming onlinegdb.com
Using int a; inside a function is described by C 2018 6.2.4 5:
An object whose identifier is declared with no linkage and without the storage-class specifier static has automatic storage duration, as do some compound literals…
Paragraph 6 continues:
… The initial value of the object is indeterminate…
An indeterminate value is not an actual specific value but is a theoretical state used to describe the semantics of C. 3.19.2 says it is:
… either an unspecified value or a trap representation…
and 3.19.3 says an unspecified value is:
… valid value of the relevant type where this document imposes no requirements on which value is chosen in any instance…
That “any instance” part means that the program may behave as if a has a different value each time it is used. For example, these two statements may print different values for a:
printf("%d\n", a);
printf("%d\n", a);
Regarding const int b = a;, this is not covered explicitly by the C standard, but I have seen a committee response: When an indeterminate value is assigned (or initialized into) another object, the other object is also said to have an indeterminate value. So, after this declaration, b has an indeterminate value. The const is irrelevant; it means the source code of the program is not supposed to change b, but it cannot remedy the fact that b does not have a determined value.
Since the C standard permits any value to be used in each instance, onlinegdb.com conforms when it prints zero, and WSL conforms when it prints other values. Any int values printed for printf("%d %d\n", a, b); conform to the C standard.
Further, another provision in the C standard actually renders the entire behavior of the program undefined. C 2018 6.3.2.1 2 says:
… If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.
This applies to a: It could have been declared register because its address is never taken, and it is not initialized or assigned a value. So, using a has undefined behavior, and that extends to the entire behavior of the program on the code branch where a is used, which is the entire program execution.
I am not able to understand why garbage value is not coming
This is a very strange statement. I wonder: what kind of answer or explanation do you expect you might get? Something like:
Everyone who said that "uninitialized local variables start out containing random values" lied to you. WSL was wrong for giving you random values. You should have gotten 0, like you did with onlinegdb.com.
onlinegdb.com is buggy. It should have given truly random values.
The rules for const variables are special. When you say const int b = a;, it magically makes a's uninitialized value more predictable.
Are you expecting to get an answer like any of those? Because, no, none of those is true, none of those can possibly be true.
I'm sorry if it sounds like I'm teasing you here. I agree, it's surprising at first if an uninitialized local variable always starts out containing 0, because that's not very random, is it?
But the point is, the value of an uninitialized local variable is not defined. It is unspecified, indeterminate, and/or undefined. You cannot know what it is going to be. But that means that no value — no possible value — that it contains can ever be "wrong". In particular, onlinegdb.com is not wrong for not giving you random values: remember, it's not obligated to give you anything!
Think about it like this. Suppose you buy a carton of milk. Suppose it's printed on the label, "Contents may spoil — keep refrigerated." Suppose you leave the carton of milk on the counter overnight. That is, suppose you fail to properly refrigerate it. Suppose that, a day later, you realize your mistake. Horrified, you carefully open the milk carton and take a small taste, to see if it has spoiled. But you got lucky! It's still okay! It didn't spoil!
Now, at this point, what do you do?
Hastily put the milk in the refrigerator, and vow to be more careful next time.
March back to the store where you bought the milk, and accuse the shopkeeper of false advertising: "The label says 'contents may spoil', but it didn't!!" What do you think the shopkeeper is going to say?
This may seem like a silly analogy, but really, it's just like your C/C++ coding situation. The rules say you're supposed to initialize your local variables before you use them. You failed. Yet, somehow, you got predictable values anyway, at least under one compiler. But you can't complain about this, because it's not causing you a problem. And you can't depend on it, because as your experience with the other compiler showed you, it's not a guaranteed result.
Typically, local variables are stored on the stack. The stack gets used for all sorts of stuff. When you call a function, it receives a new stack frame where it can store its local variables, and where other stuff pertaining to that function call is stored, too. When a function returns, its stack frame is popped, but the memory is typically not cleared. That means that, when the next function is called, its newly-allocated stack frame may end up containing random bits of data left over from the previous function whose stack frame happened to occupy that part of stack memory.
So the question of what value an uninitialized local variable contains ends up depending on what the previous function might have been and what it might have left lying around on the stack.
In the case of main, it's quite possible that since it's the first function to be called, and the stack might start out empty, that main's stack frame always ends up being built on top of virgin, untouched, all-0 memory. That would mean that uninitialized variables in main might always seem to start out containing 0.
But this is not, not, not, not, not guaranteed!!!
Nobody said the stack was guaranteed to start out containing 0. Nobody said that there wasn't some startup code that ran before main that might have left some random garbage lying around on the stack.
If you want to enumerate possibilities, I can think of 3:
The function you're wondering about is one that always gets called first, or always gets called at a "deep leaf" of the call stack, meaning that it always gets a brand-new stack frame, and on a machine where the stack always starts out containing 0. Under these circumstances, uninitialized variables might always seem to start out containing 0.
The function you're wondering about does not always get a brand-new stack frame. It always gets a "dirty" stack frame with some previous function's random data lying around, and the program is such that, during every run, that previous function was doing something different and left something different on the stack, such that the next function's uninitialized local variables always seem to start out containing different, seemingly random values.
The function you're wondering about is always called right after a previous function that always does the same thing, meaning that it always leaves the same values lying around, meaning that the next function's uninitialized local variables always seem to start out with the same values every time, which aren't zero but aren't random.
But I hope it's obvious that you absolutely can't depend on any of this! And of course there's no reason to depend on any of this. If you want your local variables to have predictable values, you can simply initialize them. But if you're curious what happens when you don't, hopefully this explanation has helped you understand that.
Also, be aware that the explanation I've given here is somewhat of a simplification, and is not complete. There are systems that don't use a conventional stack at all, meaning that none of those possibilities 1, 2, or 3 could apply. There are systems that deliberately randomize the stack every time, either to help new programs not to accidentally become dependent on uninitialized variables, or to make sure that attackers can't exploit certain predictable results of a badly written program's undefined behavior.
When your operating system gives your program memory to work with, it will likely be zero to start (though not guaranteed). As your program calls functions it creates stack frames, and your program will effectively go from the .start assembly function to the int main() c function, so when main is called, no stack frame has written the memory that local variables are placed at. Therefore, a and b are both likely to be 0 (and b is guaranteed to be the same as a). However, it's not guaranteed to be 0, especially if you call some functions that have local variables or lots of parameters. For instance, if your code was instead
#include <stdio.h>
void foo()
{
int x = 42;
}
int main()
{
foo();
int a;
const int b = a;
printf("%d %d\n", a, b);
return 0;
}
then a would PROBABLY have the value 42 (in unoptimized builds), but that would depend on the ABI (https://en.wikipedia.org/wiki/Application_binary_interface) that your compiler uses and probably a few other things.
Basically, don't do that.

Array is not erased after end of function call

in school the teacher told me that when the call to a function reaches an end everything declared inside the function's block will be erased.
But I wrote the following code:
int * secret()
{
int arr[10]={0};
arr[0]=9999;
return arr;
}
int main() {
printf("%d",secret()[0]);
return 0;
}
and the output was 9999 which doesn't suit what I was taught.
in school the teacher told me that when the call to a function reaches an end everything declared inside the function's block will be erased.
That is a misleading characterization. If your instructor used that specific wording then they did you a disservice. It is not altogether wrong, though, depending on how one interprets it.
What the language specification says is that the lifetime of an object declared, without a storage-class specifier, inside a block (such as the block serving as the function body in a function definition) ends when execution of the innermost containing block ends. You might characterize that as such objects being "erased" in the sense of erasure from existence, but not in the sense of having their contents cleared.
the output was 9999 which doesn't suit what I was taught.
Trying to be as charitable towards the instructor as possible, I do suggest considering that you may have misunderstood what they were trying to tell you.
In any event, attempting to access an object whose lifetime has ended produces undefined behavior. Moreover, when an object's lifetime ends, the values of any pointers to that object become indeterminate. This means that your program's output does not, indeed cannot, contradict what the language specification says about the situation, because the program output is undefined. Any output or none would be equally consistent with C language semantics. If we suppose that your instructor was trying to convey a characterization consistent with the specification, then they are not contradicted either.
In your example, the local variable "arr" gets allocated on the stack. When you return from secret(), the program releases that portion of the stack for use by the next function that gets called, but the pointer you returned is still pointing to that memory location. It still exists as it was until another function comes along and uses that portion of the stack.
Your teacher was correct if you assume they meant that the memory should not be used after returning, but specifically, the stack is not erased as part of the return. This is because by definition the local variables are no longer in scope, and cannot be expected to be in any particular state. There is no reason to use processor cycles to erase it.
The C language requires the designer to manage their use of pointers, which can be both useful and terrible. It gives the designer a lot of freedom, but it is totally up to the designer to know what is on the other end of that pointer and what types of operations they should being doing with it.
In most implementations function epilogue just changes the stack pointer but does not purge the actual data. If you need to purge the data you
need to do it yourself.
void foo()
{
char verySecret[5000];
char verySecret2[5000];
char verySecret4[5000];
/* do something */
/* now purge the data */
purge(verySecret,0, sizeof(verySecret));
purge(verySecret2,0, sizeof(verySecret2));
purge(verySecret4,0, sizeof(verySecret4));
}

why garbage value is stored during declaration?

I heard several times that if you do not initialise a variable then garbage value is stored in it.
Say
int i;
printf("%d",i);
The above code prints any garbage value, but I want to know that what is the need for storing garbage value if uninitialized?
The value of an uninitialized value is not simply unknown or garbage, but indeterminate and evaluating that variable may invoke undefined behavior or implementation-defined behavior.
One possible scenario (which is probably the scenario you are seeing) is that the variable, when evaluated, will return the value that was previously present in that memory address. Therefore, it's not like garbage is explicitly written to that variable.
It's worth noting that languages (or even C implementations) that do not exhibit the behavior you're seeing, do so by explicitly writing zeroes (or other initial values) to that area, before allowing you to use it.
It is not storing garbage, it prints whatever happens to be there in memory at that address when it is running. This is in the name of efficiency. You don't pay for what you didn't ask for.
EDIT
To answer why there is something in memory. All sort of program runs and need to share memory. When memory is allocated to your process, it is not reset, again for performance reason. Since the variable we are observing is declared on the stack, it could even be your program that put the value there in a previous function call.
C only does what you tell it to. The standard defines reading an uninitialized variable as undefined behavior.
This question elaborates: (Why) is using an uninitialized variable undefined behavior?
The accepted answer has a very good explanation.
EDIT:
A funny sidenote though, if you declare the variable static it is guaranteed to be initialized to zero per the standard. Can't find a quote right now, working on it..
EDIT2:
I left my C reference at work and CBA to download one. This answer elaborates on the initial values of variables, whether they be local/auto, global, static or indeterminate: https://stackoverflow.com/a/1597491/700170
The other answers point out (correctly) that what's being printed is whatever's already in memory in the memory location that happens to have been assigned to i.
They don't, however, clarify why there are any values stored in these locations in the first place, which is perhaps what you're really asking.
There are two reasons for this: first, upon startup, we can't be sure exactly how the memory circuits will initialize themselves. So they could be set to any arbitrary value. The second (and, in general, more likely reason, unless you just restarted your computer) is that before you started your program, that memory location had been used by another program, which stored something there--something that wasn't garbage at the time, since it was stored intentionally. From the perspective of your program, however, it is garbage, since your program has no way of knowing why that particular value was stored there.
EDIT: As I mentioned in a comment on another answer, even if the value stored in memory under some uninitialized variable is actually 0, that's not the same thing as "not having a value stored." The value stored is 0, which is to say, the physical hardware that represents one bit of memory is faithfully storing the value 0. As long as a circuit is active (i.e. turned on), the memory cells must store something; for an explanation of why this is, look into flip-flop gates. (There's a decent overview here, assuming you already understand a little bit about NAND gates: http://computer.howstuffworks.com/boolean4.htm)
It's happens only in case of local varibales. As memory for local variables are allocated on stack and while allocating the memory the runtime system does not clear the memory before allocating it to the variable unlike in case of allocating memory in heap for global and static variables. Hence the default value of local varibles beomes the content of its memory on stack while that of constant and static variables is 0.

Local variable still exists after function returns

I thought that once a function returns, all the local variables declared within (barring those with static keyword) are garbage collected. But when I am trying out the following code, it still prints the value after the function has returned. Can anybody explain why?
int *fun();
main() {
int *p;
p = fun();
printf("%d",*p); //shouldn't print 5, for the variable no longer exists at this address
}
int *fun() {
int q;
q = 5;
return(&q);
}
There's no garbage collection in C. Once the scope of a variable cease to exist, accessing it in any means is illegal. What you see is UB(Undefined behaviour).
It's undefined behavior, anything can happen, including appearing to work. The memory probably wasn't overwritten yet, but that doesn't mean you have the right to access it. Yet you did! I hope you're happy! :)
If you really want it to loose the value, perhaps call another function with at least a few lines of code in it, before doing the printf by accessing the location. Most probably your value would be over written by then.
But again as mentioned already this is undefined behavior. You can never predict when (or if at all) it crashes or changes. But you cannot rely upon it 'changing or remaining the same' and code an application with any of these assumptions.
What i am trying to illustrate is, when you make another function call after returning from previous one, another activation record is pushed on to the stack, most likely over writing the previous one including the variable whose value you were accessing via pointer.
No body is actually garbage collecting or doing a say memset 0 once a function and it's data goes out of scope.
C doesn't support garbage collection as supported by Java. Read more about garbage collection here
Logically, q ceases to exist when fun exits.
Physically (for suitably loose definitions of "physical"), the story is a bit more complicated, and depends on the underlying platform. C does not do garbage collection (not that garbage collection applies in this case). That memory cell (virtual or physical) that q occupied still exists and contains whatever value was last written to it. Depending on the architecture / operating system / whatever, that cell may still be accessible by your program, but that's not guaranteed:
6.2.4 Storage durations of objects
2 The lifetime of an object is the portion of program execution during which storage is
guaranteed to be reserved for it. An object exists, has a constant address,33)
and retains
its last-stored value throughout its lifetime.34)
If an object is referred to outside of its
lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when
the object it points to (or just past) reaches the end of its lifetime.
33) The term ‘‘constant address’’ means that two pointers to the object constructed at possibly different
times will compare equal. The address may be different during two different executions of the same
program.
34) In the case of a volatile object, the last store need not be explicit in the program.
"Undefined behavior" is the C language's way of dealing with problems by not dealing with them. Basically, the implementation is free to handle the situation any way it chooses to, up to ignoring the problem completely and letting the underlying OS kill the program for doing something naughty.
In your specific case, accessing that memory cell after fun had exited didn't break anything, and it had not yet been overwritten. That behavior is not guaranteed to be repeatable.

Why we must initialize a variable before using it? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
What happens to a declared, uninitialized variable in C? Does it have a value?
Now I'm reading Teach Yourself C in 21 Days. In chapter 3, there is a note like this:
DON'T use a variable that hasn't been initialized. Results can be
unpredictable.
Please explain to me why this is the case. The book provide no further clarification about it.
Because, unless the variable has static storage space, it's initial value is indeterminate. You cannot rely on it being anything as the standard does not define it. Even statically allocated variables should be initialized though. Just initialize your variables and avoid a potential headache in the future. There is no good reason to not initialize a variable, plenty of good reasons to do the opposite.
On a side note, don't trust any book that claims to teach you X programming language in 21 days. They are lying, get yourself a decent book.
When a variable is declared, it will point to a piece of memory.
Accessing the value of the variable will give you the contents of that piece of memory.
However until the variable is initialised, that piece of memory could contain anything. This is why using it is unpredictable.
Other languages may assist you in this area by initialising variables automatically when you assign them, but as a C programmer you are working with a fairly low-level language that makes no assumptions about what you want to do with your program. You as the programmer must explicitly tell the program to do everything.
This means initialising variables, but it also means a lot more besides. For example, in C you need to be very careful that you de-allocate any resources that you allocate once you're finished with them. Other languages will automatically clean up after you when the program finished; but in C if you forget, you'll just end up with memory leaks.
C will let you get do a lot of things that would be difficult or impossible in other languages. But this power also means that you have to take responsibility for the housekeeping tasks that you take for granted in those other languages.
You might end up using a variable declared outside the scope of your current method.
Consider the following
int count = 0;
while(count < 500)
{
doFunction();
count ++;
}
...
void doFunction() {
count = sizeof(someString);
print(count);
}
In C, using the values of uninitialized automatic variables (non-static locals and parameter variables) that satisfy the requirement for being a register variable is undefined behavior, because such variables values may have been fetched from a register and certain platforms may abort your program if you read such uninitialized values. This includes variables of type unsigned char (this was added to a later C spec, to accommodate these platforms).
Using values of uninitialized automatic variables that do not satisfy the requirement for being a register variable, like variables that have their addresses taken, is fine as long as the C implementation you use hasn't got trap representations for the variable type you use. For example if the variable type is unsigned char, the Standard requires all platforms not to have trap representations stored in such variables, and a read of an indeterminate value from it will always succeed and is not undefined behavior. Types like int or short don't have such guarantees, so your program may crash, depending on the C implementation you use.
Variables of static storage duration are always initialized if you don't do it explicitly, so you don't have to worry here.
For variables of allocated storage duration (malloc ...), the same applies as for automatic variables that don't satisfy the requirements for being a register variable, because in these cases the affected C implementations need to make your program read from memory, and won't run into the problems in which register reads may raise exceptions.
In C the value of an uninitialized variable is indeterminate. This means you don't know its value and it can differ depending on platform and compiler. The same is also true for compounds such as struct and union types.
Why? Sometimes you don't need to initialize a variable as you are going to pass it to a function that fills it and that doesn't care about what is in the variable. You don't want the overhead of initialization imposed upon you.
How? Primitive types can be initialized from literals or function return values.
int i = 0;
int j = foo();
Structures can be zero intialized with the aggregate intializer syntax:
struct Foo { int i; int j; }
struct Foo f = {0};
At the risk of being pedantic, the statement
DON'T use a variable that hasn't been initialized.
is incorrect. If would be better expressed:
Do not use the value of an uninitialised variable.
The language distinguishes between initialisation and assignment, so the first warning is, in that sense over cautious - you need not provide an initialiser for every variable, but you should have either assigned or initialised a variable with a useful and meaningful value before you perform any operations that subsequently use its value.
So the following is fine:
int i ; // uninitialised variable
i = some_function() ; // variable is "used" in left of assignment expression.
some_other_function( i ) ; // value of variable is used
even though when some_function() is called i is uninitialised. If you are assigning a variable, you are definitely "using" it (to store the return value in this case); the fact that it is not initialised is irrelevant because you are not using its value.
Now if you adhere to
Do not use the value of an uninitialised variable.
as I have suggested, the reason for this requirement becomes obvious - why would you take the value of a variable without knowing that it contained something meaningful? A valid question might then be "why does C not initialise auto variables with a known value. And possible answers to that would be:
Any arbitrary compiler supplied value need not be meaningful in the context of the application - or worse, it may have a meaning that is contrary to the actual state of the application.
C was deliberately designed to have no hidden overhead due to its roots as systems programming language - initialisation is only performed when explicitly coded, because it required additional machine instructions and CPU cycles to perform.
Note that static variables are always initialised to zero. .NET languages, such as C# have the concept of an null value, or a variable that contains nothing, and that can be explicitly tested and even assigned. A variable in C cannot contain nothing, but what it contains can be indeterminate, and therefore code that uses its value will behave non-deterministically.

Resources