Does gcc initializes auto variable to 0? - c

Why am i getting 0. I is an auto variable, so it should print some garbage value, right? I am using gcc compiler.
#include "stdio.h"
void main() {
int i;
printf("%d\n", i);
}

Does gcc initializes auto variable to 0?
Yes and No!
Actually uninitialized auto variables get indeterminate value (either an unspecified value or a trap representation1).
Using such variables in a program invoke undefined behavior-- behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which ANSI C International Standard imposes no requirements. (C11:§3.4.3)
Once UB is invoked you may get either expected or unexpected result. Result may vary run to run of that program, compiler to compiler or even version to version of same compiler, even on temperature of your system!
1. An automatic variable can be initialized to a trap representation without causing undefined behavior, but the value of the variable cannot be used until a proper value is stored in it. (C11: 6.2.6 Representations of types--Footnote 50)

No, I get random values with gcc (Debian 4.9.2-10) 4.9.2.
ofd#ofd-pc:~$ gcc '/home/ofd/Destkop/test.c'
ofd#ofd-pc:~$ '/home/ofd/Desktop/a.out'
-1218415715
ofd#ofd-pc:~$ '/home/ofd/Desktop/a.out'
-1218653283
ofd#ofd-pc:~$ '/home/ofd/Desktop/a.out'
-1218845795

Variables declared inside a function are uninitialized. One cannot predict what might show up if you print them out. in your example main is a function too. Hence it so happens that it is zero.
When you declare variable to be static or gloabally, the compiler will have them initialzed to zero.

It has become standard security practice for freshly allocated memory to be cleared (usually to 0) before being handed over by the OS. Don't want to be handing over memory that may have contained a password or private key! So, there's no guarantee what you'll get since the compiler is not guaranteeing to initialize it either, but in modern days it will typically be a value that's consistent across a particular OS at least.

Related

Is there a safe way to specify the value of an object may be uninitialized because it is never used?

Disclaimer: The following is a purely academic question; I keep this code at least 100 m away from any production system. The problem posed here is something that cannot be measured in any “real life” case.
Consider the following code (godbolt link):
#include <stdlib.h>
typedef int (*func_t)(int *ptr); // functions must conform to this interface
extern int uses_the_ptr(int *ptr);
extern int doesnt_use_the_ptr(int *ptr);
int foo() {
// actual selection is complex, there are multiple functions,
// but I know `func` will point to a function that doesn't use the argument
func_t func = doesnt_use_the_ptr;
int *unused_ptr_arg = NULL; // I pay a zeroing (e.g. `xor reg reg`) in every compiler
int *unused_ptr_arg; // UB, gcc zeroes (thanks for saving me from myself, gcc), clang doesn't
int *unused_ptr_arg __attribute__((__unused__)); // Neither zeroing, nor UB, this is what I want
return (*func)(unused_ptr_arg);
}
The compiler has no reasonable way to know that unused_ptr_arg is unneeded (and so the zeroing is wasted time), but I do, so I want to inform the compiler that unused_ptr_arg may have any value, such as whatever happens to be in the register that would be used for passing it to func.
Is there a way to do this? I know I’m way outside the standard, so I’ll be fine with compiler-specific extensions (especially for gcc & clang).
Using GCC/Clang `asm` Construct
In GCC and Clang, and other compilers that support GCC’s extended assembly syntax, you can do this:
int *unused_ptr_arg;
__asm__("" : "=x" (unused_ptr_arg));
return (*func)(unused_ptr_arg);
That __asm__ construct says “Here is some assembly code to insert into the program at this point. It writes a result to unused_ptr_arg in whatever location you choose for it.” (The x constraint means the compiler may choose memory, a processor register, or anything else the machine supports.) But the actual assembly code is empty (""). So no assembly code is generated, but the compiler believes that unused_ptr_arg has been initialized. In Clang 6.0.0 and GCC 7.3 (latest versions currently at Compiler Explorer) for x86-64, this generates a jmp with no xor.
Using Standard C
Consider this:
int *unused_ptr_arg;
(void) &unused_ptr_arg;
return (*func)(unused_ptr_arg);
The purpose of (void) &unused_ptr_arg; is to take the address of unused_ptr_arg, even though the address is not used. This disables the rule in C 2011 [N1570] 6.3.2.1 2 that says behavior is undefined if a program uses the value of an uninitialized object of automatic storage duration that could have been declared with register. Because its address is taken, it could not have been declared with register, and therefore using the value is no longer undefined behavior according to this rule.
In consequence, the object has an indeterminate value. Then there is an issue of whether pointers may have a trap representation. If pointers do not have trap representations in the C implementation being used, then no trap will occur due to merely referring to the value, as when passing it as an argument.
The result with Clang 6.0.0 at Compiler Explorer is a jmp instruction with no setting of the parameter register, even if -Wall -Werror is added to the compiler options. In contrast, if the (void) line is removed, a compiler error results.
int *unused_ptr_arg = NULL;
This is what you should be doing. You don't pay for anything. Zeroing an int is a no-op. Ok technically it's not, but practically it is. You will never ever ever see the time of this operation in your program. And I don't mean that it's so small that you won't notice it. I mean that it's so small that so many other factors and operations that are order of magnitude longer will "swallow" it.
This is not actually possible across all architectures for a very good reason.
A call to a function may need to spill its arguments to the stack, and in IA64, spilling uninitialized registers to the stack can crash because the previous contents of the register was a speculative load that loaded an address that wasn't mapped.
To prevent the possibility of zero-ing with each run of int foo(), simply make unused_ptr_arg static.
int foo() {
func_t func = doesnt_use_the_ptr;
static int *unused_ptr_arg;
return (*func)(unused_ptr_arg);
}

Why does the following code give different results when compiling with gcc and g++?

#include<stdio.h>
int main()
{
const int a=1;
int *p=(int *)&a;
(*p)++;
printf("%d %d\n",*p,a);
if(a==1)
printf("No\n");//"No" in g++.
else
printf("Yes\n");//"Yes" in gcc.
return 0;
}
The above code gives No as output in g++ compilation and Yes in gcc compilation. Can anybody please explain the reason behind this?
Your code triggers undefined behaviour because you are modifying a const object (a). It doesn't have to produce any particular result, not even on the same platform, with the same compiler.
Although the exact mechanism for this behaviour isn't specified, you may be able to figure out what is happening in your particular case by examining the assembly produced by the code (you can see that by using the -S flag.) Note that compilers are allowed to make aggressive optimizations by assuming code with well defined behaviour. For instance, a could simply be replaced by 1 wherever it is used.
From the C++ Standard (1.9 Program execution)
4 Certain other operations are described in this International
Standard as undefined (for example, the effect of attempting to
modify a const object). [ Note: This International Standard imposes
no requirements on the behavior of programs that contain undefined
behavior. —end note ]
Thus your program has undefined behaviour.
In your code, notice following two lines
const int a=1; // a is of type constant int
int *p=(int *)&a; // p is of type int *
you are putting the address of a const int variable to an int * and then trying to modify the value, which should have been treated as const. This is not allowed and invokes undefined behaviour.
For your reference, as mentioned in chapter 6.7.3, C11 standard, paragraph 6
If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined. If an attempt is
made to refer to an object defined with a volatile-qualified type through use of an lvalue
with non-volatile-qualified type, the behavior is undefined
So, to cut the long story short, you cannot rely on the outputs for comaprison. They are the result of undefined behaviour.
Okay we have here 'identical' code passed to "the same" compiler but once
with a C flag and the other time with a C++ flag. As far as any reasonable
user is concerned nothing has changed. The code should be interpreted
identically by the compiler because nothing significant has happened.
Actually, that's not true. While I would be hard pressed to point to it in
a standard but the precise interpretation of 'const' has slight differences
between C and C++. In C it's very much an add-on, the 'const' flag
says that this normal variable 'a' should not be written to by the code
round here. But there is a possibility that it will be written to
elsewhere. With C++ the emphasis is much more to the immutable constant
concept and the compiler knows that this constant is more akin to an
'enum' that a normal variable.
So I expect this slight difference means that slightly different parse
trees are generated which eventually leads to different assembler.
This sort of thing is actually fairly common, code that's in the C/C++
subset does not always compile to exactly the same assembler even with
'the same' compiler. It tends to be caused by other language features
meaning that there are some things you can't prove about the code right
now in one of the languages but it's okay in the other.
Usually C is the performance winner (as was re-discovered by the Linux
kernel devs) because it's a simpler language but in this example, C++
would probably turn out faster (unless the C dev switches to a macro
or enum
and catches the unreasonable act of taking the address of an immutable constant).

Simple C code snippets

I have a great difficulty in understanding the printed values of these 2 source codes.
#include<stdio.h>
void a(void){
int a;
a++;
printf("%d\n",a);
}
int main(void){
a();
a();
a();
return 0;
}
Why does this code print out "1 2 3 " ,while the second one:
#include<stdio.h>
void a(void){
int a;
a++;
printf("%d\n",a);
}
int main(void){
int b;
printf("%d\n",b);
a();
a();
a();
return 0;
}
Prints out: " 0, garbage value, same garbage value +1, same garbage value +2 ".
Shouldn't any uninitialized object in the main function (or any other functon ) be assigned a random (garbage) value?
UPDATE: I feel that the explanation "the variables are uninitialized so they can have any remaining values from other programs so -> UB" is not sufficient. I can copy-paste the same source code 100 times and still get the printed value of 0. I am using gcc 4.4.3.
The value of uninitialized automatic variables is indeterminate and so if you don't talk about a specific compiler on a specific machine with a specific set of flags then it is really unpredictable. Even if you talk about a very specific platform and settings you still may not get reproducible results.
In some very specific situations you can make predictions and the presentation Deep C talks about this in general and covers this specific case around slide 71.
On modern systems automatic variables will often be allocated on the stack and you may get the same memory location and therefore you would then see three consecutive values. But you should not rely on this behavior and using uninitialized variables is undefined behavior and the results are unpredictable.
The C99 draft standard tell us in section 6.7.8 Initialization paragraph 10 that:
If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.
and tell us in the definitions of indeterminate value:
either an unspecified value or a trap representation
Update
What is undefined behavior? In the strictest sense it is behavior the C standard does not impose requirements on, it is construct of the standard. It is defined in the draft standard in section 3.4.3:
behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements
and has the following note:
Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
at the end of the day it is a trade off between designing an efficient language and a safe one, which is paraphrased from What Every C Programmer Should Know About Undefined Behavior #1/3.
Here are more links to better understanding undefined behavior:
Philosophy behind Undefined Behavior
Why Language Designers Tolerate Undefined Behavior
int a; In first code you haven't initialized a which means it contains garbage values in it. Both the above codes snippets could behave differently printing garbage values.
An uninitialized variable is a variable that is declared but is not set to a definite known value before it is used. It will have some value, but not a predictable one. As such, it is a programming error and a common source of bugs in software.
Since all values are uninitialized they could contain anything.
So your output is unpredictable because you cannot know what value the variables have.
This is what is called undefined behavior.
Why does this code print out "1 2 3 "
It invokes undefined behavior. Any thing could happen. Using an uninitialized variable invokes UB.
Shouldn't any uninitialized object in the main function (or any other functon ) be assigned a random (garbage) value?
Yes. But that garbage value could be any thing. you can't predict it.
It's both undefined bahaviour. It just happens to be 0 for the local variable in main and also 0 for the local variable in the first example. And then a random value at the shifted local variable location for a in the second example (shifted due to a local variable in main).
Memory pages are usually zero'd when retrieved from the OS, so it's most likely the stack footprint of the language runtime or some other pre-main code. Therefore being seemingly predictable for you at repeated runs. It's possibly completely different on different machines/compilers etc.

Turbo C++: Why does printf print expected values, when no variables are passed to it?

A question was asked in a multiple choice test: What will be the output of the following program:
#include <stdio.h>
int main(void)
{
int a = 10, b = 5, c = 2;
printf("%d %d %d\n");
return 0;
}
and the choices were various permutations of 10, 5, and 2. For some reason, it works in Turbo C++, which we use in college. However, it doesn't when compiled with gcc (which gives a warning when -Wall is enabled) or clang (which has -Wformat enabled and gives a warning by default) or in Visual C++. The output is, as expected, garbage values. My guess is that it has something to do with the fact that either Turbo C++ is 16-bit, and running on 32-bit Windows XP, or that TCC is terrible when it comes to standards.
The code has undefined behaviour.
In Turbo C++, it just so happens that the three variables live at the exact positions on the stack where the missing printf() argument would be. This results in the undefined behaviour manifesting itself by having the "correct" values printed.
However, you can't reasonably rely on this to be the case. Even the slightest change to your build environment (e.g. different compiler options) could break things in an arbitrarily nasty way.
The answer here is that the program could do anything -- this is undefined behavior. According to printf()s documentation (emphasis mine):
By default, the arguments are used in the order given, where each '*' and each conversion specifier asks for the next argument (and it is an error if insufficiently many arguments are given).
If your multiple-choice test does not have a choice for "undefined behavior" then it is a flawed test. Under the influence of undefined behavior, any answer on such a multiple-choice test question is technically correct.
It is an undefined behaviour. So it could be anything.
Try to use
printf("%d %d%d", a,b,c)
Reason:- Local variables are called on the stack and printf in Turbo C++ sees them in the same order in which they were assigned in the stack.
SUGGESTION(From comments):-
Understanding why it behaves in a particular way with a particular compiler can be useful in diagnosing problems, but don't make any other use of the information.
What's actually going on is that arguments are normally passed on the call stack. Local variables are also passed on the call stack, and so printf() sees those values, in whatever order the compiler decided to store them there.
This behavior, as well as many others, are allowed under the umbrella of undefined behavoir
No, it's not related to architecture. It is related to how TurboC++ handles the stack. Variables a, b, and c are locals and as such allocated in the stack. printf also expects the values in the stack. Apparently, TurboC++ does not add anything else to the stack after the locals and printf is able to take them as parameters. Just coincidence.

What will be the value of uninitialized variable? [duplicate]

This question already has answers here:
(Why) is using an uninitialized variable undefined behavior?
(7 answers)
Closed 5 years ago.
Possible Duplicate:
Is uninitialized data behavior well specified?
I tried the following code
#include<stdio.h>
void main()
{
int i; \
printf('%d',i);
}
The result gave garbage value in VC++, while same in tc was zero.
What will be the correct value?
Will an uninitialized variable by default have value of zero? or it will contain garbage value?
Next is on the same
#include<stdio.h>
void main()
{
int i,j,num;
j=(num>0?0:num*num);
printf("\n%d",j);
}
What will be the output of the code above?
Technically, the value of an uninitialized non static local variable is Indeterminate[Ref 1].
In short it can be anything. Accessing such a uninitialized variable leads to an Undefined Behavior.[Ref 2]
[Ref 1]
C99 section 6.7.8 Initialization:
If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.
[Ref 2]
C99 section 3.18 Undefined behavior:
behavior, upon use of a nonportable or erroneous program construct, of erroneous data, or of indeterminately valued objects, for which this International Standard imposes no requirements.
Note: Emphasis mine.
Accessing an unitialized variable is undefined behavior in both C and C++, so reading any value is possible.
It is also possible that your program crashes: once you get into undefined behavior territory, all bets are off1.
1 I have never seen a program crashing over accessing an uninitalized variable, unless it's a pointer.
It's indeterminate. The compiler can do what it wants.
The value is indeterminate; using the variable before initialization results in undefined behavior.
It's undefined. It might be different between different compilers, different operating systems, different runs of the program, anything. It might not even be a particular value: the compiler is allowed to do whatever it likes to this code, because the effect isn't defined. It might choose to optimize away your whole program. It might even choose to replace your program with one that installers a keylogger and steals all of your online banking login details.
If you want to know the value, the only way is to set it.
As others have noted, the value can be anything.
This sometimes leads to hard-to-find bugs, e.g. because you happen to get one value in a debug build and get a different value in a release build, or the initial value that you get depends on previous program execution.
Lesson: ALWAYS initialize variables. There's a reason that C# defines values for fields and requires initialization for local variables.

Resources