Does scanf() update constant variables? - c

It seems that scanf() updates constant variables.
As far as I know, constant variables are supposed to have fixed values.
In the following code...
#include <stdio.h>
int main () {
const int testInteger = 0;
printf("Before 'scanf', the value of the variable 'testInteger' is %d.\n", testInteger);
// Does 'scanf' update constant variables?
printf("Enter an integer: ");
scanf("%d", &testInteger);
printf("After 'scanf', the value of the variable 'testInteger' is %d.\n", testInteger);
return 0;
}
...by entering the value of '50', it gives the following output:
Before 'scanf', the value of the variable 'testInteger' is 0.
Enter an integer: 50
After 'scanf', the value of the variable 'testInteger' is 50.
I would like to know why does scanf() update the value of the 'testInteger' constant variable.

Yes, it's supposed to have constant value, but it doesn't magically guarantee it.
You took an address of this variable (of type const int *) which is nothing more than a number pointing to some memory. Then, by passing it to scanf, you violated the contract off const-ness (didn't you get any warnings? Do you compile with -Wall?): it was used as a regular int* pointer.

C relies on the hardware to enforce the read-only-ness of const variables. Most computer hardware can only enforce read-only-ness of large blocks of memory (known as "pages"), not individual int-sized variables.
You declared testInteger as a const int, but because it is otherwise an ordinary local variable -- it has "automatic storage duration", in the language of the C standard -- storage for it is allocated in a location (known as "the stack") whose large blocks have to remain writable for normal operation of the program, so the hardware cannot enforce its constness.
If you change the declaration of testInteger to static const int testInteger, or if you declare it as a global, then it will be allocated in a special region of memory reserved for const variables (the "read-only data segment") whose blocks of memory are unwritable, and your program will crash inside the guts of scanf when it tries to write to testInteger.
It is almost always the Right Thing to declare const data objects as globals / with static storage duration anyway, so this hardware limitation is not a big deal in practice. (Well, actually, it is a big deal, but for completely unrelated reasons.)

Attempting to write a const object is undefined behavior (UB). Might work, might not, might crash, might ...
const int testInteger = 0;
scanf("%d", &testInteger); // UB

There are a couple of issues here.
First, const only means "this thing may not be the target of an assignment operator", not "this thing must be stored in read-only memory." The compiler is only required to issue a diagnostic if a const-qualified variable is the target of an assignment operator.
Secondly, the %d conversion specifier for scanf expects an argument of type int *, but we're passing an argument of type const int *. However, this isn't a constraint violation the way assignment is, so no diagnostic is required. Since we're passing the wrong type of argument to scanf for that conversion specifier, the behavior is undefined, which means that neither the compiler nor the run-time environment are required to handle the situation in any particular way. In this case, your implementation updated the variable. In an implementation where testInteger was stored in read-only memory, you might get a segfault or other runtime error, or the variable may simply not be updated.

VC doesn't warn you, you have to add -Wall, I'm using either Dev C or Geany. It's not a compiler error though, it's just a not desirable output.

I believe there are OS'es and compilers that can store const variables into read-only memory, but I also believe that most situations don't do that. It's more of a way to let a compiler know "I want you to warn me if I try to change this variable by mistake", but the compiler doesn't go so far as to check pointers to it, only direct references.

Related

Why it is compiling it as an infinite loop instead for finite one? [duplicate]

I had gone to an interview in which I was asked the question:
What do you think about the following?
int i;
scanf ("%d", i);
printf ("i: %d\n", i);
I responded:
The program will compile successfully.
It will print the number incorrectly but it will run till the end
without crashing
The response that I made was wrong. I was overwhelmed.
After that they dismissed me:
The program would crash in some cases and lead to an core dump.
I could not understand why the program would crash? Could anyone explain me the reason? Any help appreciated.
When a variable is defined, the compiler allocates memory for that variable.
int i; // The compiler will allocate sizeof(int) bytes for i
i defined above is not initialized and have indeterminate value.
To write data to that memory location allocated for i, you need to specify the address of the variable. The statement
scanf("%d", &i);
will write an int data by the user to the memory location allocated for i.
If & is not placed before i, then scanf will try to write the input data to the memory location i instead of &i. Since i contains indeterminate value, there are some possibilities that it may contain a value equivalent to the value of a memory address or it may contain a value which is out of range of memory address.
In either case, the program may behave erratically and will lead to undefined behavior. In that case anything could happen.
Beacuse it invokes undefined behavior. The scanf() family of functions expect a pointer to an integer when the "%d" specifier is found. You are passing an integer which can be interpreted as the address of some integer but it's not. This doesn't have a defined behavior in the standard. It will compile indeed (will issue some warning however) but it will certainly work in an unexpected way.
In the code as is, there is yet another problem. The i variable is never initialized so it will have an indeterminate value, yet another reason for Undefined Behavior.
Note that the standard doesn't say anything about what happens when you pass a given type when some other type was expected, it's simply undefined behavior no matter what types you swap. But this particular situation falls under a special consideration because pointers can be converted to integers, though the behavior is only defined if you convert back to a pointer and if the integer type is capable of storing the value correctly. This is why it compiles, but it surely does not work correctly.
You passed data having the wrong type (int* is expected, but int is passed) to scanf(). This will lead to undefined behavior.
Anything can happen for undefined behavior. The program may crash and may not crash.
In a typical environment, I guess the program will crash when some "address" which points to a location which isn't allowed to write into by the operating system is passed to scanf(), and writing to there will have the OS terminate the application program, and it will be observed as a crash.
One thing that the other answers haven't mentioned yet is that on some platforms, sizeof (int) != sizeof (int*). If the arguments are passed in a certain way*, scanf could gobble up part of another variable, or of the return address. Changing the return address could very well lead to a security vulnerability.
* I'm no assembly language expert, so take this with a grain of salt.
I could not understand why the program would crash? Could anyone explain me the reason. Any help appreciated.
Maybe a little more applied:
int i = 123;
scanf ("%d", &i);
With the first command you allocate memory for one integer value and write 123 in this memory block. For this example let's say this memory block has the address 0x0000ffff. With the second command you read your input and scanf writes the input to memory block 0x0000ffff - because you are not accessing (dereferencing) the value of this variable i but it's address.
If you use the command scanf ("%d", i); instead you are writing the input to the memory address 123 (because that's the value stored inside this variable). Obviously that can go terribly wrong and cause a crash.
Since there is no &(ampersand) in scanf(as required by the standard), so as soon as we enter the value the program will terminate abruptly, no matter how many lines of code are written further in the program.
-->> I executed and found that in code blocks.
Same program if we run in turbo c compiler then it will run perfectly all the lines even which are after scanf, but the only thing, as we know the value of i printed would be garbage.
Conclusion:- Since at some compiler it will run and at some it would not, so this is not a valid program.

C variable initialization and execution [duplicate]

This question already has answers here:
Is un-initialized integer always default to 0 in c?
(4 answers)
Closed 1 year ago.
In C, we know that without initializing a variable, it holds a garbage value. Yet in online compilers and also, in an IDE, when I tried this program it got compiled and there was a perfect output. When I tried to print the same without the while loop, it returned a garbage value. So, is not initializing fine?
#include <stdio.h>
int main() {
int j;
while(j<=10){
printf("\n %d",j);
j=j+1;
}
}
Try disassembling your code, for example "gcc -S test.c", so you can see that there is no instruction dedicated to initializing the integer "j" with or without loop. Greetings.
… it holds a garbage value.
When anybody says an object has a garbage value, they are being imprecise with language.
There is no value that is a garbage value. For 32-bit two’s complement integers, there are the values −2,147,483,648 to +2,147,483,647. Each of them is a valid value. None of them is a garbage value.
What it actually means to say something has a garbage value, if the speaker understands C semantics, is that the value of the object is uncontrolled. It has not been set to any specific value, and therefore whatever value you get from using it is a happenstance of circumstances. It may be some value that was in the memory of the object before it was reserved to be the memory for that object.
However, it might be other things. When an object is uninitialized, the C standard not only has no requirement that the memory of the object have any particular value, it has no requirement that the object behave as if it had any fixed value at all. This frees the compiler for purposes that are useful for optimization in other situations. But it means that, if you have int x; printf("%d\n", x); int y = x+3; printf("%d\n", y);, the compiler does not have to load x from memory each time it is used. Because x is uninitialized, the compiler is not required to do any work to load it from memory. For the printf("%d\n", x);, the compiler might let x be whatever value is in the register that would be used to pass the second argument to printf. For the int y = x+3;, the compiler might let x be whatever is in some other register that it would use to hold the value of x if x were defined. This could be a different register. So printf("%d\n", x); might print “47” while int y = x+3; printf("%d\n", y); prints “−372”, not “50”.
Sometimes an uninitialized object might behave as if it started with the value zero. This is not uncommon in short programs such as the one in the question, where nothing has used the stack much yet, so the part of it reserved for j is still in the initial state the program loader put it in, filled with zeros. But that is happenstance. When you change the program, the compiler might use a different part of the stack for j, and that part might not have zeros in it, because it was used by some of the initial program start-up code that runs before main starts.
Always avoid use a uninitialized variables, because it cause undefined behavior.
Uninitialized variables
Unlike some programming languages, C/C++ does not initialize most variables to a given value (such as zero) automatically. Thus when a variable is assigned a memory location by the compiler, the default value of that variable is whatever (garbage) value happens to already be in that memory location! A variable that has not been given a known value (usually through initialization or assignment) is called an uninitialized variable.
For your reference:
https://wiki.sei.cmu.edu/confluence/display/c/EXP33-C.+Do+not+read+uninitialized+memory
https://www.learncpp.com/cpp-tutorial/uninitialized-variables-and-undefined-behavior/

What will happen if '&' is not put in a 'scanf' statement?

I had gone to an interview in which I was asked the question:
What do you think about the following?
int i;
scanf ("%d", i);
printf ("i: %d\n", i);
I responded:
The program will compile successfully.
It will print the number incorrectly but it will run till the end
without crashing
The response that I made was wrong. I was overwhelmed.
After that they dismissed me:
The program would crash in some cases and lead to an core dump.
I could not understand why the program would crash? Could anyone explain me the reason? Any help appreciated.
When a variable is defined, the compiler allocates memory for that variable.
int i; // The compiler will allocate sizeof(int) bytes for i
i defined above is not initialized and have indeterminate value.
To write data to that memory location allocated for i, you need to specify the address of the variable. The statement
scanf("%d", &i);
will write an int data by the user to the memory location allocated for i.
If & is not placed before i, then scanf will try to write the input data to the memory location i instead of &i. Since i contains indeterminate value, there are some possibilities that it may contain a value equivalent to the value of a memory address or it may contain a value which is out of range of memory address.
In either case, the program may behave erratically and will lead to undefined behavior. In that case anything could happen.
Beacuse it invokes undefined behavior. The scanf() family of functions expect a pointer to an integer when the "%d" specifier is found. You are passing an integer which can be interpreted as the address of some integer but it's not. This doesn't have a defined behavior in the standard. It will compile indeed (will issue some warning however) but it will certainly work in an unexpected way.
In the code as is, there is yet another problem. The i variable is never initialized so it will have an indeterminate value, yet another reason for Undefined Behavior.
Note that the standard doesn't say anything about what happens when you pass a given type when some other type was expected, it's simply undefined behavior no matter what types you swap. But this particular situation falls under a special consideration because pointers can be converted to integers, though the behavior is only defined if you convert back to a pointer and if the integer type is capable of storing the value correctly. This is why it compiles, but it surely does not work correctly.
You passed data having the wrong type (int* is expected, but int is passed) to scanf(). This will lead to undefined behavior.
Anything can happen for undefined behavior. The program may crash and may not crash.
In a typical environment, I guess the program will crash when some "address" which points to a location which isn't allowed to write into by the operating system is passed to scanf(), and writing to there will have the OS terminate the application program, and it will be observed as a crash.
One thing that the other answers haven't mentioned yet is that on some platforms, sizeof (int) != sizeof (int*). If the arguments are passed in a certain way*, scanf could gobble up part of another variable, or of the return address. Changing the return address could very well lead to a security vulnerability.
* I'm no assembly language expert, so take this with a grain of salt.
I could not understand why the program would crash? Could anyone explain me the reason. Any help appreciated.
Maybe a little more applied:
int i = 123;
scanf ("%d", &i);
With the first command you allocate memory for one integer value and write 123 in this memory block. For this example let's say this memory block has the address 0x0000ffff. With the second command you read your input and scanf writes the input to memory block 0x0000ffff - because you are not accessing (dereferencing) the value of this variable i but it's address.
If you use the command scanf ("%d", i); instead you are writing the input to the memory address 123 (because that's the value stored inside this variable). Obviously that can go terribly wrong and cause a crash.
Since there is no &(ampersand) in scanf(as required by the standard), so as soon as we enter the value the program will terminate abruptly, no matter how many lines of code are written further in the program.
-->> I executed and found that in code blocks.
Same program if we run in turbo c compiler then it will run perfectly all the lines even which are after scanf, but the only thing, as we know the value of i printed would be garbage.
Conclusion:- Since at some compiler it will run and at some it would not, so this is not a valid program.

Impact of the type qualifiers on storage locations

As mentioned in the title, I am little confused if the type-qualifiers impact the storage location (stack, bss etc..) of the declarator.To describe more I am considering the following declarations.
int main()
{
const int value=5;
const char *str= "Constant String";
}
In the above code, the default storage-class-specifier is auto.
Hence it is assumed that these constants will be allocated in the stack-frame of main when it is created.
Generally, the pointers to various memory locations in stack have the freedom to modify the values contained in it.
Hence from the above points it is understandable that, either the type-qualifier adds some logic to preserve the constant nature of the element stored (If so what is it?) or the constants are stored in a read-only-portion of memory.Please elaborate on this.
More detailed example
#include <stdio.h>
int main(void)
{
int val=5;
int *ptr=&val;
const int *cptr=ptr;
*ptr=10; //Allowed
//*cptr=10; Not allowed
//Both ptr and cptr are pointing to same locations. But why the following error?
//"assignment of read-only location ‘*cptr’"
printf("ptr: %08X\n",ptr);
printf("cptr: %08X\n",cptr);
printf("Value: %d\n",*ptr);
}
In the above example, both cptr and ptr pointing to the same location. But cptr is pointer to a const type qualified integer. While modifying the value of cptr, the compiler throws a error as "assignment of read-only location ‘*cptr’". But I am able to modify the same location with ptr, as in the output below.Please explain
ptr: BFF912D8
cptr: BFF912D8
Value: 10
In your first example:
int main()
{
const int value=5;
const char *str= "Constant String";
}
Where the variable value and the string literal will be stored is left to the implementation. All that C standard guarantees is that these will stored in read-only memory which may in text segment, stack or anywhere.
In your second case:
int *ptr=&val;
const int *cptr=ptr;
When you try to modify *cptr, the compiler doesn't care whether the actual location pointed by cptr is a read-only or writable. All it cares is that the type qualifier const using which it thinks the location pointed to by cptr is read-only.
Another variant:
const int i = 5;
p = &i;
*p = 99;
In this case, the compiler allows modifying a const value through a pointer. But this is undefined behaviour.
I'm not going into the details of your example, but would like to make some general remarks:
C language semantics are enforced by compiler and hardware only up to a certain degree: it's the responsibility of the programmer to avoid undefined behaviour.
Case in point, it's very possible to modify a const-qualified variable of automatic storage duration (stack allocation) by casting away constness from a pointer without getting a segfault (hardware enforcement) or a compiler error (the cast tells it to shut up because you know what you're doing).
Language constraints however will be violated, and code will break in practice because the optimizer will make assumptions which no longer hold.
Then, there's the misconception that the type of the expression used to access the object is relevant to the definedness of an operation - it is not.
The effective typing rules (C99 6.5 §6) essentially make C a strongly typed language with very unsound type system. The type information (among it mutability) is carried by the object (storage location) itself, irrespective of how the location is accessed.
This makes storing into a const-qualified storage location illegal (undefined behaviour) but technically possible (unsound type system). Arbitrary type punning through pointers falls in the same category of operations which violate language semantics, but aren't enforced and thus can lead to strange bugs, eg under the assumption of strict aliasing.

Initializing variables in C

I know that sometimes if you don't initialize an int, you will get a random number if you print the integer.
But initializing everything to zero seems kind of silly.
I ask because I'm commenting up my C project and I'm pretty straight on the indenting and it compiles fully (90/90 thank you Stackoverflow) but I want to get 10/10 on the style points.
So, the question: when is it appropriate to initialize, and when should you just declare a variable:
int a = 0;
vs.
int a;
There are several circumstances where you should not initialize a variable:
When it has static storage duration (static keyword or global var) and you want the initial value to be zero. Most compilers will actually store zeros in the binary if you explicitly initialize, which is usually just a waste of space (possibly a huge waste for large arrays).
When you will be immediately passing the address of the variable to another function that fills its value. Here, initializing is just a waste of time and may be confusing to readers of the code who wonder why you're storing something in a variable that's about to be overwritten.
When a meaningful value for the variable can't be determined until subsequent code has completed execution. In this case, it's actively harmful to initialize the variable with a dummy value such as zero/NULL, as this prevents the compiler from warning you if you have some code paths where a meaningful value is never assigned. Compilers are good at warning you about accessing uninitialized variables, but can't warn you about "still contains dummy value" variables.
Aside from these issues, I'd say it's generally good practice to initialize your non-static variables when possible.
A rule that hasn't been mentioned yet is this: when the variable is declared inside a function it is not initialised, and when it is declared in static or global scope it's set to 0:
int a; // is set to 0
void foo() {
int b; // set to whatever happens to be in memory there
}
However - for readability I would usually initialise everything at declaration time.
If you're interested in learning this sort of thing in detail, I'd recommend this presentation and this book
I can think of a couple of reason off the top of my head:
When you're going to be initializing it later on in your code.
int x;
if(condition)
{
func();
x = 2;
}
else
{
x = 3;
}
anotherFunc(x); // x will have been set a value no matter what
When you need some memory to store a value set by a function or another piece of code:
int x; // Would be pointless to give x a value here
scanf("%d", &x);
If the variable is in the scope of of a function and not a member of a class I always initialize it because otherwise you will get warnings. Even if this variable will be used later I prefer to assign it on declaration.
As for member variables, you should initialize them in the constructor of your class.
For pointers, always initialize them to some default, particularly NULL, even if they are to be used later, they are dangerous when uninitialized.
Also it is recommended to build your code with the highest level of warnings that your compiler supports, it helps to identify bad practices and potential errors.
Static and global variables will be initialized to zero for you so you may skip initialization. Automatic variables (e.g. non-static variables defined in function body) may contain garbage and should probably always be initialized.
If there is a non-zero specific value you need at initialization then you should always initialize explicitly.
It's always good practice to initialize your variables, but sometimes it's not strictly necessary. Consider the following:
int a;
for (a = 0; a < 10; a++) { } // a is initialized later
or
void myfunc(int& num) {
num = 10;
}
int a;
myfunc(&a); // myfunc sets, but does not read, the value in a
or
char a;
cin >> a; // perhaps the most common example in code of where
// initialization isn't strictly necessary
These are just a couple of examples where it isn't strictly necessary to initialize a variable, since it's set later (but not accessed between declaration and initialization).
In general though, it doesn't hurt to always initialize your variables at declaration (and indeed, this is probably best practice).
In general, there's no need to initialize a variable, with 2 notable exceptions:
You're declaring a pointer (and not assigning it immediately) - you
should always set these to NULL as good style and defensive
programming.
If, when you declare the variable, you already know
what value is going to be assigned to it. Further assignments use up
more CPU cycles.
Beyond that, it's about getting the variables into the right state that you want them in for the operation you're going to perform. If you're not going to be reading them before an operation changes their value (and the operation doesn't care what state it is in), there's no need to initialize them.
Personally, I always like to initialize them anyway; if you forgot to assign it a value, and it's passed into a function by mistake (like a remaining buffer length) 0 is usually cleanly handled - 32532556 wouldn't be.
There is absolutely no reason why variables shouldn't be initialised, the compiler is clever enough to ignore the first assignment if a variable is being assigned twice. It is easy for code to grow in size where things you took for granted (such as assigning a variable before being used) are no longer true. Consider:
int MyVariable;
void Simplistic(int aArg){
MyVariable=aArg;
}
//Months later:
int MyVariable;
void Simplistic(int aArg){
MyVariable+=aArg; // Unsafe, since MyVariable was never initialized.
}
One is fine, the other lands you in a heap of trouble. Occasionally you'll have issues where your application will run in debug mode, but release mode will throw an exception, one reason for this is using an uninitialised variable.
As long as I have not read from a variable before writing to it, I have not had to bother with initializing it.
Reading before writing can cause serious and hard to catch bugs. I think this class of bugs is notorious enough to gain a mention in the popular SICP lecture videos.
Initializing a variable, even if it is not strictly required, is ALWAYS a good practice. The few extra characters (like "= 0") typed during development may save hours of debugging time later, particularly when it is forgotten that some variables remained uninitialized.
In passing, I feel it is good to declare a variable close to its use.
The following is bad:
int a; // line 30
...
a = 0; // line 40
The following is good:
int a = 0; // line 40
Also, if the variable is to be overwritten right after initialization, like
int a = 0;
a = foo();
it is better to write it as
int a = foo();

Resources