What will happen if '&' is not put in a 'scanf' statement? - c

I had gone to an interview in which I was asked the question:
What do you think about the following?
int i;
scanf ("%d", i);
printf ("i: %d\n", i);
I responded:
The program will compile successfully.
It will print the number incorrectly but it will run till the end
without crashing
The response that I made was wrong. I was overwhelmed.
After that they dismissed me:
The program would crash in some cases and lead to an core dump.
I could not understand why the program would crash? Could anyone explain me the reason? Any help appreciated.

When a variable is defined, the compiler allocates memory for that variable.
int i; // The compiler will allocate sizeof(int) bytes for i
i defined above is not initialized and have indeterminate value.
To write data to that memory location allocated for i, you need to specify the address of the variable. The statement
scanf("%d", &i);
will write an int data by the user to the memory location allocated for i.
If & is not placed before i, then scanf will try to write the input data to the memory location i instead of &i. Since i contains indeterminate value, there are some possibilities that it may contain a value equivalent to the value of a memory address or it may contain a value which is out of range of memory address.
In either case, the program may behave erratically and will lead to undefined behavior. In that case anything could happen.

Beacuse it invokes undefined behavior. The scanf() family of functions expect a pointer to an integer when the "%d" specifier is found. You are passing an integer which can be interpreted as the address of some integer but it's not. This doesn't have a defined behavior in the standard. It will compile indeed (will issue some warning however) but it will certainly work in an unexpected way.
In the code as is, there is yet another problem. The i variable is never initialized so it will have an indeterminate value, yet another reason for Undefined Behavior.
Note that the standard doesn't say anything about what happens when you pass a given type when some other type was expected, it's simply undefined behavior no matter what types you swap. But this particular situation falls under a special consideration because pointers can be converted to integers, though the behavior is only defined if you convert back to a pointer and if the integer type is capable of storing the value correctly. This is why it compiles, but it surely does not work correctly.

You passed data having the wrong type (int* is expected, but int is passed) to scanf(). This will lead to undefined behavior.
Anything can happen for undefined behavior. The program may crash and may not crash.
In a typical environment, I guess the program will crash when some "address" which points to a location which isn't allowed to write into by the operating system is passed to scanf(), and writing to there will have the OS terminate the application program, and it will be observed as a crash.

One thing that the other answers haven't mentioned yet is that on some platforms, sizeof (int) != sizeof (int*). If the arguments are passed in a certain way*, scanf could gobble up part of another variable, or of the return address. Changing the return address could very well lead to a security vulnerability.
* I'm no assembly language expert, so take this with a grain of salt.

I could not understand why the program would crash? Could anyone explain me the reason. Any help appreciated.
Maybe a little more applied:
int i = 123;
scanf ("%d", &i);
With the first command you allocate memory for one integer value and write 123 in this memory block. For this example let's say this memory block has the address 0x0000ffff. With the second command you read your input and scanf writes the input to memory block 0x0000ffff - because you are not accessing (dereferencing) the value of this variable i but it's address.
If you use the command scanf ("%d", i); instead you are writing the input to the memory address 123 (because that's the value stored inside this variable). Obviously that can go terribly wrong and cause a crash.

Since there is no &(ampersand) in scanf(as required by the standard), so as soon as we enter the value the program will terminate abruptly, no matter how many lines of code are written further in the program.
-->> I executed and found that in code blocks.
Same program if we run in turbo c compiler then it will run perfectly all the lines even which are after scanf, but the only thing, as we know the value of i printed would be garbage.
Conclusion:- Since at some compiler it will run and at some it would not, so this is not a valid program.

Related

Why it is compiling it as an infinite loop instead for finite one? [duplicate]

I had gone to an interview in which I was asked the question:
What do you think about the following?
int i;
scanf ("%d", i);
printf ("i: %d\n", i);
I responded:
The program will compile successfully.
It will print the number incorrectly but it will run till the end
without crashing
The response that I made was wrong. I was overwhelmed.
After that they dismissed me:
The program would crash in some cases and lead to an core dump.
I could not understand why the program would crash? Could anyone explain me the reason? Any help appreciated.
When a variable is defined, the compiler allocates memory for that variable.
int i; // The compiler will allocate sizeof(int) bytes for i
i defined above is not initialized and have indeterminate value.
To write data to that memory location allocated for i, you need to specify the address of the variable. The statement
scanf("%d", &i);
will write an int data by the user to the memory location allocated for i.
If & is not placed before i, then scanf will try to write the input data to the memory location i instead of &i. Since i contains indeterminate value, there are some possibilities that it may contain a value equivalent to the value of a memory address or it may contain a value which is out of range of memory address.
In either case, the program may behave erratically and will lead to undefined behavior. In that case anything could happen.
Beacuse it invokes undefined behavior. The scanf() family of functions expect a pointer to an integer when the "%d" specifier is found. You are passing an integer which can be interpreted as the address of some integer but it's not. This doesn't have a defined behavior in the standard. It will compile indeed (will issue some warning however) but it will certainly work in an unexpected way.
In the code as is, there is yet another problem. The i variable is never initialized so it will have an indeterminate value, yet another reason for Undefined Behavior.
Note that the standard doesn't say anything about what happens when you pass a given type when some other type was expected, it's simply undefined behavior no matter what types you swap. But this particular situation falls under a special consideration because pointers can be converted to integers, though the behavior is only defined if you convert back to a pointer and if the integer type is capable of storing the value correctly. This is why it compiles, but it surely does not work correctly.
You passed data having the wrong type (int* is expected, but int is passed) to scanf(). This will lead to undefined behavior.
Anything can happen for undefined behavior. The program may crash and may not crash.
In a typical environment, I guess the program will crash when some "address" which points to a location which isn't allowed to write into by the operating system is passed to scanf(), and writing to there will have the OS terminate the application program, and it will be observed as a crash.
One thing that the other answers haven't mentioned yet is that on some platforms, sizeof (int) != sizeof (int*). If the arguments are passed in a certain way*, scanf could gobble up part of another variable, or of the return address. Changing the return address could very well lead to a security vulnerability.
* I'm no assembly language expert, so take this with a grain of salt.
I could not understand why the program would crash? Could anyone explain me the reason. Any help appreciated.
Maybe a little more applied:
int i = 123;
scanf ("%d", &i);
With the first command you allocate memory for one integer value and write 123 in this memory block. For this example let's say this memory block has the address 0x0000ffff. With the second command you read your input and scanf writes the input to memory block 0x0000ffff - because you are not accessing (dereferencing) the value of this variable i but it's address.
If you use the command scanf ("%d", i); instead you are writing the input to the memory address 123 (because that's the value stored inside this variable). Obviously that can go terribly wrong and cause a crash.
Since there is no &(ampersand) in scanf(as required by the standard), so as soon as we enter the value the program will terminate abruptly, no matter how many lines of code are written further in the program.
-->> I executed and found that in code blocks.
Same program if we run in turbo c compiler then it will run perfectly all the lines even which are after scanf, but the only thing, as we know the value of i printed would be garbage.
Conclusion:- Since at some compiler it will run and at some it would not, so this is not a valid program.

Char and strcpy in C

I came across a part of question in which, I am getting an output, but I need a explanation why it is true and does work?
char arr[4];
strcpy(arr,"This is a link");
printf("%s",arr);
When I compile and execute, I get the following output.
Output:
This is a link
The short answer why it worked (that time) is -- you got lucky. Writing beyond the end of an array is undefined behavior. Where undefined behavior is just that, undefined, it could just a easily cause a segmentation fault as it did produce output. (though generally, stack corruption is the result)
When handling character arrays in C, you are responsible to insure you have allocated sufficient storage. When you intend to use the array as a character string, you also must allocate sufficient storage for each character +1 for the nul-terminating character at the end (which is the very definition of a nul-terminated string in C).
Why did it work? Generally, when you request say char arr[4]; the compiler is only guaranteeing that you have 4-bytes allocated for arr. However, depending on the compiler, the alignment, etc. the compiler may actually allocate whatever it uses as a minimum allocation unit to arr. Meaning that while you have only requested 4-bytes and are only guaranteed to have 4-usable-bytes, the compiler may have actually set aside 8, 16, 32, 64, or 128, etc-bytes.
Or, again, you were just lucky that arr was the last allocation requested and nothing yet has requested or written to the memory address starting at byte-5 following arr in memory.
The point being, you requested 4-bytes and are only guaranteed to have 4-bytes available. Yes it may work in that one printf before anything else takes place in your code, but your code is wholly unreliable and you are playing Russian-Roulette with stack corruption (if it has not already taken place).
In C, the responsibility falls to you to insure your code, storage and memory use is all well-defined and that you do not wander off into the realm of undefined, because if you do, all bets are off, and your code isn't worth the bytes it is stored in.
How could you make your code well-defined? Appropriately limit and validate each required step in your code. For your snippet, you could use strncpy instead of strcpy and then affirmatively nul-terminate arr before calling printf, e.g.
char arr[4] = ""; /* initialize all values */
strncpy(arr,"This is a link", sizeof arr); /* limit copy to bytes available */
arr[sizeof arr - 1] = 0; /* affirmatively nul-terminate */
printf ("%s\n",arr);
Now, you can rely on the contents of arr throughout the remainder of your code.
Your code has some memory issues (buffer overrun) . The function strcpy copies bytes until the null character. The function printf prints until the null character.
There is no guarantee on the behavior of this piece of code.
It's just like: you told me "I'll pick you up at 5:00 p.m." and when you came I would be there(guarantee). But I can't guarantee whether I had grabbed you a cup of coffee or not, because you didn't told me you want one. Maybe I'm very nice and bought two cups of coffee, or maybe I'm a cheapskate and just bought one for myself.
It may work. It may not. It may fail immediately and obviously. It may fail at some arbitrary future time and in subtle ways that will drive you insane.
That is the often-insidious nature of undefined behaviour. Don't do it.
If it works at all, it's totally by accident and in no way guaranteed. It's possible that you're overwriting stuff on the stack or in other memory (depending on the implementation and how/where the actual variable str is defined(a)) but that the memory being overwritten is not used after that point (given the simple nature of the code).
That possibility of it working accidentally in no way makes it a good idea.
For the language lawyers among us, section J.2 (instances of undefined behaviour) of C11 clearly states:
An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression a[1][7] given the declaration int a[4][5]).
That informative section references 6.5.6, which is normative, and which states when discussing pointer/integer addition (of which a[b] is an example):
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.
(a) For example, on my system, declaring the variable inside main causes the program to crash because the buffer overflow trashes the return address on the stack.
However, if I put the declaration at file level (outside of main), it seems to run just fine, printing the message then exiting the program.
But I assure you that's only because the memory you've trashed is not important for the continuation of the program in this case. It will almost certainly be important in anything more substantial than this example.
your code will always work as long as the printf is placed just after strcpy. But it is wrong coding
Try following and it won't work
int j;
char arr[4];
int i;
strcpy(arr,"This is a link");
i=0;
j=0;
printf("%s",arr);
To understand why it is so you must understand the idea of stack. All local variables are allocated on stack. Hence in your code, program control has allocated 4 bytes for "arr" and when you copy a string which is larger than 4 bytes then you are overwriting/corrupting some other memory. But as you accessed "arr" just after strcpy hence the area you have overwritten which may belong to some other variables still not updated by program that's why your printf works fine. But as I suggested in example code where other variables are updated which fall into the memory region you have overwritten, you won't get correct (? or more appropriate is desired) output
Your code is working also because stack grows downwards if it would have been other way then also you had not get desired output

what happens to array elements after the original array is reallocated?

#include <stdio.h>
#include <stdlib.h>
int main()
{
int *a;
a = (int *)malloc(100*sizeof(int));
int i=0;
for (i=0;i<100;i++)
{
a[i] = i+1;
printf("a[%d] = %d \n " , i,a[i]);
}
a = (int*)realloc(a,75*sizeof(int));
for (i=0;i<100;i++)
{
printf("a[%d] = %d \n " , i,a[i]);
}
free(a);
return 0;
}
In this program I expected the program to give me a segmentation fault because im trying to access an element of an array which is freed using realloc() . But then the output is pretty much the same except for a few final elements !
So my doubt is whether the memory is actually getting freed ? What exactly is happening ?
The way realloc works is that it guarantees that a[0]..a[74] will have the same values after the realloc as they did before it.
However, the moment you try to access a[75] after the realloc, you have undefined behaviour. This means that the program is free to behave in any way it pleases, including segfaulting, printing out the original values, printing out some random values, not printing anything at all, launching a nuclear strike, etc. There is no requirement for it to segfault.
So my doubt is whether the memory is actually getting freed?
There is absolutely no reason to think that realloc is not doing its job here.
What exactly is happening?
Most likely, the memory is getting freed by shrinking the original memory block and not wiping out the now unused final 25 array elements. As a result, the undefined behaviour manifests itself my printing out the original values. It is worth noting that even the slightest changes to the code, the compiler, the runtime library, the OS etc could make the undefined behaviour manifest itself differently.
You may get a segmentation fault, but you may not. The behaviour is undefined, which means anything can happen, but I'll attempt to explain what you might be experiencing.
There's a mapping between your virtual address space and physical pages, and that mapping is usually in pages of 4096 bytes at least (well, there's virtual memory also, but lets ignore that for the moment).
You get a segmentation fault if you attempt to address virtual address space that doesn't map to a physical page. So your call to realloc may not have resulted in a physical page being returned to the system, so it's still mapped to you program and can be used. However a following call to malloc could use that space, or it could be reclaimed by the system at any time. In the former case you'd possibly overwrite another variable, in the latter case you'll segfault.
Accessing an array beyond its bounds is undefined behaviour. You might encounter a runtime error. Or you might not. The memory manager may well have decided to re-use the original block of memory when you re-sized. But there's no guarantee of that. Undefined behaviour means that you cannot reason about or predict what will happen. There's no grounds for you to expect anything to happen.
Simply put, don't access beyond the end of the array.
Some other points:
The correct main declaration here is int main(void).
Casting the value returned by malloc is not needed and can mask errors. Don't do it.
Always store the return value of realloc into a separate variable so that you can detect NULL being returned and so avoid losing and leaking the original block.

scanf() does not read input string when first string of earlier defined array of strings in null

I defined an array for strings. It works fine if I define it in such a way the first element is not an empty string. When its an empty string, the next scanf() for the other string stops reading the input string and program stops execution.
Now I don't understand how can defining the array of strings affect reading of input by scanf().
char *str_arr[] = {"","abc","","","b","c","","",""}; // if first element is "abc" instead of "" then works fine
int size = sizeof(str_arr)/sizeof(str_arr[0]);
int i;
printf("give string to be found %d\n",size);
char *str;
scanf("%s",str);
printf("OK\n");
Actually, you are getting it wrong my brother. The initialization of str_arr doesn't affect the working of scanf() , it may however seem to you like that but it ain't actually. As described in other answers too this is called undefined behavior. An undefined behavior in C itself is very vaguely defined .
The C FAQ defines “undefined behavior” like this:
Anything at all can happen; the Standard imposes no requirements. The
program may fail to compile, or it may execute incorrectly (either
crashing or silently generating incorrect results), or it may
fortuitously do exactly what the programmer intended.
It basically means anything can happen. When you do it like this :
char *str;
scanf("%s",str);
Its an UB. Sometimes you get results which you are not supposed to and you think its working.That's where debuggers come in handy.Use them almost every time, especially in the beginning. Other recommendation w.r.t your program:
Instead of scanf() use fgets() to read strings. If you want to use scanf then use it like scanf("%ws",name); where name is character array and w is the field width.
Compile using -Wall option to get all the warnings, if you would have used it, you might have got the warning that you are using str uninitialized.
Go on reading THIS ARTICLE, it has sufficient information to clear your doubts.
Declaring a pointer does not allocate a buffer for it in memory and does not initialize it, so you are trying to dereference an uninitialized pointer (str) which results in an undefined behavior.
Note that scanf will cause a potential buffer overflow if not used carefully when reading strings. I recommend you read this page for some ideas on how to avoid it.
You are passing to scanf a pointer that is not initialized to anything particular, so scanf will try to write the characters provided by the user in some random memory location; whether this results in a crash or something else depends mostly by luck (and by how the compiler decides to set up the stack, that we may also see as "luck"). Technically, that's called "undefined behavior" - i.e. as far as the C standard is concerned, anything can happen.
To fix your problem, you have to pass to scanf a buffer big enough for the string you plan to receive:
char str[101];
scanf("%100s",str); /* the "100" bit tells to scanf to avoid reading more than 100 chars, which would result in a buffer overflow */
printf("OK\n");
And remember that char * in C is not the equivalent of string in other languages - char * is just a pointer to char, that knows nothing about allocation.

Pointer assignment Problem

When i run the above program in gcc complier(www.codepad.org) i get the output as
Disallowed system call: SYS_socketcall
Could anyone please clear why this error/output comes?
int main() {
int i=8;
int *p=&i;
printf("\n%d",*p);
*++p=2;
printf("\n%d",i);
printf("\n%d",*p);
printf("\n%d",*(&i+1));
return 0;
}
what i have observed is i becomes inaccessible after i execute *++p=2;WHY?
When you do *p = &i, you make p point to the single integer i. ++p increments p to point to the "next" integer, but since i is not an array, the result is undefined.
What you are observing is undefined behavior. Specifically, dereferencing p in *++p=2 is forbidden as i is not an array with at least two members. In practice, your program is most likely attempting to write to whatever memory is addressed by &i + sizeof(int).
You are invoking undefined behaviour by writing to undefined areas on the stack. codepad.org has protection against programs that try to do disallowed things, and your undefined behaviour program appears to have triggered that.
If you try to do that on your own computer, your program will probably end up crashing in some other way (such as segmentation fault or bus error).
The expression*++p first moves the pointer p to point one int forward (i.e. the pointer becomes invalid), then dereferences the resulting pointer and tries to save the number 2 there, thus writing to invalid memory.
You might have meant *p = 2 or (*p)++.
Your code accesses memory it does not own, and the results of that are undefined.
All your code has the right to do as it is currently written is to read and write from an area memory of size sizeof(int) at &i, and another of size sizeof(int*) at &p.
The following lines all violate those constraints, by using memory addresses outside the range you are allowed to read or write data.
*++p=2;
printf("\n%d",*p);
printf("\n%d",*(&i+1));
Operator ++ modifies its argument, so the line *++p=2; assigns 2 to a location on the stack that probably defines the call frame and increments the pointer p. Once you messed up the call frame - all bets are off - you end up in corrupt state.

Resources