int main()
{
int i,j,k;
i=1;j=2;k=3;
int *p =&k;
*(p-1)=0;
printf("%d%d%d",i,j,k);
getch();
}
the output is 1 2 3.
Your program exhibits undefined behavior, the pointer arithmetics you're doing is invalid.
You can only do pointer arithmetics on pointers that actually point into an array, and the result of the addition or subtraction must still point inside the array (or one past its end, if you don't intend to dereference it).
So anything could happen, the compiler can generate whatever code it feels like for that code.
You are not allowed to refer to p-1 after assigning it &k this is an invalid pointer for you, and the behavior of using it is undefined.
A run-time error only occurs if your stray pointer hits something that raises that error, such as some protected memory or a location that will later become a divisor in some calculation (0), for example.
Related
I have a question about this code below:
#include <stdio.h>
char abcd(char array[]);
int main(void)
{
char array[4] = { 'a', 'b', 'c', 'd' };
printf("%c\n", abcd(array));
return 0;
}
char abcd(char array[])
{
char *p = array;
while (*p) {
putchar(*p);
p++;
}
putchar(*p);
putchar(p[4]);
return *p;
}
Why isn't segmentation fault generated when this program comes across putchar(*p) right after exiting while loop? I think that after *p went beyond the array[3] there is supposed to be no value assigned to other memory locations. For example, trying to access p[4] would be illegal because it would be out of the bound, I thought. On the contrary, this program runs with no errors. Is this because any other memories which no value are assigned (in this case any other memories than array[4]) should be null, whose value is '\0'?
OP seems to think accessing an array out-of-bounds, something special should happen.
Accessing outside array bounds is undefined behavior (UB). Anything may happen.
Let's clarify what a undefined behavior is.
The C standard is a contract between the developer and the compiler as to what the code means. However, it just so happens that you can write things that are just outside what is defined by the standard.
One of the most common cases is trying to do out-of-bounds access. Other languages say that this should result in an exception or another error. C does not. An argument is that it would imply adding costly checks at every array access.
The compiler does not know that what you are writing is undefined behavior¹. Instead, the compiler assumes that what you write contains no undefined behavior, and translate your code to assembly accordingly.
If you want an example, compile the code below with or without optimizations:
#include <stdio.h>
int table[4] = {0, 0, 0, 0};
int exists_in_table(int v)
{
for (int i = 0; i <= 4; i++) {
if (table[i] == v) {
return 1;
}
}
return 0;
}
int main(void) {
printf("%d\n", exists_in_table(3));
}
Without optimizations, the assembly I get from gcc does what you might expect: it just goes too far in the memory, which might cause a segfault if the array is allocated right before a page boundary.
With optimizations, however, the compiler looks at your code and notices that it cannot exit the loop (otherwise, it would try to access table[4], which cannot be), so the function exists_in_table necessarily returns 1. And we get the following, valid, implementation:
exists_in_table(int):
mov eax, 1
ret
Undefined behavior means undefined. They are very tricky to detect since they can be virtually invisible after compiling. You need advanced static analyzer to interpret the C source code and understand whether what it does can be undefined behavior.
¹ in the general case, that is; modern compilers use some basic static analyzer to detect the most common errors
C does no bounds checking on array accesses; because of how arrays and array subscripting are implemented, it can't do any bounds checking. It simply doesn't know that you've run past the end of the array. The operating environment will throw a runtime error if you cross a page boundary, but up until that point you can read or clobber any memory following the end of the array.
The behavior on subscripting past the end of the array is undefined - the language definition does not require the compiler or the operating environment to handle it any particular way. You may get a segfault, you may get corrupted data, you may clobber a frame pointer or return instruction address and put your code in a bad state, or it may work exactly as expected.
There are few remark points inside your program:
array inside the main and abcd function are different. In main, it is array of 4 elements, in abcd, it is an input variable with array type. If inside main, you call something like array[4] there will be compiler warnings for this. But there won't be compiler warning if you call in side abcd.
*p is a pointer point to array or in other word, it point to first element of array. In C, there isn't any boundary or limit for p. Your program is lucky because the memory after array contains 0 value to stop the while(*p) loop. If you did check the address of pointer p (&p). It might not equal to array[4].
In C, inside a function, if we declare a variable and don't initialise it, it generates a garbage value and stores it in the variable.
Whereas in Java, it does not allow you to use a local variable without initialising in a method.
But the code below, when compiled and ran on online C compilers,
Idk why instead of generating garbage values, it is printing "123". (without quotes)
#include <stdio.h>
void f();
int main(){
f();
f();
f();
}
void f(){
int i;
++i;
printf("%d", i);
}
"I do not know why instead of generating garbage values, it is printing "123"."
When any expression in the program invokes Undefined Behavior, which is made by
incrementing and printing the indeterminate value of i, the result/output does not need to be wrong, but there will never be a guarantee that it will be correct, which is a reason to never rely on any of such a code, does not matter if it prints the right values in one specific case.
Thus, you do not need to smash your head around finding a reason for that behavior.
Because of common implementations of C, an uninitialized value near the start of a C program is likely to be 0, so your subsequent ++i operations change it to 1 2 and 3.
However, take good note of this:
Just because it is likely to be zero does not guarantee it will be zero.
This is undefined behavior, and the values could correctly come out to be anything.
Somewhere on the forums I encountered this:
Any attempt to evaluate an uninitialized pointer variable
invokes undefined behavior. For example:
int *ptr; /* uninitialized */
if (ptr == NULL) ...; /* undefined behavior */
What is meant here?
Is it meant that if I ONLY write:
if(ptr==NULL){int t;};
this statement is already UB?
Why? I am not dereferencing the pointer right?
(I noticed there maybe terminology issue, by UB in this case, I referred to: will my code crash JUST due to the if check?)
Using unitialized variables invokes undefined behavior. It doesn't matter whether it is pointer or not.
int i;
int j = 7 * i;
is undefined as well. Note that "undefined" means that anything can happen, including a possibility that it will work as expected.
In your case:
int *ptr;
if (ptr == NULL) { int i = 0; /* this line does nothing at all */ }
ptr might contain anything, it can be some random trash, but it can be NULL too. This code will most likely not crash since you are just comparing value of ptr to NULL. We don't know if the execution enters the condition's body or not, we can't be even sure that some value will be successfully read - and therefore, the behavior is undefined.
your pointer is not initialized. Your statement would be the same as:
int a;
if (a == 3){int t;}
since a is not initialized; its value can be anything so you have undefined behavior. It doesn't matter whether you dereference your pointer or not. If you would do that, you would get a segfault
The C99 draft standard says it is undefined clearly in Annex J.2 Undefined behavior:
The value of an object with automatic storage duration is used while it is
indeterminate (6.2.4, 6.7.8, 6.8).
and the normative text has an example that also says the same thing in section 6.5.2.5 Compound literals paragraph 17 which says:
Note that if an iteration statement were used instead of an explicit goto and a labeled statement, the lifetime of the unnamed object would be the body of the loop only, and on entry next time around p would have an indeterminate value, which would result in undefined behavior.
and the draft standard defines undefined behavior as:
behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements
and notes that:
Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
As Shafik has pointed out, the C99 standard draft declares any use of unintialized variables with automatic storage duration undefined behaviour. That amazes me, but that's how it is. My rationale for pointer use comes below, but similar reasons must be true for other types as well.
After int *pi; if (pi == NULL){} your prog is allowed to do arbitrary things. In reality, on PCs, nothing will happen. But there are architectures out there which have illegal address values, much like NaN floats, which will cause a hardware trap when they are loaded in a register. These to us modern PC users unheard of architectures are the reason for this provision. Cf. e.g. How does a hardware trap in a three-past-the-end pointer happen even if the pointer is never dereferenced?.
The behavior of this is undefined because of how the stack is used for various function calls. When a function is called the stack grows to make space for variables within the scope of that function, but this memory space is not cleared or zeroed out.
This can be shown to be unpredictable in code like the following:
#include <stdio.h>
void test()
{
int *ptr;
printf("ptr is %p\n", ptr);
}
void another_test()
{
test();
}
int main()
{
test();
test();
another_test();
test();
return 0;
}
This simply calls the test() function multiple times, which just prints where 'ptr' lives in memory. You'd expect maybe to get the same results each time, but as the stack is manipulated the physical location of where 'ptr' is has changed and the data at that address is unknown in advance.
On my machine running this program results in this output:
ptr is 0x400490
ptr is 0x400490
ptr is 0x400575
ptr is 0x400585
To explore this a bit more, consider the possible security implications of using pointers that you have not intentionally set yourself
#include <stdio.h>
void test()
{
int *ptr;
printf("ptr is %p\n", ptr);
}
void something_different()
{
int *not_ptr_or_is_it = (int*)0xdeadbeef;
}
int main()
{
test();
test();
something_different();
test();
return 0;
}
This results in something that is undefined even though it is predictable. It is undefined because on some machines this will work the same and others it might not work at all, it's part of the magic that happens when your C code is converted to machine code
ptr is 0x400490
ptr is 0x400490
ptr is 0xdeadbeef
Some implementations may be designed in such a way that an attempted rvalue conversion of an invalid pointer may case arbitrary behavior. Other implementations are designed in such a way that an attempt to compare any pointer object with null will never do anything other than yield 0 or 1.
Most implementations target hardware where pointer comparisons simply compare bits without regard for whether those bits represent valid pointers. The authors of many such implementations have historically considered it so obvious that a pointer comparison on such hardware should never have any side-effect other than to report that pointers are equal or report that they are unequal that they seldom bothered to explicitly document such behavior.
Unfortunately, it has become fashionable for implementations to aggressively "optimize" Undefined Behavior by identifying inputs that would cause a program to invoke UB, assuming such inputs cannot occur, and then eliminating any code that would be irrelevant if such inputs were never received. The "modern" viewpoint is that because the authors of the Standard refrained from requiring side-effect-free comparisons on implementations where such a requirement would
impose significant expense, there's no reason compilers for any platform should guarantee them.
You're not dereferencing the pointer, so you don't end up with a segfault. It will not crash. I don't understand why anyone thinks that comparing two numbers will crash. It's nonsense. So again:
IT WILL NOT CRASH. PERIOD.
But it's still UB. You don't know what memory address the pointer contains. It may or may not be NULL. So your condition if (ptr == NULL) may or may not evaluate to true.
Back to my IT WILL NOT CRASH statement. I've just tested the pointer going from 0 to 0xFFFFFFFF on the 32-bit x86 and ARMv6 platforms. It did not crash.
I've also tested the 0..0xFFFFFFFF and 0xFFFFFFFF00000000..0xFFFFFFFFFFFFFFFF ranges on and amd64 platform. Checking the full range would take a few thousand years I guess.
Again, it did not crash.
I challenge the commenters and downvoters to show a platform and value where it crashes. Until then, I'll probably be able to survive a few negative points.
There is also a SO link to
trap representation
which also indicates that it will not crash.
With MinGW 4.6.2 (4.7.x does not seem to be the "latest" on sourceforge, so this one got installed)
void test(int *in)
{
*in = 0;
}
int main()
{
int dat;
test(dat);
return dat;
}
As you are probably aware this will give a warning in a c project.
dirpath\fileName.c|8|warning: passing argument 1 of 'test' makes pointer from integer without a cast [enabled by default]
And 2 errors in a c++ project.
dirpath\fileName.cpp|8|error: invalid conversion from 'int' to 'int*' [-fpermissive]|
dirpath\fileName.cpp|1|error: initializing argument 1 of 'void test(int*)' [-fpermissive]|
My question is, what exactly happens (in memory) during the two following scenarios, assume -fpermissive is enabled or compiled as a c program.
dat is uninitialized and the program proceeds (and no segmentation fault occurs).
dat is initialized to 42, and the program proceeds (and does seg-fault).
Why does leaving dat uninitialized lead to no seg-fault (perhaps by chance?) while case 2 causes a seg-fault (perhaps attempting to assign a value to a memory location)?
Curiosity: what does the f stand for in -fpermissive, flag perhaps? (seems redundant)
The program has undefined behavior as-is, so it's pointless to try to reason its behavior, but anyway...
The test() function expects a pointer to int. That pointer will be dereferenced and used to set the int it points to. However, you don't pass it a pointer to int but an uninitialized int - so it will try to interpret whatever garbage value is in that variable as a memory address and then access the object behind it - and boom.
If you wanted to call the function correctly, you would need to write
test(&dat);
instead.
what does the f stand for in -fpermissive, flag perhaps?
No, as far as I know, it stands for "feature". (But in the case of -fpermissive, I'd say it stands for "your code is f..ked if you use this flag"...)
As Warning says passing argument 1 of 'test' makes pointer from integer, you are trying to fetch something from a address which is value of passed integer.It may be anything.
when you are passing value 42, compiler is getting forced to fetch some value at address 42 which is not reserved for user and you are getting Segfault.By default compiler is assigning some value and later this values is becoming address, and somehow you are lucky that you do not get Segment fault with this.
In c by default pass by value takes place.
void test(int *in)
{
*in = 0;
}
test(dat); // passing value
here you are passing dat which is uninitialized. It will consider a garbage value. So you are trying to make the garabage value to act as a memory address in the test function. It is undefined behaviour. Instead you can try this.
test(&data);
Coming to your question.
Q. dat is uninitialized and the program proceeds (and no segmentation fault occurs).
A. This is an undefined behaviour because your are passing a garbage value. If your
garbage value is a proper memory address then it will not cause segmentation error.
If it is not proper, it will cause segmentaion error. So it happend at runtime
dynamically and can give either segmentation fault or can run.
Q. dat is initialized to 42, and the program proceeds (and does seg-fault)
A. Here you have initialized dat to 42. By default c works on pass by value definition.
So you are passing 42 to test. test will consider 42 as a memory location, which
is not a proper memory location so it cause segmentation error.
When i run the above program in gcc complier(www.codepad.org) i get the output as
Disallowed system call: SYS_socketcall
Could anyone please clear why this error/output comes?
int main() {
int i=8;
int *p=&i;
printf("\n%d",*p);
*++p=2;
printf("\n%d",i);
printf("\n%d",*p);
printf("\n%d",*(&i+1));
return 0;
}
what i have observed is i becomes inaccessible after i execute *++p=2;WHY?
When you do *p = &i, you make p point to the single integer i. ++p increments p to point to the "next" integer, but since i is not an array, the result is undefined.
What you are observing is undefined behavior. Specifically, dereferencing p in *++p=2 is forbidden as i is not an array with at least two members. In practice, your program is most likely attempting to write to whatever memory is addressed by &i + sizeof(int).
You are invoking undefined behaviour by writing to undefined areas on the stack. codepad.org has protection against programs that try to do disallowed things, and your undefined behaviour program appears to have triggered that.
If you try to do that on your own computer, your program will probably end up crashing in some other way (such as segmentation fault or bus error).
The expression*++p first moves the pointer p to point one int forward (i.e. the pointer becomes invalid), then dereferences the resulting pointer and tries to save the number 2 there, thus writing to invalid memory.
You might have meant *p = 2 or (*p)++.
Your code accesses memory it does not own, and the results of that are undefined.
All your code has the right to do as it is currently written is to read and write from an area memory of size sizeof(int) at &i, and another of size sizeof(int*) at &p.
The following lines all violate those constraints, by using memory addresses outside the range you are allowed to read or write data.
*++p=2;
printf("\n%d",*p);
printf("\n%d",*(&i+1));
Operator ++ modifies its argument, so the line *++p=2; assigns 2 to a location on the stack that probably defines the call frame and increments the pointer p. Once you messed up the call frame - all bets are off - you end up in corrupt state.