//this code should give segmentation error....but it works fine ....how is it possible.....i just got this code by hit and trail whle i was trying out some code of topic ARRAY OF POINTERS....PLZ can anyone explain
int main()
{
int i,size;
printf("enter the no of names to be entered\n");
scanf("%d",&size);
char *name[size];
for(i=0;i<size;i++)
{
scanf("%s",name[i]);
}
printf("the names in your array are\n");
for(i=0;i<size;i++)
{
printf("%s\n",&name[i]);
}
return 0
The problem in your code (which is incomplete, BTW; you need #include <stdio.h> at the top and a closing } at the bottom) can be illustrated in a much shorter chunk of code:
char *name[10]; // make the size an arbitrary constant
scanf("%s", name[0]); // Read into memory pointed to by an uninitialized pointer
(name could be a single pointer rather than an array, but I wanted to preserve your program's structure for clarity.)
The pointer name[0] has not been initialized, so its value is garbage. You pass that garbage pointer value to scanf, which reads characters from stdin and stores them in whatever memory location that garbage pointer happens to point to.
The behavior is undefined.
That doesn't mean that the program will die with a segmentation fault. C does not require checking for invalid pointers (nor does it forbid it, but most implementations don't do that kind of checking). So the most likely behavior is that your program will take whatever input you provide and attempt to store it in some arbitrary memory location.
If the garbage value of name[0] happens to point to a detectably invalid memory location, your program might die with a segmentation fault. That's if you're luck. If you're not, it might happen to point to some writable memory location that your program is able to modify. Storing data in that location might be harmless, or it might clobber some critical internal data structure that your program depends on.
Again, your program's behavior is undefined. That means the C standard imposes no requirements on its behavior. It might appear to "work", it might blow up in your face, or it might do anything that it's physically possible for a program to do. Apparently to behave correctly is probably the worst consequence of undefined behavior, since it makes it difficult to diagnose the problem (which will probably appear during a critical demo).
Incidentally, using scanf with a %s format specifier is inherently unsafe, since there's no way to limit the amount of data it will attempt to read. Even with a properly initialized pointer, there's no way to guarantee that it points to enough memory to hold whatever input it receives.
You may be accustomed to languages that do run-time checking and can reliably detect (most) problems like this. C is not such a language.
I'm not sure what's your test case (No enough reputation to post a comment). I just try to input it with 0 and 1\n1\n2\n.
It's a little complex to explain the detail. However, Let's start it :-). There are two things you should know. First, main() is a function. Second, you use a C99 feature, variable-length array or gnu extension, zero-length array (supported by gcc), on char *name[size];.
main() is a function, so all the variable declared in this function is local variables. Local variables locate at stack section. You must know about it first.
If you input 1\n1\n2\n, the variable-length array is used. The implementation of it is also to allocate it on stack. Notice that value of each element in array is not initialized as 0. That is the possible answer for you to execute without segmentation fault. You cannot make sure that it'll point to the address which isn't writable (At least failed on me).
If the input is 0\n, you will use extension feature, zero-length array, supported by GNU. As you saw, it means no element in array. The value of name is equal to &size, because size is the last local variable you declared before you declared name[0] (Consider stack pointer). The value of name[0] is equal to dereference to &size, that's zero (='\0') , so it will work fine.
The simple answer to your question is that a segmentation fault is:
A segmentation fault (aka segfault) are caused by a program trying to read or write an illegal memory location.
So it all depends upon what is classed as illegal. If the memory in question is a part of the valid address space, e.g. the stack, for the process the program is running, it may not cause a segfault.
When I run this code in a debugger the line:
scanf("%s, name[i]);
over writes the content of the size variable, clearly not the intended behaviour, and the code essentially goes into an infinite loop.
But that is just what happens on my 64 bit Intel linux machine using gcc 5.4. Another environment will probably do something different.
If I put the missing & in front of name[i] it works OK. Whether that is luck, or expertly exploiting the intended behaviour of C99 variable length arrays, as suggested. I'm afraid I don't know.
So welcome to the world of subtle memory overwriting bugs.
Related
I'm looking at this simple program. My understanding is that trying to modify the values at memory addresses past the maximum index should result in a segmentation fault. However, the following code runs without any problems. I am even able to print out all 6 of the array indexes 0->6
main()
{
int i;
int a[3];
for(i=0; i<6; i++)
a[i] = i;
}
However, when I change the for loop to
for(i=0; i<7; i++)
and executing the program, it will segfault.
This almost seems to me like it is some kind of extra padding done by malloc. Why does this happen only after the 6th index (s+6)? Will this behavior happen with longer/shorter arrays? Excuse my foolishness as lowly java programmer :)
This almost seems to me like it is some kind of extra padding done by malloc.
You did not call malloc(). You declared a as an array of 3 integers in stack memory, whereas malloc() uses heap memory. In both cases, accessing any element past the last one (the third one, a[2], in this case) is undefined behaviour. When it's done with heap memory, it usually causes a segmentation fault.
Well, malloc didn't do it, because you didn't call malloc. My guess is, the extra three writes were enough to chew through your frame pointer and your stack pointer, but not through your return address (but the seventh one hit that). Your program is not guaranteed to crash when you access out-of-bounds memory (though there are other languages which do guarantee it), any more than it is guaranteed not to crash. That's what undefined behavior is: unpredictable.
As others said, accessing an array beyond its limits is undefined behaviour. So what happens? If you access memory that you should not access, it depends on where that memory is.
Generally (but very generally, specifics can vary between systems and compilers), the following main things can happen:
It can be that you simply access other variables of your process that lie directly "behind" the array. If you write to that memory, you simply modify the values of the other variables. You will probably not get a segfault, so you may never notice why your program produces bad results or acts so weird or why, during debugging, your variables have values you never (knowlingly) assigned to them. This is, IMO, really bad, because you think everything is fine while it isn't.
It can be, especially on a stack, that you access other data on the stack, like saved processor registers or even the address to which the processor should return if the function ends. If you overwrite these with some other data, it is hard to tell what happens. In that case, a segfault is probably the lesser of all possible evils. You simply don't know what can happen.
If the memory beyond your array does not belong to your process then, on most modern computers, you will get a segfault or similar exception (e.g. an access violation, or whatever your OS calls it).
I may have forgotten a few more possible problems that can occur, but those are, IMO, the most usual things that happen if you write beyond array bounds.
It depends on the free memory available, if free memory available is less then it will give segmentation fault otherwise it will use the extra memory to store the data and it will not be giving segmentation fault.There is no need for malloc because array itself allocates memory.
In your system memory is available only for 6 integers and when you are trying to access to next memory(which is not accessible or say not free)it is giving segmentation fault.
int *array; //it allocate a pointer to an int right?
array=malloc(sizeof(int); //allocate the space for ONE int right?
scanf("%d", &array[4]); //must generate a segmentation fault cause there is not allocated space enough right?
printf("%d",array[4]); //it shouldn't print nothing...
but it print 4! why?
Reading or writing off the end of an array in C results in undefined behavior, meaning that absolutely anything can happen. It can crash the program, or format the hard drive, or set the computer on fire. In your case, it just so happens to be the case that this works out perfectly fine, probably because malloc pads the length of the allocated memory for efficiency reasons (or perhaps because you're trashing memory that malloc wants to use later on). However, this isn't at all portable, isn't guaranteed to work, and is a disaster waiting to happen.
Hope this helps!
Cause the operating system happens to have this memory you have asked. This code is by no means guaranteed to run on another machine or another time.
C doesn't check your code for getting of boundaries of an array size, that depends to the programmer. I guess you can call it an undefined behavior (although it's not exactly what they mean when they say an undefined behavior because mostly the memory would be in a part of the stack or the heap so you can still get to it.
When you say array[4] you actually say *(array + 4) which is of course later translated to *(array + 4*sizeof(int)) and you actually go to a certain space in the memory, the space exists, it could be maybe a read-only, maybe a place of another array or variable in your program or it might just work perfectly. No guarantee it'll be an error, but it's not what undefined behavior.
To understand more about undefined behavior you can go to this article (which I find very interesting).
(GLOBAL DECLARATION OF CHAR ARRAY)
#include<stdio.h>
char name[10]; /* though allocated memory is 10 bytes still it accepts more then 10 chars*/
void main()
{
printf("\n ENter name :\t");
scanf("%s",name);
}
Second case:(LOCAL DECLARATION OF CHAR array)
#include<stdio.h>
void main()
{
char name[10];/* Now it will only accepts 10 chars NOT MORE */
printf("\n ENter name :\t");
scanf("%s",name);
}
why there is difference in acceptance of the chars in 1st case it accepts more than 10 but in 2nd exactly 10 but not more.I dont know why,but it happens???
In both case entering string more than 10 chars (including \0) will invoke undefined behavior because you are writing past to array bound (allocated memory).
In this case, saying that first worked and second doesn't for more than 10 chars is meaningless. Any expected or unexpected behavior can be seen.
C does not do bounds checking on array accesses, so it won't automatically raise an exception if you write past the end of the array. The behavior of writing past the end of the array is undefined; the compiler is not required to handle that error in any particular way. A really smart compiler may issue a diagnostic and halt translation. Or it may compile the code as though nothing's wrong (which is the usual behavior). When run, your code may crash immediately, or it may continue to run in a bad state and crash later (been in that movie before; not fun), or it may continue to run without any issues at all.
The C language assumes you know how big your arrays are, and that you are smart enough to not wander outside of their boundaries. It won't prevent you from doing so, but at that point it makes no guaratees about your program's behavior.
When you write more than 10 characters into name, you are stepping on unauthorized memory.
When n is defined in the global namespace you are corrupting some memory locations but bad side effects are not visible right away in your particular case. Since the behavior is undefined, for a different use case, you might see the bad side effects right away or in a delayed fashion.
When n is defined in the function, you are corrupting local stack memory and the bad side effects are visible right away in your particular case.
I have this piece of code, and it runs perfectly fine, and I don't why:
int main(){
int len = 10;
char arr[len];
arr[150] = 'x';
}
Seriously, try it! It works (at least on my machine)!
It doesn't, however, work if I try to change elements at indices that are too large, for instance index 20,000. So the compiler apparently isn't smart enough to just ignore that one line.
So how is this possible? I'm really confused here...
Okay, thanks for all the answers!
So I can use this to write into memory consumed by other variables on the stack, like so:
#include <stdio.h>
main(){
char b[4] = "man";
char a[10];
a[10] = 'c';
puts(b);
}
Outputs "can". That's a really bad thing to do.
Okay, thanks.
C compilers generally do not generate code to check array bounds, for the sake of efficiency. Out-of-bounds array accesses result in "undefined behavior", and one
possible outcome is that "it works". It's not guaranteed to cause a crash or other
diagnostic, but if you're on an operating system with virtual memory support, and your array index points to a virtual memory location that hasn't yet been mapped to physical memory, your program is more likely to crash.
So how is this possible?
Because the stack was, on your machine, large enough that there happened to be a memory location on the stack at the location to which &arr[150] happened to correspond, and because your small example program exited before anything else referred to that location and perhaps crashed because you'd overwritten it.
The compiler you're using doesn't check for attempts to go past the end of the array (the C99 spec says that the result of arr[150], in your sample program, would be "undefined", so it could fail to compile it, but most C compilers don't).
Most implementations don't check for these kinds of errors. Memory access granularity is often very large (4 KiB boundaries), and the cost of finer-grained access control means that it is not enabled by default. There are two common ways for errors to cause crashes on modern OSs: either you read or write data from an unmapped page (instant segfault), or you overwrite data that leads to a crash somewhere else. If you're unlucky, then a buffer overrun won't crash (that's right, unlucky) and you won't be able to diagnose it easily.
You can turn instrumentation on, however. When using GCC, compile with Mudflap enabled.
$ gcc -fmudflap -Wall -Wextra test999.c -lmudflap
test999.c: In function ‘main’:
test999.c:3:9: warning: variable ‘arr’ set but not used [-Wunused-but-set-variable]
test999.c:5:1: warning: control reaches end of non-void function [-Wreturn-type]
Here's what happens when you run it:
$ ./a.out
*******
mudflap violation 1 (check/write): time=1362621592.763935 ptr=0x91f910 size=151
pc=0x7f43f08ae6a1 location=`test999.c:4:13 (main)'
/usr/lib/x86_64-linux-gnu/libmudflap.so.0(__mf_check+0x41) [0x7f43f08ae6a1]
./a.out(main+0xa6) [0x400a82]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd) [0x7f43f0538ead]
Nearby object 1: checked region begins 0B into and ends 141B after
mudflap object 0x91f960: name=`alloca region'
bounds=[0x91f910,0x91f919] size=10 area=heap check=0r/3w liveness=3
alloc time=1362621592.763807 pc=0x7f43f08adda1
/usr/lib/x86_64-linux-gnu/libmudflap.so.0(__mf_register+0x41) [0x7f43f08adda1]
/usr/lib/x86_64-linux-gnu/libmudflap.so.0(__mf_wrap_alloca_indirect+0x1a4) [0x7f43f08afa54]
./a.out(main+0x45) [0x400a21]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd) [0x7f43f0538ead]
number of nearby objects: 1
Oh look, it crashed.
Note that Mudflap is not perfect, it won't catch all of your errors.
Native C arrays do not get bounds checking. That would require additional instructions and data structures. C is designed for efficiency and leanness, so it doesn't specify features that trade performance for safety.
You can use a tool like valgrind, which runs your program in a kind of emulator and attempts to detect such things as buffer overflows by tracking which bytes are initialized and which aren't. But it's not infallible, for example if the overflowing access happens to perform an otherwise-legal access to another variable.
Under the hood, array indexing is just pointer arithmetic. When you say arr[ 150 ], you are just adding 150 times the sizeof one element and adding that to the address of arr to obtain the address of a particular object. That address is just a number, and it might be nonsense, invalid, or itself an arithmetic overflow. Some of these conditions result in the hardware generating a crash, when it can't find memory to access or detects virus-like activity, but none result in software-generated exceptions because there is no room for a software hook. If you want a safe array, you'll need to build functions around the principle of addition.
By the way, the array in your example isn't even technically of fixed size.
int len = 10; /* variable of type int */
char arr[len]; /* variable-length array */
Using a non-const object to set the array size is a new feature since C99. You could just as well have len be a function parameter, user input, etc. This would be better for compile-time analysis:
const int len = 10; /* constant of type int */
char arr[len]; /* constant-length array */
For the sake of completeness: The C standard doesn't specify bounds checking but neither is it prohibited. It falls under the category of undefined behavior, or errors that need not generate error messages, and can have any effect. It is possible to implement safe arrays, various approximations of the feature exist. C does nod in this direction by making it illegal, for example, to take the difference between two arrays in order to find the correct out-of-bounds index to access an arbitrary object A from array B. But the language is very free-form, and if A and B are part of the same memory block from malloc it is legal. In other words, the more C-specific memory tricks you use, the harder automatic verification becomes even with C-oriented tools.
Under the C spec, accessing an element past the end of an array is undefined behaviour. Undefined behaviour means that the specification does not say what would happen -- therefore, anything could happen, in theory. The program might crash, or it might not, or it might crash hours later in a completely unrelated function, or it might wipe your harddrive (if you got unlucky and poked just the right bits into the right place).
Undefined behaviour is not easily predictable, and it should absolutely never be relied upon. Just because something appears to work does not make it right, if it invokes undefined behaviour.
Because you were lucky. Or rather unlucky, because it means it's harder to find the bug.
The runtime will only crash if you start using the memory of another process (or in some cases unallocated memory). Your application is given a certain amount of memory when it opens, which in this case is enough, and you can mess about in your own memory as much as you like, but you'll give yourself a nightmare of a debugging job.
We know that automatic variables are destroyed upon the return of the function.
Then, why is this C program returning correct value?
#include <stdio.h>
#include <process.h>
int * ReturningPointer()
{
int myInteger = 99;
int * ptrToMyInteger = &myInteger;
return ptrToMyInteger;
}
main()
{
int * pointerToInteger = ReturningPointer();
printf("*pointerToInteger = %d\n", *pointerToInteger);
system("PAUSE");
}
Output
*pointerToInteger = 99
Edit
Then why is this giving garbage values?
#include <stdio.h>
#include <process.h>
char * ReturningPointer()
{
char array[13] = "Hello World!";
return array;
}
main()
{
printf("%s\n", ReturningPointer());
system("PAUSE");
}
Output
xŤ
There is no answer to that question: your code exhibits undefined behavior. It could print "the right value" as you are seeing, it could print anything else, it could segfault, it could order pizza online with your credit card.
Dereferencing that pointer in main is illegal, it doesn't point to valid memory at that point. Don't do it.
There's a big difference between you two examples: in the first case, *pointer is evaluated before calling printf. So, given that there are no function calls between the line where you get the pointer value, and the printf, chances are high that the stack location pointer points to will not have been overwritten. So the value that was stored there prior to calling printf is likely to be output (that value will be passed on to printf's stack, not the pointer).
In the second case, you're passing a pointer to the stack to printf. The call to printf overwrites (a part of) that same stack region the pointer is pointing to, and printf ends up trying to print its own stack (more or less) which doesn't have a high chance of containing something readable.
Note that you can't rely on getting gibberish either. Your implementation is free to use a different stack for the printf call if it feels like it, as long as it follows the requirements laid out by the standard.
This is undefined behavior, and it could have launched a missile instead. But it just happened to give you the correct answer.
Think about it, it kind of make sense -- what else did you expect? Should it have given you zero? If so, then the compiler must insert special instructions at the scope end to erase the variable's content -- waste of resources. The most natural thing for the compiler to do is to leave the contents unchanged -- so you just got the correct output from undefined behavior by chance.
You could say this behavior is implementation defined. For example. Another compiler (or the same compiler in "Release" mode) may decide to allocate myInteger purely in register (not sure if it actually can do this when you take an address of it, but for the sake of argument...), in that case no memory would be allocated for 99 and you would get garbage output.
As a more illustrative (but totally untested) example -- if you insert some malloc and exercise some memory usage before printf you may find the garbage value you were looking for :P
Answer to "Edited" part
The "real" answer that you want needs to be answered in disassembly. A good place to start is gcc -S and gcc -O3 -S. I will leave the in-depth analysis for wizards that will come around. But I did a cursory peek using GCC and it turns out that printf("%s\n") gets translated to puts, so the calling convention is different. Since local variables are allocated on the stack, calling a function could "destroy" previously allocated local variables.
Destroying is the wrong word imho. Locals reside on the stack, if the function returns the stack space may be reused again. Until then it is not overwritten and still accessible by pointers which you might not really want (because this might never point to something valid)
Pointers are used to address space in memory, for local pointers the same as I described in 1 is valid. However the pointer seems to be passed to the main program.
If it really is the address storing the former integer it will result in "99" up until that point in the execution of your program when the program overwrite this memory location. It may also be another 99 by coincidence. Any way: do not do this.
These kind of errors will lead to trouble some day, may be on other machines, other OS, other compiler or compiler options - imagine you upgrade your compiler which may change the behaviour the memory usage or even a build with optimization flags, e.g. release builds vs default debug builds, you name it.
In most C/C++ programs their local variables live on the stack, and destroyed means overwritten with something else. In this case that particular location had not been overwritten yet when it was passed as a parameter to printf().
Of course, having such code is asking for trouble because per the C and C++ standards it exhibits undefined behavior.
That is undefined behavior. That means that anything can happen, even what you would expect.
The tricky part of UB is when it gives you the result you expect, and so you think that you are doing it right. Then, any change in an unrelated part of the program changes that...
Answering your question more specifically, you are returning a pointer to an automatic variable, that no longer exists when the function returns, but since you call no other functions in the middle, it happens to keep the old value.
If you call, for example printf twice, the second time it will most likely print a different value.
The key idea is that a variable represents a name and type for value stored somewhere in memory. When it is "destroyed", it means that a) that value can no longer be accessed using that name, and b) the memory location is free to be overwritten.
The behavior is undefined because the implementation of the compiler is free to choose what time after "destruction" the location is actually overwritten.