So I have this little tricky question I need to answer:
On which segment in memory is c+9 pointing to if the function is:
void f()
{
int *c=(int*)malloc(10);
}
I think I know how malloc works, and I looked up other questions, so this should allocate 10 bytes of memory and return the adress of the first one, plus it will cast it to int.
So because sizeof(int) is 4 bytes, I thought that there wouldn't be enough space for 9 integers, and c+9 would point to some memory outside the range of the allocated memory and it would return an error, but no, the program works just fine, as if there was 10*sizeof(int) allocated. So where am I making a mistake?
Your mistake is believing that it works just fine.
Yes, it probably runs, yes, it probably doesn't segfault, but no, it is not correct.
So because sizeof(int) is 4 bytes, I thought that there wouldn't be enough space for 9 integers, and c+9 would point to some memory outside the range of the allocated memory
This is correct
and it would return an error
But this is unfortunately not correct in every case. The OS can only give out space in full pages, this means you can only get space in multiples of 4096 bytes (one page). This means, even though malloc (which is implemented in userspace) gives you 10 bytes, your program will have at least 4096 bytes from the OS. BUT: malloc will eventually give you out more unallocated space from this one page you got and then it will probably introduce a bug.
TLDR: This is UB, even though it looks like it works, never do this.
You're making the mistake in assuming that undefined behavior implies that something "bad" is guaranteed to happen. Accessing c[9] in your case is undefined behavior as you haven't malloced enough memory - and is something that you should not do.
Undefined behavior means that the standard allow for any behavior. For this particular error you would often get an non-localized misbehavior, accessing c[9] would work apparently fine and no odd things happens when you do it, but then in an unrelated piece of code accessing an unrelated piece of data results in error. Often these kind of mistakes would also corrupt the data used by the memory allocation system which may make malloc and free to misbehave.
C programs will not return an error if you poke outside of the assigned memory range. The result is not defined, it may hang or (apparently) work fine. But it is not fine.
You are right in that malloc gives you 10 characters (usually 8-bit bytes). Allocating an area for ints that is not a multiple of int size is in itself fishy... but not illegal. The resulting address is interpreted as a pointer to an int (typically 32 bits), and you are asking for the address 9 int beyond the start of the allocated area. That in itself is fine, but trying to access that is undefined behaviour: anything might happen, including whatever you expect naïvely, nothing whatsoever, crash, or end the universe. What will usually happen is that you get assigned memory from a bigger area containing other objects and free space (and the extra data malloc uses to keep track of the whole mess). Reading there causes no harm, writing could damage other data or mess up malloc's data structures, leading to mysterious behaviour/crashes later on. If you are lucky, the new space is allocated at a boundary, and out-of-limits access gives a segmentation fault, pointing at the culprit.
Related
when I try the code below it works fine. Am I missing something?
main()
{
int *p;
p=malloc(sizeof(int));
printf("size of p=%d\n",sizeof(p));
p[500]=999999;
printf("p[0]=%d",p[500]);
return 0;
}
I tried it with malloc(0*sizeof(int)) or anything but it works just fine. The program only crashes when I don't use malloc at all. So even if I allocate 0 memory for the array p, it still stores values properly. So why am I even bothering with malloc then?
It might appear to work fine, but it isn't very safe at all. By writing data outside the allocated block of memory you are overwriting some data you shouldn't. This is one of the greatest causes of segfaults and other memory errors, and what you're observing with it appearing to work in this short program is what makes it so difficult to hunt down the root cause.
Read this article, in particular the part on memory corruption, to begin understanding the problem.
Valgrind is an excellent tool for analysing memory errors such as the one you provide.
#David made a good comment. Compare the results of running your code to running the following code. Note the latter results in a runtime error (with pretty much no useful output!) on ideone.com (click on links), whereas the former succeeds as you experienced.
int main(void)
{
int *p;
p=malloc(sizeof(int));
printf("size of p=%d\n",sizeof(p));
p[500]=999999;
printf("p[0]=%d",p[500]);
p[500000]=42;
printf("p[0]=%d",p[500000]);
return 0;
}
If you don't allocate memory, p has garbage in it, so writing to it likely will fail. Once you made a valid malloc call, p is pointing to valid memory location and you can write to it. You are overwriting memory that you shouldn't write to, but nobody's going to hold your hand and tell you about it. If you run your program and a memory debugger such as valgrind, it will tell you.
Welcome to C.
Writing past the end of your memory is Undefined Behaviour™, which means that anything could happen- including your program operating as if what you just did was perfectly legal. The reason for your program running as if you had done malloc(501*sizeof(int)) are completely implementation-specific, and can indeed be specific to anything, including the phase of the moon.
This is because P would be assigned some address no matter what size you use with malloc(). Although, with a zero size you would be referencing invalid memory as the memory hasn't been allocated, but it may be within a location which wouldn't cause program crash, though the behavior will be undefined.
Now if you do not use malloc(), it would be pointing to a garbaging location and trying to access that is likely to cause program crash.
I tried it with malloc(0*sizeof(int))
According to C99 if the size passed to malloc is 0, a C runtime can return either a NULL pointer or the allocation behaves as if the request was for non-zero allocation, except that the returned pointer should not be dereferenced. So it is implementation defined (e.g. some implementations return a zero-length buffer) and in your case you do not get a NULL pointer back, but you are using a pointer you should not be using.If you try it in a different runtime you could get a NULL pointer back.
When you call malloc() a small chunk of memory is carved out of a larger page for you.
malloc(sizeof(int));
Does not actually allocate 4 bytes on a 32bit machine (the allocator pads it up to a minimum size) + size of heap meta data used to track the chunk through its lifetime (chunks are placed in bins based on their size and marked in-use or free by the allocator). hxxp://en.wikipedia.org/wiki/Malloc or more specifically hxxp://en.wikipedia.org/wiki/Malloc#dlmalloc_and_its_derivatives if you're testing this on Linux.
So writing beyond the bounds of your chunk doesn't necessarily mean you are going to crash. At p+5000 you are not writing outside the bounds of the page allocated for that initial chunk so you are technically writing to a valid mapped address. Welcome to memory corruption.
http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=heap+overflows
Our CheckPointer tool can detect this error. It knows that the allocation of p was to a chunk of 4 bytes, and thus the assignment is made, it is outside the area for which p was allocated. It will tell you that the p[500] assignment is wrong.
Consider the following:
int* x = calloc(3,sizeof(int));
x[3] = 100;
which is located inside of a function.
I get no error when I compile and run the program, but when I run it with valgrind I get an "Invalid write of size 4".
I understand that I am accessing a memory place outside of what I have allocated with calloc, but I'm trying to understand what actually happens.
Does some address in the stack(?) still have the value 100? Because there must certainly be more available memory than what I have allocated with calloc. Is the valgrind error more of a "Hey, you probably did not mean to do that"?
I understand that I am accessing a memory place outside of what I have allocated with calloc, but I'm trying to understand what actually happens.
"What actually happens" is not well-defined; it depends entirely on what gets overwritten. As long as you don't overwrite anything important, your code will appear to run as expected.
You could wind up corrupting other data that was allocated dynamically. You could wind up corrupting some bit of heap bookkeeping.
The language does not enforce any kind of bounds-checking on array accesses, so if you read or write past the end of the array, there are no guarantees on what will happen.
Does some address in the stack(?) still have the value 100?
First of all, calloc allocates memory on the heap not stack.
Now, regarding the error.
Sure most of the time there is plenty of memory available when your program is running. However when you allocate memory for x bytes, the memory manager looks for some free chunk of memory of that exact size(+ maybe some more if calloc requested larger memory to store some auxiliary info), there are no guaranties on what the bytes after that chunk are used for, and even no guaranties that they are not read-only or can be accessed by your program.
So anything can happen. In the case if the memory was just there waiting for it to be used by your program, nothing horrible happens, but if that memory was used by something else in your program, the values would be mess up, or worst of all the program could crash because of accessing something that wasn't supposed to be accessed.
So the valgrind error should be treated very seriously.
The C language doesn't require bounds checking on array accesses, and most C compilers don't implement it. Besides if you used some variable size instead of constant value 3, the array size could be unknown during compilation, and there would be no way to check if the access isn't out of bound.
There's no guarantees on what was allocated in the space past x[3] or what will be written there in the future. alinsoar mentioned that x[3] itself does not cause undefined behavior, but you should not attempt to fetch or store a value from there. Often you will probably be able to write and access this memory location without problems, but writing code that relies on reaching outside of your allocated arrays is setting yourself up for very hard to find errors in the future.
Does some address in the stack(?) still have the value 100?
When using calloc or malloc, the values of the array are not actually on the stack. These calls are used for dynamic memory allocation, meaning they are allocated in a seperate area of memory known as the "Heap". This allows you to access these arrays from different parts of the stack as long as you have a pointer to them. If the array were on the stack, writing past the bounds would risk overwriting other information contained in your function (like in the worst case the return location).
The act of doing that is what is called undefined behavior.
Literally anything can happen, or nothing at all.
I give you extra points for testing with Valgrind.
In practice, it is likely you will find the value 100 in the memory space after your array.
Beware of nasal demons.
You're allocating memory for 3 integer elements but accessing the 4th element (x[3]). Hence, the warning message from valgrind. Compiler will not complain about it.
int *array; //it allocate a pointer to an int right?
array=malloc(sizeof(int); //allocate the space for ONE int right?
scanf("%d", &array[4]); //must generate a segmentation fault cause there is not allocated space enough right?
printf("%d",array[4]); //it shouldn't print nothing...
but it print 4! why?
Reading or writing off the end of an array in C results in undefined behavior, meaning that absolutely anything can happen. It can crash the program, or format the hard drive, or set the computer on fire. In your case, it just so happens to be the case that this works out perfectly fine, probably because malloc pads the length of the allocated memory for efficiency reasons (or perhaps because you're trashing memory that malloc wants to use later on). However, this isn't at all portable, isn't guaranteed to work, and is a disaster waiting to happen.
Hope this helps!
Cause the operating system happens to have this memory you have asked. This code is by no means guaranteed to run on another machine or another time.
C doesn't check your code for getting of boundaries of an array size, that depends to the programmer. I guess you can call it an undefined behavior (although it's not exactly what they mean when they say an undefined behavior because mostly the memory would be in a part of the stack or the heap so you can still get to it.
When you say array[4] you actually say *(array + 4) which is of course later translated to *(array + 4*sizeof(int)) and you actually go to a certain space in the memory, the space exists, it could be maybe a read-only, maybe a place of another array or variable in your program or it might just work perfectly. No guarantee it'll be an error, but it's not what undefined behavior.
To understand more about undefined behavior you can go to this article (which I find very interesting).
I have an array that's declared as char buff[8]. That should only be 8 bytes, but looking as the assembly and testing the code, I get a segmentation fault when I input something larger than 32 characters into that buff, whereas I would expect it to be for larger than 8 characters. Why is this?
What you're saying is not a contradiction:
You have space for 8 characters.
You get an error when you input more than 32 characters.
So what?
The point is that nobody told you that you would be guaranteed to get an error if you input more than 8 characters. That's simply undefined behaviour, and anything can (and will) happen.
You absolutely mustn't think that the absence of obvious misbehaviour is proof of the correctness of your code. Code correctness can only be verified by checking the code against the rules of the language (though some automated tools such as valgrind are an immense help).
Writing beyond the end of the array is undefined behavior. Undefined behavior means nothing (including a segmentation fault) is guaranteed.
In other words, it might do anything. More practical, it's likely the write didn't touch anything protected, so from the point of view of the OS everything is still OK until 32.
This raises an interesting point. What is "totally wrong" from the point of view of C might be OK with the OS. The OS only cares about what pages you access:
Is the address mapped for your process ?
Does your process have the rights ?
You shouldn't count on the OS slapping you if anything goes wrong. A useful tool for this (slapping) is valgrind, if you are using Unix. It will warn you if your process is doing nasty things, even if those nasty things are technically OK with the OS.
C arrays have no bound checking.
As other said, you are hitting undefined behavior; until you stay inside the bounds of the array, everything works fine. If you cheat, as far as the standard is concerned, anything can happen, including your program seeming to work right as well as the explosion of the Sun.
What happens in practice is that with stack-allocated variables you are likely to overwrite other variables on the stack, getting "impossible" bugs, or, if you hit a canary value put by the compiler, it may detect the buffer overflow on return from the function. For variables allocated in the so-called heap, the heap allocator may have given some more room than requested, so the mistake may be less easy to spot, although you may easily mess up the internal structures of the heap.
In both cases you can also hit a protected memory page, which will result in your program being terminated forcibly (for the stack this happens less often because usually you have to overwrite the entire stack to get to a protected page).
Your declaration char buff[8] sounds like a stack allocated variable, although it could be heap allocated if part of a struct. Accessing out of bounds of an array is undefined behaviour and is known as a buffer overrun. Buffer overruns on stack allocated memory may corrupt the current stack frame and possibly other stack frames in the call stack. With undefined behaviour, anything could happen, including no apparent error. You would not expect a seg fault immediately because the stack is typically when the thread starts.
For heap allocated memory, memory managers typically allocate large blocks of memory and then sub-allocate from those larger blocks. That is why you often don't get a seg fault when you access beyond the end of a block of memory.
It is undefined behaviour to access beyond the end of a memory block. And it is perfectly valid, according to the standard, for such out of bounds accesses to result in seg faults or indeed an apparently successful read or write. I say apparently successful because if you are writing then you will quite possibly produce a heap corruption by writing out of bounds.
Unless you are not telling us something you answered your owflown question.
declaring
char buff[8] ;
means that the compiler grabs 8 bytes of memory. If you try and stuff 32 char's into it you should get a seg fault, that's called a buffer overflow.
Each char is a byte ( unless you are doing unicode in which it is a word ) so you are trying to put 4x the number of chars that will fit in your buffer.
Is this your first time coding in C ?
when I try the code below it works fine. Am I missing something?
main()
{
int *p;
p=malloc(sizeof(int));
printf("size of p=%d\n",sizeof(p));
p[500]=999999;
printf("p[0]=%d",p[500]);
return 0;
}
I tried it with malloc(0*sizeof(int)) or anything but it works just fine. The program only crashes when I don't use malloc at all. So even if I allocate 0 memory for the array p, it still stores values properly. So why am I even bothering with malloc then?
It might appear to work fine, but it isn't very safe at all. By writing data outside the allocated block of memory you are overwriting some data you shouldn't. This is one of the greatest causes of segfaults and other memory errors, and what you're observing with it appearing to work in this short program is what makes it so difficult to hunt down the root cause.
Read this article, in particular the part on memory corruption, to begin understanding the problem.
Valgrind is an excellent tool for analysing memory errors such as the one you provide.
#David made a good comment. Compare the results of running your code to running the following code. Note the latter results in a runtime error (with pretty much no useful output!) on ideone.com (click on links), whereas the former succeeds as you experienced.
int main(void)
{
int *p;
p=malloc(sizeof(int));
printf("size of p=%d\n",sizeof(p));
p[500]=999999;
printf("p[0]=%d",p[500]);
p[500000]=42;
printf("p[0]=%d",p[500000]);
return 0;
}
If you don't allocate memory, p has garbage in it, so writing to it likely will fail. Once you made a valid malloc call, p is pointing to valid memory location and you can write to it. You are overwriting memory that you shouldn't write to, but nobody's going to hold your hand and tell you about it. If you run your program and a memory debugger such as valgrind, it will tell you.
Welcome to C.
Writing past the end of your memory is Undefined Behaviour™, which means that anything could happen- including your program operating as if what you just did was perfectly legal. The reason for your program running as if you had done malloc(501*sizeof(int)) are completely implementation-specific, and can indeed be specific to anything, including the phase of the moon.
This is because P would be assigned some address no matter what size you use with malloc(). Although, with a zero size you would be referencing invalid memory as the memory hasn't been allocated, but it may be within a location which wouldn't cause program crash, though the behavior will be undefined.
Now if you do not use malloc(), it would be pointing to a garbaging location and trying to access that is likely to cause program crash.
I tried it with malloc(0*sizeof(int))
According to C99 if the size passed to malloc is 0, a C runtime can return either a NULL pointer or the allocation behaves as if the request was for non-zero allocation, except that the returned pointer should not be dereferenced. So it is implementation defined (e.g. some implementations return a zero-length buffer) and in your case you do not get a NULL pointer back, but you are using a pointer you should not be using.If you try it in a different runtime you could get a NULL pointer back.
When you call malloc() a small chunk of memory is carved out of a larger page for you.
malloc(sizeof(int));
Does not actually allocate 4 bytes on a 32bit machine (the allocator pads it up to a minimum size) + size of heap meta data used to track the chunk through its lifetime (chunks are placed in bins based on their size and marked in-use or free by the allocator). hxxp://en.wikipedia.org/wiki/Malloc or more specifically hxxp://en.wikipedia.org/wiki/Malloc#dlmalloc_and_its_derivatives if you're testing this on Linux.
So writing beyond the bounds of your chunk doesn't necessarily mean you are going to crash. At p+5000 you are not writing outside the bounds of the page allocated for that initial chunk so you are technically writing to a valid mapped address. Welcome to memory corruption.
http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=heap+overflows
Our CheckPointer tool can detect this error. It knows that the allocation of p was to a chunk of 4 bytes, and thus the assignment is made, it is outside the area for which p was allocated. It will tell you that the p[500] assignment is wrong.