This question already has answers here:
How dangerous is it to access an array out of bounds?
(12 answers)
Closed 6 months ago.
I was reading this post and the OP says that he was worried that "that appending A's allocated memory will corrupt the heap ", so he instead allocated new memory, memcopy A, and memset.
i was wondering if that is the case. is it not possible to append without allocating new memory.
I made a lame attempt. and i got an error
int a[3] = {14, 2, 7}; // initialize array
int M = 2;
int size = sizeof a / sizeof a[0]; // size of array
// print contents of array
for (int i=0;i<size;i++) {
printf("%d\n", a[i]);
}
// add (post pad) 2 zeros to existing array , a
memset(a+M, 0, M * sizeof(int));
int size2 = sizeof a / sizeof a[0];
printf("%d\n", size2);
// print updated array
for (int i=0;i<size2;i++) {
printf("%d\n", a[i]);
}
"*** stack smashing detected ***: terminated"
In C there is no such thing as dynamic arrays. The size of the array must also be constant throughout the program and cannot change.
There is also no boundary checking so that is up to you to do as the programmer.
The array you created is a static array with space for 3 integers.
Your call to memset() is erroneous.
memset requires the following:
void *memset(void *s, int c, size_t n);
s = starting address of memory to be filled
c = value to be filled into memory
n = Number of bytes to be filled starting from s to be filled
Your code defines s as a[2], this will cause problems because memset() will try to access memory beyond what the allocated size of a is which can cause a buffer overflow bug.
Secondly it can cause a problem on the stack. Since the array a has a set size, when you try to write beyond this set size, you are accessing memory which does not belong to you and potentially over-writing important data in a stack frame that is not yours.
To put it simply the stack contains a unique value known as a "stack canary" which are placed between each stack frame. If this value is overwritten then the operating system knows the next stack frame was overwritten and a stack smashing event occured and an error is shown.
C does not allow you to write to an array out-of-bounds, no matter where it is allocated. Doing so is undefined behavior and anything can happen.
// add (post pad) 2 zeros to existing array , a
memset(a+M, 0, M * sizeof(int));
This code doesn't exactly do what the comment claims, because it starts by overwriting item a[2] with zero them continues to write out of bounds 1xsizeof(int) bytes from there. What happened in your specific case ("stack smashing") is likely that the code killed a so-called "stack canary" and so the stack corruption was detected. Which is a nice service by the compiler, but by no means guaranteed to happen. You might as well corrupt other variables or cause the program to crash.
Related
In the example below I have allocated 20 bytes of memory to extend an array by 5 integers. After that I have set the last element to 15 and then reallocated the pointer to 4 bytes (1 integer). Then I print the first 10 elements of the array (it only consists of 6 at this point) and the 9th (which I've previously set to 15) is printed without warnings or errors.
The code :
#include <stdlib.h>
#include <stdio.h>
int main()
{
int arr[5] = {0};
int *ptr = &arr[0];
ptr = malloc(5 * sizeof(int));
arr[9] = 15;
ptr = realloc(ptr, 1 * sizeof(int));
for (int i = 0; i < 10; ++i)
{
printf("%d\n", arr[i]);
}
free(ptr);
return 0;
}
The result after compiling and running :
0
0
0
0
0
32766
681279744
-1123562100
-1261131712
15
My question is as follows : Why is the 9th element of the array still 15? (why am I able to access it?; Shouldn't the allocated memory be at the first free block of memory my compiler finds and not connected to the array's buffer whatsoever?)
The behaviour of malloc() \ realloc() is irrelevant in this case because in the code in the question the content of arr rather than ptr is modified and displayed, and arr is not dynamically allocated or reallocated. So there is no out-of-bounds access in the dynamic memory. The out-of-bounds access to arr[] has undefined behaviour. You will be stomping on memory not allocated to arr. In some cases that will modify adjacent variables, but in this case you have none, so since stacks most often grow downward, you may be modifying the local variables of the calling function or corrupting the return address of the current function - this being main() even that might not cause any noticeable error. In other cases it will lead to a crash.
However, had you modified ptr[15] and reallocated, then displayed the content at ptr it is most likely that you see a similar result because avoid an unnecessary data move, realloc() reuses the same memory block when the allocation is reduced, and simply reduces its size, returning the remainder to the heap.
Returning memory to the heap, does not change its content or make it inaccessible, and C does not perform any bounds checking, so if you code to access memory that is not part of the allocation it will let you. It simply makes the returned block available for allocation.
Strictly it is undefined behaviour, so other behaviour is possible, but generally C does not generate code to do anything other than the bare minimum required - except possibly in some cases to support debugging.
Your description of what the program is doing is all wrong.
In the example below I have allocated 20 bytes of memory to extend an array by 5 integers
No, you don't. You can't extend arr. It's just impossible.
After that I have set the last element to 15
No - because you didn't extend the array so index 9 does not represent the last element. You simply write outside the array.
Look at these lines:
int *ptr = &arr[0];
ptr = malloc(5 * sizeof(int));
First you make ptr point to the first element in arr but rigth after you you make ptr point to some dynamic allocated memory which have absolutely no relation to arr. In other words - the first line can simply be deleted (and probably the compiler will).
In the rest of your program you never use ptr for anything. In other words - you can simply remove all code using ptr. It has no effect.
So the program could simply be:
int main()
{
int arr[5] = {0};
arr[9] = 15;
for (int i = 0; i < 10; ++i)
{
printf("%d\n", arr[i]);
}
return 0;
}
And it has undefined behavior because you access arr out of bounds.
Why is the 9th element of the array still 15?
The "most likely reality" is that the OS provides a way to allocate area/s of virtual pages (which aren't necessarily real memory and should be considered "pretend/fake memory"), and malloc() carves up the allocated "pretend/fake memory" (and allocates more area/s of virtual pages if/when necessary, and deallocates areas of virtual pages if/when convenient).
Freeing "pretend/fake memory that was carved up by malloc()" probably does no more than alter some meta-data used to manage the heap; and is unlikely to cause "pretend/fake memory" to be deallocated (and is even less likely to effect actual real physical RAM).
Of course all of this depends on the environment the software is compiled for, and it can be completely different; so as far as C is concerned (at the "C abstract machine" level) it's all undefined behavior (that might work like I've described, but may not); and even if it does work like I've described there's no guarantee that something you can't know about (e.g. a different thread buried in a shared library) won't allocate the same "pretend/fake memory that was carved up by malloc()" immediately after you free it and won't overwrite the data you left behind.
why am I able to access it?
This is partly because C isn't a managed (or "safe") language - for performance reasons; typically there are no checks for "array index out of bounds" and no checks for "used after it was freed". Instead, bugs cause undefined behavior (and may be critical security vulnerabilities).
int arr[5] = {0}; // these 5 integers are kept on the stack of the function
int *ptr = &arr[0]; // the pointer ptr is also on the stack and points to the address of arr[0]
ptr = malloc(5 * sizeof(int)); // malloc creates heap of size 5 * sizeof int and returns a ptr which points to it
// the ptr now points to the heap and not to the arr[] any more.
arr[9] = 15; //the array is of length 5 and arr[9] is out of the border of maximum arr[4] !
ptr = realloc(ptr, 1 * sizeof(int)); //does nothing here, since the allocated size is already larger than 1 - but it depends on implementation if the rest of 4 x integer will be free'd.
for (int i = 0; i < 10; ++i) // undefined behavior!
{
printf("%d\n", arr[i]);
}
free(ptr);
return 0;`
In short:
Whatever you do with/to a copy of the address of an array inside a pointer variable, it has no influence on the array.
The address copy creates no relation whatsoever between the array and memory allocated (and referenced by the pointer) by a later malloc.
The allocation will not be right after the array.
A realloc of a pointer with a copy of an array access does not work. Realloc only works with pointers which carry the result of a succesful malloc. (Which is probably why you inserted the malloc.)
Longer:
Here are some important facts on your code, see my comments:
#include <stdlib.h>
#include <stdio.h>
int main()
{
int arr[5] = {0}; /* size 5 ints, nothing will change that */
int *ptr = &arr[0]; /* this value will be ignored in the next line */
ptr = malloc(5 * sizeof(int)); /* overwrite value from previous line */
arr[9] = 15; /* arr still only has size 5 and this access beyond */
ptr = realloc(ptr, 1 * sizeof(int)); /* irrelevant, no influence on arr */
for (int i = 0; i < 10; ++i) /* 10 is larger than 5 ... */
{
printf("%d\n", arr[i]); /* starting with 5, this access beyond several times */
}
free(ptr);
return 0;
}
Now let us discuss your description:
In the example below I have allocated 20 bytes of memory ....
True, in the line ptr = malloc(5 * sizeof(int)); (assuming that an int has 4 bytes; not guaranteed, but let's assume it).
... to extend an array by 5 integers.
No. No attribute of the array is affected by this line. Especially not the size.
Note that with the malloc, the line int *ptr = &arr[0]; is almost completely ignored. Only the part int *ptr; remains relevant. The malloc determines the value in the pointer and there is no relation to the array whatsoever.
After that I have set the last element to 15 ...
No, you access memory beyond the array. The last useable array element is arr[4] noce code until now has changed that. Judgin from the output, which still contains "15", you got "lucky", the value has not killed anything and still is in memory. But it is practically unrelated to the array and is also practically guaranteed outside of the allocated memory referenced by ptr.
... and then reallocated the pointer to 4 bytes (1 integer).
True. But I do not really get the point you try to make.
Then I print the first 10 elements of the array ...
No, you print the first 5 elements of the array, i.e. all of them.
Then you print 3 values which happen to be inside memory which you should not access at all. Afterwards you print a fifth value outside of the array, which you also should not access, but which happens to be still be the 15 you wrote there earlier - and should not have in the first place either.
... (it only consists of 6 at this point) ...
You probabyl mean 5 values from the array and 1 from ptr, but they are unrelated and unlikely to be consecutive.
... and the 9th (which I've previously set to 15) is printed without warnings or errors.
There is no 9th, see above. Concerning the lack of errors, well, you are not always lucky enough to be told by the compiler or the runtime that you make a mistake. Life would be so much easier if they could notify you of reliably all mistakes.
Let us go on with your comments:
But isn't arr[9] part of the defined heap?
No. I am not sure what you mean by "the defined heap", but it is surely neither part of the array nor the allocated memory referenced by the pointer. The chance that the allocation is right after the array is as close to zero as it gets - maybe not precisely 0, but you simply are not allowed to assume that.
I have allocated 20 bytes, ...
On many current machines, but assuming that an int has four bytes is also not a afe assumption. However, yes, lets assume that 5 ints have 20 bytes.
... so arr should now consist of 10 integers, instead of 5.
Again no, whatever you do via ptr, it has no influence on the array and there is practically no chance that the ptr-referenced memory is right after the array by chance. It seems that you assume that copying the address of the array into the pointer has an influence on array. That is not the case. It had once a copy of the arrays address, but even that has been overwritten one line later. And even if it had not been overwritten, reallocing the ptr would make an error (that is why you inserted the malloc line, isn't it?) but still not have any effect on the array or its size.
... But I don't think I am passing the barrier of the defined heap.
Again, lets assume that by "the defined heap" you mean either the array or the allocated memory referenced by ptr. Neither can be assumed to contain the arr[9] you access. So yes, you ARE accessing outside of any memory you are allowed to access.
I shouldn't be able to access arr[9], right?
Yes and no. Yes, you are not allowed to do that (with or without the realloc to 1).
No, you cannot expect to get any helpful error message.
Let's look at your comment to another answer:
My teacher in school told me that using realloc() with a smaller size than the already allocated memory frees it until it becomes n bytes.
Not wrong. It is freed, which means you are not allowed to use it anymore.
It is also theoretically freed so that it could be used by the next malloc. That does however not mean that the next malloc will. In no case implies freeing memory any change to the content of that freed memory. It definitly could change, but you cannot expect it or even rely on it. Tom Kuschels answer to this comment is also right.
I am new to C and I have a question about malloc. Here is the code:
int *array = malloc(3 * sizeof(int));
if (array != NULL) {
printf("success \n");
}
array[0] = 1;
array[1] = 1;
array[2] = 1;
array[3] = 2; // I assume this should fail ?
array[4] = 1; // I assume this should fail ?
printf(" %d \n", array[3]);
Does it mean the malloc is only a memory allocation hint but not upper limit ? If yes, how do I enforce the upper limit in C ?
C doesn't mandate any bounds checking - the behavior on writing past the end of the array is undefined. Depending on what you overwrite, your code may crash immediately, or it may corrupt other data, or it may work as expected.
Neither the compiler nor the runtime environment are required to issue any warning or throw any exception on writing past the end of the array. You are expected to simply Not Do That.
In this context of memory manipulation, it is important for you to understand what it is you're actually doing when you are declaring this dynamic memory allocation.
Your "array" pointer is located on the stack and points to the first item (the first 4 bytes) of a continuous block of memory (12 bytes (3xsizeof(int)) which is located ON THE HEAP.
When you try to access this heap space memory, through the [N] operator, you're asking the compiler to perform some pointer arithmetic to allow you to access the data contained in memory at the location (array + (N)*sizeof(int)) ON THE HEAP.
So when you try to access array[3], you are attempting to access a 4 byte memory location that is located directly after the last memory space you allocated for your array, this could be many things, including your own data from a previous allocation!
As the other responses to this post have stated, this is undefined behaviour, and should be avoided, as you cannot control what data you manipulate when you access "out of bounds" data.
I have this code segment:
#include<stdio.h>
#include<stdlib.h>
int main()
{
int ** ar;
int i;
ar = malloc( 2 * sizeof(int*));
for(i=0; i<2; i++)
ar[i] = malloc ( 3 * sizeof(int) );
ar[0][0]=1;
ar[0][1]=2;
ar[0][2]=5;
ar[1][0]=3;
ar[1][1]=4;
ar[1][2]=6;
for(i=0; i<2; i++)
free(ar[i]);
free(ar);
printf("%d" , ar[1][2]);
return 0;
}
I went through some threads on this topic
(how to free c 2d array)
but they are quite old and no one is active.
I had the following queries with respect to the code:
Is this the correct way to free memory for a 2D array in C?
If this is the correct way then why am I still getting the corresponding array value when I try to print ? Does this mean that memory is not getting freed properly ?
What happens to the memory when it gets freed? Do all values which I have stored get erased or they just stay there in the memory which is waiting for reallocation?
Is this undefined behaviour expected?
Yes you have two levels or layers (so to speak) of memory to free.
The inner memory allocations (I like how you do those first)
The outer memory allocation for the topmost int** pointer.
Even after you freed the memory, nothing was done with it to overwrite it (So yes it's expected). Hence why you can still print them to the console. It's a good idea to always NULL your pointers after you are done with them. Kind of the polite thing to do. I've fixed many bugs and crashes in the past because the code did not null the pointers after freeing them.
In Microsofts Visual Studio, with the Debug C runtime, it can overwrite the newly free'd values with some garbage that will immediately raise an access violation if used, or dereferenced. That's useful for flushing out bugs.
It looks like you are new to C (Student?). Welcome and have a fun time.
Trying to allocate a char array of N elements.
#include <stdio.h>
#include <malloc.h>
int main()
{
int N = 2;
char *array = malloc(N * sizeof(char));
array[0] = 'a';
array[1] = 'b';
array[2] = 'c'; // why can i do that??
printf("%c", array[0]);
printf("%c", array[1]);
printf("%c", array[2]); //shouldn't I get a seg fault here??
return 0;
}
The question is:
Since I am allocating 2 * 1 = 2 bytes of memory that means i can have 2 chars in my array. How is it possible that I have more?? I also printed sizeof(*array) and it prints 8 bytes. What am I missing here?
A segmentation fault occurs when a program tries to access a memory address which has not been mapped by the operating system into its virtual memory address space.
Memory allocation occurs in pages (usually 4k or 8k, but you can get larger pages too). So the malloc() call gets a memory page from the OS and carves off a piece of it for the array and returns a pointer to that. In this specific case, there is still a large piece of the page remaining after your array (unallocated but already available for use with subsequent calls to malloc()) - array[2] references a valid address within the page, so no segmentation fault.
However, you are accessing memory beyond the array and as mentioned in the comments, that is undefined behaviour and would probably cause memory corruption in a larger program by overwritting the value of unrelated variables.
The 0th and 1th elements are inside of valid memory allocation. With the 2th element you have trespassed into unallocated memory. Will work fine, until that part of the memory gets allocated for something else, then your 2th element will start having crazy values. Your code will go nuts. But as #jon pointed out, the compiler is supposed to be catching this, unless you have asked it to shutup
I'm in the process of learning how to use pointers and structs in C. Naturally, I'm trying to deliberately break my code to further understand how the language works. Here is some test code that works as I expected it to work:
#include <stdio.h>
#include <stdlib.h>
struct pair {
int x;
int y;
};
typedef struct pair pair;
void p_struct( pair ); //prototype
int main( int argc, char** argv ) {
pair *s_pair;
int size, i;
printf( "Enter the number of pair to make: " );
scanf( "%d", &size );
getchar();
printf( "\n" );
s_pair = (pair*)malloc( size * sizeof(pair) );
for( i = 0; i < size; i++ ) {
s_pair[i].x = i;
s_pair[i].y = i;
p_struct( s_pair[i] );
}
getchar();
return (EXIT_SUCCESS);
}
void p_struct( pair s_pair ) {
printf( "\n%d %d\n", s_pair.x, s_pair.y );
}
As previously stated, this code is functional as far as I can tell.
I then decided to modify a part of the code like so:
for( i = 0; i < size + 3; i++ ) {
s_pair[i].x = i;
s_pair[i].y = i;
p_struct( s_pair[i] );
}
This modification did not produce the seg fault error that I expected it would. All of the "pairs" were printed despite me exceeding the buffer I explicitly set when assigning a value to my variable size using the scanf function.
As I understand pointers (correct me if I'm wrong), a contiguous block of memory of size size*sizeof(pair) is reserved by the memory manager in the heap when I called the malloc function for my pointer of type pair s_pair. What I did was I exceeded the last assigned address of memory when I modified my for loop to the condition i < size + 3.
If I'm understanding this correctly, did my pointer exceed its reserved memory limit and just so happen to be in the clear because nothing adjacent and to the right of it was occupied by other data? Is this normal behaviour when overflowing a buffer?
To add, I did receive a seg fault when I tested with a for loop condition of i < size + 15. The thing is, it still prints the output. As in, it prints the pair "0 0" to pair "24 24" when size = 10 on the screen as per the p_struct function I made. The program crashes by seg fault only after it gets to one of those getchar()s at the bottom. How on earth could my program assign values to pairs that exceed the buffer, print them on the screen, and then all of a sudden decide to crash on seg fault when it gets to getchar()? It seemed to have no issue with i < size + 3 (despite it still being wrong).
For the record, I also tested this behaviour with a regular pointer array:
int size, i, *ptr;
scanf( "%d", &size );
ptr = (int*)malloc( size * sizeof(int) );
for( i = 0; i < size + 15; i++ )
ptr[i] = i;
This produces the exact same result as above. At i < size + 3 there doesn't seem to be any issue with seg faults.
Finally, I tested with an array, too:
int i, array[10];
for( i = 0; i < 25; i++ )
array[i] = i;
For the condition i < 25, I get a seg fault without fail. When I change it to i < 15, I receive no seg fault.
If I remember correctly, the only difference between an array of pointers and an array is that the memory allocated to an array is located on the stack as opposed to the heap (not sure about this). With that in mind, and considering the fact that i < 15 when array[10] doesn't produce any seg faults, why would i < 25 be an issue? Isn't the array at the top of the stack during that for loop? Why would it care about 100 extra bytes when it didn't care about 60 extra bytes? Why isn't the ceiling for that array buffer all the way to the end of whatever arbitrary chunk of memory is reserved for the whole stack?
Hopefully all of this made sense to whoever decides to read a slightly inebriated man's ramblings.
If I'm understanding this correctly, did my pointer exceed its reserved memory limit and just so happen to be in the clear because nothing adjacent and to the right of it was occupied by other data?
Pretty much. Except that you're not "in the clear" because adjacent things probably were occupied by other data and your code simply stomped on that memory and changed the values. You might never notice a problem, or you might notice a problem much later. Either way, it's undefined behaviour.
Welcome to the glorious world of C!
The memory allocation functions (malloc, calloc, realloc, etc) give you memory that's on the heap. When you call one of them and your program doesn't have enough space, it makes a system call to get more. It doesn't do this in precise increments though (it often will do so in some number of whole page increments). When you're indexing past the end of your array (or even before the beginning of it) you are still within the bounds of your program's legal address space. Only when you leave the segment your program owns will you get a Segmentation Violation.
I highly recommend using Valgrind to inspect your program, especially if you are deliberately trying to learn about memory by breaking things. Among other things, it will store canary values on either side of allocations to help you figure out when you're accessing out of bounds and warn you about double frees and memory leaks.
When you call malloc, you might be given more memory than you need it because memory is allocated in multiples of a common block size. If block size is 64bytes and you ask for only 10 bytes, then the OS will give you 64 bytes, hence you can still access memory beyond your requested range which is the behavior your program is observing.
As others said, undefined behaviour doesn't mean your program will crash under all circumstances.
It completely depends on what is supposed to be there where you overwrite the data.
There may be nothing, as the C lib hasn't allocated the program there,
You may have overwritten important administration information which is used later and only then leads to a crash,
or whatever else.
For helping you to understand what really happens under the hood, printing addresses (such as printf("%p\n", s_pair); or anything like that) may be helpful, as well as compiling the program to readable assembler mnemonics (such as gcc -S filename.c -o-)