I'm in the process of learning how to use pointers and structs in C. Naturally, I'm trying to deliberately break my code to further understand how the language works. Here is some test code that works as I expected it to work:
#include <stdio.h>
#include <stdlib.h>
struct pair {
int x;
int y;
};
typedef struct pair pair;
void p_struct( pair ); //prototype
int main( int argc, char** argv ) {
pair *s_pair;
int size, i;
printf( "Enter the number of pair to make: " );
scanf( "%d", &size );
getchar();
printf( "\n" );
s_pair = (pair*)malloc( size * sizeof(pair) );
for( i = 0; i < size; i++ ) {
s_pair[i].x = i;
s_pair[i].y = i;
p_struct( s_pair[i] );
}
getchar();
return (EXIT_SUCCESS);
}
void p_struct( pair s_pair ) {
printf( "\n%d %d\n", s_pair.x, s_pair.y );
}
As previously stated, this code is functional as far as I can tell.
I then decided to modify a part of the code like so:
for( i = 0; i < size + 3; i++ ) {
s_pair[i].x = i;
s_pair[i].y = i;
p_struct( s_pair[i] );
}
This modification did not produce the seg fault error that I expected it would. All of the "pairs" were printed despite me exceeding the buffer I explicitly set when assigning a value to my variable size using the scanf function.
As I understand pointers (correct me if I'm wrong), a contiguous block of memory of size size*sizeof(pair) is reserved by the memory manager in the heap when I called the malloc function for my pointer of type pair s_pair. What I did was I exceeded the last assigned address of memory when I modified my for loop to the condition i < size + 3.
If I'm understanding this correctly, did my pointer exceed its reserved memory limit and just so happen to be in the clear because nothing adjacent and to the right of it was occupied by other data? Is this normal behaviour when overflowing a buffer?
To add, I did receive a seg fault when I tested with a for loop condition of i < size + 15. The thing is, it still prints the output. As in, it prints the pair "0 0" to pair "24 24" when size = 10 on the screen as per the p_struct function I made. The program crashes by seg fault only after it gets to one of those getchar()s at the bottom. How on earth could my program assign values to pairs that exceed the buffer, print them on the screen, and then all of a sudden decide to crash on seg fault when it gets to getchar()? It seemed to have no issue with i < size + 3 (despite it still being wrong).
For the record, I also tested this behaviour with a regular pointer array:
int size, i, *ptr;
scanf( "%d", &size );
ptr = (int*)malloc( size * sizeof(int) );
for( i = 0; i < size + 15; i++ )
ptr[i] = i;
This produces the exact same result as above. At i < size + 3 there doesn't seem to be any issue with seg faults.
Finally, I tested with an array, too:
int i, array[10];
for( i = 0; i < 25; i++ )
array[i] = i;
For the condition i < 25, I get a seg fault without fail. When I change it to i < 15, I receive no seg fault.
If I remember correctly, the only difference between an array of pointers and an array is that the memory allocated to an array is located on the stack as opposed to the heap (not sure about this). With that in mind, and considering the fact that i < 15 when array[10] doesn't produce any seg faults, why would i < 25 be an issue? Isn't the array at the top of the stack during that for loop? Why would it care about 100 extra bytes when it didn't care about 60 extra bytes? Why isn't the ceiling for that array buffer all the way to the end of whatever arbitrary chunk of memory is reserved for the whole stack?
Hopefully all of this made sense to whoever decides to read a slightly inebriated man's ramblings.
If I'm understanding this correctly, did my pointer exceed its reserved memory limit and just so happen to be in the clear because nothing adjacent and to the right of it was occupied by other data?
Pretty much. Except that you're not "in the clear" because adjacent things probably were occupied by other data and your code simply stomped on that memory and changed the values. You might never notice a problem, or you might notice a problem much later. Either way, it's undefined behaviour.
Welcome to the glorious world of C!
The memory allocation functions (malloc, calloc, realloc, etc) give you memory that's on the heap. When you call one of them and your program doesn't have enough space, it makes a system call to get more. It doesn't do this in precise increments though (it often will do so in some number of whole page increments). When you're indexing past the end of your array (or even before the beginning of it) you are still within the bounds of your program's legal address space. Only when you leave the segment your program owns will you get a Segmentation Violation.
I highly recommend using Valgrind to inspect your program, especially if you are deliberately trying to learn about memory by breaking things. Among other things, it will store canary values on either side of allocations to help you figure out when you're accessing out of bounds and warn you about double frees and memory leaks.
When you call malloc, you might be given more memory than you need it because memory is allocated in multiples of a common block size. If block size is 64bytes and you ask for only 10 bytes, then the OS will give you 64 bytes, hence you can still access memory beyond your requested range which is the behavior your program is observing.
As others said, undefined behaviour doesn't mean your program will crash under all circumstances.
It completely depends on what is supposed to be there where you overwrite the data.
There may be nothing, as the C lib hasn't allocated the program there,
You may have overwritten important administration information which is used later and only then leads to a crash,
or whatever else.
For helping you to understand what really happens under the hood, printing addresses (such as printf("%p\n", s_pair); or anything like that) may be helpful, as well as compiling the program to readable assembler mnemonics (such as gcc -S filename.c -o-)
Related
I have this code segment:
#include<stdio.h>
#include<stdlib.h>
int main()
{
int ** ar;
int i;
ar = malloc( 2 * sizeof(int*));
for(i=0; i<2; i++)
ar[i] = malloc ( 3 * sizeof(int) );
ar[0][0]=1;
ar[0][1]=2;
ar[0][2]=5;
ar[1][0]=3;
ar[1][1]=4;
ar[1][2]=6;
for(i=0; i<2; i++)
free(ar[i]);
free(ar);
printf("%d" , ar[1][2]);
return 0;
}
I went through some threads on this topic
(how to free c 2d array)
but they are quite old and no one is active.
I had the following queries with respect to the code:
Is this the correct way to free memory for a 2D array in C?
If this is the correct way then why am I still getting the corresponding array value when I try to print ? Does this mean that memory is not getting freed properly ?
What happens to the memory when it gets freed? Do all values which I have stored get erased or they just stay there in the memory which is waiting for reallocation?
Is this undefined behaviour expected?
Yes you have two levels or layers (so to speak) of memory to free.
The inner memory allocations (I like how you do those first)
The outer memory allocation for the topmost int** pointer.
Even after you freed the memory, nothing was done with it to overwrite it (So yes it's expected). Hence why you can still print them to the console. It's a good idea to always NULL your pointers after you are done with them. Kind of the polite thing to do. I've fixed many bugs and crashes in the past because the code did not null the pointers after freeing them.
In Microsofts Visual Studio, with the Debug C runtime, it can overwrite the newly free'd values with some garbage that will immediately raise an access violation if used, or dereferenced. That's useful for flushing out bugs.
It looks like you are new to C (Student?). Welcome and have a fun time.
I am a beginner with C. I am wondering, how's malloc working.
Here is a sample code, I wrote on while trying to understand it's working.
CODE:
#include<stdio.h>
#include<stdlib.h>
int main() {
int i;
int *array = malloc(sizeof *array);
for (i = 0; i < 5; i++) {
array[i] = i+1;
}
printf("\nArray is: \n");
for (i = 0; i < 5; i++) {
printf("%d ", array[i]);
}
free(array);
return 0;
}
OUTPUT:
Array is:
1 2 3 4 5
In the program above, I have only allocated space for 1 element, but the array now holds 5 elements. So as the programs runs smoothly without any error, what is the purpose of realloc().
Could anybody explain why?
Thanks in advance.
The fact that the program runs smoothly does not mean it is correct!
Try to increase the 5 in the for loop to some extent (500000, for instance, should suffices). At some point, it will stop working giving you a SEGFAULT.
This is called Undefined Behaviour.
valgrind would also warn you about the issue with something like the following.
==16812== Invalid write of size 4
==16812== at 0x40065E: main (test.cpp:27)
If you dont know what valgrind is check this out: How do I use valgrind to find memory leaks?. (BTW it's a fantastic tool)
This should help gives you some more clarifications: Accessing unallocated memory C++
This is typical undefined behavior (UB).
You are not allowed to code like that. As a beginner, think it is a mistake, a fault, a sin, something very dirty etc.
Could anybody explain why?
If you need to understand what is really happening (and the details are complex) you need to dive into your implementation details (and you don't want to). For example, on Linux, you could study the source code of your C standard library, of the kernel, of the compiler, etc. And you need to understand the machine code generated by the compiler (so with GCC compile with gcc -S -O1 -fverbose-asm to get an .s assembler file).
See also this (which has more references).
Read as soon as possible Lattner's blog on What Every C programmer should know about undefined behavior. Every one should have read it!
The worst thing about UB is that sadly, sometimes, it appears to "work" like you want it to (but in fact it does not).
So learn as quickly as possible to avoid UB systematically.
BTW, enabling all warnings in the compiler might help (but perhaps not in your particular case). Take the habit to compile with gcc -Wall -Wextra -g if using GCC.
Notice that your program don't have any arrays. The array variable is a pointer (not an array) so is very badly named. You need to read more about pointers and C dynamic memory allocation.
int *array = malloc(sizeof *array); //WRONG
is very wrong. The name array is very poorly chosen (it is a pointer, not an array; you should spend days in reading what is the difference - and what do "arrays decay into pointers" mean). You allocate for a sizeof(*array) which is exactly the same as sizeof(int) (and generally 4 bytes, at least on my machine). So you allocate space for only one int element. Any access beyond that (i.e. with any even small positive index, e.g. array[1] or array[i] with some positive i) is undefined behavior. And you don't even test against failure of malloc (which can happen).
If you want to allocate memory space for (let's say) 8 int-s, you should use:
int* ptr = malloc(sizeof(int) * 8);
and of course you should check against failure, at least:
if (!ptr) { perror("malloc"); exit(EXIT_FAILURE); };
and you need to initialize that array (the memory you've got contain unpredictable junk), e.g.
for (int i=0; i<8; i++) ptr[i] = 0;
or you could clear all bits (with the same result on all machines I know of) using
memset(ptr, 0, sizeof(int)*8);
Notice that even after a successful such malloc (or a failed one) you always have sizeof(ptr) be the same (on my Linux/x86-64 box, it is 8 bytes), since it is the size of a pointer (even if you malloc-ed a memory zone for a million int-s).
In practice, when you use C dynamic memory allocation you need to know conventionally the allocated size of that pointer. In the code above, I used 8 in several places, which is poor style. It would have been better to at least
#define MY_ARRAY_LENGTH 8
and use MY_ARRAY_LENGTH everywhere instead of 8, starting with
int* ptr = malloc(MY_ARRAY_LENGTH*sizeof(int));
In practice, allocated memory has often a runtime defined size, and you would keep somewhere (in a variable, a parameter, etc...) that size.
Study the source code of some existing free software project (e.g. on github), you'll learn very useful things.
Read also (perhaps in a week or two) about flexible array members. Sometimes they are very useful.
So as the programs runs smoothly without any error
That's just because you were lucky. Keep running this program and you might segfault soon. You were relying on undefined behaviour (UB), which is always A Bad Thing™.
What is the purpose of realloc()?
From the man pages:
void *realloc(void *ptr, size_t size);
The realloc() function changes the size of the memory block pointed to
by ptr to size bytes. The contents will be unchanged in the range
from the start of the region up to the minimum of the old and new sizes. If the new size is larger than the old size, the added
memory
will not be initialized. If ptr is NULL, then the call is equivalent to malloc(size), for all values of size; if size is equal
to zero,
and ptr is not NULL, then the call is equivalent to free(ptr). Unless ptr is NULL, it must have been returned by an
earlier call to malloc(), calloc() or realloc(). If the area pointed to was moved, a free(ptr) is done.
I'm trying to create a dynamic array of 1000 character long strings using calloc:
int i;
char** strarr =(char**)calloc(argc,sizeof(char)*1000);
if(strarr == NULL)
return 0;
strarr[0][0] ='a';
printf("%c\n",strarr[0][0]);
Every time i try to run this code i get segmentation fault on the printf line, i don't get why does this happen (you can assume that argc is bigger than 0)
Thanks
P.s. im sorry that the code is in text format but im using a mobile so i dont have the code feature
Try this:
const int num_of_strings = 255; //argc ?
const int num_of_chars = 1000;
int i;
char** strarr =(char**)malloc(sizeof(char*)*num_of_strings);
if(strarr == NULL)
return 0;
for (i = 0; i < num_of_strings; i++) strarr[i] = (char*)malloc(sizeof(char)*num_of_chars);
Hello and Welcome to the world of undefined behaviour, one of the darkest territories of the C language. Your code has several problems, which cause undefined behaviour in several occasions, but they all get executed, until you reach the printf line, where you are accessing memory, you have not allocated, which is finally caught by your system and, thus, a segmentation fault is produced.
But I think, it would be better to walk ourselves through your code.
The variable i, which is declared in the int i; line is not used anywhere in the code you have posted, but I guess you need it later.
The first piece of code, that is not right, is in this second line, where you declare an array of strings or a char**. That means that you have a pointer to pointers to chars. So, what you really want to do there is allocate memory for those pointers and not for the chars they will point to. Note that a char consumes a different amount of memory than a char*. This line is, thus, the one to go with.
char** strarr = (char**) calloc(argc, sizeof(char*));
This will allocate memory for argc blocks of memory, each of which is of size 4 or 8 bytes, which depends on whether your system is 32 or 64-bit.
You are doing a very good job of checking whether the calloc function returned NULL or not, which is a very good practice overall.
Next, you will want to allocate memory for the strings themselves, that are pointed to by the pointers, for which you allocated memory in the previous line. These lines will do it.
for (int i = 0; i < argc; i++) {
strarr[i] = (char*) calloc(1000, sizeof(char));
}
This will now allocate 1000-character lengthed strings for every element of our argc-sized string array.
After that, you can continue with your code as it is and I think that no errors will be produced. Please accept an additional piece of advice from me. Learn to love valgrind. It is a very helpful program, which you can run your code with, in order to analyse memory. It is my first step, whenever I get a segmentation fault.
The following code when tested, gives output as
1
0
0
2
0
which is amazing because ptr[3], ptr[4] did not have any memory allocation. Although they stored value in them and prints it. I tried the same code for few larger i's in ptr[i] which again compiled successfully and gives result but for very large value of i, of the order of 100000,program get crashed. if calloc() allocates such a large memory on single call then it is not worth effective. So how calloc() works? Where is this discrepancy?
#include <stdio.h>
void main() {
int * ptr = (int *)calloc(3,sizeof(int));//allocates memory to 3 integer
int i = 0;
*ptr = 1;
*(ptr+3) = 2;//although memory is not allocated but get initialized
for( i =0 ; i<5 ; ++i){
printf("%d\n",*ptr++);
}
}
After that i tried this code which continuously runs without any output
#include <stdio.h>
void main() {
int * ptr = (int *)calloc(3,sizeof(int));
int i = 0;
*ptr = 1;
*(ptr+3) = 2;
//free(ptr+2);
for( ; ptr!=NULL ;)
{
//printf("%d\n",*ptr++);
i++;
}
printf("%d",i);
}
You are confounding two kinds of memory allocation: the kernel's and libc's.
When you ask malloc or calloc for 5 bytes, it does not turn around and ask the kernel for 5 bytes. That would take forever. Instead, the libc heap system obtains larger blocks of memory from the kernel, and subdivides it.
Therefore, when you ask for a small amount of memory, there is usually plenty of more accessible memory right after it that has not been allocated yet by libc, and you can access it. Of course, accessing it is an instant recipe for bugs.
The only time that referencing off the end will get a SIGnal is if you happen to be at the very end of the region acquired from the kernel.
I recommend that you try running your test case under valgrind for additional insight.
The code that you have has undefined behavior. However, you do not get a crash because malloc and calloc indeed often allocate more memory than you ask.
One way to tell how much memory you've got is to call realloc with increasing size, until the pointer that you get back is different from the original. Although the standard does not guarantee that this trick is going to work, very often it would produce a good result.
Here is how you can run this experiment:
int *ptr = calloc(1,sizeof(int));
// Prevent expansion of the original block
int *block = calloc(1, sizeof(int));
int *tmp;
int k = 1;
do {
tmp = realloc(ptr, k*sizeof(int));
k++;
} while (tmp == ptr);
printf("%d\n", k-1);
This prints 4 on my system and on ideone (demo on ideone). This means that when I requested 4 bytes (i.e. one sizeof(int) from calloc, I got enough space for 16 bytes (i.e. 4*sizeof(int)). This does not mean that I can freely write up to fourints after requesting memory for a singleint`, though: writing past the boundary of the requested memory is still undefined behavior.
calloc is allocating memory for 3 int's in the given snippet. Actually you are accessing unallocated memory. Accessing unallocated memory invokes undefined behavior.
calloc allocates only the amount of memory that you asked, which in your case in for 3 int variables
but it doesnt create a bound on the pointer that it has created (in your case ptr). so you can access the unallocated memory just by incrementing the pointer. thats exactly whats happening in your case..
Here is the code that i want to understand:
#include <stdio.h>
#include <stdlib.h>
#define MAX 100
int main()
{
int *ptr = (int *)malloc(5 * sizeof(int)),i;
for(i=0;i<MAX;i++)
{
ptr[i] = i;
}
for(i=0;i<MAX;i++)
{
printf("%d\n",ptr[i]);
}
return 0;
}
My question: I allocated 5 * int size of memory but why it takes more than 5 ineteger?
Thnx
You reserved space for 5 integers. For the other 95 integers, you're writing into space that is reserved for other purposes. Your program may or may not crash, but you should expect that it will fail one way or another.
It doesn't "take" more than 5 integers; you are just invoking undefined behavior. You can't expect the code to "succeed" even if you are seeing it work on your implementation.
It's not 'taking' more than 5 integers : you allocated 5 * sizeof(int) and invoke undefined behavior by accessing memory beyond this size.
There's no question as whether you should set MAX to 10, 1024 or 100000 : the code is fundamentally wrong, and the fact that it didn't fail when you ran it doesn't make it less wrong. Tools like valgrind may help you detect such mistakes.
You are allocating 5 integers, anything you write or read more than this is incorrect
OS protection boundaries are 1 page, which generally means 4k.
Even if you have allocated only 5 integers, you still have the rest of the page unprotected. That is how buffer overflows and many program misbehaviors happen
I am betting if your MAX is set to 1025, you will have seg fault (assuming this is your program)
C doesn't perform bounds checking on arrays. If you have a 5-element array, C will happily let you assign to arr[5], arr[100], or even arr[-1].
If you're lucky, this will merely overwrite unused memory and your program will work anyway.
If you're unlucky, you'll overwrite other variables in your program, the metadata for malloc, or the OS, and Bad Things will happen. Get used to seeing the phrase "segmentation fault".