malloc vs array, cannot understand why this works [duplicate] - c

This question already has answers here:
Array index out of bound behavior
(10 answers)
Why don't I get a segmentation fault when I write beyond the end of an array?
(4 answers)
Closed 6 years ago.
I am trying to implement a simple sieve and to my help I found the following code:
int main(int argc, char *argv[])
{
int *array, n=10;
array =(int *)malloc(sizeof(int));
sieve(array,n);
return 0;
}
void sieve(int *a, int n)
{
int i=0, j=0;
for(i=2; i<=n; i++) {
a[i] = 1;
}
...
For some reason this works, but I think it should not! The space that is allocated for the variable array is only enough to support one integer, but a[i] for i = 2...10 are called in the function sieve. Shouldn't this cause problems?
I tried to change the implementation to
int array[10], n = 10;
which caused "Abort trap: 6" on runtime. However, this I understand since array[10] will be outside of the space allocated. But shouldn't the same be true also for the code where malloc i used?
Truly confusing.

You are correct in some ways. For example this line:
array =(int *)malloc(sizeof(int));
only allocates space for one integer, not 10. It should be:
array =(int *)malloc(sizeof(int) * 10);
However, that does not mean the code will fail. At least not immediately. When you get back a pointer from malloc and then start writing beyond the bounds of what you have allocated, you might be corrupting something in memory.
Perhaps you are writing over a structure that malloc uses to keep track of what it was asked for, perhaps you are writing over someone else's memory allocation. Perhaps nothing at all - malloc usually allocates more than it is asked for in order to keep the chunks it gives out manageable.
If you want something to crash, you usually have to scribble beyond an operating system page boundary. If you are using Windows or Linux or whatever, the OS will give you (or malloc in this case) some memory in a set of block, usually 4096 bytes in size. If you scribble within that block, the operating system will not care. If you go outside it, you will cause a page fault and the operating system will usually destroy your process.
It was much more fun in the days of MS-DOS. This was not a "protected mode" operating system - it did not have hardware enforced page boundaries like Windows or Linux. Scribbling beyond your area could do anything!

Related

Working of malloc in C

I am a beginner with C. I am wondering, how's malloc working.
Here is a sample code, I wrote on while trying to understand it's working.
CODE:
#include<stdio.h>
#include<stdlib.h>
int main() {
int i;
int *array = malloc(sizeof *array);
for (i = 0; i < 5; i++) {
array[i] = i+1;
}
printf("\nArray is: \n");
for (i = 0; i < 5; i++) {
printf("%d ", array[i]);
}
free(array);
return 0;
}
OUTPUT:
Array is:
1 2 3 4 5
In the program above, I have only allocated space for 1 element, but the array now holds 5 elements. So as the programs runs smoothly without any error, what is the purpose of realloc().
Could anybody explain why?
Thanks in advance.
The fact that the program runs smoothly does not mean it is correct!
Try to increase the 5 in the for loop to some extent (500000, for instance, should suffices). At some point, it will stop working giving you a SEGFAULT.
This is called Undefined Behaviour.
valgrind would also warn you about the issue with something like the following.
==16812== Invalid write of size 4
==16812== at 0x40065E: main (test.cpp:27)
If you dont know what valgrind is check this out: How do I use valgrind to find memory leaks?. (BTW it's a fantastic tool)
This should help gives you some more clarifications: Accessing unallocated memory C++
This is typical undefined behavior (UB).
You are not allowed to code like that. As a beginner, think it is a mistake, a fault, a sin, something very dirty etc.
Could anybody explain why?
If you need to understand what is really happening (and the details are complex) you need to dive into your implementation details (and you don't want to). For example, on Linux, you could study the source code of your C standard library, of the kernel, of the compiler, etc. And you need to understand the machine code generated by the compiler (so with GCC compile with gcc -S -O1 -fverbose-asm to get an .s assembler file).
See also this (which has more references).
Read as soon as possible Lattner's blog on What Every C programmer should know about undefined behavior. Every one should have read it!
The worst thing about UB is that sadly, sometimes, it appears to "work" like you want it to (but in fact it does not).
So learn as quickly as possible to avoid UB systematically.
BTW, enabling all warnings in the compiler might help (but perhaps not in your particular case). Take the habit to compile with gcc -Wall -Wextra -g if using GCC.
Notice that your program don't have any arrays. The array variable is a pointer (not an array) so is very badly named. You need to read more about pointers and C dynamic memory allocation.
int *array = malloc(sizeof *array); //WRONG
is very wrong. The name array is very poorly chosen (it is a pointer, not an array; you should spend days in reading what is the difference - and what do "arrays decay into pointers" mean). You allocate for a sizeof(*array) which is exactly the same as sizeof(int) (and generally 4 bytes, at least on my machine). So you allocate space for only one int element. Any access beyond that (i.e. with any even small positive index, e.g. array[1] or array[i] with some positive i) is undefined behavior. And you don't even test against failure of malloc (which can happen).
If you want to allocate memory space for (let's say) 8 int-s, you should use:
int* ptr = malloc(sizeof(int) * 8);
and of course you should check against failure, at least:
if (!ptr) { perror("malloc"); exit(EXIT_FAILURE); };
and you need to initialize that array (the memory you've got contain unpredictable junk), e.g.
for (int i=0; i<8; i++) ptr[i] = 0;
or you could clear all bits (with the same result on all machines I know of) using
memset(ptr, 0, sizeof(int)*8);
Notice that even after a successful such malloc (or a failed one) you always have sizeof(ptr) be the same (on my Linux/x86-64 box, it is 8 bytes), since it is the size of a pointer (even if you malloc-ed a memory zone for a million int-s).
In practice, when you use C dynamic memory allocation you need to know conventionally the allocated size of that pointer. In the code above, I used 8 in several places, which is poor style. It would have been better to at least
#define MY_ARRAY_LENGTH 8
and use MY_ARRAY_LENGTH everywhere instead of 8, starting with
int* ptr = malloc(MY_ARRAY_LENGTH*sizeof(int));
In practice, allocated memory has often a runtime defined size, and you would keep somewhere (in a variable, a parameter, etc...) that size.
Study the source code of some existing free software project (e.g. on github), you'll learn very useful things.
Read also (perhaps in a week or two) about flexible array members. Sometimes they are very useful.
So as the programs runs smoothly without any error
That's just because you were lucky. Keep running this program and you might segfault soon. You were relying on undefined behaviour (UB), which is always A Bad Thing™.
What is the purpose of realloc()?
From the man pages:
void *realloc(void *ptr, size_t size);
The realloc() function changes the size of the memory block pointed to
by ptr to size bytes. The contents will be unchanged in the range
from the start of the region up to the minimum of the old and new sizes. If the new size is larger than the old size, the added
memory
will not be initialized. If ptr is NULL, then the call is equivalent to malloc(size), for all values of size; if size is equal
to zero,
and ptr is not NULL, then the call is equivalent to free(ptr). Unless ptr is NULL, it must have been returned by an
earlier call to malloc(), calloc() or realloc(). If the area pointed to was moved, a free(ptr) is done.

I am unable to run the following piece of code on my windows system, however it works on a linux system [duplicate]

This question already has answers here:
Segmentation fault on large array sizes
(7 answers)
Closed 3 years ago.
Program with large global array:
int ar[2000000];
int main()
{
}
Program with large local array:
int main()
{
int ar[2000000];
}
When I declare an array with large size in the main function, the program crashes with "SIGSEGV (Segmentation fault)".
However, when I declare it as global, everything works fine. Why is that?
Declaring the array globally causes the compiler to include the space for the array in the data section of the compiled binary. In this case you have increased the binary size by 8 MB (2000000 * 4 bytes per int). However, this does mean that the memory is available at all times and does not need to be allocated on the stack or heap.
EDIT: #Blue Moon rightly points out that an uninitialized array will most likely be allocated in the bss data segment and may, in fact, take up no additional disk space. An initialized array will be allocated statically.
When you declare an array that large in your program you have probably exceeded the stack size of the program (and ironically caused a stack overflow).
A better way to allocate a large array dynamically is to use a pointer and allocate the memory on the heap like this:
using namespace std;
int main() {
int *ar;
ar = malloc(2000000 * sizeof(int));
if (ar != null) {
// Do something
free(ar);
}
return 0;
}
A good tutorial on the Memory Layout of C Programs can be found here.

function to free memory of 1D Array [duplicate]

This question already has answers here:
How do malloc() and free() work?
(13 answers)
Closed 8 years ago.
I am new at programming and i just don't get this. I am supposed to make a function which takes an 1d Array as argument, and frees this Array.
I've got this:
void destroy(double A[])
{
free(A);
}
and my main:
void main()
{
swrmeg = (double *)malloc ((10)*sizeof(double));
swrmeg[0] = 3,2;
destroy(swrmeg);
printf("%lf\n",swrmeg[0]);
}
This is supposed to give a segmentation fault, but it does not, it prints the first double of the array. This means the array has not been freed.. Any ideas why does this happen?
Any proper ways to do the freeing in a function?
You're freeing it correctly.
Doing something wrong, like accessing a piece of memory after it's been freed, doesn't necessarily mean you'll get a segmentation fault, any more than driving on the wrong side of the road means you'll necessarily have an accident.
Segfaults cannot be guaranteed when doing undefined operations, they just sometimes occur when doing undefined operations.
What is actually occurring in your case is that the memory has been assigned to your program in the malloc and then your program has decided it doesn't need it in the free statement; however, the operating system has decided not to move it's memory fences in such a manner to cause a segfault.
Why it doesn't do so includes a lot of reasons:
It could be far more expensive to move the fence rather than just to let your program get away with having a few extra bytes for a little while.
It could be that you'll ask for some memory in a few minutes, and if you do (and it's small enough) then the same memory will be returned, without the need to move memory fences.
It could be that until you hit some hardware dependent limit (like a full page of memory) the OS can't reset the memory fence.
It could be ...
That's the reason why it is undefined, because it is basically dependent on so many things that all the implementations do not need to align. It is the defined part that needs to align.
It appears you are being asked to investigate undefined behavior (UB). ( This is supposed to give a segmentation fault ) What you are doing is not guaranteed to get a seg fault, but you can increase your chances by writing to places you do not own:
void main()
{
swrmeg = (double *)malloc ((10)*sizeof(double));
swrmeg[0] = 3,2;
destroy(swrmeg);
printf("%lf\n",swrmeg[0]);
}
void destroy(double *A)
{
int i;
for(i=0;i<3000;i++)//choose a looping index that will have higher likelyhood of encroaching on illegal memory
{
A[i] = i*2.0; //make assignments to places you have not allocated memory for
}
free(A);
}
Regarding using free'd memory: This post is an excellent description of why free'd memory will sometimes work. (albeit, dealing directly with stack as opposed to heap memory, concepts discussed still illustrative on using free'd memory in general)

malloc non-deterministic behaviour

#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int *arr = (int*)malloc(10);
int i;
for(i=0;i<100;i++)
{
arr[i]=i;
printf("%d", arr[i]);
}
return 0;
}
I am running above program and a call to malloc will allocate 10 bytes of memory and since each int variable takes up 2 bytes so in a way I can store 5 int variables of 2 bytes each thus making up my total 10 bytes which I dynamically allocated.
But on making a call to for-loop it is allowing me to enter values even till 99th index and storing all these values as well. So in a way if I am storing 100 int values it means 200 bytes of memory whereas I allocated only 10 bytes.
So where is the flaw with this code or how does malloc behave? If the behaviour of malloc is non-deterministic in such a manner then how do we achieve proper dynamic memory handling?
The flaw is in your expectations. You lied to the compiler: "I only need 10 bytes" when you actually wrote 100*sizeof(int) bytes. Writing beyond an allocated area is undefined behavior and anything may happen, ranging from nothing to what you expect to crashes.
If you do silly things expect silly behaviour.
That said malloc is usually implemented to ask the OS for chunks of memory that the OS prefers (like a page) and then manages that memory. This speeds up future mallocs especially if you are using lots of mallocs with small sizes. It reduces the number of context switches that are quite expensive.
First of all, in the most Operating Systems the size of int is 4 bytes. You can check that with:
printf("the size of int is %d\n", sizeof(int));
When you call the malloc function you allocate size at heap memory. The heap is a set aside for dynamic allocation. There's no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time. Because your program is small and you have no collision in the heap you can run this for with more values that 100 and it runs too.
When you know what are you doing with malloc then you build programs with proper dynamic memory handling. When your code has improper malloc allocation then the behaviour of the program is "unknown". But you can use gdb debugger to find where the segmentation will be revealed and how the things are in heap.
malloc behaves exactly as it states, allocates n number bytes of memory, nothing more. Your code might run on your PC, but operating on non-allocated memory is undefined behavior.
A small note...
Int might not be 2 bytes, it varies on different architectures/SDKs. When you want to allocate memory for n integer elements, you should use malloc( n * sizeof( int ) ).
All in short, you manage dynamic memory with other tools that the language provides ( sizeof, realloc, free, etc. ).
C doesn't do any bounds-checking on array accesses; if you define an array of 10 elements, and attempt to write to a[99], the compiler won't do anything to stop you. The behavior is undefined, meaning the compiler isn't required to do anything in particular about that situation. It may "work" in the sense that it won't crash, but you've just clobbered something that may cause problems later on.
When doing a malloc, don't think in terms of bytes, think in terms of elements. If you want to allocate space for N integers, write
int *arr = malloc( N * sizeof *arr );
and let the compiler figure out the number of bytes.

what causes segmentation fault in below program [duplicate]

This question already has answers here:
Segmentation fault on large array sizes
(7 answers)
Closed 9 years ago.
If I keep the value of rows to 100000, the program works fine but, if I make rows one million as 1000000, the program gives me segmentation fault. What is the reason? I am running below on Linux 2.6x RHEL kernel.
#include<stdio.h>
#define ROWS 1000000
#define COLS 4
int main(int args, char ** argv)
{
int matrix[ROWS][COLS];
for(int col=0;col<COLS;col++)
for(int row=0;row < ROWS; row++)
matrix[row][col] = row*col;
return 0;
}
The matrix is a local variable inside your main function. So it is "allocated" on the machine call stack.
This stack has some limits.
You should make your matrix a global or static variable or make it a pointer and heap-allocate (with e.g. calloc or malloc) the memory zone. Don't forget that calloc or malloc may fail (by returning NULL).
A better reason to heap-allocate such a thing is that the dimensions of the matrix should really be a variable or some input. There are few reasons to wire-in the dimensions in the source code.
Heuristic: don't have a local frame (cumulated sum of local variables' sizes) bigger than a kilobyte or two.
[of course, there are valid exceptions to that heuristic]
You are allocating a stack variable, the stack of each program is limited.
When you try to allocate too much stack memory, your kernel will kill your program by sending it a SEGV signal, aka segmentation fault.
If you want to allocate bigger chunks of memory, use malloc, this function will get memory from the heap.
Your system must not allow you to make a stack allocation that large. Make matrix global or use dynamic allocation (via malloc and free) and you should be ok.

Resources