I have researched in all possible ways I could but it's hard for me to digest the fact that both malloc i.e.malloc(sizeof(10))
and calloc i.e. calloc(2,sizeof(5)) allocates same contiguous memory, ignoring the other facts that calloc initializes to zero and works relatively slower than malloc. so this is what I think.
I think that on a 32-bit system if we call malloc and say malloc(sizeof(10)) then malloc will go to the heap and allocate 12 bytes of memory, because for a 32-bit system the memory packages are arranged in groups of 4 bytes so to allocate 10 bytes 3 blocks are needed with a padding of 2 bytes in the last block.
Similarly, if we call calloc and say calloc(2,sizeof(5)) then it will allocate 2 blocks each of size 8 bytes and in total 16 bytes because due to the same reason that memory is in the packages of 4 bytes and to allocate 5 bytes two blocks of 4 bytes are used and in one block a padding of 3 bytes will be provided.
So this is what I think of malloc and calloc. I may be right or wrong but please tell me either way.
calloc allocates "memory for an array of nmemb elements of size bytes each" (Linux man page), but we know that arrays in C cannot have padding between array elements, they must be contiguous in memory. On the other hand, malloc allocates "size bytes", so either of malloc(10) or calloc(2,5) will give you those ten bytes.
Now, what happens behind the scenes is another issue, the C library might decide to allocate 12, 16, or 42 bytes instead. But you can't and must not count on that. If you ask for 10 bytes, assume you got 10.
malloc(sizeof(10)) is different, it takes the size in memory of an int (since 10 is an int), and allocates that much.
What you get when you use both methods is at least that amount of memory you request. It can be more. One reason could be the one you are mentioning. Another reason can be that you get a bigger chunk just in case you want to change it later.
There's no way to say exactly how much you got. Only that you are guaranteed at least what you requested, and if you don't, you'll get a null pointer.
Note though, that with get I mean that you get what you ask for, and you can NEVER count on more than that. That will be undefined behavior.
Related
Since arrays are just contiguous data of the same type, and you don't need to explicitly put [] somewhere (e.g. you can int *p1 = malloc(sizeof(int) * 4);, how is it that when you realloc(p1, ...), it knows to move (if it has to move) exactly 4 ints worth of space, even if potentially there are other ints in memory?
To clarify the question: If you allocate an array in this way, and also just a single, seperate int - does that mean that these 4+1 total ints are never contiguous in memory, or is this "it's an array" information in the memory block somehow (e.g. they have some sort of delimiter?), or does the compiler infer and remember that from the malloc parameter? Or something else?
Basically, how does it ensure it moves only and exactly those 4, even when there are other blocks of memory of the same size that might be also contiguous?
realloc is defined only when passed a pointer to memory allocated by a member of the malloc family of routines (or a null pointer). These routines keep records of the memory they have allocated. When you call realloc, it uses these records to know how long the allocated block is.
Often, the primary record for a block of memory is put into the bytes just before that block, so all realloc has to do is take the pointer you give it, subtract a known number of bytes from it, and look at the data at that new address, where it will find information about the size of the allocated block. However, other methods are possible too.
When I run the following 32 bit application (debug mode) under windows a memory usage reaches 2GB limit and loop breaks when i equals 42885988:
for(int i = 0; i < 104857600; ++i)
{
uint8_t* ptr = (uint8_t*)malloc(1);
if (!ptr)
{
break;
}
*ptr = 0;
}
104857600 that's 100mb so how to explain a behavior of the above program ?
malloc(1) doesn't allocate one byte.
The malloc man page notes that the memory returned "is suitably aligned for any built-in type." So if the first call to malloc returns address 0x1000, the second call probably can't return 0x1001, because that address might not be "suitably aligned for any built-in type." (Some processors can't access words at odd addresses, or generally N-byte values at addresses not evenly divisible by N, and some of those that can do so less efficiently.) So the second malloc call will have to return at least 0x1004 or even 0x1008.
Also, malloc has to allocate extra memory to store information about the buffer it returns to you. When you later call free, that function has to know the size of the buffer, for example. On a 64-bit machine that's at least another 8 bytes. Depending on how the runtime manages the heap, it may have to store additional information.
If you assume that each malloc actually allocates at least 8 bytes (for alignment) plus another 8 or 16 for housekeeping, you can see that 100 million calls to malloc of one byte each can get you over 2GB.
I'm not sure if each of your calls is actually using 16 or 24 bytes or whatever; the point is that it's a lot more than one.
2GB/42885988 is a shade over 50 bytes per allocation.
This is more that would be expected from a simple Windows heap allocation, so I suspect you are running a DEBUG build, in which case there is extra overhead of guard bytes around your allocated memory. More details can be found in this article - http://www.nobugs.org/developer/win32/debug_crt_heap.html .
This question already has answers here:
How malloc works? [duplicate]
(8 answers)
Closed 5 years ago.
int* ptr;
ptr=(int*)malloc(sizeof(int)); //(A)
ptr=(int*)malloc(5*sizeof(int)); //(B)
At line (A), a block of 4 byte is going to create dynamically. Now that's fine. But my question is at line B is it going to create a single 20(5*4) byte block? Or 5 separate blocks of size 4 byte? If it creates a separate block then will they be contiguous? Is ptr=(int*)malloc(5*sizeof(int)); and ptr=(int*)calloc(5,sizeof(int)); equivalent?
They are practically equivalent. malloc will allocate contiguous block.
Difference is that calloc does zero initialization of the memory, while malloc doesn't.
Of course, we are talking about virtual memory. The block will be contiguous for your program. It can be not in physical memory. But it is not important in most cases, until you do not try to do kernel modules or drivers, which work in ring 0. But it is the different story.
But my question is at line B is it going to create a single 20(5*4) byte block?
ptr = malloc(5*sizeof(int)) will allocate 5*sizeof(int) bytes of space. Yes allocated space will be contiguous, if contiguous space is not available then allocation will fail.
Is ptr=(int*)malloc(5*sizeof(int)); and ptr=(int*)calloc(5,sizeof(int)); equivalent?
They are equivalent except that calloc sets the allocated memory to 0.
NOTE: In C, you should not cast the return value of malloc, calloc and realloc.
malloc() tries to allocate the size of memory that you asked for. The function doesn't know how the parameter is transferred, but rather it's value alone.
For example, if sizeof(int) is 4, then:
int* ptr = malloc(sizeof(int) * 5);
and
int* ptr = malloc(20);
are essentially the same. In both the function will get a value of 20 as a parameter. The same will happen if you call malloc like this:
size_t a = 20;
int* ptr = malloc(a);
Therefore, if it succeeds (i.e. doesn't return NULL), it will allocate a contiguous block of memory, with at least the size that you asked for.
All that is true regarding to virtual memory. Meaning, you'll access the memory with a continuous index. Physical memory depends on the way the OS manages you're memory.
If, for example, your OS holds memory page frames (blocks of physical memory) in size of 4kb only, and you ask in malloc for more, although your virtual memory will be contiguous, physical memory might not.
All of that has to do with a wider subject that is called memory management. You can read about the way that linux chose to deal with it here.
Malloc takes requested size of block to be allocated, expressed in bytes. It does not (and cannot, really) determine how you got a given number i.e. if the size is 5*sizeof(int) as in your example or it's 20 or 30-10. It's going to allocate a single block of 20 bytes in either case (assuming size of int is 4 bytes).
Yes it does. malloc will allocate contiguous block, or malloc will fail if there isn't a large enough contiguous block available. (A failure with malloc will return a NULL pointer.)
Malloc and Calloc both allocate the memory but calloc initialise to zero while malloc does not.
First of all, it's bad to cast the return value of malloc(). This function takes as input the number of bytes that you want to allocate. It doesn't matter whether you pass (5*sizeof(int)) or 20 (assuming an int is 4 bytes on your machine). A chunk with the given number of bytes will be allocated. When I say chunk, I mean that you can access each byte like this:
char *ptr = malloc(5*sizeof(int));
ptr++; // now ptr points to the second byte of the chunk.
Or, you could do something like this:
int *ptr = malloc(5*sizeof(int));
ptr++; // now ptr points to the second element (that's sizeof (int) bytes away) from the start of the chunk.
Is ptr=(int*)malloc(5*sizeof(int)); and
ptr=(int*)calloc(5,sizeof(int)); equivalent?
When using malloc, you should not assume anything about the contents of the chunk of memory. When you use calloc all the bytes in the chunk is guaranteed to be set to 0. malloc might be faster than calloc on some implementations. Other than that there is no difference.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
struct Person
{
unsigned long age;
char name[20];
};
struct Array
{
struct Person someone;
unsigned long used;
unsigned long size;
};
int main()
{
//pointer to array of structs
struct Array** city;
//creating heap for one struct Array
struct Array* people=malloc(sizeof(struct Array));
city=&people;
//initalizing a person
struct Person Rob;
Rob.age=5;
strcpy(Rob.name,"Robert");
//putting the Rob into the array
people[0].someone=Rob;
//prints Robert
printf("%s\n",people[0].someone.name);
//another struct
struct Person Dave;
Dave.age=19;
strcpy(Dave.name,"Dave");
//creating more space on the heap for people.
people=realloc(people,sizeof(struct Array)*2);
//How do I know that this data is safe in memory from being overwritten?
people[1].someone=Dave;
//prints Dave
printf("%s\n",people[1].someone.name);
//accessing memory on the heap I do not owe?
people[5].someone=Rob;
//prints "Robert" why is this okay? Am I potentially overwriting memory?
printf("%s\n",people[5].someone.name);
return 0;
}
In the above code I attempt to make a pointer to a dynamic array of structs, unsure if I succeeded in that part, but my main concern is I use malloc to create space on the heap for the array 'people.' Later in the code I create another struct Person and use realloc to create more space on the heap for 'people.' I then write to memory outside of what I thought I gave space for by doing 'people[5].someone=Rob;.' This still works as I have access to the value at that memory location. My question is why does this work? Am I potentially overwriting memory by writing to memory I did not specifically define for people? Am I actually using malloc and realloc correctly? As I did hear that there were ways of testing if they were successful in another post. I am new to C so if my assumptions or terminology is off please correct me.
I'm not an expert in C, not even a middle, most of the time I program in C#, so some mistakes might be there.
Modern operating systems have a special mechanism called the memory manager. Using that mechanism we can ask OS to give us some amount of memory. In Windows there's a special function for that - VirtualAlloc. It's a really powerful function, you can read more about it on MSDN.
It works really great and gives us all the memory we require but there's a little problem - it gives us the whole physical pages (4KB). Well, actually that is not a big problem, you can use this memory in the same way as if it was allocated using malloc. There'll be no error.
But it is a problem because if we, for example, allocate a 10 byte chunk using VirtualAlloc, it will actually give us 4096 byte chunk as the memory size is rounded up to the page size boundary. So VirtualAlloc allocate a 4KB memory chunk, but we actually use only 10 bytes of it. The rest 4086 are "gone". If we create the second 10 byte array, VirtualAlloc will give us another 4096 byte chunk, so two 10 byte arrays will actually take 8KB of RAM.
To solve this problem, every C program uses malloc function, which is a part of the C runtime library. It allocates some space using VirtualAlloc and returns pointers to the parts of it. For example let's return to our previous arrays. If we allocate 10 byte array using malloc, the runtime library will call VirtualAlloc to allocate some space, and malloc will return pointer to the beginning of it. But if we allocate 10 byte array for the second time, malloc won't use VirtualAlloc. Instead, it will use the already allocated page, I mean the free space of it. After allocation of the first array, we got 4086 bytes of unused space in our memory chunk. So malloc will use this space wisely. In this case (for the second array) it will return pointer to "address of chunk" + 10 (that's a memory address).
Now we can allocate about 400 "ten byte arrays" and they will take only 4096 bytes if we use malloc. Naive way using VirtualAlloc would take 400 * 4096 bytes = 1600KB, that's a rather big figure in comparison to 4096 bytes using malloc.
There's another reason - performance, as VirtualAlloc is a really expensive operation. However, malloc will do some pointer math if you have free space in the allocated chunks, but if you don't have any free allocated space, it will call VirtualAlloc. Actually it's much more complicated than I say, but I think that would be enough to explain the cause.
Okay, let's return to the question. You allocate the memory for the Array array. Let's calculate it's size: sizeof(Person) = sizeof(long) + sizeof(char[20]) = 4 + 20 = 24 bytes; sizeof(Array) = sizeof(Person) + 2 * sizeof(long) = 24 + 8 = 32 bytes. The array of 2 elements will take 32 * 2 = 64 bytes. So, as I said before, malloc will call VirtualAlloc to allocate some memory, and it will return a 4096 bytes page. So, for example let's assume that the address of the chunk's beginning is 0. Application can modify any byte from 0 to 4096 as we allocated the page and we won't get any pagefault. What is an array indexation array[n]? It's just summation of the array's base and the offset calculated as array + n * sizeof(*array). In case of person[5] it will be 0 + sizeof(Array) * 5 = 0 + 5 * 64 = 320 bytes. Gotcha! We're still in the chunk's boundary, I mean we access the existing physical page. A pagefault would happen if we tried to access an unexisting virtual page, but in our case it exists at address 320 (from 0 to 4096 as we assumed). It's dangerous to access unallocated space as it can lead to lots of unknown consequences, but we can actually do it!
That's why you don't get any Access Violation at ****. But it's actually MUCH WORSE. Because if you for example try accessing the zero pointer you will get a pagefault and your app will just crash, therefore you WILL know the cause of the problem with a help of the debugger or something else. But if you overrun the buffer and you don't get any error, you will go crazy while looking for the problem's cause. Because it's REALLY HARD to find this kind of errors. And you can even be NOT AWARE of it. So NEVER OVERRUN THE BUFFER ALLOCATED IN THE HEAP. Actually Microsoft's C Runtime has a special "debug" version of malloc that can find these errors at runtime, but you need to compile application with "DEBUG" configuration. Also, there are some special thing like Valgrind, but I have a little experience in these stuff.
Well, I have written alot, sorry for my english, I'm still learning it. Hope it will help you.
First, never forget to free your memory.
// NEVER FORGET TO FREE YOUR MEMORY
free(people);
As for this part
//accessing memory on the heap I do not owe?
people[5].someone=Rob;
//prints "Robert" why is this okay? Am I potentially overwriting memory?
printf("%s\n",people[5].someone.name);
you are just being lucky (or unlucky in my opinion, since you don't see the logical mistake you are doing).
This is undefined behaviour, since you have two cells, but you access a 6th one, you go out of bounds.
I am having an issue with allocating the right size of memory in my program. I do the following:
void * ptr = sbrk(sizeof(void *)+sizeof(unsigned int));
When I do this, I think it is adding too much memory to the heap because it is allocating it in units of void* instead of bytes. How do I tell it that I want sizeof(whatever) to mean whatever bytes instead of whatever other units?
EDIT:
I have seen other people cast things as a char so that the compiler takes the size in bytes. If sizeof(unsigned int) is 4 bytes, but the type that I was using is void *, will the compiler break 4 times the size of a void * instead of 4 bytes?
Pass a number of bytes as the argument of sbrk.
In Linux, the prototype of sbrk is:
void *sbrk(intptr_t increment);
http://www.kernel.org/doc/man-pages/online/pages/man2/brk.2.html
sbrk() increments the program's data space by increment bytes.
But as some people in the comments added, if you want to dynamically allocate memory you are looking for the malloc function and not sbrk. brk and sbrk are syscalls that are usually used internally for the implementation of the malloc user function.
The kernel manages process memory in a page granularity. This means the process address space must grow (or shrink) by a whole number of pages.
So even though sbrk gets a number of bytes, it would add at least one page to the process.