Proper argument for malloc - c

I have always used the malloc function as, for exemple,
int size = 10000;
int *a;
a = malloc(size * sizeof(int));
I recently run into a piece of code that discards the sizeof(int) part, i.e.
int size = 10000;
int *a;
a = malloc(size);
This second code seems to be working fine.
My question is then, which form is correct? If the second form is, am I allocating needless space with the first form.

The argument to malloc is the number of bytes to be allocated. If you need space for an array of n elements of type T, call malloc(n * sizeof(T)). malloc does not know about types, it only cares about bytes.
The only exception is that when you allocate space for (byte/char) strings, the sizeof can be omitted because sizeof(char) == 1 per definition in C. Doing something like
int *a = malloc(10000);
a[9000] = 0;
may seem to work now, but actually exploits undefined behavior.

malloc allocates a given number of bytes worth of memory, suitably aligned for any type. If you want to store N elements of type T, you need N * sizeof(T) bytes of aligned storage. Typically, T * p = malloc(N * sizeof(T)) provides that and lets you index the elements as p[i] for i in [0, N).

From the man page:
The malloc() function allocates size bytes and returns a pointer to the allocated memory.
The first form is correct.
Even if the sizeof(int) on the machine you are targeting is one (which is sometimes true on 8-bit microcontrollers) you still want your code to be readable.
The reason the "second code seems to be working fine" is that you are lucky.
The version of malloc you are using might be returning a pointer to an area of memory that is larger than what you requested. No matter what is happening behind the scenes, the behavior may change if you switch to a different compiler, so you do not want to rely on it.

Related

Why malloc( ) is used ? And why the size of the variable isn't increasing?

According to the answer from my faculty malloc dynamically allocates memory, Then why the output shows the same size allocated to both normal variable and malloc();. I am a newbie to programming, so I guess you would answer my question the way that a newbie can understand.
#include<stdio.h>
int main()
{
int a,b;
a = (int *) malloc(sizeof(int)*2);
printf("The size of a is:%d \n",sizeof(a));
printf("The size of b is:%d \n",sizeof(b));
return 0;
}
Output:
The size of a is:4
The size of b is:4
Malloc is used on a pointer. You are declaring an integer int a. This needs to be changed to int *a
The sizeof() operator will not give the no of bytes allocated by malloc. This needs to be maintained by the programmer and typically cannot be determined directly from the pointer.
For int *a, sizeof(a) will always return the size of the pointer,
int *a;
printf("%zu\n",sizeof(a)); // gives the size of the pointer e.g. 4
a = malloc(100 * sizeof(int));
printf("%zu\n",sizeof(a)); // also gives the size of the pointer e.g. 4
You should always remember to free the memory you have allocated with malloc
free(a);
Edit The printf format specifiers should be %zu for a sizeof() output. See comments below.
You declare and define both variables as int. Nothing else has an influence on the value of sizeof().
int a,b;
This assigns a value to one of those ints which which is very special, but it does not change anything about the fact that a remains an int (and your cast is misleading and does not do anything at all, even less to change anything about a).
a = (int *) malloc(sizeof(int)*2);
In order to change above line to something sensible (i.e. a meaningful use of malloc) it should be like this:
int* a;
a= malloc(sizeof(int)*2);
I.e. a is now a pointer to int and gets the address of an area which can store two ints. No cast needed.
That way, sizeof(a) (on many machines) will still be 4, which is often the size of a pointer. The size of what it is pointing to is irrelevant.
The actual reason for using malloc() is determined by the goal of the larger scope of the program it is used for. That is not visible in this artificially short example. Work through some pointer-related tutorials. Looking for "linked list" or "binary tree" will get you on the right track.
What programs which meaningfully use malloc have in common is that they are dealing with data structures which are not known at compile time and can change during runtime. The unknown attributes could simply be the total size, but especially in the case of trees, the larger structure is usually unknown, too.
There is an interesting aspect to note when using malloc():
Do I cast the result of malloc?

(int*) when dynamically allocating array of ints in c

So I'm a bit confused on how to make a function that will return a pointer to an array of ints in C. I understand that you cannot do:
int* myFunction() {
int myInt[aDefinedSize];
return myInt; }
because this is returning a pointer to a local variable.
So, I thought about this:
int* myFunction(){
int* myInt = (int) malloc(aDefinedSize * sizeof(int));
return myInt; }
This gives the error: warning cast from pointer to integer of different size
This implies to use this, which works:
int* myFunction(){
int* myInt = (int*) malloc(aDefinedSize * sizeof(int));
return myInt; }
What I'm confused by though is this:
the (int*) before the malloc was explained to me to do this: it tells the compiler what the datatype of the memory being allocated is. This is then used when, for example, you are stepping through the array and the compiler needs to know how many bytes to increment by.
So, if this explanation I was given is correct, isn't memory being allocated for aDefinedSize number of pointers to ints, not actually ints? Thus, isnt myInt a pointer to an array of pointers to ints?
Some help in understanding this would be wonderful. Thanks!!
So, if this explanation I was given is correct, isn't memory being allocated for aDefinedSize number of pointers to ints, not actually ints?
No, you asked malloc for aDefinedSize * sizeof(int) bytes, not
aDefinedSize * sizeof(int *) bytes. That's the size of memory you get, the type depends on the pointer used to access the memory.
Thus, isnt myInt a pointer to an array of pointers to ints?
No, since you defined it as a int *, a pointer-to-an-int.
Of course the pointer has no knowledge of how large the allocated memory are is, but only points at the first int that fits there. It's up to you as programmer to keep track of the size.
Note that you shouldn't use that explicit typecast. malloc returns a void *, that can be silently assigned to any pointer, as in here:
int* myInt = malloc(aDefinedSize * sizeof(int));
Arithmetic on the pointer works in strides of the pointed-to type, i.e. with int *p, p[3] is the same as *(p+3), which means roughly "go to p, go forward three times sizeof(int) in bytes, and access that location".
int **q would be a pointer-to-a-pointer-to-an-int, and might point to an array of pointers.
malloc allocates an array of bytes and returns void* pointing to the first byte. Or NULL if the allocation failed.
To treat this array as an array of a different data type, the pointer must be cast to that data type.
In C, void* implicitly casts to any data pointer type, so no explicit cast is required:
int* allocateIntArray(unsigned number_of_elements) {
int* int_array = malloc(number_of_elements * sizeof(int)); // <--- no cast is required here.
return int_array;
}
Arrays in C
In C, you want to remember that an array is just an address in memory, plus a length and an object type. When you pass it as an argument to a function or a return value from a function, the length gets forgotten and it’s treated interchangeably with the address of the first element. This has led to a lot of security bugs in programs that either read or write past the end of a buffer.
The name of an array automatically converts to the address of its first element in most contexts, so you can for example pass either arrays or pointers to memmove(), but there are a few exceptions where the fact it also has a length matters. The sizeof() operator on an array is the number of bytes in the array, but sizeof() a pointer is the size of a pointer variable. So if we declare int a[SIZE];, sizeof(a) is the same as sizeof(int)*(size_t)(SIZE), whereas sizeof(&a[0]) is the same as sizeof(int*). Another important one is that the compiler can often tell at compile time if an array access is out of bounds, whereas it does not know which accesses to a pointer are safe.
How to Return an Array
If you want to return a pointer to the same, static array, and it’s fine that you’ll get the same array each time you call the function, you can do this:
#define ARRAY_SIZE 32U
int* get_static_array(void)
{
static int the_array[ARRAY_SIZE];
return the_array;
}
You must not call free() on a static array.
If you want to create a dynamic array, you can do something like this, although it is a contrived example:
#include <stdlib.h>
int* make_dynamic_array(size_t n)
// Returns an array that you must free with free().
{
return calloc( n, sizeof(int) );
}
The dynamic array must be freed with free() when you no longer need it, or the program will leak memory.
Practical Advice
For anything that simple, you would actually write:
int * const p = calloc( n, sizeof(int) );
Unless for some reason the array pointer would change, such as:
int* p = calloc( n, sizeof(int) );
/* ... */
p = realloc( p, new_size );
I would recommend calloc() over malloc() as a general rule, because it initializes the block of memory to zeroes, and malloc() leaves the contents unspecified. That means, if you have a bug where you read uninitialized memory, using calloc() will always give you predictable, reproducible results, and using malloc() could give you different undefined behavior each time. In particular, if you allocate a pointer and then dereference it on an implementation where 0 is a trap value for pointers (like typical desktop CPUs), a pointer created by calloc() will always give you a segfault immediately, while a garbage pointer created by malloc() might appear to work, but corrupt any part of memory. That kind of bug is a lot harder to track down. It’s also easier to see in the debugger that memory is or is not zeroed out than whether an arbitrary value is valid or garbage.
Further Discussion
In the comments, one person objects to some of the terminology I used. In particular, C++ offers a few different kinds of ways to return a reference to an array that preserve more information about its type, for example:
#include <array>
#include <cstdlib>
using std::size_t;
constexpr size_t size = 16U;
using int_array = int[size];
int_array& get_static_array()
{
static int the_array[size];
return the_array;
}
std::array<int, size>& get_static_std_array()
{
static std::array<int, size> the_array;
return the_array;
}
So, one commenter (if I understand correctly) objects that the phrase “return an array” should only refer to this kind of function. I use the phrase more broadly than that, but I hope that clarifies what happens when you return the_array; in C. You get back a pointer. The relevance to you is that you lose the information about the size of the array, which makes it very easy to write security bugs in C that read or write past the block of memory allocated for an array.
There was also some kind of objection that I shouldn’t have told you that using calloc() instead of malloc() to dynamically allocate structures and arrays that contain pointers will make almost all modern CPUs segfault if you dereference those pointers before you initialize them. For the record: this is not true of absolutely all CPUs, so it’s not portable behavior. Some CPUs will not trap. Some old mainframes will trap on a special pointer value other than zero. However, it’s come in very handy when I’ve coded on a desktop or workstation. Even if you’re running on one of the exceptions, at least your pointers will have the same value each time, which should make the bug more reproducible, and when you debug and look at the pointer, it will be immediately obvious that it’s zero, whereas it will not be immediately obvious that a pointer is garbage.

Should I change the pointer to an array?

for (int a=0; a<10; ++a) {
printf ("%d", a);
}
char *foo;
foo = (char*)malloc(a);
I want to store more than one char value in foo variable.
Should I change it to an array, since the buffer is only allocating 1 char length?
Is 1 the longest length that can be stored in this buffer?
Well, foo now points to some useable address of a bytes, because this is how malloc() works. It doesn't matter if its type is char *, void * or anything else, you can only use a bytes.
Here, you increment a to 10. That means you can store 10 bytes, being 10 chars, (because in the context of C, 1 char = 1 byte), starting at the address where foo points to. Using a pointer or an array is strictly equivalent.
Since the buffer is only allocating 1 char length...
No, it is not the case here.
Quoting from the C11 standard, chapter §7.22.3.4, The malloc function
void *malloc(size_t size);
The malloc function allocates space for an object whose size is specified by size and
whose value is indeterminate.
So, in case of
foo = malloc(a); //yes, the cast is not required
a memory of size same as the value of a will be allocated, considering malloc() is successful.
Simply put, if I write a snippet like
int * p = malloc(10 * sizeof*p);
then, I can also write
for (int i = 0; i < 10, i++)
p[i] = i;
because, I have allocated the required memory for 10 ints.
That said, please see this discussion on why not to cast the return value of malloc() and family in C..
There are a couple of things you could do in a case like this.
If you know at compile time how many chars you want to store you could make it an array char foo[10]; If you know that there is always going to be 10 (or less) characters you want to store.
If you are not sure how many chars it needs to hold at compile time you would typically do dynamic allocation of memory using malloc. Now when using malloc you specify how many bytes of memory you want so for 12 chars you would do malloc(12) or malloc(12 * sizeof(char)). When using malloc you need to manually free the memory when you are done using it so the benefit of being able to ask for arbitrary (within limits) sizes of memory comes at the cost of making memory management harder.
As a side note: You typically do not want to cast the return value of malloc since it can hide some types of bugs and void *, that malloc returns can be implicitly cast to any pointer type anyway.

C memory allocating - char* and char sizeof

What form is correct in allocating string in C?
char *sample;
sample = malloc ( length * sizeof(char) );
or
sample = malloc ( length * sizeof(char*) );
Why does char* take 4 bytes when char takes 1 byte?
Assuming the goal is to store a string of length characters, the correct allocation is:
sample = malloc(length + 1);
Notes:
Don't use sizeof (char), since it's always 1 it doesn't add any value.
Remember the terminator, I assumed (based on name) that length is the length in visible characters of the string, i.e. the return of strlen() will be length.
I know you didn't, but it's worth pointing out that there should be no cast of the return value from malloc(), either.
The reason char * is larger is that it's a pointer type, and pointers are almost always larger than a single character. On many systems (such as yours, it seems) they are 32 bit, while characters are just 8 bits. The larger size is needed since the pointer needs to be able to represent any address in the machine's memory. On 64-bit computers, pointers are often 64 bits, i.e. 8 characters.
Why does char* take 4 bytes when char takes 1 byte?
Because you are on a 32-bit systems, meaning that pointers take four bytes; char* is a pointer.
char always takes exactly one byte, so you do not need to multiply by sizeof(char):
sample = malloc (length);
I am assuming that length is already padded for null termination.
sample = malloc ( length * sizeof(char) );
First is the correct one if you want to allocate memory for length number of characters.
char* is of type pointer which happens to be 4 bytes on your platform. So sizeof(char*) returns 4.
But sizeof(char) is always 1 and smae is guaranteed by the C standard.
In the given cases you are doing two different things:
In the first case : sample = malloc ( length * sizeof(char) );
You are allocating length multiplied by the size of type char which is 1 byte
While in the second case : sample = malloc ( length * sizeof(char*) );
You are allocating length multiplied by the size of pointer to char which is 4 byte
on your machine.
Consider that while case 1 remains immutable, on the second case the size is variable.
sample = malloc(length);
is the right one
char* is a pointer, a pointer uses 4 bytes (say on a 32-bit platform)
char is a char, a char uses 1 byte
In your case, you want to alloc an array of length characters. You will store in sample a pointer to an array of length times the size of what you point to. The sizeof(char*) is the size of a pointer to char. Not the size of a char.
A good practice is
sample = malloc(length * sizeof(*sample));
Using that, you will reserve length time the size of what you want to point to. This gives you the ability to change the data type anytime, simply declaring sample to be another kind of data.
int *sample;
sample = malloc(length * sizeof(*sample)); // length * 4
char *sample;
sample = malloc(length * sizeof(*sample)); // length * 1
Provided the length already accounts for the nul terminator, I would write either:
sample = malloc(length);
or:
sample = malloc(length * sizeof(*sample));
sizeof(char*) is the size of the pointer, and it is completely irrelevant to the the size that the allocated buffer needs to be. So definitely don't use that.
My first snippet is IMO good enough for string-manipulation code. C programmers know that memory and string lengths in C are both measured in multiples of sizeof(char). There's no real need to put a conversion factor in there that everybody knows is always 1.
My second snippet is the One True Way to write allocations in general. So if you want all your allocations to look consistent, then string allocations should use it too. I can think of two possible reasons to make all your allocations look consistent (both fairly weak IMO, but not actually wrong):
some people will find it easier to read them that way, only one visual pattern to recognise.
you might want to use the code in future as the basis for code that handles wide strings, and a consistent form would remind you to get the allocation right when the length is no longer measured in bytes but in wide chars. Using sizeof(*sample) as the consistent form means you don't need to change that line of code at all, assuming that you update the type of sample at the same time as the units in which length is measured.
Other options include:
sample = calloc(length, 1);
sample = calloc(length, sizeof(char));
sample = calloc(length, sizeof(*sample));
They're probably fairly pointless here, but as well as the trifling secondary effect of zeroing the memory, calloc has an interesting difference from malloc that it explicitly separates the number and size of objects that you're planning to use, whereas malloc just wants the total size.
For any type T, the usual form is
T *p = malloc(N * sizeof *p);
or
T *p;
...
p = malloc(N * sizeof *p);
where N is the number of elements of type T you wish to allocate. The expression *p has type T, so sizeof *p is equivalent to sizeof (T).
Note that sizeof is an operator like & or *, not a library function; parentheses are only necessary if the operand is a type name like int or char *.
Please visit this Linkhttps://www.codesdope.com/c-dynamic-memory/for understand how it allocat the memory dynamically at run time. It might be helpful to understand the concept of malloc and how it allocate the amount of memory to the variable.
In your example;
char *sample;
sample = malloc ( length * sizeof(char) );
here, you are declare a pointer to character for sample without declaring how much memory it required. In the next line, length * sizeof(char) bytes memory is assigned for the address of sample and (char*) is to typecast the pointer returned by the malloc to character.

What causes this integer pointer reassignment to crash?

I am new to C and i have this question. why does the following code crash:
int *a = 10;
*a = 100;
Because you are trying to write 100 to the memory location 0x0000000A which is probably not allocated to your program. That is,
int *a = 10;
does not mean that the pointer 'a' will point to a location in memory having the value of 10. It means it is pointing to address 10 (0x0000000A) in the memory. Then, you want to write something into that address, but you don't have the "rights" to do so, since it is not allocated
You can try the following:
int *a = malloc(sizeof(int));
*a = 100;
This would work, although horribly inefficient. If you only need a single int, you should just put it into the stack, not the heap. On a 32-bit architecure, a pointer is 32 bits long, and an int is 32 bits long too, so your pointer-to-an-int structure takes up (at least) 8 bytes of memory space this way instead of 4. And we haven't even mentioned caching issues.
You need to assign the pointer to a memory location, not arbitrary value (10).
int cell = 10;
int *a = &cell; // a points to address of cell
*a = 100; // content of cell changed
See my answer to another question, about being careful with C.
I would like to propose a slight change in the use of malloc(), for all the answers that suggest using it to allocate memory for the int. Instead of:
a = malloc(sizeof(int));
I would suggest not repeating the type of the variable, since that is known by the compiler and repeating it manually both makes the code more dense, and introduces an error risk. If you later change the declaration to e.g.
long *a;
Without changing the allocation, you would end up allocating the wrong amount of memory ( in the general case, on 32-bit machines int and long are often the same size). It's, IMO, better to use:
a = malloc(sizeof *a);
This simply means "the size of the type pointed at by a", in this case int, which is of course exactly right. If you change the type in the declaration as above, this line is still correct. There is still a risk, if you change the name of the variable on the left hand side of the assignment, but at least you no longer repeat information needlessly.
Also note that no parenthesis are needed with sizeof when using it on actual objects (i.e. variables), only with type names, which look like cast expressions. sizeof is not a function, it's an operator.
Because you've never allocated any memory for a. You've just allocated some stack space for a pointer to a.
int *a = NULL;
a = malloc (sizeof (int));
if (a != NULL)
{
*a =10;
}
Will work.
Alternatively you could give a the address of some existing variable, which would work as well.
i.e.
int a* = NULL;
int b = 10;
a = &b;
This will now mean that doing something like
*a = 100;
will also set b to be == 100
Check out this:
http://home.netcom.com/~tjensen/ptr/pointers.pdf
The following line,
int *a = 10;
defines a pointer to an integer a. You then point the pointer a to the memory location 10.
The next line,
*a = 100;
Puts the value 100 in the memory location pointed to by a.
The problem is:
You don't know where a points to. (You don't know the value of memory location 10)
Wherever a points to, you probably have no right changing that value. It's probably some other program/process's memory. You thief!
Because You declare a pointer to int, initialize the pointer to 10 (an address) and then try to assign a value to an int at this address. Since the memory at address 10 does not belong to your process, You get a crash. This should work:
int *a;
a = malloc(sizeof(int));
*a = 10;
printf("a=%i\n", *a);
free(a);
Does this code even compile? 10 isn't convertible to an int *, unless you cast it like so:
int *a = (int *) 10;
*a = 100;
In that case, you're trying to write 100 into the memory address at 10. This isn't usually a valid memory address, hence your program crashes.
It's probably crashing because you are assigning the pointer to some part of memory which you don't have access to and then you're assigning some value to that memory location (which you're not allowed to do!).
You could also write it as:
int* a = 10;
*a = 100;
Note the different spacing on the first line. It's not a popular style, but I personally think it's clearer. It has exactly the same meaning to the compiler.
Then, read it out loud:
"Pointer-to-int 'a' becomes 10"
"Value-pointed-to-by 'a' becomes 100"
Substituting the actual value:
"Value-pointed-to-by 10 becomes 100"
... at which you realise that 10 is unlikely to point to a piece of memory you can use.
You would pretty much never assign to a pointer with a literal:
int* ptr = (int*)10; // You've guessed at a memory address, and probably got it wrong
int* ptr = malloc(sizeof(int)); // OS gives you a memory address at runtime
I guess there might be some very low level jobs where you directly specify absolute memory addresses. Kernel implementation for example?
Okay, trying to give the simplest explanation today, while trying to give you more detailed picture about it all. Lets add some parentheses shall we?
(int*) a = 10;
(*a) = 100;
You attempt to write four bytes into the address-range [10-13]. The memory layout of your program starts usually higher, so your application does not accidentally overwrite anything from where it could and still function (from .data, .bss, and stack for instance). So it just ends up crashing instead, because the address-range has not been allocated.
Pointer points to a memory location and C static typing defines a type for a pointer. Though you can override the pointer easily. Simply:
(void*) v = NULL;
Here we go further to things. What is a null pointer? It's simply pointer that points to address 0.
You can also give a struct type for your pointer:
struct Hello {
int id;
char* name;
};
...
struct Hello* hello_ptr = malloc(sizeof Hello);
hello_ptr->id = 5;
hello_ptr->name = "Cheery";
Ok, what is malloc? Malloc allocates memory and returns a pointer to the allocated memory. It has a following type signature:
void* malloc(size_t size);
If you do not have a conservative garbage collector, it is likely that your memory won't end up being freed automatically. Therefore, if you want to get the memory back into use from what you just allocated, you must do:
free(hello_ptr);
Each malloc you do has a size-tag in it, so you do not need to state the size of the chunk you pointed for free -routine.
Ok, yet one thing, what does a character string look like in memory? The one similar to "Cheery" for instance. Simple answer. It's a zero-terminated array of bytes.
0.1.2.3.4.5. 6
C h e e r y \0

Resources