Pointer arithmetic and dynamic memory? - c

What does the following piece of code mean?
int* pointer = malloc (sizeof(int) + 3);
pointer++;
The allocated piece of memory can't be broken down into chunks of sizeof(int). So what happens when pointer is asked to jump to the next "block"? Is it defined?

The code is valid but maybe unusual without more context.
Line 1: The malloc allocates 3 bytes larger than the size of an int. This is valid.
Line 2: The pointer++ is valid. It's just an address.
Further references to pointer (e.g. addition or subtraction or comparison) are valid. Dereferences (i.e. *pointer) will result in undefined behaviour.
Not that those 3 "extra" bytes are valid storage space and can be addressed with a char *, for example.

pointer can be used for pointer comparison (C standard allows pointers to be one element past the last one). Read or write access is undefined.

*pointer is 3 bytes of 0 and sizeof(int) - 3 bytes of undefined. Which byte[s] (as related to significance in your int) are undefined is platform dependent (on the system bytesex) so in terms of your C program, the whole thing might as well be undefined.

Related

Pointer layout in memory in C

I've recently been messing around with pointers and I would like to know a bit more about them, namely how they are organized in memory after using malloc for example.
So this is my understanding of it so far.
int **pointer = NULL;
Since we explicitly set the pointer to NULL it now points to the address 0x00.
Now let's say we do
pointer = malloc(4*sizeof(int*));
Now we have pointer pointing to an address in memory - let's say pointer points to the address 0x0010.
Let's say we then run a loop:
for (i = 0; i<4; i++) pointer[i] = malloc(3*sizeof(int));
Now, this is where it starts getting confusing to me. If we dereference pointer, by doing *pointer what do we get? Do we get pointer[0]? And if so, what is pointer[0]?
Continuing, now supposedly pointer[i] contains stored in it an address. And this is where it really starts confusing me and I will use images to better describe what I think is going on.
In the image you see, if it is correct, is pointer[0] referring to the box that has the address 0x0020 in it? What about pointer[1]?
If I were to print the contents of pointer would it show me 0x0010? What about pointer[0]? Would it show me 0x0020?
Thank you for taking the time to read my question and helping me understand the memory layout.
Pointer Refresher
A pointer is just a numeric value that holds the address of a value of type T. This means that T can also be a pointer type, thus creating pointers-to-pointers, pointers-to-pointers-to-pointers, and crazy things like char********** - which is simply a pointer (T*) where T is a pointer to something else (T = E*) where E is a pointer to something else (and so on...).
Something to remember here is that a pointer itself is a value and thus takes space. More specifically, it's (usually) the size of the addressable space the CPU supports.
So for example, the 6502 processor (commonly found in old gaming consoles like the NES and Atari, as well as the Apple II, etc.) could only address 16 bits of memory, and thus its "pointers" were 16-bits in size.
So regardless of the underlying type, a pointer will (usually) be as large as the addressable space.
Keep in mind that a pointer doesn't guarantee that it points to valid memory - it's simply a numeric value that happens to specify a location in memory.
Array Refresher
An array is simply a series of T elements in contiguously addressable memory. The fact it's a "double pointer" (or pointer-to-a-pointer) is innocuous - it is still a regular pointer.
For example, allocating an array of 3 T's will result in a memory block that is 3 * sizeof(T) bytes long.
When you malloc(...) that memory, the pointer returned simply points to the first element.
T *array = malloc(3 * sizeof(T));
printf("%d\n", (&array[0] == &(*array))); // 1 (true)
Keep in mind that the subscript operator (the [...]) is basically just syntactic sugar for:
(*(array + sizeof(*array) * n)) // array[n]
Arrays of Pointers
To sum all of this up, when you do
E **array = malloc(3 * sizeof(E*));
You're doing the same thing as
T *array = malloc(3 * sizeof(T));
where T is really E*.
Two things to remember about malloc(...):
It doesn't initialize the memory with any specific values (use calloc for that)
It's not guaranteed (nor really even common) for the memory to be contiguous or adjacent to the memory returned by a previous call to malloc
Therefore, when you fill the previously created array-of-pointers with subsequent calls to malloc(), they might be in arbitrarily random places in memory.
All you're doing with your first malloc() call is simply creating the block of memory required to store n pointers. That's it.
To answer your questions...
If we dereference pointer, by doing *pointer what do we get? Do we get pointer[0]?
Since pointer is just a int**, and remembering that malloc(...) returns the address of the first byte in the block of memory you allocated, *pointer will indeed evaluate to pointer[0].
And if so, what is pointer[0]?
Again, since pointer as the type int**, then pointer[0] will return a value type of int* with the numeric contents of the first sizeof(int*) bytes in the memory block pointed to by pointer.
If I were to print the contents of pointer would it show me 0x0010?
If by "printing the contents" you mean printf("%p\n", (void*) pointer), then no.
Since you malloc()'d the memory block that pointer points to, pointer itself is just a value with the size of sizeof(int**), and thus will hold the address (as a numeric value) where the block of memory you malloc()'d resides.
So the above printf() call will simply print that value out.
What about pointer[0]?
Again assuming you mean printf("%p\n", (void*) pointer[0]), then you'll get a slightly different output.
Since pointer[0] is the equivalent of *pointer, and thus causes pointer to be dereferenced, you'll get a value of int* and thus the pointer value that is stored in the first element.
You would need to further dereference that pointer to get the numeric value stored in the first integer that you allocated; for example:
printf("%d\n", **pointer);
// or
printf("%d\n", *pointer[0]);
// or even
printf("%d\n", pointer[0][0]); // though this isn't recommended
// for readability's sake since
// `pointer[0]` isn't an array but
// instead a pointer to a single `int`.
If I dereference pointer, by doing *pointer what do I get? pointer[0]?
Yes.
And if so, what is pointer[0]?
With your definitions: 0x0020.
In the image you see, if it is correct
It seems correct to me.
is pointer[0] referring to the box that has the address 0x0020 in it?
Still yes.
What about pointer[1]?
At this point, I think you can guess that it woud show: 0x002c.
To go further
If you want to check how memory is managed and what pointers look like you can use gdb. It allows running a program step by step and performing various operations such as showing the content of variables. Here is the main page for GNU gdb. A quick internet search should let you find numerous gdb tutorials.
You can also show the address of a pointer in c by using a printf line:
int *plop = NULL;
fprintf(stdout, "%p\n", (void *)pointer);
Note: don't forget to include <stdio.h>

How to limit the size allocated memory?

While studying pointers I found that if I allocate memory for a 2 digit int I can give that int a higher value. Can someone explain me why it works like that and how to work around it?
Here is the code:
int *p;
p = (int *) malloc(2*sizeof(int));
p = 123456;
First of all, Please see this discussion on why not to cast the return value of malloc() and family in C..
That said, here, you're just overwriting the pointer returned by malloc(). Don't do that. It will cause memory leak. Also, if you try to dereference this pointer later, you may face undefined behavior, as there is no guarantee that this pointer points to a valid memory location. Accessing invalid memory leads to UB.
Finally, to address
I allocate memory for a 2 digit int [...]
let me tell you, you are allocating memory to hold two integers, not a two digit integer. Also, to store the integer values into the memory area pointed by the pointer, you need to dereference the pointer, like *p and store (assign) the value there, like *p=12345;.
malloc() allocates memory of the size of bytes passed as it's argument. Check the sizeof(int) (and 2*sizeof(int), if you want) to make it more clear.
Regarding this, quoting C11, chapter
void *malloc(size_t size);
The malloc function allocates space for an object whose size is specified by size [...]
So, here, malloc() returns a pointer with the size of two integers (i.e., capable of holding two integer values), not two digits or bytes.
Also, it's always a good practice to check the success of malloc() before using the returned pointer to avoid the possible UB by dereferencing NULL in case of malloc() failure.
if i allocate memory for a 2 digit int i can give that int a higher value.
No, you cannot. You have allocated memory for two ints, with full int range. Standard guarantees at least five digits (215-1), but on most modern systems you are safe with eight digits (231-1).
Back to your question, there is no way in C to ensure that your program does not write past the end of the memory that it allocates. It is the responsibility of your program to not do that, including any checks it must perform in order to catch potentially unsafe access.
This is called undefined behavior. The system may crash when you access memory past the end of the allocated area, but it does not have to.
allocate memory for a 2 digit int
malloc(2*sizeof(int)) allocates memory for two integers, not for a single two-digit integer.
you do not need the cast in the first place
You are allocated memory for two integers - typically 32 bits each
You are giving a pointer the value 123456
Perhaps reread the book

Compatibility of data size while using malloc()

I recently studied about malloc() in C with declaration as follows:
void *malloc(size_t size)
where size_t is unsigned int and size defines the no. of bytes to be reserved.
Question is that on my system float values occupy 4bytes of memory. So if i make memory pointer(of float type) using malloc of 2bytes,
float *p;
p = (float *)malloc(2);
then how come it does not give any error? Because what i think is that float data required 4 bytes so if i issue only 2 bytes to it then it may lead to some data loss.
or is it that i m understanding malloc() incorrrectly?
In the example you give, if you only allocate 2 bytes for a float * and then attempt to write to that location by dereferencing the pointer, you'll be writing to memory that hasn't been allocated. This results in undefined behavior. That means it might work, it might core dump, or it might behave in unpredictable ways.
If you want to allocate memory for one or more floats, you would do it like this:
// allocates space for an array of 5 floats
// don't cast the result of malloc
int arrayLen = 5;
float *f = malloc(sizeof(float) * arrayLen);
You're encountering an implementation-specific result of the requirements of the C Standard:
7.22.3 Memory management functions
The order and contiguity of storage allocated by successive calls to
the aligned_alloc , calloc , malloc , and realloc functions
is unspecified. The pointer returned if the allocation succeeds is
suitably aligned so that it may be assigned to a pointer to any type
of object with a fundamental alignment requirement and then used to
access such an object or an array of such objects in the space
allocated (until the space is explicitly deallocated).
In order to provide storage "suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement", an implementation has to return memory from malloc() et al at specific offsets that are multiples of the most restrictive alignment requirement for the system. That's usually something like 8 or 16 bytes.
Given that each and every block returned has to be aligned that way, most implementations internally create blocks of memory in multiples of the alignment requirement.
So if your system has an 8-byte alignment requirement, your malloc() implementation is likely to actually give you an 8-byte block of memory even though you requested two bytes. Likewise, ask for 19 bytes and you'll likely get something like 24 in reality.
It's still undefined behavior to go beyond what you asked for, though. And undefined behavior does unfortunately include "works just fine".
This can be problematic if you try to use that pointer -actually the pointer is fine, it's the allocated memory it points to that's too small-, the reason the compiler doesn't recognize it as an error is the fact that the pointer isn't "aware" of what it points to , actually pointers are variables that contain memory addresses, so basically they're just a number, and in most cases ( as user694733 pointed out) the size of the pointer is the same whether it points to a short or a float.
what the compiler sees is a cast from (void*) to (float*) and to the compiler it's a totally valid cast.
Your question has actually nothing to do with the malloc, but rather to the data casting. In this case you have casted the bytes located starting from the address returned by malloc to float. Thus, if you later say *p = 0.0f; you will actually write 4 bytes to the mentioned memory area, but only 2 bytes are ligal to you to use since you have allocated only 2 bytes. Therefore, your code will compile and run with memory corruption (this either will result in crash, or in an unexpected runtime behavior later)

Dynamic memory allocation questions with realloc and calloc

See the following function:
int go(void) {
int *p, *q;
p = calloc(10,sizeof(int));
q = realloc(p, 20 * sizeof(int));
<<X>>
}
Assuming that both memory allocation function calls are successful, which of the following statements are true at the point marker <<X>>.
The values of p and q are the same.
p points to 10 integers each with the value of 0.
q points to at least 80 bytes of memory.
This question is in my C test paper. Except for (2) which is obviously true. I'm quite confused about (1) and (3). Can anybody explain me this?
Check out the documentation. Specifically (emphasis added):
RETURN VALUE
Upon successful completion with a size not equal to 0, realloc()
returns a pointer to the (possibly moved) allocated space. If size is
0, either a null pointer or a unique pointer that can be successfully
passed to free() is returned. If there is not enough available memory,
realloc() returns a null pointer and sets errno to [ENOMEM].
So, the p and q may be the same (if realloc is able to resize the existing block of memory), but it's not guaranteed (and so you shouldn't rely on it).
According to the C standard, an int must be at least 16 bits (2 bytes), so sizeof(int) is at least 40, so (3) is not necessarily true.
As Brendan said, (1) is not necessarily true (and probably not true).
In C, generally an "int" is 4 bytes, so (3) should be true. It is true on all systems that I know of, although I'm not positive that the C standard says that an "int" must be four bytes long.
1) both p and q are pointers. Look up documentation on what realloc does for the answer. 3) That statement is ambiguous. In C, the value of pointers are typically scalar addresses to a byte location. The size of the memory block allocated is unknown. The type of the pointer is used to determine the size of a stride when doing pointer arithmetic, but if you allocated a buffer of a multiple size of some type, that is still unknown from the variable itself.
Chech these link.These says that 1 may be true or false its like can't say http://www.thinkage.ca/english/gcos/expl/c/lib/reallo.html
Hope these information helps you as many people already explained about 3rd case
1 is true or false
because q points to the reallocated memory. This may be the same as "p" if the old block of memory could be grown (or shrunk) to the new size; otherwise, it will be a different block. The space begins at an alignment boundary that is suitable for storing objects of any data type. If memory cannot be acquired or if an argument is improperly specified, the NULL pointer is returned.
The amount of memory pointed to by q depends on the size of the int type, which might not be four bytes. Statement one is also not necessarily true. From the realloc(3) man page on my system:
If there is not enough room to
enlarge the memory allocation pointed to by ptr, realloc() creates a new
allocation, copies as much of the old data pointed to by ptr as will fit
to the new allocation, frees the old allocation, and returns a pointer to
the allocated memory.
As this is an exam question, we have to refer to the specification. All three are to be considered false.
realloc can return a pointer that is different to p.
realloc can will free the memory pointed to by p if the newly allocated space is in a different location. This means that p may still point to ten zeros, but this is not what the specification says.
As explained well in other answers, we don't know the size of an int (that is why we use sizeof(int)).

Need assistance in understanding this code using malloc and pointers

Here is a little snippet of code from Wikipedia's article on malloc():
int *ptr;
ptr = malloc(10 * sizeof (*ptr)); // Without a cast
ptr = (int*)malloc(10 * sizeof (int)); // With a cast
I was wondering if someone could help me understand what is going on here. So, from what I know, it seems like this is what's happening:
1) initialize an integer pointer that points to NULL. It is a pointer so its size is 4-bytes. Dereferencing this pointer will return the value NULL.
2) Since C allows for this type of automatic casting, it is safe not to include a cast-to-int-pointer. I am having trouble deciphering what exactly is being fed into the malloc function though (and why). It seems like we are getting the size of the dereferenced value of ptr. But isn't this NULL? So the size of NULL is 0, right? And why are we multiplying by 10??
3) The last line is just the same thing as above, except that a cast is explicitly declared. (cast from void pointer to int pointer).
I'm assuming we're talking about C here. The answer is different for C++.
1) is entirely off. ptr is a pointer to an int, that's all. It's uninitialized, so it has no deterministic value. Dereferencing it is undefined behaviour -- you will most certainly not get 0 out! The pointer also will most likely not point to 0. The size of ptr is sizeof(ptr), or sizeof(int*); nothing else. (At best you know that this is no larger than sizeof(void*).)
2/3) In C, never cast the result of malloc: int * p = malloc(sizeof(int) * 10);. The code allocates enough memory for 10 integers, i.e. 10 times the size of a single integer; the return value of the call is a pointer to that memory.
The first line declares a pointer to an integer, but doesn't initialize it -- so it points at some random piece of memory, probably invalid. The size of ptr is whatever size pointers to int are, likely either 4 or 8 bytes. The size of what it points at, which you'd get by dereferencing it when it points somewhere valid, is whatever size an int has.
The second line allocates enough memory for 10 ints from the heap, then assigns it to ptr. No cast is used, but the void * returned by malloc() is automatically converted to whatever type of pointer is needed when assigned. The sizeof (*ptr) gives the size of the dereferenced ptr, i.e. the size of what ptr points to (an int). For sizeof, it doesn't matter whether ptr actually points to a valid memory, just what the type would be.
The third line is just like the second, but with two changes: It explicitly casts the void * return from malloc() to an int *, to match the type of ptr; and it uses sizeof with the type name int rather than an expression of that type, like *ptr. The explicit cast is not necessary, and some people strongly oppose its use, but in the end it comes down to preference.
After either of the malloc()s ptr should point to a valid location on the heap and can be dereferenced safely, as long as malloc was successful.
For line 2 malloc() is allocating enough memory to hold 10 pointers.
malloc() is a general purpose function void so it must be cast to whatever type you actually want to use, in the above example pointer to int.

Resources