Is casting a pointer to a double pointer acceptable within C? - c

I was just curious if this was correct in assigning the value 888 to c and if it is not then why. I haven't found anything saying it was not and when I looked inside the c language specifications it appeared as if it was correct.
int** ppi;
int c = 6;
ppi = (int**)(&c);
*ppi = 888;
I have used it within several IDE's and with several compilers, but none have given me an error. However, some of my friends have said that this code should throw an error.
I was trying to change the value of c without adding in an intermediate pointer.
I know the following will work, but I was not sure if doing it the above way would work as well.
int** ppi;
int* pi;
int c = 6;
pi = &c;
ppi = π
**ppi = 888;

The code causes undefined behaviour in 4 different ways; it is certainly not "correct" or "acceptable" as some of the other answers seem to be suggesting.
Firstly, *ppi = 888; attempts to assign an int to an lvalue of type int * . This violates the constraint 6.5.16.1/1 of the assignment operator which lists the types that may be assigned to each other; integer to pointer is not in the list.
Being a constraint violation, the compiler must issue a diagnostic and may refuse to compile the program. If the compiler does generate a binary then that is outside the scope of the C Standard, i.e. completely undefined.
Some compilers, in their default mode of operation, will issue the diagnostic and then proceed as if you had written *ppi = (int *)888;. This brings us to the next set of issues.
The behaviour of casting 888 to int * is implementation-defined. It might not be correctly aligned (causing undefined behaviour), and it might be a trap representation (also causing undefined behaviour). Furthermore, even if those conditions pass, there is no guarantee that (int *)888 has the same size or representation as (int)888 as your code relies on.
The next major issue is that the code violates the strict aliasing rule. The object declared as int c; is written using the lvalue *ppi which is an lvalue of type int *; and int * is not compatible with int.
Yet another issue is that the write may write out of bounds. If int is 4 bytes and int * is 8 bytes, you tried to write 8 bytes into a 4-byte allocation.
Another problem from earlier in the program is that ppi = (int**)(&c); will cause undefined behaviour if c is not correctly aligned for int *, e.g. perhaps the platform has 4-byte alignment for int and 8-byte alignment for pointers.

This is not acceptable. Unless you have some really good reason to know that there's an int being stored at the memory address 888, this is invalid code which will lead to either crashes or undefined behavior if you dereference the pointer twice (and if you don't plan to do that, there's little point in using an int **).

ppi contains a pointer that points to a memory location that itself contains a pointer to an int.
int c=6; creates storage for an int and puts the value 6 into that storage giving:
ppi : [ some pointer ]
c : [ 6 ]
The line
ppi = (int**)(&c)
is telling the compiler "never mind that &c is a pointer to int; assume it's a pointer that holds a pointer to int; then store that in ppi. So at this point, ppi will contain the address of c (whatever that may be). So we have
ppi : [ &c ]
c : [ 6 ]
The next line
*ppi = 888;
is telling the compiler : "Store the value 888 at the location pointed to by *ppi."
So ppi points at c which contains 6 so we'd expect the value of c to be modified to 888. But wait, c is an int so depending on how much space an int takes, it may not be enough to store a pointer. This is the biggest problem here.

int** ppi;
int c = 6;
ppi = (int**)(&c); // Cast from int* to int** may be lossy or trap due to alignment issues
*ppi = 888; // 888 is not an int* nor implicitly convertible. Whether casting
// is allowed, and what that means, depends on the implementation
Regarding compilation giving you an error:
While the last assignment forces the compiler to give a diagnostic message, any singular one is enough. Whether that is called an error, how much detail it contains, and if that breaks the build is at the discretion of the implementation. There are probably options.

Related

Assigning a short to int * fails

I understand that I can reassign a variable to a bigger type if it fits, ad its ok to do it. For example:
short s = 2;
int i = s;
long l = i;
long long ll = l;
When I try to do it with pointers it fails and I don't understand why. I have integers that I pass as arguments to functions expecting a pointer to a long long. And it hasn't failed, yet..
The other day I was going from short to int, and something weird happens, I hope someone can I explain it to me. This would be the minimal code to reproduce.
short s = 2;
int* ptr_i = &s; // here ptr_i is the pointer to s, ok , but *ptr_i is definitely not 2
When I try to do it with pointers it fails and I don't understand why.
A major purpose of the type system in C is to reduce programming mistakes. A default conversion may be disallowed or diagnosed because it is symptomatic of a mistake, not because the value cannot be converted.
In int *ptr_i = &s;, &s is the address of a short, typically a 16-bit integer. If ptr_i is set to point to the same memory and *ptr_i is used, it attempts to refer to an int at that address, typically a 32-bit integer. This is generally an error; loading a 32-bit integer from a place where there is a 16-bit integer, and we do not know what is beyond it, is not usually a desired operation. The C standard does not define the behavior when this is attempted.
In fact, there are multiple things that can go wrong with this:
As described above, using *ptr_i when we only know there is a short there may produce undesired results.
The short object may have alignment that is not suitable for an int, which can cause a problem either with the pointer conversion or with using the converted pointer.
The C standard does not define the result of converting short * to int * except that, if it is properly aligned for int, the result can be converted back to short * to produce a value equal to the original pointer.
Even if short and int are the same width, say 32 bits, and the alignment is good, the C standard has rules about aliasing that allow the compiler to assume that an int * never accesses an object that was defined as short. In consequence, optimization of your program may transform it in unexpected ways.
I have integers that I pass as arguments to functions expecting a pointer to a long long.
C does allow default conversions of integers to integers that are the same width or wider, because these are not usually mistakes.

Do C pointers (always) start with a valid address memory?

Do C pointer (always) start with a valid address memory? For example If I have the following piece of code:
int *p;
*p = 5;
printf("%i",*p); //shows 5
Why does this piece of code work? According to books (that I read), they say a pointer always needs a valid address memory and give the following and similar example:
int *p;
int v = 5;
p = &v;
printf("%i",*p); //shows 5
Do C pointer (always) start with a valid address memory?
No.
Why does this code work?
The code invokes undefined behavior. If it appears to work on your particular system with your particular compiler options, that's merely a coincidence.
No. Uninitialized local variables have indeterminate values and using them in expressions where they get evaluated cause undefined behavior.
The behaviour is undefined. A C compiler can optimize the pointer access away, noting that in fact the p is not used, only the object *p, and replace the *p with q and effectively produce the program that corresponds to this source code:
#include <stdio.h>
int main(void) {
int q = 5;
printf("%i", q); //shows 5
}
Such is the case when I compile the program with GCC 7.3.0 and -O3 switch - no crash. I get a crash if I compile it without optimization. Both programs are standard-conforming interpretations of the code, namely that dereferencing a pointer that does not point to a valid object has undefined behaviour.
No.
On older time, it was common to initialize pointer to selected memory addresses (e.g. linked to hardware).
char *start_memory buffer = (char *)0xffffb000;
Compiler has no way to find if this is a valid address. This involve a cast, so it is cheating.
Consider
static int *p;
p will have the value of NULL, which doesn't point to a valid address (Linux, but on Kernel, it invalidate such address, other OS could use memory on &NULL to store some data.
But you may also create initialized variables, so with undefined initial values (which probably it is wrong).

Pointer to integer and back again

First, let me emphasize that this question is legalistic in nature. I am not asking whether the following program will work, in practice, on real implementations, I am asking whether it is legal (:= not producing an undefined behavior) according to the strictest legalistic interpretation of the ISO-9899 standards (:1999 and :2011).
The question is whether it is permissible to convert a pointer to an uintptr_t integer, perform some arithmetic on that integer, return it to the same value, and convert the integer back to a pointer.
So, is the following program legal (in the sense that it does not produce an undefined behavior)?
#include <stdint.h>
#include <stdio.h>
int
main(void)
{
int answer = 42; void *ptr;
uintptr_t deepthought;
ptr = &answer;
deepthought = (uintptr_t)ptr;
ptr = 0;
deepthought ^= 0xdeadbeef;
printf("I'm thinking about it...\n");
deepthought ^= 0xdeadbeef;
ptr = (void *)deepthought;
printf("The answer is: %d\n", *((int *)ptr));
return 0;
}
Again, I am aware that this code will cause no difficulty on any real system. The question is whether it lives up to the legalese in the C standard, esp. §7.18.1.4 in ISO-9899:1999 / §7.20.1.4 in ISO-9899:2011, in which the phrase "an unsigned integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer": it isn't clear whether "then converted back" allows for intermediate arithmetic computations.
To make the question a little less theoretical, here is a reason why one might wish for this kind of processing to be forbidden. If we change the example just a little bit so that the pointer is mallocated instead of pointing to a local variable, and if it happens to run on an implementation with a (conservative) garbage-collector, the memory could conceivably be reclaimed during the printf call because, at that moment, there is nothing pointing to that region of memory. So if the C standard makes the above example illegal (e.g., if nothing can be written to a pointer that did not hold a legal pointer value all the time), this provides a legalistic justification for the assumptions made by garbage-collectors.
But I repeat that the question is about the hermeneutics of the C standard, not about any practical or real-world outcome.
Yes, it must work.
From how I read the Standardese, you could write the value of deepthought to a file (say with fwrite), destroy any copies of the value in the program, then read the value from the file again (fread). The value so read and converted to a pointer must compare equal to the original pointer. I don't find any wording that forbids this.
A garbage collector which could move the address of an object would have to take such possibilities into account.

malloc return typecasting confusion

I was going through here and found that malloc can cause unwanted behaviour if we don't include stdlib.h, cast the return value and if pointer and integer size differs on the system.
Below is the code snippet given in that SO question. This was tried on 64 bit machine where pointer and integer size are different.
int main()
{
int* p;
p = (int*)malloc(sizeof(int));
*p = 10;
return 0;
}
If we don't include stdlib.h, compiler will assume malloc return type as int, and casting it and assigning to different size pointer can cause unwanted behaviour. But my question is why casting int to int* and assigning it to different size pointer can cause the problem.
int main()
{
int* p;
p = (int*)malloc(sizeof(int));
*p = 10;
return 0;
}
Under C99 and C2011 rules, the call to malloc with no visible declaration is a constraint violation, meaning that a conforming compiler must issue a diagnostic. (This is about as close as C comes to saying that something is "illegal".) If your compiler doesn't warn about the call, you should find out what options to use to make it do so.
Under C90 rules, calling a function with no visible declaration causes the compiler to assume that the function actually returns a result of type int. Since malloc is actually defined with a return type of void*, the behavior is undefined; the compiler is not required to diagnose it, but the standard says exactly nothing about what happens when the call is evaluated.
What typically happens in practice is that the compiler generates code as if malloc were defined to return an int result. For example, malloc might put its 64-bit void* result in some particular CPU register, and the calling code might assume that that register contains a 32-bit int. (This is not a type conversion; it's just bad code that incorrectly treats a value of one type as if it were of a different type.) That (possibly garbage) int value is then converted to int* and stored in p. You might lose the high-order or low-order 32 bits of the returned pointer -- but that's only one out of arbitrarily many ways it can go wrong.
Or malloc might push its 64-bit result onto the stack, and the caller might pop only 32 bits off the stack, resulting in a stack misalignment that will cause all subsequent execution to be incorrect. For historical reasons, C compiler typically don't use this kind of calling convention, but the standard permits it.
If int, void*, and int* all happen to be the same size (as they often are on 32-bit systems), the code is likely to work -- but even that's not guaranteed. For example, a calling convention might use one register to return int results and a different one to return pointer results. Again, most existing C calling conventions allow for old bad code that makes assumptions like this.
Calling malloc requires #include <stdlib.h>, even though some compilers might not enforce that requirement. It's much easier to add the #include (and drop the cast) than to spend time thinking about what might happen if you don't.
Almost, any function causes undefined behavior if no prototype is given before calling it, except for example if it's prototype is int function(int x);.
It's pretty obvious that if the size of a pointer is larger than the size of an int and malloc() returns an int because of implicit declaration, then the returned address might not be the real address because for example, it might not be possible to represent it with less bits.
Dereferencing it would be undefined behavior, which by the way, you can't test for, since it's undefined, what would you expect to happen? it's undefined!!!
So, nothing to test there?

How does the OS know how much to increment different pointers?

With a 32-bit OS, we know that the pointer size is 4 bytes, so sizeof(char*) is 4 and sizeof(int*) is 4, etc. We also know that when you increment a char*, the byte address (offset) changes by sizeof(char); when you increment an int*, the byte address changes by sizeof(int).
My question is:
How does the OS know how much to increment the byte address for sizeof(YourType)?
The compiler only knows how to increment a pointer of type YourType * if it knows the size of YourType, which is the case if and only if the complete definition of YourType is known to the compiler at this point.
For example, if we have:
struct YourType *a;
struct YourOtherType *b;
struct YourType {
int x;
char y;
};
Then you are allowed to do this:
a++;
but you are not allowed to do this:
b++;
..since struct YourType is a complete type, but struct YourOtherType is an incomplete type.
The error given by gcc for the line b++; is:
error: arithmetic on pointer to an incomplete type
The OS doesn't really have anything to do with that - it's the compiler's job (as #zneak mentioned).
The compiler knows because it just compiled that struct or class - the size is, in the struct case, pretty much the sum of the sizes of all the struct's contents.
It is primarily an issue for the C (or C++) compiler, and not primarily an issue for the OS per se.
The compiler knows its alignment rules for the basic types, and applies those rules to any type you create. It can therefore establish the alignment requirement and size of YourType, and it will ensure that it increments any YourType* variable by the correct value. The alignment rules vary by hardware (CPU), and the compiler is responsible for knowing which rules to apply.
One key point is that the size of YourType must be such that when you have an array:
YourType array[20];
then &array[1] == &array[0] + 1. The byte address of &array[1] must be incremented by sizeof(YourType), and (assuming YourType is a structure), each of the elements of array[1] must be properly aligned, just as the elements of array[0] must be properly aligned.
Also remember types are defined in your compiled code to match the hardware you are working on. It is entirely up to the source code that is used to work this out.
So a low end chipset 16 bit targeted C program might have need to define types differently to a 32 bit system.
The programming language and compiler are what govern your types. Not the OS or hardware.
Although of course trying to stick a 32 bit number into a 16 bit register could be a problem!
C pointers are typed, unlike some old languages like PL/1. This not only allows the size of the object to be known, but so widening operations and formatting can be carried out. For example getting the data at *p, is that a float, a double, or a char? The compiler needs to know (think divisions, for example).
Of course we do have a typeless pointer, a void *, which you cannot do any arithmetic with simply because the compiler has no idea how much to add to the address.

Resources