Cast pointer to larger int

Cast pointer to larger int - c

I have a pointer. On a 32-bit system it's 32 bits. On a 64-bit system it's 64 bits.
I have a long long integer field used as an identifier, and sometimes I want to use the pointer value in there. (I never cast back to a pointer - once I've cast it to the integer field, I only ever compare it for equality).
On both 32-bit and 64-bit systems, it seems safe to do this. (On larger pointered systems not so). Is that true?
And then, is there a way to make GCC not give the following warning only when building on platforms where this is safe (which is, at the moment, all target platforms)?
error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]

According to the standard, there is no guarantee that a pointer fits in an integer type. In practical, otherwise, on mostly personnal computers, there exists several memory models. You can see pointer and integer types have not always the same size (even on "conventional" computers).
You should rather use the optional types intptr_t and uintptr_t, from C99.
C11 (n1570), § 7.20.1.4
The following type designates a signed integer type with the property that any valid
pointer to void can be converted to this type, then converted back to pointer to void,
and the result will compare equal to the original pointer: intptr_t.
The following type designates an unsigned integer type with the property that any valid
pointer to void can be converted to this type, then converted back to pointer to void,
and the result will compare equal to the original pointer: uintptr_t.
Here is a small example:
#include <stdio.h>
#include <stdint.h>
int n = 42;
int *p = &n;
intptr_t i = (intptr_t)(void *)p;
int *q = (void *)i;
printf("%d\n", *q);

If you want to have an integer type that is guaranteed to be big enough to hold a pointer, consider intptr_t (signed) or uintptr_t. The standard guarantees that these datatypes are big enough to hold a pointer. Nothing else might be assumed, especially not that long int is long enough.

Related

What is the appropriate arithmetic type for a pointer?

In my code I have two char pointer, one to a string and the other as an error indicator for strtoumax() and strtod(). I am currently using the type size_t (aka unsigned long) to calculate the difference between them. Is there any type designed to specifically match pointer type size on every machine? Or do I have to check it myself with macros?

For pointer difference use ptrdiff_t. If you're just trying to store a pointer as an integer, use uintptr_t (or intptr_t).

In my code I have two char pointer, one to a string and the other as
an error indicator for strtoumax() and strtod(). I am currently using
the type size_t (aka unsigned long) to calculate the difference
between them.
Don't do that. If you want a pointer difference then compute a pointer difference:
#include <stdint.h>
// ...
ptrdiff_t difference = p2 - p1;
And note ptrdiff_t, which is the type of the result of a pointer difference.
If you want a difference in bytes instead of in units the size of the pointed-to type (including if the pointed-to type is incomplete, such as void) then first convert to pointers to char:
ptrdiff_t difference_in_bytes = (char *) p2 - (char *) p1;
(char is the smallest addressible unit of storage, but technically, it might be larger than 8 bits on some C implementations. CHAR_BIT will help you figure that out if you're concerned about such cases.)
Do not compute a pointer difference by converting to integer and performing integer arithmetic, because although the behavior of that is defined (+/- signed integer overflow), the meaning of the result is not.
Is there any type designed to specifically match pointer
type size on every machine? Or do I have to check it myself with
macros?
Yes. In stdint.h there are definitions of uintptr_t and intptr_t, which can support round-trip pointer to integer to pointer conversions without data loss. But C does not define the meaning of the value resulting from converting a pointer to an integer, so these are best used as opaque types.

To store pointers as integers you can use intptr_t and uintptr_t declared in the header <stdint.h>.
From the C Standard (7.20.1.4 Integer types capable of holding object pointers)
1 The following type designates a signed integer type with the
property that any valid pointer to void can be converted to this type,
then converted back to pointer to void, and the result will compare
equal to the original pointer:
intptr_t
The following type designates an unsigned integer type with the
property that any valid pointer to void can be converted to this type,
then converted back to pointer to void, and the result will compare
equal to the original pointer:
uintptr_t
These types are optional.
To store difference between two pointers you can use ptrdiff_t declared in the header <stddef.h>.
Pay attention to that you may calculate difference between two pointers if they both point to elements of the same array or one past the last element. Otherwise you will get undefined behavior.

Is there any difference between uintptr_t and unsigned int when unsigned int can hold any address?

Description of uintptr_t:
The following type designates an unsigned integer type with the
property that any valid pointer to void can be converted to this type,
then converted back to pointer to void, and the result will compare
equal to the original pointer:
uintptr_t
And since any pointer can be converted to void pointer and vice versa:
A pointer to void may be converted to or from a pointer to any object
type. A pointer to any object type may be converted to a pointer to
void and back again; the result shall compare equal to the original
pointer.
Any pointer can be converted to uintptr_t and vice versa, OK.
Now, description of integers and pointers:
[Integer -> Pointer]
An integer may be converted to any pointer type. Except as previously
specified, the result is implementation-defined, might not be
correctly aligned, might not point to an entity of the referenced
type, and might be a trap representation
[Pointer -> Integer]
Any pointer type may be converted to an integer type. Except as
previously specified, the result is implementation-defined. If the
result cannot be represented in the integer type, the behavior is
undefined. The result need not be in the range of values of any
integer type.
OK. Now, since in my system's ABI (Procedure call standard for ARM architecture) both unsigned int and pointers have same size and alignment, and my system uses plain 32bit continuous values starting from 0x0 for memory addresses, it seems that the implementation-defined gap in the conversion of
Integer -> Pointer and Pointer -> Integer
has been filled in my system, and I can safely convert pointers to unsigned integers, and there is no difference between converting a pointer into uintptr_t and converting a pointer to unsigned int in my system (both will yield same value). Am I right with my assumption? or there is something I'm missing?

Even given that unsigned int has enough bits to represent all addresses in the C implementation, the C standard does not guarantee this means that, given a void * pointer p, the expression (void *) (unsigned) p == p evaluates to true. Because the conversion from void * to an integer is implementation-defined, it might do more than simply reproduce the address as an unsigned value. It might include some bits describing the provenance of the address or a checksum, and unsigned might be insufficient to contain the necessary information to restore the original value.
Most implementations are likely to simply convert the address in the obvious way, reinterpreting the bits of the virtual memory adddress as an unsigned value, and no problems will arise. However, this is a feature of the implementation; it is not a requirement of the C standard.

can (u)intmax_t hold a function pointer?

Is a uintmax_t guarantee to be large enought to hold a function pointer?
i know this:
The following type designates an unsigned integer type capable of
representing any value of any unsigned integer type:
uintmax_t
and
The following type designates an unsigned integer type with the
property that any valid pointer to void can be converted to this type,
then converted back to a pointer to void, and the result will compare
equal to the original pointer: uintptr_t
and that a void-Pointer is may not large enought to hold a function pointer, so a uintptr_t is may also not large enough to hold a function pointer.

There is no such guarantee in the C standard.
First, there is no guarantee that any pointer can be losslessly converted to an integer type (except a NULL pointer). It is true that uintptr_t must be able to losslessly represent a void pointer (and thus any onject pointer). However, there is no guarantee that an implementation has uintptr_t, since it and intptr_t are optional (last sentence of § 7.20.1.4).
Second, a function pointer is not an object pointer, and it is not necessarily possible to convert one to a void pointer and back. So even if uintptr_t does exist, it might not be big enough to hold a function pointer.
On an X/Open System Interface (XSI) compatible implementation (most Posix systems), you must be able to convert between void pointers and function pointers, and uintptr_t must exist. So in that case, you do have the guarantee. (The convertibility between void and function pointers is required by the dlsym system interface, which was moved from XSI to base Posix in Issue 7 (2008). However, the existence of uintptr_t continues to be an XSI extension.)

No, not in general. void* and in consequence [u]intptr_t are only guaranteed to be wide enough to hold object pointers, that is pointers to object types. Function pointers may, on some platforms, be wider and comprise more information than just an entry point to the function. So on such platforms a void* or uintptr_t has not enough bits to represent all information that would be needed.
On many platforms, function pointers have the same width as object pointers, though, and they may even allow to convert from one to another. But this is an extension of the C standard and you'd have to check with your platform documentation.

Is a uintmax_t guarantee to be large enough to hold a function pointer?
No guaranteed way as well answered by others.
Yet the size of a function pointer does exist and it does have a bit pattern. If sufficiently large, uintmax_t could hold the pointer's bit pattern (and maybe more). This size test could be assessed at compile time.
#include <assert.h>
#include <stdio.h>
#include <stdint.h>
int foo(int x) {
return x+x;
}
int main(void) {
union {
uintmax_t um;
int (*fp)(int);
} u = {0};
assert(sizeof u.um >= sizeof u.fp); // This assertion may fail
u.fp = foo;
uintmax_t save = u.um;
printf("%ju\n", save);
u.um = save;
printf("%d\n", (*u.fp)(42));
return 0;
}
Output
4198816
84

Cast int to pointer - why cast to long first? (as in p = (void*) 42; )

In the GLib documentation, there is a chapter on type conversion macros.
In the discussion on converting an int to a void* pointer it says (emphasis mine):
Naively, you might try this, but it's incorrect:
gpointer p;
int i;
p = (void*) 42;
i = (int) p;
Again, that example was not correct, don't copy it. The problem is
that on some systems you need to do this:
gpointer p;
int i;
p = (void*) (long) 42;
i = (int) (long) p;
(source: GLib Reference Manual for GLib 2.39.92, chapter Type Conversion Macros ).
Why is that cast to long necessary?
Should any required widening of the int not happen automatically as part of the cast to a pointer?

The glib documentation is wrong, both for their (freely chosen) example, and in general.
gpointer p;
int i;
p = (void*) 42;
i = (int) p;
and
gpointer p;
int i;
p = (void*) (long) 42;
i = (int) (long) p;
will both lead to identical values of i and p on all conforming c implementations.
The example is poorly chosen, because 42 is guaranteed to be representable by int and long (C11 draft standard n157: 5.2.4.2.1 Sizes of integer types ).
A more illustrative (and testable) example would be
int f(int x)
{
void *p = (void*) x;
int r = (int)p;
return r;
}
This will round-trip the int-value iff void* can represent every value that int can, which practically means sizeof(int) <= sizeof(void*) (theoretically: padding bits, yadda, yadda, doesn't actually matter). For other integer types, same problem, same actual rule (sizeof(integer_type) <= sizeof(void*)).
Conversely, the real problem, properly illustrated:
void *p(void *x)
{
char c = (char)x;
void *r = (void*)c;
return r;
}
Wow, that can't possibly work, right? (actually, it might).
In order to round-trip a pointer (which software has done unnecessarily for a long time), you also have to ensure that the integer type you round-trip through can unambiguously represent every possible value of the pointer type.
Historically, much software was written by monkeys that assumed that pointers could round-trip through int, possibly because of K&R c's implicit int-"feature" and lots of people forgetting to #include <stdlib.h> and then casting the result of malloc() to a pointer type, thus accidentally roundtripping through int. On the machines the code was developed for sizeof(int) == sizeof(void*), so this worked. When the switch to 64-bit machines, with 64-bit addresses (pointers) happened, a lot of software expected two mutually exclusive things:
1) int is a 32-bit 2's complement integer (typically also expecting signed overflow to wrap around)
2) sizeof(int) == sizeof(void*)
Some systems (cough Windows cough) also assumed sizeof(long) == sizeof(int), most others had 64-bit long.
Consequently, on most systems, changing the round-tripping intermediate integer type to long fixed the (unnecessarily broken) code:
void *p(void *x)
{
long l = (long)x;
void *r = (void*)l;
return r;
}
except of course, on Windows. On the plus side, for most non-Windows (and non 16-bit) systems sizeof(long) == sizeof(void*) is true, so the round-trip works both ways.
So:
the example is wrong
the type chosen to guarantee round-trip doesn't guarantee round-trip
Of course, the c standard has a (naturally standard-conforming) solution in intptr_t/uintptr_t (C11 draft standard n1570: 7.20.1.4 Integer types capable of holding object pointers), which are specified to guarantee the
pointer -> integer type -> pointer
round-trip (though not the reverse).

As according to the C99: 6.3.2.3 quote:
5 An integer may be converted to any pointer type. Except as
previously specified, the result is implementation-defined, might not
be correctly aligned, might not point to an entity of the referenced
type, and might be a trap representation.56)
6 Any pointer type may be converted to an integer type. Except as
previously specified, the result is implementation-defined. If the
result cannot be represented in the integer type, the behavior is
undefined. The result need not be in the range of values of any
integer type.
According to the documentation at the link you mentioned:
Pointers are always at least 32 bits in size (on all platforms GLib
intends to support). Thus you can store at least 32-bit integer values
in a pointer value.
And further more long is guaranteed to be atleast 32-bits.
So,the code
gpointer p;
int i;
p = (void*) (long) 42;
i = (int) (long) p;
is safer,more portable and well defined for upto 32-bit integers only, as advertised by GLib.

As I understand it, the code (void*)(long)42 is "better" than (void*)42 because it gets rid of this warning for gcc:
cast to pointer from integer of different size [-Wint-to-pointer-cast]
on environments where void* and long have the same size, but different from int. According to C99, §6.4.4.1 ¶5:
The type of an integer constant is the first of the corresponding list in which its value can be represented.
Thus, 42 is interpreted as int, had this constant be assigned directly to a void* (when sizeof(void*)!=sizeof(int)), the above warning would pop up, but everyone wants clean compilations. This is the problem (issue?) the Glib doc is pointing to: it happens on some systems.
So, two issues:
Assign integer to pointer of same size
Assign integer to pointer of different size
Curiously enough for me is that, even though both cases have the same status on the C standard and in the gcc implementation notes (see gcc implementation notes), gcc only shows the warning for 2.
On the other hand, it is clear that casting to long is not always the solution (still, on modern ABIs sizeof(void*)==sizeof(long) most of the times), there are many possible combinations depending on the size of int,long,long long and void*, for 64bits architectures and in general. That is why glib developers try to find the matching integer type for pointers and assign glib_gpi_cast and glib_gpui_cast accordingly for the mason build system. Later, these mason variables are used in here to generate those conversion macros the right way (see also this for basic glib types). Eventually, those macros first cast an integer to another integer type of the same size as void* (such conversion conforms to the standard, no warnings) for the target architecture.
This solution to get rid of that warning is arguably a bad design that is nowadys solved by intptr_t and uintptr_t, but it is posible it is there for historical reasons: intptr_t and uintptr_t are available since C99 and Glib started its development earlier in 1998, so they found their own solution to the same problem. It seems that there were some tries to change it:
GLib depends on various parts of a valid C99 toolchain, so it's time to
use C99 integer types wherever possible, instead of doing configure-time
discovery like it's 1997.
no success however, it seems it never got in the main branch.
In short, as I see it, the original question has changed from why this code is better to why this warning is bad (and is it a good idea to silence it?). The later has been answered somewhere else, but this could also help:
Converting from pointer to integer or vice versa results in code that is not portable and may create unexpected pointers to invalid memory locations.
But, as I said above, this rule doesn't seem to qualify for a warning for issue number 1 above. Maybe someone else could shed some light on this topic.
My guess for the rationale behind this behaviour is that gcc decided to throw a warning whenever the original value is changed in some way, even if subtle. As gcc doc says (emphasis mine):
A cast from integer to pointer discards most-significant bits if the pointer representation is smaller than the integer type, extends according to the signedness of the integer type if the pointer representation is larger than the integer type, otherwise the bits are unchanged.
So, if sizes match there is no change on the bits (no extension, no truncation, no filling with zeros) and no warning is thrown.
Also, [u]intptr_t is just a typedef of the appropriate qualified integer: it is not justifiable to throw a warning when assigning [u]intptr_t to void* since it is indeed its purpose. If the rule applies to [u]intptr_t, it has to apply to typedefed integer types.

I think it is because this conversion is implementation-dependendent. It is better to use uintptr_t for this purpose, because it is of the size of pointer type in particular implementation.

As explained in Askmish's answer, the conversion from an integer type to a pointer is implementation defined (see e.g. N1570 6.3.2.3 Pointers §5 §6 and the footnote 67).
The conversion from a pointer to an integer is implementation defined too and if the result cannot be represented in the integer type, the behavior is undefined.
On most general purpose architectures, nowadays, sizeof(int) is less than sizeof(void *), so that even those lines
int n = 42;
void *p = (void *)n;
When compiled with clang or gcc would generate a warning (see e.g. here)
warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
Since C99, the header <stdint.h> introduces some optional fixed-sized types. A couple, in particular, should be used here n1570 7.20.1.4 Integer types capable of holding object pointers:
The following type designates a signed integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer:
intptr_t
The following type designates an unsigned integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer:
uintptr_t
These types are optional.
So, while a long may be better than int, to avoid undefined behaviour the most portable (but still implementation defined) way is to use one of those types(1).
Gcc's documentation specifies how the conversion takes place.
4.7 Arrays and Pointers
The result of converting a pointer to an integer or vice versa (C90 6.3.4, C99 and C11 6.3.2.3).
A cast from pointer to integer discards most-significant bits if the pointer representation is larger than the integer type, sign-extends(2) if the pointer representation is smaller than the integer type, otherwise the bits are unchanged.
A cast from integer to pointer discards most-significant bits if the pointer representation is smaller than the integer type, extends according to the signedness of the integer type if the pointer representation is larger than the integer type, otherwise the bits are unchanged.
When casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined. That is, one may not use integer arithmetic to avoid the undefined behavior of pointer arithmetic as proscribed in C99 and C11 6.5.6/8.
[...]
(2) Future versions of GCC may zero-extend, or use a target-defined ptr_extend pattern. Do not rely on sign extension.
Others, well...
The conversions between different integer types (int and intptr_t in this case) are mentioned in n1570 6.3.1.3 Signed and unsigned integers
When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
So, if we start from an int value and the implementation provides an intptr_t type and sizeof(int) <= sizeof(intptr_t) or INTPTR_MIN <= n && n <= INTPTR_MAX, we can safely convert it to an intptr_t and then convert it back.
That intptr_t can be converted to a void * and then converted back to the same (1)(2) intptr_t value.
The same doesn't hold in general for a direct conversion between an int and a void *, even if in the example provided, the value (42) is small enough not to cause undefined behaviour.
I personally find quite debatable the reasons given for those type conversion macros in the linked GLib documentation (emphasis mine)
Many times GLib, GTK+, and other libraries allow you to pass "user data" to a callback, in the form of a void pointer. From time to time you want to pass an integer instead of a pointer. You could allocate an integer [...] But this is inconvenient, and it's annoying to have to free the memory at some later time.
Pointers are always at least 32 bits in size (on all platforms GLib intends to support). Thus you can store at least 32-bit integer values in a pointer value.
I'll let the reader decide whether their approach makes more sense than a simple
#include <stdio.h>
void f(void *ptr)
{
int n = *(int *)ptr;
// ^ Yes, here you may "pay" the indirection
printf("%d\n", n);
}
int main(void)
{
int n = 42;
f((void *)&n);
}
(1) I'd like to quote a passage in this Steve Jessop's answer about those types
Take this to mean what it says. It doesn't say anything about size.
uintptr_t might be the same size as a void*. It might be larger. It could conceivably be smaller, although such a C++ implementation approaches perverse. For example on some hypothetical platform where void* is 32 bits, but only 24 bits of virtual address space are used, you could have a 24-bit uintptr_t which satisfies the requirement. I don't know why an implementation would do that, but the standard permits it.
(2) Actually, the standard explicitly mention the void* -> intptr_t/uintptr_t -> void* conversion, requiring those pointers to compare equal. It doesn't explicitly mandate that in the case intptr_t -> void* -> intptr_t the two integer values compare equal. It just mention in footnote 67 that "The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.".

What is intptr_t,is it a type for integer or pointer?

It's defined in /usr/include/stdint.h:
typedef long int intptr_t;
is it supposed to be a type for integer or pointer?

It is a signed integer type that is big enough to hold a pointer.

It is a signed integer type that guaranteed to can hold a void* type.
And why there is also [u]intptr_t? Because:
Any valid pointer to void can be converted to intptr_t or uintptr_t
and back with no change in value. The C Standard
guarantees that a pointer to void may be converted to or from a
pointer to any object type and back again and that the result must
compare equal to the original pointer. Consequently, converting
directly from a char * pointer to a uintptr_t is allowed on implementations that support the uintptr_t.