Which format specifier should I be using to print the address of a variable? I am confused between the below lot.
%u - unsigned integer
%x - hexadecimal value
%p - void pointer
Which would be the optimum format to print an address?
The simplest answer, assuming you don't mind the vagaries and variations in format between different platforms, is the standard %p notation.
The C99 standard (ISO/IEC 9899:1999) says in §7.19.6.1 ¶8:
p The argument shall be a pointer to void. The value of the pointer is
converted to a sequence of printing characters, in an implementation-defined
manner.
(In C11 — ISO/IEC 9899:2011 — the information is in §7.21.6.1 ¶8.)
On some platforms, that will include a leading 0x and on others it won't, and the letters could be in lower-case or upper-case, and the C standard doesn't even define that it shall be hexadecimal output though I know of no implementation where it is not.
It is somewhat open to debate whether you should explicitly convert the pointers with a (void *) cast. It is being explicit, which is usually good (so it is what I do), and the standard says 'the argument shall be a pointer to void'. On most machines, you would get away with omitting an explicit cast. However, it would matter on a machine where the bit representation of a char * address for a given memory location is different from the 'anything else pointer' address for the same memory location. This would be a word-addressed, instead of byte-addressed, machine. Such machines are not common (probably not available) these days, but the first machine I worked on after university was one such (ICL Perq).
If you aren't happy with the implementation-defined behaviour of %p, then use C99 <inttypes.h> and uintptr_t instead:
printf("0x%" PRIXPTR "\n", (uintptr_t)your_pointer);
This allows you to fine-tune the representation to suit yourself. I chose to have the hex digits in upper-case so that the number is uniformly the same height and the characteristic dip at the start of 0xA1B2CDEF appears thus, not like 0xa1b2cdef which dips up and down along the number too. Your choice though, within very broad limits. The (uintptr_t) cast is unambiguously recommended by GCC when it can read the format string at compile time. I think it is correct to request the cast, though I'm sure there are some who would ignore the warning and get away with it most of the time.
Kerrek asks in the comments:
I'm a bit confused about standard promotions and variadic arguments. Do all pointers get standard-promoted to void*? Otherwise, if int* were, say, two bytes, and void* were 4 bytes, then it'd clearly be an error to read four bytes from the argument, non?
I was under the illusion that the C standard says that all object pointers must be the same size, so void * and int * cannot be different sizes. However, what I think is the relevant section of the C99 standard is not so emphatic (though I don't know of an implementation where what I suggested is true is actually false):
§6.2.5 Types
¶26 A pointer to void shall have the same representation and alignment requirements as a pointer to a character type.39) Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements. All pointers to structure types shall have the same representation and alignment requirements as each other. All pointers to union types shall have the same representation and alignment requirements as each other. Pointers to other types need not have the same representation or alignment requirements.
39) The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.
(C11 says exactly the same in the section §6.2.5, ¶28, and footnote 48.)
So, all pointers to structures must be the same size as each other, and must share the same alignment requirements, even though the structures the pointers point at may have different alignment requirements. Similarly for unions. Character pointers and void pointers must have the same size and alignment requirements. Pointers to variations on int (meaning unsigned int and signed int) must have the same size and alignment requirements as each other; similarly for other types. But the C standard doesn't formally say that sizeof(int *) == sizeof(void *). Oh well, SO is good for making you inspect your assumptions.
The C standard definitively does not require function pointers to be the same size as object pointers. That was necessary not to break the different memory models on DOS-like systems. There you could have 16-bit data pointers but 32-bit function pointers, or vice versa. This is why the C standard does not mandate that function pointers can be converted to object pointers and vice versa.
Fortunately (for programmers targetting POSIX), POSIX steps into the breach and does mandate that function pointers and data pointers are the same size:
§2.12.3 Pointer Types
All function pointer types shall have the same representation as the type pointer to void. Conversion of a function pointer to void * shall not alter the representation. A void * value resulting from such a conversion can be converted back to the original function pointer type, using an explicit cast, without loss of information.
Note:
The ISO C standard does not require this, but it is required for POSIX conformance.
So, it does seem that explicit casts to void * are strongly advisable for maximum reliability in the code when passing a pointer to a variadic function such as printf(). On POSIX systems, it is safe to cast a function pointer to a void pointer for printing. On other systems, it is not necessarily safe to do that, nor is it necessarily safe to pass pointers other than void * without a cast.
p is the conversion specifier to print pointers. Use this.
int a = 42;
printf("%p\n", (void *) &a);
Remember that omitting the cast is undefined behavior and that printing with p conversion specifier is done in an implementation-defined manner.
Use %p, for "pointer", and don't use anything else*. You aren't guaranteed by the standard that you are allowed to treat a pointer like any particular type of integer, so you'd actually get undefined behaviour with the integral formats. (For instance, %u expects an unsigned int, but what if void* has a different size or alignment requirement than unsigned int?)
*) [See Jonathan's fine answer!] Alternatively to %p, you can use pointer-specific macros from <inttypes.h>, added in C99.
All object pointers are implicitly convertible to void* in C, but in order to pass the pointer as a variadic argument, you have to cast it explicitly (since arbitrary object pointers are only convertible, but not identical to void pointers):
printf("x lives at %p.\n", (void*)&x);
As an alternative to the other (very good) answers, you could cast to uintptr_t or intptr_t (from stdint.h/inttypes.h) and use the corresponding integer conversion specifiers. This would allow more flexibility in how the pointer is formatted, but strictly speaking an implementation is not required to provide these typedefs.
You can use %x or %X or %p; all of them are correct.
If you use %x, the address is given as lowercase, for example: a3bfbc4
If you use %X, the address is given as uppercase, for example: A3BFBC4
Both of these are correct.
If you use %x or %X it's considering six positions for the address, and if you use %p it's considering eight positions for the address. For example:
Related
What standard-defined integer type should I use for holding pointer to functions? Is there a (void*)-like type for functions that can hold any functions?
It's very certain that it's not [u]intptr_t because the standard said explicitly it's for pointers to objects and the standard makes a clear distinction between pointer to objects and pointer to functions.
There is no specified type for an integer type that is sufficient to encode a function pointer.
Alternatives:
Change code to negate the need for that integer type. Rarely is such an integer type truly needed.
Use an array: unsigned char[sizeof( int (*)(int)) )] and int (*f)(int)) within union to allow some examination of the integer-ness of the pointer. Still there may be padding issues. Comes down to what code want to do with such an integer.
Use uintmax_t and hope it is sufficient. A _Static_assert(sizeof (uintmax_t) >= sizeof (int (*)(int)) ); is a reasonable precaution though not a guarantee of success.
The limiting spec
Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type. C17dr § 6.3.2.3 6
Note even [u]intptr_t for object pointer types are a guarantee either as they are optional types.
It's very certain that it's not [u]intptr_t
It's true. But the distinction is only made to guarantee the correctness of some essential conversions. If you look at the 2nd edition of "The C Programming Language", then the distinction is more subtle to notice than the current wording in the standard.
If you target platforms with operating systems, then [u]intptr_t is probably exactly what you want, as:
POSIX specifies dlsym function that returns void * with the requirement that result can be casted to function pointers and be usable.
Win32 GetProcAddress is defined in such way that, if the return values' not NULL, it'll be a valid memory address that's valid for functions and/or objects.
Let's say I want to move a void* pointer by 4 bytes. Are the following equivalent:
A:
void* new_address(void* in_ptr) {
intptr_t tmp = (intptr_t)in_ptr;
intptr_t new_address = tmp + 4;
return (void*)new_address;
}
B:
void* new_address(void* in_ptr) {
char* tmp = (char*)in_ptr;
char* new_address = tmp + 4;
return (void*)new_address;
}
Are both defined behavior? Is one more popular/accepted convention? Any other reason to use one over the other?.
Let's only consider 64bit systems. If intptr_t is not available we can use int64_t instead.
The context is a custom memory allocator which needs to move the pointer before allocating new block of memory to a specific address (for alignment purposes). We don't know what object the resulting pointer is going to point to yet but we know we need to move it to a specific location which in the examples above is 4 bytes.
Michael Kerrisk says on page 1415 that,
The C standards make one exception to the rule that pointers of
different types need not have the same representation: pointers of the
types char * and void * are required to have the same internal
representation.
All the C standard guarantees (7.18.1.4) is that you can convert void* values to intptr_t (or uintptr_t) and back again and end up with an equal value for the pointer.
The nuance is here that we cannot apply mathematical operations (including ==) if void* is in use.
Is casting a pointer to intptr_t [...] defined behavior?
Converting a pointer to any integer type is defined and the result is implementation defined, except when result can't be represented in integer type, then it's undefined behavior. See C11 6.3.2.3p6. But intptr_t has to be able to represent void* - the behavior is defined.
, doing arithmetic on it and then casting back, defined behavior?
Any integer may be converted to any pointer type. The resulting pointer is implementation defined - there is no guarantee that adding 4 to intptr_t will increment the pointer value by 4. See C11 6.3.2.3p5.
Are both defined behavior?
Yes, however the result is implementation defined.
Is one more popular/accepted convention?
Subjective: I say using uintptr_t is more popular then intptr_t. Converting a pointer to uintptr_t or to char* to do some arithmetic happens in some code, I can't say which is more popular.
Any other reason to use one over the other?.
Not really, but I think go with char*.
When it comes to actually accessing the data behind the resulting pointer - it depends. If the resulting pointer points within the same object then you're fine (remember, conversion is implementation defined). If the resulting pointer does not point to the same object, I believe the best interpretation would be from reading c2263 Clarifying Pointer Provenance v4 2.2.3Q5 and I think that's: the current C11 standard does not clearly specify that, which would make the behavior not defined.
Because you tagged gcc, both code snippets should compile to equivalent code - I believe on all architectures pointers are converted 1:1 to (u)intptr_t on gcc. Gcc docs implementation defined behavior 4.7 arrays and pointers states casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined - so you're safe as long as the resulting pointer points to the same object.
The context is a custom memory allocator
See implementations of container_of and offsetof macros. Do not hardcode + 4 in your code, and if you do, do not depend on alignment requirements on accessing the resulting pointers - remember to use memcpy to safely copy the context or handle alignment properly. Do not reinvent the wheel - when in doubt see other implementations like glibc malloc.c or newlib malloc.c - they both calculate on char* in mem2chunk macro, but also happen to do calculations on uintptr_t integers.
No 'strictly conforming program uses A. Using the result may be Undefined Behaviour as there is no requirement for addition against intptr_t to be reflected in a pointer value if that intptr_ is converted back to a pointer.
It is both unspecified behaviour and implementation-defined.
If the optional type intptr_t is defined all you are guaranteed is that you can convert void * to intptr_t and then convert that value back to void * and the two values will compare equal (==).
The strictly conforming way to perform pointer arithmetic is B. B is guaranteed to work if and only if the pointer int_ptr is valid and for the largest enclosing object there are 3 or more bytes in that object beyond that value. It's 3 because it's valid to point to (but not dereference) to the address that is (logically) one byte beyond the end of an object.
Object includes a declared object (including array) or block of memory such as returned by malloc().
All good practice is to prefer to write 'strictly conforming' programs where possible. So all good practice is to prefer B over A.
According to the standard the use of the pointer (as a pointer) may result in Undefined Behaviour because it may be (implementation defined) to be a trap representation.
A strictly conforming program is defined as "A strictly conforming program shall use only those features of the language and library specified in this International Standard.3) It shall not produce output dependent on any unspecified, undefined, or implementation-defined behavior, and shall not exceed any minimum implementation limit.
There's some disagreement about whether the code offered for A is unspecified or implementation defined. The standard says both because implementation-defined behaviour is a sub-category of unspecified. However because the implementation may document it as a trap representation using the value may result in Undefined Behaviour.
But I hope that is swept aside by the fact that 'strictly conforming programs' don't depend on unspecified, undefined or implementation defined behaviour.
So good practice here is certainly B.
Consider a secure environment that encrypts pointer values to deliberately confound the de-referencing of arbitrary pointer values. In principle it could provide intptr_t and be conformant.
Though I still maintain that if A doesn't work then intptr_t being an optional type it would be better to not provide it. Whether it is defined is unspecified and implementation dependent. That's because no 'strictly conforming program' uses it and it has no practical use other than to manipulate a pointer as an arithmetic type in a way not supported by pointer arithmetic on a compatible pointer type char *. The snippet in A falls into that category.
To store a void * declare a void * or char[sizeof(void*)] or malloc() or similar. To overlay a void * over an arithmetic type, declare a union and benefit that the union will be aligned for a void *.
But according to the specification it is unspecified, implementation-defined no 'strictly conforming program' can rely on it and may result in Undefined Behaviour.
A very long winded way of saying the answer, here, is B.
I've read a lot of answers about the %p format specifier usage in C language here in Stack Overflow, but none seems to give an explanation as to why explicit cast to void* is needed for all types but char*.
I'm of course aware about the fact that this requirement to cast to or from void* is tied with the use of variadic functions (see first comment of this answer) while non-mandatory otherwise.
Here's an example :
int i;
printf ("%p", &i);
Yields a warning about type incompatibility and that &i shall be casted to void* (as required by the standard, see again here).
Whereas this chunk of code compiles smoothly with no complaint about type casting whatsoever:
char * m = "Hello";
printf ("%p", m);
How does that come that char* is "relieved" from this imperative?
PS: It's maybe worth adding that I work on x86_64 architecture, as pointer type size depends on it, and using gcc as compiler on linux with -W -Wall -std=c11 -pedantic compiling options.
There is no explicit cast needed for arguments of type char*, as char * has the same representation and alignment requirement as void *.
Quoting C11, chapter §6.2.5
A pointer to void shall have the same representation and alignment requirements as a
pointer to a character type. (48) [...]
and the footnote 48)
The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.
The C11 standard 6.2.5/28 says:
A pointer to void shall have the same representation and alignment requirements as a
pointer to a character type. 48)
with footnote 48 being:
The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.
However 7.21.6.1 ("The fprintf function") says about %p:
The argument shall be a pointer to void.
This is apparently a contradiction. In my opinion, a sensible interpretation is to say that the intent of 6.2.5/28 is that void * and char * are in fact interchangeable as the types for function arguments which do not correspond to a prototype. (i.e. arguments to non-prototyped functions, or matching the ellipsis of a prototype of variadic function).
Apparently the compiler you're using takes a similar view.
To back this up, the specification of argument types in 7.21.6.1, if taken literally without regard to intent, has a lot of other inconsistencies that have to be disregarded in practice (e.g. it says that printf("%lx", -1); is well-defined, but printf("%u", 1); is undefined behaviour).
The reason for this requirement is the C Standard allows for different representations for pointers to different types, with 2 notable constraints:
pointers to void and char or unsigned char and their qualified versions shall have the same representation.
pointers to structures and unions must have the same representation.
Hence on some architectures, int * and char * might have different representations, for example a different size, and they could be passed in different ways to vararg functions, causing int i = 1; printf("%p", &i); and int i = 1; printf("%p", (void*)&i); to behave differently.
Note however that the Posix standards mandate that all pointer type have the same size and representation. Hence on a Posix system printf("%p", &i); should behave as expected.
From C Standard#6.2.5p28
A pointer to void shall have the same representation and alignment requirements as a pointer to a character type.48) Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements. All pointers to structure types shall have the same representation and alignment requirements as each other. All pointers to union types shall have the same representation and alignment requirements as each other. Pointers to other types need not have the same representation or alignment requirements. [emphasis mine]
In the GLib documentation, there is a chapter on type conversion macros.
In the discussion on converting an int to a void* pointer it says (emphasis mine):
Naively, you might try this, but it's incorrect:
gpointer p;
int i;
p = (void*) 42;
i = (int) p;
Again, that example was not correct, don't copy it. The problem is
that on some systems you need to do this:
gpointer p;
int i;
p = (void*) (long) 42;
i = (int) (long) p;
(source: GLib Reference Manual for GLib 2.39.92, chapter Type Conversion Macros ).
Why is that cast to long necessary?
Should any required widening of the int not happen automatically as part of the cast to a pointer?
The glib documentation is wrong, both for their (freely chosen) example, and in general.
gpointer p;
int i;
p = (void*) 42;
i = (int) p;
and
gpointer p;
int i;
p = (void*) (long) 42;
i = (int) (long) p;
will both lead to identical values of i and p on all conforming c implementations.
The example is poorly chosen, because 42 is guaranteed to be representable by int and long (C11 draft standard n157: 5.2.4.2.1 Sizes of integer types ).
A more illustrative (and testable) example would be
int f(int x)
{
void *p = (void*) x;
int r = (int)p;
return r;
}
This will round-trip the int-value iff void* can represent every value that int can, which practically means sizeof(int) <= sizeof(void*) (theoretically: padding bits, yadda, yadda, doesn't actually matter). For other integer types, same problem, same actual rule (sizeof(integer_type) <= sizeof(void*)).
Conversely, the real problem, properly illustrated:
void *p(void *x)
{
char c = (char)x;
void *r = (void*)c;
return r;
}
Wow, that can't possibly work, right? (actually, it might).
In order to round-trip a pointer (which software has done unnecessarily for a long time), you also have to ensure that the integer type you round-trip through can unambiguously represent every possible value of the pointer type.
Historically, much software was written by monkeys that assumed that pointers could round-trip through int, possibly because of K&R c's implicit int-"feature" and lots of people forgetting to #include <stdlib.h> and then casting the result of malloc() to a pointer type, thus accidentally roundtripping through int. On the machines the code was developed for sizeof(int) == sizeof(void*), so this worked. When the switch to 64-bit machines, with 64-bit addresses (pointers) happened, a lot of software expected two mutually exclusive things:
1) int is a 32-bit 2's complement integer (typically also expecting signed overflow to wrap around)
2) sizeof(int) == sizeof(void*)
Some systems (cough Windows cough) also assumed sizeof(long) == sizeof(int), most others had 64-bit long.
Consequently, on most systems, changing the round-tripping intermediate integer type to long fixed the (unnecessarily broken) code:
void *p(void *x)
{
long l = (long)x;
void *r = (void*)l;
return r;
}
except of course, on Windows. On the plus side, for most non-Windows (and non 16-bit) systems sizeof(long) == sizeof(void*) is true, so the round-trip works both ways.
So:
the example is wrong
the type chosen to guarantee round-trip doesn't guarantee round-trip
Of course, the c standard has a (naturally standard-conforming) solution in intptr_t/uintptr_t (C11 draft standard n1570: 7.20.1.4 Integer types capable of holding object pointers), which are specified to guarantee the
pointer -> integer type -> pointer
round-trip (though not the reverse).
As according to the C99: 6.3.2.3 quote:
5 An integer may be converted to any pointer type. Except as
previously specified, the result is implementation-defined, might not
be correctly aligned, might not point to an entity of the referenced
type, and might be a trap representation.56)
6 Any pointer type may be converted to an integer type. Except as
previously specified, the result is implementation-defined. If the
result cannot be represented in the integer type, the behavior is
undefined. The result need not be in the range of values of any
integer type.
According to the documentation at the link you mentioned:
Pointers are always at least 32 bits in size (on all platforms GLib
intends to support). Thus you can store at least 32-bit integer values
in a pointer value.
And further more long is guaranteed to be atleast 32-bits.
So,the code
gpointer p;
int i;
p = (void*) (long) 42;
i = (int) (long) p;
is safer,more portable and well defined for upto 32-bit integers only, as advertised by GLib.
As I understand it, the code (void*)(long)42 is "better" than (void*)42 because it gets rid of this warning for gcc:
cast to pointer from integer of different size [-Wint-to-pointer-cast]
on environments where void* and long have the same size, but different from int. According to C99, §6.4.4.1 ¶5:
The type of an integer constant is the first of the corresponding list in which its value can be represented.
Thus, 42 is interpreted as int, had this constant be assigned directly to a void* (when sizeof(void*)!=sizeof(int)), the above warning would pop up, but everyone wants clean compilations. This is the problem (issue?) the Glib doc is pointing to: it happens on some systems.
So, two issues:
Assign integer to pointer of same size
Assign integer to pointer of different size
Curiously enough for me is that, even though both cases have the same status on the C standard and in the gcc implementation notes (see gcc implementation notes), gcc only shows the warning for 2.
On the other hand, it is clear that casting to long is not always the solution (still, on modern ABIs sizeof(void*)==sizeof(long) most of the times), there are many possible combinations depending on the size of int,long,long long and void*, for 64bits architectures and in general. That is why glib developers try to find the matching integer type for pointers and assign glib_gpi_cast and glib_gpui_cast accordingly for the mason build system. Later, these mason variables are used in here to generate those conversion macros the right way (see also this for basic glib types). Eventually, those macros first cast an integer to another integer type of the same size as void* (such conversion conforms to the standard, no warnings) for the target architecture.
This solution to get rid of that warning is arguably a bad design that is nowadys solved by intptr_t and uintptr_t, but it is posible it is there for historical reasons: intptr_t and uintptr_t are available since C99 and Glib started its development earlier in 1998, so they found their own solution to the same problem. It seems that there were some tries to change it:
GLib depends on various parts of a valid C99 toolchain, so it's time to
use C99 integer types wherever possible, instead of doing configure-time
discovery like it's 1997.
no success however, it seems it never got in the main branch.
In short, as I see it, the original question has changed from why this code is better to why this warning is bad (and is it a good idea to silence it?). The later has been answered somewhere else, but this could also help:
Converting from pointer to integer or vice versa results in code that is not portable and may create unexpected pointers to invalid memory locations.
But, as I said above, this rule doesn't seem to qualify for a warning for issue number 1 above. Maybe someone else could shed some light on this topic.
My guess for the rationale behind this behaviour is that gcc decided to throw a warning whenever the original value is changed in some way, even if subtle. As gcc doc says (emphasis mine):
A cast from integer to pointer discards most-significant bits if the pointer representation is smaller than the integer type, extends according to the signedness of the integer type if the pointer representation is larger than the integer type, otherwise the bits are unchanged.
So, if sizes match there is no change on the bits (no extension, no truncation, no filling with zeros) and no warning is thrown.
Also, [u]intptr_t is just a typedef of the appropriate qualified integer: it is not justifiable to throw a warning when assigning [u]intptr_t to void* since it is indeed its purpose. If the rule applies to [u]intptr_t, it has to apply to typedefed integer types.
I think it is because this conversion is implementation-dependendent. It is better to use uintptr_t for this purpose, because it is of the size of pointer type in particular implementation.
As explained in Askmish's answer, the conversion from an integer type to a pointer is implementation defined (see e.g. N1570 6.3.2.3 Pointers §5 §6 and the footnote 67).
The conversion from a pointer to an integer is implementation defined too and if the result cannot be represented in the integer type, the behavior is undefined.
On most general purpose architectures, nowadays, sizeof(int) is less than sizeof(void *), so that even those lines
int n = 42;
void *p = (void *)n;
When compiled with clang or gcc would generate a warning (see e.g. here)
warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
Since C99, the header <stdint.h> introduces some optional fixed-sized types. A couple, in particular, should be used here n1570 7.20.1.4 Integer types capable of holding object pointers:
The following type designates a signed integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer:
intptr_t
The following type designates an unsigned integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer:
uintptr_t
These types are optional.
So, while a long may be better than int, to avoid undefined behaviour the most portable (but still implementation defined) way is to use one of those types(1).
Gcc's documentation specifies how the conversion takes place.
4.7 Arrays and Pointers
The result of converting a pointer to an integer or vice versa (C90 6.3.4, C99 and C11 6.3.2.3).
A cast from pointer to integer discards most-significant bits if the pointer representation is larger than the integer type, sign-extends(2) if the pointer representation is smaller than the integer type, otherwise the bits are unchanged.
A cast from integer to pointer discards most-significant bits if the pointer representation is smaller than the integer type, extends according to the signedness of the integer type if the pointer representation is larger than the integer type, otherwise the bits are unchanged.
When casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined. That is, one may not use integer arithmetic to avoid the undefined behavior of pointer arithmetic as proscribed in C99 and C11 6.5.6/8.
[...]
(2) Future versions of GCC may zero-extend, or use a target-defined ptr_extend pattern. Do not rely on sign extension.
Others, well...
The conversions between different integer types (int and intptr_t in this case) are mentioned in n1570 6.3.1.3 Signed and unsigned integers
When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
So, if we start from an int value and the implementation provides an intptr_t type and sizeof(int) <= sizeof(intptr_t) or INTPTR_MIN <= n && n <= INTPTR_MAX, we can safely convert it to an intptr_t and then convert it back.
That intptr_t can be converted to a void * and then converted back to the same (1)(2) intptr_t value.
The same doesn't hold in general for a direct conversion between an int and a void *, even if in the example provided, the value (42) is small enough not to cause undefined behaviour.
I personally find quite debatable the reasons given for those type conversion macros in the linked GLib documentation (emphasis mine)
Many times GLib, GTK+, and other libraries allow you to pass "user data" to a callback, in the form of a void pointer. From time to time you want to pass an integer instead of a pointer. You could allocate an integer [...] But this is inconvenient, and it's annoying to have to free the memory at some later time.
Pointers are always at least 32 bits in size (on all platforms GLib intends to support). Thus you can store at least 32-bit integer values in a pointer value.
I'll let the reader decide whether their approach makes more sense than a simple
#include <stdio.h>
void f(void *ptr)
{
int n = *(int *)ptr;
// ^ Yes, here you may "pay" the indirection
printf("%d\n", n);
}
int main(void)
{
int n = 42;
f((void *)&n);
}
(1) I'd like to quote a passage in this Steve Jessop's answer about those types
Take this to mean what it says. It doesn't say anything about size.
uintptr_t might be the same size as a void*. It might be larger. It could conceivably be smaller, although such a C++ implementation approaches perverse. For example on some hypothetical platform where void* is 32 bits, but only 24 bits of virtual address space are used, you could have a 24-bit uintptr_t which satisfies the requirement. I don't know why an implementation would do that, but the standard permits it.
(2) Actually, the standard explicitly mention the void* -> intptr_t/uintptr_t -> void* conversion, requiring those pointers to compare equal. It doesn't explicitly mandate that in the case intptr_t -> void* -> intptr_t the two integer values compare equal. It just mention in footnote 67 that "The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.".
Is the size of the datatype "int" always equals to the size of a pointer in the c language?
I'm just curious.
Not at all, there is no guarantee that sizeof(int) == sizeof(void*). And on Linux/AMD64 sizeof(int) is 4 bytes, and sizeof(void*) is 8 bytes (same as sizeof(long) on that platform).
Recent C standard (e.g. C99) defines a standard header <stdint.h> which should define, among others, an integral type intptr_t which is guaranteed to have the size of pointers (and probably even which is reversably castable to and from pointers).
I think that the standard does not guarantee that all pointers have the same size, in particular pointer to functions can be "bigger" than data pointers (I cannot name a platform where it is true). I believe that recent Posix standard requires that (e.g. for dlsym(3)).
See also this C reference and the n1570 draft C11 standard (or better)
PS. In 2021 I cannot name a common platform with sizeof(long) != sizeof(void*). But in the previous century the old intel 286 could have been such a platform.
No. for example, in most 64bit systems, int is 4 bytes, and void* is 8.
It is not guaranteed.
And for example, in most 64-bit systems both sizes are usually different.
Even sizeof (int *) is not guranteed to be equal to sizeof (void *).
The only guarantee for void * size is
sizeof (void *) == sizeof (char *)
== sizeof (signed char *) == sizeof (unsigned char *)
No. Some (mostly older, VAX-era) code assumes this, but it's definitely not required, and assuming it is not portable. There are real implementations where the two differ (e.g., some current 64-bit environments use a 64-bit pointer and 32-bit int).
The C languages gives no guarantees of anything when it comes to integer or pointer sizes.
The size of int is typically the same as the data bus width, but not necessarily. The size of a pointer is typically the same as the address bus width, but not necessarily.
Many compilers use non-standard extensions like the far keyword, to access data beyond the width of the default pointer type.
In addition to 64-bit systems, there are also plenty of microcontroller/microprocessor architectures where the size of int and the size of a pointer are different. Windows 3.1 and DOS are other examples.
There's no guarantee of any relation between the sizes of these two types, nor that either can be faithfully represented in the other via round-trip casts. It's all implementation-defined.
With that said, in the real world, unless you're dealing with really obscure legacy 16-bit systems or odd DSPs or such, sizeof(int) is going to be less than or equal to sizeof(void *), and you can faithfully convert int values to void * to pass them to interfaces (like pthread_create) that take a generic void * argument to avoid wasteful allocation and freeing of memory to store a single int. In particular, if you're using POSIX or Windows interfaces already, this is definitely a safe real-world assumption to make.
You should never assume void * can be faithfully represented in int (i.e. casting a pointer to int and back). This does not work on any popular real-world 64-bit systems, and the percentage of systems it works on is sure to plummet in the near future.
No. Pointer types do not have to be the same size or representation as integer types. Here are a few relevant sections from the C language standard (online draft available here):
6.2.5 Types
...
27 A pointer to void shall have the same representation and alignment requirements as a
pointer to a character type.39) Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements. All pointers to structure types shall have the same representation and alignment requirements
as each other. All pointers to union types shall have the same representation and
alignment requirements as each other. Pointers to other types need not have the same
representation or alignment requirements.
...
39) The same representation and alignment requirements are meant to imply interchangeability as
arguments to functions, return values from functions, and members of unions.
...
6.3.2.3 Pointers
...
5 An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.56)
6 Any pointer type may be converted to an integer type. Except as previously specified, the
result is implementation-defined. If the result cannot be represented in the integer type,
the behavior is undefined. The result need not be in the range of values of any integer
type.
...
56) The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to
be consistent with the addressing structure of the execution environment.
No, it doesn't have to be, but it's usually the case that sizeof(long) == sizeof(void*).