I've been trying to work through some C fundamentals lately to try and build a basic understanding of lower level languages.
In one of the documents I've encountered (A tutorial on pointers and arrays in C)
the author uses a void pointer in a printf statement:
int var = 2;
printf("var has the value %d and is stored at %p\n", var, (void *) &var);
And states the reason:
I cast the pointers to integers into void
pointers to make them compatible with the %p conversion specification.
However, omitting the (void *) does not result in an error or warning, either compiling and running or running through valgrind.
int var = 2;
printf("var has the value %d and is stored at %p\n", var, &var);
Is casting to void here considered a best practice or standard, or is there something more sinister afoot?
Since printf is a variadic function, its declaration only specifies the type of the first parameter (the format string). The number and types of any remaining parameters are required to match the format string, but it's up to you, the programmer, to make sure they actually match. If they don't, the behavior is undefined, but the compiler isn't obliged to warn you about it. (Some compilers, including gcc, can do some checking if the format string is a literal.)
The %p format specifier requires an argument of type void*. If you pass a pointer of a different type, you have undefined behavior. In many implementations, all pointer types have the same size and representation, and are passed the same way as function arguments -- but the language doesn't guarantee that. By explicitly converting the pointer to void*, you guarantee that it will work correctly. By omitting the cast, you have code that will probably work as you expect it to on almost all implementations.
100% correct is better than 99% correct, especially if the only cost of that extra 1% is typing a few characters.
Related
Which format specifier should I be using to print the address of a variable? I am confused between the below lot.
%u - unsigned integer
%x - hexadecimal value
%p - void pointer
Which would be the optimum format to print an address?
The simplest answer, assuming you don't mind the vagaries and variations in format between different platforms, is the standard %p notation.
The C99 standard (ISO/IEC 9899:1999) says in §7.19.6.1 ¶8:
p The argument shall be a pointer to void. The value of the pointer is
converted to a sequence of printing characters, in an implementation-defined
manner.
(In C11 — ISO/IEC 9899:2011 — the information is in §7.21.6.1 ¶8.)
On some platforms, that will include a leading 0x and on others it won't, and the letters could be in lower-case or upper-case, and the C standard doesn't even define that it shall be hexadecimal output though I know of no implementation where it is not.
It is somewhat open to debate whether you should explicitly convert the pointers with a (void *) cast. It is being explicit, which is usually good (so it is what I do), and the standard says 'the argument shall be a pointer to void'. On most machines, you would get away with omitting an explicit cast. However, it would matter on a machine where the bit representation of a char * address for a given memory location is different from the 'anything else pointer' address for the same memory location. This would be a word-addressed, instead of byte-addressed, machine. Such machines are not common (probably not available) these days, but the first machine I worked on after university was one such (ICL Perq).
If you aren't happy with the implementation-defined behaviour of %p, then use C99 <inttypes.h> and uintptr_t instead:
printf("0x%" PRIXPTR "\n", (uintptr_t)your_pointer);
This allows you to fine-tune the representation to suit yourself. I chose to have the hex digits in upper-case so that the number is uniformly the same height and the characteristic dip at the start of 0xA1B2CDEF appears thus, not like 0xa1b2cdef which dips up and down along the number too. Your choice though, within very broad limits. The (uintptr_t) cast is unambiguously recommended by GCC when it can read the format string at compile time. I think it is correct to request the cast, though I'm sure there are some who would ignore the warning and get away with it most of the time.
Kerrek asks in the comments:
I'm a bit confused about standard promotions and variadic arguments. Do all pointers get standard-promoted to void*? Otherwise, if int* were, say, two bytes, and void* were 4 bytes, then it'd clearly be an error to read four bytes from the argument, non?
I was under the illusion that the C standard says that all object pointers must be the same size, so void * and int * cannot be different sizes. However, what I think is the relevant section of the C99 standard is not so emphatic (though I don't know of an implementation where what I suggested is true is actually false):
§6.2.5 Types
¶26 A pointer to void shall have the same representation and alignment requirements as a pointer to a character type.39) Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements. All pointers to structure types shall have the same representation and alignment requirements as each other. All pointers to union types shall have the same representation and alignment requirements as each other. Pointers to other types need not have the same representation or alignment requirements.
39) The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.
(C11 says exactly the same in the section §6.2.5, ¶28, and footnote 48.)
So, all pointers to structures must be the same size as each other, and must share the same alignment requirements, even though the structures the pointers point at may have different alignment requirements. Similarly for unions. Character pointers and void pointers must have the same size and alignment requirements. Pointers to variations on int (meaning unsigned int and signed int) must have the same size and alignment requirements as each other; similarly for other types. But the C standard doesn't formally say that sizeof(int *) == sizeof(void *). Oh well, SO is good for making you inspect your assumptions.
The C standard definitively does not require function pointers to be the same size as object pointers. That was necessary not to break the different memory models on DOS-like systems. There you could have 16-bit data pointers but 32-bit function pointers, or vice versa. This is why the C standard does not mandate that function pointers can be converted to object pointers and vice versa.
Fortunately (for programmers targetting POSIX), POSIX steps into the breach and does mandate that function pointers and data pointers are the same size:
§2.12.3 Pointer Types
All function pointer types shall have the same representation as the type pointer to void. Conversion of a function pointer to void * shall not alter the representation. A void * value resulting from such a conversion can be converted back to the original function pointer type, using an explicit cast, without loss of information.
Note:
The ISO C standard does not require this, but it is required for POSIX conformance.
So, it does seem that explicit casts to void * are strongly advisable for maximum reliability in the code when passing a pointer to a variadic function such as printf(). On POSIX systems, it is safe to cast a function pointer to a void pointer for printing. On other systems, it is not necessarily safe to do that, nor is it necessarily safe to pass pointers other than void * without a cast.
p is the conversion specifier to print pointers. Use this.
int a = 42;
printf("%p\n", (void *) &a);
Remember that omitting the cast is undefined behavior and that printing with p conversion specifier is done in an implementation-defined manner.
Use %p, for "pointer", and don't use anything else*. You aren't guaranteed by the standard that you are allowed to treat a pointer like any particular type of integer, so you'd actually get undefined behaviour with the integral formats. (For instance, %u expects an unsigned int, but what if void* has a different size or alignment requirement than unsigned int?)
*) [See Jonathan's fine answer!] Alternatively to %p, you can use pointer-specific macros from <inttypes.h>, added in C99.
All object pointers are implicitly convertible to void* in C, but in order to pass the pointer as a variadic argument, you have to cast it explicitly (since arbitrary object pointers are only convertible, but not identical to void pointers):
printf("x lives at %p.\n", (void*)&x);
As an alternative to the other (very good) answers, you could cast to uintptr_t or intptr_t (from stdint.h/inttypes.h) and use the corresponding integer conversion specifiers. This would allow more flexibility in how the pointer is formatted, but strictly speaking an implementation is not required to provide these typedefs.
You can use %x or %X or %p; all of them are correct.
If you use %x, the address is given as lowercase, for example: a3bfbc4
If you use %X, the address is given as uppercase, for example: A3BFBC4
Both of these are correct.
If you use %x or %X it's considering six positions for the address, and if you use %p it's considering eight positions for the address. For example:
When I compile this code with gcc 7.3.0, I get an "assignment from incompatible pointer type".
int intVar = 1;
char* charPointer;
charPointer = &intVar;
printf("%d", *charPointer);
So far so good. I can deal with it by doing the pointer assignment this way:
charPointer = (char*)&intVar;
Now my doubt is: how is the second case different? I can still mess things up if I don't cast charPointer to int* when I, for example increment it by n or dereference it. So why does the compiler act differently in those two cases? Why should he care if pointer type does not match during an assignment? I would just like to understand the logic behind it.
Because of the casual type change int * to char *, the compiler warns of potential pitfalls. With a cast, the compiler assumes the coders know what they are doing.
In C, the various type of pointers can live in different places, have different sizes and be encoded differently.
It is very common for all object pointers (int*, char *, struct foo *) to share the same size and encoding, but, in general, that is not required.
In is not uncommon for Function pointers to be of a different size than object pointers.
char * and void * share the same size and encoding and character pointers point to data with a minimal alignment requirement. Converting a non-char* object pointer to char * always "work". The reverse is not always true. leading to undefined behavior (UB).
Converting to non-character pointer also risks anti-aliasing (the need for the compiler to keep track that changes of data via one pointer type is reflected in another). Very strange undefined behavior can result.>
Better code avoids pointer type changes1 and when needed, is explicit with a cast.
1 Exceptions include when changing an object pointer to void * or form void* when there is no alignment concerns.
Pointer conversions are kind of dangerous.
If the pointer type your converting from is insufficiently aligned for the target type, you get UB (==undefined behavior; read up on what it is unless you have already) already at the conversion.
Otherwise, if you get usually get UB on dereferencing because C's strict aliasing rules require that you access objects through lvalue types sufficiently compatible with their effective type.
While the last paragraph doesn't quite apply to conversions to char pointers as char pointers can alias any type, the warning (compilers could make it a hard error too) is still useful because the conversion is still kind of dangerous.
printf("%d", *(char*)&(int){0xFFFF});
will get you only the first byte (it is endianness dependent whether that is the most significant one or the least significant one), printing 255 (if the implementation's char type is unsigned) or -1 (if it is signed).
printf("%d", *&(int){0xFFFF});
will get you all the bytes that are in an int.
If the compiler lets you assign to a char * from an int * with just a warning, it should behave the same as with the cast, but you do need the cast for your C to be conformant (and for the compiler to be silent about the conversion).
As #Mike Holt says, it's not actually different, except that you have told the compiler "Don't worry, I mean to do this".
The compiler does worry because assigning to a pointer of a different type is usually not what you want to do. Your code is telling the compiler "Treat the memory holding this variable as if it were holding a variable of a different type". This is almost certainly platform specific behavior, and possibly undefined behavior, depending on the types.
I have heard that pointers should first be cast to void to ensure consistency of values across different platforms and should use %p format specifier. Why is it and what exactly are the problems?
int x=100;
int *pi=&x;
printf("value of pi is: %p",(void*)pi);
printf is a variadic function and must be passed arguments of the right types. The standard says %p takes void *.
Implicit cast doesn't take place for variadic functions.
Quoting from N1570 7.21.6.1 The fprintf function
p : The argument shall be a pointer to void. The value of the pointer is
converted to a sequence of printing characters, in an implementation-defined
manner.
Internal presentation or size of different pointer types is not necessarily same.
For example on one system sizeof(void*) may be 2, but sizeof(int*) is 1.
Since printf is variable argument function, it cannot check the types of incoming parameters. If you passed int* to it, it would read wrong number of bytes, because it expects void*.
p conversion specification in printf requires an argument of type void *. C says if you pass an argument of an other type the call invokes undefined behavior.
Besides that, pointer objects of different types are not required to have the same representation: C does not guarantee that sizeof (void *) == sizeof (int *) for example. C only guarantees that void * has the same representation as pointers to character types.
Why do the code run without error ?
#include <stdio.h>
int main() {
int i="string"; //the base of string can be stored in a character pointer
printf("%s\n",i);
printf("%d",i);
return 0;
}
//compiling on ideone.com language c
OUTPUT:
string
134513984 //some garbage(address of "string")
Please explain if there is some flexibility in the pointer in c. I tried it for c++ which gives error: cannot convert ‘const char*’ to ‘int*’ in initialization
No, you cannot assume this in general. In part, this is because int may not be the same size as char * (in fact, on many 64-bit compilers it will not be the same size).
If you want to store a pointer as an integer, the appropriate type to use is actually intptr_t, from <stdint.h>. This is an integer which is guaranteed to be able to hold a pointer's value.
However, the circumstances when you'd actually want to do this are somewhat rare, and when you do do this you should also include an explicit cast:
intptr_t i=(intptr_t)"string"; //the base of string can be stored in a character pointer
This also complicates printing its value, you'll need to use a macro to be portable:
printf("%"PRIiPTR,i);
To print the original string, you should also cast:
printf("%s", (char *)i);
In general, no: the C standard states that conversions from pointers to integers are implementation defined. Further, this can be problematic on systems where sizeof(char *) and sizeof(int) are different (i.e. x86-64), for two reasons:
int i = "string"; can lose information, if the e.g. 64-bit pointer cannot fit in a 32-bit integer.
printf expects a pointer to be passed in, but gets a smaller integer. It winds up reading some garbage into the full pointer, and can crash your code (or worse).
Often times, however, compilers are "smart" enough to "fix" arguments to printf. Further, you seem to be running on a platform where pointers and integers are the same size, so you got lucky.
If you compiled this program with warnings (which you should) you'd get the following complaints:
main.c:3:9: warning: incompatible pointer to integer conversion initializing 'int' with an expression of type 'char [7]' [-Wint-conversion]
int i="string"; //the base of string can be stored in a character pointer
^ ~~~~~~~~
main.c:4:19: warning: format specifies type 'char *' but the argument has type 'int' [-Wformat]
printf("%s\n",i);
~~ ^
%d
2 warnings generated.
Warnings generally mean you're doing something that could cause unexpected results.
Most C compilers will let you do this, but that doesn't make it a good idea. Here, the address of the character array "string" gets stored in i. The printf options are determining how the integer is interpreted (as an address or an integer). This can be problematic when char* is not the same size as an int (e.g. on most 64 bit machines).
The C++ compiler is more picky and won't let you compile code like this. C compilers are much more willing, although they will usually generate warnings letting the programmer know it is a bad idea.
Your code is ill-formed in both C and C++. It is illegal to do
int i = "string";
in both languages. In both languages conversion from a pointer to an integer requires an explicit cast.
The only reason your C compiler accepted it is that it was configured by default for rather loose error checking. (A rather typical situation with C compilers.) Tighten up your C compiler settings and it should issue an error for the above initialization. I.e. you can use an explicit conversion
int i = (int) "string";
with implementation-dependent results, but you can't legally do it implicitly.
In any case, the warning your compiler emitted for the above initialization is already a sufficient form of a diagnostic message for this violation.
OK, I have heard things like you should cast a pointer to a generic one i.e void * before printing it and must practice the use of %p placeholder instead of %d in the printf function for printing them.
I just have a feeling that it might be done to prevent truncation of large addresses while printing or Is it something else? But the point is that if you are on a machine with 64 bit pointers and 32 bit integers; use of %p instead of %d will solely do the job.
Is anyone is aware of any practical situations where this casting technique is helpful?
Because the specification of the %p specifier is that it prints a void *. It doesn't know how to print any other type of pointer.
With printf the caller must convert the argument to the right type ; the printf function cannot perform any conversions because it does not have access to the type information about what arguments you actually passed in. It can only assume that you passed the right ones.
C99 7.19.6.1#7
p The argument shall be a pointer to void. The value of the pointer is
converted to a sequence of printing characters, in an implementation-defined
manner.
C99 7.19.6.1#9
If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
"Undefined behaviour" means anything can happen. In practice, you probably get away with it if the pointer you pass has the same size and representation as void * does. But you should not rely on undefined behaviour.
On modern systems, all object pointers (i.e. not function pointers) have the same size and representation. If you know you are on such a system, and it is not really important if the program misbehaves, then you probably get away with passing the wrong pointer type.
One place where it would be important is if you try to print a function pointer; as there are modern systems which have function pointers a different size to object pointers. It's not guaranteed that casting a function pointer to void * will be allowed by the compiler, but at least you'll get a compiler error message when you try it.