I have heard that pointers should first be cast to void to ensure consistency of values across different platforms and should use %p format specifier. Why is it and what exactly are the problems?
int x=100;
int *pi=&x;
printf("value of pi is: %p",(void*)pi);
printf is a variadic function and must be passed arguments of the right types. The standard says %p takes void *.
Implicit cast doesn't take place for variadic functions.
Quoting from N1570 7.21.6.1 The fprintf function
p : The argument shall be a pointer to void. The value of the pointer is
converted to a sequence of printing characters, in an implementation-defined
manner.
Internal presentation or size of different pointer types is not necessarily same.
For example on one system sizeof(void*) may be 2, but sizeof(int*) is 1.
Since printf is variable argument function, it cannot check the types of incoming parameters. If you passed int* to it, it would read wrong number of bytes, because it expects void*.
p conversion specification in printf requires an argument of type void *. C says if you pass an argument of an other type the call invokes undefined behavior.
Besides that, pointer objects of different types are not required to have the same representation: C does not guarantee that sizeof (void *) == sizeof (int *) for example. C only guarantees that void * has the same representation as pointers to character types.
Related
I stumbled upon a question with this code
int arr[10];
scanf("%s", arr);
printf("%s", arr);
I don't know the purpose, and I'm perfectly aware of how much this code smells. But my question here is if this i legal, and if I can expect it to reprint the string I enter? (Provided that the input string isn't so long that it causes buffer overflow)
And I also wonder if there's any example where this is actually useful.
From the C standard about fprintf https://port70.net/%7Ensz/c/c11/n1570.html#7.21.6.1p8
If no l length modifier is present, the argument shall be a pointer to the initial element of an array of character type.
and fscanf https://port70.net/%7Ensz/c/c11/n1570.html#7.21.6.2p12
If no l length modifier is present, the corresponding argument shall be a pointer to the initial element of a character array large enough to accept the sequence and a terminating null character, which will be added automatically.
And in this case, it's not a character array. But I seem to recall that there are some special rules about implicit conversions to and from char pointers, but I also have a memory that these does not apply to variadic functions.
For the language lawyer I would say it is illegal:
From C11: 7.21.6.
s If no l length modifier is present, the argument shall be a pointer to the initial element of an array of character type. 280)
and
9 If a conversion specification is invalid, the behavior is undefined. 282) If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
For any practical use I would say that I cannot think of a scenario where it might fail as long as you don't enter more characters than fit into the array.
Update after question was extended:
But I seem to recall that there are some special rules about implicit conversions to and from char pointers, but I also have a memory that these does not apply to variadic functions.
Correct, implicit conversions can only happen if type of destination is known. For variadic functions no such conversion is done for pointers.
And also in those cases where an implicit conversion would be done, it is only possible if one of them is void *, not between int * and char *.
... that there are some special rules about implicit conversions to and from char pointers
No, that's not correct.
For void pointers you have implicit conversion but not for char pointers. In the very old K&R days (i.e. before "void" was introduced) char pointers were used as "the generic pointer" type but still there was no implicit conversion.
Maybe you are confusing conversion with aliasing. For aliasing char pointers are special as a char pointer is allowed to alias other object types.
The conclusion is that the code has undefined behavior. Passing an int pointer when a char pointer is expected is not standard compliant.
You are allowed to access an int or an array of int using char references; C 2018 6.5 7 says character types may be used to access an object.
However, for %s, scanf and printf should be passed a char *, not an int *. Even though a char * may be used (with dereferencing) to access an int or array of int, that does not mean an int * will serve in place of a char *. When scanf or printf attempts to get the char * they expect (as by using va_arg), the fact that an int * was passed makes the behavior not defined by the C standard.
If instead you convert the int * to a char *:
scanf("%s", (char *) arr);
printf("%s", (char *) arr);
then the behavior is arguably defined by the C standard. (“Arguably” because the internals of scanf and printf are not formally specified. Presumably they align with the aliasing rules in 6.5 7.)
The program will run but with a warning explaining that %s expects the variable arr to be a char * type otherwise an array of char.
So, the output will be the string that the user types on demand of the program...
In this code given below , i have declared a pointer to int and we all know that memcpy returns a void pointer to destination string , so if ptr is a pointer to int then why printf("%s",ptr); is totally valid , ptr is not a pointer to char after all.
#include <stdio.h>
#include <string.h>
//Compiler version gcc 6.3.0
int main()
{
char a1[20] ={0} , a2[20] ={0};
int *ptr;
fgets(a1,20,stdin);
fgets(a2,20,stdin);
ptr = memcpy(a1,a2,strlen(a2)-1);
printf("%s \n",ptr);
if(ptr)
printf("%s",a1);
return 0;
}
First consider ptr = memcpy(a1,a2,strlen(a2)-1);. memcpy is declared as void *memcpy(void * restrict, const void * restrict, size_t), so it accepts the a1 and a2 passed to it because pointers to any unqualified object type may be converted to void * or to const void *. (Pointers to object types qualified with const may also converted to const void *.) This follows from the rules for function calls in C 2018 6.5.2.2 7 (arguments are converted to the parameter types as if by assignment) and 6.5.16 1 (one operand is a possibly-qualified void * and the left has all the qualifiers of the right) and 6.5.16 2 (the right operand is converted to the type of the left).
Then memcpy returns a void * that is its first argument (after conversion to void *), and we attempt to assign this to ptr. This satisfies the constraints of the assignment (one of the operands is a void *), so it converts the pointer to the type of ptr, which is int *. This is governed by 6.3.2.3 7:
A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer…
Since a1 is a char array with no alignment requested, it could have any alignment. It might not be suitable for an int. If so, then the C standard does not define the behavior of the program, per the above.
If a1 happens to be suitably aligned for an int or the C implementation successfully converts it anyway, we go on to printf("%s \n",ptr);.
printf is declared as int printf(const char * restrict, ...). For arguments corresponding to ..., there is no parameter type to convert to. Instead, the default argument promotions are performed. These affect integer and float arguments but not pointer arguments. So ptr is passed to printf unchanged, as an int *.
For a %s conversion, the printf rules in 7.21.6.1 8 say “the argument shall be a pointer to the initial element of an array of character type.” While ptr is pointing to the same place in memory as the initial element, it is a pointer to an int, not a pointer to the initial element. Therefore, it is the wrong type of argument.
7.21.6.1 9 says “… If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.” Therefore, the C standard does not define the behavior of this program.
In many C implementations, pointers are simple addresses in memory, int * and char * have the same representation, and the compiler will tolerate passing an int * for a %s conversion. In this case, printf receives the address it is expecting and will print the string in a1. That is why you observed the result you did. The C standard does not require this behavior. Because printf is part of the standard C library, the C standard permits a compiler to treat it specially when it is called with external linkage. The compiler could, hypothetically, treat the argument as having the correct type (even though it does not) and change the printf call into a loop that used ptr as if it were a char *. I am not aware of any compilers that would generate undesired code in this case, but the point is the C standard does not prohibit it.
why printf("%s",ptr); is totally valid
It isn’t - it may work as expected, but it isn’t guaranteed to. By passing an argument of the wrong type to printf, you’ve invoked undefined behavior, which simply means the compiler isn’t required to handle the situation in any particular way. You may get the expected output, you may get garbage output, you may get a runtime error, you may corrupt the state of your system, you may open a black hole to the other side of the universe.
Which format specifier should I be using to print the address of a variable? I am confused between the below lot.
%u - unsigned integer
%x - hexadecimal value
%p - void pointer
Which would be the optimum format to print an address?
The simplest answer, assuming you don't mind the vagaries and variations in format between different platforms, is the standard %p notation.
The C99 standard (ISO/IEC 9899:1999) says in §7.19.6.1 ¶8:
p The argument shall be a pointer to void. The value of the pointer is
converted to a sequence of printing characters, in an implementation-defined
manner.
(In C11 — ISO/IEC 9899:2011 — the information is in §7.21.6.1 ¶8.)
On some platforms, that will include a leading 0x and on others it won't, and the letters could be in lower-case or upper-case, and the C standard doesn't even define that it shall be hexadecimal output though I know of no implementation where it is not.
It is somewhat open to debate whether you should explicitly convert the pointers with a (void *) cast. It is being explicit, which is usually good (so it is what I do), and the standard says 'the argument shall be a pointer to void'. On most machines, you would get away with omitting an explicit cast. However, it would matter on a machine where the bit representation of a char * address for a given memory location is different from the 'anything else pointer' address for the same memory location. This would be a word-addressed, instead of byte-addressed, machine. Such machines are not common (probably not available) these days, but the first machine I worked on after university was one such (ICL Perq).
If you aren't happy with the implementation-defined behaviour of %p, then use C99 <inttypes.h> and uintptr_t instead:
printf("0x%" PRIXPTR "\n", (uintptr_t)your_pointer);
This allows you to fine-tune the representation to suit yourself. I chose to have the hex digits in upper-case so that the number is uniformly the same height and the characteristic dip at the start of 0xA1B2CDEF appears thus, not like 0xa1b2cdef which dips up and down along the number too. Your choice though, within very broad limits. The (uintptr_t) cast is unambiguously recommended by GCC when it can read the format string at compile time. I think it is correct to request the cast, though I'm sure there are some who would ignore the warning and get away with it most of the time.
Kerrek asks in the comments:
I'm a bit confused about standard promotions and variadic arguments. Do all pointers get standard-promoted to void*? Otherwise, if int* were, say, two bytes, and void* were 4 bytes, then it'd clearly be an error to read four bytes from the argument, non?
I was under the illusion that the C standard says that all object pointers must be the same size, so void * and int * cannot be different sizes. However, what I think is the relevant section of the C99 standard is not so emphatic (though I don't know of an implementation where what I suggested is true is actually false):
§6.2.5 Types
¶26 A pointer to void shall have the same representation and alignment requirements as a pointer to a character type.39) Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements. All pointers to structure types shall have the same representation and alignment requirements as each other. All pointers to union types shall have the same representation and alignment requirements as each other. Pointers to other types need not have the same representation or alignment requirements.
39) The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.
(C11 says exactly the same in the section §6.2.5, ¶28, and footnote 48.)
So, all pointers to structures must be the same size as each other, and must share the same alignment requirements, even though the structures the pointers point at may have different alignment requirements. Similarly for unions. Character pointers and void pointers must have the same size and alignment requirements. Pointers to variations on int (meaning unsigned int and signed int) must have the same size and alignment requirements as each other; similarly for other types. But the C standard doesn't formally say that sizeof(int *) == sizeof(void *). Oh well, SO is good for making you inspect your assumptions.
The C standard definitively does not require function pointers to be the same size as object pointers. That was necessary not to break the different memory models on DOS-like systems. There you could have 16-bit data pointers but 32-bit function pointers, or vice versa. This is why the C standard does not mandate that function pointers can be converted to object pointers and vice versa.
Fortunately (for programmers targetting POSIX), POSIX steps into the breach and does mandate that function pointers and data pointers are the same size:
§2.12.3 Pointer Types
All function pointer types shall have the same representation as the type pointer to void. Conversion of a function pointer to void * shall not alter the representation. A void * value resulting from such a conversion can be converted back to the original function pointer type, using an explicit cast, without loss of information.
Note:
The ISO C standard does not require this, but it is required for POSIX conformance.
So, it does seem that explicit casts to void * are strongly advisable for maximum reliability in the code when passing a pointer to a variadic function such as printf(). On POSIX systems, it is safe to cast a function pointer to a void pointer for printing. On other systems, it is not necessarily safe to do that, nor is it necessarily safe to pass pointers other than void * without a cast.
p is the conversion specifier to print pointers. Use this.
int a = 42;
printf("%p\n", (void *) &a);
Remember that omitting the cast is undefined behavior and that printing with p conversion specifier is done in an implementation-defined manner.
Use %p, for "pointer", and don't use anything else*. You aren't guaranteed by the standard that you are allowed to treat a pointer like any particular type of integer, so you'd actually get undefined behaviour with the integral formats. (For instance, %u expects an unsigned int, but what if void* has a different size or alignment requirement than unsigned int?)
*) [See Jonathan's fine answer!] Alternatively to %p, you can use pointer-specific macros from <inttypes.h>, added in C99.
All object pointers are implicitly convertible to void* in C, but in order to pass the pointer as a variadic argument, you have to cast it explicitly (since arbitrary object pointers are only convertible, but not identical to void pointers):
printf("x lives at %p.\n", (void*)&x);
As an alternative to the other (very good) answers, you could cast to uintptr_t or intptr_t (from stdint.h/inttypes.h) and use the corresponding integer conversion specifiers. This would allow more flexibility in how the pointer is formatted, but strictly speaking an implementation is not required to provide these typedefs.
You can use %x or %X or %p; all of them are correct.
If you use %x, the address is given as lowercase, for example: a3bfbc4
If you use %X, the address is given as uppercase, for example: A3BFBC4
Both of these are correct.
If you use %x or %X it's considering six positions for the address, and if you use %p it's considering eight positions for the address. For example:
I've read a lot of answers about the %p format specifier usage in C language here in Stack Overflow, but none seems to give an explanation as to why explicit cast to void* is needed for all types but char*.
I'm of course aware about the fact that this requirement to cast to or from void* is tied with the use of variadic functions (see first comment of this answer) while non-mandatory otherwise.
Here's an example :
int i;
printf ("%p", &i);
Yields a warning about type incompatibility and that &i shall be casted to void* (as required by the standard, see again here).
Whereas this chunk of code compiles smoothly with no complaint about type casting whatsoever:
char * m = "Hello";
printf ("%p", m);
How does that come that char* is "relieved" from this imperative?
PS: It's maybe worth adding that I work on x86_64 architecture, as pointer type size depends on it, and using gcc as compiler on linux with -W -Wall -std=c11 -pedantic compiling options.
There is no explicit cast needed for arguments of type char*, as char * has the same representation and alignment requirement as void *.
Quoting C11, chapter §6.2.5
A pointer to void shall have the same representation and alignment requirements as a
pointer to a character type. (48) [...]
and the footnote 48)
The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.
The C11 standard 6.2.5/28 says:
A pointer to void shall have the same representation and alignment requirements as a
pointer to a character type. 48)
with footnote 48 being:
The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.
However 7.21.6.1 ("The fprintf function") says about %p:
The argument shall be a pointer to void.
This is apparently a contradiction. In my opinion, a sensible interpretation is to say that the intent of 6.2.5/28 is that void * and char * are in fact interchangeable as the types for function arguments which do not correspond to a prototype. (i.e. arguments to non-prototyped functions, or matching the ellipsis of a prototype of variadic function).
Apparently the compiler you're using takes a similar view.
To back this up, the specification of argument types in 7.21.6.1, if taken literally without regard to intent, has a lot of other inconsistencies that have to be disregarded in practice (e.g. it says that printf("%lx", -1); is well-defined, but printf("%u", 1); is undefined behaviour).
The reason for this requirement is the C Standard allows for different representations for pointers to different types, with 2 notable constraints:
pointers to void and char or unsigned char and their qualified versions shall have the same representation.
pointers to structures and unions must have the same representation.
Hence on some architectures, int * and char * might have different representations, for example a different size, and they could be passed in different ways to vararg functions, causing int i = 1; printf("%p", &i); and int i = 1; printf("%p", (void*)&i); to behave differently.
Note however that the Posix standards mandate that all pointer type have the same size and representation. Hence on a Posix system printf("%p", &i); should behave as expected.
From C Standard#6.2.5p28
A pointer to void shall have the same representation and alignment requirements as a pointer to a character type.48) Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements. All pointers to structure types shall have the same representation and alignment requirements as each other. All pointers to union types shall have the same representation and alignment requirements as each other. Pointers to other types need not have the same representation or alignment requirements. [emphasis mine]
The last parameter of this line in main() makes me lost
// declaration
void qsort(char *linep[], int left, int right, int (*compare)(void *, void*);
// use
main(){
qsort((void**) lineptr, 0, nlines-1, (int (*)(void*,void*))(numeric ?
numcmp : strcmp));
}
I understand the ternary operator but let's say numeric == 0 then what does this mean?
(int (*)(void *, void*))strcmp;
Do datatypes of function parameters mismatch?
int strcmp(const char*, const char*);
void qsort( , , , int(*)(void*)(void*);
Can I typecast a function pointer?
In your code, using the cast
(int (*)(void *, void*))strcmp;
means, strcmp() is a function pointer, which takes two void * arguments and returns an int.
Usually, for function pointers, casting is a very bad idea, as quoting from C11, chapter §6.3.2.3
[...] If a converted
pointer is used to call a function whose type is not compatible with the referenced type,
the behavior is undefined.
but, in your case, for the argument types, char * and void * alias each other, so the typecasted type is compatible with the actual effective type(s), so (at a later point) the function call is defined.
Yes, you can cast a function pointer to a pointer to a function with a different signature. Depending on your calling convention (who cleans the stack up? The caller or the callee?) calling that function will be bad if there is a different number of arguments or their sizes differ.
Neither is the case here: On your standard architecture (sun workstations, Linux PCs, raspberry PI) the argument pointers to different data types are represented identically so that no harm is expected. The function will read the 4 or 8 byte value from the stack and interpret the memory pointed to as data of the expected type (which it should have though, e.g. don't use a float compare function on strings; it may throw because arbitrary bit patterns can be NaNs etc.).
I wanted to alert you to the fact that today's standard lib's qsort has a different function signature (and semantic) than K&R's example. Today's qsort gets a pointer to the beginning of an element vector and calls the compare function with pointers to the elements in the array; in the case of an array of string pointers, the arguments are pointers to pointers which are not suitable for strcmp(). The arguments have to be dereferenced first. The linux man page for qsort has an example for a strcmp wrapper which does just that. (The man page web export appears somewhat garbled, but is still readable.)