Why does printing a null char ('\0', 0) with %s prints the "(null)" string actually?
Like this code:
char null_byte = '\0';
printf("null_byte: %s\n", null_byte);
...printing:
null_byte: (null)
...and it even runs without errors under Valgrind, all I get is the compiler warning warning: format ‘%s’ expects argument of type ‘char *’, but argument 2 has type ‘int’ [-Wformat] (note: I'm using gcc 4.6.3 on 32bit Ubuntu)
It's undefined behavior, but it happens that on your implementation:
the int value of 0 that you pass is read by %s as a null pointer
the handling of %s by printf has special-case code to identify a null pointer and print (null).
Neither of those is required by the standard. The part that is required[*], is that a char used in varargs is passed as an int.
[*] Well, it's required given that on your implementation all values of char can be represented as int. If you were on some funny implementation where char is unsigned and the same width as int, it would be passed as unsigned int. I think that funny implementation would conform to the standard.
Well, for starters, you're doing it wrong. '\0' is a character and should be printed with %c and not %s. I don't know if this is intentional for experimentation purposes.
The actual binary value of \0 is, well, 0. You're trying to cast the value 0 to a char * pointer, which would result in an invalid reference and crash. Your compiler is preventing that with a special treatment of the %s value.
Valgrind won't catch it because it runs on the resulting binary, not the source code (you'd need a static analyzer instead). Since the compiler has already converted that call into a safe "null pointer" text, valgrind won't see anything amiss.
null_byte contains 0. When you use %s in printf, you are trying to print a string, which is an adress of a char (a char *). What you do in your code, is that you are passing the adress 0 (NULL) as the adress of your string, which is why the output is null.
The compiler warned you that you passed the wrong type to the %s modifier. try printf("null_byte: %s\n", &null_byte);
Your printf statement is trying to print a string and is therefore interprets the value null_bye as a char * that has the value null. Take heed of the warning. Either do this
printf("null_byte: %s\n", &null_byte);
or this
printf("null_byte: %c\n", null_byte);
Because printf is variadic, the usual argument promotions are performed on null_byte so it gets promoted (cast) to int, value 0.
printf then reads a char * pointer, and the 0 int is interpreted as a null pointer. Your C standard library has a feature that null strings are printed as (null).
Adding an implementation example to other answers, in XV6, which is an educational re-implementation of Unix v6, if you pass a zero value to %s, it prints (null):
void printf(int fd, const char *fmt, ...) {
uint *ap = (uint*)(void*)&fmt + 1;
...
if(state == '%') {
...
if(c == 's') {
s = (char*)*ap;
ap++;
if(s == 0)
s = "(null)";
while(*s != 0){
putc(fd, *s);
s++;
}
}
}
}
Related
I came across an example of non-portable C code where a char pointer is an argument to a variadic C function. The example is described in the image below. The part highlighted in blue is not necessarily clear, and appears wrong. In particular, I have two questions:
Assuming that NULL was 32-bit int 0 on a system, wouldn't compiler do an implicit cast of 32-bit - int to 64-bit 0 when it encounters char *string = NULL. If not, then are we saying that each expression like char *string = NULL is non-portable and must be always replaced with an explicit cast like char *string = (char *)NULL for portable C?
If NULL was 32-bit int 0, and char *string was 64-bit then why would printf run out of bits to print like it is suggested in the blue highlight. Shouldn't printf get full 64 bits as it was passed string and not NULL.
Source of the screenshot: https://wiki.sei.cmu.edu/confluence/plugins/servlet/mobile?contentId=87152357#content/view/87152357
The referenced article is wrong and should be disregarded.
Assuming that NULL was 32-bit int 0 on a system, wouldn't compiler do an implicit cast of 32-bit - int to 64-bit 0 when it encounters char *string = NULL.
An assignment automatically converts the right operand to the type of the left operand. So char *string = NULL will convert the NULL value to char *, not to “64-bit 0”.
If not, then are we saying that each expression like char *string = NULL is non-portable and must be always replaced with an explicit cast like char *string = (char *)NULL for portable C?
No, char *string = NULL is portable C code; it is strictly conforming.
If NULL was 32-bit int 0, and char *string was 64-bit then why would printf run out of bits to print like it is suggested in the blue highlight. Shouldn't printf get full 64 bits as it was passed string and not NULL.
The code referenced, char* string = NULL; followed by printf("%s %d\n", string, 1);, does not pass NULL to printf. It passes string to printf, and the prior assignment converts NULL to char *. So printf is passed a char * that has the value of a null pointer. This will not cause any problem in interpreting the variable arguments to printf. (It is, however, improper to pass a null pointer for the %s conversion.)
If the call were instead printf("%s", NULL);, then there is a problem. Arguments corresponding to the ... part of a variable-argument function are not automatically converted to a parameter type. They are processed by the default argument promotions, which largely promote narrow integer types to int and promote float to double, but they will not convert an int to any type of pointer. Thus, if NULL is defined as 0, then printf("%s", NULL); passes an int where a char * is expected, and this may cause various misinterpretations of the arguments.
In consequence, never use the NULL macro as a direct argument to a function with a variable argument list. Using a pointer variable that has been assigned from NULL is okay.
why codeblocks crashes when I try to run this code :
#include <stdio.h>
#include <stdlib.h>
void main(void)
{
char *ch= "Sam smith";
printf("%s\n",*ch);
}
but it works fine when I remove *ch in printf and replace it with just ch like this
void main(void)
{
char *ch= "Sam smith";
printf("%s\n",ch);
}
shouldn't *ch mean the content of the address pointed to by ch which is the string itself, and ch mean the adress of the first element in the string? so i expected the opposite!
I use Code blocks 17.12, gcc version 5.1.0, on windows 8.1 .
*ch dereferences the pointer to yield the data it points to. Here the first character of the string.
Providing the wrong specifier to printf is undefined behaviour (passing a value where printf expects a pointer)
printf("%c\n",*c);
would print the first character, without crashing. If you want to print all the string using %c (or use putchar), loop on the characters. But that's what %s does.
As a side note, better use const to reference literal strings so no risk to attempt to modify them:
const char *ch= "Sam smith";
why codeblocks crashes when I try to run this code :
char *ch= "Sam smith";
printf("%s\n",*ch);
ch is of type char *.
The %s conversion specifier expects a char *.
You pass *ch, which is the dereferenced ch, i.e. of type char.
If conversion specifiers do not match the types of the arguments, bad things (undefined behavior) happens.
shouldn't *ch mean the content of the address pointed to by ch which is the string itself
There is no data type "string" in C, and thus no "pointer to string".
A "string", in C parlance, is an array of characters, or a pointer to an array of characters, with a null-byte terminator.
ch is a char *, a pointer to the first character of that array - a string, so to speak.
*ch is a char, the first character of that array.
In C, You do not dereference a pointer to a string, when you want to use the whole string the pointer points to. This dereferencing process would result in the first character of the respective string, which is not what you want.
Furthermore, it is undefined behavior if a conversion specifier does not match to the type of the relative argument.
Thus,
printf("%s\n",*ch);
is wrong. *ch is the first character in the string, in this case S, of type char but the %s conversion specifier expects an argument of type *char.
printf("%s\n",ch);
is correct. ch is of type *char as required by the conversion specifier %s.
So, Codeblocks "crashes" apparently because of the mismatching argument type.
In the following code:
#include <stdio.h>
int main(void) {
char* message = "Hello C Programmer!";
printf("%s", message);
return 0;
}
I don't fully understand why it wasn't necessary to prepend an '*' to message in the printf call. I was under the assumption that message, since it is a pointer to a char, the first letter in the double quoted string, would display the address of the 'H'.
The %s format operator requires its corresponding argument to be a char * pointer. It prints the entire string that begins at that address. A string is a sequence of characters ending with a null byte. That's why this prints the entire message.
If you supply an array as the corresponding argument, it's automatically converted to a pointer to the first character of the array. In general, whenever an array is used as an r-value, it undergoes this conversion.
You don't need to use the * operator because the argument is supposed to be a pointer. If you used *message you would only pass the 'H' character to printf(). You would do this if you were using the %c format instead of %s -- its corresponding argument should be a char.
You're right that message is a pointer, of type "pointer to char", or char *. So if you were trying to print a character (an individual character), you would certainly have needed a *:
printf("first character: %c\n", *message);
In a printf format specifier, %c expects a character, which is what *message gives you.
But you weren't trying to print a single character, you were trying to print the whole string. And in a printf format specifier, %s expects a pointer to a character, the first of usually several characters to print. So
printf("entire string: %s\n", message);
is correct.
The conversion specifier %s is used to output strings (a sequence of characters terminated with the zero character '\0') pointers to first characters of which are passed to the function printf as an argument.
You can imagine that the function internally executes the following loop
for ( ; *message != '\0'; ++message )
{
printf( "%c", *message );
}
If you will supply the expression *message then it has type char and the function printf will try to interpret its value that is character 'H' as a value of a pointer. As result the function call will have undefined behavior.
To output the value of the pointer message you should use the conversion specifier %p like
printf( "%p", message );
Or as an integer value as for example
#include <stdio.h>
#include <inttypes.h>
int main(void)
{
char *message = "Hello C Programmer!";
printf( "The value of the pointer message is %" PRIiPTR "\n", ( intptr_t )message );
return 0;
}
In addition, a pointer must be used here.
If you passed a character, it would be passed by value, and printf would lose the original address of the character. That means it would no longer be possible to access the characters that come after the first.
You need to pass a pointer by value to ensure you're retaining the original address of the head of the string.
I compiled the following code:
#include <stdio.h>
int main(void) {
// your code goes here
char *consta = "ABC";
printf("Use of just const: %c\n", consta );
printf("Use of const[1]: %c\n", consta[1]);
printf("Use of whole string: %s", consta);
return 0;
}
However, the output that I get is:
Use of just const: P
Use of const[1]: B
Use of whole string: ABC
The second printf and the third printf function calls work as expected however, I was expecting 'A' to be printed instead of 'P' in the first call to printf.
consta is a pointer to a character. The formatting specifier %c expects an argument of type char (character)†, not char* (pointer to character). Your code exhibits undefined behavior. Try to dereference consta instead:
printf("Use of just const: %c\n", *consta);
where *consta is incidentially equal to consta[0].
† Actually, the argument is of type int and is converted to unsigned char by printf(). This has to do with argument promotion rules that apply to functions with variable arguments; an argument to printf() of type char is promoted to int before passing it to printf(), which is why printf() has to promote it back. For most programs the difference does not matter.
consta is pointer containing the address of a string.
You're telling printf to treat that as a character, which is undefined behavior. Pointers are usually implemented as storing the address as a number, so it will typically print the ASCII value of that address.
You want to pass the value at that address (which the pointer points to) by writing *consta.
char *consta = "ABC";
is a pointer to char pointing to "ABC", to be exact to the 1st element of "ABC", that is 'A'.
"Pointing to" means consta contains an address, here the address of the 'A'.
To print a pointer, will say an address, use the conversion specifier %p:
printf("Use of just const: %p\n", (void*) consta);
I can spot three problems with your code:
The variable consta points to a constant string so you should to make it a constant:
const char *consta = "ABC";
The second argument to the first print statement should be the first character in the string rather than a pointer:
printf("Use of just const: %c\n", consta[0]);
In the last print statement there is no final newline. This implies that there may be no output from it.
If you want your code to be standard ANSI C conformant you also need to change the line-comment (//...) to a block comment (/*...*/).
It is also a good idea to enable all warnings in your compiler. With the popular GCC compiler I use the following options:
-ansi -fsanitize=address -g -pedantic -Wall -Wfatal-errors
I'm quite new to C and I'm having a bit of trouble with these pieces of code:
char word[STRING_LEN];
while(num_words < ARRAY_SIZE && 1 == fscanf(infile, "%79s", &word))
When I try to compile, I get the warning:
format '%s' expects argument of type char *, but argument 3 has type
char (*)[80].
Now this is remedied by using &word[0]. Now, shouldn't these both point to
the address at the start of the array? What am I missing here.
Cheers!
When you use %s format in fscanf, it is expected that the argument is a char* that can hold the characters being read from the stream. That explains the warning message.
In your case, &word has the same numerical value as &word[0]. However, that is not true all the time. For example, if you have:
char* word = malloc(20);
then, the numerical value&word is not equal to that of&word[0]. The compiler is not taking the responsibility for dealing with such distinctions. It is simply expecting a char* as the argument.