Characters & Strings in Printf - c

I'm going through an exercise book on C and came across the statement
printf("%c", "\n");
Which when run in a console still works but displays the "$" symbol.
Why did this statement not crash the console like
printf("%s", '\n');
does?

a double quoted string like that produces a value of a pointer to char (aka char*), while the single quote produce a value that's a character (using the ASCII value of whats in the quotes. On some compilers you can stack multiple characters into the single quotes.
printf("%c", *("\n") );
would print your linefeed, as the * operator would dereference the pointer
( You could probably do *"\n" , I just tend to be conservative in writing expressions)
printf("%s", '\n');
crashes because %s expects a pointer, and a linefeed cast into a pointer is pointing off in the weeds and most likely causes an invalid memory access

It will invoke undefined behavior by passing data having wrong type. It just happened not to crash.
In some implementation, the pointer converted from the string literal is passed as an argument. Unlike %s, which will interpret the argument as pointer and go to read thete, %c will just take the argument as a number and print it, so it has less chance to crash.

Because %s expects a NUL terminated string where %c only wants a char. The string will read past the end of your buffer (the single char) looking for that NUL and quite likely cause a memory exception. Or not - hence undefined behavior.

Only the latter statement asked the implementation to dereference an invalid pointer. An invalid value will typically show as garbage. But most possible memory location values are inaccessible and attempting to access them will cause a crash on modern operating systems.

%s prints until it reaches a '\0' because \n has no escape it will read into memory. "%c" only needs a char

Related

String Unexpected Behavior in C

I was writing a simple program to read a string and then print it.
#include<stdio.h>
// void printString(char *str){
// while(*str!='\0'){
// printf("%c",*str);
// str++;
// }
// printf("\n");
// }
int main(){
char str[50];
printf("Enter Your Name:-");
scanf("%s",&str); // scanf adds '\0' automatically at the end.
char *ptr = str;
// printString(ptr);
printf("Your Name is %s%s",ptr);
return 0;
}
I wrote the above code. At first I didn't know that scanf stops when encounters a space. So I thought that I need to have two %s for printing two words but when I got to know the former part about scanf. I understood the problem but in the process, I encountered another problem which I think is unexpected behavior.
OUTPUT:-
Enter Your Name:-Rachit Mittal
Your Name is Rachit Mittal
This Output, from my understanding is unexpected.
If scanf only reads till it encounter space or new line then string contains only "RACHIT" So in the output screen even after using double %s, It should only print "Rachit" instead of "Rachit Mittal"
I wanna ask why is this so? Is this unexpected behavior because of double %s or something else?
You are invoking undefined behavior by not passing enough arguments to printf. %s%s contains two format specifiers, so printf expects two string arguments to be passed, but you only gave one. In this case, the result could be anything.
Consider enabling warnings to help diagnose these errors (e.g. compiling with -Wall).
You have two %s in the format string, but only one additional argument. So there's no argument corresponding to the second %s. This causes undefined behavior. This means that anything can happen, so nothing should be "unexpected".
It's a bit surprising that this causes it to print the second name from the input. The register or memory location that would have contained an additional argument just happened to point to the remainder of the input buffer used by stdio. This is purely coincidence.
For starters the second argument expression in this call of scanf
scanf("%s",&str);
is incorrect. The type of the expression &str is char ( * )[50] while the conversion specifier %s expects an argument of the type char *. You have to write
scanf( "%s", str );
As for your problem then according to the C Standard (7.21.6.1 The fprintf function)
2 The fprintf function writes output to the stream pointed to by
stream, under control of the string pointed to by format that
specifies how subsequent arguments are converted for output. If
there are insufficient arguments for the format, the behavior is
undefined. If the format is exhausted while arguments remain, the
excess arguments are evaluated (as always) but are otherwise ignored.
The fprintf function returns when the end of the format string is
encountered.
You could write to read several words
scanf( "%49[^\n]", str );

Why can i not use %s instead of %c?

The whole function the question is about is about giving a two dimensional array initialized with {0} as output and making a user able to move a 1 over the field with
char wasd;
scanf("%c", &wasd);
(the function to move by changing the value of the variable wasd is not important i think)
now my question is why using
scanf("%s", &wasd);
does only work partly(sometimes the 1 keeps being at a field and appears a 2nd time at the new place though it actually should be deleted)
and
scanf("%.1s", &wasd);
leads to the field being printed out without stop until closing the execution program. I came up with using %.1s after researching the difference between %c and %s here Why does C's printf format string have both %c and %s?? If one can figure out the answer by reading through that, i am not clever or far enough with c learning to get it.
I also found this fscanf() in C - difference between %s and %c but i do not know anything about EOF which one answer says is the cause of the problem so i would prefer getting an answer without it.
Thank you for an answer
Simple as that, %s is the conversion for a (non-empty) string. A string in C always ends with a 0 byte, so any non-empty string needs at least two bytes. If you pass a pointer to a single char variable, scanf() will just overwrite whatever is in memory after that variable -- you cause undefined behavior and anything can happen.
Side note, scanf("%s", ..), even if you give it an array of char, will always overflow the buffer if something longer is entered, therefore causing undefined behavior. You have to include a field width like
char str[10];
scanf("%9s", str);
Best is not to use scanf() at all. For your single character input, you can just use getchar() (be aware it returns an int). You might also want to read my beginners' guide away from scanf.
A char variable can hold only one byte of memory to hold a single character. But a string (array of characters) is different from a char variable as it is always ended with a null character \0 or numeric 0. So in scanf you specifically mentioned whether you are reading a character or a string so that scanf can add a null character at the end of a string. So you are not suppose to use a %s to read a value for a char variable

Why output length is coming 6?

I have written a simple program to calculate length of string in this way.
I know that there are other ways too. But I just want to know why this program is giving this output.
#include <stdio.h>
int main()
{
char str[1];
printf( "%d", printf("%s", gets(str)));
return 0;
}
OUTPUT :
(null)6
Unless you always pass empty strings from the standard input, you are invoking undefined behavior, so the output could be pretty much anything, and it could crash as well. str cannot be a well-formed C string of more than zero characters.
char str[1] allocates storage room for one single character, but that character needs to be the NUL character to satisfy C string constraints. You need to create a character array large enough to hold the string that you're writing with gets.
"(null)6" as the output could mean that gets returned NULL because it failed for some reason or that the stack was corrupted in such a way that the return value was overwritten with zeroes (per the undefined behavior explanation). 6 following "(null)" is expected, as the return value of printf is the number of characters that were printed, and "(null)" is six characters long.
There's several issues with your program.
First off, you're defining a char buffer way too short, a 1 char buffer for a string can only hold one string, the empty one. This is because you need a null at the end of the string to terminate it.
Next, you're using the gets function which is very unsafe, (as your compiler almost certainly warned you about), as it just blindly takes input and copies it into a buffer. As your buffer is 0+terminator characters long, you're going to be automatically overwriting the end of your string into other areas of memory which could and probably does contain important information, such as your rsp (your return pointer). This is the classic method of smashing the stack.
Third, you're passing the output of a printf function to another printf. printf isn't designed for formating strings and returning strings, there are other functions for that. Generally the one you will want to use is sprintf and pass it in a string.
Please read the documentation on this sort of thing, and if you're unsure about any specific thing read up on it before just trying to program it in. You seem confused on the basic usage of many important C functions.
It invokes undefined behavior. In this case you may get any thing. At least str should be of 2 bytes if you are not passing a empty string.
When you declare a variable some space is reserved to store the value.
The reserved space can be a space that was previously used by some other
code and has values. When the variable goes out of scope or is freed
the value is not erased (or it may be, anything goes.) only the programs access
to that variable is revoked.
When you read from an unitialised location you can get anything.
This is undefined behaviour and you are doing that,
Output on gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3 is 0
For above program your input is "(null)", So you are getting "(null)6". Here "6" is the output from printf (number of characters successfully printed).

Why does C's printf format string have both %c and %s?

Why does C's printf format string have both %c and %s?
I know that %c represents a single character and %s represents a null-terminated string of characters, but wouldn't the string representation alone be enough?
Probably to distinguish between null terminated string and a character. If they just had %s, then every single character must also be null terminated.
char c = 'a';
In the above case, c must be null terminated. This is my assumption though :)
%s prints out chars until it reaches a 0 (or '\0', same thing).
If you just have a char x;, printing it with printf("%s", &x); - you'd have to provide the address, since %s expects a char* - would yield unexpected results, as &x + 1 might not be 0.
So you couldn't just print a single character unless it was null-terminated (very inefficent).
EDIT: As other have pointed out, the two expect different things in the var args parameters - one a pointer, the other a single char. But that difference is somewhat clear.
The issue that is mentioned by others that a single character would have to be null terminated isn't a real one. This could be dealt with by providing a precision to the format %.1s would do the trick.
What is more important in my view is that for %s in any of its forms you'd have to provide a pointer to one or several characters. That would mean that you wouldn't be able to print rvalues (computed expressions, function returns etc) or register variables.
Edit: I am really pissed off by the reaction to this answer, so I will probably delete this, this is really not worth it. It seems that people react on this without even having read the question or knowing how to appreciate the technicality of the question.
To make that clear: I don't say that you should prefer %.1s over %c. I only say that reasons why %c cannot be replaced by that are different than the other answer pretend to tell. These other answers are just technically wrong. Null termination is not an issue with %s.
The printf function is a variadic function, meaning that it has variable number of arguments. Arguments are pushed on the stack before the function (printf) is called. In order for the function printf to use the stack, it needs to know information about what is in the stack, the format string is used for that purpose.
e.g.
printf( "%c", ch ); tells the function the argument 'ch'
is to be interpreted as a character and sizeof(char)
whereas
printf( "%s", s ); tells the function the argument 's' is a pointer
to a null terminated string sizeof(char*)
it is not possible inside the printf function to otherwise determine stack contents e.g. distinguishing between 'ch' and 's' because in C there is no type checking during runtime.
%s says print all the characters until you find a null (treat the variable as a pointer).
%c says print just one character (treat the variable as a character code)
Using %s for a character doesn't work because the character is going to be treated like a pointer, then it's going to try to print all the characters following that place in memory until it finds a null
Stealing from the other answers to explain it in a different way.
If you wanted to print a character using %s, you could use the following to properly pass it an address of a char and to keep it from writing garbage on the screen until finding a null.
char c = 'c';
printf('%.1s', &c);
For %s, we need provide the address of string, not its value.
For %c, we provide the value of characters.
If we used the %s instead of %c, how would we provide a '\0' after the characters?
Id like to add another point of perspective to this fun question.
Really this comes down to data typing. I have seen answers on here that state that you could provide a pointer to the char, and provide a
"%.1s"
This could indeed be true. But the answer lies in the C designer's trying to provide flexibility to the programmer, and indeed a (albeit small) way of decreasing footprint of your application.
Sometimes a programmer might like to run a series of if-else statements or a switch-case, where the need is to simply output a character based upon the state. For this, hard coding the the characters could indeed take less actual space in memory as the single characters are 8 bits versus the pointer which is 32 or 64 bits (for 64 bit computers). A pointer will take up more space in memory.
If you would like to decrease the size through using actual chars versus pointers to chars, then there are two ways one could think to do this within printf types of operators. One would be to key off of the .1s, but how is the routine supposed to know for certain that you are truly providing a char type versus a pointer to a char or pointer to a string (array of chars)? This is why they went with the "%c", as it is different.
Fun Question :-)
C has the %c and %s format specifiers because they handle different types.
A char and a string are about as different as night and 1.
%c expects a char, which is an integer value and prints it according to encoding rules.
%s expects a pointer to a location of memory that contains char values, and prints the characters in that location according to encoding rules until it finds a 0 (null) character.
So you see, under the hood, the two cases while they look alike they have not much in common, as one works with values and the other with pointers. One is instructions for interpreting a specific integer value as an ascii char, and the other is iterating the contents of a memory location char by char and interpreting them until a zero value is encountered.
I have done a experiment with printf("%.1s", &c) and printf("%c", c).
I used the code below to test, and the bash's time utility the get the runing time.
#include<stdio.h>
int main(){
char c = 'a';
int i;
for(i = 0; i < 40000000; i++){
//printf("%.1s", &c); get a result of 4.3s
//printf("%c", c); get a result of 0.67s
}
return 0;
}
The result says that using %c is 10 times faster than %.1s. So, althought %s can do the job of %c, %c is still needed for performance.
Since no one has provided an answer with ANY reference whatsoever, here is a printf specification from pubs.opengroup.com which is similar to the format definition from IBM
%c
The int argument shall be converted to an unsigned char, and the resulting byte shall be written.
%s
The argument shall be a pointer to an array of char. Bytes from the array shall be written up to (but not including) any terminating null byte. If the precision is specified, no more than that many bytes shall be written. If the precision is not specified or is greater than the size of the array, the application shall ensure that the array contains a null byte.

NULL arg allowed to sscanf?

Is a NULL pointer allowed as the string to store result in in a call to sscanf?
I don't find anything about it in any documentation but it seems to be working fine. Same thing with scanf.
Example:
int main(int arc, char* argv[])
{
char* s = NULL;
sscanf("Privjet mir!", "%s", s);
printf("s: %s\n", s);
return 0;
}
Output: s: (null)
No:
Matches a sequence of non-white-space
characters; the next pointer must be a
pointer to character array that is
long enough to hold the input sequence
and the terminating null character
('\0'), which is added automatically.
The input string stops at white space
or at the maximum field width,
whichever occurs first.
(http://linux.die.net/man/3/sscanf)
As is mentioned by the other answers NULL is not valid to pass to sscanf as an additional argument.
http://www.cplusplus.com/reference/cstdio/sscanf says of additional arguments:
Depending on the format string, the function may expect a sequence of additional arguments, each containing a pointer to allocated storage where the interpretation of the extracted characters is stored with the appropriate type.
For the %s specifier these extracted characters are:
Any number of non-whitespace characters, stopping at the first whitespace character found. A terminating null character is automatically added at the end of the stored sequence.
So when the "non-whitespace characters" and "terminating null character" is stored, there will be a segfault. Which is exactly what Visual Studio will yield (you can test that this fails at http://webcompiler.cloudapp.net/):
Now as far as non-Visual Studio compilers, libc's extraction code for the %s specifier: https://github.com/ffainelli/uClibc/blob/master/libc/stdio/_scanf.c#L1376 has the leading comment: /* We might have to handle the allocation ourselves */ this is because:
The GNU C library supported the dynamic allocation conversion specifier (as a nonstandard extension) via the a character. This feature seems to be present at least as far back as glibc 2.0.
Since version 2.7, glibc also provides the m modifier for the same purpose as the a modifier.
[Source]
So because libc extracts to a buffer constructed internally to sscanf and subsequently checks that the buffer parameter has no flags set before assigning it, it will never write characters to a NULL buffer parameter.
I can't stress enough that this is non-standard, and is not guaranteed to be preserved even between minor library updates. A far better way to do this is to use the * sub-specifier which:
Indicates that the data is to be read from the stream but ignored (i.e. it is not stored in the location pointed by an argument).
[Source]
This could be accomplished like this for example:
s == NULL ? sscanf("Privjet mir!", "%*s") : sscanf("Privjet mir!", "%s", s);
Obviously the true-branch of the ternary is a no-op, but I've included it with the expectation that other data was expected to be read from the string.
The manpage says that, when using %s, the argument must be a pointer with enough space for the string and \0. So my guess would be that the behaviour in your case is undefined. It may work, it may also crash or corrupt memory and cause issues later.
No, this is not allowed.
sscanf %s expects a char* pointing to a sufficient large buffer, printf %s wants a nul char* buffer. Anything else results in undefined behavior. (And that means some implementations might detect and handle a null pointer in a certain way, other implementations might not)
I didn't find anything in the standard explicitly concerning NULL and *printf/*scanf.
I suppose that this is undefined behavior1, since it counts as passing an argument that is not coherent with the format specifier (§7.19.6.1 ¶13, §7.19.6.2 ¶13): %s means that a you're going to pass a pointer to the first element of a character array (large enough for the acquired string for *scanf, containing a NUL-terminated string for *printf) - and passing NULL doesn't satisfy this requirement.
1. In this case UB shows as "just ignoring the acquisition" and "printing (null)", on other platforms it may result in planes falling down the sky or the usual nasal demons.
Allocate the memory to s . Assign s to character array. Then run the program.
Following will work.
int main(int arc, char* argv[])
{
char s[100];
sscanf("Privjet mir!", "%[^\t]s", s);
printf("s: %s\n", s);
return 0;
}

Resources