Is is possible to convert a char with %s conversion specifier - c

char c;
scanf("%s",&c);
Is this correct or wrong ? What happening exactly when typing EOF character in stdin ?
I know there is something like in SO but I cannot find it ?
I already know the exact right form which should be:
scanf("%hhd",&c);

No, this is not correct.
While it might appear that sending the address to the char (&c) would be equivalent, it is not the same as a "string."
Of course, C doesn't really have a string type; instead, we find an abstraction in the standard library string and printf style functions that use the convention of a character pointer to a set of bytes with a null byte (null terminator) at the end of the string.
If you tell scanf to read into a "%s" and then provide a char &, the allocation will only be for one byte. Even if the input is only a single byte, it will be null-terminated (an extra byte appended to the end), automatically causing a buffer overflow.

It's clearly wrong to use char c;scanf("%s",&c).. Format specifier "%s" reads in a sequence of non-white-space characters and appends a '\0' at the end (cf. cppreference concerning scanf). So it will most likely exceed the memory provided by a single character char c (unless the user just presses return or ctrl-d).
Use char c;scanf("%c",&c). instead.

This is wrong. You can't put a string in a char. This is undefined behavior and will probably result in memory stomping.
scanf("%c", &c);
is how you put take a char from scanf.
Better yet, avoid scanf() entirely and use getchar():
int c = getchar();
Just be aware of the return type.

Related

Why can i not use %s instead of %c?

The whole function the question is about is about giving a two dimensional array initialized with {0} as output and making a user able to move a 1 over the field with
char wasd;
scanf("%c", &wasd);
(the function to move by changing the value of the variable wasd is not important i think)
now my question is why using
scanf("%s", &wasd);
does only work partly(sometimes the 1 keeps being at a field and appears a 2nd time at the new place though it actually should be deleted)
and
scanf("%.1s", &wasd);
leads to the field being printed out without stop until closing the execution program. I came up with using %.1s after researching the difference between %c and %s here Why does C's printf format string have both %c and %s?? If one can figure out the answer by reading through that, i am not clever or far enough with c learning to get it.
I also found this fscanf() in C - difference between %s and %c but i do not know anything about EOF which one answer says is the cause of the problem so i would prefer getting an answer without it.
Thank you for an answer
Simple as that, %s is the conversion for a (non-empty) string. A string in C always ends with a 0 byte, so any non-empty string needs at least two bytes. If you pass a pointer to a single char variable, scanf() will just overwrite whatever is in memory after that variable -- you cause undefined behavior and anything can happen.
Side note, scanf("%s", ..), even if you give it an array of char, will always overflow the buffer if something longer is entered, therefore causing undefined behavior. You have to include a field width like
char str[10];
scanf("%9s", str);
Best is not to use scanf() at all. For your single character input, you can just use getchar() (be aware it returns an int). You might also want to read my beginners' guide away from scanf.
A char variable can hold only one byte of memory to hold a single character. But a string (array of characters) is different from a char variable as it is always ended with a null character \0 or numeric 0. So in scanf you specifically mentioned whether you are reading a character or a string so that scanf can add a null character at the end of a string. So you are not suppose to use a %s to read a value for a char variable

can't read first character of array in c

I'm trying to read the characters of a character array by number in c. I've stripped the program down to isolate the problem. I know this is about my misunderstanding of how arrays and memory works and I am ready to be called clueless but I would like to know what I am misunderstanding here. Here is what I have:
#include <stdio.h>
int main(int argc, char **argv) {
char buffer[] = "stuff";
printf("buffer is %s\n", buffer);
printf("first character of buffer is %s", (char)buffer[0]);
return 0;
}
You have to write the correct format specifier. Now you used %s ...what happens?
It looks for an string which is null terminated. But it doesn't find one. So it simply cant put anything in the output . That's it.
Use %c instead.
In C there is a very big difference between a character and a string.
A character is simply a number in a range of 256 different options.
A string is not really a type of its own, it is merely an array of chars (which, in C, is simply evaluated as a pointer to the first character of the string).
Now, when you type buffer[0], this is evaluated to the value at the beginning of the string (first value in the array). Indeed, this is of char type (and therefore you do not need the (char) casting, because this will not do anything in the case of your code).
What you need is to tell printf() how to evaluate the input that you give it. %s is for a string (an array of chars). But note and remember that buffer[0] is not an array of chars, but rather a char.
So you actually want to use %c, instead of %s. This tells printf() to evaluate the parameter as a char type.
What your code currently does is take the value buffer[0] (which is just a number) and consider it as a pointer to a location in memory where a string is kept, and printf() tries to print this string. But this memory location is simply invalid. It is not a location you've accessed before.
In conclusion you want:
printf("first character of buffer is %c", (char)buffer[0]);
or even simpler:
printf("first character of buffer is %c", buffer[0]);
For other specifiers of the printf() function, look here:
http://www.tutorialspoint.com/c_standard_library/c_function_printf.htm
If you want to print only a single char use %c format.
printf("first character of buffer is %c\n", (char)buffer[0]);

Why storing Unicode Characters in char works?

I have a program I made to test I/O from a terminal:
#include <stdio.h>
int main()
{
char *input[100];
scanf("%s", input);
printf("%s", input);
return 0;
}
It works as it should with ASCII characters, but it also works with Unicode characters and emoji.
Why is this?
Your code works because the input and output stream have the same encoding, and you do not do anything with c.
Basically, you type something, which is converted into a sequence of bytes, which are then stored in c, then you send back that sequence of bytes to stdout which convert them back to readable characters.
As long as the encoding and decoding process are compatible, you will get the "expected" result.
Now, what happens if you try to use standard "string" C functions? Let's assume you typed "♠Hello" in your terminal, you will get the expected output but:
strlen(c) -> 8
c[0] -> Some strange character
c[3] -> H
You see? You may be able to store whatever you want in a char array, it does not mean you should. If you want to deal with extended character sets, use wchar_t instead.
You're probably running on Linux, with your terminal set to UTF-8 so scanf produces UTF-8, and printf can output it. UTF-8 is designed such that char[] can store it. I explicitly use char[] and not char because non-ASCII characters need more than one byte.
Your program is undefined as it has undefined behavior.
scanf("%s", input);
expects a pointer to string, but
char *input[100];
input is pointer to pointer to char, char *.
Your program may work because the buffer you pass to scanf is of sufficient size to store unicode character and a characters you pass don't have a NULL byte in between them, but it may not work as well because the implementation of C on your (and any other) machine is allowed to do anything in cases of UB.

Why does C's printf format string have both %c and %s?

Why does C's printf format string have both %c and %s?
I know that %c represents a single character and %s represents a null-terminated string of characters, but wouldn't the string representation alone be enough?
Probably to distinguish between null terminated string and a character. If they just had %s, then every single character must also be null terminated.
char c = 'a';
In the above case, c must be null terminated. This is my assumption though :)
%s prints out chars until it reaches a 0 (or '\0', same thing).
If you just have a char x;, printing it with printf("%s", &x); - you'd have to provide the address, since %s expects a char* - would yield unexpected results, as &x + 1 might not be 0.
So you couldn't just print a single character unless it was null-terminated (very inefficent).
EDIT: As other have pointed out, the two expect different things in the var args parameters - one a pointer, the other a single char. But that difference is somewhat clear.
The issue that is mentioned by others that a single character would have to be null terminated isn't a real one. This could be dealt with by providing a precision to the format %.1s would do the trick.
What is more important in my view is that for %s in any of its forms you'd have to provide a pointer to one or several characters. That would mean that you wouldn't be able to print rvalues (computed expressions, function returns etc) or register variables.
Edit: I am really pissed off by the reaction to this answer, so I will probably delete this, this is really not worth it. It seems that people react on this without even having read the question or knowing how to appreciate the technicality of the question.
To make that clear: I don't say that you should prefer %.1s over %c. I only say that reasons why %c cannot be replaced by that are different than the other answer pretend to tell. These other answers are just technically wrong. Null termination is not an issue with %s.
The printf function is a variadic function, meaning that it has variable number of arguments. Arguments are pushed on the stack before the function (printf) is called. In order for the function printf to use the stack, it needs to know information about what is in the stack, the format string is used for that purpose.
e.g.
printf( "%c", ch ); tells the function the argument 'ch'
is to be interpreted as a character and sizeof(char)
whereas
printf( "%s", s ); tells the function the argument 's' is a pointer
to a null terminated string sizeof(char*)
it is not possible inside the printf function to otherwise determine stack contents e.g. distinguishing between 'ch' and 's' because in C there is no type checking during runtime.
%s says print all the characters until you find a null (treat the variable as a pointer).
%c says print just one character (treat the variable as a character code)
Using %s for a character doesn't work because the character is going to be treated like a pointer, then it's going to try to print all the characters following that place in memory until it finds a null
Stealing from the other answers to explain it in a different way.
If you wanted to print a character using %s, you could use the following to properly pass it an address of a char and to keep it from writing garbage on the screen until finding a null.
char c = 'c';
printf('%.1s', &c);
For %s, we need provide the address of string, not its value.
For %c, we provide the value of characters.
If we used the %s instead of %c, how would we provide a '\0' after the characters?
Id like to add another point of perspective to this fun question.
Really this comes down to data typing. I have seen answers on here that state that you could provide a pointer to the char, and provide a
"%.1s"
This could indeed be true. But the answer lies in the C designer's trying to provide flexibility to the programmer, and indeed a (albeit small) way of decreasing footprint of your application.
Sometimes a programmer might like to run a series of if-else statements or a switch-case, where the need is to simply output a character based upon the state. For this, hard coding the the characters could indeed take less actual space in memory as the single characters are 8 bits versus the pointer which is 32 or 64 bits (for 64 bit computers). A pointer will take up more space in memory.
If you would like to decrease the size through using actual chars versus pointers to chars, then there are two ways one could think to do this within printf types of operators. One would be to key off of the .1s, but how is the routine supposed to know for certain that you are truly providing a char type versus a pointer to a char or pointer to a string (array of chars)? This is why they went with the "%c", as it is different.
Fun Question :-)
C has the %c and %s format specifiers because they handle different types.
A char and a string are about as different as night and 1.
%c expects a char, which is an integer value and prints it according to encoding rules.
%s expects a pointer to a location of memory that contains char values, and prints the characters in that location according to encoding rules until it finds a 0 (null) character.
So you see, under the hood, the two cases while they look alike they have not much in common, as one works with values and the other with pointers. One is instructions for interpreting a specific integer value as an ascii char, and the other is iterating the contents of a memory location char by char and interpreting them until a zero value is encountered.
I have done a experiment with printf("%.1s", &c) and printf("%c", c).
I used the code below to test, and the bash's time utility the get the runing time.
#include<stdio.h>
int main(){
char c = 'a';
int i;
for(i = 0; i < 40000000; i++){
//printf("%.1s", &c); get a result of 4.3s
//printf("%c", c); get a result of 0.67s
}
return 0;
}
The result says that using %c is 10 times faster than %.1s. So, althought %s can do the job of %c, %c is still needed for performance.
Since no one has provided an answer with ANY reference whatsoever, here is a printf specification from pubs.opengroup.com which is similar to the format definition from IBM
%c
The int argument shall be converted to an unsigned char, and the resulting byte shall be written.
%s
The argument shall be a pointer to an array of char. Bytes from the array shall be written up to (but not including) any terminating null byte. If the precision is specified, no more than that many bytes shall be written. If the precision is not specified or is greater than the size of the array, the application shall ensure that the array contains a null byte.

NULL arg allowed to sscanf?

Is a NULL pointer allowed as the string to store result in in a call to sscanf?
I don't find anything about it in any documentation but it seems to be working fine. Same thing with scanf.
Example:
int main(int arc, char* argv[])
{
char* s = NULL;
sscanf("Privjet mir!", "%s", s);
printf("s: %s\n", s);
return 0;
}
Output: s: (null)
No:
Matches a sequence of non-white-space
characters; the next pointer must be a
pointer to character array that is
long enough to hold the input sequence
and the terminating null character
('\0'), which is added automatically.
The input string stops at white space
or at the maximum field width,
whichever occurs first.
(http://linux.die.net/man/3/sscanf)
As is mentioned by the other answers NULL is not valid to pass to sscanf as an additional argument.
http://www.cplusplus.com/reference/cstdio/sscanf says of additional arguments:
Depending on the format string, the function may expect a sequence of additional arguments, each containing a pointer to allocated storage where the interpretation of the extracted characters is stored with the appropriate type.
For the %s specifier these extracted characters are:
Any number of non-whitespace characters, stopping at the first whitespace character found. A terminating null character is automatically added at the end of the stored sequence.
So when the "non-whitespace characters" and "terminating null character" is stored, there will be a segfault. Which is exactly what Visual Studio will yield (you can test that this fails at http://webcompiler.cloudapp.net/):
Now as far as non-Visual Studio compilers, libc's extraction code for the %s specifier: https://github.com/ffainelli/uClibc/blob/master/libc/stdio/_scanf.c#L1376 has the leading comment: /* We might have to handle the allocation ourselves */ this is because:
The GNU C library supported the dynamic allocation conversion specifier (as a nonstandard extension) via the a character. This feature seems to be present at least as far back as glibc 2.0.
Since version 2.7, glibc also provides the m modifier for the same purpose as the a modifier.
[Source]
So because libc extracts to a buffer constructed internally to sscanf and subsequently checks that the buffer parameter has no flags set before assigning it, it will never write characters to a NULL buffer parameter.
I can't stress enough that this is non-standard, and is not guaranteed to be preserved even between minor library updates. A far better way to do this is to use the * sub-specifier which:
Indicates that the data is to be read from the stream but ignored (i.e. it is not stored in the location pointed by an argument).
[Source]
This could be accomplished like this for example:
s == NULL ? sscanf("Privjet mir!", "%*s") : sscanf("Privjet mir!", "%s", s);
Obviously the true-branch of the ternary is a no-op, but I've included it with the expectation that other data was expected to be read from the string.
The manpage says that, when using %s, the argument must be a pointer with enough space for the string and \0. So my guess would be that the behaviour in your case is undefined. It may work, it may also crash or corrupt memory and cause issues later.
No, this is not allowed.
sscanf %s expects a char* pointing to a sufficient large buffer, printf %s wants a nul char* buffer. Anything else results in undefined behavior. (And that means some implementations might detect and handle a null pointer in a certain way, other implementations might not)
I didn't find anything in the standard explicitly concerning NULL and *printf/*scanf.
I suppose that this is undefined behavior1, since it counts as passing an argument that is not coherent with the format specifier (§7.19.6.1 ¶13, §7.19.6.2 ¶13): %s means that a you're going to pass a pointer to the first element of a character array (large enough for the acquired string for *scanf, containing a NUL-terminated string for *printf) - and passing NULL doesn't satisfy this requirement.
1. In this case UB shows as "just ignoring the acquisition" and "printing (null)", on other platforms it may result in planes falling down the sky or the usual nasal demons.
Allocate the memory to s . Assign s to character array. Then run the program.
Following will work.
int main(int arc, char* argv[])
{
char s[100];
sscanf("Privjet mir!", "%[^\t]s", s);
printf("s: %s\n", s);
return 0;
}

Resources