I am trying to convert a string read using wscanf to an integer using wcstol, both from header file wchar.h on Linux. While wcstol works on constant wide-char strings (e.g. L"23") it does not work on wscanf input, which puzzles me. I always get 0, even if the input is actually numeric (e.g. 23). Why?
$ ./test
23
s=23
0
Here is my test program:
#include <stdio.h>
#include <wchar.h>
int main() {
wchar_t s[100];
if (wscanf(L"%s", s) == 1) {
wprintf(L"s=%s\n", s);
wprintf(L"%ld\n", wcstol(s, NULL, 10));
}
}
If instead of wcstol I use strtol, it works but I get this warning:
/usr/include/stdlib.h:183:17: note: expected ‘const char * restrict’ but argument is of type ‘wchar_t * {aka int *}’
which I could silent using a type cast. I thought wcstol was the right way to parse a wide-char string to an integer. Since on my machine chars are actually ints, strtol happens to work, but that leaves me still unsure whether this is the right solution. What's going on here? Why wcstol does not do its job?
Your problem is with the wscanf() format. A %s field descriptor designates a pointer to char, just like for scanf(). These two functions differ a bit in how they convert the input, but they agree on the meaning of the field descriptors.
For reading into an array of wchar_t, you want %ls. Moreover, whether you should use wscanf() of scanf() is primarily a function of how the input is encoded, not of the data type into which you want to scan its contents.
Your problem is the wscanf format.
As described on MAN:
s
Matches a sequence of non white-space wide characters. [...] The application shall ensure that the corresponding argument is a pointer to a character array large enough to accept the sequence and the terminating null character, which shall be added automatically.
Simply "%s" must be used for non-wide-chars string, as usual in printf, scanf and so on
MAN also says:
l (ell)
Specifies that a following d, i, o, u, x, X, or n conversion specifier applies to an argument with type pointer to long or unsigned long; [...] that a following c, s, or [ conversion specifier applies to an argument with type pointer to wchar_t.
That means you must use "%ls" as format string to read a wide-char string.
There is another non-standard ISO C solution if you are on POSIX or on .NET MSDN: format "%S" can be used.
Related
it is simple version just for ask, i have this program
wchar_t c;
wprintf(L"input\n");
wscanf(L"%d", &c);
wprintf(L"output\n");
wprintf(L"%lc", towlower(c));
and this input/output
if i input "W" there output "?", with another characters i have the same situation.
in the line wscanf(L"%d", &c);, you passed %d as format specifier. So wscanf() is searching for an integer, but you are passing a character instead. Changing to %c will solve it.
See the specification in the wscanf(3) manpage:
d Matches an optionally signed decimal integer, whose format is the same
as expected for the subject sequence of wcstol() with the value 10 for
the base argument. In the absence of a size modifier, the application
shall ensure that the corresponding argument is a pointer to int.
...
c Matches a sequence of wide characters of exactly the number specified
by the field width (1 if no field width is present in the conversion
specification).
Why I am not getting error even though I am not not passing any format specifier but passing a string literal. There is no error in the case of string literal but there is an error in the case of character,integer. Why ?
#include<stdio.h>
int main()
{
printf("Hello World");
return 0;
}
The first argument to printf should be a format string1. "Hello World" is a format string2.
Per paragraph 7.21.6.1 3 in the C 2018 standard, the format string is composed of zero or more directives:
A % character starts a directive for a conversion specification.
Any other character is a directive to output that character unchanged.
So "Hello World" is a format string that says to print “H”, “e”, “l”, “l”, “o”, “ ”, “W”, “o”, “r”, “l”, and “d”. It is simply a format string with only ordinary characters, no conversion specifications. It is the proper type and data for the first parameter of printf, so no errors occurs.
In contrast, when a char or int is passed as the first argument to printf, the compiler knows it is the wrong type for the argument and issues a warning or error message.
Footnotes
1 Technically, the argument should be a pointer to the first character of the format string.
2 "Hello World" is passed as a pointer because, while it is an array of characters, it is automatically converted to a pointer to its first character.
The documentation on printf is as follows:
int printf ( const char * format, ... );
Print formatted data to stdout
Writes the C string pointed by format to the standard output (stdout). If format includes format specifiers (subsequences beginning with %), the additional arguments following format are formatted and inserted in the resulting string replacing their respective specifiers.
putting a char or int variable in place of format above will fail
It is possible with a help of preprocessor and Generic Selection introduced in C11 to select a proper format string basing on argument's type.
Try:
#define FMT(X) _Generic(X, int: "%d", char*: "%s", float: "%f")
#define print(X) printf(FMT(X), (X))
Now the program:
int main() {
print(1);
print("abc");
print(3.0f);
return 0;
}
Produces an expected output of 1abc3.00000.
How to use a scanf width specifier of 0?
1) unrestricted width (as seen with cywin gcc version 4.5.3)
2) UB
3) something else?
My application (not shown) dynamically forms the width specifier as part of a larger format string for scanf(). Rarely it would create a "%0s" in the middle of the format string. In this context, the destination string for that %0s has just 1 byte of room for scanf() to store a \0 which with behavior #1 above causes problems.
Note: The following test cases use constant formats.
#include <memory.h>
#include <stdio.h>
void scanf_test(const char *Src, const char *Format) {
char Dest[10];
int NumFields;
memset(Dest, '\0', sizeof(Dest)-1);
NumFields = sscanf(Src, Format, Dest);
printf("scanf:%d Src:'%s' Format:'%s' Dest:'%s'\n", NumFields, Src, Format, Dest);
}
int main(int argc, char *argv[]) {
scanf_test("1234" , "%s");
scanf_test("1234" , "%2s");
scanf_test("1234" , "%1s");
scanf_test("1234" , "%0s");
return 0;
}
Output:
scanf:1 Src:'1234' Format:'%s' Dest:'1234'
scanf:1 Src:'1234' Format:'%2s' Dest:'12'
scanf:1 Src:'1234' Format:'%1s' Dest:'1'
scanf:1 Src:'1234' Format:'%0s' Dest:'1234'
My question is about the last line. It seems that a 0 width results in no width limitation rather than a width of 0. If this is correct behavior or UB, I'll have to approach the zero width situation another way or are there other scanf() formats to consider?
The maximum field width specifier must be non-zero. C99, 7.19.6.2:
The format shall be a multibyte character sequence, beginning and ending in its initial
shift state. The format is composed of zero or more directives: one or more white-space
characters, an ordinary multibyte character (neither % nor a white-space character), or a
conversion specification. Each conversion specification is introduced by the character %.
After the %, the following appear in sequence:
— An optional assignment-suppressing character *.
— An optional nonzero decimal integer that specifies the maximum field width (in
characters).
— An optional length modifier that specifies the size of the receiving object.
— A conversion specifier character that specifies the type of conversion to be applied.
So, if you use 0, the behavior is undefined.
This came from 7.21.6.2 of n1570.pdf (C11 standard draft):
After the %, the following appear in sequence:
— An optional assignment-suppressing character *.
— An optional decimal integer greater than zero that specifies the
maximum field width (in characters).
...
It's undefined behaviour, because the C standard states that your maximum field width must be greater than zero.
An input item is defined as the longest sequence of input characters
which does not exceed any specified field width and ...
What is it you wish to achieve by reading a field of width 0 and assigning it as a string (empty string) into Dest? Which actual problem are you trying to solve? It seems more clear to just assign like *Dest = '\0';.
What is the function definition of the printf() function as defined in the standard C library?
I need the definition to solve the following question:
Give the output of the following:
int main()
{
int a = 2;
int b = 5;
int c = 10;
printf("%d ",a,b,c);
return 0;
}
The C language standard declares printf as follows:
int printf(const char *format, ...);
It returns an integer and takes a first parameter of a pointer to a constant character and an arbitrary number of subsequent parameters of arbitrary type.
If you happen to pass in more parameters than are required by the format string you pass in, then the extra parameters are ignored (though they are still evaluated). From the C89 standard §4.9.6.1:
If there
are insufficient arguments for the format, the behavior is undefined.
If the format is exhausted while arguments remain, the excess
arguments are evaluated (as always) but are otherwise ignored.
You pass an array of chars (or pointer) as the first argument (which includes format placeholders) and additional arguments to be substituted into the string.
The output for your example would be 2 1 to the standard output. %d is the placeholder for a signed decimal integer. The extra space will be taken literally as it is not a valid placeholder. a is passed as the first placeholder argument, and it has been assigned 2. The extra arguments won't be examined (see below).
printf() is a variadic function and only knows its number of additional arguments by counting the placeholders in the first argument.
1 Markdown does not allow trailing spaces in inline code examples. I had to use an alternate space, but the space you will see will be a normal one (ASCII 0x20).
Its
int printf(const char *format, ...);
format is a pointer to the format string
... is the ellipsis operator , with which you can pass variable number of arguments, which depends on how many place holders we have in the format string.
Return value is the number of characters that were printed
Have a look here about the ellipsis operator: http://bobobobo.wordpress.com/2008/01/28/how-to-use-variable-argument-lists-va_list/
man 3 printf gives...
int printf(const char *restrict format, ...);
Writes to the standard output (stdout) a sequence of data formatted as the format argument specifies. After the format parameter, the function expects at least as many additional arguments as specified in format.
%d = Signed decimal integer
printf("%d ",a,b,c);
For every %(something) you need add one referining variable, therefore
printf("%d ",a+b+c); //would work (a+b+c), best case with (int) before that
printf("%d %d %d",a,b,c); //would print all 3 integers.
I was reading a book and came across a program to read entries from a /proc file.
The program which they mentioned has following line
printf("%.*s", (int) n, line);
I am not clear with meaning of above line
what type of print if above "%.*s used instead of %s
The code can be read here
Abstract from here:
.* - The precision is not specified in
the format string, but as an
additional integer value argument
preceding the argument that has to be
formatted.
So this prints up to n characters from line string.
The cast expression (int) n converts the value of n to type int. This is because the formatting specifier requires a plain int, and I assume (since you didn't include it) the variable n has a different type.
Since a different type, like size_t might have another size, it would create problems with the argument passing to printf() if it wasn't explicitly converted to int.