What does a # sign after a % sign in a scanf() function mean? - c

What does the following code mean,in C
scanf("%d%#d%d",&a,&b,&c);
if given values 1 2 3 it gives output as 1 0 0
P.S- I know it is used with printf() statement but here in scanf() statement it gives random behaviour.

TL;DR; - A # after a % sign in the format string of scanf() function is wrong code.
Explanation:
The # here is a flag character, which is allowed in fprintf() and family, not in fscanf() and family.
In case of your code, the presence of # after % is treated as invalid conversion specifier. As per 7.21.6.2,
If a conversion specification is invalid, the behavior is undefined
So, your code produces undefined behaviour.
Hint: you can check the return value of scanf() to check how many elements were "scanned" successfully.
However, FWIW, using # with %d in printf() also is undefined behaviour.
Just for reference: As per the C11 standard document , chapter §7.21.6.1, flag characters part, (emphasis mine)
#
The result is converted to an ‘‘alternative form’’. For o conversion, it increases the precision, if and only if necessary, to force the first digit of the result to be a zero (if the value and precision are both 0, a single 0 is printed). For x (or X) conversion, a nonzero result has 0x (or 0X) prefixed to it. For a, A, e, E, f, F, g, and G conversions, the result of converting a floating-point number always contains a decimal-point character, even if no digits follow it. (Normally, a decimal-point character appears in the result of these conversions only if a digit follows it.) For g and G conversions, trailing zeros are not removed from the
result. For other conversions, the behavior is undefined.

According to the Standard, the use of # is illegal.
Its use makes your program invoke Undefined Behaviour.
Of course, if your implementation defines it, it is defined behaviour for your implementation and it does what your documentation says.

Related

Format string vulnerability in C (how does stack behave in this case?) [duplicate]

I've seen the following line in a source code written in C:
printf("%2$d %1$d", a, b);
What does it mean?
It's an extension to the language added by POSIX (C11-compliant behaviour should be as described in an answer by #chux). Notation %2$d means the same as %d (output signed integer), except it formats the parameter with given 1-based number (in your case it's a second parameter, b).
So, when you run the following code:
#include <stdio.h>
int main() {
int a = 3, b = 2;
printf("%2$d %1$d", a, b);
return 0;
}
you'll get 2 3 in standard output.
More info can be found on printf man pages.
Per the C spec C11dr 7.21.6.1
As part of a print format, the first % in "%2$d %1$d" introduces a directive. This directive may have various flags, width, precision, length modifier and finally a conversion specifier. In this case 2 is a width. The next character $ is neither a precision, length modifier nor conversion specifier. Thus since the conversion specification is invalid,
... the behavior is undefined. C11dr 7.21.6.1 9
The C spec discusses future library directions. Lower case letters may be added in the future and other characters may be used in extensions. Of course $ is not a lower case letter, so that is good for the future. It certainly fits the "other character" role as $ is not even part of the C character set.
In various *nix implementations, $ is used as describe in Linux Programmer's Manual PRINTF(3). The $, along with the preceding integer defines the argument index of the width.

How # flag in printf works?

#include <stdio.h>
int main()
{
float x;
x=(int)(float)(double)(5.5);
printf("%#u",x);
return 0;
}
How the # flag in the printf is working here?
Everytime I run this code I get different(garbage) values.
I know that the # flag works only with o , 0x, 0X, e, E, f, g, G but when it is not defined for integers.
So is this an Undefined behaviour? I am getting correct values when I am using the above flags.
So tell me whether I am right or wrong.
From c11 standard.
7.21.6.1. p6:
#:
The result is converted to an ‘‘alternative form’’. For o conversion, it increases
the precision, if and only if necessary, to force the first digit of the result to be a
zero (if the value and precision are both 0, a single 0 is printed). For x (or X)
conversion, a nonzero result has 0x (or 0X) prefixed to it. For a, A, e, E, f, F, g,
and G conversions, the result of converting a floating-point number always
contains a decimal-point character, even if no digits follow it. (Normally, a
decimal-point character appears in the result of these conversions only if a digit
follows it.) For g and G conversions, trailing zeros are not removed from the
result. For other conversions, the behavior is undefined.
So, to clarify, using # with u is undefined.
From the manual page:
#
The value should be converted to an "alternate form" [...] For other conversions, the result is undefined.
So yes, it's undefined.
Using this flag with any other than the listed conversions is undefined behaviour. Don't use it with other conversions.
The value should be converted to an "alternate form".
For o conversions, the first character of the output string is made zero (by prefixing a 0 if it was not zero already).
For x and X conversions, a nonzero result has the string "0x" (or "0X" for X conversions) prepended to it.
For a, A, e, E, f, F, g, and G conversions, the result will always contain a decimal point, even if no digits follow it (normally, a decimal point appears in the results of those conversions only if a digit follows).
For g and G conversions, trailing zeros are not removed from the result as they would otherwise be.
For other conversions, the result is undefined.
(taken from the printf(3)-manpage. Wording is essentially the same as in the standard. Emphasis mine)

(GCC) Dollar sign in printf format string

I've seen the following line in a source code written in C:
printf("%2$d %1$d", a, b);
What does it mean?
It's an extension to the language added by POSIX (C11-compliant behaviour should be as described in an answer by #chux). Notation %2$d means the same as %d (output signed integer), except it formats the parameter with given 1-based number (in your case it's a second parameter, b).
So, when you run the following code:
#include <stdio.h>
int main() {
int a = 3, b = 2;
printf("%2$d %1$d", a, b);
return 0;
}
you'll get 2 3 in standard output.
More info can be found on printf man pages.
Per the C spec C11dr 7.21.6.1
As part of a print format, the first % in "%2$d %1$d" introduces a directive. This directive may have various flags, width, precision, length modifier and finally a conversion specifier. In this case 2 is a width. The next character $ is neither a precision, length modifier nor conversion specifier. Thus since the conversion specification is invalid,
... the behavior is undefined. C11dr 7.21.6.1 9
The C spec discusses future library directions. Lower case letters may be added in the future and other characters may be used in extensions. Of course $ is not a lower case letter, so that is good for the future. It certainly fits the "other character" role as $ is not even part of the C character set.
In various *nix implementations, $ is used as describe in Linux Programmer's Manual PRINTF(3). The $, along with the preceding integer defines the argument index of the width.

%.#s format specifier in printf statement in c

Please explain the output. What does %.#s in printf() mean?
#include<stdio.h>
#include <stdlib.h>
int main(int argc,char*argv[]){
char *A="HELLO";
printf("%.#s %.2s\n",A,A);
return 0;
}
OUTPUT:
#s HE
It's undefined behavior. # in printf format specifier means alternative form, but according to the standard, # is only used together with o, a, A, x, X, e, E, f, F, g, G, not including s.
C11 §7.21.6.1 The fprintf function Section 6
# The result is converted to an ‘‘alternative form’’. For o conversion, it increases
the precision, if and only if necessary, to force the first digit of the result to be a
zero (if the value and precision are both 0, a single 0 is printed). For x (or X)
conversion, a nonzero result has 0x (or 0X) prefixed to it. For a, A, e, E, f, F, g, and G conversions, the result of converting a floating-point number always
contains a decimal-point character, even if no digits follow it. (Normally, a decimal-point character appears in the result of these conversions only if a digit follows it.) For g and G conversions, trailing zeros are not removed from the result. For other conversions, the behavior is undefined.
For example, on my machine, output is different: %.0#s HE
%.1s is used to print the first character of the string
%.2s is used to print the first two characters of the string
%.3s is used to print the first three characters of the string and so on
where # : alternative form of the conversion is performed is a flag which have an optional usage with the format parameter in printf() and fprintf() functions etc.
But as #Yu Hao said # is only used together with o, a, A, x, X, e, E, f, F, g, G, not including s.
in your case %.#s usage is Wrong.
Example usage from reference given by #WhozCraig :
printf("Hexadecimal:\t%x %x %X %#x\n", 5, 10, 10, 6);
printf("Octal:\t%o %#o %#o\n", 10, 10, 4);
I agree with Yu Hao's answer that it is undefined behavior, but I think the reason is different. Yes, the # character works as a flag to convert the result to an alternative format. Yes, the # flag is undefined for strings. But in this case, the # is not a flag, it's a precision. It's still undefined, but the reason is different
The C11 standard at §6.21.6.1 says that the % sign is followed in sequence by:
Zero or more flags (including #)
An optional minimum field width
An optional precision
An optional length modifier
A conversion specifier character
Except for the conversion specifier, these are all optional. But the order in which they appear is always as above. So flag, if present, has to be first, immediately after the % character. Here what follows the % is not something indicating a flag, it is a period: %., indicating precision.
When you have %.# in your format string for printf(), the period indicates that the following character is the precision for the conversion specification that follows. I.e., the # in your code specifies the precision for the string s, not a flag. To be a flag, it would have to directly follow the % character, without the intervening period.
With regard to precision, the C standard §7.21.6.1 says this:
The precision takes the form of a period (.) followed either by an asterisk *
(described later) or by an optional decimal integer; if only the period is specified,
the precision is taken as zero. If a precision appears with any other conversion
specifier, the behavior is undefined.
Since in your format string you have %.#s, and # is neither an asterisk * nor a decimal integer, the result is undefined.
So to be extremely exact about why your code is undefined, I think it's because the # character appears in place of a legal precision, not because it is an illegal flag for the %s conversion. It would be illegal as a flag, of course, but that's not what is precisely (har har) happening here.

What compiler does not support the style of "%#x" in printf flags?

I understand %#x give the same effect of 0x%x and it meets POSIX standard. But people mention that some compilers do not support it. Is that true, any example?
Aside from perhaps some broken embedded-systems C libraries, the # modifier should be universally supported. However %#x and 0x%x are not the same. They yield different results for the value 0, and the # modifier will always print the x in the same case as the hex digits (e.g. %#x gives 0xa and %#X gives 0XA) while using 0x%X would allow you to have a lowercase x and capital hex digits (much more visually pleasing, at least to me). As such, I find the # modifier is rarely useful in practice.
%#x is a valid conversion specification in printf format string in C89, C99 and C11.
The # flag character is not from POSIX, but rather the C standard (§7.21.6.1). If a compiler or library does not support it then it is not a C compiler / standard library.
This is perfectly valid as per C Specification - 7.21.6.1 The fprintf function - point #6
#
The result is converted to an ‘‘alternative form’’. For o conversion, it increases
the precision, if and only if necessary, to force the first digit of the result to be a
zero (if the value and precision are both 0, a single 0 is printed). For x (or X)
conversion, a nonzero result has 0x (or 0X) prefixed to it. For a, A, e, E, f, F, g,
and G conversions, the result of converting a floating-point number always
contains a decimal-point character, even if no digits follow it. (Normally, a
decimal-point character appears in the result of these conversions only if a digit
follows it.) For g and G conversions, trailing zeros are not removed from the
result. For other conversions, the behavior is undefined.

Resources