sscanf supports %n to count how many bytes are read.
Why does sscanf sometimes read additional bytes?
#include <stdio.h>
int main()
{
char *data = "X\n \n\x09\n \x10\n";
int len = 0;
sscanf(data, "X%n", &len);
printf("%i\n", len);
sscanf(data, "X\n%n", &len);
printf("%i\n", len);
return 0;
}
This program prints:
1
7
I would expect:
1
2
(1 for X and 2 for X\n.)
Why does it read more bytes than expected?
From cppreference:
The format string consists of
non-whitespace multibyte characters except %: each such character in the format string consumes exactly one identical character from the
input stream, or causes the function to fail if the next character on
the stream does not compare equal.
whitespace characters: any single whitespace character in the format string consumes all available consecutive whitespace characters from
the input (determined as if by calling isspace in a loop). Note that
there is no difference between "\n", " ", "\t\t", or other whitespace
in the format string.
Thus, your \n in the second format string will cause the function to consume all remaining whitespace characters – which is actually all 6 characters following the X and preceding the 0x10.
Related
I have this code:
#include<stdio.h>
int main(){
char arr[10][80];
int n;
scanf_s("%d",&n);
for(int i=0;i<n;i++){
fgets(arr[i],80,stdin);
}
for(int i=0;i<n;i++)
printf("%s",arr[i]);
}
For the most part it works well except for one thing:
It only reads n-1 strings. So for n=3, it will only let me input 2 strings before it prints them and the program ends. Why is this?
As everybody noticed, scanf_s("%d",&n); leaves the newline in the input stream so the first fgets reads this and stores it in arr[0].
The solution is now not to wildly do some extra reading into undefined memory locations, but to look at the format specifier of scanf_s. In particular:
The format argument specifies the interpretation of the input and can contain one or more of the following:
White-space characters: blank (' '); tab ('\t'); or newline ('\n'). A white-space character causes scanf to read, but not store, all consecutive white-space characters in the input up to the next non–white-space character. One white-space character in the format matches any number (including 0) and combination of white-space characters in the input.
...
So al that is required is to adapt the format specifier into scanf_s("%d ",&n); in whch only a space is added, which tells scanf to read up to and including the newline character. QED.
How to get the string length without using strlen function or counters, like:
scanf("???",&len);
printf("%d",len);
Input: abcde
Expected output: 5
You can use assignment-suppression (character *) and %n which will store the number of characters consumed into an int value:
int count;
scanf( "%*s%n", &count );
printf( "string length: %d\n", count );
Explanation:
%*s will parse a string (up to the first whitespace characters) but will not store it because of the *.
Then %n will store the numbers of characters consumed (which is the length of the string parsed) into count.
Please note that %n is not necessarily counted for the return value of scanf():
The C standard says: "Execution of a %n directive does not increment the assignment count returned at the completion of execution" but the Corrigendum seems to contradict this. Probably it is wise not to make any assumptions on the effect of %n conversions on the return value.
quoted from the man page where you will find everything else about scanf() too.
Use the %n format specifier to get the amount of characters consumed so far and write it to len of type int:
char buf[50];
int len;
if ( scanf("%49s%n", buf, &len) != 1 )
{
// error routine.
}
printf("%d", len);
You can doing this with the n specifier:
%n returns the number of characters read so far.
char str[20];
int len;
scanf("%s%n", &str, &len);
Explanation : Here [] is used as scanset character and ^\n takes input with spaces until the new line encountered. For length calculation of whole string, We can use a flag character in C where nothing is expected from %n, instead, the number of characters consumed thus far from the input is stored through the next pointer, which must be a pointer to int. This is not a conversion, although it can be suppressed with the *flag. Here load the variable pointed by the corresponding argument with a value equal to the number of characters that have been scanned by scanf() before the occurrence of %n.
By using this technique, We can speed up our program at runtime instead of using strlen() or loop of O(n) count.
scanf("%[^\n]%n",str,&len);
printf("%s %i",str,len);
I have this code in C language
#include <stdio.h>
#include <stdlib.h>
int foo () {printf ("foo \n");return 1;}
int bar () {printf ("bar \n");return 1;}
typedef struct {
char buf[20];
int (*func)();
} Object ;
int main()
{
Object *o1 , *o2;
o1 = (Object*) malloc (sizeof(Object));
o2 = (Object*) malloc (sizeof(Object));
if(o1==NULL || o2==NULL)
return -1;
o1-> func =&foo;
o2-> func =&bar;
scanf ( "%s " , o1->buf);
scanf ( "%s " , o2->buf);
(* ( o1->func ))();
(* ( o2->func ))();
return 0;
}
When I run this code, it crashes
The problem is in this two line:
scanf ( "%s " , o1->buf);
scanf ( "%s " , o2->buf);
It has a vulnerability in this two line.
updated :
this is an example of result :
the program run without error but I think a Vulnerability exist in the scanf line
From man scanf:
int scanf(const char *format, ...);
The format string consists of a sequence of directives which describe how to process the sequence of input characters. <snip> A directive is one of the following:
A sequence of white-space characters (space, tab, newline, etc.; see isspace(3)). This directive matches any amount of white space, including none, in the input.
An ordinary character (i.e., one other than white space or '%'). This character must exactly match the next character of input.
A conversion specification, which commences with a '%' (percent) character. A sequence of characters from the input is converted according to this specification
Conversion specification: s: Matches a sequence of non-white-space characters; the next pointer must be a pointer to character array that is long enough to hold the input sequence and the terminating null byte (\0), which is added automatically. The input string stops at white space or at the maximum field width, whichever occurs first.
So when you write:
scanf("%s ",buf)
The function will scan for a string and absorb all following white-space characters (including your newlines). This implies that an input from stdin can only terminate by a non-white-space character followed by a new-line:
foo < string conversion specification + absorbed newline
< absorbed newline
< absorbed newline
b < end of white-space directive
So in short if you want to ensure that your scanf works as expected, you can do:
scanf("%s",buf)
You have to make sure that the entered string can fit the size of buf, including the NULL character.
If you hardcoded the size of buf, you can use something like:
scanf("%19s",buf)
if buf has a size of 20. This way it will read maximum 19 characters and store them in buf with an additional null.
What does %[^\n] mean in C?
I saw it in a program which uses scanf for taking multiple word input into a string variable. I don't understand though because I learned that scanf can't take multiple words.
Here is the code:
#include <stdio.h>
#include <stdlib.h>
int main() {
char line[100];
scanf("%[^\n]",line);
printf("Hello,World\n");
printf("%s",line);
return 0;
}
[^\n] is a kind of regular expression.
[...]: it matches a nonempty sequence of characters from the scanset (a set of characters given by ...).
^ means that the scanset is "negated": it is given by its complement.
^\n: the scanset is all characters except \n.
Furthermore fscanf (and scanf) will read the longest sequence of input characters matching the format.
So scanf("%[^\n]", s); will read all characters until you reach \n (or EOF) and put them in s. It is a common idiom to read a whole line in C.
See also §7.21.6.2 The fscanf function.
scanf("%[^\n]",line); is a problematic way to read a line. It is worse than gets().
C defines line as:
A text stream is an ordered sequence of characters composed into lines, each line consisting of zero or more characters plus a terminating new-line character. Whether the last line requires a terminating new-line character is implementation-defined.
The scanf("%[^\n]", line) has the specifier "%[^\n]". It scans for unlimited number of characters that match the scan-set ^\n. If none are read, the specifier fails and scanf() returns with line unaltered. If at least one character is read, all matching characters are read and saved and a null character is appended.
The scan-set ^\n implies all character that are not (due to the '^') '\n'.
'\n' is not read
scanf("%[^\n]",.... fails to read a new line character '\n'. It remains in stdin. The entire line is not read.
Buffer overflow
The below leads to undefined behavior (UB) should more than 99 characters get read.
char line[100];
scanf("%[^\n]",line); // buffer overflow possible
Does nothing on empty line
When the line consists of only "\n", scanf("%[^\n]",line); returns a 0 without setting line[] - no null character is appended. This can readily lead to undefined behavior should subsequent code use an uninitialized line[]. The '\n' remains in stdin.
Failure to check the return value
scanf("%[^\n]",line); assumes input succeeded. Better code would check the scanf() return value.
Recommendation
Do not use scanf() and instead use fgets() to read a line of input.
#define EXPECTED_INPUT_LENGTH_MAX 49
char line[EXPECTED_INPUT_LENGTH_MAX + 1 + 1 + 1];
// \n + \0 + extra to detect overly long lines
if (fgets(line, sizeof line, stdin)) {
size_t len = strlen(line);
// Lop off potential trailing \n if desired.
if (len > 0 && line[len-1] == '\n') {
line[--len] = '\0';
}
if (len > EXPECTED_INPUT_LENGTH_MAX) {
// Handle error
// Usually includes reading rest of line if \n not found.
}
The fgets() approach has it limitations too. e.g. (reading embedded null characters).
Handling user input, possible hostile, is challenging.
scanf("%[^\n]",line);
means: scan till \n or an enter key.
scanf("%[^\n]",line);
Will read user input until enter is pressed or a newline character is added (\n) and store it into a variable named line.
Question: what is %[^\n] mean in C?
Basically the \n command prints the output in the next line, but in
case of C gives the Null data followed by the above problem only.
Because of that to remove the unwanted data or null data, need to add
Complement/negotiated symbol[^\n]. It gives all characters until the next line
and keeps the data in the defined expression.
Means it is the Complemented data or rewritten data from the trash
EX:
char number[100]; //defined a character ex: StackOverflow
scanf("%[^\n]",number); //defining the number without this statement, the
character number gives the unwanted stuff `���`
printf("HI\n"); //normaly use of printf statement
printf("%s",number); //printing the output
return 0;
The input consists a string and an integer, which are separated by a '/', like this:
hello/17
And I want to read the input into a string and an int, like this:
char str[20];
int num;
scanf("%s/%d", str, &num); // this how I tried to do it.
I can't seem to make it, any advice?
scanf awaits a whitespace terminated string when it tries to read %s.
Try to specify the forbidden character set directly:
scanf("%19[^/]/%d", str, &num);
You can read more about the formating codes here
You only need to run the following program:
#include <stdio.h>
int main (void) {
char str[20] = {'\0'};
int count, num = 42;
count = sscanf ("hello/17", "%s/%d", str, &num);
printf ("String was '%s'\n", str);
printf ("Number was %d\n", num);
printf ("Count was %d\n", count);
return 0;
}
to see why this is happening. The output is:
String was 'hello/17'
Number was 42
Count was 1
The reason has to do with the %s format specifier. From C99 7.19.6.2 The fscanf function (largely unchanged in C11, and the italics are mine):
s: matches a sequence of non-white-space characters.
Since / is not white space, it gets included in the string bit, as does the 17 for the same reason. That's also indicated by the fact that sscanf returns 1, meaning that only one item was scanned.
What you'll then be looking for is something that scans any characters other than / into the string (including white space). The same section of the standard helps out there as well:
[: matches a nonempty sequence of characters from a set of expected characters (the scanset). The conversion specifier includes all subsequent characters in the format string, up to and including the matching right bracket (]). The characters between the brackets (the scanlist) compose the scanset, unless the character after the left bracket is a circumflex (^), in which case the scanset contains all characters that do not appear in the scanlist between the circumflex and the right bracket.
In other words, something like:
#include <stdio.h>
int main (void) {
char str[20] = {'\0'};
int count, num = 42;
count = sscanf ("hello/17", "%[^/]/%d", str, &num);
printf ("String was '%s'\n", str);
printf ("Number was %d\n", num);
printf ("Count was %d\n", count);
return 0;
}
which gives you:
String was 'hello'
Number was 17
Count was 2
One other piece of advice: never ever use scanf with an unbounded %s or %[; you're asking for a buffer overflow attack. If you want a robust user input function, see this answer.
Once you have it in as a string, you can sscanf it to your heart's content without worrying about buffer overflow (since you've limited the size on input).
Could be like that:
char str[20];
int num;
scanf("%19[^/]%*c%d", str, &num);
%*c reads one character and discards it