Why sscanf doesn't work well in this code? - c

#include <stdio.h>
int main()
{
char string[80]="abcdef";
char buffer[80];
int num;
sscanf(string,"%*[^0-9a-fA-F]%n%s",&num,buffer);
printf("%d\n",num);
puts(buffer);
return 0;
}
Output:
-149278720
And what I expect is
0
abcdef
I believe that the regex %*[^0-9a-fA-F] discards all characters other than "xdigits", however, when the first character in the string is a "xdigit", sscanf seems to return instantly.
How can I fix this?

%*[^0-9a-fA-F] matches a non-empty sequence of characters that aren't in the character set. Since you don't have any non-hexdigits at the beginning of the string, this conversion fails and sscanf returns immediately.
As far as I can tell, there's no way to make this optional in sscanf. If you just want to skip over the non-hexdigits, use strcspn().
num = strcspn(string, "0123456789abcdefABCDEF");
strcpy(buf, string+num);

Related

Ignoring "=" character with fscanf

I am trying to read a file that contains lines in this format abc=1234. How can I make fscanf ignore the = and store str1="abc" and str2="1234"?
I tried this:
fscanf(fich1, "%[^=]=%[^=]" , palavra, num_char)
I'd recommend using fgets to read lines and then parse them with sscanf. But you can use the same principle for just fscanf if you want.
#include <stdio.h>
int main(void) {
char buf[100];
char str1[100];
char str2[100];
if(! fgets(buf, sizeof buf, stdin)) return 1;
if(sscanf(buf, "%[^=]=%s", str1, str2) != 2) return 1;
puts(str1);
puts(str2);
}
So what does %[^=]=%s do? First %[^=] reads everything until the first occurrence of = and stores it in str1. Then it reads a = and discards it. Then it reads a string to str2. And here you can see the problem with your format string. %[^=] expects the string to end with =, and you have another one at the end. So you would have a successful read of the string abc=1234=.
Note that %[^=] and %s treats white space a little differently. So if that's a concern, you need to account for that. For example with %[^=]=%[^\n].
And in order to avoid buffer overflow, you also might want to do %99[^=]=%99[^\n].

Tell sscanf to consider \0 as a valid character to read

I am parsing a file with formatted strings. Need some help with parsing.
Consider below example.
int main()
{
char value[32], name[32];
int buff_ret1, buff_ret2;
char *buff = "1000000:Hello";
char *buff_other ="200000:\0";
buff_ret1 = sscanf(buff,"%[^:]:%s", value, name);
printf("buff_ret1 is %d\n", buff_ret1);
buff_ret2 = sscanf(buff_other,"%[^:]:%s", value, name);
printf("buff_ret2 is %d\n", buff_ret2);
return 0;
}
I am expecting value of buff1_ret and and buff2_ret to be 2, but buff_ret2 value is coming as 1. I understand that it is not considering NUL. Is there a way I can say to sscanf function to consider NUL as a character to read.
No, this is not possible. From sscanf
Reaching the end of the string in sscanf() shall be equivalent to encountering end-of-file for fscanf().
This means \0 (end of string) is interpreted as end of file.

Input/Output scanset in c

#include<stdio.h>
int main()
{
char str[50]={'\0'};
scanf("%[A-Z]s",str);
printf("%s",str);
return 0;
}
1)
Input:
helloWORLD
output:
2)
Input:
HELLoworlD
output:
HELL
In output 1, i expected the output as "WORLD" but it didnt give any outout.
From output 2, i understood that this is working only if the first few characters are in upper case.
Can you please explain how it actually works?
Interpretation of scansets
When it is given helloWORLD, the conversion specification %[A-Z] fails immediately because the h is not an upper-case letter. Therefore, scanf() returns 0, indicating that it did not successfully convert anything. If you tested the return value, you'd know that.
When it is given HELLoworlD, the scanset matches the HELL and stops at the first o. The format string also attempts to match a literal s, but there's no way for scanf() to report that it fails to match that after matching HELL.
Buffer overflow
Note that %[A-Z] is in general dangerous (as is %s) because there is no constraint on the number of characters read. If you have:
char str[50];
then you should use:
if (scanf("%49[A-Z]", str) != 1)
...some problem in the scan...
Also note that there is a 'difference by one' between the declared length of str and the number in the format string. This is awkward; there's no way to provide that number as an argument to scanf() separate from the format string (unlike printf()), so you may end up creating the format string on the fly:
int scan_upper(char *buffer, size_t buflen)
{
char format[16];
if (buflen < 2)
return EOF; // Or other error indication
snprintf(format, sizeof(format), "%%%zu[A-Z]", buflen-1); // Check this too!?
return scanf(format, buffer);
}
When you do
scanf("%[A-Z]s",str);
It takes input as long as you enter upper-case letters.
And since you set all the array to '\0', printf() will stop printing when it meets one.
Therefore, the first input is blank, and the second is printing until the end of the upper-case string.

Does sscanf() support "recursive" buffer?

I make some research in order to look for the sscanf() source code . But I could not find the answer to my question.
when we use sscanf() in this way:
char str[50] = "5,10,15";
int x;
sscanf(str,"%d,%s",&x,str);
Does sscanf() support "recursive" buffer str ?
It doesn't break from self modifying buffer. But to make it (tail)recursive, you'd have to read to the end of the string.
The code fragment:
char str[]="5,10,15";
int a[10]={0},x = 0;
while (sscanf(str,"%d,%s",a+ x++,str)>1);
reads all the integers.
Since this is not really recursive and the string to be read doesn't overwrite the asciiz in the string, I believe this is "safe" in the meaning: only try this at home.
scanf family functions read the format specifiers and do the conversion one-by-one. In your example code:
char str[50] = "5,10,15";
int x;
sscanf(str,"%d,%s",&x,str);
It may work because it reads in an integer first and another c-string. No problem here as original str is over-written with the new value.
But consider the following:
char str[50] = "5 10 15";
int x;
sscanf(str,"%s %d",str, &x);
It first reads 5 from the original value and overwrites str and subsequent format specifier %d will have nothing to read from str as the end of str has been reached due to the nul-termination for the previous %s read.
It's just a counter example to show it's a bad idea and won't work. So you can say this is going to invoke undefined behaviour at some point or cause other problems.
Generally I would advise against this. If it works once it doesn't mean, that it will in all corner cases. To be really sure, you'd have to check sources for the very implementation you are using. If you want it to be portable, forget about it right away, unless you see it written in the libc specs as a guaranteed behaviour. Use strchr() to find the next delimiter and update the string pointer to point at the next character.
The use of strtok() solved an almost identical problem I had, namely, using sscanf() recursively to extract the contents of a char* (a list of double's separated by a space, e.g. "1.00 2.01 ...") into a double array (e.g. array[1]=1.00, array[2]=2.01, ...).
The following might help, if I believe I understand your issue:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char** argv)
{
char str[]= "5,10,15";
char* token;
int x[3];
token = strtok(str, ",");
for(int i=0; i<3; i++){
sscanf(token, "%d", &x[i]);
token = strtok(NULL, ",");
}
for(int i=0; i<3; i++) printf("%d\n", x[i]);
return 0;
}
On running, it produces:
5
10
15
See http://www.cplusplus.com/reference/cstring/strtok/ for a useful description of strtok().

Printing all characters in a string in C

I have a very simple program to print the chars in a string but for some reason it is not working:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void * print_chars(char *process_string) {
int i;
int string_len;
string_len = strlen(process_string);
printf("String is %s, and its length is %d", process_string, string_len);
for(i = 0; i < string_len; i++) {
printf(process_string[i]);
}
printf("\n");
}
int main(void) {
char *process_string;
process_string = "This is the parent process.";
print_chars(process_string);
return 0;
}
When I run it in Netbeans, I get the following:
RUN FAILED (exit value 1, total time: 98ms)
If I remove the line
printf(process_string[i]);
the program runs but nothing prints out to the console (obviously).
Any ideas what I'm missing here?
You need a format in the line
printf(process_string[i]);
i.e.
printf("%c", process_string[i]);
There are a couple of problems.
One is that you're not seeing any output from the printf("String is %s, and its length is %d", ...). This is because standard output is line buffered by default, and you are not including a newline, so it never actually decides that there's a line ready to print. If you change the format string to add a \n, you will see the output from this command.
The second is that you are passing a char into the first argument of printf(), where it expects a char *. This causes it to crash, as it tries to interpret that character as a pointer. You want to pass something like printf(process_string) instead. However, it's generally a bad idea to pass a variable string directly into the first argument of printf(); instead, you should pass a format string that includes %s, and pass the string in as the corresponding argument: printf("%s\n", process_string). Or, if you want to print it character by character, printf("%c", process_string[i]), followed by a printf("\n") to flush the buffer and actually see the output. Or if you're doing it character by character, putchar(process_string[i]) will be simpler than printf().
printf() expects, as first parameter, a pointer to char. What you are passing is a char, not a pointer to one.
Anyway, printf() is not the function to use here. Try putc()...

Resources