How to read string separated by / with scanf - c

I`ve been trying to this for quite a while now and after some research I had no success, so my last resort was asking a question. My input looks like this:
1.0.0.0/255.0.0.0/127.0.0.1/1112
1.2.0.0/255.255.0.0/2.4.6.9/1112
1.2.3.0/255.255.255.0/1.2.3.1/111
I need to extract 4 strings from each line, so for exemple the first line would give me
s1 = 1.0.0.0
s2 = 255.0.0.0
s3 = 127.0.0.1
s4 = 1112
Here is what I have tried:
scanf("%s/%s/%s/%s", str1, str2, str3, str4); // This doesn't consume the "/"
scanf("%[^/]s%[^/]s%[^/]s%[^/]s", str1, str2, str3, str4); // This only gets the first string
scanf(""%[^\n]s%*c%s%*c%s%*c%s", str1, str2, str3, str4); // Hera %*c was supposed to consume the "/" and do not store it, it doen't happen tho
How can I get the 4 strings from each input line using a single scanf inside a while (!feof(fileIn)) ? Thank you.

There are a few issues with the posted code. The scanset directive is %[]; there is no s in this. The format strings using %[^/]s are attempting to match a literal s in the input. But this will always fail because %[^/] matches any character except for /. When a / is encountered, the match fails and the / character is left in the input stream. It is this character which must be consumed before continuing on to the next input field.
Also, note that while(!feof(file)){} is always wrong. Instead, try fetching input by lines using fgets(), and parsing with sscanf(). The fgets() function returns a null pointer when end-of-file is reached.
Further, you should always specify a maximum width when reading strings with scanf() family functions to avoid buffer overflow.
Here is an example program:
#include <stdio.h>
int main(void)
{
char input[4096];
char str1[100];
char str2[100];
char str3[100];
char str4[100];
while (fgets(input, sizeof input, stdin)) {
sscanf(input, " %99[^/]/ %99[^/]/ %99[^/]/ %99[^/]",
str1, str2, str3, str4);
puts(str1);
puts(str2);
puts(str3);
puts(str4);
}
return 0;
}
Sample interaction using sample input from the question:
λ> ./a.out < readstring_test.txt
1.0.0.0
255.0.0.0
127.0.0.1
1112
1.2.0.0
255.255.0.0
2.4.6.9
1112
1.2.3.0
255.255.255.0
1.2.3.1
111

You already got quite close: you missed to consume the delimiter in your second approach:
scanf("%[^/]/%[^/]/%[^/]/%[^/]", str1, str2, str3, str4);
should do the job.

Related

How to get each string within a buffer fetched with "getline" from a file in C

I'm trying to read every string separated with commas, dots or whitespaces from every line of a text from a file (I'm just receiving alphanumeric characters with scanf for simplicity). I'm using the getline function from <stdio.h> library and it reads the line just fine. But when I try to "iterate" over the buffer that was fetched with it, it always returns the first string read from the file. Let's suppose I have a file called "entry.txt" with the following content:
test1234 test hello
another test2
And my "main.c" contains the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_WORD 500
int main()
{
FILE *fp;
int currentLine = 1;
size_t characters, maxLine = MAX_WORD * 500;
/* Buffer can keep up to 500 words of 500 characters each */
char *word = (char *)malloc(MAX_WORD * sizeof(char)), *buffer = (char *)malloc((int)maxLine * sizeof(char));
fp = fopen("entry.txt", "r");
if (fp == NULL) {
return 1;
}
for (currentLine = 1; (characters = getline(&buffer, &maxLine, fp)) != -1; currentLine++)
{
/* This line gets "test1234" onto "word" variable, as expected */
sscanf(buffer, "%[a-zA-Z_0-9]", word);
printf("%s", word); // As expected
/* This line should get "test" string, but again it obtains "test1234" from the buffer */
sscanf(buffer, "%[a-zA-Z_0-9]", word);
printf("%s", word); // Not intended...
// Do some stuff with the "word" and "currentLine" variables...
}
return 0;
}
What happens is that I'm trying to get every alphanumeric string (namely word from now on) in sequence from the buffer, when the sscanf function just gives me the first occurrence of a word within the specified buffer string. Also, every line on the entry file can contain an unknown amount of words separated by either whitespaces, commas, dots, special characters, etc.
I'm obtaining every line from the file separately with "getline" because I need to get every word from every line and store it in other place with the "currentLine" variable, so I'll know from which line a given word would've come. Any ideas of how to do that?
fscanf has an input stream argument. A stream can change its state, so that the second call to fscanf reads a different thing. For example:
fscanf(stdin, "%s", str1); // str1 contains some string; stdin advances
fscanf(stdin, "%s", str2); // str2 contains some other sting
scanf does not have a stream argument, but it has a global stream to work with, so it works exactly like fscanf(stdin, ...).
sscanf does not have a stream argument, nor there is any global state to keep track of what was read. There is an input string. You scan it, some characters get converted, and... nothing else changes. The string remains the same string (how could it possibly be otherwise?) and no information about how far the scan has advanced is stored anywhere.
sscanf(buffer, "%s", str1); // str1 contains some string; nothing else changes
sscanf(buffer, "%s", str2); // str2 contains the same sting
So what does a poor programmer fo?
Well I lied. No information about how far the scan has advanced is stored anywhere only if you don't request it.
int nchars;
sscanf(buffer, "%s%n", str1, &nchars); // str1 contains some string;
// nchars contains number of characters consumed
sscanf(buffer+nchars, "%s", str2); // str2 contains some other string
Error handling and %s field widths omitted for brevity. You should never omit them in real code.

Ignoring "=" character with fscanf

I am trying to read a file that contains lines in this format abc=1234. How can I make fscanf ignore the = and store str1="abc" and str2="1234"?
I tried this:
fscanf(fich1, "%[^=]=%[^=]" , palavra, num_char)
I'd recommend using fgets to read lines and then parse them with sscanf. But you can use the same principle for just fscanf if you want.
#include <stdio.h>
int main(void) {
char buf[100];
char str1[100];
char str2[100];
if(! fgets(buf, sizeof buf, stdin)) return 1;
if(sscanf(buf, "%[^=]=%s", str1, str2) != 2) return 1;
puts(str1);
puts(str2);
}
So what does %[^=]=%s do? First %[^=] reads everything until the first occurrence of = and stores it in str1. Then it reads a = and discards it. Then it reads a string to str2. And here you can see the problem with your format string. %[^=] expects the string to end with =, and you have another one at the end. So you would have a successful read of the string abc=1234=.
Note that %[^=] and %s treats white space a little differently. So if that's a concern, you need to account for that. For example with %[^=]=%[^\n].
And in order to avoid buffer overflow, you also might want to do %99[^=]=%99[^\n].

Parsing a C string and storing parts of the string in variables

This post might be marked as a duplicate, but I did search online for this specific case and I couldn't find any examples similar to this. The following is a simplified version of my code.
I have the following lines of data stored with a text file named test.txt:
12345|This is a sentence|More words here
24792|This is another sentence|More words here again
The text in the test.txt file will always follow the format of <int>|<string>|<string>
I now want to store each of the sections separated by the delimiter | in a variable.
The following is my attempt:
uint32_t num;
char* str1, str2;
// the data variable is a char pointer to a single line retrieved from test.txt
sscanf(data, "%d|%s|%s", &num, str1, str2);
This code above would retrieve the correct value for num but would insert the first word from section two into str1, leaving the variable str2 as null. To my understanding, this was the case because the sscanf() function stops when it hits a space.
Is there an efficient way of storing each section into a variable?
As you noted, %s uses whitespace as the delimiter. To use | as the delimiter, use %[^|]. This matches any sequence of characters not including |.
And since num is unsigned, you should use %u, not %d.
sscanf(data, "%u|%[^|]|%[^|]", &num, str1, str2);
Don't forget to allocate memory for str1 and str2 to point to; scanf() won't do that automatically.
Your variable declarations are also wrong. It needs to be:
char *str1, *str2;
Your declaration is equivalent to:
char *str1;
char str2;

putting 3 strings together using strcpy, strcat, sprintf at least once each

My homework problem is to make a program putting 3 strings together using strcpy, strcat, sprintf at least once each.
I'm wondering if I can use all those three without any garbage code. I've tried using strchr to use sprintf for putting strings together, but the pointer location changed so couldn't print out the whole thing.
char str1[MAX];
char str2[MAX];
char str3[MAX];
char str4[MAX];
gets(str1);
gets(str2);
gets(str3);
strcat(str1, str2);
strchr(str1, '\0');
sprintf(str1, "%s", str3);
strcpy(str4, str1);
puts(str4);
I also want to know if there is any difference in their use between strcpy and sprintf in this case.
Lets say str1 = "ab", str2 = "cd", str3 = "ef"
strcat(str1, str2);
This will concatenate str2 onto str1,
now
str1 = "abcd"
strchr(str1, '\0'); // this will not do anything, it will just return pointer of '\0' in str1, which is the last chatracter.
sprintf(str1, "%s", str3);
this will print "ef" into str1, old content will be lost,
I believe you wanted to do
sprintf(<pointer returned from strchr>, "%s", str3);
strcpy(str4, str1);
This will just copy str1 to str4.
puts(str4);
This will print the string str4
The problem with your code is where you are doing strchr and not collecting the return value, so that you can concatenate there.
In this case strcpy and sprintf are similar, but sprintf gives you lot of formatting options, see documentation.
http://www.cplusplus.com/reference/cstdio/sprintf/
Also, your MAX macro should be large enough to hold strings.
This doesn't do anything: strchr(str1,'\0'). Read the documentation of strchar carefully. But you don't need strchr here anyway, you proably just want this:
...
gets(str1);
gets(str2);
gets(str3);
strcpy(str4, str1); // copy str1 into str4
strcat(str4, str2); // append str2 to str4
strcat(str4, str3); // append str3 to str4
puts(str4); // print str4
As you can see, you don't need sprintf either.
But you can do the same thing using only sprintf
...
gets(str1);
gets(str2);
gets(str3);
sprintf(str4, "%s%s%s", str1, str2, str3);
puts(str4);
but then you don't need strcpy nor strcat.
Using all strcpy, strcat and sprintf is a somewhat pointless requirement, but now you should be able to do it.

Reading multiple strings using sscanf() based on a delimiter

I have string having multiple words separated with commas like
char str[]="K&R,c89,c99,c11";
I am trying to read the first 2 words into a separate character arrays using sscanf().
sscanf(str, "%[^,] s%[^,]s", str1, str2);
I intended sscanf() to scan through str till reaching a ,, store it to str1, continue scanning till another , and store into str2.
But value is being stored only into str1 while str2 seem to be having garbage.
I tried removing the space between the %[^,]ss if that was of any significance but it made no difference on the output.
What am I doing wrong? Or is this not possible for multiple words?
I've heard of doing something like this with strtok() but I was wondering if sscanf() could be used for this.
Duh.. It took me a while to see it. Get rid of the s in your format string. The character class [...] takes the place of s and by putting s in there, you are forcing sscanf to look for a literal s in str, e.g.
#include <stdio.h>
#define MAX 8
int main (void) {
char str[]="K&R,c89,c99,c11";
char str1[MAX] = "";
char str2[MAX] = "";
if (sscanf(str, "%[^,],%[^,]", str1, str2) == 2)
printf ("str1 : %s\nstr2 : %s\n", str1, str2);
return 0;
}
Example Use/Output
$ ./bin/sscanfcomma
str1 : K&R
str2 : c89
Also, consider protecting your arrays from overflow with, e.g.
if (sscanf(str, "%7[^,],%7[^,]", str1, str2) == 2)

Resources