Meaning of [^:]: format string in scanf - c

I found this in some code and could not understand what it does:
scanf("%[^:]:%[^:]:%[^:\n]", a, b, c);
There was no mention of the [^:]: format string in the C documentation and I am quite confused.

The format string %[..] is for specifying the possible characters. For example %[A-DF] is for A, B, C, D, and F. And the ^ at the beginning is for any character excluding the characters specified. Hence, the first format string is for reading characters excluding colon. And the next is colon, and so on. You may test the scanf for the following input:
Adam:And:Apple

Important thing you should remember is, scanf() function accepts the arguments how it is specified.
For example:
scanf("%d %d",&a,&b);
assuming a and b are integer.
In the command line you have to input a and then followed by space and then b.because you have given a space in-between two "%d's". if you are giving n number of spaces are other character then you have to enter in-between those character before the value gets accepted.
Hence in your case:
you enter the string not containing ":" in it.
Consider this case, if you want to enter a string containing ":" in it.
For example:
"some:init" is the string then, in first %[^:] the string "some" is stored,then comes the ":" and then in second %[^:] it stores init. Then it wait for one more ":" and then the string not containing ":".
ultimately accepting the input in this format doesn't fetch any thing.
The only thing matters is how you print it on to the console.

Related

What does scanf("%d/%d/%d) mean in C?

int currD, currM, currY;
scanf("%d/%d/%d", &currD, &currM, &currY);
I saw this code receiving birth date in the format DD/MM/YYYY, but I wonder what's the meaning of putting '/', I know without this, it will lead to bad input because of the character '/'. So what does it actually mean?
When encountering code that you don’t understand, and which is calling a function from a library, your first order of business is to research the documentation for that function. For C standard functions it’s enough to google the function name.
A good reference in this case is cppreference (don’t be misled by the website name, this is the C reference, not the C++ reference). It gives the function’s definition as
int scanf( const char *format, ... );​
Now look for the parameter description of the format parameter:
pointer to a null-terminated character string specifying how to read the input.
The subsequent text explains how to read the format string. In particular:
[…] character [except %] in the format string consumes exactly one identical character from the input stream, or causes the function to fail if the next character on the stream does not compare equal.
conversion specifications [in] the following format
introductory % character
conversion format specifier
d — matches a decimal integer.
In other words:
scanf parses a textual input based on the format string. Inside the format string, / matches a slash in the user input literally. %d matches a decimal integer.
Therefore, scanf("%d/%d/%d", …) will match a string consisting of three integers separated by slashes, and store the number values inside the pointed-to variables.
Is just the separator in the date format. The error must raise when some function searchs for those /.
The first parameter of scanf is a string specifying the format of the string you want to use to store the informations in the further arguments. You can see this format string as a pattern : %d means an integer, and without the '%' it means it just has to match exactly the characters.
Input is expected to provide like 04/07/2019.
If input is provided only 04072019. currD alone hold the value 04072019, currM and currY might garbage value as it is not initialised.
It expects the input to be in the format three integers separated by two slashes ("/"). For example: 10/11/1999.

Why there should not be type specifier like s or c after [0-9A-Z^%]?

For example consider the following code -
fscanf(fp,"%d:%d:%[^:]:%[^\n]\n",&pow->no,&pow->seen,pow->word,pow->means);
printf("\ntthis is what i read--\n%d:%d:%s:%s:\n",pow->no,pow->seen,pow->word,pow->means);
here pow is pointer to an object declared before,
when I put s as in fscanf(fp,"%d:%d:%[^:]s:%[^\n]\n" the 3rd one is read but not the last one
output is --
4:0:Abridge::
but when i do fscanf(fp,"%d:%d:%[^:]:%[^\n]s\n" all are read
output is --
4:0:Abridge:To condense:
AND without s anywhere fscanf(fp,"%d:%d:%[^:]:%[^\n]\n" all are read
output is --
`4:0:Abridge:To condense:
WHY??
To answer your question what is the meaning of %[^\n]s there are two format specifier one is [] and another is s.
Now the first one will scan anything other than \n and then it gets a \n and keeps it in stdin. And move on. But it doesn't stop here - it basically then tries to find a match for the letter s. In case it doesn't find it - it fails. (The explanation with %[^:]s will be same as this one).
Now decide if this is what you really want.[^\n] is the right one which will scan until \n is found (and yes it doesn't skip whitespace like %s do). scanset covers the letter including s also. And more than that %[^\n]s is self contradictory. So no use of it either.
%d:%d:%[^:]s:%[^\n]
%d - Matches an optionally signed decimal integer. (Ignore whitespace)
: - Then looks for ':'
%d - Matches an optionally signed decimal integer. (Ignore whitespace)
: - Then looks for ':'
%[^:] - No white space ignored - everything is taken into input except `:`
':' is unread.
s - Tries to match 's'. No white space ignored.
%[^\n] - Everything except '\n' inputted. `\n` left unread.
The specifier IS "%[]", you don't need the "s" there.
Read the manual page for scanf()
Your format string doesn't match the input because you the "s" is not part of the specifier and it's not present in the input where the format is expecting it.
By reading the documentation in the link above, you will find out — if you don't already know — that you should also check the return value of scanf() before calling printf() or otherwise your code will invoke undefined behavior, because some of the passed pointers don't get initialized.

splitting string in c

I have a file where each line looks like this:
cc ssssssss,n
where the two first 'c's are individual characters, possibly spaces, then a space after that, then the 's's are a string that is 8 or 9 characters long, then there's a comma and then an integer.
I'm really new to c and I'm trying to figure out how to put this into 4 seperate variables per line (each of the first two characters, the string, and the number)
Any suggestions? I've looked at fscanf and strtok but i'm not sure how to make them work for this.
Thank you.
I'm assuming this is a C question, as the question suggests, not C++ as the tags perhaps suggest.
Read the whole line in.
Use strchr to find the comma.
Do whatever you want with the first two characters.
Switch the comma for a zero, marking the end of a string.
Call strcpy from the fourth character on to extract the sssssss part.
Call atoi on one character past where the comma was to extract the integer.
A string is a sequence of characters that ends at the first '\0'. Keep this in mind. What you have in the file you described isn't a string.
I presume n is an integer that could span multiple decimal places and could be negative. If that's the case, I believe the format string you require is "%2[^ ] %9[^,\n],%d". You'll want to pass fscanf the following expressions:
Your FILE *,
The format string,
An array of 3 chars silently converted to a pointer,
An array of 9 chars silently converted to a pointer,
... and a pointer to int.
Store the return value of fscanf into an int. If fscanf returns negative, you have a problem such as EOF or some other read error. Otherwise, fscanf tells you how many objects it assigned values into. The "success" value you're looking for in this case is 3. Anything else means incorrectly formed input.
I suggest reading the fscanf manual for more information, and/or for clarification.
fscanf function is very powerful and can be used to solve your task:
We need to read two chars - the format is "%c%c".
Then skip a space (just add it to the format string) - "%c%c ".
Then read a string until we hit a comma. Don't forget to specify max string size. So, the format is "%c%c %10[^,]". 10 - max chars to read. [^,] - list of allowed chars. ^, - means all except a comma.
Then skip a comma - "%c%c %10[^,],".
And finally read an integer - "%c%c %10[^,],%d".
The last step is to be sure that all 4 tokens are read - check fscanf return value.
Here is the complete solution:
FILE *f = fopen("input_file", "r");
do
{
char c1 = 0;
char c2 = 0;
char str[11] = {};
int d = 0;
if (4 == fscanf(f, "%c%c %10[^,],%d", &c1, &c2, str, &d))
{
// successfully got 4 values from the file
}
}
while(!feof(f));
fclose(f);

Invalid output with fscanf()

The language I am using is C
I am trying to scan data from a file, and the code segment is like:
char lsm;
long unsigned int address;
int objsize;
while(fscanf(mem_trace,"%c %lx,%d\n",&lsm,&address,&objsize)!=EOF){
printf("%c %lx %d\n",lsm,address,objsize);
}
The file which I read from has the first line as follows:
S 00600aa0,1
I 004005b6,5
I 004005bb,5
I 004005c0,5
S 7ff000398,8
The results that show in stdout is:
8048350 134524916
S 600aa0 1
I 4005b6 5
I 4005bb 5
I 4005c0 5
S 7ff000398,8
Obviously, the results had an extra line which comes nowhere.Is there anybody know how this could happen?
Thx!
This works for me on the data you supply:
#include <stdio.h>
int main(void)
{
char lsm[2];
long unsigned int address;
int objsize;
while (scanf("%1s %lx,%d\n", lsm, &address, &objsize) == 3)
printf("%s %9lx %d\n", lsm, address, objsize);
return 0;
}
There are multiple changes. The simplest and least consequential is the change from fscanf() to scanf(); that's for my convenience.
One important change is the type of lsm from a single char to an array of two characters. The format string then uses %1s reads one character (plus NUL '\0') into the string, but it also (and this is crucial) skips leading blanks.
Another change is the use of == 3 instead of != EOF in the condition. If something goes wrong, scanf() returns the number of successful matches. Suppose that it managed to read a letter but what followed was not a hex number; it would return 1 (not EOF). Further, it would return 1 on each iteration until it could find something that matched a hex number. Always test for the number of values you expect.
The output format was tidied up with the %9lx. I was testing on a 64-bit system, so the 9-digit hex converts fine. One problem with scanf() is that if you get an overflow on a conversion, the behaviour is undefined.
Output:
S 600aa0 1
I 4005b6 5
I 4005bb 5
I 4005c0 5
S 7ff000398 8
Why did you get the results you got?
The first conversion read a space into lsm, but then failed to convert S into a hex number, so it was left behind for the next cycle. So, you got the left-over garbage printed in the address and object size columns. The second iteration read the S and was then in synchrony with the data until the last line. The newline at the end of the format (like any other white space in the format string) eats white space, which is why the last line worked despite the leading blank.
A directive that is a conversion specification defines a set of
matching input sequences, as described below for each specifier. A
conversion specification is executed in the following steps:
Input white-space characters (as specified by the isspace function)
are skipped, unless the specification includes a [, c, or n specifier.
An input item is read from the stream, unless the specification
includes an n specifier.
[...]
The first time you call fscanf, your %c reads the first blank space in the file. Your white-space character reads zero or more characters of white-space, this time zero of them. Your %lx fails to match the S character in the file, so fscanf returns. You don't check the result. Your variables contain values that they had from earlier operations.
The second time you call fscanf, your %c reads the first S character in the file. From that point on, everything else succeeds too.
Added in editing, here is the simplest change to your format string to solve your problem:
" %c %lx,%d\n"
The space at the beginning will read zero or more characters of white-space and then %c will read the first non-white-space character in the file.
Here is another format string that will also solve your problem:
" %c %lx,%d"
The reason is that if you read and discard zero or more white-space characters twice in a row, the result is the same as doing it just once.
I think that fsanf reads the first character [space] into lsm then fails to read address and objsize because the format shift doesn't match for the rest of the line.
Then it prints a space then whatever happened to be in address and objsize when it was declared
EDIT--
fscanf consumes the whitespaces after each call, if you call ftell you'll see
printf("%c %lx %d %d\n",lsm,address,objsize,ftell(mem_trace));

Parsing a line in C

I have a file, in which each line contains several words that are separated by variable amount of whitespace characters (spaces and tabs). For example:
do that param1 param2 param3
do this param1
(The number of words in a line is unknown in advance and is unbounded)
I'm looking for a way to parse such a line in plain C, so that I'll have a pointer to string containing the first word, a pointer to a string containing the second word, and a pointer to a string containing everything else (that is - all of the line, except the first two words). The idea is that the "rest of the line" string will be further parsed by a callback function, determined by the first two words).
Getting the first two words is easy enough (a simple sscanf), but I have no idea how to get the "rest of the line" pointer (As sscanf stops at whitespace, and I don't know the amount of whitespace before the first word, and between the first and the second word).
Any idea will be greatly appreciated.
You can use sscanf for the rest of the line as well. You just use a "scanset" conversion instead of a string conversion:
char word1[256], word2[256], remainder[1024];
sscanf(input_line, "%255s %255s %1023[^\n]", word1, word2, remainder);

Resources