I have an application written in C that reads text messages from a modem using AT commands. A typical AT response from the modem looks like this:
+CMGL: 1,"REC READ","+31612123738",,"08/12/22,11:37:52+04"
The code is currently set up to only retrieve the id from this line, which is the first number, and it does so using the following code:
sscanf(line, "+CMGL: %d,", &entry);
Here, "line" is a character array containing a line from the modem, and "entry" is an integer in which the id is stored. I tried extending this code like this:
sscanf(line, "+CMGL: %d,\"%*s\",\"%s\",", &entry, phonenr);
I figured I would use the %*s to scan for the text in the first pair of quotes and skip it, and read the text in the next pair of quotes (the phone number) into the phonenr character array.
This doesn't work (%*s apparently reads "REC" and the next %s doesn't read anything).
An extra challange is that the text isn't restricted to "REC READ", it could in fact be many things, also a text without the space in it.
Sscanf is not very good for parsing, use strchr rather. Without error handling:
#include <stdio.h>
int main(void)
{
const char *CGML_text = "+CMGL: 1,\"REC READ\",\"+31612123738\",,\"08/12/22,11:37:52+04\"";
char *comma, *phone_number_start, *phone_number_end;
comma = strchr(CGML_text, ',');
comma = strchr(comma + 1, ',');
phone_number_start = comma + 2;
phone_number_end = strchr(phone_number_start, '"') - 1;
printf("Phone number is '%.*s'\n", phone_number_end + 1 - phone_number_start, phone_number_start);
return 0;
}
(updated with tested, working code)
The way I solved it now is with the following code:
sscanf(line, "+CMGL: %d,\"%*[^\"]\",\"%[^\"]", &entry, phonenr);
This would first scan for a number (%d), then for an arbitrary string of characters that are not double quotes (and skip them, because of the asterisk), and for the phone number it does the same.
However, I'm not sure yet how robust this is.
You can use strchr() to find the position of '+' in the string, and extract the phone number after it. You may also try to use strtok() to split the string with '"', and analyze the 3rd part.
%s in scanf() reads until whitespace.
You're very close to a solution.
To read this;
+CMGL: 1,"REC READ"
You need;
"+CMGL: %d,"%*s %*s"
Related
I am working on this piece of code that reads a file with records delimited by percent signs (%) and then saves the values in a node struct. The input is as follows:
2%c1%d3%33445.000000%2016%4%11
1%c2%d2%234.500000%2016%4%11
0%c1%d1%123.400000%2016%4%11
Each line will be a node containing the data separated by the percent signs. I am using fscanf to read the formatted input and save the values in the specific variables. It works well if the delimiter is any character but '%'.
I tried escaping the percent sign by doing '%%' but it won't work and fscanf returns -1. I have looked everywhere for a way of doing this, but can't find anything. Any help would be greatly appreciated. The following is a snippet from by code.
int recordID;
char category[255];
char detail[255];
float amount;
int year;
int month;
int day;
while(fscanf(pFile, "%d%%%s%%%s%%%f%%%d%%%d%%%d", &recordID, category, detail, &amount, &year, &month, &day) == 7) {
struct node* p = (struct node*) malloc(sizeof(struct node));
p->recordID = recordID;
copy_array(category, p->category, 255);
copy_array(detail, p->detail, 255);
p->amount = amount;
p->year = year;
p->month = month;
p->day = day;
add_node(p);
}
The pFile is the file containing the input specified above.
Thank you!
The problem is that %s reads a string, and you're not telling it to stop at the delimiter, so it gobbles up the % and everything past it. Use %[^%] instead of %s:
fscanf(pFile, "%d%%%[^%]%%%[^%]%%%f%%%d%%%d%%%d", ...
If you've never seen the scanf specifier %[, it works like this: %[abc] scans any combination of a's, b's, and c's. %[^abc] scans any string not containing an a, b, or c. You can also use ranges, like %[0-9]. Otherwise it's mostly like %s, writing to a char * destination buffer.
(As an aside, whoever chose % as a delimiter should be shot. I changed all the %'s to |'s, both in your code and your data file, so I could debug it without losing my mind, and then I changed them all back to % at the end, after I got it working.)
Addendum: John Bollinger is absolutely right, you need to worry about buffer overflow, also, as his solution shows.
[f]scanf() does not do pattern matching the way you hoped. In particular, the %s field descriptor matches a whitespace-delimited string, whereas you need to match a string delimited by '%' characters. You might have gotten a clue about this if you had examined the actual scanf() return value (but good on you for at least comparing it with the one you expected!).
You can match a string of characters from a given set via the %[ field descriptor, as #SteveSummit explained in his answer. Moreover, it is a good idea to specify maximum field widths in your format, so as to avoid overrunning the bounds of your arrays. That would be particularly effective with the format you are scanning, as an overlength input field for either of your strings will cause a matching failure to occur with the subsequent delimiter:
fscanf(pFile, "%d%%%254[^%]%%%254[^%]%%%f%%%d%%%d%%%d", &recordID, category,
detail, &amount, &year, &month, &day)
I was wondering if it is possible to only read in particular parts of a string using scanf.
For example since I am reading from a file i use fscanf
if I wanted to read name and number (where number is the 111-2222) when they are in a string such as:
Bob Hardy:sometext:111-2222:sometext:sometext
I use this but its not working:
(fscanf(read, "%23[^:] %27[^:] %10[^:] %27[^:] %d\n", name,var1, number, var2, var3))
Your initial format string fails because it does not consume the : delimiters.
If you want scanf() to read a portion of the input, but you don't care what is actually read, then you should use a field descriptor with the assignment-suppression flag (*):
char nl;
fscanf(read, "%23[^:]:%*[^:]:%10[^:]%*[^\n]%c", name, number, &nl);
As a bonus, you don't need to worry about buffer overruns for fields with assignment suppressed.
You should not attempt to match a single newline via a trailing newline character in the format, because a literal newline (or space or tab) in the format will match any run of whitespace. In this particular case, it would consume not just the line terminator but also any leading whitespace on the next line.
The last field is not suppressed, even though it will almost always receive a newline, because that way you can tell from the return value if you've scanned the last line of the file and it is not newline-terminated.
Check fscanf() return value.
fscanf(read, "%23[^:] %27[^:] ... is failing because after scanning the first field with %23[^:], fscanf() encounters a ':'. Since that does not match the next part of the format, a white-space as in ' ', scanning stops.
Had code checked the returned value of fscanf(), which was certainly 1, it may have been self-evident the source of the problem. So the scanning needs to consume the ':', add it to the format: "%23[^:]: %27[^:]: ...
Better to use fgets()
Using fscanf() to read data and detect properly and improperly formatted data is very challenging. It can be done correctly to scan expected input. Yet it rarely works to handle some incorrectly formated input.
Instead, simple read a line of data and then parse it. Using '%n' is an easy way to detect complete conversion as it saves the char scan count - if scanning gets there.
char buffer[200];
if (fgets(buffer, sizeof buffer, read) == NULL) {
return EOF;
}
int n = 0;
sscanf(buffer, " %23[^:]: %27[^:]: %10[^:]: %27[^:]:%d %n",
name, var1, number, var2, &var3, &n);
if (n == 0) {
return FAIL; // scan incomplete
}
if (buffer[n]) {
return FAIL; // Extra data on line
}
// Success!
Note: sample input ended with text, but original format used "%d". Unclear on OP's intent.
I have the following problem:
sscanf is not returning the way I want it to.
This is the sscanf:
sscanf(naru,
"%s[^;]%s[^;]%s[^;]%s[^;]%f[^';']%f[^';']%[^;]%[^;]%[^;]%[^;]"
"%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]"
"%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]",
&jokeri, &paiva1, &keskilampo1, &minlampo1, &maxlampo1,
&paiva2, &keskilampo2, &minlampo2, &maxlampo2, &paiva3,
&keskilampo3, &minlampo3, &maxlampo3, &paiva4, &keskilampo4,
&minlampo4, &maxlampo4, &paiva5, &keskilampo5, &minlampo5,
&maxlampo5, &paiva6, &keskilampo6, &minlampo6, &maxlampo6,
&paiva7, &keskilampo7, &minlampo7, &maxlampo7);
The string it's scanning:
const char *str = "city;"
"2014-04-14;7.61;4.76;7.61;"
"2014-04-15;5.7;5.26;6.63;"
"2014-04-16;4.84;2.49;5.26;"
"2014-04-17;2.13;1.22;3.45;"
"2014-04-18;3;2.15;3.01;"
"2014-04-19;7.28;3.82;7.28;"
"2014-04-20;10.62;5.5;10.62;";
All of the variables are stored as char paiva1[22] etc; however, the sscanf isn't storing anything except the city correctly. I've been trying to stop each variable at ;.
Any help how to get it to store the dates etc correctly would be appreciated.
Or if there's a smarter way to do this, I'm open to suggestions.
There are multiple problems, but BLUEPIXY hit the first one — the scan-set notation doesn't follow %s.
Your first line of the format is:
"%s[^;]%s[^;]%s[^;]%s[^;]%f[^';']%f[^';']%[^;]%[^;]%[^;]%[^;]"
As it stands, it looks for a space separated word, followed by a [, a ^, a ;, and a ] (which is self-contradictory; the character after the string is a space or end of string).
The first fixup would be to use scan-sets properly:
"%[^;]%[^;]%[^;]%[^;]%f[^';']%f[^';']%[^;]%[^;]%[^;]%[^;]"
Now you have a problem that the first %[^;] scans everything up to the end of string or first semicolon, leaving nothing for the second %[;] to match.
"%[^;]; %[^;]; %[^;]; %[^;]; %f[^';']%f[^';']%[^;]%[^;]%[^;]%[^;]"
This looks for a string up to a semicolon, then for the semicolon, then optional white space, then repeats for three items. Apart from adding a length to limit the size of string, preventing overflow, these are fine. The %f is OK. The following material looks for an odd sequence of characters again.
However, when the data is looked at, it seems to consist of a city, and then seven sets of 'a date plus three numbers'.
You'd do better with an array of structures (if you've worked with those yet), or a set of 4 parallel arrays, and a loop:
char jokeri[30];
char paiva[7][30];
float keskilampo[7];
float minlampo[7];
float maxlampo[7];
int eoc; // End of conversion
int offset = 0;
char sep;
if (fscanf(str + offset, "%29[^;]%c%n", jokeri, &sep, &eoc) != 2 || sep != ';')
...report error...
offset += eoc;
for (int i = 0; i < 7; i++)
{
if (fscanf(str + offset, "%29[^;];%f;%f;%f%c%n", paiva[i],
&keskilampo[i], &minlampo[i], &maxlampo[i], &sep, &eoc) != 5 ||
sep != ';')
...report error...
offset += eoc;
}
See also How to use sscanf() in loops.
Now you have data that can be managed. The set of 29 separately named variables is a ghastly thought; the code using them will be horrid.
Note that the scan-set conversion specifications limit the string to a maximum length one shorter than the size of jokeri and the paiva array elements.
You might legitimately be wondering about why the code uses %c%n and &sep before &eoc. There is a reason, but it is subtle. Suppose that the sscanf() format string is:
"%29[^;];%f;%f;%f;%n"
Further, suppose there's a problem in the data that the semicolon after the third number is missing. The call to sscanf() will report that it made 4 successful conversions, but it doesn't count the %n as an assignment, so you can't tell that sscanf() didn't find a semicolon and therefore did not set &eoc at all; the value is left over from a previous call to sscanf(), or simply uninitialized. By using the %c to scan a value into sep, we get 5 returned on success, and we can be sure the %n was successful too. The code checks that the value in sep is in fact a semicolon and not something else.
You might want to consider a space before the semi-colons, and before the %c. They'll allow some other data strings to be converted that would not be matched otherwise. Spaces in a format string (outside a scan-set) indicate where optional white space may appear.
I would use strtok function to break your string into pieces using ; as a delimiter. Such a long format string may be a source of problems in future.
Here's my code:
FILE* fp,*check;
fp=fopen("file.txt","r");
check=fp;
char polyStr[10];
while(fgetc(check)!='\n')
{
fscanf(fp,"%s",polyStr);
puts(polyStr);
check=fp;
}
while(fgetc(check)!=EOF)
{
fscanf(fp,"%s",polyStr);
puts(polyStr);
check=fp;
}
Now if my file.txt is:
3,3, 4,4, 5,5
4,1, 5,5, 12,2
Now output is:
,3,
4,4,
5,5,
,1,
5,5,
12,2,
Now why is the first character of both the lines not getting read?
Your fgetc call is eating the character.
You should read entire lines with fgets and then parse them with the strtol family. You should never use any of the *scanf functions.
Let's talk about the format of the input data first. Your list would seem to be better formatted if you only had <coef>,<exp> without the trailing comma. In this way, you would have a nice pattern with which to match. So you could do something like:
fscanf(filep, "%d,%d", &coef, &exp)
to get the values. You should check the return value from fscanf to be sure that you are reading 2 fields. So if the format of a line was a set of the following '<coef>,<exp><white-space>' (where white-space is either one blank or one newline, then you would be able to do the following:
do {
fscanf(filep, "%d,%d", &coef, &exp);
} while (fgetc(filep) != '\n');
This code allows you to get the pairs until you eat the end of line. The while conditional will eat either the blank or the newline. You can wrap this in another loop for processing several lines.
Note that I have NOT tested this code, but the gist of it should be clear. Comment if you have any more questions.
I have a semi xml formatted file that contains line with the following format:
<param name="Distance" value="1000Km" />
The first char in the string is usually a TAB or spaces.
I've been using the following to try to parse the two strings out (from name and value):
if(sscanf(lineread, "\t<param name=\"%s\" value=\"%s\" />", name, value) == 1)
{
//do something
}
name and value are char*
Now, the result is always the same: name gets parse (I need to remove the quotes) and name is always empty.
What am I doing wrong?
Thanks, code is appreciated.
Jess.
As usual, a scanset is probably your best answer:
sscanf(lineread, "%*[^\"]\"%[^\"]\"%*[^\"]\"%[^\"]\"", name, value);
Of course, for real code you also want to limit the lengths of the conversions:
#include <stdio.h>
int main() {
char lineread[] = "<param name=\"Distance\" value=\"1000Km\" />";
char name[256], value[256];
sscanf(lineread, "%*[^\"]\"%255[^\"]\"%*[^\"]\"%255[^\"]\"", name, value);
printf("%s\t%s", name, value);
return 0;
}
Edti: BTW, sscanf returns the number of successful conversions, so in your original code, you probably wanted to compare to 2 instead of 1.
Edit2: This much: %*[^\"]\" means "read and ignore characters other than a quote mark", then read and skip across a quote mark. The next %255[^\"]\" means "read up to 255 characters other than a quote mark, then read and skip across a quote mark. That whole pattern is then repeated to read the second string.
The problem with the original code was that %s stops only after seeing a space. Hence, name gets Distance" not Distance as expected.