Format Checking in C

Format Checking in C - c

I'm trying to make a program in c to read in from text a large database of ice rink activities. Does anyone know how to check for something that is not in the format
ie the text document will have something like this
sample
---------------------------------------------
date startT endT END
_______________________________________________
Ice Rink 1
1/13/2014 1:50 3:50 PM Public Skating
1/13/2014 1:50 3:50 PM Game
ice rink 2
1/13/2014 1:50 3:50 PM OPEN
I can already successfully read in one line of the event, date time and description
but how do I skip or detect the lines that don't match my scan in style of
fscanf(ifp,"%d/%d/%d\t%d:%d%s\t%d:%d%s\t\t %20c",
&e1[i].month,&e1[i].day,&e1[i].year,&e1[i].startH,&e1[i].startM,e1[i].MER1,&e1[i].endH,&e1[i].endM,e1[i].MER2,e1[i].event);
In short: how do detect cases that don't match this exactly?
Thanks in advance

As others have already said, you can check the return value of fscanf to find out whether a line is in the given format. That isn't the ideal approach, however. First, your data is organised line-wise, but fscanf treats the newline character like any other whitespace. You could read the line with fgets first and then apply sscanf on the line, but you'd still have one big monolithic format specifier that is easy to lose track of.
I'd like to propose another approach. Yor data lines seem to be organised in fields, which are separated from each other with tab characters. You could read the lines with fgets, then split them with strtok and finally scan the separate fields with sscanf. If you write custom wrapper functions to your sscanf statements, you can run a sanity check on the data when it's read.
/*
* Return true if str has format "hh:min AM/PM"
*/
int scan_time(const char *str, int *hh, int *mm)
{
char buf[4] = {0};
int n;
char c;
n = sscanf(str, "%d:%d%4s %c", hh, mm, buf, &c);
if (n == 4) return 0; /* trailing extra chars */
if (n < 2) return 0; /* missing minutes */
if (n == 3) {
int key = (buf[0] << 16) + (buf[1] << 8) + buf[2];
#define KEY(a, b) ((a << 16) + (b << 8))
switch (key) {
case KEY('a', 'm'):
case KEY('A', 'M'):
break;
case KEY('p', 'm'):
case KEY('P', 'M'):
*hh += 12;
break;
default:
return 0; /* invalid am/pm spec */
}
}
if (*hh < 0 || *hh >= 24) return 0; /* invalid hours */
if (*mm < 0 || *mm >= 60) return 0; /* invalid minutes */
return 1;
}
/*
* Return true, if str has format "mm/dd/year"
*/
int scan_date(const char *str, int *yy, int *mm, int *dd)
{
static const int mdays[] = {
0, 31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31
};
int n;
char c;
n = sscanf(str, "%d/%d/%d %c", mm, dd, yy, &c);
if (n == 4) return 0; /* trailing extra chars */
if (n < 2) return 0; /* missing day */
if (n == 2) *yy = 2014; /* set default value */
if (*yy < 100) *yy += 2000; /* allow 1/1/14 */
if (*mm < 1 || *mm > 12) return 0; /* invalid month */
if (*dd < 1 || *dd > mdays[*mm]) return 0;
if (*mm == 2 && *dd == 29 % *yy % 4) return 0;
/* invalid day */
return 1;
}
/*
* Return true if line is "date \t time \t time \t text"
*/
int scan_line(char *str, struct Event *ev)
{
char *token;
token = strtok(str, "\t");
if (token == NULL) return 0;
if (!scan_date(token, &ev->year, &ev->month, &ev->day)) return 0;
token = strtok(NULL, "\t");
if (token == NULL) return 0;
if (!scan_time(token, &ev->startH, &ev->startM)) return 0;
token = strtok(NULL, "\t");
if (token == NULL) return 0;
if (!scan_time(token, &ev->endH, &ev->endM)) return 0;
token = strtok(NULL, "\t");
if (token == NULL) return 0;
strncpy(ev->event, token, 40);
return 1;
}
/*
* Remove trailing newline
*/
void chomp(char *str)
{
int l = strlen(str);
if (l && str[l - 1] == '\n') str[l - 1] = '\0';
}
/*
* Scan file with events
*/
int scan_file(const char *fn)
{
FILE *f = fopen(fn, "r");
if (f == NULL) return -1;
for (;;) {
struct Event ev;
char line[200];
if (fgets(line, 200, f) == NULL) break;
chomp(line);
if (scan_line(line, &ev)) {
printf("%s on %d/%d/%d\n",
ev.event, ev.month, ev.day, ev.year);
}
}
return 0;
}
Here, the scan_xxx functions scan a piece of data, check the format, assign the data and run a basic check on the data, so that yo'll never get an event on the 32nd of January or at 35:00h.
This makes the scanning functions more complicated than a single call to sscanf, but there are some benefits. First, the checks are done when reading the format. That means you don't have to check your data in the client code, because you can rely on sensible values. That also means that you don't have to duplicate code: Note how the checks for the time are coded only once, namely in scan_time, although the are applied twice per line, for the start and end times.
Treating the data field-wise in encapsulated functions allows you to change the format. For example, you could allow "1pm" as valid shortcut for "1:00 pm". You'd just have to re-scanf your time field with a second format string when the first format fails. You can also do that with your long single-line format, but since you have two time fields, that wouldn't be so easy.
Also note how the code above accepts 14 as shortcut for 2014 and interprets a missing year as 2014. All this might seem a bit too complicated for a simple data scanning tool, but you can re-use your functions in similar projects. Also, writing these tidy functions is more fun than wrangling longish scanf formats.

You can check the return of fscanf: "On success, the function returns the number of items of the argument list successfully filled. This count can match the expected number of items or be less (even zero) due to a matching failure, a reading error, or the reach of the end-of-file." If you know how many items you want to match, you can check how many were matched successfully. Now realize this, a subsequent one will pick up where the first one stopped. Meaning the next fscanf start where the other one stopped, either at the completion of a full fscanf, of the first time it encountered something not within the format.
Just brain storming but what you can do to get around this is use some form of fgets to get the line until a '\n' appears and do nothing with that line.

Note that whatever you do, because you have more than "type" of input line, you aren't going to be fscanfing the data straight into variables in your program. You have to know whether you have a rink name or an activity entry, before you can decide what to do with the line.
So you will first read a whole line in, then process it (and dump empty lines as you go)
You can use sscanf to see if the line is of an acceptable format. You will want to test it against the format of the activity entry first, because you will conclude that if the first element doesn't match (the first digit of a time) then you must have a rink-name. Then see if you can scan the result into a suitable rink name (you might want to check something about these).
If the sscanf for the activity entry fails on anything other than the first entry, you can tell your user which one it was and thus what it is that is wrong (IE if sscanf returns 3, then you know that the date didn't scan in properly).

Related

How to get the strings from a file and store in a 2D char array and compare that 2D char array with a string in C?

I have a text file, it has values(I usually call them as upc_values) of
01080006210
69685932764
40000114485
40000114724
07410855329
72908100004
66484101000
04000049163
43701256600
99999909001
07726009493
78732510053
78732510063
78732510073
78732510093
02842010109
02842010132
78732510213
02410011035
73999911110
char *UPC_val = "99999909001";
char upcbuf[100][12];
char buf[12];
memset(buf,0,sizeof(buf));
memset(upcbuf,0,sizeof(upcbuf));
When I tried to fgets, I stored that in a 2D buffer.
while ( fgets(buf, sizeof(buf), f) != NULL ) {
strncpy(upcbuf[i], buf, 11);
i++;
}
I tried to print the data in the buffer.
puts(upcbuf[0]);
upcbuf[0] has the whole data in a continues stream,
0108000621069685932764400001144854000011472407410855329729081000046648410100004000049163437012566009999990900107726009493787325100537873251006378732510073787325100930284201010902842010132787325102130241001103573999911110
and I want to compare this upc values(11 digit) with another string(11 digit). I used,
if(strncmp(UPC_Val,upcbuf[i],11) == 0)
{
//do stuff here
}
It didn't work properly, I used strstr() too like,
if(strstr(upcbuf[0],UPC_val) != NULL)
{
//do stuff here
}
I am totally unaware of what it is doing, am I doing the comparison properly?
How to do this, any help please?
Thanks in advance.

To read a line of text of 11 digits and a '\n' into a string needs an array of at least 13 to store the string. There is little reason to be so tight. Suggest 2x expected max size
char upcbuf[100][12]; // large enough for 100 * (11 digits and a \0)
...
#define BUF_SIZE (13*2)
char buf[BUF_SIZE];
while (i < 100 && fgets(buf, sizeof buf, f) != NULL ) {
Lop off the potential tailing '\n'
size_t len = strlen(buf);
if (len && buf[len-1] == '\n') buf[--len] = '\0';
Check length and handle that somehow.
if (len != 11) exit(EXIT_FAILURE);
Save/print the data
// strncpy(upcbuf[i], buf, 11); // fails to insure a null character at the end
strcpy(upcbuf[i], buf);
i++;
puts(upcbuf[i]);
To compare strings
if(strcmp(UPC_Val,upcbuf[i]) == 0) {
// strings match
}

If you are still having trouble getting the logic to work after #chux's answer, then here is a short example implementing his suggestions that takes the filename to read as the first argument, and optionally the upc to search for as the second argument (it will search for "99999909001" by default [and it that case you can just read the file in on stdin]).
Note the use of an enum to define global constants for your row and column values. (you can use independent #define ROW 128 and #define COL 32 if you like) If you need constants in your code, define them once, at the top, so if they ever need to change, you have a single convenient place to change the values, rather than having to pick through your code, or perform a global search/replace to change them.
For example, you could put the logic together as follows:
#include <stdio.h>
#include <string.h>
enum { COL = 32, ROW = 128 }; /* an enum is convenient for constants */
int main (int argc, char **argv) {
char buf[COL] = "", /* buffer to read each line */
upcbuf[ROW][COL] = { "" }, /* 2D array of ROW x COL chars */
*upcval = argc > 2 ? argv[2] : "99999909001";
size_t n = 0; /* index/counter */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin; /* file */
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
/* fill upcbuf (you could search at same time, but let's fill) */
while (n < ROW && fgets (buf, COL, fp)) {
size_t len = strlen (buf); /* get length */
/* test last char '\n', overwrite w/nul-terminating char */
if (len && buf[len - 1] == '\n')
buf[--len] = 0;
strcpy (upcbuf[n++], buf); /* copy to upcbuf */
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
/* step through upcbuf - search for upcval */
for (size_t i = 0; i < n; i++)
if (strcmp (upcbuf[i], upcval) == 0) {
printf ("upcval: '%s' found at line '%zu'.\n", upcval, i + 1);
return 0;
}
printf ("upcval: '%s' not found in file.\n", upcval);
return 0;
}
Example Use/Output
$ ./bin/upcbuf dat/upcfile.txt
upcval: '99999909001' found at line '10'.
$ ./bin/upcbuf dat/upcfile.txt 01080006210
upcval: '01080006210' found at line '1'.
$ ./bin/upcbuf dat/upcfile.txt 02410011035
upcval: '02410011035' found at line '19'.
$ ./bin/upcbuf dat/upcfile.txt "not there!"
upcval: 'not there!' not found in file.
Also note that if you were simply searching for a single upc, then you could combine read and search in a single loop, but since you often read as a separate function, and then operate on the data elsewhere in your code, this example simply reads all upc values from the file into your array, and then searches though the array in a separate loop. Look things over, look at all answers, and let us know if you have any further questions.
As a final note, you have checked if the last char is '\n', but what happens if it isn't? You should check if the length is COL-1 indicating that additional characters remain unread in that line and handle the error (or just read and discard the remaining chars). You can do that with an addition similar to the following:
/* test last char '\n', overwrite w/nul-terminating char */
if (len && buf[len - 1] == '\n')
buf[--len] = 0;
else if (len == COL - 1) { /* if no '\n' & len == COL - 1 */
fprintf (stderr, "error: line excces %d chars.\n", COL - 1);
return 1;
}
And, you need to use the else if and check the COL - 1 and not simply use an else there because you may be reading from a file that does not have a POSIX end-of-line (e.g. a new-line character) after the final line of the file. fgets properly reads the final line, even without a POSIX line ending, but there will be no '\n' in buf. So even without the POSIX line ending, the line can be a valid line, and you are guaranteed to have a complete read, so long as the number of characters read (+ the nul-terminating char) does not equal your buffer size.

Reading multiple lines with different data types in C

I have a very strange problem, I'm trying to read a .txt file with C, and the data is structured like this:
%s
%s
%d %d
Since I have to read the strings all the way to \n I'm reading it like this:
while(!feof(file)){
fgets(s[i].title,MAX_TITLE,file);
fgets(s[i].artist,MAX_ARTIST,file);
char a[10];
fgets(a,10,file);
sscanf(a,"%d %d",&s[i].time.min,&s[i++].time.sec);
}
However, the very first integer I read in s.time.min shows a random big number.
I'm using the sscanf right now since a few people had a similar issue, but it doesn't help.
Thanks!
EDIT: The integers represent time, they will never exceed 5 characters combined, including the white space between.

Note, I take your post to be reading values from 3 different lines, e.g.:
%s
%s
%d %d
(primarily evidenced by your use of fgets, a line-oriented input function, which reads a line of input (up to and including the '\n') each time it is called.) If that is not the case, then the following does not apply (and can be greatly simplified)
Since you are reading multiple values into a single element in an array of struct, you may find it better (and more robust), to read each value and validate each value using temporary values before you start copying information into your structure members themselves. This allows you to (1) validate the read of all values, and (2) validate the parse, or conversion, of all required values before storing members in your struct and incrementing your array index.
Additionally, you will need to remove the tailing '\n' from both title and artist to prevent having embedded newlines dangling off the end of your strings (which will cause havoc with searching for either a title or artist). For instance, putting it all together, you could do something like:
void rmlf (char *s);
....
char title[MAX_TITLE] = "";
char artist[MAX_ARTIST = "";
char a[10] = "";
int min, sec;
...
while (fgets (title, MAX_TITLE, file) && /* validate read of values */
fgets (artist, MAX_ARTIST, file) &&
fgets (a, 10, file)) {
if (sscanf (a, "%d %d", &min, &sec) != 2) { /* validate conversion */
fprintf (stderr, "error: failed to parse 'min' 'sec'.\n");
continue; /* skip line - tailor to your needs */
}
rmlf (title); /* remove trailing newline */
rmlf (artist);
s[i].time.min = min; /* copy to struct members & increment index */
s[i].time.sec = sec;
strncpy (s[i].title, title, MAX_TITLE);
strncpy (s[i++].artist, artist, MAX_ARTIST);
}
/** remove tailing newline from 's'. */
void rmlf (char *s)
{
if (!s || !*s) return;
for (; *s && *s != '\n'; s++) {}
*s = 0;
}
(note: this will also read all values until an EOF is encountered without using feof (see Related link: Why is “while ( !feof (file) )” always wrong?))
Protecting Against a Short-Read with fgets
Following on from Jonathan's comment, when using fgets you should really check to insure you have actually read the entire line, and not experienced a short read where the maximum character value you supply is not sufficient to read the entire line (e.g. a short read because characters in that line remain unread)
If a short read occurs, that will completely destroy your ability to read any further lines from the file, unless you handle the failure correctly. This is because the next attempt to read will NOT start reading on the line you think it is reading and instead attempt to read the remaining characters of the line where the short read occurred.
You can validate a read by fgets by validating the last character read into your buffer is in fact a '\n' character. (if the line is longer than the max you specify, the last character before the nul-terminating character will be an ordinary character instead.) If a short read is encountered, you must then read and discard the remaining characters in the long line before continuing with your next read. (unless you are using a dynamically allocated buffer where you can simply realloc as required to read the remainder of the line, and your data structure)
Your situation complicates the validation by requiring data from 3 lines from the input file for each struct element. You must always maintain your 3-line read in sync reading all 3 lines as a group during each iteration of your read loop (even if a short read occurs). That means you must validate that all 3 lines were read and that no short read occurred in order to handle any one short read without exiting your input loop. (you can validate each individually if you just want to terminate input on any one short read, but that leads to a very inflexible input routine.
You can tweak the rmlf function above to a function that validates each read by fgets in addition to removing the trailing newline from the input. I have done that below in a function called, surprisingly, shortread. The tweaks to the original function and read loop could be coded something like this:
int shortread (char *s, FILE *fp);
...
for (idx = 0; idx < MAX_SONGS;) {
int t, a, b;
t = a = b = 0;
/* validate fgets read of complete line */
if (!fgets (title, MAX_TITLE, fp)) break;
t = shortread (title, fp);
if (!fgets (artist, MAX_ARTIST, fp)) break;
a = shortread (artist, fp);
if (!fgets (buf, MAX_MINSEC, fp)) break;
b = shortread (buf, fp);
if (t || a || b) continue; /* if any shortread, skip */
if (sscanf (buf, "%d %d", &min, &sec) != 2) { /* validate conversion */
fprintf (stderr, "error: failed to parse 'min' 'sec'.\n");
continue; /* skip line - tailor to your needs */
}
s[idx].time.min = min; /* copy to struct members & increment index */
s[idx].time.sec = sec;
strncpy (s[idx].title, title, MAX_TITLE);
strncpy (s[idx].artist, artist, MAX_ARTIST);
idx++;
}
...
/** validate complete line read, remove tailing newline from 's'.
* returns 1 on shortread, 0 - valid read, -1 invalid/empty string.
* if shortread, read/discard remainder of long line.
*/
int shortread (char *s, FILE *fp)
{
if (!s || !*s) return -1;
for (; *s && *s != '\n'; s++) {}
if (*s != '\n') {
int c;
while ((c = fgetc (fp)) != '\n' && c != EOF) {}
return 1;
}
*s = 0;
return 0;
}
(note: in the example above the result of the shortread check for each of the lines that make up and title, artist, time group.)
To validate the approach I put together a short example that will help put it all in context. Look over the example and let me know if you have any further questions.
#include <stdio.h>
#include <string.h>
/* constant definitions */
enum { MAX_MINSEC = 10, MAX_ARTIST = 32, MAX_TITLE = 48, MAX_SONGS = 64 };
typedef struct {
int min;
int sec;
} stime;
typedef struct {
char title[MAX_TITLE];
char artist[MAX_ARTIST];
stime time;
} songs;
int shortread (char *s, FILE *fp);
int main (int argc, char **argv) {
char title[MAX_TITLE] = "";
char artist[MAX_ARTIST] = "";
char buf[MAX_MINSEC] = "";
int i, idx, min, sec;
songs s[MAX_SONGS] = {{ .title = "", .artist = "" }};
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
for (idx = 0; idx < MAX_SONGS;) {
int t, a, b;
t = a = b = 0;
/* validate fgets read of complete line */
if (!fgets (title, MAX_TITLE, fp)) break;
t = shortread (title, fp);
if (!fgets (artist, MAX_ARTIST, fp)) break;
a = shortread (artist, fp);
if (!fgets (buf, MAX_MINSEC, fp)) break;
b = shortread (buf, fp);
if (t || a || b) continue; /* if any shortread, skip */
if (sscanf (buf, "%d %d", &min, &sec) != 2) { /* validate conversion */
fprintf (stderr, "error: failed to parse 'min' 'sec'.\n");
continue; /* skip line - tailor to your needs */
}
s[idx].time.min = min; /* copy to struct members & increment index */
s[idx].time.sec = sec;
strncpy (s[idx].title, title, MAX_TITLE);
strncpy (s[idx].artist, artist, MAX_ARTIST);
idx++;
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
for (i = 0; i < idx; i++)
printf (" %2d:%2d %-32s %s\n", s[i].time.min, s[i].time.sec,
s[i].artist, s[i].title);
return 0;
}
/** validate complete line read, remove tailing newline from 's'.
* returns 1 on shortread, 0 - valid read, -1 invalid/empty string.
* if shortread, read/discard remainder of long line.
*/
int shortread (char *s, FILE *fp)
{
if (!s || !*s) return -1;
for (; *s && *s != '\n'; s++) {}
if (*s != '\n') {
int c;
while ((c = fgetc (fp)) != '\n' && c != EOF) {}
return 1;
}
*s = 0;
return 0;
}
Example Input
$ cat ../dat/titleartist.txt
First Title I Like
First Artist I Like
3 40
Second Title That Is Way Way Too Long To Fit In MAX_TITLE Characters
Second Artist is Fine
12 43
Third Title is Fine
Third Artist is Way Way Too Long To Fit in MAX_ARTIST
3 23
Fourth Title is Good
Fourth Artist is Good
32274 558212 (too long for MAX_MINSEC)
Fifth Title is Good
Fifth Artist is Good
4 27
Example Use/Output
$ ./bin/titleartist <../dat/titleartist.txt
3:40 First Artist I Like First Title I Like
4:27 Fifth Artist is Good Fifth Title is Good

Instead of sscanf(), I would use strtok() and atoi().
Just curious, why only 10 bytes for the two integers? Are you sure they are always that small?
By the way, I apologize for such a short answer. I'm sure there is a way to get sscanf() to work for you, but in my experience sscanf() can be rather finicky so I'm not a big fan. When parsing input with C, I have just found it a lot more efficient (in terms of how long it takes to write and debug the code) to just tokenize the input with strtok() and convert each piece individually with the various ato? functions (atoi, atof, atol, strtod, etc.; see stdlib.h). It keeps things simpler, because each piece of input is handled individually, which makes debugging any problems (should they arise) much easier. In the end I typically spend a lot less time getting such code to work reliably than I did when I used to try to use sscanf().

Use "%*s %*s %d %d" as your format string, instead...
You seem to be expecting sscanf to automagically skip the two tokens leading up to the decimal digit fields. It doesn't do that unless you explicitly tell it to (hence the pair of %*s).
You can't expect the people who designed C to have designed it the same way as you would. You NEED to check the return value, as iharob said.
That's not all. You NEED to read (and understand reelatively well) the entire scanf manual (the one written by OpenGroup is okay). That way you know how to use the function (including all of the subtle nuances of format strings) and what to do with the return vale.
As a programmer, you need to read. Remember that well.

strtol not changing errno

I'm working on a program that performs calculations given a char array that represents a time in the format HH:MM:SS. It has to parse the individual time units.
Here's a cut down version of my code, just focusing on the hours:
unsigned long parseTime(const char *time)
{
int base = 10; //base 10
long hours = 60; //defaults to something out of range
char localTime[BUFSIZ] //declares a local array
strncpy(localTime, time, BUFSIZ); //copies parameter array to local
errno = 0; //sets errno to 0
char *par; //pointer
par = strchr(localTime, ':'); //parses to the nearest ':'
localTime[par - localTime] = '\0'; //sets the ':' to null character
hours = strtol(localTime, &par, base); //updates hours to parsed numbers in the char array
printf("errno is: %d\n", errno); //checks errno
errno = 0; //resets errno to 0
par++; //moves pointer past the null character
}
The problem is that if the input is invalid (e.g. aa:13:13), strtol() apparently doesn't detect an error because it's not updating errno to 1, so I can't do error handling. What am I getting wrong?

strtol is not required to produce an error code when no conversion can be performed. Instead you should use the second argument which stores the final position after conversion and compare it to the initial position.
BTW there are numerous other errors in your code that do not affect the problem you're seeing but which should also be fixed, such as incorrect use of strncpy.

As others have explained, strtol may not update errno in case it cannot perform any conversion. The C Standard only documents that errnor be set to ERANGE in case the converted value does not fit in a long integer.
Your code has other issues:
Copying the string with strncpy is incorrect: in case the source string is longer than BUFSIZ, localTime will not be null terminated. Avoid strncpy, a poorly understood function that almost never fits the purpose.
In this case, you no not need to clear the : to '\0', strtol will stop at the first non digit character. localTime[par - localTime] = '\0'; is a complicated way to write *par = '\0';
A much simpler version is this:
long parseTime(const char *time) {
char *par;
long hours;
if (!isdigit((unsigned char)*time) {
/* invalid format */
return -1;
}
errno = 0;
hours = strtol(time, &par, 10);
if (errno != 0) {
/* overflow */
return -2;
}
/* you may want to check that hour is within a decent range... */
if (*par != ':') {
/* invalid format */
return -3;
}
par++;
/* now you can parse further fields... */
return hours;
}
I changed the return type to long so you can easily check for invalid format and even determine which error from a negative return value.
For an even simpler alternative, use sscanf:
long parseTime(const char *time) {
unsigned int hours, minutes, seconds;
char c;
if (sscanf(time, "%u:%u:%u%c", &hours, &minutes, &seconds, &c) != 3) {
/* invalid format */
return -1;
}
if (hours > 1000 || minutes > 59 || seconds > 59) {
/* invalid values */
return -2;
}
return hours * 3600L + minutes * 60 + seconds;
}
This approach still accepts incorrect strings such as 1: 1: 1 or 12:00000002:1. Parsing the string by hand seem the most concise and efficient solution.

A useful trick with sscanf() is that code can do multiple passes to detect errant input:
// HH:MM:SS
int parseTime(const char *hms, unsigned long *secs) {
int n = 0;
// Check for valid text
sscanf(hms "%*[0-2]%*[0-9]:%*[0-5]%*[0-9]:%*[0-5]%*[0-9]%n", &n);
if (n == 0) return -1; // fail
// Scan and convert to integers
unsigned h,m,s;
sscanf(hms "%u:%u:%u", &h, &m, &s);
// Range checks as needed
if (h >= 24 || m >= 60 || s >= 60) return -1;
*sec = (h*60 + m)*60L + s;
return 0;
}

After hours = strtol(localTime, &par, base); statement you have to first save the value of errno. Because after this statement you are going to call printf() statement that also set errno accordingly.
printf("errno is: %d\n", errno);
So in this statement "errno" gives the error indication for printf() not for strtol()... To do so save "errno" before calling any library function because most of the library function interact with "errno".
The correct use is :
hours = strtol(localTime, &par, base);
int saved_error = errno; // Saving the error...
printf("errno is: %d\n", saved_error);
Now check it. It will give correct output surely...And one more thing to convert this errno to some meaningful string to represent error use strerror() function as :
printf("Error is: %s\n", strerror(saved_error));

fgetc() can't read float

I got a big problem using fgetc() and i can't figure it out... I try to parse a text file, everything compile but at the execution I got an infinite loop xor a segfault (Code::blocks), my text file is like that: {"USD_EUR": "0.8631364", "EUR_USD": "1.3964719"} with 16 rates change. I try to put all my float in rate[16]...
void read(float change[4][4], char* myFile)
{
FILE* file = NULL;
file = fopen(myFile, "r+");
int value,i;
float rate[16];
char* str = "";
if (file != NULL)
{
do
{
value = fgetc(file);
printf("%c \n",value);
while(value > 48 && value < 57)
{
value = fgetc(file);
strcat(str, value);
//printf("%s \n", str);
}
rate[i] = atof(str);
i++;
str = "";
}while(value != EOF);// 125 = }
change[0][1] = rate[5];
change[0][2] = rate[0];
change[0][3] = rate[15];
change[1][0] = rate[6];
change[1][1] = rate[14];
change[1][2] = rate[7];
change[1][3] = rate[10];
change[2][0] = rate[8];
change[2][1] = rate[2];
change[2][2] = rate[12];
change[2][3] = rate[4];
change[3][0] = rate[3];
change[3][1] = rate[13];
change[3][2] = rate[11];
change[3][3] = rate[9];
fclose(file);
}
else
{
printf("Unable to read the file!\n");
}
}
I also try with EOF but i only have the char before numbers then that goes out of the loop ex: {"USD_EUR": "

I suggest that you simply use fscanf.
E.g
FILE *file;
int i = 0, status;
float value;
float rate[16];
file = fopen(myFile, "r");
if(file == NULL){
printf("Unable to read the file!\n");
return ;
}
while((status=fscanf(file, "%f", &value))!=EOF){
if(status==1){
rate[i++] = value;
if(i==16)//full
break;
} else {
fgetc(file);//one character drop
}
}
fclose(file);

Problem 1:
char* str = "";
Declares str as a pointer to a static string. This creates a literal "" in memory and points str to it, which isn't anything you can safely change. You want something like
char str[30] = "";
Problems 2 and 3:
strcat(str, value);
Attempts to to append to str, which isn't safe or right. Also, as guest notes, you are trying to strcat(char *, int), which isn't the correct usage. strcat(char *, char *) is correct. Note - this doesn't mean that you should strcat(str, (char *) &value); - you need to understand how strings are implemented as char arrays in C, particularly with regard to zero termination.
Problem 4:
str = "";
See user3629249's comment above. Given a proper declaration,
str[0] = '\0';
Would be correct.
Problem 5:
Again, with credit to user3629249,
in 'change', the position change[0][0] is being skipped.

In addition to the solutions provided in the other answers, when faced with a messy line of input to read, it may be easier to use the line-oriented input functions provided by libc (e.g. fgets or getline). Reading the data one line at a time into a buffer, often (not always) allows greater flexibility in parsing the data with the other tools provided by libc (e.g. strtok, strsep, etc..)
With other data, character-oriented input is a better choice. In your case, the lines were interlaced with numerous '"', ':', ' ' and ',''s. This made it difficult to construct a fscanf format string to read both exchange rates in a single call or use any of the string parsing tools like strtok. So this was truly a tough call. I agree, BluePixyes' solution for parsing a single float in a fscanf call is a good solution. The line-oriented alternative is to read a line at a time, and then using strtof to convert the float values found in the line. The only advantage that strtof provides is error checking on the conversion that allows you to verify a good float conversion. This is one approach for a line-oriented solution:
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <limits.h>
int main () {
FILE* file = NULL; /* aways initialize variables */
float rate[16] = {0.0}; /* market rates 1st & 2nd */
char myFile[50] = {0}; /* input filename */
char line[128] = {0}; /* input line buffer */
char *p = NULL; /* pointer to parse line */
char *ep = NULL; /* pointer to parse line */
size_t idx = 0; /* index for rate array values */
size_t it = 0; /* general index iterator */
/* prompt for filename */
printf ("\n Please enter filename to read rates from: ");
scanf ("%[^\n]%*c", myFile);
/* open & validate file */
file = fopen (myFile, "r");
if (!file) {
fprintf(stderr, "error: Unable to read the file!\n");
return 1;
}
/* using line-oriented input to read line, then parse */
while (fgets (line, 127, file) != NULL)
{
if (idx == 16) {
fprintf (stderr, "warning: array full.\n");
break;
}
p = line; /* parse line for floats */
while (*p) { /* find first digit or end */
while (*p && (*p < 48 || *p > 57) ) p++;
if (!*p) break; /* validate not null */
rate[idx++] = strtof (p, &ep); /* get float, set end-ptr */
if (errno != 0 || p == ep) /* validate conversion */
fprintf (stderr, "discarding: rate[%zd] invalid read\n", --idx);
p = ep; /* set ptr to end-ptr */
}
}
fclose (file);
printf ("\n The exchange rates read from file:\n\n");
for (it = 0; it < idx; it++)
printf (" rate[%2zd] = %9.7f\n", it, rate[it]);
printf ("\n");
return 0;
}
sample input:
$ cat dat/rates.txt
"USD_EUR": "0.8631364", "EUR_USD": "1.3964719"
"USD_AGT": "0.9175622", "EUR_USD": "1.0975372"
"USD_BRZ": "0.8318743", "EUR_USD": "1.1713074"
"USD_COL": "0.9573478", "EUR_USD": "1.0537964"
"USD_GIA": "0.7904234", "EUR_USD": "1.5393454"
output:
$ ./bin/read_xchgrates
Please enter filename to read rates from: dat/rates.txt
The exchange rates read from file:
rate[ 0] = 0.8631364
rate[ 1] = 1.3964719
rate[ 2] = 0.9175622
rate[ 3] = 1.0975372
rate[ 4] = 0.8318743
rate[ 5] = 1.1713074
rate[ 6] = 0.9573478
rate[ 7] = 1.0537964
rate[ 8] = 0.7904234
rate[ 9] = 1.5393454
Note: check your strtof man page for any additional #define's your compiler may require.

the code has the following sequence:
fgetc results in 'U',
that is not a value inside the range 0...9 exclusive,
so drops through to try and convert str to rate[i]
(where 'i' has not been initialized to a known value)
Since no digits have been saved where str points,
some unknown offset from rate[] gets set to 0
(this is undefined behaviour)
then the unknown value 'i' gets incremented
and the following line: str = "" is executed
which has no effect on string
(unless each literal is at a different location in the .const section)
and the outer loop is repeated.
Eventually, a char in the range 1...8 is input
Then, in the inner loop, that first digit is SKIPPED
and another char is read.
from your example that next char is a '.'
Which could cause the inner loop to be exited
However,
the line: strcat(str, value);
should cause a seg fault event
due to trying to write to the .const section of the executable

How to do conditional parsing with fscanf?

I have some lines I want to parse from a text file. Some lines start with x and continue with several y:z and others are composed completely of several y:zs, where x,y,z are numbers. I tried following code, but it does not work. The first line also reads in the y in y:z.
...
if (fscanf(stream,"%d ",&x))
if else (fscanf(stream,"%d:%g",&y,&z))
...
Is there a way to tell scanf to only read a character if it is followed by a space?

The *scanf family of functions do not allow you to do that natively. Of course, you can workaround the problem by reading in the minimum number of elements that you know will be present per input line, validate the return value of *scanf and then proceed incrementally, one item at a time, each time checking the return value for success/failure.
if (1 == fscanf(stream, "%d", &x) && (x == 'desired_value)) {
/* we need to read in some more : separated numbers */
while (2 == fscanf(stream, "%d:%d", &y, &z)) { /* loop till we fail */
printf("read: %d %d\n", y, z);
} /* note we do not handle the case where only one of y and z is present */
}
Your best bet to handle this is to read in a line using fgets and then parse the line yourself using sscanf.
if (NULL != fgets(stream, line, MAX_BUF_LEN)) { /* read line */
int nitems = tokenize(buf, tokens); /* parse */
}
...
size_t tokenize(const char *buf, char **tokens) {
size_t idx = 0;
while (buf[ idx ] != '\0') {
/* parse an int */
...
}
}

char line[MAXLEN];
while( fgets(line,MAXLEN,stream) )
{
char *endptr;
strtol(line,&endptr,10);
if( *endptr==':' )
printf("only y:z <%s>",line);
else
printf("beginning x <%s>",line);
}

I found a crude way to do, what I wanted without having to switch to fgets (which would probably be safer on the long run).
if (fscanf(stream,"%d ",&x)){...}
else if (fscanf(stream,"%d:%g",&y,&z)){...}
else if (fscanf(stream,":%g",&z)){
y=x;
x=0;
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Format Checking in C - c

Related

How to get the strings from a file and store in a 2D char array and compare that 2D char array with a string in C?

Reading multiple lines with different data types in C

strtol not changing errno

fgetc() can't read float

How to do conditional parsing with fscanf?

Categories

Resources