strtol not changing errno - c

I'm working on a program that performs calculations given a char array that represents a time in the format HH:MM:SS. It has to parse the individual time units.
Here's a cut down version of my code, just focusing on the hours:
unsigned long parseTime(const char *time)
{
int base = 10; //base 10
long hours = 60; //defaults to something out of range
char localTime[BUFSIZ] //declares a local array
strncpy(localTime, time, BUFSIZ); //copies parameter array to local
errno = 0; //sets errno to 0
char *par; //pointer
par = strchr(localTime, ':'); //parses to the nearest ':'
localTime[par - localTime] = '\0'; //sets the ':' to null character
hours = strtol(localTime, &par, base); //updates hours to parsed numbers in the char array
printf("errno is: %d\n", errno); //checks errno
errno = 0; //resets errno to 0
par++; //moves pointer past the null character
}
The problem is that if the input is invalid (e.g. aa:13:13), strtol() apparently doesn't detect an error because it's not updating errno to 1, so I can't do error handling. What am I getting wrong?

strtol is not required to produce an error code when no conversion can be performed. Instead you should use the second argument which stores the final position after conversion and compare it to the initial position.
BTW there are numerous other errors in your code that do not affect the problem you're seeing but which should also be fixed, such as incorrect use of strncpy.

As others have explained, strtol may not update errno in case it cannot perform any conversion. The C Standard only documents that errnor be set to ERANGE in case the converted value does not fit in a long integer.
Your code has other issues:
Copying the string with strncpy is incorrect: in case the source string is longer than BUFSIZ, localTime will not be null terminated. Avoid strncpy, a poorly understood function that almost never fits the purpose.
In this case, you no not need to clear the : to '\0', strtol will stop at the first non digit character. localTime[par - localTime] = '\0'; is a complicated way to write *par = '\0';
A much simpler version is this:
long parseTime(const char *time) {
char *par;
long hours;
if (!isdigit((unsigned char)*time) {
/* invalid format */
return -1;
}
errno = 0;
hours = strtol(time, &par, 10);
if (errno != 0) {
/* overflow */
return -2;
}
/* you may want to check that hour is within a decent range... */
if (*par != ':') {
/* invalid format */
return -3;
}
par++;
/* now you can parse further fields... */
return hours;
}
I changed the return type to long so you can easily check for invalid format and even determine which error from a negative return value.
For an even simpler alternative, use sscanf:
long parseTime(const char *time) {
unsigned int hours, minutes, seconds;
char c;
if (sscanf(time, "%u:%u:%u%c", &hours, &minutes, &seconds, &c) != 3) {
/* invalid format */
return -1;
}
if (hours > 1000 || minutes > 59 || seconds > 59) {
/* invalid values */
return -2;
}
return hours * 3600L + minutes * 60 + seconds;
}
This approach still accepts incorrect strings such as 1: 1: 1 or 12:00000002:1. Parsing the string by hand seem the most concise and efficient solution.

A useful trick with sscanf() is that code can do multiple passes to detect errant input:
// HH:MM:SS
int parseTime(const char *hms, unsigned long *secs) {
int n = 0;
// Check for valid text
sscanf(hms "%*[0-2]%*[0-9]:%*[0-5]%*[0-9]:%*[0-5]%*[0-9]%n", &n);
if (n == 0) return -1; // fail
// Scan and convert to integers
unsigned h,m,s;
sscanf(hms "%u:%u:%u", &h, &m, &s);
// Range checks as needed
if (h >= 24 || m >= 60 || s >= 60) return -1;
*sec = (h*60 + m)*60L + s;
return 0;
}

After hours = strtol(localTime, &par, base); statement you have to first save the value of errno. Because after this statement you are going to call printf() statement that also set errno accordingly.
printf("errno is: %d\n", errno);
So in this statement "errno" gives the error indication for printf() not for strtol()... To do so save "errno" before calling any library function because most of the library function interact with "errno".
The correct use is :
hours = strtol(localTime, &par, base);
int saved_error = errno; // Saving the error...
printf("errno is: %d\n", saved_error);
Now check it. It will give correct output surely...And one more thing to convert this errno to some meaningful string to represent error use strerror() function as :
printf("Error is: %s\n", strerror(saved_error));

Related

list convertion in C

I am trying to make put command line arguments by the user into an array but I am unsure how to approach it.
For example say I ran my program like this.
./program 1,2,3,4,5
How would I store 1 2 3 4 5 without the commas, and allow it to be passed to other functions to be used. I'm sure this has to do with using argv.
PS: NO space-separated, I want the numbers to parse into integers, I have an array of 200, and I want these numbers to be stored in the array as, arr[0] = 1, arr[1] = 2....
store 1 2 3 4 5 without the commas, and allow it to be passed to other functions to be used.
PS: NO space-separated, I want the numbers to parse into integers
Space or comma-separated doesn't matter. Arguments always come in as strings. You will have to do the work to turn them into integers using atoi (Ascii-TO-Integer).
Using spaces between arguments is the normal convention: ./program 1 2 3 4 5. They come in already separated in argv.
Loop through argv (skipping argv[0], the program name) and run them through atoi.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
for(int i = 1; i < argc; i++) {
int num = atoi(argv[i]);
printf("%d: %d\n", i, num);
}
}
Using commas is going to make that harder. You first have to split the string using the kind of weird strtok (STRing TOKenizer). Then again call atoi on the resulting values.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[]) {
char *token = strtok(argv[1], ",");
while(token) {
int num = atoi(token);
printf("%d\n", num);
token = strtok(NULL, ",");
}
}
This approach is also more fragile than taking them as individual arguments. If the user types ./program 1, 2, 3, 4, 5 only 1 will be read.
One of the main disadvantages to using atoi() is it provides no check on the string it is processing and will happily accept atoi ("my-cow"); and silently fail returning 0 without any indication of a problem. While a bit more involved, using strtol() allows you to determine what failed, and then recover. This can be as simple or as in-depth a recovery as your design calls for.
As mentioned in the comment, strtol() was designed to work through a string, converting sets of digits found in the string to a numeric value. On each call it will update the endptr parameter to point to the next character in the string after the last digit converted (to each ',' in your case -- or the nul-terminating character at the end). man 3 strtol provides the details.
Since strtol() updates endptr to the character after the last digit converted, you check if nptr == endptr to catch the error when no digits were converted. You check errno for a numeric conversion error such as overflow. Lastly, since the return type is long you need to check if the value returned is within the range of an int before assigning to your int array.
Putting it altogether with a very minimal bit of error handling, you could do something like:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#include <errno.h>
#define NELEM 200 /* if you need a constant, #define one (or more) */
int main (int argc, char **argv) {
int arr[NELEM] = {0}, ndx = 0; /* array and index */
char *nptr = argv[1], *endptr = nptr; /* nptr and endptr */
if (argc < 2) { /* if no argument, handle error */
fputs ("error: no argument provided.\n", stderr);
return 1;
}
else if (argc > 2) { /* warn on more than 2 arguments */
fputs ("warning: more than one argument provided.\n", stdout);
}
while (ndx < NELEM) { /* loop until all ints processed or arr full */
int error = 0; /* flag indicating error occured */
long tmp = 0; /* temp var to hold strtol return */
char *onerr = NULL; /* pointer to next comma after error */
errno = 0; /* reset errno */
tmp = strtol (nptr, &endptr, 0); /* attempt conversion to long */
if (nptr == endptr) { /* no digits converted */
fputs ("error: no digits converted.\n", stderr);
error = 1;
onerr = strchr (endptr, ',');
}
else if (errno) { /* overflow in conversion */
perror ("strtol conversion error");
error = 1;
onerr = strchr (endptr, ',');
}
else if (tmp < INT_MIN || INT_MAX < tmp) { /* check in range of int */
fputs ("error: value outside range of int.\n", stderr);
error = 1;
onerr = strchr (endptr, ',');
}
if (!error) { /* error flag not set */
arr[ndx++] = tmp; /* assign integer to arr, advance index */
}
else if (onerr) { /* found next ',' update endptr to next ',' */
endptr = onerr;
}
else { /* no next ',' after error, break */
break;
}
/* if at end of string - done, break loop */
if (!*endptr) {
break;
}
nptr = endptr + 1; /* update nptr to 1-past ',' */
}
for (int i = 0; i < ndx; i++) { /* output array content */
printf (" %d", arr[i]);
}
putchar ('\n'); /* tidy up with newline */
}
Example Use/Output
This will handle your normal case, e.g.
$ ./bin/argv1csvints 1,2,3,4,5
1 2 3 4 5
It will warn on bad arguments in list while saving all good arguments in your array:
$ ./bin/argv1csvints 1,my-cow,3,my-cat,5
error: no digits converted.
error: no digits converted.
1 3 5
As well as handling completely bad input:
$ ./bin/argv1csvints my-cow
error: no digits converted.
Or no argument at all:
$ ./bin/argv1csvints
error: no argument provided.
Or more than the expected 1 argument:
$ ./bin/argv1csvints 1,2,3,4,5 6,7,8
warning: more than one argument provided.
1 2 3 4 5
The point to be made it that with a little extra code, you can make your argument parsing routine as robust as need be. While your use of a single argument with comma-separated values is unusual, it is doable. Either manually tokenizing (splitting) the number on the commas with strtok() (or strchr() or combination of strspn() and strcspn()), looping with sscanf() using something similar to the "%d%n" format string to get a minimal succeed / fail indication with the offset of the next number from the last, or using strtol() and taking advantage of its error reporting. It's up to you.
Look things over and let me know if you have questions.
This is how I'd deal with your requirement using strtol(). This does not damage the input string, unlike solutions using strtok(). It also handles overflows and underflows correctly, unlike solutions using atoi() or its relatives. The code assumes you want to store an array of type long; if you want to use int, you can add testing to see if the value converted is larger than INT_MAX or less than INT_MIN and report an appropriate error if it is not a valid int value.
Note that handling errors from strtol() is a tricky business, not least because every return value (from LONG_MIN up to LONG_MAX) is also a valid result. See also Correct usage of strtol(). This code requires no spaces before the comma; it permits them after the comma (so you could run ./csa43 '1, 2, -3, 4, 5' and it would work). It does not allow spaces before commas. It allows leading spaces, but not trailing spaces. These issues could be fixed with more work — probably mostly in the read_value() function. It may be that the validation work in the main loop should be delegated to the read_value() function — it would give a better separation of duty. OTOH, what's here works within limits. It would be feasible to allow trailing spaces, or spaces before commas, if that's what you choose. It would be equally feasible to prohibit leading spaces and spaces after commas, if that's what you choose.
#include <errno.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
static int read_val(const char *str, char **eov, long *value)
{
errno = 0;
char *eon;
if (*str == '\0')
return -1;
long val = strtol(str, &eon, 0);
if (eon == str || (*eon != '\0' && *eon != ',') ||
((val == LONG_MIN || val == LONG_MAX) && errno == ERANGE))
{
fprintf(stderr, "Could not convert '%s' to an integer "
"(the leftover string is '%s')\n", str, eon);
return -1;
}
*value = val;
*eov = eon;
return 0;
}
int main(int argc, char **argv)
{
if (argc != 2)
{
fprintf(stderr, "Usage: %s n1,n2,n3,...\n", argv[0]);
exit(EXIT_FAILURE);
}
enum { NUM_ARRAY = 200 };
long array[NUM_ARRAY];
size_t nvals = 0;
char *str = argv[1];
char *eon;
long val;
while (read_val(str, &eon, &val) == 0 && nvals < NUM_ARRAY)
{
array[nvals++] = val;
str = eon;
if (str[0] == ',' && str[1] == '\0')
{
fprintf(stderr, "%s: trailing comma in number string\n", argv[1]);
exit(EXIT_FAILURE);
}
else if (str[0] == ',')
str++;
}
for (size_t i = 0; i < nvals; i++)
printf("[%zu] = %ld\n", i, array[i]);
return 0;
}
Output (program csa43 compiled from csa43.c):
$ csa43 1,2,3,4,5
[0] = 1
[1] = 2
[2] = 3
[3] = 4
[4] = 5
$

How I can handle integer overflow?

I am trying to handle integer overflow. My code is :
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<errno.h>
#include<limits.h>
int isInt (char *s)
{
char *ep = NULL;
long i = strtol (s, &ep, 10);
if ((*ep == 0) || (!strcmp(ep,"\n")))
return 1; // it's an int
return 0;
}
int main()
{
char *buffer = NULL;
size_t count = 0;
ssize_t ret;
//AMINO *a_acid;
int num;
for(;;)
{
printf("Please enter an integer:");
if((ret = getline(&buffer, &count, stdin)) < 0)
{
perror("getline: error\n");
free(buffer);
exit(EXIT_FAILURE);
}
if(!isInt(buffer))
{
perror("you are not entering int , Try again:");
continue;
}
sscanf(buffer, "%d",&num);
printf("%d\n", num);
if ((num > INT_MAX)|| (num < 0))
{
perror("you overflowed int variable , Try again:\n ");
continue;
}
break;
}
}
Now I was checking how this code is responding. And I saw something weird.When I am entering so big number, then it is detected. But sometimes is not getting detected.
Here is my terminal view:
> nazmul#nazmul-Lenovo-G50-80:~/2nd_sem/biophysics$ gcc torson.c
> nazmul#nazmul-Lenovo-G50-80:~/2nd_sem/biophysics$ ./a.out
> Please enter an integer:ksdjfjklh
> you are not entering int , Try again:: Success
> Please enter an integer:338479759475637465765
> -1
> you overflowed int variable , Try again: : Numerical result out of
> range
> Please enter an integer:58678946895785
> 1103697833
> nazmul#nazmul-Lenovo-G50-80:~/2nd_sem/biophysics$
*Why it is working for this number 338479759475637465765. But it is not working for 58678946895785. logic , I used in my program, is when it is out of bound, then int variable gives some -1 or negative value. I read many article, still it is not quite clear.
strtol converts the value to a long int, whose range might be distinct from int. Furthermore, it returns LONG_MAX or LONG_MIN if the value could be converted but is outside the range for long int. In that case, errno will be set to ERANGE (but not otherwise!) Also, in the case of matching failure the value returned is 0, but errno is not set; but the ep points to the beginning of the string.
int isInt (char *s)
{
char *ep = NULL;
// zero errno first!
errno = 0;
long i = strtol (s, &ep, 10);
if (errno) {
return 0;
}
// matching failure.
if (ep == s) {
return 0;
}
// garbage follows
if (! ((*ep == 0) || (!strcmp(ep,"\n")))) {
return 0;
}
// it is outside the range of `int`
if (i < INT_MIN || i > INT_MAX) {
return 0;
}
return 1;
}
What dbush says about the use of perror is correct, though. strtol sets an error only in case of long overflow, which is not the only possible failing case in your function, so perror could print anything like Is a directory or Multihop attempted.
sscanf(buffer, any_format_without_width, &anytype); is not sufficient to detect overflow.
if the result of the conversion cannot be represented in the object, the behavior is undefined. C11dr §7.21.6.2 10
Do not use *scanf() family to detect overflow. It may work in select cases, but not in general.
Instead use strto**() functions. Yet even OP's isInt() is mis-coded as it incorrectly assess isInt("\n"), isInt(""), isInt("999..various large values ...999") as good ints.
Alternative:
bool isint_alt(const char *s) {
char *endptr;
errno = 0;
long y = strtol(s, &endptr, 10);
if (s == endptr) {
return false; // No conversion
}
if (errno == ERANGE) {
return false; // Outside long range
}
if (y < INT_MIN || y > INT_MAX) {
return false; // Outside int range
}
// Ignore trailing white space
while (isspace((unsigned char)*endptr)) {
endptr++;
}
if (*endptr) {
return false; // Trailing junk
}
return true;
}
You're getting your types mixed up.
In the isInt function you use strtol, which return a long to check the value. Then in your main function you use sscanf with %d, which reads into an int.
On your system, it seems that a long is 64 bits while an int is 32 bits. So strtol fails to fully convert 338479759475637465765 because it is larger than a 64 bit variable can hold. Then you try to convert 58678946895785 which will fit in a 64 bit variable but not a 32 bit variable.
You should instead have sscanf read into a long. Then you can compare the value against INT_MAX:
long num;
...
sscanf(buffer, "%ld", &num);
printf("%ld\n", num);
if ((num > INT_MAX)|| (num < INT_MIN))
{
printf("you overflowed int variable , Try again:\n ");
continue;
}
Also note that it doesn't make sense to call perror here. You only use it right after calling a function which sets errno.
If one must use sscanf() to detect int overflow rather than the robust strtol(), there is a cumbersome way.
Use a wider type and a width limit to prevent overflow when scanning.
bool isint_via_sscanf(const char *s) {
long long y;
int n = 0;
if (sscanf(s, "18%lld %n", &y, &n) != 1) { // Overflow not possible
return false; // Conversion failed
}
if (y < INT_MIN || y > INT_MAX) {
return false; // Outside int range
}
if (s[n]) {
return false; // Trailing junk
}
return true;
}
It is insufficient on rare platforms where INT_MAX > 1e18.
It also incorrectly returns input like "lots of leading space and/or lot of leading zeros 000123" as invalid.
More complex code using sscanf() can address these short-comings, yet the best approach is strto*().

How to use `strtoul` to parse string where zero may be valid?

According to the documentation for strtoul, regarding its return value...
This function returns the converted integral number as a long int value. If no valid conversion could be performed, a zero value is returned.
What if I'm parsing a user-supplied string of "0" where, for my application, "0" may be a valid entry? In that case it seems that I have no way to determine from using strtoul if a valid conversion was performed. Is there another way to handle this?
Read further the man page:
Since strtoul() can legitimately return 0 or ULONG_MAX (ULLONG_MAX for strtoull()) on both success and failure, the calling program should set errno to 0 before the call, and then determine if an error occurred by checking whether errno has a nonzero value after the call.
Also, to handle another scenario, where no digits were read in the input. If this happens, strtol() sets the value of *endptr to that of the nptr. So, you should also check that the pointer values compare equal or not.
How to use strtoul to parse string where zero may be valid?
Any value returned from strtoul() may be from an expected string input or from other not so expected strings. Further tests are useful.
The following strings all return 0 from strtoul()
OK "0", "-0", "+0"
Not OK "", "abc"
Usually considered OK: " 0"
OK or not OK depending on goals: "0xyz", "0 ", "0.0"
strtoul() has the various detection modes.
int base = 10;
char *endptr; // Store the location where conversion stopped
errno = 0;
unsigned long y = strtoul(s, &endptr, base);
if (s == endptr) puts("No conversion"); // "", "abc"
else if (errno == ERANGE) puts("Overflow");
else if (*endptr) puts("Extra text after the number"); // "0xyz", "0 ", "0.0"
else puts("Mostly successful");
What is not yet detected.
Negative input. strtoul() effectively wraps around such that strtoul("-1", 0, 10) == ULONG_MAX). This issue is often missed in cursory documentation review.
Leading white space allowed. This may or may not be desired.
To also detect negative values:
// find sign
while (isspace((unsigned char) *s)) {
s++;
}
char sign = *s;
int base = 10;
char *endptr; // Store the location where conversion stopped
errno = 0;
unsigned long y = strtoul(s, &endptr, base);
if (s == endptr) puts("No conversiosn");
else if (errno == ERANGE) puts("Overflow");
else if (*endptr) puts("Extra text after the number");
else if (sign == '-' && y != 0) puts("Negative value");
else puts("Successful");
One solution would be to pass the address of a char pointer and check if it is pointing to the beginning of the string:
char *str = "0";
char *endptr;
unsgined long x = strtoul(str, &endptr, 10);
if(endptr == str)
{
//Nothing was read
}
Consider the following function:
#include <stdlib.h>
#include <errno.h>
/* SPDX-Identifier: CC0-1.0 */
const char *parse_ulong(const char *src, unsigned long *to)
{
const char *end;
unsigned long val;
if (!src) {
errno = EINVAL;
return NULL;
}
end = src;
errno = 0;
val = strtoul(src, (char **)(&end), 0);
if (errno)
return NULL;
if (end == src) {
errno = EINVAL;
return NULL;
}
if (to)
*to = val;
return end;
}
This function parses the unsigned long in the string src, returning a pointer to the first unparsed character in src, with the unsigned long saved to *to. If there is an error, the function will return NULL with errno set to indicate the error.
If you compare the function to man 3 strtoul, you'll see it handles all error cases correctly, and only returns non-NULL when src yields a valid unsigned long. Especially see the Notes section. Also pay attention to how negative numbers are handled.
This same pattern works for strtol(), strtod(), strtoull().

Comparing an input string with a string that has a integer variable in C?

I'm trying to compare an input of characters with a string that can be of the format "!x" where x is any integer.
What's the easiest way to do this? I tried
int result = strcmp(input,"!%d");
which did not work.
Here's one way to do it:
int is_bang_num(const char *s) {
if (*s != '!') {
return 0;
}
size_t n = strspn(s + 1, "0123456789");
return n > 0 && s[1 + n] == '\0';
}
This verifies that the first character is !, that it is followed by more characters, and that all of those following characters are digits.
You see, scanf() family of functions return a value indicating how many parameters where converted.
Even books usually ignore this value and it leads programmers to ignore that it does return a value. One of the consequences of this is Undefined Behavior when the scanf() function failed and the value was not initialized, not before calling scanf() and since it has failed not by scanf() either.
You can use this value returned by sscanf() to check for success, like this
#include <stdio.h>
int
main(void)
{
const char *string;
int value;
int result;
string = "!12345";
result = sscanf(string, "!%d", &value);
if (result == 1)
fprintf(stderr, "the value was: %d\n", value);
else
fprintf(stderr, "the string did not match the pattern\n");
return 0;
}
As you can see, if one parameter was successfuly scanned it means that the string matched pattern, otherwise it didn't.
With this approach you also extract the integral value, but you should be careful because scanf()'s are not meant for regular expressions, this would work in very simple situations.
Since the stirng must begin with a ! and follow with an integer, use a qualified strtol() which allows a leading sign character. As OP did not specify the range of the integer, let us allow any range.
int is_sc_num(const char *str) {
if (*str != '!') return 0;
str++;
// Insure no spaces- something strtol() allows.
if (isspace((unsigned char) *str) return 0;
char *endptr;
// errno = 0;
// By using base 0, input like "0xABC" allowed
strtol(str, &endptr, 0);
// no check for errno as code allows any range
// if (errno == ERANGE) return 0
if (str == endptr) return 0; // no digits
if (*endptr) return 0; // Extra character at the end
return 1;
}
If you want to test that a string matches a format of an exclamation point and then some series of numbers, this regex: "!\d+" will match that. That won't catch if the first number is a zero, which is invalid. This will: "![1,2,3,4,5,6,7,8,9]\d*".

Format Checking in C

I'm trying to make a program in c to read in from text a large database of ice rink activities. Does anyone know how to check for something that is not in the format
ie the text document will have something like this
sample
---------------------------------------------
date startT endT END
_______________________________________________
Ice Rink 1
1/13/2014 1:50 3:50 PM Public Skating
1/13/2014 1:50 3:50 PM Game
ice rink 2
1/13/2014 1:50 3:50 PM OPEN
I can already successfully read in one line of the event, date time and description
but how do I skip or detect the lines that don't match my scan in style of
fscanf(ifp,"%d/%d/%d\t%d:%d%s\t%d:%d%s\t\t %20c",
&e1[i].month,&e1[i].day,&e1[i].year,&e1[i].startH,&e1[i].startM,e1[i].MER1,&e1[i].endH,&e1[i].endM,e1[i].MER2,e1[i].event);
In short: how do detect cases that don't match this exactly?
Thanks in advance
As others have already said, you can check the return value of fscanf to find out whether a line is in the given format. That isn't the ideal approach, however. First, your data is organised line-wise, but fscanf treats the newline character like any other whitespace. You could read the line with fgets first and then apply sscanf on the line, but you'd still have one big monolithic format specifier that is easy to lose track of.
I'd like to propose another approach. Yor data lines seem to be organised in fields, which are separated from each other with tab characters. You could read the lines with fgets, then split them with strtok and finally scan the separate fields with sscanf. If you write custom wrapper functions to your sscanf statements, you can run a sanity check on the data when it's read.
/*
* Return true if str has format "hh:min AM/PM"
*/
int scan_time(const char *str, int *hh, int *mm)
{
char buf[4] = {0};
int n;
char c;
n = sscanf(str, "%d:%d%4s %c", hh, mm, buf, &c);
if (n == 4) return 0; /* trailing extra chars */
if (n < 2) return 0; /* missing minutes */
if (n == 3) {
int key = (buf[0] << 16) + (buf[1] << 8) + buf[2];
#define KEY(a, b) ((a << 16) + (b << 8))
switch (key) {
case KEY('a', 'm'):
case KEY('A', 'M'):
break;
case KEY('p', 'm'):
case KEY('P', 'M'):
*hh += 12;
break;
default:
return 0; /* invalid am/pm spec */
}
}
if (*hh < 0 || *hh >= 24) return 0; /* invalid hours */
if (*mm < 0 || *mm >= 60) return 0; /* invalid minutes */
return 1;
}
/*
* Return true, if str has format "mm/dd/year"
*/
int scan_date(const char *str, int *yy, int *mm, int *dd)
{
static const int mdays[] = {
0, 31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31
};
int n;
char c;
n = sscanf(str, "%d/%d/%d %c", mm, dd, yy, &c);
if (n == 4) return 0; /* trailing extra chars */
if (n < 2) return 0; /* missing day */
if (n == 2) *yy = 2014; /* set default value */
if (*yy < 100) *yy += 2000; /* allow 1/1/14 */
if (*mm < 1 || *mm > 12) return 0; /* invalid month */
if (*dd < 1 || *dd > mdays[*mm]) return 0;
if (*mm == 2 && *dd == 29 % *yy % 4) return 0;
/* invalid day */
return 1;
}
/*
* Return true if line is "date \t time \t time \t text"
*/
int scan_line(char *str, struct Event *ev)
{
char *token;
token = strtok(str, "\t");
if (token == NULL) return 0;
if (!scan_date(token, &ev->year, &ev->month, &ev->day)) return 0;
token = strtok(NULL, "\t");
if (token == NULL) return 0;
if (!scan_time(token, &ev->startH, &ev->startM)) return 0;
token = strtok(NULL, "\t");
if (token == NULL) return 0;
if (!scan_time(token, &ev->endH, &ev->endM)) return 0;
token = strtok(NULL, "\t");
if (token == NULL) return 0;
strncpy(ev->event, token, 40);
return 1;
}
/*
* Remove trailing newline
*/
void chomp(char *str)
{
int l = strlen(str);
if (l && str[l - 1] == '\n') str[l - 1] = '\0';
}
/*
* Scan file with events
*/
int scan_file(const char *fn)
{
FILE *f = fopen(fn, "r");
if (f == NULL) return -1;
for (;;) {
struct Event ev;
char line[200];
if (fgets(line, 200, f) == NULL) break;
chomp(line);
if (scan_line(line, &ev)) {
printf("%s on %d/%d/%d\n",
ev.event, ev.month, ev.day, ev.year);
}
}
return 0;
}
Here, the scan_xxx functions scan a piece of data, check the format, assign the data and run a basic check on the data, so that yo'll never get an event on the 32nd of January or at 35:00h.
This makes the scanning functions more complicated than a single call to sscanf, but there are some benefits. First, the checks are done when reading the format. That means you don't have to check your data in the client code, because you can rely on sensible values. That also means that you don't have to duplicate code: Note how the checks for the time are coded only once, namely in scan_time, although the are applied twice per line, for the start and end times.
Treating the data field-wise in encapsulated functions allows you to change the format. For example, you could allow "1pm" as valid shortcut for "1:00 pm". You'd just have to re-scanf your time field with a second format string when the first format fails. You can also do that with your long single-line format, but since you have two time fields, that wouldn't be so easy.
Also note how the code above accepts 14 as shortcut for 2014 and interprets a missing year as 2014. All this might seem a bit too complicated for a simple data scanning tool, but you can re-use your functions in similar projects. Also, writing these tidy functions is more fun than wrangling longish scanf formats.
You can check the return of fscanf: "On success, the function returns the number of items of the argument list successfully filled. This count can match the expected number of items or be less (even zero) due to a matching failure, a reading error, or the reach of the end-of-file." If you know how many items you want to match, you can check how many were matched successfully. Now realize this, a subsequent one will pick up where the first one stopped. Meaning the next fscanf start where the other one stopped, either at the completion of a full fscanf, of the first time it encountered something not within the format.
Just brain storming but what you can do to get around this is use some form of fgets to get the line until a '\n' appears and do nothing with that line.
Note that whatever you do, because you have more than "type" of input line, you aren't going to be fscanfing the data straight into variables in your program. You have to know whether you have a rink name or an activity entry, before you can decide what to do with the line.
So you will first read a whole line in, then process it (and dump empty lines as you go)
You can use sscanf to see if the line is of an acceptable format. You will want to test it against the format of the activity entry first, because you will conclude that if the first element doesn't match (the first digit of a time) then you must have a rink-name. Then see if you can scan the result into a suitable rink name (you might want to check something about these).
If the sscanf for the activity entry fails on anything other than the first entry, you can tell your user which one it was and thus what it is that is wrong (IE if sscanf returns 3, then you know that the date didn't scan in properly).

Resources