Ignoring integers that are next to characters using sscanf() - c

Sorry for the simple question, but I'm trying to find an elegant way to avoid my program seeing input like "14asdf" and accepting it just as 14.
if (sscanf(sInput, "%d", &iAssignmentMarks[0]) != 0)
Is there an easy way to prevent sscanf from pulling integers out of mangled strings like that?

You can't directly stop sscanf() from doing what it is designed and specified to do. However, you can use a little-known and seldom-used feature of sscanf() to make it easy to find out that there was a problem:
int i;
if (sscanf(sInput, "%d%n", &iAssignmentMarks[0], &i) != 1)
...failed to recognize an integer...
else if (!isspace(sInput[i]) && sInput[i] != '\0')
...character after integer was not a space character (including newline) or EOS...
The %n directive reports on the number of characters consumed up to that point, and does not count as a conversion (so there is only one conversion in that format). The %n is standard in sscanf() since C89.
For extracting a single integer, you could also use strtol() - carefully (detecting error conditions with it is surprisingly hard, but it is better than sscanf() which won't report or detect overflows). However, this technique can be used multiple times in a single format, which is often more convenient.

You want to read integers from strings. It is easier to do this with strtol instead of sscanf. strtol will return, indirectly via endptr, the address just after the last character that was succesfully read into the number. If, and only if, the string was a number, then endptr will point to the end of your number string, i.e. *endptr == \0.
char *endptr = NULL;
long n = strtol(sInput, &endptr, 10);
bool isNumber = endptr!=NULL && *endptr==0 && errno==0;
(Initial whitespace is ignored. See a strtol man page for details.

This is easy. No fancy C++ required! Just do:
char unusedChar;
if (sscanf(sInput, "%d%c", &iAssignmentMarks[0], &unusedChar) == 1)

scanf isn't that smart. You'll have to read the input as text and use strtol to convert it. One of the arguments to strtol is a char * that will point to the first character that isn't converted; if that character isn't whitespace or 0, then the input string wasn't a valid integer:
char input[SIZE]; // where SIZE is large enough for the expected values plus
// a sign, newline character, and 0 terminator
...
if (fgets(input, sizeof input, stdin))
{
char *chk;
long val = strtol(input, &chk, 10);
if (*chk == NULL || !isspace(*chk) && *chk != 0)
{
// input wasn't an integer string
}
}

If you can use c++ specific capabilities, there are more clear ways to test input strings using streams.
Check here:
http://www.parashift.com/c++-faq-lite/misc-technical-issues.html#faq-39.2
If you're wondering, yes this did come from another stack overflow post. Which answers this question:
Other answer

Related

Float Checking from Char Array for Limit, Character Checking Front and Back

I am creating a simple Console application where its char *argv[] are expected to be in the form of floating number (such as 5.234, 7.197, and so on)
To ensure that the program only receive user inputs which are truly valid float, I created a function which combines sscanf (ref: character array to floating point conversion) and valid range checks (ref: How can I check if a string can be converted to a float?) results.
//buffer comes from agrv[n]
char MyFloatCheck(char* buffer)
{
float f;
char result;
result = sscanf(buffer, "%f", &f);
result &= isRangeValid(buffer);
return result;
}
Then I tested the the function above with:
Valid input: 12.15
Very large input: 4 x 10^40
Invalid inputs: (a) ab19.114, (b) 19.114ab
The results for my test no 1, 2, and 3(a) are expected:
1
0 (because it is too large)
(a) 0 (because it contains the invalid characters in front of the number)
However the result for 3 (b) is unexepected:
(b) 1 (??)
My questions are:
1. Why is that so?
2. Is there any built-in way to check this kind of input error?
3. Is there any well established workaround?
I am thinking of making my own function which checks the character from the right end to see if it contains invalid characters, but if there is any available built-in way, I would rather use it.
As you noticed, sscanf consumes characters one by one and writes the number that has been read in %f regardless of whether the reading stopped because of the end of the input string, a space, a newline, or a letter.
You would get the same behavior from strtof, a simpler substitute for sscanf(buffer, "%f", &f);:
char *endptr;
f = strtof(buffer, &endptr);
The above two lines give you a simple way to check that the entire string has been consumed after the call to strtof:
if (endptr != buffer && *endptr == 0) …
The condition endptr != buffer means that a floating-point number has been read. Otherwise, f is zero but that doesn't mean anything since no character was consumed. *endptr == 0 means that the entire input buffer was consumed in reading the floating-point number, which appears to be what you are looking for.

Converting input numerical strings to integers in C

I am trying to convert numerical strings inputs to integer. When I type 2 my output is 0
scanf( "%d", betNumber );
printf("What number do you want to bet? \n");
printf("Here is your number: %d\n" ,atoi(betNumber));
User input = 2 or 3 or 5
Output = always 0
You should never use the *scanf family, nor atoi and friends. The correct way to write this code is
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
int main(void)
{
char inbuf[80], *endp;
unsigned long bet; /* negative bets don't make sense */
puts("What number do you want to bet?");
if (!fgets(inbuf, sizeof inbuf, stdin))
return 1; /* EOF - just quit */
errno = 0;
bet = strtoul(inbuf, &endp, 10);
if (endp == inbuf || *endp != '\n' || errno) {
fputs("invalid number entered, or junk after number\n", stderr);
return 1;
}
printf("Here is your number: %lu\n", bet);
return 0;
}
Some exposition, perhaps:
The only sane way to read input from the user in C is an entire line at a time. If you don't, you are very likely to get in trouble when, not if, the user types more input than you expected. The ideal way to do this is with getline, but many C libraries do not have it, and it does make you remember to free the line-buffer. Failing that, fgets is good enough for many purposes, as shown here.
The only correct way to convert text to machine numbers in C is with the strto* family of functions. I use the word correct very deliberately. All the alternatives either silently ignore numeric overflow (ato*) or, worse, trigger undefined behavior on numeric overflow (*scanf). The usage pattern for strto* is a little tricky, but once you get used to it, it is just boilerplate to be memorized and typed. I'll take it apart for you:
errno = 0;
It is necessary to clear errno manually before calling strto*, because a syntactically valid number that overflows the range of the return value is reported only by setting errno, but success does not clear errno. (The manpage says that a particular numeric value is returned, but that value could have resulted from correct input, so that's no help.)
bet = strtoul(inbuf, &endp, 10);
When this function call returns, bet will be the number you wanted, and endp will be set to the first character in inbuf that is not a digit.
if (endp == inbuf || *endp != '\n' || errno) { /* error */ }
If endp equals inbuf, that means there were no digits, which is usually not considered a valid number.
If *endp is not equal to '\n', that means either there was something else on the line after the number, or fgets did not read the whole line; in either case, again, the input is not as expected. (All 64-bit unsigned numbers fit in fewer than 80 characters.)
And if errno is nonzero, numeric overflow occurred.
I suggest you not use scanf, nor any of it's disreputable coterie of insidious companions. Use strtol. Easier to get your head around and less error prone. atoi is a less-safe equivalent of strtol.
But... atoi takes a pointer to char. So if this compiles...
printf("Here is your number: %d\n" ,atoi(betNumber));
...then betNumber must be:
char betNumber[100];
...or something like that. Which explains what's wrong with the call to scanf.
scanf("%d", p) expects that you'll be passing it a pointer to int, not a pointer to char. It reads text from standard input, converts it to an integer, and assigns it to the int variable -- or what it assumes must be an int variable -- that the pointer is pointing to. There is no type checking there, there can't be. The compiler can't tell that you passed the wrong thing (a C compiler could be written to validate scanf's arguments against the format string, but the general tendency with C compilers has always been to politely let you shoot yourself in the foot (modulo gcc -Wall, thanks #Zack), and rightly so).
If you use scanf() properly, such as
int betNumber;
scanf("%d", &betNumber);
then after scanf() returns successfully, the value of betNumber already is an integer, you do not need to convert it anymore.
scanf and atoi both take a pointer to some previously allocated storage. Here's a working example:
int betNumber;
printf("What number do you want to bet?\n");
scanf("%d", &betNumber);
printf("Here is your number: %d\n", betNumber);

using sscanf to check string format

I want to compare my string to a giving format.
the format that I want to use in the check is :
"xxx://xxx:xxx#xxxxxx" // all the xxx are with variable length
so I used the sscanf() as follow :
if (sscanf(stin,"%*[^:]://%*[^:]:%*[^#]#") == 0) { ... }
is it correct to compare the return of scanf to 0 in this case?
You will only get zero back if all the fields match; but that won't tell you diddly-squat in practice. It might have failed with a colon in the first character and it would still return 0.
You need at least one conversion in there that is counted (%n is not counted), and that occurs at the end so you know that what went before also matched. You can never tell if trailing context (data after the last conversion specification) matched, and sscanf() won't backup if it has converted data, even if backing up would allow the trailing context to match.
For your scenario, that might be:
char c;
int n;
if (sscanf(stin, "%*[^:]://%*[^:]:%*[^#]#%n%c", &n, &c) == 1)
This requires at least one character after the #. It also tells you how many characters there were up to and including the #.
OP's suggestion is close.
#Jonathan Leffler is correct in that comparing the result of a specifier-less sscanf() against 0 does not distinguish between a match and no-match.
To test against "xxx://xxx:xxx#xxxxxx", (and assuming any part with "x" needs at least 1 matching), use
int n = 0;
sscanf(stin, "%*[^:]://%*[^:]:%*[^#]#%*c%n", &n);
if (n > 0) {
match();
}
There is a obscure hole using this method with fscanf(). A stream of data with a \0 is a problem.

Can I use fscanf to get only digits from text that contain chars and ints?

I want to extract digits from a file that contains characters and digits.
For example:
+ 321 chris polanco 23
I want to skip the '+' and get only the 321.
Here's the code I have so far.
while(fscanf(update, "%d", &currentIn->userid) != EOF){
currentIn->index = index;
rootIn = sort(rootIn, currentIn);
index = index + 1;
currentIn = malloc(sizeof(Index));
}
I was thinking that since I had %d that it would get the first digits that it finds but I was wrong. I'm open to better ways of doing this if you guys have any.
Instead of struggling with fscanf() (and running into format problems later), I recommend to use fgets() + sscanf() combination to process each line.
If you know the the integer you are interested in starts at 3rd position in each line of the file then you can do line+2 in sscanf() to read it. Otherwise, you can modify the sscanf() format string according to the format of your input file.
char line[MAX_LINE_LEN + 1];
While ( fgets(line, sizeof line, update) )
{
if(sscanf(line+2, "%d", &currentIn->userid) != 1)
{
/* handle failure */
}
...
}
while (fscanf(update, "%*[^0-9]%d", &currentIn->userid) == 1)
{
...
}
This skips over non-digits (that's the %*[^0-9] part) followed by an integer. The suppressed assignment isn't counted, so the == 1 ensures that you got a number.
Unfortunately, it runs into a problem if the first character in the file is a digit — as pointed out by Chris Dodd. There are multiple possible solutions to that:
ungetc('a', update); will give a non-digit to read first.
while ((fscanf(update, "%*[^0-9]"), fscanf(update, "%d", &currentIn->userid)) == 1)
Or:
while (fscanf(update, "%*[^0-9]%d", &currentIn->userid) == 1 ||
fscanf(update, "%d", &currentIn->userid) == 1)
{
...
}
Depending on which you think is more likely, you could reverse the order of these two fscanf() policies. With the scanf() family of functions, there's always a problem if the string of digits is so long that the number cannot be represented in an int; you get undefined behaviour. I don't attempt to address that.
This will pick up multiple numbers per line, one per invocation. If you want a single number per line, or otherwise want control over how each line is handled, then use fgets() or readline() to read the line, and then sscanf() to do the analysis. One advantage of this is that if you so choose, you can use careful functions like strtol() to convert digits to numbers.

strcmp not working

I know this may be a totally newbie question (I haven't touched C in a long while), but can someone tell me why this isn't working?
printf("Enter command: ");
bzero(buffer,256);
fgets(buffer,255,stdin);
if (strcmp(buffer, "exit") == 0)
return 0;
If I enter "exit" it doesn't enter the if, does it have to do with the length of "buffer"?
Any suggestions?
You want to do this:
strcmp(buffer, "exit\n")
That is, when you enter your string and press "enter", the newline becomes a part of buffer.
Alternately, use strncmp(), which only compares n characters of the string
fgets() is returning the string "exit\n" -- unlike gets(), it preserves newlines.
As others have said, comparing with "exit" is failing because fgets() included the newline in the buffer. One of its guarantees is that the buffer will end with a newline, unless the entered line is too long for the buffer, in which case it does not end with a newline. fgets() also guarantee that the buffer is nul terminated, so you don't need to zero 256 bytes but only let fgets() use 255 to get that guarantee.
The easy answer of comparing to exactly "exit\n" required that the user did not accidentally add whitespace before or after the word. That may not matter if you want to force the user to be careful with the exit command, but might be a source of user annoyance in general.
Using strncmp() potentially allows "exited", "exit42", and more to match where you might not want them. That might work against you, especially if some valid commands are prefix strings of other valid commands.
In the general case, it is often a good idea to separate I/O, tokenization, parsing, and action into their own phases.
Agree with Dave. Also you may wish to use strncmp() instead. Then you can set a length for the comparison.
http://www.cplusplus.com/reference/clibrary/cstdio/fgets/
http://www.cplusplus.com/reference/clibrary/cstring/strncmp/
I'd recommend that you strip the \n from the end of the string, like this.
char buf[256];
int len;
/* get the string, being sure to leave room for a null byte */
if ( fgets(buf,sizeof(buf) - 1) == EOF )
{
printf("error\n");
exit(1);
}
/* absolutely always null-terminate, the easy way */
buf[sizeof(buf) - 1] = '\0';
/* compute the length, and truncate the \n if any */
len = strlen(buf);
while ( len > 0 && buf[len - 1] == '\n' )
{
buf[len - 1] = '\0';
--len;
}
That way, if you have to compare the inputted string against several constants, you're not having to add the \n to all of them.

Resources