I have dynamic string like: "/users/5/10/fnvfnvdjvndfvjvdklchsh" and also dynamic format like "/users/%u/%d/%s", how to check these strings matches?
As string i mean char[255] or char* str = malloc(x).
I tried use sscanf but i dont know number of arguments and types, if i do:
int res = sscanf(input, format);
I have stack overflow, or can i allocate stack to prevent this?
Example like this:
void* buffer = malloc(1024);
int res = sscanf(input, format, buffer);
I would like have function like this:
bool stringMatches(const char* format, const char* input);
stringMatches("/users/%u/%d/%s", "/users/5/10/fnvfnvdjvndfvjvdklchsh"); //true
stringMatches("/users/%u/%d/%s", "/users/5/10"); //false
stringMatches("/users/%u/%d/%s", "/users/-10/10/aaa"); //false %u is unsigned
Do you see any solution?
Thanks in advance.
I don't think that there is a scanf-like matching function in the standard lib, so you will have to write your own. Replicating all details of the scanf behaviour is difficult, but it's probably not necessary.
If you allow only % and a limited selection of single format identifiers without size, width and precision information, the code isn't terribly complex:
bool stringMatches(const char *format, const char *input)
{
while (*format) {
if (*format == '%') {
format++;
switch(*format++) {
case '%': {
if (*input++ != '%') return false;
}
break;
case 'u':
if (*input == '-') return false;
// continue with 'd' case
case 'd': {
char *end;
strtol(input, &end, 0);
if (end == input) return false;
input = end;
}
break;
case 's': {
if (isspace((uint8_t) *input)) return false;
while (*input && !isspace((uint8_t) *input)) input++;
}
break;
default:
return false;
}
} else {
if (*format++ != *input++) return false;
}
}
return (*input == '\0');
}
Some notes:
I've parsed the numbers with strtol. If you want to include floating-point number formats, you could use strtod for that, if your embedded system provides it. (You could also parse stretches of isdigit() chars as valid numbers.)
The 'u' case falls through to the 'd' case here. The function strtoul parses an unsigned long, but it allows a minus sign, so that case is caught explicitly. (But the way it is caught, it won't allow leading white space.)
You could implement your own formats or re-interpret existing ones. For example you could decide that you don't want leading white space for numbers or that a string ends with a slash.
It's a rather tricky one. I don't think C has very useful built in functions that will help you.
What you could do is using a regex. Something like this:
#include <sys/types.h>
#include <regex.h>
#include <stdio.h>
int main(void)
{
regex_t regex;
if (regcomp(®ex, "/users/[[:digit:]]+", 0)) {
fprintf("Error\n");
exit(1);
}
char *mystring = "/users/5/10/fnvfnvdjvndfvjvdklchsh";
if( regexec(®ex, myString, 0, NULL, 0) == 0)
printf("Match\n");
}
The regex in the code above does not suit your example. I just used something to show the idea. I think it would correspond to the format string "/users/%u" but I'm not sure. Nevertheless, I think this is one of the easiest ways to tackle this problem.
The easiest is to just try parsing it with sscanf, and see if the scan succeeded.
char * str = "/users/5/10/fnvfnvdjvndfvjvdklchsh";
unsigned int tmp_u;
int tmp_d;
char tmp_s[256];
int n = sscanf (str, "/users/%u/%d/%s", &tmp_u, &tmp_d, tmp_s);
if (n!=3)
{
/* Match failed */
}
Just remember that you don't have to mach everything in one go. You can use the %n format specifier to get the number of bytes parsed, and increment the string for the next parse.
This example abuses the fact that bytes_parsed will not be modified if the parsing doesn't reach the %n specifier:
char * str = "/users/5/10/fnvfnvdjvndfvjvdklchsh";
int bytes_parsed = 0;
/* parse prefix */
sscanf("/users/%n", &bytes_parsed);
if (bytes_parsed == 0)
{
/* Parse error */
}
str += bytes_parsed; /* str = "5/10/fnvfnvdjvndfvjvdklchsh"; */
bytes_parsed = 0;
/* Parse next num */
unsigned int tmp_u
sscanf(str, "%u%n", &tmp_u, &bytes_parsed);
if (bytes_parsed)
{
/* Number was an unsigned, do something */
}
else
{
/* First number was not an `unsigned`, so we try parsing it as signed */
unsigned int tmp_d
sscanf(str, "%d%n", &tmp_d, &bytes_parsed);
if (bytes_parsed)
{
/* Number was an unsigned, do something */
}
}
if (!bytes_parsed)
{
/* failed parsing number */
}
str += bytes_parsed; /* str = "/10/fnvfnvdjvndfvjvdklchsh"; */
......
Related
Hi folks thanks in advance for any help, I'm doing the CS50 course i'm at the very beginning of programming.
I'm trying to check if the string from the main function parameter string argv[] is indeed a number, I searched multiple ways.
I found in another topic How can I check if a string has special characters in C++ effectively?, on the solution posted by the user Jerry Coffin:
char junk;
if (sscanf(str, "%*[A-Za-z0-9_]%c", &junk))
/* it has at least one "special" character
else
/* no special characters */
if seems to me it may work for what I'm trying to do, I'm not familiar with the sscanf function, I'm having a hard time, to integrate and adapt to my code, I came this far I can't understand the logic of my mistake:
#include <cs50.h>
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>
int numCheck(string[]);
int main(int argc, string argv[]) {
//Function to check for user "cooperation"
int key = numCheck(argv);
}
int numCheck(string input[]) {
int i = 0;
char junk;
bool usrCooperation = true;
//check for user "cooperation" check that key isn't a letter or special sign
while (input[i] != NULL) {
if (sscanf(*input, "%*[A-Za-z_]%c", &junk)) {
printf("test fail");
usrCooperation = false;
} else {
printf("test pass");
}
i++;
}
return 0;
}
check if the string from the main function parameter string argv[] is indeed a number
A direct way to test if the string converts to an int is to use strtol(). This nicely handles "123", "-123", "+123", "1234567890123", "x", "123x", "".
int numCheck(const char *s) {
char *endptr;
errno = 0; // Clear error indicator
long num = strtol(s, &endptr, 0);
if (s == endptr) return 0; // no conversion
if (*endptr) return 0; // Junk after the number
if (errno) return 0; // Overflow
if (num > INT_MAX || num < INT_MIN) return 0; // int Overflow
return 1; // Success
}
int main(int argc, string argv[]) {
// Call each arg[] starting with `argv[1]`
for (int a = 1; a < argc; a++) {
int success = numCheck(argv[a]);
printf("test %s\n", success ? "pass" : "fail");
}
}
sscanf(*input, "%*[A-Za-z_]%c", &junk) is the wrong approach for testing numerical conversion.
You pass argv to numcheck and test all strings in it: this is incorrect as argv[0] is the name of the running executable, so you should skip this argument. Note also that you should pass input[i] to sscanf(), not *input.
Furthermore, lets analyze the return value of sscanf(input[i], "%*[A-Za-z_]%c", &junk):
it returns EOF if the input string is empty,
it returns 0 if %*[A-Za-z_] fails,
it also returns 0 if the conversion %c fails after the %*[A-Za-z_] succeeds,
it returns 1 is both conversions succeed.
This test is insufficient to check for non digits in the string, it does not actually give useful information: the return value will be 0 for the string "1" and also for the string "a"...
sscanf() is very tricky, full of quirks and traps. Definitely not the right tool for pattern matching.
If the goal is to check that the strings contain only digits (at least one), use this instead, using the often overlooked standard function strspn():
#include <stdio.h>
#include <string.h>
int numCheck(char *input[]) {
int i;
int usrCooperation = 1;
//check for user "cooperation" check that key isn't a letter or special sign
for (i = 1; input[i] != NULL; i++) {
// count the number of matching character at the beginning of the string
int ndigits = strspn(input[i], "0123456789");
// check for at least 1 digit and no characters after the digits
if (ndigits > 0 && input[i][ndigits] == '\0') {
printf("test passes: %d digits\n", ndigits);
} else {
printf("test fails\n");
usrCooperation = 0;
}
}
return usrCooperation;
}
Let's try this again:
This is still your problem:
if (sscanf(*input, "%*[A-Za-z_]%c", &junk))
but not for the reason I originally said - *input is equal to input[0]. What you want to have there is
if ( sscanf( input[i], "%*[A-Za-z_]%c", &junk ) )
what you're doing is cycling through all your command line arguments in the while loop:
while( input[i] != NULL )
but you're only actually testing input[0].
So, quick primer on sscanf:
The first argument (input) is the string you're scanning. The type of this argument needs to be char * (pointer to char). The string typedef name is an alias for char *. CS50 tries to paper over the grosser parts of C string handling and I/O and the string typedef is part of that, but it's unique to the CS50 course and not a part of the language. Beware.
The second argument is the format string. %[ and %c are format specifiers and tell sscanf what you're looking for in the string. %[ specifies a set of characters called a scanset - %[A-Za-z_] means "match any sequence of upper- and lowercase letters and underscores". The * in %*[A-Za-z_] means don't assign the result of the scan to an argument. %c matches any character.
Remaining arguments are the input items you want to store, and their type must match up with the format specifier. %[ expects its corresponding argument to have type char * and be the address of an array into which the input will be stored. %c expects its corresponding argument (in this case junk) to also have type char *, but it's expecting the address of a single char object.
sscanf returns the number of items successfully read and assigned - in this case, you're expecting the return value to be either 0 or 1 (because only junk gets assigned to).
Putting it all together,
sscanf( input, "%*[A-Za-z_]%c", &junk )
will read and discard characters from input up until it either sees the string terminator or a character that is not part of the scanset. If it sees a character that is not part of the scanset (such as a digit), that character gets written to junk and sscanf returns 1, which in this context is treated as "true". If it doesn't see any characters outside of the scanset, then nothing gets written to junk and sscanf returns 0, which is treated as "false".
EDIT
So, chqrlie pointed out a big error of mine - this test won't work as intended.
If there are no non-letter and non-underscore characters in input[i], then nothing gets assigned to junk and sscanf returns 0 (nothing assigned). If input[i] starts with a letter or underscore but contains a non-letter or non-underscore character later on, that bad character will be converted and assigned to junk and sscanf will return 1.
So far so good, that's what you want to happen. But...
If input[i] starts with a non-letter or non-underscore character, then you have a matching failure and sscanf bails out, returning 0. So it will erroneously match a bad input.
Frankly, this is not a very good way to test for the presence of "bad" characters.
A potentially better way would be to use something like this:
while ( input[i] )
{
bool good = true;
/**
* Cycle through each character in input[i] and
* check to see if it's a letter or an underscore;
* if it isn't, we set good to false and break out of
* the loop.
*/
for ( char *c = input[i]; *c; c++ )
{
if ( !isalpha( *c ) && *c != '_' )
{
good = false;
break;
}
}
if ( !good )
{
puts( "test fails" );
usrCooperation = 0;
}
else
{
puts( "test passes" );
}
}
I followed the solution by the user "chux - Reinstate Monica". thaks everybody for helping me solve this problem. Here is my final program, maybe it can help another learner in the future. I decided to avoid using the non standard library "cs50.h".
//#include <cs50.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <limits.h>
void keyCheck(int);
int numCheck(char*);
int main(int argc, char* argv[])
{
//Error code == 1;
int key = 0;
keyCheck(argc); //check that two parameters where sent to main.
key = numCheck(argv[1]); //Check for user "cooperation".
return 0;
}
//check for that main received two parameters.
void keyCheck(int key)
{
if (key != 2) //check that main argc only has two parameter. if not terminate program.
{
exit(1);
}
}
//check that the key (main parameter (argv [])) is a valid number.
int numCheck(char* input)
{
char* endptr;
errno = 0;
long num = strtol(input, &endptr, 0);
if (input == endptr) //no conversion is possible.
{
printf("Error: No conversion possible");
return 1;
}
else if (errno == ERANGE) //Input out of range
{
printf("Error: Input out of range");
return 1;
}
else if (*endptr) //Junk after numeric text
{
printf("Error: data after main parameter");
return 1;
}
else //conversion succesfull
{
//verify that the long int is in the integer limits.
if (num >= INT_MIN && num <= INT_MAX)
{
return num;
}
//if the main parameter is bigger than an int, terminate program
else
{
printf("Error key out of integer limits");
exit(1);
}
}
/* else
{
printf("Success: %ld", num);
return num;
} */
}
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
Input: AA:BB:CC:DD:EE:FF Output expected: 0xaabbccddeeff.
Input: AA:BB:65:F0:E4:D4 Output expected:0xaabb65f0e4d4
char arr[20]="AA:BB:CC:DD:EE:FF";
char t[20]="0x";
char *token=strtok(arr[i], ":");
while(token !=NULL){
printf("%s\n", token);
token = strtok(NULL, ":");
strcat(t, token);
}
printf("The modified string is %s\n", t);
I am seeing a segmentation fault.
You're attempting the final strcat with a null token. Try moving your conditional to check for that before making the strcat call:
#include <ctype.h>
#include <stdio.h>
#include <string.h>
void lower(char *c) {
for (; *c = tolower(*c); *c++);
}
int main() {
char s[] = "AA:BB:CC:DD:EE:FF";
char t[15] = "0x";
char *token = strtok(s, ":");
if (token) {
lower(token);
strcat(t, token);
while (token = strtok(NULL, ":")) {
lower(token);
strcat(t, token);
}
}
printf("The modified string is %s\n", t);
}
Output:
The modified string is 0xaabbccddeeff
Use the 64-bit unsigned integer type uint64_t (declared in <inttypes.h>) to store the 48-bit value (HH:HH:HH:HH:HH:HH → 0xHHHHHHHHHHHH).
You could use sscanf(), but it does not detect overflow; it would consider only the two rightmost hexadecimal characters in each part, so F11:E22:D33:C44:B55:A66 would yield the same result as 11:22:33:44:55:66.
First, we need a function to convert a hexadecimal digit to its numerical value. Here is the simplest, most easy to read, and also most portable way to write it:
#include <stdlib.h>
#include <string.h>
#include <inttypes.h>
#include <ctype.h>
#include <stdio.h>
static inline int hex_digit(const int c)
{
switch (c) {
case '0': return 0;
case '1': return 1;
case '2': return 2;
case '3': return 3;
case '4': return 4;
case '5': return 5;
case '6': return 6;
case '7': return 7;
case '8': return 8;
case '9': return 9;
case 'A': case 'a': return 10;
case 'B': case 'b': return 11;
case 'C': case 'c': return 12;
case 'D': case 'd': return 13;
case 'E': case 'e': return 14;
case 'F': case 'f': return 15;
default: return -1;
}
}
The function will return a nonnegative (0 or positive) integer corresponding to the character, or -1 if the character is not a hexadecimal digit.
The static inline means that the function is only visible in this translation unit (file; or if put in a header file, each file that #includes that header file). It was standardized in C99 as a way for programmers to write functions that are as fast as (incur no runtime overhead compared to) preprocessor macros.
Next, we need a function to carefully parse the string. Here is one:
/* Parse a string "HH:HH:HH:HH:HH:HH" to 0x00HHHHHHHHHHHH,
and return a pointer to the character following it,
or NULL if an error occurs. */
static const char *parse_mac(const char *src, uint64_t *dst)
{
uint64_t value = 0;
int i, hi, lo;
/* No string specified? */
if (!src)
return NULL;
/* Skip leading whitespace. */
while (isspace((unsigned char)(*src)))
src++;
/* End of string? */
if (!*src)
return NULL;
/* First pair of hex digits. */
if ((hi = hex_digit(src[0])) < 0 ||
(lo = hex_digit(src[1])) < 0)
return NULL;
value = 16*hi + lo;
src += 2;
/* The next five ":HH" */
for (i = 0; i < 5; i++) {
if (src[0] != ':' || (hi = hex_digit(src[1])) < 0 ||
(lo = hex_digit(src[2])) < 0 )
return NULL;
value = 256*value + 16*hi + lo;
src += 3;
}
/* Successfully parsed. */
if (dst)
*dst = value;
return src;
}
Above, we marked the function static, meaning it too is only visible in this compilation unit. It is not marked inline, because it is not a trivial function; it does proper work, so we do not suggest the compiler should inline it.
Note the cast to unsigned char in the isspace() call. This is because isspace() takes either an unsigned char, or EOF. If we supply it a char, and char type happens to be a signed type (it varies between architectures), some characters do get incorrectly classified. So, using the cast with the character-type functions (isspace(), isblank(), tolower(), `toupper(), et cetera) is important, if you want your code to work right on all systems that support standard C.
You might not be familiar with the idiom if ((variable = subexpression) < 0). For each (variable = subexpression) < 0, the subexpression gets evaluated, then assigned to the variable. If the value is less than zero, the entire expression is true; otherwise it is false. The variable will retain its new value afterwards.
In C, logical AND (&&) and OR (||) are short-circuiting. This means that if you have A && B, and A is false, then B is not evaluated at all. If you have A || B, and A is true, then B is not evaluated at all. So, in the above code,
if ((hi = hex_digit(src[0])) < 0 ||
(lo = hex_digit(src[1])) < 0)
return NULL;
is exactly equivalent to
hi = hex_digit(src[0]);
if (hi < 0)
return NULL;
lo = hex_digit(src[1]);
if (lo < 0)
return NULL;
Here, we could have written those two complicated if statements more verbosely, but I wanted to include it in this example, to make this answer into something you must "chew" a bit in your mind, before you can use it in e.g. homework.
The main "trick" in the function is that we build value by shifting its digits leftward. If we are parsing 12:34:56:78:9A:BC, the first assignment to value is equivalent to value = 0x12;. Multiplying value by 256 shifts the hexadecimal digits by two places (because 256 = 0x100), so in the first iteration of the for loop, the assignment to value is equivalent to value = 0x1200 + 0x30 + 0x4; i.e. value = 0x1234;. This goes on for four more assignments, so that the final value is 0x123456789ABC;. This "shifting digits via multiplication" is very common, and works in all numerical bases (for decimal numbers, the multiplier is a power of 10; for octal numbers, a power of 8; for hexadecimal numbers, a power of 16; always a power of the base).
You can, for example, use this approach to reverse the digits in a number (so that one function converts 0x123456 to 0x654321, and another converts 8040201 to 1020408).
To test the above, we need a main(), of course. I like my example programs to tell me what they do if I run them without arguments. When they work on strings or numbers, I like to provide them on the command line, rather than having the program ask for input:
int main(int argc, char *argv[])
{
const char *end;
uint64_t mac;
int arg;
if (argc < 2 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
fprintf(stderr, "\n");
fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv[0]);
fprintf(stderr, " %s HH:HH:HH:HH:HH:HH ...\n", argv[0]);
fprintf(stderr, "\n");
fprintf(stderr, "This program parses the hexadecimal string(s),\n");
fprintf(stderr, "and outputs them in both hexadecimal and decimal.\n");
fprintf(stderr, "\n");
return EXIT_FAILURE;
}
for (arg = 1; arg < argc; arg++) {
end = parse_mac(argv[arg], &mac);
if (!end) {
fprintf(stderr, "Cannot parse '%s'.\n", argv[arg]);
return EXIT_FAILURE;
}
if (*end)
printf("%s: 0x%012" PRIx64 " = %" PRIu64 " in decimal; '%s' unparsed.\n",
argv[arg], mac, mac, end);
else
printf("%s: 0x%012" PRIx64 " = %" PRIu64 " in decimal.\n",
argv[arg], mac, mac);
fflush(stdout);
}
return EXIT_SUCCESS;
}
The first if clause checks if there are any command-line parameters. (argv[0] is the program name itself, and is included in argc, the number of strings in argv[] array. In other words, argc == 1 means only the program name was supplied on the command line, argc == 2 means the program name and one parameter (in argv[1]) was supplied, and so on.)
Because it is often nice to supply more than one item to work on, we have a for loop over all command-line parameters; from argv[1] to argv[argc-1], inclusive. (Remember, because argc is the number of strings in the argv[] array, and numbering starts from 0, the last is argc-1. This is important to remember in C, in all array use!)
Within the for loop, we use our parse function. Because it returns a pointer to the string following the part we parsed, and we store that to end, (*end == '\0') (which is equivalent to the shorter form (!*end) is true if the string ended there. If (*end) (equivalent to (*end != '\0')) is true, then there are additional characters in the string following the parsed part.
To output any of the integer types specified in <inttypes.h>, we must use preprocessor macros. For uint64_t, we can use "%" PRIu64 to print one in decimal; or "%" PRIx64 to print one in hexadecimal. "%012" PRIu64 means "Print a 12-digit uint64_t, zero-padded (on the left)".
Remember that in C, string literals are concatenated; "a b", "a " "b", "a" " " "b" are all equivalent. (So, the PRI?## macros all expand to strings that specify the exact conversion type. They are macros, because they vary between systems. In 64-bit Windows PRIu64 is usually "llu", but in 64-bit Linux it is "lu".)
The fflush(stdout); at the end should do nothing, because standard output is by default line buffered. However, because I explicitly want the C library to ensure the output is output to standard output before next loop iteration, I added it. It would matter if one changed standard output to fully buffered. As it is, it is an "insurance" (against oddly behaving C library implementations), and a reminder to us human programmers that the intent is to have the output flushed, not cached by the C library, at that point.
(Why do we want that? Because if an error occurs during the next iteration, and we print errors to standard error, and standard output and error are both usually directed to the terminal, we want the standard output to be visible before the standard error is, to avoid user confusion.)
If you compile the above to say example (I use Linux, so I run it as ./example; in Windows, you probably run it as example.exe), you can expect the following outputs:
./example 12:34:56:07:08:09 00:00:00:00:00:00foo bad
12:34:56:07:08:09: 0x123456070809 = 20015990900745 in decimal.
00:00:00:00:00:00foo: 0x000000000000 = 0 in decimal; 'foo' unparsed.
Cannot parse 'bad'.
If you run it without parameters, or with just -h or --help, you should see
Usage: ./z [ -h | --help ]
./z HH:HH:HH:HH:HH:HH ...
This program parses the hexadecimal string(s),
and outputs them in both hexadecimal and decimal.
Obviously, there are other ways to achieve the same. If you are only interested in the string representation, you could use e.g.
#include <stdlib.h>
#include <ctype.h>
char *mac_to_hex(const char *src)
{
char *dst, *end;
int i;
if (!src)
return NULL;
/* Skip leading whitespace. */
while (isspace((unsigned char)(*src)))
src++;
/* The next two characters must be hex digits. */
if (!isxdigit((unsigned char)(src[0])) ||
!isxdigit((unsigned char)(src[1])))
return NULL;
/* Dynamically allocate memory for the result string.
"0x112233445566" + '\0' = 15 chars total. */
dst = malloc(15);
if (!dst)
return NULL;
/* Let end signify the position of the next char. */
end = dst;
/* Prefix, and the first two hex digits. */
*(end++) = '0';
*(end++) = 'x';
*(end++) = *(src++);
*(end++) = *(src++);
/* Loop over the five ":HH" parts left. */
for (i = 0; i < 5; i++) {
if (src[0] == ':' &&
isxdigit((unsigned char)(src[1])) &&
isxdigit((unsigned char)(src[2])) ) {
*(end++) = src[1];
*(end++) = src[2];
src += 3;
} else {
free(dst);
return NULL;
}
}
/* All strings need a terminating '\0' at end.
We allocated enough room for it too. */
*end = '\0';
/* Ignore trailing whitespace in source string. */
while (isspace((unsigned char)(*src)))
src++;
/* All of source string processed? */
if (*src) {
/* The source string contains more stuff; fail. */
free(dst);
return NULL;
}
/* Success! */
return dst;
}
I consider this approach much less useful, because the source string must contain exactly HH:HH:HH:HH:HH:HH (although leading and trailing whitespace is allowed). Parsing it to an unsigned integer lets you e.g. read a line, and parse all such patterns on it, with a simple loop.
If you find any bugs or issues in the above, let me know in a comment so I can verify and fix if necessary.
int main()
{
int f;
printf("Type your age");
scanf("%d", &f);
if(!isdigit(f))
{
printf("Digit");
}
else
{
printf("Is not a digit");
}
return 0;
}
No matter if a typed 6 or a always shows me the "Digit" message
isdigit() should be passed a char not an int. And your if-else logic is reversed:
int main() {
char f;
printf("Type your age");
scanf("%c", &f);
if (isdigit(f)) {
printf("Digit");
} else {
printf("Is not a digit");
}
return 0;
}
As mentioned in the comments, this will only work for a single digit age. Validating input is a major topic under the 'C' tag, a search will reveal many approaches to more robust validation.
%d is an integer specifier. Change int f to char f and parse as a character. You are always passing an int into isdigit, which is why it is always true.
There's actually no need to use isdigit at all here since scanf with the %d format specifier already guarantees that the characters will be digits with an optional leading sign. And there's a separate specifier to get rid of the leading sign, %u.
If what you input isn't of the correct format, scanf will tell you (since it returns the number of items successfully scanned).
So, for a simple solution, you can just use something like:
unsigned int age;
if (scanf("%u", &age) == 1) {
puts("Not a valid age");
return 1;
}
// Now it's a valid uint, though you may want to catch large values.
If you want robust code, you may have to put in a little more effort than a one-liner scanf("%d") - it's fine for one-time or throw-away programs but it has serious shortcomings for code intended to be used in real systems.
First, I would use the excellent string input routine in this answer(a) - it pretty much provides everything you need for prompted and checked user input.
Once you have the input as a string, strtoul allows you to do the same type of conversion as scanf but with the ability to also ensure there's no trailing rubbish on the line as well. This answer (from the same author) provides the means for doing that.
Tying that all together, you can use something like:
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>
// Code to robustly get input from user.
#define OK 0 // Return codes - okay.
#define NO_INPUT 1 // - no input given.
#define TOO_LONG 2 // - input was too long.
static int getLine (
char *prmpt, // The prompt to use (NULL means no prompt).
char *buff, // The buffer to populate.
size_t sz // The size of the buffer.
) {
int ch, extra;
// Get line with buffer overrun protection.
if (prmpt != NULL) {
printf ("%s", prmpt);
fflush (stdout);
}
if (fgets (buff, sz, stdin) == NULL)
return NO_INPUT;
// If it was too long, there'll be no newline. In that case, we flush
// to end of line so that excess doesn't affect the next call.
if (buff[strlen(buff)-1] != '\n') {
extra = 0;
while (((ch = getchar()) != '\n') && (ch != EOF))
extra = 1;
return (extra == 1) ? TOO_LONG : OK;
}
// Otherwise remove newline and give string back to caller.
buff[strlen(buff)-1] = '\0';
return OK;
}
// Code to check string is valid unsigned integer and within range.
// Returns true if it passed all checks, false otherwise.
static int validateStrAsUInt(
char *str, // String to evaluate.
unsigned int minVal, // Minimum allowed value.
unsigned int maxVal, // Maximum allowed value.
unsigned int *pResult // Address of item to take value.
) {
char *nextChar;
unsigned long retVal = strtoul (str, &nextChar, 10);
// Ensure we used the *whole* string and that it wasn't empty.
if ((nextChar == str) || (*nextChar != '\0'))
return 0;
// Ensure it's within range.
if ((retVal < minVal) || (retVal > maxVal))
return 0;
// It's okay, send it back to caller.
*pResult = retVal;
return 1;
}
// Code for testing above functions.
int main(void) {
int retCode;
unsigned int age;
char buff[20];
// Get it as string, detecting input errors.
retCode = getLine ("Enter your age> ", buff, sizeof(buff));
if (retCode == NO_INPUT) {
printf ("\nError, no input given.\n");
return 1;
}
if (retCode == TOO_LONG) {
printf ("Error, input too long [%s]\n", buff);
return 1;
}
// Check string is valid age.
if (! validateStrAsUInt(buff, 0, 150, &age)) {
printf("Not a valid age (0-150)\n");
return 1;
}
// It's okay, print and exit.
printf("Age is valid: %u\n", age);
return 0;
}
(a) I'm reliably informed the author is actually quite clever, and very good looking :-)
I would like to understand how to validate a string input and check whether the entered string is Numeric or not? I belive isdigit() function is the right way to do it but i'm able to try it out with one char but when it comes to a string the function isn't helping me.This is what i have got so far,Could any please guide me to validate a full string like
char *Var1 ="12345" and char *var2 ="abcd"
#include <stdio.h>
#include <ctype.h>
int main()
{
char *var1 = "hello";
char *var2 = "12345";
if( isdigit(var1) )
{
printf("var1 = |%s| is a digit\n", var1 );
}
else
{
printf("var1 = |%s| is not a digit\n", var1 );
}
if( isdigit(var2) )
{
printf("var2 = |%s| is a digit\n", var2 );
}
else
{
printf("var2 = |%s| is not a digit\n", var2 );
}
return(0);
}
The program seems to be working fine when the variables are declared and initialized as below,
int var1 = 'h';
int var2 = '2';
But i would like to understand how to validate a full string like *var =" 12345";
Try to make a loop on each string and verify each char alone
isdigit takes a single char, not a char*. If you want to use isdigit, add a loop to do the checking. Since you are planning to use it in several places, make it into a function, like this:
int all_digits(const char* str) {
while (*str) {
if (!isdigit(*str++)) {
return 0;
}
}
return 1;
}
The loop above will end when null terminator of the string is reached without hitting the return statement in the middle, in other words, when all characters have passed the isdigit test.
Note that passing all_digits does not mean that the string represents a value of any supported numeric type, because the length of the string is not taken into account. Therefore, a very long string of digits would return true for all_digits, but if you try converting it to int or long long you would get an overflow.
Use this
int isNumber(const char *const text)
{
char *endptr;
if (text == NULL)
return 0;
strtol(text, &endptr, 10);
return (*endptr == '\0');
}
then
if (isNumeric(var1) == 0)
printf("%s is NOT a number\n", var1);
else
printf("%s is number\n", var1);
the strtol() function will ignore leading whitspace characters.
If a character that cannot be converted is found, the convertion stops, and endptr will point to that character after return, thus checking for *endptr == '\0' will tell you if you are at the end of the string, meaning that all characters where successfuly converted.
If you want to consider leading whitespaces as invalid characters too, then you could just write this instead
int isNumber(const char *text)
{
char *endptr;
if (text == NULL)
return 0;
while ((*text != '\0') && (isspace(*text) != 0))
text++;
if (*text == '\0')
return 0;
strtol(text, &endptr, 10);
return (*endptr == '\0');
}
depending on what you need, but skipping leading whitespace characters is to interpret the numbers as if a human is reading them, since humans "don't see" whitespace characters.
I can have strings containing random 10 digit numbers e.g.
"abcgfg1234567890gfggf" or
"fgfghgh3215556890ddf" etc
basically any combination of 10 digits plus chars together in a string, so I need check the string to determine if a 10 digit number is present. I use strspn but it returns 0
char str_in[] = "abcgfg1234567890gfggf";
char cset[] = "1234567890";
int result;
result = strspn(str_in, cset); // returns 0 need it to return 10
The fact that the following code returns 0 instead of 10 highlights the problem. I asked this previously but most replies were for checking against a known 10 digit number. In my case the number will be random. Any better way than strspn?
It returns 0 because there are no digits at the start of the string.
The strspn() function calculates the length (in bytes) of the
initial segment of s which consists entirely of bytes in accept.
You need to skip non-digits - strcspn - and then call strspn on the string + that offset. You could try:
/* Count chars to skip. */
skip = strcspn(str_in, cset);
/* Measure all-digit portion. */
length = strspn(str_in + skip, cset)
EDIT
I should mention this must be done in a loop. For example if your string is "abcd123abcd1234567890" the first strspn will only match 3 characters and you need to look further.
Just use sscanf():
unsigned long long value;
const char *str_in = "abcgfg1234567890gfggf";
if(sscanf(str_in, "%*[^0-9]%uL", &value) == 1)
{
if(value >= 1000000000ull) /* Check that it's 10 digits. */
{
/* do magic here */
}
}
The above assumes that unsigned long long is large enough to hold a 10-digit decimal numbers, in practice this means it assumes that's a 64-bit type.
The %*[^0-9] conversion specifier tells sscanf() to ignore a bunch of initial characters that are not (decimal) digits, then convert an unsigned long long (%uL) directly after that. The trailing characters are ignored.
How about using a regex?
#include <stdio.h>
#include <stdlib.h>
#include <regex.h>
int
main(int argc, char **argv)
{
char str_in[] = "abcgfg1234567890gfggf";
int result = 0;
const char *pattern = "[0-9]{10}";
regex_t re;
char msg[256];
if (regcomp(&re, pattern, REG_EXTENDED|REG_NOSUB) != 0) {
perror("regcomp");
return(EXIT_FAILURE);
}
result = regexec(&re, str_in, (size_t)0, NULL, 0);
regfree(&re);
if (!result) {
printf("Regex got a match.\n");
} else if (result == REG_NOMATCH) {
printf("Regex got no match.\n");
} else {
regerror(result, &re, msg, sizeof(msg));
fprintf(stderr, "Regex match failed: %s\n", msg);
return(EXIT_FAILURE);
}
return(EXIT_SUCCESS);
}
strspn seems handy for this, but you would have to include it in a loop and search several times. Given the specific requirements, the easiest way is probably to make your own custom function.
int find_digits (const char* str, int n);
/* Searches [str] for a sequence of [n] adjacent digits.
Returns the index of the first valid substring containing such a sequence,
otherwise returns -1.
*/
#include <ctype.h>
int find_digits (const char* str, int n)
{
int result = -1;
int substr_len = 0;
int i = 0;
for(int i=0; str[i] != '\0'; i++)
{
if(isdigit(str[i]))
{
substr_len++;
}
else
{
substr_len=0;
}
if(substr_len == n)
{
result = i;
break;
}
}
return result;
}
(I just hacked this down here and now, not tested, but you get the idea. This is most likely the fastest algorithm for the task, that is, if performance matters at all)
Alternative use of sscanf()
(blatant variation of #unwind)
const char *str_in = "abcgfg0123456789gfggf";
int n1 = 0;
int n2 = 0;
// %*[^0-9] Scan any non-digits. Do not store result.
// %n Store number of characters read so far.
// %*[0-9] Scan digits. Do not store result.
sscanf(str_in, "%*[^0-9]%n%*[0-9]%n", &n1, &n2);
if (n2 == 0) return 0;
return n2 - n1;
Counts leading 0 characters as part of digit count.
Should one wish to avoid sscanf()
char str_in[] = "abcgfg1234567890gfggf";
const char *p1 = str_in;
while (*p1 && !isdigit(*p1)) p1++;
const char *p2 = p1;
while (isdigit(*p2)) p2++;
result = p2 - p1;
for testing a suit of "0123456789" inside a string you can do something like that:
int main()
{
char str_in[] = "abcgfg1234567890gfggf";
char cset[] = "1234567890";
int result;
int i;
int f;
i = 0;
f = 0;
while (str_in[i])
{
if (str_in[i] == cset[f])
{
f++;
if(f == strlen(cset))
return (f);
}
else
f = 0;
i++;
}
}