Check if a string (char *) can be enumerated - c

I'm trying to find a way to check if a read-in char * can be represented as a number (with or without decimal places) or not. Essentially I am asking, if I have a text file containing the following:
-1.9e-3
e9
1e9
1ee9
-1-.9e3
.9e3
.9.e3
It would be able to recognize that line 1, line 3, and line 6 can be "enumerated" into valid numbers, whereas all of the other lines contain erroneous inputs. I know this could be done with brute force, but there is potentially an unlimited number of possibilities that could be wrong. It would be much easier if there was a function that read in the entire char * and can just say, "Yes that string of characters can be represented as an actual number" or "No that string of characters cannot be turned into the number that it intends to be."
And by enumerated I mean that the string (char *) can be the number that it wishes to represent.

Just use strtold(), it will tell you if it succeeds, and also give you the converted number.

Just try to convert it with strtold() and use the possibilities to check for errors, e.g.
char *x = "-1.9e-3";
errno = 0;
char *endptr;
long double xnum = strtold(x, &endptr);
if (*endptr) // (&& *endptr != '\n' if you read with `fgets()`)
{
// extra / invalid characters;
}
else if (errno == ERANGE)
{
// out of range;
}

Related

How to check if input is numeric(float) or it is some character?

I was asked to write a program to find sum of two inputs in my college so I should first check whether the input is valid.
For example, if I input 2534.11s35 the program should detect that it is not a valid input for this program because of s in the input.
to check input is numeric(float)
1) Take input as a string char buf[Big_Enough]. I'd expect 160 digits will handle all but the most arcane "float" strings1.
#define N 160
char buf[N];
if (fgets, buf, sizeof buf, stdin) {
2) Apply float strtof() for float, (strtod() for double, strtold() for long double).
char *endptr;
errno = 0;
float d = strtof(buf, &endptr);
// endptr now points to the end of the conversion, if any.
3) Check results.
if (buf == endptr) return "No_Conversion";
// Recommend to tolerate trailing white-space.
// as leading white-spaces are already allowed by `strtof()`
while (isspace((unsigned char)*endptr) {
endptr++;
}
if (*endptr) return "TrailingJunkFound";
return "Success";
4) Tests for extremes, if desired.
At this point, the input is numeric. The question remains if the "finite string" an be well represented by a finite float: if a the |result| is in range of 0 or [FLT_TRUE_MIN...FLT_MAX].
This involves looking at errno.
The conversion "succeed" yet finite string values outside the float range become HUGE_VALF which may be infinity or FLT_MAX.
Wee |values| close to 0.0, but not 0.0 become something in the range [0.0 ... INT_MIN].
Since the goal is to detect is a conversion succeeded (it did), I'll leave these details for a question that wants to get into the gory bits of what value.
An alternative is to use fscanf() to directly read and convert, yet the error handling there has its troubles too and hard to portably control.
1 Typical float range is +/- 1038. So allowing for 40 or so characters makes sense. An exact print of FLT_TRUE_MIN can take ~150 characters. To distinguish a arbitrarily "float" string from FLT_TRUE_MIN from the next larger one needs about that many digits.
If "float" strings are not arbitrary, but only come from the output of a printed float, then far few digits are needed - about 40.
Of course it is wise to allow for extra leading/trailing spaces and zeros.
You need to take the input as a string and then, make use of strtod() to parse the input.
Regarding the return values, from the man page:
double strtod(const char *nptr, char **endptr);
These functions return the converted value, if any.
If endptr is not NULL, a pointer to the character after the last character used in the conversion is stored in the location referenced by endptr.
If no conversion is performed, zero is returned and the value of nptr is stored in the location referenced by endptr.
Getting to the point of detection of errors, couple of points:
Ensure the errno is set to 0 before the call and it still is 0 after the call.
The return value is not HUGE_VAL.
The content pointed to by *endptr is not null and not equal to nptr (i.e., no conversation has been preformed).
The above checks, combined together will ensure a successful conversion.
In your case, the last point is essential, as if there is an invalid character present in the input, the *endptr would not be pointing to a null, instead it would hold the address of that (first) invalid character in the input.
#include<stdio.h>
#include<stdlib.h>
void main(){
char num1[15];
float number1;
int dot_check1=0,check=0,i;
printf("enter the numbers :\n");
gets(num1);
i=0;
while(num1[i]){
if(num1[i]>'/' && num1[i]<':')
;
else { if(dot_check1==0){
if(num1[i]=='.')
dot_check1=1;
else {
check=1;
break;
}
}
else {
check=1;
break;
}
}
i++;
}
if(check){
printf("please check the number you have entered");
}
else{
number1=atof(num1);
printf("you entered number is %f",number1);
}
}
Here is untested code to check whether a string meets the requested specification.
#include <ctype.h>
/* IsFloatNumeral returns true (1) if the string pointed to by p contains a
valid numeral and false (0) otherwise. A valid numeral:
Starts with optional white space.
Has an optional hyphen as a minus sign.
Contains either digits, a period followed by digits, or both.
Ends with optional white space.
Notes:
It is unusual not to accept "3." for a float literal, but this was
specified in a comment, so the code here is written for that.
The question does not state that leading or trailing white space
should be accepted (and ignored), but that is included here. To
exclude such white space, simply delete the relevant lines.
*/
_Bool IsFloatNumeral(const char *p)
{
_Bool ThereAreInitialDigits = 0;
_Bool ThereIsAPeriod = 0;
// Skip initial spaces. (Not specified in question; removed if undesired.)
while (isspace(*p))
++p;
// Allow an initial hyphen as a minus sign.
if (*p == '-')
++p;
// Allow initial digits.
if (isdigit(*p))
{
ThereAreInitialDigits = 1;
do
++p;
while (isdigit(*p));
}
// Allow a period followed by digits. Require at least one digit to follow the period.
if (*p == '.')
{
++p;
if (!isdigit(*p))
return 0;
ThereIsAPeriod = 1;
do
++p;
while (isdigit(*p));
}
/* If we did not see either digits or a period followed by digits,
reject the string (return 0).
*/
if (!ThereAreInitialDigits && !ThereIsAPeriod)
return 0;
// Skip trailing spaces. (Not specified in question; removed if undesired.)
while (isspace(*p))
++p;
/* If we are now at the end of the string (the null terminating
character), accept the string (return 1). Otherwise, reject it (return
0).
*/
return *p == 0;
}

Good C Coding practice/etiquette for integrity checks

Two part question;
I'm Coming from a high level Language, so this is a question about form not function;
I've written an isnumeric() function that takes a char[] and returns 1 if the string is a number taking advantage of the isdigit() function in ctype. Similar functions are builtin to other languages and I have always used something like that to integrity check the data before converting it to a numeric type. Mostly because some languages conversion functions fail badly if you try to convert a non-number string to an integer.
But it seems like a kludge having to do all that looping to compensate for the lack of strings in C, which poses the first part of the question;
Is it acceptable practice in C to trap for a 0 return from atoi() in lieu of doing an integrity check on the data before calling atoi()? The way atoi() (and other ascii to xx functions) works seems to lend itself well to eliminating the integrity check altogether. It would certainly seem more efficient to just skip the check.
The second part of the question is;
Is there a C function or common library function for a numeric integrity check
on a string? (by string, I of course mean char[])
Is it acceptable practice in C to trap for a 0 return from atoi() in lieu of doing an integrity check on the data before calling atoi()?
Never ever trap on error unless the error indicates a programming error that can't happen if there isn't a bug in the code. Always return some sort of error result in case of an error. Look at the OpenBSD strtonum function for how you could design such an interface.
The second part of the question is; Is there a C function or common library function for a numeric integrity check on a string? (by string, I of course mean char[])
Never use atoi unless you are writing a program without error checking as atoi doesn't do any error checking. The strtol family of functions allow you to check for errors. Here is a simply example of how you could use them:
int check_is_number(const char *buf)
{
const char *endptr;
int errsave = errno, errval;
long result;
errno = 0;
result = strtol(buf, &endptr, 0);
errval = errno;
errno = errsave;
if (errval != 0)
return 0; /* an error occured */
if (buf[0] == '\0' || *endptr != '\0')
return 0; /* not a number */
return 1;
}
See the manual page linked before for how the third argument to strtol (base) affects what it does.
errno is set to ERANGE if the value is out of range for the desired type (i.e. long). In this case, the return value is LONG_MAX or LONG_MIN.
If the conversion method returns an error indication (as distinct from going bananas if an error occurs, or not providing a definitive means to check if an error has occurred) then there is actually no need to check if a string is numeric before trying to convert it.
With that in mind, using atoi() is not a particularly good function to use if you need to check for errors on conversion. Zero will be returned for zero input, as well as an error, and there is no way to check on why. A better function to use is (assuming you want to read an integral value) is strtol(). Although strtol() returns zero on integer, it also returns information that can be used to check for failure. For example;
long x;
char *end;
x = strtol(your_string, &end, 10);
if (end == your_string)
{
/* nothing was read due to invalid character or the first
character marked the end of string */
}
else if (*end != '\0`)
{
/* an integral value was read, but there is following non-numeric data */
}
Second, there are alternatives to using strtol(), albeit involving more overhead. The return values from sscanf() (and, in fact, all functions in the scanf() family) can be checked for error conditions.
There is no standard function for checking if a string is numeric, but it can be easily rolled using the above.
int IsNumeric(char *your_string)
{
/* This has undefined behaviour if your_string is not a C-style string
It also deems that a string like "123AB" is non-numeric
*/
long x;
char *end;
x = strtol(your_string, &end, 10);
return !(end == your_string || *end != '\0`);
}
No (explicit) loops in any of the above options.
Is it acceptable practice in C to trap for a 0 return from atoi() in lieu of doing an integrity check on the data before calling atoi()?
No. #FUZxxl well answers that.
Is there a C function or common library function for a numeric integrity check on a string?
In C, the conversion of a string to a number and the check to see if the conversion is valid is usually done together. The function used depends on the type of number sought. "1.23" would make sense for a floating point type, but not an integer.
// No error handle functions
int atoi(const char *nptr);
long atol(const char *nptr);
long long atoll(const char *nptr);
double atof(const char *nptr);
// Some error detection functions
if (sscanf(buffer, "%d", &some_int) == 1) ...
if (sscanf(buffer, "%lf", &some_double) == 1) ...
// Robust methods use
long strtol( const char *nptr, char ** endptr, int base);
long long strtoll( const char *nptr, char ** endptr, int base);
unsigned long strtoul( const char *nptr, char ** endptr, int base);
unsigned long long strtoull( const char *nptr, char ** endptr, int base);
intmax_t strtoimax(const char *nptr, char ** endptr, int base);
uintmax_t strtoumax(const char *nptr, char ** endptr, int base);
float strtof( const char *nptr, char ** endptr);
double strtod( const char *nptr, char ** endptr);
long double strtold( const char *nptr, char ** endptr);
These robust methods use char ** endptr to store the string location where scanning stopped. If no numeric data was found, then *endptr == nptr. So a common test could is
char *endptr;
y = strto...(buffer, ..., &endptr);
if (buffer == endptr) puts("No conversion");
if (*endptr != '\0') puts("Extra text");
If the range was exceed these functions all set the global variable errno = ERANGE; and return a minimum or maximum value for the type.
errno = 0;
double y = strtod("1.23e10000000", &endptr);
if (errno == ERANGE) puts("Range exceeded");
The integer functions allow a radix selection from base 2 to 36. If 0 is used, the leading part of the string "0x", "0X", "0", other --> base 16, 16, 8, 10.
long y = strtol(buffer, &endptr, 10);
Read the specification or help page for more details.
You probably don't need a function to check whether a string is numeric. You will most likely need to convert the string to a number so just do that. Then check if the convertion is successful.
long number;
char *end;
number = strtol(string, &end, 10);
if ((*string == '\0') || (*end != '\0'))
{
// empty string or invalid number
}
the second argument of strtol is used to indicate where the parsing ended (the first non-numeric character). That character will be \0 if we've reached the end of the string. If you want to permit other characters after the number (like ), you can use switch to check for it.
strtol works with long integers. If you need some other type, you should consult the man page: man 3 strtol. For floating-point numbers you can use strtod.
Don't trap if the program logic permits that the string is not numeric (e.g. if it comes from the user or a file).
OP later commneted:
I'm looking for a way to determine if the string contains ONLY base 10 digits or a decimal or a comma. So if the string is 100,000.01 I want a positive return from func. Any other ascii characters anywhere in the string would result in a negative return value.
If is all your interest, use;
if (buffer[strspn(buffer, "0123456789.,")] == '\0') return 0; // Success
else return -1; // Failure

C language reading columnated text file

First of all let me ask for your forgiveness if this is too trivial, I am not a C developer, usually I program in Fortran.
I am in need to read some columnated text files. The problem I have is that some columns can have blank space (non filled value) or not fully filed field.
Let me use a short example of the problem. Lets say I have a generator program like:
#include <stdio.h>
#include <stdlib.h>
int main(){
printf("xxxx%4d%4.2f\n",99,3.14);
}
When I execute this program I get:
$ ./t1
xxxx 993.14
If I get it into a text file and try to read using (e.g.) sscanf with the code:
#include <stdio.h>
#include <stdlib.h>
int main() {
char *fmt = "%*4c%4d%4f";
char *line = "xxxx 993.14";
int ival;
float fval;
sscanf(line,fmt,&ival,&fval);
printf(">>>>%d|%f\n",ival,fval);
}
The result is:
$ ./t2
>>>>993|0.140000
What is the problem here? The sscanf seems to think that all space is meaningless and should be discarded. So the "%4c" does what it is meant to be, it counts 4 characters without discarding any blank space and discards everything due to "". Next the %4d start skipping all blank spaces and start count the 4 characters of the field upon finding the first valid character for the conversion. So the value, meant to be 99 becomes 993, and the 3.14 becomes 0.14.
In Fortran the reading code would be:
program t3
implicit none
integer :: ival
real :: fval
character(len=30) :: fmt="(4x,i4,f4.0)"
character(len=30) :: line="xxxx 993.14"
read(line,fmt) ival, fval
write(*,"('>>>>',i4,'|',f4.2)") ival,fval
end program t3
and the result would be:
$ ./t3
>>>> 99|3.14
That is, the format specification states the field width and nothing is discarding in conversion, except if instructed to by the "nX" specification.
Some final remarks to help the helpers:
The format to be read is an international standard and there is no
way to change it.
The number of existing files is to big to think of intervention or
format change.
It is not a CSV or similar format.
The code has to be in C for integration in a free software package.
Sorry to be too long, trying to state the problem as completely as possible.
The question is: Is there a way to tell sscanf to not skip the blank spaces? If not, is there a simple way to do it in C or it will be necessary write an specialized parser for each record type?
Thank you in advance.
When reading fixed-length fields with sscanf, it is best to parse the values as character strings (which you could do a number of ways), and then perform independent conversion of each of the fields. This allows you to handle conversion/error detection on a per-field basis. For example, you could use a format string of:
char *fmt = "%*4s%2[^0-9]%s";
which would read/discard the 4 leading characters, then read 2-chars as your integer, followed by the remainder of line (or up until the next whitespace) as a string containing your float value.
To handle the storage and parsing of line as fixed length fields, you could use temporary character arrays to hold each of the strings and then use sscanf to fill them much as you have attempted to do with the integer and float directly. e.g.:
char istr[8] = {0};
char fstr[16] = {0};
...
sscanf (line,fmt,istr,fstr);
(note: you could use minimum storage of istr[3] and fstr[7] in this given case, adjust the storage length as required, but providing space for the nul-terminating character)
You can then use strtol and strtof to provide conversion with error checking on each value. For example:
errno = 0;
if ((ival = (int)strtol (istr, NULL, 10)) == 0 && errno)
fprintf (stderr, "error: integer conversion failed.\n");
/* underflow/overflow checks omitted */
and
errno = 0;
if ((fval = strtof (fstr, NULL)) == 0 && errno)
fprintf (stderr, "error: integer conversion failed.\n");
/* nan and inf checks omitted */
Putting all the pieces together in you example, you could use something like:
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
int main() {
char *fmt = "%*4s%2[^0-9]%s";
char *line = "xxxx 993.14";
char istr[8] = {0};
char fstr[16] = {0};
int ival;
float fval;
sscanf (line,fmt,istr,fstr);
errno = 0;
if ((ival = (int)strtol (istr, NULL, 10)) == 0 && errno)
fprintf (stderr, "error: integer conversion failed.\n");
/* underflow/overflow checks omitted */
errno = 0;
if ((fval = strtof (fstr, NULL)) == 0 && errno)
fprintf (stderr, "error: integer conversion failed.\n");
/* nan and inf checks omitted */
printf(">>>>%d|%6.2f\n",ival,fval);
return 0;
}
Example/Output
$ >>>>0|993.14
*scanf() is not designed to handle fixed column width with non-intervening white-space.
With sscanf(), to not skip spaces, code must use "%c", "%n", "%[]" as all other specifiers skip leading white-space and those skipped characters do not contribute to a width limit.
To scan the printed line, which in now in buffer, take advantage that the only use of '\n' is at the end of the line.
char str_int[5];
char str_float[5];
int n = 0;
sscanf(buffer, "%*4c%4[^\n]%4[^\n]%n", str_int, str_float, &n);
if (n != 12 || buffer[n] != '\n') Fail();
// Now convert str_int, str_float as needed.
Another way to use sscanf() would be to parse buffer as
int ival;
float fval;
if (strlen(buffer) != 13) Fail();
if (sscanf(&buffer[8], "%f", &fval) != 1) Fail();
buffer[8] = '\0';
if (sscanf(&buffer[4], "%d", &ival) != 1) Fail();
Note: The 4s in the below do not specified the output width as 4 characters. 4 is the minimum width to print.
printf("xxxx%4d%4.2f\n",ival, fval);
Code could use the following to detect problems.
if (13 != printf("xxxx%4d%4.2f\n",ival, fval)) Fail();
Watch out for
printf("xxxx%4d%4.2f\n",123, 9.995000001f); // "xxxx 12310.00\n"
First off, I dunno. There might be some way to wrangle sscanf to recognize the whitespace towards your integer count. But I just don't think scanf was made for this sort of format in mind. The tool's trying to be smart of helpful and it's biting you in the ass.
But if it's columnated data and you know the position of the various fields, there's a really easy work around. Just extract the field you want.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char** argv)
{
char line[] = "xxxx 893.14";
char tmp[100];
int thatDamnNumber;
float myfloatykins;
//Get that field
memcpy(tmp, line+4, 4);
sscanf(tmp, "%d", &thatDamnNumber);
//Kill that field so it doesn't goober-up the float
memset(line+4, ' ', 4);
sscanf(line, "%*4c%f", &myfloatykins);
printf("%d %f\n", thatDamnNumber, myfloatykins);
return 0;
}
If there is a lot of this, you could make some generalized functions: integerExtract(int positionStart, int sizeInCharacters), floatExtract(), etc.
If each element is of fixed width you don't really need scanf(), try this
char copy[5];
const char *line = "xxxx 993.14";
int ival;
float fval;
copy[0] = line[4];
copy[1] = line[5];
copy[2] = line[6];
copy[3] = line[7];
copy[4] = '\0'; // nul terminate for `atoi' to work
ival = atoi(copy);
fval = atof(&line[8]);
fprintf(stdout, "%d -- %f\n", ival, fval);
If you want (probably should) you can use strtol() instead of atoi() and strtof() instead of atof() to check for malformed data.
Both these functions take a parameter to store the unconverted/invalid characters, you can check the passed pointer in order to verify that there was a problem with conversion.
Or if you really want scanf() do the same, capture the integer + whitespaces to a char array and then convert it to int later, like this
char integer[5];
const char *line = "xxxx 993.14";
int ival;
float fval;
if (sscanf(line, "%*4c%4[0-9 ]%f", integer, &fval) != 2)
return -1;
ival = atoi(integer);
fprintf(stdout, "%d -- %f\n", ival, fval);
The format "%*4c%4[0-9 ]%f" will
Skip the first four characters including white spaces.
Scan the next four characters if they consist only of digits or white spaces.
Scan the rest of the input string searching for a matching float value.
I am posting what I think is a final conclusion from the answers I have got so far and from other sources.
What is a very trivial task in Fortran is not a so trivial task in other languages. I guess — not sure — that the same task could be as easy as in Fortran in other languages. I think that Cobol, Pascal, PL/I and others from the time of punched card probably could be trivial.
I think that most languages nowadays are more comfortable with different data structure and inherited its I/O structure from C. I think that Java, Python, Perl(?) and others could serve as examples.
From what I saw in this thread there are two main problems to read / convert fixed column length text data with C.
The first problem is that, as Philip said in his answer: “The tool’s trying to be smart of helpful and it’s biting you in the ass.” Quite right! The point is that it seems that C text I/O thinks that “white space” is something like a NULL character and should be thrown away, completely disregarding any information of the start of field. The only exception to that seems to be the %nc that get exactly n chars, even blanks.
The second problem is that the conversion “tag” (how is that called?) %nf will keep converting while it finds a valid character, even if you say stop at the 4th character.
If we join those two problems with a field completely filled with white space, depending on the conversion tool used, it throws an error or keeps going madly looking for something meaningful.
At the end of the day, it seems that the only way is to extract the field length to another memory area, dynamically allocated or not (we can have an area for each column length), and try to parse this separate area, taking into account the possibility of a full white space area to cache the error.

Using strtol to validate integer input in ANSI C

I am new to programming and to C in general and am currently studying it at university. This is for an assignment so I would like to avoid direct answers but are more after tips or hints/pushes in the right direction.
I am trying to use strtol to validate my keyboard input, more specifically, test whether the input is numeric. I have looked over other questions on here and other sites and I have followed instructions given to other users but it hasn't helped me.
From what I have read/ understand of strtol (long int strtol (const char* str, char** endptr, int base);) if the endptr is not a null pointer the function will set the value of the endptr to the first character after the number.
So if I was to enter 84948ldfk, the endptr would point to 'l', telling me there is characters other than numbers in the input and which would make it invalid.
However in my case, what is happening, is that no matter what I enter, my program is returning an Invalid input. Here is my code:
void run_perf_square(int *option_stats)
{
char input[MAX_NUM_INPUT + EXTRA_SPACES]; /*MAX_NUM_INPUT + EXTRA_SPACES are defined
*in header file. MAX_NUM_INPUT = 7
*and EXTRA_SPACES
*(for '\n' and '\0') = 2. */
char *ptr;
unsigned num=0; /*num is unsigned as it was specified in the start up code for the
*assignment. I am not allow to change it*/
printf("Perfect Square\n");
printf("--------------\n");
printf("Enter a positive integer (1 - 1000000):\n");
if(fgets(input, sizeof input, stdin) != NULL)
{
num=strtol(input, &ptr, 10);
if( num > 1000001)
{
printf("Invalid Input! PLease enter a positive integer between 1
and 1000000\n");
read_rest_of_line(); /*clears buffer to avoid overflows*/
run_perf_square(option_stats);
}
else if (num <= 0)
{
printf("Invalid Input! PLease enter a positive integer between 1
and 1000000\n");
run_perf_square(option_stats);
}
else if(ptr != NULL)
{
printf("Invalid Input! PLease enter a positive integer between 1
and 1000000\n");
run_perf_square(option_stats);
}
else
{
perfect_squares(option_stats, num);
}
}
}
Can anyone help me in the right direction? Obviously the error is with my if(ptr != NULL) condition, but as I understand it seems right. As I said, I have looked at previous questions similar to this and took the advice in the answers but it doesn't seem to work for me. Hence, I thought it best to ask for my help tailored to my own situation.
Thanks in advance!
You're checking the outcome of strtol in the wrong order, check ptr first, also don't check ptr against NULL, derference it and check that it points to the NUL ('\0') string terminator.
if (*ptr == '\0') {
// this means all characters were parsed and converted to `long`
}
else {
// this means either no characters were parsed correctly in which
// case the return value is completely invalid
// or
// there was a partial parsing of a number to `long` which is returned
// and ptr points to the remaining string
}
num > 1000001 also needs to be num > 1000000
num < 0 also needs to be num < 1
You can also with some reorganising and logic tweaks collapse your sequence of if statements down to only
a single invalid branch and a okay branch.
OP would like to avoid direct answers ....
validate integer input
Separate I/O from validation - 2 different functions.
I/O: Assume hostile input. (Text, too much text, too little text. I/O errors.) Do you want to consume leading spaces as part of I/O? Do you want to consume leading 0 as part of I/O? (suggest not)
Validate the string (NULL, lead space OK?, digits after a trailing space, too short, too long, under-range, over-range, Is 123.0 an OK integer)
strtol() is your friend to do the heavy conversion lifting. Check how errno should be set and tested afterward. Use the endptr. Should its value be set before. How to test afterward. It consume leading spaces, is that OK? It converts text to a long, but OP wants the nebulous "integer".
Qapla'
The function strtol returns long int, which is a signed value. I suggest that you use another variable (entry_num), which you could test for <0, thus detecting negative numbers.
I would also suggest that regex could test string input for digits and valid input, or you could use strtok and anything but digits as the delimiter ;-) Or you could scan the input string using validation, something like:
int validate_input ( char* input )
{
char *p = input;
if( !input ) return 0;
for( p=input; *p && (isdigit(*p) || iswhite(*p)); ++p )
{
}
if( *p ) return 0;
return 1;
}

Best way to do binary arithmetic in C?

I am learning C and writing a simple program that will take 2 string values assumed to each be binary numbers and perform an arithmetic operation according to user selection:
Add the two values,
Subtract input 2 from input 1, or
Multiply the two values.
My implementation assumes each character in the string is a binary bit, e.g. char bin5 = "0101";, but it seems too naive an approach to parse through the string a character at a time. Ideally, I would want to work with the binary values directly.
What is the most efficient way to do this in C? Is there a better way to treat the input as binary values rather than scanf() and get each bit from the string?
I did some research but I didn't find any approach that was obviously better from the perspective of a beginner. Any suggestions would be appreciated!
Advice:
There's not much that's obviously better than marching through the string a character at a time and making sure the user entered only ones and zeros. Keep in mind that even though you could write a really fast assembly routine if you assume everything is 1 or 0, you don't really want to do that. The user could enter anything, and you'd like to be able to tell them if they screwed up or not.
It's true that this seems mind-bogglingly slow compared to the couple cycles it probably takes to add the actual numbers, but does it really matter if you get your answer in a nanosecond or a millisecond? Humans can only detect 30 milliseconds of latency anyway.
Finally, it already takes far longer to get input from the user and write output to the screen than it does to parse the string or add the numbers, so your algorithm is hardly the bottleneck here. Save your fancy optimizations for things that are actually computationally intensive :-).
What you should focus on here is making the task less manpower-intensive. And, it turns out someone already did that for you.
Solution:
Take a look at the strtol() manpage:
long strtol(const char *nptr, char **endptr, int base);
This will let you convert a string (nptr) in any base to a long. It checks errors, too. Sample usage for converting a binary string:
#include <stdlib.h>
char buf[MAX_BUF];
get_some_input(buf);
char *err;
long number = strtol(buf, &err, 2);
if (*err) {
// bad input: try again?
} else {
// number is now a long converted from a valid binary string.
}
Supplying base 2 tells strtol to convert binary literals.
First out I do recommend that you use stuff like strtol as recommended by tgamblin,
it's better to use things that the lib gives to you instead of creating the wheel over and over again.
But since you are learning C I did a little version without strtol,
it's neither fast or safe but I did play a little with the bit manipulation as a example.
int main()
{
unsigned int data = 0;
int i = 0;
char str[] = "1001";
char* pos;
pos = &str[strlen(str)-1];
while(*pos == '0' || *pos == '1')
{
(*pos) -= '0';
data += (*pos) << i;
i++;
pos--;
}
printf("data %d\n", data);
return 0;
}
In order to get the best performance, you need to distinguish between trusted and untrusted input to your functions.
For example, a function like getBinNum() which accepts input from the user should be checked for valid characters and compressed to remove leading zeroes. First, we'll show a general purpose in-place compression function:
// General purpose compression removes leading zeroes.
void compBinNum (char *num) {
char *src, *dst;
// Find first non-'0' and move chars if there are leading '0' chars.
for (src = dst = num; *src == '0'; src++);
if (src != dst) {
while (*src != '\0')
*dst++ = *src++;
*dst = '\0';
}
// Make zero if we removed the last zero.
if (*num == '\0')
strcpy (num, "0");
}
Then provide a checker function that returns either the passed in value, or NULL if it was invalid:
// Check untested number, return NULL if bad.
char *checkBinNum (char *num) {
char *ptr;
// Check for valid number.
for (ptr = num; *ptr == '0'; ptr++)
if ((*ptr != '1') && (*ptr != '0'))
return NULL;
return num;
}
Then the input function itself:
#define MAXBIN 256
// Get number from (untrusted) user, return NULL if bad.
char *getBinNum (char *prompt) {
char *num, *ptr;
// Allocate space for the number.
if ((num = malloc (MAXBIN)) == NULL)
return NULL;
// Get the number from the user.
printf ("%s: ", prompt);
if (fgets (num, MAXBIN, stdin) == NULL) {
free (num);
return NULL;
}
// Remove newline if there.
if (num[strlen (num) - 1] == '\n')
num[strlen (num) - 1] = '\0';
// Check for valid number then compress.
if (checkBinNum (num) == NULL) {
free (num);
return NULL;
}
compBinNum (num);
return num;
}
Other functions to add or multiply should be written to assume the input is already valid since it will have been created by one of the functions in this library. I won't provide the code for them since it's not relevant to the question:
char *addBinNum (char *num1, char *num2) {...}
char *mulBinNum (char *num1, char *num2) {...}
If the user chooses to source their data from somewhere other than getBinNum(), you could allow them to call checkBinNum() to validate it.
If you were really paranoid, you could check every number passed in to your routines and act accordingly (return NULL), but that would require relatively expensive checks that aren't necessary.
Wouldn't it be easier to parse the strings into integers, and then perform your maths on the integers?
I'm assuming this is a school assignment, but i'm upvoting you because you appear to be giving it a good effort.
Assuming that a string is a binary number simply because it consists only of digits from the set {0,1} is dangerous. For example, when your input is "11", the user may have meant eleven in decimal, not three in binary. It is this kind of carelessness that gives rise to horrible bugs. Your input is ambiguously incomplete and you should really request that the user specifies the base too.

Resources