atoi vs atol vs strtol vs strtoul vs sscanf - c

I'm trying to figure out from a command line being parsed, which function would be best to convert either a decimal, hexadecimal, or octal number to an int the best — without knowing the input beforehand.
The goal then is to use a single function that recognizes the different types of inputs and assign that to its integer (int) value which can then be used so:
./a.out 23 0xC4 070
could print
23
196 /*hexadecimal*/
56 /*octal*/
The only issue that I can see is the parsing to find the difference between a decimal integer and an octal.
Side question, is this stable for converting the string to an integer for use?

which function would be best to convert either a decimal, hexadecimal, or octal number to an int the best (?)
To convert such text to int, recommend long strtol(const char *nptr, char **endptr, int base); with additional tests when converting to int, if needed.
Use 0 as the base to assess early characters in steering conversion as base 10, 16 or 8.
#Mike Holt
Convert text per:
Step 1: Optional whitespaces like `' '`, tab, `'\n'`, ... .
Step 2: Optional sign: `'-'` or `'+'`.
Step 3:
0x or 0X followed by hex digits--> hexadecimal
0 --> octal
else --> decimal
Sample code
#include <errno.h>
#include <limits.h>
#include <stdlib.h>
int mystrtoi(const char *str) {
char *endptr;
errno = 0;
// v--- determine conversion base
long long_var = strtol(str, &endptr, 0);
// out of range , extra junk at end, no conversion at all
if (errno == ERANGE || *endptr != '\0' || str == endptr) {
Handle_Error();
}
// Needed when `int` and `long` have different ranges
#if LONG_MIN < INT_MIN || LONG_MAX > INT_MAX
if (long_var < INT_MIN || long_var > INT_MAX) {
errno = ERANGE;
Handle_Error();
}
#endif
return (int) long_var;
}
atoi vs atol vs strtol vs strtoul vs sscanf ... to int
atoi()
Pro: Very simple.
Pro: Convert to an int.
Pro: In the C standard library.
Pro: Fast.
Con: On out of range errors, undefined behavior. #chqrlie
Con: Handle neither hexadecimal nor octal.
atol()
Pro: Simple.
Pro: In the C standard library.
Pro: Fast.
Con: Converts to an long, not int which may differ in size.
Con: On out of range errors, undefined behavior.
Con: Handle neither hexadecimal nor octal.
strtol()
Pro: Simple.
Pro: In the C standard library.
Pro: Good error handling.
Pro: Fast.
Pro: Can handle binary. (base 2 to base 36)
Con: Convert to an long, not int which may differ in size.
strtoul()
Pro: Simple.
Pro: In the C standard library.
Pro: Good error handling.
Pro: Fast.
Pro: Can handle binary.
---: Does not complain about negative numbers.
Con: Converts to an unsigned long, not int which may differ in size.
sscanf(..., "%i", ...)
Pro: In the C standard library.
Pro: Converts to int.
---: Middle-of-the-road complexity.
Con: Potentially slow.
Con: OK error handling (overflow is not defined).
All suffer/benefit from locale settings. §7.22.1.4 6 “In other than the "C" locale, additional locale-specific subject sequence forms may be accepted.”
Additional credits:
#Jonathan Leffler: errno test against ERANGE, atoi() decimal-only, discussion about errno multi-thread concern.
#Marian Speed issue.
#Kevin Library inclusiveness.
For converting short, signed char, etc., consider strto_subrange().

It is only sensible to consider strtol() and strtoul() (or strtoll() or strtoull() from <stdlib.h>, or perhaps strtoimax() or strtoumax() from <inttypes.h>) if you care about error conditions. If you don't care about error conditions on overflow, any of them could be used. Neither atoi() nor atol() nor sscanf() gives you control if the values overflow. Additionally, neither atoi() nor atol() provides support for hex or octal inputs (so in fact you can't use those to meet your requirements).
Note that calling the strtoX() functions is not entirely trivial. You have to set errno to 0 before calling them, and pass a pointer to get the end location, and analyze carefully to know what happened. Remember, all possible return values from these functions are valid outputs, but some of them may also indicate invalid inputs — and errno and the end pointer help you distinguish between them all.
If you need to convert to int after reading the value using, say, strtoll(), you can check the range of the returned value (stored in a long long) against the range defined in <limits.h> for int: INT_MIN and INT_MAX.
For full details, see my answer at: Correct usage of strtol().
Note that none of these functions tells you which conversion was used. You'll need to analyze the string yourself. Quirky note: did you know that there is no decimal 0 in C source; when you write 0, you are writing an octal constant (because its first digit is a 0). There are no practical consequences to this piece of trivia.

Related

Integer validation through conversion from char * to int

Say I have an invalid integer input to a char * where,
char *ch = "23 45"
using atoi(ch) gives 23 as the converted output, ignoring the space and 45.
I'm trying to do testing on this input. What can I do to flag it as an invalid input?
Either check the string before passing it to atoi() or use strtol(), though the latter will return long int.
With strtol(), you can check for errors:
RETURN VALUE
The strtol() function returns the result of the conversion, unless the value would underflow or overflow. If an underflow occurs, strtol() returns LONG_MIN. If an overflow
occurs, strtol() returns LONG_MAX. In both cases, errno is set to ERANGE. Precisely the same holds for strtoll() (with LLONG_MIN and LLONG_MAX instead of LONG_MIN and
LONG_MAX).
ERRORS
EINVAL (not in C99) The given base contains an unsupported value.
ERANGE The resulting value was out of range.
The implementation may also set errno to EINVAL in case no conversion was performed (no digits seen, and 0 returned).
The lack of error detection is one of the main shortcomings of the atoi() function. If that's something you need, then the basic answer is "don't use atoi()."
The strtol() function is a better alternative in pretty much every way. For your particular purpose, you can pass to it a pointer to a char *, wherein it will record a pointer to the first character in the input that was not converted. If the whole string is successfully converted then a pointer to the string terminator will be stored, so you might write
_Bool is_valid_int(const char *to_test) {
// assumes to_test is not NULL
char *end;
long int result = strtol(to_test, &end, 10);
return (*to_test != '\0' && *end == '\0');
}

Difference in the values of atoi

I have the following code:
char* input = (char*)malloc(sizeof(char) * BUFFER) // buffer is defined to 100
int digit = atoi(input); // convert char into a digit
int digit_check = 0;
digit_check += digit % 10; // get last value of digit
When I run the input 1234567896 and similarly digit = 1234567896 and digit_check = 6.
However when I run the input 9999999998, digit = 1410065406 and therefore digit_check = 6 when it should be 8.
For the second example, why is there a difference between input and digit when it should be the same value?
Probably because 9999999998 is bigger then the maximum (signed) integer representation, so you get an overflow.
In fact this is the binary representation of 9999999998 and 1410065406:
10 01010100 00001011 11100011 11111110
01010100 00001011 11100011 11111110
As you can see if you see 1410065406 is the 32ed bit value of 9999999998
atoi is limited to an int size (32 bits on most recent plateform).
If you want to handle large numbers, you can use atol or scanf("%ld").
Don't forget to type your variable to long int (or long).
You could also just getting the very last character of your input (gathered as a string rather than as an int) and use atoi on it, so it would never overflow.
On many platforms size of int is limited by 4 bytes, that limits digit in [-2 ** 31, 2**31 - 1].
Use long (or long long) with strtol (or strtoll) depending on platform you build for. For example, GCC on x86 will have 64-bit long long, and for amd64 it will have 64-bit long and long long types.
So:
long long digit = strtoll(input, NULL, 10);
NOTE: strtoll() is popular in Unix-like systems and became standard in C++11, but not all VC++ implementations have it. Use _strtoi64() instead:
__int64 digit = _strtoi64(input, NULL, 10);
You probably want to use the atoll function, which returns a long long int, that is twice as big as int (most likely 64 bits in your case).
It is declared in stdlib.h
http://linux.die.net/man/3/atoll
You should avoid to call atoi on uninitialized string, if there is no \0 on the string, you will invalid read and have a segmentation fault.
You should use strtoimax instead, it's more safe.
9999999998 is bigger then the maximum value that an integer can represent. Either use atol() OR atoll()
You should stop using atoi function or any other functions from ato... group. These functions are not officially deprecated, but they are effectively abandoned since 1995 and exist only for legacy code compatibility purposes. Forget about these functions as if they do not exist. These function provide no usable feedback in case of error or overflow. And overflow is what apparently happens in your example.
In order to convert strings to numbers, C standard library provides strtol function and other functions from strto... group. These are the functions you should use to perform the conversion. And don't forget to check the result for overflow: strto... functions provide this feedback through the return value and errno variable.

Parse a string as a (long long) integer

I am writing a code in which I need to parse a string to a "long long int"
I used to use atoi when changing from string to int, I dont think it still work. What Can I use now?
--Thanks
Use strtoll() (man page):
#include <stdlib.h>
long long int n = strtoll(s, NULL, 0);
(This is only available in C99 and C11, not in C89.) The third argument is the number base for the conversion, and 0 means "automatic", i.e. decimal, octal or hexadecimal are selected depending on the usual conventions (10, 010, 0x10). Just be mindful of that in case your string starts with 0.

atoi function in C not working properly (when exceeding a certain value)

Might someone explain why the atoi function doesn't work for nmubers with more than 9 digits?
For example:
When I enter: 123456789,
The program program returns: 123456789,
However,when I enter: 12345678901
the program returns: -519403114...
int main ()
{
int i;
char szinput [256];
printf ("Enter a Card Number:");
fgets(szinput,256,stdin);
i=atoi(szinput);
printf("%d\n",i);
getch();
return 0;
}
Don't use atoi(), or any of the atoi*() functions, if you care about error handling. These functions provide no way of detecting errors; neither atoi(99999999999999999999) nor atoi("foo") has any way to tell you that there was a problem. (I think that one or both of those cases actually has undefined behavior, but I'd have to check to be sure.)
The strto*() functions are a little tricky to use, but they can reliably tell you whether a string represents a valid number, whether it's in range, and what its value is. (You have to deal with errno to get full checking.)
If you just want an int value, you can use strtol() (which, after error checking, gives you a long result) and convert it to int after also checking that the result is in the representable range of int (see INT_MIN and INT_MAX in <limits.h>). strtoul() gives you an unsigned long result. strtoll() and strtoull() are for long long and unsigned long long respectively; they're new in C99, and your compiler implementation might not support them (though most non-Microsoft implementations probably do).
Because you are overflowing an int with such a large value.
Moreover, atoi is deprecated and thread-unsafe on many platforms, so you'd better ditch it in favour of strto(l|ll|ul|ull).
Consider using strtoull instead. Since unsigned long long is a 64-bit type on most modern platforms, you'll be able to convert a number as big as 2 ^ 64 (18446744073709551616).
To print an unsigned long long, use the %llu format specifier.
if you are writing win32 application then you can use windows implementation of atoi, for details check the below page.
http://msdn.microsoft.com/en-us/library/czcad93k%28v=vs.80%29.aspx

C Compatibility Between Integers and Characters

How does C handle converting between integers and characters? Say you've declared an integer variable and ask the user for a number but they input a string instead. What would happen?
The user input is treated as a string that needs to be converted to an int using atoi or another conversion function. Atoi will return 0 if the string cannot be interptreted as a number because it contains letters or other non-numeric characters.
You can read a bit more at the atoi documentation on MSDN - http://msdn.microsoft.com/en-us/library/yd5xkb5c(VS.80).aspx
Uh?
You always input a string. Then you parse convert this string to number, with various ways (asking again, taking a default value, etc.) of handling various errors (overflow, incorrect chars, etc.).
Another thing to note is that in C, characters and integers are "compatible" to some degree. Any character can be assigned to an int. The reverse also works, but you'll lose information if the integer value doesn't fit into a char.
char foo = 'a'; // The ascii value representation for lower-case 'a' is 97
int bar = foo; // bar now contains the value 97
bar = 255; // 255 is 0x000000ff in hexadecimal
foo = bar; // foo now contains -1 (0xff)
unsigned char foo2 = foo; // foo now contains 255 (0xff)
As other people have noted, the data is normally entered as a string -- the only question is which function is used for doing the reading. If you're using a GUI, the function may already deal with conversion to integer and reporting errors and so in an appropriate manner. If you're working with Standard C, it is generally easier to read the value into a string (perhaps with fgets() and then convert. Although atoi() can be used, it is seldom the best choice; the trouble is determining whether the conversion succeeded (and produced zero because the user entered a legitimate representation of zero) or not.
Generally, use strtol() or one of its relatives (strtoul(), strtoll(), strtoull()); for converting floating point numbers, use strtod() or a similar function. The advantage of the integer conversion routines include:
optional base selection (for example, base 10, or base 10 - hex, or base 8 - octal, or any of the above using standard C conventions (007 for octal, 0x07 for hex, 7 for decimal).
optional error detection (by knowing where the conversion stopped).
The place I go for many of these function specifications (when I don't look at my copy of the actual C standard) is the POSIX web site (which includes C99 functions). It is Unix-centric rather than Windows-centric.
The program would crash, you need to call atoi function.

Resources