Convert string to long long C? - c

How can I convert a string to a long long in C?
I've got
char* example = "123";
I'd like to convert example to a long long so I'd want something like
long long n = example;
How can I do this?

Use the function strtoll:
#include <stdlib.h>
#include <errno.h>
char const * example = "123";
char * e;
errno = 0;
long long int n = strtoll(example, &e, 0);
if (*e != 0 || errno != 0) { /* error, don't use n! */ }
In fact, e will point to the next character after the converted sequence, so you can do even more sophisticated parsing with this. As it stands, we just check that the entire sequence has been converted. You can also inspect errno to see if an overflow occurred. See the manual for details.
(For historical interest: long long int and strtoll were introduced in C99. They're not available in C89/90. Equivalent functions strtol / strtoul / strtod exist, though.)

Related

Is it better to implement strtol() without errno?

Traditional strtol() is usually used like this:
int main()
{
errno = 0;
char *s = "12345678912345678900";
char *endptr;
long i = strtol(s, &endptr, 10);
if(i == LONG_MAX && errno == ERANGE)
printf("overflow");
}
We need to access errno two times, and errno nowadays is usually a C macro finally expanded to a function. It seems a little expensive considering parsing string to integer isn't a heavy job.
So, is it better to implement strtol without errno but using some other ways to indicating overflow?
like:
long strtol(const char *nptr, char **endptr, int base, bool *is_overflow);
instead of
long strtol(const char *nptr, char **endptr, int base);
is it better to implement strtol without errno ...
No.
... but using some other ways to indicating overflow?
No.
long int strtol(const char * restrict nptr, char ** restrict endptr, int base);
strtol() is a standard C library function and any implementation must adhere to proper use of the 3 inputs and errno to be compliant.
Of course OP can implement some other my_strtol() as desired.
Any performance concerns around avoiding errno are a micro-optimization yet a reasonable design goal.
It really comes down to how to conveys problems of string to long
Overflow "12345678912345678901234567890"
No conversions "abc"
Excess junk "123 abc"
Leading space allowed, trailing space allowed?
Allow various bases?
Once functionality about all exceptional cases are defined, not just overflow, then coding concerns about errno is useful, even if unlikely to make any meaningful performance improvements.
IMO, coding to one base only is likely a more productive path to speed improvements than errno.
OP code is not a robust strtol() usage. Suggest:
char *s = "12345678912345678900";
char *endptr;
errno = 0;
long i = strtol(s, &endptr, 10);
if (errno == ERANGE) printf("Overflow %ld\n", i);
else if (s == endptr) printf("No conversion %ld\n", i);
else if (*endptr) printf("Extra Junk %ld\n", i);
else printf("Success %ld\n", i);
There is actually some overhead besides errno in strtol(), like skipping spaces, taking care of the base (10 or hexa), check characters ...
In a specific environment where speed is critical and you know the string provided is a number base 10 that fits in a long, you could make your own quick function, like
#include <ctype.h>
long mystrtol(char *s) {
long res = 0, minus = *s == '-';
if (minus || *s == '+') s++;
while (isdigit(*s)) {
res = res*10 + (*s++ - '0');
}
return minus ? -res : res;
}
and choose to inline it.

Good C Coding practice/etiquette for integrity checks

Two part question;
I'm Coming from a high level Language, so this is a question about form not function;
I've written an isnumeric() function that takes a char[] and returns 1 if the string is a number taking advantage of the isdigit() function in ctype. Similar functions are builtin to other languages and I have always used something like that to integrity check the data before converting it to a numeric type. Mostly because some languages conversion functions fail badly if you try to convert a non-number string to an integer.
But it seems like a kludge having to do all that looping to compensate for the lack of strings in C, which poses the first part of the question;
Is it acceptable practice in C to trap for a 0 return from atoi() in lieu of doing an integrity check on the data before calling atoi()? The way atoi() (and other ascii to xx functions) works seems to lend itself well to eliminating the integrity check altogether. It would certainly seem more efficient to just skip the check.
The second part of the question is;
Is there a C function or common library function for a numeric integrity check
on a string? (by string, I of course mean char[])
Is it acceptable practice in C to trap for a 0 return from atoi() in lieu of doing an integrity check on the data before calling atoi()?
Never ever trap on error unless the error indicates a programming error that can't happen if there isn't a bug in the code. Always return some sort of error result in case of an error. Look at the OpenBSD strtonum function for how you could design such an interface.
The second part of the question is; Is there a C function or common library function for a numeric integrity check on a string? (by string, I of course mean char[])
Never use atoi unless you are writing a program without error checking as atoi doesn't do any error checking. The strtol family of functions allow you to check for errors. Here is a simply example of how you could use them:
int check_is_number(const char *buf)
{
const char *endptr;
int errsave = errno, errval;
long result;
errno = 0;
result = strtol(buf, &endptr, 0);
errval = errno;
errno = errsave;
if (errval != 0)
return 0; /* an error occured */
if (buf[0] == '\0' || *endptr != '\0')
return 0; /* not a number */
return 1;
}
See the manual page linked before for how the third argument to strtol (base) affects what it does.
errno is set to ERANGE if the value is out of range for the desired type (i.e. long). In this case, the return value is LONG_MAX or LONG_MIN.
If the conversion method returns an error indication (as distinct from going bananas if an error occurs, or not providing a definitive means to check if an error has occurred) then there is actually no need to check if a string is numeric before trying to convert it.
With that in mind, using atoi() is not a particularly good function to use if you need to check for errors on conversion. Zero will be returned for zero input, as well as an error, and there is no way to check on why. A better function to use is (assuming you want to read an integral value) is strtol(). Although strtol() returns zero on integer, it also returns information that can be used to check for failure. For example;
long x;
char *end;
x = strtol(your_string, &end, 10);
if (end == your_string)
{
/* nothing was read due to invalid character or the first
character marked the end of string */
}
else if (*end != '\0`)
{
/* an integral value was read, but there is following non-numeric data */
}
Second, there are alternatives to using strtol(), albeit involving more overhead. The return values from sscanf() (and, in fact, all functions in the scanf() family) can be checked for error conditions.
There is no standard function for checking if a string is numeric, but it can be easily rolled using the above.
int IsNumeric(char *your_string)
{
/* This has undefined behaviour if your_string is not a C-style string
It also deems that a string like "123AB" is non-numeric
*/
long x;
char *end;
x = strtol(your_string, &end, 10);
return !(end == your_string || *end != '\0`);
}
No (explicit) loops in any of the above options.
Is it acceptable practice in C to trap for a 0 return from atoi() in lieu of doing an integrity check on the data before calling atoi()?
No. #FUZxxl well answers that.
Is there a C function or common library function for a numeric integrity check on a string?
In C, the conversion of a string to a number and the check to see if the conversion is valid is usually done together. The function used depends on the type of number sought. "1.23" would make sense for a floating point type, but not an integer.
// No error handle functions
int atoi(const char *nptr);
long atol(const char *nptr);
long long atoll(const char *nptr);
double atof(const char *nptr);
// Some error detection functions
if (sscanf(buffer, "%d", &some_int) == 1) ...
if (sscanf(buffer, "%lf", &some_double) == 1) ...
// Robust methods use
long strtol( const char *nptr, char ** endptr, int base);
long long strtoll( const char *nptr, char ** endptr, int base);
unsigned long strtoul( const char *nptr, char ** endptr, int base);
unsigned long long strtoull( const char *nptr, char ** endptr, int base);
intmax_t strtoimax(const char *nptr, char ** endptr, int base);
uintmax_t strtoumax(const char *nptr, char ** endptr, int base);
float strtof( const char *nptr, char ** endptr);
double strtod( const char *nptr, char ** endptr);
long double strtold( const char *nptr, char ** endptr);
These robust methods use char ** endptr to store the string location where scanning stopped. If no numeric data was found, then *endptr == nptr. So a common test could is
char *endptr;
y = strto...(buffer, ..., &endptr);
if (buffer == endptr) puts("No conversion");
if (*endptr != '\0') puts("Extra text");
If the range was exceed these functions all set the global variable errno = ERANGE; and return a minimum or maximum value for the type.
errno = 0;
double y = strtod("1.23e10000000", &endptr);
if (errno == ERANGE) puts("Range exceeded");
The integer functions allow a radix selection from base 2 to 36. If 0 is used, the leading part of the string "0x", "0X", "0", other --> base 16, 16, 8, 10.
long y = strtol(buffer, &endptr, 10);
Read the specification or help page for more details.
You probably don't need a function to check whether a string is numeric. You will most likely need to convert the string to a number so just do that. Then check if the convertion is successful.
long number;
char *end;
number = strtol(string, &end, 10);
if ((*string == '\0') || (*end != '\0'))
{
// empty string or invalid number
}
the second argument of strtol is used to indicate where the parsing ended (the first non-numeric character). That character will be \0 if we've reached the end of the string. If you want to permit other characters after the number (like ), you can use switch to check for it.
strtol works with long integers. If you need some other type, you should consult the man page: man 3 strtol. For floating-point numbers you can use strtod.
Don't trap if the program logic permits that the string is not numeric (e.g. if it comes from the user or a file).
OP later commneted:
I'm looking for a way to determine if the string contains ONLY base 10 digits or a decimal or a comma. So if the string is 100,000.01 I want a positive return from func. Any other ascii characters anywhere in the string would result in a negative return value.
If is all your interest, use;
if (buffer[strspn(buffer, "0123456789.,")] == '\0') return 0; // Success
else return -1; // Failure

String conversion to int

I have a pointer lpBegin pointing to a string "1234". Now i want this string compare to an uint how can i make this string to unsigned integer without using scanf? I know the string number is 4 characters long.
You will have to use the atoi function. This takes a pointer to a char and returns an int.
const char *str = "1234";
int res = atoi(str); //do something with res
As said by others and something I didn't know, is that atoi is not recommended because it is undefined what happens when a formatting error occurs. So better use strtoul as others have suggested.
Definitely atoi() which is easy to use.
Don't forget to include stdlib.h.
You can use the strtoul() function. strtoul stands for "String to unsigned long":
#include <stdio.h>
#include <stdlib.h>
int main()
{
char lpBegin[] = "1234";
unsigned long val = strtoul(lpBegin, NULL, 10);
printf("The integer is %ul.", val);
return 0;
}
You can find more information here: http://www.cplusplus.com/reference/clibrary/cstdlib/strtoul/
You could use strtoul(lpBegin), but this only works with zero-terminated strings.
If you don't want to use stdlib for whatever reason and you're absolutely sure about the target system(s), you could do the number conversion manually.
This one should work on most systems as long as they are using single byte encoding (e.g. Latin, ISO-8859-1, EBCDIC). To make it work with UTF-16, just replace the 'char' with 'wchar_t' (or whatever you need).
unsigned long sum = (lpbegin[0] - '0') * 1000 +
(lpbegin[1] - '0') * 100 +
(lpbegin[2] - '0') * 10 +
(lpbegin[3] - '0');
or for numbers with unknown length:
char* c = lpBegin;
unsigned long sum = 0;
while (*c >= '0' && *c <= '9') {
sum *= 10;
sum += *c - '0';
++c;
}
I think you look for atoi()
http://www.elook.org/programming/c/atoi.html
strtol is better than atoi with better error handling.
You should use the strtoul function, "string to unsigned long". It is found in stdlib.h and has the following prototype:
unsigned long int strtoul (const char * restrict nptr,
char ** restrict endptr,
int base);
nptr is the character string.
endptr is an optional parameter giving the location of where the function stopped reading valid numbers. If you aren't interested of this, pass NULL in this parameter.
base is the number format you expect the string to be in. In other words, 10 for decimal, 16 for hex, 2 for binary and so on.
Example:
#include <stdlib.h>
#include <stdio.h>
int main()
{
const char str[] = "1234random rubbish";
unsigned int i;
const char* endptr;
i = strtoul(str,
(char**)&endptr,
10);
printf("Integer: %u\n", i);
printf("Function stopped when it found: %s\n", endptr);
getchar();
return 0;
}
Regarding atoi().
atoi() internally just calls strtoul with base 10. atoi() is however not recommended, since the C standard does not define what happens when atoi() encounters a format error: atoi() can then possibly crash. It is therefore better practice to always use strtoul() (and the other similar strto... functions).
If you're really certain the string is 4 digits long, and don't want to use any library function (for whatever reason), you can hardcode it:
const char *lpBegin = "1234";
const unsigned int lpInt = 1000 * (lpBegin[0] - '0') +
100 * (lpBegin[1] - '0') +
10 * (lpBegin[2] - '0') +
1 (lpBegin[3] - '0');
Of course, using e.g. strtoul() is vastly superior so if you have the library available, use it.

Getting gibberish instead of numbers using memcpy and strtoul

I have the following piece of code compiling under gcc:
int parseMsg(const char *msg_to_parse, unsigned long *exp_input, unsigned long *sysTicks )
{
int l_msg_size = strlen(msg_to_parse);
if(l_msg_size <10)
return -1;
char l_exp_input_arr[10];
char l_sys_ticks_arr[10];
memcpy(l_sys_ticks_arr,msg_to_parse+12,10);
memcpy(l_exp_input_arr,msg_to_parse,10);
//l_msg_size = strlen(msg_to_parse);
*sysTicks = strtoul(l_sys_ticks_arr,NULL,10);
*exp_input = strtoul(l_exp_input_arr,NULL,10);
return 0;
}
And I'm trying to test it in the following manner:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int parseMsg(const char *msg_to_parse, unsigned long *exp_input, unsigned long *sysTicks );
int main(void) {
char msg[] = "1234567890 59876543213";
unsigned long along1, along2;
along1 =0;
along2=0;
parseMsg(msg,&along1, &along2 );
printf("result of parsing: \n \t Along 1 is %lu \n \t Along 2 is %lu \n",along1, along2);
return 0;
}
But, I'm getting the following result:
result of parsing:
Along 1 is 1234567890
Along 2 is 4294967295
Why is the second unsigned long wrong?
Thanks
The second integer you provide is too big to be represented in memory on your architecture. So, according to its API, strtoul is just returning you ULONG_MAX (=4294967295 on your architecture), along with setting errno to ERANGE
strtoul API is here : http://www.cplusplus.com/reference/clibrary/cstdlib/strtoul/
BUT it may also fail if you gave a smaller integer, because strtoul only stops parsing when it encounters a non-numerical character. Since you didn't ensure that, you cannot be sure that strtoul will not try to parse whatever is in memory just after your strings. (So assuming random, you have 10 chance out of 256 to have a conversion error)
Terminate your strings with \0, it will be ok then :
char l_exp_input_arr[11]; // +1 for \0
char l_sys_ticks_arr[11];
memcpy(l_sys_ticks_arr, msg_to_parse+12, 10);
l_sys_ticks_arr[10] = '\0';
memcpy(l_exp_input_arr, msg_to_parse, 10);
l_exp_input_arr[10] = '\0';
You need to make your two temporary char[] variables one char longer and then make the last character NULL.

how to convert hex string to unsigned 64bit (uint64_t) integer in a fast and safe way?

I tried
sscanf(str, "%016llX", &int64 );
but seems not safe. Is there a fast and safe way to do the type casting?
Thanks~
Don't bother with functions in the scanf family. They're nearly impossible to use robustly. Here's a general safe use of strtoull:
char *str, *end;
unsigned long long result;
errno = 0;
result = strtoull(str, &end, 16);
if (result == 0 && end == str) {
/* str was not a number */
} else if (result == ULLONG_MAX && errno) {
/* the value of str does not fit in unsigned long long */
} else if (*end) {
/* str began with a number but has junk left over at the end */
}
Note that strtoull accepts an optional 0x prefix on the string, as well as optional initial whitespace and a sign character (+ or -). If you want to reject these, you should perform a test before calling strtoull, for instance:
if (!isxdigit(str[0]) || (str[1] && !isxdigit(str[1])))
If you also wish to disallow overly long representations of numbers (leading zeros), you could check the following condition before calling strtoull:
if (str[0]=='0' && str[1])
One more thing to keep in mind is that "negative numbers" are not considered outside the range of conversion; instead, a prefix of - is treated the same as the unary negation operator in C applied to an unsigned value, so for example strtoull("-2", 0, 16) will return ULLONG_MAX-1 (without setting errno).
Your title (at present) contradicts the code you provided. If you want to do what your title was originally (convert a string to an integer), then you can use this answer.
You could use the strtoull function, which unlike sscanf is a function specifically geared towards reading textual representations of numbers.
const char *test = "123456789abcdef0";
errno = 0;
unsigned long long result = strtoull(test, NULL, 16);
if (errno == EINVAL)
{
// not a valid number
}
else if (errno == ERANGE)
{
// does not fit in an unsigned long long
}
At the time I wrote this answer, your title suggested you'd want to write an uint64_t into a string, while your code did the opposite (reading a hex string into an uint64_t). I answered "both ways":
The <inttypes.h> header has conversion macros to handle the ..._t types safely:
#include <stdio.h>
#include <inttypes.h>
sprintf( str, "%016" PRIx64, uint64 );
Or (if that is indeed what you're trying to do), the other way round:
#include <stdio.h>
#include <inttypes.h>
sscanf( str, "%" SCNx64, &uint64 );
Note that you cannot enforce widths etc. with the scanf() function family. It parses what it gets, which can yield undesired results when the input does not adhere to expected formatting. Oh, and the scanf() function family only knows (lowercase) "x", not (uppercase) "X".

Resources