Conversion of string constant to numeric value using C

Conversion of string constant to numeric value using C - c

I have written a C program which uses two different algorithms to convert a string constant representing a numeric value to its integer value. For some reasons, the first algorithm, atoi(), doesn't execute properly on large values, while the second algorithm, atoi_imp(), works fine. Is this an optimization issue or some other error? The problem is that the first function makes the program's process to terminate with an error.
#include <stdio.h>
#include <string.h>
unsigned long long int atoi(const char[]);
unsigned long long int atoi_imp(const char[]);
int main(void) {
printf("%llu\n", atoi("9417820179"));
printf("%llu\n", atoi_imp("9417820179"));
return 0;
}
unsigned long long int atoi(const char str[]) {
unsigned long long int i, j, power, num = 0;
for (i = strlen(str) - 1; i >= 0; --i) {
power = 1;
for (j = 0; j < strlen(str) - i - 1; ++j) {
power *= 10;
}
num += (str[i] - '0') * power;
}
return num;
}
unsigned long long int atoi_imp(const char str[]) {
unsigned long long int i, num = 0;
for (i = 0; str[i] >= '0' && str[i] <= '9'; ++i) {
num = num * 10 + (str[i] - '0');
}
return num;
}

atoi is part of C standard library, with signature int atoi(const char *);.
You are declaring that a function with that name exists, but give it different return type. Note that in C, function name is the only thing that matters, and the toolchain can only trust what you tell in the source code. If you lie to the compiler, like here, all bets are off.
You should select different name for your own implementation to avoid issues.
As researched by #pmg, C standard (link to C99.7.1.3) says, using names from C standard library for your own global symbols (functions or global variables) is explicitly Undefined Behavior. Beware of nasal demons!

Ok there is at least one problem with your function atoi.
You are looping down on an unsigned value and check if its bigger equal zero, which should be an underflow.
The most easy fix is index shifting i.e.:
unsigned long long int my_atoi(const char str[]) {
unsigned long long int i, j, power, num = 0;
for (i = strlen(str); i != 0; --i) {
power = 1;
for (j = 0; j < strlen(str) - i; ++j) {
power *= 10;
}
num += (str[i-1] - '0') * power;
}
return num;
}

Too late, but may help. I did for base 10, in case you change the base you need to take care about how to compute the digit 0, in *p-'0'.
I would use the Horner's rule to compute the value.
#include <stdio.h>
void main(void)
{
char *a = "5363", *p = a;
int unsigned base = 10;
long unsigned x = 0;
while(*p) {
x*=base;
x+=(*p-'0');
p++;
}
printf("%lu\n", x);
}

Your function has an infinite loop: as i is unsigned, i >= 0 is always true.
It can be improved in different ways:
you should compute the length of str just once. strlen() is not cheap, it must scan the string until it finds the null terminator. The compiler is not always capable of optimizing away redundant calls for the same argument.
power could be computed incrementally, avoiding the need for a nested loop.
you should not use the name atoi as it is a standard function in the C library. Unless you implement its specification exactly and correctly, you should use a different name.
Here is a corrected and improved version:
unsigned long long int atoi_power(const char str[]) {
size_t i, len = strlen(str);
unsigned long long int power = 1, num = 0;
for (i = len; i-- > 0; ) {
num += (str[i] - '0') * power;
power *= 10;
}
return num;
}
Modified this way, the function should have a similar performance as the atoi_imp version. Note however that they do not implement the same semantics. atoi_pow must be given a string of digits, whereas atoi_imp can have trailing characters.
As a matter of fact neither atoi_imp nor atoi_pow implement the specification of atoi extended to handle larger unsigned integers:
atoi ignored any leading white space characters,
atoi accepts an optional sign, either '+' or '-'.
atoi consumes all following decimal digits, the behavior on overflow is undefined.
atoi ignores and trailing characters that are not decimal digits.
Given these semantics, the natural implementation or atoi is that of atoi_imp with extra tests. Note that even strtoull(), which you could use to implement your function handles white space and an optional sign, although the conversion of negative values may give surprising results.

Related

issue convert double range number to binary

I have a problem to convert integer type's double rage number to binary as the below,
void intToBin(int digit) {
int b;
int k = 0;
char *bits;
int i;
bits= (char *) malloc(sizeof(char));
while (digit) {
b = digit % 2;
digit = digit / 2;
bits[k] = b;
k++;
}
for ( i = k - 1; i >= 0; i--) {
printf("%d", bits[i]);
}
}
But as you can see the that function's arguments input is integer.
I came across the error when I tried with intToBin(10329216702565230)
because 10329216702565230 is over integer range.
How can I extend what that have integer type's double rage number to binary ?
update
I've updated the below code
void intToBin(uint64_t digit) {
int b;
int k = 0;
char *bits;
int i;
bits = malloc(sizeof digit * 64);
while (digit) {
b = digit % 2;
digit = digit / 2;
bits[k] = b;
k++;
}
for ( i = k - 1; i >= 0; i--) {
printf("%d", bits[i]);
}
}
But I didn't get it what should I do to get the 2's complement ?
m
dmnngn

Solution is to use type which supports that range of numbers.
Use unsigned long long or uint64_t(assuming you are passing non negative integers, otherwise use long long or int64_t). Then you call the function like this Edited to add int64_t to uint64_t from the comment posted. unsigned long long is 64 bits atleast - can even be wider. With OP's comment of getting 64 bits output - better to use (u)int64_t
intToBin(10329216702565230U)
In case you want to use negative numbers use long long.Call it like this
intToBin(10329216702565230L).
You didn't allocate enough memory - you were accessing memory that you haven't allocated, resulting in Undefined behavior. You have allocated 1 char first and then you didn't allocate. You can solve this by reallocating - reallocate memory inside the loop (reallocate 1 char at a time inside loop). And then use it. Instead of calling realloc multiple times why don't you allocate memory for 64 chars and then use it to store the result. And in the end, the left over space can be freed with another realloc call.
You don't need to cast the return value of malloc (void* to char* conversion is done implicitly).
You didn't check the return value of malloc. malloc may return NULL and in that case you have to handle that separately. For example:-
#define NBITS 64
...
...
bits = malloc(NBITS);
if( bits == NULL ){
perror("malloc failed");
exit(EXIT_FAILURE);
}
Note: The 64 magic number is coming introduced with the thought that unsigned long long is 64 bits atleast. So while converting we will be using that in case the number of bits exceeds 64 we will reallocate. A better choice is to use what chux said - sizeof digit * CHAR_BIT.
Also
bits[k] = b+'0';
We are putting the ascii value and then you can print it like this
printf("%c", bits[i]);
You forgot to free the allocated memory. Without freeing it (free(bits)), you have memory leak.
Davic C. Rankins comment

void intToBin(int digit)
{
int b;
int k = 0;
char *bits;
int i;
bits= (char *) malloc(sizeof(char));
while (digit) {
b = digit % 2;
digit = digit / 2;
bits[k] = b;
k++;
}
for ( i = k - 1; i >= 0; i--) {
printf("%d", bits[i]);
}
}
The answer is simple,
Replace int with int64_t to use 64 bits instead of 32.
Please try it and let us know

Replace int with int64_t to use 64 bits instead of 32.

Comparing unsigned and signed int

I guess this is one of the classical questions.
As far as I know comparing unsigned and signed int are performed using unsigned arithmetic, which means that if length = -1 = unsigned max of 32 bits.
The code can be fixed by either declaring length to be an int, or by changing the test of the for loop to be i < length.
Declaring length to be an int, it's easy to understand, but changing the loop to be i < length not really easy.
If we have the following situation: 5 < -1 which if performed using unsigned arithmetic, in my computer yields 5 < 4294967295, how can this be a solution, it seems like it will access undefined elements.
Code
float sum_elements(float a[], unsigned length)
{
int i;
float result = 0;
for (i = 0; i <= length-1; i++)
result += a[i];
return result;
}

Consider the condition.
i <= length-1
As you mentioned, if length is zero then you will enter into a situation like 5 < 4294967295.
Changing the condition to "i < length" will prevent this.
Also changing type of variable "i" to "unsigned" makes sense because (a) it is array index. (b) you are comparing it with an "unsigned".
So I would prefer this code.
float sum_elements(float a[], unsigned length)
{
unsigned i = 0;
//float result = 0.0; //Refer comment section.
double result = 0.0;
for (i = 0; i < length; i++)
result += (double)a[i];
return result;
}

Option #1:
for (i = 0; i <= (int)length-1; i++)
Option #2:
for (i = 0; i+1 <= length; i++)
Option #3:
for (i = 0; i < length; i++)

It's your compilator job's, when he creates he's parser lexer, he uses a table for your variables. If he saw something like :
float a = b + 60
60 will be cast in 60.0 by your compilator.
I think this is the same thing here:
(unsigned int)length = (unsigned int)length (int)-1
becomes:
(unsigned int)length = (int)length (int)-1;
If you want a proper arithmetic comparison, you should use the flag -Wextra

A pedantic <= compare of and int <= unsigned would test for negative-ness first.
for (i = 0; i < 0 || ((unsigned) i) <= length-1; i++)
Removing the -1 helps to avoid overflow.
for (i = 0; i < 0 || ((unsigned) i) < length; i++)
A good compiler will likely optimize the code so 2 compares are not actually in the executable.
If -Wsign-conversion or its equivalent compiler option is not used, drop the cast for cleaner code #R..
for (i = 0; i < 0 || i < length; i++)
As well commented by #chqrlie the compare may perform well but subsequent operations on i may be a problem. In particular when i == INT_MAX, the i++ is UB.
Better to use size_t (an unsigned type) for array size computation and indexing.
float sum_elements(float a[], size_t length) {
float result = 0;
size_t i;
for (i = 0; i < length; i++)
result += a[i];
return result;
}

Your code will not perform as expected in 2 cases:
if length == 0, length - 1, computed using unsigned arithmetic, is a very large number and comparing i <= length - 1 will be always true because the comparison is also performed using unsigned arithmetics.
if length is larger than the maximum integer value, i can never reach such a value and although the comparison performed using unsigned arithmetic will work as expected, the indexing a[i] will be incorrect on 64-bit systems where the negative index will point outside the array.
The compiler correctly diagnoses a real problem. Using a signed type for i and comparing that to an unsigned length expression can lead to unexpected behavior. Correct the problem this way:
float sum_elements(float a[], unsigned length) {
double result = 0.0;
for (unsigned i = 0; i < length; i++) {
result += a[i];
}
return result;
}
Notes:
the types for length and i really should be size_t as this may be a larger type than unsigned.
the sum should be computed using double arithmetics, to achieve better precision than using float. Precision will be better, but still limited. Summing the array elements in a different order can produce a different result.

Lose the i variable, to save a little stack space and make the function faster.
float sum_elements(float a[], unsigned length)
{
float result = 0;
while (length--)
result += *a++;
return result;
}

How I can Gave The Variable To The Array in c [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I am trying to solve a problem, I have one integer variable such as
unsigned int x = 456;
Now I want to decompose my integer to an array of its digits, like so:
unsigned int i[] = {4,5,6};
Then I want to convert each element of the array to a string or char.
Any ideas?
I use Avr studio

#include <stdio.h>
int main(){
unsigned int x = 456;
int len = snprintf(NULL, 0, "%u", x);
unsigned int i[len];
unsigned int wk = x;
for(int k=len-1;k>=0;--k, wk/=10)
i[k]=wk % 10;
for(int k=0;k<len;++k)
printf("%u", i[k]);
char string[len+1];
for(int k=0;k<len;++k)
sprintf(string+k, "%u", i[k]);
printf("\n%s\n", string);
return 0;
}

The easiest way to convert an integer to a string is to use a library function such as snprintf().
If you don't have the standard C library, you can use the classic remainder/division trick:
void uint_to_string(char *buf, unsigned int x, unsigned int digits)
{
buf[digits] = '\0';
while(digits > 0)
{
buf[--digits] = '0' + (x % 10);
x /= 10;
}
}
Note that the above builds the string "backwards" (right to left) since that's easiest. It will generate a 0-padded result, you can fix that by adding code to break out of the loop (after the digit is generated on the first line of the loop's body) if x == 0.

main()
{
unsigned int x = 456;
char i[3];
int j,k;
for (j=0; x!=0; j++){
i[j] = x%10 + '0';
x /= 10;
}
for (k=0; k<j; k++)
printf("%c ", i[k]);
return 0;
}

The answer to this is slightly dependent on your actual problem. Do you need the array of digits, or is this merely the intermediate step you yourself came up with to convert an unsigned integer to a string?
If all you need is the string, it would be much simpler to use a function such as sprintf or snprintf.
#include <stdio.h>
//...
unsigned int x = 456;
char digits[50]; // 50 is chosen arbitrarily
snprintf(digits, 50, "%u", x);
//...
Will yield a null-terminated string in digits that looks exactly like the string representation of x, with the caveat that if x is more than 50 digits it will just do as much as it can. (Though I'm not sure an unsigned int can even have more than 50 decimal digits off the top of my head)
If you want the char* to be exactly the correct size to hold the number, it's only a little more difficult.
#include <stdio.h>
// ...
unsigned int x = 456;
int numDigits = snprintf(NULL, 0, "%u", x); // snprintf returns the number of characters that could potentially be written.
char digits[numDigits];
sprintf(digits, "%u", x);
// ...
Without the standard library available, it gets a bit more hairy, but not unmanageably so. Unfortunately, you're going to need two passes that do almost exactly the same things: one to count the digits and one to actually assign them to your array.
int main( void ) {
// ...
unsigned int x = 456;
int numDigits = countDigits(x);
char digits[numDigits+1]; // The +1 is for null-termination
fillDigitArray(digits, x, numDigits);
// ...
}
int fillDigitArray(char *digits, int x, int numDigits) {
int i;
// This requires perhaps a little explaining
// By far the easiest way to get individual digits of a number is with
// x % 10, but this gives us the righthand-most digits
// Thus by counting DOWN, we're filling our buffer from the RIGHT
// making up for the "backwards" nature.
digits[numDigits] = 0;
for (i = numDigits-1; i >= 0; i--) {
digits[i] = '0' + (x%10);
x /= 10;
}
}
int countDigits(int x) {
// Special case
if( x == 0 ) {
return 1;
}
int numDigits;
while(x > 0) {
x /= 10;
numDigits++;
}
return numDigits;
}
Extracting it into an array of unsigned ints is similar, just make digits an unsigned int * rather than a char *, and instead of making digits[i] = '0' + x%10 make it digits[i] = x%10.
Edit: In the interest of fully explaining the example, x%10 is "x mod 10", which can roughly be stated as "give me the rightmost digit of x". x /= 10, while dividing x by 10 and overwriting x with the new value, is essentially just our way of saying "make the right-most digit of x what is currently in the 10's place".
The '0'+ x%10 part is admittedly a bit of magic. The actual ASCII character value for the number "0" isn't actually 0, but the digits 0-9 are laid out in order. So if the rightmost digits of x is 0, we get '0'+0, which is '0', and if we get the rightmost digit as 9 '0'+9' becomes '9'. Using this allows us to bypass an ugly if or switch statement to map the number to the right character.

Getting each digit is a math/logic problem. You need to use the modulus operator which gives you the remainder of the division of the operands.
#include <stdio.h>
static char digits[10];
int main(void) {
int number = 4056;
int remainder = 0;
int i = 0;
while(number > 0 && digits[i] >= 0) {
remainder = number % 10;
number /= 10;
digits[i] = 48 + remainder;
i++;
}
for(i--; i >= 0; i--) {
printf("%c", digits[i]);
}
printf("\n");
}

Is there a strtol equivalent that does not require a null-terminated string?

Is there a standard C function similar to strtol which will take a char* and a length for a non-null-terminated string?
I know that I could copy out the string into a null-terminated region, but for efficiency reasons that is undesirable.

No such function in the standard library. You will either have to use the temporary buffer method, or write your own function from scratch.

To answer your question: no, there is no standard function, but it is simple enough to write your own:
#include <stdio.h>
#include <ctype.h>
int natoi(char *s, int n)
{
int x = 0;
while(isdigit(s[0]) && n--)
{
x = x * 10 + (s[0] - '0');
s++;
}
return x;
}
int main(int argc, char*argv[])
{
int i;
for(i = 1; i < argc; i++)
printf("%d: %d\n", i, natoi(argv[i], 5));
}

strntol is probably what you're after... it's not standard C, though.

If you're that pressed for efficiency, you can probably motivate the time to write and debug your own.
But: just do it with a copy; you probably have an upper bound for how long the string can be (a decimal numeral that fits in a long has a strict upper bound on its maximum length), so you can have a static buffer. Then profile your entire application, and see if the copying/conversion really is a bottleneck. If it really is, then you know you need to write your own.
Here's a rough (untested, browser-written) starting point:
long limited_strtol(const char *string, size_t len)
{
long sign = 1;
long value = 0;
for(; len > 0 && *string == '-'; string++, len--)
sign *= -1;
for(; len > 0 && isdigit(*string); string++, len--)
{
value *= 10;
value += *string - '0';
len--;
string++;
}
return sign * value;
}

K&R Exercise 2-3 "Hex to int converter" Problem

The program I wrote works in demographics consisting of only single Hexadecimal values. (Probably not the most elegant solution, but I'm a new programmer) My question is, how would I go about handling of multiple hexadecimal digits, such as 0xAF, or 0xFF, etc? I'm not exactly sure, and I've seemed confuse myself greatly, in the attempt. I'm not asking for someone to hold my hand, but to give me a tip where I've gone wrong in this code and thoughts on how to fix it.
Thanks :)
/* Exercise 2-3. Write the function htoi(s), which converts a string of
* hexadecimal digits (including an optional 0x or 0X) into it's equivalent
* integer value. The allowable digits are 0...9 - A...F and a...f.
*
*/
#include <stdio.h>
#include <string.h>
#define NL '\n'
#define MAX 24
int htoi(char *hexd);
int
main(void)
{
char str[MAX] = {0};
char hex[] = "0123456789ABCDEFabcdef\0";
int c;
int i;
int x = 0;
while((c = getchar()) != EOF) {
for(i = 0; hex[i] != '\0'; i++) {
if(c == hex[i])
str[x++] = c;
}
if(c == NL) {
printf("%d\n", htoi(str));
x = 0, i = x;
}
}
return 0;
}
int
htoi(char *hexd)
{
int i;
int n = 0;
for(i = 0; isdigit(hexd[i]); i++)
n = (16 * i) + (hexd[i] - '0');
for(i = 0; isupper(hexd[i]); i++) /* Let's just deal with lowercase characters */
hexd[i] = hexd[i] + 'a' - 'A';
for(i = 0; islower(hexd[i]); i++) {
hexd[i] = hexd[i] - 'a';
n = (16 + i) + hexd[i] + 10;
n = hexd[i] + 10;
}
return n;
}

Someone has alredy asked this (hex to int, k&r 2.3).
Take a look, there are many good answers, but you have to fill in the blanks.
Hex to Decimal conversion [K&R exercise]
Edit:
in
char hex[] = "0123456789ABCDEFabcdef\0";
The \0 is not necesary. hex is alredy nul terminated. Is len (0...f) + 1 = 17 bytes long.

I'll pick on one loop, and leave it to you to rethink your implementation. Specifically this:
for(i = 0; isdigit(hexd[i]); i++)
n = (16 * i) + (hexd[i] - '0');
doesn't do what you probably think it does...
It only processes the first span of characters where isdigit() is TRUE.
It stops on the first character where isdigit() is FALSE.
It doesn't run past the end because isdigit('\0') is known to be FALSE. I'm concerned that might be accidentally correct, though.
It does correctly convert a hex number that can be expressed solely with digits 0-9.
Things to think about for the whole program:
Generally, prefer to not modify input strings unless the modification is a valuable side effect. In your example code, you are forcing the string to lower case in-place. Modifying the input string in-place means that a user writing htoi("1234") is invoking undefined behavior. You really don't want to do that.
Only one of the loops over digits is going to process a non-zero number of digits.
What happens if I send 0123456789ABCDEF0123456789ABCDEF to stdin?
What do you expect to get for 80000000? What did you get? Are you surprised?
Personally, I wouldn't use NL for '\n'. C usage pretty much expects to see \n in a lot of contexts where the macro is not convenient, so it is better to just get used to it now...

I think that the MAX size of string should be either 10 or 18 instead of 24. (If you have already checked the int on your machine and followed the reasoning bellow, it would be beneficial to include it as a comment in your code.)
10 : since htoi() returns an int , int is usually 4 bytes (check your system's too), so the hexadecimal number can be atmost 8 digits in length (4bits to 1 hex digit, 8 bits to a byte), and we want to allow for the optional 0x or 0X.
18 : would be better if htoi() returned a long and its 8 bytes (again, check your system's), so the hexadecimal number can be atmost 16 digits in length, and we want to allow for the optional 0x or 0X.
Please note that that sizes of int and long are machine dependent, and please look at exercise 2.1 in the K&R book to find them.

Here is my version of a classic htoi() function to convert multiple hexadecimal values into decimal integers. It's a full working program compile it and run.
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>
int htoi(const char*);
int getRawInt(char);
int main(int argc, char **argv) {
char hex[] = " ";
printf("Enter a hexadecimal number (i.e 33A)\n");
scanf("%s", hex);
printf("Hexedecimal %s in decimal is %d\n", hex, htoi(hex)); // result will be 826
return 0;
}
int htoi(const char *hex) {
const int LEN = strlen(hex) -1;
int power = 1;
int dec = 0;
for(int i = LEN; i >= 0; --i) {
dec += getRawInt(hex[i]) * power;
power *= 16;
}
return dec;
}
int getRawInt(char c) {
if(isalpha(c)) {
return toupper(c) - 'A' + 10;
} return c-'0';
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Conversion of string constant to numeric value using C - c

Related

issue convert double range number to binary

Comparing unsigned and signed int

How I can Gave The Variable To The Array in c [closed]

Is there a strtol equivalent that does not require a null-terminated string?

K&R Exercise 2-3 "Hex to int converter" Problem

Categories

Resources