Algorithm to parse an int from a string in one pass - c

I'm trying to write a function that parses an integer from a string representation.
My problem is that I don't know how to do this with one pass through the string. If I knew ahead of time that the input contained only characters in the range '0', '1', ..., '9' and that the string was of length n, I could of course calculate
character_1 * 10^(n-1) + character_2 * 10^(n-2) + .... + character_n * 10^0
but I want to deal with the general scenario as I've presented it.
I'm not looking for a library function, but an algorithm to achieve this in "pure C".
Here's the code I started from:
int parse_int (const char * c1, const char * c2, int * i)
{
/*
[c1, c2]: Range of characters in the string
i: Integer whose string representnation will be converted
Returns the number of characters parsed.
Exs. "2342kjsd32" returns 4, since the first 4 characters were parsed.
"hhsd3b23" returns 0
*/
int n = 0;
*i = 0;
while (c1!= c2)
{
char c = *c1;
if (c >= '0' && c <= '9')
{
}
}
return n;
}

Just as some of the comments and answers suggested, maybe a bit clearer: You have to "shift" the result "left" by multiplying it by 10 in every iteration before the addition of the new digit.
Indeed, this should remind us of Horner's method. As you have recognized, the result can be written like a polynomial:
result = c1 * 10^(n-1) + c2 * 10^(n-2) + ... + cn * 10^0
And this equation can be rewritten as this:
result = cn + 10*(... + 10*(c2 + 10*c1))
Which is the form this approach is based on. From the formula you can already see, that you don't need to know the power of 10 the first digit is to be multiplied by, directly from the start.
Here's an example:
#include <stdio.h>
int parse_int(const char * begin, const char * end, int * result) {
int d = 0;
for (*result = 0; begin != end; d++, begin++) {
int digit = *begin - '0';
if (digit >= 0 && digit < 10) {
*result *= 10;
*result += digit;
}
else break;
}
return d;
}
int main() {
char arr[] = "2342kjsd32";
int result;
int ndigits = parse_int(arr, arr+sizeof(arr), &result);
printf("%d digits parsed, got: %d\n", ndigits, result);
return 0;
}
The same can be achieved using sscanf(), for everyone that is fine with using the C standard library (can also handle negative numbers):
#include <stdio.h>
int main() {
char arr[] = "2342kjsd32";
int result, ndigits;
sscanf(arr, "%d%n", &result, &ndigits);
printf("%d digits parsed, got: %d\n", ndigits, result);
return 0;
}
The output is (both implementations):
$ gcc test.c && ./a.out
4 digits parsed, got: 2342

I think this is good solution to count parse character
int parse(char *str)
{
int k = 0;
while(*str)
{
if((*str >= '0') & (*str <= '9'))
break;
str++;
k++;
}
return k;
}

Here's a working version:
#include <stdio.h>
int parse_int (const char * c1, const char * c2, int * i)
{
/*
[c1, c2]: Range of characters in the string
i: Integer whose string representnation will be converted
Returns the number of characters parsed.
Exs. "2342kjsd32" returns 4, since the first 4 characters were parsed.
"hhsd3b23" returns 0
*/
int n = 0;
*i = 0;
for (; c1 != c2; c1++)
{
char c = *c1;
if (c >= '0' && c <= '9')
{
++n;
*i = *i * 10 + c - '0';
}
else
{
break;
}
}
return n;
}
int main()
{
int i;
char const* c1 = "2342kjsd32";
int n = parse_int(c1, c1+10, &i);
printf("n: %d, i: %d\n", n, i);
return 0;
}
Output:
n: 4, i: 2342

Related

Strict atoi/strtol function

In C when I use:
int x = 0;
char str[] = " \t123abc";
x = atoi(str);
printf("%d\n", str); //123
123 is printed. I would like to know if there is a 'strict' C function that returns 0 if the string isn't fully an integer. (I don't care about the number sign (always positive) and the base (always 10)).
Some examples:
" \t123abc" -> 0
" 123abc" -> 0
"123abc" -> 0
" 123 " -> 0
"123" -> 123
"123\n" -> 0
Currently I created a int sstoi (char *str) function to do it:
static const unsigned int pow10[10] = {1, 10, 100, 1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000};
int sstoi (char *str) {
int result = 0, length = strlen(str);
char c;
if (length > 10) return 0;
for (int i = 0; i < length; i++) {
c = str[i];
if (c < 48 || c > 57) return 0;
result += (c-48)*pow10[length-i-1];
}
return result;
}
I believe that it should be improved: can someone help me out?
You can at first check if all characters in string are number then use atoi:
if(strspn(str, "0123456789") == strlen(str))
{
x = atoi(str);
}
else
{
x = 0;
}
strtol() can set a pointer to the first invalid character in the string it scans, which would be the first character after the number. If you check that the first character is valid, you can use this pointer to see whether there were characters following the number or not.
int sstoi(char *s) {
char *ep; // to point to first char after the number
if (isdigit(*s)) { // make sure first char is a digit
// convert, and find first invalid char
int x = strtol(s, &ep, 10);
// return conversion if first invalid char was the
// terminating null
if (!(*ep))
return x;
}
return 0; // otherwise return 0
}
I could not find an inbuilt function which directly returns 0, it the input is wrong.
You can use the inbuilt function isdigit(char s) to check whether the given input is valid and if it's not then you can return 0.
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdbool.h>
bool isValidNumber(char digitsArray[]){
bool isNumber = true;
for(int i=0; i < strlen(digitsArray); i++){
if(!isdigit(digitsArray[i])){
isNumber = false;
break;
}
}
return isNumber;
}
int getNumber(char digitsArray[]){
if(isValidNumber(digitsArray)){
return atoi(digitsArray);
}
return 0;
}
int main()
{
char str[] = "123\n";
int number = getNumber(str);
printf("Number is %d", number);
return 0;
}
Edit: The isValidNumber won't work if the char[] array contains a negative number.

Manipulating Big Numbers as strings

I have a problem that goes:
Create a C program that inputs large integers as strings.
Then every character is converted into the corresponding digit.
After that I have to create a function addBigNumbers() that has 3 matrices.
addBigNumbers(char *a1, char *a2, char *res)
a1 and a2 will contain the 2 large numbers that I want to add,res will contain the sum of those as a number sequence. We want the function that we created to check if the strings contains numbers only.
If it contains only numbers then res equals 1 and it prints the sum of those numbers else res is equal to 0 (max number length is 1000)
After that first function we want to create a function for subtraction.
So far I haven't gotten to subtraction since I stuck in the first one and I need your help.
This is the code that I have so far:
#include <stdio.h>
#include <stdlib.h>
#define N 1000
/* run this program using the console pauser or add your own getch, system("pause") or input loop */
int addHugeNumbers(char *a1, char *a2, char *res){
int y=0, u=0, h=0;
res=strcat(a1,a2);
if(strlen(a1)>strlen(a2)){
y=atoi(a1);
u=atoi(a2);
h=y+u;
}
else{
y=atoi(a1);
u=atoi(a2);
h=u+y;
}
printf("%d", h);
}
int main(int argc, char *argv[]) {
char res[N];
char a1[N/2];
char a2[N/2];
scanf("%s", &a1);
scanf("%s", &a2);
addHugeNumbers(a1, a2, res);
return 0;
}
The problem I have is that if I input ex. 23 23 it outputs 2346 which is obviously wrong but it got 46 correct, when I input 1234 123 it outputs 1234246 which is all wrong.
Where it gets weird is if i input something like 1234r 123 or anything else that has a character in it, it outputs the exact sum.
The problem is res=strcat(a1,a2), which does something very different than that what you think: it appends a2 to a1, and it does not "create" a new string. See, for example strcat-definition at cppreference.com:
char *strcat( char *dest, const char *src )
Appends a copy of the null-terminated byte string pointed to by src
to the end of the null-terminated byte string pointed to by dest. The
character src[0] replaces the null terminator at the end of dest. The
resulting byte string is null-terminated.
So you are manipulating your input before calculating something, and that's what you will observe then when using a debugger.
Further, scanf("%s", &a1) looks suspicious; it should be scanf("%s", a1);. Your compiler should have warned you.
You'd probably rethink addBigNumbers, probably adding the digits in a loop rather than converting them to (somehow always) limited integral data types in between. This task is actually nothing for beginners in C; take the following fragment to study:
#define N 1000
int addHugeNumbers(char *a1, char *a2, char *res){
char resultBuffer[N];
int i1 = (int)strlen(a1);
int i2 = (int)strlen(a2);
int carryOver = 0;
int ri = 0;
while (i1 > 0 || i2 > 0) { // until both inputs have been read to their beginning
i1--;
i2--;
// read single digits and consider that a string might have already
// been read to its beginning
int d1 = i1 >= 0 ? a1[i1] - '0' : 0;
int d2 = i2 >= 0 ? a2[i2] - '0' : 0;
// check for invalid input
if (d1 < 0 || d1 > 9 || d2 < 0 || d2 > 9) {
return 0;
}
// calculate result digit, taking previous carryOver into account
int digitSum = d1 + d2 + carryOver;
carryOver = digitSum / 10;
digitSum %= 10;
resultBuffer[ri++] = digitSum + '0';
}
// write the last carryOver, if any
if (carryOver > 0) {
resultBuffer[ri++] = carryOver + '0';
}
// copy resultBuffer into res in reverse order:
while(ri--) {
*res++ = resultBuffer[ri];
}
// terminate res-string
*res = '\0';
return 1;
}
int main(int argc, char *argv[]) {
char res[N];
char a1[N/2] = "123412341234";
char a2[N/2] = "1231";
if (addHugeNumbers(a1, a2, res)) {
printf("result: %s\n", res);
} else {
printf("invalid number.\n");
}
return 0;
}

Calculating hamming distance by appending '0' to lesser length string

I have to find the hamming distance between two codes.
For example if I input:
a= 10
b= 1010
Automatically a should be made equal to the length of the string b by appending 0's.
So the input should become:
a=0010
b=1010
But I'm getting instead:
a = 001010
b = 1010
Here is my code:
#include<stdio.h>
#include<string.h>
void main()
{
char a[20],b[20],len1,len2,i,diff,count=0,j;
printf("Enter the first binary string\n");
scanf("%s",a);
printf("Enter the second binary string\n");
scanf("%s",b);
len1 = strlen(a);
len2 = strlen(b);
if(len1>len2)
{
diff = len1-len2;
for(i=0;i<len1;i++)
{
b[i+diff]=b[i];
}
j=i+diff;
b[j]='\0';
for(i=0;i<diff;i++)
{
b[i]='0';
}
}
else
{
diff = len2-len1;
for(i=0;i<len2;i++)
{
a[i+diff]=a[i];
}
j=i+diff;
a[j]='\0';
for(i=0;i<diff;i++)
{
a[i]='0';
}
}
printf("\nCodes are\n");
printf("a=%s\n",a);
printf("\nb=%s\n",b);
for(i=0;a[i]!='\0';i++)
{
if(a[i]!=b[i])
{
count++;
}
}
printf("hammung distance between two code word is %d\n",count);
}
Can anyone help me to fix this issue?
In your two for loop where you are moving the content of your old tab to the right to insert the zeros, you inverted the lengths.
First loop should be:
for(i=0;i<len2;i++)
{
b[i+diff]=b[i];
}
And second:
for(i=0;i<len1;i++)
{
a[i+diff]=a[i];
}
After trying it:
Codes are
a=0010
b=1010
hammung distance between two code word is 1
Also, the main function should return an int, not void. As stated in the comments, you should also change the type of your len1, len2, i, diff, count and j because you use them as number values, not as characters. You can for instance either use the int or size_t types for that.
int main()
{
char a[20],b[20];
int len1, len2, i, diff, count=0, j;
// Rest of your code
}
Here is a method that does not prepend zeros to the shortest binary string, and avoids the limitations of strtol() by comparing the elements of the string directly, starting with the last characters. The intricacies of using strtol() are traded for more complexity in handling the array indices. Note that care must be taken to avoid counting down to a negative value since size_t types are used. This method is not limited by the capacity of long types, but rather by size_t.
#include <stdio.h>
#include <string.h>
int main(void)
{
char a[20], b[20];
printf("Enter first binary string: ");
scanf("%19s", a);
printf("Enter second binary string: ");
scanf("%19s", b);
size_t a_len = strlen(a);
size_t b_len = strlen(b);
size_t max_len = a_len > b_len ? a_len : b_len;
size_t hamming_dist = 0;
for (size_t i = 0; i < max_len; i++) {
if (a_len - i > 0 && b_len - i > 0) {
if (a[a_len - i - 1] == b[b_len - i - 1]) {
continue;
}
}
if ((a_len - i > 0 && a[a_len - i - 1] == '1') ||
(b_len - i > 0 && b[b_len - i - 1] == '1')) {
++hamming_dist;
}
}
printf("bstring_1: %s\n", a);
printf("bstring_2: %s\n", b);
printf("Hamming distance: %zu\n", hamming_dist);
return 0;
}
A way that doesn't need to pad one of the parameters with zeroes:
#include <stdio.h>
#include <stdlib.h>
int main ()
{
char *a = "1010";
char *b = "10";
long unsigned int xorab;
unsigned int hammingDistance = 0;
xorab = strtoul(a, NULL, 2) ^ strtoul(b, NULL, 2);
while (xorab) {
hammingDistance += xorab & 1;
xorab >>= 1;
}
printf("%u\n", hammingDistance);
}
It uses strtoul to convert the binary strings to unsigned long int using a base 2, then you only have to use bitwise operators (xor, and, shift) to calculate the Hamming distance without to take care of the size difference.
Obviously, this way stops to work if you want to test binary strings with values greater than an unsigned long int.

Parsing a string into an array of integers in C

So now that I've figured out how to get what I want, I'm just hoping somebody can let me know a cleaner, less ridiculous way of achieving the same thing. I'm just learning C. Here was my approach.
int main()
{
// String of positive and negative integer values
// Numbers are never more than 2 digits
char TEMPS[256] = "1 -22 -8 14 5";
int N = 5;
int ints[N];
int i = 0;
int mult;
// Arbitrary number to identify that num is not yet in use
int num = 999;
int c = 0;
mult = (TEMPS[c] != 45) ? -1 : 1;
while(strcmp(&TEMPS[c], "\0") != 0)
{
if(TEMPS[c] == 32)
{
ints[i] = mult * num;
i++;
num = 999;
mult = (TEMPS[c + 1] == 45) ? -1 : 1;
}
else if((TEMPS[c] != 45) && (TEMPS[c] != 32))
{
if(num == 999)
{
num = TEMPS[c] - '0';
}
else
{
num = num * 10 + (TEMPS[c] - '0');
}
}
c++;
}
ints[i] = mult * num;
}
I would use strtol - here's a good site for how it works and examples
http://www.tutorialspoint.com/c_standard_library/c_function_strtol.htm
long int strtol(const char *str, char **endptr, int base)
Parameters
str -- This is the string containing the representation of an integral number.
endptr -- This is the reference to an object of type char*, whose value is set by the function to the next character in str after the numerical value.
base -- This is the base, which must be between 2 and 36 inclusive, or be the special value 0.
Return Value
This function returns the converted integral number as a long int value, else zero value is r
I've included one example from the site.
#include <stdio.h>
#include <stdlib.h>
int main()
{
char str[30] = "2030300 This is test";
char *ptr;
long ret;
ret = strtol(str, &ptr, 10);
printf("The number(unsigned long integer) is %ld\n", ret);
printf("String part is |%s|", ptr);
return(0);
}

how to add char type integer in c

This is the sample code of my program, in which i've to add two string type integer (ex: "23568" and "23674"). So, i was trying with single char addition.
char first ='2';
char second ='1';
i was trying like this..
i=((int)first)+((int)second);
printf("%d",i);
and i'm getting output 99, because, it's adding the ASCII value of both. Anyone please suggest me, what should be the approach to add the char type number in C.
Since your example has two single chars being added together, you can be confident knowing two things
The total will never be more than 18.
You can avoid any conversions via library calls entirely. The standard requires that '0' through '9' be sequential (in fact it is the only character sequence that is mandated by the standard).
Therefore;
char a = '2';
char b = '3';
int i = (int)(a-'0') + (int)(b-'0');
will always work. Even in EBCDIC (and if you don't know what that is, consider yourself lucky).
If your intention is to actually add two numbers of multiple digits each currently in string form ("12345", "54321") then strtol() is your best alternative.
i=(first-'0')+(second-'0');
No need for casting char to int.
if you want to add the number reprensations of the characters, I would use "(first - '0') + (second - '0');"
The question seemed interesting, I though it would be easier than it is, adding "String numbers" is a little bit tricky (even more with the ugly approach I used).
This code will add two strings of any length, they doesn't need to be of the same length as the adding begins from the back. Your provide both strings, a buffer of enough length and you ensure the strings only contains digits:
#include <stdio.h>
#include <string.h>
char * add_string_numbers(char * first, char * second, char * dest, int dest_len)
{
char * res = dest + dest_len - 1;
*res = 0;
if ( ! *first && ! *second )
{
puts("Those numbers are less than nothing");
return 0;
}
int first_len = strlen(first);
int second_len = strlen(second);
if ( ((first_len+2) > dest_len) || ((second_len+2) > dest_len) )
{
puts("Possibly not enough space on destination buffer");
return 0;
}
char *first_back = first+first_len;
char *second_back = second+second_len;
char sum;
char carry = 0;
while ( (first_back > first) || (second_back > second) )
{
sum = ((first_back > first) ? *(--first_back) : '0')
+ ((second_back > second) ? *(--second_back) : '0')
+ carry - '0';
carry = sum > '9';
if ( carry )
{
sum -= 10;
}
if ( sum > '9' )
{
sum = '0';
carry = 1;
}
*(--res) = sum;
}
if ( carry )
{
*(--res) = '1';
}
return res;
}
int main(int argc, char** argv)
{
char * a = "555555555555555555555555555555555555555555555555555555555555555";
char * b = "9999999999999666666666666666666666666666666666666666666666666666666666666666";
char r[100] = {0};
char * res = add_string_numbers(a,b,r,sizeof(r));
printf("%s + %s = %s", a, b, res);
return (0);
}
Well... you are already adding char types, as you noted that's 4910 and 5010 which should give you 9910
If you're asking how to add the reperserented value of two characters i.e. '1' + '2' == 3 you can subtract the base '0':
printf("%d",('2'-'0') + ('1'-'0'));
This gives 3 as an int because:
'0' = ASCII 48<sub>10</sub>
'1' = ASCII 49<sub>10</sub>
'2' = ASCII 50<sub>10</sub>
So you're doing:
printf("%d",(50-48) + (49-48));
If you want to do a longer number, you can use atoi(), but you have to use strings at that point:
int * first = "123";
int * second = "100";
printf("%d", atoi(first) + atoi(second));
>> 223
In fact, you don't need to even type cast the chars for doing this with a single char:
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char* argv[]) {
char f1 = '9';
char f2 = '7';
int v = (f1 - '0') - (f2 - '0');
printf("%d\n", v);
return 0;
}
Will print 2
But beware, it won't work for hexadecimal chars
This will add the corresponding characters of any two given number strings using the ASCII codes.
Given two number strings 'a' and 'b', we can compute the sum of a and b using their ASCII values without type casting or trying to convert them to int data type before addition.
Let
char *a = "13784", *b = "94325";
int max_len, carry = 0, i, j; /*( Note: max_len is the length of the longest string)*/
char sum, *result;
Adding corresponding digits in a and b.
sum = a[i] + (b[i] - 48) + carry; /*(Because 0 starts from 48 in ASCII) */
if (sum >= 57)
result[max_len - j] = sum - 10;
carry = 1;
else
result[max_len - j] = sum;
carry = 0;
/* where (0 < i <= max_len and 0 <= j <= max_len) */
NOTE:
The above solution only takes account of single character addition starting from the right and moving leftward.
if you want to scan number by number, simple atoi function will do it
you can use
atoi() function
#include <stdio.h>
#include <stdlib.h>
void main(){
char f[] = {"1"};
char s[] = {"2"};
int i, k;
i = atoi(f);
k = atoi(s);
printf("%d", i + k);
getchar();
}
Hope I answered you question

Resources