c-changing string to int - c

I'm trying to change a string of chars into a number.
For example the string '5','3','9' into 539.
what I did is:
for (j = 0; j < len_of_str; j++)
num = num + ((str[j] - 48) * (10 ^ (len_of_str - j)))
printf("%d", num);
num is the number which would contain the number as a int the minus 48 is to change the value in ASCII to a number who's like the real number.
and the (10 ^ (len_of_str - j)) is the change the values to hundreds, thousands, etc...

Several issues:
First, ^ is not an exponentiation operator in C - it's a bitwise exclusive-OR operator. Instead of getting 10N, you'll get 10 XOR N, which is not what you want. C does not have an exponentiation operator (ironic for a language that defines eleventy billion operators, but there you go) - you'll need to use the library function pow instead. Or you can avoid the whole issue and do this instead:
num = 0;
for ( j = 0; j < len_of_str; j++ )
{
num *= 10;
num += str[j] - 48;
}
Second, str[j]-48 assumes ASCII encoding. To make that a bit more generic, use str[j] - '0' instead (in most encodings digit characters are sequential,
so '9' - '0' should equal 9).
Finally, is there a reason you're not using one of the built-in library functions such as atoi or strtol?
num = (int) strtol( str, NULL, 0 );
printf( "num = %d\n", num );

As pointed out by the comments above, ^ does not actually calculate a power, but instead does a bit-wise XOR (see wikipedia). For instance for 0101 ^ 0111 == 0010, as XOR will only set the bits to one for which the inputs differ in that bit.
To calculate 10 to the power something in c, use pow(double x, double y) from <math.h>. See this post for more information.

Convert a sequence of digits into an integer is a special case of the more general case of parsing a number (integer or real) into a binary integer or double value.
One approach is to describe the number using a pattern, which you can either describe iteratively, or recursively as follows,
An integer_string is composed of:
and optional '+' or '-' (sign)
follwed by a digit_sequence
a digit_sequence is composed of:
digit ('0', '1', '2', '3', ..., '9')
followed by an optional (recursive) digit_sequence
This can be written using Backus-Naur formalism as,
integer_string := { '+' | '-' } digit_sequence
digit_sequence := digit { digit_sequence }
digit := [ '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' ]
Should you desire, you can extend the above to recognize a real number,
real_number := integer_string { '.' { digit_sequence } }
{ [ 'e' | 'E' ] integer_string }
Although the above is not quite correct, as it forces a digit before the decimal (fix is left as an exercise for the reader).
Once you have the Backus-Naur formalism, it is easy to recognize the symbols that comprise the pattern, and the semantic action of the actual conversion to integer
long int
atol_self(char* str)
{
if(!str) return(0);
//accumulator for value
long int accum=0; //nothing yet
//index through the string
int ndx=0;
//handle the optional sign
int sign=1;
if ( str[ndx=0] == '+' ) { sign=1; ndx+=1; }
else if ( str[ndx=0] == '+' ) { sign=1; ndx+=1; }
for( ; str[ndx] && isdigit(str[ndx]); ) {
int digval = str[ndx] - '0';
accum = accum*10 + digval;
++ndx;
}
return(accum*sign);
}

Related

C program that sums a char with int

I have a given exercise that wants me to find the uppercase letter that is K places from the letter in this case char variable that is named C. The range is uppercase letters from A to Z.
For example if the input is B 3 the output should be E. For this specific input its simple you just sum the values and you get your answer but for example what if we go out of the range. Here is one example F 100 the program should output B because if the value is > than Z the program starts from A.
If there are some confusions I will try to explain it more here are some test cases and my code that only work if we don't cross the range.
Input Output
B 3 E
X 12345 S
F 100 B
T 0 T
#include <stdio.h>
int main(){
int K;
char C,rez;
scanf("%c %d",&C,&K);
int ch;
for(ch = 'A';ch <= 'Z';ch++){
if(C>='A' && C<='Z'){
rez = C+K;
}
}
printf("%c",rez);
return 0;
}
Think of the letters [A-Z] as base 26 where zero is A, one is B and 25 is Z.
As we sum of the letter (in base 26) and the offset, it is only the least significant base 26 digit we have interest, so use % to find the least significant base 26 digit much like one uses % 10 to find the least significant decimal digit.
scanf(" %c %d",&C,&K);
// ^ space added to consume any white-space
if (C >= 'A' && C <= 'Z') {
int base26 = C - 'A';
base26 = base26 + K;
base26 %= 26;
int output = base26 + 'A';
printf("%c %-8d %c\n", C, K, output);
}
For negative offsets we need to do a little more work as % in not the mod operator, but the remainder. This differs with some negative operands.
base26 %= 26;
if (base < 0) base26 += 26; // add
int output = base26 + 'A';
Pedantically, C + K may overflow with extreme K values. To account for that, reduce K before adding.
// base26 = C + K;
base26 = C + K%26;
We could be a little sneaky and add 26 to insure the sum is not negative.
if (C >= 'A' && C <= 'Z') {
int base26 = C - 'A';
base26 = base26 + K%26 + 26; // base26 >= 0, even when K < 0
base26 %= 26; // base26 >= 0 and < 26
int output = base26 + 'A';
printf("%c %-8d %c\n", C, K, output);
}
... or make a complex one-line
printf("%c %-8d %c\n", C, K, (C - 'A' + K%26 + 26)%26 + 'A');
This can be accomplished by using 2 concepts.
ASCII value
Modulus operator (%)
In C every character has an ASCII value. Basically it goes from 0-127.
The character 'A' has the value of 65
The character 'B' has the value of 66 (65 + 1)
and so on...
Until Z which is 65 + 25 = 90
And the 2nd concept I want to highlight in math is modulo arithmetic where if you always want to map a number to certain range, you can use a modulus operator.
Modulus is the reminder that you get after dividing a number by another number.
In our case, we have 26 alphabets so we can always get a number between 0 to 25
For the example you took
100 % 26 = 22
But you have to consider the starting point too.
So, we always subtract the initial alphabet by the value of 'A', i.e. 65 so that 'A' maps to 0 and 'Z' maps to 25
So, if we start with 'F' and need to go 100 places..
Subtract 'A' value from 'F' value. Characters behave like numbers so you can actually store 'F' - 'A' in an integer
In this case 'F' - 'A' = 5
Next we add the offset to this.
5 + 100 = 105
Then we perform modulus with 26
105 % 26 = 1
Finally add the value of 'A' back to the result
'A' + 1 = 'B'
And you are done
Get the remainder of input number with 26 using modulo operator. If sum of input character and remainder is less than or equal to Z then its the answer otherwise again find the remainder of sum with 26 and that will be answer (take care of offset because the ASCII decimal value of letter A is 65).
Roughly the implementation will be:
#include <stdio.h>
int main(){
int K;
char C, rez;
scanf("%c %d",&C,&K);
// Validate the user input
int ch;
int rem = K % 26;
if ((rem + C) - 'A' < 26) {
rez = rem + C;
} else {
rez = ((rem + C - 'A') % 26) + 'A';
}
printf("%c\n",rez);
return 0;
}
Note that, I know there is scope of improvement in the implementation. But this is just to give an idea to OP about how it can be done.
Output:
# ./a.out
B 3
E
# ./a.out
X 12345
S
# ./a.out
F 100
B
# ./a.out
T 0
T

K&R book exercise 4-2

I'm studying K&R book. I'm currently at chapter 4. I was reading the atof() function on page 71. Function atof(s) converts string to its double precision floating point equivalent.
The code of atof() is as following:
//atof: convert string s to double
double atof2(char s[])
{
double val, power;
int i, sign;
for (i = 0; isspace(s[i]); ++i) //skip white space
;
sign = (s[i] == '-') ? -1: 1;
if (s[i] == '-' || s[i] == '-')
++i;
for (val = 0.0; isdigit(s[i]); i++)
val = 10.0 * val + (s[i] - '0');
if (s[i] == '.')
++i;
for (power = 1.0; isdigit(s[i]); i++) {
val = 10.0 * val + (s[i] - '0');
power *= 10.0;
}
return sign * val / power;
}
My question is about variable: power. Why do we need it for?
I do understand the use of variable: "val" but i'm not sure about variable: "power". Why do we divide val by power?
Variable power is for division of number by power , to get result as float point .
Let your string be -12.83 , then first for loop will check for space and increment i as no space so ,i=0 .
sign will be -1 as s[i]=s[0]='-' .
In next two loops string's values are converted to integers and stored in val ( excluding . - figure out yourself) .
Now after both loop val will be 1283 . But last loop will iterate for 2 times and power will be changed to 100.00 (10*1.0 in first iteration and 10*10.0 in second iteration) .
Now to get value as float point val is divided by power and multiplied by sign .
So , what it will return is -1*1283/100 , thus -12.83 is your float point number .

Explanation of atof code from K&R

I understand what we are doing, before we converted a string to an int, now we are converting a string to a double. I don't understand the logic behind this code though. Could someone clarify this a little for me? Best regards.
#include <ctype.h>
#include <stdio.h>
//atof: convert string s to double
double atof(char s[])
{
double val, power;
int i, sign;
for (i = 0; isspace(s[i]); i++) //skip whitespace
;
sign = (s[i] == '-') ? -1 : 1;
if (s[i] == '+' || s[i] == '-')
i++;
for (val = 0.0; isdigit(s[i]); i++)
val = 10.0 * val + (s[i] - '0');
if (s[i] == '.')
i++;
for (power = 1.0; isdigit(s[i]); i++) {
val = 10.0 * val + (s[i] - '0');
power *= 10.0;
}
return sign * val / power;
}
int main()
{
char s[] = "78.93"; //output is 78.930000
printf("atof equals %f\n", atof(s));
return 0;
}
This part is pretty easy, just skips to the first non-whitespace character:
for (i = 0; isspace(s[i]); i++) //skip whitespace
;
Now we check if the first non-whitespace character is a - to set it negative, then skip over the character whether its a - or a +:
sign = (s[i] == '-') ? -1 : 1;
if (s[i] == '+' || s[i] == '-')
i++;
Now it starts to get tricky. Let's use an example of 1234.5678. First we're going to handle the part before the decimal. Its handled by looking at each digit, adding it to val, then if the next digit is not a decimal, multiply val up to the point by 10 to left shift it and add the next digit. For example with 1234.5678, we first see digit 1, add it to val for a val of 1. The next digit is 2, so we multiple current val (1) by 10 to get 10 then add 2 to get 12. The next digit is 3, so we multiply the current val (12) by 10 to get 120, then add 3 to get 123. The next digit is 4, so we multiple the current val (123) by 10 to get 1230, then add 4 to get 1234. Then the '.' is not a digit, so we've finished the left side of the number.
for (val = 0.0; isdigit(s[i]); i++)
val = 10.0 * val + (s[i] - '0');
This part just moves past the dot.
if (s[i] == '.')
i++;
Now we do the same with the right side of the decimal as we did with the left, but we also track how many digits are past the decimal (with the power variable). In the example of 1234.5678, the first digit we see is 5. So we multiply the current val (1234) by 10 and add 5 for (12345). We also increase our power to 10.0. This continues until we get a val of 123456789 and a power of 10000.0.
for (power = 1.0; isdigit(s[i]); i++) {
val = 10.0 * val + (s[i] - '0');
power *= 10.0;
}
Finally, we divide by the power to get the decimal place in the correct spot (123456789 / 10000.0):
return sign * val / power;
double atof(char s[])
{
double val, power;
int i, sign;
// if there is any leading 'white space', step index past it
// keep stepping index until other than white space encountered
for (i = 0; isspace(s[i]); i++)
;
// if there is a '-' char
// then indicate value is negative
// else assume value is positive
// format is: result = (condition)? true value : false value
sign = (s[i] == '-') ? -1 : 1;
// if there is a sign byte, step index past it
if (s[i] == '+' || s[i] == '-')
i++;
// initialize the result 'val'
// then loop through following characters
for (val = 0.0; isdigit(s[i]); i++)
// digits are in the range 0x30 through 0x39
// make them integers by subtracting 0x30 ('0')
// and update the result 'val'
// remembering that each successive digit pushes the current result 'val'
// to 10 times the old value then add the new 'converted' digit
val = 10.0 * val + (s[i] - '0');
// this ends the 'for' code block
// when execution gets here, encountered something other than a digit
// when a '.' encountered, step the index past it
if (s[i] == '.')
i++;
// the 'power' value is indicating how much to divide the resulting
// 'val' by to place the decimal point (if there was a decimal point)
// into the correct position
// if other than a digit encountered, exit loop
for (power = 1.0; isdigit(s[i]); i++)
{
val = 10.0 * val + (s[i] - '0'); // see above comment about a similar line of code
power *= 10.0;
} // end for
// calculate the actual value by allowing for any sign (+ or -)
// then dividing that result by 'power' to properly place the decimal point
return sign * val / power;
} // end function: atof
Skip the white space; handle a leading sign; compute the integer part (in Val); skip the decimal; handle the fractional part (by updating Val as if there were no decimal point, but also power to account for it).
This code consists of 3 loops
the first loop keep reading 'spaces' until something readable has been detected (a sign or a number)
the second loop calculate the value of the left part of the floating point (the value of xxx in -xxx.545)
the last loop uses the value of the previous loop and continue with the right part of the 'point'
while calculating the number 'power' which is the 10 to the power of number of elements after the '.'
now that we have a sign and value of both left and right parts of the floating point number
now in simple example: let -12.345
sign = -1
val = 12345
power = 1000 ( 10 to the power of numbers after the '.')
result is -1 * 12345 / 1000 = -12.345

Why is '0' subtracted when converting string to number?

I am new to C and I was looking for a custom function in C that would convert a string to an integer and I came across this algorithm which makes perfect sense except for one part. What exactly is the -'0' doing on this line n = n * 10 + a[c] - '0';?
int toString(char a[]) {
int c, sign, offset, n;
if (a[0] == '-') { // Handle negative integers
sign = -1;
}
if (sign == -1) { // Set starting position to convert
offset = 1;
}
else {
offset = 0;
}
n = 0;
for (c = offset; a[c] != '\0'; c++) {
n = n * 10 + a[c] - '0';
}
if (sign == -1) {
n = -n;
}
return n;
}
The algorithm did not have an explanation from where I found it, here.
The reason subtracting '0' works is that character code points for decimal digits are arranged sequentially starting from '0' up, without gaps. In other words, the character code for '5' is greater than the character code for '0' by 5; character code for '6' is greater than the character code for '0' by 6, and so on. Therefore, subtracting the code of zero '0' from a code of another digit produces the value of the corresponding digit.
This arrangement is correct for ASCII codes, EBSDIC, UNICODE codes of decimal digits, and so on. For ASCII codes, the numeric codes look like this:
'0' 48
'1' 49
'2' 50
'3' 51
'4' 52
'5' 53
'6' 54
'7' 55
'8' 56
'9' 57
Assuming x has a value in the range between '0' and '9', x - '0' yields a value between 0 and 9. So x - '0' basically converts a decimal digits character constant to its numerical integer value (e.g., '5' to 5).
C says '0' to '9' are implementation defined values but C also guarantees '0' to '9' to be sequential values.

what does putchar('0' + num); do?

I am trying to understand how the putchar('0' + r); works. Below, the function takes an integer and transform it to binary.
void to_binary(unsigned long n)
{
int r;
r = n % 2;
if (n >= 2)
to_binary(n / 2);
putchar('0' + r);
}
I google the definition of putchar but I didn't find this. To test it, I added a printf to see the value of the r:
void to_binary(unsigned long n)
{
int r;
r = n % 2;
if (n >= 2)
to_binary(n / 2);
printf("r = %d and putchar printed ", r);
putchar('0' + r);
printf("\n");
}
and I run it (typed 5) and got this output:
r = 1 and putchar printed 1
r = 0 and putchar printed 0
r = 1 and putchar printed 1
So I suppose that the putchar('0' + r); prints 0 if r=0, else prints 1 if r=1, or something else happens?
In C '0' + digit is a cheap way of converting a single-digit integer into its character representation, like ASCII or EBCDIC. For example if you use ASCII think of it as adding 0x30 ('0') to a digit.
The one assumption is that the character encoding has a contiguous area for digits - which holds for both ASCII and EBCDIC.
As pointed out in the comments this property is required by both the C++ and C standards. The C standard says:
5.2.1 - 3
In both the source and execution basic character sets, the value of
each character after 0 in the above list of decimal digits shall be
one greater than the value of the previous.
'0' represents an integer equal to 48 in decimal and is the ASCII code for the character 0 (zero). The ASCII code for the character for 1 is 49 in decimal.
'0' + r is the same as 48 + r. When r = 0, the expression evaluates to 48 so a 0 is outputted. On the other hand, when r = 1, the expression evaluates to 49 so a 1 is outputted. In other words, '0' + 1 == '1'
Basically, it's a nice way to convert decimal digits to their ASCII character representations easily. It also works with the alphabet (i.e. 'A' + 2 is the same as C)
It's a common technique used for char handing.
char a = '0' + r (r in [0,9]) will convert an integer to its char format based on given char base (i.e. '0' in this case), you will get '0'...'9'
Similarly, char a = 'a' + r or char a = 'A' + r (r in [0,25]) will convert an integer to its char format, you will get 'a'...'z' or 'A'...'Z' (except for EBCDIC systems which has discontinuous area for alphabets).
Edit:
You can also do the other way around, for example:
char myChar = 'c';
int b = myChar - 'a'; // b will be 2
Similar idea is used to convert a lowercase char to uppercase:
char myChar = 'c';
char newChar = myChar - 'a' + 'A'; // newChar will be 'C'
U are adding the ASCII value of the number's
say '0' ASCII value is 48
'1' -> 49,and so on CHECK HERE FOR COMPLETE TABLE
so when u add one to 48 it will 49 and putchar functuion prints the character sent to it. when u do
putchar('0' + r )
if r = 1 putchar(48 + 1) (converting into ASCII value)
putchar(49) which is 1

Resources