Convert single Character (hex number) to Integer in C - c

So for an assignment I have to convert a character (0-F) to an integer (0-15), 0-9 works fine, but if any letter is given, it prints a random number: For C for instance, it gives 19, for D is returns 20.
This is my method:
int char2int(char digit) {
int i = 0;
if (digit == 0 || 1 || 2 || 3 || 4 || 5 || 6 || 7 || 8 || 9)
i = digit - '0';
else
if (digit == 'A' || 'B' || 'C' || 'D' || 'E' || 'F')
i = digit - '9';
else
i = -1;
return i;
}
At first my if statements were like this:
if (digit => 0 && =< 9)
if (digit => A && =< F)
But that gave a number of errors. You can tell I don't know C very well. My current If statement works but I'm sure it's unnecessarily long.

if (digit == 0 || 1 || 2 || 3 || 4 || 5 || 6 || 7 || 8 || 9)
This is not how conditional expressions work in C.
You either need to compare digit against each of the numbers individually
if (digit == '0' || digit == '1' || digit == '2' ...
or do it the clever way:
if(digit >= '0' && digit <= '9')
^^ not =<
Notice that I put ' around the numbers because you want to compare the digit with the letter 0 and not the number (which is not the same see here for all the ASCII character values).

You were on the right path when you started, but wandered off a bit. try this
#include <ctype.h>
int char2int(char d) {
if (!isxdigit(d)) {
return -1;
}
if (isdigit(d)) {
return d - '0';
}
return (tolower(d) - 'a') + 10;
}
If you'd prefer an approach closer to your range testing, you could do it like this:
int char2int(char d) {
if (d >= '0' && d <= '9') {
return d - '0';
}
d = tolower(d);
if (d >= 'a' && d <= 'f') {
return (d - 'a') + 10;
}
return -1;
}

Assuming ASCII the following converts from a character (0-9, a-f, A-F) to the associated unsigned integer (0-15). Any other character will also be converted to... some random value in the 0-15 range. Garbage in, garbage out.
unsigned hexToUnsigned(char ch) {
return ((ch | 432) * 239'217'992 & 0xffff'ffff) >> 28;
}
CPUs with 32-bit integers will generally be able to elide the 0xffff'ffff masking. On my machine the compiler turns this function into:
hexToUnsigned PROC
movsx eax, cl
or eax,1B0h
imul eax, eax, 0E422D48h
shr eax, 1ch
ret 0
hexToUnsigned ENDP
Another common way to do this has fewer apparent operations (just three), returns total garbage on invalid characters (which is probably okay), but also requires division (which takes it out of the top spot):
return ((ch | ('A' ^ 'a')) - '0') % 39;
To illustrate how compilers feel about division, they (at least on x64) change it into a multiply of the reciprocal to get the product and then one more multiply and subtract if you need the remainder:
hexToUnsigned PROC
; return ((ch | ('A' ^ 'a')) - '0') % 39;
movsx r8d, cl
mov eax, -770891565
or r8d, 32
sub r8d, 48
imul r8d
add edx, r8d
sar edx, 5
mov ecx, edx
shr ecx, 31
add edx, ecx
imul ecx, edx, 39
sub r8d, ecx
mov eax, r8d
ret 0
hexToUnsigned ENDP

The return value is not random. Every ascii character is represented in the memory by a value. The value of each ascii character can be found in the
Ascii Table.
The other responses tell you what you are doing wrong with the conditional expressions, but another mistake is that if the character is A, B, C, D, E or F you need to convert it to int like this i = ( digit - 'A' ) + 10 which means take the value of A, B, C, D, E or F subtract the min value which is A and add to that 10.
Moreover, you can see that if you don't need the exact value of a character you can do without the ascii table, using the property that letters are continuous.

If you are willing to make assumptions such as char are encoded as ASCII and 2's complement, the following is quite efficient.
This code is not meant for readability. Use other solutions if that is a concern. This is for tight encoding. With a given processor, it is about 10 instructions. Your results will vary.
Subtract 1. This shifts the char values down 1. In particular, A-Z is now 64-89 and a-z in the range 96-121.
Test if a bit (64's place) is clear: in the range of '0' - '9'. If so, increment by 7 and mask to keep that bit (64's place) cleared.
Otherwise mask a bit to fold a-z into the A-Z range.
Now '0' to '9' and 'A' to 'Z' are in a continues range. Just subtract 54. All unsigned char values other than 0-9, A-Z and a-z will have a value > 35. This is useful for any base use to base 36.
int Value(char ch) {
if (!(--ch & 64)) { // decrement, if ch in the '0' to '9' area ...
ch = (ch + 7) & (~64); // move 0-9 next to A-Z codes
} else {
ch &= ~32;
}
ch -= 54; // -= 'A' - 10 - 1
if ((unsigned char)ch > 15) {
; // handle error
}
return (unsigned char)ch;
}

In redis
https://github.com/antirez/redis/blob/3.2.8/src/sds.c#L892
int hex_digit_to_int(char c) {
switch(c) {
case '0': return 0;
case '1': return 1;
case '2': return 2;
case '3': return 3;
case '4': return 4;
case '5': return 5;
case '6': return 6;
case '7': return 7;
case '8': return 8;
case '9': return 9;
case 'a': case 'A': return 10;
case 'b': case 'B': return 11;
case 'c': case 'C': return 12;
case 'd': case 'D': return 13;
case 'e': case 'E': return 14;
case 'f': case 'F': return 15;
default: return 0;
}
}

Related

About atoi function

`
#include <unistd.h>
int ft_atoi(char *str)
{
int c;
int sign;
int result;
c = 0;
sign = 1;
result = 0;
while ((str[c] >= '\t' && str[c] <= '\r') || str[c] == ' ')
{
c++;
}
while (str[c] == '+' || str[c] == '-')
{
if (str[c] == '-')
sign *= -1;
c++;
}
while (str[c] >= '0' && str[c] <= '9')
{
result = (str[c] - '0') + (result * 10);
c++;
}
return (result * sign);
}
#include <stdio.h>
int main(void)
{
char *s = " ---+--+1234ab567";
printf("%d", ft_atoi(s));
}
`
This line: result = (str[c] - '0') + (result * 10);
Why do we subtract zero and multiply by 10? How its convert ascii to int with this operations?
Thanks...
Some detail before answering your question
Internally everything is a number a char is not an exception.
In C char is a promoted type of integer meaning characters are integer in C. The char which is promoted type of integer are mapped to responding ASCII Value.
For example:
Capital Letter Range
65 => 'A' to 90 => 'Z'
Small Letter Range
97 => 'a' to 122 => 'z'
Number Range
48 => '0' to 57 => '9'
To answer your question
The ASCII CHARACTER '0' subtracted from any ASCII CHARACTER that is a digit(0-9) results to an actual Integer.
For Example
'9'(57) - '0'(48) = 9 (int)
'8'(56) - '0'(48) = 8 (int)
Remember char are promoted integer in C Read the detail to understand this statement.
And Also the ASCII CHARACTER '0' added to any INTEGER in the range(0-9) results to an ASCII CHARACTER.
For Example
9 + '0'(48) = '9'(57) (char)
8 + '0'(48) = '8' (56)(char)
Please see ASCII table
The ASCII code for '0' is 48 - not zero. Therefore, to convert to decimal you need to subtract 48

how to shift down alphabet characters in a cyclic way?

i want to find the alphabet charachter that is 7 charachters before mine ,so i wrote this function to do so and it works fine :
char find7_before(char letter){
switch (letter){
case 'g':
return 'z';
break;
case 'f':
return 'y';
break;
case 'e':
return 'x';
break;
case 'd':
return 'w';
break;
case 'c':
return 'v';
break;
case 'b':
return 'u';
break;
case 'a':
return 't';
break;
default:
return (char)(((int)letter) - 7);
}
}
but i think i can do it in a smarter way without all of these cases but i just can't figure it out ! (i figured how to find 7 letters after in a cyclic way ) any help or an idea or a hint ?
thank you :)
Assuming ASCII with continuous ['a' , 'z']...
Simply "mod 26".
letter = ((letter - 'a' - 7) mod 26) + 'a';
Yet C does not have a Euclidean mod operator.
See What's the difference between “mod” and “remainder”?
So create a Euclidean mod function - save for later use.
int modulo_Euclidean(int a, int b) {
int m = a % b;
if (m < 0) {
// m += (b < 0) ? -b : b; // avoid this form: it is UB when b == INT_MIN
m = (b < 0) ? m - b : m + b;
}
return m;
}
letter = modulo_Euclidean(letter - 'a' - 7, 26) + 'a';
Alternately code can take advantage that 'a' has a value of 97 and not subtract so much that letter - ('a'%26) - 7 becomes negative.
letter = (letter - ('a'%26) - 7)%26 + 'a';
Pedantic code would not assume continuous ['a' , 'z'] and perform more elaborate code.
subtract 'a' (so now it's in 0-25), subtract 7, and mod 26. Then add 'a' again so it's back to a char.
In my opinion, the clearest and simplest way is to use an if statement:
char find7_before(char letter) {
char value = letter - 7;
if (value < 'a') {
value += 26;
}
return value;
}
The precondition here is the letter is between 'a' and 'z', inclusive.
This technique generalizes as well:
char findn_before(char letter, int n) {
char value = letter - n;
if (value < 'a') {
value += 26;
}
return value;
}
Precondition on letter is the same as before; n must be between 0 and 26, inclusive.

c-changing string to int

I'm trying to change a string of chars into a number.
For example the string '5','3','9' into 539.
what I did is:
for (j = 0; j < len_of_str; j++)
num = num + ((str[j] - 48) * (10 ^ (len_of_str - j)))
printf("%d", num);
num is the number which would contain the number as a int the minus 48 is to change the value in ASCII to a number who's like the real number.
and the (10 ^ (len_of_str - j)) is the change the values to hundreds, thousands, etc...
Several issues:
First, ^ is not an exponentiation operator in C - it's a bitwise exclusive-OR operator. Instead of getting 10N, you'll get 10 XOR N, which is not what you want. C does not have an exponentiation operator (ironic for a language that defines eleventy billion operators, but there you go) - you'll need to use the library function pow instead. Or you can avoid the whole issue and do this instead:
num = 0;
for ( j = 0; j < len_of_str; j++ )
{
num *= 10;
num += str[j] - 48;
}
Second, str[j]-48 assumes ASCII encoding. To make that a bit more generic, use str[j] - '0' instead (in most encodings digit characters are sequential,
so '9' - '0' should equal 9).
Finally, is there a reason you're not using one of the built-in library functions such as atoi or strtol?
num = (int) strtol( str, NULL, 0 );
printf( "num = %d\n", num );
As pointed out by the comments above, ^ does not actually calculate a power, but instead does a bit-wise XOR (see wikipedia). For instance for 0101 ^ 0111 == 0010, as XOR will only set the bits to one for which the inputs differ in that bit.
To calculate 10 to the power something in c, use pow(double x, double y) from <math.h>. See this post for more information.
Convert a sequence of digits into an integer is a special case of the more general case of parsing a number (integer or real) into a binary integer or double value.
One approach is to describe the number using a pattern, which you can either describe iteratively, or recursively as follows,
An integer_string is composed of:
and optional '+' or '-' (sign)
follwed by a digit_sequence
a digit_sequence is composed of:
digit ('0', '1', '2', '3', ..., '9')
followed by an optional (recursive) digit_sequence
This can be written using Backus-Naur formalism as,
integer_string := { '+' | '-' } digit_sequence
digit_sequence := digit { digit_sequence }
digit := [ '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' ]
Should you desire, you can extend the above to recognize a real number,
real_number := integer_string { '.' { digit_sequence } }
{ [ 'e' | 'E' ] integer_string }
Although the above is not quite correct, as it forces a digit before the decimal (fix is left as an exercise for the reader).
Once you have the Backus-Naur formalism, it is easy to recognize the symbols that comprise the pattern, and the semantic action of the actual conversion to integer
long int
atol_self(char* str)
{
if(!str) return(0);
//accumulator for value
long int accum=0; //nothing yet
//index through the string
int ndx=0;
//handle the optional sign
int sign=1;
if ( str[ndx=0] == '+' ) { sign=1; ndx+=1; }
else if ( str[ndx=0] == '+' ) { sign=1; ndx+=1; }
for( ; str[ndx] && isdigit(str[ndx]); ) {
int digval = str[ndx] - '0';
accum = accum*10 + digval;
++ndx;
}
return(accum*sign);
}

Why is '0' subtracted when converting string to number?

I am new to C and I was looking for a custom function in C that would convert a string to an integer and I came across this algorithm which makes perfect sense except for one part. What exactly is the -'0' doing on this line n = n * 10 + a[c] - '0';?
int toString(char a[]) {
int c, sign, offset, n;
if (a[0] == '-') { // Handle negative integers
sign = -1;
}
if (sign == -1) { // Set starting position to convert
offset = 1;
}
else {
offset = 0;
}
n = 0;
for (c = offset; a[c] != '\0'; c++) {
n = n * 10 + a[c] - '0';
}
if (sign == -1) {
n = -n;
}
return n;
}
The algorithm did not have an explanation from where I found it, here.
The reason subtracting '0' works is that character code points for decimal digits are arranged sequentially starting from '0' up, without gaps. In other words, the character code for '5' is greater than the character code for '0' by 5; character code for '6' is greater than the character code for '0' by 6, and so on. Therefore, subtracting the code of zero '0' from a code of another digit produces the value of the corresponding digit.
This arrangement is correct for ASCII codes, EBSDIC, UNICODE codes of decimal digits, and so on. For ASCII codes, the numeric codes look like this:
'0' 48
'1' 49
'2' 50
'3' 51
'4' 52
'5' 53
'6' 54
'7' 55
'8' 56
'9' 57
Assuming x has a value in the range between '0' and '9', x - '0' yields a value between 0 and 9. So x - '0' basically converts a decimal digits character constant to its numerical integer value (e.g., '5' to 5).
C says '0' to '9' are implementation defined values but C also guarantees '0' to '9' to be sequential values.

Shift a letter down the alphabet?

I.E., you enter the number 5, and the character A and the output would yield F. I have no idea how to even start to go about this, any give me a push in the right direction?
Individual characters are represented by numbers according to the ASCII code (usually). In C, if you add a number to a character, you're shifting the character down. Try:
char c = 'A';
int n = 5;
printf("%c\n", c + n);
Look at the ASCII table and note the values of the characters.
Try this:
#include <stdio.h>
char shift_char(char val, char shift)
{
val = toupper(val);
assert(isupper(val));
char arr[26] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
return arr[ ( (toupper(val) - 'A' + shift) % 26) ];
}
You can get a little fancier if you want to preserve the case of the character. It also assumes, but does not verify shift is non-negative. That case may cause problems with the modulus operation you will need to guard against... or better yet prevent. Still, since this is tagged as homework, that's the sort of thing you should work through.
If you can assume ASCII, it is easier.
Characters are no more than simple numbers: only the interpretation of said numbers changes. In ASCII all letters are sequential; so the number for 'A' + 5 is the number for 'F'; 'F' - 1 is 'E' ..., ...
int ch = 'J';
ch -= 2; putchar(ch);
ch -= 3; putchar(ch);
ch += 7; putchar(ch); putchar(ch);
ch += 3; putchar(ch);
puts("");
Just pay attention to wrapping!
If you can't assume ASCII, you need to convert characters yourself. Something like:
char charr[26] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
int ndx = 9; /* charr[9] is 'J' */
ndx -= 2; putchar(charr[ndx]);
ndx -= 3; putchar(charr[ndx]);
ndx += 7; putchar(charr[ndx]); putchar(charr[ndx]);
ndx += 3; putchar(charr[ndx]);
puts("");
Do not forget the wrapping
Other people have pointed out that you can use ASCII.
An easy way to handle wrapping is with modulus arithmetic:
char result, ch;
int offset;
... // Populate ch with the letter to be changed and offset with the number.
result = ch - 'a';
result = (result + offset) % 26; // 26 letters in the alphabet
result += 'a';
char shift_char(char c, char shift)
{
if(isalpha(c)) {
if (c>='A' && c<='Z') {
return 'A' + ( (c - 'A' + shift) % 26);
} else if(c>='a' && c<='z') {
return 'a' + ( (c - 'a' + shift) % 26);
}
}
return c;
}

Resources