Converting from hexadecimal to decimal number in C - c

I am doing the exercise in the C Programming language book, and exercise 2-3 asked us to write a function htoi to convert a hexadecimal number to decimal number.
This is the code I wrote, however when it runs, it always show that my hexadecimal number is illegal.
Please help!
#include<stdio.h>
#define TRUE 1
#define FALSE 0
int htoi (char s[]);
int main() {
printf("The decimal number is %d\n", htoi("0x134"));
return 0;
}
int htoi (char s[]) {
int j; /* counter for the string */
int temp; /* temp number in between conversion */
int number; /* the converted number */
int ishex; /* if the number is a valid hexadecimal number */
char c;
number = 0;
temp = 0;
ishex = FALSE;
if (s[0] == '0' && (s[1] == 'x' || s[1] == 'X')) {
ishex = TRUE;
}
else {
ishex = FALSE;
printf("This is not valid hexadecimal number.\n");
return number = 0;
}
if (ishex == TRUE) {
for (j = 2; (c = s[j]) != EOF; ++j) {
if (c >= '0' && c <= '9')
temp = c - '0';
else if (c >= 'a' && c <= 'f')
temp = 10 + c - 'a';
else if (c >= 'A' && c <= 'F')
temp = 10 + c - 'A';
else {
printf("This is a illegal hexadecimal number.\n");
ishex = FALSE;
return 0;
}
number = number * 16 + temp;
}
}
return number;
}

A string is a sequence of characters that terminates at the first '\0' character. That means "0x134" terminates with a '\0' character value, not an EOF value.
You are operating on a sequence of characters that you expect to be terminated by an EOF value, but that is simply not possible. I'll explain why later... Suffice to say for now, the string "0x134" contains no EOF value.
Your loop reaches the string-terminating '\0', which isn't in the range 0..9, a..f or A..F and so this branch executes:
else {
printf("This is a illegal hexadecimal number.\n");
ishex = FALSE;
return 0;
}
Perhaps you meant to write your loop like so:
for (j = 2; (c = s[j]) != '\0'; ++j) {
/* SNIP */
}
I promised to explain what is wrong with expecting EOF to exist as a character value. Assuming an unsigned char is 8 bits, getchar can return one of 256 character values, and it will return them as a positive unsigned char value... OR it can return the negative int value EOF, corresponding to an error or end-of-file.
Confused? In an empty file, there are no characters... Yet if you try to read a character from the file, you will get EOF every time, in spite of there being no characters. Hence, EOF is not a character value. It's an int value, and should be treated as such before you attempt to convert the value to a character, like so:
int c = getchar();
if (c == EOF) {
/* Here, c is NOT A CHARACTER VALUE! *
* It's more like an error code ... *
* XXX: Break or return or something */
}
else {
/* Here, c IS a character value, ... *
* so the following conversion is ok */
char ch = c;
}
On another note, c >= '0' && c <= '9' will evaluate truthfully when c is one of the digits in the range 0..9... This is a requirement from the C standard
Neither c >= 'a' && c <= 'f' nor c >= 'A' && c <= 'F' are required to evaluate truthfully under any circumstance, however. It happens to work on your system, because you are using ASCII which contains all of the lowercase letters in one contiguous block, and all of the uppercase letters in another contiguous block. C does not require that ASCII be the character set.
If you want this code to work portably, you might consider something like:
char alpha_digit[] = "aAbBcCdDeEfF";
if (c >= '0' && c <= '9') {
c -= '0';
}
else if (strchr(alpha_digit, c)) {
c = 10 + (strchr(alpha_digit, c) - alpha_digit) / 2;
}
else {
/* SNIP... XXX invalid digit */
}

Related

How do I get my integer arithmetic into a long long?

As part of exercise 2-3 in Ritchie and Kernighan's C programming language, I've written a program that converts hexadecimal inputs into decimal outputs. I want it to be able to handle larger numbers, but it seems to be doing integer arithmetic somewhere. When you enter something like "DECAFCAB" it spits out a large negative int. I figured out that I need to add the "LL" suffix to my literals, which I did, but it's still not working. Any help please? Sorry if this is a dumb question or a typo, but I've been at it for an hour and can't figure it out. :(
#include <stdio.h>
#define MAX_LINE 1000
void getline(char s[])
{
int i;
char c;
for(i = 0; i < MAX_LINE-1 && (c=getchar()) != EOF && c != '\n'; ++i)
s[i] = c;
s[i] = '\0';
printf("\n%s", s);
}
long long htoi(char s[]) // convert the hex string to dec
{
long long n = 0;
int i = 0;
if(s[i] == '0') // eat optional leading Ox or OX
++i;
if(s[i] == 'x' || s[i] == 'X')
++i;
while(s[i] != '\0')
{
if((s[i] >= '0' && s[i] <= '9'))
n = 16LL * n + (s[i] - '0'); // here is the arithmetic in question
else if(s[i] >= 'A' && s[i]<= 'F')
n = 16LL * n + (s[i] - 'A' + 10LL);
else if(s[i] >= 'a' && s[i] <= 'f')
n = 16LL * n + (s[i] - 'a' + 10LL);
else {
printf("\nError: Encountered a non-hexadecimal format: the '%c' character was unexpected.", s[i]);
printf("\nHexadecimal numbers can begin with an optional 0x or 0X only, and contain 0-9, A-F, and a-f.\n\n");
return -1;
}
++i;
}
return n;
}
main()
{
char input[MAX_LINE];
long long hex_output;
while(1){
getline(input);
hex_output = htoi(input);
if(hex_output >= 0)
printf("\nThe value of the hexadecimal %s is %d in decimal.\n\n", input, hex_output);
}
}
You told printf to expect an int when you made the placeholder %d. To make it expect (and therefore read the entirety of a) long long, modify it to %lld.
The reason it looks like a plain int is that with varargs functions like printf, it doesn't know what the argument sizes are, and the format string is the only way to figure it out. When you say to expect plain int, it reads sizeof(int) bytes from the argument, not sizeof(long long) bytes (it's not necessarily byte-oriented, but that's how much data is read), and (on a little endian system with 4 byte int and 8 byte long long) you see (roughly) the result of the argument with the top 4 bytes masked off.
The problem you are experiencing comes from treating a (conventionally) "unsigned" hexadecimal integer value as "signed". Resorting using to a larger built-in data type will get you past the problem with going from 31 to 32 bits, but this masks the actual problem. (If you extend to 64 bits, you will encounter the same problem and be back asking, "why doesn't this work.")
Better is to write code that doesn't require ever wider registers. There will always be a maximum width, but the answer to this OP is to use an "unsigned long".
#include <stdio.h>
unsigned long htoi( char s[] ) { // convert the hex string to dec
unsigned long n = 0;
int i = 0;
if(s[i] == '0') // eat optional leading Ox or OX
++i;
if(s[i] == 'x' || s[i] == 'X')
++i;
for( ; s[i]; i++ ) {
unsigned int dVal = 0; // don't copy/paste complex statements.
if((s[i] >= '0' && s[i] <= '9'))
dVal = s[i] - '0'; // simple
else if(s[i] >= 'A' && s[i]<= 'F')
dVal = s[i] - 'A' + 10; // simple
else if(s[i] >= 'a' && s[i] <= 'f')
dVal = s[i] - 'a' + 10; // simple
else {
// less verbose
printf("\nError: '%c' unexpected.", s[i] );
return 0; // NB: Notice change!!
}
n = (16 * n) + dVal; // simple...
}
return n;
}
int main() {
// simplified, stripping out user input.
char *hexStr = "0xDECAFCAB";
unsigned long hex_output = htoi( hexStr );
// Notice the format specifier to print an ordinary (unsigned) long
printf( "\nThe value of the hexadecimal %s is %u in decimal.\n\n", hexStr, hex_output );
return 0;
}
The value of the hexadecimal 0xDECAFCAB is 3737844907 in decimal.
When K&R wrote the original book, there was no such thing as "long long", but there was "unsigned long".

convert a string cointaing a base 10 number to an integer value

I am fairly new to programming and I am trying to convert a string containing a base 10 number to an integer value following this pseudo algorithm in c.
start with n = 0
read a character from the string and call it c
if the value of c is between '0' and '9' (48 and 57):
n = n * 10 +(c-'0')
read the next character from the string and repeat
else return n
here is the rough basics of what i wrote down however I am not clear on how to read a character from the string. i guess im asking if i understand the pseudocode correctly.
stoi(char *string){
int n = 0;
int i;
char c;
for (i = 0;i < n ; i++){
if (c[i] <= '9' && c[i] >= '0'){
n = n *10 +(c - '0')}
else{
return n
}
}
}
You were close, you just need to traverse the string to get the value of each digit.
Basically you have two ways to do it.
Using array notation:
int stoi(const char *str)
{
int n = 0;
for (int i = 0; str[i] != '\0'; i++)
{
char c = str[i];
if ((c >= '0') && (c <= '9'))
{
n = n * 10 + (c - '0');
}
else
{
break;
}
}
return n;
}
or using pointer arithmetic:
int stoi(const char *str)
{
int n = 0;
while (*str != '\0')
{
char c = *str;
if ((c >= '0') && (c <= '9'))
{
n = n * 10 + (c - '0');
}
else
{
break;
}
str++;
}
return n;
}
Note that in both cases we iterate until the null character '\0' (which is the one that marks the end of the string) is found.
Also, prefer const char *string over char *string when the function doesn't need to modify the string (like in this case).
Congrats on starting your C journey!
One of the most important aspects of strings in C is that, technically, there are none. A string is not a primitive type like in Java. You CAN'T do:
String myString = "Hello";
In C, each string is just an array of multiple characters. That means the word Hello is just the array of [H,e,l,l,o,\0]. Here, the \0 indicates the end of the word. This means you can easily access any character in a string by using indexes (like in a normal array):
char *myString = "Hello";
printf("%c", myString[0]); //Here %c indicates to print a character
This will then print H, since H is the first character in the string. I hope you can see how you can access the any character in the string.

C program to capitalize a word inside quotation marks

I need to build a function that gets an input and capitalizes only the first letter, doesn't print numbers, capitalizes after a . for a new sentence, and capitalizes all words between a double quotation marks ".
This is what I got until now:
#include <stdio.h>
#define MAX 100
int main()
{
char str[MAX] = { 0 };
int i;
//input string
printf("Enter a string: ");
scanf("%[^\n]s", str); //read string with spaces
//capitalize first character of words
for (i = 0; str[i] != '\0'; i++)
{
//check first character is lowercase alphabet
if (i == 0)
{
if ((str[i] >= 'a' && str[i] <= 'z'))
str[i] = str[i] - 32; //subtract 32 to make it capital
continue; //continue to the loop
}
if (str[i] == '.')//check dot
{
//if dot is found, check next character
++i;
//check next character is lowercase alphabet
if (str[i] >= 'a' && str[i] <= 'z')
{
str[i] = str[i] - 32; //subtract 32 to make it capital
continue; //continue to the loop
}
}
else
{
//all other uppercase characters should be in lowercase
if (str[i] >= 'A' && str[i] <= 'Z')
str[i] = str[i] + 32; //subtract 32 to make it small/lowercase
}
}
printf("Capitalize string is: %s\n", str);
return 0;
}
I cant find a way to remove all numbers from input and convert all lowercase to uppercase inside a " plus code for not printing numbers if user input them.
if I input
I am young. You are young. All of us are young.
"I think we need some help. Please" HELP. NO, NO NO,
I DO NOT
NEED HELP
WHATSOEVER.
"Today’s date is
15/2/2021"...
I am 18 years old, are you 20 years old? Maybe 30 years?
output:
I am young. You are young. All of us are young.
"I THINK WE NEED SOME HELP. PLEASE" help. No, no no,
i do not
need help
whatsoever.
"TODAY’S DATE IS
//"...
I am years old, are you years old? maybe years?
The C standard library provides a set of functions, in ctype.h, that will help you
Of particular interest, would be:
isdigit() - returns true if digit
isalpha() - returns true if alphabet character
isalnum() - returns true if alpha/numeric character
islower() - returns true if lower case character
isupper() - returns true if upper case character
tolower() - converts character to lower case
toupper() - converts character to upper case
So, for example, you could replace the test/modify with:
if ( islower( str[i] ) )
{
str[i] = toupper( str[i] );
}
Pedantically, islower() and toupper() return an unsigned int but that's a separate matter...
You can remove letters from a string if you keep two indices, one for reading and one for writing. The following loop will remove all digits from a string:
int j = 0; // writing index, j <= i
int i; // reading index
for (i = 0; str[i]; i++) {
int c = (unsigned char) str[i];
if (!isdigit(c)) str[j++] = c;
}
str[j] = '\0';
(I've used to character classification functions from <ctype.h> mentioned in Andrew' answer.)
This is safe, because j will always be smaller or equal to i. Don't forget to mark the end of the filtered string with the nullterminator, '\0'. You can combine this filtering with your already existing code for replacing characters.
In your code, you capitalize letters only if they are directly behind a full stop. That's usually not the case, there's a space between full stop and the next word. It's better to establish a context:
shift: capitalize the next letter (beginning or after full stop.)
lock: capitalize all letters (inside quotation marks.)
When you read a letter, decide whether to capitalize it or not depending of these two states.
Putting the filtering and the "shift context§ together:
#include <stdio.h>
#include <ctype.h>
int main(void)
{
char str[] = "one. two. THREE. 4, 5, 6. \"seven\", eight!";
int shift = 1; // Capitalize next letter
int lock = 0; // Capitalize all letters
int j = 0; // writing index, j <= i
int i; // reading index
for (i = 0; str[i]; i++) {
int c = (unsigned char) str[i];
if (isdigit(c)) continue;
if (isalpha(c)) {
if (shift || lock) {
str[j++] = toupper(c);
shift = 0;
} else {
str[j++] = tolower(c);
}
} else {
if (c == '"') lock = !lock;
if (c == '.') shift = 1;
str[j++] = c;
}
}
str[j] = '\0';
puts(str);
printf("(length: %d)\n", j);
return 0;
}
In order to remove some characters, you should use 2 index variables: one for reading and one for writing back to the same array.
If you are allowed to use <ctype.h>, it is a much more portable and efficient way to test character types.
Also do not use scanf() with protection against buffer overflow. It is as bad as using gets(). Given the difficulty in specifying the maximum number of bytes to store into str, you should use fgets() instead of scanf().
Here is a modified version:
#include <ctype.h>
#include <stdio.h>
#define MAX 100
int main() {
char str[MAX];
int i, j;
unsigned char last, inquote;
//input string
printf("Enter a string: ");
if (!fgets(str, sizeof str, stdin)) { //read string with spaces
// empty file
return 1;
}
last = '.'; // force conversion of first character
inquote = 0;
//capitalize first character of words
for (i = j = 0; str[i] != '\0'; i++) {
unsigned char c = str[i];
//discard digits
if (isdigit(c)) {
continue;
}
//handle double quotes:
if (c == '"') {
inquote ^= 1;
}
//upper case letters after . and inside double quotes
if (last == '.' || inquote) {
str[j++] = toupper(c);
} else {
str[j++] = tolower(c);
}
if (!isspace(c) && c != '"') {
// ignore spaces and quotes for the dot rule
last = c;
}
}
str[j] = '\0'; // set the null terminator in case characters were removed
printf("Capitalized string is: %s", str);
return 0;
}

How does getchar_unlocked() work?

My question is based on a CodeChef problem called Lucky Four.
This is my code:
int count_four() {
int count = 0;
char c = getchar_unlocked();
while (c < '0' || c > '9')
c = getchar_unlocked();
while (c >= '0' && c <= '9') {
if (c == '4')
++count;
c = getchar_unlocked();
}
return count;
}
int main() {
int i, tc;
scanf("%d", &tc);
for (i = 0; i < tc; ++i) {
printf("%d\n", count_four());
}
return 0;
}
Let's say I make a slight change to count_four():
int count_four() {
int count = 0;
char c = getchar_unlocked();
while (c >= '0' && c <= '9') {
if (c == '4')
++count;
c = getchar_unlocked();
}
while (c < '0' || c > '9') // I moved this `while` loop
c = getchar_unlocked();
return count;
}
This is my output after moving the while loop below the other one:
0
3
0
1
0
instead of:
4
0
1
1
0
The input used to test the program:
5
447474
228
6664
40
81
Why is this happening? How do getchar() and getchar_unlocked() work?
getchar_unlocked is just a lower level function to read a byte from the stream without locking it. In a single thread program, it behaves exactly like getchar().
Your change in the count_four function changes its behavior completely.
The original function reads the standard input. It skips non digits, causing an infinite loop at end of file. It then counts digits until it gets a '4'. The count is returned.
Your version reads the input, it skips digits, counting occurrences of '4', it then skips non digits, with the same bug on EOF, and finally returns the count.

Bizzare behavior from C program:: Kernighan & Ritchie exercise 2-3

all.
I've written a program as a solution to Kernighan & Ritchie's exercise 2-3, and its behaviour during testing is (IMHO) wildly unintuitive.
The problem spec says to write a program that converts hex values to their decimal equivalent. The code I've written works fine for smaller hex values, but for larger hex values things get a little... odd. For example, if I input 0x1234 the decimal value 4660 pops out on the other end, which happens to be the correct output (the code also works for letters, i.e. 0x1FC -> 508). If, on the other hand, I were to input a large hex value, say as a specific example 0x123456789ABCDEF, I should get 81985529216486895, though instead I get 81985529216486896 (off by one digit!).
The error in conversion is inconsistent, sometimes with the decimal value being too high and other times too low. Generally, much larger hex values result in more incorrect place values in the decimal output.
Here's my program in its entirety:
/*Kernighan & Ritchie's Exercise 2-3
Write a function 'htoi' which converts a string of hexadecimal digits (including an
optional 0x or 0X) into its equivalent integer value.
*/
#include <stdio.h>
#define MAXLINE 1000 //defines maximum size of a hex input
//FUNCTION DEFINITIONS
signed int htoi(char c); //converts a single hex digit to its decimal value
//BEGIN PROGRAM////////////////////////////////////////////////////////////
main()
{
int i = 0; //counts the length of 'hex' at input
char c; //character buffer
char hex[MAXLINE]; //string from input
int len = 0; //the final value of 'i'
signed int val; //the decimal value of a character stored in 'hex'
double n = 0; //the decimal value of 'hex'
while((c = getchar()) != '\n') //store a string of characters in 'hex'
{
hex[i] = c;
++i;
}
len = i;
hex[i] = '\0'; //turn 'hex' into a string
if((hex[0] == '0') && ((hex[1] == 'x') || (hex[1] == 'X'))) //ignore leading '0x'
{
for(i = 2; i < len; ++i)
{
val = htoi(hex[i]); //call 'htoi'
if(val == -1 ) //test for a non-hex character
{
break;
}
n = 16.0 * n + (double)val; //calculate decimal value of hex from hex[0]->hex[i]
}
}
else
{
for(i = 0; i < len; ++i)
{
val = htoi(hex[i]); //call 'htoi'
if(val == -1) //test for non-hex character
{
break;
}
n = 16.0 * n + (double)val; //calc decimal value of hex for hex[0]->hex[i]
}
}
if(val == -1)
{
printf("\n!!THE STRING FROM INPUT WAS NOT A HEX VALUE!!\n");
}
else
{
printf("\n%s converts to %.0f\n", hex, n);
}
return 0;
}
//FUNCTION DEFINITIONS OUTSIDE OF MAIN()///////////////////////////////////
signed int htoi(char c)
{
signed int val = -1;
if(c >= '0' && c <= '9')
val = c - '0';
else if(c == 'a' || c == 'A')
val = 10;
else if(c == 'b' || c == 'B')
val = 11;
else if(c == 'c' || c == 'C')
val = 12;
else if(c == 'd' || c == 'D')
val = 13;
else if(c == 'e' || c == 'E')
val = 14;
else if(c == 'f' || c == 'F')
val = 15;
else
{
;//'c' was a non-hex character, do nothing and return -1
}
return val;
}
pastebin: http://pastebin.com/LJFfwSN5
Any ideas on what is going on here?
You are probably exceeding the precision with which double can store integers.
My suggestion would be to change your code to use unsigned long long for the result; and also add in a check for overflow here, e.g.:
unsigned long long n = 0;
// ...
if ( n * 16 + val < n )
{
fprintf(stderr, "Number too big.\n");
exit(EXIT_FAILURE);
}
n = n * 16 + val;
My less-than check works because when unsigned integer types overflow they wrap around to zero.
If you want to add more precision than unsigned long long then you will have to get into more advanced techniques (probably beyond the scope of Ch. 2 of K&R but once you've finished the book you could revisit).
NB. You also need to #include <stdlib.h> if you take my suggestion of exit; and don't forget to change %.0f to %llu in your final printf. Also, a safer way to get the input (which K&R covers) is:
int c;
while((c = getchar()) != '\n' && c != EOF)
The first time I ran the code on ideone I got segfault, because I didn't put a newline on the end of the stdin so this loop kept on shoving EOF into hex until it buffer overflowed.
This is a classic example of floating point inaccuracy.
Unlike most of the examples of floating point errors you'll see, this is clearly not about non-binary fractions or very small numbers; in this case, the floating point representation is approximating very big numbers, with the accuracy decreasing the higher you go. The principle is the same as writing "1.6e10" to mean "approximately 16000000000" (I think I counted the zeros right there!), when the actual number might be 16000000001.
You actually run out of accuracy sooner than with an integer of the same size because only part of the width of a floating point variable can be used to represent a whole number.

Resources