Translate string to number - c

I am looking for a way to take a string and check 3 possibilities.
Digit and thus converts it to a signed int (not a long)
Is a symbolic representation previously defined at runtime, and converts it to a signed int
Neither
The "symbolic representation" will be basically like an associative array that starts at 0 elements and expands as more symbols are added. For example lets say for instance that C had associative arrays (I wish) with this peusdocode:
symbol_array['q'] = 3;
symbol_array['five'] = 5;
symbol_array['negfive'] = -5;
symbol_array['random294'] = 28;
signed int i;
string = get_from_input();
if(!(i = convert_to_int(string))) {
if(!(i = translate_from_symbol(string))) {
printf("Invalid symbol or integer\n");
exit(1);
}
}
printf("Your number: %d\n, i);
The idea being if they entered "5" it would convert it to 5 via convert_to_int, and if they entered "five" it would convert it to 5 via translate_from_symbol. As what I feel may be hardest is if they entered "random294" it wouldn't convert it to 294, but to 28. If they entered "foo" then it would exit(1).
My general questions are these: (Instead of making multiple posts)
When making convert_to_int I know I shouldn't use atoi because it doesn't fail right. Some people say to use strtol but it seems tedious to convert it back to a non-long int. The simplistic (read: shortest) way I've found is using sscanf:
int i;
if ((sscanf(string, "%d", &i)) == 1){
return i;
}
However, some people look down on that even. What is a better method if not sscanf or converting strtol?
Secondly, how can I not only return an integer but also know if it found one. For example if the user entered "0" then it would return 0, thus setting off my FALSE in my if statement. I had considered using -1 if not found but since I am returning signed int's then this also suffers from the same problem. In PHP I know for example with strpos they use === FALSE
Finally, is there any short code that emulates associate arrays and/or lets me push elements on to the array in runtime?

First, you might want to revise your syntax and set the keyword apart from the operand, i.e. "neg five" instead of "negfive". Otherwise your symbol lookup for the keywords has to consider every prefix. ("random294" might be okay if your keywords aren't allowed to have digits in them.)
Sure, sscanf tells you whether you found a decimal in the return value and writes that decimal to a separate int, which is nice, but you'll have to watch out for trailing characters by checking that the number of characters read equals the length of your string with the %n format. Otherwise, sscanf will consider 5x as legal decimal number. strtol also returns a pointer to the location after the parsed decimal number, but it relies too much on checking err for my taste.
The fact that strtol uses long integers shouldn't be an issue. If the input doesn't fit into an int, return INT_MAX or INT_MIN or issue an error.
You can also easily write a wrapper function around sscanf or strtol that suits your needs better. (I know I'd like a function that returns true on success and stores the integer via a pointer argument, sscanf style, where success means: no trailing non-digit characters.)
Finally, about the associative arrays: There is no short code, at least not in C. You'll have to implement your own hash map or use a library. As a first draft, I'd use a linear list of strings and check them one by one. This is a very naive approach, but easy to implement. I assume that you don't start out with a lot of symbols, and you're not doing a lot of checks, so speed shouldn't be an issue. (You can sort the array and use binary search to speed it up, but you'd have to re-sort after every insertion.) Once you have the logic of your program working, you can start thinking about hash maps.

Something like this should do your job:
#include <stdio.h>
#include <string.h>
struct StringToLongLookUp {
char *str;
char *num;
};
struct StringToLongLookUp table[] =
{
{ "q" , "3" },
{ "five" , "5" },
{ "negfive" , "-5" },
{ "random294", "28" }
};
int translate_from_symbol(char **str)
{
int i;
for(i = 0; i < (sizeof(table) / sizeof(struct StringToLongLookUp)); i++)
{
if(strcmp(*str, table[i].str) == 0)
{
*str = table[i].num;
return 1; // TRUE
}
}
return 0; // FALSE
}
int main()
{
char buf[100];
char *in = buf;
char *out;
int val;
scanf("%s", in);
translate_from_symbol(&in);
val = strtol(in, &out, 10);
if (in != out)
{
printf("\nValue = %d\n", val);
}
else
{
printf("\nValue Invalid\n");
}
}
Of course, you get a long, but converting that to int shouldn't be an issue as mentioned above.

Related

C11: how to quickly convert a char array into ints, then modify ints and update char array

There are two parts of the problem that I don't know how to solve:
Input
The user can enter some inputs like 12,14y or 15m and I need to extract the two ints and the character. For now, I simply use:
char buffer[50];
scanf("%s", buffer);
switch (buffer[strlen(buffer)-1]) {
// ... I use this to read the last char
}
This can give me the information of how many ints I have to read:
one in the m,n case -> sscanf(buffer, "%d%c", int1, c)
two in the y,s,b case -> sscanf(buffer, "%d,%d%c", int1, int2, c)
I need these numbers for the core of my program, so I need int values not only the string.
The problem is that online I read about sscanf inefficiency and I need a good way to do this task quickly.
Output
My code has to modify these numbers just in one case (y) and conserve a modified copy of the user input. For example, users input is 1,12y then I have to modify it in 1,10y and store it as a char array so it's not only an input. The modification of int2 it's quite long to explain, I can say that the new value would be less than the original one (in my example from 12 I get 10). The only idea I have about this it's how to create the new char array: I can calculate int1 and int2 length trying to divide them with increasing power of 10 until I get a result between 1 and 9. e.g.:
int1 = 201:
201 no
20.1 no
2.01 yes
=> 3 tries, length = 3
Then I use a malloc. But then, how can I write my "output" in the new char array? e.g.:
input = "1,201y"
-> int1 = 1, int2 = 201
-> lenght(int1) = 1, length(int2) = 2
// if the core program sets int2 = 51, then
char *out = malloc(1+2+1):
// now I have to write "1,51y" in this char array
I've coded the "core" program already, but now I'd want to improve a fast "translation" of user input (because in the core program I need to know if it's a int1m or int1n or int1,int2y or int1,int2s or int1,int2b command) and I don't know how to modify user input to save it in a string (for strings I use char arrays dynamically allocated). Only the y command could modify int2.
I hope that it's clear what I've to done.
The problem is that online I read about sscanf inefficiency
"Online" isn't a very trustworthy source. Inefficiency depends entirely on what you compare the function with.
If you compare with any plain C function then all of the stdio.h functions are very much inefficient. As is malloc for that matter. However, printing to the screen and waiting on the human user are by far the largest bottlenecks in this program, so you might want to re-consider why and what you are optimizing.
That being said, you can easily roll out a manual specialized version of the string to integer conversion, by calling strtol family of functions. Here's a version supporting exactly 1 or 2 integers in the input string (it can easily be rewritten to use a loop instead):
#include <stdlib.h>
int parse_input (const char* input, int* i1, int* i2, char* ch)
{
char* endptr=NULL;
const char* cptr=input;
int result;
result = strtol(cptr, &endptr, 10);
if(cptr==endptr)
{
return 0;
}
*i1 = result;
if(*endptr != ',')
{
*ch = *endptr;
return 1;
}
cptr=endptr+1;
result = strtol(cptr, &endptr, 10);
if(cptr==endptr)
{
return 0;
}
*i2 = result;
*ch = *endptr;
return 2;
}
Some extra error handling might be needed too. This gives around 50 instructions when compiled for x86_64, not counting strtol calls. Where some 20 of those instructions are related to the parameter stacking and calling convention.

Writing an array of integers into a file using C [duplicate]

This question already has answers here:
How to write an array to file in C
(3 answers)
Closed 3 years ago.
I would like to write an array of integers into a file using C. However, I get some gibberish in the file.
The code is about a function that converts a decimal number into binary then stores it into a file.
int * decToBinary(int n) //function to transform the decimal numbers to binary
{
static int binaryNum[16]; // array to store binary number
int i = 0; // counter for binary array
while (n > 0) {
binaryNum[i] = n % 2; // storing remainder in binary array
n = n / 2;
i++;
}
return binaryNum;
}
int main()
{
FILE *infile;
int i;
int *p;
int decimal= 2000;
int written = 0;
infile = fopen("myfile.txt","w");
p = decToBinary(decimal);
written = fwrite(p,sizeof(int),sizeof(p),infile) ;
if (written == 0) {
printf("Error during writing to file !");
}
fclose(infile);
return 0;
}
This is what I get in my file:
This is what I get when I write a text as a test, it does not have any problem with the text, but it has with the array.
char str[] = "test text --------- \n";
infile = fopen("myfile.txt","wb");
p=decToBinary(decimal);
fwrite(str , 1 , sizeof(str) , infile);
written = fwrite(p,sizeof(int),sizeof(p),infile) ;
And this is what I get when I make this change:
written = fwrite(&p,sizeof(int),sizeof(p),infile) ;
First, be aware that there are two interpretations for 'binary':
int n = 1012;
fwrite(&n, sizeof(n), 1, file);
This writes out the data just as is; as it is represented in form of bits, output is considered "binary" (a binary file).
Your question and the code you provided, though, rather imply that you actually want to have a file containing the numbers in binary text format, i. e. 7 being represented by string "111".
Then first, be aware that 0 and 1 do not represent the characters '0' and '1' in most, if not all, encodings. Assuming ASCII or compatible, '0' is represented by value 48, '1' by value 49. As C standard requires digits [0..9] being consecutive characters (this does not apply for any other characters!), you can safely do:
binaryNum[i] = '0' + n % 2;
Be aware that, as you want strings, you chose the bad data type, you need a character array:
static char binaryNum[X];
X??? We need to talk about required size!
If we create strings, we need to null-terminate them. So we need place for the terminating 0-character (really value 0, not 48 for character '0'), so we need at least one character more.
Currently, due to the comparison n > 0, you consider negative values as equal to 0. Do you really intend this? If so, you might consider unsigned int as data type, otherwise, leave some comment, then I'll cover handling negative values later on.
With restriction to positive values, 16 + 1 as size is fine, assuming int has 32 bit on your system! However, C standard allows int to be smaller or larger as well. If you want to be portable, use CHAR_BIT * sizeof(int) / 2 (CHAR_BIT is defined in <limits.h>; drop division by 2 if you switch to unsigned int).
There is one special case not covered: integer value 0 won't enter the loop at all, thus you'd end up with an empty string, so catch this case separately:
if(n == 0)
{
binaryNum[i++] = '0';
}
else
{
while (n > 0) { /.../ }
}
// now the important part:
// terminate the string!
binaryNum[i] = 0;
Now you can simply do (assuming you changed p to char*):
written = fprintf(file, "%s\n", p);
// ^^ only if you want to have each number on separate line
// you can replace with space or drop it entirely, if desired
Be aware that the algorithm, as is, prints out least significant bits first! You might want to have it inverse, then you'd either yet have to revert the string or (which I would prefer) start with writing the terminating 0 to the end and then fill up the digits one by one towards front - returning a pointer to the last digit (the most significant one) written instead of always the start of the buffer.
One word about your original version:
written = fwrite(p, sizeof(int), sizeof(p), infile);
sizeof(p) gives you the size of a pointer; this one is system dependent, but will always be the same on the same system, most likely 8 on yours (if modern 64-bit hardware), possibly 4 (on typical 32-bit CPU), other values on less common systems are possible as well. You'd need to return the number of characters printed separately (and no, sizeof(binaryNum) won't be suitable as it always returns 17, assuming 32-bit int and all changes shown above applied).
You probably want this:
...
int main()
{
int decimal = 2000;
int *p = decToBinary(decimal);
for (int i = 0; i< 16; i++)
{
printf("%d", p[i]);
}
return 0;
}
The output goes to the terminal instead into a file.
For writing into a file use fopen as in your code, and use fprintf instead of printf.
Concerning the decToBinary there is still room for improvement, especially you could transform the number directly into an array of char containing only chars 0 and 1 using the << and & operators.

How to make my Hexadecimal spit out 8 digits(including leading zeros)

So, I wrote a function converting a decimal number into a hexadecimal number by using recursion, but I can't seem to figure out how to add the prefix "0x" and leading zeros to my converted hexadecimal number. Let's say I pass the number 18 into the parameters of my function. The equivalent hexadecimal number should be 0x00000012. However, I only end up getting 12 as my hexidecimal number. The same applies when I pass in a hexidecimal number 0xFEEDDAD. I end up getting only FEEDDAD without the prefix as my answer. Can someone please help me figure this out? I've listed my code below. Also, I'm only allowed to use fputc to display my output.
const char digits[] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
void hexout (unsigned long number, FILE * stream)
{
long quotient;
long remainder;
quotient = number / 16;
remainder = number % 16;
if(quotient != 0)
hexout(quotient,stream);
fputc(digits[remainder],stream);
}
void hexout (unsigned long number, FILE * stream)
{
fprintf(stream, "0x%08lX", number);
}
If you cannot use fprintf (neither sprintf), you can use this kind of code (no recursion, but a 8-chars array on the stack):
const char digits[] = "0123456789ABCDEF";
void hexout(unsigned long number, FILE * stream)
{
unsigned long int input = number;
unsigned long int quotient;
unsigned long int remainder;
unsigned short ndigit = 0;
char result[8] = {0};
// Compute digits
do
{
quotient = input / 16;
remainder = input % 16;
result[7-ndigit] = digits[remainder];
input = quotient;
ndigit++;
}
while (ndigit < 8);
// Display result
fputc('0', stream);
fputc('x', stream);
for (ndigit = 0; ndigit < 8; ndigit++)
{
fputc(result[ndigit], stream);
}
}
Of course, this can be improved a lot...
Add digits to a string, and print out string with zero-padding using fprintf. Or just use fprintf to begin with.
Your own hexout fails for obvious reasons. You cannot 'continue' to output a number of zeroes when the value reaches 0, because you don't know how much numbers you already emitted. Also, you don't know when to prepend "0x" -- it should be before you start to emit hex digits, but how can you know you are at the start?
The logical way¹ to do this is to not use recursion, but a simple loop instead. Then again -- unsaid, but a fair bet this is a homework assignment, and in that case any number of silly constraints are possible ("write a C program without using the character '{'" comes to mind). In your case it's "you must use recursion".
You must add a counter to your recursive function; when it reaches 0, you know you have output 0x, and if it's not 0 you need to output a hex digit, irrespective if your value is 0 or not. There are a couple of ways of adding a counter to a recursive function: a global variable (which would be the easiest and utterly ugliest way, so please don't stop reading here), a static variable -- only semantically better than a global --, or a pass-by-reference argument (of which some say is a myth, but then again the end result is the same).
Which method is best for you depends on how well you can defend why you used that method.
¹ So is printf("0x%08X") an "illogical" solution? Yes. It solves the problem but without any further insights. The purpose of this assignment is not to find out the existence of printf and its parameters, it's to learn how (and why) to use recursion.

Getting the number of digits of an integer

bool isValidId(int* id)
{
if(log10(*id) != 6)
{
return false;
}
return true;
}
printf("Enter ID: ");
gets(input);
c.id = atoi(input);
validID= isValidId(c.id);
if(!validID)
{
printf("Invalid ID format -(Use example 123456 format). \n");
}
This is how it looks now.I ask the user to enter an ID and check it if is valid with the isValidId method but my program is crashing when I enter an ID. Please help! Thanks
return *id >= 100000 && *id < 1000000;
I think this may be a good solution, both easy to read and efficient.
There is no need to acquire its length if you just want to judge if it is a valid id
Program crashes because the parameter of isValidId is pointer to int, not int, so
validID = isValidId(c.id);
should be
validID = isValidId(&c.id);
First of all, I don't see any reason to pass a pointer to isValidId function. You can pass an integer and calculate the number of digits.
bool isValidId(int id) {
// count digits here
}
Now there are at least two ways to calculate the number of digits. First one is to use log10. The number of digits in a 10-base integer n is (int)(log10(n) + 1). You will need to import math.h to use log10. You may check whether n <= 0 before calling log10.
The second way is to loop through n.
int count = 0;
while (n > 0) {
count++;
n /= 10;
}
You've declared isValidId to take a pointer to an int, but you're passing it a plain int; in this case, there's no reason to use a pointer, so you'd be better off changing isValidId to use a regular int.
NEVER NEVER NEVER NEVER NEVER USE GETS -- IT WILL INTRODUCE A POINT OF FAILURE/MAJOR SECURITY HOLE IN YOUR CODE. Use fgets(input, sizeof input, stdin) instead.
How is input declared? Is it large enough to hold as many digits as int will allow, plus a sign, plus a 0 terminator?
log10 returns a double, not an int. To properly count digits with log10, you will need to write something like (int)floor(log10(id)) + 1.
You can simplify your isValidId function a little:
bool isValidId(int id)
{
return (int) floor(log10(id)) + 1 == 6;
}
The Boolean data type is a latecomer to the C language (introduced in C99), so a lot of us older types tend to avoid using Boolean constants in our code.
I've not done C for eons however try something like this
bool isValidId(int* id)
{
char str[15];
sprintf(str, "%d", id)
if(strlen(str) != 6)
{
return false;
}
return true;
}
It's way easier like this:
#include <math.h>
bool isValidId(int *id)
{
return (int)log10(*id) == 6;
}

How to manually convert decimal value to hexadecimal string in C?

n.b. I know that this question has been asked on StackOverflow before in a variety of different ways and circumstances, but the search for the answer I seek doesn't quite help my specific case. So while this initially looks like a duplicate of a question such as How can I convert an integer to a hexadecimal string in C? the answers given, are accurate, but not useful to me.
My question is how to convert a decimal integer, into a hexadecimal string, manually. I know there are some beat tricks with stdlib.h and printf, but this is a college task, and I need to do it manually (professor's orders). We are however, permitted to seek help.
Using the good old "divide by 16 and converting the remainder to hex and reverse the values" method of obtaining the hex string, but there must be a big bug in my code as it is not giving me back, for example "BC" for the decimal value "188".
It is assumed that the algorithm will NEVER need to find hex values for decimals larger than 256 (or FF). While the passing of parameters may not be optimal or desirable, it's what we've been told to use (although I am allowed to modify the getHexValue function, since I wrote that one myself).
This is what I have so far:
/* Function to get the hex character for a decimal (value) between
* 0 and 16. Invalid values are returned as -1.
*/
char getHexValue(int value)
{
if (value < 0) return -1;
if (value > 16) return -1;
if (value <= 9) return (char)value;
value -= 10;
return (char)('A' + value);
}
/* Function asciiToHexadecimal() converts a given character (inputChar) to
* its hexadecimal (base 16) equivalent, stored as a string of
* hexadecimal digits in hexString. This function will be used in menu
* option 1.
*/
void asciiToHexadecimal(char inputChar, char *hexString)
{
int i = 0;
int remainders[2];
int result = (int)inputChar;
while (result) {
remainders[i++] = result % 16;
result /= (int)16;
}
int j = 0;
for (i = 2; i >= 0; --i) {
char c = getHexValue(remainders[i]);
*(hexString + (j++)) = c;
}
}
The char *hexString is the pointer to the string of characters which I need to output to the screen (eventually). The char inputChar parameter that I need to convert to hex (which is why I never need to convert values over 256).
If there is a better way to do this, which still uses the void asciiToHexadecimal(char inputChar, char *hexString) function, I am all ears, other than that, my debugging seems to indicate the values are ok, but the output comes out like \377 instead of the expected hexadecimal alphanumeric representation.
Sorry if there are any terminology or other problems with the question itself (or with the code), I am still very new to the world of C.
Update:
It just occurred to me that it might be relevant to post the way I am displaying the value in case its the printing, and not the conversion which is faulty. Here it is:
char* binaryString = (char*) malloc(8);
char* hexString = (char*) malloc(2);
asciiToBinary(*(asciiString + i), binaryString);
asciiToHexadecimal(*(asciiString + i), hexString);
printf("%6c%13s%9s\n", *(asciiString + i), binaryString, hexString);
(Everything in this code snip-pit works except for hexString)
char getHexValue(int value)
{
if (value < 0) return -1;
if (value > 16) return -1;
if (value <= 9) return (char)value;
value -= 10;
return (char)('A' + value);
}
You might wish to print out the characters you get from calling this routine for every value you're interested in. :) (printf(3) format %c.)
When you call getHexValue() with a number between 0 and 9, you return a number between 0 and 9, in the ASCII control-character range. When you call getHexValue() with a number between 10 and 15, you return a number between 65 and 75, in the ASCII letter range.
The sermon? Unit testing can save you hours of time if you write the tests about the same time you write the code.
Some people love writing the tests first. While I've never had the discipline to stick to this approach for long, knowing that you have to write tests will force you to write code that is easier to test. And code that is easier to test is less coupled (or 'more decoupled'), which usually leads to fewer bugs!
Write tests early and often. :)
Update: After you included your output code, I had to comment on this too :)
char* binaryString = (char*) malloc(8);
char* hexString = (char*) malloc(2);
asciiToBinary(*(asciiString + i), binaryString);
asciiToHexadecimal(*(asciiString + i), hexString);
printf("%6c%13s%9s\n", *(asciiString + i), binaryString, hexString);
hexString has been allocated one byte too small to be a C-string -- you forgot to leave room for the ASCII NUL '\0' character. If you were printing hexString by the %c format specifier, or building a larger string by using memcpy(3), it might be fine, but your printf() call is treating hexString as a string.
In general, when you see a
char *foo = malloc(N);
call, be afraid -- the C idiom is
char *foo = malloc(N+1);
That +1 is your signal to others (and yourself, in two months) that you've left space for the NUL. If you hide that +1 in another calculation, you're missing an opportunity to memorize a pattern that can catch these bugs every time you read code. (Honestly, I found one of these through this exact pattern on SO just two days ago. :)
Is the target purely hexadecimal, or shall the function be parametizable. If it's constrained to hex, why not exploit the fact, that a single hex digit encodes exactly four bits?
This is how I'd do it:
#include <stdlib.h>
#include <limits.h> /* implementation's CHAR_BIT */
#define INT_HEXSTRING_LENGTH (sizeof(int)*CHAR_BIT/4)
/* We define this helper array in case we run on an architecture
with some crude, discontinous charset -- THEY EXIST! */
static char const HEXDIGITS[0x10] =
{'0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', 'a', 'b', 'c', 'd', 'e', 'f'};
void int_to_hexstring(int value, char result[INT_HEXSTRING_LENGTH+1])
{
int i;
result[INT_HEXSTRING_LENGTH] = '\0';
for(i=INT_HEXSTRING_LENGTH-1; value; i--, value >>= 4) {
int d = value & 0xf;
result[i] = HEXDIGITS[d];
}
for(;i>=0;i--){ result[i] = '0'; }
}
int main(int argc, char *argv[])
{
char buf[INT_HEXSTRING_LENGTH+1];
if(argc < 2)
return -1;
int_to_hexstring(atoi(argv[1]), buf);
puts(buf);
putchar('\n');
return 0;
}
I made a librairy to make Hexadecimal / Decimal conversion without the use of stdio.h. Very simple to use :
char* dechex (int dec);
This will use calloc() to to return a pointer to an hexadecimal string, this way the quantity of memory used is optimized, so don't forget to use free()
Here the link on github : https://github.com/kevmuret/libhex/
You're very close - make the following two small changes and it will be working well enough for you to finish it off:
(1) change:
if (value <= 9) return (char)value;
to:
if (value <= 9) return '0' + value;
(you need to convert the 0..9 value to a char, not just cast it).
(2) change:
void asciiToHexadecimal(char inputChar, char *hexString)
to:
void asciiToHexadecimal(unsigned char inputChar, char *hexString)
(inputChar was being treated as signed, which gave undesirable results with %).
A couple of tips:
have getHexValue return '?' rather than -1 for invalid input (make debugging easier)
write a test harness for debugging, e.g.
int main(void)
{
char hexString[256];
asciiToHexadecimal(166, hexString);
printf("hexString = %s = %#x %#x %#x ...\n", hexString, hexString[0], hexString[1], hexString[2]);
return 0;
}
#include<stdio.h>
char* inttohex(int);
main()
{
int i;
char *c;
printf("Enter the no.\n");
scanf("%d",&i);
c=inttohex(i);
printf("c=%s",c);
}
char* inttohex(int i)
{
int l1,l2,j=0,n;
static char a[100],t;
while(i!=0)
{
l1=i%16;
if(l1>10)
{
a[j]=l1-10+'A';
}
else
sprintf(a+j,"%d",l1);
i=i/16;
j++;
}
n=strlen(a);
for(i=0;i<n/2;i++)
{
t=a[i];
a[i]=a[n-i-1];
a[n-i-1]=t;
}
//printf("string:%s",a);
return a;
//
}
In complement of the other good answers....
If the numbers represented by these hexadecimal or decimal character strings are huge (e.g. hundreds of digits), they won't fit in a long long (or whatever largest integral type your C implementation is providing). Then you'll need bignums. I would suggest not coding your own implementation (it is tricky to make an efficient one), but use an existing one like GMPlib

Resources