String parsing in C - c

how would you parse the string, 1234567 into individual numbers?

char mystring[] = "1234567";
Each digit is going to be mystring[n] - '0'.

What Delan said. Also, it's probably bad practice for maintainability to use a ASCII dependent trickery. Try using this one from the standard library:
int atoi ( const char * str );
EDIT: Much better idea (the one above has been pointed out to me as being a slow way to do it) Put a function like this in:
int ASCIIdigitToInt(char c){
return (int) c - '0';
}
and iterate this along your string.

Don't forget that a string in C is actually an array of type 'char'. You could walk through the array, and grab each individual character by array index and subtract from that character's ascii value the ascii value of '0' (which can be represented by '0').

Related

LOGIC of Converting a "char" into an "int" in C

everyone!
please help me understand the following problem...
So i will have a STRING-type input of a note, looks like "A5" or "G#2" or "Cb4" etc. And i need to extract an octave index, which is the last digit "5" or "2" or "4"... And after exctraction i need it as an int-type.
So I did this:
string note = get_string("Input: ");
int octave = atoi(note[strlen(note) - 1]);
printf("your octave is %i \n", octave);
But it gave me an error "error: incompatible integer to pointer conversion passing 'char' to parameter of type 'const char *'; take the address with & [-Werror,-Wint-conversion]"
Then I tryied to throug away the math from the function, and did this:
int extnum = strlen(note) - 1;
int octave = atoi(note[extnum]);
It didn't work as well. So i did my reserch on atoi function and i don't get it...
ATOI expects a string (CHECK)
Converts it to an interger, not the ASCII meanning (CHECK)
Library for atoi function (CHECK)
What I am doing in basically asking "take n-th character of that string and make it an int".
After googling for some time a found an other code example where a guy uses atoi with this symbol '&'. So i did this:
int octave = atoi(&note[strlen(note) - 1]);
And IT WORKED! But I can't understand WHY it worked with the & symbol and didnt work without it....Cause it always worked without it! There was a million times i was giving a single-character string like '5' or so ond just used atoi and it worked perfectly...
Plesase help me, why in this case it acts so weird?
C does not have a native string type. Strings are usually represented as char array or a pointer to char.
Assuming that string is just a typedef to char *.
if note is an array of chars, note[strlen(note)-1] is just the last character. Since atoi expects a pointer to char (which has to be null-terminated) you have to pass the address of the char and not the value.
The task to convert one char digit to int could also be solved easier:
int octave = note[strlen(note) - 1] - '0';
The function atoi takes a pointer to a character array as the input parameter (const char*). When you call note[strlen(note) - 1] this is a single character (char), in order to make atoi work you need to provide the pointer. You do that by adding & as you've done. This then works, because right after that single digit there is a null character \0 that terminates the string - because your original string was null-terminated.
Note however that doing something like this would not be a good idea:
char a = '7';
int b = atoi(&a);
as there is no way to be sure what the next byte in memory is (following the byte that belongs to a), but the function will try to read it anyway, which can lead to undefined behaviour.
The last character is... well a character! not a string. So by adding the & sign, you made it a pointer to character (char*)!
You can also try this code:
char a = '5';
int b = a - '0';
Gives you ASCII code of 5 minus ASCII code of 0
From the manual pages, the signature of atoi is: int atoi(const char *nptr);. So, you need to pass the address of a char.
When you do this: atoi(note[strlen(note) - 1]) you pass the char itself. Thus, invoking UB.
When you use the &, you are passing what the function expects - the address. Hence, that works.
atoi excepts a string (a pointer to character), not a single character.
However, you should never use atoi since that function has bad error handling. The function strtol is 100% equivalent but safer.
You need to do this in two steps:
Find the first digit in the string.
From there, convert it to integer by calling strtol.
1) can is solved by looping through the string, checking if every item is a digit by calling isdigit from ctype.h. Simultaneously, check for the end of the string, the null terminator \0. When you find the first digit, save a pointer to that address.
2) is solved by passing the saved pointer to strtol, such as result = strtol(pointer, NULL, 10);.

Differences between int/char arrays/strings

I'm still new to the forum so I apologize in advance for forum - etiquette issues.
I'm having trouble understanding the differences between int arrays and char arrays.
I recently wrote a program for a Project Euler problem that originally used a char array to store a string of numbers, and later called specific characters and tried to use int operations on them to find a product. When I used a char string I got a ridiculously large product, clearly incorrect. Even if I converted what I thought would be compiled as a character (str[n]) to an integer in-line ((int)str[n]) it did the exact same thing. Only when I actually used an integer array did it work.
Code is as follows
for the char string
char str[21] = "73167176531330624919";
This did not work. I got an answer of about 1.5 trillion for an answer that should have been about 40k.
for the int array
int str[] = {7,3,1,6,7,1,7,6,5,3,1,3,3,0,6,2,4,9,1,9};
This is what did work. I took off the in-line type casting too.
Any explanation as to why these things worked/did not work and anything that can lead to a better understanding of these ideas will be appreciated. Links to helpful stuff are as well. I have researched strings and arrays and pointers plenty on my own (I'm self taught as I'm in high school) but the concepts are still confusing.
Side question, are strings in C automatically stored as arrays or is it just possible to do so?
To elaborate on WhozCraig's answer, the trouble you are having does not have to do with strings, but with the individual characters.
Strings in C are stored by and large as arrays of characters (with the caveat that there exists a null terminator at the end).
The characters themselves are encoded in a system called ascii which assigns codes between 0 - 127 for characters used in the english language (only). Thus "7" is not stored as 7 but as the ascii encoding of 7 which is 55.
I think now you can see why your product got so large.
One elegant way to fix would be to convert
int num = (int) str[n];
to
int num = str[n] - '0';
//thanks for fixing, ' ' is used for characters, " " is used for strings
This solution subtracts the ascii code for 0 from the ascii code for your character, say "7". Since the numbers are encoded linearly, this will work (for single digit numbers). For larger numbers, you should use atoi or strtol from stdlib.h
Strings are just character arrays with a null terminating byte.
There is no separate string data type in c.
When using a char as an integer, the numeric ascii value is used. For example, saying something like printf("%d\n", (int)'a'); will result in 97 (the ascii value of 'a') being printed.
You cannot use a string of numbers to do numeric calculations unless you convert it to an integer array. To convert a digit as a character into its integer form, you can do something like this:
char a = '2';
int a_num = a - '0';
//a_num now stores integer 2
This causes the ascii value of '0' (48) to be subtracted from ascii value '2' (50), finally leaving 2.
char str[21] = "73167176531330624919"
this code is equivalent to
char str[21] = {'7','3','1','6','7','1','7','6','5',/
'3','1','3','3','0','6','2','4','9','1','9'}
so whatever stored in str[21] is not numbers, but the char(their ASCII equivalent representation is different).
side question answer - yes/no, the strings are automatically stored as char arrays, but the string does has a extra character('\0') as the last element(where a char array need not have such a one).

Read all the subsequent chars and transform into an int

I'm reading a txt file and getting all the chars that aren't space, transforming them to int using (int)c-'0' and that is working.
The problem is if the number has more than 1 digit, because I'm reading char by char.
How could I do to read like a sequence of chars, transform this sequence of chars into int?
I tried using a string, but when I try to pass this string to my other function, it treats each index as a number, but what I need is that the whole string is treated as one number.
Any ideas?
A convenient way to do the conversion is to read the whole number into a buffer (string) and then call atoi. Make triple sure that the string is properly null-terminated.
One solution, I won't say it's good or bad in your case since you don't provide any code, but you could do something like this: (pseudoish code)
int i;
int val = 0;
char *string = "5238785";
for (i = 0; i < strlen(string); i++) {
val = val * 10 + atoi(string[i]);
}
NOTE: I simplified it and you should do more string controls to make sure you don't go out of bounds etc. Make sure the string is NULL-terminated \0, but the concept is that you read one digit at the time, and just move what you've read so far "one step left" to fit next digit.

Decimal string to character ASCII conversion - C

Can someone explain how to convert a string of decimal values from ASCII table to its character 'representation' in C ? For example: user input could be 097 and the function would print 'a' on the screen, but also user could type in '097100101' and the function would have to print 'ade' etc. I have written something clunky that does the opposite operation:
char word[30];
scanf("%s", word);
while(word[i]!=0)
{
if(word[i]<'d')
printf("0%d", (int)word[i]);
if(word[i]>='d')
printf("%d", (int)word[i]);
i++;
}
but it works. Now I want to have function that works in a similar way but of course does decimal > char conversion. The point is, I cannot use any functions like 'atoi' or something like that (not sure about names, never used them ;)).
You can use this function instead of atoi:
char a3toc(const char *ptr)
{
return (ptr[0]-'0')*100 + (ptr[1]-'0')*10 + (ptr[0]-'0');
}
So, a3toc("102") will return the same thing as (char) 102, which is an 'f'.
If you don't see why, substitute in the values: ptr[0] is '1', so the first part becomes ('1'-'0')*100 or 1*100 or 100, which is what that first 1 in 102 represents.
Tokenize the input string. I'm assuming you are forcing that every letter MUST be represented in 3 characters. So break the string that way. And simply use explicit type casting to get the desired character.
I don't think I should be giving you the code for this, since it is pretty easy and seems more like a Homework question.

New to C: whats wrong with my program?

I know my way around ruby pretty well and am teaching myself C starting with a few toy programs. This one is just to calculate the average of a string of numbers I enter as an argument.
#include <stdio.h>
#include <string.h>
main(int argc, char *argv[])
{
char *token;
int sum = 0;
int count = 0;
token = strtok(argv[1],",");
while (token != NULL)
{
count++;
sum += (int)*token;
token = strtok(NULL, ",");
}
printf("Avg: %d", sum/count);
printf("\n");
return 0;
}
The output is:
mike#sleepycat:~/projects/cee$ ./avg 1,1
Avg: 49
Which clearly needs some adjustment.
Any improvements and an explanation would be appreciated.
Look for sscanf or atoi as functions to convert from a string (array of characters) to an integer.
Unlike higher-level languages, C doesn't automatically convert between string and integral/real data types.
49 is the ASCII value of '1' char.
It should be helpful to you....:D
The problem is the character "1" is 49. You have to convert the character value to an integer and then average.
In C if you cast a char to an int you just get the ASCII value of it. So, you're averaging the ascii value of the character 1 twice, and getting what you'd expect.
You probably want to use atoi().
EDIT: Note that this is generally true of all typecasts in C. C doesn't reinterpret values for you, it trusts you to know what exists at a given location.
strtok(
Please, please do not use this. Even its own documentation says never to use it. I don't know how you, as a Ruby programmer, found out about its existence, but please forget about it.
(int)*token
This is not even close to doing what you want. There are two fundamental problems:
1) A char* does not "contain" text. It points at text. token is of type char*; therefore *token is of type char. That is, a single byte, not a string. Note that I said "byte", not "character", because the name char is actually wrong - an understandable oversight on the part of the language designers, because Unicode did not exist back then. Please understand that char is fundamentally a numeric type. There is no real text type in C! Interpreting a sequence of char values as text is just a convention.
2) Casting in C does not perform any kind of magical conversions.
What your code does is to grab the byte that token points at (after the strtok() call), and cast that numeric value to int. The byte that is rendered with the symbol 1 actually has a value of 49. Again, interpreting a sequence of bytes as text is just a convention, and thus interpreting a byte as a character is just a convention - specifically, here we are using the convention known as ASCII. When you hit the 1 key on your keyboard, and later hit enter to run the program, the chain of events set in motion by the command window actually passed a byte with the value 49 to your program. (In the same way, the comma has a value of 44.)
Both of the above problems are solved by using the proper tools to parse the input. Look up sscanf(). However, you don't even want to pass the input to your program this way, because you can't put any spaces in the input - each "word" on the command line will be passed as a separate entry in the argv[] array.
What you should do, in fact, is take advantage of that, by just expecting each entry in argv[] to represent one number. You can again use sscanf() to parse each entry, and it will be much easier.
Finally:
printf("Avg: %d", sum/count)
The quotient sum/count will not give you a decimal result. Dividing an integer by another integer yields an integer in C, discarding the remainder.
In this line: sum += (int)*token;
Casting a char to an int takes the ASCII value of the char. for 1, this value is 49.
Use the atoi function instead:
sum += atoi(token);
Note atoi is found in the stdlib.h file, so you'll need to #include it as well.
You can't convert a string to an integer via
sum += (int)*token;
Instead you have to call a function like atoi():
sum += atoi (token);
when you cast a char (which is what *token is) to int you get its ascii value in C - which is 49... so the average of the chars ascii values is in fact 49. you need to use atoi to get the value of the number represented

Resources