I know my way around ruby pretty well and am teaching myself C starting with a few toy programs. This one is just to calculate the average of a string of numbers I enter as an argument.
#include <stdio.h>
#include <string.h>
main(int argc, char *argv[])
{
char *token;
int sum = 0;
int count = 0;
token = strtok(argv[1],",");
while (token != NULL)
{
count++;
sum += (int)*token;
token = strtok(NULL, ",");
}
printf("Avg: %d", sum/count);
printf("\n");
return 0;
}
The output is:
mike#sleepycat:~/projects/cee$ ./avg 1,1
Avg: 49
Which clearly needs some adjustment.
Any improvements and an explanation would be appreciated.
Look for sscanf or atoi as functions to convert from a string (array of characters) to an integer.
Unlike higher-level languages, C doesn't automatically convert between string and integral/real data types.
49 is the ASCII value of '1' char.
It should be helpful to you....:D
The problem is the character "1" is 49. You have to convert the character value to an integer and then average.
In C if you cast a char to an int you just get the ASCII value of it. So, you're averaging the ascii value of the character 1 twice, and getting what you'd expect.
You probably want to use atoi().
EDIT: Note that this is generally true of all typecasts in C. C doesn't reinterpret values for you, it trusts you to know what exists at a given location.
strtok(
Please, please do not use this. Even its own documentation says never to use it. I don't know how you, as a Ruby programmer, found out about its existence, but please forget about it.
(int)*token
This is not even close to doing what you want. There are two fundamental problems:
1) A char* does not "contain" text. It points at text. token is of type char*; therefore *token is of type char. That is, a single byte, not a string. Note that I said "byte", not "character", because the name char is actually wrong - an understandable oversight on the part of the language designers, because Unicode did not exist back then. Please understand that char is fundamentally a numeric type. There is no real text type in C! Interpreting a sequence of char values as text is just a convention.
2) Casting in C does not perform any kind of magical conversions.
What your code does is to grab the byte that token points at (after the strtok() call), and cast that numeric value to int. The byte that is rendered with the symbol 1 actually has a value of 49. Again, interpreting a sequence of bytes as text is just a convention, and thus interpreting a byte as a character is just a convention - specifically, here we are using the convention known as ASCII. When you hit the 1 key on your keyboard, and later hit enter to run the program, the chain of events set in motion by the command window actually passed a byte with the value 49 to your program. (In the same way, the comma has a value of 44.)
Both of the above problems are solved by using the proper tools to parse the input. Look up sscanf(). However, you don't even want to pass the input to your program this way, because you can't put any spaces in the input - each "word" on the command line will be passed as a separate entry in the argv[] array.
What you should do, in fact, is take advantage of that, by just expecting each entry in argv[] to represent one number. You can again use sscanf() to parse each entry, and it will be much easier.
Finally:
printf("Avg: %d", sum/count)
The quotient sum/count will not give you a decimal result. Dividing an integer by another integer yields an integer in C, discarding the remainder.
In this line: sum += (int)*token;
Casting a char to an int takes the ASCII value of the char. for 1, this value is 49.
Use the atoi function instead:
sum += atoi(token);
Note atoi is found in the stdlib.h file, so you'll need to #include it as well.
You can't convert a string to an integer via
sum += (int)*token;
Instead you have to call a function like atoi():
sum += atoi (token);
when you cast a char (which is what *token is) to int you get its ascii value in C - which is 49... so the average of the chars ascii values is in fact 49. you need to use atoi to get the value of the number represented
Related
I have written a program that takes 2 arguments from the user and adds them together so for example if the user puts ./test 12 4 it will print out the sum is: 16.
the part that is confusing me is why do I have to use the atoi and I can't just use argv[1] + argv[2]
I know that atoi is used convert a string to an integer and I found this line of code online which helped me with my program but can someone explain to me why do I need it :
sum = atoi(argv[1])+atoi(argv[2]);
code:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
int sum = 0;
sum = atoi(argv[1])+atoi(argv[2]);
printf("The sum is : %d \n", sum);
return 0;
}
The reason is the same thing as the difference between you and your name. The user typed "./test 12 4" so before your program ran, the command shell that the user is using prepared two sequences of numbers representing the text characters that form the names of the user's numbers and gave them to your program - the shell got those directly from the terminal where the user typed them.
In order to add the numbers that these sequences identify by name you need to use a function that converts those two names to int - a representation of the numbers that they identify for which addition is defined. This function is called 'atoi' and the two names were the sequences {49,50,0} and {52,0} (being representations of the symbol sequences {'1','2','\0'} and {'4', '\0'}). The 0 (also written '\0') is a code for a special symbol that can't be printed or typed directly (this is a lie but I don't want to get into that) and it's added to the end of the names so that atoi, as it reads the name character code by character code, knows when it reached the end. Note that these particular values depend on what platform you're using but I'm assuming it's a platform that uses ascii or utf-8 rather than something like ebcdic.
As part of printing the resulting number, printf uses the "%d" directive to accept the int way of representing the answer (which you got from adding two ints) and converts it back to to the name of the answer as character codes {49,56} ({'1','6'}) ready to send back to the terminal. I left out any possible terminating 0 ('\0') in the output codes just there because printf doesn't indicate an end here unlike the end indications you receive in the inputs from the shell - the terminal can't look for an end in the same way that atoi does and printf doesn't give you the name for further usage within the C program; it just sends the name right out of the program for display to the terminal (which is the place that the command shell hooked up to the program's output stream).
Although atoi isn't the only thing you could do with the incoming names for numbers, the C language has been designed to give the ending marker for each argv element because it will typically be needed for any alternative choices you might make for handling the incoming information.
Try this to see the codes being used explicitly (still assuming your system uses ascii or utf-8):
#include <stdio.h>
#include <stdlib.h>
char const name_of_program[] = "./test";
char const name_of_first_number[] = {49,50,0}; // {'1','2','\0'} would be the same as would "12" - with the quotes
char const name_of_second_number[] = {52,0}; // {'4','\0'} would be the same as would "4" - with the quotes
int main()
{
char const *argv[] = {
name_of_program,
name_of_first_number,
name_of_second_number,
NULL,
};
int sum = 0;
sum = atoi(argv[1])+atoi(argv[2]);
printf("The sum is : %d \n", sum);
return 0;
}
"the part that is confusing me is why do i have to use the "atoi" and i can't just use argv[1] + argv[2]"
The argv argument holds a list of strings that the program can take as input. The C language does not automatically convert strings to numbers, so you have to do that yourself. The atoi function takes a string as a parameter and returns an integer, which can then be used for the arithmetic operations you want.
In other languages such as C++, summing strings usually concatenates them, but in C you will get a compiler error.
Your argv[i] is type C string by default:
int main(int argc, char *argv[])
and sum is type int.
Even if you input a number it will be read as a char* by your compiler. atoi() makes it read as an int so you can do arithmetic calculations with it.
[Answer updated thanks to comments bellow]
Well as said Im using C language and fscanf for this task but it seems to make the program crash each time then its surely that I did something wrong here, I havent dealed a lot with this type of input read so even after reading several topics here I still cant find the right way, I have this array to read the 2 bytes
char p[2];
and this line to read them, of course fopen was called earlier with file pointer fp, I used "rb" as read mode but tried other options too when I noticed this was crashing, Im just saving space and focusing in the trouble itself.
fscanf(fp,"%x%x",p[0],p[1]);
later to convert into decimal I have this line (if its not the EOF that we reached)
v = strtol(p, 0, 10);
Well v is mere integer to store the final value we are seeking. But the program keeps crashing when scanf is called or I think thats the case, Im not compiling to console so its a pitty that I cant output what has been done and what hasnt but in debugger it seems like crashing there
Well I hope you can help me out in this, Im a bit lost regarding this type of read/conversion any clue will help me greatly, thanks =).
PS forgot to add that this is not homework, a friend want to make some file conversion for a game and this code will manipulate the files needed alone, so while I could be using any language or environment for this, I always feel better in C language
char strings in C are really called null-terminated byte strings. That null-terminated part is important, as it means a string of two characters needs space for three characters to include the null-terminator character '\0'. Not having the terminator means string functions will go out of bounds in their search for it, leading to undefined behavior.
Furthermore the "%x" format is to read a heaxadecimal integer number and store it in an int. Mismatching format specifiers and arguments leads to undefined behavior.
Lastly and probably what's causing the crash: The scanf family of function expects pointers as their arguments. Not providing pointers will again lead to undefined behavior.
There are two solutions to the above problems:
Going with code similar to what you already use, first of all you must make space for the terminator in the array. Then you need to read two characters. Lastly you need to add the terminator:
char p[3] = { 0 }; // String for two characters, initialized to zero
// The initialization means that we don't need to explicitly add the terminator
// Read two characters, skipping possible leading white-space
fscanf(fp," %c%c",p[0],p[1]);
// Now convert the string to an integer value
// The string is in base-16 (two hexadecimal characters)
v = strtol(p, 0, 16);
Read the hexadecimal value into an integer directly:
unsigned int v;
fscanf(fp, "%2x", &v); // Read as hexadecimal
The second alternative is what I would recommend. It reads two characters and parses it as a hexadecimal value, and stores the result into the variable v. It's important to note that the value in v is stored in binary! Hexadecimal, decimal or octal are just presentation formats, internally in the computer it will still be stored in binary ones and zeros (which is true for the first alternative as well). To print it as decimal use e.g.
printf("%d\n", v);
You need to pass to fscanf() the address of a the variable(s) to scan into.
Also the conversion specifier need to suite the variable provided. In your case those are chars. x expects an int, to scan into a char use the appropriate length modifiers, two times h here:
fscanf(fp, "%hhx%hhx", &p[0], &p[1]);
strtol() expects a C-string as 1st parameter.
What you pass isn't a C-string, as a C-string ought to be 0-terminated, which p isn't.
To fix this you could do the following:
char p[3];
fscanf(fp, "%x%x", p[0], p[1]);
p[2] = '\0';
long v = strtol(p, 0, 10);
I'm still new to the forum so I apologize in advance for forum - etiquette issues.
I'm having trouble understanding the differences between int arrays and char arrays.
I recently wrote a program for a Project Euler problem that originally used a char array to store a string of numbers, and later called specific characters and tried to use int operations on them to find a product. When I used a char string I got a ridiculously large product, clearly incorrect. Even if I converted what I thought would be compiled as a character (str[n]) to an integer in-line ((int)str[n]) it did the exact same thing. Only when I actually used an integer array did it work.
Code is as follows
for the char string
char str[21] = "73167176531330624919";
This did not work. I got an answer of about 1.5 trillion for an answer that should have been about 40k.
for the int array
int str[] = {7,3,1,6,7,1,7,6,5,3,1,3,3,0,6,2,4,9,1,9};
This is what did work. I took off the in-line type casting too.
Any explanation as to why these things worked/did not work and anything that can lead to a better understanding of these ideas will be appreciated. Links to helpful stuff are as well. I have researched strings and arrays and pointers plenty on my own (I'm self taught as I'm in high school) but the concepts are still confusing.
Side question, are strings in C automatically stored as arrays or is it just possible to do so?
To elaborate on WhozCraig's answer, the trouble you are having does not have to do with strings, but with the individual characters.
Strings in C are stored by and large as arrays of characters (with the caveat that there exists a null terminator at the end).
The characters themselves are encoded in a system called ascii which assigns codes between 0 - 127 for characters used in the english language (only). Thus "7" is not stored as 7 but as the ascii encoding of 7 which is 55.
I think now you can see why your product got so large.
One elegant way to fix would be to convert
int num = (int) str[n];
to
int num = str[n] - '0';
//thanks for fixing, ' ' is used for characters, " " is used for strings
This solution subtracts the ascii code for 0 from the ascii code for your character, say "7". Since the numbers are encoded linearly, this will work (for single digit numbers). For larger numbers, you should use atoi or strtol from stdlib.h
Strings are just character arrays with a null terminating byte.
There is no separate string data type in c.
When using a char as an integer, the numeric ascii value is used. For example, saying something like printf("%d\n", (int)'a'); will result in 97 (the ascii value of 'a') being printed.
You cannot use a string of numbers to do numeric calculations unless you convert it to an integer array. To convert a digit as a character into its integer form, you can do something like this:
char a = '2';
int a_num = a - '0';
//a_num now stores integer 2
This causes the ascii value of '0' (48) to be subtracted from ascii value '2' (50), finally leaving 2.
char str[21] = "73167176531330624919"
this code is equivalent to
char str[21] = {'7','3','1','6','7','1','7','6','5',/
'3','1','3','3','0','6','2','4','9','1','9'}
so whatever stored in str[21] is not numbers, but the char(their ASCII equivalent representation is different).
side question answer - yes/no, the strings are automatically stored as char arrays, but the string does has a extra character('\0') as the last element(where a char array need not have such a one).
Can someone explain how to convert a string of decimal values from ASCII table to its character 'representation' in C ? For example: user input could be 097 and the function would print 'a' on the screen, but also user could type in '097100101' and the function would have to print 'ade' etc. I have written something clunky that does the opposite operation:
char word[30];
scanf("%s", word);
while(word[i]!=0)
{
if(word[i]<'d')
printf("0%d", (int)word[i]);
if(word[i]>='d')
printf("%d", (int)word[i]);
i++;
}
but it works. Now I want to have function that works in a similar way but of course does decimal > char conversion. The point is, I cannot use any functions like 'atoi' or something like that (not sure about names, never used them ;)).
You can use this function instead of atoi:
char a3toc(const char *ptr)
{
return (ptr[0]-'0')*100 + (ptr[1]-'0')*10 + (ptr[0]-'0');
}
So, a3toc("102") will return the same thing as (char) 102, which is an 'f'.
If you don't see why, substitute in the values: ptr[0] is '1', so the first part becomes ('1'-'0')*100 or 1*100 or 100, which is what that first 1 in 102 represents.
Tokenize the input string. I'm assuming you are forcing that every letter MUST be represented in 3 characters. So break the string that way. And simply use explicit type casting to get the desired character.
I don't think I should be giving you the code for this, since it is pretty easy and seems more like a Homework question.
Id just thought id ask this question to see whether it can actually be done.
if i want to store a number like "00000000000001", What would be the best way?
bearing in mind that this also has to be incrememted on a regular basis.
Im thinking either there is a way to do this with the integer or i have to convert to a char array somewhere along the line. This would be fine but its a pain to try and increment a string.
I would store it as an integer and only convert to the formatted version with leading zeros on demand when you need to produce output, for example with printf, sprintf etc.
It's far easier that way than storing a string and trying to perform arithmetic on strings. Not least because you have extra formatting requirements about your strings.
If for some reason it is awkward to store an integer as your master data do it like this.
Store the string as your master data.
Whenever you need to perform arithmetic, convert from string to integer.
When the arithmetic is complete, convert back to string and store.
You should simply store the number using an appropriate type (say, unsigned int), so that doing operations like 'increment by one' are easy - only bother worrying about leading zeros when displaying the number.
sprintf can actually do this for you:
unsigned int i = 1;
char buffer[64];
sprintf( buf, "%014u", i );
This prints '00000000000001'.
You could store it in a integer variable (provided there's an integer type that's wide enough for your needs). When printing, simply format the number to have the correct number of leading zeros.
#include <stdlib.h> // for itoa() call
#include <stdio.h> // for printf() call
int main() {
int num = 123;
char buf[5];
// convert 123 to string [buf]
itoa(num, buf, 10);
// print our string
printf("%s\n", buf);
return 0;
}