Working with atoi - c

I have been attacking atoi from several different angles trying to extract ints from a string 1 digit at a time.
Problem 1 - Sizing the array
Should this array of 50 chars be of size 50 or 51 (to account for null terminator)?
char fiftyNumbersOne[51] = "37107287533902102798797998220837590246510135740250";
Problem 2 - atoi output
What am I doing wrong here?
char fiftyNumbersOne[51] = "37107287533902102798797998220837590246510135740250";
int one = 0;
char aChar = fiftyNumbersOne[48];
printf("%c\n",aChar);//outputs 5 (second to last #)
one = atoi(&aChar);
printf("%d\n",one);//outputs what appears to be INT_MAX...I want 5

Problem 1
The array should be length 51. But you can avoid having to manually figure that out by simply doing char fiftyNumbersOne[] = "blahblahblah";.
Problem 2
aChar is not a pointer to the original string; it's just an isolated char floating about in memory somewhere. But atoi(&aChar) is treating it as if it were a pointer to a null-terminated string. It's simply walking through memory until it happens to find a 0 somewhere, and then interpreting everything it's found as a string.
You probably want:
one = aChar - '0';
This relies on the fact that the character values for 0 to 9 are guaranteed to be contiguous.

51.
That's because aChar is not null-terminated. If you just want to get the integer value of a char, simply use
one = aChar - '0';

Problem 1 - Sizing the array Should
this array of 50 chars be of size 50
or 51 (to account for null
terminator)?
You always want an array one bigger than what you need to store in it (to account for the null terminator). So your 50 chars should be stored in an array of size 51.
What am I doing wrong here?
Try null terminating your input string to atoi. Documentation says atoi is supposed to be given the pointer to a string - which is different than a non-terminated single character. Your results with the current code you posted vary on different platforms (I get -1 on unbuntu/gcc) .
char fiftyNumbersOne[51] = "37107287533902102798797998220837590246510135740250";
int one = 0;
char aChar = fiftyNumbersOne[48];
char intChar[2];
printf("%c\n",aChar);//outputs 5 (second to last #)
sprintf(intChar, "%c", aChar); //print the char to a null terminated string
one = atoi(&intChar);
printf("%d\n",one);//outputs what appears to be INT_MAX...I want 5

Should this array of 50 chars be of size 50 or 51 (to account for null terminator)?
51, but you can also declare it without size.
char foo[] = "foo";
What am I doing wrong here?
Not reading the documentation for atoi I guess. aChar is a char, so you're passing the right type to atoi, but atoi is expecting this type to represent a string of characters, normally terminated by the character '\0'. Your "string" isn't terminated.
One solution to this is
char aString[2];
aString[0] = fiftyNumbersOne[48];
aString[1] = '\0';
atoi(aString);
Another is doing fiftyNumbersOne[48] - '0' instead of calling atoi, since in ASCII the decimal codes are consecutive and increasing from 0 to 9.

Related

Why one string writes two char in C? [duplicate]

I'm currently learning C and I'm confused with differences between char array and string, as well as how they work.
Question 1:
Why is there a difference in the outcomes of source code 1 and source code 2?
Source code 1:
#include <stdio.h>
#include <string.h>
int main(void)
{
char c[2]="Hi";
printf("%d\n", strlen(c)); //returns 3 (not 2!?)
return 0;
}
Source code 2:
#include <stdio.h>
#include <string.h>
int main(void)
{
char c[3]="Hi";
printf("%d\n", strlen(c)); //returns 2 (not 3!?)
return 0;
}
Question 2:
How is a string variable different from a char array? How to declare them with the minimum required index numbers allowing \0 to be stored if any (please read the codes below)?
char name[index] = "Mick"; //should index be 4 or 5?
char name[index] = {'M', 'i', 'c', 'k'}; //should index be 4 or 5?
#define name "Mick" //what is the size? Is there a \0?
Question 3:
Does the terminating NUL ONLY follow strings but not char arrays? So the actual value of the string "Hi" is [H][i][\0] and the actual value of the char array "Hi" is [H][i]?
Question 4:
Suppose c[2] is going to store "Hi" followed by a \0 (not sure how this is done, using gets(c) maybe?). So where is the \0 stored? Is it stored "somewhere" after c[2] to become [H][i]\0 or will c[2] be appended with a \0 to become c[3] which is [H][i][\0]?
It is quite confusing that sometimes there is a \0 following the string/char array and causes trouble when I compare two variables by if (c1==c2) as it most likely returns FALSE (0).
Detailed answers are appreciated. But keeping your answer brief helps my understanding :)
Thank you in advance!
Answer 1: In code 1 you have a char array that is not a string; in code 2 you have a char array that is also a string.
Answer 2: A string is a char array in which (at least) one element has the value 0; if you leave the size part empty, the compiler will automatically fill it with the minimum possible value.
char astring[] = "foobar"; /* compiler automagically uses 7 for size */
printf("%d\n", (int)sizeof astring);
Answer 3: a char array in which one of the elements is NUL is a string; a char array where no elements are NUL is not a string.
Answer 4: an array defined to hold two elements (char c[2];) cannot hold three elements. If it is going to be a string it can only be the empty string or a string with 1 character.
Question 1:
Why is there a difference in the outcomes of source code 1 and source
code 2?
Source code 1:
#include <stdio.h>
#include <string.h>
int main()
{
char c[2]="Hi";
printf("%d", strlen(c)); //returns 3 (not 2!?)
getchar();
}
Source code 2:
#include <stdio.h>
#include <string.h>
int main()
{
char c[3]="Hi";
printf("%d", strlen(c)); //returns 2 (not 3!?)
getchar();
}
answer:
Because in the first case, c[] is only holding "Hi". strlen looks for a zero at the end, and, depending on exactly what is behind c[] finds one sooner or later, or crashes. We can't say without knowing exactly what is in the memory behind the c[] array.
Question 2:
How is a string variable different from a char array? How to declare
them with the minimum required index numbers allowing \0 to be stored
if any (please read the codes below)?
char name[index] = "Mick"; //should index be 4 or 5?
char name[index] = {'M', 'i', 'c', 'k'}; //should index be 4 or 5?
answer
Really depends on what you want to do. Probably 5 if you want to actually use the content as a string. But there's nothing saying you can't store "Mick" in a 4 character array - you just can't use strlen to find out how long it is, because strlen will continue to 5 and quite possibly (much) further to find the length, and if there is no zero in the next several memory locations, it could lead to a crash, because eventually, there won't be valid memory addresses to read.
#define name "Mick" //what is the size? Is there a \0?
This has absolutely no size at all, until you use name somwhere. #defines are not part of what the compiler sees - the pre-processor will replace name with "Mick" if you use name anywhere - and hopefully, that's in a place the compiler can make sense of. And then the same rules apply as in previous answer - it depends on how you want to use the array of characters. For correct operation with strlen, strpy, and nearly all other str... functions, you need a zero at the end.
Question 3:
Does the terminating null ONLY follow strings but not char arrays? So
the actual value of the string "Hi" is [H][i][\0] and the actual value
of the char array "Hi" is [H][i]?
Yes, no, maybe. It all depends on how you USE the "Hi" string literal (that's the technical name for 'something within double quotes'). If the compiler is "allowed", it will put a zero at the end. But if you initialize an array to a given size, it will stuff the bytes in there, and if there isn't room for a zero, that's your problem, not the compiler's.
Question 4:
Suppose c[2] is going to store "Hi" followed by a \0 (not sure how
this is done, using gets(c) maybe?). So where is the \0 stored? Is it
stored "somewhere" after c[2] to become [H][i]\0 or will c[2] be
appended with a \0 to become c[3] which is [H][i][\0]?
In c[2], beyond the 'H', 'i', there is no telling what is stored [technically, it could well be "the end of the earth" - in computer terms, that's "memory that can't be read - in which case strlen on that WILL crash your program, because strlen reads beyond the end of the earth]. But if could also be a zero, a one, the letter 'a', the number 42, or any other 8-bit [1] value.
It is quiet confusing that sometimes there is a \0 following the
string/char array and causes trouble when I compare two variables by
if (c1==c2) as it most likely returns FALSE (0).
If c1 and c2 are char arrays, that will ALWAYS be false since c1 and c2 are never going to have the same address, and when using an array in C in that way, it becomes "the address in memory of the first element in the array". So no matter what teh contents of c1 and c2 is, their address can never be the same [because they are two different variables, and two variables can not have the same location in memory - that's like trying to park two cars in a parking space large enough only for one car - and no, crushing either car is not allowed in our thought experiment].
[1] Char isn't guaranteed to be 8 bits. But lets inore that for now.
Running source code one is undefined behavior because strlen() requires a NUL-terminated string, which c[2] = "Hi"; /* = { 'H', 'i' } */ is not. A string differs from a char array in that a string is a char array with at least one NUL byte somewhere in the array.
The remaining answers should follow easily from this fact.
To autosize a char array to match the size of a string literal at initialization, simply specify no array size:
char c[] = "This will automatically size the c array (including the NUL).";
Note that you cannot compare char arrays with the == operator. You have to use
if (strcmp(c1, c2) == 0) {
/* Equal. */
} else {
/* Not equal. */
}
strlen() works on \0 terminating characters and in C all strings should be \0 terminated. So when you have given only 2 spaces for 2 characters H and i but there is no room for \0. Hence you are getting Undefined Behavior in strlen().
In case of char c[3] = "Hi"; there is \0 at the third place and strlen() will calculate the actual length.
How to declare them with the minimum required index numbers allowing \0 to be stored if any ?
When you are not sure about the size of char array , Do like this :
char c1[] = "Mike"; // strlen = 4
char c2[] = "Omkant" // strlen = 6
NOTE :
EDIT :In the above case where no size is mentioned explicitly , Do not confuse with sizeof with the strlen().
strlen() returns only number of charaters
sizeof gives number of characters plus one more (for \0 character).
So sizeof always gives exactly 1 more than the number returned by strlen().

what is the length of this array? c language

char msg[100] = {’C’,’P’,’R’,‘E’,‘\0’,‘2’,‘8’, ‘8’,‘\0’};
int my_length = 0xFFFFFFFF;
my_length = strlen(msg);
I thought it is nine, however, the answer is 4. anyone can explain? thanks
strlen will stop counting as soon as it hits a null terminator (as C uses null terminated strings and expects to only find them at the end of a string).
You have four characters before your first null terminator, therefore the length is 4.
strlen returns 4 because the (first) string in msg is terminated by the \0 at msg[4]. However, the array msg has a length of 100 chars because it was declared as such.
Remember that in C, a string is simply a sequence of character values followed by a zero-valued terminator. Strings are stored in arrays of char (or wchar_t for wide strings), but not every array of char (or wchar_t) is a string. To store a string that's N characters long, you need an array with at least N + 1 elements to account for the terminator.
strlen returns the number of characters in the string starting at the specified address up to the zero terminator.
To get the size (in bytes) of the msg array, use the sizeof operator:
char msg[100] = {'C','P','R','E','\0','2','8','8','\0'};
size_t my_length = strlen( msg );
size_t my_size = sizeof msg;
if ( my_length >= my_size )
// whoopsie
In this case, you're actually storing two strings in one array ("CPRE" and "288").
The size of the msg array is 100 (as given by the declaration).
The length of the string "CPRE" starting at msg[0] is 4, since you have a zero terminator in the fifth element of the array ('\0' == 0).
The length of the string "288" starting at msg[5] is 3 since you have another zero terminator in the ninth element of the array.
Maybe it is typo in your
char msg[100] = {’C’,’P’,’R’,‘E’,‘\0’,‘2’,‘8’, ‘8’,‘\0’};
and you wanted
char msg[100] = {’C’,’P’,’R’,‘E’,‘0’,‘2’,‘8’, ‘8’,‘\0’};
(plainly: CPRE0288), so binary 0 (instead of the character representation of 0 , i. e. '0') prematurely finishes your string.
You cannot assume that the return value of strlen represents the size of an array.
strlen will take a pointer to the start of a string and increment the pointer while looking for a null terminator; once it finds that, it returns the counter (i.e. number of increments before the null was found).
You declared msg to be of length 100, but only populated 9 elements in the array. sizeof(msg) will be 100.
Are you actually asking "how can I find out how many values are initialized in an array"? There's really no answer to that.

Sizeof(char[]) in C

Consider this code:
char name[]="123";
char name1[]="1234";
And this result
The size of name (char[]):4
The size of name1 (char[]):5
Why the size of char[] is always plus one?
Note the difference between sizeof and strlen. The first is an operator that gives the size of the whole data item. The second is a function that returns the length of the string, which will be less than its sizeof (unless you've managed to get string overflow), depending how much of its allocated space is actually used.
In your example
char name[]="123";
sizeof(name) is 4, because of the terminating '\0', and strlen(name) is 3.
But in this example:
char str[20] = "abc";
sizeof(str) is 20, and strlen(str) is 3.
As Michael pointed out in the comments the strings are terminated by a zero. So in memory the first string will look like this
"123\0"
where \0 is a single char and has the ASCII value 0. Then the above string has size 4.
If you had not this terminating character, how would one know, where the string (or char[] for that matter) ends? Well, indeed one other way is to store the length somewhere. Some languages do that. C doesn't.
In C, strings are stored as arrays of chars. With a recognised terminating character ('\0' or just 0) you can pass a pointer to the string, with no need for any further meta-data. When processing a string, you read chars from the memory pointed at by the pointer until you hit the terminating value.
As your array initialisation is using a string literal:
char name[]="123";
is equivalent to:
char name[]={'1','2','3',0};
If you want your array to be of size 3 (without the terminating character as you are not storing a string, you will want to use:
char name[]={'1','2','3'};
or
char name[3]="123";
(thanks alk)
which will do as you were expecting.
Because there is a null character that is attached to the end of string in C.
Like here in your case
name[0] = '1'
name[1] = '2'
name[2] = '3'
name[3] = '\0'
name1[0] = '1'
name1[1] = '2'
name1[2] = '3'
name1[3] = '4'
name1[4] = '\0'
A String in C (and in, probably, every programming language - behind the scenes) is an array of characters which is terminated by \0 with the ASCII value of 0.
When assigning: char arr[] = "1234";, you assign a string literal, which is, by default, null-terminated (\0 is also called null) as you can see here.
To avoid a null (assuming you want just an array of chars and not a string), you can declare it the following way char arr[] = {'1', '2', '3', '4'}; and the program will behave as you wish (sizeof(arr) would be 4).
name = {'1','2','3','\0'};
name1 = {'1','2','3','4','\0'};
So
sizeof(name) = 4;
sizeof(name1) = 5;
sizeof returns the size of the object and in this case the object is an array and it is defined that your array is 4 bytes long in first case and 5 bytes in second case.
In C, string literals have a null terminating character added to them.
Your strings,
char name[]="123";
char name1[]="1234";
look more like:
char name[]="123\0";
char name1[]="1234\0";
Hence, the size is always plus one. Keep in mind when reading strings from files or from whatever source, the variable where you store your string, should always have extra space for the null terminating character.
For example if you are expected to read string, whose maximum size is 100, your buffer variable, should have size of 101.
Every string is terminated with the char nullbyte '\0' which add 1 to your length.

printing int array as string

I am trying to print int array with %s. But it is not working. Any ideas why?
#include<stdio.h>
main() {
int a[8];
a[0]='a';
a[1]='r';
a[2]='i';
a[3]='g';
a[4]='a';
a[5]='t';
a[6]='o';
a[7] = '\0';
printf("%s", a);
}
It prints just a.
I tried with short as well, but it also does not work.
This is because you are trying to print a int array, where each element has a size of 4 byte (4 chars, on 32bit machines at least). printf() interprets it as char array so the first element looks like:
'a' \0 \0 \0
to printf(). As printf() stops at the first \0 it finds, it only prints the 'a'.
Use a char array instead.
Think about the way integers are represented - use a debugger if you must. Looking at the memory you will see plenty of 0 bytes, and %s stops when it reaches a 0 byte.
It prints just a.
That's why it prints just a. Afterwards it encounters a 0 byte and it stops.
Because you declared a as an integer, so those signle characters you initialized would result in an error. You must change it to a char variable. However to save time, just make the variable a pointer using the asterisk character, which then allows you to make a single string using double quotes.
int a[8] means array of 8 ints or 8*(4 bytes) - Say 32 bit architecture
a[0] = 'a' stores in the first int index as 'a''\0''\0''\0'
a[1] = 'r' as 'r''\0''\0''\0' and so on . . .
%s represents any C-style string ie. any string followed by a '\0' character
So
printf("%s", a);
searches for trailing '\0' character and just prints "a" assuming it is the entire string

About string length, terminating NUL, etc

I'm currently learning C and I'm confused with differences between char array and string, as well as how they work.
Question 1:
Why is there a difference in the outcomes of source code 1 and source code 2?
Source code 1:
#include <stdio.h>
#include <string.h>
int main(void)
{
char c[2]="Hi";
printf("%d\n", strlen(c)); //returns 3 (not 2!?)
return 0;
}
Source code 2:
#include <stdio.h>
#include <string.h>
int main(void)
{
char c[3]="Hi";
printf("%d\n", strlen(c)); //returns 2 (not 3!?)
return 0;
}
Question 2:
How is a string variable different from a char array? How to declare them with the minimum required index numbers allowing \0 to be stored if any (please read the codes below)?
char name[index] = "Mick"; //should index be 4 or 5?
char name[index] = {'M', 'i', 'c', 'k'}; //should index be 4 or 5?
#define name "Mick" //what is the size? Is there a \0?
Question 3:
Does the terminating NUL ONLY follow strings but not char arrays? So the actual value of the string "Hi" is [H][i][\0] and the actual value of the char array "Hi" is [H][i]?
Question 4:
Suppose c[2] is going to store "Hi" followed by a \0 (not sure how this is done, using gets(c) maybe?). So where is the \0 stored? Is it stored "somewhere" after c[2] to become [H][i]\0 or will c[2] be appended with a \0 to become c[3] which is [H][i][\0]?
It is quite confusing that sometimes there is a \0 following the string/char array and causes trouble when I compare two variables by if (c1==c2) as it most likely returns FALSE (0).
Detailed answers are appreciated. But keeping your answer brief helps my understanding :)
Thank you in advance!
Answer 1: In code 1 you have a char array that is not a string; in code 2 you have a char array that is also a string.
Answer 2: A string is a char array in which (at least) one element has the value 0; if you leave the size part empty, the compiler will automatically fill it with the minimum possible value.
char astring[] = "foobar"; /* compiler automagically uses 7 for size */
printf("%d\n", (int)sizeof astring);
Answer 3: a char array in which one of the elements is NUL is a string; a char array where no elements are NUL is not a string.
Answer 4: an array defined to hold two elements (char c[2];) cannot hold three elements. If it is going to be a string it can only be the empty string or a string with 1 character.
Question 1:
Why is there a difference in the outcomes of source code 1 and source
code 2?
Source code 1:
#include <stdio.h>
#include <string.h>
int main()
{
char c[2]="Hi";
printf("%d", strlen(c)); //returns 3 (not 2!?)
getchar();
}
Source code 2:
#include <stdio.h>
#include <string.h>
int main()
{
char c[3]="Hi";
printf("%d", strlen(c)); //returns 2 (not 3!?)
getchar();
}
answer:
Because in the first case, c[] is only holding "Hi". strlen looks for a zero at the end, and, depending on exactly what is behind c[] finds one sooner or later, or crashes. We can't say without knowing exactly what is in the memory behind the c[] array.
Question 2:
How is a string variable different from a char array? How to declare
them with the minimum required index numbers allowing \0 to be stored
if any (please read the codes below)?
char name[index] = "Mick"; //should index be 4 or 5?
char name[index] = {'M', 'i', 'c', 'k'}; //should index be 4 or 5?
answer
Really depends on what you want to do. Probably 5 if you want to actually use the content as a string. But there's nothing saying you can't store "Mick" in a 4 character array - you just can't use strlen to find out how long it is, because strlen will continue to 5 and quite possibly (much) further to find the length, and if there is no zero in the next several memory locations, it could lead to a crash, because eventually, there won't be valid memory addresses to read.
#define name "Mick" //what is the size? Is there a \0?
This has absolutely no size at all, until you use name somwhere. #defines are not part of what the compiler sees - the pre-processor will replace name with "Mick" if you use name anywhere - and hopefully, that's in a place the compiler can make sense of. And then the same rules apply as in previous answer - it depends on how you want to use the array of characters. For correct operation with strlen, strpy, and nearly all other str... functions, you need a zero at the end.
Question 3:
Does the terminating null ONLY follow strings but not char arrays? So
the actual value of the string "Hi" is [H][i][\0] and the actual value
of the char array "Hi" is [H][i]?
Yes, no, maybe. It all depends on how you USE the "Hi" string literal (that's the technical name for 'something within double quotes'). If the compiler is "allowed", it will put a zero at the end. But if you initialize an array to a given size, it will stuff the bytes in there, and if there isn't room for a zero, that's your problem, not the compiler's.
Question 4:
Suppose c[2] is going to store "Hi" followed by a \0 (not sure how
this is done, using gets(c) maybe?). So where is the \0 stored? Is it
stored "somewhere" after c[2] to become [H][i]\0 or will c[2] be
appended with a \0 to become c[3] which is [H][i][\0]?
In c[2], beyond the 'H', 'i', there is no telling what is stored [technically, it could well be "the end of the earth" - in computer terms, that's "memory that can't be read - in which case strlen on that WILL crash your program, because strlen reads beyond the end of the earth]. But if could also be a zero, a one, the letter 'a', the number 42, or any other 8-bit [1] value.
It is quiet confusing that sometimes there is a \0 following the
string/char array and causes trouble when I compare two variables by
if (c1==c2) as it most likely returns FALSE (0).
If c1 and c2 are char arrays, that will ALWAYS be false since c1 and c2 are never going to have the same address, and when using an array in C in that way, it becomes "the address in memory of the first element in the array". So no matter what teh contents of c1 and c2 is, their address can never be the same [because they are two different variables, and two variables can not have the same location in memory - that's like trying to park two cars in a parking space large enough only for one car - and no, crushing either car is not allowed in our thought experiment].
[1] Char isn't guaranteed to be 8 bits. But lets inore that for now.
Running source code one is undefined behavior because strlen() requires a NUL-terminated string, which c[2] = "Hi"; /* = { 'H', 'i' } */ is not. A string differs from a char array in that a string is a char array with at least one NUL byte somewhere in the array.
The remaining answers should follow easily from this fact.
To autosize a char array to match the size of a string literal at initialization, simply specify no array size:
char c[] = "This will automatically size the c array (including the NUL).";
Note that you cannot compare char arrays with the == operator. You have to use
if (strcmp(c1, c2) == 0) {
/* Equal. */
} else {
/* Not equal. */
}
strlen() works on \0 terminating characters and in C all strings should be \0 terminated. So when you have given only 2 spaces for 2 characters H and i but there is no room for \0. Hence you are getting Undefined Behavior in strlen().
In case of char c[3] = "Hi"; there is \0 at the third place and strlen() will calculate the actual length.
How to declare them with the minimum required index numbers allowing \0 to be stored if any ?
When you are not sure about the size of char array , Do like this :
char c1[] = "Mike"; // strlen = 4
char c2[] = "Omkant" // strlen = 6
NOTE :
EDIT :In the above case where no size is mentioned explicitly , Do not confuse with sizeof with the strlen().
strlen() returns only number of charaters
sizeof gives number of characters plus one more (for \0 character).
So sizeof always gives exactly 1 more than the number returned by strlen().

Resources