what is the length of this array? c language - c

char msg[100] = {’C’,’P’,’R’,‘E’,‘\0’,‘2’,‘8’, ‘8’,‘\0’};
int my_length = 0xFFFFFFFF;
my_length = strlen(msg);
I thought it is nine, however, the answer is 4. anyone can explain? thanks

strlen will stop counting as soon as it hits a null terminator (as C uses null terminated strings and expects to only find them at the end of a string).
You have four characters before your first null terminator, therefore the length is 4.

strlen returns 4 because the (first) string in msg is terminated by the \0 at msg[4]. However, the array msg has a length of 100 chars because it was declared as such.

Remember that in C, a string is simply a sequence of character values followed by a zero-valued terminator. Strings are stored in arrays of char (or wchar_t for wide strings), but not every array of char (or wchar_t) is a string. To store a string that's N characters long, you need an array with at least N + 1 elements to account for the terminator.
strlen returns the number of characters in the string starting at the specified address up to the zero terminator.
To get the size (in bytes) of the msg array, use the sizeof operator:
char msg[100] = {'C','P','R','E','\0','2','8','8','\0'};
size_t my_length = strlen( msg );
size_t my_size = sizeof msg;
if ( my_length >= my_size )
// whoopsie
In this case, you're actually storing two strings in one array ("CPRE" and "288").
The size of the msg array is 100 (as given by the declaration).
The length of the string "CPRE" starting at msg[0] is 4, since you have a zero terminator in the fifth element of the array ('\0' == 0).
The length of the string "288" starting at msg[5] is 3 since you have another zero terminator in the ninth element of the array.

Maybe it is typo in your
char msg[100] = {’C’,’P’,’R’,‘E’,‘\0’,‘2’,‘8’, ‘8’,‘\0’};
and you wanted
char msg[100] = {’C’,’P’,’R’,‘E’,‘0’,‘2’,‘8’, ‘8’,‘\0’};
(plainly: CPRE0288), so binary 0 (instead of the character representation of 0 , i. e. '0') prematurely finishes your string.

You cannot assume that the return value of strlen represents the size of an array.
strlen will take a pointer to the start of a string and increment the pointer while looking for a null terminator; once it finds that, it returns the counter (i.e. number of increments before the null was found).
You declared msg to be of length 100, but only populated 9 elements in the array. sizeof(msg) will be 100.
Are you actually asking "how can I find out how many values are initialized in an array"? There's really no answer to that.

Related

Array showing random characters at the end

I wanted to test things out with arrays on C as I'm just starting to learn the language. Here is my code:
#include <stdio.h>
main(){
int i,t;
char orig[5];
for(i=0;i<=4;i++){
orig[i] = '.';
}
printf("%s\n", orig);
}
Here is my output:
.....�
It is exactly that. What are those mysterious characters? What have i done wrong?
%s with printf() expects a pointer to a string, that is, pointer to the initial element of a null terminated character array. Your array is not null terminated.
Thus, in search of the terminating null character, printf() goes out of bound, and subsequently, invokes undefined behavior.
You have to null-terminate your array, if you want that to be used as a string.
Quote: C11, chapter §7.21.6.1, (emphasis mine)
s
If no l length modifier is present, the argument shall be a pointer to the initial element of an array of character type.280) Characters from the array are
written up to (but not including) the terminating null character. If the
precision is specified, no more than that many bytes are written. If the
precision is not specified or is greater than the size of the array, the array shall
contain a null character.
Quick solution:
Increase the array size by 1, char orig[6];.
Add a null -terminator in the end. After the loop body, add orig[i] = '\0';
And then, print the result.
char orig[5];//creates an array of 5 char. (with indices ranging from 0 to 4)
|?|?|?|0|0|0|0|0|?|?|?|?|
| ^memory you do not own (your mysterious characters)
^start of orig
for(i=0;i<=4;i++){ //attempts to populate array with '.'
orig[i] = '.';
|?|?|?|.|.|.|.|.|?|?|?|?|
| ^memory you do not own (your mysterious characters)
^start of orig
This results in a non null terminated char array, which will invoke undefined behavior if used in a function that expects a C string. C strings must contain enough space to allow for null termination. Change your declaration to the following to accommodate.
char orig[6];
Then add the null termination to the end of your loop:
...
for(i=0;i<=4;i++){
orig[i] = '.';
}
orig[i] = 0;
Resulting in:
|?|?|?|.|.|.|.|.|0|?|?|?|
| ^memory you do not own
^start of orig
Note: Because the null termination results in a C string, the function using it knows how to interpret its contents (i.e. no undefined behavior), and your mysterious characters are held at bay.
There is a difference between an array and a character array. You can consider a character array is an special case of array in which each element is of type char in C and the array should be ended (terminated) by a character null (ASCII value 0).
%s format specifier with printf() expects a pointer to a character array which is terminated by a null character. Your array is not null terminated and hence, printf function goes beyond 5 characters assigned by you and prints garbage values present after your 5th character ('.').
To solve your issues, you need to statically allocate the character array of size one more than the characters you want to store. In your case, a character array of size 6 will work.
#include <stdio.h>
int main(){
int i,t;
char orig[6]; // If you want to store 5 characters, allocate an array of size 6 to store null character at last position.
for (i=0; i<=4; i++) {
orig[i] = '.';
}
orig[5] = '\0';
printf("%s\n", orig);
}
There is a reason to waste one extra character space for the null character. The reason being whenever you pass any array to a function, then only pointer to first element is passed to the function (pushed in function's stack). This makes for a function impossible to determine the end of the array (means operators like sizeof won't work inside the function and sizeof will return the size of the pointer in your machine). That is the reason, functions like memcpy, memset takes an additional function arguments which mentions the array sizes (or the length upto which you want to operate).
However, using character array, function can determine the size of the array by looking for a special character (null character).
You need to add a NUL character (\0) at the end of your string.
#include <stdio.h>
main()
{
int i,t;
char orig[6];
for(i=0;i<=4;i++){
orig[i] = '.';
}
orig[i] = '\0';
printf("%s\n", orig);
}
If you do not know what \0 is, I strongly recommand you to check the ascii table (https://www.asciitable.com/).
Good luck
prinftf takes starting pointer of any memory location, array in this case and print till it encounter a \0 character. These type of strings are called as null terminated strings.
So please add a \0 at the end and put in characters till (size of array - 2) like this :
main(){
int i,t;
char orig[5];
for(i=0;i<4;i++){ //less then size of array -1
orig[i] = '.';
}
orig[i] = '\0'
printf("%s\n", orig);
}

Why is the entirety of this first array being added onto the second, on top of the two values (from the first) that I assign it?

I want to assign the first two values from the hash array to the salt array.
char hash[] = {"HAodcdZseTJTc"};
char salt[] = {hash[0], hash[1]};
printf("%s", salt);
However, when I attempt this, the first two values are assigned and then all thirteen values are also assigned to the salt array. So my output here is not:
HA
but instead:
HAHAodcdZseTJTC
salt is not null-terminated. Try:
char salt[] = {hash[0], hash[1], '\0'};
Since you are adding just two characters to the salt array and you are not adding the '\0' terminator.
Passing a non nul terminated array as a parameter to printf() with a "%s" specifier, causes undefined behavior, in your case it prints hash in my case
HA#
was printed.
Strings in c use a special convetion to know where they end, a non printable special character '\0' is appended at the end of a sequence of non-'\0' bytes, and that's how a c string is built.
For example, if you were to compute the length of a string you would do something like
size_t stringlength(const char *string)
{
size_t length;
for (length = 0 ; string[length] != '\0' ; ++length);
return length;
}
there are of course better ways of doing it, but I just want to illustrate what the significance of the terminating '\0' is.
Now that you know this, you should notice that
char string[] = {'A', 'B', 'C'};
is an array of char but it's not a string, for it to be a string, it needs a terminating '\0', so
char string[] = {'A', 'B', 'C', '\0'};
would actually be a string.
Notice that then, when you allocate space to store n characters, you need to allocate n + 1 bytes, to make room for the '\0'.
In the case of printf() it will try to consume all the bytes that the passed pointer points at, until one of them is '\0', there it would stop iterating through the bytes.
That also explains the Undefined Behavior thing, because clearly printf() would be reading out of bounds, and anything could happen, it depends on what is actually there at the memory address that does not belong the the passed data but is off bounds.
There are many functions in the standard library that expect strings, i.e. _sequences of non nul bytes, followed by a nul byte.

Sizeof(char[]) in C

Consider this code:
char name[]="123";
char name1[]="1234";
And this result
The size of name (char[]):4
The size of name1 (char[]):5
Why the size of char[] is always plus one?
Note the difference between sizeof and strlen. The first is an operator that gives the size of the whole data item. The second is a function that returns the length of the string, which will be less than its sizeof (unless you've managed to get string overflow), depending how much of its allocated space is actually used.
In your example
char name[]="123";
sizeof(name) is 4, because of the terminating '\0', and strlen(name) is 3.
But in this example:
char str[20] = "abc";
sizeof(str) is 20, and strlen(str) is 3.
As Michael pointed out in the comments the strings are terminated by a zero. So in memory the first string will look like this
"123\0"
where \0 is a single char and has the ASCII value 0. Then the above string has size 4.
If you had not this terminating character, how would one know, where the string (or char[] for that matter) ends? Well, indeed one other way is to store the length somewhere. Some languages do that. C doesn't.
In C, strings are stored as arrays of chars. With a recognised terminating character ('\0' or just 0) you can pass a pointer to the string, with no need for any further meta-data. When processing a string, you read chars from the memory pointed at by the pointer until you hit the terminating value.
As your array initialisation is using a string literal:
char name[]="123";
is equivalent to:
char name[]={'1','2','3',0};
If you want your array to be of size 3 (without the terminating character as you are not storing a string, you will want to use:
char name[]={'1','2','3'};
or
char name[3]="123";
(thanks alk)
which will do as you were expecting.
Because there is a null character that is attached to the end of string in C.
Like here in your case
name[0] = '1'
name[1] = '2'
name[2] = '3'
name[3] = '\0'
name1[0] = '1'
name1[1] = '2'
name1[2] = '3'
name1[3] = '4'
name1[4] = '\0'
A String in C (and in, probably, every programming language - behind the scenes) is an array of characters which is terminated by \0 with the ASCII value of 0.
When assigning: char arr[] = "1234";, you assign a string literal, which is, by default, null-terminated (\0 is also called null) as you can see here.
To avoid a null (assuming you want just an array of chars and not a string), you can declare it the following way char arr[] = {'1', '2', '3', '4'}; and the program will behave as you wish (sizeof(arr) would be 4).
name = {'1','2','3','\0'};
name1 = {'1','2','3','4','\0'};
So
sizeof(name) = 4;
sizeof(name1) = 5;
sizeof returns the size of the object and in this case the object is an array and it is defined that your array is 4 bytes long in first case and 5 bytes in second case.
In C, string literals have a null terminating character added to them.
Your strings,
char name[]="123";
char name1[]="1234";
look more like:
char name[]="123\0";
char name1[]="1234\0";
Hence, the size is always plus one. Keep in mind when reading strings from files or from whatever source, the variable where you store your string, should always have extra space for the null terminating character.
For example if you are expected to read string, whose maximum size is 100, your buffer variable, should have size of 101.
Every string is terminated with the char nullbyte '\0' which add 1 to your length.

Reseting a char pointer to the top of an array

I am writing a function and I need to count the length of an array:
while(*substring){
substring++;
length++;
}
Now when I exit the loop. Will that pointer still point to the start of the array? For example:
If the array is "Hello"
when I exit the loop with the pointer be pointed at:
H or the NULL?
If it is pointing at NULL how do I make it point at H?
Strings in C are stored with a null character (denoted \0) at the end.
Thus, one might declare a string as follows.
char *str="Hello!";
In memory, this will look like Hello!0 (or rather, a string of numbers corresponding to each letter followed by a zero).
Your code looks like this:
substring=str;
length=0;
while(*substring){
substring++;
length++;
}
When you reach the end of this loop, *substring will be equal to 0 and substring will contain the address of the 0 character mentioned above. The value of substring will not change unless you explicitly do so.
To make it point at the beginning of the string you could use substring-length, since pointers are integers and may be manipulated as such. Alternatively, you could memorize the location before you begin:
beginning=str;
substring=str;
length=0;
while(*substring){
substring++;
length++;
}
substring=beginning;
It's pointing at the NULL-terminator of the array. Just remember the position in another variable, or subtract length from the pointer.
Pointer once moved will not automatically move to any another location. So once the while loop gets over the pointer would be pointing to NULL or precisely '\0' which is a termination sequence for the string.
In order to move back to the length of string just calculate the string length, which you already are doing by incrementing the length variable.
Sample code:
#include<stdio.h>
int main()
{
char name1[10] = "test program";
char *name = '\0';
name = name1;
int len = strlen(name);
while(*name)
{
name++;
}
name=name-len;
printf("\n%s\n",name);
}
Hope this helps...
At the end of the loop, *substring will be 0. That's the condition for the loop to end:
while(*substring)
So while( (the value pointed to by substring) is not equal to 0), do stuff
But then *substring becomes 0 (i.e. end of string), so *substring will point to NULL.
If you want to bring it back to H, do substring - length
However, the function you are writing already exists. It's in string.h and it's size_t strlen(const char*) size_t is an integer the size of a pointer (i.e. 32 bits on 32 bit OS and 64 bits on 64 bit OS).

Working with atoi

I have been attacking atoi from several different angles trying to extract ints from a string 1 digit at a time.
Problem 1 - Sizing the array
Should this array of 50 chars be of size 50 or 51 (to account for null terminator)?
char fiftyNumbersOne[51] = "37107287533902102798797998220837590246510135740250";
Problem 2 - atoi output
What am I doing wrong here?
char fiftyNumbersOne[51] = "37107287533902102798797998220837590246510135740250";
int one = 0;
char aChar = fiftyNumbersOne[48];
printf("%c\n",aChar);//outputs 5 (second to last #)
one = atoi(&aChar);
printf("%d\n",one);//outputs what appears to be INT_MAX...I want 5
Problem 1
The array should be length 51. But you can avoid having to manually figure that out by simply doing char fiftyNumbersOne[] = "blahblahblah";.
Problem 2
aChar is not a pointer to the original string; it's just an isolated char floating about in memory somewhere. But atoi(&aChar) is treating it as if it were a pointer to a null-terminated string. It's simply walking through memory until it happens to find a 0 somewhere, and then interpreting everything it's found as a string.
You probably want:
one = aChar - '0';
This relies on the fact that the character values for 0 to 9 are guaranteed to be contiguous.
51.
That's because aChar is not null-terminated. If you just want to get the integer value of a char, simply use
one = aChar - '0';
Problem 1 - Sizing the array Should
this array of 50 chars be of size 50
or 51 (to account for null
terminator)?
You always want an array one bigger than what you need to store in it (to account for the null terminator). So your 50 chars should be stored in an array of size 51.
What am I doing wrong here?
Try null terminating your input string to atoi. Documentation says atoi is supposed to be given the pointer to a string - which is different than a non-terminated single character. Your results with the current code you posted vary on different platforms (I get -1 on unbuntu/gcc) .
char fiftyNumbersOne[51] = "37107287533902102798797998220837590246510135740250";
int one = 0;
char aChar = fiftyNumbersOne[48];
char intChar[2];
printf("%c\n",aChar);//outputs 5 (second to last #)
sprintf(intChar, "%c", aChar); //print the char to a null terminated string
one = atoi(&intChar);
printf("%d\n",one);//outputs what appears to be INT_MAX...I want 5
Should this array of 50 chars be of size 50 or 51 (to account for null terminator)?
51, but you can also declare it without size.
char foo[] = "foo";
What am I doing wrong here?
Not reading the documentation for atoi I guess. aChar is a char, so you're passing the right type to atoi, but atoi is expecting this type to represent a string of characters, normally terminated by the character '\0'. Your "string" isn't terminated.
One solution to this is
char aString[2];
aString[0] = fiftyNumbersOne[48];
aString[1] = '\0';
atoi(aString);
Another is doing fiftyNumbersOne[48] - '0' instead of calling atoi, since in ASCII the decimal codes are consecutive and increasing from 0 to 9.

Resources