Sizeof(char[]) in C - c

Consider this code:
char name[]="123";
char name1[]="1234";
And this result
The size of name (char[]):4
The size of name1 (char[]):5
Why the size of char[] is always plus one?

Note the difference between sizeof and strlen. The first is an operator that gives the size of the whole data item. The second is a function that returns the length of the string, which will be less than its sizeof (unless you've managed to get string overflow), depending how much of its allocated space is actually used.
In your example
char name[]="123";
sizeof(name) is 4, because of the terminating '\0', and strlen(name) is 3.
But in this example:
char str[20] = "abc";
sizeof(str) is 20, and strlen(str) is 3.

As Michael pointed out in the comments the strings are terminated by a zero. So in memory the first string will look like this
"123\0"
where \0 is a single char and has the ASCII value 0. Then the above string has size 4.
If you had not this terminating character, how would one know, where the string (or char[] for that matter) ends? Well, indeed one other way is to store the length somewhere. Some languages do that. C doesn't.

In C, strings are stored as arrays of chars. With a recognised terminating character ('\0' or just 0) you can pass a pointer to the string, with no need for any further meta-data. When processing a string, you read chars from the memory pointed at by the pointer until you hit the terminating value.
As your array initialisation is using a string literal:
char name[]="123";
is equivalent to:
char name[]={'1','2','3',0};
If you want your array to be of size 3 (without the terminating character as you are not storing a string, you will want to use:
char name[]={'1','2','3'};
or
char name[3]="123";
(thanks alk)
which will do as you were expecting.

Because there is a null character that is attached to the end of string in C.
Like here in your case
name[0] = '1'
name[1] = '2'
name[2] = '3'
name[3] = '\0'
name1[0] = '1'
name1[1] = '2'
name1[2] = '3'
name1[3] = '4'
name1[4] = '\0'

A String in C (and in, probably, every programming language - behind the scenes) is an array of characters which is terminated by \0 with the ASCII value of 0.
When assigning: char arr[] = "1234";, you assign a string literal, which is, by default, null-terminated (\0 is also called null) as you can see here.
To avoid a null (assuming you want just an array of chars and not a string), you can declare it the following way char arr[] = {'1', '2', '3', '4'}; and the program will behave as you wish (sizeof(arr) would be 4).

name = {'1','2','3','\0'};
name1 = {'1','2','3','4','\0'};
So
sizeof(name) = 4;
sizeof(name1) = 5;
sizeof returns the size of the object and in this case the object is an array and it is defined that your array is 4 bytes long in first case and 5 bytes in second case.

In C, string literals have a null terminating character added to them.
Your strings,
char name[]="123";
char name1[]="1234";
look more like:
char name[]="123\0";
char name1[]="1234\0";
Hence, the size is always plus one. Keep in mind when reading strings from files or from whatever source, the variable where you store your string, should always have extra space for the null terminating character.
For example if you are expected to read string, whose maximum size is 100, your buffer variable, should have size of 101.

Every string is terminated with the char nullbyte '\0' which add 1 to your length.

Related

construct string from indices in C

When I construct a string like this:
char string[1] = {'a'};
printf("%s", string)
it returns a a4.
Why is there a four at the end? How can I get rid of it?
I choose this method because I need to make a string from character indexes, such as char array[4] = {string[i],string[j],string[k]};.
Your string should end with terminating char '\0'
You can do it by:
char string[2] = {'a','\0'};
Or:
char string[] = "a";
"strings" in C are essentially arrays of characters ending with the \0 character (null terminated).
So if you want an array of characters, what you did is fine, but it is not a "string". Dont try to print it as such.
If you would also like to print it or treat it as a "string", then increase it's length by 1, and add a '\0' char at the end.
The conversion specification %s is used to output strings that is a sequence of characters terminated by a zero character.
The array declared this way
char string[1] = {'a'};
does not contain a string.
So to output its elements you need to specify the exact number of characters you are going to output. For example
printf("%*.*s", 1, 1, string);
Otherwise reserve one more element in the array for the terminating zero and use the conversion specification %s. For example
char string[2] = {'a'};
printf( "%s", string );

what is the length of this array? c language

char msg[100] = {’C’,’P’,’R’,‘E’,‘\0’,‘2’,‘8’, ‘8’,‘\0’};
int my_length = 0xFFFFFFFF;
my_length = strlen(msg);
I thought it is nine, however, the answer is 4. anyone can explain? thanks
strlen will stop counting as soon as it hits a null terminator (as C uses null terminated strings and expects to only find them at the end of a string).
You have four characters before your first null terminator, therefore the length is 4.
strlen returns 4 because the (first) string in msg is terminated by the \0 at msg[4]. However, the array msg has a length of 100 chars because it was declared as such.
Remember that in C, a string is simply a sequence of character values followed by a zero-valued terminator. Strings are stored in arrays of char (or wchar_t for wide strings), but not every array of char (or wchar_t) is a string. To store a string that's N characters long, you need an array with at least N + 1 elements to account for the terminator.
strlen returns the number of characters in the string starting at the specified address up to the zero terminator.
To get the size (in bytes) of the msg array, use the sizeof operator:
char msg[100] = {'C','P','R','E','\0','2','8','8','\0'};
size_t my_length = strlen( msg );
size_t my_size = sizeof msg;
if ( my_length >= my_size )
// whoopsie
In this case, you're actually storing two strings in one array ("CPRE" and "288").
The size of the msg array is 100 (as given by the declaration).
The length of the string "CPRE" starting at msg[0] is 4, since you have a zero terminator in the fifth element of the array ('\0' == 0).
The length of the string "288" starting at msg[5] is 3 since you have another zero terminator in the ninth element of the array.
Maybe it is typo in your
char msg[100] = {’C’,’P’,’R’,‘E’,‘\0’,‘2’,‘8’, ‘8’,‘\0’};
and you wanted
char msg[100] = {’C’,’P’,’R’,‘E’,‘0’,‘2’,‘8’, ‘8’,‘\0’};
(plainly: CPRE0288), so binary 0 (instead of the character representation of 0 , i. e. '0') prematurely finishes your string.
You cannot assume that the return value of strlen represents the size of an array.
strlen will take a pointer to the start of a string and increment the pointer while looking for a null terminator; once it finds that, it returns the counter (i.e. number of increments before the null was found).
You declared msg to be of length 100, but only populated 9 elements in the array. sizeof(msg) will be 100.
Are you actually asking "how can I find out how many values are initialized in an array"? There's really no answer to that.

String Initialization in c

I am quite new to C programming so feel free to correct me, I insist. My basic understanding of strings in C is when we initialize a string a null character is automatically assigned at the end of the string and the null character cannot be read read or written, but is used internally only.
So when I create a string of size 4 as char str[3] and assign a word to it say "RED" and print it using puts function or printf("%s",str), I get an unusual output printed as RED(SMIILEY FACE)
I then again reduce the size of string to char str[2] and assign RED to it and then compile it and the again receive a output stating RE(Smiley face)
If someone can explain it to me I will be thankful . Posting the C code below
int main()
{
char s1[3]="RED";
char s2[]="RED";
puts(s1);
puts(s2);
printf("%s",s1);
return 0;
}
char s1[3] = "RED";
Is a valid statement. It copies 3 characters from the constant string literal "RED" (which is 4 characters long) into the character array s1. There is no terminating '\0' in s1, because there is no room for it.
Note the copy, because s1 is mutable, while "RED" is not. This makes the statement different from e.g. const char *s1 = "RED";, where the string is not copied.
The result of both puts(s1) and printf("%s", s1) are undefined. There is no terminating '\0' in s1. Treating it as a string with one can lead to arbitrary behavior.
char s2[] = "RED";
Here, sizeof(s2) == 4, because "RED" has four characters, you need to count the trailing '\0' when calculating space.
The null character takes one exra character(byte). So you need to use an extra space in addition to the number of characters in the word you are initializing.
char s1[4]="RED"; //3 for RED and 1 for the null character
On the other hand
char s2[3]="RED";
there is no space for null character. "RED" is in there but you would encounter I/O problems when printing it as there is no null character stored at the end. Your data is stored fine but it can't be recognized properly by the printf as there is no null character.
char s2[]="RED";
This would work as memory of 4 (bytes) is automatically assigned which includes space for the terminating null character.

Why is the entirety of this first array being added onto the second, on top of the two values (from the first) that I assign it?

I want to assign the first two values from the hash array to the salt array.
char hash[] = {"HAodcdZseTJTc"};
char salt[] = {hash[0], hash[1]};
printf("%s", salt);
However, when I attempt this, the first two values are assigned and then all thirteen values are also assigned to the salt array. So my output here is not:
HA
but instead:
HAHAodcdZseTJTC
salt is not null-terminated. Try:
char salt[] = {hash[0], hash[1], '\0'};
Since you are adding just two characters to the salt array and you are not adding the '\0' terminator.
Passing a non nul terminated array as a parameter to printf() with a "%s" specifier, causes undefined behavior, in your case it prints hash in my case
HA#
was printed.
Strings in c use a special convetion to know where they end, a non printable special character '\0' is appended at the end of a sequence of non-'\0' bytes, and that's how a c string is built.
For example, if you were to compute the length of a string you would do something like
size_t stringlength(const char *string)
{
size_t length;
for (length = 0 ; string[length] != '\0' ; ++length);
return length;
}
there are of course better ways of doing it, but I just want to illustrate what the significance of the terminating '\0' is.
Now that you know this, you should notice that
char string[] = {'A', 'B', 'C'};
is an array of char but it's not a string, for it to be a string, it needs a terminating '\0', so
char string[] = {'A', 'B', 'C', '\0'};
would actually be a string.
Notice that then, when you allocate space to store n characters, you need to allocate n + 1 bytes, to make room for the '\0'.
In the case of printf() it will try to consume all the bytes that the passed pointer points at, until one of them is '\0', there it would stop iterating through the bytes.
That also explains the Undefined Behavior thing, because clearly printf() would be reading out of bounds, and anything could happen, it depends on what is actually there at the memory address that does not belong the the passed data but is off bounds.
There are many functions in the standard library that expect strings, i.e. _sequences of non nul bytes, followed by a nul byte.

How is an empty string stored in a char array?

If I have an array declared as
char arr[1] = "";
What is actually stored in memory? What will a[0] be?
Strings are null-terminated. An empty string contains one element, the null-terminator itself, i.e, '\0'.
char arr[1] = "";
is equivalent to:
char arr[1] = {'\0'};
You can imagine how it's stored in the memory from this.
C-strings are zero-terminated. Thus, "abc" is represented as { 'a', 'b', 'c', 0 }.
Empty strings thus just have the zero.
This is also the reason why a string must always be allocated to be one char larger than the maximum possible length.
arr[0] = 0x00;
however, if you did not assign any value like
char arr[1];
then arr[0] = garbage value
a[0] is the null character, which can be referred to as '\0' or 0.
A string is, by definition, "a contiguous sequence of characters terminated by and including the first null character". For an empty string, the terminating null character is the first one (at index 0).
It will pique more if the array is declared as char arr[] = "";
In this case the sizeof(arr) is 1 and strlen(arr) is 0 .
But still self analysis can be done by adding print like this printf("%d", arr[0]); So that you can understand by yourself.
string is a sequence of characters, in your case there is no character is present inside "". So it stores only '\0' character in arr[0].
C string is end with NULL, so the empty string "" actually is "\0", Compiler help do this,
so strlen("") equal 0 but sizeof("") equal to 1.

Resources