Storing of a value in an array and it's initialization? - c

I might be a something basic or whatever I am not able to still figure out what will happen
for eg
if I write
char temp[3]="";
or
char temp[3]={0};
or
char temp[3]={};
or
char temp;
What will be the initialization In all four cases.
And if 0 is stored is it stored as ascii value?
And if NULL then also is the ascii value stored.
If some elements are not declared do which value they have
garbage value or something specified

1)
char temp[3]="";
and
char temp[3]={0};
are equivalent. The array temp will be filled with 3 zeros. It's as if you had: char temp[3] = {0, 0, 0};.
2)
char temp[3]={};
is illegal in C. Empty initializers are not allowed in C.
3)
char temp;
This, depends on where temp is declared.
If it's in a block scope then temp will be uninitialized and its value is indeterminate.
If it's at file scope then temp will be initialized to 0, provided there are no other definitions for it1. It's as if you had: char temp = 0;
1 This may sound odd. But C has a concept called "tentative definitions". See: About Tentative definition.

The first three are equivalent and the array will be initialized to zero.
The last case is different, because you don't initialize the single character. How it's initialized depends on where you define the variable. If it's a global variable it will be zero-initialized. If it's a local variable then it will not be initialized at all and have an indeterminate value.
And zero is zero, i.e. 0 and not '0'.
Lastly, NULL is for pointers, not for non-pointer values. There is some confusion since the string terminator character '\0' (which is equal to 0) is also called the null character. The null character and a null pointer are two different things semantically, even if they can have the same actual value.

char temp[3]={};
isn't correct C.
char temp[3]={0};
initializes temp[0] to 0 and the rest is initialized as if they were default-initialized global variables, which for chars means that the rest will be 0 also.
char temp[3]="";
is initialization from a (empty) string which behaves the same as if you broke down the string into character literals and assigned those.
For an empty string, the broken down version would be { '\0' }, which is the same as {0}, which makes it equivalent to the case above it.
char temp; will be default initialized (for chars == zeroed) if it's a global that isn't followed by a nontentative definition or it will have undefined contents if it's an automatic variable.

Related

Utility of '\0' in C string [duplicate]

This question already has answers here:
What is a null-terminated string?
(7 answers)
Closed last year.
#include <stdio.h>
#include <string.h>
int main()
{
char ch[20] = {'h','i'};
int k=strlen(ch);
printf("%d",k);
return 0;
}
The output is 2.
As far as I know '\0' helps compiler identify the end of string but the output here suggests the strlen can detect the end on it's own then why do we need '\0'?
long story short: it's your compiler making proactive decisions based on the standard.
long story:
char ch[20] = {'h','i'}
in the line above what you are implying to your compiler is;
allocate a memory big enough to store 20 characters (aka, array of 20 chars).
initialize first two slices (first two members of the array) as 'h' & 'i'.
implicitly initialize the rest.
since you are initialing your char array, your compiler is smart enough to insert the null terminator to the third element if it has enough space remaining. This process is the standard for initialization.
if you were to remove the initialization syntax and initialize each member manually like below, the result is undefined behavior.
char ch[20];
ch[0] = 'h';
ch[1] = 'i';
Also, if you were to not have extra space for your compiler to put the null terminator, even if you used a initializer the result would still be an undefined behavior as you can easily test via this code snippet below:
char ch[2] = { 'h','i' };
int k = strlen(ch);
printf("%d\n%s\n", k, ch);
now, if you were to increase the array size of 'ch' from 2 to 3 or any other number higher than 2, you can see that your compiler initializes it with the null terminator thus no more undefined behavior.
In this declaration:
char ch[20] = {'h','i'};
the first two elements are initialized explicitly and all other elements are initialized implicitly by zeroes.
The above declaration in fact (with one exceptions that the third element of the array is also explicitly initialized) is equivalent to:
char ch[20] = "hi";
Pat attention to that the string literal is represented as the following array:
{ 'h', 'i', '\0' }
That is the array contains a string that is terminated by the zero character '\0' and the function strlen can successfully find the length of the stored string.
If you would write for example:
char ch[2] = "hi";
then in this case the array ch does not have a space to store the terminating zero of the string literal. In this case applying the function strlen to this array invokes undefined behavior.
A null byte (i.e. the value 0) is what defines the end of a string in C.
When you defined ch, you gave less initializers than values in the array, so the remaining elements are set to 0. This results in a null terminated string.
The strlen function is basically looking for that value and counting how many elements it sees before it finds the null byte.
As far as I know '\0' helps compiler identify the end of string
Technically, it helps user code and the C runtime library identify the ends of strings. To the extent that the compiler needs to know where strings end, it knows without looking for a terminator.
but the output here suggests the strlen can detect the end on it's own
That would be a misinterpretation. The actual fact is that your string is null-terminated even though you did not put a null terminator in it explicitly. This is a consequence of declaring your array with an initializer that specifies values for only some of the elements. As some of your other answers describe in more detail, that does not produce a partial initialization. Rather, elements for which the initializer does not specify values are default-initialized. For elements of type char, that means initialization with 0, which serves as a string terminator.
Moreover, if the array were without a terminator then the result of passing it to strlen() would be undefined. You could not then conclude anything from the result.
then why do we need '\0'?
So that user code and many standard library functions can recognize the ends of strings. You already know this.
But in many cases we do not need to provide terminators explicitly. In particular, we do not need to represent them in string literals (and it means something different than you probably intended if you do), and you don't need to represent them in the initializers for char arrays storing strings, provided that the array has more elements than you specify in the initializer.
It is likely that your array ch contained zeros thus the byte after i is already set to zero. You can view it with a debugger or simply test it in the code. Trust me, strlen needs the zero to work.

String Initialization Declaration in C [duplicate]

This question already has answers here:
String initialization with and without explicit trailing terminator
(4 answers)
Closed 8 years ago.
I have a few questions regarding string initialization and declaration in C.
Suppose if a I declare a string 's' of size 10 using
char s[10];
Q 1. Is it necessary that all the elements of 's' will be initialized to '\0' or is it just pure luck that I will find other elements to be '\0'?
Q 2. If I instead use malloc to setup a string like this
char *s = malloc(10 * sizeof(char));
Again is it necessary that all the elements will be initialized to '\0'?
Q 3. Further do I need to add an '\0' while declaring the string or not?
char s[10] = "abc";
OR is it has to be
char s[10] = "abc\0";
NOTE: If possible, please take a look at the second answer by Kevin here.
No — in general. In some contexts yes, though. Specifically, if the variable is a local variable and not static, then it is not initialized at all. If the variable is local and static, or if the variable is file scope and static, or if it is global, then it will be initialized to all bytes zero.
No. malloc() is not guaranteed to return zeroed memory. If you need it zeroed, use calloc() instead.
These comments apply to any type.
char s0[10]; // Initialized all bytes zero
static char s1[10]; // Initialized all bytes zero
void somefunc(void)
{
static char s2[10]; // Initialized all bytes zero
char s3[10]; // Not initialized to all bytes zero
char *s4 = malloc(10); // Not initialized to all bytes zero
char *s5 = calloc(10, 1); // Initialized all bytes zero
…code using s0..s5…
}
It is sufficient to use:
char s6[10] = "abc"; // 3 bytes non-zero plus 7 bytes zero
Writing this would achieve the same result because the size of the array is specified:
char s7[10] = "abc\0"; // 3 bytes non-zero plus 7 bytes zero
Writing these gives two arrays of different sizes:
char s8[] = "abc"; // sizeof(s8) == 4 – 1 null byte
char s9[] = "abc\0"; // sizeof(s9) == 5 – 2 null bytes
C automatically adds a trailing null byte.
First and foremost, your s is not a "string". Your s is a character array. The term string refers to the content of a character array. In order to qualify as a string that content must satisfy some requirements. A string is defined as a continuous sequence of characters terminated with a zero character.
Q1. If the array is declared with static storage duration it will begin its life with all zeros in it. In all other cases it will contain unpredictable garbage.
Q2. malloc does not initialize allocated memory. The memory contains unpredictable garbage. calloc allocates character array initialized with zeros.
Q3. What you have on the right-hand side of initialization is called string literal. String literal already includes a terminating zero character implicitly. There's no need to add it explicitly.
However, C language follows the all-or-nothing approach to initialization. If you initialize just a small portion of some aggregate object, the rest of that object is implicitly initialized with zeros. In your case that means that the rest of array s will be filled with zeros anyway all the way to the end. Consequently there's no difference between the end result your two initialization examples. Still, there's no point is specifying that zero character explicitly.
If you declare the string using char s[10]; or malloc, the contents will not be initialized to \0 or anything. It will contain garbage values. So if you need \0 in your string, you need to explicitly store that.
Further, if you do sonething like
char s[10] = "abc";
then, you dont need to add \0,
A note: If you use to calloc instead of malloc to allocate memory, the contents will be initialized to 0.
Q1. If you don't explicitly initialize a local variable then it can contain any values. Often the bytes will just happen to contain zeroes.
But static variables (declared outside any function or prefixed with the static keyword are guaranteed to be initialized to zeroes.
Q2. Again malloc does not clear them memory but it will often happen to be filled with zeroes. To explicitly get zero-filled memory use calloc().
Q3. You don't need to add \0 inside the double-quotes. The string "abc" means 4 bytes are created somewhere containing the 3 characters then a string-terminator (byte with value zero).

Something I don't get about C strings

A few questions regarding C strings:
both char* and char[] are pointers?
I've learned about pointers and I can tell that char* is a pointer, but why is it automatically a string and not just a char pointer that points to 1 char; why can it hold strings?
Why, unlike other pointers, when you assign a new value to the char* pointer you are actually allocating new space in memory to store the new value and, unlike other pointers, you just replace the value stored in the memory address the pointer is pointing at?
A pointer is not a string.
A string is a constant object having type array of char and, also, it has the property that the last element of the array is the null character '\0' which, in turn, is an int value (converted to char type) having the integer value 0.
char* is a pointer, but char[] is not. The type char[] is not a "real" type, but an incomplete type. The C language is specified in such a way that, in the moment that you define a concrete variable (object) having array of char type, the size of the array is well determined in some way or another. Thus, none variable has type char[] because this is not a type (for a given object).
However, automatically every object having type array of N objects of type char is promoted to char *, that is, a pointer to char pointing to the initial object of the array.
On the other hand, this promotion is not always performed. For example, the operator sizeof() will give different results for char* than for an array of N chars. In the former case, the size of a pointer to char is given (which is in general the same amount for every pointer...), and in the last case gives you the value N, that is, the size of the array.
The behaviour is differente when you declare function arguments as char* and char[]. Since the function cannot know the size of the array, you can think of both declarations as equivalent.
Actually, you are right here: char * is a pointer to just 1 character object. However, it can be used to access strings, as I will explain you now: In the paragraph 1. I showed you that the strings are considered objects in memory having type array of N chars for some N. This value N is big enough to allow an ending null character (as all "string" is supposed to be in C).
So, what's the deal here?
The key point to understand this issues is the concept of object (in memory).
When you have a string or, more generally, an array of char, this means that you have figured out some manner to hold an array object in memory.
This object determines a portion of RAM memory that you can access safely, because C has assigned enough memory for it.
Thus, when you point to the first byte of this object with a char* variable, actually you have guaranteed access to all the adjacent elements to the "right" of that memory place, because those places are well defined by C as having the bytes of the array above.
Briefly: the adjacent (to the right) bytes of the byte pointed by a char* variable can be accessed, they are valid places to access, so the pointer can be "iterated" to walk through these bytes, up to the end of the string, without "risks", since all the bytes in an array are contiguous well defined positions in memory.
This is a complicated question, but it reveals that you are not understanding the relationship between pointers, arrays, and string literals in C.
A pointer is just a variable pointing to a position in memory.
A pòinter to char points to just 1 object having type char.
If the adjacent bytes of the pointed position correspond to an array of chars, they will be accessible by the pointer, so the pointer can "walk on" the memory bytes occupied by the array object.
A string literal is considered as an array of char object, which implictely add an ending byte with value 0 (the null character).
In any case, an array of T object has a well defined "size".
A string literal has an additional property: it's a constant object.
Try to fit and gather these concepts in your mind to figure out what's going on.
And ask me for clarification.
ADDITIONAL REMARKS:
Consider the following piece of code:
#include <stdio.h>
int main(void)
{
char *s1 = "not modifiable";
char s2[] = "modifiable";
printf("%s ---- %s\n\n", s1, s2);
printf("Size of array s2: %d\n\n", (int)sizeof(s2));
s2[1] = '0', s2[3] = s2[5] = '1', s2[4] = '7',
s2[6] = '4', s2[7] = '8', s2[9] = '3';
printf("New value of s2: %s\n\n",s2);
//s1[0] = 'X'; // Attempting to modify s1
}
In the definition and initialization of s1 we have the string literal "not modifiable", which has constant content and constant address. Its address is assigned to the pointer s1 as initialization.
Any attempt to modify the bytes of the string will give some kind of error, because the array content is read-only.
In the definition and initialization of s2, we have the string literal "modifiable", which has, again, constant content and constant address. However, what happens now is that, as part of the initialization, the content of the string is copied to the array of char s2. The size of the array s2 is not specified (the declaration char s2[] gives an incomplete type), but after initialization the size of the array is well determined and defined as the exact size of the copied string (plus 1 character used to hold the null character, or end-of-string mark).
So, the string literal "modifiable" is used to initialize the bytes of the array s2, which is modifiable.
The right manner to do that is by changing a character at the time.
For more handy ways of modifying and assigning strings, it has to be used the standard header <string.h>.
char *s is a pointer, char s[] is an array of characters. Ex.
char *s = "hello";
char c[] = "world";
s = c; //Legal
c = address of some other string //Illegal
char *s is not a string; it points to an address. Ex
char c[] = "hello";
char *s = &c[3];
Assigning a pointer is not creating memory; you are pointing to memory. Ex.
char *s = "hello";
In this example when you type "hello" you are creating special memory to hold the string "hello" but that has nothing to do with the pointer, the pointer simply points to that spot.

pointers and strings in C, diverse issues

version 1
char *cad1="hell";
char *cad2="home";
int j;
cad2=cad1;
for (j=0;j<4;j++){
printf("%c",cad1[j]);
}
cad2[0]='a';
for (j=0;j<4;j++){
printf("%c",cad2[j]);
}
version 2
char cad1[]="hell";
char cad2[]="home";
int j;
cad2=cad1;
for (j=0;j<4;j++){
printf("%c",cad1[j]);
}
cad2[0]='a';
for (j=0;j<4;j++){
printf("%c",cad2[j]);
}
version 3
char cad1[]="hell";
char *cad2="home";
int j;
cad2=cad1;
for (j=0;j<4;j++){
printf("%c",cad1[j]);
}
cad2[0]='a';
for (j=0;j<4;j++){
printf("%c",cad2[j]);
}
The question that I have is why version 1 hangs the dev c++?, version 2 says there is an incompatible assignment in cad2=cad1? and why version 3 works normal?
When you declare pointer like,
char *cad1="hell";
"hell" is called as constant string literal and so may be stored in read-only memory. - compiler is free to choose whatever it likes.
But when you declare it as,
char cad2[]="hell";
"hell" is stored as array member. ie, it will be stored as,
cad[0] = 'h', cad[1] = 'e', cad[2] = 'l', cad[3] = 'l', cad[4] = '\0'
C doesn't guarantee any defined behavior for changing constant literals. It may crash hang or spoil other valid data. Its called as undefined behavior.
Since you are changing cad1 which is pointing to constant literal your application hangs.
In version 2, both cad1 and cad2 are of array type. Direct array assignments in C is illegal. So you got error. Refer this link for all details as mentioned by others.
To answer why version 3 works,
cad1 is an array and cad2 is pointer here. By the statement cad2 = cad1 you made cad2 to point the memory which can be modified ( still, size is restricted). So changing cad1 and cad2 are same as they point same modifiable memory.
In version 1, cad2 is equal to cad1 which points to the constant string "hell". Later on, you attempt to modify that constant string, which is unpredictable. Version 3, in contrast, has cad1 declared as a char array, so you get a non-constant copy of the string, so modifying it will work.
For version 2, it's probably because both are arrays (not pointers), so I'm sure there's some issues there.
If cad is declared as char* cad="hell"; then that is a string literal (of length 4 plus 1 for a null terminator) and any attempt to modify a string literal is undefined behaviour. Anything could happen.
char cad[]="home"; will allocate 5 chars on the stack, cad[4] is '\0' - the null terminator; used by many string functions in C in modelling a set of chars as a string to mark the string end. You are free to modify these data although changing cad[4] will cause you trouble when using C string library functions as you will have removed their stopping condition.
Throughout your code you have cad2=cad1; Note that this does not copy the string, just the pointer; use strcpy in the C standard library to copy strings.
Really you should write const char* cad="hell";. Newer c++ compilers will insist on it.
why version 1 hangs the dev c++?
Read comments:
char *cad1="hell"; // pointer to constant string "hell"
char *cad2="home";
cad2=cad1; // now cad2 points to constant string "hell" too
cad2[0]='a'; // modifying of constant string causes undefined behaviour.
version 2 says there is an incompatible assignment in cad2=cad1?
Read comments:
char cad2[]="home"; // cad2 is array
cad2=cad1; // error because you can not assign to arrays in C.
why version 3 works normal?
Read comments:
char cad1[]="hell"; // cad1 is array
char *cad2="home";
cad2=cad1; // now cad2 point to first element of array cad1
cad2[0]='a'; // you can modify arrays in C
Note that you can not assign to arrays but you can modify them by copying or assigning values to their elements.

String initialization with and without explicit trailing terminator

What is the difference between
char str1[32] = "\0";
and
char str2[32] = "";
Since you already declared the sizes, the two declarations are exactly equal. However, if you do not specify the sizes, you can see that the first declaration makes a larger string:
char a[] = "a\0";
char b[] = "a";
printf("%i %i\n", sizeof(a), sizeof(b));
prints
3 2
This is because a ends with two nulls (the explicit one and the implicit one) while b ends only with the implicit one.
Well, assuming the two cases are as follows (to avoid compiler errors):
char str1[32] = "\0";
char str2[32] = "";
As people have stated, str1 is initialized with two null characters:
char str1[32] = {'\0','\0'};
char str2[32] = {'\0'};
However, according to both the C and C++ standard, if part of an array is initialized, then remaining elements of the array are default initialized. For a character array, the remaining characters are all zero initialized (i.e. null characters), so the arrays are really initialized as:
char str1[32] = {'\0','\0','\0','\0','\0','\0','\0','\0',
'\0','\0','\0','\0','\0','\0','\0','\0',
'\0','\0','\0','\0','\0','\0','\0','\0',
'\0','\0','\0','\0','\0','\0','\0','\0'};
char str2[32] = {'\0','\0','\0','\0','\0','\0','\0','\0',
'\0','\0','\0','\0','\0','\0','\0','\0',
'\0','\0','\0','\0','\0','\0','\0','\0',
'\0','\0','\0','\0','\0','\0','\0','\0'};
So, in the end, there really is no difference between the two.
As others have pointed out, "" implies one terminating '\0' character, so "\0" actually initializes the array with two null characters.
Some other answerers have implied that this is "the same", but that isn't quite right. There may be no practical difference -- as long the only way the array is used is to reference it as a C string beginning with the first character. But note that they do indeed result in two different memory initalizations, in particular they differ in whether Str[1] is definitely zero, or is uninitialized (and could be anything, depending on compiler, OS, and other random factors). There are some uses of the array (perhaps not useful, but still) that would have different behaviors.
Unless I'm mistaken, the first will initialize 2 chars to 0 (the '\0' and the terminator that's always there, and leave the rest untouched, and the last will initialize only 1 char (the terminator).

Resources