Why would someone initialize unallocated memory in C? - c

Say I do initialize an array like this:
char a[]="test";
What's the purpose of this? We know that the content might immediately get changed, as it is not allocated, and thus why would someone initialize the array like this?

To clarify, this code is wrong for the reasons stated by the OP:
char* a;
strcpy(a, "test");
As noted by other responses, the syntax "char a[] = "test"" does not actually do this. The actual effect is more like this:
char a[5];
strcpy(a, "test");
The first statement allocates a fixed-size static character array on the local stack, and the second initializes the data in it. The size is determined from the length of the string literal. Like all stack variables, the array is automatically deallocated on exiting the function scope.

The purpose of this is to allocate five bytes on the stack or the static data segment (depending on where this snippet occurs), then set those bytes to the array {'t','e','s','t','\0'}.

This syntax allocates an array of five characters on the stack, equivalent to this:
char a[5] = "test";
The elements of the array are initialized to the characters in the string given as an initializer. The size of the array is determined to fit the size of the initializer.

It is allocated. That code is equivalent to
char a[5]="test";
When you leave the number out, the compiler simply calculates the length of the character-array for you by counting the characters in the literal string. It then adds 1 to the length in order to include the necessary terminating nul '\0'. Hence, the length of the array is 5 while the length of the string is 4.

The array is allocated; its size is inferred from the string literal being used to initialize it (5 chars total).
Had you written
char *a = "test";
then all that would get allocated would be a pointer variable, not an array (the string literal "test" lives in memory such that it's allocated at program startup and held until the program exits).

Related

Initializing strings using pointers

What is the difference between:
char arr[20]="I am a string"
and
char *arr="I am a string"
How is it possible to initialize an array just by using a pointer?
First one is clear, it is an array initialisation, whereas the second one means that character pointer *arr is pointing to the unnamed static array which will store the String " I am a string".
One difference is in allocated storage size. First expression allocates 20 chars, but the second expression allocate the length of the string (13 chars).
The second difference is mentioned in this post. which is discussed on the way how these variables are allocated.
In first case you are partially initializing stack allocated array with 14 chars taken from buffer represented by "I am a string" string literal.
In second case you are initializing stack allocated pointer with a pointer to a buffer with static storage duration represented by "I am a string" string literal. Also notice that in second case you should use const char *arr.

I want to know why this works without having to bind memory for the string

Hello guys I recently picked up C programming and I am stuck at understanding pointers. As far as I understand to store a value in a pointer you have to bind memory (using malloc) the size of the value you want to store. Given this, the following code should not work as I have not allocated 11 bytes of memory to store my string of size 11 bytes and yet for some reason beyond my comprehension it works perfectly fine.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(){
char *str = NULL;
str = "hello world\0";
printf("filename = %s\n", str);
return 0;
}
In this case
str = "hello world\0";
str points to the address of the first element of an array of chars, initialized with "hello world\0". In other words, str points to a "string literal".
By definition, the array is allocated and the address of the first element has to be "valid".
Quoting C11, chapter §6.4.5, String literals
In translation phase 7, a byte or code of value zero is appended to each multibyte
character sequence that results from a string literal or literals.78) The multibyte character
sequence is then used to initialize an array of static storage duration and length just
sufficient to contain the sequence. For character string literals, the array elements have
type char, and are initialized with the individual bytes of the multibyte character
sequence. [....]
Memory allocation still happens, just not explicitly by you (via memory allocator functions).
That said, the "...\0" at the end is repetitive, as mentioned (in the first statement of the quote) above, by default, the array will be null-terminated.
Using a char variable without malloc is stating that the string you are assigning is read-only. This means that you are creating a pointer to a string constant. "hello world\0" is somewhere in the read-only part of memory and you are just pointing to it.
Now if you want to make changes to the string. Let's say changing the h to H, that would be str[0]='H'. Without malloc that will not be possible to make.
When you declare a string literal in a C program, it is stored in a read-only section of the program code. A statement of the form char *str = "hello"; will assign the address of this string to the char* pointer. However, the string itself (i.e., the characters h, e, l, l and o, plus the \0 string terminator) are still located in read-only memory, so you can't change them at all.
Note that there's no need for you to explicitly add a zero byte terminator to your string declarations. The C compiler will do this for you.
Right. But in this case you are just pointing to a string literal which is placed in the constant memory area. Your pointer is created in the stack area. So you are just pointing to another address. i.e, at the starting address of string literal.
Try using copy the string literal in your pointer variable. Then it will give error because you have not allocated memory. Hope you understand now.
Storage for string literals is set aside at program startup and held until the program exits. This storage may be read-only, and attempting to modify the contents of a string literal results in undefined behavior (it may work, it may crash, it may do something in between).

String Initialization Declaration in C [duplicate]

This question already has answers here:
String initialization with and without explicit trailing terminator
(4 answers)
Closed 8 years ago.
I have a few questions regarding string initialization and declaration in C.
Suppose if a I declare a string 's' of size 10 using
char s[10];
Q 1. Is it necessary that all the elements of 's' will be initialized to '\0' or is it just pure luck that I will find other elements to be '\0'?
Q 2. If I instead use malloc to setup a string like this
char *s = malloc(10 * sizeof(char));
Again is it necessary that all the elements will be initialized to '\0'?
Q 3. Further do I need to add an '\0' while declaring the string or not?
char s[10] = "abc";
OR is it has to be
char s[10] = "abc\0";
NOTE: If possible, please take a look at the second answer by Kevin here.
No — in general. In some contexts yes, though. Specifically, if the variable is a local variable and not static, then it is not initialized at all. If the variable is local and static, or if the variable is file scope and static, or if it is global, then it will be initialized to all bytes zero.
No. malloc() is not guaranteed to return zeroed memory. If you need it zeroed, use calloc() instead.
These comments apply to any type.
char s0[10]; // Initialized all bytes zero
static char s1[10]; // Initialized all bytes zero
void somefunc(void)
{
static char s2[10]; // Initialized all bytes zero
char s3[10]; // Not initialized to all bytes zero
char *s4 = malloc(10); // Not initialized to all bytes zero
char *s5 = calloc(10, 1); // Initialized all bytes zero
…code using s0..s5…
}
It is sufficient to use:
char s6[10] = "abc"; // 3 bytes non-zero plus 7 bytes zero
Writing this would achieve the same result because the size of the array is specified:
char s7[10] = "abc\0"; // 3 bytes non-zero plus 7 bytes zero
Writing these gives two arrays of different sizes:
char s8[] = "abc"; // sizeof(s8) == 4 – 1 null byte
char s9[] = "abc\0"; // sizeof(s9) == 5 – 2 null bytes
C automatically adds a trailing null byte.
First and foremost, your s is not a "string". Your s is a character array. The term string refers to the content of a character array. In order to qualify as a string that content must satisfy some requirements. A string is defined as a continuous sequence of characters terminated with a zero character.
Q1. If the array is declared with static storage duration it will begin its life with all zeros in it. In all other cases it will contain unpredictable garbage.
Q2. malloc does not initialize allocated memory. The memory contains unpredictable garbage. calloc allocates character array initialized with zeros.
Q3. What you have on the right-hand side of initialization is called string literal. String literal already includes a terminating zero character implicitly. There's no need to add it explicitly.
However, C language follows the all-or-nothing approach to initialization. If you initialize just a small portion of some aggregate object, the rest of that object is implicitly initialized with zeros. In your case that means that the rest of array s will be filled with zeros anyway all the way to the end. Consequently there's no difference between the end result your two initialization examples. Still, there's no point is specifying that zero character explicitly.
If you declare the string using char s[10]; or malloc, the contents will not be initialized to \0 or anything. It will contain garbage values. So if you need \0 in your string, you need to explicitly store that.
Further, if you do sonething like
char s[10] = "abc";
then, you dont need to add \0,
A note: If you use to calloc instead of malloc to allocate memory, the contents will be initialized to 0.
Q1. If you don't explicitly initialize a local variable then it can contain any values. Often the bytes will just happen to contain zeroes.
But static variables (declared outside any function or prefixed with the static keyword are guaranteed to be initialized to zeroes.
Q2. Again malloc does not clear them memory but it will often happen to be filled with zeroes. To explicitly get zero-filled memory use calloc().
Q3. You don't need to add \0 inside the double-quotes. The string "abc" means 4 bytes are created somewhere containing the 3 characters then a string-terminator (byte with value zero).

Why can't I change char array later?

char myArray[6]="Hello"; //declaring and initializing char array
printf("\n%s", myArray); //prints Hello
myArray="World"; //Compiler says"Error expression must a modifiable lvalue
Why can't I change myArray later? I did not declare it as const modifier.
When you write char myArray[6]="Hello"; you are allocating 6 chars on the stack (including a null-terminator).
Yes you can change individual elements; e.g. myArray[4] = '\0' will transform your string to "Hell" (as far as the C library string functions are concerned), but you can't redefine the array itself as that would ruin the stack.
Note that [const] char* myArray = "Hello"; is an entirely different beast: that is read-only memory and any changes to that string is undefined behaviour.
Array is a non modifiable lvalue. So you cannot modify it.
If you wish to modify the contents of the array, use strcpy.
Because the name of an array cannot be modified, just use strcpy:
strcpy(myArray, "World");
You can't assign to an array (except when initializing it in its declaration. Instead you have to copy to it. This you do using strcpy.
But be careful so you don't copy more than five characters to the array, as that's the longest string it can contain. And using strncpy in this case may be dangerous, as it may not add the terminating '\0' character if the source string is to long.
You can't assign strings to variables in C except in initializations. Use the strcpy() function to change values of string variables in C.
Well myArray is the name of the array which you cannot modify. It is illegal to assign a value to it.
Arrays in C are non-modifiable lvalues. There are no operations in C that can modify the array itself (only individual elements can be modifiable).
Well myArray is of size 6 and hence care must be taken during strcpy.
strcpy(myArray,"World") as it would result in overflow if the source's string length is more than the destination's (6 in this case).
A arrays in C are non-modifiable lvalues. There are no operations in C that can modify the array itself (only individual elements can be modifiable).
A possible and safe method would be
char *ptr = "Hello";
If you want to change
ptr = strdup("World");
NOTE:
Make sure that you free(ptr) at the end otherwise it would result in memory leak.
You cannot assign naked arrays in C. However you can assign pointers:
char const *myPtr = "Hello";
myPtr = "World";
Or you can assign to the elements of an array:
char myArray[6] = "Hello";
myArray[0] = 'W';
strcpy(myArray, "World");

C strings pointer vs. arrays [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
What is the difference between char s[] and char *s in C?
Why is:
char *ptr = "Hello!"
different than:
char ptr[] = "Hello!"
Specifically, I don't see why you can use (*ptr)++ to change the value of 'H' in the array, but not the pointer.
Thanks!
You can (in general) use the expression (*ptr)++ to change the value that ptr points to when ptr is a pointer and not an array (ie., if ptr is declared as char* ptr).
However, in your first example:
char *ptr = "Hello!"
ptr is pointing to a literal string, and literal strings are not permitted to be modified (they may actually be stored in memory area which are not writable, such as ROM or memory pages marked as read-only).
In your second example,
char ptr[] = "Hello!";
The array is declared and the initialization actually copies the data in the string literal into the allocated array memory. That array memory is modifiable, so (*ptr)++ works.
Note: for your second declaration, the ptr identifier itself is an array identifier, not a pointer and is not an 'lvalue' so it can't be modified (even though it converts readily to a pointer in most situations). For example, the expression ++ptr would be invalid. I think this is the point that some other answers are trying to make.
When pointing to a string literal, you should not declare the chars to be modifiable, and some compilers will warn you for this:
char *ptr = "Hello!" /* WRONG, missing const! */
The reason is as noted by others that string literals may be stored in an immutable part of the program's memory.
The correct "annotation" for you is to make sure you have a pointer to constant char:
const char *ptr = "Hello!"
And now you see directly that you can't modify the text stored at the pointer.
Arrays automatically allocate space and they can't be relocated or resized while pointers are explicitly assigned to point to allocated space and can be relocated.
Array names are read only!
If You use a string literal "Hello!", the literal itself becomes an array of 7 characters and gets stored somewhere in a data memory. That memory may be read only.
The statement
char *ptr = "Hello!";
defines a pointer to char and initializes it, by storing the address of the beginning of the literal (that array of 7 characters mentioned earlier) in it. Changing contents of the memory pointed to by ptr is illegal.
The statement
char ptr[] = "Hello!";
defines a char array (char ptr[7]) and initializes it, by copying characters from the literal to the array. The array can be modified.
in C strings are arrays of characters.
A pointer is a variable that contains the memory location of another variable.
An array is a set of ordered data items.
when you put (*ptr)++ you are getting Segmentation Fault with the pointer.
Maybe you are adding 1 to the whole string (with the pointer), instead of adding 1 to the first character of the variable (with the array).

Resources