Say I do this:
const char *myvar = NULL;
Then later
*myval = “hello”;
And then again:
*myval = “world”;
I’d like to understand what happens to the memory where “hello” was stored?
I understand it is in the read only stack space, but does that memory space stays there forever while running and no other process can use that memory space?
Thanks
Assuming you meant
myval = "world";
instead, then
I’d like to understand what happens to the memory where “hello” was stored?
Nothing.
You just modify the pointer itself to point to some other string literal.
And string literals in C programs are static fixed (non-modifiable) arrays of characters, with a life-time of the full programs. The assignment really makes the pointer point to the first element of such an array.
String literals have static storage duration. They are usually placed by the compiler in a stack pool. So string literals are created before the program will run.
In these statements
*myval = “hello”;
*myval = “world”;
the pointer myval is reassigned by addresses of first characters of these two string literals.
Pat attention to that you may not change string literals. Whether the equal string literals are stored as a one character array or different character arrays with the static storage duration depends on compiler options.
From the C Standard (6.4.5 String literals)
7 It is unspecified whether these arrays are distinct provided their
elements have the appropriate values. If the program attempts to
modify such an array, the behavior is undefined.
Related
I understand that char pointers initialized to string literals are stored in read-only memory. But what about arrays of string literals?
In the array:
int main(void) {
char *str[] = {"hello", "world"};
}
Are "hello" and "world" stored as string literals in read-only memory? Or on the stack?
Technically, a string literal is a quoted string in source code. Colloquially, people use “string literal” to refer to the array of characters created for a string literal. Often we can overlook this informality, but, when asking about storage, we should be clear.
The array created for a string literal has static storage duration, meaning it exists (notionally, in the abstract computer the C standard uses as a model of computing) for the entire execution of the program. Because the behavior of attempting to modify the elements of this array is not defined by the C standard, the C implementation may treat them as constants and may place them in read-only memory. It is not required to do so by the C standard, but this is common practice in C implementations for general-purpose multi-user operating systems.
In the code you show, string literals are used as initializers for an array of pointers. In this use, the array of each string literal is converted to a pointer to its first element, and that address is used as the initial value for the corresponding element of the array of pointers.
The array of the string literal is the same as for any string literal; the C implementation may place it in read-only memory, and common practice is to do so.
Here is what the c17 standard says:
String literals [...] It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined. (6.4.5 p 7)
Like string literals, const-qualified compound literals can be placed into read-only memory and can even be shared. (6.5.2.5 p 13).
Hello guys I recently picked up C programming and I am stuck at understanding pointers. As far as I understand to store a value in a pointer you have to bind memory (using malloc) the size of the value you want to store. Given this, the following code should not work as I have not allocated 11 bytes of memory to store my string of size 11 bytes and yet for some reason beyond my comprehension it works perfectly fine.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(){
char *str = NULL;
str = "hello world\0";
printf("filename = %s\n", str);
return 0;
}
In this case
str = "hello world\0";
str points to the address of the first element of an array of chars, initialized with "hello world\0". In other words, str points to a "string literal".
By definition, the array is allocated and the address of the first element has to be "valid".
Quoting C11, chapter §6.4.5, String literals
In translation phase 7, a byte or code of value zero is appended to each multibyte
character sequence that results from a string literal or literals.78) The multibyte character
sequence is then used to initialize an array of static storage duration and length just
sufficient to contain the sequence. For character string literals, the array elements have
type char, and are initialized with the individual bytes of the multibyte character
sequence. [....]
Memory allocation still happens, just not explicitly by you (via memory allocator functions).
That said, the "...\0" at the end is repetitive, as mentioned (in the first statement of the quote) above, by default, the array will be null-terminated.
Using a char variable without malloc is stating that the string you are assigning is read-only. This means that you are creating a pointer to a string constant. "hello world\0" is somewhere in the read-only part of memory and you are just pointing to it.
Now if you want to make changes to the string. Let's say changing the h to H, that would be str[0]='H'. Without malloc that will not be possible to make.
When you declare a string literal in a C program, it is stored in a read-only section of the program code. A statement of the form char *str = "hello"; will assign the address of this string to the char* pointer. However, the string itself (i.e., the characters h, e, l, l and o, plus the \0 string terminator) are still located in read-only memory, so you can't change them at all.
Note that there's no need for you to explicitly add a zero byte terminator to your string declarations. The C compiler will do this for you.
Right. But in this case you are just pointing to a string literal which is placed in the constant memory area. Your pointer is created in the stack area. So you are just pointing to another address. i.e, at the starting address of string literal.
Try using copy the string literal in your pointer variable. Then it will give error because you have not allocated memory. Hope you understand now.
Storage for string literals is set aside at program startup and held until the program exits. This storage may be read-only, and attempting to modify the contents of a string literal results in undefined behavior (it may work, it may crash, it may do something in between).
I know that string literal used in program gets storage in read only area for eg.
//global
const char *s="Hello World \n";
Here string literal "Hello World\n" gets storage in read only area of program .
Now suppose I declare some literal in body of some function like
func1(char *name)
{
const char *s="Hello World\n";
}
As local variables to function are stored on activation record of that function, is this the
same case for string literals also? Again assume I call func1 from some function func2 as
func2()
{
//code
char *s="Mary\n";
//call1
func1(s);
//call2
func1("Charles");
//code
}
Here above,in 1st call of func1 from func2, starting address of 's' is passed i.e. address of s[0], while in 2nd call I am not sure what does actually happens. Where does string literal "Charles" get storage. Whether some temperory is created by compiler and it's address is passed or something else happens?
I found literals get storage in "read-only-data" section from
String literals: Where do they go?
but I am unclear about whether that happens only for global literals or for literals local to some function also. Any insight will be appreciable. Thank you.
A C string literal represents an array object of type char[len+1], where len is the length, plus 1 for the terminating '\0'. This array object has static storage duration, meaning that it exists for the entire execution of the program. This applies regardless of where the string literal appears.
The literal itself is an expression type char[len+1]. (In most but not all contexts, it will be implicitly converted to a char* value pointing to the first character.)
Compilers may optimize this by, for example, storing identical string literals just once, or by not storing them at all if they're never referenced.
If you write this:
const char *s="Hello World\n";
inside a function, the literal's meaning is as I described above. The pointer object s is initialized to point to the first character of the array object.
For historical reasons, string literals are not const in C, but attempting to modify the corresponding array object has undefined behavior. Declaring the pointer const, as you've done here, is not required, but it's an excellent idea.
Where string literals (or rather, the character arrays they are compiled to) are located in memory is an implementation detail in the compiler, so if you're thinking about what the C standard guarantees, they could be in a number of places, and string literals used in different ways in the program could end up in different places.
But in practice most compilers will treat all string literals the same, and they will probably all end up in a read-only segment. So string literals used as function arguments, or used inside functions, will be stored in the same place as the "global" ones.
Wouldn't the pointer returned by the following function be inaccessible?
char *foo(int rc)
{
switch (rc)
{
case 1:
return("one");
case 2:
return("two");
default:
return("whatever");
}
}
So the lifetime of a local variable in C/C++ is practically only within the function, right? Which means, after char* foo(int) terminates, the pointer it returns no longer means anything, right?
I'm a bit confused about the lifetime of a local variable. What is a good clarification?
Yes, the lifetime of a local variable is within the scope({,}) in which it is created.
Local variables have automatic or local storage. Automatic because they are automatically destroyed once the scope within which they are created ends.
However, What you have here is a string literal, which is allocated in an implementation-defined read-only memory. String literals are different from local variables and they remain alive throughout the program lifetime. They have static duration [Ref 1] lifetime.
A word of caution!
However, note that any attempt to modify the contents of a string literal is an undefined behavior (UB). User programs are not allowed to modify the contents of a string literal.
Hence, it is always encouraged to use a const while declaring a string literal.
const char*p = "string";
instead of,
char*p = "string";
In fact, in C++ it is deprecated to declare a string literal without the const though not in C. However, declaring a string literal with a const gives you the advantage that compilers would usually give you a warning in case you attempt to modify the string literal in the second case.
Sample program:
#include<string.h>
int main()
{
char *str1 = "string Literal";
const char *str2 = "string Literal";
char source[]="Sample string";
strcpy(str1,source); // No warning or error just Undefined Behavior
strcpy(str2,source); // Compiler issues a warning
return 0;
}
Output:
cc1: warnings being treated as errors
prog.c: In function ‘main’:
prog.c:9: error: passing argument 1 of ‘strcpy’ discards qualifiers from pointer target type
Notice the compiler warns for the second case, but not for the first.
To answer the question being asked by a couple of users here:
What is the deal with integral literals?
In other words, is the following code valid?
int *foo()
{
return &(2);
}
The answer is, no this code is not valid. It is ill-formed and will give a compiler error.
Something like:
prog.c:3: error: lvalue required as unary ‘&’ operand
String literals are l-values, i.e: You can take the address of a string literal, but cannot change its contents.
However, any other literals (int, float, char, etc.) are r-values (the C standard uses the term the value of an expression for these) and their address cannot be taken at all.
[Ref 1]C99 standard 6.4.5/5 "String Literals - Semantics":
In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence; for wide string literals, the array elements have type wchar_t, and are initialized with the sequence of wide characters...
It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.
It's valid. String literals have static storage duration, so the pointer is not dangling.
For C, that is mandated in section 6.4.5, paragraph 6:
In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence.
And for C++ in section 2.14.5, paragraphs 8-11:
8 Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type “array of n const char”, where n is the size of the string as defined below, and has static storage duration (3.7).
9 A string literal that begins with u, such as u"asdf", is a char16_t string literal. A char16_t string literal has type “array of n const char16_t”, where n is the size of the string as defined below; it has static storage duration and is initialized with the given characters. A single c-char may produce more than one char16_t character in the form of surrogate pairs.
10 A string literal that begins with U, such as U"asdf", is a char32_t string literal. A char32_t string literal has type “array of n const char32_t”, where n is the size of the string as defined below; it has static storage duration and is initialized with the given characters.
11 A string literal that begins with L, such as L"asdf", is a wide string literal. A wide string literal has type “array of n const wchar_t”, where n is the size of the string as defined below; it has static storage duration and is initialized with the given characters.
String literals are valid for the whole program (and are not allocated not the stack), so it will be valid.
Also, string literals are read-only, so (for good style) maybe you should change foo to const char *foo(int)
Yes, it is valid code, see case 1 below. You can safely return C strings from a function in at least these ways:
const char* to a string literal. It can't be modified and must not be freed by caller. It is rarely useful for the purpose of returning a default value, because of the freeing problem described below. It might make sense if you actually need to pass a function pointer somewhere, so you need a function returning a string..
char* or const char* to a static char buffer. It must not be freed by the caller. It can be modified (either by the caller if not const, or by the function returning it), but a function returning this can't (easily) have multiple buffers, so it is not (easily) threadsafe, and the caller may need to copy the returned value before calling the function again.
char* to a buffer allocated with malloc. It can be modified, but it must usually be explicitly freed by the caller and has the heap allocation overhead. strdup is of this type.
const char* or char* to a buffer, which was passed as an argument to the function (the returned pointer does not need to point to the first element of argument buffer). It leaves responsibility of buffer/memory management to the caller. Many standard string functions are of this type.
One problem is, mixing these in one function can get complicated. The caller needs to know how it should handle the returned pointer, how long it is valid, and if caller should free it, and there's no (nice) way of determining that at runtime. So you can't, for example, have a function, which sometimes returns a pointer to a heap-allocated buffer which caller needs to free, and sometimes a pointer to a default value from string literal, which caller must not free.
Good question. In general, you would be right, but your example is the exception. The compiler statically allocates global memory for a string literal. Therefore, the address returned by your function is valid.
That this is so is a rather convenient feature of C, isn't it? It allows a function to return a precomposed message without forcing the programmer to worry about the memory in which the message is stored.
See also #asaelr's correct observation re const.
Local variables are only valid within the scope they're declared, however you don't declare any local variables in that function.
It's perfectly valid to return a pointer to a string literal from a function, as a string literal exists throughout the entire execution of the program, just as a static or a global variable would.
If you're worrying about what you're doing might be invalid undefined, you should turn up your compiler warnings to see if there is in fact anything you're doing wrong.
str will never be a dangling pointer, because it points to a static address where string literals resides.
It will be mostly read-only and global to the program when it will be loaded.
Even if you try to free or modify, it will throw a segmentation fault on platforms with memory protection.
A local variable is allocated on the stack. After the function finishes, the variable goes out of scope and is no longer accessible in the code. However, if you have a global (or simply - not yet out of scope) pointer that you assigned to point to that variable, it will point to the place in the stack where that variable was. It could be a value used by another function, or a meaningless value.
In the above example shown by you, you are actually returning the allocated pointers to whatever function that calls the above. So it would not become a local pointer. And moreover, for the pointers that are needed to be returned, memory is allocated in the global segment.
I'm a little bit confused about this expression:
char *s = "abc";
Does the string literal get created on the stack?
I know that this expression
char *s = (char *)malloc(10 * sizeof(char));
allocates memory on the heap and this expression
char s[] = "abc";
allocates memory on the stack, but I'm totally unsure what the first expression does.
Typically, the string literal "abc" is stored in a read only part of the executable. The pointer s would be created on the stack(or placed in a register, or just optimized away) - and point to that string literal which lives "elsewhere".
"abc"
String literals are stored in the __TEXT,__cstring (or rodata or whatever depends on the object format) section of your program, if string pooling is enabled. That means, it's neither on the stack, nor in the heap, but sticks in the read-only memory region near your code.
char *s = "abc";
This statement will be assign the memory location of the string literal "abc" to s, i.e. s points to a read-only memory region.
"Stacks" and "heaps" are implementation details and depend on the platform (all the world is not x86). From the language POV, what matters is storage class and extent.
String literals have static extent; storage for them is allocated at program startup and held until the program terminates. It is also assumed that string literals cannot be modified (attempting to do so invokes undefined behavior). Contrast this with local, block-scope (auto) variables, whose storage is allocated on block entry and released on block exit. Typically, this means that string literals are not stored in the same memory as block-scope variables.