Stack memory fundamentals - c

Consider this code:
char* foo(int myNum) {
char* StrArray[5] = {"TEST","ABC","XYZ","AA","BB"};
return StrArray[4];
}
When I return StrArray[4] to the caller, is this supposed to work?
Since the array is defined on the stack, when the caller gets the pointer, that part of memory has gone out of scope. Or will this code work?

This code will work. You are returning the value of the pointer in StrArray[4], which points to a constant string "BB". Constant strings have a lifetime equal to that of your entire program.
The important thing is the lifetime of what the pointer points to, not where the pointer is stored. For example, the following similar code would not work:
char* foo(int myNum) {
char bb[3] = "BB";
char* StrArray[5] = {"TEST","ABC","XYZ","AA",bb};
return StrArray[4];
}
This is because the bb array is a temporary value on the stack of the foo() function, and disappears when you return.

Beware: you're lying to the compiler.
Each element of StrArray points to a read-only char *;
You're telling the compiler the return value of your function is a modifiable char *.
Lie to the compiler and it will get its revenge sooner or later.
In C, the way to declare a pointer to read-only data is to qualify it with const.
I'd write your code as:
const char* foo(int myNum) {
const char* StrArray[5] = {"TEST","ABC","XYZ","AA","BB"};
return StrArray[4];
}

The code will work. The point you are returning (StrArray[4]) points to a string literal "BB". String literals in C are anonymous array objects with static storage duration, which means that they live as long as your program lives (i.e forever). It doesn't matter where you create that sting literal. Even if it is introduced inside a function, it still has static storage duration.
Just remember, that string literals are not modifiable, so it is better to use const char* pointers with string literals.

C uses arrays with indicies beginning at 0. So the first element is StrArray[0]
Thus StrArray[5] was not declared in your code.
C will allow you to write code to return StrArray[5] but wht happens is undefined and will differ on OS and compiler but often will crash the program.

Only the array of pointers is on the stack, not the string-literal-constants.

Related

Returning array declared in function returns address of local variable in C?

I realize that variables declared in a function call are pushed onto the stack and once the function reaches its end the local variables declared onto the stack are popped off and go into lala land.
The thing I don't undertstand is that if I declare a pointer in a function I can return the pointer with no compiler complaint, whereas with an array it does give me a warning.
Here is an example of what I'm referring to:
char * getStringArray();
char * getStringPointer();
int main(){
char * temp1 = getStringPointer();
char * temp2 = getStringArray();
return 0;
}
char * getStringPointer(){
char * retString = "Fred";
return retString;
}
char * getStringArray(){
char retString[5] = {'F', 'r','e','d','\0'};
return retString;
}
During compilation it throws a "returning address of local variable" warning about getStringArray(). What confuses me is that I've been told that referencing an array solely by its name(like retString, no[])refers to its address in memory, acting like a pointer.
Arrays can be handy and there are many times I would like to use an array in a function and return it, is there any quick way around this?
As I referred to in a comment, will this work? I'm guessing malloc allocates to heap but so its fine. I'm still a little unclear of the difference between static data and data on the heap.
char * getStringPointer(){
char * retString = (char *)malloc(sizeof(char)*4+1);
*(retString+0) = 'F';
*(retString+1) = 'r';
*(retString+2) = 'e';
*(retString+3) = 'd';
*(retString+4) = '\0';
return retString;
}
getStringPointer is allocating a pointer on the stack, then returning that pointer, which points to the string 'Fred\0' somewhere in your executable (not on the stack).
getStringArray allocates space for 5 chars on the stack, assigns them 'F' 'r' 'e' 'd' and '\0', then returns a pointer to the address of 'F', which is on the stack (and therefore invalid after the function returns).
There are two ways round it: either you can malloc some space for your array on the heap, or you can make a struct that contains an appropriately sized array and return that. You can return numerical, pointer and struct types from functions, but not arrays.
"Fred" is an unnamed global, so it's not going away, and returning a pointer to it is safe. But getStringArray() is de-allocating everything that pointer points to. That's the difference.
The array is unambiguously a pointer to stack-local memory so the compiler is certain that's what you are leaking, so that's why you get the warning.
Just make the array static and both the warning and the lifetime restriction will vanish.
Now, the pointer could be pointing to something static or in the heap (in your case it actually was static) and so it's not obvious from looking at just that one expression whether you are leaking something stack-local or not. That's why you don't get the warning when you return a pointer, regardless of whether it's a safe return or not. Again, the static storage class will save you. Your string constant happens to be static, so that works fine.
By adding a static modifier to the variable retString and returning back a pointer will resolve the issue.
char * getStringArray(){
static char retString[5] = {'F', 'r','e','d','\0'};
return retString;
}
That will return the appropriate value.
Notice how its still treated as a pointer, because the first element of the array is decayed into a pointer of type T, in this case, char. This will make the compiler happy as it matches the prototype declaration of the function itself as defined at the start of the source.
See ideone sample

Why can't we assign int* x=12 or int* x= "12" when we can assign char* x= "hello"?

What is the correct way to use int* x?
Mention any related link if possible as I was unable to find one.
Because the literal "hello" evaluates to a pointer to constant memory initialised with the string "hello" (and a nul terminator), i.e. the value you get is of char* type.
If you want a pointer to number 12 then you'll need to store the value 12 somewhere, e.g. in another int, and then take a pointer to that:
int x_value = 12;
int* x = &x_value;
However in this case you're putting the 12 on the stack, and so that pointer will become invalid once you leave this function.
You can at a pinch abuse that mechanism to make yourself a pointer to 12; depending on endianness that would probably be
int* x = (int*)("\x0c\x00\x00");
Note that this is making assumptions about your host's endianness and size of int, and that you would not be able to modify that 12 either (but you can change x to point to something else), so this is a bad idea in general.
Because the compiler creates a static (constant) string "hello" and lets x point to that, where it doesn't create a static (constant) int.
A string literal creates an array object. This object has static storage duration (meaning it exists for the entire execution of the program), and is initialized with the characters in the string literal.
The value of a string literal is the value of the array. In most contexts, there is an implicit conversion from char[N] to char*, so you get a pointer to the initial (0th) element of the array. So this:
char *s = "hello";
initializes s to point to the initial 'h' in the implicitly created array object. A pointer can only point to an object; it does not point to a value. (Incidentally, that really should be const char *s, so you don't accidentally attempt to modify the string.)
String literals are a special case. An integer literal does not create an object; it merely yields a value. This:
int *ptr = 42; // INVALID
is invalid, because there is no implicit conversion of 42 from int* to int. This:
int *ptr = &42; // INVALID
is also invalid, because the & (address-of) operator can only be applied to an object (an "lvalue"), and there is no object for it to apply to.
There are several ways around this; which one you should use depends on what you're trying to do. You can allocate an object:
int *ptr = malloc(sizeof *ptr); // allocation an int object
if (ptr == NULL) { /* handle the error */ }
but a heap allocation can always fail, and you need to deallocate it when you're finished with it to avoid a memory leak. You can just declare an object:
int obj = 42;
int *ptr = &obj;
You just have to be careful with the object's lifetime. If obj is a local variable, you can end up with a dangling pointer. Or, in C99 and later, you can use a compound literal:
int *ptr = &(int){42};
(int){42} is a compound literal, which is similar in some ways to a string literal. In particular, it does create an object, and you can take that object's address.
But unlike with string literals, the lifetime of the (anonymous) object created by a compound literal depends on the context in which it appears. If it's inside a function definition, the lifetime is automatic, meaning that it ceases to exist when you leave the block containing it -- just like an ordinary local variable.
That answers the question in your title. The body of your question:
What is the correct way to use int* x?
is much more general, and it's not a question we can answer here. There are a multitude of ways to use pointers correctly -- and even more ways to use them incorrectly. Get a good book or tutorial on C and read the section that discusses pointers. Unfortunately there are also a lot of bad books and tutorials. Question 18.10 of the comp.lang.c FAQ is a good starting point. (Bad tutorials can often be identified by the casual use of void main(), and by the false assertion that arrays are really pointers.)
Q1. Why can't we assign int *x=12? You can provided that 12 is a valid memory address which holds an int. But with a modern OS specifying a hard memory address is completely wrong (perhaps except embedded code). The usage is typically like this
int y = 42; // simple var
int *x = &y; // address-of: x is pointer to y
*x = 12; // write a new value to y
This looks the same as what you asked, but it is not, because your original declaration assigns the value 12 to x the pointer itself, not to *x its target.
Q2. Why can't we assign int *x = "12"? Because you are trying to assign an incompatible type - a char pointer to int pointer. "12" is a string literal which is accessed via a pointer.
Q3. But we can assign char* x= "hello"
Putting Q1 and Q2 together, "hello" generates a pointer which is assigned to the correct type char*.
Here is how it is done properly:
#include <stdio.h>
#include <stdlib.h>
int main() {
int *x;
x = malloc(sizeof(int));
*x = 8;
printf("%d \n", *x);
}

where does string literal passed to function call gets stored in c

I know that string literal used in program gets storage in read only area for eg.
//global
const char *s="Hello World \n";
Here string literal "Hello World\n" gets storage in read only area of program .
Now suppose I declare some literal in body of some function like
func1(char *name)
{
const char *s="Hello World\n";
}
As local variables to function are stored on activation record of that function, is this the
same case for string literals also? Again assume I call func1 from some function func2 as
func2()
{
//code
char *s="Mary\n";
//call1
func1(s);
//call2
func1("Charles");
//code
}
Here above,in 1st call of func1 from func2, starting address of 's' is passed i.e. address of s[0], while in 2nd call I am not sure what does actually happens. Where does string literal "Charles" get storage. Whether some temperory is created by compiler and it's address is passed or something else happens?
I found literals get storage in "read-only-data" section from
String literals: Where do they go?
but I am unclear about whether that happens only for global literals or for literals local to some function also. Any insight will be appreciable. Thank you.
A C string literal represents an array object of type char[len+1], where len is the length, plus 1 for the terminating '\0'. This array object has static storage duration, meaning that it exists for the entire execution of the program. This applies regardless of where the string literal appears.
The literal itself is an expression type char[len+1]. (In most but not all contexts, it will be implicitly converted to a char* value pointing to the first character.)
Compilers may optimize this by, for example, storing identical string literals just once, or by not storing them at all if they're never referenced.
If you write this:
const char *s="Hello World\n";
inside a function, the literal's meaning is as I described above. The pointer object s is initialized to point to the first character of the array object.
For historical reasons, string literals are not const in C, but attempting to modify the corresponding array object has undefined behavior. Declaring the pointer const, as you've done here, is not required, but it's an excellent idea.
Where string literals (or rather, the character arrays they are compiled to) are located in memory is an implementation detail in the compiler, so if you're thinking about what the C standard guarantees, they could be in a number of places, and string literals used in different ways in the program could end up in different places.
But in practice most compilers will treat all string literals the same, and they will probably all end up in a read-only segment. So string literals used as function arguments, or used inside functions, will be stored in the same place as the "global" ones.

Is char pointer address initialization necessary in C?

I'm learning C programming in a self-taught fashion. I know that numeric pointer addresses must always be initialized, either statically or dynamically.
However, I haven't read about the compulsory need of initializing char pointer addresses yet.
For example, would this code be correct, or is a pointer address initialization needed?
char *p_message;
*p_message = "Pointer";
I'm not entirely sure what you mean by "numeric pointer" as opposed to "char pointer". In C, a char is an integer type, so it is an arithmetic type. In any case, initialization is not required for a pointer, regardless of whether or not it's a pointer to char.
Your code has the mistake of using *p_message instead of p_message to set the value of the pointer:
*p_message = "Pointer" // Error!
This wrong because given that p_message is a pointer to char, *p_message should be a char, not an entire string. But as far as the need for initializing a char pointer when first declared, it's not a requirement. So this would be fine:
char *p_message;
p_message = "Pointer";
I'm guessing part of your confusion comes from the fact that this would not be legal:
char *p_message;
*p_message = 'A';
But then, that has nothing to do with whether or not the pointer was initialized correctly. Even as an initialization, this would fail:
char *p_message = 'A';
It is wrong for the same reason that int *a = 5; is wrong. So why is that wrong? Why does this work:
char *p_message;
p_message = "Pointer";
but this fail?
char *p_message;
*p_message = 'A';
It's because there is no memory allocated for the 'A'. When you have p_message = "Pointer", you are assigning p_message the address of the first character 'P' of the string literal "Pointer". String literals live in a different memory segment, they are considered immutable, and the memory for them doesn't need to be specifically allocated on the stack or the heap.
But chars, like ints, need to be allocated either on the stack or the heap. Either you need to declare a char variable so that there is memory on the stack:
char myChar;
char *pChar;
pChar = &myChar;
*pChar = 'A';
Or you need to allocate memory dynamically on the heap:
char* pChar;
pChar = malloc (1); // or pChar = malloc (sizeof (char)), but sizeof(char) is always 1
*pChar = 'A';
So in one sense char pointers are different from int or double pointers, in that they can be used to point to string literals, for which you don't have to allocate memory on the stack (statically) or heap (dynamically). I think this might have been your actual question, having to do with memory allocation rather than initialization.
If you are really asking about initialization and not memory allocation: A pointer variable is no different from any other variable with regard to initialization. Just as an uninitialized int variable will have some garbage value before it is initialized, a pointer too will have some garbage value before it is initialized. As you know, you can declare a variable:
double someVal; // no initialization, will contain garbage value
and later in the code have an assignment that sets its value:
someVal = 3.14;
Similarly, with a pointer variable, you can have something like this:
int ary [] = { 1, 2, 3, 4, 5 };
int *ptr; // no initialization, will contain garbage value
ptr = ary;
Here, ptr is not initialized to anything, but is later assigned the address of the first element of the array.
Some might say that it's always good to initialize pointers, at least to NULL, because you could inadvertently try to dereference the pointer before it gets assigned any actual (non-garbage) value, and dereferencing a garbage address might cause your program to crash, or worse, might corrupt memory. But that's not all that different from the caution to always initialize, say, int variables to zero when you declare them. If your code is mistakenly using a variable before setting its value as intended, I'm not sure it matters all that much whether that value is zero, NULL, or garbage.
Edit. OP asks in a comment: You say that "String literals live in a different memory segment, they are considered immutable, and the memory for them doesn't need to be specifically allocated on the stack or the heap", so how does allocation occur?
That's just how the language works. In C, a string literal is an element of the language. The C11 standard specifies in ยง6.4.5 that when the compiler translates the source code into machine language, it should transform any sequence of characters in double quotes to a static array of char (or wchar_t if they are wide characters) and append a NUL character as the last element of the array. This array is then considered immutable. The standard says: If the program attempts to modify such an array, the behavior is undefined.
So basically, when you have a statement like:
char *p_message = "Pointer";
the standard requires that the double-quoted sequence of characters "Pointer" be implemented as a static, immutable, NUL-terminated array of char somewhere in memory. Typically implementations place such string literals in a read-only area of memory such as the text block (along with program instructions). But this is not required. The exact way in which a given implementation handles memory allocation for this array / NUL terminated sequence of char / string literal is up to the particular compiler. However, because this array exists somewhere in memory, you can have a pointer to it, so the above statement does work legally.
An analogy with function pointers might be useful. Just as the code for a function exists somewhere in memory as a sequence of instructions, and you can have a function pointer that points to that code, but you cannot change the function code itself, so also the string literal exists in memory as a sequence of char and you can have a char pointer that points to that string, but you cannot change the string literal itself.
The C standard specifies this behavior only for string literals, not for character constants like 'A' or integer constants like 5. Setting aside memory to hold such constants / non-string literals is the programmer's responsibility. So when the compiler comes across statements like:
char *charPtr = 'A'; // illegal!
int *intPtr = 5; // illegal!
the compiler does not know what to do with them. The programmer has not set aside such memory on the stack or the heap to hold those values. Unlike with string literals, the compiler is not going to set aside any memory for them either. So these statements are illegal.
Hopefully this is clearer. If not, please comment again and I'll try to clarify some more.
Initialisation is not needed, regardless of what type the pointer points to. The only requirement is that you must not attempt to use an uninitialised pointer (that has never been assigned to) for anything.
However, for aesthetic and maintenance reasons, one should always initialise where possible (even if that's just to NULL).
First of all, char is a numeric type, so the distinction in your question doesn't make sense. As written, your example code does not even compile:
char *p_message;
*p_message = "Pointer";
The second line is a constraint violation, since the left-hand side has arithmetic type and the right-hand side has pointer type (actually, originally array type, but it decays to pointer type in this context). If you had written:
char *p_message;
p_message = "Pointer";
then the code is perfectly valid: it makes p_message point to the string literal. However, this may or may not be what you want. If on the other hand you had written:
char *p_message;
*p_message = 'P';
or
char *p_message;
strcpy(p_message, "Pointer");
then the code would be invoking undefined behavior by either (first example) applying the * operator to an invalid pointer, or (second example) passing an invalid pointer to a standard library function which expects a valid pointer to an object able to store the correct number of characters.
not needed, but is still recommended for a clean coding style.
Also the code you posted is completely wrong and won't work, but you know that and only wrote that as a quick example, right?

Returning strings in C function, explanation needed

I understand that in order to return a string from a function I have to return a pointer. What I don't understand is why a char array is treated somewhat different from, say, integer, and gets destroyed when you exit the function. That's probably because I come from a high-level languages world, but this seems to be equally valid to me:
int x = 1;
return x;
char x[] = "hello";
return x;
The reason is both simple, yet subtle: C functions cannot be declared to return arrays.
When you return a pointer, like this:
char *foo(void)
{
char x[] = "hello";
return x;
}
The return x; is not actually returning x. It is returning a pointer to the first element of x1 - it is exactly the same as saying:
return &x[0];
It should be more clear why this isn't correct - it is exactly analagous to this:
int *foo(void)
{
int x = 100;
return &x;
}
It is, however, possible to return structures from functions - so you can return an array as long as it wrapped inside a struct. The following is quite OK:
struct xyzzy {
char x[10];
};
struct xyzzy foo(void)
{
struct xyzzy x = { "hello" };
return x;
}
1. This is a consequence of a special case rule for array types. In an expression, if an array isn't the subject of either the unary & or sizeof operators, it evaluates to a pointer to its first element. Trying to pin down actual an array in C is a bit like trying to catch fog in your hands - it just disappears, replaced by a pointer.
An integer is a primitive type, so it's value is returned as such. However, in C, a string is not a primitive type, which is why you declare it as an array of characters. And when you return an array, you are actually returning a pointer to the original array, so it is a kind of return-by-reference, if you will. But as you can see, the pointer to char x[] is invalid once you exit the function.
Please see this detailed answer I have written about this. Not alone that, in C, variables that are created within the scope of functions are using an automatic stack heap where the variables will resides in and when the function returns, that heap is destroyed, this is where the usage of local variables especially of type string i.e. char [] or a pointer to string must be allocated on the heap outside of the function scope otherwise garbage will be returned. The workaround on that is to use a static buffer or use malloc.
That char array is allocated within the scope of the function (rather than dynamically allocated eg by malloc()), so it will be unavailable outside of that scope.
Two ways around it:
Declare that char array to be static
Allocate the buffer in the calling function, passing it as an argument, then this function just fills it.

Resources