Why local array value isn't destroyed in function?

Why local array value isn't destroyed in function? - c

void func(){
int i;
char str[100];
strcat(str, "aa");
printf("%s\n", str);
}
int main(){
func();
func();
func();
return 0;
}
this code prints:
?#aa
?#aaaa
?#aaaaaa
I don't understand why trash value(?#) is created, and why "aa" is continuously appended. Theoretically local values should be destroyed upon function termination. But this code doesn't.

There's nothing in the standard that says data on the stack needs to be "destroyed" when a function returns. It's undefined behavior, which means anything can happen.
Some implementations may choose to write zeros to all bytes that were on the stack, some might choose to write random data there, and some may chose to not touch it at all.
On this particular implementation, it appears that it doesn't attempt to sanitize what was on the stack.
After the first call to func returns. The data that was on the stack is still physically there because it hasn't been overwritten yet. When you then call func again immediately after the first call, the str variable happens to reside in the same physical memory that it did before. And since no other functions calls were made in the calling function, this data is unchanged from the first call.
If you were to call some other function in between the calls to func, then you'd most likely see different behavior, as str would contain data that was used by whatever other function was called last.

To use strcat() you need to have a valid string, an uninitialized array of char is not a valid string.
To make it a valid "empty" string, you need to do this
char str[100] = {0};
this creates an empty string because it only contains the null terminator.
Be careful when using strcat(). If you intend two just concatenate to valid c strings it's ok, if they are more than 2 then it's not the right function because the way it wroks.
Also, if you want your code to work, declare the array in main() instead and pass it to the function, like this
void
func(char *str)
{
strcat(str, "aa");
printf("%s\n", str);
}
int
main()
{
char str[100] = {0};
func(str);
func(str);
func(str);
return 0;
}

When you declare:
char str[100];
all you are doing is declaring a character array, but it's random garbage.
So you are effectively appending to garbage.
If you did this instead:
char str[100] = {0};
Then you will get different results since the string will be null to begin with. Anytime you allocate memory in C, unless you initialize it when you allocate it, it will simply be whatever was in memory in that spot at the time. You cannot assume it will be NULL and zero'ed for you.

Function strcat means to concatenate two strings. So the both arguments of the function must to represent strings. Moreover the first argument is changed.
That a character array would represent a string you can either to initialize it with a string literal (possibly an "empty" string literal) or implicitly to specify a zero value either when the array is initialized or after its initialization.
For example
void func(){
char str1[100] = "";
char str2[100] = {'\0'};
char str3[100];
str3[0] ='\0';
//...
}
Take into account that objects with the automatic storage duration are not initialized implicitly. They have indeterminate values.
As for your question
Why local array value isn't destroyed in function?
then the reason for this is that the memory occupied by the array was not overwritten by other function. However in any case it is undefined behavior.

Related

Non static pointer variable declared inside function seems to be preserved even after function returns

This question is not an exact duplicate of this. The most popular answer here says that it was not guaranteed that the memory location would be preserved but by accident it did. Where I got this piece of code from, it says clearly that the string variable will be preserved.
Consider the following code:
#include<stdio.h>
char *getString()
{
char *str = "abc";
return str;
}
int main()
{
printf("%s", getString());
getchar();
return 0;
}
This code compiles and runs without any errors.
The pointer char* str is not defined as a static variable. Yet the variable seems to be preserved even after the function returns.
Please explain this behavior.
I do not understand why this question got a negative vote even though it's not answered yet or marked as duplicate.

char *getString() ... char *str = "abc";
You are making some confusion. getString() creates a variable in the stack, and makes it point to a literal string. That literal can be anywhere, and will stay always there; depending on the compiler, it can be in RAM initialized at startup, or it can be in read-only memory. OK: If it is in RAM, perhaps you can even modify it, but not in normal (legal) ways - we should ignore this and think at the literal as immutable data.
The point above is THE point. It is true that the string "is preserved". But it is not true that an automatic variable is preserved - it can happen, but you should not rely on that, it is an error if you do it. By definition, an automatic variable gets destroyed when the function returns, and be sure that it happens.
That said, I don't see how you can say that the variable is preserved. There is no mechanism in C to peek at local variables (outside the current scope); you can do it with a debugger, especially when the active frame is that of getString(); maybe some debugger lets you to look where you shouldn't, but this is another matter.
EDIT after the comment
Many compilers create auto (local) variables in the stack. When the function returns, the data on the stack remains untouched, because clearing/destroying the stack is made by simply moving the stack pointer elsewhere. So the affirmation "the variable seems to be preserved" is correct. It seems because actually the variable is still there. It seems because there is no legal way to use it. But even in this situation, where the variable is still there but hidden, the compiler can decide to use that room for something else, or an interrupt can arrive and use that same room. In other words, while the auto variable is in scope it is guaranteed to be "preserved", but when it falls out of scope it must be considered gone.
There are situation where one can write code that refers to variables fallen out of scope: these are errors that the compiler should (and sometimes can) detect.
I said that often auto variables go in the stack. This is not always true, it depends on architecture and compiler, but all the rest is true and it is dictated by the language rules.

The address of str may change but the thing it points to (once initialised) will not.
Note that in this simple example you probably won't see changes. If you call getString a few times from different stack depths and show &str you will see what I mean.
char *getString()
{
char *str = "abc";
printf("%p", &str);
return str;
}
int main()
{
printf("%s", getString());
stackTest();
getchar();
return 0;
}
void stackTest()
{
char blob[200];
int x=0;
printf("%s", getString());
}
*(I have not tested this and my unused stack variables might get optimised away depending on your compiler & settings)

C 2011 (draft N1570) clause 6.4.5, paragraph 6 describes how a string literal in source code becomes a static object:
The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence.
Thus, in char *str = "abc";, str is initialized to point to a static array containing the characters a, b, and c, and a null character.
Then return str; returns the value of this pointer.
Finally, printf("%s", getString()); passes the value of the pointer to printf. At this point, the str object is gone (in the C model of computation). It no longer exists. We have the value it was used to hold, but the str object itself is gone. However, the value points to a static object, the array of characters, and printf reads that array and prints it.
So, the title of your question is incorrect. The non-static pointer variable declared inside the function was not preserved. All that was preserved was the static array of characters and its address.

Return values to pointers initialized in function

Starting to re-learn C after many, many years, and I'm a little confused why something I'm doing is working:
char *test_foo() {
char *foo = "foo";
return foo;
}
int main(int argc, char *argv[])
{
char *foo = test_foo();
printf(foo);
return 0;
}
Maybe I'm misunderstanding my book, but I'm under the impression that the character pointer initialized in the test_foo function should have its memory block released after the function returns, which should make my printf not print out "foo" like it does in this example.
Is this just a case of the kernel not releasing this memory in time, or am I misunderstanding why this happens?

In C, a string literal is an anonymous array of constant characters with static storage duration. The memory for a string literal is not released when the surrounding function returns. The memory for the pointer foo is, but its value is copied to the caller before it's released. What you are doing is perfectly well defined and sane.

you end up returning a pointer to a literal ("foo")
your code is the same as
char *test_foo() {
return "foo";
}
Note however that you make a bad assumption. You assumed that because your code printed "foo" that everything you did was good (in this case it is), but even in a 'bad' example your code many times would succeed in a simple test but explode in a more real-world example.
To see the case you are expecting try
char *test_foo() {
char foo[10];
strcpy(foo,"foo")
return foo;
}
now your string is on the stack and will be released at function exit

The reason it works, is because foo() returns a pointer to initialised data - a string literal. That string data is not on the local stack, so the pointer that foo() returns is good.

Pointer and Function ambiguity in C

Please look at the following code:
char* test ( )
{
char word[20];
printf ("Type a word: ");
scanf ("%s", word);
return word;
}
void main()
{
printf("%s",test());
}
When the function returns, the variable word is destroyed and it prints some garbage value. But when I replace
char word[20];
by char *word;
it prints the correct value. According to me, the pointer variable should have been destroyed similar to the character array and the output should be some garbage value. Can anyone please explain the ambiguity?

Undefined behavior is just that - undefined. Sometimes it will appear to work, but that is just coincidence. In this case, it's possible that the uninitialized pointer just happens to point to valid writeable memory, and that memory is not used for anything else, so it successfully wrote and read the value. This is obviously not something you should count on.

You have undefined behavior either way, but purely from a "what's going on here" viewpoint, there's still some difference between the two.
When you use an array, the data it holds is allocated on the stack. When the function returns, that memory will no longer be part of the stack, and almost certainly will be overwritten in the process of calling printf.
When you use the pointer, your data is going to be written to whatever random location that pointer happens to have pointed at. Though writing there is undefined behavior, simple statistics says that if you have (for example) a 32-bit address space of ~4 billion locations, the chances of hitting one that will be overwritten in the new few instructions is fairly low.
You obviously shouldn't do either one, but the result you got isn't particularly surprising either.

Because the char array is defined and declared in the function, it is a local variable and no longer exists after the function returns. If you use a char pointer and ALLOCATE MEMORY FOR IT then it will remain, and all you need is the pointer (aka a number).
int main(int argc, char* argv[]) {
printf("%s", test());
return 0;
}
char* test(void) {
char* str = (char*)malloc(20 * sizeof(char));
scanf("%19s", str);
return str;
}
Notice how I used %19s instead of %s. Your current function can easily lead to a buffer overflow if a user enters 20 or more characters.

During program execution first it will create activation records for the function main in stack segment of the process memory. In that main activation records it will allocate memory for the local variable of that function(main) and some more memory for internal purpose. In your program main doesn't has any local variable, so it will not allocate any memory for local variables in main activation records.
Then while executing the statement for calling the function test, it will create one more activation records for the calling function(test) and it will allocate 20 bytes for the local variable word.
Once the control exits the function test, activation record created for that function will be poped out of that stack. Then it will continue to execute the remaining statment (printf) of the called function main. Here printf is trying to print the characters in the test function's local variable which is already poped out of the stack. So this behaviour is undefined, sometimes it may print the proper string or else it will print some junk strings.
So in this situation only dynamic memory comes into picture. With the help of dynamic memory we can control the lifetime(or scope) of a variable. So use dynamic memory like below.
char *word = NULL:
word = (char *) malloc(sizeof(char) * 20);
Note : Take care of NULL check for the malloc return value and also dont forget to free the allocated memory after printf in main function.

If referencing constant character strings with pointers, is memory permanently occupied?

I'm trying to understand where things are stored in memory (stack/heap, are there others?) when running a c program. Compiling this gives warning: function return adress of local variable:
char *giveString (void)
{
char string[] = "Test";
return string;
}
int main (void)
{
char *string = giveString ();
printf ("%s\n", string);
}
Running gives various results, it just prints jibberish. I gather from this that the char array called string in giveString() is stored in the stack frame of the giveString() function while it is running. But if I change the type of string in giveString() from char array to char pointer:
char *string = "Test";
I get no warnings, and the program prints out "Test". So does this mean that the character string "Test" is now located on the heap? It certainly doesn't seem to be in the stack frame of giveString() anymore. What exactly is going on in each of these two cases? And if this character string is located on the heap, so all parts of the program can access it through a pointer, will it never be deallocated before the program terminates? Or would the memory space be freed up if there was no pointers pointing to it, like if I hadn't returned the pointer to main? (But that is only possible with a garbage collector like in Java, right?) Is this a special case of heap allocation that is only applicable to pointers to constant character strings (hardcoded strings)?

You seem to be confused about what the following statements do.
char string[] = "Test";
This code means: create an array in the local stack frame of sufficient size and copy the contents of constant string "Test" into it.
char *string = "Test";
This code means: set the pointer to point to constant string "Test".
In both cases, "Test" is in the const or cstring segment of your binary, where non-modifiable data exists. It is neither in the heap nor stack. In the former case, you're making a copy of "Test" that you can modify, but that copy disappears once your function returns. In the latter case, you are merely pointing to it, so you can use it once your function returns, but you can never modify it.
You can think of the actual string "Test" as being global and always there in memory, but the concept of allocation and deallocation is not generally applicable to const data.

No. The string "Test" is still on the stack, it's just in the data portion of the stack which basically gets set up before the program runs. It's there, but you can think of it kind of like "global" data.
The following may clear it up a tad for you:
char string[] = "Test"; // declare a local array, and copy "Test" into it
char* string = "Test"; // declare a local pointer and point it at the "Test"
// string in the data section of the stack

It's because in the second case you are creating a constant string :
char *string = "Test";
The value pointed by string is a constant and can never change, so it's allocated at compile time like a static variable(but it's still stack not heap).

Proper way to initialize a string in C

I've seen people's code as:
char *str = NULL;
and I've seen this is as well,
char *str;
I'm wonder, what is the proper way of initializing a string? and when are you supposed to initialize a string w/ and w/out NULL?

You're supposed to set it before using it. That's the only rule you have to follow to avoid undefined behaviour. Whether you initialise it at creation time or assign to it just before using it is not relevant.
Personally speaking, I prefer to never have variables set to unknown values myself so I'll usually do the first one unless it's set in close proximity (within a few lines).
In fact, with C99, where you don't have to declare locals at the tops of blocks any more, I'll generally defer creating it until it's needed, at which point it can be initialised as well.
Note that variables are given default values under certain circumstances (for example, if they're static storage duration such as being declared at file level, outside any function).
Local variables do not have this guarantee. So, if your second declaration above (char *str;) is inside a function, it may have rubbish in it and attempting to use it will invoke the afore-mentioned, dreaded, undefined behaviour.
The relevant part of the C99 standard 6.7.8/10:
If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static storage duration is not initialized explicitly, then:
if it has pointer type, it is initialized to a null pointer;
if it has arithmetic type, it is initialized to (positive or unsigned) zero;
if it is an aggregate, every member is initialized (recursively) according to these rules;
if it is a union, the first named member is initialized (recursively) according to these rules.

I'm wonder, what is the proper way of initializing a string?
Well, since the second snippet defines an uninitialized pointer to string, I'd say the first one. :)
In general, if you want to play it safe, it's good to initialize to NULL all pointers; in this way, it's easy to spot problems derived from uninitialized pointers, since dereferencing a NULL pointer will yield a crash (actually, as far as the standard is concerned, it's undefined behavior, but on every machine I've seen it's a crash).
However, you should not confuse a NULL pointer to string with an empty string: a NULL pointer to string means that that pointer points to nothing, while an empty string is a "real", zero-length string (i.e. it contains just a NUL character).
char * str=NULL; /* NULL pointer to string - there's no string, just a pointer */
const char * str2 = ""; /* Pointer to a constant empty string */
char str3[] = "random text to reach 15 characters ;)"; /* String allocated (presumably on the stack) that contains some text */
*str3 = 0; /* str3 is emptied by putting a NUL in first position */

this is a general question about c variables not just char ptrs.
It is considered best practice to initialize a variable at the point of declaration. ie
char *str = NULL;
is a Good Thing. THis way you never have variables with unknown values. For example if later in your code you do
if(str != NULL)
doBar(str);
What will happen. str is in an unknown (and almost certainly not NULL) state
Note that static variables will be initialized to zero / NULL for you. Its not clear from the question if you are asking about locals or statics

Global variables are initialized with default values by a compiler, but local variables must be initialized.

an unitialized pointer should be considered as undefined so to avoid generating errors by using an undefined value it's always better to use
char *str = NULL;
also because
char *str;
this will be just an unallocated pointer to somewhere that will mostly cause problems when used if you forget to allocate it, you will need to allocate it ANYWAY (or copy another pointer).
This means that you can choose:
if you know that you will allocate it shortly after its declaration you can avoid setting it as NULL (this is a sort of rule to thumb)
in any other case, if you want to be sure, just do it. The only real problem occurs if you try to use it without having initialized it.

It depends entirely on how you're going to use it. In the following, it makes more sense not to initialize the variable:
int count;
while ((count = function()) > 0)
{
}

Don't initialise all your pointer variables to NULL on declaration "just in case".
The compiler will warn you if you try to use a pointer variable that has not been initialised, except when you pass it by address to a function (and you usually do that in order to give it a value).
Initialising a pointer to NULL is not the same as initialising it to a sensible value, and initialising it to NULL just disables the compiler's ability to tell you that you haven't initialised it to a sensible value.
Only initialise pointers to NULL on declaration if you get a compiler warning if you don't, or you are passing them by address to a function that expects them to be NULL.
If you can't see both the declaration of a pointer variable and the point at which it is first given a value in the same screen-full, your function is too big.

static const char str[] = "str";
or
static char str[] = "str";

Because free() doesn't do anything if you pass it a NULL value you can simplify your program like this:
char *str = NULL;
if ( somethingorother() )
{
str = malloc ( 100 );
if ( NULL == str )
goto error;
}
...
error:
cleanup();
free ( str );
If for some reason somethingorother() returns 0, if you haven't initialized str you will
free some random address anywhere possibly causing a failure.
I apologize for the use of goto, I know some finds it offensive. :)

Your first snippet is a variable definition with initialization; the second snippet is a variable definition without initialization.
The proper way to initialize a string is to provide an initializer when you define it. Initializing it to NULL or something else depends on what you want to do with it.
Also be aware of what you call "string". C has no such type: usually "string" in a C context means "array of [some number of] char". You have pointers to char in the snippets above.
Assume you have a program that wants the username in argv[1] and copies it to the string "name". When you define the name variable you can keep it uninitialized, or initialize it to NULL (if it's a pointer to char), or initialize with a default name.
int main(int argc, char **argv) {
char name_uninit[100];
char *name_ptr = NULL;
char name_default[100] = "anonymous";
if (argc > 1) {
strcpy(name_uninit, argv[1]); /* beware buffer overflow */
name_ptr = argv[1];
strcpy(name_default, argv[1]); /* beware buffer overflow */
}
/* ... */
/* name_uninit may be unusable (and untestable) if there were no command line parameters */
/* name_ptr may be NULL, but you can test for NULL */
/* name_default is a definite name */
}

By proper you mean bug free? well, it depends on the situation. But there are some rules of thumb I can recommend.
Firstly, note that strings in C are not like strings in other languages.
They are pointers to a block of characters. The end of which is terminated with a 0 byte or NULL terminator. hence null terminated string.
So for example, if you're going to do something like this:
char* str;
gets(str);
or interact with str in any way, then it's a monumental bug. The reason is because as I have just said, in C strings are not strings like other languages. They are just pointers. char* str is the size of a pointer and will always be.
Therefore, what you need to do is allocate some memory to hold a string.
/* this allocates 100 characters for a string
(including the null), remember to free it with free() */
char* str = (char*)malloc(100);
str[0] = 0;
/* so does this, automatically freed when it goes out of scope */
char str[100] = "";
However, sometimes all you need is a pointer.
e.g.
/* This declares the string (not intialized) */
char* str;
/* use the string from earlier and assign the allocated/copied
buffer to our variable */
str = strdup(other_string);
In general, it really depends on how you expect to use the string pointer.
My recommendation is to either use the fixed size array form if you're only going to be using it in the scope of that function and the string is relatively small. Or initialize it to NULL. Then you can explicitly test for NULL string which is useful when it's passed into a function.
Beware that using the array form can also be a problem if you use a function that simply checks for NULL as to where the end of the string is. e.g. strcpy or strcat functions don't care how big your buffer is. Therefore consider using an alternative like BSD's strlcpy & strlcat. Or strcpy_s & strcat_s (windows).
Many functions expect you to pass in a proper address as well. So again, be aware that
char* str = NULL;
strcmp(str, "Hello World");
will crash big time because strcmp doesn't like having NULL passed in.
You have tagged this as C, but if anyone is using C++ and reads this question then switch to using std::string where possible and use the .c_str() member function on the string where you need to interact with an API that requires a standard null terminated c string.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight