Starting to re-learn C after many, many years, and I'm a little confused why something I'm doing is working:
char *test_foo() {
char *foo = "foo";
return foo;
}
int main(int argc, char *argv[])
{
char *foo = test_foo();
printf(foo);
return 0;
}
Maybe I'm misunderstanding my book, but I'm under the impression that the character pointer initialized in the test_foo function should have its memory block released after the function returns, which should make my printf not print out "foo" like it does in this example.
Is this just a case of the kernel not releasing this memory in time, or am I misunderstanding why this happens?
In C, a string literal is an anonymous array of constant characters with static storage duration. The memory for a string literal is not released when the surrounding function returns. The memory for the pointer foo is, but its value is copied to the caller before it's released. What you are doing is perfectly well defined and sane.
you end up returning a pointer to a literal ("foo")
your code is the same as
char *test_foo() {
return "foo";
}
Note however that you make a bad assumption. You assumed that because your code printed "foo" that everything you did was good (in this case it is), but even in a 'bad' example your code many times would succeed in a simple test but explode in a more real-world example.
To see the case you are expecting try
char *test_foo() {
char foo[10];
strcpy(foo,"foo")
return foo;
}
now your string is on the stack and will be released at function exit
The reason it works, is because foo() returns a pointer to initialised data - a string literal. That string data is not on the local stack, so the pointer that foo() returns is good.
Related
void func(){
int i;
char str[100];
strcat(str, "aa");
printf("%s\n", str);
}
int main(){
func();
func();
func();
return 0;
}
this code prints:
?#aa
?#aaaa
?#aaaaaa
I don't understand why trash value(?#) is created, and why "aa" is continuously appended. Theoretically local values should be destroyed upon function termination. But this code doesn't.
There's nothing in the standard that says data on the stack needs to be "destroyed" when a function returns. It's undefined behavior, which means anything can happen.
Some implementations may choose to write zeros to all bytes that were on the stack, some might choose to write random data there, and some may chose to not touch it at all.
On this particular implementation, it appears that it doesn't attempt to sanitize what was on the stack.
After the first call to func returns. The data that was on the stack is still physically there because it hasn't been overwritten yet. When you then call func again immediately after the first call, the str variable happens to reside in the same physical memory that it did before. And since no other functions calls were made in the calling function, this data is unchanged from the first call.
If you were to call some other function in between the calls to func, then you'd most likely see different behavior, as str would contain data that was used by whatever other function was called last.
To use strcat() you need to have a valid string, an uninitialized array of char is not a valid string.
To make it a valid "empty" string, you need to do this
char str[100] = {0};
this creates an empty string because it only contains the null terminator.
Be careful when using strcat(). If you intend two just concatenate to valid c strings it's ok, if they are more than 2 then it's not the right function because the way it wroks.
Also, if you want your code to work, declare the array in main() instead and pass it to the function, like this
void
func(char *str)
{
strcat(str, "aa");
printf("%s\n", str);
}
int
main()
{
char str[100] = {0};
func(str);
func(str);
func(str);
return 0;
}
When you declare:
char str[100];
all you are doing is declaring a character array, but it's random garbage.
So you are effectively appending to garbage.
If you did this instead:
char str[100] = {0};
Then you will get different results since the string will be null to begin with. Anytime you allocate memory in C, unless you initialize it when you allocate it, it will simply be whatever was in memory in that spot at the time. You cannot assume it will be NULL and zero'ed for you.
Function strcat means to concatenate two strings. So the both arguments of the function must to represent strings. Moreover the first argument is changed.
That a character array would represent a string you can either to initialize it with a string literal (possibly an "empty" string literal) or implicitly to specify a zero value either when the array is initialized or after its initialization.
For example
void func(){
char str1[100] = "";
char str2[100] = {'\0'};
char str3[100];
str3[0] ='\0';
//...
}
Take into account that objects with the automatic storage duration are not initialized implicitly. They have indeterminate values.
As for your question
Why local array value isn't destroyed in function?
then the reason for this is that the memory occupied by the array was not overwritten by other function. However in any case it is undefined behavior.
This question already has answers here:
String literals: Where do they go?
(8 answers)
Closed 9 years ago.
The following program will print on the screen "Hello\nWorld\n" ('\n' = line down) as it supposed to. But actually, as i learned, something here isn't done as it should be. The "hello" and "world" strings are defined inside a function (and therefore are local and their memory is released at the end of the function's scope - right?). Actually we don't do a malloc for them as we are supposed to (to save the memory after the scope). So when a() is done, isn't the memory stack move up it's cursor and "world" will be placed in the memory at the same place where "hello" was ? (it looks like it doesn't happen here and I don't understand why, and therefore, why do i usually need to do this malloc if actually the memory block is saved and not returned after the scope?)
Thanks.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *a()
{
char *str1 = "Hello";
return str1;
}
char *b()
{
char *str2 = "World";
return str2;
}
int main()
{
char *main_str1 = a();
char *main_str2 = b();
puts(main_str1);
puts(main_str2);
return 0;
}
edit: So what you are saying actually is that my "hello" string takes a constant place in memory and even though it's inside a function , i can read it from anywhere i want if i have it's address (so its defined just like a malloc but you cant free it) - right ?
Constant strings are not allocated on the stack. Only the pointer is allocated on the stack. The pointer returned from a() and b() points to some literal constant part of executable memory.
Another question dealing with this topic
In this case all works because string literal are allocated in memory data available for all program lifetime.
Your code is equivalent to (produce same result,I mean):
char *a()
{
return "Hello";
}
This code doesn't work
char* a()
{
char array[6];
strcpy(array,"Hello");
return array;
}
because array[] is created on stack and destroyed when function returning
String literals (strings that are defined with "quotes") are created statically in the program's memory space at compile-time. When you go char *str1 = "Hello";, you aren't creating new memory at run-time like you would with a malloc call.
C does not obligate the compiler to move memory on the stack as OP suggests and that is why the observed behavior is not failing as expected.
Compiler models and optimizations may allow a program, such as OP's with undefined behavior (UB), to apparently work without side effects like corrupt memory or seg faults. Another compiler may also compile the same code with very different results.
Version with allocated memory follows:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *a() {
return strdup("Hello"); // combo strlen, malloc and memcpy
}
char *b() {
return strdup("World");
}
int main() {
char *main_str1 = a();
char *main_str2 = b();
puts(main_str1);
puts(main_str2);
free(main_str1);
free(main_str2);
return 0;
}
I have the following code:
const char * func_journey ()
{
const char * manner = "Hello";
manner = "World";
return manner;
}
int main()
{
const char * Temp;
Temp = func_journey();
return 0;
}
I ran it in debug just to see what happens, somehow manner changed from "Hello" to "World" and also the pointer changed even due I have declared it a const.
Another thing is that at the end of the run Temp was "World", now how can it be? manner was an automate variable inside func_journey, shouldn't it get destroyed at the end?
Thanks a lot.
I ran it in debug just to see what happens, somehow manner changed from "Hello" to "World"
That's precisely what your code told it to do, so there's no surprise that it did what you asked for.
and also the pointer changed even due I have declared it a const.
You declared it a pointer to const, not a const pointer (I know, this may sound confusing). When you write const char *, it means that what's pointed to is const. If you want to say that the pointer itself is const, you need
char * const manner = "Hello";
The answer to your questions are two part:
1) you have declared pointer to a "const" and not a "const" pointer (which you wanted I guess).
2) The memory allocated for both "Hello" and "World" is not in function func_journey local stack but in a global read-only memory location (look into how string literals are allocated). If you declare using char array then "World" would not be copied back to Temp.
I'm trying to understand where things are stored in memory (stack/heap, are there others?) when running a c program. Compiling this gives warning: function return adress of local variable:
char *giveString (void)
{
char string[] = "Test";
return string;
}
int main (void)
{
char *string = giveString ();
printf ("%s\n", string);
}
Running gives various results, it just prints jibberish. I gather from this that the char array called string in giveString() is stored in the stack frame of the giveString() function while it is running. But if I change the type of string in giveString() from char array to char pointer:
char *string = "Test";
I get no warnings, and the program prints out "Test". So does this mean that the character string "Test" is now located on the heap? It certainly doesn't seem to be in the stack frame of giveString() anymore. What exactly is going on in each of these two cases? And if this character string is located on the heap, so all parts of the program can access it through a pointer, will it never be deallocated before the program terminates? Or would the memory space be freed up if there was no pointers pointing to it, like if I hadn't returned the pointer to main? (But that is only possible with a garbage collector like in Java, right?) Is this a special case of heap allocation that is only applicable to pointers to constant character strings (hardcoded strings)?
You seem to be confused about what the following statements do.
char string[] = "Test";
This code means: create an array in the local stack frame of sufficient size and copy the contents of constant string "Test" into it.
char *string = "Test";
This code means: set the pointer to point to constant string "Test".
In both cases, "Test" is in the const or cstring segment of your binary, where non-modifiable data exists. It is neither in the heap nor stack. In the former case, you're making a copy of "Test" that you can modify, but that copy disappears once your function returns. In the latter case, you are merely pointing to it, so you can use it once your function returns, but you can never modify it.
You can think of the actual string "Test" as being global and always there in memory, but the concept of allocation and deallocation is not generally applicable to const data.
No. The string "Test" is still on the stack, it's just in the data portion of the stack which basically gets set up before the program runs. It's there, but you can think of it kind of like "global" data.
The following may clear it up a tad for you:
char string[] = "Test"; // declare a local array, and copy "Test" into it
char* string = "Test"; // declare a local pointer and point it at the "Test"
// string in the data section of the stack
It's because in the second case you are creating a constant string :
char *string = "Test";
The value pointed by string is a constant and can never change, so it's allocated at compile time like a static variable(but it's still stack not heap).
Coming from a PHP background, I'm used to writing small functions that return a string (or the response from another function) like so:
function get_something(){
return "foo";
}
However, I'm new to C and am trying to figure how to do some really fundamental things like this.
Can people review the following similar functions and tell me how they differ and which one is the best/cleanest to use?
char *get_foo(){
char *bar;
bar = "bar";
return bar;
}
char *get_foo(){
char *bar = "bar";
return bar;
}
char *get_foo(){
char *bar = NULL;
bar = "bar";
return bar;
}
char *get_foo(){
return "bar";
}
Is there any difference between these functions or is this a style issue?
One other thing. If I have two functions and one calls the other, is this alright to do?
char *get_foo(){
return "bar";
}
char *get_taz(){
return get_foo();
}
UPDATE: How would these functions need to change if get_foo() did not return a const char*? What if get_foo() calls another function that has a char* of different lengths?
The four are equivalent, especially the first three ones - the compiler is likely to compile them to exactly the same code. So I'd go for the last one, for being smaller.
Having said that - you're returning a const char*, not a char*, so this particular code could break everything, depending on how you use it (if it compiles at all, which you can force anyway). The thing is, you're returing a pointer to a string that isn't dynamically allocated, but part of the executable image. So modifying it could be dangerous.
As a more general rule, never return a pointer to stuff allocated on the stack (ie not created using new or malloc) because as soon as the function ends, the scope of that variable also ends, gets destroyed, so you get a pointer to invalid (freed) memory.
Differences like this will usually be optimized out by the compiler anyway ... I would vote for :
char *get_foo(){
char *bar = "bar";
return bar;
}
or
const char *get_foo(){
return "bar";
}
or something along the lines of (but obviously more defensive, and on GNU system):
char *get_foo(){
return strdup("bar");
}
Depending on future use and expansion of the function. Really, due to optimizations, it is a readability issue, and how you want the string (mutable/not) for future use.
Because you are initializing the variable to a constant in the data of the program. I would do things differently if I were creating a string dynamically.
Like others already have stated, the compiler will produce likely the same code for the alternatives. But: are you forced to use C? Why not use C++ where you can use the std::string class. I haven't declared new char arrays for ages - too error-prone. You don't need to learn/master C before going to C++!
I'm always wary of return a pointer to a variable that exist on a lower scope level. When I first learned C some X-teen years ago, I can remember returning a pointer to a variable that was declared with local scope, before I called printf the debugger told me everything was normal but it never printed the right value. What was happening was: The variable was correct BEFORE the printf call, but when you call a function local variables get allocated on the stack, and deallocated upon return, so the variable that I had pointer to existed on the stack BEFORE calling printf and was the memory was reallocated to printf when the printf function was evoked thus overwritting the previous variables.
In your case the example you've given will assign a pointer to the constants table that is loaded as part of the executable and MIGHT be fine, depending on what else the actual program is doing, but I would recomend trying to keep the string at a higher level scope to prevent an easy bug from sneaking into your code as you tweek it. Based on the example you've given, you could probably have a string table allocated at the scope above this call, and just assign the variable instead of calling a function.
I.E.
#define FOO 0
#define BAR 1
#define FOOBAR 2
#define BARFOO 3
char *MyFooStrings[4] = {"Foo","Bar","FooBar","BarFoo"};
// Instead: myFoo = get_foo();
myFoo = MyFooStrings[FOO];
Pete
Is there any difference between these
functions or is this a style issue?
There is no difference in terms of output. As mentioned by others, the compiler will likely optimize the code anyway. My preference is to use:
char *get_foo(){
char *bar = "bar";
return bar;
}
If your return value gets to be more complex than a simple assignment, it helps to have the intermediate variable if you need to step through the code.
One other thing. If I have two
functions and one calls the other, is
this alright to do?
This is not a problem as long as you insure that the return types of the two functions are compatible.
UPDATE: How would these functions need
to change if get_foo() did not return
a const char*? What if get_foo() calls
another function that has a char* of
different lengths?
get_taz() just has to have a return type that is assignment-compatible with get_foo(). For example, if get_foo() returns an int, then get_taz() has to return something that you can assign an int to - like int, long int, or similar.
A "char* of different lengths" doesn't really mean anything, because a "char *" doesn't really mean "string" - it means "the location of some chars". Whether that location holds three chars or thirty, a "char *" is still a "char *", so this is perfectly OK:
const char *get_zero(void)
{
return "Zero";
}
const char *get_nonzero(void)
{
return "The number is non-zero";
}
const char *get_n(int n)
{
if (n == 0)
{
return get_zero();
}
else
{
return get_nonzero();
}
}
First off, some of those will cause your program to crash.
The function:
char *get_foo(){
char *bar;
bar = "bar";
return bar;
}
Is incorrect C code (it may not crash, but you never know)
char *bar
allocates 1 pointers worth of memory on the stack.
Personally, I would do it like this.
1.
char *get_foo1(void) {
char *bar;
bar = malloc(strlen("bar")+1);
sprintf(bar,"bar");
return bar;
}
2. Or pass your allocated variable in.
void get_foo2(char **bar) {
sprintf(*bar,"bar);
}
combine 1 and 2, give user options
When working with strings in C, you almost always need to malloc() memory for usage.
Unless the length of the strings are known ahead of time or are very small. Additionally, you can use #2 above to avoid memory allocation, like this
int main(int argc, char *argv[]) {
char bar[4];
get_foo2(&bar);
}