Locality of variables in functions and memory - c

I've done the following:
char * copyact(char * from)
{
return ++from;
}
int main()
{
char *string = "school";
char *copy;
copy = copyact(string);
printf("%s", copy);
}
This is printing chool, however my idea is the application must crash when we try to print it in main(). By scope rules, parameter from is a variable local to copyact function. I'm doing from = from + 1; and returning address to that place. So when we get back to main, shouldn't the memory given to that location now be invalid because all local variables must be destroyed? Why is this thing still working?
Clarification: Don't we assign a memory location for the pointer &from in which it stores the address for the string? When the function exits, don't we also destroy the address of pointer that holds the valid address? or is it because by the time return is executed, the address it points to was already sent to copy= ?

1. Undefined behavior is not a crash
First of all please remember that when you do bad things with memory (like handling a variable after it has been destroyed) the result is undefined behavior and this means something completely different from a "crash".
Undefined behavior means that anything can happen (including a crash) but anything may also mean "nothing". Actually the worst kinds of bug are those in which undefined behavior doesn't do anything apparent immediately, but only to provoke crazy behavior in some other and unrelated and innocent part of the code one million of instructions executed later. Or only when showing your program in front of a vast audience.
So please remember that undefined behavior is not crash. It's a crash only when you're lucky.
The sooner you understand the difference between a bug and a crash and the better it is. Bugs are your enemies, crashes are your friends (because they reveal a bug).
2. This code is not doing anything bad
The function returns a char *, and this value (a pointer) is computed by pre-incrementing a local variable. When the function returns the local variable is destroyed, but because the function was returning its value (a pointer) then the code is perfectly safe.
It would have been unsafe instead if the function was defined as
char *& copyact(char * from)
{
return ++from;
}
because in this case the return value is a reference to a pointer to char and it would have returned a reference to from that was however going to be already destroyed by the time the caller could access the returned reference.
By the way for exampe g++ compiler emits a warning when you compile the modified version:
vref.cpp: In function ‘char*& copyact(char*)’:
vref.cpp:3:9: warning: reference to local variable ‘from’ returned
Note however that even in this case you cannot expect that running the code would generate a crash for sure. For example on my computer running the buggy code with the modified version just prints "school" instead of "chool".
It doesn't make much sense, but this is quite normal once you enter Undefined Behavior realm.

It works, because your function gets the reference to the object that already exists outside of it. The result it returns is just a value. Though judging by the code, it returns the pointer to the string shifted by one from start. I am not sure that was the idea, also it will probably crash if the original was an empty string.

char * copyact(char * from)
{
return ++from;
}
char *string = "school";
char *copy;
copy = copyact(string);
You are making farm points to "school" , which is already there in memory
and you are returned from+1 that points to "chool"
For example in which case you should not return.
char * copyact(char * from)
{
char a[10]; //declared array, has automatic scope.
return a; // you should not return a and can't be accessed outside of function.
}

Related

using a pointer without referencing?

i have a problem with this function this function using a pointer without referencing it but I surprised that it working and I don't know why ptr in the Function is not referenced and function working without error if any person can explain me why it not generate error i'll be so grateful
#include<stdio.h>
int * Ret(int *x)
{
int *ptr;
*ptr = (-1*(*x));
return ptr;
}
int main(void)
{
int val = 5,op;
op = *Ret(&val);
printf("%d",op);
}
output will be -5 but I think that it must generate run time error ?
It's undefined behaviour.
Anything can happen and because of that can be a valid behaviour.
If you want to catch these problems use external tools, e.g. valgrind or a custom compiler e.g. clang with address sanitizer.
You are right in the fact that the function is doing something wrong.
The function returns a pointer to an integer which is allocated only inside the function.
The memory (the value) the pointer points to does not change at the end of the function and this is why you get the correct value. (the implementation of C allows it)
if you had more function later on, they might have override this memory and the value of 'op' might have changed
Bottom line, don't do it!
int *ptr is on the stack. Stack variables are not initialized. So it can have any value. The *ptr= assignment dereferences ptr, that is, the "any value" is taken as an address and the right hand side is stored there. If "any value", as an address, is outside the program's assigned memory, a run time error will ocur. Otherwise some memory of the program is overwritten; this error can become manifest at any later moment, can not manifest itself at all, or can give (noted or unnoted) wrong results. "Bad code", in summary.
The compiler could catch the error by flagging the use of ptr as use of an uninitialized variable.

why the second printf prints garbage value

this is the source code
#include <stdio.h>
#include <stdlib.h>
int *fun();
int main()
{
int *j;
j=fun();
printf("%d\n",*j);
printf("%d\n",*j);
return 0;
}
int *fun()
{
int k=35;
return &k;
}
output-
35
1637778
the first printf() prints 35 which is the value of k but
In the main() the second printf prints a garbage value rather than printing 35.why?
The problem here is the return from fun is returning the address of a local variable. That address becomes invalid the moment the function returns. You are simply getting lucky on the first call to printf.
Even though the local is technically destroyed when fun returns the C runtime does nothing to actively destroy it. Hence your first use of *j is working because the memory for the local hasn't been written over yet. The implementation of printf though is likely over writing this simply by using its own locals in the method. Hence in the second use of *j you're referring to whatever local printf used and not k.
In order to make this work you need to return an address that points to a value that lives longer than fun. Typically in C this is achieved with malloc
int *fun() {
int* pValue = malloc(sizeof(int));
*pValue = 23;
return pValue;
}
Because the return of malloc lives until you call free this will be valid across multiple uses of printf. The one catch is the calling function now has to tell the program when it is done with the retun of fun. To do this call free after the second call to printf
j=fun();
printf("%d\n",*j);
printf("%d\n",*j);
free(j);
Program invokes undefined behavior. You can't return a pointer to an automatic local variable. The variable no longer exist once fun returns. In this case the result you get, may be expected or unexpected.
Never return a pointer to an automatic local variable
You are returning local value it is stored in stack. When you move out of function it gets erased. You getting undefined behaviour.
In your case stack not changed after function returning, so first time you getting correct value. This is not same in all time.
Both are wrong, since you print a value that no longer exists: the memory to store int k in the function is ok only while the function is executing; you can't return a reference (pointer) to it, since it will no longer reference anything meaningful.
The following, instead, would work:
int *fun()
{
static int k=35;
return &k;
}
The static keyword "says" that the memory must "survive" even if the function is not running, thus the pointer you return will be valid.
As others already told, your program invokes undefined behavior.
That means, anything can happen where the behaviour is not defined.
In your case, the following happens: The address of the variable, sitting on the stack, is returned. After returning from the function, the next function call can - and will - reuse that space.
Between the function call erroneously returning this address and the call using the value, nothing happens - in your case. Be aware that even this might be different on systems where interrupts may occur, and as well on systems with signals being able to interrupt the normal program run.
The first printf() call now uses the stack for its own purpose - maybe it is even the call itself which overwrites the old value. So the second printf() call receives the value now written into that memory.
On undefined behaviour, anything may happen.

scope rules in C

I recently read about scope rules in C. It says that a local or auto variable is available only inside the block of the function in which it is declared. Once outside the function it no longer is visible. Also that its lifetime is only till the end of the final closing braces of the function body.
Now here is the problem. What happens when the address of a local variable is returned from the function to the calling function ?
For example :-
main()
{
int *p=fun();
}
int * fun()
{
int localvar=0;
return (&localvar);
}
once the control returns back from the function fun, the variable localvar is no longer alive. So how will main be able to access the contents at this address ?
The address can be returned, but the value stored at the address cannot reliably be read. Indeed, it is not even clear that you can safely assign it, though the chances are that on most machines there wouldn't be a problem with that.
You can often read the address, but the behaviour is undefined (read 'bad: to be avoided at all costs!'). In particular, the address may be used for other variables in other functions, so if you access it after calling other functions, you are definitely unlikely to see the last value stored in the variable by the function that returned the pointer to it.
Why then is a function returning a pointer ever required?
One reason is often 'dynamic memory'. The malloc() family of functions return a pointer to new (non-stack) memory.
Another reason is 'found something at this location in a value passed to me'. Consider strchr() or strstr().
Another reason is 'returning pointer to a static object, either hidden in the function or in the file containing the source for the function'. Consider asctime() et al (and worry about thread-safety).
There are probably a few others, but those are probably the most common.
Note that none of these return a pointer to a local (stack-based) variable.
The variable is gone, but the memory location still exists and might even still contain the value you set. It will however probably get overwritten pretty fast as more functions are called and the memory address gets reused for another function's local variables. You can learn more by reading about the Call Stack, which is where local variables of functions are stored.
Referencing that location in memory after the function has returned is dangerous. Of course the location still exists (and it may still contain your value), but you no longer have any claim to that memory region and it will likely be overwritten with new data as the program continues and new local variables are allocated on the stack.
gcc gives me the following warning:
t.c: In function ‘test’:
t.c:3:2: warning: function returns address of local variable [enabled by default]
Consider this test program:
int * test(int p) {
int loc = p;
return &loc;
}
int main(void) {
int *c = test(4);
test(5);
printf("%d\n", *c);
return 0;
}
What do you think this prints?

What could be the possible reason behind the warning which comes up when the following piece of code is compiled

This is a simple piece of code which i wrote to check whether it is legitimate to return the address of a local variable and my assumptions were proved correct by the compiler which gives a warning saying the same:
warning: function returns address of local variable
But the correct address is printed when executed... Seems strange!
#include<stdio.h>
char * returnAddress();
main()
{
char *ptr;
ptr = returnAddress();
printf("%p\n",ptr);
}
char * returnAddress()
{
int x;
printf("%p\n",&x);
return &x;
}
The behaviour is undefined.
Anything is allowed to happen when you invoke undefined behaviour - including behaving semi-sanely.
The address of a local variable is returned. It remains an address; it might even be a valid address if you're lucky. What you get if you access the data that it points to is anyone's guess - though you're best off not knowing. If you call another function, the space pointed at could be overwritten by new data.
You should be getting warnings about the conversion between int pointer and char pointer - as well as warnings about returning the address of a local variable.
What you are trying to do is usually dangerous:
In returnAddress() you declare a local, non-static variable i on the stack. Then you return its address which will be invalid once the function returned.
Additionally you try to return a char * while you actually have an int *.
To get rid of the warning caused by returning a pointer to a local var, you could use this code:
void *p = &x;
return p;
Of course printing it is completely harmless but dereferencing (e.g. int x = *ptr;) it would likely crash your program.
However, what you are doing is a great way to break things - other people might not know that you return an invalid pointer that must never be dereferenced.
Yes, the same address is printed both times. Except that, when the address is printed in main(), it no longer points to any valid memory address. (The variable x was created in the stack frame of returnAddress(), which was scrapped when the function returned.)
That's why the warning is generated: Because you now have an address that you must not use.
Because you can access the memory of the local variable, doesn't mean it is a correct thing to do. After the end of a function call, the stack pointer backtracks to its previous position in memory, so you could access the local variables of the function, as they are not erased. But there is no guaranty that such a thing won't fail (like a segmentation fault), or that you won't read garbages.
Which warning? I get a type error (you're returning an int* but the type says char*) and a warning about returning the address of a local variable.
The type error is because the type you've declared for the function is lies (or statistics?).
The second is because that is a crazy thing to do. That address is going to be smack in the middle (or rather, near the top) of the stack. If you use it you'll be stomping on data (or have your data stomped on by subsequent function calls).
Its not strange. The local variables of a function is allocated in the stack of that function. Once the control goes out of the function, the local variables are invalid. You may have the reference to the address but the same space of memory can be replaced by some other values. This is why the behavior is undefined. If you want reference a memory throughout your program, allocate using malloc. This will allocate the memory in heap instead of stack. You can safely reference it until you free the memory explicitly.
#include<stdio.h>
#include<stdlib.h>
char * returnAddress();
main()
{
char *ptr;
ptr = returnAddress();
printf("%p\n",ptr);
}
char * returnAddress()
{
char *x = malloc(sizeof(char));
printf("%p\n",x);
return x;
}

How does temporary storage work in C when a function returns?

I know C pretty well, however I'm confused of how temporary storage works.
Like when a function returns, all the allocation happened inside that function is freed (from the stack or however the implementation decides to do this).
For example:
void f() {
int a = 5;
} // a's value doesn't exist anymore
However we can use the return keyword to transfer some data to the outside world:
int f() {
int a = 5;
return a;
} // a's value exists because it's transfered to the outside world
Please stop me if any of this is wrong.
Now here's the weird thing, when you do this with arrays, it doesn't work.
int []f() {
int a[1] = {5};
return a;
} // a's value doesn't exist. WHY?
I know arrays are only accessible by pointers, and you can't pass arrays around like another data structure without using pointers. Is this the reason you can't return arrays and use them in the outside world? Because they're only accessible by pointers?
I know I could be using dynamic allocation to keep the data to the outside world, but my question is about temporary allocation.
Thanks!
When you return something, its value is copied. a does not exist outside the function in your second example; it's value does. (It exists as an rvalue.)
In your last example, you implicitly convert the array a to an int*, and that copy is returned. a's lifetime ends, and you're pointing at garbage.
No variable lives outside its scope, ever.
In the first example the data is copied and returned to the calling function, however the second returns a pointer so the pointer is copied and returned, however the data that is pointed to is cleaned up.
In implementations of C I use (primarily for embedded 8/16-bit microcontrollers), space is allocated for the return value in the stack when the function is called.
Before calling the function, assume the stack is this (the lines could represent various lengths, but are all fixed):
[whatever]
...
When the routine is called (e.g. sometype myFunc(arg1,arg2)), C throws the parameters for the function (arguments and space for the return value, which are all of fixed length) on to the stack, followed by the return address to continue code execution from, and possibly backs up some processor registers.
[myFunc local variables...]
[return address after myFunc is done]
[myFunc argument 1]
[myFunc argument 2]
[myFunc return value]
[whatever]
...
By the time the function fully completes and returns to the code it was called from, all of it's variables have been deallocated off the stack (they might still be there in theory, but there is no guarantee)
In any case, in order to return the array, you would need to allocate space for it somewhere else, then return the address to the 0th element.
Some compilers will store return values in temporary registers of the processor rather than using the stack, but it's rare (only seen it on some AVR compilers).
When you attempt to return a locally allocated array like that, the calling function gets a pointer to where the array used to live on the stack. This can make for some spectacularly gruesome crashes, when later on, something else writes to the array, and clobbers a stack frame .. which may not manifest itself until much later, if the corrupted frame is deep in the calling sequence. The maddening this with debugging this type of error is that real error (returning a local array) can make some other, absolutely perfect function blow up.
You still return a memory address, you can try to check its value, but the contents its pointing are not valid beyond the scope of function,so dont confuse value with reference.
int []f() {
int a[1] = {5};
return a;
} // a's value doesn't exist. WHY?
First, the compiler wouldn't know what size of array to return. I just got syntax errors when I used your code, but with a typedef I was able to get an error that said that functions can't return arrays, which I knew.
typedef int ia[1];
ia h(void) {
ia a = 5;
return a;
}
Secondly, you can't do that anyway. You also can't do
int a[1] = {4};
int b[1];
b = a; // Here both a and b are interpreted as pointer literals or pointer arithmatic
While you don't write it out like that, and the compiler really wouldn't even have to generate any code for it this operation would have to happen semantically for this to be possible so that a new variable name could be used to refer the value that was returned by the function. If you enclosed it in a struct then the compiler would be just fine with copying the data.
Also, outside of the declaration and sizeof statements (and possibly typeof operations if the compiler has that extension) whenever an array name appears in code it is thought of by the compiler as either a pointer literal or as a chunk of pointer arithmetic that results in a pointer. This means that the return statement would end looking like you were returning the wrong type -- a pointer rather than an array.
If you want to know why this can't be done -- it just can't. A compiler could implicitly think about the array as though it were in a struct and make it happen, but that's just not how the C standard says it is to be done.

Resources