What's wrong with the following C code- Struct and pointers - c

Something's wrong with the following function:
typedef struct Data1{
float result;
struct Data1* next;
} Data;
Data* f(Data* info){
Data item;
item.result=info->result;
item.next=info->next;
return &item;
}
I notice two things here:
The returned value is a pointer of local value. However it's still a pointer- the compiler gives a warning: function returns address of local variable. but would it really be a problem? ( I don't return a local value itself)
I believe that the main problem here is that this function suppose to copy the Data struct. it would be OK for the results value, but regarding the 'next' pointers, I believe that at the end of the call to the function the pointers would not be changed, Am I correct? It's like equalize two ints in a outside function, should *(item.next)=*(info->next); solve the problem?
So what's the main problem here? is it both 1 and 2?

The returned value is a pointer of local value. However it's still a pointer- the compiler gives a warning: function returns address of local variable. but would it really be a problem? ( I don't return a local value itself)
That is the main problem. After the function returns, the local variable doesn't exist anymore. The space it occupied may be overwritten immediately or later, but you can't count on ever reading meaningful data from that address.
If you want to copy things, you have to return a pointer to malloced memory.
Data* f(Data* info){
Data *item = malloc(sizeof *item);
item->result=info->result;
item->next=info->next;
return item;
}
But that has the drawback that now the caller has to free the memory allocated by f, so
Data* f(Data* info, Data* item){
item->result=info->result;
item->next=info->next;
return item;
}
with a pointer allocated by the caller.

The problem with returning pointers to local variables is that the space the local variables occupies will be reclaimed when the function returns, so the pointer no longer points to valid memory, or even memory used by other functions called later.

Yes, it would be a problem, since the returned pointer is useless: it's pointing at an object which no longer exists. Hence the warning.
Not sure I follow your reasoning here ... You are not changing anything in the Data passed in, so that's a problem if you expected it to.

1) Yes, it's a problem, because your pointer now points to what used to be on your stack, but is no longer managed memory, which means another function call (or an interrupt) will, with almost 100% certainty, begin mangling that memory.
2) I have no idea what you're asking here.
The main problem is that you're unclear on how memory works in C programs, which leads to constructs like this; not a ding, just an honest observation: http://www.geeksforgeeks.org/archives/14268 gives a relatively good overview and should serve you well.

Related

Mallocing in a recursive function

I have a function that is called recursively a number of times. Inside this function I malloc memory for a struct and pass it as an argument into the recursive call of this function. I am confused whether I can keep the name of the variable I am mallocing the same. Or is this going to be a problem?
struct Student{
char *studentName;
int studentAge;
};
recursiveFunction(*struct){ //(Whoever calls this function sends in a malloced struct)
Student *structptr = malloc(sizeof(Student));
<Do some processing>
.
.
if(condition met){
return;
}
else{
recursiveFunction(structptr);
}
}
free(){} // All malloced variables are free'd in another function
Would this be a problem since the name of the variable being malloced doesnt change in each recursive call.
The short answer is no. When you declare a variable it is scoped at the level where it is declared, in your case within this function. Each successive recursive call creates a new scope and allocates that memory within that scope so the name of your variable will not cause problems. However, you do want to be very careful that you free any memory that you malloc() before returning from your function as it will not be accessible outside the scope of your function unless you pass back a pointer to it. This question provides a lot of helpful information on using malloc() within functions. I also recommend reading more about scope here.
Each malloc() must have a matching free(). Either you need to free the record inside recursiveFunction (e.g. immediately before it exits), or in a function called by recursiveFunction or you need to maintain a list of them and free them elsewhere.
The name of the 'variable being malloced' being the same is irrelevant. In any case, it is not the variable that is being malloc()d; rather it is memory that is being malloc()d and the address stored in a variable. Each recursive iteration of recursiveFunction has a different stack frame and thus a different instance of this variable. So all you need to do is ensure that each malloc() is paired with a free() that is passed the address returned by malloc().
If you want to check you've done your malloc() / free() right, run valgrind on the code.
Can keep the name of the variable I am mallocing the same?
Yes, in a recursive function this is fine. As the function is called recursively, each variable holding the malloc'd pointer (it doesn't hold the memory itself) will be allocated on a new stack frame.
However, you're going to have to free that memory somehow. Only the pointer to the memory is on the stack, so only the pointer is freed when the function exits. The malloc'd memory lives on. Either at the end of each call to the function, or all that memory will have to be returned as part of a larger structure and freed later.
I am confused whether I can keep the name of the variable I am mallocing the same.
You seem to be confused about the concept of scope. Functions in C define scopes for the (local) variables you declare within them. That means that when you declare a local variable bar inside some function foo, then when you reference bar inside that function you reference whatever you declared it to be.
int bar = 21;
void foo(void) {
int bar = 42;
// ...
bar; // This is the bar set to 42
}
Now scope is only the theoretical concept. It's implemented using (among other details that I skip over here) so called stack frames:
When you call foo, then a new stack frame is created on the call stack, containing (this is highly dependent on the target architecture) things like return address (i.e. the address of the instruction that will be executed after foo), parameters (i.e. the values that you pass to a function) and, most importantly, space for the local variables (bar).
Accessing the variable bar in foo is done using addresses relative to the current stack frame. So accessing bar could mean access byte 12 relative to the current stack frame.
When in a recursive function the function calls itself, this is handled (mostly, apart from possible optimizations) like any other function call, and thus a new stack frame is created. Accessing the same (named) variable from within different stack frames will (because, as said, the access is using a relative address) thus access a different entities.
[Note: I hope this rather rough descriptions helps you, this is a topic that is - when talked about in depth - extremely depending on actual implementations (compilers), used optimizations, calling convention, operating system, target architecture, ... ]
I put together a simple stupid example, which hopefully shows that what you want to do should be possible, given that you appropriately free whatever you allocated:
unsigned int crazy_factorial(unsigned int const * const n) {
unsigned int result;
if (*n == 0) {
result = 1;
} else {
unsigned int * const nextN = malloc(sizeof(unsigned int));
*nextN = *n - 1;
result = *n * crazy_factorial(nextN);
free(nextN);
}
return result;
}
Running this with some output shows what's going on.

What does free do to a pointer passed by value to a function?

It is known that if we pass a pointer by value to a function, it cannot be freed inside the function, like so:
void func(int *p)
{
free(p);
p = NULL;
}
p holds a copy of a (presumably valid) address, so free(p) tries to, well, free it. But since it is a copy, it cannot really free it. How does the call to free() know that it cannot really free it ?
The code above does not produce an error. Does that mean free() just fails silently, "somehow" knowing that address passed in as argument cannot be worked upon ?
p holds a copy of a (presumably valid) address, so free(p) tries to, well, free it. But since it is a copy, it cannot really free it.
It's not true. free() can work just fine if p is a valid address returned by malloc() (or NULL).
In fact, this is a common pattern for implementing custom "destructor" functions (when writing OO-style code in C).
What you probably mean is that p won't change to NULL after this - but that's natural, since you're passing it by value. If you want to free() and null out the pointer, then pass it by pointer ("byref"):
void func(int **p)
{
if (p != NULL) {
free(*p);
*p = NULL;
}
}
and use this like
int *p = someConstructor();
func(&p);
// here 'p' will actually be NULL
The only problem is if this function is in a different DLL (Windows). Then, it may be linked with a different version of the standard library and have different ideas on how the heap is built.
Otherwise no problem.
Passing p to func() by value, which will copy the pointer and creates the local copy to func() which frees the memory. func() then sets it's own instance of the pointer p to NULL but which is useless. Once the function is complete the parameter p come to end of existence. In calling function you still have pointer p holding an address, but the block is now on the free list and not useful for storage until allocated again.
What everybody is saying is that your memory will be freed by free(p);, but your original pointer (which you use to call the function with) will still hold the (now invalid) address. If a new block of memory including your address is allocated at a later stage than your original pointer will become valid (for memory manager) again, but will now point to completely different data causing all sorts of problems and confusion.
No you really free the block of memory. After the function call, the pointer passed to this function is pointing to nowhere : same address but the MMU don't know anymore what to do with this address

Returning struct in a c function

I have something like the folowing C code:
struct MyStruct MyFunction (something here)
{
struct MyStruct data;
//Some code here
return data;
}
would the returning value be a reference or a copy of the memory block for data?
Should MyFunction return struct MyStruct* (with the corresponding memory allocation) instead of struct MyStruct?
There is no such thing as a reference in C. So semantically speaking, you are returning a copy of the struct. However, the compiler may optimise this.
You cannot return the address of a local variable, as it goes out of scope when the function returns. You could return the address of something that you've just malloc-ed, but you'll need to make it clear that someone will need to free that pointer at some point.
It would return a copy. C is a pass-by-value language. Unless you specify that you are passing pointers around, structures get copied for assignments, return statements, and when used as function parameters.
It is returned as copy. BTW, you should not return it's reference because it has automatic storage duration ( i.e., data resides on stack ).
I have had problems with this (a function in a DLL returning a struct) and have investigated it. Returning a struct from a DLL to be used by people who might have a different compiler is not good practice, because of the following.
How this works depends on the implementation. Some implementations return small records in registers, but most get an invisible extra argument that points to a result struct in the local frame of the caller. On return, the pointer is used to copy data to the struct in the local frame of the caller. How this pointer is passed depends on the implementation again: as last argument, as first argument or as register.
As others said, returning references is not a good idea, as the struct you return might be in your local frame. I prefer functions that do not return such structs at all, but take a pointer to one and fill it up from inside the function.

Freeing pointers from inside other functions in C

Consider the c code:
void mycode() {
MyType* p = malloc(sizeof(MyType));
/* set the values for p and do some stuff with it */
cleanup(p);
}
void cleanup(MyType* pointer) {
free(pointer);
pointer = NULL;
}
Am I wrong in thinking that after cleanup(p); is called, the contents of p should now be NULL? Will cleanup(MyType* pointer) properly free the memory allocation?
I am coding my college assignment and finding that the debugger is still showing the pointer to have a memory address instead of 0x0 (or NULL) as I expect.
I am finding the memory management in C to be very complicated (I hope that's not just me). can any shed some light onto what's happening?
Yes that will free the memory correctly.
pointer inside the cleanup function is a local variable; a copy of the value passed in stored locally for just that function.
This might add to your confusion, but you can adjust the value of the variable p (which is local to the mycode method) from inside the cleanup method like so:
void cleanup(MyType** pointer) {
free(*pointer);
*pointer = NULL;
}
In this case, pointer stores the address of the pointer. By dereferencing that, you can change the value stored at that address. And you would call the cleanup method like so:
cleanup(&p);
(That is, you want to pass the address of the pointer, not a copy of its value.)
I will note that it is usually good practice to deal with allocation and deallocation on the same logical 'level' of the software - i.e. don't make it the callers responsibility to allocate memory and then free it inside functions. Keep it consistent and on the same level.
cleanup will properly free p, but it won't change its value. C is a pass-by-value language, so you can't change the caller's variable from the called function. If you want to set p from cleanup, you'll need to do something like:
void cleanup(MyType **pointer) {
free(*pointer);
*pointer = NULL;
}
And call it like:
cleanup(&p);
Your code is a little bit un-idiomatic, can you explain a bit better why you want to write this cleanup function?
Yes
Yes
Yes: There is a block of memory magically produced by malloc(3). You have assigned the address of this memory, but not the memory itself in any meaningful way, to the pointer p which is an auto variable in mycode().
Then, you pass p to cleanup(), by value, which will copy the pointer and, using the copy local to cleanup(), free the block. cleanup() then sets it's own instance of the pointer to NULL, but this is useless. Once the function is complete the parameter pointer ceases to exist.
Back in mycode(), you still have pointer p holding an address, but the block is now on the free list and not terribly useful for storage until allocated again.
You may notice that you can even still store to and read back from *p, but various amounts of downstream lossage will occur, as this block of memory now belongs to the library and you may corrupt its data structures or the data of a future owner of a malloc() block.
Carefully reading about C can give you an abstract idea of variable lifetime, but it's far easier to visualize the near-universal (for compiled languages, anyway) implementation of parameter passing and local variable allocation as stack operations. It helps to take an assembly course before the C course.
This won't work as the pointer in cleanup() is local, and thus assigning it NULL is not seen by the calling function. There are two common ways of solving this.
Instead of sending cleanup the pointer, send it a pointer to the pointer. Thus change cleanup() as follows:
void cleanup(MyType** pointer)
{
free(*pointer);
*pointer = NULL;
}
and then just call cleanup(&p).
A second option which is quite common is to use a #define macro that frees the memory and cleans the pointer.
If you are using C++ then there is a third way by defining cleanup() as:
void cleanup(MyType& *pointer)
{
// your old code stays the same
}
There are two questions are here:
Am I wrong in thinking that after
cleanup(p); is called, the contents of
p should now be NULL?
Yes, this is wrong. After calling free the memory pointed by the pointer is deallocated. That doesn't mean that the content pointed by the pointer is set to NULL. Also, if you are expecting the pointer p to become NULL in mycode it doesn't happen because you are passing copy of p to cleanup. If you want p to be NULL in mycode, then you need a pointer to pointer in cleanup, i.e. the cleanup signature would be cleanup(MyType**).
Second question:
Will cleanup(MyType* pointer) properly
free the memory allocation?
Yes, since you are doing free on a pointer returned by malloc the memory will be freed.
It's not just you.
cleanup() will properly clean up your allocation, but will not set the pointer to NULL (which should IMHO be regarded as separate from cleanup.) The data the pointer points to is passed to cleanup() by pointer, and is free()ed properly, but the pointer itself is passed by value, so when you set it to NULL you're only affecting the cleanup() function's local copy of the pointer, not the original pointer.
There are three ways around this:
Use a pointer to a pointer.
void cleanup(struct MyType **p) { free(*p); *p = NULL; }
Use a macro.
#define cleanup(p) do { free(p); p = NULL; } while(0)
or (probably better):
void cleanup_func(struct MyType *p) { /* more complicated cleanup */ }
#define cleanup(p) do { cleanup_func(p); p = NULL; } while(0)
Leave the responsibility of setting pointers to NULL to the caller. This can avoid unnecessary assignments and code clutter or breakage.

How does temporary storage work in C when a function returns?

I know C pretty well, however I'm confused of how temporary storage works.
Like when a function returns, all the allocation happened inside that function is freed (from the stack or however the implementation decides to do this).
For example:
void f() {
int a = 5;
} // a's value doesn't exist anymore
However we can use the return keyword to transfer some data to the outside world:
int f() {
int a = 5;
return a;
} // a's value exists because it's transfered to the outside world
Please stop me if any of this is wrong.
Now here's the weird thing, when you do this with arrays, it doesn't work.
int []f() {
int a[1] = {5};
return a;
} // a's value doesn't exist. WHY?
I know arrays are only accessible by pointers, and you can't pass arrays around like another data structure without using pointers. Is this the reason you can't return arrays and use them in the outside world? Because they're only accessible by pointers?
I know I could be using dynamic allocation to keep the data to the outside world, but my question is about temporary allocation.
Thanks!
When you return something, its value is copied. a does not exist outside the function in your second example; it's value does. (It exists as an rvalue.)
In your last example, you implicitly convert the array a to an int*, and that copy is returned. a's lifetime ends, and you're pointing at garbage.
No variable lives outside its scope, ever.
In the first example the data is copied and returned to the calling function, however the second returns a pointer so the pointer is copied and returned, however the data that is pointed to is cleaned up.
In implementations of C I use (primarily for embedded 8/16-bit microcontrollers), space is allocated for the return value in the stack when the function is called.
Before calling the function, assume the stack is this (the lines could represent various lengths, but are all fixed):
[whatever]
...
When the routine is called (e.g. sometype myFunc(arg1,arg2)), C throws the parameters for the function (arguments and space for the return value, which are all of fixed length) on to the stack, followed by the return address to continue code execution from, and possibly backs up some processor registers.
[myFunc local variables...]
[return address after myFunc is done]
[myFunc argument 1]
[myFunc argument 2]
[myFunc return value]
[whatever]
...
By the time the function fully completes and returns to the code it was called from, all of it's variables have been deallocated off the stack (they might still be there in theory, but there is no guarantee)
In any case, in order to return the array, you would need to allocate space for it somewhere else, then return the address to the 0th element.
Some compilers will store return values in temporary registers of the processor rather than using the stack, but it's rare (only seen it on some AVR compilers).
When you attempt to return a locally allocated array like that, the calling function gets a pointer to where the array used to live on the stack. This can make for some spectacularly gruesome crashes, when later on, something else writes to the array, and clobbers a stack frame .. which may not manifest itself until much later, if the corrupted frame is deep in the calling sequence. The maddening this with debugging this type of error is that real error (returning a local array) can make some other, absolutely perfect function blow up.
You still return a memory address, you can try to check its value, but the contents its pointing are not valid beyond the scope of function,so dont confuse value with reference.
int []f() {
int a[1] = {5};
return a;
} // a's value doesn't exist. WHY?
First, the compiler wouldn't know what size of array to return. I just got syntax errors when I used your code, but with a typedef I was able to get an error that said that functions can't return arrays, which I knew.
typedef int ia[1];
ia h(void) {
ia a = 5;
return a;
}
Secondly, you can't do that anyway. You also can't do
int a[1] = {4};
int b[1];
b = a; // Here both a and b are interpreted as pointer literals or pointer arithmatic
While you don't write it out like that, and the compiler really wouldn't even have to generate any code for it this operation would have to happen semantically for this to be possible so that a new variable name could be used to refer the value that was returned by the function. If you enclosed it in a struct then the compiler would be just fine with copying the data.
Also, outside of the declaration and sizeof statements (and possibly typeof operations if the compiler has that extension) whenever an array name appears in code it is thought of by the compiler as either a pointer literal or as a chunk of pointer arithmetic that results in a pointer. This means that the return statement would end looking like you were returning the wrong type -- a pointer rather than an array.
If you want to know why this can't be done -- it just can't. A compiler could implicitly think about the array as though it were in a struct and make it happen, but that's just not how the C standard says it is to be done.

Resources