I am learning the basic of C programming and find it quite strange and difficult. Especially with dynamic memory allocation, pointer and similar things. I came upon this function and don't quite understand what is wrong with it.
char *strdup(const char *p)
{
char *q;
strcpy(q, p);
return q;
}
I think I have to malloc and free q. But the function " return q". Doesn't that mean that it will store q value in its own memory. So the data still be saved after the function executed?
When it is appropriate to use malloc? As I understand so far is that I have to malloc a new variable every time I need that variable declared in a function to be used elsewhere. Is that true? Is there any other situation where malloc is needed?
The type of q is a pointer, and pointers hold addresses -- so what you are returning is the address that pointer holds.
Until you give that pointer a valid address, it points off to who-knows-where, memory that you may or may not own and have the right to access. So, the strdup call will copy a string from the address held in p into some location you probably don't own.
If you had done a malloc first, and given q the results of the malloc, then q would hold a valid address, and your strdup would put the copy into memory that you did own (assuming you malloc'd enough space for the string -- a strlen on p would tell you how much you needed).
Then, when you returned q, you would be giving the caller the address as well. Any code with that address can see the string you put there. If some future code were to free that address, then what it holds is up in the air -- it could be anything at all.
So, you don't want to free q before you return the address that it holds -- you need to let the caller free the address it gets from you, when it is ready to do so.
In terms of when you malloc, yes, if you want to return an address that will remain viable after your function completes, you need to malloc it -- giving the caller the address of a local variable, for example, would be bad: the memory is freed when the function returns, you don't own it anymore.
Another typical use of malloc is for building up dynamic data structures like trees and lists -- you can't know how much memory you need up front, so you build the list or tree up as you need to, malloc'ing more memory for each node in the structure.
My personal rules are: use malloc() when an object is too big to put on the stack or/and when it has to live outside the scope of the current block. In your case, I believe, you should do something like the following:
char *strdup(const char *p)
{
char *q;
q = malloc(strlen(p) + 1);
if(NULL != q)
strcpy(q, p);
return q;
}
malloc(X) creates space (of size X bytes) on the heap for you to play with. The data that you write to these X byes stays put when your function returns, as a result, you can read what your strdup() function wrote to that space on the heap.
free() is used for freeing space on the heap. You can pass a pointer to this function that you obtained only as a result of a malloc() or realloc() call. This function may not clear out the data, it just means that a subsequent call to malloc() may return the address of the same space that you just freed. I say "may" because these are all implementaion defined and should not be relied upon.
In the piece of code you wrote, the strcpy() function copies the bytes one by one from p to q until it finds a \0 at the location pointed to by q (and then it copies the \0 as well). In order to write data somewhere, you need to allocate space first for the data to be written, hence, one option is that you call malloc() to create some space and then write data there.
Well, calling free() is not mandatory as your OS will reclaim the space allocated by malloc() when you program ends, but as long as your program runs, you may be ocupying more space than you need to - and that's bad for your program, for other programs, for the OS and the universe as a whole.
I think I have to malloc and free q. But the function " return q". Doesn't that mean that it will store q value in its own memory. So the data still be saved after the function executed?
No, your data won't be saved. In fact your pointer q is being used without allocating it's size can cause problems. Also, once this function execution complete, variable char* q will be destroyed.
You need to allocate memory to pointer q before copying data as suggested by #Michael's answer.But once you finish using the data return by this function, you will need to manually free() the memory you allocated or else it will cause memory leak (a situation where the memory is allocated but there is no pointer refer to that chunk of memory you allocated and hence will be inaccessible throughout program execution)
char *strdup(const char *p) // From #Michael's answer
{
char *q;
q = malloc(strlen(p) + 1);
if(NULL != q)
strcpy(q, p);
return q;
}
void someFunction()
{
char* aDupString = strdup("Hello World!!!");
/*... somecode use aDupString */
free(aDupString); // If not freed, will cause memory leaks
}
When it is appropriate to use malloc? Is there any other situation where malloc is needed?
It is appropriate to use in following situations:
1> The usage size of array are unknown at compile time.
2> You need size flexibility. For example, your function need to work with small data size and large data size. (like Data structure such as Link list,Stacks, Queues, etc.)
As I understand so far is that I have to malloc a new variable every time I need that variable declared in a function to be used elsewhere. Is that true?
I think this one is partially true. depending on what you are trying to achive, there might be a way to get around using malloc though. For example, your strdup can also be rewrite in following way:
void strdup2(const char *p, char* strOut)
{
// malloc not require
strcpy(strOut, p);
}
void someFunction()
{
char aString[15] = "Hello World!!!";
char aDupStr[sizeof(aString)];
strdup2(aString, aDupStr);
// free() memory not required. but size is not dynamic.
}
Related
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
char* test() {
char* s = "Hello World";
size_t len = strlen(s);
char* t = malloc(sizeof(char)*(len+1));
strcpy(t, s);
free(t);
return t;
};
int main(void) {
printf("%s\n", test());
return 0;
};
I would like to allocate and de-allocate memory inside the function. I tested this code and works, but I am wondering:
Why does this work?
Is it good practice to use the value of a freed pointer in main ?
Once you call free on a pointer, the memory it pointed to is no longer valid. Attempting to use a pointer to freed memory triggers undefined behavior. In this particular case it happened to work, but there's no guarantee of that.
If the function returns allocated memory, it is the responsibility of the caller to free it:
char* test() {
char* s = "Hello World";
size_t len = strlen(s);
char* t = malloc(sizeof(char)*(len+1));
strcpy(t, s);
return t;
};
int main(void) {
char *t = test();
printf("%s\n", t);
free(t);
return 0;
};
malloc reserves memory for use.
free releases that reservation. In general, it does not make the memory go away, it does not change the contents of that memory, and it does not alter the value of the pointer that held the address.
After free(t), the bytes of t still contain the same bit settings they did before the free. Then return t; returns those bits to the caller.
When main passes those bits to printf, printf uses them as the address to get the characters for %s. Since nothing has changed them, they are printed.
That is why you got the behavior you did with this program. However, none of it is guaranteed. Once free was called with t, the memory reservation was gone. Something else in your program could have used that memory. For example, printf might have allocated a buffer for its own internal use, and that could have used the same memory.
For the most part, malloc and free are just methods of coordinating use of memory, so that different parts of your program do not try to use the same memory at the same time for different purposes. When you only have one part of your program using allocated memory, there are no other parts of your program to interfere with that. So the lack of coordination did not cause your program to fail. If you had multiple routines in your program using allocated memory, then attempting to use memory after it has been released is more likely to encounter problems.
Additionally, once the memory has been freed, the compiler may treat a pointer to it as if it has no fixed value. The return t; statement is not required to return any particular value.
It doesn't matter where do you free() a pointer. Once it is free()d, the pointer is not deferrenciable anymore (neither inside nor ouside the function where it was free()d)
The purpose of free() is to return the memory allocated with malloc() so the semantics are that, once you have freed a chunk of memory, it is not anymore usable.
In C, all parameters are passed by value, so free() cannot change the value expression you passed to it, and this is the reason the pointer is not changed into an invalid pointer value (like NULL) but you are advised that no more uses of the pointer can be done without incurring in Undefined Behaviour.
There could be a solution in the design of free() and it is to pass the pointer variable that holds the pointer by address, and so free() would be able to turn the pointer into a NULL. But this not only takes more work to do, but free() doesn't know how many copies you have made of the value malloc() gave to you... so it is impossible to know how many references you have over there to be nullified. That approach makes it impossible to give free() the responsibility of nullifying the reference to the returned memory.
So, if you think that free doesn't turn the pointer into NULL and for some strange reason you can still use the memory returned, don't do it anymore, because you'll be making mistakes.
You are adviced! :)
As my long title says: I am trying to return a pointer in c that has been dynamically allocated, I know, I have to free it, but I do not know how to myself, my search has showed that it can only be freed in main, but I cannot leave it up to the user to free the int.
My code looks like this right now,
int *toInt(BigInt *p)
{
int *integer = NULL;
integer = calloc(1, sizeof(int));
// do some stuff here to make integer become an int from a passed
// struct array of integers
return integer;
}
I've tried just making a temp variable and seeing the integer to that then freeing integer and returning the temp, but that hasn't worked. There must be a way to do this without freeing in main?
Program design-wise, you should always let the "module" (translation unit) that did the allocation be responsible for freeing the memory. Expecting some other module or the caller to free() memory is indeed bad design.
Unfortunately C does not have constructors/destructors (nor "RAII"), so this has to be handled with a separate function call. Conceptually you should design the program like this:
#include "my_type.h"
int main()
{
my_type* mt = my_type_alloc();
...
my_type_free(mt);
}
As for your specific case, there is no need for dynamic allocation. Simply leave allocation to the caller instead, and use a dedicated error type for reporting errors:
err_t toInt (const BigInt* p, int* integer)
{
if(bad_things())
return ERROR;
*integer = p->stuff();
return OK;
}
Where err_t is some custom error-handling type (likely enum).
Your particular code gains nothing useful from dynamic allocation, as #unwind already observed. You can save yourself considerable trouble by just avoiding it.
In a more general sense, you should imagine that with each block of allocated memory is associated an implicit obligation to free. There is no physical or electronic representation of that obligation, but you can imagine it as a virtual chit associated at any given time with at most one copy of the pointer to the space during the lifetime of the allocation. You can transfer the obligation between copies of the pointer value at will. If the pointer value with the obligation is ever lost through going out of scope or being modified then you have a leak, at least in principle; if you free the space via a copy of the pointer that does not at that time hold the obligation to free, then you have a (possibly virtual) double free.
I know I have to free it, but I do not know how to myself
A function that allocates memory and returns a copy of the pointer to it without making any other copies, such as your example, should be assumed to associate the obligation to free with the returned pointer value. It cannot free the allocated space itself, because that space must remain allocated after the function returns (else the returned pointer is worse than useless). If the obligation to free were not transferred to the returned pointer then a (virtual) memory leak would occur when the function's local variables go out of scope at its end, leaving no extant copy of the pointer having obligation to free.
I cannot leave it up to the user to free the int.
If you mean you cannot leave it up to the caller, then you are mistaken. Of course you can leave it up to the caller. If in fact the function allocates space and returns a pointer to it as you describe, then it must transfer the obligation to free to the caller along with the returned copy of the pointer to the allocated space. That's exactly what the calloc() function does in the first place. Other functions do similar, such as POSIX's strdup().
Because there is no physical or electronic representation of obligation to free, it is essential that your functions document any such obligations placed on the caller.
Just stop treating it as a pointer, there's no need for a single int.
Return it directly, and there will be no memory management issues since it's automatically allocated:
int toInt(const BigInt *p)
{
int x;
x = do some stuff;
return x;
}
The caller can just do
const int my_x = toInt(myBigInt);
and my_x will be automatically cleaned away when it does out of scope.
I am pretty new to C programming and I have several functions returning type char *
Say I declare char a[some_int];, and I fill it later on. When I attempt to return it at the end of the function, it will only return the char at the first index. One thing I noticed, however, is that it will return the entirety of a if I call any sort of function on it prior to returning it. For example, my function to check the size of a string (calling something along the lines of strLength(a);).
I'm very curious what the situation is with this exactly. Again, I'm new to C programming (as you probably can tell).
EDIT: Additionally, if you have any advice concerning the best method of returning this, please let me know. Thanks!
EDIT 2: For example:
I have char ret[my_strlen(a) + my_strlen(b)]; in which a and b are strings and my_strlen returns their length.
Then I loop through filling ret using ret[i] = a[i]; and incrementing.
When I call my function that prints an input string (as a test), it prints out how I want it, but when I do
return ret;
or even
char *ptr = ret;
return ptr;
it never supplies me with the full string, just the first char.
A way not working to return a chunk of char data is to return it in memory temporaryly allocated on the stack during the execution of your function and (most probably) already used for another purpose after it returned.
A working alternative would be to allocate the chunk of memory ont the heap. Make sure you read up about and understand the difference between stack and heap memory! The malloc() family of functions is your friend if you choose to return your data in a chunk of memory allocated on the heap (see man malloc).
char* a = (char*) malloc(some_int * sizeof(char)) should help in your case. Make sure you don't forget to free up memory once you don't need it any more.
char* ret = (char*) malloc((my_strlen(a) + my_strlen(b)) * sizeof(char)) for the second example given. Again don't forget to free once the memory isn't used any more.
As MByD correctly pointed out, it is not forbidden in general to use memory allocated on the stack to pass chunks of data in and out of functions. As long as the chunk is not allocated on the stack of the function returning this is also quite well.
In the scenario below function b will work on a chunk of memory allocated on the stackframe created, when function a entered and living until a returns. So everything will be pretty fine even though no memory allocated on the heap is involved.
void b(char input[]){
/* do something useful here */
}
void a(){
char buf[BUFFER_SIZE];
b(buf)
/* use data filled in by b here */
}
As still another option you may choose to leave memory allocation on the heap to the compiler, using a global variable. I'd count at least this option to the last resort category, as not handled properly, global variables are the main culprits in raising problems with reentrancy and multithreaded applications.
Happy hacking and good luck on your learning C mission.
I am working with my first straight C project, and it has been a while since I worked on C++ for that matter. So the whole memory management is a bit fuzzy.
I have a function that I created that will validate some input. In the simple sample below, it just ignores spaces:
int validate_input(const char *input_line, char** out_value){
int ret_val = 0; /*false*/
int length = strlen(input_line);
out_value =(char*) malloc(sizeof(char) * length + 1);
if (0 != length){
int number_found = 0;
for (int x = 0; x < length; x++){
if (input_line[x] != ' '){ /*ignore space*/
/*get the character*/
out_value[number_found] = input_line[x];
number_found++; /*increment counter*/
}
}
out_value[number_found + 1] = '\0';
ret_val = 1;
}
return ret_val;
}
Instead of allocating memory inside the function for out_value, should I do it before I call the function and always expect the caller to allocate memory before passing into the function? As a rule of thumb, should any memory allocated inside of a function be always freed before the function returns?
I follow two very simple rules which make my life easier.
1/ Allocate memory when you need it, as soon as you know what you need. This will allow you to capture out-of-memory errors before doing too much work.
2/ Every allocated block of memory has a responsibility property. It should be clear when responsibility passes through function interfaces, at which point responsibility for freeing that memory passes with the memory. This will guarantee that someone has a clearly specified requirement to free that memory.
In your particular case, you need to pass in a double char pointer if you want the value given back to the caller:
int validate_input (const char *input_line, char **out_value_ptr) {
: :
*out_value_ptr =(char*) malloc(length + 1); // sizeof(char) is always 1
: :
(*out_value_ptr)[number_found] = input_line[x];
: :
As long as you clearly state what's expected by the function, you could either allocate the memory in the caller or the function itself. I would prefer outside of the function since you know the size required.
But keep in mind you can allow for both options. In other words, if the function is passed a char** that points to NULL, have it allocate the memory. Otherwise it can assume the caller has done so:
if (*out_value_ptr == NULL)
*out_value_ptr =(char*) malloc(length + 1);
You should free that memory before the function returns in your above example. As a rule of thumb you free/delete allocated memory before the scope that the variable was defined in ends. In your case the scope is your function so you need to free it before your function ends. Failure to do this will result in leaked memory.
As for your other question I think it should be allocated going in to the function since we want to be able to use it outside of the function. You allocate some memory, you call your function, and then you free your memory. If you try and mix it up where allocation is done in the function, and freeing is done outside it gets confusing.
The idea of whether the function/module/object that allocates memory should free it is somewhat of a design decision. In your example, I (personal opinion here) think it is valid for the function to allocate it and leave it up to the caller to free. It makes it more usable.
If you do this, you need to declare the output parameter differently (either as a reference in C++ style or as char** in C style. As defined, the pointer will exist only locally and will be leaked.
A typical practice is to allocate memory outside for out_value and pass in the length of the block in octets to the function with the pointer. This allows the user to decide how they want to allocate that memory.
One example of this pattern is the recv function used in sockets:
ssize_t recv(int socket, void *buffer, size_t length, int flags);
Here are some guidelines for allocating memory:
Allocate only if necessary.
Huge objects should be dynamically
allocated. Most implementations
don't have enough local storage
(stack, global / program memory).
Set up ownership rules for the
allocated object. Owner should be
responsible for deleting.
Guidelines for deallocating memory:
Delete if allocated, don't delete
objects or variables that were not
dynamically allocated.
Delete when not in use any more.
See your object ownership rules.
Delete before program exits.
In this example you should be neither freeing or allocating memory for out_value. It is typed as a char*. Hence you cannot "return" the new memory to the caller of the function. In order to do that you need to take in a char**
In this particular scenario the buffer length is unknown before the caller makes the call. Additionally making the same call twice will produce different values since you are processing user input. So you can't take the approach of call once get the length and call the second time with the allocated buffer. Hence the best approach is for the function to allocate the memory and pass the responsibility of freeing onto the caller.
First, this code example you give is not ANSI C. It looks more like C++. There is not "<<" operator in C that works as an output stream to something called "cout."
The next issue is that if you do not free() within this function, you will leak memory. You passed in a char * but once you assign that value to the return value of malloc() (avoid casting the return value of malloc() in the C programming language) the variable no longer points to whatever memory address you passed in to the function. If you want to achieve that functionality, pass a pointer to a char pointer char **, you can think of this as passing the pointer by reference in C++ (if you want to use that sort of language in C, which I wouldn't).
Next, as to whether you should allocate/free before or after a function call depends on the role of the function. You might have a function whose job it is to allocate and initialize some data and then return it to the caller, in which case it should malloc() and the caller should free(). However, if you are just doing some processing with a couple of buffers like, you may tend to prefer the caller to allocate and deallocate. But for your case, since your "validate_input" function looks to be doing nothing more than copying a string without the space, you could just malloc() in the function and leave it to the caller. Although, since in this function, you simply allocate the same size as the whole input string, it almost seems as if you might as well have the caller to all of it. It all really depends on your usage.
Just make sure you do not lose pointers as you are doing in this example
Some rough guidelines to consider:
Prefer letting the caller allocate the memory. This lets it control how/where that memory is allocated. Calling malloc() directly in your code means your function is dictating a memory policy.
If there's no way to tell how much memory may be needed in advance, your function may need to handle the allocation.
In cases where your function does need to allocate, consider letting the caller pass in an allocator callback that it uses instead of calling malloc directly. This lets your function allocate when it needs and as much as it needs, but lets the caller control how and where that memory is allocated.
Say you have the following function:
char *getp()
{
char s[] = "hello";
return s;
}
Since the function is returning a pointer to a local variable in the function to be used outside, will it cause a memory leak?
P.S. I am still learning C so my question may be a bit naive...
[Update]
So, if say you want to return a new char[] array (ie maybe for a substring function), what do you return exactly? Should it be pointer to an external variable ? ie a char[] that is not local to the function?
It won't cause a memory leak. It'll cause a dangling reference. The local variable is allocated on the stack and will be freed as soon as it goes out of scope. As a result, when the function ends, the pointer you are returning no longer points to a memory you own. This is not a memory leak (memory leak is when you allocate some memory and don't free it).
[Update]:
To be able to return an array allocated in a function, you should allocate it outside stack (e.g. in the heap) like:
char *test() {
char* arr = malloc(100);
arr[0] = 'M';
return arr;
}
Now, if you don't free the memory in the calling function after you finished using it, you'll have a memory leak.
No, it wont leak, since its destroyed after getp() ends;
It will result in undefined behaviour, because now you have a pointer to a memory area that no longer holds what you think it does, and that can be reused by anyone.
A memory leak would happen if you stored that array on the heap, without executing a call to free().
char* getp(){
char* p = malloc(N);
//do stuff to p
return p;
}
int main(){
char* p = getp();
//free(p) No leak if this line is uncommented
return 0;
}
Here, p is not destroyed because its not in the stack, but in the heap. However, once the program ends, allocated memory has not been released, causing a memory leak ( even though its done once the process dies).
[UPDATE]
If you want to return a new c-string from a function, you have two options.
Store it in the heap (as the example
above or like this real example that returns a duplicated string);
Pass a buffer parameter
for example:
//doesnt exactly answer your update question, but probably a better idea.
size_t foo (const char* str, size_t strleng, char* newstr);
Here, you'd have to allocate memory somewhere for newstr (could be stack OR heap) before calling foo function. In this particular case, it would return the amount of characters in newstr.
It's not a memory leak because the memory is being release properly.
But it is a bug. You have a pointer to unallocated memory. It is called a dangling reference and is a common source of errors in C. The results are undefined. You wont see any problems until run-time when you try to use that pointer.
Auto variables are destroyed at the end of the function call; you can't return a pointer to them. What you're doing could be described as "returning a pointer to the block of memory that used to hold s, but now is unused (but might still have something in it, at least for now) and that will rapidly be filled with something else entirely."
It will not cause memory leak, but it will cause undefined behavior. This case is particularly dangerous because the pointer will point somewhere in the program's stack, and if you use it, you will be accessing random data. Such pointer, when written through, can also be used to compromise program security and make it execute arbitrary code.
No-one else has yet mentioned another way that you can make this construct valid: tell the compiler that you want the array "s" to have "static storage duration" (this means it lives for the life of the program, like a global variable). You do this with the keyword "static":
char *getp()
{
static char s[] = "hello";
return s;
}
Now, the downside of this is that there is now only one instance of s, shared between every invocation of the getp() function. With the function as you've written it, that won't matter. In more complicated cases, it might not do what you want.
PS: The usual kind of local variables have what's called "automatic storage duration", which means that a new instance of the variable is brought into existence when the function is called, and disappears when the function returns. There's a corresponding keyword "auto", but it's implied anyway if you don't use "static", so you almost never see it in real world code.
I've deleted my earlier answer after putting the code in a debugger and watching the disassembly and the memory window.
The code in the question is invalid and returns a reference to stack memory, which will be overwritten.
This slightly different version, however, returns a reference to fixed memory, and works fine:
char *getp()
{
char* s = "hello";
return s;
}
s is a stack variable - it's automatically de-referenced at the end of the function. However, your pointer won't be valid and will refer to an area of memory that could be overwritten at any point.