multiple calls to strdup() with the same lvalue - c

Throughout the programs I inherited from my predecessors, there are functions of the following format:
somefunc(some_type some_parameter, char ** msg)
In other words, the last parameter is a char **, which is used to return messages.
That is: somefunc() will "change" msg.
In some cases the changing in question is of the form:
sprintf(txt,"some text. Not fixed but with a format and variables etc");
LogWar("%s",txt); //call to some logging function that uses txt
*msg = strdup(txt);
I know that each call to strdup() should have a related call to free() to release the memory it allocated.
Since that memory is used to return something, it should obviously not be freed at the end of somefunc().
But then where?
If somefunc() is called multiple times with the same msg, then that pointer will move around, I presume. So the space allocated by the previous call will be lost, right?
Somewhere before the end of the program I should certainly free(*msg). (In this case *msg is the version that is used as parameter in the calls to somefunc().)
But I think that call would only release the last allocated memory, not the memory allocated in earlier calls to somefunc(), right?
So, am I correct in saying that somefunc() should look like this:
sprintf(txt,"some text. Not fixed like here, but actually with variables etc");
LogWar("%s",txt); //call to some logging function that uses txt
free(*msg); //free up the memory that was previously assigned to msg, since we will be re-allocating it immediatly hereafter
*msg = strdup(txt);
So with a free() before the strdup().
Am I correct?

Yes, you're correct. Any old pointer returned from strdup() must be free()d before you overwrite it, or you will leak memory.
I'm sure you where being simple for clarity, but I would of course vote for something like this:
const char * set_error(char **msg, const char *text)
{
free(*msg);
*msg = strdup(text);
}
and then:
LogWar("%s",txt); //call to some logging function that uses txt
set_error(msg, txt);
See how I used encapsulation to make this pretty important sequence more well-defined, and even named?

Related

Weird situation when returning char *

I am pretty new to C programming and I have several functions returning type char *
Say I declare char a[some_int];, and I fill it later on. When I attempt to return it at the end of the function, it will only return the char at the first index. One thing I noticed, however, is that it will return the entirety of a if I call any sort of function on it prior to returning it. For example, my function to check the size of a string (calling something along the lines of strLength(a);).
I'm very curious what the situation is with this exactly. Again, I'm new to C programming (as you probably can tell).
EDIT: Additionally, if you have any advice concerning the best method of returning this, please let me know. Thanks!
EDIT 2: For example:
I have char ret[my_strlen(a) + my_strlen(b)]; in which a and b are strings and my_strlen returns their length.
Then I loop through filling ret using ret[i] = a[i]; and incrementing.
When I call my function that prints an input string (as a test), it prints out how I want it, but when I do
return ret;
or even
char *ptr = ret;
return ptr;
it never supplies me with the full string, just the first char.
A way not working to return a chunk of char data is to return it in memory temporaryly allocated on the stack during the execution of your function and (most probably) already used for another purpose after it returned.
A working alternative would be to allocate the chunk of memory ont the heap. Make sure you read up about and understand the difference between stack and heap memory! The malloc() family of functions is your friend if you choose to return your data in a chunk of memory allocated on the heap (see man malloc).
char* a = (char*) malloc(some_int * sizeof(char)) should help in your case. Make sure you don't forget to free up memory once you don't need it any more.
char* ret = (char*) malloc((my_strlen(a) + my_strlen(b)) * sizeof(char)) for the second example given. Again don't forget to free once the memory isn't used any more.
As MByD correctly pointed out, it is not forbidden in general to use memory allocated on the stack to pass chunks of data in and out of functions. As long as the chunk is not allocated on the stack of the function returning this is also quite well.
In the scenario below function b will work on a chunk of memory allocated on the stackframe created, when function a entered and living until a returns. So everything will be pretty fine even though no memory allocated on the heap is involved.
void b(char input[]){
/* do something useful here */
}
void a(){
char buf[BUFFER_SIZE];
b(buf)
/* use data filled in by b here */
}
As still another option you may choose to leave memory allocation on the heap to the compiler, using a global variable. I'd count at least this option to the last resort category, as not handled properly, global variables are the main culprits in raising problems with reentrancy and multithreaded applications.
Happy hacking and good luck on your learning C mission.

Return a string allocated with malloc?

I'm creating a function that returns a string. The size of the string is known at runtime, so I'm planning to use malloc(), but I don't want to give the user the responsibility for calling free() after using my function's return value.
How can this be achieved? How do other functions that return strings (char *) work (such as getcwd(), _getcwd(), GetLastError(), SDL_GetError())?
Your challenge is that something needs to release the resources (i.e. cause the free() to happen).
Normally, the caller frees the allocated memory either by calling free() directly (see how strdup users work for instance), or by calling a function you provide the wraps free. You might, for instance, require callers to call a foo_destroy function. As another poster points out you might choose to wrap that in an opaque struct, though that's not necessary as having your own allocation and destroy functions is useful even without that (e.g. for resource tracking).
However, another way would be to use some form of clean-up function. For instance, when the string is allocated, you could attach it to a list of resources allocated in a pool, then simply free the pool when done. This is how apache2 works with its apr_pool structure. In general, you don't free() anything specifically under that model. See here and (easier to read) here.
What you can't do in C (as there is no reference counting of malloc()d structures) is directly determine when the last 'reference' to an object goes out of scope and free it then. That's because you don't have references, you have pointers.
Lastly, you asked how existing functions return char * variables:
Some (like strdup, get_current_dir_name and getcwd under some circumstances) expect the caller to free.
Some (like strerror_r and getcwd in under other circumstances) expect the caller to pass in a buffer of sufficient size.
Some do both: from the getcwd man page:
As an extension to the POSIX.1-2001 standard, Linux (libc4, libc5, glibc) getcwd() allocates the buffer dynamically
using malloc(3) if buf is NULL. In this case, the allocated buffer has the length size unless size is zero, when
buf is allocated as big as necessary. The caller should free(3) the returned buffer.
Some use an internal static buffer and are thus not reentrant / threadsafe (yuck - do not do this). See strerror and why strerror_r was invented.
Some only return pointers to constants (so reentrancy is fine), and no free is required.
Some (like libxml) require you to use a separate free function (xmlFree() in this case)
Some (like apr_palloc) rely on the pool technique above.
Many libraries force the user to deal with memory allocation. This is a good idea because every application has its own patterns of object lifetime and reuse. It's good for the library to make as few assumptions about its users as possible.
Say a user wants to call your library function like this:
for (a lot of iterations)
{
params = get_totally_different_params();
char *str = your_function(params);
do_something(str);
// now we're done with this str forever
}
If your libary mallocs the string every time, it is wasting a lot of effort calling malloc, and possibly showing poor cache behavior if malloc picks a different block each time.
Depending on the specifics of your library, you might do something like this:
int output_size(/*params*/);
void func(/*params*/, char *destination);
where destination is required to be at least output_size(params) size, or you could do something like the socket recv API:
int func(/*params*/, char *destination, int destination_size);
where the return value is:
< desination_size: this is the number of bytes we actually used
== destination_size: there may be more bytes waiting to output
These patterns both perform well when called repeatedly, because the caller can reuse the same block of memory over and over without any allocations at all.
There is no way to do this in C. You have to either pass a parameter with size information, so that malloc() and free() can be called in the called function, or the calling function has to call free after malloc().
Many object oriented languages (eg. C++) handle memory in such a way as to do what you want to, but not C.
Edit
By size information as an argument, I mean something to let the called function know the how many bytes of memory are owned by the pointer you are passing. This can be done by looking directly at the called string if it has already been assigned a value, such as:
char test1[]="this is a test";
char *test2="this is a test";
when called like this:
readString(test1); // (or test2)
char * readString(char *abc)
{
int len = strlen(abc);
return abc;
}
Both of those arguments will result in len = 14
However if you create a non populated variable, such as:
char *test3;
And allocate the same amount of memory, but do not populate it, for example:
test3 = malloc(strlen("this is a test") +1);
There is no way for the called function to know what memory has been allocated. The variable len will == 0 inside the 1st prototype of readString(). However, if you change the prototype readString() to:
readString(char *abc, int sizeString); Then size information as an argument can be used to create memory:
void readString(char *abc, size_t sizeString)
{
char *in;
in = malloc(sizeString +1);
//do something with it
//then free it
free(in);
}
example call:
int main()
{
int len;
char *test3;
len = strlen("this is a test") +1; //allow for '\0'
readString(test3, len);
// more code
return 0;
}
You cannot do this in C.
Return a pointer and it is up to the person calling the function to call free
Alternatively use C++. shared_ptr etc
You can wrap it in a opaque struct.
Give the user access to pointers to your struct but not its internal. Create a function to release resources.
void release_resources(struct opaque *ptr);
Of course the user needs to call the function.
You could keep track of the allocated strings and free them in an atexit routine (http://www.tutorialspoint.com/c_standard_library/c_function_atexit.htm). In the following, I have used a global variable but it could be a simple array or list if you have one handy.
#include <stdlib.h>
#include <string.h>
#include <malloc.h>
char* freeme = NULL;
void AStringRelease(void)
{
if (freeme != NULL)
free(freeme);
}
char* AStringGet(void)
{
freeme = malloc(20);
strcpy(result, "A String");
atexit(AStringRelease);
return freeme;
}

Does char* cause memory leak if passed to a function

I have a function like this:
void readString(char* str){
str="asd";
}
Can I know if str will be dealloced? Or must I free it?
Note: I can't use string library as I am programming a microprocessor.
free() must only be called if malloc(), calloc() or realloc() was used to allocate memory. This is not the case in the posted code so calling free() is unrequired.
The "asd" is a string literal and exists for the lifetime of the program (has static storage duration).
Your function does nothing.
It doesn't "read" a string. All it does it assign the address of a string literal (a constant block of memory somewhere that is initialized to the text of the string) to the function's local variable str. The function then exits, causing that local variable to stop existing.
Nothing is returned, and the pointer is not de-referenced (which would in turn be wrong since it's only a char *, not a char * *), so nothing happens outside the function. The caller doesn't "get" any value, and thus has nothing to call free() on, so that problem can never even occur.
String will not be deallocated because it is stored in static memory. You didn't allocate it, you don't free it
No, there is no memory leak. In your case it is statically allocated.
In general you have to make up your own rules about who can or must free memory, and you should document your code so it is clear what the requirements are.
In the example given, readString() only overwrites its own private copy of the pointer, and when it returns the caller will not see that anything has changed. Consequently the caller will have the same duty to free() its pointer as it had before it called readString(), and there will be no leak.
However, if readString() instead accepted a char **, so that it could modify the caller's copy of the pointer, then the outcome would be that it would not be legal to call free() after calling readString(), as the pointer's new value is not part of the malloc heap.
If the previous value of that pointer variable had been a malloc()ed object, then the caller should have freed it before allowing the pointer to be overwritten. It would be truly horrible to have readString() call free() in that case, because it would turn a variable which must eventually be freed into one which must never be freed, and the program flow would be very hard to follow.
This code is useless and meaningless as for as I am concerned. Here are different ways of calling your function definition and why I say this!
int main (int argc, char *argv[], char *envp[])
{
char a, *b, *c;
b = malloc (10);
readString(&a); // Case-1, Valid calling.
readString(b); // Case-2, Valid calling.
readString(c); // Case-3, Invalid calling. Unallocated location.
}
Case-1: This is the only case, where it matters to the caller about what you do in your function. You may use the passed character as you wish. The only meaningful assignment would be something like this. Doing 'str = "asd";' would probably dump the core or mess with the caller's stack or data segment memory(if address of a global variable was passed) and create a complicated debugging nightmare!
void readString(char* str){
*str='a';
}
Case-2: There is nothing Fatal or Syntax error in the code, but it is meaningless to do this. The only meaningful thing would be, just using what ever passed to your function from the caller. What is the reason for assigning like this on the passed parameter? Your definition can just have a local variable and avoid parameter passing completely. That function can be called as "readString();"...
void readString(void){
char *str='asd';
}

Efficient/easy way to guard for NULL after conversion from "char x[n]" to "char* x"?

I have recently made changes in some code that makes a char name field dynamic.
So it was originally like
struct boo
{
char name[100];
...
}
and i have changed it to
struct boo
{
char *name;
...
}
so this make name dynamically allocate the amount of memory actually needed to store the names.
Anyway.. the result of this change will require me to add if(boo->name) null pointer check in about 1000 places in the code.
So just wondering is there any smart or efficient way (reduce programmer development time) of doing this null pointer check.
It will be far easier to ensure that the buffer is allocated when the structure is created rather than checking it wherever the structure is used. Don't ever let it be NULL in the first place!
If you need a pointer value to place in the structure before you have the relevant data, you can keep a global empty string to use specifically for this task. Compare to this pointer before trying to free the memory.
If this is C++ and not C, seriously consider using a std::string instead of a pointer.
if (name) works, but there is always the problem that your pointer may not be initialized to NULL to start with.
if you are dynamically allocating your structs, to make sure this happens, do:
mystruct foo = calloc(sizeof(foo));
calloc zeroes the memory.
EDIT:
In addition, if you only want to check for name in debug builds, you can do:
assert(name);
This will quit the program right at that line if name is NULL but be optimized out to nothing in "release" builds.
Your problem is to check the return allocation succeeds when you malloc.
Some people like to use the xmalloc wrapper to malloc from libiberty library:
— Replacement: void* xmalloc (size_t)
Allocate memory without fail. If malloc fails, this will print a message to stderr (using the name set by xmalloc_set_program_name, if any) and then call xexit. Note that it is therefore safe for a program to contain #define malloc xmalloc in its source.
http://gcc.gnu.org/onlinedocs/libiberty/Functions.html
You can also easily write your own xmalloc function:
void *xmalloc(size_t size)
{
char *p = malloc(size);
if (!p) {
fprintf(stderr, "Error: allocation failure\n");
exit(EXIT_FAILURE);
}
return p;
}

Returning a pointer to an automatic variable

Say you have the following function:
char *getp()
{
char s[] = "hello";
return s;
}
Since the function is returning a pointer to a local variable in the function to be used outside, will it cause a memory leak?
P.S. I am still learning C so my question may be a bit naive...
[Update]
So, if say you want to return a new char[] array (ie maybe for a substring function), what do you return exactly? Should it be pointer to an external variable ? ie a char[] that is not local to the function?
It won't cause a memory leak. It'll cause a dangling reference. The local variable is allocated on the stack and will be freed as soon as it goes out of scope. As a result, when the function ends, the pointer you are returning no longer points to a memory you own. This is not a memory leak (memory leak is when you allocate some memory and don't free it).
[Update]:
To be able to return an array allocated in a function, you should allocate it outside stack (e.g. in the heap) like:
char *test() {
char* arr = malloc(100);
arr[0] = 'M';
return arr;
}
Now, if you don't free the memory in the calling function after you finished using it, you'll have a memory leak.
No, it wont leak, since its destroyed after getp() ends;
It will result in undefined behaviour, because now you have a pointer to a memory area that no longer holds what you think it does, and that can be reused by anyone.
A memory leak would happen if you stored that array on the heap, without executing a call to free().
char* getp(){
char* p = malloc(N);
//do stuff to p
return p;
}
int main(){
char* p = getp();
//free(p) No leak if this line is uncommented
return 0;
}
Here, p is not destroyed because its not in the stack, but in the heap. However, once the program ends, allocated memory has not been released, causing a memory leak ( even though its done once the process dies).
[UPDATE]
If you want to return a new c-string from a function, you have two options.
Store it in the heap (as the example
above or like this real example that returns a duplicated string);
Pass a buffer parameter
for example:
//doesnt exactly answer your update question, but probably a better idea.
size_t foo (const char* str, size_t strleng, char* newstr);
Here, you'd have to allocate memory somewhere for newstr (could be stack OR heap) before calling foo function. In this particular case, it would return the amount of characters in newstr.
It's not a memory leak because the memory is being release properly.
But it is a bug. You have a pointer to unallocated memory. It is called a dangling reference and is a common source of errors in C. The results are undefined. You wont see any problems until run-time when you try to use that pointer.
Auto variables are destroyed at the end of the function call; you can't return a pointer to them. What you're doing could be described as "returning a pointer to the block of memory that used to hold s, but now is unused (but might still have something in it, at least for now) and that will rapidly be filled with something else entirely."
It will not cause memory leak, but it will cause undefined behavior. This case is particularly dangerous because the pointer will point somewhere in the program's stack, and if you use it, you will be accessing random data. Such pointer, when written through, can also be used to compromise program security and make it execute arbitrary code.
No-one else has yet mentioned another way that you can make this construct valid: tell the compiler that you want the array "s" to have "static storage duration" (this means it lives for the life of the program, like a global variable). You do this with the keyword "static":
char *getp()
{
static char s[] = "hello";
return s;
}
Now, the downside of this is that there is now only one instance of s, shared between every invocation of the getp() function. With the function as you've written it, that won't matter. In more complicated cases, it might not do what you want.
PS: The usual kind of local variables have what's called "automatic storage duration", which means that a new instance of the variable is brought into existence when the function is called, and disappears when the function returns. There's a corresponding keyword "auto", but it's implied anyway if you don't use "static", so you almost never see it in real world code.
I've deleted my earlier answer after putting the code in a debugger and watching the disassembly and the memory window.
The code in the question is invalid and returns a reference to stack memory, which will be overwritten.
This slightly different version, however, returns a reference to fixed memory, and works fine:
char *getp()
{
char* s = "hello";
return s;
}
s is a stack variable - it's automatically de-referenced at the end of the function. However, your pointer won't be valid and will refer to an area of memory that could be overwritten at any point.

Resources