I am trying to understand how to free up memory fully after calls to strtok(). I read most of the answered questions here and none seemed to address the point of my confusion. If this is a duplicate feel free to point me to the direction of something that answers my question
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
char * aliteral = "Hello/world/fine/you";
char * allocatedstring;
char * token;
int i=0;
allocatedstring=(char *) malloc(sizeof(allocatedstring)*21);
allocatedstring=strcpy(allocatedstring,aliteral);
token = strtok(allocatedstring, "/");
token = strtok(NULL, "/");
token = strtok(NULL, "/");
token = strtok(NULL, "/");
printf("%s\n",allocatedstring);
printf("%s\n",token);
free(allocatedstring);
return 0;
}
Freeing allocatedstring here only frees up the string up to the first \0 character that replaced strtok's delimiter. So it only clears up until "Hello". I checked that using eclipse debugger and monitoring the memory addresses.
How do I clear the rest of it? I tried 2 things, having 1 extra pointer point to the start of allocatedstring and freeing that (didnt work) and freeing token after call to strtok() (didnt work either)
So how do I clean up the parts of allocatedstring that are now between \0 's ?
EDIT : To clarify, seeing the memory address blocks in eclipse debugger, I was seeing the string "HELLO WORLD FINE YOU" in the memory blocks that were initially allocated by the call to malloc. After the call to free(), the blocks containing "HELLO" and the first \0 turned to gibberish, but the rest of the blocks kept the characters "FINE YOU". I assumed that meant that they were not freed.
free has no knowledge of \0 terminated strings.
free will free what was allocated with malloc, and should work properly in this situation.
If your evidence that free is not working is simply that the string data still exists, then you misunderstand free.
It does not zero out the memory. It simply marks it as available for use.
The original data remains in that memory. But the memory may be allocated by the next caller of malloc, and that caller will be able to overwrite your data at will, because you don't own that data anymore!
If you want that memory cleared (such as, if it contains a password or a security key), you must clear it out with something like memset, before you call free.
Again, free only marks the memory as "unallocated, available for use by malloc", and does not clear out the contents.
PS Some debugging systems, such as Visual Studio, will overwrite freed data, but only to make it obvious in the debugger that it has been freed. That behavior is not contractually needed in C, and only aids in debugging. Typically, freed memory may be filled with something like 0xdeadbeef.
You have a minor issue on this line:
allocatedstring=(char *) malloc(sizeof(allocatedstring)*21);
sizeof(allocatedstring) is equal to sizeof (char *); you're allocating enough space for 21 pointers to char, not 21 characters. This isn't a problem, since a char * is going to be at least as large as a char, but indicates some confusion about what types you're dealing with.
That said, you don't have to worry about the size of allocatedstring or *allocatedstring; since you're allocating enough space to hold the literal, you can do the following:
allocatedstring = malloc( strlen( aliteral ) + 1 ); // note no cast
As for the behavior you're seeing...
free is releasing all the memory associated with allocatedstring; the fact that part of the memory hadn't yet been overwritten when you checked isn't surprising, because free doesn't affect the contents of that memory; it will contain whatever was last written to it until something else allocates and uses it.
Related
A little more than 20 years ago I had some grasp of writing something small in C , but even at that time, I probably didn't really do things right all the time. Now I'm trying to learn C again, so I'm really a newbie.
Based on this article:
Using realloc to shrink the allocated memory
, I made this test, which works, but troubles me:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int test (char *param) {
char *s = malloc(strlen(param));
strcpy(s, param);
printf("original string : [%4d] %s \n", strlen(s), s);
// reduce size
char *tmp = realloc(s, 5);
if (tmp == NULL) {
printf("Failed\n");
free(s);
exit(1);
} else {
tmp[4] = 0;
}
s = tmp;
printf("the reduced string : [%4d] %s\n", strlen(s), s );
free(s);
}
void main(void){
test("This is a string with a certain length!");
}
If I leave out "tmp[4] = 0", then I still get back the whole string. Does this mean the rest of the string is still in memory, but not allocated anymore?
how does c free memory anyway? Does it keep track of memory by itself or is it something that is handled by the OS?
I free the s string "free(s)", do I also need to free the tmp str (it does point to the same memory block, yet the (same) address it holds is probably stored on another memory location?
These are most likely just basics, but none of what I have read so far has given me a clear answer (including mentioned article).
If I leave out "tmp[4] = 0", then I still get back the whole string.
You've invoked undefined behavior. All the string operations require the argument to be a null-terminated array of characters. If you reduce the size of the allocation so it doesn't include the null terminator, you're accessing outside the allocation when it tries to find it.
Does this mean the rest of the string is still in memory, but not allocated anymore?
In practice, many implementations don't actually re-allocate anything when you shrink the size. They simply update the bookkeeping information to say that the allocated length is shorter, and return the original pointer. So the remainder of the string stays the same unless you do another allocation that happens to use that memory.
This can even happen when you grow the size. Some designs always allocate memory in specific granularities (e.g. powers of 2), so if you grow the allocation but it doesn't exceed the granularity, it doesn't need to copy the data.
how does c free memory anyway? Does it keep track of memory by itself or is it something that is handled by the OS?
Heap management is part of the C runtime library. It can use a variety of strategies.
I free the s string "free(s)", do I also need to free the tmp str (it does point to the same memory block, yet the (same) address it holds is probably stored on another memory location?
After s = tmp;, both s and tmp point to the same allocated memory block. You only need to free one of them.
BTW, the initial allocation should be:
char *s = malloc(strlen(param)+1);
You need to add 1 for the null terminator, since strlen() doesn't count this.
I don't understand how dynamically allocated strings in C work. Below, I have an example where I think I have created a pointer to a string and allocated it 0 memory, but I'm still able to give it characters. I'm clearly doing something wrong, but what?
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
char *str = malloc(0);
int i;
str[i++] = 'a';
str[i++] = 'b';
str[i++] = '\0';
printf("%s\n", str);
return 0;
}
What you're doing is undefined behavior. It might appear to work now, but is not required to work, and may break if anything changes.
malloc normally returns a block of memory of the given size that you can use. In your case, it just so happens that there's valid memory outside of that block that you're touching. That memory is not supposed to be touched; malloc might use that memory for internal housekeeping, it might give that memory as the result of some malloc call, or something else entirely. Whatever it is, it isn't yours, and touching it produces undefined behavior.
Section 7.20.3 of the current C standard states in part:
"If the size of the space requested is zero, the behavior is
implementation defined: either a null pointer is returned, or the
behavior is as if the size were some nonzero value, except that the
returned pointer shall not be used to access an object."
This will be implementation defined. Either it could send a NULL pointer or as mentioned something that cannot be referenced
Your are overwriting non-allocated memory. This might looks like working. But you are in trouble when you call free where the heap function tries to gives the memory block back.
Each malloc() returned chunk of memory has a header and a trailer. These structures hold at least the size of the allocated memory. Sometimes yout have additional guards. You are overwriting this heap internal structures. That's the reason why free() will complain or crash.
So you have an undefined behavior.
By doing malloc(0) you are creating a NULL pointer or a unique pointer that can be passed to free. Nothing wrong with that line. The problem lies when you perform pointer arithmetic and assign values to memory you have not allocated. Hence:
str[i++] = 'a'; // Invalid (undefined).
str[i++] = 'b'; // Invalid (undefined).
str[i++] = '\0'; // Invalid (undefined).
printf("%s\n", str); // Valid, (undefined).
It's always good to do two things:
Do not malloc 0 bytes.
Check to ensure the block of memory you malloced is valid.
... to check to see if a block of memory requested from malloc is valid, do the following:
if ( str == NULL ) exit( EXIT_FAILURE );
... after your call to malloc.
Your malloc(0) is wrong. As other people have pointed out that may or may not end up allocating a bit of memory, but regardless of what malloc actually does with 0 you should in this trivial example allocate at least 3*sizeof(char) bytes of memory.
So here we have a right nuisance. Say you allocated 20 bytes for your string, and then filled it with 19 characters and a null, thus filling the memory. So far so good. However, consider the case where you then want to add more characters to the string; you can't just out them in place because you had allocated only 20 bytes and you had already used them. All you can do is allocate a whole new buffer (say, 40 bytes), copy the original 19 characters into it, then add the new characters on the end and then free the original 20 bytes. Sounds inefficient doesn't it. And it is inefficient, it's a whole lot of work to allocate memory, and sounds like an specially large amount of work compared to other languages (eg C++) where you just concatenate strings with nothing more than str1 + str2.
Except that underneath the hood those languages are having to do exactly the same thing of allocating more memory and copying existing data. If one cares about high performance C makes it clearer where you are spending time, whereas the likes of C++, Java, C# hide the costly operations from you behind convenient-to-use classes. Those classes can be quite clever (eg allocating more memory than strictly necessary just in case), but you do have to be on the ball if you're interested in extracting the very best performance from your hardware.
This sort of problem is what lies behind the difficulties that operations like Facebook and Twitter had in growing their services. Sooner or later those convenient but inefficient class methods add up to something unsustainable.
I am just learning C (reading Sam's Teach Yourself C in 24 hours). I've gotten through pointers and memory allocation, but now I'm wondering about them inside a structure.
I wrote the little program below to play around, but I'm not sure if it is OK or not. Compiled on a Linux system with gcc with the -Wall flag compiled with nothing amiss, but I'm not sure that is 100% trustworthy.
Is it ok to change the allocation size of a pointer as I have done below or am I possibly stepping on adjacent memory? I did a little before/after variable in the structure to try to check this, but don't know if that works and if structure elements are stored contiguously in memory (I'm guessing so since a pointer to a structure can be passed to a function and the structure manipulated via the pointer location). Also, how can I access the contents of the pointer location and iterate through it so I can make sure nothing got overwritten if it is contiguous? I guess one thing I'm asking is how can I debug messing with memory this way to know it isn't breaking anything?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct hello {
char *before;
char *message;
char *after;
};
int main (){
struct hello there= {
"Before",
"Hello",
"After",
};
printf("%ld\n", strlen(there.message));
printf("%s\n", there.message);
printf("%d\n", sizeof(there));
there.message = malloc(20 * sizeof(char));
there.message = "Hello, there!";
printf("%ld\n", strlen(there.message));
printf("%s\n", there.message);
printf("%s %s\n", there.before, there.after);
printf("%d\n", sizeof(there));
return 0;
}
I'm thinking something is not right because the size of my there didn't change.kj
Kind regards,
Not really ok, you have a memory leak, you could use valgrind to detect it at runtime (on Linux).
You are coding:
there.message = malloc(20 * sizeof(char));
there.message = "Hello, there!";
The first assignment call malloc(3). First, when calling malloc you should always test if it fails. But indeed it usually succeeds. So better code at least:
there.message = malloc(20 * sizeof(char));
if (!there.message)
{ perror("malloc of 20 failed"); exit (EXIT_FAILURE); }
The second assignment put the address of the constant literal string "Hello, there!" into the same pointer there.message, and you have lost the first value. You probably want to copy that constant string
strncpy (there.message, "Hello, there!", 20*sizeof(char));
(you could use just strcpy(3) but beware of buffer overflows)
You could get a fresh copy (in heap) of some string using strdup(3) (and GNU libc has also asprintf(3) ...)
there.message = strdup("Hello, There");
if (!there.message)
{ perror("strdup failed"); exit (EXIT_FAILURE); };
At last, it is good taste to free at program end the heap memory.
(But the operating system would supress the process space at _exit(2) time.
Read more about C programming, memory management, garbage collection. Perhaps consider using Boehm's conservative GC
A C pointer is just a memory address zone. Applications need to know their size.
PS. manual memory management in C is tricky, even for seasoned veteran programmers.
there.message = "Hello, there!" does not copy the string into the buffer. It sets the pointer to a new (generally static) buffer holding the string "Hello, there!". Thus, the code as written has a memory leak (allocated memory that never gets freed until the program exits).
But, yes, the malloc is fine in its own right. You'd generally use a strncpy, sprintf, or similar function to copy content into the buffer thus allocated.
Is it ok to change the allocation size of a pointer [...] ?
Huh? What do you mean by "changing the allocation size of a pointer"? Currently all your code does is leaking the 20 bytes you malloc()ated by assigning a different address to the pointer.
I have yet again a question about the workings of C. (ANSI-C compiled by VS2012)
I am refactoring a standalone program (.exe) into a .dll. This works fine so far but I stumble accross problems when it comes to logging. Let me explain:
The original program - while running - wrote a log-file and printed information to the screen. Since my dll is going to run on a webserver, accessed by many people simultaneously there is
no real chance to handle log-files properly (and clean up after them)
no console-window anyone would see
So my goal is to write everything that would be put in the log-file or on the screen into string-like variables (I know that there are no strings in C) which I then can later pass on requet to the caller (also a dll, but written in C#).
Since in C such a thing is not possible:
char z88rlog;
z88rlog="First log-entry\n";
z88rlog+="Second log-entry\n";
I have two possibilities:
char z88rlog[REALLY_HUGE];
dynamically allocating memory
In my mind the first way is to be ignored because:
The potential waste of memory is rather enormous
I still may need more memory than REALLY_HUGE, thus creating a buffer overflow
which leaves me with the second way. I have done some work on that and came up with two solutions, either of which doesn't work properly.
/* Solution 1 */
void logpr(char* tmpstr)
{
extern char *z88rlog;
if (z88rlog==NULL)
{
z88rlog=malloc(strlen(tmpstr)+1);
strcpy(z88rlog,tmpstr);
}
else
{
z88rlog=realloc(z88rlog,strlen(z88rlog)+strlen(tmpstr));
z88rlog=strcat(z88rlog,tmpstr);
}
}
In solution 1 (equal to solution 2 you will find) I pass my new log-entry through char tmpstr[255];. My "log-file" z88rlog is declared globally, so I need extern to access it. I then check if memory has been allocated for z88rlog. If no I allocate memory the size of my log-entry (+1 for my \0) and copy the contents of tmpstr into z88rlog. If yes I realloc memory for z88rlog in the size of what it has been + the length of tmpstr (+1). Then the two "string" are joined, using strcat. Using breakpoints an the direct-window I obtainded the following output:
z88rlog
0x00000000 <Schlechtes Ptr>
z88rlog
0x0059ef80 "start Z88R version 14OS"
z88rlog
0x0059ef80 "start Z88R version 14OS
opening file Z88.DYNÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍýýýý««««««««þîþîþîþ"
It shows three consecutive calls of logpr (breakpoint before strcpy/strcat). The indistinguable gibberish at the end results from memory allocation. After that VS gives out an error message that something caused the debugger to set a breakpoint in realloc.c. Because this obviously doesn't work I concocted my wonderful solution 2:
/* Solution 2 */
void logpr(char* tmpstr)
{
extern char *z88rlog;
char *z88rlogtmp;
if (z88rlog==NULL)
{
z88rlog=malloc(strlen(tmpstr)+1);
strcpy(z88rlog,tmpstr);
}
else
{
z88rlogtmp=malloc(strlen(z88rlog)+strlen(tmpstr+1));
z88rlogtmp=strcat(z88rlog,tmpstr);
free(z88rlog);
z88rlog=malloc(strlen(z88rlogtmp)+1);
memcpy(z88rlog,z88rlogtmp,strlen(z88rlogtmp)+1);
free(z88rlogtmp);
}
}
Here my aim is to create a copy of my log-file, free the originals' memory create new memory for the original in the new size and copy the contents back. And don't forget to free the temporary copy since it's allocated via malloc. This crashes instantly when it reaches free, again telling me that the heap might be broken.
So lets comment free for the time being. This does work better - much to my relief - but while building the log-string suddenly not all characters from z88rlogtmp get copied. But everything still works kind of properly. Until suddenly I am told again that the heap might be broken and the debugger puts a breakpoint at the end of _heap_alloc (size_t size) in malloc.c size has - according to the debugger - the value of 1041.
So I have 2 (or 3) ways I want to achieve this "string-growing" but none works. Might the error giving me the size point me to the conclusion that the array has become to big? I hope I explained well what I want to do and someone can help me :-) Thanks in advance!
irony on Maybee I should just go and buy some new heap for the computer. Does it fit in RAM-slots? Can anyone recomend a good brand? irony off
This is one mistake in Solution 1:
z88rlog=realloc(z88rlog,strlen(z88rlog)+strlen(tmpstr));
as no space is allocated for the terminating null character. Note that you must store the result of realloc() to a temporary variable to avoid memory leak in the event of failure. To correct:
char* tmp = realloc(z88rlog, strlen(z88rlog) + strlen(tmpstr) + 1);
if (tmp)
{
z88rlog = tmp;
/* ... */
}
Mistakes in Solution 2:
z88rlogtmp=malloc(strlen(z88rlog)+strlen(tmpstr+1));
/*^^^^^^^^^*/
it is calulating one less than the length of tmpstr. To correct:
z88rlogtmp=malloc(strlen(z88rlog) + strlen(tmpstr) + 1);
Pointer reassignment resulting in undefined behaviour:
z88rlogtmp=strcat(z88rlog,tmpstr);
/* Now, 'z88rlogtmp' and 'z88rlog' point to the same memory. */
free(z88rlog);
/* 'z88rlogtmp' now points to deallocated memory. */
z88rlog=malloc(strlen(z88rlogtmp)+1);
/* This call ^^^^^^^^^^^^^^^^^^ is undefined behaviour,
and from this point on anything can happen. */
memcpy(z88rlog,z88rlogtmp,strlen(z88rlogtmp)+1);
free(z88rlogtmp);
Additionally, if the code is executing within a Web Server it is almost certainly operating in a multi-threaded environment. As you have a global variable it will need synchronized access.
You seem to have many problems. To start with in your realloc call you don't allocate space for the terminating '\0' character. In your second solution you have strlen(tmpstr+1) which isn't correct. In your second solution you also use strcat to append to the existing buffer z88rlog, and if it's not big enough you overwrite unallocated memory, or over data allocated for something else. The first argument to strcat is the destination, and that is what is returned by the function as well so you loose the newly allocated memory too.
The first solution, with realloc, should work fine, if you just remember to allocate that extra character.
In solution 1, you would need to allocate space for terminating NULL character. Hence, the realloc should include one more space i.e.
z88rlog=realloc(z88rlog,strlen(z88rlog)+strlen(tmpstr) + 1);
In second solution, I am not sure of this z88rlogtmp=strcat(z88rlog,tmpstr); because z88rlog is the destination string. In case you wish to perform malloc only, then
z88rlogtmp=malloc(strlen(z88rlog)+1 // Allocate a temporary string
strcpy(z88rlogtmp,z88rlog); // Make a copy
free(z88rlog); // Free current string
z88rlog=malloc(strlen(z88rlogtmp)+ strlen(tmpstr) + 1)); //Re-allocate memory
strcpy(z88rlog, z88rlogtmp); // Copy first string
strcat(z88rlog, tmpStr); // Concatenate the next string
free(z88rlogtmp); // Free the Temporary string
Is there anything I should know about using strtok on a malloced string?
In my code I have (in general terms)
char* line=getline();
Parse(dest,line);
free(line);
where getline() is a function that returns a char * to some malloced memory.
and Parse(dest, line) is a function that does parsing online, storing the results in dest, (which has been partially filled earlier, from other information).
Parse() calls strtok() a variable number of times on line, and does some validation.
Each token (a pointer to what is returned by strtok()) is put into a queue 'til I know how many I have.
They are then copied onto a malloc'd char** in dest.
Now free(line)
and a function that free's each part of the char*[] in dest, both come up on valgrind as:
"Address 0x5179450 is 8 bytes inside a block of size 38 free'd"
or something similar.
I'm considering refactoring my code not to directly store the tokens on the the char** but instead store a copy of them (by mallocing space == to strlen(token)+1, then using strcpy()).
There is a function strdup which allocates memory and then copies another string into it.
You ask:
Is there anything I should know about
using strtok on a malloced string?
There are a number of things to be aware of. First, strtok() modifies the string as it processes it, inserting nulls ('\0') where the delimiters are found. This is not a problem with allocated memory (that's modifiable!); it is a problem if you try passing a constant string to strtok().
Second, you must have as many calls to free() as you do to malloc() and calloc() (but realloc() can mess with the counting).
In my code I have (in general terms)
char* line=getline();
Parse(dest,line);
free(line);
Unless Parse() allocates the space it keeps, you cannot use the dest structure (or, more precisely, the pointers into the line within the dest structure) after the call to free(). The free() releases the space that was allocated by getline() and any use of the pointers after that yields undefined behaviour. Note that undefined behaviour includes the option of 'appearing to work, but only by coincidence'.
where getline() is a function that
returns a char * to some malloced
memory, and Parse(dest, line) is a
function that does parsing online,
storing the results in dest (which
has been partially filled earlier,
from other information).
Parse() calls strtok() a a variable
number of times on line, and does some
validation. Each token (a pointer to
what is returned by strtok()) is put
into a queue 'til I know how many I
have.
Note that the pointers returned by strtok() are all pointers into the single chunk of space allocated by getline(). You have not described any extra memory allocation.
They are then copied onto a malloc'd
char** in dest.
This sounds as if you copy the pointers from strtok() into an array of pointers, but you do not attend to copying the data that those pointers are pointing at.
Now free(line) and a function that
free's each part of the char*[] in
dest,
both come up on valgrind as:
"Address 0x5179450 is 8 bytes inside a block of size 38 free'd"
or something similar.
The first free() of the 'char *[]' part of dest probably has a pointer to line and therefore frees the whole block of memory. All the subsequent frees on the parts of dest are trying to free an address not returned by malloc(), and valgrind is trying to tell you that. The free(line) operation then fails because the first free() of the pointers in dest already freed that space.
I'm considering refactoring my code
[to] store a copy of them [...].
The refactoring proposed is probably sensible; the function strdup() already mentioned by others will do the job neatly and reliably.
Note that after refactoring, you will still need to release line, but you will not release any of the pointers returned by strtok(). They are just pointers into the space managed by (identified by) line and will all be released when you release line.
Note that you will need to free each of the separately allocated (strdup()'d) strings as well as the array of character pointers that are accessed via dest.
Alternatively, do not free line immediately after calling Parse(). Have dest record the allocated pointer (line) and free that when it frees the array of pointers. You still do not release the pointers returned by strtok(), though.
they are then copied on to to a malloc'd char** in dest.
The strings are copied, or the pointers are copied? The strtok function modifies the string you give it so that it can give you pointers into that same string without copying anything. When you get tokens from it, you must copy them. Either that or keep the input string around as long as any of the token pointers are in use.
Many people recommend that you avoid strtok altogether because it's error-prone. Also, if you're using threads and the CRT is not thread-aware, strtok will likely crash your app.
1 in your parse(), strtok() only writes '\0' at every matching position. actually this step is nothing special. using strtok() is easy. of course it cannot be used on read-only memory buffer.
2 for each sub-string got in parse(), copy it to a malloc()ed buffer accordingly. if i give a simple example for storing the sub-strings, it looks like the below code, say conceptually, though it might not be exactly the same as your real code:
char **dest;
dest = (char**)malloc(N * sizeof(char*));
for (i: 0..N-1) {
dest[i] = (char*)malloc(LEN);
strcpy(dest[i], sub_strings[i]);
NOTE: above 2 lines could be just one line as below
dest[i] = strdup(sub_string[i]);
}
3 free dest, conceptually again:
for (i: 0..N-1) {
free(dest[i]);
}
free(dest);
4 call free(line) is nothing special too, and it doesn't affect your "dest" even a little.
"dest" and "line" use different memory buffer, so you can perform step 4 before step 3 if preferred. if you had following above steps, no errors would occur. seems you made mistacks in step 2 of your code.