I do something like that in the loop :
char* test = "\0";
test = strcat(test, somestr);
...
char* tmp = strstr(test, 0, len);
free(test);
test = tmp;
And get memory leak. What I do wrong?
You don't actually have a memory leak (in the code you posted anyway), but you do several things wrong.
char* test = "\0";
This declares pointer named test and initializes it to point to some literal array of two bytes { 0, 0 }
test = strcat(test, somestr);
This tries to append something to the end of that string literal (and since as a C string it is empty it would be like a string copy). Literal values often are stored in memory that is not writable, so copying something into this memory would likely cause an error (segmentation fault or SIGSEGV on many operating systems). Additionally you only have two bytes of storage pointed to by test, which means that unless somestr refers to a string whose strlen is less than or equal to 1 you would end up attempting to write over some other memory (whatever happens to be after the "\0" that test points to).
char* tmp = strstr(test, 0, len);
I don't know what is going on here since strstr only takes 2 arguments (both of them const char *).
free(test);
Here you are attempting to free non-heap allocated memory. The heap is where malloc, realloc, and calloc get the memory they allocate. Calling free with a memory location that was not returned by one of these functions (and a few other functions on some systems) is an error because free does not know what to do with them.
You should probably keep in mind that often memory is huge array of bytes, and that the pointers you use are like array indexes. The system you are using may be able to distinguish between some areas of this array and determine how you can access them (readable, writable, and/or executable). But it is still just an array of bytes. When you have a string (such as "foo") that means that somewhere in RAM there are four bytes ( 3 letters + the \0 terminator byte) and you can access this area by knowing its index within the array of bytes that is RAM. There are likely other things that are stored adjacent to your string (such as { ..., 4, 2, 'f', 'o', 'o', 0, 99, 3, 2, ...}) so you have to try to make sure you stay within the space of that memory without wandering into the adjacent data.
There are a couple of problems:
strcat will append a string to the destination buffer. You need the first parameter to be a buffer not a string literal pointer. Here is an example of a char buffer or also called an array of chars: char test[1024];
The return value of strcat is a pointer to the destination buffer, it is not a newly allocated string on the heap. So you shouldn't call free on the return value.
You can't strcat to test because you are initially pointing it to a constant char *. You need to assign memory for it. strcat won't do it.
Change your code to something like:
char* test = (char*)malloc(20*sizeof(char));
test[0] = '\0'; // nothing on this string to begin with
strcat(test, "something");
free(test);
Also, this won't work:
char* tmp = strcat(test, 0, len);
Since there is no strcat function with three parameters.
Remember. 99.9% of the time there will be a free call for each malloc allocation.
Related
This question already has answers here:
How can I correctly assign a new string value?
(4 answers)
Closed 4 years ago.
Why does this not return a segmentation fault 11?
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char const *argv[])
{
char *test;
test = (char*) (malloc(sizeof(char)*3));
test = "foo";
printf("%s\n", test);
test = "foobar";
printf("%s\n", test);
return 0;
}
My outcome is
foo
foobar
I'm fairly new to C, but when I was compiling this using both gcc on mac and Windows Debugger on Windows 10, it doesn't crash like I expected.
My understanding is that by using (char*) (malloc(sizeof(char)*3)), I am creating a character array of length 3. Then, when I assign test to the string foobar I am writing to 6 array positions.
I'm left sitting here, staring at my apparently valid, runnable code scratching my head.
test = "foo";
Here you do not copy the string to the allocated memory, test no longer points to the allocated memory, instead it points to the string literal "foo". Same goes for "foobar". Also as pointed out in the comments the address of the allocated memory is lost and therefore it is a memory leak (since there is no way to retrieve the address of the memory).
If you want to copy a string to another destination you need to use strcpy or loop over every character.
If you write or read outside bounds of the allocated space you are invoking undefined behavior. That means that basicly everything can happen, also that it works.
Your program never writes to the location pointed to by the return from malloc(). All you've done with e.g. test = "foo"; is change what test points to, which by the way is a memory leak since you've then lost what malloc() returned.
To properly use the memory you allocated with malloc(), use strcpy(), snprintf(), etc.
Also, don't forget the null terminator in your C strings. To properly store e.g. "foobar" you need at least 7 bytes, not 6.
First thing is that you waste the memory allocated by malloc unnecessorily by storing the address of foo into that.
If you are going to point to string in code section then there is no need to allocate memory to the pointer.
When to allocate memory to pointer
e.g. when you intended to scan 'n' number of bytes from keyboard in pointer.
char *ptr,num_char;
scanf("%d",&num_char);
ptr = (char *)malloc(num_char*sizeof(char));
scanf("%s",ptr);
Essentially, I have a structure for a linked list in my code,
struct node{
char* val;
struct node *next;
};
Then later I try to do some things to it...
...
addr = malloc(sizeof(struct node));
addr->val = malloc(72);
addr->val = "";
snprintf(addr->val, 1000, "%s", ident);
...
...
This gives me a segmentation fault at the snprintf. Valgrind says the following
Process terminating with default action of signal 11 (SIGSEGV)
==10724== Bad permissions for mapped region at address 0x40566B
==10724== at 0x4EB0A32: vsnprintf (vsnprintf.c:112)
==10724== by 0x4E8F931: snprintf (snprintf.c:33)
==10724== by 0x4016CC: id (Analyzer.c:267)
...
I am fairly new to C as opposed to C++, but I thought that calling malloc on the char* should make it valid, especially since I can initialize and print it, so I don't understand why the snprintf wouldn't work.
I also had my program print out the addresses of both variables, and the address valgrind complains about is indeed from addr->val.
I also tried using strcpy instead of snprintf but had the same result.
Thanks.
addr->val = malloc(72);
This line dynamically allocates 72 bytes and assigns the address of that memory region to addr->val.
addr->val = "";
This then sets addr->val to point to the address of a string constant, discarding the address of the malloced area of memory that it previously contained, causing a memory leak.
When you then try to use snprintf, you're attempting to write to a string literal. Since these are typically stored in a read-only section of memory, attempting to do so results in a core dump.
There's no need for addr->val = "";. It throws away allocated memory; it doesn't set that allocated memory to an empty string, which is probably what you thought it would do. Even if it did, it's useless anyway because snprintf will overwrite anything that might be there.
The code
addr->val = malloc(72);
addr->val = ""; <====
overwrites val pointer with "", address of a static area, made of 1 character (of value 0). Remove this line.
And
snprintf(addr->val, 1000, "%s", ident);
Accept a length of 1000 while you only allocated 72 characters.
snprintf(addr->val, 72, "%s", ident);
is better.
addr->val = malloc(72);
addr->val = "";
The second line re-assignes addr->val to an address in read-only section (where string literals lie), and discards the address of allocation from malloc, which can lead to a potential memory leak.
I know you want to clear it. To assign strings, you should use strcpy()
strcpy(addr->val, "");
But since you want to empty it, it's easiest to set the first character to zero:
addr->val[0] = '\0';
Besides, you're trying to do a potentially harmful job:
snprintf(addr->val, 1000, "%s", ident);
You intended to allocate 72 bytes, but why us the second parameter going up to 1k? Change it to a safer number:
snprintf(addr->val, 72, "%s", ident);
Everything should be fine by then.
I'm told that when initializing a string like so
char str[] = "Hello world!";
The compiler will allocate an area in constants memory(read only for the program) and then copy the string to the array which resides in the stack. My question is, can I read or point to the original string after modifying the copy I'm given, and how? And if not, why does the string even exist outside of the stack in the first place?
It's done this way for space efficiency. When you write:
char str[] = "Hello world!";
it's compiled effectively as if you'd written:
static char str_init[] = "Hello world!";
char str[13];
strncpy(str, str_init, 13);
An alternative way to implement this might be equivalent to:
char str[13];
str[0] = 'H';
str[1] = 'e';
...
str[11] = '!';
str[12] = 0;
But for long strings, this is very inefficient. Instead of 1 byte of static data for each character of the string, it will use a full word of instruction (probably 4 bytes, but maybe more on some architectures) for each character. This will quadruple the size of the initialization data unnecessarily.
Because the program has to remember the string somewhere, i.e., your so-called "constant memory". Otherwise how can it know what values to assign when allocating the variable? Think about a variable with a given initial value. The variable is not allocated until declared. But the initial value must be stored somewhere else.
When this statement is compiled
char str[] = "Hello world!";
the compiler does not keep the string literal in the program. it is used only to initialize the array.
If you want to keep the string literal then you have to write the following way
char *s = "Hello world!";
char str[13];
strcpy( str, s );
When the program runs, "Hello world" will be stored in the constant part of the memory as a string literal, after that, the program will reserve enough space in the stack and copy character by character from the constant part of the memory. Unfortunately, you don't have access to the constant part that stores the string literal because you are telling the program that you want the values to be modifiable (string stored in stack), so it gives what you asked.
Most of your question has been addressed in the other answers, However, I did not see anyone address this one specifically:
Regarding your question: ...can I read or point to the original string after modifying the copy I'm given, and how?
The following sequence demonstrates how you can read the original after modifying a copy:
char str[] = "hello world"; //creates original (stack memory)
char *str2 = 0;//create a pointer (pointer created, no memory allocated)
str2 = StrDup(str); populate pointer with original (memory allocated on heap)
str2[5]=0; //edit copy: results in "hello" (i.e. modified) (modifying a location on the heap)
str; //still contains "hello world" (viewing value on the stack)
EDIT (answering comment question)
The answer above only addressed the specific question about accessing an original string after a copy has been modified. I just showed one possible set of steps to address that. You can edit the original string too:
char str[] = "Hello world!"; //creates location in stack memory called "str",
//and assigns space enough for literal string:
//"Hello world!", 13 spaces in all (including the \0)
strcpy(str, "new string"); //replaces original contents with "new string"
//old contents are no longer available.
So, using these steps, the original values in the variable str are changed, and are no longer available.
The method I outline in my original answer, (at top) shows a way whereby you can make an editable copy, while maintaining the original variable.
In your comment question, you are referring to things such as system memory and constant memory. Normally, system memory refers to RAM implementations on a system (i.e. how much physical memory). By constant memory, my guess is that you are referring to memory used by variables created on the stack. (read on)
First In a development, or run-time environment, there is stack memory. This is usually defaulted to some maximum value, such as 250,000 bytes perhaps. It is a pre-build settable value in most development environments, and is available for use by any variable you create on the stack. Example:
int x[10]; //creates a variable on the stack
//using enough memory space for 10 integers.
int y = 1; //same here, except uses memory for only 1 integer value
Second There is also what is referred to a heap memory. The amount of heap memory is system dependent, the more physical memory your system has available, the more heap memory you can use for variable memory space in your application. Heap memory is used when you dynamically allocate memory, for example using malloc(), calloc(), realloc().
int *x=0; //creates a pointer, no memory allocation yet...
x = malloc(10); //allocates enough memory for 10 integers, but the
//memory allocated is from the _heap_
//and must be freed for use by the system
//when you are done with it.
free(x);
I have marked the original post (above) with indications showing what type of memory each variable is using. I hope this helps.
struct TokenizerT_ {
char* separators;
char* tks;
char* cur_pos;
char* next;
};
typedef struct TokenizerT_ TokenizerT;
TokenizerT *TKCreate(char *separators, char *ts)
{
TokenizerT *tokenizer;
tokenizer = (TokenizerT*)malloc(sizeof(TokenizerT));
//some manipulation here
tokenizer->tks = (char*) malloc (strlen(str)* sizeof(char));
tokenizer->tks=str;
printf("size of tokenizer->tks is %zu\n", strlen(tokenizer->tks)); //this prints out the correct number (e.g. 7)
return tokenizer;
}
int main(int argc, char **argv)
{
TokenizerT *tk = TKCreate(argv[1], argv[2]);
printf("tk->tks: %zu\n", strlen(tk->tks)); //HOWEVER, this prints out the wrong number (e.g. 1)
}
As seen from the above code, I'm working with pointers to structs. For some reason I am not receiving back the correct length for tk->tks. I cannot understand this because it should be the same size as tks in my TKCreate function. Can someone explain this please?
I suspect str, the definition of which is not shown in your code snippet, is a local variable defined in TKCreate(). If so, you're assigning tokenizer->tks to have the value of str, which points to a proper string inside the scope of TKCreate() but upon exiting TKCreate(), the stack contents (including parameters and local variables) are freed and wiped out so when you try to reference that pointer outside the scope of TKCreate() all bets are off.
One plausible fix is to allocate the storage for tokenizer->tks dynamically, so it persists after you exit TKCreate(). I see you do that with a call to malloc but then you overwrite that with an explicit assignment from str. Instead you should copy the contents of str (using strcpy) into the dynamically allocated memory via: strcpy(tokenizer->tks, str);
You should strcpy the contents of str to tokenizer->tks, because when you use the assign operator, you're losing the pointer malloc gave you, creating a memory leak and pointing tokenizer->tks to a local variable, which will be destroyed after the function's return.
So, the approach would be something like this:
tokenizer->tks = (char *)malloc ((strlen(str) + 1) * sizeof(char));
strcpy(tokenizer->tks, str);
Another thing:
Don't forget to free ->tks before you free tk itself.
So, after the printf, you should use:
free(tk->tks);
free(tk);
There's no problem in not freeing the structure and the string (which is in another memory location and not inside the structure's memory space, that's why you have to free they both), if your program is that small, because after it's executed, the program's memory will be wiped out anyways. But if you intend to implement this function on a fully-working and big program, freeing the memory is a good action.
It is not clear where str is defined, but if it is a local variable in the function, your problem is likely that it goes out of scope, so the data gets overwritten.
You're leaking memory because you've forgotten to use strcpy() or memcpy() or memmove() to copy the value in str over the allocated space, and you overwrite the only pointer to the newly allocated memory with the pointer str. If you copied, you would be writing out of bounds because you forgot to allocate enough space for the trailing null as well as the string. You should also check that the allocation succeeds.
Bogus code:
tokenizer->tks = (char*) malloc (strlen(str)* sizeof(char));
tokenizer->tks = str;
Fixed code:
size_t len = strlen(str) + 1;
tokenizer->tks = (char *)malloc(len);
if (tokenizer->tks == 0)
...error handling...
memmove(tokenizer->tks, str, len);
Using memmove() or memcpy() can outperform strcpy() handily (see Why is Python faster than C for some illustrations and timing). There are those who would excoriate you (and me) for using the cast on malloc(); I understand why they argue as they do, but I don't fully agree with them (and usually use the cast myself). Since sizeof(char) is 1 by definition, there's no particular need to multiply by it, though there's no harm done in doing so, either.
I am getting segmentation fault when using strncpy and (pointer-to-struct)->(member) notation:
I have simplified my code. I initialise a struct and set all of it's tokens to an empty string. Then a declare a pointer to a struct and assign the address of the struct to it.
I pass the pointer to a function. I can print out the contents of the struct at the beginning of the function, but if I try to use the tp -> mnemonic in a strncpy function, I get seg fault. Can anyone tell me what I am doing wrong?
typedef struct tok {
char* label;
char* mnem;
char* operand;
}Tokens;
Tokens* tokenise(Tokens* tp, char* line) {
// This prints "load"
printf("Print this - %s\n", tp -> mnem);
// This function gives me segmentation fault
strncpy(tp -> mnem, line, 4);
return tp;
}
int main() {
char* line = "This is a line";
Tokens tokens;
tokens.label = "";
tokens.mnem = "load";
tokens.operand = "";
Tokens* tp = &tokens;
tp = tokenise(tp, line);
return 0;
}
I have used printf statements to confirm that the code definitely stops executing at the strncpy function.
The problem is that tp->mnem is pointing to a string literal, which is generally allocated in a read-only segment of memory. Therefore it's illegal to overwrite it. Most likely what you need to do instead is something like this:
Tokens tokens;
tokens.label = "";
tokens.mnem = strdup("load");
tokens.operand = "";
This will give you a dynamically allocated block of memory for mnem, which you can then write into as much as you like. Of course, you have a couple of other problems too: first, you'll need to remember to release that memory with free later; second, you'll have to be aware of the size of the buffer you've allocated so that you don't overwrite it.
If you know that the contents of mnem will never exceed 4 bytes, then you might instead change your structure declaration like so:
typedef struct tok {
char* label;
char mnem[5]; // note: +1 byte for a NULL terminator
char* operand;
}Tokens;
Then, you'd initialize it like this:
Tokens tokens;
tokens.label = "";
strcpy(tokens.mnem, "load");
tokens.operand = "";
This relieves you of the responsibility of managing the memory for mnem, although you still have some risk of overrunning your buffer.
Following line
tokens.mnem = "load"
assigns mnem to address of string literal, which is typically located in read-only data segment, so changing this memory with strncpy() or any other function will fail.
The problem is you've assigned string literals to the members of your Tokens structure and are trying to overwrite that memory (specifically, the mnem field) in tokenise.
Most modern OSes will allocate memory for string literals from a special read-only section of your program's address space. If you try to write to that memory, then your program will die with a segfault.
This is why the type of a string literal is const char *, not char *. Your compiler should warn you when you try to assign these to the fields of tokenise.
If you want to overwrite the memory later, you need to allocate the memory dynamically using malloc or change the members of the Tokens structure to fixed-length arrays, then copy the initial value into the allocated memory. Of course if you allocate the memory dynamically you need to free it later too.
You're calling strncpy() without having allocated the buffer spacem, just like Shadow said.
The literal string "load" you set the mnem member to in the initializer is not overwritable.
If you want to be able to change the string stored, and the size is reasonable, it might be easiest to just change the declaration of the struct field to char mnem[5];.
Also, please note that strncpy() has quite weird semantics. Check if you have strlcpy(); it's a better function.
You're getting a segmentation fault because this line:
strncpy(tp -> mnem, line, 4);
Is trying to copy four characters from 'line' into a location occupied by a string literal as assigned here:
tokens.mnem = "load";
The string literal is stored in a special text part of your program and may not be modified.
What you need to do is allocate a buffer of your own into which the string will be copied:
tokens.mnem = (char*) malloc (bufferSize);
And free the buffer when you are done using it.
This line is questionable:
strncpy(tp -> mnem, line, 4);
You are relying on a function that returns a pointer to memory that is not allocated. The return of *tokenise() is undefined. Its returning a pointer to memory that could contain all kinds of stuff, and that you don't have permission to modify.
It should return an allocated pointer.
You might malloc the tp variable. If you don't malloc there is no guarantee that the memory is actually yours. Don't forget to free the memory when you are finished.