I have a tokenizer method that returns a char**. The results are being stored in a char** called lineTokens. When I use printf() to print the first token, I get the correct result, but when I use strcmp(lineTokens[0],"Some text"), I get a seg fault. The appropriate code is below.
lineTokens = tokenize(tempString);
printf("token[0] = %s\n", lineTokens[0]);
if(strcmp(lineTokens[0], "INPUTVAR")==0){
printf("It worked\n");
}
EDIT:
My tokenize code is as follows
char** tokenize(char* input){
int i = 0;
char* tok;
char** ret;
tok = strtok(input, " ");
ret[0] = tok;
while(tok != NULL){
printf("%s\n", tok);
ret[i] = tok;
tok = strtok(NULL, " ");
i++;
}
ret[i] = NULL;
return ret;
}
It's impossible to answer this without seeing the code of tokenize(), of course.
My guess is that there is some undefined behavior in that function, which perhaps corrupts the stack, so that when printf() runs and actually uses some more stack space, things go bad. The thing with undefined behavior is that it's really undefined, anything can happen.
Run the code in Valgrind.
Your tokenize function is broken. Every pointer in your code needs to point at allocated memory, which is not the case here. You get no memory allocated by simply declaring a pointer: a pointer is merely containing an address to memory allocated some place else. Given that you set it to point at "some place else", if you don't, it will point at a random garbage address.
So you need to rewrite that function from scratch. Either pass a pointer to allocated memory as a parameter or allocate memory dynamically inside the function. But before you do, I would strongly recommend that you study arrays and pointers some more, for example by reading this chapter of the C FAQ.
Related
I have the following program written in C:
...
char *answer = NULL;
char *pch = strtok(phrase, " "); // phrase is a string with possibly many words
while (pch) {
char *tmp = translate_word(pch); // returns a string based on pch
void *ptr = realloc(answer, sizeof(answer) + sizeof(tmp) + 1000); // allocate space to answer
if (!ptr) // If realloc fails
return -1;
strcat(answer, tmp); // append tmp to answer
pch = strtok(NULL, " "); // find next word
}
...
The problem is that strtok() shows weird behavior, it returns a word that does not exist in the phrase string but is part of the answer string.
On the other hand, when I change the following line:
void *ptr = realloc(answer, sizeof(answer) + sizeof(tmp) + 1000);
to:
void *ptr = realloc(answer, sizeof(answer) + sizeof(tmp) + 1);
strok() works as expected.
How is it possible that realloc() affects strtok() in this case? They do not even use the same variables. Looking forward to your insights.
The realloc function could move the memory that was previously allocated. After the call, the pointer to the allocated memory is returned and the pointer value passed to it, if it differs, is no longer valid. So when you call strcat(answer, tmp); you're potentially writing to freed memory which invokes undefined behavior, and in this case it manifests as the strange output you're seeing.
After checking the return value of realloc, assign that value back to answer.
Also, sizeof(answer) and sizeof(tmp) give you the size of the pointer, not the size of what it points to. You instead want to use strlen to get the length of the string then contain. And while we're at it, lets just add 1 to this instead of 1000 because that's all you actually need.
void *ptr = realloc(answer, strlen(answer) + strlen(tmp) + 1);
if (!ptr)
return -1;
answer = ptr;
strcat(answer, tmp);
One more issue: the first time realloc is called the memory is completely uninitialized. Subsequently calling strcat on it depends on answer containing a null terminated string. It doesn't so this also invokes undefined behavior.
This can be fixed by malloc-ing a single byte to start and setting it to 0, that way you start with an empty string.
char *answer = malloc(1);
if (!answer) return -1;
answer[0] = 0;
sizeof(answer) & sizeof(tmp) gives you sizes of the pointers.
You need to use strlen instead
additionally...
char *answer = NULL;
... either:
... strlen(answer) ...
strcat(answer, tmp);
These SHOULD fail, with a segmentation violation, but may not depending on the OS. Dereferencing NULL is never a good idea.
In short, you need to either know you have assigned something to answer, or to check if answer is NULL.
I'm trying to implement a function that concatenate two strings, but I keep getting the same error.
"pointer being realloc'd was not allocated"
When I compiled the same code on a windows machine it worked, is it something that I'm missing?
The code below is basically what I'm trying to do.
main:
int main() {
int length = 4096;
char *string = malloc(length * sizeof(char));
createString(string, length);
realloc(string, 30);
return 0;
}
createString:
void createString(char * string, int length) {
char *copyAdress = string;
char *temp ="";
int counter2 = 0;
fflush(stdin);
fgets(string, length,stdin);
while(*string != EOF && *string != *temp ) {
string++;
counter++;
}
string = copyAdress;
realloc(string, (counter)*sizeof(char));
}
Thanks!
Edit:
I want createString to change the size of string to the length of the string that I get with fgets, while having the same address as the string that I sent in, so I can allocate more memory to it later when I want to add another string to it.
There are several issues:
realloc(string, (counter)*sizeof(char)); is wrong, you need string = realloc(string, (counter)*sizeof(char)); because realloc may return a different address.
Calling createString(string, length); won't modify string
If you want a more accurate answer you need to tell us what exactly createString is supposed to do. In your code there is no attempt to concatenate two strings.
Let's work through this in order of execution.
fflush(stdin); is undefined behaviour. If you really need to clear everything in the stdin you have to find another way (a loop for example). There are compilers/systems with a defined implementation but I would not count on it.
string++; is superflous as you overwrite string after the loop.
realloc(string, (counter)*sizeof(char));
should be
char *temp = realloc(string, (counter)*sizeof(char));
if (temp != NULL)
string = temp;
This way you get the pointer where your new string is located, but I suggest you read the refecerence for realloc. In essence you do not know if it has been moved and the old address might be invalid from that point on. So dereferencing it is also undefined behaviour.
After this you would have to return the new address of string or pass the address of the pointer to your function.
The same problem repeats with the second realloc. You only got to know your first call was wrong, because the second call noticed that you do not have valid data in what you thought would be your string.
In regards to your comment: It is not possible to use realloc and to be sure that the reallocated memory is in the same place as before.
If you realloc some memory, the pointer pointing to the original memory becomes invalid (unless realloc failed and returned NULL). So calling realloc twice on the same pointer should indeed not work (if it didn't return NULL the first time).
See the answers from others about what you do wrong. However, the eror message means that on MacOS, the realloc in createString deallocated the orignal string and allocated a new one, and now your realloc in main tries to realloc a pointer that is no longer valid (allocated). On Windows, the memory was not deallocated in createString and so the second call of realloc (in main) is given a valid pointer.
So I am dynamically creating an array of strings. I am then assigning each element in that array a pointer returned by calling strtok. At the end of my process when I need to redo everything I have been trying to free the pointers in the elements of said array, but I keep getting an error stating
*** glibc detected *** ./prgm: munmap_chunk(): invalid pointer: 0x00007fff600d98
Also, would it make sense to free inputStr at the end of the loop?
Where is my logical "not really logical at all" thinking wrong..
e.g code
char** argvNew = (char**)calloc(33,sizeof(char*));
char inputStr[128];
do{
scanf("%127[^\n]%*c", inputStr);
token = strtok(inputStr, delim);
/* Add tokens to array*/
varNum= 0;
for(i = 0; token != NULL; i++){
varNum++;
argvNew[i] = token;
token = strtok(NULL, delim);
}
argvNew[i] = NULL;
//Free argvNew
for(i = 0; i < varNum;i++){
printf("Deleting %i, %s\n",i,argvNew[i]);
free(argvNew[i]);
}
while(1);
No, you should not free it. It's returning a pointer to a character in inputStr (or NULL when it reaches the end). It's not allocating any new memory, so there's nothing to free.
If inputStr is dynamically allocated, you should free it when you're done with it.
No, since it's not allocating new memory.
Quoting the ref of strtok():
Return Value
If a token is found, a pointer to the beginning of the token. Otherwise, a null pointer. A null pointer is always
returned when the end of the string (i.e., a null character) is
reached in the string being scanned.
The example of the ref, doesn't free what strtok() returns, which confirms:
/* strtok example */
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
return 0;
}
You see the returned pointer of strtok() will be used just to access the memory already created by your program (regardless of whether this was created dynamically or not; here it is created statically). It won't be assigned newly created memory, thus you shouldn't free it.
In general, this is what you should have in mind:
Call free() as many times as you call malloc().
You only allocated the argvNew array. And that's the only thing you should be deallocating.
You did not allocate what the pointers in argvNew are pointing to, so you are not going to free() them.
I've never needed to use strdup(stringp) with strsep(&stringp_copy, token) together until recently and I think it was causing a memory leak.
(strdup() has always free'd just fine before.)
I fixed the leak, and I think I understand how, but I just can't figure out why I needed to.
Original code (summarized):
const char *message = "From: username\nMessage: basic message\n";
char *message_copy, *line, *field_name;
int colon_position;
message_copy = strdup(message);
while(line = strsep(&message_copy, "\n")) {
printf(line);
char *colon = strchr(line, ':');
if (colon != NULL) {
colon_position = colon - line;
strncpy(field_name, line, colon_position);
printf("%s\n", field_name);
}
}
free(message_copy);
New code that doesn't leak:
const char *message = "From: username\nMessage: basic message\n";
char *message_copy, *freeable_message_copy, *line, *field_name;
int colon_position;
freeable_message_copy = message_copy = strdup(message);
while(line = strsep(&message_copy, "\n")) {
printf(line);
char *colon = strchr(line, ':');
if (colon != NULL) {
colon_position = colon - line;
strncpy(field_name, line, colon_position);
printf("%s\n", field_name);
}
}
free(freeable_message_copy);
How is the message_copy pointer being overwritten in the first code? or is it?
The function strsep() takes a pointer to the original string (message_copy) and modifies it to return a new pointer to the 'next' token
const char *message = "From: username\nMessage: basic message\n";
char *message_copy, *original_copy;
//here you have allocated new memory, a duplicate of message
message_copy = original_copy = strdup(message);
Print out the pointer here,
printf("copy %p, original %p\n", message_copy, original_copy);
Note that as you use strsep(), you are modifying the message_copy,
char* token;
//here you modify message_copy
while(token = strsep(&message_copy, "\n")) {
printf("%s\n", token);
}
This illustrates the changed message_copy, while original_copy is unchanged,
printf("copy %p, original %p\n", message_copy, original_copy);
Since message_copy does not point to the original strdup() result this would be wrong,
free(message_copy);
But keep around the original strdup() result, and this free works
//original_copy points to the results of strdup
free(original_copy);
Because strsep() modifies the message_copy argument, you were trying to free a pointer that was not returned by malloc() et al. This would generate complaints from some malloc() libraries, and from valgrind. It is also undefined behaviour, usually leading to crashes in short order (but crashes in code at an inconvenient location unrelated to the code that did the damage).
In fact, your loop iterates until message_copy is set to NULL, so you were freeing NULL, which is defined and safe behaviour, but it is also a no-op. It did not free the pointer allocated via strdup().
Summary:
Only free pointers returned by the memory allocators.
Do not free pointers into the middle or end of a block returned by the memory allocators.
Have a read of the strsep man page here.
In short the strsep function will modify the original character pointer that's passed into the function, overwriting each occurrance of the delimiter with a \0, and the original character pointer is then updated to point past the \0.
Your second version doesn't leak, as you created a temporary pointer to point to the start of the original char pointer returned from strdup(), so the memory was freed correctly, as you called free() with the original char pointer instead of the updated one that strsep() had modified.
From the man page,
...This token is terminated by overwriting the delimiter with a null byte ('\0') and *stringp is updated to point past the token....
I'm having trouble figuring out how to pass strings back through the parameters of a function. I'm new to programming, so I imagine this this probably a beginner question. Any help you could give would be most appreciated. This code seg faults, and I'm not sure why, but I'm providing my code to show what I have so far.
I have made this a community wiki, so feel free to edit.
P.S. This is not homework.
This is the original version
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void
fn(char *baz, char *foo, char *bar)
{
char *pch;
/* this is the part I'm having trouble with */
pch = strtok (baz, ":");
foo = malloc(strlen(pch));
strcpy(foo, pch);
pch = strtok (NULL, ":");
bar = malloc(strlen(pch));
strcpy(bar, pch);
return;
}
int
main(void)
{
char *mybaz, *myfoo, *mybar;
mybaz = "hello:world";
fn(mybaz, myfoo, mybar);
fprintf(stderr, "%s %s", myfoo, mybar);
}
UPDATE Here's an updated version with some of the suggestions implemented:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXLINE 1024
void
fn(char *baz, char **foo, char **bar)
{
char line[MAXLINE];
char *pch;
strcpy(line, baz);
pch = strtok (line, ":");
*foo = (char *)malloc(strlen(pch)+1);
(*foo)[strlen(pch)] = '\n';
strcpy(*foo, pch);
pch = strtok (NULL, ":");
*bar = (char *)malloc(strlen(pch)+1);
(*bar)[strlen(pch)] = '\n';
strcpy(*bar, pch);
return;
}
int
main(void)
{
char *mybaz, *myfoo, *mybar;
mybaz = "hello:world";
fn(mybaz, &myfoo, &mybar);
fprintf(stderr, "%s %s", myfoo, mybar);
free(myfoo);
free(mybar);
}
First thing, those mallocs should be for strlen(whatever)+1 bytes. C strings have a 0 character to indicate the end, called the NUL terminator, and it isn't included in the length measured by strlen.
Next thing, strtok modifies the string you're searching. You are passing it a pointer to a string which you're not allowed to modify (you can't modify literal strings). That could be the cause of the segfault. So instead of using a pointer to the non-modifiable string literal, you could copy it to your own, modifiable buffer, like this:
char mybaz[] = "hello:world";
What this does is put a size 12 char array on the stack, and copy the bytes of the string literal into that array. It works because the compiler knows, at compile time, how long the string is, and can make space accordingly. This saves using malloc for that particular copy.
The problem you have with references is that you're currently passing the value of mybaz, myfoo, and mybar into your function. You can't modify the caller's variables unless you pass a pointer to myfoo and mybar. Since myfoo is a char*, a pointer to it is a char**:
void
fn(char *baz, char **foo, char **bar) // take pointers-to-pointers
*foo = malloc(...); // set the value pointed to by foo
fn(mybaz, &myfoo, &mybar); // pass pointers to myfoo and mybar
Modifying foo in the function in your code has absolutely no effect on myfoo. myfoo is uninitialised, so if neither of the first two things is causing it, the segfault is most likely occurring when you come to print using that uninitialised pointer.
Once you've got it basically working, you might want to add some error-handling. strtok can return NULL if it doesn't find the separator it's looking for, and you can't call strlen with NULL. malloc can return NULL if there isn't enough memory, and you can't call strcpy with NULL either.
One thing everyone is overlooking is that you're calling strtok on an array stored in const memory. strtok writes to the array you pass it so make sure you copy that to a temporary array before calling strtok on it or just allocate the original one like:
char mybaz[] = "hello:world";
Ooh yes, little problem there.
As a rule, if you're going to be manipulating strings from inside a function, the storage for those strings had better be outside the function. The easy way to achieve this is to declare arrays outside the function (e.g. in main()) and to pass the arrays (which automatically become pointers to their beginnings) to the function. This works fine as long as your result strings don't overflow the space allocated in the arrays.
You've gone the more versatile but slightly more difficult route: You use malloc() to create space for your results (good so far!) and then try to assign the malloc'd space to the pointers you pass in. That, alas, will not work.
The pointer coming in is a value; you cannot change it. The solution is to pass a pointer to a pointer, and use it inside the function to change what the pointer is pointing to.
If you got that, great. If not, please ask for more clarification.
In C you typically pass by reference by passing 1) a pointer of the first element of the array, and 2) the length of the array.
The length of the array can be ommitted sometimes if you are sure about your buffer size, and one would know the length of the string by looking for a null terminated character (A character with the value of 0 or '\0'.
It seems from your code example though that you are trying to set the value of what a pointer points to. So you probably want a char** pointer. And you would pass in the address of your char* variable(s) that you want to set.
You're wanting to pass back 2 pointers. So you need to call it with a pair of pointers to pointers. Something like this:
void
fn(char *baz, char **foo, char **bar) {
...
*foo = malloc( ... );
...
*bar = malloc( ... );
...
}
the code most likely segfaults because you are allocating space for the string but forgetting that a string has an extra byte on the end, the null terminator.
Also you are only passing a pointer in. Since a pointer is a 32-bit value (on a 32-bit machine) you are simply passing the value of the unitialised pointer into "fn". In the same way you wouldn't expact an integer passed into a function to be returned to the calling function (without explicitly returning it) you can't expect a pointer to do the same. So the new pointer values are never returned back to the main function. Usually you do this by passing a pointer to a pointer in C.
Also don't forget to free dynamically allocated memory!!
void
fn(char *baz, char **foo, char **bar)
{
char *pch;
/* this is the part I'm having trouble with */
pch = strtok (baz, ":");
*foo = malloc(strlen(pch) + 1);
strcpy(*foo, pch);
pch = strtok (NULL, ":");
*bar = malloc(strlen(pch) + 1);
strcpy(*bar, pch);
return;
}
int
main(void)
{
char *mybaz, *myfoo, *mybar;
mybaz = "hello:world";
fn(mybaz, &myfoo, &mybar);
fprintf(stderr, "%s %s", myfoo, mybar);
free( myFoo );
free( myBar );
}
Other answers describe how to fix your answer to work, but an easy way to accomplish what you mean to do is strdup(), which allocates new memory of the appropriate size and copies the correct characters in.
Still need to fix the business with char* vs char**, though. There's just no way around that.
The essential problem is that although storage is ever allocated (with malloc()) for the results you are trying to return as myfoo and mybar, the pointers to those allocations are not actually returned to main(). As a result, the later call to printf() is quite likely to dump core.
The solution is to declare the arguments as ponter to pointer to char, and pass the addresses of myfoo and mybar to fn. Something like this (untested) should do the trick:
void
fn(char *baz, char **foo, char **bar)
{
char *pch;
/* this is the part I'm having trouble with */
pch = strtok (baz, ":");
*foo = malloc(strlen(pch)+1); /* include space for NUL termination */
strcpy(*foo, pch);
pch = strtok (NULL, ":");
*bar = malloc(strlen(pch)+1); /* include space for NUL termination */
strcpy(*bar, pch);
return;
}
int
main(void)
{
char mybaz[] = "hello:world";
char *myfoo, *mybar;
fn(mybaz, &myfoo, &mybar);
fprintf(stderr, "%s %s", myfoo, mybar);
free(myfoo);
free(mybar);
}
Don't forget the free each allocated string at some later point or you will create memory leaks.
To do both the malloc() and strcpy() in one call, it would be better to use strdup(), as it also remembers to allocate room for the terminating NUL which you left out of your code as written. *foo = strdup(pch) is much clearer and easier to maintain that the alternative. Since strdup() is POSIX and not ANSI C, you might need to implement it yourself, but the effort is well repaid by the resulting clarity for this kind of usage.
The other traditional way to return a string from a C function is for the caller to allocate the storage and provide its address to the function. This is the technique used by sprintf(), for example. It suffers from the problem that there is no way to make such a call site completely safe against buffer overrun bugs caused by the called function assuming more space has been allocated than is actually available. The traditional repair for this problem is to require that a buffer length argument also be passed, and to carefully validate both the actual allocation and the length claimed at the call site in code review.
Edit:
The actual segfault you are getting is likely to be inside strtok(), not printf() because your sample as written is attempting to pass a string constant to strtok() which must be able to modify the string. This is officially Undefined Behavior.
The fix for this issue is to make sure that bybaz is declared as an initialized array, and not as a pointer to char. The initialized array will be located in writable memory, while the string constant is likely to be located in read-only memory. In many cases, string constants are stored in the same part of memory used to hold the executable code itself, and modern systems all try to make it difficult for a program to modify its own running code.
In the embedded systems I work on for a living, the code is likely to be stored in a ROM of some sort, and cannot be physically modified.