strtok and free - c

What's the problem of doing this:
void *educator_func(void *param) {
char *lineE = (char *) malloc (1024);
size_t lenE = 1024;
ssize_t readE;
FILE * fpE;
fpE = fopen(file, "r");
if (fpE == NULL) {
printf("ERROR: couldnt open file\n");
exit(0);
}
while ((readE = getline(&lineE, &lenE, fpE)) != -1) {
char *pch2E = (char *) malloc (50);
pch2E = strtok(lineE, " ");
free(pch2E);
}
free(lineE);
fclose(fpE);
return NULL;
}
If i remove the line 'pch2E = strtok(lineE, " ");' it works fine...
why cant i do a strtok() there ? I tried with strtok_r() also but no luck, it gives me invalid free (Address 0x422af10 is 0 bytes inside a block of size 1,024 free'd)

Your code is not doing what you think it is doing... the call to pch2E = strtok(lineE, " "); is replacing the value of pch2E with the return value of strtok which is either lineE or a newly allocated replacement for lineE
You can fix it as follows...
int firstPass = 1;
while ((readE = getline(&lineE, &lenE, fpE)) != -1)
{
char* pch2E = strtok( firstPass ? lineE : NULL, " ");
firstPass = 0;
}
free(lineE);
I should add, the more I look at your code, the more fundamentally flawed it looks to me. You need an inner loop in your code that deals with tokens while the outer loop is loading lines...
while ((readE = getline(&lineE, &lenE, fpE)) != -1)
{
char* pch2E;
int firstPass = 1;
while( (pch2E = strtok( firstPass ? lineE : NULL, " ")) != NULL )
{
firstPass = 0;
// do something with the pch2E return value
}
}
free(lineE);

strtok returns a pointer to the token, that is included in the string you have passed, so you can't free it, because it doesn't (always) point to something you've allocated with malloc.
That kind of assignment can't even work in C, if you wanted a function that would copy the token into a buffer, it would be something like this:
tokenize(char* string, char* delimiter, char* token);
And you would need to pass a valid pointer to token, for the function to copy the data in. In C to copy the data in the pointer, the function needs access to that pointer so it would be impossible for a function to do it on a return value.
An alternative strategy for that (but worst) would be a function that allocates memory internally and returns a pointer to a memory area that needs to be freed by the caller.
For your problem, strtok needs to be called several times to return all the tokens, until it returns null, so it should be:
while ((readE = getline(&lineE, &lenE, fpE)) != -1) {
char *pch2E;
pch2E = strtok(lineE, " "); //1st token
while ((pch2E = strtok(NULL, " ")) != NULL) {
//Do something with the token
}
}

Related

What is the proper way to use strtok()?

Just to clarify, I'm a complete novice in C programming.
I have a tokenize function and it isn't behaving like what I expect it to. I'm trying to read from the FIFO or named pipe that is passed by the client side, and this is the server side. The client side reads a file and pass it to the FIFO. The problem is that tokenize doesn't return a format where execvp can process it, as running gdb tells me that it failed at calling the execute function in main(). (append function appends a char into the string)
One bug is that tokens is neither initialized nor allocated any memory.
Here is an example on how to initialize and allocate memory for tokens:
char **tokenize(char *line){
line = append(line,'\0');
int i = 0, tlen = 0;
char **tokens = NULL, *line2, *token, *delimiter;
delimiter = " \t";
token = strtok(line,delimiter);
while (token != NULL) {
if (i == tlen) {
// Allocate more space
tlen += 10;
tokens = realloc(tokens, tlen * sizeof *tokens);
if (tokens == NULL) {
exit(1);
}
}
tokens[i] = token;
token = strtok(NULL, delimiter);
i += 1;
}
tokens[i] = NULL;
return tokens;
}
This code will allocate memory for 10 tokens at a time. If the memory allocation fails, it will end the program with a non-zero return value to indicate failure.

Error on free char** table

I have a function that takes a string and split it into tokens, because I want to return these tokens I allocate a variable using malloc.
char** analyze(char* buffer)
{
int i= 0;
char* token[512];
char** final = (char**)malloc(strlen(buffer)+1);
if ( final == NULL ) { perror("Failed to malloc"); exit(10); }
token[i] = strtok(buffer, " ");
while( token[i] != NULL )
{
final[i] = malloc(strlen(token[i])+1);
if( final[i] == NULL ) { perror("Failed to malloc"); exit(11); }
final[i] = token[i];
i++;
token[i] = strtok(NULL, " ");
}
final[i] = malloc(sizeof(char));
if( final[i] == NULL ) { perror("Failed to malloc"); exit(12); }
final[i] = NULL;
return final;
}
And I try to free this table with another function:
void free_table(char** job)
{
int i = 0;
while( job[i] != NULL )
{
free(job[i]);
i++;
}
free(job[i]); //free the last
free(job);
}
In main I use:
char** job = analyze(buffer); // buffer contains the string
and
free_table(job);
when I try to free the table I get this error:
*** Error in `./a.out': free(): invalid pointer: 0x00007ffd003f62b0 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7fdb2e5497e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x7fe0a)[0x7fdb2e551e0a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7fdb2e55598c]
./a.out[0x4012d6]
and the error goes on...
What am I doing wrong?
To begin with:
char** final = (char**)malloc(strlen(buffer)+1);
This allocates strlen(buffer) + 1 bytes, not that amount of "elements". And since sizeof(char*) is most likely very much larger than a single byte, you might be allocating to little memory here.
Since you don't know how many tokens there might be you should not allocate a fixed amount, but instead use realloc to reallocate as needed.
Then the second problem:
final[i] = malloc(strlen(token[i])+1);
...
final[i] = token[i];
In the first statement you allocate memory enough for the string pointed to by token[i], and assign the pointer to that memory to final[i]. But then you immediately reassign final[i] to point somewhere else, some memory that you haven't gotten from malloc. You should copy the string instead of reassigning the pointer:
strcpy(final[i], token[i]);
On an unrelated note, there's no need for token to be an array of pointer. It can be just a pointer:
char *token = strtok(...);
Example of a possible implementation:
char **analyze(char *buffer)
{
size_t current_token_index = 0;
char **tokens = NULL;
// Get the first "token"
char *current_token = strtok(buffer, " ");
while (current_token != NULL)
{
// (Re)allocate memory for the tokens array
char **temp = realloc(tokens, sizeof *temp * (current_token_index + 1));
if (temp == NULL)
{
// TODO: Better error handling
// (like freeing the tokens already allocated)
return NULL;
}
tokens = temp;
// Allocate memory for the "token" and copy it
tokens[current_token_index++] = strdup(current_token);
// Get the next "token"
current_token = strtok(NULL, " ");
}
// Final reallocation to make sure there is a terminating null pointer
char **temp = realloc(tokens, sizeof *temp * (current_token_index + 1));
if (temp == NULL)
{
// TODO: Better error handling
// (like freeing the tokens already allocated)
return NULL;
}
tokens = temp;
// Terminate the array
tokens[current_token_index] = NULL;
return tokens;
}
Note that strdup isn't a standard C function, but it is prevalent enough to assume it will exist. In the unlikely case where it doesn't exist, it's easy to implement yourself.

Tokenizing a String - C

I'm trying to tokenize a string in C based upon \r\n delimiters, and want to print out each string after subsequent calls to strtok(). In a while loop I have, there is processing done to each token.
When I include the processing code, the only output I receive is the first token, however when I take the processing code out, I receive every token. This doesn't make sense to me, and am wondering what I could be doing wrong.
Here's the code:
#include <stdio.h>
#include <time.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <stdlib.h>
int main()
{
int c = 0, c2 = 0;
char *tk, *tk2, *tk3, *tk4;
char buf[1024], buf2[1024], buf3[1024];
char host[1024], path[1024], file[1024];
strcpy(buf, "GET /~yourloginid/index.htm HTTP/1.1\r\nHost: remote.cba.csuohio.edu\r\n\r\n");
tk = strtok(buf, "\r\n");
while(tk != NULL)
{
printf("%s\n", tk);
/*
if(c == 0)
{
strcpy(buf2, tk);
tk2 = strtok(buf2, "/");
while(tk2 != NULL)
{
if(c2 == 1)
strcpy(path, tk2);
else if(c2 == 2)
{
tk3 = strtok(tk2, " ");
strcpy(file, tk3);
}
++c2;
tk2 = strtok(NULL, "/");
}
}
else if(c == 1)
{
tk3 = strtok(tk, " ");
while(tk3 != NULL)
{
if(c2 == 1)
{
printf("%s\n", tk3);
// strcpy(host, tk2);
// printf("%s\n", host);
}
++c2;
tk3 = strtok(NULL, " ");
}
}
*/
++c;
tk = strtok(NULL, "\r\n");
}
return 0;
}
Without those if else statements, I receive the following output...
GET /~yourloginid/index.htm HTTP/1.1
Host: remote.cba.csuohio.edu
...however, with those if else statements, I receive this...
GET /~yourloginid/index.htm HTTP/1.1
I'm not sure why I can't see the other token, because the program ends, which means that the loop must occur until the end of the entire string, right?
strtok stores "the point where the last token was found" :
"The point where the last token was found is kept internally by the function to be used on the next call (particular library implementations are not required to avoid data races)."
-- reference
That's why you can call it with NULL the second time.
So your calling it again with a different pointer inside your loop makes you loose the state of the initial call (meaning tk = strtok(NULL, "\r\n") will be NULL by the end of the while, because it will be using the state of the inner loops).
So the solution is probably to change the last line of the while from:
tk = strtok(NULL, "\r\n");
to something like (please check the bounds first, it should not go after buf + strlen(buf)):
tk = strtok(tk + strlen(tk) + 1, "\r\n");
Or use strtok_r, which stores the state externally (like in this answer).
// first call
char *saveptr1;
tk = strtok_r(buf, "\r\n", &saveptr1);
while(tk != NULL) {
//...
tk = strtok_r(NULL, "\r\n", &saveptr1);
}
strtok stores the state of the last token in a global variable, so that the next call to strtok knows where to continue. So when you call strtok(buf2, "/"); in the if, it clobbers the saved state about the outser tokenization.
The fix is to use strtok_r instead of strtok. This function takes an extra argument that is used to store the state:
char *save1, *save2, *save3;
tk = strtok_r(buf, "\r\n", &save1);
while(tk != NULL) {
printf("%s\n", tk);
if(c == 0) {
strcpy(buf2, tk);
tk2 = strtok_r(buf2, "/", &save2);
while(tk2 != NULL) {
if(c2 == 1)
strcpy(path, tk2);
else if(c2 == 2) {
tk3 = strtok_r(tk2, " ", &save3);
strcpy(file, tk3); }
++c2;
tk2 = strtok_r(NULL, "/", &save2); }
} else if(c == 1) {
tk3 = strtok_r(tk, " ", &save2);
while(tk3 != NULL) {
if(c2 == 1) {
printf("%s\n", tk3);
// strcpy(host, tk2);
// printf("%s\n", host);
}
++c2;
tk3 = strtok_r(NULL, " ", &save2); } }
++c;
tk = strtok_r(NULL, "\r\n", &save1); }
return 0;
}
One thing that stands out to me is that unless you are doing something else with the string buffer, there is no need to copy each token to its own buffer. The strtok function returns a pointer to the beginning of the token, so you can use the token in place. The following code may work better and be easier to understand:
#define MAX_PTR = 4
char buff[] = "GET /~yourloginid/index.htm HTTP/1.1\r\nHost: remote.cba.csuohio.edu\r\n\r\n";
char *ptr[MAX_PTR];
int i;
for (i = 0; i < MAX_PTR; i++)
{
if (i == 0) ptr[i] = strtok(buff, "\r\n");
else ptr[i] = strtok(NULL, "\r\n");
if (ptr[i] != NULL) printf("%s\n", ptr[i]);
}
The way that I defined the buffer is something that I call a pre-loaded buffer. You can use an array that is set equal to a string to initialize the array. The compiler will size it for you without you needing to do anything else. Now inside the for loop, the if statement determines which form of strtok is used. So if i == 0, then we need to initialize strtok. Otherwise, we use the second form for all subsequent tokens. Then the printf just prints the different tokens. Remember, strtok returns a pointer to a spot inside the buffer.
If you really are doing something else with the data and you really do need the buffer for other things, then the following code will work as well. This uses malloc to allocate blocks of memory from the heap.
#define MAX_PTR = 4
char buff[] = "GET /~yourloginid/index.htm HTTP/1.1\r\nHost: remote.cba.csuohio.edu\r\n\r\n";
char *ptr[MAX_PTR];
char *bptr; /* buffer pointer */
int i;
for (i = 0; i < MAX_PTR; i++)
{
if (i == 0) bptr = strtok(buff, "\r\n");
else bptr = strtok(NULL, "\r\n");
if (bptr != NULL)
{
ptr[i] = malloc(strlen(bptr + 2));
if (ptr[i] == NULL)
{
/* Malloc error check failed, exit program */
printf("Error: Memory Allocation Failed. i=%d\n", i);
exit(1);
}
strncpy(ptr[i], bptr, strlen(bptr) + 1);
ptr[i][strlen(bptr) + 1] = '\0';
printf("%s\n", ptr[i]);
}
else ptr[i] = NULL;
}
This code does pretty much the same thing, except that we are copying the token strings into buffers. Note that we use an array of char pointers to do this. THe malloc call allocates memory. Then we check if it fails. If malloc returns a NULL, then it failed and we exit program. The strncpy function should be used instead of strcpy. Strcpy does not allow for checking the size of the target buffer, so a malicious user can execute a buffer overflow attack on your code. The malloc was given strlen(bptr) + 2. This is to guarantee that the size of the buffer is big enough to handle the size of the token. The strlen(bptr) + 1 expressions are to make sure that the copied data doesn't overrun the buffer. As an added precaution, the last byte in the buffer is set to 0x00. Then we print the string. Note that I have the if (bptr != NULL). So the main block of code will be executed only if strtok returns a pointer to a valid string, otherwise we set the corresponding pointer entry in the array to NULL.
Don't forget to free() the pointers in the array when you are done with them.
In your code, you are placing things in named buffers, which can be done, but it's not really good practice because then if you try to use the code somewhere else, you have to make extensive modifications to it.

C struct string error

Here are some chunks of my code to give you a view of my problem
typedef struct {
int count;
char *item;
int price;
char *buyer;
date *date;
}transaction;
transaction *open(FILE *src, char* path) {
char buffer[100], *token;
int count = 0;
transaction *tlist = (transaction*)malloc(sizeof(transaction));
tlist = alloc(tlist, count);
src = fopen(path, "r");
if (src != NULL) {
printf("\nSoubor nacten.\n");
}
else {
printf("\nChyba cteni souboru.\n");
return NULL;
}
while (fgets(buffer, sizeof(buffer), src)) {
tlist = alloc(tlist, count+1);
token = strtok(buffer, "\t"); //zahodit jméno obchodníka
tlist[count].count = strtok(NULL, "x");
tlist[count].item = strtok(NULL, "\t");
tlist[count].item++;
tlist[count].item[strlen(tlist[count].item)] = '\0';
tlist[count].price = atoi(strtok(NULL, "\t "));
token = strtok(NULL, "\t"); //zahodit md
tlist[count].buyer = strtok(NULL, "\t");
tlist[count].date = date_autopsy(strtok(NULL, "\t"));
count++;
}
fclose(src);
return tlist;
}
transaction *alloc(transaction *tlist, int count) {
if (count == 0) {
tlist[0].item = (char*)malloc(20 * sizeof(char));
tlist[0].buyer = (char*)malloc(20 * sizeof(char));
}
else {
tlist = (transaction*)realloc(tlist, count * sizeof(transaction));
tlist[count - 1].item = (char*)malloc(20 * sizeof(char));
tlist[count - 1].buyer = (char*)malloc(20 * sizeof(char));
}
return tlist;
}
First in main(), I create the list
transaction *list = (transaction*)malloc(sizeof(transaction));
Then with the right command, I call my opening function that loads a file, then it tokens a line from that file into pieces that then puts into the structure. It all works fine.. When I want to print(for testing) tlist[count].item inside the opening function, it prints the right thing. But when I try it outside(in main()), it prints garbage. It somehow works for the date and price parts of sturcture.. I assume the "buyer" string will be broken as well. Thanks in advance
Since you are overwriting the allocated memory with the local buffer for the item , bueyr fields so it is not reflected in the caller function. Modify the code as below
tlist[count].count = strtok(NULL, "x") -> strcpy(tlist[count].count, strtok(NULL, "x"))
tlist[count].buyer = strtok(NULL, "\t") -> strcpy(tlist[count].buyer , strtok(NULL, "\t"))
and also check the tlist[count].date , You should allocate the memory for the date and also use memcpy to copy the contents.
Since strtok returns the NULL termintated string what is the use of the following lines ?
tlist[count].item++;
tlist[count].item[strlen(tlist[count].item)] = '\0';
strtok holds an internal static buffer which is filled and returned after every call.
somerthing like this:
char *strtok(...)
{
static char buf[XX];
// get next token
return buf;
}
You are effectively assigning and therefore overwriting the data the pointer points to after every call.
A better use would be to allocate memory for the char * fields and use strcpy for the data returned from strtok

easy c question: compare first char of a char array

How can I compare the first letter of the first element of a char**?
I have tried:
int main()
{
char** command = NULL;
while (true)
{
fgets(line, MAX_COMMAND_LEN, stdin);
parse_command(line, command);
exec_command(command);
}
}
void parse_command(char* line, char** command)
{
int n_args = 0, i = 0;
while (line[i] != '\n')
{
if (isspace(line[i++]))
n_args++;
}
for (i = 0; i < n_args+1; i++)
command = (char**) malloc (n_args * sizeof(char*));
i = 0;
line = strtok(line," \n");
while (line != NULL)
{
command[i++] = (char *) malloc ( (strlen(line)+1) * sizeof(char) );
strcpy(command[i++], line);
line = strtok(NULL, " \n");
}
command[i] = NULL;
}
void exec_command(char** command)
{
if (command[0][0] == '/')
// other stuff
}
but that gives a segmentation fault. What am I doing wrong?
Thanks.
Could you paste more code? Have you allocated memory both for your char* array and for the elements of your char* array?
The problem is, you do allocate a char* array inside parse_command, but the pointer to that array never gets out of the function. So exec_command gets a garbage pointer value. The reason is, by calling parse_command(line, command) you pass a copy of the current value of the pointer command, which is then overwritten inside the function - but the original value is not affected by this!
To achieve that, either you need to pass a pointer to the pointer you want to update, or you need to return the pointer to the allocated array from parse_command. Apart from char*** looking ugly (at least to me), the latter approach is simpler and easier to read:
int main()
{
char** command = NULL;
while (true)
{
fgets(line, MAX_COMMAND_LEN, stdin);
command = parse_command(line);
exec_command(command);
}
}
char** parse_command(char* line)
{
char** command = NULL;
int n_args = 0, i = 0;
while (line[i] != '\n')
{
if (isspace(line[i++]))
n_args++;
}
command = (char**) malloc ((n_args + 1) * sizeof(char*));
i = 0;
line = strtok(line," \n");
while (line != NULL)
{
command[i] = (char *) malloc ( (strlen(line)+1) * sizeof(char) );
strcpy(command[i++], line);
line = strtok(NULL, " \n");
}
command[i] = NULL;
return command;
}
Notes:
in your original parse_command, you allocate memory to command in a loop, which is unnecessary and just creates memory leaks. It is enough to allocate memory once. I assume that you want command to contain n_args + 1 pointers, so I modified the code accordingly.
in the last while loop of parse_command, you increment i incorrectly twice, which also leads to undefined behaviour, i.e. possible segmentation fault. I fixed it here.

Resources