When running my code (shown in the first code block), I get this error:
*** Error in `./a.out': free(): invalid pointer: 0x0000000001e4c016 ***
I found a fix (which is shown in the second code block), but I don't understand why the error is happening in the first place.
I read the documentation regarding strtok_r, but I don't understand why assigning "str" to a new char* fixes the problem.
Doesn't "rest = str" mean that rest and str point to the same block of memory. How does this fix the problem???
Broken code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
char* str = (char*) malloc(sizeof(char) * 128);
char* token;
printf("Enter something: ");
fgets(str, 128, stdin);
while ((token = strtok_r(str, " ", &str))) {
printf("%s\n", token);
}
free(str);
return (0);
}
Fixed code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
char* str = (char*) malloc(sizeof(char) * 128);
char* token;
char* rest = str;
printf("Enter something: ");
fgets(str, 128, stdin);
while ((token = strtok_r(rest, " ", &rest))) {
printf("%s\n", token);
}
free(str);
return (0);
}
It looks evidently that a call of strtok_r changes the pointer str that is passed to the call by reference as the third parameter.
while ((token = strtok_r(str, " ", &str))) {
^^^^
printf("%s\n", token);
}
So after a call of the function the pointer str can point inside the original string. So it will not store the value that it had after a call of malloc.
Thus using the auxiliary variable rest allows to keep the initial value in the pointer str.
Pay attention to that you are calling the function incorrectly. Here is its description
On the first call to strtok_r(), str should point to the string to be
parsed, and the value of saveptr is ignored. In subsequent calls, str
should be NULL, and saveptr should be unchanged since the previous
call.
So for the second and subsequent calls of the function the first argument shall be NULL.
You should write:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
char str[128];
char *token;
char *rest = str;
printf("Enter something: ");
fgets(str, sizeof str, stdin);
for (token = strtok_r(rest, " ", &rest);
token = strtok_r(NULL, " ", &rest);
/* just nothing here */)
{
printf("%s\n", token);
}
return (0);
}
First, you don't need to allocate the memory for str, as you can define a local array to store the data. You can use the sizeof operator so you don't run the risk of not updating it in two places if you decide to change the size of str. In the case of using malloc you had better #define a constant to hold the value while you use the constant everywhere you are using the size of the allocated buffer.
Second, never cast the returned value of malloc. Believe me, it is a very bad habit. When you do a cast, you tell the compiler you know what you are doing. Casting the value of malloc is a legacy from when there was no void type in C (this is so far as the middle eighties). Once upon a time, malloc() used to return a char * which normally was not the type of pointer you wanted, and you had to cast the pointer to match the one your were using. Casting malloc() return value in 2021 is not only not recommended, but it is strongly discouraged, as many errors come from having cast it (the compiler warns you when you are doing something bad, but it will not, if you cast the value, normally that is interpreted as you telling the compiler you are doing something weird on purpose, so the compiler shuts up, and doesn't say more)
Third, if you are going to extract all the tokens in a string, the first time you need to call strtok() (or his friend strtok_w) with a first parameter pointing to the start of the string, but the rest of the calls have to be done with NULL as it first parameter, or you'll be searching inside the string just returned, and not behind the first occurrence. Your problem was not about using strtok or strtok_r, as strtok_r is just a reentrant version of strtok that allows you to start a nested loop, inside the first, or to call it from different threads.
Heap memory management keeps track of base memory addresses for implementing library calls. We need to preserve those base-addresses to free/reallocate whenever necessary.
Now that you found a way to use strtok_r(), I prefer the below version:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main () {
char orgStr [] = "strtok does not allow you to have 2 pointers going at once on the same string";
for (char *token, *rmdStr = orgStr; token = strtok_r (NULL, " ", &rmdStr); /* empty */) {
printf ("%s\n", token);
}
/* Original string is chopped up with NULCHAR, now unreliable */
}
Related
I'm a C noob and I'm having problems with the following code:
#include <stdio.h>
#include <string.h>
#include <unistd.h>
void split_string(char *conf, char *host_ip[]){
long unsigned int conf_len = sizeof(conf);
char line[50];
strcpy(line, conf);
int i = 0;
char* token;
char* rest = line;
while ((token = strtok_r(rest, "_", &rest))){
host_ip[i] = token;
printf("-----------\n");
printf("token: %s\n", token);
i=i+1;
}
}
int main(){
char *my_conf[1];
my_conf[0] = "conf01_192.168.10.1";
char *host_ip[2];
split_string(my_conf[0], host_ip);
printf("%s\n",host_ip[0]);
printf("%s\n",host_ip[1]);
}
I want to modify the host_ip array inside the split_string function and then print the 2 resulting strings in the main.
However, the 2 last printf() are only printing unknown/random characters (maybe an address?). Any help?
There are 2 problems:
First, you're returning pointers to local variables. You can avoid this by strduping the strings and freeing in the caller.
Second:
On the first call to strtok_r(), str should point to the string to be parsed, and the value of saveptr is ignored. In subsequent calls, str should be NULL, and saveptr should be unchanged since the previous call.
I.e. you must NULL for the first argument after the first iteration in the loop. Nowhere is it said that it is OK to use the same pointer for both arguments.
This is because the strtok_r is an almost drop-in replacement to the braindead strtok, with just one extra argument, so that you could even wrap it with a macro...
Thus we get
char *start = rest;
while ((token = strtok_r(start, "_", &rest))){
host_ip[i] = strdup(token);
printf("-----------\n");
printf("token: %s\n", token);
i++;
start = NULL;
}
and in the caller:
free(host_ip[0]);
free(host_ip[1]);
You are storing address of local variable (line) which is in stack.Stack is LIFO and has valid data for local variables in its stack memory during its function life time.after that, the same stack memory will be allocated to another function's local variables. So, data stores in line【50】's memory will be invalid after coming out of string_split function
Why do I get a segfault with the below code?
#include <stdio.h>
int main()
{
char * tmp = "0.1";
char * first = strtok(tmp, ".");
return 0;
}
Edited:
#include <stdio.h>
int main()
{
char tmp[] = "0.1";
char *first = strtok(tmp, ".");
char *second = strtok(tmp, "."); // Yes, should be NULL
printf("%s\n", first);
printf("Hello World\n");
return 0;
}
The segfault can be reproduced at the online gdb here:
https://www.onlinegdb.com/online_c_compiler
The problem with your first code is that tmp points at a string literal, which is read-only. When strtok tries to modify the string, it crashes.
The problem with your second code is a missing include:
#include <string.h>
This missing header means strtok is undeclared in your program. The C compiler assumes all undeclared functions return int. This is not true for strtok, which returns char *. The likely cause of the crash in your example is that the code is running on a 64-bit machine where pointers are 8 bytes wide but int is only 4 bytes. This messes up the return value of strtok, so first is a garbage pointer (and printf crashes when it tries to use it).
You can confirm this for yourself by doing
char *first = strtok(tmp, ".");
printf("%p %p\n", (void *)tmp, (void *)first);
The addresses printed for tmp and first should be identical (and they are if you #include <string.h>).
The funny thing is that gcc can warn you about these problems:
main.c: In function 'main':
main.c:6:19: warning: implicit declaration of function 'strtok' [-Wimplicit-function-declaration]
char *first = strtok(tmp, ".");
^
main.c:6:19: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
main.c:7:20: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
char *second = strtok(tmp, "."); // Yes, should be NULL
^
... and onlinegdb will show you these warnings, but only if compilation fails!
So to see compiler warnings on onlinegdb, you have to add a hard error to the code (e.g. by putting a # in the last line of the file).
The behaviour of the function strtok goes something like this:
Accept a string str or NULL and a string of delimiters characters.
The strtok function then begins to process the given string str, wherein which it reads the string character by character until it encounters a character present amongst the provided delimiter characters.
If the number of characters it has encountered until reaching the delimiter string is > 0, then replace the delimiter character by '\n' and returns a pointer to the first character in this iteration which was not a delimiter character.
Else, if the number of characters it has encountered until reaching the delimiter string is == 0, then continue iterating the rest of the string without replacing this delimiter character by '\n'.
I've created some code snippets which will help you better understand the nature of the function, https://ideone.com/6NCcrR and https://ideone.com/KVI5n4 (<- taking excerpts from your code your code)
Now to answer your question, including string.h header and setting
char tmp[] = "0.1"; should solve your issue.
With char * tmp = "0.1";, tmp points to a string literal that cannot be modified and strtok tries to modify the string by replacing . with '\0'.
Another approach, avioding the segfault, would be to use strchr to find the dot and the precision field to print a limited number of characters. The sub-strings may be copied to other variables as well.
#include <stdio.h>
#include <string.h>
int main ( void) {
char * tmp = "0.1";
char * first = strchr(tmp, '.');
char * second = first + 1;
if ( first) {
printf ( "%.*s\n", first - tmp, tmp);
printf ( "%s\n", second);
}
printf ( "Hello World\n");
return 0;
}
tmp is not a string literal as few answers or comments point out.
char *tmp = "0.1" this is a string literal.
char tmp[] = "0.1" is a character array and all array operations can be performed on them.
The segfault arises because the function declaration for strtok is not found as string.h is not included, and the gcc or other c compilers implicitly declare the return type as int by default.
Now depending on the platform the integer size may vary, if int size is 4 byte and pointer size is 8 byte respectively
char *first = (int)strtok(tmp,".");
Truncation takes place on the pointer address returned by strtok and then when your printing, you are de-referencing the address value contained in first which could be a memory region out of bound resulting in segmentation fault or undefined behavior.
If you can typecast the output of strtok to a type that is 8 bytes(long in my case) then there would not be a segfault, although this is not a clean way to do.
Include proper headerfiles to avoid undefined behavior.
I'm trying to split the input from fgets using strtok, and store the results in an array, i.e. newArgs, so I can then call execvp and essentially execute the input passed by fgets.
E.g. ls -la will map to /bin/ls -la and execute correctly.
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char * argv[])
{
char buff[1024];
fgets(buff, 1024, stdin);
buff[strcspn(buff, "\n")] = 0;
printf("%s\n", buff);
printf("%d\n", strlen(buff));
char *newArgs[30];
char *token;
char delim[2] = " ";
token = strtok(buff, delim);
int i = 0;
while(token != NULL)
{
if(newArgs[i])
{
sprintf(newArgs[i], "%s", token);
printf("%s\n", newArgs[i]);
}
token = strtok(NULL, delim);
i++;
}
execvp(newArgs[0], newArgs);
return 0;
}
I keep getting a Segmentation fault, even though I'm checking the existence of newArgs[i], which is a little odd. Any ideas as to what's going wrong?
You're not allocating any memory for each element of newArgs. Try using a multi-dimensional array, like newArgs[30][100]. Don't forget to ensure they're null terminated.
Problems I see:
You are using uninitialized values of newArgs[i]. You have:
char *newArgs[30];
This is an array of uninitialized pointers. Then, you go on to use them as:
if(newArgs[i])
That is cause for undefined behavior. You can fix that by initializing the pointers to NULL.
char *newArgs[30] = {};
You haven't allocated memory for newArgs[i] before calling
sprintf(newArgs[i], "%s", token);
That is also cause for undefined behavior. You can fix that by using:
newArgs[i] = strdup(token);
The list of arguments being passed to execvp must contains a NULL pointer.
From http://linux.die.net/man/3/execvp (emphasis mine):
The execv(), execvp(), and execvpe() functions provide an array of pointers to null-terminated strings that represent the argument list available to the new program. The first argument, by convention, should point to the filename associated with the file being executed. The array of pointers must be terminated by a NULL pointer.
You are missing the last requirement. You need o make sure that one of the elements of newArgs is a NULL pointer. This problem will go away if you initialize the pointers to NULL.
You are not allocating memory for newArgs before storing it in the string.
Add
newArgs[i] = malloc(strlen(token));
before the if statement inside the for loop.
There is absolutely no reason to copy the tokens you are finding in buff.
That won't always be the case, but it certainly is here: buff is not modified before the execvp and execvp doesn't return. Knowing when not to copy a C string is not as useful as knowing how to copy a C string, but both are important.
Not copying the strings will simplify the code considerably. All you need to do is fill in the array of strings which you will pass to execvp:
char* args[30]; /* Think about dynamic allocation instead */
char** arg = &args[0];
*arg = strtok(buff, " ");
while (*arg++) {
/* Should check for overflow of the args array */
*arg = strtok(NULL, " ");
}
execvp(args[0], args);
Note that the above code will store the NULL returned by strtok at the end of the args array. This is required by execvp, which needs to know where the last arg is.
So I'm new to C and the whole string manipulation thing, but I can't seem to get strtok() to work. It seems everywhere everyone has the same template for strtok being:
char* tok = strtok(source,delim);
do
{
{code}
tok=strtok(NULL,delim);
}while(tok!=NULL);
So I try to do this with the delimiter being the space key, and it seems that strtok() no only reads NULL after the first run (the first entry into the while/do-while) no matter how big the string, but it also seems to wreck the source, turning the source string into the same thing as tok.
Here is a snippet of my code:
char* str;
scanf("%ms",&str);
char* copy = malloc(sizeof(str));
strcpy(copy,str);
char* tok = strtok(copy," ");
if(strcasecmp(tok,"insert"))
{
printf(str);
printf(copy);
printf(tok);
}
Then, here is some output for the input "insert a b c d e f g"
aaabbbcccdddeeefffggg
"Insert" seems to disappear completely, which I think is the fault of strcasecmp(). Also, I would like to note that I realize strcasecmp() seems to all-lower-case my source string, and I do not mind. Anyhoo, input "insert insert insert" yields absolutely nothing in output. It's as if those functions just eat up the word "insert" no matter how many times it is present. I may* end up just using some of the C functions that read the string char by char but I would like to avoid this if possible. Thanks a million guys, i appreciate the help.
With the second snippet of code you have five problems: The first is that your format for the scanf function is non-standard, what's the 'm' supposed to do? (See e.g. here for a good reference of the standard function.)
The second problem is that you use the address-of operator on a pointer, which means that you pass a pointer to a pointer to a char (e.g. char**) to the scanf function. As you know, the scanf function want its arguments as pointers, but since strings (either in pointer to character form, or array form) already are pointer you don't have to use the address-of operator for string arguments.
The third problem, once you fix the previous problem, is that the pointer str is uninitialized. You have to remember that uninitialized local variables are truly uninitialized, and their values are indeterminate. In reality, it means that their values will be seemingly random. So str will point to some "random" memory.
The fourth problem is with the malloc call, where you use the sizeof operator on a pointer. This will return the size of the pointer and not what it points to.
The fifth problem, is that when you do strtok on the pointer copy the contents of the memory pointed to by copy is uninitialized. You allocate memory for it (typically 4 or 8 bytes depending on you're on a 32 or 64 bit platform, see the fourth problem) but you never initialize it.
So, five problems in only four lines of code. That's pretty good! ;)
It looks like you're trying to print space delimited tokens following the word "insert" 3 times. Does this do what you want?
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
char str[BUFSIZ] = {0};
char *copy;
char *tok;
int i;
// safely read a string and chop off any trailing newline
if(fgets(str, sizeof(str), stdin)) {
int n = strlen(str);
if(n && str[n-1] == '\n')
str[n-1] = '\0';
}
// copy the string so we can trash it with strtok
copy = strdup(str);
// look for the first space-delimited token
tok = strtok(copy, " ");
// check that we found a token and that it is equal to "insert"
if(tok && strcasecmp(tok, "insert") == 0) {
// iterate over all remaining space-delimited tokens
while((tok = strtok(NULL, " "))) {
// print the token 3 times
for(i = 0; i < 3; i++) {
fputs(tok, stdout);
}
}
putchar('\n');
}
free(copy);
return 0;
}
I'm having trouble figuring out how to pass strings back through the parameters of a function. I'm new to programming, so I imagine this this probably a beginner question. Any help you could give would be most appreciated. This code seg faults, and I'm not sure why, but I'm providing my code to show what I have so far.
I have made this a community wiki, so feel free to edit.
P.S. This is not homework.
This is the original version
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void
fn(char *baz, char *foo, char *bar)
{
char *pch;
/* this is the part I'm having trouble with */
pch = strtok (baz, ":");
foo = malloc(strlen(pch));
strcpy(foo, pch);
pch = strtok (NULL, ":");
bar = malloc(strlen(pch));
strcpy(bar, pch);
return;
}
int
main(void)
{
char *mybaz, *myfoo, *mybar;
mybaz = "hello:world";
fn(mybaz, myfoo, mybar);
fprintf(stderr, "%s %s", myfoo, mybar);
}
UPDATE Here's an updated version with some of the suggestions implemented:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXLINE 1024
void
fn(char *baz, char **foo, char **bar)
{
char line[MAXLINE];
char *pch;
strcpy(line, baz);
pch = strtok (line, ":");
*foo = (char *)malloc(strlen(pch)+1);
(*foo)[strlen(pch)] = '\n';
strcpy(*foo, pch);
pch = strtok (NULL, ":");
*bar = (char *)malloc(strlen(pch)+1);
(*bar)[strlen(pch)] = '\n';
strcpy(*bar, pch);
return;
}
int
main(void)
{
char *mybaz, *myfoo, *mybar;
mybaz = "hello:world";
fn(mybaz, &myfoo, &mybar);
fprintf(stderr, "%s %s", myfoo, mybar);
free(myfoo);
free(mybar);
}
First thing, those mallocs should be for strlen(whatever)+1 bytes. C strings have a 0 character to indicate the end, called the NUL terminator, and it isn't included in the length measured by strlen.
Next thing, strtok modifies the string you're searching. You are passing it a pointer to a string which you're not allowed to modify (you can't modify literal strings). That could be the cause of the segfault. So instead of using a pointer to the non-modifiable string literal, you could copy it to your own, modifiable buffer, like this:
char mybaz[] = "hello:world";
What this does is put a size 12 char array on the stack, and copy the bytes of the string literal into that array. It works because the compiler knows, at compile time, how long the string is, and can make space accordingly. This saves using malloc for that particular copy.
The problem you have with references is that you're currently passing the value of mybaz, myfoo, and mybar into your function. You can't modify the caller's variables unless you pass a pointer to myfoo and mybar. Since myfoo is a char*, a pointer to it is a char**:
void
fn(char *baz, char **foo, char **bar) // take pointers-to-pointers
*foo = malloc(...); // set the value pointed to by foo
fn(mybaz, &myfoo, &mybar); // pass pointers to myfoo and mybar
Modifying foo in the function in your code has absolutely no effect on myfoo. myfoo is uninitialised, so if neither of the first two things is causing it, the segfault is most likely occurring when you come to print using that uninitialised pointer.
Once you've got it basically working, you might want to add some error-handling. strtok can return NULL if it doesn't find the separator it's looking for, and you can't call strlen with NULL. malloc can return NULL if there isn't enough memory, and you can't call strcpy with NULL either.
One thing everyone is overlooking is that you're calling strtok on an array stored in const memory. strtok writes to the array you pass it so make sure you copy that to a temporary array before calling strtok on it or just allocate the original one like:
char mybaz[] = "hello:world";
Ooh yes, little problem there.
As a rule, if you're going to be manipulating strings from inside a function, the storage for those strings had better be outside the function. The easy way to achieve this is to declare arrays outside the function (e.g. in main()) and to pass the arrays (which automatically become pointers to their beginnings) to the function. This works fine as long as your result strings don't overflow the space allocated in the arrays.
You've gone the more versatile but slightly more difficult route: You use malloc() to create space for your results (good so far!) and then try to assign the malloc'd space to the pointers you pass in. That, alas, will not work.
The pointer coming in is a value; you cannot change it. The solution is to pass a pointer to a pointer, and use it inside the function to change what the pointer is pointing to.
If you got that, great. If not, please ask for more clarification.
In C you typically pass by reference by passing 1) a pointer of the first element of the array, and 2) the length of the array.
The length of the array can be ommitted sometimes if you are sure about your buffer size, and one would know the length of the string by looking for a null terminated character (A character with the value of 0 or '\0'.
It seems from your code example though that you are trying to set the value of what a pointer points to. So you probably want a char** pointer. And you would pass in the address of your char* variable(s) that you want to set.
You're wanting to pass back 2 pointers. So you need to call it with a pair of pointers to pointers. Something like this:
void
fn(char *baz, char **foo, char **bar) {
...
*foo = malloc( ... );
...
*bar = malloc( ... );
...
}
the code most likely segfaults because you are allocating space for the string but forgetting that a string has an extra byte on the end, the null terminator.
Also you are only passing a pointer in. Since a pointer is a 32-bit value (on a 32-bit machine) you are simply passing the value of the unitialised pointer into "fn". In the same way you wouldn't expact an integer passed into a function to be returned to the calling function (without explicitly returning it) you can't expect a pointer to do the same. So the new pointer values are never returned back to the main function. Usually you do this by passing a pointer to a pointer in C.
Also don't forget to free dynamically allocated memory!!
void
fn(char *baz, char **foo, char **bar)
{
char *pch;
/* this is the part I'm having trouble with */
pch = strtok (baz, ":");
*foo = malloc(strlen(pch) + 1);
strcpy(*foo, pch);
pch = strtok (NULL, ":");
*bar = malloc(strlen(pch) + 1);
strcpy(*bar, pch);
return;
}
int
main(void)
{
char *mybaz, *myfoo, *mybar;
mybaz = "hello:world";
fn(mybaz, &myfoo, &mybar);
fprintf(stderr, "%s %s", myfoo, mybar);
free( myFoo );
free( myBar );
}
Other answers describe how to fix your answer to work, but an easy way to accomplish what you mean to do is strdup(), which allocates new memory of the appropriate size and copies the correct characters in.
Still need to fix the business with char* vs char**, though. There's just no way around that.
The essential problem is that although storage is ever allocated (with malloc()) for the results you are trying to return as myfoo and mybar, the pointers to those allocations are not actually returned to main(). As a result, the later call to printf() is quite likely to dump core.
The solution is to declare the arguments as ponter to pointer to char, and pass the addresses of myfoo and mybar to fn. Something like this (untested) should do the trick:
void
fn(char *baz, char **foo, char **bar)
{
char *pch;
/* this is the part I'm having trouble with */
pch = strtok (baz, ":");
*foo = malloc(strlen(pch)+1); /* include space for NUL termination */
strcpy(*foo, pch);
pch = strtok (NULL, ":");
*bar = malloc(strlen(pch)+1); /* include space for NUL termination */
strcpy(*bar, pch);
return;
}
int
main(void)
{
char mybaz[] = "hello:world";
char *myfoo, *mybar;
fn(mybaz, &myfoo, &mybar);
fprintf(stderr, "%s %s", myfoo, mybar);
free(myfoo);
free(mybar);
}
Don't forget the free each allocated string at some later point or you will create memory leaks.
To do both the malloc() and strcpy() in one call, it would be better to use strdup(), as it also remembers to allocate room for the terminating NUL which you left out of your code as written. *foo = strdup(pch) is much clearer and easier to maintain that the alternative. Since strdup() is POSIX and not ANSI C, you might need to implement it yourself, but the effort is well repaid by the resulting clarity for this kind of usage.
The other traditional way to return a string from a C function is for the caller to allocate the storage and provide its address to the function. This is the technique used by sprintf(), for example. It suffers from the problem that there is no way to make such a call site completely safe against buffer overrun bugs caused by the called function assuming more space has been allocated than is actually available. The traditional repair for this problem is to require that a buffer length argument also be passed, and to carefully validate both the actual allocation and the length claimed at the call site in code review.
Edit:
The actual segfault you are getting is likely to be inside strtok(), not printf() because your sample as written is attempting to pass a string constant to strtok() which must be able to modify the string. This is officially Undefined Behavior.
The fix for this issue is to make sure that bybaz is declared as an initialized array, and not as a pointer to char. The initialized array will be located in writable memory, while the string constant is likely to be located in read-only memory. In many cases, string constants are stored in the same part of memory used to hold the executable code itself, and modern systems all try to make it difficult for a program to modify its own running code.
In the embedded systems I work on for a living, the code is likely to be stored in a ROM of some sort, and cannot be physically modified.