Storing as using result of strtok in array in C - c

I'm trying to split the input from fgets using strtok, and store the results in an array, i.e. newArgs, so I can then call execvp and essentially execute the input passed by fgets.
E.g. ls -la will map to /bin/ls -la and execute correctly.
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char * argv[])
{
char buff[1024];
fgets(buff, 1024, stdin);
buff[strcspn(buff, "\n")] = 0;
printf("%s\n", buff);
printf("%d\n", strlen(buff));
char *newArgs[30];
char *token;
char delim[2] = " ";
token = strtok(buff, delim);
int i = 0;
while(token != NULL)
{
if(newArgs[i])
{
sprintf(newArgs[i], "%s", token);
printf("%s\n", newArgs[i]);
}
token = strtok(NULL, delim);
i++;
}
execvp(newArgs[0], newArgs);
return 0;
}
I keep getting a Segmentation fault, even though I'm checking the existence of newArgs[i], which is a little odd. Any ideas as to what's going wrong?

You're not allocating any memory for each element of newArgs. Try using a multi-dimensional array, like newArgs[30][100]. Don't forget to ensure they're null terminated.

Problems I see:
You are using uninitialized values of newArgs[i]. You have:
char *newArgs[30];
This is an array of uninitialized pointers. Then, you go on to use them as:
if(newArgs[i])
That is cause for undefined behavior. You can fix that by initializing the pointers to NULL.
char *newArgs[30] = {};
You haven't allocated memory for newArgs[i] before calling
sprintf(newArgs[i], "%s", token);
That is also cause for undefined behavior. You can fix that by using:
newArgs[i] = strdup(token);
The list of arguments being passed to execvp must contains a NULL pointer.
From http://linux.die.net/man/3/execvp (emphasis mine):
The execv(), execvp(), and execvpe() functions provide an array of pointers to null-terminated strings that represent the argument list available to the new program. The first argument, by convention, should point to the filename associated with the file being executed. The array of pointers must be terminated by a NULL pointer.
You are missing the last requirement. You need o make sure that one of the elements of newArgs is a NULL pointer. This problem will go away if you initialize the pointers to NULL.

You are not allocating memory for newArgs before storing it in the string.
Add
newArgs[i] = malloc(strlen(token));
before the if statement inside the for loop.

There is absolutely no reason to copy the tokens you are finding in buff.
That won't always be the case, but it certainly is here: buff is not modified before the execvp and execvp doesn't return. Knowing when not to copy a C string is not as useful as knowing how to copy a C string, but both are important.
Not copying the strings will simplify the code considerably. All you need to do is fill in the array of strings which you will pass to execvp:
char* args[30]; /* Think about dynamic allocation instead */
char** arg = &args[0];
*arg = strtok(buff, " ");
while (*arg++) {
/* Should check for overflow of the args array */
*arg = strtok(NULL, " ");
}
execvp(args[0], args);
Note that the above code will store the NULL returned by strtok at the end of the args array. This is required by execvp, which needs to know where the last arg is.

Related

invalid pointer when using strtok_r

When running my code (shown in the first code block), I get this error:
*** Error in `./a.out': free(): invalid pointer: 0x0000000001e4c016 ***
I found a fix (which is shown in the second code block), but I don't understand why the error is happening in the first place.
I read the documentation regarding strtok_r, but I don't understand why assigning "str" to a new char* fixes the problem.
Doesn't "rest = str" mean that rest and str point to the same block of memory. How does this fix the problem???
Broken code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
char* str = (char*) malloc(sizeof(char) * 128);
char* token;
printf("Enter something: ");
fgets(str, 128, stdin);
while ((token = strtok_r(str, " ", &str))) {
printf("%s\n", token);
}
free(str);
return (0);
}
Fixed code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
char* str = (char*) malloc(sizeof(char) * 128);
char* token;
char* rest = str;
printf("Enter something: ");
fgets(str, 128, stdin);
while ((token = strtok_r(rest, " ", &rest))) {
printf("%s\n", token);
}
free(str);
return (0);
}
It looks evidently that a call of strtok_r changes the pointer str that is passed to the call by reference as the third parameter.
while ((token = strtok_r(str, " ", &str))) {
^^^^
printf("%s\n", token);
}
So after a call of the function the pointer str can point inside the original string. So it will not store the value that it had after a call of malloc.
Thus using the auxiliary variable rest allows to keep the initial value in the pointer str.
Pay attention to that you are calling the function incorrectly. Here is its description
On the first call to strtok_r(), str should point to the string to be
parsed, and the value of saveptr is ignored. In subsequent calls, str
should be NULL, and saveptr should be unchanged since the previous
call.
So for the second and subsequent calls of the function the first argument shall be NULL.
You should write:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
char str[128];
char *token;
char *rest = str;
printf("Enter something: ");
fgets(str, sizeof str, stdin);
for (token = strtok_r(rest, " ", &rest);
token = strtok_r(NULL, " ", &rest);
/* just nothing here */)
{
printf("%s\n", token);
}
return (0);
}
First, you don't need to allocate the memory for str, as you can define a local array to store the data. You can use the sizeof operator so you don't run the risk of not updating it in two places if you decide to change the size of str. In the case of using malloc you had better #define a constant to hold the value while you use the constant everywhere you are using the size of the allocated buffer.
Second, never cast the returned value of malloc. Believe me, it is a very bad habit. When you do a cast, you tell the compiler you know what you are doing. Casting the value of malloc is a legacy from when there was no void type in C (this is so far as the middle eighties). Once upon a time, malloc() used to return a char * which normally was not the type of pointer you wanted, and you had to cast the pointer to match the one your were using. Casting malloc() return value in 2021 is not only not recommended, but it is strongly discouraged, as many errors come from having cast it (the compiler warns you when you are doing something bad, but it will not, if you cast the value, normally that is interpreted as you telling the compiler you are doing something weird on purpose, so the compiler shuts up, and doesn't say more)
Third, if you are going to extract all the tokens in a string, the first time you need to call strtok() (or his friend strtok_w) with a first parameter pointing to the start of the string, but the rest of the calls have to be done with NULL as it first parameter, or you'll be searching inside the string just returned, and not behind the first occurrence. Your problem was not about using strtok or strtok_r, as strtok_r is just a reentrant version of strtok that allows you to start a nested loop, inside the first, or to call it from different threads.
Heap memory management keeps track of base memory addresses for implementing library calls. We need to preserve those base-addresses to free/reallocate whenever necessary.
Now that you found a way to use strtok_r(), I prefer the below version:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main () {
char orgStr [] = "strtok does not allow you to have 2 pointers going at once on the same string";
for (char *token, *rmdStr = orgStr; token = strtok_r (NULL, " ", &rmdStr); /* empty */) {
printf ("%s\n", token);
}
/* Original string is chopped up with NULCHAR, now unreliable */
}

Wrong output after modifying an array in a function (in C)

I'm a C noob and I'm having problems with the following code:
#include <stdio.h>
#include <string.h>
#include <unistd.h>
void split_string(char *conf, char *host_ip[]){
long unsigned int conf_len = sizeof(conf);
char line[50];
strcpy(line, conf);
int i = 0;
char* token;
char* rest = line;
while ((token = strtok_r(rest, "_", &rest))){
host_ip[i] = token;
printf("-----------\n");
printf("token: %s\n", token);
i=i+1;
}
}
int main(){
char *my_conf[1];
my_conf[0] = "conf01_192.168.10.1";
char *host_ip[2];
split_string(my_conf[0], host_ip);
printf("%s\n",host_ip[0]);
printf("%s\n",host_ip[1]);
}
I want to modify the host_ip array inside the split_string function and then print the 2 resulting strings in the main.
However, the 2 last printf() are only printing unknown/random characters (maybe an address?). Any help?
There are 2 problems:
First, you're returning pointers to local variables. You can avoid this by strduping the strings and freeing in the caller.
Second:
On the first call to strtok_r(), str should point to the string to be parsed, and the value of saveptr is ignored. In subsequent calls, str should be NULL, and saveptr should be unchanged since the previous call.
I.e. you must NULL for the first argument after the first iteration in the loop. Nowhere is it said that it is OK to use the same pointer for both arguments.
This is because the strtok_r is an almost drop-in replacement to the braindead strtok, with just one extra argument, so that you could even wrap it with a macro...
Thus we get
char *start = rest;
while ((token = strtok_r(start, "_", &rest))){
host_ip[i] = strdup(token);
printf("-----------\n");
printf("token: %s\n", token);
i++;
start = NULL;
}
and in the caller:
free(host_ip[0]);
free(host_ip[1]);
You are storing address of local variable (line) which is in stack.Stack is LIFO and has valid data for local variables in its stack memory during its function life time.after that, the same stack memory will be allocated to another function's local variables. So, data stores in line【50】's memory will be invalid after coming out of string_split function

Double pointer as argument to execvp()

I am trying to execute execvp() using a custom **tokens double pointer as input, instead of argv[] on a "create a custom shell" assignment, like this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/wait.h>
int main(){
char *token;
char **tokens = malloc(sizeof(char*)*512); //512 is for the maximum input-command length
char *command=malloc(sizeof(char)*512);
int i = 0;
pid_t child_pid;
int status;
//***********take input from user*************************************
fgets(command,512,stdin);
//***********parse input*********************************************
token = strtok(command," \t");
while( token != NULL ) {
tokens[i]=token;
i ++;
token = strtok(NULL, " \t");
}
child_pid = fork();
if(child_pid == 0) {
/* This is done by the child process. */
execvp(tokens[0], tokens);
} else {
waitpid(child_pid, &status, WUNTRACED);
}
}
The problem is definately on this line:
execvp(tokens[0], tokens);
and I just can't understand why it can't be executed and print to my stdout.
I have tried this:
execvp("ls", tokens);
and it works just fine.
And this:
printf("%s\n", tokens[0]);
with the output being (according to the test input: ls ):
ls
You have several problems in your code, including:
The array of argument pointers passed to execvp() must be terminated by a null pointer. You do not ensure that.
The string obtained via fgets will include all characters up to and including the line's newline, if the buffer is large enough to accommodate it. You do not include the newline among your token delimiter characters, so for a one-word command ls, the command passed to execvp() is equivalent to "ls\n", not "ls". It is unlikely (but not impossible) that ls\n is an available command on your machine.
You do not check the return value of execvp(), or of any of your other functions, nor do you handle any errors. execvp() is special in that it returns only if there is an error, but you would have saved yourself some confusion if you had handled that case by emitting an error message.
After I correct the first two of those, your program successfully runs an "ls" command for me.
You need to allocate the memory with sizeof(char *).
char **tokens = malloc(sizeof(char *)*512);
^^----------->Size of char pointer
As of now you are allocating sizeof(char) thus invoking undefined behavior.
Also consider the first comment pointed by #n.m

C String parsing errors with strtok(),strcasecmp()

So I'm new to C and the whole string manipulation thing, but I can't seem to get strtok() to work. It seems everywhere everyone has the same template for strtok being:
char* tok = strtok(source,delim);
do
{
{code}
tok=strtok(NULL,delim);
}while(tok!=NULL);
So I try to do this with the delimiter being the space key, and it seems that strtok() no only reads NULL after the first run (the first entry into the while/do-while) no matter how big the string, but it also seems to wreck the source, turning the source string into the same thing as tok.
Here is a snippet of my code:
char* str;
scanf("%ms",&str);
char* copy = malloc(sizeof(str));
strcpy(copy,str);
char* tok = strtok(copy," ");
if(strcasecmp(tok,"insert"))
{
printf(str);
printf(copy);
printf(tok);
}
Then, here is some output for the input "insert a b c d e f g"
aaabbbcccdddeeefffggg
"Insert" seems to disappear completely, which I think is the fault of strcasecmp(). Also, I would like to note that I realize strcasecmp() seems to all-lower-case my source string, and I do not mind. Anyhoo, input "insert insert insert" yields absolutely nothing in output. It's as if those functions just eat up the word "insert" no matter how many times it is present. I may* end up just using some of the C functions that read the string char by char but I would like to avoid this if possible. Thanks a million guys, i appreciate the help.
With the second snippet of code you have five problems: The first is that your format for the scanf function is non-standard, what's the 'm' supposed to do? (See e.g. here for a good reference of the standard function.)
The second problem is that you use the address-of operator on a pointer, which means that you pass a pointer to a pointer to a char (e.g. char**) to the scanf function. As you know, the scanf function want its arguments as pointers, but since strings (either in pointer to character form, or array form) already are pointer you don't have to use the address-of operator for string arguments.
The third problem, once you fix the previous problem, is that the pointer str is uninitialized. You have to remember that uninitialized local variables are truly uninitialized, and their values are indeterminate. In reality, it means that their values will be seemingly random. So str will point to some "random" memory.
The fourth problem is with the malloc call, where you use the sizeof operator on a pointer. This will return the size of the pointer and not what it points to.
The fifth problem, is that when you do strtok on the pointer copy the contents of the memory pointed to by copy is uninitialized. You allocate memory for it (typically 4 or 8 bytes depending on you're on a 32 or 64 bit platform, see the fourth problem) but you never initialize it.
So, five problems in only four lines of code. That's pretty good! ;)
It looks like you're trying to print space delimited tokens following the word "insert" 3 times. Does this do what you want?
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
char str[BUFSIZ] = {0};
char *copy;
char *tok;
int i;
// safely read a string and chop off any trailing newline
if(fgets(str, sizeof(str), stdin)) {
int n = strlen(str);
if(n && str[n-1] == '\n')
str[n-1] = '\0';
}
// copy the string so we can trash it with strtok
copy = strdup(str);
// look for the first space-delimited token
tok = strtok(copy, " ");
// check that we found a token and that it is equal to "insert"
if(tok && strcasecmp(tok, "insert") == 0) {
// iterate over all remaining space-delimited tokens
while((tok = strtok(NULL, " "))) {
// print the token 3 times
for(i = 0; i < 3; i++) {
fputs(tok, stdout);
}
}
putchar('\n');
}
free(copy);
return 0;
}

Splitting a string with strtok() goes wrong

I'm trying to get input from the user while allocating it dynamically and then "split" it using strtok.
Main Questions:
Im getting an infinite loop of "a{\300_\377" and ",".
Why do i get a warning of "Implicitly declaring library function "malloc"/"realoc" with type void"
Other less important questions:
3.i want to break, if the input includes "-1", how do i check it? As you can see it breaks now if its 1.
4.In the getsWordsArray() i want to return a pointer to an array of strings. Since i dont know how many strings there are do i also need to dynamically allocate it like in the getInput(). (I dont know how many chars are there in each string)
int main(int argc, const char * argv[])
{
char input = getInput();
getWordsArray(&input);
}
char getInput()
{
char *data,*temp;
data=malloc(sizeof(char));
char c; /* c is the current character */
int i; /* i is the counter */
printf ("\n Enter chars and to finish push new line:\n");
for (i=0;;i++) {
c=getchar(); /* put input character into c */
if (c== '1') // need to find a way to change it to -1
break;
data[i]=c; /* put the character into the data array */
temp=realloc(data,(i+1)*sizeof(char)); /* give the pointer some memory */
if ( temp != NULL ) {
data=temp;
} else {
free(data);
printf("Error allocating memory!\n");
return 0 ;
}
}
printf("list is: %s\n",data); // for checking
return *data;
}
void getWordsArray(char *input)
{
char *token;
char *search = " ,";
token = strtok (input,search);
while (token != NULL ) {
printf("%s\n",token);
token = strtok(NULL,search);
}
}
EDIT:
i noticed i forgot to "strtok" command so i changed it to token = strtok(NULL,search);
I still get wierd output on the printf:
\327{\300_\377
Change:
int main(int argc, const char * argv[])
{
char input = getInput();
getWordsArray(&input);
}
to:
int main(int argc, const char * argv[])
{
char *input = getInput();
getWordsArray(input);
}
with a similar to the return value of getInput():
char *getInput()
{
// ...
return data;
}
In your code, you were only saving the first character of the input string, and then passing mostly garbage to getWordsArray().
For your malloc() question, man malloc starts with:
SYNOPSIS
#include <stdlib.h>
For your getchar() question, perhaps see I'm trying to understand getchar() != EOF, etc.
Joseph answered Q1.
Q2: malloc and realoc returns type void *. You need to explicitly convert that to char *. Try this:
data = (char *) malloc(sizeof(char));
Q3: 1 can be interpreted as one character. -1, while converting to characters, is equivalent to string "-1" which has character '-' and '1'. In order to check against -1, you need to use strcmp or strncmp to compare against the string "-1".
Q4: If you are going to return a different copy, yes, dynamically allocate memory is a good idea. Alternatively, you can put all pointers to each token into a data structure like a linked list for future reference. This way, you avoid making copies and just allow access to each token in the string.
Things that are wrong:
Strings in C are null-terminated. The %s argument to printf means "just keep printing characters until you hit a '\0'". Since you don't null-terminate data before printing it, printf is running off the end of data and just printing your heap (which happens to not contain any null bytes to stop it).
What headers did you #include? Missing <stdlib.h> is the most obvious reason for an implicit declaration of malloc.
getInput returns the first char of data by value. This is not what you want. (getWordsArray will never work. Also see 1.)
Suggestions:
Here's one idea for breaking on -1: if ((c == '1') && (data[i-1] == '-'))
To get an array of the strings you would indeed need a dynamic array of char *. You could either malloc a new string to copy each token that strtok returns, or just save each token directly as a pointer into input.

Resources