C - split string into an array of strings - c

I'm not completely sure how to do this in C:
char* curToken = strtok(string, ";");
//curToken = "ls -l" we will say
//I need a array of strings containing "ls", "-l", and NULL for execvp()
How would I go about doing this?

Since you've already looked into strtok just continue down the same path and split your string using space (' ') as a delimiter, then use something as realloc to increase the size of the array containing the elements to be passed to execvp.
See the below example, but keep in mind that strtok will modify the string passed to it. If you don't want this to happen you are required to make a copy of the original string, using strcpy or similar function.
char str[]= "ls -l";
char ** res = NULL;
char * p = strtok (str, " ");
int n_spaces = 0, i;
/* split string and append tokens to 'res' */
while (p) {
res = realloc (res, sizeof (char*) * ++n_spaces);
if (res == NULL)
exit (-1); /* memory allocation failed */
res[n_spaces-1] = p;
p = strtok (NULL, " ");
}
/* realloc one extra element for the last NULL */
res = realloc (res, sizeof (char*) * (n_spaces+1));
res[n_spaces] = 0;
/* print the result */
for (i = 0; i < (n_spaces+1); ++i)
printf ("res[%d] = %s\n", i, res[i]);
/* free the memory allocated */
free (res);
res[0] = ls
res[1] = -l
res[2] = (null)

Here is an example of how to use strtok borrowed from MSDN.
And the relevant bits, you need to call it multiple times. The token char* is the part you would stuff into an array (you can figure that part out).
char string[] = "A string\tof ,,tokens\nand some more tokens";
char seps[] = " ,\t\n";
char *token;
int main( void )
{
printf( "Tokens:\n" );
/* Establish string and get the first token: */
token = strtok( string, seps );
while( token != NULL )
{
/* While there are tokens in "string" */
printf( " %s\n", token );
/* Get next token: */
token = strtok( NULL, seps );
}
}

Related

Function calls between strtok?

So I'm using strtok to split an char array by " ". Then I place each word I split into a function that will determine a value for the word based on a list. However everything I place the function call in middle of the while loop to of splitting the char array it stops.
Do I have to split the array, store it in another array and then go through the second array?
p = strtok(temp, " ");
while (p != NULL) {
value = get_score(score, scoresize, p);
points = points + value;
p = strtok(NULL, " ");
}
So as long as value = get_score(score, scoresize, p); is there the while loops break after the first word.
strtok() uses a hidden state variable to keep track of the source string position. If you use strtok again, directly or indirectly in get_score(), this hidden state will be changed as to make the call p = strtok(NULL, " "); meaningless.
Do not use strtok() this way, either use the improved version strtok_r standardized in POSIX, available on many systems. Or re-implement it with strspn and strcspn:
#include <string.h>
char *my_strtok_r(char *s, char *delim, char **context) {
char *token = NULL;
if (s == NULL)
s = *context;
/* skip initial delimiters */
s += strspn(s, delim);
if (*s != '\0') {
/* we have a token */
token = s;
/* skip the token */
s += strcspn(s, delim);
if (*s != '\0') {
/* cut the string to terminate the token */
*s++ = '\0';
}
}
*context = s;
return token;
}
...
char *state;
p = my_strtok_r(temp, " ", &state);
while (p != NULL) {
value = get_score(score, scoresize, p);
points = points + value;
p = my_strtok_r(NULL, " ", &state);
}

Parse url path of GET request

I'm new to C and I've been working on this task for about 7 hours now - please don't say I didn't try.
I want to parse the path of a self-written webserver in C. Let's say I call
http://localhost:8080/hello/this/is/a/test.html
then the browser gets
GET /hello/this/is/a/test.html HTTP/1.1
I want to parse /hello/this/is/a/test.html, so the complete string between "GET " (note the white space after GET) and the first white space after /../../..html.
What I tried so far:
int main() {
...
char * getPathOfGetRequest(char *);
char *pathname = getPathOfGetRequest(buf);
printf("%s\n\n%s", buf, pathname);
...
}
char * getPathOfGetRequest(char *buf) {
char *startingGet = "GET ";
char buf_cpy[BUFLEN];
memcpy(buf_cpy, buf, sizeof(buf));
char *urlpath = malloc(1000);
char *path = malloc(1000);
urlpath = strstr(buf_cpy, startingGet);
char delimiter[] = " ";
path = strtok(urlpath, delimiter);
path = strtok(NULL, delimiter);
return path;
}
The pathname always only has 4 correct chars and may or may not be filled with other unrelated chars, like /hell32984cn)/$"ยง$. I guess it has something to do with strlen(startingGet), but I can't see the relationship between it. Where is my mistake?
Question code with commentary:
char * getPathOfGetRequest(char *buf) {
char *startingGet = "GET ";
char buf_cpy[BUFLEN];
memcpy(buf_cpy, buf, sizeof(buf));
The above memcpy will likely only copy 4 bytes from buf to buf_cpy.
This is due to buf being a pointer to a char.
sizeof(buf) is the size of a pointer (likely: 4).
Perhaps, instead of using 'sizeof()', it would have been better to use 'strlen()'.
char *urlpath = malloc(1000);
char *path = malloc(1000);
urlpath = strstr(buf_cpy, startingGet);
Perhaps the questioner is not clear on why urlpath was allocated 1000 bytes of memory. In any case, the above assignment will cause that 1000 bytes to be leaked, and defeats the purpose of the 'urlpath=malloc(1000)'.
The actual effect of the above statements is urlpath = buf_cpy;, as strstr() will return the position of the beginning of 'GET ' in the buf_copy.
char delimiter[] = " ";
path = strtok(urlpath, delimiter);
Likewise, the above assignment will cause the 1000 bytes allocated to path to be leaked, and defeats the purpose of the 'path=malloc(1000)' above.
path = strtok(NULL, delimiter);
return path;
}
An alternitive coding:
char *getPathOfGetRequest(const char *buf)
{
const char *start = buf;
const char *end;
char *path=NULL;
size_t pathLen;
/* Verify that there is a 'GET ' at the beginning of the string. */
if(strncmp("GET ", start, 4))
{
fprintf(stderr, "Parse error: 'GET ' is missing.\n");
goto CLEANUP;
}
/* Set the start pointer at the first character beyond the 'GET '. */
start += 4;
/* From the start position, set the end pointer to the first white-space character found in the string. */
end=start;
while(*end && !isspace(*end))
++end;
/* Calculate the path length, and allocate sufficient memory for the path plus string termination. */
pathLen = (end - start);
path = malloc(pathLen + 1);
if(NULL == path)
{
fprintf(stderr, "malloc() failed. \n");
goto CLEANUP;
}
/* Copy the path string to the path storage. */
memcpy(path, start, pathLen);
/* Terminate the string. */
path[pathLen] = '\0';
CLEANUP:
/* Return the allocated storage, or NULL in the event of an error, to the caller. */
return(path);
}
And, finally, if 'strtok()' must be used:
char *getPathOfGetRequest(char *buf)
{
char *path = NULL;
if(strtok(buf, " "))
{
path = strtok(NULL, " ");
if(path)
path=strdup(path);
}
return(path);
}

In C, How to split a string on \n into lines

I want to split a string by \n and place lines which contain a specific token into an array.
I have this code:
char mydata[100] =
"mary likes apples\njim likes playing\nmark hates school\nanne likes mary";
char *token = "likes";
char ** res = NULL;
char * p = strtok (mydata, "\n");
int n_spaces = 0, i;
/* split string and append tokens to 'res' */
while (p) {
res = realloc (res, sizeof (char*) * ++n_spaces);
if (res == NULL)
exit (-1); /* memory allocation failed */
if (strstr(p, token))
res[n_spaces-1] = p;
p = strtok (NULL, "\n");
}
/* realloc one extra element for the last NULL */
res = realloc (res, sizeof (char*) * (n_spaces+1));
res[n_spaces] = '\0';
/* print the result */
for (i = 0; i < (n_spaces+1); ++i)
printf ("res[%d] = %s\n", i, res[i]);
/* free the memory allocated */
free (res);
But then I get a segmentation fault:
res[0] = mary likes apples
res[1] = jim likes playing
Segmentation fault
How can I split a string on \n correctly in C?
strstr just returns a pointer to the first match of second argument.
Your code is not taking care of null character.
Can use strcpy to copy string.
while (p) {
// Also you want string only if it contains "likes"
if (strstr(p, token))
{
res = realloc (res, sizeof (char*) * ++n_spaces);
if (res == NULL)
exit (-1);
res[n_spaces-1] = malloc(sizeof(char)*strlen(p));
strcpy(res[n_spaces-1],p);
}
p = strtok (NULL, "\n");
}
Free res using:
for(i = 0; i < n_spaces; i++)
free(res[i]);
free(res);
Try this:
char mydata[100] = "mary likes apples\njim likes playing\nmark hates school\nanne likes mary";
char *token = "likes";
char **result = NULL;
int count = 0;
int i;
char *pch;
// split
pch = strtok (mydata,"\n");
while (pch != NULL)
{
if (strstr(pch, token) != NULL)
{
result = (char*)realloc(result, sizeof(char*)*(count+1));
result[count] = (char*)malloc(strlen(pch)+1);
strcpy(result[count], pch);
count++;
}
pch = strtok (NULL, "\n");
}
// show and free result
printf("%d results:\n",count);
for (i = 0; i < count; ++i)
{
printf ("result[%d] = %s\n", i, result[i]);
free(result[i]);
}
free(result);

Tokenize an environment variable and save the resulting token in a char**

I'm attempting to create an array of strings that represent the directories stored in the PATH variable. I'm writing this code in C, but I'm having trouble getting the memory allocation parts working.
char* shell_path = getenv ("PATH");
char* tok = strtok (shell_path, SHELL_PATH_SEPARATOR);
int number_of_tokens = 0, i = 0;
while (tok != NULL)
{
number_of_tokens++;
}
Shell_Path_Directories = malloc (/* This is where I need some help */);
shell_path = getenv ("PATH");
tok = strtok (shell_path, SHELL_PATH_SEPARATOR);
while (tok != NULL)
{
Shell_Path_Directories[i++] = tok;
tok = strtok (NULL, SHELL_PATH_SEPARATOR);
}
The issue I'm having is that I can't think of how I can know exactly how much memory to allocate.
I know I'm tokenizing the strings twice, and that it's probably stupid for me to be doing that, but I'm open to improvements if someone can figure out a better way to do this.
Just to give you basically the same answer as user411313's in a different dialect:
char* shell_path = getenv ("PATH");
/* Copy the environment string */
size_t const len = strlen(shell_path)+1;
char *copyenv = memcpy(malloc(len), shell_path, len);
/* start the tokenization */
char *p=strtok(copyenv,SHELL_PATH_SEPARATOR);
/* the path should always contain at least one element */
assert(p);
char **result = malloc(sizeof result[0]);
int i = 0;
while (1)
{
result[i] = strcpy(malloc(strlen(p)+1), p);
p=strtok(0,SHELL_PATH_SEPARATOR);
if (!p) break;
++i;
result = realloc( result, (i+1)*sizeof*result );
}
You can do:
Shell_Path_Directories = malloc (sizeof(char*) * number_of_tokens);
Also the way you are counting the number_of_tokens is incorrect. You need to call the strtok again in the loop passing it NULL as the 1st argument:
while (tok != NULL) {
number_of_tokens++;
tok = strtok (NULL, SHELL_PATH_SEPARATOR);
}
Since you've counted the number of tokens already, you can use that as the number of pointers to char to allocate:
char **Shell_Path_Directories = malloc(number_of_tokens * sizeof(char *));
Then you have one more minor issue: you're using strtok directly on the string returned by getenv, which leads to undefined behavior (strtok modifies the string you pass to it, and you're not allowed to modify the string returned by getenv, so you get undefined behavior). You probably want to duplicate the string first, then tokenize your copy instead.
You should not change the getenv-return pointer, safer you make a copy. With strtok you can destroy the content of your environment table.
char* shell_path = getenv ("PATH");
char *p,*copyenv = strcpy( malloc(strlen(shell_path)+1), shell_path );
char **result = 0;
int i = 0;
for( p=strtok(copyenv,SHELL_PATH_SEPARATOR); p; p=strtok(0,SHELL_PATH_SEPARATOR) )
{
result = realloc( result, ++i*sizeof*result );
strcpy( result[i-1]=malloc(strlen(p)+1), p );
}

How to split a string to 2 strings in C

I was wondering how you could take 1 string, split it into 2 with a delimiter, such as space, and assign the 2 parts to 2 separate strings. I've tried using strtok() but to no avail.
#include <string.h>
char *token;
char line[] = "SEVERAL WORDS";
char *search = " ";
// Token will point to "SEVERAL".
token = strtok(line, search);
// Token will point to "WORDS".
token = strtok(NULL, search);
Update
Note that on some operating systems, strtok man page mentions:
This interface is obsoleted by strsep(3).
An example with strsep is shown below:
char* token;
char* string;
char* tofree;
string = strdup("abc,def,ghi");
if (string != NULL) {
tofree = string;
while ((token = strsep(&string, ",")) != NULL)
{
printf("%s\n", token);
}
free(tofree);
}
For purposes such as this, I tend to use strtok_r() instead of strtok().
For example ...
int main (void) {
char str[128];
char *ptr;
strcpy (str, "123456 789asdf");
strtok_r (str, " ", &ptr);
printf ("'%s' '%s'\n", str, ptr);
return 0;
}
This will output ...
'123456' '789asdf'
If more delimiters are needed, then loop.
Hope this helps.
char *line = strdup("user name"); // don't do char *line = "user name"; see Note
char *first_part = strtok(line, " "); //first_part points to "user"
char *sec_part = strtok(NULL, " "); //sec_part points to "name"
Note: strtok modifies the string, so don't hand it a pointer to string literal.
You can use strtok() for that
Example: it works for me
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
return 0;
}
If you have a char array allocated you can simply put a '\0' wherever you want.
Then point a new char * pointer to the location just after the newly inserted '\0'.
This will destroy your original string though depending on where you put the '\0'
If you're open to changing the original string, you can simply replace the delimiter with \0. The original pointer will point to the first string and the pointer to the character after the delimiter will point to the second string. The good thing is you can use both pointers at the same time without allocating any new string buffers.
You can do:
char str[] ="Stackoverflow Serverfault";
char piece1[20] = ""
,piece2[20] = "";
char * p;
p = strtok (str," "); // call the strtok with str as 1st arg for the 1st time.
if (p != NULL) // check if we got a token.
{
strcpy(piece1,p); // save the token.
p = strtok (NULL, " "); // subsequent call should have NULL as 1st arg.
if (p != NULL) // check if we got a token.
strcpy(piece2,p); // save the token.
}
printf("%s :: %s\n",piece1,piece2); // prints Stackoverflow :: Serverfault
If you expect more than one token its better to call the 2nd and subsequent calls to strtok in a while loop until the return value of strtok becomes NULL.
This is how you implement a strtok() like function (taken from a BSD licensed string processing library for C, called zString).
Below function differs from the standard strtok() in the way it recognizes consecutive delimiters, whereas the standard strtok() does not.
char *zstring_strtok(char *str, const char *delim) {
static char *static_str=0; /* var to store last address */
int index=0, strlength=0; /* integers for indexes */
int found = 0; /* check if delim is found */
/* delimiter cannot be NULL
* if no more char left, return NULL as well
*/
if (delim==0 || (str == 0 && static_str == 0))
return 0;
if (str == 0)
str = static_str;
/* get length of string */
while(str[strlength])
strlength++;
/* find the first occurance of delim */
for (index=0;index<strlength;index++)
if (str[index]==delim[0]) {
found=1;
break;
}
/* if delim is not contained in str, return str */
if (!found) {
static_str = 0;
return str;
}
/* check for consecutive delimiters
*if first char is delim, return delim
*/
if (str[0]==delim[0]) {
static_str = (str + 1);
return (char *)delim;
}
/* terminate the string
* this assignmetn requires char[], so str has to
* be char[] rather than *char
*/
str[index] = '\0';
/* save the rest of the string */
if ((str + index + 1)!=0)
static_str = (str + index + 1);
else
static_str = 0;
return str;
}
Below is an example code that demonstrates the usage
Example Usage
char str[] = "A,B,,,C";
printf("1 %s\n",zstring_strtok(s,","));
printf("2 %s\n",zstring_strtok(NULL,","));
printf("3 %s\n",zstring_strtok(NULL,","));
printf("4 %s\n",zstring_strtok(NULL,","));
printf("5 %s\n",zstring_strtok(NULL,","));
printf("6 %s\n",zstring_strtok(NULL,","));
Example Output
1 A
2 B
3 ,
4 ,
5 C
6 (null)
You can even use a while loop (standard library's strtok() would give the same result here)
char s[]="some text here;
do {
printf("%s\n",zstring_strtok(s," "));
} while(zstring_strtok(NULL," "));

Resources