C - Store the results of strtok? - c

So I'm trying getting my input as file.txt,r and I have to split the string on the comma and save both file.txt and r into separate strings... but I'm really confused how to do that. I looked up strtok
and this is what I have so far:
char buffer[256];
char filename[2][40];
char operation[20];
n = read(sock,buffer,255); //read the message from the client into buffer
char cinput[300];
strcpy(cinput,buffer);//now cinput has the whole thing
char *token;
token = strtok(cinput,",");
while(token)
{
printf("%s\n",token);
token = strtok(NULL,",");
}
But I'm confused... how would I store file.txt and r as separate strings once parsed?
edit: something like this?
char *token;
char *pt;
pt = strtok(cinput,","); //this will hold the value of the first one
strcpy(filename,pt);
token = strtok(cinput,",");
while(token)
{
//printf("%s\n",token);
token = strtok(NULL,",");
}
printf("%s\n",token); //this will hold the value of the second one
strcpy(operation,token);
printf("%s\n",operation);

All you should need is separate pointers. You don't need to allocate all these buffers or use strcpy().
Just assign the return values from strtok() in multiple char * pointers.
Something like:
char *p1 = strtok("file.txt,r", ",");
char *p2 = strtok(NULL, ",");

may be a compact approach
//your data pattern
typedef
struct file_inputs {
char *fname;
char *fmode;
} finput_t;
and some where in your code
finput_t fi;
fi.fname = strtok(cinput,",");
fi.fmode = strtok(NULL,",");

Related

How do I strtok through a str and assign each token to an array that is a global variable

In this code, I call the input function and assign the input to a global variable. then i parse the global string using strtok() in my parseFunction separating it out by the spaces. I assign each token to a global array.
#define BUFFERSIZE 80
char input[BUFFERSIZE];
char parsed[BUFFERSIZE][50];
void getInput() {
printf("input command ");
fgets(input, BUFFERSIZE, stdin);
}
void parseFunction() {
int i = 0;
char* tok;
char* delim = " \n";
while(tok != NULL) {
parsed[i] = tok;
i++;
tok = strktok(NULL, delim);
}
parsed[i] = NULL;
}
int main() {
getInput();
parseFunction();
}
I am getting the following errors and I cannot understand what is going wrong.
error: incompatible types when assigning to type ‘char[50]’ from type ‘char *’
parsed[i] = tok;
^
shell.c:51:15: error: incompatible types when assigning to type ‘char[50]’ from type ‘void *’
parsed[i] = NULL;
The comments are helpful in terms of getting you through compilation, but there are a few more gotchas waiting for you. I'm guessing that BUFFERSIZE is the max length of a string and that you wanted an array of 50 strings that are of size BUFFERSIZE. What you specified was BUFFERSIZE strings of size 50. I think you want -
#define BUFFERSIZE 80
char input[BUFFERSIZE];
char parsed[50][BUFFERSIZE]; /* size of each of 50 strings is the same as input now, which I think is what you wanted */
If you agree with Kaylem's comment about storing pointers (which will work), then it actually should be.
#define BUFFERSIZE 80
char input[BUFFERSIZE];
char *parsed[50]; /* an array of 50 pointers to char */
But you may or may not want to rethink that. The next problem is how you're iterating through the input string with strtok. tok is uninitialized when you first test its value entering the while loop. The result is undefined. Any wild thing could happen there from memory access crash to simply being NULL by accident and never entering your loop. I think what you want is.
#define BUFFERSIZE 80
char input[BUFFERSIZE];
char *parsed[50];
void parseFunction() {
int i = 0;
char* tok;
char* delim = " \n";
for (tok = strtok(input, delim); tok != NULL; tok = strtok(NULL, delim) { /* you have to use input the first time you call strtok */
parsed[i++] = tok; /* short-cut */
/* maybe a check on i to make sure it doesn't go beyond 49 or you'll crash */
}
parsed[i] = NULL;
}
That's if you really want to store pointers to places inside the input string. But let's say you reuse the string input for another round of input and add the tokens to the end of parsed. If you put the above in a loop and got more data from the console, using input again, and intended that to be added to "parsed" like it was one big set of tokens. Someone types in 10 tokens and you store the pointers to the tokens - pointers to the inside of input which strtok is modifying by replacing delimeters with null terminators. Then you get another 5 and you add those 5 pointers to the end of your "parsed" array. The first 10 would now be pointing to garbage, because fgets would have overwritten the contents of input, which strtok had set up to be read as a collection of substrings. The behavior would be undefined again, but almost certain to produce garbage. The safer way to do this is to actually copy the strings. You use the first of the two declarations above and do something like -
#define BUFFERSIZE 80
char input[BUFFERSIZE];
char parsed[50][BUFFERSIZE];
void parseFunction() {
int i = 0;
char* tok;
char* delim = " \n";
for (tok = strtok(input, delim); tok != NULL; tok = strtok(NULL, delim) {
strcpy(parsed[i++], tok); /* actually move the chars into one of the strings in "input" */
}
parsed[i][0] = '\0'; /* use an empty string in stead of a NULL pointer */
}
Now, regardless of what happens to input, the contents of parsed is safe. Some picky C99 or C++ compilers will require you to cast parsed[i] as a char* for strcpy, as in (char*)parsed[i]. Technically, parsed[i] is of type char (*)[BUFFERSIZE], but older C didn't care.
In your code compiler complains,
#define BUFFERSIZE 80
char input[BUFFERSIZE];
char parsed[BUFFERSIZE][50];
void getInput() {
printf("input command ");
fgets(input, BUFFERSIZE, stdin);
}
void parseFunction() {
int i = 0;
char* tok;
char* delim = " \n";
while(tok != NULL) {
parsed[i] = tok; // here is error...
i++;
tok = strktok(NULL, delim); // this is strtok(original), if you didn't made a wrapper or your own...
}
parsed[i] = NULL; //here is error..
}
int main() {
getInput();
parseFunction();
}
A possible solution could be...
#define BUFFERSIZE 80
char input[BUFFERSIZE];
char parsed[BUFFERSIZE][50];
void getInput() {
printf("input command ");
fgets(input, BUFFERSIZE, stdin);
}
void parseFunction() {
int i = 0, j =0; //edit
char* tok =NULL; //edit
char* delim = " \n";
tok = strktok(NULL, delim); //edit...
while(tok != NULL) {
parsed[i][j] = tok; //edit
i++;
tok = strtok(NULL, delim);
}
parsed[i][j] = NULL; //edit
}
int main() {
getInput();
parseFunction();
}

Segfault resulting from strdup and strtok

I've been assigned a homework from my college professor and I seem to have found some strange behavior of strtok
Basically, we have to parse a CSV file for my class, where the number of tokens in the CSV is known and the last element may have extra "," characters.
An example of a line:
Hello,World,This,Is,A lot, of Text
Where the tokens should be output as
1. Hello
2. World
3. This
4. Is
5. A lot, of Text
For this assignment we MUST use strtok. Because of this I found on some other SOF post that using strtok with an empty string (or passing "\n" as the second argument) results in reading until the end of the line. This is perfect for my application since the extra commas always appear in the last element.
I've created this code which works:
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#define NUM_TOKENS 5
const char *line = "Hello,World,This,Is,Text";
char **split_line(const char *line, int num_tokens)
{
char *copy = strdup(line);
// Make an array the correct size to hold num_tokens
char **tokens = (char**) malloc(sizeof(char*) * num_tokens);
int i = 0;
for (char *token = strtok(copy, ",\n"); i < NUM_TOKENS; token = strtok(NULL, i < NUM_TOKENS - 1 ? ",\n" : "\n"))
{
tokens[i++] = strdup(token);
}
free(copy);
return tokens;
}
int main()
{
char **tokens = split_line(line, NUM_TOKENS);
for (int i = 0; i < NUM_TOKENS; i++)
{
printf("%s\n", tokens[i]);
free(tokens[i]);
}
}
Now this works and should get me full credit but I hate this ternary that shouldn't be needed:
token = strtok(NULL, i < NUM_TOKENS - 1 ? ",\n" : "\n");
I'd like to replace the method with this version:
char **split_line(const char *line, int num_tokens)
{
char *copy = strdup(line);
// Make an array the correct size to hold num_tokens
char **tokens = (char**) malloc(sizeof(char*) * num_tokens);
int i = 0;
for (char *token = strtok(copy, ",\n"); i < NUM_TOKENS - 1; token = strtok(NULL, ",\n"))
{
tokens[i++] = strdup(token);
}
tokens[i] = strdup(strtok(NULL, "\n"));
free(copy);
return tokens;
}
This tickles my fancy much nicer since it is much easier to see that there is a final case. You also get rid of the strange ternary operator.
Sadly though, this segfaults! I can't for the life of me figure out why.
Edit: Add some output examples:
[11:56:06] gravypod:test git:(master*) $ ./test_no_fault
Hello
World
This
Is
Text
[11:56:10] gravypod:test git:(master*) $ ./test_seg_fault
[1] 3718 segmentation fault (core dumped) ./test_seg_fault
[11:56:14] gravypod:test git:(master*) $
Please check the return value from strtok before you risk passing NULL to another function. Your loop is calling strtok one more time than you think.
It is more usual to use this return value to control your loop, then you are not at the mercy of your data. As for the delimitors, best to keep it simple and not try anything fancy.
char **split_line(const char *line, int num_tokens)
{
char *copy = strdup(line);
char **tokens = (char**) malloc(sizeof(char*) * num_tokens);
int i = 0;
char *token;
char delim1[] = ",\r\n";
char delim2[] = "\r\n";
char *delim = delim1; // start with a comma in the delimiter set
token = strtok(copy, delim);
while(token != NULL) { // strtok result comtrols the loop
tokens[i++] = strdup(token);
if(i == NUM_TOKENS) {
delim = delim2; // change the delimiters
}
token = strtok(NULL, delim);
}
free(copy);
return tokens;
}
Note you should also check the return values from malloc and strdup and free your memory properly
When you get to the last loop, you'll get
for (char *token = strtok(copy, ",\n"); i < NUM_TOKENS - 1; token = strtok(NULL, ",\n"))
loop body
loop increment step, i.e. token = strtok(NULL, ",\n") (with the wrong second arg)
loop continuation check i < NUM_TOKENS - 1
i.e. it has still called strtok even though you're now out-of-range. You've also got an off-by-one on your array indices here: you'd want to initialise i=0 not 1.
You could avoid this by e.g.
making the initial strtok a special case outside the loop, e.g.
int i = 0;
tokens[i++] = strdup(strtok(copy, ",\n"));
then moving the strtok(NULL, ",\n") inside the loop
I'm also surprised you want the \n there at all, or even need to call the last strtok (wouldn't that already just point to the rest of the string? If you just trying to chop a trailing newline there are easier ways) but I haven't used strtok in years.
(As an aside you're also not freeing the malloced array you store the string pointers in. That said since it's the end of the program at that point that doesn't matter so much.)
Remember that strtok identifies a token when it finds any of the characters in the delimiter string (the second argument to strtok()) - it doesn't try to match the entire delimiter string itself.
Thus, the ternary operator was never needed in the first place - the string will be tokenized based on the occurrence of , OR \n in the input string, so the following works:
for (token = strtok(copy, ",\n"); i < NUM_TOKENS; token = strtok(NULL, ",\n"))
{
tokens[i++] = strdup(token);
}
The second example segfaults because it's already tokenized the input to the end of the string by the time it exits the for loop. Calling strtok() again sets token to NULL, and the segfault is generated when strdup() is called on the NULL pointer. Removing the extra call to strtok gives the expected results:
for (token = strtok(copy, ",\n"); i < NUM_TOKENS - 1; token = strtok(NULL, ",\n"))
{
tokens[i++] = strdup(token);
}
tokens[i] = strdup(token);

Taking apart strings in C

I have a string that includes two names and a comma how can i take them apart nd write them to seperate strings.
Example
char *line="John Smith,Jane Smith";
I am thinking of using sscanf function.
sscanf(line,"%s,%s",str1,str2);
What should i do?
note: I can change comma to space character.
I am thinking of using sscanf function.
Don't even think about it.
char line[] = "John Smith,Jane Smith";
char *comma = strchr(line, ',');
*comma = 0;
char *firstName = line;
char *secondName = comma + 1;
Here's how you could do it using strtok:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
// Your string.
char *line = "John Smith,10,Jane Smith";
// Let's work with a copy of your string.
char *line_copy = malloc(1 + strlen(line));
strcpy(line_copy, line);
// Get the first person.
char *pointer = strtok(line_copy, ",");
char *first = malloc(1 + strlen(pointer));
strcpy(first, pointer);
// Skip the number.
strtok(NULL, ",");
// Get the second person.
pointer = strtok(NULL, ",");
char *second = malloc(1 + strlen(pointer));
strcpy(second, pointer);
// Print.
printf("%s\n%s", first, second);
return 0;
}

Comma Delimiting C

I need my program to take a series of file names (stored in a single "String" and separated by commas) and act on them.
The psuedo code would be:
for each filename in some_string
open filename
operate on contents of filename
close filename
The issue is that I'm stuck separating some_string ("filename1,filename2,...,filenamen") into [filename 1], [filename 2], ... [filename n].
Edit: to clarify, it seems simpler to keep some_string intact and extract each file name as needed, which is what I'm attempting to do.
My code, as it stands, is pretty clunky (and quite disgusting...)
int j = 0;
char *tempS = strdup(filenames);
while (strchr(tempS, ',')) {
char *ptr = strchr(tempS, ',');
*ptr++ = '.';
numFiles++;
}
for (; j < numFiles; j++) {
char *ptr = strchr(tempS, ',');
//don't know where to go from here...
fin = openFile(tempS);
if (fin != NULL) {
//do something
}
fclose(fin);
}
It's not done, obviously. I correctly find the number of files, but I'm a little lost when it comes to figuring out how to separate one at a time from the source string and operate on it.
You can use strtok for this
char *fname = strtok(tempS, ",");
while (fname != NULL) {
/* process filename */
fname = strtok(NULL, ",");
}
strtok delivers the strings separated by comma, one by one.
Usually for splitting string in C, strok() function from C standard library is used.
#include <string.h>
...
char *token;
char *line = "string1,string2,string3";
char *search = ",";
token = strtok(line, search);
token = strtok(NULL, search);
strtok() is not multithreaded-safe. If that matters to you, you should use strtok_r(). For example:
char *savedptr = NULL /* to be passed back to strtok_r in follow-on calls */
char *tempS = strdup( some_string ); /* to keep your original intact */
char *fname = strtok_r(tempS, ",", savedptr);
while (fname != NULL) {
/* process fname ... */
fname = strtok_r(NULL, ",", savedptr); /* pass savedptr back to strtok_r */
}

CSV into array in C

I'm trying to load a CSV file into a single dimentional array. I can output the contents of the CSV file, but in trying to copy it into an array I'm having a bit of trouble.
Here is my existing code, which I realise is probably pretty bad but I'm teaching myself:
#include <stdio.h>
#include <string.h>
#define MAX_LINE_LENGTH 1024
#define MAX_CSV_ELEMENTS 1000000
int main(int argc, char *argv[])
{
char line[MAX_LINE_LENGTH] = {0};
int varCount = 0;
char CSVArray[MAX_CSV_ELEMENTS] = {0};
FILE *csvFile = fopen("data.csv", "r");
if (csvFile)
{
char *token = 0;
while (fgets(line, MAX_LINE_LENGTH, csvFile))
{
token = strtok(&line[0], ",");
while (token)
{
varCount++;
CSVArray[varCount] = *token; //This is where it all goes wrong
token = strtok(NULL, ",");
}
}
fclose(csvFile);
}
return 0;
}
Is there a better way I should be doing this? Thanks in advance!
*token means dereferencing the pointer token which is the address of the first character in a string that strtok found. That's why your code fills CSVArray with just the first characters of each token.
You should rather have an array of char pointers to point at the tokens, like:
char *CSVArray[MAX_CSV_ELEMENTS] = {NULL};
And then assign a pointer to its elements:
CSVArray[varCount] = token;
Alternatively, you can copy the whole token each time:
CVSArray[varCount] = malloc(strlen(token)+1);
strcpy(CVSArray[varCount], token);
You are right about the problem line. It is because you are assigning a pointer, not copying the text.
Try here http://boredzo.org/pointers/ for a tutorial on pointers.
It looks like you're trying to put the char * that comes back from strtok into the char array.
I think you want to declare CSVArray as:
char * CSVArray[MAX_CSV_ELEMENTS] = {0};

Resources