get the last token of a string in C - c

what I want to do is given an input string, which I will not know it's size or the number of tokens, be able to print it's last token.
e.x.:
char* s = "some/very/big/string";
char* token;
const char delimiter[2] = "/";
token = strtok(s, delimiter);
while (token != NULL) {
printf("%s\n", token);
token = strtok(NULL, delimiter);
}
return token;
and i want my return to be
string
but I what I get is (null). Any workarounds? I've searched the web and can't seem to find an answer to this. At least for C programming language.

If you tokenize on a specific character, i.e. '/' in your example, you do not need to tokenize the string at all: call strrchr to find the position of the last '/', and add 1 to the resultant pointer to skip the delimiter, like this:
char *s = "some/very/big/string";
char *last = strrchr(s, '/');
if (last != NULL) {
printf("Last token: '%s'\n", last+1);
}
Demo.

Just use another variable to store last token before it gets null
char s[] = "some/very/big/string";
char * token, * last;
last = token = strtok(s, "/");
for (;(token = strtok(NULL, "/")) != NULL; last = token);
printf("%s\n", last);

Related

Trying to split text into lines, and then tabs. Get only first line

I have a char array (buf) that exists out of multiple lines and each line is split up by multiple tabs. I want to separate this. I use the following code for this:
char copy[4096];
char* split_request = strtok(buf, "\r\n");
strcpy(copy, split_request);
while(split_request != NULL) {
if (strchr(copy, '\t') != NULL) {
printf("We have a tab");
//If I uncomment this line I get an assertion error
//char* temp = strtok(copy, '\t');
}
printf(split_request);
split_request = strtok(NULL, "\r\n");
if (split_request != NULL) {
strcpy(copy, split_request);
}
printf("\n");
}
If I uncomment that one line of code, only the first line is processed. In addition, it is printed 5 times, and each time one tabbed column disappears. It feels like despite the strcpy, the original string is still affected...
I was experimenting with an alternative approach to your problem. I have used strtok_r for separating lines and each line is processed using strtok for tabs. The code is given below.
void lineParser(char *singleLine){
const char tab[] = "\t";
char *token = NULL;
if(strchr(singleLine, '\t') != NULL){
token = strtok(singleLine, tab);
while(token != NULL){
printf("%s\n", token);
token = strtok(NULL, tab);
}
}
}
int main()
{
char buf[] = "Stack\tOverFlow\r\nStack\tExchange\r\n";
char *rest = buf;;
char* token = NULL;
const char tab[] = "\t";
const char newline[] = "\r\n";
while ((token = strtok_r(rest, " ", &rest))) {
lineParser(token);
token = strtok_r(rest, newline, &rest);
}
}

Segfault resulting from strdup and strtok

I've been assigned a homework from my college professor and I seem to have found some strange behavior of strtok
Basically, we have to parse a CSV file for my class, where the number of tokens in the CSV is known and the last element may have extra "," characters.
An example of a line:
Hello,World,This,Is,A lot, of Text
Where the tokens should be output as
1. Hello
2. World
3. This
4. Is
5. A lot, of Text
For this assignment we MUST use strtok. Because of this I found on some other SOF post that using strtok with an empty string (or passing "\n" as the second argument) results in reading until the end of the line. This is perfect for my application since the extra commas always appear in the last element.
I've created this code which works:
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#define NUM_TOKENS 5
const char *line = "Hello,World,This,Is,Text";
char **split_line(const char *line, int num_tokens)
{
char *copy = strdup(line);
// Make an array the correct size to hold num_tokens
char **tokens = (char**) malloc(sizeof(char*) * num_tokens);
int i = 0;
for (char *token = strtok(copy, ",\n"); i < NUM_TOKENS; token = strtok(NULL, i < NUM_TOKENS - 1 ? ",\n" : "\n"))
{
tokens[i++] = strdup(token);
}
free(copy);
return tokens;
}
int main()
{
char **tokens = split_line(line, NUM_TOKENS);
for (int i = 0; i < NUM_TOKENS; i++)
{
printf("%s\n", tokens[i]);
free(tokens[i]);
}
}
Now this works and should get me full credit but I hate this ternary that shouldn't be needed:
token = strtok(NULL, i < NUM_TOKENS - 1 ? ",\n" : "\n");
I'd like to replace the method with this version:
char **split_line(const char *line, int num_tokens)
{
char *copy = strdup(line);
// Make an array the correct size to hold num_tokens
char **tokens = (char**) malloc(sizeof(char*) * num_tokens);
int i = 0;
for (char *token = strtok(copy, ",\n"); i < NUM_TOKENS - 1; token = strtok(NULL, ",\n"))
{
tokens[i++] = strdup(token);
}
tokens[i] = strdup(strtok(NULL, "\n"));
free(copy);
return tokens;
}
This tickles my fancy much nicer since it is much easier to see that there is a final case. You also get rid of the strange ternary operator.
Sadly though, this segfaults! I can't for the life of me figure out why.
Edit: Add some output examples:
[11:56:06] gravypod:test git:(master*) $ ./test_no_fault
Hello
World
This
Is
Text
[11:56:10] gravypod:test git:(master*) $ ./test_seg_fault
[1] 3718 segmentation fault (core dumped) ./test_seg_fault
[11:56:14] gravypod:test git:(master*) $
Please check the return value from strtok before you risk passing NULL to another function. Your loop is calling strtok one more time than you think.
It is more usual to use this return value to control your loop, then you are not at the mercy of your data. As for the delimitors, best to keep it simple and not try anything fancy.
char **split_line(const char *line, int num_tokens)
{
char *copy = strdup(line);
char **tokens = (char**) malloc(sizeof(char*) * num_tokens);
int i = 0;
char *token;
char delim1[] = ",\r\n";
char delim2[] = "\r\n";
char *delim = delim1; // start with a comma in the delimiter set
token = strtok(copy, delim);
while(token != NULL) { // strtok result comtrols the loop
tokens[i++] = strdup(token);
if(i == NUM_TOKENS) {
delim = delim2; // change the delimiters
}
token = strtok(NULL, delim);
}
free(copy);
return tokens;
}
Note you should also check the return values from malloc and strdup and free your memory properly
When you get to the last loop, you'll get
for (char *token = strtok(copy, ",\n"); i < NUM_TOKENS - 1; token = strtok(NULL, ",\n"))
loop body
loop increment step, i.e. token = strtok(NULL, ",\n") (with the wrong second arg)
loop continuation check i < NUM_TOKENS - 1
i.e. it has still called strtok even though you're now out-of-range. You've also got an off-by-one on your array indices here: you'd want to initialise i=0 not 1.
You could avoid this by e.g.
making the initial strtok a special case outside the loop, e.g.
int i = 0;
tokens[i++] = strdup(strtok(copy, ",\n"));
then moving the strtok(NULL, ",\n") inside the loop
I'm also surprised you want the \n there at all, or even need to call the last strtok (wouldn't that already just point to the rest of the string? If you just trying to chop a trailing newline there are easier ways) but I haven't used strtok in years.
(As an aside you're also not freeing the malloced array you store the string pointers in. That said since it's the end of the program at that point that doesn't matter so much.)
Remember that strtok identifies a token when it finds any of the characters in the delimiter string (the second argument to strtok()) - it doesn't try to match the entire delimiter string itself.
Thus, the ternary operator was never needed in the first place - the string will be tokenized based on the occurrence of , OR \n in the input string, so the following works:
for (token = strtok(copy, ",\n"); i < NUM_TOKENS; token = strtok(NULL, ",\n"))
{
tokens[i++] = strdup(token);
}
The second example segfaults because it's already tokenized the input to the end of the string by the time it exits the for loop. Calling strtok() again sets token to NULL, and the segfault is generated when strdup() is called on the NULL pointer. Removing the extra call to strtok gives the expected results:
for (token = strtok(copy, ",\n"); i < NUM_TOKENS - 1; token = strtok(NULL, ",\n"))
{
tokens[i++] = strdup(token);
}
tokens[i] = strdup(token);

strcpy doesn't work in C with pointers

My code looks like that:
char *token;
token = strtok(myStrings, delimiter);
char newArray[MAXCHARS];
while (token != NULL) {
printf("%s\n", token);
strcpy(newArray, &token);
token = strtok(NULL, delimiter);
}
I want to add the token to my char array but it doesn't work.
It prints the right token (e.g. Martin) but I can't add it to my char array.

C - segfault when trying to use strdup

I've used strdup() in the past in the same way that I am using it here. I am passing token2 into strdup which is of type char * with a valid pointer in it, yet when I try to run the line "name = strdup(token2);" my program segfaults and I am quite unsure as to why. If anyone would be able to help me it would be greatly appreciated. I also realize that my code does not return a proper type yet, I am still working on writing all of it.
struct YelpDataBST* create_business_bst(const char* businesses_path, const char* reviews_path){
if(fopen(businesses_path,"r") == NULL || fopen(reviews_path,"r") == NULL)
return NULL;
FILE* fp_bp = fopen(businesses_path, "r");
FILE* fp_rp = fopen(reviews_path, "r");
struct YelpDataBST* yelp = malloc(sizeof(struct YelpDataBST*));
int ID = -1;
int tempID;
long int addressOffset;
long int reviewOffset;
char line[2000];
char line2[2000];
char temp[2000];
char temp2[2000];
char* token;
char* token2;
char* name;
int len;
BusList* busNode = NULL;
BusList* busList = NULL;
BusTree* busTreeNode = NULL;
BusTree* busTree = NULL;
ID = -1;
tempID = 0;
fgets(line,2000,fp_rp);
fgets(line2,2000,fp_bp);
fseek(fp_rp,0, SEEK_SET);
fseek(fp_bp,0,SEEK_SET);
int ct = 0;
while(!feof(fp_rp)){
len = strlen(line);
token = strtok(line, "\t");
//printf("line: %s\n", line);
token2 = strtok(line2, "\t");
tempID = atoi((char*)strdup(token));
if(ct == 0){
tempID = 1;
ct++;
}
if((ID != tempID || (ID < 0)) && tempID != 0){
if(tempID == 1)
tempID = 0;
token2 = strtok(NULL, "\t");
//name = strdup(token2);
reviewOffset = ftell(fp_rp);
if(tempID != 0)
reviewOffset -= len;
addressOffset = ftell(fp_bp);
ID = atoi((char*)strdup(token));
busList = BusNode_insert(busList, addressOffset, reviewOffset); //replace with create node for tree
token2 = strtok(NULL, "\t");
token2 = strtok(NULL, "\t");
token2 = strtok(NULL, "\t");
token2 = strtok(NULL, "\t");
token2 = strtok(NULL, "\t");
token2 = strtok(NULL, "\n");
fgets(line2,2000,fp_bp);
}
token = strtok(NULL, "\t");
token = strtok(NULL, "\t");
token = strtok(NULL, "\t");
token = strtok(NULL, "\t");
token = strtok(NULL, "\n");
fgets(line,2000,fp_rp);
}
//BusList_print(busList);
}
strdup(token) segfaulting is most likely explained by token being NULL. (You don't need to strdup here anyway). Change that piece of code to:
if ( token == NULL )
{
fprintf(stderr, "Invalid data in file.\n");
exit(EXIT_FAILURE); // or some other error handling
}
tempID = atoi(token);
However a greater problem with the surrounding code is that you are trying to use strtok twice at once. It maintains internal state and you can only have one strtok "in progress" at any one time. The second one cancels the first one. You'll have to redesign that section of code.
Also, while(!feof(fp_rp)) is wrong, and your yelp mallocs the wrong number of bytes (although in the code posted you never actually try to store anything in that storage, so it would not cause an error just yet).

Adding Tokens to an Array C

So I'm trying to add tokens to an array the if statement keeps verifying that the array, tokenHolder, is empty. My second while loop is where I try to input tokens into the array. However no tokens are inputted into the array and I don't understand why.
char* token;
int* bufflength = 0;
char* buffer = NULL;
char input[25000];
char *tokenHolder[2500];
int pos = 0;
while(1){
printf("repl> ");
getline(&buffer, &bufflength, stdin);
token = strtok(buffer, "");
//code to input tokens into array
while(token != NULL){
pos++;
token = strtok(NULL, "");
tokenHolder[pos] = token;
}
if(tokenHolder[0] == NULL){
printf("It's NULL");
}
}
You increment pos to 1 before you save any token, so nothing is ever assigned to tokenHolder[0].
Either use (note the use of blank rather than an empty string as the delimiter):
tokenHolder[0] = token = strtok(buffer, " ");
(or an equivalent) or do something like:
char *data = buffer;
while ((tokenHolder[pos++] = strtok(data, " ")) != NULL)
data = NULL;
char *tokenHolder[2500] = { NULL };
...
while(token != NULL){
tokenHolder[pos++] = token;
token = strtok(NULL, "");
}
if(tokenHolder[0] == NULL){//or if(pos == 0){
printf("It's NULL");
}

Resources