Tokenize a c string - c

I'm trying to get the string thet's after /xxx/, there is a must be forward slashes, two of them, then the string that I need to extract.
Here is my code, but I don't know where to set the null terminator, there is a math problem here
char str[100] = "/709/usr/datapoint/nviTemp1";
char *tmp;
char token[100];
tmp = strchr(str+1, '/');
size_t len = (size_t)(tmp-str)+1;
strncpy(token, str+len, strlen(str+len));
strcat(token,"\0");
I want to extract whatever after /709/ which is usr/datapoint/nviTemp1
Note that /709/ is variable and it could be any size but for sure there will be two forward slashes.

Simple improvement:
char str[100] = "/709/usr/datapoint/nviTemp1";
char *tmp;
char token[100];
tmp = strchr(str+1, '/');
if (tmp != NULL) strncpy(token, tmp + 1, sizeof(token)); else token[0] = '\0';
tmp points the slash after "/709", so what you want is right after there.
You don't have to calculate the length manually.
Moreover, strncpy copies at most (third argument) characters, so it should be the length of the destination buffer.

If you know for sure that the string starts with `"/NNN/", then it is simple:
char str[100] = "/709/usr/datapoint/nviTemp1";
char token[100];
strcpy(token, str+5); // str+5 is the first char after the second slash.
If you need to get everything after the second slash:
char str[100] = "/709/usr/datapoint/nviTemp1";
char token[100];
char* slash;
slash = strchr(str, '/'); // Expect that slash == str
slash = strchr(slash+1, '/'); // slash is second slash.
strcpy(token, slash+1); // slash+1 is the string after the second slash.

You can use a counter and a basic while loop:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
if (argc < 2) return 0;
char *s = argv[1];
char slash = '/';
int slash_count = 0;
while (slash_count < 2 && *s)
if (*s++ == slash) slash_count++;
printf("String: %s\n", s);
return 0;
}
Then you can do whatever you want with the s pointer, like duplicating it with strdup or using strcat, strcpy, etc...
Outputs:
$ ./draft /709/usr/datapoint/nviTemp1
String: usr/datapoint/nviTemp1
$ ./draft /70905/usr/datapoint/nviTemp1
String: usr/datapoint/nviTemp1

You can make use of sscanf-
sscanf(str,"%*[/0-9]%s",token);
/* '/709/' is read and discarded and remaining string is stored in token */
token will contain string "usr/datapoint/nviTemp1" .
Working example

Related

Split string by a substring

I have following string:
char str[] = "A/USING=B)";
I want to split to get separate A and B values with /USING= as a delimiter
How can I do it? I known strtok() but it just split by one character as delimiter.
As others have pointed out, you can use strstr from <string.h> to find the delimiter in your string. Then either copy the substrings or modify the input string to split it.
Here's an implementation that returns the second part of a split string. If the string can't be split, it returns NULL and the original string is unchanged. If you need to split the string into more substrings, you can call the function on the tail repeatedly. The first part will be the input string, possibly shortened.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *split(char *str, const char *delim)
{
char *p = strstr(str, delim);
if (p == NULL) return NULL; // delimiter not found
*p = '\0'; // terminate string after head
return p + strlen(delim); // return tail substring
}
int main(void)
{
char str[] = "A/USING=B";
char *tail;
tail = split(str, "/USING=");
if (tail) {
printf("head: '%s'\n", str);
printf("tail: '%s'\n", tail);
}
return 0;
}
I known strtok() but it just split by one character as delimiter
Nopes, it's not.
As per the man page for strtok(), (emphasis mine)
char *strtok(char *str, const char *delim);
[...] The delim argument specifies a set of bytes that delimit the tokens in the parsed string. [...] A sequence of two or more contiguous delimiter bytes in the parsed string is considered to be a single delimiter. [...]
So, it need not be "one character" as you've mentioned. You can using a string, like in your case "/USING=" as the delimiter to get the job done.
Here is a little function to do this. It works exactly like strtok_r except that the delimiter is taken as a delimiting string, not a list of delimiting characters.
char *strtokstr_r(char *s, char *delim, char **save_ptr)
{
char *end;
if (s == NULL)
s = *save_ptr;
if (s == NULL || *s == '\0')
{
*save_ptr = s;
return NULL;
}
// Skip leading delimiters.
while (strstr(s,delim)==s) s+=strlen(delim);
if (*s == '\0')
{
*save_ptr = s;
return NULL;
}
// Find the end of the token.
end = strstr (s, delim);
if (end == NULL)
{
*save_ptr = s + strlen(s);
return s;
}
// Terminate the token and make *SAVE_PTR point past it.
memset(end, 0, strlen(delim));
*save_ptr = end + strlen(delim);
return s;
}
Dude, this answer is only valid if the input is this one, if were "abUcd/USING=efgh" your algorithm doesn't work.
This answer is the only valid for me:
char *split(char *str, const char *delim)
{
char *p = strstr(str, delim);
if (p == NULL) return NULL; // delimiter not found
*p = '\0'; // terminate string after head
return p + strlen(delim); // return tail substring
}
int main(void)
{
char str[] = "A/USING=B";
char *tail;
tail = split(str, "/USING=");
if (tail) {
printf("head: '%s'\n", str);
printf("tail: '%s'\n", tail);
}
return 0;
}
See this. I got this when I searched for your question on google.
In your case it will be:
#include <stdio.h>
#include <string.h>
int main (int argc, char* argv [])
{
char theString [16] = "abcd/USING=efgh";
char theCopy [16];
char *token;
strcpy (theCopy, theString);
token = strtok (theCopy, "/USING=");
while (token)
{
printf ("%s\n", token);
token = strtok (NULL, "/USING=");
}
return 0;
}
This uses /USING= as the delimiter.
The output of this was:
abcd
efgh
If you want to check, you can compile and run it online over here.

how to extract a string using two forward slashes

I have a string which is always has two forward slashes as /709/nviTemp1
I would like to extract the /709/ from that string and return it in char*, how would I use strstr for that purpose?
I also might have many forward slashes in the path like /709/nvitemp1/d/s/
so I only need to get the first token /709/
You may try something like this:
char str[100] = "/709/nviTemp1";
char resu[100];
char *tmp;
tmp = strchr(str+1, '/');
strncpy(resu, str, (size_t)(tmp - str) + 1);
resu[(size_t)(tmp - str) + 1] = '\0';
The strchr search the 1st '/', but starting at str+1 skips the real first one. Then compute "size" beetween start and found '/' and use strncpy to copy stuff, and add a trailing '\0'.
Try using strtok for this. strtok splits up a string into different tokens, based on a separator. Like this:
char str[100] = "/709/nviTemp1";
char delimiter[2] = "/";
char *result;
char *finalresult;
result = strtok(str, delimiter); // splits by first occurence of '/', e.g "709"
strcat(finalresult,"/");
strcat(finalresult, result);
strcat(finalresult,"/");
printf("%s",finalresult);
Please take care of the fact that strtok modifies your original string that you pass to it.
To perform the task you've asked about, the following code will suffice. If you need a more general solution, the answer will obviously differ.
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
int main()
{
char *str = "/709/nviTemp1";
char *delims = "/";
char *strCopy;
char *tmpResult;
strCopy = strdup(str);
tmpResult = strtok(strCopy, delims);
// +1 for the first slash, +1 for the second slash, + another for the terminating NULL
char *finalResult = (char*)calloc(strlen(tmpResult) + 3, 1);
strcat(finalResult, "/");
strcat(finalResult, tmpResult);
strcat(finalResult, "/");
free(strCopy);
printf("%s",finalResult);
}
Output:
/709/
Use strchr to find first slash. Advance pointer and find the second slash. Advance the pointer and set to '\0'.
#include<stdlib.h>
#include<stdio.h>
#include<string.h>
int main (int argc , char *argv[]) {
char *tok;
char text[] = "/709/nvitemp1/d/s/";
if ( ( tok = strchr ( text, '/')) != NULL) {//find first /
tok++;
if ( ( tok = strchr ( tok, '/')) != NULL) {//find second /
tok++;
*tok = '\0';
printf ( "%s\n", text);
}
}
return 0;
}

Substrings in the middle of a String in C

I need to extract substrings that are between Strings I know.
I have something like char string = "abcdefg";
I know what I need is between "c" and "f", then my return should be "de".
I know the strncpy() function but do not know how to apply it in the middle of a string.
Thank you.
Here's a full, working example:
#include <stdio.h>
#include <string.h>
int main(void) {
char string[] = "abcdefg";
char from[] = "c";
char to[] = "f";
char *first = strstr(string, from);
if (first == NULL) {
first = &string[0];
} else {
first += strlen(from);
}
char *last = strstr(first, to);
if (last == NULL) {
last = &string[strlen(string)];
}
char *sub = calloc(strlen(string) + 1, sizeof(char));
strncpy(sub, first, last - first);
printf("%s\n", sub);
free(sub);
return 0;
}
You can check it at this ideone.
Now, the explanation:
1.
char string[] = "abcdefg";
char from[] = "c";
char to[] = "f";
Declarations of strings: main string to be checked, beginning delimiter, ending delimiter. Note these are arrays as well, so from and to could be, for example, cd and fg, respectively.
2.
char *first = strstr(string, from);
Find occurence of the beginning delimiter in the main string. Note that it finds the first occurence - if you need to find the last one (for example, if you had the string abcabc, and you wanted a substring from the second a), it might need to be different.
3.
if (first == NULL) {
first = &string[0];
} else {
first += strlen(from);
}
Handle situation, in which the first delimiter doesn't appear in the string. In such a case, we will make a substring from the beginning of the entire string. If it does appear, however, we move the pointer by length of from string, as we need to extract the substring beginning after the first delimiter (correction thanks to #dau_sama).
Depending on your specifications, this may or may not be needed, or another result might be expected.
4.
char *last = strstr(first, to);
Find occurence of the ending delimiter in the main string. Note that it finds the first occurence.
As noted by #dau_sama, it's better to search for ending delimiter from the first, not from beginning of the entire string. This prevents situations, in which to would appear earlier than from.
5.
if (last == NULL) {
last = &string[strlen(string)];
}
Handle situation, in which the second delimiter doesn't appear in the string. In such a case, we will make a substring until end of the string, so we get a pointer to the last character.
Again, depending on your specifications, this may or may not be needed, or another result might be expected.
6.
char *sub = calloc(last - first + 1, sizeof(char));
strncpy(sub, first, last - first);
Allocate sufficient memory and extract substring based on pointers found earlier. We copy last - first (length of the substring) characters beginning from first character.
7.
printf("%s\n", sub);
Here's the result.
I hope it does present the problem with enough details. Depending on your exact specifications, you may need to alter this somehow. For example, if you needed to find all substrings, and not just the first one, you may want to make a loop for finding first and last.
TY guys, worked using the form below:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *between_substring(char *str, char from, char to){
while(*str && *str != from)
++str;//skip
if(*str == '\0')
return NULL;
else
++str;
char *ret = malloc(strlen(str)+1);
char *p = ret;
while(*str && *str != to){
*p++ = *str++;//To the end if `to` do not exist
}
*p = 0;
return ret;
}
int main (void){
char source[] = "abcdefg";
char *target;
target = between(source, 'c', 'f');
printf("%s", source);
printf("%s", target);
return 0;
}
Since people seemed to not understand my approach in the comments, here's a quick hacked together stub.
const char* string = "abcdefg";
const char* b = "c";
const char* e = "f";
//look for the first pattern
const char* begin = strstr(string, b);
if(!begin)
return NULL;
//look for the end pattern
const char* end = strstr(begin, e);
if(!end)
return NULL;
end -= strlen(e);
char result[MAXLENGTH];
strncpy(result, begin, end-begin);
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *between(const char *str, char from, char to){
while(*str && *str != from)
++str;//skip
if(*str == '\0')
return NULL;
else
++str;
char *ret = malloc(strlen(str)+1);
char *p = ret;
while(*str && *str != to){
*p++ = *str++;//To the end if `to` do not exist
}
*p = 0;
return ret;
}
int main(void){
const char* string = "abcdefg";
char *substr = between(string, 'c', 'f');
if(substr!=NULL){
puts(substr);
free(substr);
}
return 0;
}

How to copy front part of string up to a delimiter

I need to grab the first part of a string up to and including the last backslash in a path. I am fairly new to C. So I was wondering if the following code is a good approach? Or is there a better way?
#include <stdio.h>
#include <string.h>
int main(int argc, char* argv[]) {
char szPath[260] = {0};
strcpy(szPath, argv[0]);
char* p = szPath;
size_t len = strlen(argv[0]);
p+=len; //go to end of string
int backpos = 0;
while(*--p != '\\')
++backpos;
szPath[len-backpos] = 0;
printf("%s\n", szPath);
return 0;
}
After receiving comments changed to this:
char szPath[260];
strcpy(szPath, argv[0]);
/*Scan a string for the last occurrence of a character.*/
char *p = strrchr(szPath, '\\');
if (p) {
*(p + 1) = 0; /* retain backslash and null terminate after that */
} else {
/* handle error */
}
printf("%s\n", szPath);
I would go with strrchr. This assumes str points to writable memory:
char *p;
if ((p = strrchr(str, '\\'))
*(p + 1) = 0; /* Since we passed it to strrchr, it's 0-terminated. */
Obviously, basename and dirname might be there if you are working with paths and might be more appropriate.

Parsing a string with tokens for the first and last words (in C)

I'm going to try to explain the problem.
I am getting a string containing a registry key. For example:
HKEY_CURRENT_USER\Software\MyProgram\SomeOtherValue\SomeKey
now, I need to parse that string into 3 different char (or char *) variables. After the parsing it'll be something like:
string1 = HKEY_CURRENT_USER
string2 = \Software\MyProgram\SomeOtherValue\ /* with the '\' */
string3 = SomeKey
Not only do I need to group the backslashes; I also don't know how many of them are there. I could have something like:
HKEY_CURRENT_USER\Software\SomeKey
or something like:
HKEY_CURRENT_USER\Software\SomeValue\SomeOthervalue\Someblah\SomeKey
I tried with strtok() and strcspn() but i'm getting very confused here...
Any idea how to get this done?
Code is appreciated.
Thanks!
Pseudo-Code:
Step 1: Scan forward until the first "\", note the index.
Step 2: Scan Backward from the end to the last "\"
(the first "\" encountered when going backwards), note the index.
Step 3: StrCpy the relevant pieces out into 3 strings.
Code: (does not rely on strrchr, or other methods you seem to have issues with)
void ParseRegEntry(char* regKey, char** TopLevel, char** Path, char** Key);
int main(void)
{
char* regKey = "HKEY_CURRENT_USER\\Software\\MyProgram\\SomeOtherValue\\SomeKey";
char* TopLevel;
char* Path;
char* Key;
ParseRegEntry(regKey, &TopLevel, &Path, &Key);
printf("1: %s\n2: %s\n3: %s\n", TopLevel, Path, Key);
free(TopLevel);
free(Path);
free(Key);
return 0;
}
void ParseRegEntry(char* regKey, char** TopLevel, char** Path, char** Key)
{
int firstDelimiter = 0;
int lastDelimiter = strlen(regKey)-1;
int keyLen;
while(regKey[firstDelimiter] != '\\')
{
firstDelimiter++;
}
while(regKey[lastDelimiter] != '\\')
{
lastDelimiter--;
}
keyLen = strlen(regKey) - lastDelimiter-1;
*TopLevel = (char*)malloc(firstDelimiter+1);
strncpy(*TopLevel, regKey, firstDelimiter);
(*TopLevel)[firstDelimiter] = '\0';
*Path = (char*)malloc(lastDelimiter - firstDelimiter+2);
strncpy(*Path, regKey+firstDelimiter, lastDelimiter - firstDelimiter);
(*Path)[lastDelimiter-firstDelimiter] = '\0';
*Key = (char*)malloc(keyLen+1);
strncpy(*Key, regKey+lastDelimiter+1, keyLen);
(*Key)[keyLen] = '\0';
}
strchr(char*, char) : locate first occurrence of char in string
strrchr(char*, char) : locate last occurrence of char in string
char* str = "HKEY_CURRENT_USER\Software\MyProgram\SomeOtherValue\SomeKey";
char token1[SIZE], token2[SIZE], token3[SIZE];
char* first = strchr(str, '\\');
char* last = strrchr(str, '\\')+1;
strncpy(token1, str, first-str);
token1[first-str] = '\0';
strncpy(token2, first, last-first);
token2[last-first] = '\0';
strcpy(token3, last);
We use strchr to find the first '\', and strrchr to find the last '\'. We then copy to token1, token2, token3 based on those positions.
I decided to just use fixed size buffers instead of calloc-ing, because that's not so important to illustrate the point. And I kept messing it up. :)
Copy the string into an allocated one and split the variable placing a '\0' in the slash where you want to truncate it.
You can "scan" the string for slashes using the strchr function.
void to_split(char *original, int first_slash, int second_slash, char **first, char **second, char **third) {
int i;
char *first_null;
char *second_null;
char *allocated;
if (first_slash >= second_slash)
return;
allocated = malloc(strlen(original) + 1);
*first = allocated;
strcpy(allocated, original);
for (i = 0, first_null = allocated; i < first_slash && (first_null = strchr(first_null,'\\')); i++);
if (first_null) {
*first_null = '\0';
*second = first_null + 1;
}
second_null = allocated + strlen(original);
i = 0;
while (i < second_slash && second_null > allocated)
i += *second_null-- == '\\';
if (++second_null > allocated) {
*second_null = '\0';
*third = second_null + 1;
}
}
Usage:
int main (int argc, char **argv) {
char *toSplit = "HKEY_CURRENT_USER\\Software\\MyProgram\\SomeOtherValue\\SomeKey";
char *first;
char *second;
char *third;
to_split(toSplit, 1, 3, &first, &second, &third);
printf("%s %s %s\n", first, second, third);
return 0;
}
It isn't the best code in the world, but it gives you an idea.
Here's an example using strchr and strrchr to scan forwards and backwards in the string for the '\'.
char str[] = "HKEY_CURRENT_USER\Software\MyProgram\SomeOtherValue\SomeKey";
char *p, *start;
char root[128], path[128], key[128];
p = strchr (str, '\\');
strncpy (root, str, p - str);
start = p;
p = strrchr (str, '\\') + 1;
strncpy (path, start, p - start);
strcpy (key, p);

Resources