Tokenizing multiple String in C with KEIL Compiler - c

I am writing on a Microcontroller-program using the Keil-Compiler. The program creates several CSV-Like Strings (Logging-Lines). For example "A;001;ERROR;C05;...\n"
To save space I now want to reduce the data by just logging the differences.
Therefore I am saving the last logged line and compare it to the new one. If a value in a column is the same, I want to just omit it. For example:
"A;001;ERROR;C05;...\n" <- previous Log
"A;002;ERROR;C06;...\n" <- new Log
would result in ";002;;C06;...\n"
At first I just included <string.h> and used 'strtok' to step through my CSV-line. Since I need to compare 2 Strings/Lines, I would need to use it simultaneously on 2 different Strings which does not work. So I switched to 'strtok_r' which seems to not work at all:
token1 = strtok_r(m_cActLogLine, ";", pointer1);
while (token1 != NULL) {
token1 = strtok_r(NULL, ";", pointer1);
}
This just gives me strange behaviour. Typically the second call to 'strtok_r' just returns a NULL and the loop is left.
So is there maybe another way of achieving the desired behaviour?
EDIT:
To clarify what I mean, this is what I am currently trying to get to work:
My Input (m_cMeasureLogLine) is "M;0001;001;01;40;1000.00;0.00;1000.00;0.00;360.00;0.00;400.00;24.90;400.00;-9999.00;-9999.00;-9999.00;0;LED;;;;;400.00;34.40;25.41;27.88;29.01;0.00;0.00;0.00;-100.00;0.00;-1000.00;-1000.00;-103.032;-70.192;19;8192.00;0.00;0;"
char m_cActLogLine[MAX_SIZE_PARAM_LINE_TEXT];
char* token1;
char* token2;
char** pointer1;
char** pointer2;
void vLogProtocolMeasureData()
{
strcpy(m_cActLogLine, m_cMeasureLogLine);
token1 = strtok_r(m_cActLogLine, ";", pointer1);
while (token1 != NULL) {
token1 = strtok_r(NULL, ";", pointer1);
}
}
The function is part of a bigger embedded project so I dont have Console Output but use the debugger to check the contents of my variables. In the above example, after the first call to 'strtok_r' token1 is 'M' which is correct. After the second call however (in the Loop) token 1 becomes 0x00000000 (NULL).
If I instead use 'strtok' instead:
strcpy(m_cActLogLine, m_cMeasureLogLine);
token1 = strtok(m_cActLogLine, ";");
while (token1 != NULL) {
token1 = strtok(NULL, ";");
}
the loop iterates just fine. But that way I cant process two Strings at a time and compare values column-wise.
In string.h the functions are declared as:
extern _ARMABI char *strtok(char * __restrict /*s1*/, const char * __restrict /*s2*/) __attribute__((__nonnull__(2)));
extern _ARMABI char *_strtok_r(char * /*s1*/, const char * /*s2*/, char ** /*ptr*/) __attribute__((__nonnull__(2,3)));
#ifndef __STRICT_ANSI__
extern _ARMABI char *strtok_r(char * /*s1*/, const char * /*s2*/, char ** /*ptr*/) __attribute__((__nonnull__(2,3)));
#endif

You need to pass a pointer to a valid char* for the last parameter of strtok_r(). You're passing a pointer to a pointer with pointer1, but it's a NULL (because it's a globally scoped variable that isn't assigned a value), so when strtok_r() goes to store it's iterating pointer at the address to a pointer you pass in, it's trying to write something to address 0x00000000.
Try...
char m_cActLogLine[MAX_SIZE_PARAM_LINE_TEXT];
char* token1;
char* pointer1;
void vLogProtocolMeasureData()
{
strcpy(m_cActLogLine, m_cMeasureLogLine);
token1 = strtok_r(m_cActLogLine, ";", &pointer1);
while (token1 != NULL) {
token1 = strtok_r(NULL, ";", &pointer1);
}

Related

How to save the remaining string from strtok_r()?

I'm trying to figure out how to pull the remaning string that needs to be parsed (the third parameter of strtok_r()), but am lost as to how to do so.
The initial input comes from a char pointer defined by malloc().
The code below is what I am trying to achieve.
num = strtok_r(raw_in, delim, &rest_of_nums);
while(rest_of_nums != NULL){
while(num != NULL){
//Compare num with fist digit of rest_of_nums
num = strtok_r(NULL, delim, &rest_of_nums);
}
//Iterate to compare num to second digit of rest_of_nums
}
I think you are trying to mix up strtok() and strtok_r(). The syntax of strtok() is as follows:
char * strtok ( char * str, const char * delimiters );
and the syntax of strtok_r() is as follows:
char * strtok_r ( char * str, const char * delimiters, char **saveptr );
When we call strtok() for the first time, the function expects a C string as argument for str, whose first character is used as the starting location to scan for tokens. In subsequent calls, the function expects a null pointer and uses the position right after the end of the last token as the new starting location for scanning. The point where the last token was found is kept internally by the function to be used on the next call.
However, in strtok_r(), the third argument saveptr is a pointer to a char * variable that is used internally by strtok_r() in order to maintain context between successive calls that parse the same string.
A sample example for strtok_r() is as follows:
char str[] = "sample strtok_r example gcc stack overflow";
char * token;
char * raw_in = str;
char * saveptr;
//delimiter is blank space in this example
token = strtok_r(raw_in, " ", &saveptr);
while (token != NULL) {
printf("%s\n", token);
printf("%s\n", saveptr);
token = strtok_r(NULL, " ", &saveptr);
}
The output should be as follows:
sample
strtok_r example gcc stack overflow
strtok_r
example gcc stack overflow
example
gcc stack overflow
gcc
stack overflow
stack
overflow
overflow
Source:
http://www.cplusplus.com/reference/cstring/strtok/
https://www.geeksforgeeks.org/strtok-strtok_r-functions-c-examples/
Questions are welcome.

Weird output from strtok

I was having some issues dealing with char*'s from an array of char*'s and used this for reference: Splitting C char array into words
So what I'm trying to do is read in char arrays and split them with a space delimiter so I can do stuff with it. For example if the first token in my char* is "Dog" I would send it to a different function that dealt with dogs. My problem is that I'm getting a strange output.
For example:
INPUT: *cmd = "Dog needs a vet appointment."
OUTPUT: (from print statements) "Doneeds a vet appntment."
I've checked for memory leaks using valgrind and I have none of them or other errors.
void parseCmd(char* cmd){ //passing in an individual char* from a char**
char** p_args = calloc(100, sizeof(char*));
int i = 0;
char* token;
token = strtok(cmd, " ");
while (token != NULL){
p_args[i++] = token;
printf("%s",token); //trying to debug
token = strtok(NULL, cmd);
}
free(p_args);
}
Any advice? I am new to C so please bear with me if I did something stupid. Thank you.
In your case,
token = strtok(NULL, cmd);
is not what you should be doing. You instead need:
token = strtok(NULL, " ");
As per the ISO standard:
char *strtok(char * restrict s1, const char * restrict s2);
A sequence of calls to the strtok function breaks the string pointed to by s1 into a sequence of tokens, each of which is delimited by a character from the string pointed to by s2.
The only difference between the first and subsequent calls (assuming, as per this case, you want the same delimiters) should be using NULL as the input string rather than the actual string. By using the input string as the delimiter list in subsequent calls, you change the behaviour.
You can see exactly what's happening if you try the following code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void parseCmd(char* cmd) {
char* token = strtok(cmd, " ");
while (token != NULL) {
printf("[%s] [%s]\n", cmd, token);
token = strtok(NULL, cmd);
}
}
int main(void) {
char x[] = "Dog needs a vet appointment.";
parseCmd(x);
return 0;
}
which outputs (first column will be search string to use next iteration, second is result of this iteration):
[Dog] [Dog]
[Dog] [needs a vet app]
[Dog] [intment.]
The first step worked fine since you were using space as the delimiter and it modified the string by placing a \0 at the end of Dog.
That means the next attempt (with the wrong spearator) would use one of the letters from {D,o,g} to split. The first matching letter for that set is the o in appointment which is why you see needs a vet app. The third attempt finds none of the candidate letters so you just get back the rest of the string, intment..
token = strtok(NULL, cmd); should be token = strtok(NULL, " ");.
The second argument is for delimiter.
http://man7.org/linux/man-pages/man3/strtok.3.html

How to store each sentence as an element of an array?

So, suppose I have an array (program asks me to write some text):
char sentences[] = "The first sentence.The second sentence.The third sentence";
And I need to store each sentence as an array, where I can have access to any word, or to store the sentences in a single array as elements.
(sentences[0] = "The first sentence"; sentences[1] = "The second sentence";)
How to print out each sentence separately I know:
char* sentence_1 = strtok(sentences, ".");
char* sentence_2 = strtok(NULL, ".");
char* sentence_3 = strtok(NULL, ".");
printf("#1 %s\n", sentence_1);
printf("#2 %s\n", sentence_2);
printf("#3 %s\n", sentence_3);
But how to make program store those sentences in 1 or 3 arrays I have no idea.
Please, help!
If you keep it in the main, since your sentences memory is static (cannot be deleted) you can simply do that:
#include <string.h>
#include <stdio.h>
int main()
{
char sentences[] = "The first sentence.The second sentence.The third sentence";
char* sentence[3];
unsigned int i;
sentence[0] = strtok(sentences, ".");
for (i=1;i<sizeof(sentence)/sizeof(sentence[0]);i++)
{
sentence[i] = strtok(NULL, ".");
}
for (i=0;i<sizeof(sentence)/sizeof(sentence[0]);i++)
{
printf("%d: %s\n",i,sentence[i]);
}
return 0;
}
In the general case, you first have to duplicate your input string:
char *sentences_dup = strdup(sentences);
sentence[0] = strtok(sentences_dup, ".");
many reasons for that:
you don't know the lifespan/scope of the input, and it is generally a pointer/a parameter, so your sentences could be invalid as soon as the input memory is freed/goes out of scope
the passed buffer may be const: you cannot modify its memory (strtok modifies the passed buffer)
change sentences[] by *sentences in the example above and you're pointing on a read-only zone: you have to make a copy of the buffer.
Don't forget to store the duplicated pointer, because you may need to free it at some point.
Another alternative is to also duplicate there:
for (i=1;i<sizeof(sentence)/sizeof(sentence[0]);i++)
{
sentence[i] = strdup(strtok(NULL, "."));
}
so you can free your big tokenized string at once, and the sentences have their own, independent memory.
EDIT: the remaining problem here is that you still have to know in advance how many sentences there are in your input.
For that, you could count the dots, and then allocate the proper number of pointers.
int j,nb_dots=0;
char pathsep = '.';
int nb_sentences;
int len = strlen(sentences);
char** sentence;
// first count how many dots we have
for (j=0;j<len;j++)
{
if (sentences[j]==pathsep)
{
nb_dots++;
}
}
nb_sentences = nb_dots+1; // one more!!
// allocate the array of strings
sentence=malloc((nb_sentences) * sizeof(*sentence));
now that we have the number of strings, we can perform our strtok loop. Just be careful of using nb_sentences and not sizeof(sentence)/sizeof(sentence[0]) which is now irrelevant (worth 1) because of the change of array type.
But at this point you could also get rid of strtok completely like proposed in another answer of mine

Parse non-const char* to integer in C

I'm confused by the C library functions strtol etc. I am trying to use them on a char* buffer that I passed to a call to strsep (which changed the location of that pointer). However, the compiler complains that I am passing a char* to strtol, which expects a const char*.
How can I parse the string into an integer if it is not a const char*? I cannot use a constant in this case because I need, at times, to change the values in the array (and also stringsep will change where the beginning of the array points to). Thanks.
EDIT: Here's is my attempt, using atoi (I know this is now deprecated, but it takes the same type argument as strtol and I was going to get this to work before switching to the other function.)
char *token, *freeme;
freeme = input;
while((token = (char*)(uint64_t)strsep(&input, " ")) != NULL) {
printf("%s\n", token);
current->next = malloc(sizeof(struct fraction_node));
current = current->next;
current->num = atoi(strsep(&token, "/"));
current->denom = atoi(&token);
}
free(freeme);
(The context is that it's parsing a list of fractions.)
while((token = (char*)(uint64_t)strsep(&input, " ")) != NULL) {
is completely broken.
#define _BSD_SOURCE
#include <string.h>
while((token = strsep(&input, " ") != NULL) {
is a trivial attempt to fix it, but does not work when input is a char const * pointer.
The
current->denom = atoi(&token);
does not make sense either; you have to write
current->denom = atoi(token);
Copy the string first, before converting.

C: Parse empty tokens from a string with strtok

My application produces strings like the one below. I need to parse values between the separator into individual values.
2342|2sd45|dswer|2342||5523|||3654|Pswt
I am using strtok to do this in a loop. For the fifth token, I am getting 5523. However, I need to account for the empty value between the two separators || as well. 5523 should be the sixth token, as per my requirement.
token = (char *)strtok(strAccInfo, "|");
for (iLoop=1;iLoop<=106;iLoop++) {
token = (char *)strtok(NULL, "|");
}
Any suggestions?
In that case I often prefer a p2 = strchr(p1, '|') loop with a memcpy(s, p1, p2-p1) inside. It's fast, does not destroy the input buffer (so it can be used with const char *) and is really portable (even on embedded).
It's also reentrant; strtok isn't. (BTW: reentrant has nothing to do with multi-threading. strtok breaks already with nested loops. One can use strtok_r but it's not as portable.)
That's a limitation of strtok. The designers had whitespace-separated tokens in mind. strtok doesn't do much anyway; just roll your own parser. The C FAQ has an example.
On a first call, the function expects
a C string as argument for str, whose
first character is used as the
starting location to scan for tokens.
In subsequent calls, the function
expects a null pointer and uses the
position right after the end of last
token as the new starting location for
scanning.
To determine the beginning and the end
of a token, the function first scans
from the starting location for the
first character not contained in
delimiters (which becomes the
beginning of the token). And then
scans starting from this beginning of
the token for the first character
contained in delimiters, which becomes
the end of the token.
What this say is that it will skip any '|' characters at the beginning of a token. Making 5523 the 5th token, which you already knew. Just thought I would explain why (I had to look it up myself). This also says that you will not get any empty tokens.
Since your data is setup this way you have a couple of possible solutions:
1) find all occurrences of || and replace with | | (put a space in there)
2) do a strstr 5 times and find the beginning of the 5th element.
char *mystrtok(char **m,char *s,char c)
{
char *p=s?s:*m;
if( !*p )
return 0;
*m=strchr(p,c);
if( *m )
*(*m)++=0;
else
*m=p+strlen(p);
return p;
}
reentrant
threadsafe
strictly ANSI conform
needs an unused help-pointer from calling
context
e.g.
char *p,*t,s[]="2342|2sd45|dswer|2342||5523|||3654|Pswt";
for(t=mystrtok(&p,s,'|');t;t=mystrtok(&p,0,'|'))
puts(t);
e.g.
char *p,*t,s[]="2,3,4,2|2s,d4,5|dswer|23,42||5523|||3654|Pswt";
for(t=mystrtok(&p,s,'|');t;t=mystrtok(&p,0,'|'))
{
char *p1,*t1;
for(t1=mystrtok(&p1,t,',');t1;t1=mystrtok(&p1,0,','))
puts(t1);
}
your work :)
implement char *c as parameter 3
Look into using strsep instead: strsep reference
Use something other than strtok. It's simply not intended to do what you're asking for. When I've needed this, I usually used strcspn or strpbrk and handled the rest of the tokeninzing myself. If you don't mind it modifying the input string like strtok, it should be pretty simple. At least right off, something like this seems as if it should work:
// Warning: untested code. Should really use something with a less-ugly interface.
char *tokenize(char *input, char const *delim) {
static char *current; // just as ugly as strtok!
char *pos, *ret;
if (input != NULL)
current = input;
if (current == NULL)
return current;
ret = current;
pos = strpbrk(current, delim);
if (pos == NULL)
current = NULL;
else {
*pos = '\0';
current = pos+1;
}
return ret;
}
Inspired by Patrick Schlüter answer I made this function, it is supposed to be thread safe and support empty tokens and doesn't change the original string
char* strTok(char** newString, char* delimiter)
{
char* string = *newString;
char* delimiterFound = (char*) 0;
int tokLenght = 0;
char* tok = (char*) 0;
if(!string) return (char*) 0;
delimiterFound = strstr(string, delimiter);
if(delimiterFound){
tokLenght = delimiterFound-string;
}else{
tokLenght = strlen(string);
}
tok = malloc(tokLenght + 1);
memcpy(tok, string, tokLenght);
tok[tokLenght] = '\0';
*newString = delimiterFound ? delimiterFound + strlen(delimiter) : (char*)0;
return tok;
}
you can use it like
char* input = "1,2,3,,5,";
char** inputP = &input;
char* tok;
while( (tok=strTok(inputP, ",")) ){
printf("%s\n", tok);
}
This suppose to output
1
2
3
5
I tested it for simple strings but didn't use it in production yet, and posted it on code review too, so you can see what do others think about it
Below is the solution that is working for me now. Thanks to all of you who responded.
I am using LoadRunner. Hence, some unfamiliar commands, but I believe the flow can be understood easily enough.
char strAccInfo[1024], *p2;
int iLoop;
Action() { //This value would come from the wrsp call in the actual script.
lr_save_string("323|90||95|95|null|80|50|105|100|45","test_Param");
//Store the parameter into a string - saves memory.
strcpy(strAccInfo,lr_eval_string("{test_Param}"));
//Get the first instance of the separator "|" in the string
p2 = (char *) strchr(strAccInfo,'|');
//Start a loop - Set the max loop value to more than max expected.
for (iLoop = 1;iLoop<200;iLoop++) {
//Save parameter names in sequence.
lr_param_sprintf("Param_Name","Parameter_%d",iLoop);
//Get the first instance of the separator "|" in the string (within the loop).
p2 = (char *) strchr(strAccInfo,'|');
//Save the value for the parameters in sequence.
lr_save_var(strAccInfo,p2 - strAccInfo,0,lr_eval_string("{Param_Name}"));
//Save string after the first instance of p2, as strAccInfo - for looping.
strcpy(strAccInfo,p2+1);
//Start conditional loop for checking for last value in the string.
if (strchr(strAccInfo,'|')==NULL) {
lr_param_sprintf("Param_Name","Parameter_%d",iLoop+1);
lr_save_string(strAccInfo,lr_eval_string("{Param_Name}"));
iLoop = 200;
}
}
}

Resources