Parse non-const char* to integer in C - c

I'm confused by the C library functions strtol etc. I am trying to use them on a char* buffer that I passed to a call to strsep (which changed the location of that pointer). However, the compiler complains that I am passing a char* to strtol, which expects a const char*.
How can I parse the string into an integer if it is not a const char*? I cannot use a constant in this case because I need, at times, to change the values in the array (and also stringsep will change where the beginning of the array points to). Thanks.
EDIT: Here's is my attempt, using atoi (I know this is now deprecated, but it takes the same type argument as strtol and I was going to get this to work before switching to the other function.)
char *token, *freeme;
freeme = input;
while((token = (char*)(uint64_t)strsep(&input, " ")) != NULL) {
printf("%s\n", token);
current->next = malloc(sizeof(struct fraction_node));
current = current->next;
current->num = atoi(strsep(&token, "/"));
current->denom = atoi(&token);
}
free(freeme);
(The context is that it's parsing a list of fractions.)

while((token = (char*)(uint64_t)strsep(&input, " ")) != NULL) {
is completely broken.
#define _BSD_SOURCE
#include <string.h>
while((token = strsep(&input, " ") != NULL) {
is a trivial attempt to fix it, but does not work when input is a char const * pointer.
The
current->denom = atoi(&token);
does not make sense either; you have to write
current->denom = atoi(token);

Copy the string first, before converting.

Related

Parse comma separated string in C

Currently I'm trying this, which is printing out nothing (but there are no compilation problems):
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void parseGprmc(char* gprmc) {
printf("test");
char* ptr;
ptr = strtok(gprmc, ",");
while(ptr != NULL) {
printf("%s\n", ptr);
ptr = strtok(gprmc, ",");
}
}
int main() {
char* gprmc = "$GPRMC,081836,A,3751.65,S,14507.36,E,000.0,360.0,,,E*62";
printf("%s", gprmc);
parseGprmc(gprmc);
printf("%s", gprmc);
return 1;
}
What am I doing wrong?
Ideally, parseGprmc would print out:
$GPRMC
081836
A
3751.65
S
14507.36
E
000.0
360.0
E*62
Considering the null values as valid as well but I don't think that strtok does that.
There are two serious issues in your code. First (as pointed out in the comments), the strtok function modifies the string given as its first argument – but you have declared gprmc as a pointer to a constant (immutable) string literal. To fix this, change the declaration to a char array that is initialized with a copy of the literal:
char gprmc[] = "$GPRMC,081836,A,3751.65,S,14507.36,E,000.0,360.0,,,E*62";
Second, only the first call to strtok for parsing a given string should have that string as its first argument; subsequent calls (on the same string) should use a NULL first argument (see this cppreference, particularly the part beginning with If str is not a null pointer, ….
Here is a 'fixed' version of your parseGprmc function:
void parseGprmc(char* gprmc)
{
printf("test");
char* ptr;
ptr = strtok(gprmc, ","); // First call, use string address
while (ptr != NULL) {
printf("%s\n", ptr);
ptr = strtok(NULL, ","); // Subsequent calls, use NULL
}
}
Note that it is not directly possible to extract empty (null) tokens using the strtok function, as the function searches for the first character which is not contained in delim (quoted from the same cppreference page). To do this, you have to add some 'tricks', comparing the returned address of each token with the address of the end of the previous token, and counting how many delimiter characters were in the 'gap'. Here is a possible modification of your function to do just that:
void parseGprmc(char* gprmc)
{
printf("test");
char* ptr;
char* lastend;
ptrdiff_t nBlanks;
ptr = strtok(gprmc, ","); // First call, use string address
while (ptr != NULL) {
lastend = ptr + strlen(ptr); // Address of last character in token
printf("%s\n", ptr);
ptr = strtok(NULL, ","); // Subsequent calls, use NULL
nBlanks = ptr ? ptr - lastend : -1; // Number of delimiters found
while (--nBlanks >= 1) { // One empty token for each gap > 1
printf("<null token>\n");
}
}
}
One remark about strtok is that you must use NULL for the second and next calls. This is because after the first call the contents of gprmc is modified and has an \0 in place of first ,
ptr = strtok(gprmc, ","); // for first call before the loop
ptr = strtok(NULL, ",");// for the call following in the loop
homework? inside the loop you don't pass strtok a pointer, you pass NULL. See help on that function. But perhaps what you're wanting is to use strchr instead and change ',' to '\n' ?

Issue when using pointer to line tokens in C

I have created a program that requires reading a CSV file that contains bank accounts and transaction history. To access certain information, I have a function getfield which reads each line token by token:
const char* getfield(char* line, int num)
{
const char *tok;
for (tok = strtok(line, ",");
tok && *tok;
tok = strtok(NULL, ",\n"))
{
if (!--num)
return tok;
}
return NULL;
}
I use this later on in my code to access the account number (at position 2) and the transaction amount(position 4):
...
while (fgets(line, 1024, fp))
{
char* tmp = strdup(line);
//check if account number already exists
char *acc = (char*) getfield(tmp, 2);
char *txAmount = (char*)getfield(tmp, 4);
printf("%s\n", txAmount);
//int n =1;
if (acc!=NULL && atoi(acc)== accNum && txAmount !=NULL){
if(n<fileSize)
{
total[n]= (total[n-1]+atof(txAmount));
printf("%f", total[n]);
n++;
}
}
free(tmp1); free(tmp2);
}
...
No issue seems to arise with char *acc = (char*) getfield(tmp, 2), but when I use getfield for char *txAmount = (char*)getfield(tmp, 4) the print statement that follows shows me that I always have NULL. For context, the file currently reads as (first line is empty):
AC,1024,John Doe
TX,1024,2020-02-12,334.519989
TX,1024,2020-02-12,334.519989
TX,1024,2020-02-12,334.519989
I had previously asked if it was required to use free(acc) in a separate part of my code (Free() pointer error while casting from const char*) and the answer seemed to be no, but I'm hoping this question gives better context. Is this a problem with not freeing up txAmount? Any help is greatly appreciated !
(Also, if anyone has a better suggestion for the title, please let me know how I could have better worded it, I'm pretty new to stack overflow)
Your getfield function modifies its input. So when you call getfield on tmp again, you aren't calling it on the right string.
For convenience, you may want to make a getfield function that doesn't modify its input. It will be inefficient, but I don't think performance or efficiency are particularly important to your code. The getfield function would call strdup on its input, extract the string to return, call strdup on that, free the duplicate of the original input, and then return the pointer to the duplicate of the found field. The caller would have to free the returned pointer.
The issue is that strtok replaces the found delimiters with '\0'. You'll need to get a fresh copy of the line.
Or continue where you left off, using getfield (NULL, 2).

issue with strtok to compare two words from a nested results of strtok function

I'm looking for comparing words from an array with words in the dictionary from another array to look for the maximum number of words found
I used strtok since the words in both are delimited with spaces, but it's not working. I need your help please
void chercherScoreMotDansDico(char msgBootforce [], int*
maxCorrepondance, char* mot, char* dicoActuel, char*
bonResultatBootforce) {
int i = 0;
char* motdico = NULL;
char tmpMsgBootForce [3000] = {0};
strcpy(tmpMsgBootForce, msgBootforce);
mot = strtok (tmpMsgBootForce, " ");
while (mot != NULL) {
motdico = strtok (dicoActuel, " ");
while (motdico != NULL) {
if (strcmp(mot,motdico) == 0) ++i;
motdico = strtok (NULL, " ");
}
mot = strtok (NULL," ");
}
if (i > *(maxCorrepondance)) {
*(maxCorrepondance) = i;
strcat(bonResultatBootforce, msgBootforce);
}
}
You can't have two uses of strtok() on two different strings being done at the same time.; strtok() has an internal pointer where it stores the address of the current string being processed. If you call strtok() with a string and then call strtok() with a different string then when you do strtok(NULL, delim) it will continue with the last string that was specified.
See https://en.cppreference.com/w/c/string/byte/strtok
This function is destructive: it writes the '\0' characters in the
elements of the string str. In particular, a string literal cannot be
used as the first argument of strtok. Each call to strtok modifies a
static variable: is not thread safe. Unlike most other tokenizers,
the delimiters in strtok can be different for each subsequent token,
and can even depend on the contents of the previous tokens. The
strtok_s function differs from the POSIX strtok_r function by guarding
against storing outside of the string being tokenized, and by checking
runtime constraints.
There is a new version of the strtok() function strtok_s() which has an additional argument of an address for a pointer variable to use instead of the internal pointer variable that strtok() uses.
You can't use strtok with two different strings at the same time.
strtok(string, delim) stores its position in string internally for future calls to strtok (NULL, delim). It can only remember one at a time. strtok (tmpMsgBootForce, " ") says to look through tmpMsgBootForce and then motdico = strtok (dicoActuel, " ") overwrites that with dicoActuel.
What to use instead depends on your compiler. The C standard defines strtok_s, but that's from the 2011 standard and has proven to be controversial. POSIX defines strtok_r, most Unix compilers will understand that. Finally, Visual Studio has their own slightly different strtok_s.
They all work basically the same way. You manually store the position in each string you're iterating through.
Here it is using strtok_r. next_tmpMsgBootforce and next_dicoActuel hold the position for parsing tmpMsgBootForce and dicoActuel respectively.
char *next_tmpMsgBootforce;
char *next_dicoActuel;
strcpy(tmpMsgBootForce, msgBootforce);
mot = strtok_r(tmpMsgBootForce, " ", &next_tmpMsgBootforce);
while (mot != NULL) {
motdico = strtok_r(dicoActuel, " ", &next_dicoActuel);
while (motdico != NULL) {
if (strcmp(mot,motdico) == 0) ++i;
motdico = strtok_r(NULL, " ", &next_dicoActuel);
}
mot = strtok_r(NULL," ", &next_tmpMsgBootforce);
}
Because this is all such a mess, I recommend using a library such as GLib to smooth out these incompatibilities and unsafe functions.
As a side note, the strcpy and strcat are not safe. If their destination does not have enough space it will try to write outside its memory bounds. As with strtok the situation to do this safely is a mess. There's the non-standard but ubiquitous strlcpy and strlcat. There's the standard but not ubiquitous strcpy_s and strcat_s. Thankfully for once Visual Studio follows the standard.
On POSIX systems you can use strdup to duplicate a string. It will handle the memory allocation for you.
char *tmpMsgBootForce = strdup(msgBootForce);
The caveat is you have to free this memory at the end of the function.
Doing a strcat safely gets complicated. Let's simplify this by splitting it into two functions. One to do the searching.
int theSearching(
const char *msgBootforce,
const char *dicoActuel
) {
int i = 0;
char *next_tmpMsgBootforce;
char *next_dicoActuel;
char *tmpMsgBootForce = strdup(msgBootforce);
char *tmpDicoActuel = strdup(dicoActuel);
char *mot = strtok_r(tmpMsgBootForce, " ", &next_tmpMsgBootforce);
while (mot != NULL) {
char *motdico = strtok_r(tmpDicoActuel, " ", &next_dicoActuel);
while (motdico != NULL) {
if (strcmp(mot,motdico) == 0) {
++i;
}
motdico = strtok_r(NULL, " ", &next_dicoActuel);
}
mot = strtok_r(NULL," ", &next_tmpMsgBootforce);
}
return i;
}
And one to do the appending. This function ensures there's enough space for the concatenation.
char *tryAppend( char *dest, const char *src, int *maxCorrepondance, const int numFound ) {
char *new_dest = dest;
if (numFound > *maxCorrepondance) {
*(maxCorrepondance) = numFound;
// Allocate enough memory for the concatenation.
// Don't forget space for the null byte.
new_dest = realloc( dest, strlen(dest) + strlen(src) + 1 );
strcat( new_dest, src);
}
// Return a pointer to the reallocated memory,
// or just the old one if no reallocation was necessary.
return new_dest;
}
Then use them together.
int numFound = theSearching(msgBootforce, dicoActuel);
bonResultatBootforce = tryAppend(bonResultatBootforce, msgBootforce, &maxCorrepondance, numFound);

Tokenizing multiple String in C with KEIL Compiler

I am writing on a Microcontroller-program using the Keil-Compiler. The program creates several CSV-Like Strings (Logging-Lines). For example "A;001;ERROR;C05;...\n"
To save space I now want to reduce the data by just logging the differences.
Therefore I am saving the last logged line and compare it to the new one. If a value in a column is the same, I want to just omit it. For example:
"A;001;ERROR;C05;...\n" <- previous Log
"A;002;ERROR;C06;...\n" <- new Log
would result in ";002;;C06;...\n"
At first I just included <string.h> and used 'strtok' to step through my CSV-line. Since I need to compare 2 Strings/Lines, I would need to use it simultaneously on 2 different Strings which does not work. So I switched to 'strtok_r' which seems to not work at all:
token1 = strtok_r(m_cActLogLine, ";", pointer1);
while (token1 != NULL) {
token1 = strtok_r(NULL, ";", pointer1);
}
This just gives me strange behaviour. Typically the second call to 'strtok_r' just returns a NULL and the loop is left.
So is there maybe another way of achieving the desired behaviour?
EDIT:
To clarify what I mean, this is what I am currently trying to get to work:
My Input (m_cMeasureLogLine) is "M;0001;001;01;40;1000.00;0.00;1000.00;0.00;360.00;0.00;400.00;24.90;400.00;-9999.00;-9999.00;-9999.00;0;LED;;;;;400.00;34.40;25.41;27.88;29.01;0.00;0.00;0.00;-100.00;0.00;-1000.00;-1000.00;-103.032;-70.192;19;8192.00;0.00;0;"
char m_cActLogLine[MAX_SIZE_PARAM_LINE_TEXT];
char* token1;
char* token2;
char** pointer1;
char** pointer2;
void vLogProtocolMeasureData()
{
strcpy(m_cActLogLine, m_cMeasureLogLine);
token1 = strtok_r(m_cActLogLine, ";", pointer1);
while (token1 != NULL) {
token1 = strtok_r(NULL, ";", pointer1);
}
}
The function is part of a bigger embedded project so I dont have Console Output but use the debugger to check the contents of my variables. In the above example, after the first call to 'strtok_r' token1 is 'M' which is correct. After the second call however (in the Loop) token 1 becomes 0x00000000 (NULL).
If I instead use 'strtok' instead:
strcpy(m_cActLogLine, m_cMeasureLogLine);
token1 = strtok(m_cActLogLine, ";");
while (token1 != NULL) {
token1 = strtok(NULL, ";");
}
the loop iterates just fine. But that way I cant process two Strings at a time and compare values column-wise.
In string.h the functions are declared as:
extern _ARMABI char *strtok(char * __restrict /*s1*/, const char * __restrict /*s2*/) __attribute__((__nonnull__(2)));
extern _ARMABI char *_strtok_r(char * /*s1*/, const char * /*s2*/, char ** /*ptr*/) __attribute__((__nonnull__(2,3)));
#ifndef __STRICT_ANSI__
extern _ARMABI char *strtok_r(char * /*s1*/, const char * /*s2*/, char ** /*ptr*/) __attribute__((__nonnull__(2,3)));
#endif
You need to pass a pointer to a valid char* for the last parameter of strtok_r(). You're passing a pointer to a pointer with pointer1, but it's a NULL (because it's a globally scoped variable that isn't assigned a value), so when strtok_r() goes to store it's iterating pointer at the address to a pointer you pass in, it's trying to write something to address 0x00000000.
Try...
char m_cActLogLine[MAX_SIZE_PARAM_LINE_TEXT];
char* token1;
char* pointer1;
void vLogProtocolMeasureData()
{
strcpy(m_cActLogLine, m_cMeasureLogLine);
token1 = strtok_r(m_cActLogLine, ";", &pointer1);
while (token1 != NULL) {
token1 = strtok_r(NULL, ";", &pointer1);
}

C: Parse empty tokens from a string with strtok

My application produces strings like the one below. I need to parse values between the separator into individual values.
2342|2sd45|dswer|2342||5523|||3654|Pswt
I am using strtok to do this in a loop. For the fifth token, I am getting 5523. However, I need to account for the empty value between the two separators || as well. 5523 should be the sixth token, as per my requirement.
token = (char *)strtok(strAccInfo, "|");
for (iLoop=1;iLoop<=106;iLoop++) {
token = (char *)strtok(NULL, "|");
}
Any suggestions?
In that case I often prefer a p2 = strchr(p1, '|') loop with a memcpy(s, p1, p2-p1) inside. It's fast, does not destroy the input buffer (so it can be used with const char *) and is really portable (even on embedded).
It's also reentrant; strtok isn't. (BTW: reentrant has nothing to do with multi-threading. strtok breaks already with nested loops. One can use strtok_r but it's not as portable.)
That's a limitation of strtok. The designers had whitespace-separated tokens in mind. strtok doesn't do much anyway; just roll your own parser. The C FAQ has an example.
On a first call, the function expects
a C string as argument for str, whose
first character is used as the
starting location to scan for tokens.
In subsequent calls, the function
expects a null pointer and uses the
position right after the end of last
token as the new starting location for
scanning.
To determine the beginning and the end
of a token, the function first scans
from the starting location for the
first character not contained in
delimiters (which becomes the
beginning of the token). And then
scans starting from this beginning of
the token for the first character
contained in delimiters, which becomes
the end of the token.
What this say is that it will skip any '|' characters at the beginning of a token. Making 5523 the 5th token, which you already knew. Just thought I would explain why (I had to look it up myself). This also says that you will not get any empty tokens.
Since your data is setup this way you have a couple of possible solutions:
1) find all occurrences of || and replace with | | (put a space in there)
2) do a strstr 5 times and find the beginning of the 5th element.
char *mystrtok(char **m,char *s,char c)
{
char *p=s?s:*m;
if( !*p )
return 0;
*m=strchr(p,c);
if( *m )
*(*m)++=0;
else
*m=p+strlen(p);
return p;
}
reentrant
threadsafe
strictly ANSI conform
needs an unused help-pointer from calling
context
e.g.
char *p,*t,s[]="2342|2sd45|dswer|2342||5523|||3654|Pswt";
for(t=mystrtok(&p,s,'|');t;t=mystrtok(&p,0,'|'))
puts(t);
e.g.
char *p,*t,s[]="2,3,4,2|2s,d4,5|dswer|23,42||5523|||3654|Pswt";
for(t=mystrtok(&p,s,'|');t;t=mystrtok(&p,0,'|'))
{
char *p1,*t1;
for(t1=mystrtok(&p1,t,',');t1;t1=mystrtok(&p1,0,','))
puts(t1);
}
your work :)
implement char *c as parameter 3
Look into using strsep instead: strsep reference
Use something other than strtok. It's simply not intended to do what you're asking for. When I've needed this, I usually used strcspn or strpbrk and handled the rest of the tokeninzing myself. If you don't mind it modifying the input string like strtok, it should be pretty simple. At least right off, something like this seems as if it should work:
// Warning: untested code. Should really use something with a less-ugly interface.
char *tokenize(char *input, char const *delim) {
static char *current; // just as ugly as strtok!
char *pos, *ret;
if (input != NULL)
current = input;
if (current == NULL)
return current;
ret = current;
pos = strpbrk(current, delim);
if (pos == NULL)
current = NULL;
else {
*pos = '\0';
current = pos+1;
}
return ret;
}
Inspired by Patrick Schlüter answer I made this function, it is supposed to be thread safe and support empty tokens and doesn't change the original string
char* strTok(char** newString, char* delimiter)
{
char* string = *newString;
char* delimiterFound = (char*) 0;
int tokLenght = 0;
char* tok = (char*) 0;
if(!string) return (char*) 0;
delimiterFound = strstr(string, delimiter);
if(delimiterFound){
tokLenght = delimiterFound-string;
}else{
tokLenght = strlen(string);
}
tok = malloc(tokLenght + 1);
memcpy(tok, string, tokLenght);
tok[tokLenght] = '\0';
*newString = delimiterFound ? delimiterFound + strlen(delimiter) : (char*)0;
return tok;
}
you can use it like
char* input = "1,2,3,,5,";
char** inputP = &input;
char* tok;
while( (tok=strTok(inputP, ",")) ){
printf("%s\n", tok);
}
This suppose to output
1
2
3
5
I tested it for simple strings but didn't use it in production yet, and posted it on code review too, so you can see what do others think about it
Below is the solution that is working for me now. Thanks to all of you who responded.
I am using LoadRunner. Hence, some unfamiliar commands, but I believe the flow can be understood easily enough.
char strAccInfo[1024], *p2;
int iLoop;
Action() { //This value would come from the wrsp call in the actual script.
lr_save_string("323|90||95|95|null|80|50|105|100|45","test_Param");
//Store the parameter into a string - saves memory.
strcpy(strAccInfo,lr_eval_string("{test_Param}"));
//Get the first instance of the separator "|" in the string
p2 = (char *) strchr(strAccInfo,'|');
//Start a loop - Set the max loop value to more than max expected.
for (iLoop = 1;iLoop<200;iLoop++) {
//Save parameter names in sequence.
lr_param_sprintf("Param_Name","Parameter_%d",iLoop);
//Get the first instance of the separator "|" in the string (within the loop).
p2 = (char *) strchr(strAccInfo,'|');
//Save the value for the parameters in sequence.
lr_save_var(strAccInfo,p2 - strAccInfo,0,lr_eval_string("{Param_Name}"));
//Save string after the first instance of p2, as strAccInfo - for looping.
strcpy(strAccInfo,p2+1);
//Start conditional loop for checking for last value in the string.
if (strchr(strAccInfo,'|')==NULL) {
lr_param_sprintf("Param_Name","Parameter_%d",iLoop+1);
lr_save_string(strAccInfo,lr_eval_string("{Param_Name}"));
iLoop = 200;
}
}
}

Resources