C programming strtok - c

I am having trouble with the strtok function. I keep getting a 'bus error.' I wrote a function to return all the words within a line. Could somebody please point out my error?
NOTE: I am used to higher level languages
void extract_words(char tokens[WORD_MAX][WORD_LEN], char* line, int* sizePtr)
{
printf("in extract words"); //for debugging
char* chPtr = NULL;
chPtr = strtok(line, " ");
int size = 1; //words has one element
while(chPtr != NULL)
{
strcpy(tokens[size++], chPtr);
chPtr = strtok(NULL, " "); //continue to tokenize the string
}
*sizePtr = size;
}
Thanks in advance!

strtok modifies the string you pass to it, so it can't be a string literal. You should be able to do something like this:
void extract_words(char tokens[WORD_MAX][WORD_LEN], const char* line_arg, int* sizePtr)
{
char line[(WORD_LEN+1)*WORD_MAX];
char* chPtr = NULL;
int size = 0;
strcpy(line,line_arg);
printf("in extract words"); //for debugging
chPtr = strtok(line, " ");
while(chPtr != NULL)
{
strcpy(tokens[size++], chPtr);
chPtr = strtok(NULL, " "); //continue to tokenize the string
}
*sizePtr = size;
}
Note that I also initialized size to zero, since array indices start at zero.

Well...
Should you try to use a separator?
That case, I have source code.
int split(char *src, char *div, char **result,int *size)
{
int i, j, slen, dlen, key=0, start=0;
slen=strlen(src);
dlen=strlen(div);
for(i=0;i<slen;i++)
{
for(j=0;j<dlen;j++)
{
if(src[i]==div[j])
{
src[i]=0x00;
result[key] = src+start;
key++;
start=i+1;
}
}
}
result[key]=src+start;
*size=key+1;
return 0;
}
using
split(chatData, " ", cmpData, &tok);
" " : token
&tok : count split word
chatData : original data
by korean Dalsam.

Related

Unable to achieve expected parsing output with strtok

I have been working on this for a while now but I do not seem to be able to resolve this bug. Any insights will be greatly appreciated thanks.
I am writing code that will parse a string first by ";" and then by " ". The code I have below is as follows:
void arrayVis(char **arr, int size){
printf("[");
for(int i = 0; i < size; i++){
if(arr[i] == NULL || strcmp(arr[i], "") == 0){
break;
}
printf("%s,", arr[i]);
}
void parser2(char *line){
char *token = strtok(line, " ");
char *arr[10];
int index = 0;
while(token != NULL){
arr[index] = token;
token = strtok(NULL, " ");
index++;
}
arrayVis(arr, 10);
}
void parser1(char *line){
char *token = strtok(line, ";");
while(token != NULL){
parser2(token);
// myPrint(token);
printf("\n");
token = strtok(NULL, ";");
}
}
array vis will just allow me to visualize the array that is produced. When I pass "1 2 3;4 5 6;"
I am expecting an output of
[1,2,3
[4,5,6
but instead I just get the output
[1,2,3
Why is my output omitting the second portion of the parse? I have been thinking about this for a while now but I dont seem to understand why this happens. Any insights will be appreciated. Thank you.
int main(void) {
static const char *ROW_TOKENS=";";
static const char *COL_TOKENS=" ";
char buf[] = "data1;data2 data3;data4 data5";
char *aux_row,*cursor_row;
cursor_row = strtok_r(buf, ROW_TOKENS, &aux_row);
// printf("[A] buf: %p, cursor_row=%p, aux_row=%p\n", buf, cursor_row, aux_row);
while (cursor_row) {
char *aux_col,*cursor_col;
printf("[");
cursor_col = strtok_r(cursor_row, COL_TOKENS, &aux_col);
// printf("[B] cursor_row=%p, aux_row=%p, cursor_col=%p, aux_col=%p\n",
// cursor_row, aux_row, cursor_col, aux_col);
while (cursor_col) {
// printf("[C] cursor_row=%p, aux_row=%p, cursor_col=%p, aux_col=%p\n",
// cursor_row, aux_row, cursor_col, aux_col);
printf("%s,", cursor_col);
cursor_col = strtok_r(NULL, COL_TOKENS, &aux_col);
}
cursor_row = strtok_r(NULL, ROW_TOKENS, &aux_row);
printf("\n");
}
return 0;
}

In the C language, how do I divide a string up into substrings separated by a space character. [duplicate]

I have been trying to tokenize a string using SPACE as delimiter but it doesn't work. Does any one have suggestion on why it doesn't work?
Edit: tokenizing using:
strtok(string, " ");
The code is like the following
pch = strtok (str," ");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ");
}
Do it like this:
char s[256];
strcpy(s, "one two three");
char* token = strtok(s, " ");
while (token) {
printf("token: %s\n", token);
token = strtok(NULL, " ");
}
Note: strtok modifies the string its tokenising, so it cannot be a const char*.
Here's an example of strtok usage, keep in mind that strtok is destructive of its input string (and therefore can't ever be used on a string constant
char *p = strtok(str, " ");
while(p != NULL) {
printf("%s\n", p);
p = strtok(NULL, " ");
}
Basically the thing to note is that passing a NULL as the first parameter to strtok tells it to get the next token from the string it was previously tokenizing.
strtok can be very dangerous. It is not thread safe. Its intended use is to be called over and over in a loop, passing in the output from the previous call. The strtok function has an internal variable that stores the state of the strtok call. This state is not unique to each thread - it is global. If any other code uses strtok in another thread, you get problems. Not the kind of problems you want to track down either!
I'd recommend looking for a regex implementation, or using sscanf to pull apart the string.
Try this:
char strprint[256];
char text[256];
strcpy(text, "My string to test");
while ( sscanf( text, "%s %s", strprint, text) > 0 ) {
printf("token: %s\n", strprint);
}
Note: The 'text' string is destroyed as it's separated. This may not be the preferred behaviour =)
You can simplify the code by introducing an extra variable.
#include <string.h>
#include <stdio.h>
int main()
{
char str[100], *s = str, *t = NULL;
strcpy(str, "a space delimited string");
while ((t = strtok(s, " ")) != NULL) {
s = NULL;
printf(":%s:\n", t);
}
return 0;
}
I've made some string functions in order to split values, by using less pointers as I could because this code is intended to run on PIC18F processors. Those processors does not handle really good with pointers when you have few free RAM available:
#include <stdio.h>
#include <string.h>
char POSTREQ[255] = "pwd=123456&apply=Apply&d1=88&d2=100&pwr=1&mpx=Internal&stmo=Stereo&proc=Processor&cmp=Compressor&ip1=192&ip2=168&ip3=10&ip4=131&gw1=192&gw2=168&gw3=10&gw4=192&pt=80&lic=&A=A";
int findchar(char *string, int Start, char C) {
while((string[Start] != 0)) { Start++; if(string[Start] == C) return Start; }
return -1;
}
int findcharn(char *string, int Times, char C) {
int i = 0, pos = 0, fnd = 0;
while(i < Times) {
fnd = findchar(string, pos, C);
if(fnd < 0) return -1;
if(fnd > 0) pos = fnd;
i++;
}
return fnd;
}
void mid(char *in, char *out, int start, int end) {
int i = 0;
int size = end - start;
for(i = 0; i < size; i++){
out[i] = in[start + i + 1];
}
out[size] = 0;
}
void getvalue(char *out, int index) {
mid(POSTREQ, out, findcharn(POSTREQ, index, '='), (findcharn(POSTREQ, index, '&') - 1));
}
void main() {
char n_pwd[7];
char n_d1[7];
getvalue(n_d1, 1);
printf("Value: %s\n", n_d1);
}
When reading the strtok documentation, I see you need to pass in a NULL pointer after the first "initializing" call. Maybe you didn't do that. Just a guess of course.
Here is another strtok() implementation, which has the ability to recognize consecutive delimiters (standard library's strtok() does not have this)
The function is a part of BSD licensed string library, called zString. You are more than welcome to contribute :)
https://github.com/fnoyanisi/zString
char *zstring_strtok(char *str, const char *delim) {
static char *static_str=0; /* var to store last address */
int index=0, strlength=0; /* integers for indexes */
int found = 0; /* check if delim is found */
/* delimiter cannot be NULL
* if no more char left, return NULL as well
*/
if (delim==0 || (str == 0 && static_str == 0))
return 0;
if (str == 0)
str = static_str;
/* get length of string */
while(str[strlength])
strlength++;
/* find the first occurance of delim */
for (index=0;index<strlength;index++)
if (str[index]==delim[0]) {
found=1;
break;
}
/* if delim is not contained in str, return str */
if (!found) {
static_str = 0;
return str;
}
/* check for consecutive delimiters
*if first char is delim, return delim
*/
if (str[0]==delim[0]) {
static_str = (str + 1);
return (char *)delim;
}
/* terminate the string
* this assignmetn requires char[], so str has to
* be char[] rather than *char
*/
str[index] = '\0';
/* save the rest of the string */
if ((str + index + 1)!=0)
static_str = (str + index + 1);
else
static_str = 0;
return str;
}
As mentioned in previous posts, since strtok(), or the one I implmented above, relies on a static *char variable to preserve the location of last delimiter between consecutive calls, extra care should be taken while dealing with multi-threaded aplications.
int not_in_delimiter(char c, char *delim){
while(*delim != '\0'){
if(c == *delim) return 0;
delim++;
}
return 1;
}
char *token_separater(char *source, char *delimiter, char **last){
char *begin, *next_token;
char *sbegin;
/*Get the start of the token */
if(source)
begin = source;
else
begin = *last;
sbegin = begin;
/*Scan through the string till we find character in delimiter. */
while(*begin != '\0' && not_in_delimiter(*begin, delimiter)){
begin++;
}
/* Check if we have reached at of the string */
if(*begin == '\0') {
/* We dont need to come further, hence return NULL*/
*last = NULL;
return sbegin;
}
/* Scan the string till we find a character which is not in delimiter */
next_token = begin;
while(next_token != '\0' && !not_in_delimiter(*next_token, delimiter)) {
next_token++;
}
/* If we have not reached at the end of the string */
if(*next_token != '\0'){
*last = next_token--;
*next_token = '\0';
return sbegin;
}
}
void main(){
char string[10] = "abcb_dccc";
char delim[10] = "_";
char *token = NULL;
char *last = "" ;
token = token_separater(string, delim, &last);
printf("%s\n", token);
while(last){
token = token_separater(NULL, delim, &last);
printf("%s\n", token);
}
}
You can read detail analysis at blog mentioned in my profile :)

EXC_BAD_ACCESS error by reallocate memory

I am trying to make dynamic double array, but I have a problem with BAD_ACCSESS.
int execute(person* person_array)
{
char** parsed_command;
if(!(parsed_command = malloc(sizeof(char*)))){
error_notification(12);
return 2;
}
parsed_command[0] = malloc(SIZE_ARG*sizeof(char));
char command[MAX_BUFFER_SIZE];
string quit = "quit\n";
do{
printf("esp> ");
if(fgets(command, MAX_BUFFER_SIZE, stdin)==NULL){ // save input in "command"
return 2;
}
parse_command_input(command, person_array, &parsed_command);
}while(strcmp(command,quit));
printf("Bye.\n");
free(&parsed_command[0]);
free(parsed_command);
return 0;
}
void parse_command_input(const char* command, person* person_array, char*** parsed_command){
char* delim = strtok(command, " ");
int counter = 0;
while (delim != NULL){
if(counter > 0) {
char **tmp = realloc(*parsed_command, (counter+1)*sizeof(char*));
if(tmp!=NULL)
*parsed_command = tmp;
else{
error_notification(12);
}
*parsed_command[counter] = malloc(SIZE_ARG*sizeof(char)); //ERROR
}
strcpy(*parsed_command[counter], delim);
counter++;
delim = strtok (NULL, " \n");
}
which_command(parsed_command, counter, person_array);
}
So, I initialise parsed_command in execute(), and then reallocate it in parsed_command_input() when i have more then one word in input.
By reallocating parsed_command at the first time everything is ok, but on the second reallocating round the address of parsed_command changes, and i have BAD_ACCSESS by malloc (add memory for line).
How can I fix it?
Thanks in advance
*parsed_command[counter] means the same as *(parsed_command[counter]) but you meant it to be (*parsed_command)[counter] so write that.

Parsing input in C [duplicate]

I have been trying to tokenize a string using SPACE as delimiter but it doesn't work. Does any one have suggestion on why it doesn't work?
Edit: tokenizing using:
strtok(string, " ");
The code is like the following
pch = strtok (str," ");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ");
}
Do it like this:
char s[256];
strcpy(s, "one two three");
char* token = strtok(s, " ");
while (token) {
printf("token: %s\n", token);
token = strtok(NULL, " ");
}
Note: strtok modifies the string its tokenising, so it cannot be a const char*.
Here's an example of strtok usage, keep in mind that strtok is destructive of its input string (and therefore can't ever be used on a string constant
char *p = strtok(str, " ");
while(p != NULL) {
printf("%s\n", p);
p = strtok(NULL, " ");
}
Basically the thing to note is that passing a NULL as the first parameter to strtok tells it to get the next token from the string it was previously tokenizing.
strtok can be very dangerous. It is not thread safe. Its intended use is to be called over and over in a loop, passing in the output from the previous call. The strtok function has an internal variable that stores the state of the strtok call. This state is not unique to each thread - it is global. If any other code uses strtok in another thread, you get problems. Not the kind of problems you want to track down either!
I'd recommend looking for a regex implementation, or using sscanf to pull apart the string.
Try this:
char strprint[256];
char text[256];
strcpy(text, "My string to test");
while ( sscanf( text, "%s %s", strprint, text) > 0 ) {
printf("token: %s\n", strprint);
}
Note: The 'text' string is destroyed as it's separated. This may not be the preferred behaviour =)
You can simplify the code by introducing an extra variable.
#include <string.h>
#include <stdio.h>
int main()
{
char str[100], *s = str, *t = NULL;
strcpy(str, "a space delimited string");
while ((t = strtok(s, " ")) != NULL) {
s = NULL;
printf(":%s:\n", t);
}
return 0;
}
I've made some string functions in order to split values, by using less pointers as I could because this code is intended to run on PIC18F processors. Those processors does not handle really good with pointers when you have few free RAM available:
#include <stdio.h>
#include <string.h>
char POSTREQ[255] = "pwd=123456&apply=Apply&d1=88&d2=100&pwr=1&mpx=Internal&stmo=Stereo&proc=Processor&cmp=Compressor&ip1=192&ip2=168&ip3=10&ip4=131&gw1=192&gw2=168&gw3=10&gw4=192&pt=80&lic=&A=A";
int findchar(char *string, int Start, char C) {
while((string[Start] != 0)) { Start++; if(string[Start] == C) return Start; }
return -1;
}
int findcharn(char *string, int Times, char C) {
int i = 0, pos = 0, fnd = 0;
while(i < Times) {
fnd = findchar(string, pos, C);
if(fnd < 0) return -1;
if(fnd > 0) pos = fnd;
i++;
}
return fnd;
}
void mid(char *in, char *out, int start, int end) {
int i = 0;
int size = end - start;
for(i = 0; i < size; i++){
out[i] = in[start + i + 1];
}
out[size] = 0;
}
void getvalue(char *out, int index) {
mid(POSTREQ, out, findcharn(POSTREQ, index, '='), (findcharn(POSTREQ, index, '&') - 1));
}
void main() {
char n_pwd[7];
char n_d1[7];
getvalue(n_d1, 1);
printf("Value: %s\n", n_d1);
}
When reading the strtok documentation, I see you need to pass in a NULL pointer after the first "initializing" call. Maybe you didn't do that. Just a guess of course.
Here is another strtok() implementation, which has the ability to recognize consecutive delimiters (standard library's strtok() does not have this)
The function is a part of BSD licensed string library, called zString. You are more than welcome to contribute :)
https://github.com/fnoyanisi/zString
char *zstring_strtok(char *str, const char *delim) {
static char *static_str=0; /* var to store last address */
int index=0, strlength=0; /* integers for indexes */
int found = 0; /* check if delim is found */
/* delimiter cannot be NULL
* if no more char left, return NULL as well
*/
if (delim==0 || (str == 0 && static_str == 0))
return 0;
if (str == 0)
str = static_str;
/* get length of string */
while(str[strlength])
strlength++;
/* find the first occurance of delim */
for (index=0;index<strlength;index++)
if (str[index]==delim[0]) {
found=1;
break;
}
/* if delim is not contained in str, return str */
if (!found) {
static_str = 0;
return str;
}
/* check for consecutive delimiters
*if first char is delim, return delim
*/
if (str[0]==delim[0]) {
static_str = (str + 1);
return (char *)delim;
}
/* terminate the string
* this assignmetn requires char[], so str has to
* be char[] rather than *char
*/
str[index] = '\0';
/* save the rest of the string */
if ((str + index + 1)!=0)
static_str = (str + index + 1);
else
static_str = 0;
return str;
}
As mentioned in previous posts, since strtok(), or the one I implmented above, relies on a static *char variable to preserve the location of last delimiter between consecutive calls, extra care should be taken while dealing with multi-threaded aplications.
int not_in_delimiter(char c, char *delim){
while(*delim != '\0'){
if(c == *delim) return 0;
delim++;
}
return 1;
}
char *token_separater(char *source, char *delimiter, char **last){
char *begin, *next_token;
char *sbegin;
/*Get the start of the token */
if(source)
begin = source;
else
begin = *last;
sbegin = begin;
/*Scan through the string till we find character in delimiter. */
while(*begin != '\0' && not_in_delimiter(*begin, delimiter)){
begin++;
}
/* Check if we have reached at of the string */
if(*begin == '\0') {
/* We dont need to come further, hence return NULL*/
*last = NULL;
return sbegin;
}
/* Scan the string till we find a character which is not in delimiter */
next_token = begin;
while(next_token != '\0' && !not_in_delimiter(*next_token, delimiter)) {
next_token++;
}
/* If we have not reached at the end of the string */
if(*next_token != '\0'){
*last = next_token--;
*next_token = '\0';
return sbegin;
}
}
void main(){
char string[10] = "abcb_dccc";
char delim[10] = "_";
char *token = NULL;
char *last = "" ;
token = token_separater(string, delim, &last);
printf("%s\n", token);
while(last){
token = token_separater(NULL, delim, &last);
printf("%s\n", token);
}
}
You can read detail analysis at blog mentioned in my profile :)

Splitting C char array into words

I am trying to split a given char array into separate strings. I am doing this by putting the address of each word into an array, and then getting the string from the address to print.
So I have updated my code but now the program freezes after printing the numArgs but before "test2." I don't understand why.
----------------old code-----------------------
char* parseArgs(char* comPtr){
char *args[100] = {0};
char *token;
int i = 0;
token = strtok(comPtr, " ");
while(token != NULL){
args[i] = malloc(100);
args[i] = &token;
token = strtok(NULL, " ");
}
return *args;
}
char* args = parseArgs(comPtr);
int i = 0;
while(i < numArgs){
printf("arg%d: %s\n",i,&args[i]);
i++;
}
-----------------------end old code--------------------
------------------new code------------------------
int countArgs(char* comPtr){
char *token;
int i = 0;
token = strtok(comPtr, " ");
while(token != NULL){
i++;
token = strtok(NULL, " ");
}
return i;
}
char** parseArgs(char* comPtr){
printf("test1");
char** args = calloc(100, sizeof(char*));
char* token;
int i = 0;
while(token = strtok(comPtr, " ")){
args[i] = token;
}
printf("test2");
return args;
}
printf("ComPtr: %s\n",comPtr);
char* path = "/bin/";
//int pid = fork(); //pid always 0 so using pid = 1 to test
//printf("PID:%d",pid);
int pid = 1;
printf("PID:%d",pid);
if(pid != 0){
int numArgs = countArgs(comPtr);
printf("test1");
printf("NumArgs: %d\n",numArgs);
printf("test2");
char** args = parseArgs(comPtr);
int i = 0;
printf("test3");
while(i < numArgs){
printf("arg%d: %s\n",i,args[i]);
printf("test4");
i++;
}
}
else{
//waitpid();
}
You've lost track of where your memory is, your pointers are pointing etc.. If you want to return the list of pointers to tokens, you need something like this:
char** parseArgs(char* comPtr){
char** p_args = calloc(100, sizeof(char*);
int i = 0;
char* token;
while (token = strtok(comPtr, " "))
p_args[i] = token;
return p_args;
}
char** p_args = parseArgs(comPtr);
int i = 0;
while(i < numArgs)
{
printf("arg%d: %s\n",i,p_args[i]);
i++;
}
free(p_args);
I haven't tested it, but it should point you in the right direction. Have a careful think about how it differs from your program, and use a debugger and/or printf() statements in the code to print out addresses and see how it works (or debug it if necessary).
Declare the pointer array 'char *args[100]' as global variable. In your program your are allocating memory to the local pointer and its life is within the function. so at the end of the function your pointer variable scope ends. Here there is memory leak too.
The freeze is due to
int i = 0;
while(token = strtok(comPtr, " ")){
args[i] = token;
}
where you repeatedly - in an infinite loop - find the first token in comPtr, token becomes &comPtr[0] in each iteration (unless the string starts with spaces), and that is assigned to args[i].
After the first call, all calls to strtok that shall find further tokens in the same string - if any - should have a NULL first argument.
Also, you should probably increment i in the loop, since presumably you don't want to overwrite args[0] with each new token.

Resources