strtok() skips first token - c

Can't seem to work out why this code is not working.
It should be really straight forward.
From what I have troubleshooted, in the while(token) block the id array is assigned but then when I go to print all the char array's outside the while(token) block the id array displays nothing but all the other array's display their contents.
int loadCatData(char* menuFile) {
char line[MAX_READ];
char id[ID_LEN];
char hotCold[MIN_DESC_LEN];
char name[MAX_NAME_LEN];
char description[MAX_DESC_LEN];
char delim[2] = "|";
char lineTemp[MAX_READ];
int count;
FILE *mfMenu = fopen(menuFile, "r");
while (fgets(line, sizeof(line), mfMenu)!=NULL) {
count = 0;
printf(line);
strcpy(lineTemp,line);
char* token = strtok(lineTemp, delim);
while (token) {
printf(token);
if (count == 0) {
strcpy(id, token);
}
if (count == 1) {
strcpy(hotCold, token);
}
if (count == 2) {
strcpy(name, token);
}
if (count == 3) {
strcpy(description, token);
}
printf("\n");
token = strtok(NULL, delim);
count = count + 1;
}
printf(id);
printf(hotCold);
printf(name);
printf(description);
}
fclose(mfMenu);
return true;
}

You are the victim of a buffer overflow error caused by strcpy.
What is happening is that the hotCold array is too small to hold the data you're filling it with, but strcpy doesn't care, nor does it know that there isn't enough room. So it keeps on writing data into hotCold and then runs out of room, then fills up the padding bytes, then fills up id. You just have the unfortunate luck of having the terminating null byte of hotCold sitting at the start of id.
Switch from using strcpy to strncpy or strncat (which I think is better than strncpy). If you're skeptical of what I'm saying, add a line of code at the end that goes like this:
assert (strlen (hotCold) < MIN_DESC_LEN);
The other alternative is that the id field is being interpreted as a special printf-format specifier. Just in case, replace printf(id) with printf("%s", id).

int loadCatData(const char* menuFile) {
char id[ID_LEN];
char hotCold[MIN_DESC_LEN];
char name[MAX_NAME_LEN];
char description[MAX_DESC_LEN];
FILE *mfMenu = fopen(menuFile, "r");
while (fscanf(mfMenu, "%*s|%*s|%*s|%*s",
sizeof(id), id, sizeof(hotCold), hotCold,
sizeof(name), name, sizeof(description), description) == 4) {
printf("%s %s %s %s\n", id, hotCold, name, description);
}
fclose(mfMenu);
return true;
}
You should never pass input from outside the program to printf as the first argument. Imagine if one of the tokens is "%s" and you say printf(token)--that's undefined behavior because you didn't pass a second string to print, and your program will crash if you're lucky.

Related

Function that concatenates two pointer strings together

I am having issues when it comes to concatenating these two pointer strings together, below is my concatenating function, I am supposed to take string 1 and add it to string 2. Also I cannot use any functions in the string library, that's the point of this is to help us understand what code is actually in the functions by writing it ourself.
char strconcat(char *user2p, char *user1p) {
while (*user2p) {
user2p++;
}
while (*user1p) {
*user2p = *user1p;
*user2p++;
*user1p++;
}
*user2p = '\0';
printf("test: %c", *user2p);
return *user2p;
}
And here is the part of my main that is relevant to the function.
int main() {
char userString1[21], userString2[21];
char *user1p, *user2p;
user1p = userString1;
user2p = userString2;
printf("Please enter the first string: ");
gets(userString1);
printf("Please enter the second string: ");
gets(userString2);
printf("String 1 after concatenation: ");
puts(userString1);
printf("String 2 after concatenation: %c\n", strconcat(user2p, user1p));
The terminal keeps giving me this, I didn't include the code for the length and alphabetical order. It gives me a null when I try to run the test printf in the function and it gives me nothing when I return the function. I'm at a loss and any help is much appreciated!
Please enter the first string: jackhammer
Please enter the second string: jacky
The length of string 1 is: 10
The length of string 2 is: 5
String 1 comes before string 2 alphabetically.
String 1 after concatenation: jackhammer
(null)
String 2 after concatenation:
Your concat algorithm is fine, but you have to return a pointer to the original [leftmost] value, so your function needs to save it before looping:
char *
strconcat(char *user2p, char *user1p)
{
char *orig2p = user2p;
while (*user2p) {
user2p++;
}
while (*user1p) {
*user2p = *user1p;
user2p++;
user1p++;
}
*user2p = '\0';
printf("test: %s\n", orig2p);
return orig2p;
}
UPDATE:
To come up with a completely bulletproof test program for the concat function, we can use [overly] large input buffers and clip the input length to a maximum of 1/2 of the target buffer.
gets strips the newline but fgets does not. So, I've created an xgets function that is similar to gets but uses fgets and strchr to get [nearly] the same effect.
Although I believe it's okay to use standard string functions as part of the test code, I've created a hand coded version of strchr [hope that's not your next assignment :-)].
Anyway, here's the full program:
#include <stdio.h>
char *
strconcat(char *user2p, char *user1p)
{
char *orig2p = user2p;
while (*user2p) {
user2p++;
}
while (*user1p) {
*user2p = *user1p;
*user2p++;
*user1p++;
}
*user2p = '\0';
printf("test: %s\n", orig2p);
return orig2p;
}
char *
xstrchr(char *buf,int chrwant)
{
int chrcur;
char *res = NULL;
for (chrcur = *buf++; chrcur != 0; chrcur = *buf++) {
if (chrcur == chrwant) {
res = buf - 1;
break;
}
}
return res;
}
char *
xgets(char *buf,int maxlen)
{
char *cp;
char *res;
res = fgets(buf,maxlen,stdin);
if (res != NULL) {
cp = xstrchr(buf,'\n');
if (cp != NULL)
*cp = 0;
}
return res;
}
#define MAXLEN 800
int
main(void)
{
char userString1[MAXLEN], userString2[MAXLEN + 1];
char *user1p, *user2p;
printf("Please enter the first string: ");
user1p = xgets(userString1,MAXLEN / 2);
printf("Please enter the second string: ");
user2p = xgets(userString2,MAXLEN / 2);
if ((user2p != NULL) && (user1p != NULL))
printf("String 2 after concatenation: %s\n",strconcat(user2p, user1p));
return 0;
}
There's a number of issues. First is this.
while (*user1p) {
*user2p = *user1p;
*user2p++;
*user1p++;
}
This is working by accident. If you have compiler warnings on you should get a warning...
test.c:13:9: warning: expression result unused [-Wunused-value]
*user2p++;
^~~~~
test.c:14:9: warning: expression result unused [-Wunused-value]
*user1p++;
^~~~~~~
The reason it's unused is because C is interpreting it like so:
*(user1p++)
Increment the pointer, then dereference it. You just want to increment the pointers, no dereferencing required.
while (*user1p) {
*user2p = *user1p;
user2p++;
user1p++;
}
Then down here.
printf("String 2 after concatenation: %c\n", strconcat(user2p, user1p));
%c prints an individual char. You want %s which prints a char *. This reveals you have the wrong signature. strconcat should return a char * (ie. what C uses for strings) and return user2p (a char *).
char *strconcat(char *orig_to, const char *from) {
...
return user2p;
}
And since you're not changing from it should be const char * to let the compiler know and warn you if its accidentally changed.
Finally, when you return *user2p it's already been moved to the end of the string.
while (*user1p) {
*user2p = *user1p;
user2p++;
user1p++;
}
*user2p = '\0';
printf("test: %c", *user2p);
// This points to the null byte just set above
return user2p;
So printing the result of strconcat will print nothing. To get around this, store the original pointer for user2p and return that.
char *strconcat(char *orig_to, const char *from) {
char *orig_user2p = user2p;
...
return orig_user2p;
}
And some tips. It's easier to follow the code with good variable names that describe what they're doing.
char *strconcat(char *orig_to, const char *from) {
char *to = orig_to;
...
}
char foo[NN] already makes foo a pointer. There's no need to declare separate char * variables and copy the pointer.
char from[21], to[21];
Never use gets. There's no limit to how much memory it can use and it can easily overflow your buffer. Use fgets which can limit how much can be read to available memory.
printf("Please enter the string to concat from: ");
fgets(from, sizeof(from), stdin);
Though it's annoying that it keeps the newline and there's no simple function to strip it. You can use scanf which will strip whitespace, but beware its many pitfalls.
printf("Please enter the string to concat from: ");
scanf("%20s", from);
printf("Please enter the string to concat to: ");
scanf("%20s", to);
Finally, be sure the string you're concatenating to can hold its own contents and the new contents.
char from[21], to[41];
printf("Please enter the string to concat to: ");
// Be sure to leave enough room in `to` to fit `from`.
fgets(to, sizeof(to) - sizeof(from), stdin);
I would have created a more dynamic memory model. This code is more generic and concatenates strings creating a new string containing both strings.. Free when done :-)...
char *strconcat(char *string1, char *string2) {
int lenStr1=0,lenStr2=0;
char *tmpStr1=string1,*tmpStr2=string2,*returnStr;
while (*tmpStr1++)lenStr1++;
while (*tmpStr2++)lenStr2++;
if((returnStr=(char *)malloc(lenStr1+lenStr2+1))){
memcpy(returnStr,string1,lenStr1);
memcpy(&returnStr[lenStr1],string2,lenStr2);
returnStr[lenStr1+lenStr2]=0;
return returnStr;
} else {
return 0;
}
}
int main() {
char *string1="String 1 ",*string2="String 2 ",*result;
if((result=strconcat(string1, string2))) {
printf("-> %s \n",result);
free(result);
} else {
printf("Out of memory");
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}

Taking of the last word in the line with strtok

Given a file with the following line:
word1 word2 word3 word4
I tried to write the following code:
FILE* f = fopen("my_file.txt", "r");
char line[MAX_BUFFER + 1];
if (fgets(line, MAX_LENGTH_LINE, f) == NULL) {
return NULL;
}
char* word = strtok(line, " ");
for (int i = 0; i < 4; i++) {
printf("%s ", word);
word = strtok(NULL, " ");
}
For prints the "words".
It's working. But, I don't understand something.
How it's acheive the last word word4? (I don't understand it because that after "word4" not exists a space)..
I'm not quite sure what you're asking. Are you asking how the program was able to correctly read word4 from the file even though it wasn't followed by a space? Or are you asking why, when the program printed word4 back out, it didn't seem to print a space after it?
The answer to the first question is that strtok is designed to give you tokens separated by delimiters, not terminated by delimiters. There is no requirement that the last token be followed by a delimiter.
To see the answer to the second question, it may be more clear if we adjust the program and its printout slightly:
char* word = strtok(line, " ");
for (int i = 0; word != NULL; i++) {
printf("%d: \"%s\"\n", i, word);
word = strtok(NULL, " ");
}
I have made two changes here:
The loop runs until word is NULL, that is, as long as strtok finds another word on the line. (This is to make sure we see all the words, and to make sure we're not trying to treat the fourth word specially in any way. If you were trying to treat the fourth word specially in some way, please say so.)
The words are printed back out surrounded by quotes, so that we can see exactly what they contain.
When I run the modified program, I see:
0: "word1"
1: "word2"
2: "word3"
3: "word4
"
That last line looks very strange at first, but the explanation is straightforward. You originally read the line using fgets, which does copy the terminating \n character into the line buffer. So it ends up staying tacked onto word4; that is, the fourth "word" is "word4\n".
For this reason, it's often a good idea to include \n in the set of whitespace delimiter characters you hand to strtok -- that is, you can call strtok(line, " \n") instead. If I do that (in both of the strtok calls), the output changes to
0: "word1"
1: "word2"
2: "word3"
3: "word4"
which may be closer to what you expected.
Your code doesn't check the return value of strtok(), it may be unsafe in some cases.
/* Split string
#content origin string content
#delim delimiter for splitting
#psize pointer pointing at the variable to store token size
#return tokens after splitting
*/
const char **split(char *content, const char *delim, int *psize)
{
char *token;
const char **tokens;
int capacity;
int size = 0;
token = strtok(content, delim);
if (!token)
{
return NULL;
}
// Initialize tokens
tokens = malloc(sizeof(char *) * 64);
if (!tokens)
{
exit(-1);
}
capacity = 64;
tokens[size++] = token;
while ((token = strtok(NULL, delim)))
{
if (size >= capacity)
{
tokens = realloc(tokens, sizeof(char *) * capacity * 2);
if (!tokens)
{
exit(-1);
}
capacity *= 2;
}
tokens[size++] = token;
}
// if (size < capacity)
// {
// tokens = realloc(tokens, sizeof(char *) * size);
// if (!tokens)
// {
// exit(-1);
// }
// }
*psize = size;
return tokens;
}

how to divide words with strtok in an array of chars in c

I have a struct named excuses that has chars, I need to store at least 20 excuses. Then, I need to divide each word of each excuse in an array.
¿How i can do that?
#define excuseLength 256
typedef struct{
char sentence[excuseLength];
}excuse;
excuse listExcuses[20];
for (int listExcuses_i = 0; listExcuses_i < 20; listExcuses_i++)
{
char *input;
scanf("%s", input);
strcpy(listExcuses[listExcuses_i].sentence, input);
char* token = strtok(input, " ");
while(token != NULL){
printf("token: %s\n", token);
token = strtok(NULL, " ");
}
}
Here are some things you can add to your solution:
Check fgets() for return value, as it returns NULL on error.
If you decide to still use scanf(), make sure to use scanf("%255s", input) instead for char input[256]. Using the format specifier %255s instead of the simpe %s checks for excessive input. Overall, it just better to read input using fgets().
Remove '\n' character appended by fgets(). This is also good for checking that you don't enter more characters than the limit of 256 in input, and that your sentences don't have a trailing newline after each of them. If you don't remove this newline, then your strtok() delimiter would have to be " \n" instead.
#define constants in your code, and use const char* for string literals, such as the delimiter for strtok().
You can also add some code to check for empty inputs from fgets(). You could simply use a separate counter, and only increment this counter for valid strings found.
It's also strange to have struct with one member, usually structs contain more than one member. You could simply bypass using a struct and use a 2D char array declared as char listexcuses[NUMEXCUSES][EXCUSELENGTH]. This array can hold up to 20 strings, each of which has a maximum length of 256.
Here is some modified code of your approach:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define EXCUSELENGTH 256
#define NUMEXCUSES 20
typedef struct {
char sentence[EXCUSELENGTH];
} excuse;
int main(void) {
excuse listexcuses[NUMEXCUSES];
char input[EXCUSELENGTH] = {'\0'};
char *word = NULL;
const char *delim = " ";
size_t slen, count = 0;
for (size_t i = 0; i < NUMEXCUSES; i++) {
printf("\nEnter excuse number %zu:\n", count+1);
if (fgets(input, EXCUSELENGTH, stdin) == NULL) {
fprintf(stderr, "Error from fgets(), cannot read line\n");
exit(EXIT_FAILURE);
}
slen = strlen(input);
if (slen > 0 && input[slen-1] == '\n') {
input[slen-1] = '\0';
} else {
fprintf(stderr, "Too many characters entered in excuse %zu\n", count+1);
exit(EXIT_FAILURE);
}
if (*input) {
strcpy(listexcuses[count].sentence, input);
count++;
printf("\nTokens found:\n");
word = strtok(input, delim);
while (word != NULL) {
printf("%s\n", word);
word = strtok(NULL, delim);
}
}
}
return 0;
}
As you need to eventually store these tokens somewhere, you will need another form of storing this data. Since you don't know how many tokens you can get, or how long each token is, you may need to use something like char **tokens. This is not an array, but it is a pointer to a pointer. Using this would allow any number of words and any lengths of each word to be stored. You will need dynamic memory allocation for this. The answer in this post will help.
I changed the scanf for fgets and initialize the char input[256] and with that now it works!
#define excuseLength 256
#define numberExcuses 20
typedef struct{
char sentence[excuseLength];
}excuse;
excuse listExcuses[20];
for (int listExcuses_i = 0; listExcuses_i < numberExcuses; listExcuses_i++)
{
char input[256];
scanf("%s", input);
fgets(input, 256, stdin);
strcpy(listExcuses[listExcuses_i].sentence, input);
char* token = strtok(input, " ");
while(token != NULL){
printf("token: %s\n", token);
token = strtok(NULL, " ");
}
}

(C) strtok with multiple spaces/tabs, checking for null with pointer

I am trying to split a string into two tokens using strtok() that might have spaces and tabs mixed in the string.
So I made this:
struct strstr
{
char *str,
*one,
*two;
};
typedef struct strstr *STRSTR;
void split(STRSTR);
int main()
{
STRSTR str = malloc(sizeof(struct strstr));
str->str = malloc(256);
fgets(str->str, 256, stdin);
split(str);
printf("%s, %s\n", str->one, str->two);
free(str->str);
free(str);
return 0;
}
void split(STRSTR str)
{
int i;
char *temp = str->str;
while(isspace(*(str->str)))
str->str++;
str->one = strtok(str->str, " \t");
for(i = 0; i < strlen(str->one); i++)
{
if(!isspace(str->one[i]))
str->str++;
}
str->str++;
if(str->str != NULL)
{
puts("In null if");
str->two = strtok(str->str, "");
}
str->str = temp;
}
So for example if you input Hello Earth lingss, it will print out Hello, Earth lingss, which is perfect.
However, if I input Hello only, the split function goes inside the if(str->str != NULL) statement. How do I stop it from doing that with the code that I have?
EDIT: Also another problem, if someone doesn't mind checking it out. temp will only point to the first word in str->str. How can I make it point to the whole thing?
Add this statement before the last if block in the split function
str->str = strtok(str->str," \t"); like
str->str = strtok(str->str," \t");
if(str->str != NULL)
{
puts("In null if");
str->two = strtok(str->str, "");
}
you split the string based on the "\t" as the delimiter but you never changed the str->str string, use the above snippet and it should work fine
strtok is a funny function that both modifies the string you pass it, and stores information about it internally. You should pass your string to strtok once, then pass in NULL on subsequent calls. For instance, if your goal is to simply break a string up into tokens (which is obviously what strtok is for), then something like:
#define BUFFER_SIZE 256
int main(void) {
char *buffer = malloc(BUFFER_SIZE);
if (!buffer) {
return -1;
}
fgets(buffer, BUFFER_SIZE, stdin);
char *word;
char *ptr = buffer;
printf("Tokens: [");
while ((word = strtok(ptr, " \t\n"))) {
printf("%s, ", word);
ptr = NULL;
}
printf("]\n");
free(buffer);
}
will work. When I run the code like this:
./quick
when in the fun apple orange
I get the following result:
Tokens: [when, in, the, fun, apple, orange, ]
The important thing is that I only passed the buffer pointer to strtok on the first time through the loop. After that it is passed NULL.

Search for a string in a text file and parse that line (Linux, C)

This is "how to parse a config file" question.
Basically i have a text file (/etc/myconfig) that has all kind of settings. I need to read that file and search for the string:
wants_return=yes
once I locate that string I need to parse it and return only whatever it is after the equal sign.
I've tried using a combinations of fgets and strtok but I'm getting confused here.
in any case anyone knows a function that can perform this?
Code is appreciated.
thanks
This works: (note: I'm unsure if fgets is supposed to include the newline character in the returned string; if it isn't, you can drop the check for it)
#include <stdio.h>
const unsigned MAXLINE=9999;
char const* FCFG="/etc/myconfig";
char const* findkey="wants_return=";
char * skip_ws(char *line)
{
return line+strspn(line," \t");
}
char * findval(char *line,char const* prefix,int prelen)
{
char *p;
p=skip_ws(line);
if (strncmp(p,prefix,prelen)==0)
return p+prelen;
else
return NULL;
}
char *findval_slow(char *line,char const* prefix)
{
return findval(line,prefix,strlen(prefix));
}
int main() {
FILE *fcfg;
char line[MAXLINE];
char *p,*pend;
int findlen;
findlen=strlen(findkey);
fcfg=fopen(FCFG,"r");
while (p=fgets(line,MAXLINE,fcfg)) {
printf("Looking at %s\n",p);
if (p=findval(line,findkey,findlen)) {
pend=p+strlen(p)-1; /* check last char for newline terminator */
if (*pend=='\n') *pend=0;
printf("Found %s\n",p); /* process/parse the value */
}
}
return 0;
}
Here's a quick example using strtok:
const int linelen = 256;
char line[linelen];
FILE* fp = fopen(argv[1], "r");
if (fp == NULL) {
perror("Error opening file");
} else {
while (! feof(fp)) {
if (fgets(line, linelen , fp)) {
const char* name = strtok(line, "= \r\n");
const char* value = strtok(NULL, "= \r\n");
printf("%s => %s\n", name, value);
}
}
fclose (fp);
}
Note, you'll need to put some additional error checking around it, but this works to parse the files I threw at it.
From your comment, it looks like you're already getting the appropriate line from the text file using fgets and loading it into a character buffer. You can use strtok to parse the tokens from the line.
If you run it with the string buffer as the first argument, it will return the first token from that string. If you run the same command with the first argument set to NULL it will return subsequent tokens from the same original string.
A quick example of how to retrieve multiple tokens:
#include <stdio.h>
#include <string.h>
int main() {
char buffer[17]="wants_return=yes";
char* tok;
tok = strtok(buffer, "=");
printf("%s\n", tok); /* tok points to "wants_return" */
tok = strtok(NULL, "=");
printf("%s\n", tok); /* tok points to "yes" */
return 0;
}
For the second strtok call, you can replace the "=" with "" to return everything to the end of the string, instead of breaking off at the next equal sign.
With a POSIX shell, I'd use something like:
answer=`egrep 'wants_config[ ]*=' /etc/myconfig | sed 's/^.*=[ ]*//'`
Of course, if you're looking for an answer that uses the C STDIO library, then you really need to review the STDIO documentation.

Resources