Cannot concatenate strtok's output variable. strcat and strtok - c

I’ve spent hours on this program and have put several hours online searching for alternatives to my methods and have been plagued with crashes and errors all evening…
I have a few things I'd like to achieve with this code. First I’ll explain my problems, then I’ll post the code and finally I’ll explain my need for the program.
The program outputs just the single words and the concatenate function does nothing. This seems like it should be simple enough to fix...
My first problem is that I cannot seem to get the concatenate function to work, I used the generic strcat function which didn't work and neither did another version I found on the internet ( that function is used here, it is called "mystrcat" ). I want to have the program read in a string and remove "delimiters" to create a single string comprised of every word within the original string. I am trying to do with strtok and a strcat function. If there is an easier or simpler way PLEASE I’m all ears.
Another problem, which isn’t necessarily a problem but an ugly mess: the seven lines following main. I’d prefer to initialize my variables as follows: char variable[amt]; but the code I found for strtok was using a pointer and the code for the strcat function was using pointers. A better understanding of pointers && addresses for strings would probably help me out long term. However I would like to get rid of some of those lines by any means necessary. I can’t have 6 lines dedicated to only 2 variables. When I have 10 variables I do not want 30 lines up top…
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *mystrcat(char *output, char *firstptr);
int main() {
char str[] = "now # is the time for all # good men to come to the # aid of their country";
char delims[] = "# ";
char resultOrig[70]; //was [20];
char firstOrig[20];
//char *result = NULL, *first = NULL;
char result = resultOrig; //was result = &resultOrig;
char first = firstOrig; //was first = &firstOrig;
first = strtok( str, delims );
while( first != NULL ) {
mystrcat(resultOrig, firstOrig);
printf( "%s ", first );
printf("\n %s this should be the concat\'d string so far\n", resultOrig);
first = strtok( NULL, delims );
}
system("pause");
return 0;
}
char *mystrcat(char *resultptr, char *firstptr)
{
char *output = resultptr;
while (*output != '\0')
output++;
while(*firstptr != '\0')
{
*output = *firstptr;
output++;
firstptr++;
}
*output = '\0';
return output;
}
This is just a test program right now but I was intending to use this for a list/database of files. My files have underscores, hyphens, periods, parentheses’, and numbers; all of which I would like to set as the “delimiters”. I was planning on going thru a loop, where I would delete a delimiter(each loop-thru change from _ to – to . etc…) and create a single string, I may want to replace the delimiters with a space or a period. And some files have spaces in them already along with the special characters I’d like to “delimit”.
I’m planning to do all this by means of scanning a text file. Within the file I also have a size in this format: “2,518,6452”. I’m hoping I can sort my database alphabetically or by size, ascending or descending. That’s just some additional information which may be useful to know for my specific questions above.
Below I have included some fictional samples of how these names could appear.
my_file(2009).ext
second.File-group1.extls
the.third.file-vol30.lmth
I am focusing this post on: the question on how to get the concatenate function working or an alternative to strcat and/or strtok. As well as asking for help to unclutter unnecessary or redundant code.
I appreciate all the help and even all those who read through my post.
Thank you so much!

strcat would work if you used first instead of firstOrig in your loop. No need for mystrcat. Can be simplified to:
#include <stdio.h>
#include <string.h>
int main() {
char str[] = "now # is the time for all # good men to come to the # aid of their country";
char delims[] = "# ";
char result[100] = ""; /* Original size was too small */
char* token;
token = strtok(str, delims);
while(token != NULL) {
printf("token = '%s'\n", token);
strcat(result, token);
token = strtok(NULL, delims);
}
printf("%s\n", result);
return 0;
}
Output:
token = 'now'
token = 'is'
token = 'the'
token = 'time'
token = 'for'
token = 'all'
token = 'good'
token = 'men'
token = 'to'
token = 'come'
token = 'to'
token = 'the'
token = 'aid'
token = 'of'
token = 'their'
token = 'country'
nowisthetimeforallgoodmentocometotheaidoftheircountry

There are several problems here:
Missing intitializations fro resultOrig and firstOrig (as codaddict pointed out).
first = &firstOrig doesn't do what you want from it. You later do first = strtok(str, delims), which sets first to point to somewhere in str. It doesn't read data into firstOrig.
You allocate small buffers (just 20 bytes) and try to fill them with much more than this. It would overflow the stack. causing nasty bugs.

You've not initialized the following two strings:
char resultOrig[20];
char firstOrig[20];
and you are appending characters to them. Change them to:
char resultOrig[20] = "";
char firstOrig[20] = "";
Also the name of the character array gives its starting address. So
result = &resultOrig;
first = &firstOrig;
should be:
result = resultOrig;
first = firstOrig;
Change
mystrcat(resultOrig, firstOrig);
to
mystrcat(resultOrig, first);
also make resultOrig to be large enough to hold the concatenations, like:
char resultOrig[100] = "";

Related

Splitting a sentence using STRTOK in C

I`m having a hard time splitting a sentence read from a file in C programming language via strtok function. I scanned it from a file and stored it in a variable info, from which I need to separate words. I tried many things and eventually copied a code from the net and changed it a little bit. The code separates the first token nicely, but then it writes some nonsense.
#include <stdio.h>
#include <string.h>
void main()
{
//int i; //brojac
char info[]=""; // sve informacije, kasnije treba da bude u strukturi
FILE *pok;
pok=fopen("C:/Users/Trajkovici/Desktop/OsobeFajl.txt","r");
if(pok==NULL)
{
printf("Greška prilikom otvaranja datoteke!");
}
fscanf(pok,"%[^\n]",&info);
puts("INFO: ");
puts(info);
//fclose(pok);
char * token = strtok(info, " ");
// loop through the string to extract all other tokens
while( token != NULL )
{
puts("\nTOKEN:");
printf( " %s\n", token ); //printing each token
token = strtok(NULL, " ");
}
}
This is the file and the result:
The result
The file
BTW, I wrote the same code, without extracting a sentence from a file, but instead declaring it manually. It works perfectly fine.
#include<stdio.h>
#include <string.h>
int main() {
char string[] = "Sladjan Jankovic 46 Vranje";
// Extract the first token
puts(string);
char * token = strtok(string, " ");
// loop through the string to extract all other tokens
while( token != NULL )
{
printf( " %s\n", token ); //printing each token
token = strtok(NULL, " ");
}
return 0;
}
And this is the result of the above code:
The result
So, the problem is that I have two codes with literally same variables, but one of them splits into tokens fine, while the other one doesn`t. Any help about the first code?
P.S. Sorry for possible bad indentation, this is my first time posting on Stack Overflow. Also, some comments and lines from the file are in Serbian.
char info[]="";
will allocate only one element. Using it in
fscanf(pok,"%[^\n]",&info);
is dangerous because it will write out-of-bounds when a string with positive length is read. (even one-character string is too long because there must be a terminating null-character).
Allocate enough elements like (for example):
char info[102400]="";
and specify the maximum length to read (the limit have to be at most the size of buffer minus one for terminating null-character) to prevent buffer overrun like this:
fscanf(pok,"%102399[^\n]",info);
Also note that you should remove & before info. Arrays in expressions (except for some exceptions) are automatically converted to pointers for their first elements. Adding & will have it pass a pointer to an array while %[ expects a pointer to a character. Passing data having wrong type to fscanf() invokes undefined behavior.

strtok() C-Strings to Array

Currently learning C, Having some trouble with passing c-string tokens into array. Lines come in by standard input, strtok is used to split the line up, and I want to put each into an array properly. an EOF check is required for exiting the input stream. Here's what I have, set up so that it will print the tokens back to me (these tokens will be converted to ASCII in a different code segment, just trying to get this part to work first).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char string[1024]; //Initialize a char array of 1024 (input limit)
char *token;
char *token_arr[1024]; //array to store tokens.
char *out; //used
int count = 0;
while(fgets(string, 1023, stdin) != NULL) //Read lines from standard input until EOF is detected.
{
if (count == 0)
token = strtok(string, " \n"); //If first loop, Get the first token of current input
while (token != NULL) //read tokens into the array and increment the counter until all tokens are stored
{
token_arr[count] = token;
count++;
token = strtok(NULL, " \n");
}
}
for (int i = 0; i < count; i++)
printf("%s\n", token_arr[i]);
return 0;
}
this seems like proper logic to me, but then i'm still learning. The issue seems to be with streaming in multiple lines before sending the EOF signal with ctrl-D.
For example, given an input of:
this line will be fine
the program returns:
this
line
will
be
fine
But if given:
none of this
is going to work
It returns:
is going to work
ing to work
to work
any help is greatly appreciated. I'll keep working at it in the meantime.
There are a couple of issues here:
You never call token = strtok(string, " \n"); again once the string is "reset" to a new value, so strtok() still thinks it is tokenizing your original string.
strtok is returning pointers to "substrings" inside string. You are changing the contents of what is in string and so your second line effectively corrupts your first (since the original contents of string are overwritten).
To do what you want you need to either read each line into a different buffer or duplicate the strings returned by strtok (strdup() is one way - just remember to free() each copy...)

Problem with strtok and segmentation fault

I have two helper functions to break up strings in the format of decimal prices ie. "23.00", "2.30"
Consider this:
char price[4] = "2.20";
unsigned getDollars(char *price)
{
return atoi(strtok(price, "."));
}
unsigned getCents(char *price)
{
strtok(price, ".");
return atoi(strtok(NULL, "."));
}
Now when I run the below I get a segmentation fault:
printf("%u\n", getDollars(string));
printf("%u\n", getCents(string));
However when I run them seperately without one following the other, they work fine. What am I missing here? Do I have to do some sort of resetting of strtok??
My solution:
With the knowledge about strtok I gained from the answer I chose below, I changed the implementation of the helper functions so that they copy the passed in string first, thus shielding the original string and preventing this problem:
#define MAX_PRICE_LEN 5 /* Assumes no prices goes over 99.99 */
unsigned getDollars(char *price)
{
/* Copy the string to prevent strtok from changing the original */
char copy[MAX_PRICE_LEN];
char tok[MAX_PRICE_LEN];
/* Create a copy of the original string */
strcpy(copy, price);
strcpy(tok, strtok(copy, "."));
/* Return 0 if format was wrong */
if(tok == NULL) return 0;
else return atoi(tok);
}
unsigned getCents(char *price)
{
char copy[MAX_PRICE_LEN];
char tok[MAX_PRICE_LEN];
strcpy(copy, price);
/* Skip this first part of the price */
strtok(copy, ".");
strcpy(tok, strtok(NULL, "."));
/* Return 0 if format was wrong */
if(tok == NULL) return 0;
else return atoi(tok);
}
Because strtok() modifies the input string, you run into problems when it fails to find the delimiter in the getCents() function after you call getDollars().
Note that strtok() returns a null pointer when it fails to find the delimiter. Your code does not check that strtok() found what it was looking for - which is always risky.
Your update to the question demonstrates that you have learned about at least some of the perils (evils?) of strtok(). However, I would suggest that a better solution would use just strchr().
First, we can observe that atoi() will stop converting at the '.' anyway, so we can simplify
getDollars() to:
unsigned getDollars(const char *price)
{
return(atoi(price));
}
We can use strchr() - which does not modify the string - to find the '.' and then process the text after it:
unsigned getCents(const char *price)
{
const char *dot = strchr(price, '.');
return((dot == 0) ? 0 : atoi(dot+1));
}
Quite a lot simpler, I think.
One more gotcha: suppose the string is 26.6; you are going to have to work harder than the revised getCents() just above does to get that to return 60 instead of 6. Also, given 26.650, it will return 650, not 65.
This:
char price[4] = "2.20";
leaves out the nul terminator on price. I think you want this:
char price[5] = "2.20";
or better:
char price[] = "2.20";
So, you will run off the end of the buffer the second time you try to get a token out of price. You're just getting lucky that getCents() doesn't segfault every time you run it.
And you should almost always make a copy of a string before using strtok on it (to avoid the problem that Jonathan Leffler pointed out).

Remove the first part of a C String

I'm having a lot of trouble figuring this out. I have a C string, and I want to remove the first part of it. Let's say its: "Food,Amount,Calories". I want to copy out each one of those values, but not the commas. I find the comma, and return the position of the comma to my method. Then I use
strncpy(aLine.field[i], theLine, end);
To copy "theLine" to my array at position "i", with only the first "end" characters (for the first time, "end" would be 4, because that is where the first comma is). But then, because it's in a Loop, I want to remove "Food," from the array, and do the process over again. However, I cannot see how I can remove the first part (or move the array pointer forward?) and keep the rest of it. Any help would be useful!
What you need is to chop off strings with comma as your delimiter.
You need strtok to do this. Here's an example code for you:
int main (int argc, const char * argv[]) {
char *s = "asdf,1234,qwer";
char str[15];
strcpy(str, s);
printf("\nstr: %s", str);
char *tok = strtok(str, ",");
printf("\ntok: %s", tok);
tok = strtok(NULL, ",");
printf("\ntok: %s", tok);
tok = strtok(NULL, ",");
printf("\ntok: %s", tok);
return 0;
}
This will give you the following output:
str: asdf,1234,qwer
tok: asdf
tok: 1234
tok: qwer
If you have to keep the original string, then strtok. If not, you can replace each separator with '\0', and use the obtained strings directly:
char s_RO[] = "abc,123,xxxx", *s = s_RO;
while (s){
char* old_str = s;
s = strchr(s, ',');
if (s){
*s = '\0';
s++;
};
printf("found string %s\n", old_str);
};
The function you might want to use is strtok()
Here is a nice example - http://www.cplusplus.com/reference/clibrary/cstring/strtok/
Personally, I would use strtok().
I would not recommend removing extracted tokens from the string. Removing part of a string requires copying the remaining characters, which is not very efficient.
Instead, you should keep track of your positions and just copy the sections you want to the new string.
But, again, I would use strtok().
if you know where the comma is, you can just keep reading the string from that point on.
for example
void readTheString(const char *theLine)
{
const char *wordStart = theLine;
const char *wordEnd = theLine;
int i = 0;
while (*wordStart) // while we haven't reached the null termination character
{
while (*wordEnd != ',')
wordEnd++;
// ... copy the substring ranging from wordStart to wordEnd
wordStart = ++wordEnd; // start the next word
}
}
or something like that.
the null termination check is probably wrong, unless the string also ends with a ','... but you get the idea.
anyway, using strtok would probably be a better idea.

Tips on how to read last 'word' in a character array in C

Just looking to be pointed in the right direction:
Have standard input to a C program, I've taken each line in at a time and storing in a char[].
Now that I have the char[], how do I take the last word (just assuming separated by a space) and then convert to lowercase?
I've tried this but it just hangs the program:
while (sscanf(line, "%s", word) == 1)
printf("%s\n", word);
Taken what was suggested and came up with this, is there a more efficient way of doing this?
char* last = strrchr(line, ' ')+1;
while (*last != '\0'){
*last = tolower(*last);
putchar((int)*last);
last++;
}
If I had to do this, I'd probably start with strrchr. That should get you the beginning of the last word. From there it's a simple matter of walking through characters and converting to lower case. Oh, there is the minor detail that you'd have to delete any trailing space characters first.
The issue with your code is that it will repeatedly read the first word of the sentence into word. It will not move to the next word each time you call it. So if you have this as your code:
char * line = "this is a line of text";
Then every single time sscanf is called, it will load "this" into word. And since it read 1 word each time, sscanf will always return 1.
This will help:
char dest[10], source [] = "blah blah blah!" ;
int sum = 0 , index =0 ;
while(sscanf(source+(sum+=index),"%s%n",dest,&index)!=-1);
printf("%s\n",dest) ;
'strtok' will split the input string based on certain delimitors, in your case the delimitor would be a space, thus it will return an array of "words" and you would simply take the last one.
http://www.cplusplus.com/reference/clibrary/cstring/strtok/
One could illustrate many different methods of performing this operation and then determine which one contained the best performance and useability characteristics, or the advantages and disadvantages of each, I simply wanted to illustrate what I mentioned above with a code snippet.
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>
#include <conio.h>
int main()
{
char line[] = "This is a sentence with a last WoRd ";
char *lastWord = NULL;
char *token = strtok(line, " ");
while (token != NULL)
{
lastWord = token;
token = strtok(NULL, " ");
}
while (*lastWord)
{
printf("%c", tolower(*lastWord++));
}
_getch();
}

Resources