how to determine last words of each string? - c

I have this assignment:
Enter a sequence of sentences from the keyboard into the string array (the end of entering - empty string). Determine the last word of each of these sentences.
The problem is that my program outputs the last word of the last sentence, and I need the last word of each sentence to be output.
Program I have tried:
#include <stdio.h>
#include <string.h>
int main() {
char str[10][100];
int i;
printf("Enter a sequence of sentences:\n");
for (i = 0; i < 10; i++) {
if (*gets(str) == '\0')
break;
}
printf("The last word of each of these sentences is:\n");
for (i = 0; i < 10; i++) {
char *word;
word = strtok(str[i], ".");
while (word != NULL) {
char *last_word = word;
word = strtok(NULL, ".");
}
printf("%s\n", last_word);
}
return 0;
}

The delimiter in this call
word = strtok(str[i], ".");
does not make sense.
It seems you mean
word = strtok(str[i], " \t.");
provided that a sentence can be ended only with a dot and words are separated by spaces or tab characters.
Another problem is that the variable last_word must be declared before the while loop.
For example
char *last_word = NULL;
char *word;
word = strtok(str[i], " \t.");
while (word != NULL) {
last_word = word;
word = strtok(NULL, " \t.");
}
And it is better to use for loop instead of the while loop
char *last_word = NULL;
for ( char *word = strtok(str[i], " \t." );
word != NULL;
word = strtok(NULL, " \t.") )
{
last_word = word;
}
Pay attention to that the function gets is unsafe and is not supported by the C Standard. Instead use standard C function fgets.
And the condition in the second for loop
for(i=0; i<10; i++)
{
char *word;
//...
is incorrect because the user can enter less than 10 sentences.

Without repeating the commentary of the accepted answer provided by #Vlad (kudos!), here is an alternative offering (with comments)
#include <stdio.h>
#include <string.h>
int main( void ) {
// A single large buffer allowing very long lines to be entered.
char buf[ 10 * 100 ], *p = buf;
size_t left = sizeof buf;
int i = 0;
// up to 10 'lines' of input, breaking on an empty line, too
while( i++ < 10 && fgets( p, left, stdin ) && p[0] != '\n' ) {
// typical invocation of strtok() to isolate "words"
// and a "wasteful" copy of each word to the current start of the buffer
for( char *tkn = p; ( tkn = strtok( tkn, " .\n" ) ) != NULL; tkn = NULL )
strcpy( p, tkn );
// having copied the last "word", append '\n' and advance the pointer
size_t len = strlen( p );
p += len;
strcpy( p++, "\n" );
left -= len + 1; // eroding the available size of the buffer
}
printf( "%s", buf ); // a single output of "word1\nword2\nword3\n..."
return 0;
}
NB: strcpy() of overlapping buffers is fraught with hazards. This works in this case, but the practise and its effects must be very well considered before using this technique.

Related

Using strtox for word generator ,

I'm trying to create word generator in C and found Segmentation Fault message.
gdb output :
_GI___strtok_r (
s=0x562d88201188 "some text without comma",
delim=0x562d8820117f " ", save_ptr=0x7f570a47aa68 <olds>) at strtok_r.c:72
code with strtox function :
char **words = malloc(sizeof(char *) * NUM_WORDS);
int num_words = 0;
char *save_ptr;
char *word = strtok(text, " ");
while (word != NULL) {
// Strip leading and trailing whitespace
while (isspace(*word)) {
word++;
}
int len = strlen(word);
while (len > 0 && isspace(word[len - 1])) {
len--;
}
// Allocate memory for the word and copy it using strdup()
words[num_words] = strdup(word);
// Move to the next word
num_words++;
word = strtok(NULL, " ");
}
how to use function with an indeterminate number of words in text?
Can't believe someone finally asked for this!
You may want to add verification that realloc() hasn't returned a NULL.
In brief, the string is chopped on the delimiters provided to strtok() while realloc() is used to grow an array of pointers to each of those segments.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
char buf[] = "Once upon a time there lived a beautiful princess.", *p = buf;
char **t = NULL; size_t sz = sizeof *t;
int n = 0;
while(!!(t=realloc(t,(n+1)*sz))&&!!(t[n]=strtok(p," .\n"))) p=NULL, n++;
for( int i = 0; i < n; i++ )
puts( t[i] );
free( t );
return 0;
}
Once
upon
a
time
there
lived
a
beautiful
princess
EDIT
Then there is the extension that can handle multiple input lines:
int main() {
char *buf[] = { "Once upon a time\n", "there lived\n", " a beautiful princess.\n" };
char **t = NULL; size_t sz = sizeof *t;
int n = 0;
for( int ln = 0; ln < sizeof buf/sizeof buf[0]; ln++ ) {
char *p = buf[ln];
while(!!(t=realloc(t,(n+1)*sz))&&!!(t[n]=strtok(p," .\n"))) p=NULL, n++;
}
for( int i = 0; i < n; i++ )
puts( t[i] );
free( t );
return 0;
}
/* Output same as shown above */
Put the strtok() as the parameter to strdup() and you've got yourself something that will preserve words while using a single line input buffer.

String parsing in C... strtok() doesn't quite get it done

OK... first question so please forgive me if it isn't quite understandable the first go.
I am attempting to parse a string input to stdin through a couple of different conditions.
Example input string: move this into "tokens that I need" \n
I would like to parse this into tokens as:
Token 1 = move
Token 2 = this
Token 3 = into
Token 4 = tokens that I need
Where the tokens are by whitespace (easy enough) until a quote is encountered, then everything inside of the open and close quotes is treated as a single token.
I've tried several different methods, but I unfortunately feel that I may be in over my head here so any help would be greatly appreciated.
My latest attempt:
fgets(input, BUFLEN, stdin); //gets the input
input[strlen(input)-1] = '\0';//removes the new line
printf("Input string = %s\n",input);//Just prints it out for me to see
char *token = strtok(input,delim);//Tokenizes the input, which unfortunately does not do what I need. delim is just my string of delimiters which currently only has a " " in it.
I tried to scan through the string one character at a time and then place those characters into arrays so that I could have them as I wanted, but that failed miserably.
The ultimate solution with customized version of my_strtok_r is here. This solution has advantage over solution with non re-entrant: strtok.
my_strtok_r is re-entrant: you can call them from multiple threads simultaneously, or in nested loops, et cetera.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char * my_strtok_r(char *s, const char *delim1, const char *delim2, char **save_ptr)
{
char *end;
size_t s1;
size_t s2;
int delim2found = 0;
if (s == NULL)
s = *save_ptr;
if (*s == '\0'){
*save_ptr = s;
return NULL;
}
s1 = strspn (s, delim1);
s2 = strspn (s, delim2);
if(s2 > s1){
s += s2;
delim2found = 1;
}
else{
s += s1;
}
if (*s == '\0'){
*save_ptr = s;
return NULL;
}
/* Find the end of the token. */
if(delim2found)
end = s + strcspn (s, delim2);
else
end = s + strcspn (s, delim1);
if (*end == '\0') {
*save_ptr = end;
return s;
}
/* Terminate the token and make *save_ptr point past it. */
*end = '\0';
*save_ptr = end + 1;
return s;
}
int main (void)
{
char str[] = " 123 abc \"SPLITTING WORKS\" yes! \"GREAT WE HAVE A SOLUTION\" ! ";
char *d1 = " ";
char *d2 = "\"";
char *token;
char *rest = str;
char array[20][80];
printf ("Splitting string \"%s\" into tokens:\n",str);
size_t nr_of_tokens = 0;
while ((token = my_strtok_r(rest, d1, d2, &rest)))
{
strcpy (array[nr_of_tokens], token);
nr_of_tokens++;
}
for(int i=0; i < nr_of_tokens; i++)
printf ("%s\n",array[i]);
return 0;
}
Test:
Splitting string " 123 abc "SPLITING WORKS" yes! "GREAT WE HAVE A SOLUTION" ! " into tokens:
123
abc
SPLITTING WORKS
yes!
GREAT WE HAVE A SOLUTION
!
This is another solution (fully tested) which you can use. You can mix any number of tokens delimited by white spaces and '\"'. It can be configured to your needs. Extensive explanations are given in the code itself.
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include <ctype.h>
char *get_str_segment(char *output_str, char *input_str, char extDel)
{
/*
Purpose :
To copy to output first segment.
To extract the segment two types of delimiters are used:
1. white space delimiter
2. 'extDel' -
do not put here white space or '\0'!
(typicaly '"' = quote!)
'extDel' allows us to put white spaces inside the segment.
Notice that 'extDel' cannot be embedded inside the segment!
It makes 'extDel' special character which will not be encountered
in the 'output_string'! First appearance of 'extDel' starts new
segment!
Notice that unbalanced 'extDel' will cause coping whole string to
destination from that point!
Return:
Pointer to the first character after the segment
or NULL !!!
we will not allow **empty** segments with unbalanced 'extDel'
if ('extDel' is unbalanced) it has to have at list one character!
It can be white space!
Notice!
"get_str_segment()" on strings filed with white spaces
and empty strings will return *** NULL *** to indicate that
no conclusive segment has been found!
Example:
input_str = " qwerty"123 45" "samuel" G7 "
output_str = ""
// Call:
char *ptr = get_str_segment(output_str,input_str,'"');
Result:
input_str = " qwerty"123 45" "samuel" G7 "
^
|
ptr----------------------.
output_str = "qwerty"
*/
char *s = input_str;
char *d = output_str;
char i = 0;
if(!s) return NULL; // rule #1 our code never brakes!
if(!d) return NULL;
// eliminate white spaces from front of the block
while(1)
{
if ( *s == '\0')
{
*d = '\0' ; // end the output string
return (NULL) ; // return NULL to indicate that no
// coping has been done.
//
//
// "get_str_segment()" on
// strings filed with white spaces
// and empty strings
// will return NULL to indicate that
// no conclusive segment has been found
//
}
if (isspace(*s)) ++s; // move pointer to next char
else break; // break the loop!
}
// we found first non white character!
if( *s != extDel)
{
// copy block up to end of string first white space or extDel
while( ((*s) != '\0') && (!isspace(*s)) && ((*s) != extDel) )
{
*d = *s; // copy segment characters
++s;
++d;
}
*d = '\0' ; // end the destination string
return (s); // return pointer to end of the string ||
// trailing white space ||
// 'extDel' char
}
else // It is 'extDel' character !
{
++s; // skip opening 'extDel'
while( ((*s) != '\0') && ((*s) != extDel) )
{
i=1; // we executed loop at list one time
*d = *s; // copy segment characters till '\0' or extDel
++s;
++d;
}
*d = '\0' ; // end the destination string
if( *s == extDel ) ++s; // skip *closing* 'extDel'
else
{
// unbalanced 'extDel'!
printf("WARNING:get_str_segment: unbalanced '%c' encountered!\n",extDel);
if (i==0) return NULL; // we will not allow
// **empty** unbalanced segments 'extDel'
// if ('extDel' is unbalanced) it has to have at list one character!
// It can be white space!
}
return (s); // return pointer to next char after 'extDel'
// ( It can be '\0')
// if it is '\0' next pass will return 'NULL'!
}
}
int parse_line_to_table(int firstDim, int secondDim, char *table, char * line, char separator)
{
// Purpose:
// Parse 'line' to 'table'
// Function returns: number of segments
// 'table' has to be passed from outside
char* p;
int i;
if(!table) return (-1);
// parse segments to 'table':
if(line)
{
p = line; // A necessary initialization!
for(i=0; i<firstDim; i++)
{
p = get_str_segment( table+i*secondDim , p , separator );
if(p==NULL) break;
}
}
else
return (-1);
// debug only
// for(int j=0; j<i; j++) { printf(" i=%d %s",j, table+j*secondDim ); }
// printf("\n");
return (i); // notice that i is post incremented
}
int main(void)
{
char table[20][80];
char *line = "move this into \"tokens that I need\"";
int ret = parse_line_to_table(20, 80, table, line, '\"');
for(int i = 0; i < ret; i++ )
printf("%s\n",table[i]);
return 0;
}
Output:
move
this
into
tokens that I need

how to filter the lines in a buffer after comparing with particular string

I am trying to filter only those lines from the given buffer(1200 lines), whose first token matches with a particular string. Here the strings in the lines are separated by "#" symbol and the lines are separated by "\n".
So, first of all I splitted the strings using strtok() and storing the tokens in an array of pointers.
And then I used compared
if(token[0]=="abc.com") print("%s",token[i])
Here it is printing only all the lines which starts with abc.com with first token only.
So, can anyone help me how to print the original lines after matching the first token.
int len_of_buff;
int n;
//char tokens[1024];
int ret_code = 1;
n=0;
len_of_buff = strlen((char *)my_buffer);
//char *tokens[len_of_buff];
for(i=n;i<len_of_buff; i++) {
char tokens[1024];
tokens[i] = strtok ((char *)my_buffer,"#\n");
//if (my_buffer[i] == '\n') my_buffer[i]='\0';
ret_code=strcmp(tokens[0], "abc.com");
if (ret_code==0) {
printf("\n");
fprintf(stdout, "%s \n ",(char *)my_buffer+n);
// fprintf(stdout, "******The buffer is: %d bytes\n",len_of_buff);
n = i+1;
break; } }
if(token[0]=="abc.com")
Simply compares two pointers (one to the constant string, the other to somewhere in your input line. This will always return FALSE.
if (strcmp (token[0], "abc.com") == 0)
Would do the right thing, provided token[0] is a char pointer to your first token.
If you want to print the whole, original line, you need to save it somewhere - strtok destroys the original string during the parsing process.
now i did it as following and it is working fine.
char buffer_copy[1000];
strcpy( buffer_copy, buffer );
char * tokens[1000];
size_t token_count = 0;
char * ptr = tokens[token_count] = buffer_copy;
while( *ptr )
if( ( *ptr == '#' ) || ( *ptr == '\n' ) )
*ptr++ = '\0', tokens[++token_count] = ptr;
else
++ptr;
size_t index;
for( index = 0; index < token_count; ++index )
if( !strcmp( tokens[index], "member" ) )
printf( "%d: %s | %s", index, tokens[index], buffer );

Issues with Pointer Arithmetic - Trying to tokenize input String

Currently I am working on a program that allows a user to enter a string that is then tokenized, then the tokens are printed to the screen by using an array of pointers. It is "supposed" to do this by calling my tokenize function which reads the input string until the first separator ( ' ', ',', '.', '?', '!'). It then changes that separator in my string to a NULL char. It then should return a pointer to the next character in my string.
In main after the string has been input, it should keep calling the tokenize function which returns pointers which are then stored in a array of pointers to later print my tokens. Once the tokenize() returns a pointer to a NULL character which is at the end of my string it breaks from that loop. Then I print the tokens out using my array of pointers.
//trying to be detailed
#include <stdio.h>
#include <string.h>
char *tokenize ( char *text, const char *separators );
int main ( void )
{
char text[30];
char separators[6] = { ' ','.',',','?','!','\0'};
char *pch = NULL;
int tokens[15];
int i = 0;
int j = 0;
printf("Enter a string: \n");
fgets( text, 30, stdin );
printf("%s", text );
pch = tokenize ( text, separators );
do
{
pch = tokenize ( pch, separators );
//printf("%c", *pch);
tokens[i] = pch;
i++;
}
while( *pch != NULL );
i--;
while( j != i )
{
printf("%s", tokens[i] );
j++;
}
return 0;
}
char *tokenize ( char *text, const char *separators )
{
while( text != NULL )
{
if( text != NULL )
{
while( separators != NULL )
{
if( text == separators )
{
text = '\0';
}
separators++;
}
}
text++;
}
return text;
}
3 big known problems currently.
1.When I compile, it reads the string then prints it, then gets stuck in a endless loop with nothing printing, still trying to get input.
2. Im pretty sure I am using the " * " for my pointers in the wrong place.
3. My function passes in a reference to my arrays, so I assumed i could just increment them as is.
I appreciate any feedback! I will be watching this post constantly. If i left something unclear, I can respecify. Thanks.
You had right idea for approaching the problem, but you had numerous pointer/int errors throughout your code. Make sure you compile your code with Warnings enabled, this will tell you where you have problems in your code. (don't expect your code to run correctly until you address and eliminate all warnings). At a minimum, compile with -Wall -Wextra options in your build command.
There are a lot easier ways to do this, but for the learning experience, this is a great exercise. Below is your code with the errors corrected. Where possible, I have left your original code commented so you can see where the issues were. I also include a bit of code to remove the newline included by fgets at the end of text. While this isn't required, it is good practice not to have stray newlines filter through your code.
Let me know if you have questions:
#include <stdio.h>
#include <string.h>
char *tokenize ( char *text, const char *separators );
int main ( void )
{
char text[30];
char separators[6] = { ' ','.',',','?','!','\0'};
char *pch = NULL;
char *tokens[15] = {0}; /* declare array of pointers */
int i = 0;
int j = 0;
printf("Enter a string: \n");
fgets( text, 30, stdin );
size_t len = strlen (text);
if (text[len-1] == '\n') /* strip newline from text */
text[--len] = 0;
pch = text; /* pch pointer to next string */
char *str = text; /* str pointer to current */
do
{
pch = tokenize ( str, separators ); /* pch points to next */
tokens[i++] = str; /* save ptr to token */
str = pch; /* new start of str */
}
while (pch != NULL && *pch != 0); /* test both pch & *pch */
printf ("\nTokens collected:\n\n");
while (tokens[j]) /* print each token */
{
printf(" token[%d]: %s\n", j, tokens[j] );
j++;
}
printf ("\n");
return 0;
}
char *tokenize ( char *text, const char *separators )
{
const char *s = separators; /* must use pointer to allow reset */
//while( text != NULL )
while( *text != '\0' )
{
s = separators; /* reset s */
while( *s != 0 ) /* 0 is the same as '\0' */
{
//if( text == separators )
if( *text == *s )
{
//text = '\0';
*text = '\0';
return ++text;
}
s++;
}
text++;
}
return text;
}
Example output:
$ ./bin/tokenizestr
Enter a string:
This is a test string
Tokens collected:
token[0]: This
token[1]: is
token[2]: a
token[3]: test
token[4]: string
Maybe you will want to take a look at
strsep and this post Split string with delimiters in C
If you need more reference points try searching "split string" it is what you want to do if I understood correctly.

Function to delete all occurrences of a word in a sentence in C

I have this code which will remove the first occurrence of the word from the sentence:
#include "stdio.h"
#include "string.h"
int delete(char *source, char *word);
void main(void) {
char sentence[500];
char word[30];
printf("Please enter a sentence. Max 499 chars. \n");
fgets(sentence, 500, stdin);
printf("Please enter a word to be deleted from sentence. Max 29 chars. \n");
scanf("%s", word);
delete(sentence, word);
printf("%s", sentence);
}
int delete(char *source, char *word) {
char *p;
char temp[500], temp2[500];
if(!(p = strstr(source, word))) {
printf("Word was not found in the sentence.\n");
return 0;
}
strcpy(temp, source);
temp[p - source] = '\0';
strcpy(temp2, p + strlen(word));
strcat(temp, temp2);
strcpy(source, temp);
return 1;
}
How would I modify it to delete all occurrences of the word in the given sentence? Can i still use the strstr function in this case?
Thanks for the help!
Open to completely different ways of doing this too.
P.S. This might sound like a homework question, but it's actually a past midterm question which I'd like to resolve to prepare for my midterm!
As a side question, if I use fgets(word, 30, stdin) instead of scanf("%s", word), it no longer works and tells me that the word was not found in the sentence. Why?
Try the following
#include <stdio.h>
#include <string.h>
size_t delete( char *source, const char *word )
{
size_t n = strlen( word );
size_t count = 0;
if ( n != 0 )
{
char *p = source;
while ( ( p = strstr( p, word ) ) != NULL )
{
char *t = p;
char *s = p + n;
while ( ( *t++ = *s++ ) );
++count;
}
}
return count;
}
int main( void )
{
char s[] = "abxabyababz";
printf( "%zu\n", delete( s, "ab" ) );
puts( s );
return 0;
}
The output is
4
xyz
As for the question about fgets then it includes the new line character in the string. You have to remove it from the string.
How would I modify it to delete all occurrences of the word in the given sentence?
There are many ways, as you have suggested, and since you are Open to completely different ways of doing this too...
Here is a different idea:
A sentence uses white space to separate words. You can use that to help solve the problem. Consider implementing these steps using fgets(), strtok() and strcat() to break apart the string, and reassemble it without the string to remove.
0) create line buffer sufficient length to read lines from file
(or pass in line buffer as an argument)
1) use while(fgets(...) to get new line from file
2) create char *buf={0};
3) create char *new_str; (calloc() memory to new_str >= length of line buffer)
4) loop on buf = strtok();, using " \t\n" as the delimiter
Inside loop:
a. if (strcmp(buf, str_to_remove) != 0) //approve next token for concatenation
{ strcat(new_str, buf); strcat(new_str, " ");}//if not str_to_remove,
//concatenate token, and a space
5) free allocated memory
new_str now contains sentence without occurrences of str_to_remove.
Here is a demo using this set of steps (pretty much)
int delete(char *str, char *str_to_remove)
{
char *buf;
char *new_str;
new_str = calloc(strlen(str)+1, sizeof(char));
buf = strtok(str, " \t\n");
while(buf)
{
if(strcmp(buf, str_to_remove) != 0)
{
strcat(new_str, buf);
strcat(new_str, " ");
}
buf = strtok(NULL, " \t\n");
}
printf("%s\n", new_str);
free(new_str);
getchar();
return 0;
}
int main(void)
{
delete("this sentence had a withh bad withh word", "withh");
return 0;
}

Resources