Finding the last occurrence of a string in a sentence in C - c

I am practicing and I came across an exercise. The exercise says I am to manually write a function that finds the index of the last occurrence in a string. Now, I get that this has maybe been asked before but I cannot find what is the problem with my code. It works for almost all instances, but not so when the last occurrence of the word is at the beginning of the string.
What I have tried: I used pointers to store the addresses of the ends of both the sentence and the word we are looking for. I then used a while loop to iterate through the string. If the current character matches the last character of the word we are searching for, we enter another while loop which compares the two. If the pointer that points to the beginning of the word and the one we used to iterate through the word are equal, the word is found.
Here is some code:
#include <stdio.h>
int find_last( char *str, char *word)
{
char *p, *q;
char *s, *t;
p=str; /* Pointer p now points to the last character of the sentence*/
while(*p!='\0') p++;
p--;
q = word;
while(*q!='\0') q++; /* Pointer q now points to the last character of the word*/
q--;
while(p != str) {
if(*p == *q) {
s=p; /* if a matching character is found, "s" and "t" are used to iterate through */
/* the string and the word, respectively*/
t=q;
while(*s == *t) {
s--;
t--;
}
if(t == word-1) return s-str+1; /* if pointer "t" is equal by address to pointer word-1, we have found our match. return s-str+1. */
}
p--;
}
return -1;
}
int main()
{
char arr[] = "Today is a great day!";
printf("%d", find_last(arr, "Today"));
return 0;
}
So, this code should return 0 but it returns -1.
It works in every other instance I tested! When ran in CodeBlocks, the output is as expected (0), but using any other online IDE I could find the output is still -1.

For starters the parameters of the function shall have the qualifier const and its return type should be either size_t or ptrdiff_t.
For example
ptrdiff_t find_last( const char *str, const char *word );
In any case the function shall be declared at least like
int find_last( const char *str, const char *word );
The function should emulate the behavior of the standard C function strstr. That is when the second argument is an empty string the function should return 0.
If either of the arguments is an empty string then your function has undefined behavior due to these statements
p=str; /* Pointer p now points to the last character of the sentence*/
while(*p!='\0') p++;
p--;
^^^^
q = word;
while(*q!='\0') q++; /* Pointer q now points to the last character of the word*/
q--;
^^^^
If the string pointed to by str contains only one symbol then your function returns -1 because the condition of the loop
while(p != str) {
evaluates to false independent on whether the both strings are equal each other or not.
This loop
while(*s == *t) {
s--;
t--;
}
again can invoke undefined behavior because there can be an access to memory that precedes the string word.
And this statement
if(t == word-1) return s-str+1;
also can invoke the undefined behavior by the same reason.
The function can be defined as it is shown in the demonstrative program below.
#include <stdio.h>
int find_last( const char *str, const char *word )
{
const char *p = str;
int found = !*word;
if ( !found )
{
while ( *p ) ++p;
const char *q = word;
while ( *q ) ++q;
while ( !found && !( p - str < q - word ) )
{
const char *s = p;
const char *t = q;
while ( t != word && *( s - 1 ) == *( t - 1) )
{
--s;
--t;
}
found = t == word;
if ( found ) p = s;
else --p;
}
}
return found ? p - str : -1;
}
int main(void)
{
const char *str = "";
const char *word = "";
printf( "find_last( str, word ) == %d\n", find_last( str, word ) );
word = "A";
printf( "find_last( str, word ) == %d\n", find_last( str, word ) );
str = "A";
printf( "find_last( str, word ) == %d\n", find_last( str, word ) );
str = "ABA";
printf( "find_last( str, word ) == %d\n", find_last( str, word ) );
str = "ABAB";
printf( "find_last( str, word ) == %d\n", find_last( str, word ) );
str = "ABCDEF";
printf( "find_last( str, word ) == %d\n", find_last( str, word ) );
str = "ABCDEF";
word = "BC";
printf( "find_last( str, word ) == %d\n", find_last( str, word ) );
return 0;
}
The program output is
find_last( str, word ) == 0
find_last( str, word ) == -1
find_last( str, word ) == 0
find_last( str, word ) == 2
find_last( str, word ) == 2
find_last( str, word ) == 0
find_last( str, word ) == 1

Related

How to count how many word in string?

I want to know how to count how many words are in a string.
I use strstr to compare and it works but only works for one time
like this
char buff = "This is a real-life, or this is just fantasy";
char op = "is";
if (strstr(buff,op)){
count++;
}
printf("%d",count);
and the output is 1 but there are two "is" in the sentence, please tell me.
For starters you have to write the declarations at least like
char buff[] = "This is a real-life, or this is just fantasy";
const char *op = "is";
Also if you need to count words you have to check whether words are separated by white spaces.
You can do the task the following way
#include <string.h>
#include <stdio.h>
#include <ctype.h>
//...
size_t n = strlen( op );
for ( const char *p = buff; ( p = strstr( p, op ) ) != NULL; p += n )
{
if ( p == buff || isblank( ( unsigned char )p[-1] ) )
{
if ( p[n] == '\0' || isblank( ( unsigned char )p[n] ) )
{
count++;
}
}
}
printf("%d",count);
Here is a demonstration program.
#include <stdio.h>
#include <string.h>
#include <ctype.h>
int main(void)
{
char buff[] = "This is a real-life, or this is just fantasy";
const char *op = "is";
size_t n = strlen( op );
size_t count = 0;
for ( const char *p = buff; ( p = strstr( p, op ) ) != NULL; p += n )
{
if ( p == buff || isblank( ( unsigned char )p[-1] ) )
{
if ( p[n] == '\0' || isblank( ( unsigned char )p[n] ) )
{
count++;
}
}
}
printf( "The word \"%s\" is encountered %zu time(s).\n", op, count );
return 0;
}
The program output is
The word "is" is encountered 2 time(s).
Parse the string, in a loop.
As OP has "but there are two "is" in the sentence", it is not enough just to look for "is" as that occurs 4x, twice in "This". Code needs to parse the string for the idea of a "word".
Case sensitively is also a concern.
char buff = "This is a real-life, or this is just fantasy";
char op = "is";
char *p = buff;
char *candidate;
while ((candidate = strstr(p, op)) {
// Add code to test if candidate is a stand-alone word
// Test if candidate is beginning of buff or prior character is a white-space.
// Test if candidate is end of buff or next character is a white-space/punctuation.
p += strlen(op); // advance
}
For me, I would not use strstr(), but look for "words" with isalpha().
// Concept code
size_t n = strlen(op);
while (*p) {
if (isalpha(*p)) { // Start of word
// some limited case insensitive compare
if (strnicmp(p, op, n) == 0 && !isalpha(p[n]) {
count++;
}
while (isalpha(*p)) p++; // Find end of word
} else {
p++;
}
}

Copying a character from one string argument to another in C

The following program crashes without an error message after trying to replace the first character in s with t. The purpose of the program is to test if the two strings s and t are isomorphic:
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
bool isIsomorphic(char *s, char *t);
int main()
{
isIsomorphic("egg", "add");
}
bool isIsomorphic(char *s, char *t)
{
//create two other char pointers for the characters one position before s and t.
char *preS = s;
char *preT = t;
//replace first character in s with t.
*s = *t //CRASHES HERE
//increment both pointers to their second character.
s ++;
t ++;
//run through t
while(t != NULL)
{
//if the characters in both strings are either a. both equal to their previous or b. both different to their previous:
if(((strcmp(t, preT) == 0) && (strcmp(s, preS) == 0)) || (((strcmp(t, preT) != 0) && (strcmp(s, preS) != 0))))
{
//copy t into s and shift both pointers along.
*s = *t;
s ++;
t ++;
}
else
{
printf("not isomorphic\n");
return false;
}
}
printf("isomorphic\n");
return true;
}
Why is this the case? Any help would be appreciated.
Modifying string literals is prohibited. Trying to do so invokes undefined behavior.
In this case the pointers are incremented after the assignments and the strings are not referenced after the call of isIsomorphic function, so you should remove the meaningless and harmful assignments (*s = *t;).
If you want to refer the modified string later, you should store the string to modify in a modifiable array like this:
int main(void)
{
char str[] = "egg";
isIsomorphic(str, "add");
}
You are using string literals
isIsomorphic("egg", "add");
that you are changing within the function isIsomorphic
*s = *t //CRASHES HERE
You may not change a string literal. Any attempt to change a string literal results in undefined behavior.
From the C Standard (6.4.5 String literals)
7 It is unspecified whether these arrays are distinct provided their
elements have the appropriate values. If the program attempts to
modify such an array, the behavior is undefined.
But in any case the function is incorrect.
Firstly the function shall not change passed to it strings. That is it shall be declared like
bool isIsomorphic( const char *s, const char *t);
Also these calls of the function strcmp in the if statement
if(((strcmp(t, preT) == 0) && (strcmp(s, preS) == 0)) || (((strcmp(t, preT) != 0) && (strcmp(s, preS) != 0))))
do not make a sense at least because strings pointed to by t and preT have different lengths. So this expression strcmp(t, preT) == 0 will always evaluate to logical false.
I can suggest the following function definition shown in the demonstrative program below.
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
bool isIsomorphic( const char *s, const char *t )
{
size_t n = strlen( s );
bool isomorphic = n == strlen( t );
while ( isomorphic && n-- )
{
const char *p1 = strchr( s + n + 1, s[n] );
const char *p2 = strchr( t + n + 1, t[n] );
isomorphic = ( !p1 && !p2 ) || ( p1 && p2 && p1 - s == p2 - t );
}
return isomorphic;
}
int main(void)
{
const char *s = "egg";
const char *t = "add";
printf( "\"%s\" is isomorphic with \"%s\" is %s\n",
s, t, isIsomorphic( s, t ) ? "true" : "false" );
s = "foo";
t = "bar";
printf( "\"%s\" is isomorphic with \"%s\" is %s\n",
s, t, isIsomorphic( s, t ) ? "true" : "false" );
s = "paper";
t = "title";
printf( "\"%s\" is isomorphic with \"%s\" is %s\n",
s, t, isIsomorphic( s, t ) ? "true" : "false" );
s = "0123456789";
t = "9876543210";
printf( "\"%s\" is isomorphic with \"%s\" is %s\n",
s, t, isIsomorphic( s, t ) ? "true" : "false" );
return 0;
}
The program output is
"egg" is isomorphic with "add" is true
"foo" is isomorphic with "bar" is false
"paper" is isomorphic with "title" is true
"0123456789" is isomorphic with "9876543210" is true

How to find out if a word is inside of a line?

I am tring to make a function from exercise in book "Programming in C". The correct function should indicate if a line contain some word, if yes - return its first charcter position(of the word) in the line.
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
int substring (char a[], char b[]);
int main ()
{
char line1[15], line2[15];
printf("Print first one\n");
int i = 0;
char character;
do
{
character = getchar ();
line1[i] = character;
i++;
}
while (character != '\n');
line1[i-1] = '\0';
printf ("Print the second one\n");
scanf("%s", line2);
printf("%s, %s\n", line1, line2); \\ for checking lines
int index;
index = substring (line1, line2);
printf("The result is: %i\n", index);
}
int substring (char a[], char b[])\*function to determine if a line contains a word, if yes then return its polition, else return -3*\
{
int len1 = strlen(a), len2 = strlen(b);
int current1 = 0, current2 = 0;
bool found = false;
int result;
while( current1 < len1 )
{
if (a[current1] == b[current2])
{
if(!found)
{
result = current1+1;
}
else
found = true;
while ((a[current1] == b[current2]) && (a[current1] != '\0') && (b[current2] != '\0'))
{
current1++;
if(current2+1 == len2)
return result;
current2++;
}
current1 = result;
}
else
{
current2 = 0;
found = false;
current1++;
}
}
return -3;
}
The problem is somehow in the second function(substring), cause when i try to search for "word" in line "Here is your word", fucntion works properly, but when i try to search "word" in a line "Here is your wwwwword", function returns -3 (which is indication if something went wrong).
For starters there is the standard C string function strstr that allows to determine whether one string is contained in other string.
If you need to write such a function yourself then your function implementation looks too complicated.
For example the if-else statement
if (a[current1] == b[current2])
{
if(!found)
{
result = current1+1;
}
else
found = true;
// ...
does not make a great sense at least by two reasons. The first one is the variable result that can be returned from the function in the while loop does not contain the exact index of the found word. And the second one is that found never can be equal to true within the function. So the condition in the if statement
if( !found )
always evaluates to true.
Also you forgot to reset the variable current2 in the compound statement of the if statement after the while loop.
Apart from this the function parameters should have the qualifier const because passed strings are not being changed in the function. And the function return type shall be size_t.
The function can look for example the following way as it is shown in the demonstrative program below.
#include <stdbool.h>
#include <string.h>
size_t substring( const char *s1, const char *s2 )
{
size_t n1 = strlen( s1 );
size_t n2 = strlen( s2 );
size_t i = 0;
bool found = false;
if (!( n1 < n2 ))
{
for ( size_t n = n1 - n2; !found && i <= n; i += !found )
{
size_t j = 0;
while (s2[j] != '\0' && s2[j] == s1[i + j]) ++j;
found = s2[j] == '\0';
}
}
return found ? i : -1;
}
int main( void )
{
const char *word = "word";
const char *s1 = "Here is your word";
const char *s2 = "Here is your wwwwword";
printf( "%zu\n", substring( s1, word ) );
printf( "%zu\n", substring( s2, word ) );
}
The program output is
13
17
If a word is not found in the source string then the function returns a value of the type size_t that is equal to -1.
Pay attention to that the variable character should be declared having the type int instead of char. Otherwise in general the comparison with EOF can produce unexpected result if the type char behaves as the type unsigned char.

Merge two specific columns from a line in C language

I want to merge two specific columns from a line in C language. The line is like "hello world hello world". It consists of some words and some white space. The below is my code. In this function, c1 and c2 represent the number of the column, and array key is mergeed string. But it's not good to run.
char *LinetoKey(char *line, int c1, int c2, char key[COLSIZE]){
char *col2 = (char *)malloc(sizeof(char));
while (*line != '\0' && isspace(*line) )
line++;
while(*line != '\0' && c1 != 0){
if(isspace(*line)){
while(*line != '\0' && isspace(*line))
line++;
c1--;
c2--;
}else
line++;
}
while (*line != '\0' && *line != '\n' && (isspace(*line)==0))
*key++ = *line++;
*key = '\0';
while(*line != '\0' && c2 != 0){
if(isspace(*line)){
while(*line != '\0' && isspace(*line))
line++;
c2--;
}else
line++;
}
while (*line != '\0' && *line != '\n' && isspace(*line)==0)
*col2++ = *line++;
*col2 = '\0';
strcat(key,col2);
return key;
}
Here's a possible solution using strtok(). It can handle an arbitrary number of columns (increase the size of buf if necessary), and will still work if the order of the columns is reversed (i.e. c1 > c2). The function returns 1 on success (tokens successfully merged), 0 otherwise.
Note that strtok() modifies its argument - so I have copied input to a temporary buffer, char buf[64].
/*
* Merge space-separated 'tokens' in a string.
* Columns are zero-indexed.
*
* Return: 1 on success, 0 on failure
*/
int merge_cols(char *input, int c1, int c2, char *dest) {
char buf[64];
int col = 0;
char *tok = NULL, *first = NULL, *second = NULL, *tmp = NULL;
if (c1 == c2) {
fprintf(stderr, "Columns can not be the same !");
return 0;
}
if (strlen(input) > sizeof(buf) - 1) return 0;
/*
* strtok() is _destructive_, so copy the input to
* a buffer.
*/
strcpy(buf, input);
tok = strtok(buf, " ");
while (tok) {
if (col == c1 || col == c2) {
if (!first)
first = tok;
else if (first && !second)
second = tok;
}
if (first && second) break;
tok = strtok(NULL, " ");
col++;
}
// In case order of columns is swapped ...
if (c1 > c2) {
tmp = second;
second = first;
first = tmp;
}
if (first) strcpy(dest, first);
if (second) strcat(dest, second);
return first && second;
}
Sample usage:
char *input = "one two three four five six seven eight";
char dest[128];
// The columns can be reversed ...
int merged = merge_cols(input, 7, 1, dest);
if (merged)
puts(dest);
Note also that it is very easy to use different delimiters when using strtok() - so if you wanted to use comma- or tab-separated input instead of spaces, you just change the second argument when calling it.
It is not clear what you're trying to do. If what David Collins suggested is what you're looking for - word concatenation by word index - here's a starting point (demo):
Your function must minimize the number of string traversals. To help this, the code below is using char** instead of char* (sort of "char streams").
The function must be able to count the number of characters the result will have, prior to actual concatenation, in order to be able to allocate the destination string on free store. If catwords is called with null destination, it only counts the length of the result string.
Regarding the actual implementation, you will have to traverse the string word by word and decide whether to copy or skip the word. See the code below for the following functions:
nextword - skips white-spaces until it finds a non-white-space character.
copyword - copies the current word if destination is valid or skips it if not. It returns the number of copied/skipped characters.
#include <ctype.h>
#include <stdlib.h>
void nextword( const char** ps )
{
while ( **ps && isspace( **ps ) )
++*ps;
}
int copyword( char** const pd, const char** ps )
{
// remember the starting point
const char* b = *ps;
// actual copy
if ( pd && *pd ) while ( **ps && !isspace( **ps ) )
*( *pd )++ = *( *ps )++;
// skip the word (no destination)
else while ( **ps && !isspace( **ps ) )
( *ps )++;
// return the length
return *ps - b;
}
int catwords( char* d, const char* s, const int* c )
{
int len = 0;
int iw = 0;
int ic = 0;
const char** ps = &s;
char** pd = &d;
for ( nextword( ps ); **ps && c[ ic ] > -1; nextword( ps ), ++iw )
if ( iw == c[ ic ] )
{
len += copyword( pd, ps );
++ic;
}
else
{
copyword( 0, ps ); // just skip the current word
}
if ( d )
**pd = '\0';
return len;
}
int main()
{
// static buffer test
{
char d[ 1024 ];
int t[] = { 0, 3, -1 };
catwords( d, "Hello world. Hello world!", t );
puts( d );
}
// dynamic buffer test
{
const char* s = "The greatness of a man is not in how much wealth he acquires, but in his integrity and his ability to affect those around him positively.";
int t[] = { 1, 5, 16, -1 };
int dstcharcount = catwords( 0, s, t ) + 1;
char* d = (char*)malloc( dstcharcount * sizeof( char ) );
catwords( d, s, t );
puts( d );
free( d );
}
return 0;
}

Why doesn't pointer arithmetic work in char* functions

THE MESSAGE:
/usr/local/webide/runners/c_runner.sh: line 54: 20533 Segmentation fault
nice -n 10 valgrind --leak-check=full --log-file="$valgrindout" "$exefile"
I can't understand why I can't use pointer arithmetic when my function type is not void. Take a look at this example:
Let's say I have to write a function that would 'erase' all whitespaces before the first word in a string.
For example, if we had a char array:
" Hi everyone"
it should produce "Hi everyone" after the function's modification.
Here is my code which works fine when instead of
char* EraseWSbeforethefirstword() I have
void EraseWSbeforethefirstword.
When the function returns an object char* it can't even be compiled.
char* EraseWSbeforethefirstword(char *s) {
char *p = s, *q = s;
if (*p == ' ') { /*first let's see if I have a string that begins with a space */
while (*p == ' ') {
p++;
} /*moving forward to the first non-space character*/
while (*p!= '\0') {
*q = *p;
p++;
q++;
} /*copying the text*/
*q = '\0'; /*If I had n spaces at the beginning the new string has n characters less */
}
return s;
}
Here is a function implementation that has the return type char * as you want.
#include <stdio.h>
char * EraseWSbeforethefirstword( char *s )
{
if ( *s == ' ' || *s == '\t' )
{
char *p = s, *q = s;
while ( *p == ' ' || *p == '\t' ) ++p;
while ( ( *q++ = *p++ ) );
}
return s;
}
int main(void)
{
char s[] = "\t Hello World";
printf( "\"%s\"\n", s );
printf( "\"%s\"\n", EraseWSbeforethefirstword( s ) );
return 0;
}
The program output is
" Hello World"
"Hello World"
Take into account that you may not modify string literals. So the program will have undefined behavior if instead of the array
char s[] = "\t Hello World";
there will be declared a pointer to a string literal
char *s = "\t Hello World";
If you want that the function could deal with string literals then the function has to allocate a new array dynamically and to return a pointer to its first element.
If you may not use standard C string functions then the function can look the following way
#include <stdio.h>
#include <stdlib.h>
char * EraseWSbeforethefirstword( const char *s )
{
size_t blanks = 0;
while ( s[blanks] == ' ' || s[blanks] == '\t' ) ++blanks;
size_t length = 0;
while ( s[length + blanks] != '\0' ) ++length;
char *p = malloc( length + 1 );
if ( p != NULL )
{
size_t i = 0;
while ( ( p[i] = s[i + blanks] ) != '\0' ) ++i;
}
return p;
}
int main(void)
{
char *s= "\t Hello World";
printf( "\"%s\"\n", s );
char *p = EraseWSbeforethefirstword( s );
if ( p ) printf( "\"%s\"\n", p );
free( p );
return 0;
}

Resources