Using Memmove to copy a string into itself produces wrong output - c

I'm given a string as such:
"Hello World"
With 5 Spaces characters inbetween the two words. I want to remove all but one of the spaces between the two words . However, my code seems to only work when there is 3 spaces or less. I'm using memmove to try and accomplish this.
Here is what I've tried:
int main(void) {
char * word = malloc(sizeof(char)*16);
strcpy(word,"Hello World");
checkWords(word);
return 0;
}
void checkWords(char * word) {
int i;
for(i=0; i < strlen(word); i++) {
if(word[i] == ' ' )
memmove(&word[i],&word[i+1],strlen(word)+1);
}
printf("The string without spaces is %s\n",word);
}
The output here is "Hello World"
Not "Hello World"
If It try input such as:
"Hello World" gets me "Hello World" -->correct
"Hello World" gets me "Hello World" -->correct
anything greater than 3 spaces, gets me incorrect output. (I want to have one space between the two words.

There are four problems. One is undefined behaviour: Take a 1,000,000 byte string where just the last character is a space. You will be moving about one megabyte after the end of the string one byte forward. That's quite fatal.
One is just a bug: You examine every character position in the string only once. If you have two consecutive spaces, you move the second space into the position of the first one and leave it there.
And one is a performance issue: If you fix the first two problems, and then you take a 1,000,000 byte string consisting of spaces only, you will do one million memmoves each moving between 0 and 1 megabyte. That's 500 gigabyte moved. That takes time.
And another performance issue is calling strlen in a loop. If your string is one million bytes, you do one million calls to strlen, and each call will go through the whole string until the end, scanning the whole megabyte string for the trailing zero byte.
PS. I misinterpreted what you are trying to achieve: Your code will delete any single space, and delete one out of any pairs of spaces. So it leaves one space for every two spaces you had. So it won't work if there is one space. It works by coincidence for two or three spaces. And if you have lots of spaces, it will keep about half of them.

Copying the "right" side of the string potentially multiple times is inefficient. Use of strcpy() or memcpy() is not an efficient approach.
Repeatedly calling strlen() is also inefficient.
Suggest using two indexes and walk them through the string.
#include <stdbool.h>
void RemoveExtraSpaces(char * word) {
if (word[0]) {
size_t src = 1;
size_t dest = 1;
do {
if (word[src] != ' ' || word[src - 1] != ' ') {
word[dest] = word[src];
dest++;
}
} while (word[src++]);
}
printf("The string without spaces is `%s`\n", word);
}
Unclear what should happen with multiple spaces before the first word or after the last word. This code shrinks those to 1 space.
After accept variation - slight simplification. Inspired by #Joachim Pileborg good answer.
void RemoveExtraSpaces2(char * word) {
size_t src = 0;
size_t dest = 0;
do {
if (word[src] == ' ' && word[src + 1] == ' ') {
src++;
}
word[dest] = word[src];
dest++;
} while (word[src++]);
printf("The string without spaces is `%s`\n", word);
}
As this is too close to being a duplicate of Replace multiple spaces by single space in C, (BTW: which did not accept the best answer IMO), I am making this community wiki.

This works for me
for(i=0; i < strlen(word); i++) {
if(word[i] == ' ' )
{
while (word[i+1] == ' ' && word[i+1] != '\0')
memmove(&word[i],&word[i+1], strlen(word)-i);
}
}

Related

CamelCase to snake_case in C without tolower

I want to write a function that converts CamelCase to snake_case without using tolower.
Example: helloWorld -> hello_world
This is what I have so far, but the output is wrong because I overwrite a character in the string here: string[i-1] = '_';.
I get hell_world. I don't know how to get it to work.
void snake_case(char *string)
{
int i = strlen(string);
while (i != 0)
{
if (string[i] >= 65 && string[i] <= 90)
{
string[i] = string[i] + 32;
string[i-1] = '_';
}
i--;
}
}
This conversion means, aside from converting a character from uppercase to lowercase, inserting a character into the string. This is one way to do it:
iterate from left to right,
if an uppercase character if found, use memmove to shift all characters from this position to the end the string one position to the right, and then assigning the current character the to-be-inserted value,
stop when the null-terminator (\0) has been reached, indicating the end of the string.
Iterating from right to left is also possible, but since the choice is arbitrary, going from left to right is more idiomatic.
A basic implementation may look like this:
#include <stdio.h>
#include <string.h>
void snake_case(char *string)
{
for ( ; *string != '\0'; ++string)
{
if (*string >= 65 && *string <= 90)
{
*string += 32;
memmove(string + 1U, string, strlen(string) + 1U);
*string = '_';
}
}
}
int main(void)
{
char string[64] = "helloWorldAbcDEFgHIj";
snake_case(string);
printf("%s\n", string);
}
Output: hello_world_abc_d_e_fg_h_ij
Note that:
The size of the string to move is the length of the string plus one, to also move the null-terminator (\0).
I am assuming the function isupper is off-limits as well.
The array needs to be large enough to hold the new string, otherwise memmove will perform invalid writes!
The latter is an issue that needs to be dealt with in a serious implementation. The general problem of "writing a result of unknown length" has several solutions. For this case, they may look like this:
First determine how long the resulting string will be, reallocating the array, and only then modifying the string. Requires two passes.
Every time an uppercase character is found, reallocate the string to its current size + 1. Requires only one pass, but frequent reallocations.
Same as 2, but whenever the array is too small, reallocate the array to twice its current size. Requires a single pass, and less frequent (but larger) reallocations. Finally reallocate the array to the length of the string it actually contains.
In this case, I consider option 1 to be the best. Doing two passes is an option if the string length is known, and the algorithm can be split into two distinct parts: find the new length, and modify the string. I can add it to the answer on request.

Understanding this program which prints a sentence in reverse but keeps the words unchanged

I don't quite understand this program. I don't understand what is happening in the for loop. Can someone explain to me in simple words. And the site also didn't explain it well-enough. This is the link to the site. https://www.geeksforgeeks.org/print-words-string-reverse-order/
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void printReverse(char str[])
{
int length = strlen(str);
FILE *fptr;
if((fptr=fopen("Question1.txt","w"))==NULL)
{
printf("Invalid file");
exit(0);
}
int i;
for (i = length - 1; i >= 0; i--) {
if (str[i] == ' ')
{
str[i] = '\0';
printf("%s ", &(str[i]) + 1);
fprintf(fptr,"%s ", &(str[i]) + 1);
}
}
fprintf(fptr,"%s",str);
printf("%s.", str);
fclose(fptr);
}
int main()
{
char str[1000];
//clrscr();
printf("Enter string: ");
scanf("%[^\n]s", str);
printReverse(str);
//getch();
return 0;
}
In the for loop, why put &(str[i])+1? And also in printf("%s.", str)--this only has the first word; how?
Okay, let's see if I can help. I'll go through the code carefully.
I suspect you already understand this. It's just a method call.
void printReverse(char str[])
{
strlen is a standard method that returns the length of a null-terminated string. That means that str might contain Hello (5 characters), but there's one more byte with a 0 in it, which is how C has always marked the end of the string. In this case, str itself takes 6 bytes, but length will be 5.
int length = strlen(str);
This is how you open a file in C. C++ has better ways. The file is written for writing.
FILE *fptr;
if((fptr=fopen("Question1.txt","w"))==NULL)
{
printf("Invalid file");
exit(0);
}
Here's your for-loop. Let's assume str contains Hello, so length is 5, but the indexes into string are str[0..4]. C uses the index as "offset from the beginning", so the first element is 0, not 1. Thus, when this loop starts, str[i] == o (using Hello as our example string). We then loop, decrementing i each time. Once i goes below 0, the loop ends.
int i;
for (i = length - 1; i >= 0; i--) {
Okay, remember we're printing the words in normal order, but the words themselves are in reverse order. So this looks for a space -- between words. So if we use Hello there as our input text, this if-statement is true when i is pointing to the space between the two words.
Now here's the trick. Remember what I said earlier about null-terminated strings? What this does is to step on that space and replace it with a 0. That makes the rest of this magic work.
if (str[i] == ' ')
{
str[i] = '\0';
And here's the magic. Now, this is a strange way to do it. I would have done it with &str[i+1], but this works. What this is doing is saying "Print the string that begins after the space we just clobbered." We do it to the terminal and the file.
printf("%s ", &(str[i]) + 1);
fprintf(fptr,"%s ", &(str[i]) + 1);
}
}
This writes the produced rearranged string to the file that was opened as well as to your terminal then makes sure the file is closed.
fprintf(fptr,"%s",str);
printf("%s.", str);
fclose(fptr);
}
This all works because we step on the spaces with a zero. For Hello world, we:
Start from the tail
Find the space and stick a zero in it
Print world
Keep backing up to the end of the data.
Drop out of the for-loop and print whatever is left: Hello
Answer to your specific questions
In the for loop why put? &(str[i])+1?
&str[i] is the address of the character at index i where a space has been replaced with a NUL character. With +1 you get the address of the character after it, i.e. the beginning of the word that follows the space that was just replaced. (In case of double spaces this would result in an empty string.)
And also in printf("%s.", str); this only has the first word how?
Assuming the first word is not preceded by a space, the loop will not print it.
This printf("%s.", str); will print the string from the beginning until the first NUL character that replaces a former space character, hence resulting in the first word.
Additional question from comment
So... for example if I input Hello World does the W in that get the index 0?
The W is at index 6. (H is 0, e is 1 etc.)
When i has been counted down to 5, the space at this position will be replaced with a NUL ('\0') character, and it will print the remaining string from the W up to the end of the string which is also marked by a NUL character. (As defined by the C standard.)
And what if the character is not a NULL character? Then it won't go execute if right? It'll just increment i again till it encounters another NULL right?
I don't fully understand these questions. In case there was no NUL character at the end of the string printf would read past the end of the string leading to undefined behavior.
In case of an input string Hello World and Universe", all spaces after Worldwould have been replaced with NUL characters before, so when the program reaches the position of the space beforeWorld`, the string will be
Hello World\0and\0Universe\0
before the replacement and
Hello\0World\0and\0Universe\0
after the replacement.

Determining the length of an array for memory efficiency

Write a function in C language that:
Takes as its only parameter a sentence stored in a string (e.g., "This is a short sentence.").
Returns a string consisting of the number of characters in each word (including punctuation), with spaces separating the numbers. (e.g., "4 2 1 5 9").
I wrote the following program:
int main()
{
char* output;
char *input = "My name is Pranay Godha";
output = numChar(input);
printf("output : %s",output);
getch();
return 0;
}
char* numChar(char* str)
{
int len = strlen(str);
char* output = (char*)malloc(sizeof(char)*len);
char* out = output;
int count = 0;
while(*str != '\0')
{
if(*str != ' ' )
{
count++;
}
else
{
*output = count+'0';
output++;
*output = ' ';
output++;
count = 0;
}
str++;
}
*output = count+'0';
output++;
*output = '\0';
return out;
}
I was just wondering that I am allocating len amount of memory for output string which I feel is more than I should have allocated hence there is some wasting of memory. Can you please tell me what can I do to make it more memory efficient?
I see lots of little bugs. If I were your instructor, I'd grade your solution at "C-". Here's some hints on how to turn it into "A+".
char* output = (char*)malloc(sizeof(char)*len);
Two main issues with the above line. For starters, you are forgetting to "free" the memory you allocate. But that's easily forgiven.
Actual real bug. If your string was only 1 character long (e.g. "x"), you would only allocate one byte. But you would likely need to copy two bytes into the string buffer. a '1' followed by a null terminating '\0'. The last byte gets copied into invalid memory. :(
Another bug:
*output = count+'0';
What happens when "count" is larger than 9? If "count" was 10, then *output gets assigned a colon, not "10".
Start by writing a function that just counts the number of words in a string. Assign the result of this function to a variable call num_of_words.
Since you could very well have words longer than 9 characters, so some words will have two or more digits for output. And you need to account for the "space" between each number. And don't forget the trailing "null" byte.
If you think about the case in which a 1-byte unsigned integer can have at most 3 chars in a string representation ('0'..'255') not including the null char or negative numbers, then sizeof(int)*3 is a reasonable estimate of the maximum string length for an integer representation (not including a null char). As such, the amount of memory you need to alloc is:
num_of_words = countWords(str);
num_of_spaces = (num_of_words > 0) ? (num_of_words - 1) : 0;
output = malloc(num_of_spaces + sizeof(int)*3*num_of_words + 1); // +1 for null char
So that's a pretty decent memory allocation estimate, but it will definitely allocate enough memory in all scenarios.
I think you have a few other bugs in your program. For starters, if there are multiple spaces between each word e.g.
"my gosh"
I would expect your program to print "2 4". But your code prints something else. Likely other bugs exist if there are leading or trailing spaces in your string. And the memory allocation estimate doesn't account for the extra garbage chars you are inserting in those cases.
Update:
Given that you have persevered and attempted to make a better solution in your answer below, I'm going to give you a hint. I have written a function that PRINTs the length of all words in a string. It doesn't actually allocate a string. It just prints it - as if someone had called "printf" on the string that your function is to return. Your job is to extrapolate how this function works - and then modify it to return a new string (that contains the integer lengths of all the words) instead of just having it print. I would suggest you modify the main loop in this function to keep a running total of the word count. Then allocate a buffer of size = (word_count * 4 *sizeof(int) + 1). Then loop through the input string again to append the length of each word into the buffer you allocated. Good luck.
void PrintLengthOfWordsInString(const char* str)
{
if ((str == NULL) || (*str == '\0'))
{
return;
}
while (*str)
{
int count = 0;
// consume leading white space
while ((*str) && (*str == ' '))
{
str++;
}
// count the number of consecutive non-space chars
while ((*str) && (*str != ' '))
{
count++;
str++;
}
if (count > 0)
{
printf("%d ", count);
}
}
printf("\n");
}
The answer is: it depends. There are trade-offs.
Yes, it's possible to write some extra code that, before performing this action, counts the number of words in the original string and then allocates the new string based on the number of words rather than the number of characters.
But is it worth it? The extra code would make your program longer. That is, you would have more binary code, taking up more memory, which may be more than you gain. In addition, it will take more time to run.
By the way, you have a memory leak in your program, which is more of a problem.
As long as none of the words in the sentence are longer than 9 characters, the length of your output array needs only to be the number of words in the sentence, multiplied by 2 (to account for the spaces), plus an extra one for the null terminator.
So for the string
My name is Pranay Godha
...you need only an array of length 11.
If any of the words are ten characters or more, you'll need to calculate how many extra char your array will need by determining the length of the numeric required. (e.g. a word of length 10 characters clearly requires two char to store the number 10.)
The real question is, is all of this worth it? Unless you're specifically required (homework?) to use the minimal space required in your output array, I'd be minded to allocate a suitably large array and perform some bounds checking when writing to it.

How does this clean up the empty space in an array?

void empty_spaces(char array[]){
int j=0,i=0,n=0;
n=strlen(array);
while(i<n){
if(array[i]==' '){
j=i;
while(j<n){
array[j]=array[j+1];
++j;
}
--n;
}else
++i;
}
if(n>15)
n=15;
array[n]='\0';
}
Could someone explain me, this code? This function cleans up the empty spaces in array, but could someone explain me exactly what it works?
It's a rather flabby attempt at a function that removes spaces from a string. The problem with the code is that it has gratuitous iteration and it turns an O(n) algorithm into an O(n^2) algorithm.
Rather than trying to understand the code you have I feel it is best to do it the efficient and simple way. Like this.
void empty_spaces(char str[])
{
char *src = str;
char *dst = str;
while (*src)
{
if (*src != ' ')
{
*dst = *src;
dst++;
}
src++;
}
*dst = '\0';
}
We perform a single pass across the string with two pointers, src and dst. When a non-space character is encountered it is copied from source to destination. Maintaining two separate pointers into the array avoids the spurious iteration from your code.
I ignored the n>15 part of your code. The effect of that is that the string is always truncated to have length no greater than 15 characters, but quite why that would be done is mysterious to me. It surely shouldn't be mixed up in this function.
Since I've not really answered the question as asked, but since I hope this is useful to you, I have made the answer community wiki.
A rewritten and commented version of the above:
//....
n = strlen(array); // n is the number of characters in the array up to the final 0
while (i < n) {
if (array[i] != ' ') { // not a space
i++; // next char,
continue; // continue
}
j = i; // j is the current array index
while (j < n) { // while there are chars left...
array[j] = array[j+1]; // copy the next character into the current index
j++;
}
n--; // and remove one from the string len since a space is removed
}
The code after that limits the string length to 15 before returning.
So this code removes spaces and possibly truncates the string to 15 chars only.
It loops over every character in the array, removing all ' ' (space) characters. The inner loop is what does the erasing. When the outer loop finds a space character, the inner loop "shifts" the rest of the array to the left one index, overwriting the space.
Basically, as soon as the loop encounters ' ' character (space), it moves all the elements of array one place ' left ', therefore replacing the space with the following character.

Fast C comparison

As part of a protocol I'm receiving C string of the following format:
WORD * WORD
Where both WORDs are the same given string.
And, * - is any string of printable characters, NOT including spaces!
So the following are all legal:
WORD asjdfnkn WORD
WORD 234kjk2nd32jk WORD
And the following are illegal:
WORD akldmWORD
WORD asdm zz WORD
NOTWORD admkas WORD
NOTWORD admkas NOTWORD
Where (1) is missing a trailing space; (2) has 3 or more spaces; (3)/(4) do not open/end with the correct string (WORD).
Of-course this could be implemented pretty straight-forward, however I'm not sure what I'm doing is the most efficient.
Note: WORD is pre-set for a whole run, however could change from run to run.
Currently I'm strncmping each string against "WORD ".
If that checks manually (char-by-char) run over the string, to check for the second space char.
[If found] I then strcmp (all the way) with "WORD".
Would love to hear your solution, with an emphasis on efficiency as I'll be running over millions of theses in real-time.
I'd say, have a look at the algorithms in Handbook of Exact String-Matching Algorithms, compare the complexities and choose the one that you like best, implement it.
Or you can use some ready-made implementations.
You have some really classical algorithms for searching strings inside another string here:
KMP(Knuth-Morris-Pratt)
Rabin-Karp
Boyer-Moore
Hope this helps :)
Have you profiled?
There's not much gain to be had here, since you're doing basic string comparisons. If you want to go for the last few percent of performance, I'd change out the str... functions for mem... functions.
char *bufp, *bufe; // pointer to buffer, one past end of buffer
if (bufe - bufp < wordlen * 2 + 2)
error();
if (memcmp(bufp, word, wordlen) || bufp[wordlen] != ' ')
error();
bufp += wordlen + 1;
char *datap = bufp;
char *datae = memchr(bufp, ' ', bufe - buf);
if (!datae || bufe - datae < wordlen + 1)
error();
if (memcmp(datae + 1, word, wordlen))
error();
// Your data is in the range [datap, datae).
The performance gains are likely less than spectacular. You have to examine each character in the buffer since each character could be a space, and any character in the delimiters could be wrong. Changing a loop to memchr is slick, but modern compilers know how to do that for you. Changing a strncmp or strcmp to memcmp is also probably going to be negligible.
There is probably a tradeoff to be made between the shortest code and the fastest implementation. Choices are:
The regular expression ^WORD \S+ WORD$ (requires a regex engine)
strchr on "WORD " and a strrchr on " WORD" with a lot of messy checks (not really recommended)
Walking the whole string character by character, keeping track of the state you are in (scanning first word, scanning first space, scanning middle, scanning last space, scanning last word, expecting end of string).
Option 1 requires the least code but backtracks near the end, and Option 2 has no redeeming qualities. I think you can do option 3 elegantly. Use a state variable and it will look okay. Remember to manually enter the last two states based on the length of your word and the length of your overall string and this will avoid the backtracking that a regex will most likely have.
Do you know how long the string that is to be checked is? If not, your are somewhat limited in what you can do. If you do know how long the string is, you can speed things up a bit. You have not specified for sure that the '*' part has to be at least one character. You've also not stipulated whether tabs are allowed, or newlines, or ... is it only alphanumerics (as in your examples) or are punctuation and other characters allowed? Control characters?
You know how long WORD is, and can pre-construct both the start and end markers. The function error() reports an error (however you need it to be reported) and returns false. The test function might be bool string_is_ok(const char *string, int actstrlen);, returning true on success and false when there is a problem:
// Preset variables characterizing the search
static int wordlen = 4;
static int marklen = wordlen + 1;
static int minstrlen = 2 * marklen + 1; // Two blanks and one other character.
static char bword[] = "WORD "; // Start marker
static char eword[] = " WORD"; // End marker
static char verboten[] = " "; // Forbidden characters
bool string_is_ok(const char *string, int actstrlen)
{
if (actstrlen < minstrlen)
return error("string too short");
if (strncmp(string, bword, marklen) != 0)
return error("string does not start with WORD");
if (strcmp(string + actstrlen - marklen, eword) != 0)
return error("string does not finish with WORD");
if (strcspn(string + marklen, verboten) != actstrlen - 2 * marklen)
return error("string contains verboten characters");
return true;
}
You probably can't reduce the tests by much if you want your guarantees. The part that would change most depending on the restrictions in the alphabet is the strcspn() line. That is relatively fast for a small list of forbidden characters; it will likely be slower as the number of characters forbidden is increased. If you only allow alphanumerics, you have 62 OK and 193 not OK characters, unless you count some of the high-bit set characters as alphabetic too. That part will probably be slow. You might do better with a custom function that takes a start position and length and reports whether all characters are OK. This could be along the lines of:
#include <stdbool.h>
static bool ok_chars[256] = { false };
static void init_ok_chars(void)
{
const unsigned char *ok = "abcdefghijklmnopqrstuvwxyz...0123456789";
int c;
while ((c = *ok++) != 0)
ok_chars[c] = 1;
}
static bool all_chars_ok(const char *check, int numchars)
{
for (i = 0; i < numchars; i++)
if (ok_chars[check[i]] == 0)
return false;
return true;
}
You can then use:
return all_chars_ok(string + marklen, actstrlen - 2 * marklen);
in place of the call to strcspn().
If your "stuffing" should contain only '0'-'9', 'A'-'Z' and 'a'-'z' and are in some encoding based on ASCII (like most Unicode based encodings), then you can skip two comparisons in one of your loops, since only one bit differ between capital and minor characters.
Instead of
ch>='0' && ch<='9' && ch>='A' && ch<='Z' && ch>='a' && ch<='a'
you get
ch2 = ch & ~('a' ^ 'A')
ch>='0' && ch<='9' && ch2>='A' && ch2<='Z'
But you better look at the assembler code your compiler generate and do some benchmarking, depending on computer architecture and compiler, this trick could give slower code.
If branching is expensive compared to comparisons on your computer, you can also replace the && with &. But most modern compilers know this trick in most situations.
If, on the other hand, you test for any printable glyph from some large character encoding, then it is most likely less expensive to test for white-space glyphs, rather then printable glyph.
Also, compile specifically for the computer that the code will run on and don't forget turn of any generation of debugging-code.
Added:
Don't make subroutine calls within your scan loops, unless it is worth it.
Whatever trick you use to speed up your loops, it will diminish if you have to make a sub-routine call within one of them. It is fine to use built-in functions that your compiler inline into your code, but if you use something lika an external regex-library and your compiler is unable to inline those functions (gcc can do that, sometimes, if you ask it to), then making that subroutine call will shuffle a lot of memory around, in worse case between different types of memory (registers, CPU buffers, RAM, harddisk et.c.) and may mess up CPU predictions and pipelines. Unless your text-snippets are very long, so that you spend much time parsing each of them, and the subroutine is effective enough to compensate for the cost of the call, don't do that. Some functions for parsing use call-backs, it might be more effective then you making a lot of subroutine calls from your loops (since the function can scan several pattern-matches in one sweep and bunch several call-backs together outside the critical loop), but that depend on how someone else have written that function and basically it is the same thing as you making the call.
WORD is 4 characters, with uint32_t you could do a quick comparison. You will need a different constant depending on system endianness. The rest seems to be fine.
Since WORD can change you have to precalculate the uint32_t, uint64_t, ... you need depending on the length of the WORD.
Not sure from the description, but if you trust the source you could just chomp the first n+1 and last n+1 characters.
bool check_legal(
const char *start, const char *end,
const char *delim_start, const char *delim_end,
const char **content_start, const char **content_end
) {
const size_t delim_len = delim_end - delim_start;
const char *p = start;
if (start + delim_len + 1 + 0 + 1 + delim_len < end)
return false;
if (memcmp(p, delim_start, delim_len) != 0)
return false;
p += delim_len;
if (*p != ' ')
return false;
p++;
*content_start = p;
while (p < end - 1 - delim_len && *p != ' ')
p++;
if (p + 1 + delim_len != end)
return false;
*content_end = p;
p++;
if (memcmp(p, delim_start, delim_len) != 0)
return false;
return true;
}
And here is how to use it:
const char *line = "who is who";
const char *delim = "who";
const char *start, *end;
if (check_legal(line, line + strlen(line), delim, delim + strlen(delim), &start, &end)) {
printf("this %*s nice\n", (int) (end - start), start);
}
(It's all untested.)
using STL find the number of spaces..if they are not two obviously the string is wrong..and using find(algorithm.h) you can get the position of the two spaces and the middle word! Check for WORD at the beginning and the end! you are done..
This should return the true/false condition in O(n) time
int sameWord(char *str)
{
char *word1, *word2;
word1 = word2 = str;
// Word1, Word2 points to beginning of line where the first word is found
while (*word2 && *word2 != ' ') ++word2; // skip to first space
if (*word2 == ' ') ++word2; // skip space
// Word1 points to first word, word2 points to the middle-filler
while (*word2 && *word2 != ' ') ++word2; // skip to second space
if (*word2 == ' ') ++word2; // skip space
// Word1 points to first word, word2 points to the second word
// Now just compare that word1 and word2 point to identical strings.
while (*word1 != ' ' && *word2)
if (*word1++ != *word2++) return 0; //false
return *word1 == ' ' && (*word2 == 0 || *word2 == ' ');
}

Resources