c string: put ' ' if a word found in the sentence - c

I made a code and my target is to put spacewhere the input word was found in a sentence.
i neet to replece the small word with space
like:
Three witches watched three watches
tch
output:
Three wi es wa ed three wa es
I made this code:
#include<stdio.h>
#define S 8
#define B 50
void main() {
char small[S] = {"ol"};
char big[B] = {"my older gradmom see my older sister"};
int i = 0, j = 0;
for (i = 0; i < B; i++)
{
for(j=0;j<S;j++)
{
if(small[j]!=big[i])
{
j=0;
break;
}
if(small[j]=='\0')
{
while (i-(j-1)!=i)
{
i = i - j;
big[i] = '\n';
i++;
}
}
}
}
puts(big);
}

First of all, in your exemple you work with newline '\n' and not with space.
Consider this simple example:
#include<stdio.h>
#define S 8
#define B 50
void main() {
char small[S] = {"ol"};
char big[B] = {"my older gradmom see my older sister"};
int i = 0, j = 0;
int cpt = 0;
int smallSize = 0;
// loop to retrieve smallSize
for (i = 0; i < S; i++)
{
if (small[i] != '\0')
smallSize++;
}
// main loop
for (i = 0; i < B; i++)
{
// stop if we hit the end of the string
if (big[i] == '\0')
break;
// increment the cpt and small index while the content of big and small are equal
if (big[i] == small[j])
{
cpt++;
j++;
}
// we didn't found the full small word
else
{
j = 0;
cpt = 0;
}
// test if we found the full word, if yes replace char in big by space
if (cpt == smallSize)
{
for (int k = 0; k < smallSize; k++)
{
big[i-k] = ' ';
}
j = 0;
cpt = 0;
}
}
puts(big);
}
You need first to retrieve the real size of the small array.
Once done, next step is to look inside "big" if there is the word small inside. If we find it, then replace all those char by spaces.
If you want to replace the whole small word with a single space, then you'll need to adapt this example !
I hope this help !

A possible way is to use to pointers to the string, one for reading and one for writing. This will allow to replace an arbitrary number of chars (the ones from small) with a single space. And you do not really want to nest loops but une only one to process every char from big.
Last but not least, void main() should never be used except in stand alone environment (kernel or embedded development). Code could become:
#include <stdio.h>
#define S 8
#define B 50
int main() { // void main is deprecated...
char small[S] = {"ol"};
char big[B] = {"my older gradmom see my older sister"};
int i = 0, j = 0;
int k = 0; // pointer to written back big
for (i = 0; i < B; i++)
{
if (big[i] == 0) break; // do not process beyond end of string
if(small[j]!=big[i])
{
for(int l=0; l<j; l++) big[k++] = small[l]; // copy an eventual partial small
big[k++] = big[i]; // copy the incoming character
j=0; // reset pointer to small
continue;
}
else if(small[++j] == 0) // reached end of small
{
big[k++] = ' '; // replace chars from small with a single space
j = 0; // reset pointer to small
}
}
big[k] = '\0';
puts(big);
return 0;
}
or even better (no need for fixed sizes of strings):
#include <stdio.h>
int main() { // void main is deprecated...
char small[] = {"ol"};
char big[] = {"my older gradmom see my older sister"};
int i = 0, j = 0;
int k = 0; // pointer to written back big
for (i = 0; i < sizeof(big); i++)
{
if(small[j]!=big[i])
...

In C strings are terminated with a null character '\0'. Your code defines a somehow random number at the beginning (B and S) and iterates over that much characters instead of the exact number of characters, the strings actually contain. You can use the fact that the string is terminated by testing the content of the string in a while loop.
i = 0;
while (str[i]) {
...
i = i + 1;
}
If you prefer for loops you can write it also as a for loop.
for (i = 0; str[i]; i++) {
...
}
Your code does not move the contents of the remaining string to the left. If you replace two characters ol with one character , you have to move the remaining characters to the left by one character. Otherwise you would have a hole in the string.
#include <stdio.h>
int main() {
char small[] = "ol";
char big[] = "my older gradmom see my older sister";
int s; // index, which loops through the small string
int b; // index, which loops through the big string
int m; // index, which loops through the characters to be modified
// The following loops through the big string up to the terminating
// null character in the big string.
b = 0;
while (big[b]) {
// The following loops through the small string up to the
// terminating null character, if the character in the small
// string matches the corresponding character in the big string.
s = 0;
while (small[s] && big[b+s] == small[s]) {
// In case of a match, continue with the next character in the
// small string.
s = s + 1;
}
// If we are at the end of the small string, we found in the
// big string.
if (small[s] == '\0') {
// Now we have to modify the big string. The modification
// starts at the current position in the big string.
m = b;
// First we have to put the space at the current position in the
// big string.
big[m] = ' ';
// And next the rest of the big string has to be moved left. The
// rest of the big string starts, where the match has ended.
while (big[b+s]) {
m = m + 1;
big[m] = big[b+s];
s = s + 1;
}
// Finally the big string has to be terminated by a null
// character.
big[m+1] = '\0';
}
// Continue at next character in big string.
b = b + 1;
}
puts(big);
return 0;
}

Related

C allocation memory error. Don't find something like this

Could you help please ?
When I execute this code I receive that:
AAAAABBBBBCCCCCBBBBBCOMP¬ıd┐╔ LENGTH 31
There are some weirds characters after letters, while I've allocate just 21 bytes.
#include <stdio.h>
#include <stdlib.h>
char * lineDown(){
unsigned short state[4] = {0,1,2,1};
char decorationUp[3][5] = {
{"AAAAA"},{"BBBBB"},{"CCCCC"}
};
char * deco = malloc(21);
int k;
int p = 0;
for(int j = 0; j < 4; j++){
k = state[j];
for(int i = 0; i < 5; i++){
*(deco+p) = decorationUp[k][i];
p++;
}
}
return deco;
}
int main(void){
char * lineDOWN = lineDown();
int k = 0;
char c;
do{
c = *(lineDOWN+k);
printf("%c",*(lineDOWN+k));
k++;
}while(c != '\0');
printf("LENGTH %d\n\n",k);
}
The function does not build a string because the result array does not contain the terminating zero though a space for it was reserved when the array was allocated.
char * deco = malloc(21);
So you need to append the array with the terminating zero before exiting the function
//...
*(deco + p ) = '\0';
return deco;
}
Otherwise this do-while loop
do{
c = *(lineDOWN+k);
printf("%c",*(lineDOWN+k));
k++;
}while(c != '\0')
will have undefined behavior.
But even if you will append the array with the terminating zero the loop will count the length of the stored string incorrectly because it will increase the variable k even when the current character is the terminating zero.
Instead you should use a while loop. In this case the declaration of the variable c will be redundant. The loop can look like
while ( *( lineDOWN + k ) )
{
printf("%c",*(lineDOWN+k));
k++;
}
In this case this call
printf("\nLENGTH %d\n\n",k);
^^
will output the correct length of the string equal to 20.
And you should free the allocated memory before exiting the program
free( lineDOWN );
As some other wrote here in their answers that the array decorationUp must be declared like
char decorationUp[3][6] = {
{"AAAAA"},{"BBBBB"},{"CCCCC"}
};
then it is not necessary if you are not going to use elements of the array as strings and you are not using them as strings in your program.
Take into account that your program is full of magic numbers. Such a program is usually error-prone. Instead you should use named constants.
In
char decorationUp[3][5] = {
{"AAAAA"},{"BBBBB"},{"CCCCC"}
};
your string needs 6 characters to also place the null char, even in that case you do not use them as 'standard' string but only array of char. To get into the habit always reverse the place for the ending null character
you can do
char decorationUp[3][6] = {
{"AAAAA"},{"BBBBB"},{"CCCCC"}
};
Note it is useless to give the first size, the compiler counts for you
Because in main you stop when you read the null character you also need to place it in deco at the end, so you need to allocate 21 for it. As before you missed the place for the null character, but here that produces an undefined behavior because you read after the allocated block.
To do *(deco+p) is not readable, do deco[p]
So for instance :
char * lineDown(){
unsigned short state[] = {0,1,2,1};
char decorationUp[][6] = {
{"AAAAA"},{"BBBBB"},{"CCCCC"}
};
char * deco = malloc(4*5 + 1); /* a formula to explain why 21 is better than 21 directly */
int k;
int p = 0;
for(int j = 0; j < 4; j++){
k = state[j];
for(int i = 0; i < 5; i++){
deco[p] = decorationUp[k][i];
p++;
}
}
deco[p] = 0;
return deco;
}

Count of similar characters without repetition, in two strings

I have written a C program to find out the number of similar characters between two strings. If a character is repeated again it shouldn't count it.
Like if you give an input of
everest
every
The output should be
3
Because the four letters "ever" are identical, but the repeated "e" does not increase the count.
For the input
apothecary
panther
the output should be 6, because of "apther", not counting the second "a".
My code seems like a bulk one for a short process. My code is
#include<stdio.h>
#include <stdlib.h>
int main()
{
char firstString[100], secondString[100], similarChar[100], uniqueChar[100] = {0};
fgets(firstString, 100, stdin);
fgets(secondString, 100, stdin);
int firstStringLength = strlen(firstString) - 1, secondStringLength = strlen(secondString) - 1, counter, counter1, count = 0, uniqueElem, uniqueCtr = 0;
for(counter = 0; counter < firstStringLength; counter++) {
for(counter1 = 0; counter1 < secondStringLength; counter1++) {
if(firstString[counter] == secondString[counter1]){
similarChar[count] = firstString[counter];
count++;
break;
}
}
}
for(counter = 0; counter < strlen(similarChar); counter++) {
uniqueElem = 0;
for(counter1 = 0; counter1 < counter; counter1++) {
if(similarChar[counter] == uniqueChar[counter1]) {
uniqueElem++;
}
}
if(uniqueElem == 0) {
uniqueChar[uniqueCtr++] = similarChar[counter];
}
}
if(strlen(uniqueChar) > 1) {
printf("%d\n", strlen(uniqueChar));
printf("%s", uniqueChar);
} else {
printf("%d",0);
}
}
Can someone please provide me some suggestions or code for shortening this function?
You should have 2 Arrays to keep a count of the number of occurrences of each aplhabet.
int arrayCount1[26],arrayCount2[26];
Loop through strings and store the occurrences.
Now for counting the similar number of characters use:
for( int i = 0 ; i < 26 ; i++ ){
similarCharacters = similarCharacters + min( arrayCount1[26], arrayCount2[26] )
}
There is a simple way to go. Take an array and map the ascii code as an index to that array. Say int arr[256]={0};
Now whatever character you see in string-1 mark 1 for that. arr[string[i]]=1; Marking what characters appeared in the first string.
Now again when looping through the characters of string-2 increase the value of arr[string2[i]]++ only if arr[i] is 1. Now we are tallying that yes this characters appeared here also.
Now check how many positions of the array contains 2. That is the answer.
int arr[256]={0};
for(counter = 0; counter < firstStringLength; counter++)
arr[firstString[counter]]=1;
for(counter = 0; counter < secondStringLength; counter++)
if(arr[secondString[counter]]==1)
arr[secondString[counter]]++;
int ans = 0;
for(int i = 0; i < 256; i++)
ans += (arr[i]==2);
Here is a simplified approach to achieve your goal. You should create an array to hold the characters that has been seen for the first time.
Then, you'll have to make two loops. The first is unconditional, while the second is conditional; That condition is dependent on a variable that you have to create, which checks weather the end of one of the strings has been reached.
Ofcourse, the checking for the end of the other string should be within the first unconditional loop. You can make use of the strchr() function to count the common characters without repetition:
#include <stdio.h>
#include <string.h>
int foo(const char *s1, const char *s2);
int main(void)
{
printf("count: %d\n", foo("everest", "every"));
printf("count: %d\n", foo("apothecary", "panther"));
printf("count: %d\n", foo("abacus", "abracadabra"));
return 0;
}
int foo(const char *s1, const char *s2)
{
int condition = 0;
int count = 0;
size_t n = 0;
char buf[256] = { 0 };
// part 1
while (s2[n])
{
if (strchr(s1, s2[n]) && !strchr(buf, s2[n]))
{
buf[count++] = s2[n];
}
if (!s1[n]) {
condition = 1;
}
n++;
}
// part 2
if (!condition ) {
while (s1[n]) {
if (strchr(s2, s1[n]) && !strchr(buf, s1[n]))
{
buf[count++] = s1[n];
}
n++;
}
}
return count;
}
NOTE: You should check for buffer overflow, and you should use a dynamic approach to reallocate memory accordingly, but this is a demo.

code accounting for multiple delimiters isn't working

I have a program I wrote to take a string of words and, based on the delimiter that appears, separate each word and add it to an array.
I've adjusted it to account for either a ' ' , '.' or '.'. Now the goal is to adjust for multiple delimiters appearing together (as in "the dog,,,was walking") and still only add the word. While my program works, and it doesn't print out extra delimiters, every time it encounters additional delimiters, it includes a space in the output instead of ignoring them.
int main(int argc, const char * argv[]) {
char *givenString = "USA,Canada,Mexico,Bermuda,Grenada,Belize";
int stringCharCount;
//get length of string to allocate enough memory for array
for (int i = 0; i < 1000; i++) {
if (givenString[i] == '\0') {
break;
}
else {
stringCharCount++;
}
}
// counting # of commas in the original string
int commaCount = 1;
for (int i = 0; i < stringCharCount; i++) {
if (givenString[i] == ',' || givenString[i] == '.' || givenString[i] == ' ') {
commaCount++;
}
}
//declare blank Array that is the length of commas (which is the number of elements in the original string)
//char *finalArray[commaCount];
int z = 0;
char *finalArray[commaCount] ;
char *wordFiller = malloc(stringCharCount);
int j = 0;
char current = ' ';
for (int i = 0; i <= stringCharCount; i++) {
if (((givenString[i] == ',' || givenString[i] == '\0' || givenString[i] == ',' || givenString[i] == ' ') && (current != (' ' | '.' | ',')))) {
finalArray[z] = wordFiller;
wordFiller = malloc(stringCharCount);
j=0;
z++;
current = givenString[i];
}
else {
wordFiller[j++] = givenString[i];
}
}
for (int i = 0; i < commaCount; i++) {
printf("%s\n", finalArray[i]);
}
return 0;
}
This program took me hours and hours to get together (with help from more experienced developers) and I can't help but get frustrated. I'm using the debugger to my best ability but definitely need more experience with it.
/////////
I went back to pad and paper and kind of rewrote my code. Now I'm trying to store delimiters in an array and compare the elements of that array to the current string value. If they are equal, then we have come across a new word and we add it to the final string array. I'm struggling to figure out the placement and content of the "for" loop that I would use for this.
char * original = "USA,Canada,Mexico,Bermuda,Grenada,Belize";
//creating two intialized variables to count the number of characters and elements to add to the array (so we can allocate enough mmemory)
int stringCharCount = 0;
//by setting elementCount to 1, we can account for the last word that comes after the last comma
int elementCount = 1;
//calculate value of stringCharCount and elementCount to allocate enough memory for temporary word storage and for final array
for (int i = 0; i < 1000; i++) {
if (original[i] == '\0') {
break;
}
else {
stringCharCount++;
if (original[i] == ',') {
elementCount++;
}
}
}
//account for the final element
elementCount = elementCount;
char *tempWord = malloc(stringCharCount);
char *finalArray[elementCount];
int a = 0;
int b = 0;
//int c = 0;
//char *delimiters[4] = {".", ",", " ", "\0"};
for (int i = 0; i <= stringCharCount; i++) {
if (original[i] == ',' || original[i] == '\0') {
finalArray[a] = tempWord;
tempWord = malloc(stringCharCount);
tempWord[b] = '\0';
b = 0;
a++;
}
else {
tempWord[b++] = original[i];
}
}
for (int i = 0; i < elementCount; i++) {
printf("%s\n", finalArray[i]);
}
return 0;
}
Many issues. Suggest dividing code into small pieces and debug those first.
--
Un-initialize data.
// int stringCharCount;
int stringCharCount = 0;
...
stringCharCount++;
Or
int stringCharCount = strlen(givenString);
Other problems too: finalArray[] is never assigned a terminarting null character yet printf("%s\n", finalArray[i]); used.
Unclear use of char *
char *wordFiller = malloc(stringCharCount);
wordFiller = malloc(stringCharCount);
There are more bugs than lines in your code.
I'd suggest you start with something much simpler.
Work through a basic programming book with excercises.
Edit
Or, if this is about learning to program, try another, simpler programming language:
In C# your task looks rather simple:
string givenString = "USA,Canada Mexico,Bermuda.Grenada,Belize";
string [] words = string.Split(new char[] {' ', ',', '.'});
foreach(word in words)
Console.WriteLine(word);
As you see, there are much issues to worry about:
No memory management (alloc/free) this is handeled by the Garbage Collector
no pointers, so nothing can go wrong with them
powerful builtin string capabilities like Split()
foreach makes loops much simpler

Parsing character array to words held in pointer array (C-programming)

I am trying to separate each word from a character array and put them into a pointer array, one word for each slot. Also, I am supposed to use isspace() to detect blanks. But if there is a better way, I am all ears. At the end of the code I want to print out the content of the parameter array.
Let's say the line is: "this is a sentence". What happens is that it prints out "sentence" (the last word in the line, and usually followed by some random character) 4 times (the number of words). Then I get "Segmentation fault (core dumped)".
Where am I going wrong?
int split_line(char line[120])
{
char *param[21]; // Here I want to put one word for each slot
char buffer[120]; // Word buffer
int i; // For characters in line
int j = 0; // For param words
int k = 0; // For buffer chars
for(i = 0; i < 120; i++)
{
if(line[i] == '\0')
break;
else if(!isspace(line[i]))
{
buffer[k] = line[i];
k++;
}
else if(isspace(line[i]))
{
buffer[k+1] = '\0';
param[j] = buffer; // Puts word into pointer array
j++;
k = 0;
}
else if(j == 21)
{
param[j] = NULL;
break;
}
}
i = 0;
while(param[i] != NULL)
{
printf("%s\n", param[i]);
i++;
}
return 0;
}
There are many little problems in this code :
param[j] = buffer; k = 0; : you rewrite at the beginning of buffer erasing previous words
if(!isspace(line[i])) ... else if(isspace(line[i])) ... else ... : isspace(line[i]) is either true of false, and you always use the 2 first choices and never the third.
if (line[i] == '\0') : you forget to terminate current word by a '\0'
if there are multiple white spaces, you currently (try to) add empty words in param
Here is a working version :
int split_line(char line[120])
{
char *param[21]; // Here I want to put one word for each slot
char buffer[120]; // Word buffer
int i; // For characters in line
int j = 0; // For param words
int k = 0; // For buffer chars
int inspace = 0;
param[j] = buffer;
for(i = 0; i < 120; i++) {
if(line[i] == '\0') {
param[j++][k] = '\0';
param[j] = NULL;
break;
}
else if(!isspace(line[i])) {
inspace = 0;
param[j][k++] = line[i];
}
else if (! inspace) {
inspace = 1;
param[j++][k] = '\0';
param[j] = &(param[j-1][k+1]);
k = 0;
if(j == 21) {
param[j] = NULL;
break;
}
}
}
i = 0;
while(param[i] != NULL)
{
printf("%s\n", param[i]);
i++;
}
return 0;
}
I only fixed the errors. I leave for you as an exercise the following improvements :
the split_line routine should not print itself but rather return an array of words - beware you cannot return an automatic array, but it would be another question
you should not have magic constants in you code (120), you should at least have a #define and use symbolic constants, or better accept a line of any size - here again it is not simple because you will have to malloc and free at appropriate places, and again would be a different question
Anyway good luck in learning that good old C :-)
This line does not seems right to me
param[j] = buffer;
because you keep assigning the same value buffer to different param[j] s .
I would suggest you copy all the char s from line[120] to buffer[120], then point param[j] to location of buffer + Next_Word_Postition.
You may want to look at strtok in string.h. It sounds like this is what you are looking for, as it will separate words/tokens based on the delimiter you choose. To separate by spaces, simply use:
dest = strtok(src, " ");
Where src is the source string and dest is the destination for the first token on the source string. Looping through until dest == NULL will give you all of the separated words, and all you have to do is change dest each time based on your pointer array. It is also nice to note that passing NULL for the src argument will continue parsing from where strtok left off, so after an initial strtok outside of your loop, just use src = NULL inside. I hope that helps. Good luck!

Returning the length of a char array in C

I am new to programming in C and am trying to write a simple function that will normalize a char array. At the end i want to return the length of the new char array. I am coming from java so I apologize if I'm making mistakes that seem simple. I have the following code:
/* The normalize procedure normalizes a character array of size len
according to the following rules:
1) turn all upper case letters into lower case ones
2) turn any white-space character into a space character and,
shrink any n>1 consecutive whitespace characters to exactly 1 whitespace
When the procedure returns, the character array buf contains the newly
normalized string and the return value is the new length of the normalized string.
*/
int
normalize(unsigned char *buf, /* The character array contains the string to be normalized*/
int len /* the size of the original character array */)
{
/* use a for loop to cycle through each character and the built in c functions to analyze it */
int i;
if(isspace(buf[0])){
buf[0] = "";
}
if(isspace(buf[len-1])){
buf[len-1] = "";
}
for(i = 0;i < len;i++){
if(isupper(buf[i])) {
buf[i]=tolower(buf[i]);
}
if(isspace(buf[i])) {
buf[i]=" ";
}
if(isspace(buf[i]) && isspace(buf[i+1])){
buf[i]="";
}
}
return strlen(*buf);
}
How can I return the length of the char array at the end? Also does my procedure properly do what I want it to?
EDIT: I have made some corrections to my program based on the comments. Is it correct now?
/* The normalize procedure normalizes a character array of size len
according to the following rules:
1) turn all upper case letters into lower case ones
2) turn any white-space character into a space character and,
shrink any n>1 consecutive whitespace characters to exactly 1 whitespace
When the procedure returns, the character array buf contains the newly
normalized string and the return value is the new length of the normalized string.
*/
int
normalize(unsigned char *buf, /* The character array contains the string to be normalized*/
int len /* the size of the original character array */)
{
/* use a for loop to cycle through each character and the built in c funstions to analyze it */
int i = 0;
int j = 0;
if(isspace(buf[0])){
//buf[0] = "";
i++;
}
if(isspace(buf[len-1])){
//buf[len-1] = "";
i++;
}
for(i;i < len;i++){
if(isupper(buf[i])) {
buf[j]=tolower(buf[i]);
j++;
}
if(isspace(buf[i])) {
buf[j]=' ';
j++;
}
if(isspace(buf[i]) && isspace(buf[i+1])){
//buf[i]="";
i++;
}
}
return strlen(buf);
}
The canonical way of doing something like this is to use two indices, one for reading, and one for writing. Like this:
int normalizeString(char* buf, int len) {
int readPosition, writePosition;
bool hadWhitespace = false;
for(readPosition = writePosition = 0; readPosition < len; readPosition++) {
if(isspace(buf[readPosition]) {
if(!hadWhitespace) buf[writePosition++] = ' ';
hadWhitespace = true;
} else if(...) {
...
}
}
return writePosition;
}
Warning: This handles the string according to the given length only. While using a buffer + length has the advantage of being able to handle any data, this is not the way C strings work. C-strings are terminated by a null byte at their end, and it is your job to ensure that the null byte is at the right position. The code you gave does not handle the null byte, nor does the buffer + length version I gave above. A correct C implementation of such a normalization function would look like this:
int normalizeString(char* string) { //No length is passed, it is implicit in the null byte.
char* in = string, *out = string;
bool hadWhitespace = false;
for(; *in; in++) { //loop until the zero byte is encountered
if(isspace(*in) {
if(!hadWhitespace) *out++ = ' ';
hadWhitespace = true;
} else if(...) {
...
}
}
*out = 0; //add a new zero byte
return out - string; //use pointer arithmetic to retrieve the new length
}
In this code I replaced the indices by pointers simply because it was convenient to do so. This is simply a matter of style preference, I could have written the same thing with explicit indices. (And my style preference is not for pointer iterations, but for concise code.)
if(isspace(buf[i])) {
buf[i]=" ";
}
This should be buf[i] = ' ', not buf[i] = " ". You can't assign a string to a character.
if(isspace(buf[i]) && isspace(buf[i+1])){
buf[i]="";
}
This has two problems. One is that you're not checking whether i < len - 1, so buf[i + 1] could be off the end of the string. The other is that buf[i] = "" won't do what you want at all. To remove a character from a string, you need to use memmove to move the remaining contents of the string to the left.
return strlen(*buf);
This would be return strlen(buf). *buf is a character, not a string.
The notations like:
buf[i]=" ";
buf[i]="";
do not do what you think/expect. You will probably need to create two indexes to step through the array — one for the current read position and one for the current write position, initially both zero. When you want to delete a character, you don't increment the write position.
Warning: untested code.
int i, j;
for (i = 0, j = 0; i < len; i++)
{
if (isupper(buf[i]))
buf[j++] = tolower(buf[i]);
else if (isspace(buf[i])
{
buf[j++] = ' ';
while (i+1 < len && isspace(buf[i+1]))
i++;
}
else
buf[j++] = buf[i];
}
buf[j] = '\0'; // Null terminate
You replace the arbitrary white space with a plain space using:
buf[i] = ' ';
You return:
return strlen(buf);
or, with the code above:
return j;
Several mistakes in your code:
You cannot assign buf[i] with a string, such as "" or " ", because the type of buf[i] is char and the type of a string is char*.
You are reading from buf and writing into buf using index i. This poses a problem, as you want to eliminate consecutive white-spaces. So you should use one index for reading and another index for writing.
In C/C++, a native string is an array of characters that ends with 0. So in essence, you can simply iterate buf until you read 0 (you don't need to use the len variable at all). In addition, since you are "truncating" the input string, you should set the new last character to 0.
Here is one optional solution for the problem at hand:
int normalize(char* buf)
{
char c;
int i = 0;
int j = 0;
while (buf[i] != 0)
{
c = buf[i++];
if (isspace(c))
{
j++;
while (isspace(c))
c = buf[i++];
}
if (isupper(c))
buf[j] = tolower(c);
j++;
}
buf[j] = 0;
return j;
}
you should write:
return strlen(buf)
instead of:
return strlen(*buf)
The reason:
buf is of type char* - it's an address of a char somewhere in the memory (the one in the beginning of the string). The string is null terminated (or at least should be), and therefore the function strlen knows when to stop counting chars.
*buf will de-reference the pointer, resulting on a char - not what strlen expects.
Not much different then others but assumes this is an array of unsigned char and not a C string.
tolower() does not itself need the isupper() test.
int normalize(unsigned char *buf, int len) {
int i = 0;
int j = 0;
int previous_is_space = 0;
while (i < len) {
if (isspace(buf[i])) {
if (!previous_is_space) {
buf[j++] = ' ';
}
previous_is_space = 1;
} else {
buf[j++] = tolower(buf[i]);
previous_is_space = 0;
}
i++;
}
return j;
}
#OP:
Per the posted code it implies leading and trailing spaces should either be shrunk to 1 char or eliminate all leading and trailing spaces.
The above answer simple shrinks leading and trailing spaces to 1 ' '.
To eliminate trailing and leading spaces:
int i = 0;
int j = 0;
while (len > 0 && isspace(buf[len-1])) len--;
while (i < len && isspace(buf[i])) i++;
int previous_is_space = 0;
while (i < len) { ...

Resources