How to Replace Leading or Trailing Blank Characters with "X" - c

Looking for a more efficient way to replace leading and trailing empty spaces (' ') and appending an 'X' to the front for each empty space.. It seems to work ok for trailing spaces but I'd like to know if there's a better / simpler way of going about this that I am missing.
Example:
Passed in string: '12345 '
Desired result 'XXXXX12345'
Removed 5 empty spaces and append 5 'X's to front.
Example:
Passed in string: ' 12345'
Desired result 'XX12345'
Remove 2 empty spaces and append 2 'X's to front.
void fixStr(char* str)
{
int i = 0;
int length = strlen(str);
char strCopy[10];
strcpy(strCpy, str);
for(i = 0; i < length; i++)
{
if(strCopy[i] == ' ')
{
strCopy[i] = '\0';
str[i] = '\0';
break;
}
}
for(i = 0; i < length - i + 2; i++)
{
str[i] = 'X';
str[i + 1] = '\0';
}
strcat(str, strCopy);
}

One way to achieve this is to find out the leading non-space position & trailing non-space position of the string, and then move the content in-between (leading nonspace, trailing nonspace) this to end of the string, then set all the empty space at the beginning to 'x'
This way you can get the expected output (function below)
void fixStr(char* str)
{
int i = 0;
int length = strlen(str);
int leadindex = length;
int tailindex = 0;
// First find the leading nonspace position
for(i = 0; i < length; i++)
{
if(str[i] != ' ')
{
leadindex = i;
break;
}
}
// if not found nonspace then no change
if( leadindex == length )
{
// all spaces, so no change required;
return;
}
// Find the trailing nonspace position
for(i = length - 1; i >= 0 ; i--)
{
if(str[i] != ' ')
{
tailindex = i;
break;
}
}
// move the buffer (in place) to exclude trailing spaces
memmove(str + (length - tailindex -1),str,(tailindex +1) );
// set the 'x' to all empty spaces at leading ( you may use for loop to set this)
memset(str, 'X', length - (tailindex - leadindex + 1) );
}

To solve a problem the engineer's way:
Define the needs.
Know your tools.
Use the tools as simple as possible, as accurate as necessary to make up a solution.
In your case:
Needs:
find the number of trailing spaces
move content of string to the end
set beginning to 'X's
Tools:
to measure, iterate, compare and count
to move a block of memory
to initialise a block of memory
Example for a solution:
#include <string.h> /* for strlen(), memmove () and memset() */
void fix_str(char * s)
{
if ((NULL != s) && ('\0' != *s)) /* Ignore NULL and empty string! */
{
/* Store length and initialise counter: */
size_t l = strlen(s), i = l;
/* Count space(s): */
for (; (0 != i) && (' ' == s[i-1]); --i); /* This for loop does not need a "body". */
/* Calculate the complement: */
size_t c = l - i;
/* Move content to the end overwriting any trailing space(s) counted before hand: */
memmove(s+c, s, i); /* Note that using memmove() instead of memmcpy() is essential
here as the source and destination memory overlap! */
/* Initialise the new "free" characters at the beginning to 'X's:*/
memset(s, 'X', c);
}
}

I didn't fix your code but you could use sprintf in combination with isspace, something along the lines of this. Also, remember to make a space for the '\0 at the end of your string. Use this idea and it should help you:
#include <ctype.h>
#include <stdio.h>
int main()
{
char buf[11];
char *s = "Hello";
int i;
sprintf(buf, "%10s", s); /* right justifies in a column of 10 in buf */
for(i = 0; i < 10; i++) {
if(isspace(buf[i])) /* replace the spaces with an x (or whatever) */
buf[i] = 'x';
}
printf("%s\n", buf);
return 0;
}

Related

How to add space between the characters if two consecutive characters are equal in c?

I need to add add space if two consecutive characters are same.
For example:
input:
ttjjjiibbbbhhhhhppuuuu
Output:
t tjjji ibbbbhhhhhp puuuu
If the two consecutive characters are same then need to print space between two consecutive characters....if the consecutive characters are greater than two no need to add space.
My code:
#include <stdio.h>
#include <string.h>
int main()
{
char s[100]="ttjjjiibbbbhhhhhppuuuu";
for(int i=0;i<strlen(s);i++){
if(s[i]!=s[i-1] && s[i]==s[i+1]){
s[i+1]=' ';
}
}
printf("%s",s);
}
my output:
t j ji b b h h hp u u
What mistake i made??
Your primary mistake is writing to your input when the string needs to grow. That's not going to work well and is hard to debug.
This is typical of C Code: measure once, process once. Same-ish code appears twice.
Variables:
int counter;
char *ptr1;
char *ptr2;
char *t;
Step 1: measure
for (ptr1 = s; *ptr1; ptr1++)
{
++counter;
if (ptr1[0] == ptr1[1] && ptr1[0] != ptr1[2] && (ptr1 == s || ptr1[-1] != ptr1[0]))
++counter;
}
Step 2: copy and process
t = malloc(counter + 1);
for (ptr1 = s, ptr2 = t; *ptr1; ptr1++)
{
*ptr2++ = *ptr1;
if (ptr1[0] == ptr1[1] && ptr1[0] != ptr1[2] && (ptr1 == s || ptr1[-1] != ptr1[0]))
*ptr2++ = ' ';
}
ptr2[0] = '\0';
Another solution: Calculate the length of consective characters and handle the special case(Length == 2).
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv) {
char s[100] = "ttjjjiibbbbhhhhhppuuuu";
char tmp_ch = s[0];
int cnt = 1;
for (int i = 1; i < strlen(s); i++) {
while (s[i] == tmp_ch) {
cnt++;
i++;
if (i == strlen(s)) {
break;
}
}
if (cnt == 2) {
putchar(tmp_ch);
putchar(' ');
putchar(tmp_ch);
} else {
for (int j = 0; j < cnt; j++) {
putchar(tmp_ch);
}
}
tmp_ch = s[i];
cnt = 1;
}
return 0;
}
Another approach is to use strspn() to get the number of consecutive characters as you work down the string. The prototype for strspn() is:
size_t strspn(const char *s, const char *accept);
Where strspn() returns the number of bytes in the initial segment of s which consist only of bytes from accept. (e.g. using the current character in a 2-character string as accept, it gives the number of times that character appears in sequence)
Tracking the number of charters returned and updating an offset from the beginning allows you to simply loop letting strspn() do the work as you work though your string. All you are concerned with is when strspn() returns 2 identifying where two, and only two, of the same character are adjacent to one another.
You can do:
#include <stdio.h>
#include <string.h>
int main (void) {
char *input = "ttjjjiibbbbhhhhhppuuuu";
char chstr[2] = {0}; /* 2 char string for accept parameter */
size_t nchr = 0, offset = 0; /* no. chars retured, current offset */
*chstr = input[offset]; /* initialize with 1st char */
/* while not at end, get number of consecutive character(s) */
while (*chstr && (nchr = strspn (input + offset, chstr))) {
if (nchr == 2) { /* if 2 - add space */
putchar (input[offset]);
putchar (' ');
putchar (input[offset]);
}
else { /* otherwise, loop nchr times outputting char */
size_t n = nchr;
while (n--)
putchar(input[offset]);
}
offset += nchr; /* add nchr to offset */
*chstr = input[offset]; /* store next char in string */
}
putchar ('\n'); /* tidy up with newline */
}
Example Use/Output
$ /bin/space_between_2
t tjjji ibbbbhhhhhp puuuu
Let me know if you have further questions concerning the use of strspn().

add additional letters in a string if there are two same letters beside each other

I'm trying to add an additional letter if there are two equal letters beside each other.
That's what I was thinking, but it doesn't put in an x between the two letters; instead of that, it copies one of the double letters, and now I have, for example, MMM instead of MXM.
for (index_X = 0; new_text[index_X] != '\0'; index_X++)
{
if (new_text[index_X] == new_text[index_X - 1])
{
double_falg = 1;
}
text[index_X] = new_text[index_X];
}
if (double_falg == 1)
{
for (counter_X = 0; text[counter_X] != '\0'; counter_X++)
{
transfer_X = counter_X;
if (text[transfer_X - 1] == text[transfer_X])
{
text_X[transfer_X] = 'X';
cnt_double++;
printf("%c\n", text[transfer_X]);
}
text_X[transfer_X] = text[transfer_X - cnt_double];
}
printf("%s\n", text_X);
}
If you're trying to create the modified array in text_X, copying data from new_text and putting an X between adjacent repeated letters (ignoring the possibility that the input contains XX), then you only need:
char new_text[] = "data with appalling repeats";
char text_X[SOME_SIZE];
int out_pos = 0;
for (int i = 0; new_text[i] != '\0'; i++)
{
text_X[out_pos++] = new_text[i];
if (new_text[i] == new_text[i+1])
text_X[out_pos++] = 'X';
}
text_X[out_pos] = '\0';
printf("Input: [%s]\n", new_text);
printf("Output: [%s]\n", text_X);
When wrapped in a basic main() function (and enum { SOME_SIZE = 64 };), that produces:
Input: [data with appalling repeats]
Output: [data with apXpalXling repeats]
To deal with repeated X's in the input, you could use:
text_X[out_pos++] = (new_text[i] == 'X') ? 'Q' : 'X';
It seems that your approach is more complicated than needed - too many loops and too many arrays involved. A single loop and two arrays should do.
The code below iterates the original string with idx to track position and uses the variable char_added to count how many extra chars that has been added to the new array.
#include <stdio.h>
#define MAX_LEN 20
int main(void) {
char org_arr[MAX_LEN] = "aabbcc";
char new_arr[MAX_LEN] = {0};
int char_added = 0;
int idx = 1;
new_arr[0] = org_arr[0];
if (new_arr[0])
{
while(org_arr[idx])
{
if (org_arr[idx] == org_arr[idx-1])
{
new_arr[idx + char_added] = '*';
++char_added;
}
new_arr[idx + char_added] = org_arr[idx];
++idx;
}
}
puts(new_arr);
return 0;
}
Output:
a*ab*bc*c
Note: The code isn't fully tested. Also it lacks out-of-bounds checking.
There is a lot left to be desired in your Minimal, Complete, and Verifiable Example (MCVE) (MCVE). However, that said, what you will need to do is fairly straight-forward. Take a simple example:
"ssi"
According to your statement, you need to add a character between the adjacent 's' characters. (you can use whatever you like for the separator, but if your input are normal ASCII character, then you can set the current char to the next ASCII character (or subtract one if current is the last ASCII char '~')) See ASCII Table and Description.
For example, you could use memmove() to shift all characters beginning with the current character up by one and then set the current character to the replacement. You also need to track the current length so you don't write beyond your array bounds.
A simple function could be:
#include <stdio.h>
#include <string.h>
#define MAXC 1024
char *betweenduplicates (char *s)
{
size_t len = strlen(s); /* get length to validate room */
if (!len) /* if empty string, nothing to do */
return s;
for (int i = 1; s[i] && len + 1 < MAXC; i++) /* loop until end, or out of room */
if (s[i-1] == s[i]) { /* adjacent chars equal? */
memmove (s + i + 1, s + i, len - i + 1); /* move current+ up by one */
if (s[i-1] != '~') /* not last ASCII char */
s[i] = s[i-1] + 1; /* set to next ASCII char */
else
s[i] = s[i-1] - 1; /* set to previous ASCII char */
len += 1; /* add one to len */
}
return s; /* convenience return so it can be used immediately if needed */
}
A short example program taking the string to check as the first argument could be:
int main (int argc, char **argv) {
char str[MAXC];
if (argc > 1) /* if argument given */
strcpy (str, argv[1]); /* copy to str */
else
strcpy (str, "mississippi"); /* otherwise use default */
puts (str); /* output original */
puts (betweenduplicates (str)); /* output result */
}
Example Use/Output
$ ./bin/betweenduplicated
mississippi
mistsistsipqpi
or when there is nothing to replace:
$ ./bin/betweenduplicated dog
dog
dog
Or checking the extremes:
$ ./bin/betweenduplicated "two spaces and alligators ~~"
two spaces and alligators ~~
two ! spaces ! and ! almligators ! ~}~
There are a number of ways to approach it. Let me know if you have further questions.

How to rearrange array using spaces?

I'm struggling with rearranging my array. I have used from single to multiple loops trying to put spaces (white characters) between two pairs of characters, but I was constantly rewriting the original input. So there is always an input of even length, for example ABCDEFGH. And my task would be to extend the size of the array by putting spaces after every 2 chars (except the last one).
So the output would be:
AB CD EF GH
So the size of output (if I'm correct) will be (2*input_len)-1
Thanks.
EDIT:
This is my code so far
// output = "ABCDEFGHIJKL
char c1;
char c2;
char c3;
int o_len = strlen(output);
for(int i = 2; i < o_len + olen/2; i = i + 3){
if(i == 2){
c1 = output[i];
c2 = output[i+1];
c3 = output[i+2];
output[i] = ' ';
output[i+1] = c1;
output[i+2] = c2;
}
else{
c1 = output[i];
c2 = output[i+1];
output[i] = ' ';
output[i+1] = c3;
output[i+2] = c1;
c3 = c2;
}
}
So the first 3 pairs are printed correctly, then it is all a mess.
Presuming you need to store the space separate result, probably the easiest way to go about inserting the spaces is simply to use a pair of pointers (one to your input string and one to your output string) and then just loop continually writing a pair to your output string, increment both pointers by 2, check whether you are out of characters in your input string (if so break; and nul-terminate your output string), otherwise write a space to your output string and repeat.
You can do it fairly simply using memcpy (or you can just copy 2-chars to the current pointer and pointer + 1, your choice, but since you already include string.h for strlen() -- make it easy on yourself) You can do something similar to:
#include <stdio.h>
#include <string.h>
#define ARRSZ 128 /* constant for no. of chars in output string */
int main (int argc, char **argv) {
char *instr = argc > 1 ? argv[1] : "ABCDEFGH", /* in string */
outstr[ARRSZ] = "", /* out string */
*ip = instr, *op = outstr; /* pointers to each */
size_t len = strlen (instr); /* len of instr */
if (len < 4) { /* validate at least 2-pairs worth of input provided */
fputs ("error: less than two-pairs to separate.\n", stderr);
return 1;
}
if (len & 1) { /* validate even number of characters */
fputs ("error: odd number of characters in instr.\n", stderr);
return 1;
}
if (ARRSZ < len + len / 2) { /* validate sufficient storage in outstr */
fputs ("error: insufficient storage in outstr.\n", stderr);
return 1;
}
for (;;) { /* loop continually */
memcpy (op, ip, 2); /* copy pair to op */
ip += 2; /* increment ip by 2 for next pair */
op += 2; /* increment op by 2 for next pair */
if (!*ip) /* check if last pair written */
break;
*op++ = ' '; /* write space between pairs in op */
}
*op = 0; /* nul-terminate outstr */
printf ("instr : %s\noutstr : %s\n", instr, outstr);
}
Example Use/Output
$ ./bin/strspaceseppairs
instr : ABCDEFGH
outstr : AB CD EF GH
$ ./bin/strspaceseppairs ABCDEFGHIJLMNOPQ
instr : ABCDEFGHIJLMNOPQ
outstr : AB CD EF GH IJ LM NO PQ
Odd number of chars:
$ ./bin/strspaceseppairs ABCDEFGHIJLMNOP
error: odd number of characters in instr.
Or short string:
$ ./bin/strspaceseppairs AB
error: less than two-pairs to separate.
Look things over and let me know if you have further questions.
Edit To Simply Output Single-Pair or Empty-String
Based upon the comment by #chqrlie it may make more sense rather than issuing a diagnostic for a short string, just to output it unchanged. Up to you. You can modify the first conditional and move it after the odd character check in that case, e.g.
if (len & 1) { /* validate even number of characters */
fputs ("error: odd number of characters in instr.\n", stderr);
return 1;
}
if (len < 4) { /* validate at least 2-pairs worth of input provided */
puts(instr); /* (otherwise output unchanged and exit) */
return 0;
}
You can decide how you want to handle any aspect of your program and make the changes accordingly.
I think you are looking for a piece of code like the one below:
This function returns the output splitted array, as you requested to save it.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <math.h>
char* split_by_space(char* str, size_t length, size_t step) {
size_t i = 0, j = 0, spaces = (length / step);
char* splitted = malloc(length + spaces + 1);
for (i = 0, j = 0; i < length; ++i, ++j) {
if (i % step == 0 && i != 0) {
splitted[j] = ' ';
++j;
}
splitted[j] = str[i];
}
splitted[j] = '\0';
return splitted;
}
int main(void) {
// Use size_t instead of int.
size_t step = 2; // Also works with odd numbers.
char str[] = "ABCDEFGH";
char* new_str;
// Works with odd and even steps.
new_str = split_by_space(str, strlen(str), step);
printf("New splitted string is [%s]", new_str);
// Don't forget to clean the memory that the function allocated.
free(new_str);
return 0;
}
When run with a step value of 2, the above code, outputs:
New splitted string is [AB CD EF GH]
Inserting characters inside the array is cumbersome and cannot be done unless you know the array is large enough to accommodate the new string.
You probably want to allocate a new array and create the modified string there.
The length of the new string is not (2 * input_len) - 1, you insert a space every 2 characters, except the last 2: if the string has 2 or fewer characters, its length is unmodified, otherwise it increases by (input_len - 2) / 2. And in case the length is off, you should round this value to the next integer, which is done in integer arithmetics this way: (input_len - 2 + 1) / 2.
Here is an example:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char *reformat_with_spaces(const char *str) {
size_t len = strlen(str);
size_t newlen = len > 2 ? len + (len - 2 + 1) / 2 : len;
char *out = malloc(newlen + 1);
if (out) {
for (size_t i = 0, j = 0; i < len; i++) {
if (i > 0 && i % 2 == 0) {
out[j++] = ' ';
}
out[j++] = str[i];
}
out[j] = '\0';
}
return out;
}
int main(void) {
char buf[256];
char *p;
while (fgets(buf, sizeof buf, stdin)) {
buf[strcspn(buf, "\n")] = '\0'; // strip the newline if any
p = reformat_with_spaces(buf);
if (p == NULL) {
fprintf(stderr, "out of memory\n");
return 1;
}
puts(p);
free(p);
}
return 0;
}
Try this,
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
void rearrange(char *str)
{
int len=strlen(str),n=0,i;
char *word=malloc((len+(int)(len/2)));
if(word==NULL)
{
printf("Memory Error");
exit(1);
}
for(i=0;i<len;i++)
{
if( i % 2 == 0 && i != 0)
{
word[n]=' ';
n++;
word[n]=str[i];
n++;
}
else
{
word[n]=str[i];
n++;
}
}
word[n]='\0';
strcpy(str,word);
free(word);
return;
}
int main()
{
char word[40];
printf("Enter word:");
scanf("%s",word);
rearrange(word);
printf("\n%s",word);
return 0;
}
See Below:
The rearrange function saves the letters in str into word. if the current position is divisible by 2 i.e i%2 it saves one space and letter into str, otherwise it saves letter only.

Swap words of a string without using sting library and use pointers

The goal of my exercise is to produce
The original string is:
silence .is a looking bird:the turning; edge, of life. e. e. cummings
Destination string after swapping:
cummings e. e. life. of edge, turning; bird:the looking a .is silence
and what I am getting is:
69The original string is:
silence .is a looking bird:the turning; edge, of life. e. e. cummings
Destination string after swapping:
my code:
'''
#include<stdio.h>
#include<stdlib.h>
#define MAX_STR_LEN 1024
// DO NOT USE the string library <string.h> for this exercise
void wordSwapper(char *source, char *destination)
{
int count = 0;
while (*(source + count) != '\0')
{
count++;
}
printf("%d", count);
for(int i = 0; i < count; i++)
{
*(destination + i) = *(source + (count - i));
}
}
int main()
{
char source[MAX_STR_LEN]="silence .is a looking bird:the turning; edge, of life. e. e. cummings";
char destination[MAX_STR_LEN]="I am a destination string and I contain lots of junk 1234517265716572#qsajdkuhasdgsahiehwjauhiuiuhdsj!";
wordSwapper(&source[0], &destination[0]);
printf("The original string is: \n%s\n",source);
printf("Destination string after swapping: \n%s\n",destination);
}
'''
My variant:
void wordSwapper(char *source, char *destination)
{
char *start, *end;
start = source;
while (*(start++) != '\0')
destination++;
// write trailing zero
*destination = '\0';
while (*source != '\0')
{
// copy spaces
while (*source == ' ')
*(--destination) = *(source++);
// find word bounds
start = end = source;
while (*end != '\0' && *end != ' ')
end++;
source = end;
// copy word
while (end > start)
*(--destination) = *(--end);
}
}
The posted code reverse the string - character by character. Two issues:
off-by-one, where the terminating NUL character is copied to position 0 of the destination string, therefore the result is empty string. The sec
The requirement is to split the string into words, and copy the words in reverse order to the destination string.
Consider the following alternative
void wordSwapper2(char *source, char *destination)
{
int count = 0;
while (*(source + count) != '\0')
{
count++;
}
// Copy words in reverse order
char *dest = destination ;
int dest_pos = 0 ;
// Word End
int w_end = count ;
while ( w_end >= 0 ) {
// Find word start
int w_start = w_end ;
while ( w_start > 0 && source[w_start-1] != ' ' ) w_start-- ;
// Copy word
for (int i=w_start ; i<w_end ; i++ ) *dest++ = source[i] ;
// Add space if not first word
if ( w_start > 0 ) *dest++ = ' ' ;
// Move to previous word (skip over space)
w_end = w_start-1 ;
} ;
// Terminating NUL
*dest++ = '\0' ;
}

Returning the length of a char array in C

I am new to programming in C and am trying to write a simple function that will normalize a char array. At the end i want to return the length of the new char array. I am coming from java so I apologize if I'm making mistakes that seem simple. I have the following code:
/* The normalize procedure normalizes a character array of size len
according to the following rules:
1) turn all upper case letters into lower case ones
2) turn any white-space character into a space character and,
shrink any n>1 consecutive whitespace characters to exactly 1 whitespace
When the procedure returns, the character array buf contains the newly
normalized string and the return value is the new length of the normalized string.
*/
int
normalize(unsigned char *buf, /* The character array contains the string to be normalized*/
int len /* the size of the original character array */)
{
/* use a for loop to cycle through each character and the built in c functions to analyze it */
int i;
if(isspace(buf[0])){
buf[0] = "";
}
if(isspace(buf[len-1])){
buf[len-1] = "";
}
for(i = 0;i < len;i++){
if(isupper(buf[i])) {
buf[i]=tolower(buf[i]);
}
if(isspace(buf[i])) {
buf[i]=" ";
}
if(isspace(buf[i]) && isspace(buf[i+1])){
buf[i]="";
}
}
return strlen(*buf);
}
How can I return the length of the char array at the end? Also does my procedure properly do what I want it to?
EDIT: I have made some corrections to my program based on the comments. Is it correct now?
/* The normalize procedure normalizes a character array of size len
according to the following rules:
1) turn all upper case letters into lower case ones
2) turn any white-space character into a space character and,
shrink any n>1 consecutive whitespace characters to exactly 1 whitespace
When the procedure returns, the character array buf contains the newly
normalized string and the return value is the new length of the normalized string.
*/
int
normalize(unsigned char *buf, /* The character array contains the string to be normalized*/
int len /* the size of the original character array */)
{
/* use a for loop to cycle through each character and the built in c funstions to analyze it */
int i = 0;
int j = 0;
if(isspace(buf[0])){
//buf[0] = "";
i++;
}
if(isspace(buf[len-1])){
//buf[len-1] = "";
i++;
}
for(i;i < len;i++){
if(isupper(buf[i])) {
buf[j]=tolower(buf[i]);
j++;
}
if(isspace(buf[i])) {
buf[j]=' ';
j++;
}
if(isspace(buf[i]) && isspace(buf[i+1])){
//buf[i]="";
i++;
}
}
return strlen(buf);
}
The canonical way of doing something like this is to use two indices, one for reading, and one for writing. Like this:
int normalizeString(char* buf, int len) {
int readPosition, writePosition;
bool hadWhitespace = false;
for(readPosition = writePosition = 0; readPosition < len; readPosition++) {
if(isspace(buf[readPosition]) {
if(!hadWhitespace) buf[writePosition++] = ' ';
hadWhitespace = true;
} else if(...) {
...
}
}
return writePosition;
}
Warning: This handles the string according to the given length only. While using a buffer + length has the advantage of being able to handle any data, this is not the way C strings work. C-strings are terminated by a null byte at their end, and it is your job to ensure that the null byte is at the right position. The code you gave does not handle the null byte, nor does the buffer + length version I gave above. A correct C implementation of such a normalization function would look like this:
int normalizeString(char* string) { //No length is passed, it is implicit in the null byte.
char* in = string, *out = string;
bool hadWhitespace = false;
for(; *in; in++) { //loop until the zero byte is encountered
if(isspace(*in) {
if(!hadWhitespace) *out++ = ' ';
hadWhitespace = true;
} else if(...) {
...
}
}
*out = 0; //add a new zero byte
return out - string; //use pointer arithmetic to retrieve the new length
}
In this code I replaced the indices by pointers simply because it was convenient to do so. This is simply a matter of style preference, I could have written the same thing with explicit indices. (And my style preference is not for pointer iterations, but for concise code.)
if(isspace(buf[i])) {
buf[i]=" ";
}
This should be buf[i] = ' ', not buf[i] = " ". You can't assign a string to a character.
if(isspace(buf[i]) && isspace(buf[i+1])){
buf[i]="";
}
This has two problems. One is that you're not checking whether i < len - 1, so buf[i + 1] could be off the end of the string. The other is that buf[i] = "" won't do what you want at all. To remove a character from a string, you need to use memmove to move the remaining contents of the string to the left.
return strlen(*buf);
This would be return strlen(buf). *buf is a character, not a string.
The notations like:
buf[i]=" ";
buf[i]="";
do not do what you think/expect. You will probably need to create two indexes to step through the array — one for the current read position and one for the current write position, initially both zero. When you want to delete a character, you don't increment the write position.
Warning: untested code.
int i, j;
for (i = 0, j = 0; i < len; i++)
{
if (isupper(buf[i]))
buf[j++] = tolower(buf[i]);
else if (isspace(buf[i])
{
buf[j++] = ' ';
while (i+1 < len && isspace(buf[i+1]))
i++;
}
else
buf[j++] = buf[i];
}
buf[j] = '\0'; // Null terminate
You replace the arbitrary white space with a plain space using:
buf[i] = ' ';
You return:
return strlen(buf);
or, with the code above:
return j;
Several mistakes in your code:
You cannot assign buf[i] with a string, such as "" or " ", because the type of buf[i] is char and the type of a string is char*.
You are reading from buf and writing into buf using index i. This poses a problem, as you want to eliminate consecutive white-spaces. So you should use one index for reading and another index for writing.
In C/C++, a native string is an array of characters that ends with 0. So in essence, you can simply iterate buf until you read 0 (you don't need to use the len variable at all). In addition, since you are "truncating" the input string, you should set the new last character to 0.
Here is one optional solution for the problem at hand:
int normalize(char* buf)
{
char c;
int i = 0;
int j = 0;
while (buf[i] != 0)
{
c = buf[i++];
if (isspace(c))
{
j++;
while (isspace(c))
c = buf[i++];
}
if (isupper(c))
buf[j] = tolower(c);
j++;
}
buf[j] = 0;
return j;
}
you should write:
return strlen(buf)
instead of:
return strlen(*buf)
The reason:
buf is of type char* - it's an address of a char somewhere in the memory (the one in the beginning of the string). The string is null terminated (or at least should be), and therefore the function strlen knows when to stop counting chars.
*buf will de-reference the pointer, resulting on a char - not what strlen expects.
Not much different then others but assumes this is an array of unsigned char and not a C string.
tolower() does not itself need the isupper() test.
int normalize(unsigned char *buf, int len) {
int i = 0;
int j = 0;
int previous_is_space = 0;
while (i < len) {
if (isspace(buf[i])) {
if (!previous_is_space) {
buf[j++] = ' ';
}
previous_is_space = 1;
} else {
buf[j++] = tolower(buf[i]);
previous_is_space = 0;
}
i++;
}
return j;
}
#OP:
Per the posted code it implies leading and trailing spaces should either be shrunk to 1 char or eliminate all leading and trailing spaces.
The above answer simple shrinks leading and trailing spaces to 1 ' '.
To eliminate trailing and leading spaces:
int i = 0;
int j = 0;
while (len > 0 && isspace(buf[len-1])) len--;
while (i < len && isspace(buf[i])) i++;
int previous_is_space = 0;
while (i < len) { ...

Resources