How to rearrange array using spaces? - c

I'm struggling with rearranging my array. I have used from single to multiple loops trying to put spaces (white characters) between two pairs of characters, but I was constantly rewriting the original input. So there is always an input of even length, for example ABCDEFGH. And my task would be to extend the size of the array by putting spaces after every 2 chars (except the last one).
So the output would be:
AB CD EF GH
So the size of output (if I'm correct) will be (2*input_len)-1
Thanks.
EDIT:
This is my code so far
// output = "ABCDEFGHIJKL
char c1;
char c2;
char c3;
int o_len = strlen(output);
for(int i = 2; i < o_len + olen/2; i = i + 3){
if(i == 2){
c1 = output[i];
c2 = output[i+1];
c3 = output[i+2];
output[i] = ' ';
output[i+1] = c1;
output[i+2] = c2;
}
else{
c1 = output[i];
c2 = output[i+1];
output[i] = ' ';
output[i+1] = c3;
output[i+2] = c1;
c3 = c2;
}
}
So the first 3 pairs are printed correctly, then it is all a mess.

Presuming you need to store the space separate result, probably the easiest way to go about inserting the spaces is simply to use a pair of pointers (one to your input string and one to your output string) and then just loop continually writing a pair to your output string, increment both pointers by 2, check whether you are out of characters in your input string (if so break; and nul-terminate your output string), otherwise write a space to your output string and repeat.
You can do it fairly simply using memcpy (or you can just copy 2-chars to the current pointer and pointer + 1, your choice, but since you already include string.h for strlen() -- make it easy on yourself) You can do something similar to:
#include <stdio.h>
#include <string.h>
#define ARRSZ 128 /* constant for no. of chars in output string */
int main (int argc, char **argv) {
char *instr = argc > 1 ? argv[1] : "ABCDEFGH", /* in string */
outstr[ARRSZ] = "", /* out string */
*ip = instr, *op = outstr; /* pointers to each */
size_t len = strlen (instr); /* len of instr */
if (len < 4) { /* validate at least 2-pairs worth of input provided */
fputs ("error: less than two-pairs to separate.\n", stderr);
return 1;
}
if (len & 1) { /* validate even number of characters */
fputs ("error: odd number of characters in instr.\n", stderr);
return 1;
}
if (ARRSZ < len + len / 2) { /* validate sufficient storage in outstr */
fputs ("error: insufficient storage in outstr.\n", stderr);
return 1;
}
for (;;) { /* loop continually */
memcpy (op, ip, 2); /* copy pair to op */
ip += 2; /* increment ip by 2 for next pair */
op += 2; /* increment op by 2 for next pair */
if (!*ip) /* check if last pair written */
break;
*op++ = ' '; /* write space between pairs in op */
}
*op = 0; /* nul-terminate outstr */
printf ("instr : %s\noutstr : %s\n", instr, outstr);
}
Example Use/Output
$ ./bin/strspaceseppairs
instr : ABCDEFGH
outstr : AB CD EF GH
$ ./bin/strspaceseppairs ABCDEFGHIJLMNOPQ
instr : ABCDEFGHIJLMNOPQ
outstr : AB CD EF GH IJ LM NO PQ
Odd number of chars:
$ ./bin/strspaceseppairs ABCDEFGHIJLMNOP
error: odd number of characters in instr.
Or short string:
$ ./bin/strspaceseppairs AB
error: less than two-pairs to separate.
Look things over and let me know if you have further questions.
Edit To Simply Output Single-Pair or Empty-String
Based upon the comment by #chqrlie it may make more sense rather than issuing a diagnostic for a short string, just to output it unchanged. Up to you. You can modify the first conditional and move it after the odd character check in that case, e.g.
if (len & 1) { /* validate even number of characters */
fputs ("error: odd number of characters in instr.\n", stderr);
return 1;
}
if (len < 4) { /* validate at least 2-pairs worth of input provided */
puts(instr); /* (otherwise output unchanged and exit) */
return 0;
}
You can decide how you want to handle any aspect of your program and make the changes accordingly.

I think you are looking for a piece of code like the one below:
This function returns the output splitted array, as you requested to save it.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <math.h>
char* split_by_space(char* str, size_t length, size_t step) {
size_t i = 0, j = 0, spaces = (length / step);
char* splitted = malloc(length + spaces + 1);
for (i = 0, j = 0; i < length; ++i, ++j) {
if (i % step == 0 && i != 0) {
splitted[j] = ' ';
++j;
}
splitted[j] = str[i];
}
splitted[j] = '\0';
return splitted;
}
int main(void) {
// Use size_t instead of int.
size_t step = 2; // Also works with odd numbers.
char str[] = "ABCDEFGH";
char* new_str;
// Works with odd and even steps.
new_str = split_by_space(str, strlen(str), step);
printf("New splitted string is [%s]", new_str);
// Don't forget to clean the memory that the function allocated.
free(new_str);
return 0;
}
When run with a step value of 2, the above code, outputs:
New splitted string is [AB CD EF GH]

Inserting characters inside the array is cumbersome and cannot be done unless you know the array is large enough to accommodate the new string.
You probably want to allocate a new array and create the modified string there.
The length of the new string is not (2 * input_len) - 1, you insert a space every 2 characters, except the last 2: if the string has 2 or fewer characters, its length is unmodified, otherwise it increases by (input_len - 2) / 2. And in case the length is off, you should round this value to the next integer, which is done in integer arithmetics this way: (input_len - 2 + 1) / 2.
Here is an example:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char *reformat_with_spaces(const char *str) {
size_t len = strlen(str);
size_t newlen = len > 2 ? len + (len - 2 + 1) / 2 : len;
char *out = malloc(newlen + 1);
if (out) {
for (size_t i = 0, j = 0; i < len; i++) {
if (i > 0 && i % 2 == 0) {
out[j++] = ' ';
}
out[j++] = str[i];
}
out[j] = '\0';
}
return out;
}
int main(void) {
char buf[256];
char *p;
while (fgets(buf, sizeof buf, stdin)) {
buf[strcspn(buf, "\n")] = '\0'; // strip the newline if any
p = reformat_with_spaces(buf);
if (p == NULL) {
fprintf(stderr, "out of memory\n");
return 1;
}
puts(p);
free(p);
}
return 0;
}

Try this,
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
void rearrange(char *str)
{
int len=strlen(str),n=0,i;
char *word=malloc((len+(int)(len/2)));
if(word==NULL)
{
printf("Memory Error");
exit(1);
}
for(i=0;i<len;i++)
{
if( i % 2 == 0 && i != 0)
{
word[n]=' ';
n++;
word[n]=str[i];
n++;
}
else
{
word[n]=str[i];
n++;
}
}
word[n]='\0';
strcpy(str,word);
free(word);
return;
}
int main()
{
char word[40];
printf("Enter word:");
scanf("%s",word);
rearrange(word);
printf("\n%s",word);
return 0;
}
See Below:
The rearrange function saves the letters in str into word. if the current position is divisible by 2 i.e i%2 it saves one space and letter into str, otherwise it saves letter only.

Related

How to add space between the characters if two consecutive characters are equal in c?

I need to add add space if two consecutive characters are same.
For example:
input:
ttjjjiibbbbhhhhhppuuuu
Output:
t tjjji ibbbbhhhhhp puuuu
If the two consecutive characters are same then need to print space between two consecutive characters....if the consecutive characters are greater than two no need to add space.
My code:
#include <stdio.h>
#include <string.h>
int main()
{
char s[100]="ttjjjiibbbbhhhhhppuuuu";
for(int i=0;i<strlen(s);i++){
if(s[i]!=s[i-1] && s[i]==s[i+1]){
s[i+1]=' ';
}
}
printf("%s",s);
}
my output:
t j ji b b h h hp u u
What mistake i made??
Your primary mistake is writing to your input when the string needs to grow. That's not going to work well and is hard to debug.
This is typical of C Code: measure once, process once. Same-ish code appears twice.
Variables:
int counter;
char *ptr1;
char *ptr2;
char *t;
Step 1: measure
for (ptr1 = s; *ptr1; ptr1++)
{
++counter;
if (ptr1[0] == ptr1[1] && ptr1[0] != ptr1[2] && (ptr1 == s || ptr1[-1] != ptr1[0]))
++counter;
}
Step 2: copy and process
t = malloc(counter + 1);
for (ptr1 = s, ptr2 = t; *ptr1; ptr1++)
{
*ptr2++ = *ptr1;
if (ptr1[0] == ptr1[1] && ptr1[0] != ptr1[2] && (ptr1 == s || ptr1[-1] != ptr1[0]))
*ptr2++ = ' ';
}
ptr2[0] = '\0';
Another solution: Calculate the length of consective characters and handle the special case(Length == 2).
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv) {
char s[100] = "ttjjjiibbbbhhhhhppuuuu";
char tmp_ch = s[0];
int cnt = 1;
for (int i = 1; i < strlen(s); i++) {
while (s[i] == tmp_ch) {
cnt++;
i++;
if (i == strlen(s)) {
break;
}
}
if (cnt == 2) {
putchar(tmp_ch);
putchar(' ');
putchar(tmp_ch);
} else {
for (int j = 0; j < cnt; j++) {
putchar(tmp_ch);
}
}
tmp_ch = s[i];
cnt = 1;
}
return 0;
}
Another approach is to use strspn() to get the number of consecutive characters as you work down the string. The prototype for strspn() is:
size_t strspn(const char *s, const char *accept);
Where strspn() returns the number of bytes in the initial segment of s which consist only of bytes from accept. (e.g. using the current character in a 2-character string as accept, it gives the number of times that character appears in sequence)
Tracking the number of charters returned and updating an offset from the beginning allows you to simply loop letting strspn() do the work as you work though your string. All you are concerned with is when strspn() returns 2 identifying where two, and only two, of the same character are adjacent to one another.
You can do:
#include <stdio.h>
#include <string.h>
int main (void) {
char *input = "ttjjjiibbbbhhhhhppuuuu";
char chstr[2] = {0}; /* 2 char string for accept parameter */
size_t nchr = 0, offset = 0; /* no. chars retured, current offset */
*chstr = input[offset]; /* initialize with 1st char */
/* while not at end, get number of consecutive character(s) */
while (*chstr && (nchr = strspn (input + offset, chstr))) {
if (nchr == 2) { /* if 2 - add space */
putchar (input[offset]);
putchar (' ');
putchar (input[offset]);
}
else { /* otherwise, loop nchr times outputting char */
size_t n = nchr;
while (n--)
putchar(input[offset]);
}
offset += nchr; /* add nchr to offset */
*chstr = input[offset]; /* store next char in string */
}
putchar ('\n'); /* tidy up with newline */
}
Example Use/Output
$ /bin/space_between_2
t tjjji ibbbbhhhhhp puuuu
Let me know if you have further questions concerning the use of strspn().

How to Replace Leading or Trailing Blank Characters with "X"

Looking for a more efficient way to replace leading and trailing empty spaces (' ') and appending an 'X' to the front for each empty space.. It seems to work ok for trailing spaces but I'd like to know if there's a better / simpler way of going about this that I am missing.
Example:
Passed in string: '12345 '
Desired result 'XXXXX12345'
Removed 5 empty spaces and append 5 'X's to front.
Example:
Passed in string: ' 12345'
Desired result 'XX12345'
Remove 2 empty spaces and append 2 'X's to front.
void fixStr(char* str)
{
int i = 0;
int length = strlen(str);
char strCopy[10];
strcpy(strCpy, str);
for(i = 0; i < length; i++)
{
if(strCopy[i] == ' ')
{
strCopy[i] = '\0';
str[i] = '\0';
break;
}
}
for(i = 0; i < length - i + 2; i++)
{
str[i] = 'X';
str[i + 1] = '\0';
}
strcat(str, strCopy);
}
One way to achieve this is to find out the leading non-space position & trailing non-space position of the string, and then move the content in-between (leading nonspace, trailing nonspace) this to end of the string, then set all the empty space at the beginning to 'x'
This way you can get the expected output (function below)
void fixStr(char* str)
{
int i = 0;
int length = strlen(str);
int leadindex = length;
int tailindex = 0;
// First find the leading nonspace position
for(i = 0; i < length; i++)
{
if(str[i] != ' ')
{
leadindex = i;
break;
}
}
// if not found nonspace then no change
if( leadindex == length )
{
// all spaces, so no change required;
return;
}
// Find the trailing nonspace position
for(i = length - 1; i >= 0 ; i--)
{
if(str[i] != ' ')
{
tailindex = i;
break;
}
}
// move the buffer (in place) to exclude trailing spaces
memmove(str + (length - tailindex -1),str,(tailindex +1) );
// set the 'x' to all empty spaces at leading ( you may use for loop to set this)
memset(str, 'X', length - (tailindex - leadindex + 1) );
}
To solve a problem the engineer's way:
Define the needs.
Know your tools.
Use the tools as simple as possible, as accurate as necessary to make up a solution.
In your case:
Needs:
find the number of trailing spaces
move content of string to the end
set beginning to 'X's
Tools:
to measure, iterate, compare and count
to move a block of memory
to initialise a block of memory
Example for a solution:
#include <string.h> /* for strlen(), memmove () and memset() */
void fix_str(char * s)
{
if ((NULL != s) && ('\0' != *s)) /* Ignore NULL and empty string! */
{
/* Store length and initialise counter: */
size_t l = strlen(s), i = l;
/* Count space(s): */
for (; (0 != i) && (' ' == s[i-1]); --i); /* This for loop does not need a "body". */
/* Calculate the complement: */
size_t c = l - i;
/* Move content to the end overwriting any trailing space(s) counted before hand: */
memmove(s+c, s, i); /* Note that using memmove() instead of memmcpy() is essential
here as the source and destination memory overlap! */
/* Initialise the new "free" characters at the beginning to 'X's:*/
memset(s, 'X', c);
}
}
I didn't fix your code but you could use sprintf in combination with isspace, something along the lines of this. Also, remember to make a space for the '\0 at the end of your string. Use this idea and it should help you:
#include <ctype.h>
#include <stdio.h>
int main()
{
char buf[11];
char *s = "Hello";
int i;
sprintf(buf, "%10s", s); /* right justifies in a column of 10 in buf */
for(i = 0; i < 10; i++) {
if(isspace(buf[i])) /* replace the spaces with an x (or whatever) */
buf[i] = 'x';
}
printf("%s\n", buf);
return 0;
}

add additional letters in a string if there are two same letters beside each other

I'm trying to add an additional letter if there are two equal letters beside each other.
That's what I was thinking, but it doesn't put in an x between the two letters; instead of that, it copies one of the double letters, and now I have, for example, MMM instead of MXM.
for (index_X = 0; new_text[index_X] != '\0'; index_X++)
{
if (new_text[index_X] == new_text[index_X - 1])
{
double_falg = 1;
}
text[index_X] = new_text[index_X];
}
if (double_falg == 1)
{
for (counter_X = 0; text[counter_X] != '\0'; counter_X++)
{
transfer_X = counter_X;
if (text[transfer_X - 1] == text[transfer_X])
{
text_X[transfer_X] = 'X';
cnt_double++;
printf("%c\n", text[transfer_X]);
}
text_X[transfer_X] = text[transfer_X - cnt_double];
}
printf("%s\n", text_X);
}
If you're trying to create the modified array in text_X, copying data from new_text and putting an X between adjacent repeated letters (ignoring the possibility that the input contains XX), then you only need:
char new_text[] = "data with appalling repeats";
char text_X[SOME_SIZE];
int out_pos = 0;
for (int i = 0; new_text[i] != '\0'; i++)
{
text_X[out_pos++] = new_text[i];
if (new_text[i] == new_text[i+1])
text_X[out_pos++] = 'X';
}
text_X[out_pos] = '\0';
printf("Input: [%s]\n", new_text);
printf("Output: [%s]\n", text_X);
When wrapped in a basic main() function (and enum { SOME_SIZE = 64 };), that produces:
Input: [data with appalling repeats]
Output: [data with apXpalXling repeats]
To deal with repeated X's in the input, you could use:
text_X[out_pos++] = (new_text[i] == 'X') ? 'Q' : 'X';
It seems that your approach is more complicated than needed - too many loops and too many arrays involved. A single loop and two arrays should do.
The code below iterates the original string with idx to track position and uses the variable char_added to count how many extra chars that has been added to the new array.
#include <stdio.h>
#define MAX_LEN 20
int main(void) {
char org_arr[MAX_LEN] = "aabbcc";
char new_arr[MAX_LEN] = {0};
int char_added = 0;
int idx = 1;
new_arr[0] = org_arr[0];
if (new_arr[0])
{
while(org_arr[idx])
{
if (org_arr[idx] == org_arr[idx-1])
{
new_arr[idx + char_added] = '*';
++char_added;
}
new_arr[idx + char_added] = org_arr[idx];
++idx;
}
}
puts(new_arr);
return 0;
}
Output:
a*ab*bc*c
Note: The code isn't fully tested. Also it lacks out-of-bounds checking.
There is a lot left to be desired in your Minimal, Complete, and Verifiable Example (MCVE) (MCVE). However, that said, what you will need to do is fairly straight-forward. Take a simple example:
"ssi"
According to your statement, you need to add a character between the adjacent 's' characters. (you can use whatever you like for the separator, but if your input are normal ASCII character, then you can set the current char to the next ASCII character (or subtract one if current is the last ASCII char '~')) See ASCII Table and Description.
For example, you could use memmove() to shift all characters beginning with the current character up by one and then set the current character to the replacement. You also need to track the current length so you don't write beyond your array bounds.
A simple function could be:
#include <stdio.h>
#include <string.h>
#define MAXC 1024
char *betweenduplicates (char *s)
{
size_t len = strlen(s); /* get length to validate room */
if (!len) /* if empty string, nothing to do */
return s;
for (int i = 1; s[i] && len + 1 < MAXC; i++) /* loop until end, or out of room */
if (s[i-1] == s[i]) { /* adjacent chars equal? */
memmove (s + i + 1, s + i, len - i + 1); /* move current+ up by one */
if (s[i-1] != '~') /* not last ASCII char */
s[i] = s[i-1] + 1; /* set to next ASCII char */
else
s[i] = s[i-1] - 1; /* set to previous ASCII char */
len += 1; /* add one to len */
}
return s; /* convenience return so it can be used immediately if needed */
}
A short example program taking the string to check as the first argument could be:
int main (int argc, char **argv) {
char str[MAXC];
if (argc > 1) /* if argument given */
strcpy (str, argv[1]); /* copy to str */
else
strcpy (str, "mississippi"); /* otherwise use default */
puts (str); /* output original */
puts (betweenduplicates (str)); /* output result */
}
Example Use/Output
$ ./bin/betweenduplicated
mississippi
mistsistsipqpi
or when there is nothing to replace:
$ ./bin/betweenduplicated dog
dog
dog
Or checking the extremes:
$ ./bin/betweenduplicated "two spaces and alligators ~~"
two spaces and alligators ~~
two ! spaces ! and ! almligators ! ~}~
There are a number of ways to approach it. Let me know if you have further questions.

chars are not getting copied properly

This function is supposed to expand expressions like "a-z" in s1 to "abcd....xyz" in s2,
but for some reasons it does not work properly, every time i print s2 it stops at the second char that is supposed to be expanded.
For example, if s1="a-z", printing s2 gives me "ab".
Why?
void expand(char s1[], char s2[]) {
int i, j, k;
for (i = 0, j = 0; s1[i] != '\0'; i++, j++) {
if (s1[i] == '-' && s1[i-1] != ' ' && s1[i+1] != ' ') {
for (k = s1[i-1]+1; k < s1[i+1]; ++j, ++k)
s2[j] = k;
} else {
s2[j] = s1[i];
}
}
}
The function is called this way:
int caller (void) {
char des[30];
expand("a-z", des);
printf("%s\n", des);
}
Why so many complications?
Your code has some problems, nevertheless, it produces a meaningful output for me than what you are getting.
$ ./main.out
abcdefghijklmnopqrstuvwxyz
Just that, you need to modify the termination condition in the inner loop to include the last character.
Let me help you with a readable solution. I have tried to derive it out of your code's gist only, as there's nothing horribly wrong there. There are still much better ways to write code than below, but for quickness' sake... You can put your own validations.
void expand (char s1[], char s2[])
{
int cnt = 0;
for (int i = 0; i < s1[i] != '\0'; ++i)
{
switch (s1[i])
{
case '-':
for (char k = s1[i-1]+1; k < s1[i+1]; ++k)
s2[cnt++] = k;
continue;
break;
default:
s2[cnt++] = s1[i];
}
}
}
In function, you are using the next snippet of code
for (i = 0, j = 0; s1[i] != '\0'; i++, j++) {
if (s1[i] == '-' && s1[i-1] != ' ' && s1[i+1] != ' ') {
If you are going to do something with s1[i-1], then i cannot go from 0, or you'll be checking s1[-1] in first loop iteration, which is out of array bounds. This is an error, that produces Undefined Behaviour. An alternative would be to begin at i = 1, or to check if strcmp(s1 + i, " - ") == 0, which never checks before i or never goes after a \0.
for (i = 0, j = 0; s1[i] != '\0'; i++, j++) {
if (strcmp(s1 + i, " - ") == 0) {
(but it is possible that this is not what you are looking for, while checking that the character at i is - and the character at i-1 is a space, and the character at i+1 is another space is somehow equivalent as checking if the character sequence at i ---well, not at i-1--- is the sequence -)
The problem in your code is that you need a buffer to copy the strings... as when you say something like: a-z, then that spans a string longer than the original a-z sequence. First of all, you must recognize the subsequence of two chars (which can or cannot be the - char) and a - in the middle. This is something you can do with this state machine:
/* expand.c -- expands ranges in the form a-b
* Date: Fri Dec 20 08:02:30 EET 2019
*/
#include <stdio.h>
#define ERR_ENOSPACE (-1)
#define ERR_ERANGE (-2)
ssize_t expand(
char *source, /* the source string */
char *target, /* the target string */
size_t target_length) /* the target length */
{
int ch, /* the character to copy */
first_char, /* first char in range */
last_char; /* last char in range */
size_t len = 0; /* the length of the complete range string */
ssize_t result = 0; /* the length returned */
while((ch = *source++) != 0) { /* s is the input string */
switch (len) { /* length of substring (or machine state). */
case 0:
first_char = ch; /* annotate first char */
len = 1; /* state is now 1 */
break;
case 1: switch (ch) {
case '-': /* valid range go to state 2 */
len = 2;
break;
default: /* not a valid range, store a new first char
and remain in this state. And copy the last
char to the output string. */
if (target_length < 3) {
/* not enough space (3 is needed for first_char,
* this char and the final \0 char) */
return ERR_ENOSPACE;
}
*target++ = first_char; target_length--;
first_char = ch; /* len = 1; */
result++;
} break;
case 2:
last_char = ch; /* we completed a range */
if (first_char > last_char)
return ERR_ERANGE;
ssize_t n = last_char - first_char + 1; /* number of output chars */
if (n + 1 > target_length) {
/* we need space for n characters, * plus a '\0' char */
return ERR_ENOSPACE;
}
/* copy the string */
while (first_char <= last_char)
*target++ = first_char++;
target_length -= n;
result += n;
len = 0; /* state comes back to 0 */
break;
} /* switch (l) */
} /* while */
/* check state on end. */
switch (len) { /* depending on length we need to add a partial
built sequence */
case 0: break; /* nothing to append */
case 1: /* we have a spare first_char, add it */
if (target_length < 2)
return ERR_ENOSPACE;
*target++ = first_char; target_length--;
result++;
break;
case 2: if (target_length < 3)
return ERR_ENOSPACE;
*target++ = first_char; *target++ = '-';
target_length -= 2; result += 2;
break;
}
/* now fill the final \0 */
if (target_length < 1) {
return ERR_ENOSPACE;
}
*target = '\0';
return result;
} /* expand */
int main()
{
char line[1024];
char outbuf[8192];
while (fgets(line, sizeof line, stdin)) {
ssize_t n = expand(line, outbuf, sizeof outbuf);
#define CASE(err) case err: fprintf(stderr, "ERROR: " #err "\n"); break;
switch(n) {
CASE(ERR_ENOSPACE)
CASE(ERR_ERANGE)
default: printf("OUTPUT: %s\n", outbuf); break;
} /* switch */
} /* while */
} /* main */
The sample code is a full sample (with a main() routine) you can compile and test.
To add a bit of flexibility in handling your s1 string so it will accept "a-z", "az" or "a #%^#%& z", assuming the string starts with the first character in the range, you can save the first character and then iterate over s1 looking for the next alpha character as the end character for the expansion. You can use the isalpha() macro provided in ctype.h.
You also need to handle the cases where no ending character is found, or the ending character is less than the beginning from an ASCII value standpoint.
Note this only works for ASCII character sets. Others character sets do not guarantee sequential alpha-characters "A-Z" or "a-z".
You can do something similar to the following:
#include <ctype.h>
...
void expand (const char *s1, char *s2)
{
int c = *s1; /* assign 1st char in s1 to c */
if (!isalpha(*s1++)) { /* validate 1st char is alpha character */
fputs ("error: invalid format s1\n", stderr);
return;
}
*s2++ = c; /* assign c to 1st char in s2 */
*s2 = 0; /* nul-terminate in case end char in s1 not found */
while (*s1 && !isalpha(*s1)) /* loop s1 looking for next alpha char */
s1++;
if (!*s1) { /* if 2nd alpha char not found, handle error */
fputs ("error: invalid format s1\n", stderr);
return;
}
for (; c <= *s1; c++) /* loop until c == end char in s1 */
*s2++ = c; /* assign c to s2, increment pointer */
*s2 = 0; /* nul-terminate s2 */
}
(note: the initial nul-termination of s2 after the first character covers both cases where (1) no ending alpha-character is found or (2) the end character has an ASCII value less than the first. Consider changing the order of the s1 and s2 parameter to make them consistent with strcpy, etc..)
If you like you can iterate to find the first alpha character in s1 to handle leading non-alpha characters -- that is left to you.
Look things over and let me know if you have further questions.

Store every string that start and end with a special words into an array in C

I have a long string and I want to store every string that starts and ends with a special word into an array, and then remove duplicate strings. In my long string, there is no space, , or any other separation between words so that I cannot use strtok. The start marker is start and the end marker is end. This is the code I have so far (but it doesn't work because it is using strtok()).
char buf[] = "start-12-3.endstart-12-4.endstart-13-3.endstart-12-4.end";
char *array[5];
char *x;
int i = 0, j = 0;
array[i] = strtok(buf, "start");
while (array[i] != NULL) {
array[++i] = strtok(NULL, "start");
}
//removeDuplicate(array[i]);
for (i = 0; i < 5; i++)
for (j = 0; j < 5; j++)
if (strcmp(array[i], array[j]) == 0)
x[i++] = array[i];
printf("%s", x[i]);
Example input:
start-12-3.endstart-12-4.endstart-13-3.endstart-12-4.end
Output equivalent to:
char *array[]= { "start-12-3.end", "start-12-4.end", "start-13-3.end" };
The second start-12-4.end string has been eliminated in the output.
*I've also used strstr but has some issue:
int main(int argc, char **argv)
{
char string[] = "This-one.testthis-two.testthis-three.testthis-two.test";
int counter = 0;
while (counter < 4)
{
char *result1 = strstr(string, "this");
int start = result1 - string;
char *result = strstr(string, "test");
int end = result - string;
end += 4;
printf("\n%s\n", result);
memmove(result, result1, end += 4);
counter++;
}
}
To put string into array and remove duplicate string, I've tried following code but it has issue:
int main(void)
{
char string[] = "this-one.testthis-two.testthis-three.testthis-two.test";
int counter = 0;
const char *b_token = "this";
const char *e_token = "test";
int e_len = strlen(e_token);
char *buffer = string;
char *b_mark;
char *e_mark;
char *a[50];
int i=0, j;
char *s;
while ((b_mark = strstr(buffer, b_token)) != 0 && (e_mark =strstr(b_mark, e_token)) != 0)
{
int length = e_mark + e_len - b_mark;
s = (char *) malloc(length);
strncpy(s, b_mark, length);
a[i]=s;
i++;
buffer = e_mark + e_len;
}
for (i=0; i<strlen(s); i++)
printf ("%s",a[i]);
free(s);
/*
//remove duplicate string
for (i=0; i<4; i++)
for (j=0; j<4; j++)
{
if (a[i] == NULL || a[j] == NULL || i == j)
continue;
if (strcmp (a[i], a[j]) == 0) {
free(a[i]);
a[i] = NULL;
}
printf("%s\n", a[i]);
*/
return 0;
}
Works with provided example of yours and tested in Valgrind for mem leaks, but might require further testing.
#include <malloc.h>
#include <stdio.h>
#include <string.h>
unsigned tokens_find_amount( char const* const string, char const* const delim )
{
unsigned counter = 0;
char const* pos = string;
while( pos != NULL )
{
if( ( pos = strstr( pos, delim ) ) != NULL )
{
pos++;
counter++;
}
}
return counter;
}
void tokens_remove_duplicate( char** const tokens, unsigned tokens_num )
{
for( unsigned i = 0; i < tokens_num; i++ )
{
for( unsigned j = 0; j < tokens_num; j++ )
{
if( tokens[i] == NULL || tokens[j] == NULL || i == j )
continue;
if( strcmp( tokens[i], tokens[j] ) == 0 )
{
free( tokens[i] );
tokens[i] = NULL;
}
}
}
}
void tokens_split( char const* const string, char const* const delim, char** tokens )
{
unsigned counter = 0;
char const* pos, *lastpos;
lastpos = string;
pos = string + 1;
while( pos != NULL )
{
if( ( pos = strstr( pos, delim ) ) != NULL )
{
*(tokens++) = strndup( lastpos, (unsigned long )( pos - lastpos ));
lastpos = pos;
pos++;
counter++;
continue;
}
*(tokens++) = strdup( lastpos );
}
}
void tokens_free( char** tokens, unsigned tokens_number )
{
for( unsigned i = 0; i < tokens_number; ++i )
{
free( tokens[ i ] );
}
}
void tokens_print( char** tokens, unsigned tokens_number )
{
for( unsigned i = 0; i < tokens_number; ++i )
{
if( tokens[i] == NULL )
continue;
printf( "%s ", tokens[i] );
}
}
int main(void)
{
char const* buf = "start-12-3.endstart-12-4.endstart-13-3.endstart-12-4.end";
char const* const delim = "start";
unsigned tokens_number = tokens_find_amount( buf, delim );
char** tokens = malloc( tokens_number * sizeof( char* ) );
tokens_split( buf, delim, tokens );
tokens_remove_duplicate( tokens, tokens_number );
tokens_print( tokens, tokens_number );
tokens_free( tokens, tokens_number );
free( tokens );
return 0;
}
Basic splitting — identifying the strings
In a comment, I suggested:
Use strstr() to locate occurrences of your start and end markers. Then use memmove() (or memcpy()) to copy parts of the strings around. Note that since your start and end markers are adjacent in the original string, you can't simply insert extra characters into it — which is also why you can't use strtok(). So, you'll have to make a copy of the original string.
Another problem with strtok() is that it looks for any one of the delimiter characters — it does not look for the characters in sequence. But strtok() modifies its input string, zapping the delimiter it finds, which is clearly not what you need. Generally, IMO, strtok() is only a source of headaches and seldom an answer to a problem. If you must use something like strtok(), use POSIX strtok_r() or Microsoft's strtok_s(). Microsoft's function is essentially the same as strtok_r() except for the spelling of the function name. (The Standard C Annex K version of strtok_s() is different from both POSIX and Microsoft — see Do you use the TR 24731 'safe' functions?)
In another comment, I noted:
Use strstr() again, starting from where the start portion ends, to find the next end marker. Then, knowing the start of the whole section, and the start of the end and the length of the end, you can arrange to copy precisely the correct number of characters into the new string, and then null terminate if that's appropriate, or comma terminate. Something like:
if ((start = strstr(source, "start")) != 0 && ((end = strstr(start, "end")) != 0)
then the data is between start and end + 2 (inclusive) in your source string. Repeat starting from the character after the end of 'end'.
You then said:
I've tried following code but it doesn't work fine; would u please tell me what's wrong with it?
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv)
{
char string[] = "This-one.testthis-two.testthis-three.testthis-two.test";
int counter = 0;
while (counter < 4)
{
char *result1 = strstr(string, "This");
int start = result1 - string;
char *result = strstr(string, "test");
int end = result - string;
end += 4;
printf("\n%s\n", result);
memmove(result, result1, end += 4);
counter++;
}
}
I observed:
The main problem appears to be searching for This with a capital T but the string only contains a single capital T. You should also look at Is there a way to specify how many characters of a string to print out using printf()?
Even assuming you fix the This vs this glitch, there are other issues.
You print the entire string.
You don't change the starting point for the search.
Your moving code adds 4 to end a second time.
You don't use start.
The code should print from result1, not result.
With those fixed, the code runs but produces:
testthis-two.testthis-three.testthis-two.test
testtestthis-three.testthis-two.test
testtthis-two.test
test?
and a core dump (segmentation fault).
Code identifying the strings
This is what I created, based on a mix of your code and my commentary:
#include <stdio.h>
#include <string.h>
int main(void)
{
char string[] = "this-one.testthis-two.testthis-three.testthis-two.test";
int counter = 0;
const char *b_token = "this";
const char *e_token = "test";
int e_len = strlen(e_token);
char *buffer = string;
char *b_mark;
char *e_mark;
while ((b_mark = strstr(buffer, b_token)) != 0 &&
(e_mark = strstr(b_mark, e_token)) != 0)
{
int length = e_mark + e_len - b_mark;
printf("%d: %.*s\n", ++counter, length, b_mark);
buffer = e_mark + e_len;
}
return 0;
}
Clearly, this code does no moving of data, but being able to isolate the data to be moved is a key first step to completing that part of the exercise. Extending it to make copies of the strings so that they can be compared is fairly easy. If it is available to you, the strndup() function will be useful:
char *strndup(const char *s1, size_t n);
The strndup() function copies at most n characters from the string s1 always NUL terminating the copied string.
If you don't have it available, it is pretty straight-forward to implement, though it is more straight-forward if you have strnlen() available:
size_t strnlen(const char *s, size_t maxlen);
The strnlen() function attempts to compute
the length of s, but never scans beyond the first maxlen bytes of s.
Neither of these is a standard C library function, but they're defined as part of POSIX (strnlen()
and strndup()) and are available on BSD and Mac OS X; Linux has them, and probably other versions of Unix do too. The specifications shown are quotes from the Mac OS X man pages.
Example output:
I called the program stst (for start-stop).
$ ./stst
1: this-one.test
2: this-two.test
3: this-three.test
4: this-two.test
$
There are multiple features to observe:
Since main() ignores its arguments, I removed the arguments (my default compiler options won't allow unused arguments).
I case-corrected the string.
I set up constant strings b_token and e_token for the beginning and end markers. The names are symmetric deliberately. This could readily be transplanted into a function where the tokens are arguments to the function, for example.
Similarly I created the b_mark and e_mark variables for the positions of the begin and end markers.
The name buffer is a pointer to where to start searching.
The loop uses the test I outlined in the comments, adapted to the chosen names.
The printing code determines how long the found string is and prints only that data. It prints the counter value.
The reinitialization code skips all the previously printed material.
Command line options for generality
You could generalize the code a bit by accepting command line arguments and processing each of those in turn if any are provided; you'd use the string you provide as a default when no string is provided. A next level beyond that would allow you to specify something like:
./stst -b beg -e end 'kalamazoo-beg-waffles-end-tripe-beg-for-mercy-end-of-the-road'
and you'd get output such as:
1: beg-waffles-end
2: beg-for-mercy-end
Here's code that implements that, using the POSIX getopt().
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int main(int argc, char **argv)
{
char string[] = "this-one.testthis-two.testthis-three.testthis-two.test";
const char *b_token = "this";
const char *e_token = "test";
int opt;
int b_len;
int e_len;
while ((opt = getopt(argc, argv, "b:e:")) != -1)
{
switch (opt)
{
case 'b':
b_token = optarg;
break;
case 'e':
e_token = optarg;
break;
default:
fprintf(stderr, "Usage: %s [-b begin][-e end] ['beginning-to-end...' ...]\n", argv[0]);
return 1;
}
}
/* Use string if no argument supplied */
if (optind == argc)
{
argv[argc-1] = string;
optind = argc - 1;
}
b_len = strlen(b_token);
e_len = strlen(e_token);
printf("Begin: (%d) [%s]\n", b_len, b_token);
printf("End: (%d) [%s]\n", e_len, e_token);
for (int i = optind; i < argc; i++)
{
char *buffer = argv[i];
int counter = 0;
char *b_mark;
char *e_mark;
printf("Analyzing: [%s]\n", buffer);
while ((b_mark = strstr(buffer, b_token)) != 0 &&
(e_mark = strstr(b_mark + b_len, e_token)) != 0)
{
int length = e_mark + e_len - b_mark;
printf("%d: %.*s\n", ++counter, length, b_mark);
buffer = e_mark + e_len;
}
}
return 0;
}
Note how this program documents what it is doing, printing out the control information. That can be very important during debugging — it helps ensure that the program is working on the data you expect it to be working on. The searching is better too; it works correctly with the same string as the start and end marker (or where the end marker is a part of the start marker), which the previous version did not (because this version uses b_len, the length of b_token, in the second strstr() call). Both versions are quite happy with adjacent end and start tokens, but they're equally happy to skip material between an end token and the next start token.
Example runs:
$ ./stst -b beg -e end 'kalamazoo-beg-waffles-end-tripe-beg-for-mercy-end-of-the-road'
Begin: (3) [beg]
End: (3) [end]
Analyzing: [kalamazoo-beg-waffles-end-tripe-beg-for-mercy-end-of-the-road]
1: beg-waffles-end
2: beg-for-mercy-end
$ ./stst -b th -e th
Begin: (2) [th]
End: (2) [th]
Analyzing: [this-one.testthis-two.testthis-three.testthis-two.test]
1: this-one.testth
2: this-th
$ ./stst -b th -e te
Begin: (2) [th]
End: (2) [te]
Analyzing: [this-one.testthis-two.testthis-three.testthis-two.test]
1: this-one.te
2: this-two.te
3: this-three.te
4: this-two.te
$
After update to question
You have to account for the trailing null byte by allocating enough space for length + 1 bytes. Using strncpy() is fine but in this context guarantees that the string is not null terminated; you must null terminate it.
Your duplicate elimination code, commented out, was not particularly good — too many null checks when none should be necessary. I've created a print function; the tag argument allows it to identify which set of data it is printing. I should have put the 'free' loop into a function. The duplicate elimination code could (should) be in a function; the string extraction code could (should) be in a function — as in the answer by pikkewyn. I extended the test data (string concatenation is wonderful in contexts like this).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static void dump_strings(const char *tag, char **strings, int num_str)
{
printf("%s (%d):\n", tag, num_str);
for (int i = 0; i < num_str; i++)
printf("%d: %s\n", i, strings[i]);
putchar('\n');
}
int main(void)
{
char string[] =
"this-one.testthis-two.testthis-three.testthis-two.testthis-one.test"
"this-1-testthis-1-testthis-2-testthis-1-test"
"this-1-testthis-1-testthis-1-testthis-1-test"
;
const char *b_token = "this";
const char *e_token = "test";
int b_len = strlen(b_token);
int e_len = strlen(e_token);
char *buffer = string;
char *b_mark;
char *e_mark;
char *a[50];
int num_str = 0;
while ((b_mark = strstr(buffer, b_token)) != 0 && (e_mark = strstr(b_mark + b_len, e_token)) != 0)
{
int length = e_mark + e_len - b_mark;
char *s = (char *) malloc(length + 1); // Allow for null
strncpy(s, b_mark, length);
s[length] = '\0'; // Null terminate the string
a[num_str++] = s;
buffer = e_mark + e_len;
}
dump_strings("After splitting", a, num_str);
//remove duplicate strings
for (int i = 0; i < num_str; i++)
{
for (int j = i + 1; j < num_str; j++)
{
if (strcmp(a[i], a[j]) == 0)
{
free(a[j]); // Free the higher-indexed duplicate
a[j] = a[--num_str]; // Move the last element here
j--; // Examine the new string next time
}
}
}
dump_strings("After duplicate elimination", a, num_str);
for (int i = 0; i < num_str; i++)
free(a[i]);
return 0;
}
Testing with valgrind gives this a clean bill of health: no memory faults, no leaked data.
Sample output:
After splitting (13):
0: this-one.test
1: this-two.test
2: this-three.test
3: this-two.test
4: this-one.test
5: this-1-test
6: this-1-test
7: this-2-test
8: this-1-test
9: this-1-test
10: this-1-test
11: this-1-test
12: this-1-test
After duplicate elimination (5):
0: this-one.test
1: this-two.test
2: this-three.test
3: this-1-test
4: this-2-test

Resources