Remove adjacent duplicates in a string in C

Remove adjacent duplicates in a string in C - c

How to remove all adjacent duplicates in a string in C. say for example..if "caaabbcdd" is the given string then it should remove sequentially as
1. cbbcdd
2. ccdd
3. dd
thus an empty string is returned in the end. Time complexity can be O(n^2) for starting.Can anyone help.
so far this i what i have done
void recursiven2(char *str)
{
int i,j,k,len;
len=strlen(str);
for(i=0;i<len-1;i++)
{
if(str[i]==str[i+1])
{
for(j=i;j<len-2;j++)
str[j]=str[j+2];
str[j]='\0';
}
}
}

You can refer to this. It has a very nice explanation.

Your code is close, but not quite.
It's easier to think about this in terms of "if this character is the same as the previous, drop it". Your code is more like "if this character is the same as the next", if you see the difference.
Also, memmove() just loves this.
Perhaps something like:
void compress(char *str)
{
size_t len = strlen(str);
if(len <= 1)
return;
for(size_t i = 1; i < len; )
{
if(str[i] == str[i - 1])
{
memmove(&str[i], &str[i + 1], (len - (i + 1) + 1);
--len;
}
else
++i;
}
}
There might be an obi-wan error (or two!) in the above, I haven't tested it.

string removeDuplicates(string s) {
int done=0;
while(done==0)
{
int check=0;
for(int i=0;i<s.length();i++)
{
if(s[i]==s[i+1])
{
s.erase(i,2);
check=1;
}
};
if(check==0)
{
done=1;
}
}
return s;
}

Related

Check if Char Array contains special sequence without using string library on Unix in C

Let‘s assume we have a char array and a sequence. Next we would like to check if the char array contains the special sequence WITHOUT <string.h> LIBRARY: if yes -> return true; if no -> return false.
bool contains(char *Array, char *Sequence) {
// CONTAINS - Function
for (int i = 0; i < sizeof(Array); i++) {
for (int s = 0; s < sizeof(Sequence); s++) {
if (Array[i] == Sequence[i]) {
// How to check if Sequence is contained ?
}
}
}
return false;
}
// in Main Function
char *Arr = "ABCDEFG";
char *Seq = "AB";
bool contained = contains(Arr, Seq);
if (contained) {
printf("Contained\n");
} else {
printf("Not Contained\n");
}
Any ideas, suggestions, websites ... ?
Thanks in advance,
Regards, from ∆

The simplest way is the naive search function:
for (i = 0; i < lenS1; i++) {
for (j = 0; j < lenS2; j++) {
if (arr[i] != seq[j]) {
break; // seq is not present in arr at position i!
}
}
if (j == lenS2) {
return true;
}
}
Note that you cannot use sizeof because the value you seek is not known at run time. Sizeof will return the pointer size, so almost certainly always four or eight whatever the strings you use. You need to explicitly calculate the string lengths, which in C is done by knowing that the last character of the string is a zero:
lenS1 = 0;
while (string1[lenS1]) lenS1++;
lenS2 = 0;
while (string2[lenS2]) lenS2++;
An obvious and easy improvement is to limit i between 0 and lenS1 - lenS2, and if lenS1 < lenS2, immediately return false. Obviously if you haven't found "HELLO" in "WELCOME" by the time you've gotten to the 'L', there's no chance of five-character HELLO being ever contained in the four-character remainder COME:
if (lenS1 < lenS2) {
return false; // You will never find "PEACE" in "WAR".
}
lenS1minuslenS2 = lenS1 - lenS2;
for (i = 0; i < lenS1minuslenS2; i++)
Further improvements depend on your use case.
Looking for the same sequence among lots of arrays, looking for different sequences always in the same array, looking for lots of different sequences in lots of different arrays - all call for different optimizations.
The length and distribution of characters within both array and sequence also matter a lot, because if you know that there only are (say) three E's in a long string and you know where they are, and you need to search for HELLO, there's only three places where HELLO might fit. So you needn't scan the whole "WE WISH YOU A MERRY CHRISTMAS, WE WISH YOU A MERRY CHRISTMAS AND A HAPPY NEW YEAR" string. Actually you may notice there are no L's in the array and immediately return false.
A balanced option for an average use case (it does have pathological cases) might be supplied by the Boyer-Moore string matching algorithm (C source and explanation supplied at the link). This has a setup cost, so if you need to look for different short strings within very large texts, it is not a good choice (there is a parallel-search version which is good for some of those cases).

This is not the most efficient algorithm but I do not want to change your code too much.
size_t mystrlen(const char *str)
{
const char *end = str;
while(*end++);
return end - str - 1;
}
bool contains(char *Array, char *Sequence) {
// CONTAINS - Function
bool result = false;
size_t s, i;
size_t arrayLen = mystrlen(Array);
size_t sequenceLen = mystrlen(Sequence);
if(sequenceLen <= arrayLen)
{
for (i = 0; i < arrayLen; i++) {
for (s = 0; s < sequenceLen; s++)
{
if (Array[i + s] != Sequence[s])
{
break;
}
}
if(s == sequenceLen)
{
result = true;
break;
}
}
}
return result;
}
int main()
{
char *Arr = "ABCDEFG";
char *Seq = "AB";
bool contained = contains(Arr, Seq);
if (contained)
{
printf("Contained\n");
}
else
{
printf("Not Contained\n");
}
}

Basically this is strstr
const char* strstrn(const char* orig, const char* pat, int n)
{
const char* it = orig;
do
{
const char* tmp = it;
const char* tmp2 = pat;
if (*tmp == *tmp2) {
while (*tmp == *tmp2 && *tmp != '\0') {
tmp++;
tmp2++;
}
if (n-- == 0)
return it;
}
tmp = it;
tmp2 = pat;
} while (*it++ != '\0');
return NULL;
}
The above returns n matches of substring in a string.

While splitting one char array into array of char arrays, first array always displayed as random chars

I am doing a school exercise and it's asking us to split a string(character array) into multiple character arrays. A string input like this
"asdf qwerty zxcv"
should result in an array of characters arrays like this
"asdf","qwerty","zxcv"
While I am testing the code, no matter what strings I entered as the argument of my function, the first string printed out would always be some random characters, while the rest are as expected.
"02�9�","qwerty","zxcv"
Besides, my code worked fine in online compilers, which I saved here. I also tested in OnlineGDB, in which the code worked pretty well too.
This is my code with the main function:
#include <stdio.h>
#include <stdlib.h>
int is_separator(char c)
{
if (c == '\n' || c == '\t' || c == ' ' || c == '\0')
{
return (1);
}
else
{
return (0);
}
}
int ct_len(int index, char *str)
{
int i;
i = index;
while (!(is_separator(str[index])))
{
index++;
}
return (index - i);
}
int ct_wd(char *str)
{
int count;
int i;
i = 0;
count = 0;
while (str[i])
{
if (is_separator(str[i]))
count++;
i++;
}
return (count + 1);
}
char **ft_split_whitespaces(char *str)
{
char **tab;
int i;
int j;
int k;
i = 0;
j = 0;
tab = malloc(ct_wd(str));
while (str[j])
{
k = 1;
while (is_separator(str[j]))
j++;
*(tab + i) = (char *)malloc(sizeof(char) * ((ct_len(j, str) + 1)));
while (!(is_separator(str[j])))
{
tab[i][k - 1] = str[j++];
k++;
}
tab[i++][k - 1] = '\0';
}
tab[i] = 0;
return (&tab[0]);
}
int main(void)
{
char** res;
for (res = ft_split_whitespaces("asdf qwerty zxcv"); *res != 0; res++)
{
printf("'%s',", *res);
}
return (0);
}
One hint is that the output of the first array is changing, which suggests that there might be some problems with my memory allocation. However, I am not sure about it. If you can help me find out where the bug is, I would be really appreciative of your help. Thank you very much for reading.

this
tab = malloc(ct_wd(str));
to this
tab = malloc(ct_wd(str) * sizeof(char *));
also you wight want to consider using valgrind, which should provide a fair indication of where the corruption is. essentially ct_wd(str) function is the main culprit along with malloc statement after that. you might want to take a closer look at how much memory you are allocating and how much actually using. as mentioned valgrind should assist you better.
valgrind --tool=memcheck --leak-check=full --track-origins=yes <executalbe>

C programming: ouput two strings as two columns next to each other

I have a question regarding an issue with a program in C I am making. I am going to write two different strings next to each other in two columns. I haven't found clear answers to my question since they almost always give examples of numbers with a known length or amount.
I have two strings, with a maximum length of 1500 characters, but to me unknown length. Let's for the sake of learning given them these values:
char string1[] = "The independent country is not only self-governed nation with own authorities.";
char string2[] = "This status needs the international diplomatic recognition of sovereignty.";
I want to write them next to each other, with a column width of twenty characters. I have set the difference between the columns to a regular 'tab'. Like this:
The independent coun This status needs th
try is not only self e international dipl
-governed nation wit omatic recognition o
h own authorities. f sovereignty.
I have tried with the following code but it isn't effective since I can't figure out how to adapt it to the length of the strings. It also just adapted to write five rows. I also get the below error.
Could someone please give me an example of how this could be done, and maybe with a pre-defined c-function in order to avoid using the for-loops.
void display_columns(char *string1, char *string2);
int main()
{
char string1[] = "The independent country is not only self-governed nation with own authorities.";
char string2[] = "This status needs the international diplomatic recognition of sovereignty.";
display_columns(string1,string2);
}
void display_columns(char *string1, char *string2)
{
int i,j;
for(i=0;i<5;i++)
{
for(j=0+20*i;j<20+20*i;j++)
{
printf("%c",string1[j]);
}
printf("\t");
for(j=0+20*i;j<20+20*i;j++)
{
printf("%c",string2[j]);
}
}
}

I guess this is more generic way to do it.
void print_line(char *str, int *counter) {
for (int i = 0; i < 20; i++) {
if (str[*counter] != '\0') {
printf("%c", str[*counter]);
*counter += 1;
}
else { printf(" "); }
}
}
void display_columns(char *string1, char *string2)
{
int counter = 0, counter2 = 0;
while (1) {
print_line(string1, &counter);
printf("\t");
print_line(string2, &counter2);
printf("\n");
if (string1[counter] == '\0' && string2[counter2] == '\0') {
break;
}
}
}

To print a single character, use:
printf("%c",string1[j]);
or
putchar(string1[j]);
This is the reason for the warnings and segmentation fault.
With this fix, the program somewhat works, you just have to print a newline as the last part of the loop:
for(i=0;i<5;i++)
{
for(j=0+20*i;j<20+20*i;j++)
{
putchar(string1[j]);
}
printf("\t");
for(j=0+20*i;j<20+20*i;j++)
{
putchar(string2[j]);
}
putchar('\n');
}
Update: For the function to work with strings of variable lengths, try this:
void display_columns(char *string1, char *string2)
{
int i,j;
int len1 = strlen(string1);
int len2 = strlen(string2);
int maxlen = (len1 > len2) ? len1 : len2;
int numloops = (maxlen + 20 - 1) / 20;
for(i=0; i<numloops; i++)
{
for(j=0+20*i;j<20+20*i;j++)
{
if (j < len1)
putchar(string1[j]);
else
putchar(' '); // Fill with spaces for correct alignment
}
printf("\t");
for(j=0+20*i;j<20+20*i;j++)
{
if (j < len2)
putchar(string2[j]);
else
break; // Just exit from the loop for the right side
}
putchar('\n');
}
}

How do I check a Palindrome in C while ignoring case sensitivity and punctuation?

I'm am currently trying to write a palindrome that ignores punctuations and case sensitivity, 1 using arrays, the 2nd using pointers. My problem is that I'm unable to figure out how. The code seems to work fine other than that. I've also written a lower case to upper case function, but I don't think it works anyhow.
This is my first code using arrays.
int is_palindrome1(const char phrase[], int length)
{
int first = phrase[0];
int last = phrase[length - 1];
for (length = 0; phrase[length] != '\0'; length++)
{
while (last > first)
{
if ((phrase[first]) != (phrase[last]))
{
return 0;
}
last--;
first++;
}
break;
}
return 1;
}
This is my 2nd palindrome code using pointers.
int is_palindrome2(const char *phrase, int length)
{
int i;
length = strlen(phrase);
for (i = 0; i < length / 2; i++)
{
if (*(phrase + i) != *(phrase + length - i - 1))
{
return 0;
}
}
return 1;
}
Here is my lower case to upper case function.
char lower_to_upper(char lower, char upper)
{
if (lower >= 'a' && lower <= 'z')
{
upper = ('A' + lower - 'a');
return upper;
}
else
{
upper = lower;
return upper;
}
}

So. Let's do this in steps.
The simplest is_palindrome function:
This will look very similar to your code. Except that some syntax problems that you have are fixed. Note that s and e point to the first and last character of the string.
bool is_palindrome(const char *phrase, unsigned length) {
const char *s = phrase + 0;
const char *e = phrase + length - 1;
while (s < e) {
if (*s != *e)
return false;
s += 1;
e -= 1;
}
return true;
}
Let's add lowercase / uppercase comparisons:
The simplest way to do this is to convert all valid characters to uppercase. It looks like you had this idea as well with your talking about a lower_to_upper() function.
The only problem is, your function has a really odd signature (why is upper an argument?). So, an easy fix to that, is to use the builtin function toupper().
bool is_palindrome(const char *phrase, unsigned length) {
const char *s = phrase + 0;
const char *e = phrase + length - 1;
while (s < e) {
if (toupper(*s) != toupper(*e))
return false;
s += 1;
e -= 1;
}
return true;
}
What about those other characters (like spaces)
Now. The last piece is that you want to ignore spaces and punctuation. Rather than wording it that way, it might be better to talk about the characters that we do want to compare. I think that you are looking to only compare alphanumeric characters. These are a-z, A-Z, and 0-9. To test if a character is one of these, we could build a custom function, or we could use the builtin isalnum() to do that:
bool is_palindrome(const char *phrase, unsigned length) {
const char *s = phrase + 0;
const char *e = phrase + length - 1;
while (s < e) {
if (!isalnum(*s)) {
s++;
} else if (!isalnum(*e)) {
e--;
} else if (toupper(*s) == toupper(*e)) {
s++;
e--;
} else {
return false;
}
}
return true;
}
Some final thoughts:
Note that on each pass of the loop, we move either s, e, or both one step. This ensures that we will eventually complete the loop. Our condition of s < e also ensure that once we reach the "middle" of the string, that we finish. I put middle in quotes, because for the string "ab a", the middle is the second character.
Languages are complicated beasts:
English has a fairly straightforward encoding in most (all?) systems. But other languages are not always that straightforward. In a comment, chux had a recommendation about this:
A locale than may have a many-to-1 mapping of lower to upper or visa-versa, using round-trip if (tolower(toupper(*s)) != tolower(toupper(*e))) handles that.
I'm personally not as concerned, because I feel that around the same point that we worry about this, we should also worry about how the text is encoded. Is it UTF-8? Is it something else? This is probably beyond your instructors expectations.

Effective way of checking if a given string is palindrome in C

I was preparing for my interview and started working from simple C programming questions. One question I came across was to check if a given string is palindrome. I wrote a a code to find if the user given string is palindrome using Pointers. I'd like to know if this is the effective way in terms of runtime or is there any enhancement I could do to it. Also It would be nice if anyone suggests how to remove other characters other than letters (like apostrophe comas) when using pointer.I've added my function below. It accepts a pointer to the string as parameter and returns integer.
int palindrome(char* string)
{
char *ptr1=string;
char *ptr2=string+strlen(string)-1;
while(ptr2>ptr1){
if(tolower(*ptr1)!=tolower(*ptr2)){
return(0);
}
ptr1++;ptr2--;
}
return(1);
}

"how to remove other characters other than letters?"
I think you don't want to actually remove it, just skip it and you could use isalpha to do so. Also note that condition ptr2 > ptr1 will work only for strings with even amount of characters such as abba, but for strings such as abcba, the condition should be ptr2 >= ptr1:
int palindrome(char* string)
{
size_t len = strlen(string);
// handle empty string and string of length 1:
if (len == 0) return 0;
if (len == 1) return 1;
char *ptr1 = string;
char *ptr2 = string + len - 1;
while(ptr2 >= ptr1) {
if (!isalpha(*ptr2)) {
ptr2--;
continue;
}
if (!isalpha(*ptr1)) {
ptr1++;
continue;
}
if( tolower(*ptr1) != tolower(*ptr2)) {
return 0;
}
ptr1++; ptr2--;
}
return 1;
}
you might need to #include <ctype.h>

How about doing like this if you want to do it using pointers only:
int main()
{
char str[100];
char *p,*t;
printf("Your string : ");
gets(str);
for(p=str ; *p!=NULL ; p++);
for(t=str, p-- ; p>=t; )
{
if(*p==*t)
{
p--;
t++;
}
else
break;
}
if(t>p)
printf("\nPalindrome");
else
printf("\nNot a palindrome");
getch();
return 0;
}

int main()
{
const char *p = "MALAYALAM";
int count = 0;
int len = strlen(p);
for(int i = 0; i < len; i++ )
{
if(p[i] == p[len - i - 1])
count++;
}
cout << "Count: " << count;
if(count == len)
cout << "Palindrome";
else
cout << "Not Palindrome";
return 0;
}

I have actually experimented quite a lot with this kind of problem.
There are two optimisations that can be done:
Check for odd string length, odd stings can't be palindromes
Start using vectorised compares, but this only really gives you performance if you expect a lot of palindromes. If the majority of your strings aren't palindromes you are still best off with byte by byte comparisons. In fact my vectorised palindrome checker ran 5% slower then the non-vectorised just because palindromes were so rare in the input. The extra branch that decided vectorised vs non vectorised made this big difference.
Here is code draft how you can do it vectorised:
int palindrome(char* string)
{
size_t length = strlen(string);
if (length >= sizeof(uintptr_t)) { // if the string fits into a vector
uintptr_t * ptr1 = (uintptr_t*)string;
size_t length_v /= sizeof(uintptr_t);
uintptr_t * ptr2 = (uintptr_t*)(string + (length - (length_v * sizeof(uintptr_t)))) + length_v - 1;
while(ptr2>ptr1){
if(*ptr1 != bswap(*ptr2)){ // byte swap for your word length, x86 has an instruction for it, needs to be defined separately
return(0);
}
ptr1++;ptr2--;
}
} else {
// standard byte by byte comparison
}
return(1);
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Remove adjacent duplicates in a string in C - c

You can refer to this. It has a very nice explanation.

string removeDuplicates(string s) { int done=0; while(done==0) { int check=0; for(int i=0;i<s.length();i++) { if(s[i]==s[i+1]) { s.erase(i,2); check=1; } }; if(check==0) { done=1; } } return s; }

Related

Check if Char Array contains special sequence without using string library on Unix in C

While splitting one char array into array of char arrays, first array always displayed as random chars

C programming: ouput two strings as two columns next to each other

How do I check a Palindrome in C while ignoring case sensitivity and punctuation?

Effective way of checking if a given string is palindrome in C

Categories

Resources