Remove duplicates in a string. What am I missing? - c

Here is my try to remove duplicates of a string, and I have two questions:
void removeDuplicates(char *original_string)
{
if(original_string == NULL) {
return;
}
int len = strlen(original_string);
if (len < 2) {
return;
}
int tail = 1;
int i;
for (i = 1; i < len; i++) {
int j;
for (j=0; j < tail; j++) {
if (original_string[i] == original_string[j]) {
break;
}
}
if (j == tail) {
original_string[tail] = original_string[i];
++tail;
}
}
}
First: What am I doing wrong that I don't see? I have found this example in a book and I believe it makes sense. Why are the duplicated characters not being deleted?
Second: When calling the function, if I do it with:
char duplicated[] = "aba";
removeDuplicates(duplicated);
I don't get an error. But if I do it with:
char *duplicated = "aba";
removeDuplicates(duplicated);
I get an Bus error: 10 in run time.

char duplicated[] = "aba";
creates an array of chars, which is writable.
char *duplicated = "aba";
creates a string literal (which is unmodifiable) then the variable duplicated is assigned to the pointer to that string literal. Since your function tries to modify the string in-place, it invokes undefined behavior when attempting to write to a string literal, hence the crash.

string literals are non-modifiable in C. it is undefined behaviour
So, duplicated has to be a local array:
or it has to be
char duplicated[] = "aba";
and not
char *duplicated = "aba";

"..." creates a constant chunk of memory holding your string.
You cannot modify it.
Therefore, modifying original_string[tail] is undefined behavior when called on a constant string.

nothing is being removed
original_string[tail] = original_string[i]
that isn't removing anything, it's replacing

Related

Dynamically increasing C string's size

I'm currently creating a program that captures user's keypresses and stores them in a string. I wanted the string that stores the keypresses to be dynamic, but i came across a problem.
My current code looks something like this:
#include <stdio.h>
#include <stdlib.h>
typedef struct Foo {
const char* str;
int size;
} Foo;
int main(void)
{
int i;
Foo foo;
foo.str = NULL;
foo.size = 0;
for (;;) {
for (i = 8; i <= 190; i++) {
if (GetAsyncKeyState(i) == -32767) { // if key is pressed
foo.str = (char*)realloc(foo.str, (foo.size + 1) * sizeof(char)); // Access violation reading location xxx
sprintf(foo.str, "%s%c", foo.str, (char)i);
foo.size++;
}
}
}
return 0;
}
Any help would be appreciated, as I don't have any ideas anymore. :(
Should I maybe also allocate the Foo object dynamically?
First, in order to handle things nicely, you need to define
typedef struct Foo {
char* str;
int size
} Foo;
Otherwise, Foo is really annoying to mutate properly - you invoke undefined behaviour by modifying foo->str after the realloc call in any way.
The seg fault is actually caused by sprintf(foo.str, "%s%c", foo.str, (char)i);, not the call to realloc. foo.str is, in general, not null-terminated.
In fact, you're duplicating work by calling sprintf at all. realloc already copies all the characters previously in f.str, so all you have to do is add a single character via
f.str[size] = (char) i;
Edit to respond to comment:
If we wanted to append to strings (or rather, two Foos) together, we could do that as follows:
void appendFoos(Foo* const first, const Foo* const second) {
first->str = realloc(first->str, (first->size + second->size) * (sizeof(char)));
memcpy(first->str + first->size, second->str, second->size);
first->size += second->size;
}
The appendFoos function modifies first by appending second onto it.
Throughout this code, we leave Foos as non-null terminated. However, to convert to a string, you must add a final null character after reading all other characters.
const char *str - you declare the pointer to const char. You cant write to the referenced object as it invokes UB
You use sprintf just to add the char. It makes no sense.
You do not need a pointer in the structure.
You need to set compiler options to compile **as C language" not C++
I would do it a bit different way:
typedef struct Foo {
size_t size;
char str[1];
} Foo;
Foo *addCharToFoo(Foo *f, char ch);
{
if(f)
{
f = realloc(f, sizeof(*f) + f -> size);
}
else
{
f = realloc(f, sizeof(*f) + 1);
if(f) f-> size = 0
}
if(f) //check if realloc did not fail
{
f -> str[f -> size++] = ch;
f -> str[f -> size] = 0;
}
return f;
}
and in the main
int main(void)
{
int i;
Foo *foo = NULL, *tmp;
for (;;)
{
for (i = 8; i <= 190; i++)
{
if (GetAsyncKeyState(i) == -32767) { // if key is pressed
if((tmp = addCharToFoo(f, i))
{
foo = tmp;
}
else
/* do something - realloc failed*/
}
}
}
return 0;
}
sprintf(foo.str, "%s%c", foo.str, (char)i); is ill-formed: the first argument cannot be const char *. You should see a compiler error message.
After fixing this (make str be char *), then the behaviour is undefined because the source memory read by the %s overlaps with the destination.
Instead you would need to use some other method to append the character that doesn't involve overlapping read and writes (e.g. use the [ ] operator to write the character and don't forget about null termination).

Clear char array in C without any standard library

I'm working on a class project that would require me to make unique strings and I want to concatenate a number to a string. However I do NOT have access to C Standard Library (memset, malloc, etc.). I made this which works:
char* concat(char* name, int num) {
int i, j;
char newName[50], stack[5];
for(i=0; name[i]!='\0'; ++i) {
newName[i] = name[i];
}
for (j=0; num>=1 || num==0; j++) {
stack[j] = (num % 10) + '0';
num = num / 10;
if (num==0) break;
}
while (j>=0) {
newName[i++] = stack[j--];
}
name[0] = '\0';
return newName;
}
But then as I tested it with multiple strings, I realized that newName was being reused over and over. For ex.
This test file outputs the following:
int main() {
char* rebecca = concat("rebecca", 1);
char* bill = concat("bill", 2);
Write(rebecca); /* bill2ca1 */
Write(bill); /* bill2ca1 */
}
It successfully appends the 1 to rebecca, but then when I call concat on bill, it overwrites the first 5 letter but keeps the same chars from before in newName.
QUESTION: How to clear a char array so the next time it's called it will be set to empty, or dynamically allocate it (without using C Standard Library)?
Without using malloc, you can simply put the memory on the stack of the calling function, to keep in the scope where it is needed. It's easier to add the buffer pointer to the argument list like so:
char* concat(char *newName, char* name, int num) {
int i, j;
char stack[5];
:
:
}
int main() {
char rebecca[50];
char bill[50];
concat(rebecca, "rebecca", 1);
concat(bill, "bill", 2);
write(rebecca);
write(bill);
}
Generally speaking, assign memory where it will be used. Embedded programming (which might need to run for months without a reboot) avoids malloc like the plague, just because of the risk of memory leaks. You then need to assign extra space since you may not know the size at compile time, and then ideally check for running past the end of the buffer. Here we know the string sizes and 50 chars is more than enough.
Edit:
The other issue is that you're not null terminating. The print will go until it hits 0x00. Your line
name[0] = '\0';
should be
newName[i] = '\0';
You've got a major issue that you're overlooking. In your function, newName is a local variable (array) and you're returning it from the function. This invokes undefined behavior. The beauty of UB is that, sometime it appears to work as expected.
You need to take a pointer and allocate memory dynamically instead, if you want to return it from your concat() function. Also, in the main(), after using it, you need to free() it.
A better alternative, maybe, if you choose to do so, is
Define the array in the caller.
Pass the array to the function.
Inside the function, memset() the array before you perform any other operation.
One thing to remember, this way, every call to the function will clean the previous result.
EDIT:
If you cannot use memset(), in the main, you can use a for loop like
for (i = 0; i < sizeof(arr)/sizeof(arr[0]); i++)
arr[i] = 0;
to clear the array before passing it on next time.
You're returning the address of a local variable. Since the variable goes out of scope when the function returns, this invokes undefined behavior.
You function should dynamically allocate memory for the result of the concatenation, then return that buffer. You'll need to be sure to free that buffer later to prevent a memory leak:
char* concat(char* name, int num) {
int i, j;
char *newName, stack[5];
// allocate enough space for the existing string and digits for a 64-bit number
newName = malloc(strlen(name) + 30);
for(i=0; name[i]!='\0'; ++i) {
newName[i] = name[i];
}
for (j=0; num>=1 || num==0; j++) {
stack[j] = (num % 10) + '0';
num = num / 10;
if (num==0) break;
}
while (j>=0) {
newName[i++] = stack[j--];
}
newName[i] = '\0';
return newName;
}
int main() {
char* rebecca = concat("rebecca", 1);
char* bill = concat("bill", 2);
Write(rebecca);
Write(bill);
free(rebecca);
free(bill);
}

return a space-less string from a function

I have a fucntion which in it I want to return a string (i.e array of chars) with no spaces at all. This is my code, which in my understanding is not right:
char *ignoreSpace( char helpArr[], int length ){
int i = 0; int j = 0;
char withoutSpace[length];
while ( i < length ){
/*if not a space*/
if ( isspace( helpArr[i] ) == FALSE )
withoutSpace[j] = helpArr[i];
i++;
}
return *withoutSpace;
}
My intention in the line:
return *withoutSpace;
Is to return the content of the array withoutSpace so I could parse a string with no spaces at all.
Can you please tell me how can I make it any better?
Your current solution will lose the result of withoutSpace when the function returns as it is only defined in that function's scope.
A better pattern would be to accept a third argument to the function which is a pointer to a char[] to write the result into - in much the same way the standard functions do, (eg strcpy.
char* ignoreSpace(char* src, char* dst, int length) {
// copy from src to dst, ignoring spaces
// ...
// ...
return dst;
}
Try this (assuming null terminated string)
void ignoreSpace(char *str) {
int write_pos = 0, read_pos = 0;
for (; str[read_pos]; ++read_pos) {
if (!isspace(str[read_pos]) {
str[write_pos++] = str[read_pos];
}
}
str[write_pos] = 0;
}
You cannot return a pointer to a local variable from a function, because as soon as you leave the function all local variables are detroyed and no longer valid.
You must either
Allocate space with malloc in your function and return a pointer
to that allocated memory
not return a pointer from the function butmodify directly the
original string.
First solution :
char *ignoreSpace(char helpArr[], int length)
{
int i=0; int j=0;
char *withoutSpace = malloc(length) ;
while(i <= length)
{
/*if not a space*/
if(isspace(helpArr[i]) == FALSE)
withoutSpace[j++] = helpArr[i];
i++;
}
return withoutSpace;
}
Second solution:
char *ignoreSpace(char helpArr[], int length)
{
int i=0; int j=0;
while(i <= length)
{
/*if not a space*/
if(isspace(helpArr[i]) == FALSE)
helpArr[j++] = helpArr[i];
i++;
}
return helpArr;
}
There are some other small correction in my code. Finding out which ones is left as an exercise to the reader.
You don't increment j, ever. In the case that the current character of the source string is not a space, you probably would like to store it in your output string and then also increment the j by one; so that you'd store the next possible character into the next slot instead of overwriting the 0th one again and again.
So change this:
...
withoutSpace[j] = helpArr[i];
...
into this:
...
withoutSpace[j++] = helpArr[i];
...
And then also append your withoutSpace with a 0 or '\0' (they are the same), so that any string processing function may know its end. Also return the pointer, since you should do that, not the *withoutSpace or withoutSpace[0] (they are the same):
char *ignoreSpace( char helpArr[], int length ){
int i = 0; int j = 0;
char * withoutSpace = malloc( length * sizeof * withoutSpace ); // <-- changed this
while ( i < length ){
/*if not a space*/
if ( isspace( helpArr[i] ) == FALSE )
withoutSpace[j++] = helpArr[i]; // <-- replaced j with j++
i++;
}
withoutSpace[j] = 0; // <-- added this
return withoutSpace;
}
And then you should be good to go, assuming that you can have variable-length arrays.
Edit: Well, variable-length arrays or not, you better just use dynamic memory allocation by using malloc or calloc or something, because else, as per comments, you'd be returning a local pointer variable. Of course, this requires you to manually free the allocated memory in the end.

Appending a char to a char* in C?

I'm trying to make a quick function that gets a word/argument in a string by its number:
char* arg(char* S, int Num) {
char* Return = "";
int Spaces = 0;
int i = 0;
for (i; i<strlen(S); i++) {
if (S[i] == ' ') {
Spaces++;
}
else if (Spaces == Num) {
//Want to append S[i] to Return here.
}
else if (Spaces > Num) {
return Return;
}
}
printf("%s-\n", Return);
return Return;
}
I can't find a way to put the characters into Return. I have found lots of posts that suggest strcat() or tricks with pointers, but every one segfaults. I've also seen people saying that malloc() should be used, but I'm not sure of how I'd used it in a loop like this.
I will not claim to understand what it is that you're trying to do, but your code has two problems:
You're assigning a read-only string to Return; that string will be in your
binary's data section, which is read-only, and if you try to modify it you will get a segfault.
Your for loop is O(n^2), because strlen() is O(n)
There are several different ways of solving the "how to return a string" problem. You can, for example:
Use malloc() / calloc() to allocate a new string, as has been suggested
Use asprintf(), which is similar but gives you formatting if you need
Pass an output string (and its maximum size) as a parameter to the function
The first two require the calling function to free() the returned value. The third allows the caller to decide how to allocate the string (stack or heap), but requires some sort of contract about the minumum size needed for the output string.
In your code, when the function returns, then Return will be gone as well, so this behavior is undefined. It might work, but you should never rely on it.
Typically in C, you'd want to pass the "return" string as an argument instead, so that you don't have to free it all the time. Both require a local variable on the caller's side, but malloc'ing it will require an additional call to free the allocated memory and is also more expensive than simply passing a pointer to a local variable.
As for appending to the string, just use array notation (keep track of the current char/index) and don't forget to add a null character at the end.
Example:
int arg(char* ptr, char* S, int Num) {
int i, Spaces = 0, cur = 0;
for (i=0; i<strlen(S); i++) {
if (S[i] == ' ') {
Spaces++;
}
else if (Spaces == Num) {
ptr[cur++] = S[i]; // append char
}
else if (Spaces > Num) {
ptr[cur] = '\0'; // insert null char
return 0; // returns 0 on success
}
}
ptr[cur] = '\0'; // insert null char
return (cur > 0 ? 0 : -1); // returns 0 on success, -1 on error
}
Then invoke it like so:
char myArg[50];
if (arg(myArg, "this is an example", 3) == 0) {
printf("arg is %s\n", myArg);
} else {
// arg not found
}
Just make sure you don't overflow ptr (e.g.: by passing its size and adding a check in the function).
There are numbers of ways you could improve your code, but let's just start by making it meet the standard. ;-)
P.S.: Don't malloc unless you need to. And in that case you don't.
char * Return; //by the way horrible name for a variable.
Return = malloc(<some size>);
......
......
*(Return + index) = *(S+i);
You can't assign anything to a string literal such as "".
You may want to use your loop to determine the offsets of the start of the word in your string that you're looking for. Then find its length by continuing through the string until you encounter the end or another space. Then, you can malloc an array of chars with size equal to the size of the offset+1 (For the null terminator.) Finally, copy the substring into this new buffer and return it.
Also, as mentioned above, you may want to remove the strlen call from the loop - most compilers will optimize it out but it is indeed a linear operation for every character in the array, making the loop O(n**2).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *arg(const char *S, unsigned int Num) {
char *Return = "";
const char *top, *p;
unsigned int Spaces = 0;
int i = 0;
Return=(char*)malloc(sizeof(char));
*Return = '\0';
if(S == NULL || *S=='\0') return Return;
p=top=S;
while(Spaces != Num){
if(NULL!=(p=strchr(top, ' '))){
++Spaces;
top=++p;
} else {
break;
}
}
if(Spaces < Num) return Return;
if(NULL!=(p=strchr(top, ' '))){
int len = p - top;
Return=(char*)realloc(Return, sizeof(char)*(len+1));
strncpy(Return, top, len);
Return[len]='\0';
} else {
free(Return);
Return=strdup(top);
}
//printf("%s-\n", Return);
return Return;
}
int main(){
char *word;
word=arg("make a quick function", 2);//quick
printf("\"%s\"\n", word);
free(word);
return 0;
}

Pointers to Structures in C

When trying to compile the following code, I am getting a warning that line 18 makes integer from pointer without cast and that 19 and 20 are incompatible types in assignment. I am new to structures in C, and can't seem to figure out what is wrong.
#include <stdio.h>
struct song
{ char title[70];
};
struct playlist
{ struct song songs[100];
};
void title_sort(struct playlist * list,int len)
{ int swapped = 1,i;
char hold;
while (swapped)
{ swapped = 0;
for (i = 0;i < len - 1; i++)
{ if (list->songs[i].title > list->songs[i+1].title)
{ hold = list->songs[i].title;
list->songs[i].title = list->songs[i+1].title;
list->songs[i+1].title = hold;
swapped = 1;
}
}
}
}
int main()
{ struct playlist playlist;
int i;
for (i = 0;i < 5;i++)
{ fgets(playlist.songs[i].title,70,stdin);
}
title_sort(&playlist,5);
printf("\n");
for (i = 0;i < 5;i++)
{ printf("%s",playlist.songs[i].title);
}
return 0;
}
You can't compare strings in C with >. You need to use strcmp. Also hold is char but title is char [70]. You could copy pointers to strings but arrays can't be copied with just =.
You could use strcpy like this:
void title_sort(struct playlist * list,int len)
{ int swapped = 1,i;
char hold[70];
while (swapped)
{ swapped = 0;
for (i = 0;i < len - 1; i++)
{ if (strcmp (list->songs[i].title, list->songs[i+1].title) > 0)
{ strcpy (hold, list->songs[i].title);
strcpy (list->songs[i].title, list->songs[i+1].title);
strcpy (list->songs[i+1].title,hold);
swapped = 1;
}
}
}
}
But please note that in C you need to check things like the lengths of strings, so the above code is dangerous. You need to either use strncpy or use strlen to check the lengths of the strings.
You can not use strings like that C. Strings are essentially a simple array of characters in C without specialized operators like =, < etc. You need to use string functions like strcmp and strcpy to do the string manipulations.
To be more specific : following is wrong
if (list->songs[i].title > list->songs[i+1].title)
Do it this way:
if( strcmp (list->songs[i].title , list->songs[i+1].title) > 0 )
char hold needs to be something else, perhaps char *hold, perhaps an array.
C doesn't have array assignment, although it does have structure assignment, you will need to rework that
Your first issue, on line 18, is caused by two problems. Firstly, the variable hold can only hold a single char value, and you're trying to assign an array of 70 chars to it.
First you'll need to make hold the correct type:
char hold[70];
Now, there's another problem - arrays can't just be assigned using the = operator. You have to use a function to explicitly copy the data from one array to another. Instead of your current line 18, you could use:
memcpy(hold, list->songs[i].title, 70);
You then need to do the same thing for lines 19 and 20:
memcpy(list->songs[i].title, list->songs[i+1].title, 70);
memcpy(list->songs[i+1].title, hold, 70);
Alternatively, you could write a loop and swap the two titles one char at a time.
In a similar fashion, you can't compare two strings with the simple < operator - you need to use a function for this, too (eg. strcmp).

Resources