How to remove repeating elements from an array in C

How to remove repeating elements from an array in C - c

I want to write a C program that removes repeated values in an array and keep only the last occurrence.
For example I have to arrays:
char vals[6]={'a','b','c','a','f','b'};
int pos[6]={1,2,3,4,5,6};
I want to write a function so that the elements in the array after would be:
char vals[4]={'c','a','f','b'};
int pos[4]={3,4,5,6};
I know how to delete elements in general but in this case I am looking for a way where I could also delete the values in the pos array (associated with the Vals array)

Overwriting duplicate elements, per se, isn't particularly complicated. But right here you have the additional constraint of wanting the last index of each element you find. This can be solved easily when you search for duplicates:
unsigned remove_duplicates (char * restrict array,
unsigned * restrict positions, unsigned count) {
// assume positions is uninitialized
unsigned current, insert = 0;
for (current = 0; current < count; current ++) {
// first, see if the value is already in the array
unsigned search;
for (search = 0; search < current; search ++)
if (array[current] == array[search]) break;
if (search < current)
// if we found it, we just have a new position for it
positions[search] = current + 1; // +1 because your positions are 1-based
else {
// otherwise, write it into the array and store its position
// insert tracks the insertion pointer (i.e., the new end of the array)
array[insert] = array[current];
positions[insert] = current + 1;
insert ++;
}
}
// at this point we're done; insert will have tracked the number of
// unique elements, which we can return as the new array size
// the positions won't be sorted; you can sort both arrays if you want
return insert;
}

Related

How to delete an element from an array in C?

I've tried shifting elements backwards but it is not making the array completely empty.
for(i=pos;i<N-count;i++)
{
A[i]=A[i+1];
}
Actually, I've to test for a key value in an input array and if the key value is present in the array then I've to remove it from the array. The loop should be terminated when the array becomes empty. Here "count" represents the number of times before a key value was found and was removed. And, "pos" represents the position of the element to be removed. I think dynamic memory allocation may help but I've not learned it yet.

From your description and code, by "delete" you probably mean shift the values to remove the given element and shorten the list by reducing the total count.
In your example, pos and count would be/should be the similar (off by 1?) .
The limit for your for loop isn't N - count. It is N - 1
So, you want:
for (i = pos; i < (N - 1); i++) {
A[i] = A[i + 1];
}
N -= 1;
To do a general delete, given some criteria (a function/macro that matches on element(s) to delete, such as match_for_delete below), you can do the match and delete in a single pass on the array:
int isrc = 0;
int idst = 0;
for (; isrc < N; ++isrc) {
if (match_for_delete(A,isrc,...))
continue;
if (isrc > idst)
A[idst] = A[isrc];
++idst;
}
N = idst;

removing set interval of struct

im having trouble "removing" my struct/array. Right now i can define max array to be size 10. I can fill the array with struct containing name, age, ect. My search function will let me search between a set of interval, say age 10 to 25. What i want my remove function do is remove those all those people between age 10-25. I should be able to re-enter new people into the database as long as it doesn't exceed my defined limit. Right now it seems to randomly remove stuff from the array.
struct database
{
float age,b,c,d;
char name[WORDLENGTH];
};
typedef struct database Database;
search func();
.........
void remove(Database inv[], int *np, int *min, int *max, int *option)
{
int i;
if (*np == 0)
{
printf("The database is empty\n");
return;
}
search(inv, *np, low, high, option);
if (*option == 1)
{
for (i = 0; i<*np; i++)
{
if (inv[i].age >= *low && inv[i].age <= *high)
{
(*np)--;
}
}
}
}

Right now it seems to randomly remove stuff from the array.
The items that your code removes are not random at all. This line
(*np)--;
removes the last item. Therefore, if the range contains two items that match the search condition at the beginning of the inv, your code would remove two items off the end. Things get a little more complicated if matching items are located in the back of the valid range of inv, so deletions start looking random.
Deleting from an array of structs is not different from deleting from an array of ints. You need to follow this algorithm:
Maintain a read index and a write index, initially set to zero
Run a loop that terminates when the read index goes past the end
At each step check the item at read index
If the item does not match the removal condition, copy from read index to write index, and advance both indexes
Otherwise, advance only the read index
Set new np to the value of write index at the end of the loop.
This algorithm ensures that items behind the deleted ones get moved toward the front of the array. See this answer for an example implementation of the above approach.

You can't remove an array element simply by decreasing the count of number of elements.
If you want to remove the n'th element in the array, you have to overwrite the n'th element with the (n+1)'th element and overwrite the (n+1)'th element with the (n+2)'th element and so on.
Something like:
int arr[5] = { 1, 2, 3, 4, 5};
int np = 5;
// Remove element 3 (aka index 2)
int i;
for (i = 2; i < (np-1); ++i)
{
arr[i] = arr[i+1];
}
--np;
This is a simple approach to explain the concept. But notice that it requires a lot of copy so in real code, you should use a better algorithm (if performance is an issue). The answer from #dasblinkenlight explains one good algorithm.

Array in JAVA., repeat?

I'm writing a function using java language that takes in a 1D array and the size of the array as inputs to the function. I want to find out how many function values of are in the array. How would I do this?

Approach 1(O(nlogn)):
Sort the array.
Compare the adjacent elements in array
Increment the count whenever the adjacent elements are unequal. Please take care of three consecutive same elements using an extra variable.
Approach 2(O(n) but space complexity of O(n)):
Create a Hash Table for value.
Insert a value if not present in the hash table.
Count and print the values for present in hashtable

#Find unique items from array:
1. Create one new array
2. Take each item from existing array
3. Check if the item is exist in new array
4. **If not exist push the item into new array** else go for next item
5. After iterating all item in array get the length of new array

#include <stdio.h>
int main ()
{
int n[10] = {1,2,5,5,3,4,1,4,5,11};
int count = 0; int i = 0;
for (i=0; i< 10; i++)
{
int j;
for (j=0; j<i; j++)
if (n[i] == n[j])
break;
if (i == j)
count += 1;
}
printf("The counts are: %d distinct elements", count);
return 0;
}

Hash tables: double probe when collision

I am currently working on hash tables and am a little confused on double hashing. Let me first start with what the information I was given.
You first make an array which will hold all the data and they are sorted by keys. I used the formula K % size to find the position in the array that the key will go. If you submit a key into a spot where there is already a key its called a collision. Here is where the double comes in. I use the formula max(1,(K/size) % size) to get a number which will decrement from that position.
So I came up with these functions:
int hashing(table_t *hash, hashkey_t K)
{
int item;
item = K % hash->size;
return item;
}
int double_hashing(table_t *hash, hashkey_t K)
{
int item;
item = K/hash->size % hash->size);
return item;
}
//This is part of another function which involves the double.
else if(hash->probing_type == 2)
{
int dec, item;
item = hashing(hash,K);
if(hash->table[item] == NULL)
{
hash->table[item]->K == K;
hash->table[item]->I == I;
}
else
{
dec = double_hashing(hash,K);
hash->table[item-dec]->K == K;
hash->table[item-dec]->I == I;
}
}
So I use the two formulas to move the keys around. Now I am confused to what happens if I decrement and land on another spot in which a key is already placed. Do I decrement again by that much until I find a place?

Now I am confused to what happens if I decrement and land on another
spot in which a key is already placed. Do I decrement again by that
much until I find a place?
Yes. Provided your hash table size is prime and the table is not full, you will eventually find a free space for your new entry.
You don't just check if the entry is NULL. You need to also check that it doesn't contain the same key that is being inserted. Storing the key in a hash table is essential, so you can be sure that the key you searched on is the key you found.
Beware of modifying your table index without forcing it to be in the array bounds. For example, if item was 0 and then you subtract 1, you will have an out-of-bounds index.
You can correct this like so:
item = (item - dec + hash->size) % hash->size;

Find the number of occurrence of each element in an array and update the information related to each elements

I have a big 2-D array, array[length][2]. the length= 500000.
In array[i][0]= hex number, array[i][1]= 0 or 1, which represents some information related to each hex number. Like this:
array[i][0] array[i][1]
e05f56f8 1
e045ac44 1
e05f57fc 1
e05f57b4 1
e05ff8dc 0
e05ff8ec 0
e05ff900 1
I want to get a new array which stores: the hex number,# of occurance, the sum of array[i][1] of the same hex number.
I write the code like this:
//First Sort the array according to array[][0]
int x,y,temp1,temp2;
for (x=lines_num1-2;x>=0;x--)
{
for (y=0;y<=x;y++)
{
if(array[y][0]>array[y+1][0])
{
temp1=array[y][0];
array[y][0]=array[y+1][0];
array[y+1][0]=temp1;
temp2=array[y][1];
array[y][1]=array[y+1][1];
array[y+1][1]=temp2;
}
}
}
// generate the new_array[][]
int new_array[length][3];
int n=0;
for (n=0; n<length; n++){
new_array[n][0]=0;
new_array[n][1]=0;
new_array[n][2]=0;
}
int prev = array[0][0];
new_array[0][0]=array[0][0];
new_array[0][1]=1;
new_array[0][2]=array[0][2];
for (k=1;k<length;k++)
{
if (array[k][0] == prev)
{
new_array[n][1]=new_array[n][1]+1;
new_array[n][2]=new_array[n][2]+array[k][0];
}else{
prev = array[k][0];
new_array[n+1][0]=array[k][0];
new_array[n+1][1]=new_array[n+1][1]+1;
new_array[n+1][2]=new_array[n+1][2]+array[k][0];
n++;
}
}
But the code seems not work as I expected. First the sorting is so slow. And It seems cannot generate the correct new_array. Any suggestion on how to deal with this.

Personally, I would write a hash function to index the result array with the hexadecimal value directly. Then it is simple:
struct {
unsigned int nocc;
unsigned int nsum;
} result[/* ... */];
/* calculate the results */
for (i = 0; i < LENGTH; ++i) {
int *curr = &array[i];
unsigned int index = hash(curr[0]);
result[index].nocc++;
result[index].nsum += curr[1];
}
If you want to sort your array, don't reinventing the wheel: use qsort from the standard C library.

Sorting is slow because you're using bubble sort to sort the data. Bubble sort has quadratic average complexity, which means it has to perform more then 100 billion comparisons and swaps to sort your array. For this reason, never use bubble sort. Instead, learn to use the qsort library function and apply it to your problem.
Also, your sorting code has at least one bug: when exchanging values for the second column of the array, you are getting the value with the wrong column index, [3] instead of [1].

For your scenario insertion sort is the right solution, while doing the insertion itself you could make the #count and the sum. When the sort is finished, you will have your result array as well.
The code might look something like this
int hex = 0, count = 0, sum = 0, iHole;
for (i=1; i < lines_num1 -1; i++)
{
hex = array[i][0];
count = array[i][1];
sum = array[i][2];
iHole = i
// keep moving the hole to next smaller index until A[iHole - 1] is <= item
while (iHole > 0 and array[iHole - 1][0] > hex)
{
// move hole to next smaller index
A[iHole][0] = A[iHole - 1][0];
A[iHole][1] = A[iHole - 1][1];
A[iHole][2] = A[iHole - 1][2];
iHole = iHole - 1
}
// put item in the hole
if (array[iHole][0] == hex)
{
array[iHole][1]++;
array[iHole][2] += array[iHole][0];
}
else
{
array[iHole][0] = hex;
array[iHole][1] = 1;
array[iHole][2] = hex;
}
}
So the cost of making the second array is cost of the sorting itself. O(n) best case, O(n^2) worst case, and you don't have to travel again to make the sum and count.
Remember this sort is a inplace sort. If you don't want to affect your original array that could be done as well with iHole pointing to the new array. The iHole should point to the tail of new array instead of "i"