Finding Number of duplicates in an array - arrays

I have an array, say 1,3,3,1,2 The output of the code must be 4(2 repetitions of 1 + 2 repetitions of 3=4). How can I do this in C? Here's my attempt.
#include <stdio.h>
int main(){
int n,i,j,temp;
scanf("%d",&n);
int arr[n];
for(i=0;i<n;i++){
scanf("%d",&arr[i]);
}
for(i=0;i<n;i++){
int min = i;
for(j=i+1;j<n;j++){
if(arr[j]<arr[min]) min=j;
}
temp= arr[min];
arr[min]=arr[i];
arr[i]=temp;
}
int count=1;
for(i=0;i<n;i++){
if(arr[i]==arr[i+1])count++;
else continue;
}
printf("%d",count);
}

What you need is to change this for loop.
int count=1;
for(i=0;i<n;i++){
if(arr[i]==arr[i+1])count++;
else continue;
}
It can look for example the following way
int count = 0;
for ( i = 0; i < n; )
{
int j = i;
while ( ++i < n && arr[i-1] == arr[i] );
if ( !( i - j < 2 ) ) count += i - j;
}

It looks like your loop has a couple of problems.
It indexes past the end of the array, which is undefined behavior
It doesn't understand when to count the first item in a group of duplicates
Regarding #1 it's nice to just start your loop at 1 instead of 0 and then check index i-1 against i.
Regarding #2 your code works but only when there's only one number that has duplicates. This is because you started the count at 1. However, when you encounter another group, that assumption breaks down. The simplest way is to just record whether you're starting a new group.
Let's put this all together:
int count = 0;
int first = 1;
for(i = 1; i < n; i++) {
if (arr[i-1] == arr[i]) {
count += first + 1;
first = 0;
} else {
first = 1;
}
}
As for the sorting step, it's using a wildly inefficient algorithm. This is fine for small datasets, but you'll have a problem if you have a very large number of inputs. It would be wise to use something like qsort instead. There are many examples out there for how to do this.
So, your runtime right now is O(N^2). With quicksort it becomes O(N.logN).
You can probably reduce runtime further with something like a hash table that simply stores how many of each value you've found, which you update as they arrive.
If your data ranges are well-defined and small enough, you might also benefit from using a large array instead of a hash table and store a single bit for each possible number representing when a number is seen. Actually for your case you'd need two of these because of the "first in the group" problem. Now, each number that arrives sets the "seen" bit. If it was already seen, set the "duplicate" bit and increment the count. If the "duplicate" bit is not set, increment the count. Now you pretty much guarantee blazing-fast O(N) runtime where testing for and counting a duplicate value is O(1).

Related

How to order a list of integers from greatest to least?

I have an assignment where I must use a structure to put in student information. I must then order the credit hours from greatest to least. I am focused on the integer ordering loop, I just can't figure out why my program is outputting incorrectly.
#include <stdlib.h>
#include <stdio.h>
struct Student {
char name[21];
int credits;
} s[99];
int main()
{
int students;
int tempCred = 0;
char tempName[21];
printf("How many students?: ");
scanf_s("%d", &students);
for (int i = 0; i < students; i++)
{
printf("\nStudent Name?: ");
scanf_s("%s", &s[i].name, 21);
printf("\nCredits Taken?: ");
scanf_s("%d", &s[i].credits);
}
for (int i = 0; i < students; i++) {
for (int j = 0; j < students; j++) {
if (s[j].credits > tempCred) {
tempCred = s[j].credits;
s[i].credits = s[j].credits;
s[j].credits = tempCred;
}
}
printf("\n%d", s[i].credits);
}
}
For example, if I were to enter 2,6, and 8 when asked for credit hours, the program would output '8,6,8'. I am not unfamiliar with sorting, but for some reason something isn't making sense when I look at this code. Could anyone help me order the integers from greatest to least? Thanks!
NOTE: I am aware there are better ways to do this, but my professor is making us use strictly C, no C++ at all. I just need help ordering the integers.
There are various techniques used for sorting. For instance the bubble sort, quick sort, insertion sort, etc. The simplest one is the bubble sort - but it's not the most efficient one.
In your program you have an array of structs. You've done the inserting part of the structs into the array and that's fine. The problem lies in the second part - the sorting. You have a for loop that starts at the very first element (i.e. 0) and goes all the way up to the last element (i.e. students-1). And nested inside this loop is another for loop - that also has the same range???
No, that's wrong. Instead replace the first and second for loops with this:
for (int i = 0 ; i < students-1 ; i++)
{
for (int j = i+1 ; j < students ; j++)
{
...
}
}
Here, the outer for loop begins with element 0 and goes up to the element before the last. The inner for loop starts with the next element to what the outer for loop stores (i.e. j = i + 1). So if i = 0, j = 1. And this loop goes all the way up to the last element of the array of structs.
Now, inside the inner for loop specify the condition. In your case you want them sorted in descending order (highest to lowest) of the credits.
for (int i = 0 ; i < students-1 ; i++)
{
for (int j = i+1 ; j < students ; j++)
{
if(s[j].credits > s[i].credits) // then swap the credits
{
tempCred = s[j].credits ;
s[j].credits = s[i].credits ;
s[i].credits = tempCred ;
}
}
}
Note that j is one greater that i. So if i = 0, j = 1, then the if statement reads
If the credits held in the struct in element 1 of the array is greater than the credits stored in the struct in element 0 of the array, then...
If the condition is met, the credits in these 2 structs are swapped.
This an implementation of the "bubble sort". See this for more techniques and explanations.
Finally, you can display the credits:
for(int index = 0 ; index < students ; index++)
{
printf("\n%d", s[index].credits) ;
}
Like a lot of people in the comments have said, use debugger. It'll help you trace the logic of your programs.
Like #Barmar said use the qsort() function from glibc.
Not only is easier than writting your own method but it is much faster at O(N log N) on average.

checking if a array has numbers in it from 0 to length -1 in C

I have got an assignment and i'll be glad if you can help me with one question
in this assignment, i have a question that goes like this:
write a function that receives an array and it's length.
the purpose of the function is to check if the array has all numbers from 0 to length-1, if it does the function will return 1 or 0 otherwise.The function can go through the array only one.
you cant sort the array or use a counting array in the function
i wrote the function that calculate the sum and the product of the array's values and indexes
int All_Num_Check(int *arr, int n)
{
int i, index_sum = 0, arr_sum = 0, index_multi = 1, arr_multi = 1;
for (i = 0; i < n; i++)
{
if (i != 0)
index_multi *= i;
if (arr[i] != 0)
arr_multi *= arr[i];
index_sum += i;
arr_sum += arr[i];
}
if ((index_sum == arr_sum) && (index_multi == arr_multi))
return 1;
return 0;
}
i.e: length = 5, arr={0,3,4,2,1} - that's a proper array
length = 5 , arr={0,3,3,4,2} - that's not proper array
unfortunately, this function doesnt work properly in all different cases of number variations.
i.e: length = 5 , {1,2,2,2,3}
thank you your help.
Checking the sum and product is not enough, as your counter-example demonstrates.
A simple solution would be to just sort the array and then check that at every position i, a[i] == i.
Edit: The original question was edited such that sorting is also prohibited. Assuming all the numbers are positive, the following solution "marks" numbers in the required range by negating the corresponding index.
If any array cell already contains a marked number, it means we have a duplicate.
int All_Num_Check(int *arr, int n) {
int i, j;
for (i = 0; i < n; i++) {
j = abs(arr[i]);
if ((j >= n) || (arr[j] < 0)) return 0;
arr[j] = -arr[j];
}
return 1;
}
I thought for a while, and then i realized that it is a highly contrained problem.
Things that are not allowed:
Use of counting array.
Use of sorting.
Use of more than one pass to the original array.
Hence, i came up with this approach of using XOR operation to determine the results.
a ^ a = 0
a^b^c = a^c^b.
Try this:
int main(int argc, char const *argv[])
{
int arr[5], i, n , temp = 0;
for(i=0;i<n; i++){
if( i == 0){
temp = arr[i]^i;
}
else{
temp = temp^(i^arr[i]);
}
}
if(temp == 0){
return 1;
}
else{
return 0;
}
}
To satisfy the condition mentioned in the problem, every number has to occour excatly once.
Now, as the number lies in the range [0,.. n-1], the looping variable will also have the same possible range.
Variable temp , is originally set to 0.
Now, if all the numbers appear in this way, then each number will appear excatly twice.
And XORing the same number twice results in 0.
So, if in the end, when the whole array is traversed and a zero is obtained, this means that the array contains all the numbers excatly once.
Otherwise, multiple copies of a number is present, hence, this won't evaluate to 0.

C - Generate random sequence with no repeats without shuffling

I want to generate an array of the sequence [0...1'000'000] in random order without shuffling.
This means that I don't want to do:
int arr[1000000];
for (int i = 0; i < 1000000; i++)
{
arr[i] = i;
}
shuffle(arr);
shuffle(arr);
I want to figure out how to do it without the "black-box" shuffle function. I also don't want to randomly select an index between 1 and 1'000'000 because at number 999'999 there would be only a 1/1'000'000 chance to continue.
I've been trying to think of a solution and I think the key is parallel arrays and looping backwards then using modulus to limit only to the indexes that you haven't already been to, but then I can't guarantee that the value I get is unique.
I don't want to use a HashSet or TreeSet implementation as well.
This can be done in O(n) time with two lists, one with the number (initialy) in order, and one in the resulting order.
You start with n elements in order in your source list. Then you select a random number mod n. That gives you the next element, which you place in the destination list.
Now the key part. If you were to pick a random number between 0 and n-1 each time, as you seem to think a shuffle does, you have an increasing chance of selecting a number you selected before. So how do you handle this? By decreasing the available list of number to select from.
In the source list, after selecting a number, you move the last element of the list to the index that was just used. You now have a list of n-1 numbers to chose from. So on the next iteration you take a random number mod n-1. Keep going until your source list only has one element.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define LEN 10
int main()
{
int a[LEN], b[LEN];
int i, val;
int count = LEN;
srand(time(NULL));
for (i=0;i<LEN;i++) {
a[i]=i+1;
}
for (i=0;i<LEN;i++) {
val = rand() % count;
b[i] = a[val];
a[val] = a[count-1];
count--;
}
for (i=0;i<LEN;i++) {
printf("%d ", b[i]);
}
printf("\n");
return 0;
}
EDIT:
Here's a slightly more efficient version that doesn't use two arrays and is therefore O(1) space:
int a[LEN];
int i, val, tmp;
srand(time(NULL));
for (i=0;i<LEN;i++) {
a[i]=i+1;
}
for (i=0;i<LEN-1;i++) {
val = (rand() % (LEN - 1 - i)) + i + 1;
tmp = a[i];
a[i] = a[val];
a[val] = tmp;
}
for (i=0;i<LEN;i++) {
printf("%d ", a[i]);
}
printf("\n");
The O(N) answer is great but here is an alternative way using binary search and binary indexed tree to do this in O(NlogN).
arr = []
N = 1000,000
for i from 0 to N-1
low = 0
high = N-1
mid = (low+high)/2
while low < high
if full(low,mid)
low = mid+1
else if full(mid+1,high)
high = mid
else
if rand() < 0.5
low = mid+1
else
high = mid
mark(low) // marking the element in binary indexed tree
arr[i] = low
The function full is implemented using binary indexed tree and checks whether all the elements in the range given are marked or not.
Both mark and full have O(logN) complexity.

While loop won't break as intended in C

I'm trying to learn how to program in C and have stumbled into a problem that seems like it should have been a simple fix, but it's giving me more issues then I anticipated. I'm trying to created a number guessing game, where you get three chances to guess the number, but my issue is that the Do While loop wont break when the right answer is guessed. Here is the function:
void Win_Search(int lucky[],const int MAX, int user_entry, int i)
{
int j=0;
do {
j++;
printf("Please enter a number between 0 and 100\n");
scanf("%d",&user_entry);
for(i = 0; i < MAX; i++)
{
if(user_entry==lucky[i])
{
printf("winner\n");
}
}
} while(user_entry==lucky[i]||j<3);
}
Basically it's supposed to loop through the array lucky[i] and check to see if the user_entry equals any of the 20 numbers in the array. As of right now it loops through, recognizes if a winning number has been selected from the array, but doesn't break from the array.
when I change it to
}while(user_entry!=lucky[i]||j<3);
it completely ignores the counter and just loops forever.
I don't want to use break because everything I've read about it talks about it's poor programming practice. Is there another way to break, or have simply just made a mistake thats causing this issue.
Thanks in advance.
Consider for a second where your index variable "i" comes from. What happens to it after you've found a correct user entry? Where does the control flow go?
I would suggest having a look at the "break" keyword.
You wrote while (user_entry == lucky[i]..) which translates to as long as user_entry is equal to lucky[i] keep on looping. Which is clearly not what you intend to do.
Transform your condition to } while (user_entry != lucky[i] && j < 3); and you should be fine. This will translate in plain english to as long as user_entry is different of lucky[i] AND j is inferior to 3, keep looping.
But using this, you test on the value of lucky[i] even when i means nothing ( when i is equal to max, you don't want to test it, and this goes in the domain of undefined behavior).
But if you realy dont want to use break keyword, one solution is to use a flag. Set it to 1 before you start to loop, and change it to 0 when the good answer is found. Your code will become
void Win_Search(int lucky[],const int MAX, int user_entry, int i)
{
int j=0;
char flag = 1;
do {
j++;
printf("Please enter a number between 0 and 100\n");
scanf("%d",&user_entry);
for(i = 0; i < MAX; i++)
{
if(user_entry==lucky[i])
{
printf("winner\n");
flag = 0;
}
}
} while(flag&&j<3);
}
}while(user_entry!=lucky[i]||j<3);
That is bad logic - loop while the user's entry isn't the lucky number OR j is below three? Surely you actually want this:
}while(user_entry!=lucky[i]&&j<3);
This is only the solution to your second issue of it ignoring the counter - the main problem is solved in the other answers.
The only independent condition is that the user has more guesses left. try this while"
while(j <= 3);
The less than should be obvious, but the equals belongs there because you increment your j before the loop so it will be
j = 1 => first guess
j = 2 => second guess
j = 3 => third guess
After that the user should have no more guesses
You should find this doesn't work, that is because we want to exit the loop if the user guesses correctly. To do this, you can use a int as a bool (0-false, 1-yes).
void Win_Search(int lucky[],const int MAX, int user_entry, int i)
{
int j=0;
int exitCase = 0;
do {
j++;
printf("Please enter a number between 0 and 100\n");
scanf("%d",&user_entry);
for(i = 0; i < MAX; i++)
{
if(user_entry==lucky[i])
{
exitCase = 1;
printf("winner\n");
}
}
} while(exitCase == 0 || j <= 3);
}

Find integer not occurring twice in an array

I am trying to solve this problem:
In an integer array all numbers occur exactly twice, except for a single number which occurs exactly once.
A simple solution is to sort the array and then test for non repetition. But I am looking for better solution that has time complexity of O(n).
You can use "xor" operation on the entire array. Each pair of numbers will cancel each other, leaving you with the sought value.
int get_orphan(int const * a, int len)
{
int value = 0;
for (int i = 0; i < len; ++i)
value ^= a[i];
// `value` now contains the number that occurred odd number of times.
// Retrieve its index in the array.
for (int i = 0; i < len; ++i)
{
if (a[i] == value)
return i;
}
return -1;
}

Resources