C programming: Generate random n-combinations from given set? - c

Say I have a pre-specified set S of m items. I would like to generate a random combination of n (unique) items taken from S.
Is there an easy way to implement this in C? I looked into rand() but it didn't seem to do what I want.
(EDIT to add more details)
The specific problem is to randomly choose n distinct elements from an array of size m. My first instinct is to do this:
idx_array = []
int idx = rand() % m
[if idx not in idx_array, add to idx_array. Otherwise repeat above line. Repeat until idx_array has size n]
But it doesn't look like this process is truly random. I'm still new to C and really just want to know if there's a built-in function for this purpose.
Any help appreciated.

Instead of generating a number from 1 to n with the possibility of duplicate, shuffle your array and then pick out of the first n elements:
#include <stdio.h>
#include <stdlib.h>
// Randomly shuffle a array
void shuffle (int * array, int n) {
int i, j, tmp;
for (i = n - 1; i > 0; i--) {
j = arc4random_uniform(i + 1);
tmp = array[j];
array[j] = array[i];
array[i] = tmp;
}
}
int main (int argc, char const *argv[])
{
const int m = 5;
const int n = 3;
int s[m] = {10, 20, 30, 40, 50};
// Make a copy of s before shuffling it
int t[m];
for(size_t i = 0; i < m; i++)
{
t[i] = s[i];
}
shuffle(t, m);
// Now, the first n elements of t is what you want
for(size_t i = 0; i < n; i++)
{
printf("%d ", t[i]);
}
return 0;
}
Credit to Roland Illig for the Fisher-Yate shuffling function.

This is a sampling problem. There are a host of sampling algorithms but a straightforward algorithm which does the job pretty well is known as Reservoir Sampling. Refer geekforgeeks for more details on reservoir sampling.

Related

Find the most frequent elements in an array of Integers

I have to find all of the elements which have the maximum frequency. For example, if array a={1,2,3,1,2,4}, I have to print as 1, also 2. My code prints only 2. How to print the second one?
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define n 6
int main(){
int a[n]={1,2,3,1,2,4};
int counter=0,mostFreq=-1,maxcnt=0;
for(int i=0;i<n;i++){
for(int j=i+1;j<n;j++){
if(a[i]==a[j]){
counter++;
}
}
if(counter>maxcnt){
maxcnt=counter;
mostFreq=a[i];
}
}
printf("The most frequent element is: %d",mostFreq);
}
How to print the second one?
The goal it not only to print a potential 2nd one, but all the all of the elements which have the maximum frequency.
OP already has code that determines the maximum frequency. Let us build on that. Save it as int target = mostFreq;.
Instead of printing mostFreq, a simple (still O(n*n)) approach would perform the same 2-nested for() loops again. Replace this 2nd:
if(counter>maxcnt){
maxcnt=counter;
mostFreq=a[i];
}
With:
if(counter == target){
; // TBD code: print the a[i] and counter.
}
For large n, a more efficient approach would sort a[] (research qsort()). Then walk the sorted a[] twice, first time finding the maximum frequency and the 2nd time printing values that match this frequency.
This is O(n* log n) in time and O(n) in memory (if a copy of the original array needed to preserve the original). If also works well with negative values or if we change the type of a[] from int to long long, double, etc.
The standard student solution to such problems would be this:
Make a second array called frequency, of the same size as the maximum value occurring in your data.
Init this array to zero.
Each time you encounter a value in the data, use that value as an index to access the frequency array, then increment the corresponding frequency by 1. For example freq[value]++;.
When done, search through the frequency array for the largest number(s). Optionally, you could sort it.
We can (potentially) save some effort in an approach with unsorted data by creating an array of boolean flags to determine whether we need to count an element at all.
For the array {1, 2, 3, 1, 2, 4} we do have nested for loops, so O(n) complexity, but we can avoid the inner loop entirely for repeated numbers.
#include <stdio.h>
#include <stdbool.h>
int main(void) {
int arr[] = {1, 2, 3, 1, 2, 4};
size_t arr_size = sizeof(arr) / sizeof(*arr);
bool checked[arr_size];
for (size_t i = 0; i < arr_size; i++) checked[i] = false;
unsigned int counts[arr_size];
for (size_t i = 0; i < arr_size; i++) counts[i] = 0;
for (size_t i = 0; i < arr_size; i++) {
if (!checked[i]) {
checked[i] = true;
counts[i]++;
for (size_t j = i+1; j < arr_size; j++) {
if (arr[i] == arr[j]) {
checked[j] = true;
counts[i]++;
}
}
}
}
unsigned int max = 0;
for (size_t i = 0; i < arr_size; i++) {
if (counts[i] > max) max = counts[i];
}
for (size_t i = 0; i < arr_size; i++) {
if (counts[i] == max)
printf("%d\n", arr[i]);
}
return 0;
}

A question regarding making ordered pairs from every element in an array

I was curious about how I could possibly iterate through an array, and keep track of every single possible ordered pair.
To create a problem to illustrate this; lets say I have a function that takes in an input array, the length of that array and a "target" which is the product of 2 values, and outputs an array consisting of the indices of the input array that you need to multiply in order to get the "target".
int* multipairs(int* inputarray, int arraysize, int target){
/code
}
For example:
Given an array, arr = [2, 5, 1, 9, 1, 0, 10, 2], and target = 50
It should return output = [1,6].
In my mind, I would iterate through the arrays as follow;
(0,1) -> (0,2) -> (0,3) -> (0,4)....
In the second pass I would do:
(1,2) -> (1,3) -> (1,4)...
.
.
.
and so on
I have the idea of what I want to do, but I am unfamiliar with C programming, and have no idea how to make a proper for loop. Please help me figure this out.
Your description of the algorithm is complete - as you say, the first item in the pair is iterating over all the array indices. For each of those, you want to iterate over all the pairs that follow that in the array.
for (int i = 0, i < arraysize; i++)
{
for (int j = i + 1; j < arraysize; j++)
{
// operate on pair array[i] and array[j]
}
}
You can use nested for-loops to solve your problem.
int* multipairs(int* inputarray, int arraysize, int target){
int i, j, k = -1;
/*
Maximum number of such pairs can be arraysize*(arraysize-1)/2
Since, for each pair we store two indices (0-indexed),
maximum size of output array will be arraysize*(arraysize-1)
*/
int maxsize = arraysize*(arraysize-1);
int *output = (int*)malloc(sizeof(int)*maxsize);
for (i = 0, i < arraysize; i++){
for (j = i + 1; j < arraysize; j++){
if(inputarray[i] * inputarray[j] == target){
output[++k] = i;
output[++k] = j;
}
}
}
return output;
}

To speed up C-code random selection with further modification array

I'm trying to write some code in C-language. The main idea is that I have an input linear array that consists the readius for each pixel (`````` - something like that, moreover, the length of pix_r, for instance, for picture with size (128,512) will be 128 * 512). And I need for each radius random selected fixed numbers of pixels and other set to -1. What I mean:
r = 2 in pix_r = [1, 8, 2, 2, 4, 6, 7, 7, 8, 2, 8] is in the following positions currentR = [2, 3, 9], and let's NumberOfRandomS = 2, so one of the possible result can be pix_r = [*, *, 2, -1, *, *, *, *, *, 2, *]. and the same should be doe for each r. If number of items == r is less than NumberOfRandomS, we should pick up all elements without any modification.
I try to write this in C-code. But I am a newbie and don't know all features and tips for optimization. My first aprroach of writing this function is
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <math.h>
#include <ctype.h>
#include <stdarg.h>
#include <stdint.h>
#include <stddef.h>
#include <limits.h>
#include <string.h>
#include <ctype.h>
const int NumberOfRandomS = 5;
void RandomSelected(size_t numEl, int maxRad, int *pix_r){
srand(time(NULL));
int lenRandomIndex = NumberOfRandomS*sizeof(int);
int* RandomIndex = (int*) malloc(lenRandomIndex);
memset(RandomIndex, 0, lenRandomIndex);
int lenNumPerShell1 = (maxRad) * sizeof(int);
int* numPerShell1 = (int*) malloc(lenNumPerShell1);
memset(numPerShell1, 0, lenNumPerShell1);
//Calculate the number of each pix_r per shell
for (int i=0; i<numEl; ++i){
numPerShell1[pix_r[i]]++;
}
//Main part for random selection of NumberOfRandomS items
//for each pix_r
for(int r=0; r<maxRad; ++r){
int lenShellR = numPerShell1[r];
//if number of items for this r is less than should be
//selected, skip it. It means that we selected all items
//for this r
if(lenShellR <= NumberOfRandomS){
continue;
}
int lenCurrentR = lenShellR*sizeof(int);
int* currentR = (int *) malloc(lenCurrentR); // array of indexes for this r
memset(currentR, 0, lenCurrentR);
//filling currentR array with all indexes for this r
int cInd = 0;
for(register int j=0; j<numEl; ++j){
if(pix_r[j] == r){
currentR[cInd] = j;
cInd++;
}
}
//generate random indexes without repetiotion that should be selected from currentR
//this indexes help us to save r value in these positions and others indexes for this r
//set to -1
int value[NumberOfRandomS];
for (int i=0;i<NumberOfRandomS;++i)
{
int check; //variable to check or index is already used for this r
size_t pick_index; //variable to store the random index in
do
{
pick_index = rand() % lenShellR;
//check or index is already used for this r:
check=1;
for (int j=0;j<i;++j)
if (pick_index == value[j]) //if index is already used
{
check=0; //set check to false
break; //no need to check the other elements of value[]
}
} while (check == 0); //loop until new, unique index is found
value[i]=pick_index; //store the generated index in the array
RandomIndex[i] = currentR[pick_index];
}
//set all positions for each r that are not on random selected to -1
for(register int k=0; k < lenShellR; ++k)
{
int flag = 0; // flag will be 1 if this index for this r in RandomIndex
for (register int q = 0; q < NumberOfRandomS; ++q)
{
if(RandomIndex[q] == currentR[k])
{
flag = 1; //this index is found
}
}
if(flag != 1)
{
//index for this r not in RandomIndex, so set this index for this r to -1
pix_r[currentR[k]] = -1;
}
}
}
return;
}
I tried to optimize a little bit, but different resources contradict each other and after testing it didn't show any speeding up:
void ModRandomSelected(size_t numEl, int maxRad, int *pix_r){
srand(time(NULL));
int lenRandomIndex = NumberOfRandomS*sizeof(int);
int* RandomIndex = (int*) malloc(lenRandomIndex);
memset(RandomIndex, 0, lenRandomIndex);
int lenNumPerShell1 = (maxRad) * sizeof(int);
int* numPerShell1 = (int*) malloc(lenNumPerShell1);
memset(numPerShell1, 0, lenNumPerShell1);
//Calculate the number of each pix_r per shell
for (int i=numEl-1; i>=0; --i){
numPerShell1[pix_r[i]]++;
}
//Main part for random selection of NumberOfRandomS items
//for each pix_r
for(int r=maxRad-1; r>=0; --r)
{
int lenShellR = numPerShell1[r];
//if number of items for this r is less than should be
//selected, skip it. It means that we selected all items
//for this r
if(lenShellR <= NumberOfRandomS){
continue;
}
int lenCurrentR = lenShellR*sizeof(int);
int* currentR = (int *) malloc(lenCurrentR); // array of indexes for this r
memset(currentR, 0, lenCurrentR);
//filling currentR array with all indexes for this r
int cInd = 0;
for(register int i = numEl-1; i>=0; --i)
{
if(pix_r[i] == r){
currentR[cInd] = i;
cInd++;
}
}
//generate random indexes without repetiotion that should be selected from currentR
//this indexes help us to save r value in these positions and others indexes for this r
//set to -1
int value[NumberOfRandomS];
for (int i=NumberOfRandomS-1; i>=0; --i)
{
int check; //variable to check or index is already used for this r
size_t pick_index; //variable to store the random index in
do
{
pick_index = rand() % lenShellR;
//check or index is already used for this r:
check=1;
for (int j=0;j<i;++j)
if (pick_index == value[j]) //if index is already used
{
check=0; //set check to false
break; //no need to check the other elements of value[]
}
} while (check == 0); //loop until new, unique index is found
value[i]=pick_index; //store the generated index in the array
RandomIndex[i] = currentR[pick_index];
}
//set all positions for each r that are not on random selected to -1
for(register int k=lenShellR-1; k >= 0; --k)
{
int flag = 0; // flag will be 1 if this index for this r in RandomIndex
for (register int q = NumberOfRandomS-1; q >= 0; --q)
{
if(RandomIndex[q]== currentR[k]){
flag = 1; //this index is found
}
}
if(flag != 1)
{
//index for this r not in RandomIndex, so set this index for this r to -1
pix_r[currentR[k]] = -1;
}
}
}
return;
}
I will be very thankful if you help and explain what and how I can improve this function.
The code is rather messy and hard to follow, so I can't be bothered to figure out what it actually does. The algorithm overall might be the true bottleneck. Anyway, here's some misc comments & advise of potential problems that I spotted:
Ensure to only call srand once in the whole program.
The register keyword is obsolete, from a time when compilers were bad at determining when to place variables in registers. Nowadays, compilers are more competent at this than programmers, don't use register, it is bloat.
Similarly, replacing up-counting loops with down-counting ones for the sake of performance is an obsolete technique nowadays sorting under "pre-mature optimization". The compiler can do that optimization for you - so write the code as readable as possible instead.
Avoid iterating over the same range/array multiple times.
Keep loop conditions as trivial as possible. This helps readability and data cache optimization both. The ideal for loop should look like for(int i=0; i<n; i++).
malloc is much slower than static or local storage. In this case you have a few items and only need to access them locally, so all malloc calls should be swapped with local arrays. You may use VLA here, to get stack allocation instead. That is, drop this code:
int lenRandomIndex = NumberOfRandomS;
int* RandomIndex = (int*) malloc(lenRandomIndex);
memset(RandomIndex, 0, lenRandomIndex);
and replace with this code:
int RandomIndex [NumberOfRandomS];
You have similar situations all over the code. And you probably don't need to set it to zero, because:
Don't zero-initialize or memset arrays that you indeed to fill with data the first thing you do anyway. This is a rather big performance problem in the posted code.
Empty return ; at the end of a function returning void is just clutter.
Investigate if some of these searches could be replaced with binary search. It means sorting the data in advance but might lead to much faster code overall.
Minimize the amount of checks, particularly inside loops.
Split up your big monster functions into several. Local static functions are very certain to get inlined and they improve readability a lot. Splitting functions into several smaller also allows much easier benchmarking.
Please benchmark your code when optimizations are enabled.

Rearranging an array with respect to another array

I have 2 arrays, in parallel:
defenders = {1,5,7,9,12,18};
attackers = {3,10,14,15,17,18};
Both are sorted, what I am trying to do is rearrange the defending array's values so that they win more games (defender[i] > attacker[i]) but I am having issues on how to swap the values in the defenders array. So in reality we are only working with the defenders array with respect to the attackers.
I have this but if anything it isn't shifting much and Im pretty sure I'm not doing it right. Its suppose to be a brute force method.
void rearrange(int* attackers, int* defenders, int size){
int i, c, j;
int temp;
for(i = 0; i<size; i++){
c = 0;
j = 0;
if(defenders[c]<attackers[j]){
temp = defenders[c+1];
defenders[c+1] = defenders[c];
defenders[c] = temp;
c++;
j++;
}
else
c++;
j++;
}
}
Edit: I did ask this question before, but I feel as if I worded it terribly, and didn't know how to "bump" the older post.
To be honest, I didn't look at your code, since I have to wake up in less than 2.30 hours to go to work, hope you won't have hard feelings for me.. :)
I implemented the algorithm proposed by Eugene Sh. Some links you may want to read first, before digging into the code:
qsort in C
qsort and structs
shortcircuiting
My approach:
Create merged array by scanning both att and def.
Sort merged array.
Refill def with values that satisfy the ad pattern.
Complete refilling def with the remaining values (that are
defeats)*.
*Steps 3 and 4 require two passes in my approach, maybe it can get better.
#include <stdio.h>
#include <stdlib.h>
typedef struct {
char c; // a for att and d for def
int v;
} pair;
void print(pair* array, int N);
void print_int_array(int* array, int N);
// function to be used by qsort()
int compar(const void* a, const void* b) {
pair *pair_a = (pair *)a;
pair *pair_b = (pair *)b;
if(pair_a->v == pair_b->v)
return pair_b->c - pair_a->c; // d has highest priority
return pair_a->v - pair_b->v;
}
int main(void) {
const int N = 6;
int def[] = {1, 5, 7, 9, 12, 18};
int att[] = {3, 10, 14, 15, 17, 18};
int i, j = 0;
// let's construct the merged array
pair merged_ar[2*N];
// scan the def array
for(i = 0; i < N; ++i) {
merged_ar[i].c = 'd';
merged_ar[i].v = def[i];
}
// scan the att array
for(i = N; i < 2 * N; ++i) {
merged_ar[i].c = 'a';
merged_ar[i].v = att[j++]; // watch out for the pointers
// 'merged_ar' is bigger than 'att'
}
// sort the merged array
qsort(merged_ar, 2 * N, sizeof(pair), compar);
print(merged_ar, 2 * N);
// scan the merged array
// to collect the patterns
j = 0;
// first pass to collect the patterns ad
for(i = 0; i < 2 * N; ++i) {
// if pattern found
if(merged_ar[i].c == 'a' && // first letter of pattern
i < 2 * N - 1 && // check that I am not the last element
merged_ar[i + 1].c == 'd') { // second letter of the pattern
def[j++] = merged_ar[i + 1].v; // fill-in `def` array
merged_ar[i + 1].c = 'u'; // mark that value as used
}
}
// second pass to collect the cases were 'def' loses
for(i = 0; i < 2 * N; ++i) {
// 'a' is for the 'att' and 'u' is already in 'def'
if(merged_ar[i].c == 'd') {
def[j++] = merged_ar[i].v;
}
}
print_int_array(def, N);
return 0;
}
void print_int_array(int* array, int N) {
int i;
for(i = 0; i < N; ++i) {
printf("%d ", array[i]);
}
printf("\n");
}
void print(pair* array, int N) {
int i;
for(i = 0; i < N; ++i) {
printf("%c %d\n", array[i].c, array[i].v);
}
}
Output:
gsamaras#gsamaras:~$ gcc -Wall px.c
gsamaras#gsamaras:~$ ./a.out
d 1
a 3
d 5
d 7
d 9
a 10
d 12
a 14
a 15
a 17
d 18
a 18
5 12 18 1 7 9
The problem is that you are resetting c and j to zero on each iteration of the loop. Consequently, you are only ever comparing the first value in each array.
Another problem is that you will read one past the end of the defenders array in the case that the last value of defenders array is less than last value of attackers array.
Another problem or maybe just oddity is that you are incrementing both c and j in both branches of the if-statement. If this is what you actually want, then c and j are useless and you can just use i.
I would offer you some updated code, but there is not a good enough description of what you are trying to achieve; I can only point out the problems that are apparent.

How do you break an array in to little arrays of a fixed size? (in C)

I was trying to do an exercise in Hacker Rank but found that my code(which is below) is too linear. To make it better I want to know if it is possible to break an array in to little arrays of a fixed size to complete this exercise.
The Exersise on HackerRank
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <stdlib.h>
int main() {
int N, M, Y, X;
scanf("%d %d %d %d", &N, &M, &Y, &X);
int max = 0;
int total = 0;
int data[N][M];
for(int i = 0; i < N; i++)
{
for(int j = 0; j < M; j++)
{
scanf("%d",&(data[i][j]));
}
}
for(int i = 0; i < N; i++)
{
for(int j = 0; j < M; j++)
{
total = 0;
for(int l = 0; (l < Y) && (i + Y) <= N; l++)
{
for(int k = 0; (k < X) && (j + X <= M); k++)
{
total += data[i+l][j+k];
}
if(total > max)
max = total;
}
}
}
printf("%d",max);
return 0;
}
While "breaking" it into pieces implies that we'd be moving things around in memory, you may be able to "view" the array in such a way that is equivalent.
In a very real sense the name of the array is simply a pointer to the first element. When you dereference an element of the array an array mapping function is used to perform pointer arithmetic so that the correct element can be located. This is necessary because C arrays do not natively have any pointer information within them to identify elements.
The nature of how arrays are stored, however, can be leveraged by you to treat the data as arbitrary arrays of whatever size you'd like. For example, if we had:
int integers[] = {1,2,3,4,5,6,7,8,9,10};
you could view this as a single array:
for(i=0;i!=10;i++){ printf("%d\n", integers[i]); }
But starting with the above array you could also do this:
int *iArray1, *iArray2;
iArray1 = integers;
iArray2 = integers + (5 * sizeof(int));
for(i=0;i!=5;i++){ printf("%d - %d\n", iArray1[i], iArray2[i]);}
In this way we are choosing to view the data as two 5 term arrays.
The problem is not in the linear solution. The main problem is in your algorithm complexity. As it's written it's O(N^4). Also I think your solution is not correct since:
The ceulluar tower can cover a regtangular area of Y rows and X columns.
It does not mean exactly Y rows and X columns IMHO you could find a solution where the are dimension is less than X, Y.
The problems like that are solvable in reasonable time using dynamic programming. Try to optimize your program using dynamic programming to O(N^2).

Resources