I am very new to the concept of Dynamic Programing and CS in general. I am teaching myself by reading lectures posted online, watching videos and solving problems posted on websites such as GeeksforGeeks and Hacker Rank.
Problem
Given input
3 25 30 5
where 3 = #of keys
25 = frequency of key 1
30 = frequency of key 2
5 = frequency of key 3
I am to print the minimum cost if each key is arranged in a optimized manner. This is a optimal binary search tree problem and I found a solution on geeks for geeks that sort of does something similar.
#include <stdio.h>
#include <limits.h>
// A utility function to get sum of array elements freq[i] to freq[j]
int sum(int freq[], int i, int j);
/* A Dynamic Programming based function that calculates minimum cost of
a Binary Search Tree. */
int optimalSearchTree(int keys[], int freq[], int n)
{
/* Create an auxiliary 2D matrix to store results of subproblems */
int cost[n][n];
/* cost[i][j] = Optimal cost of binary search tree that can be
formed from keys[i] to keys[j].
cost[0][n-1] will store the resultant cost */
// For a single key, cost is equal to frequency of the key
for (int i = 0; i < n; i++)
cost[i][i] = freq[i];
// Now we need to consider chains of length 2, 3, ... .
// L is chain length.
for (int L=2; L<=n; L++)
{
// i is row number in cost[][]
for (int i=0; i<=n-L+1; i++)
{
// Get column number j from row number i and chain length L
int j = i+L-1;
cost[i][j] = INT_MAX;
// Try making all keys in interval keys[i..j] as root
for (int r=i; r<=j; r++)
{
// c = cost when keys[r] becomes root of this subtree
int c = ((r > i)? cost[i][r-1]:0) +
((r < j)? cost[r+1][j]:0) +
sum(freq, i, j);
if (c < cost[i][j])
cost[i][j] = c;
}
}
}
return cost[0][n-1];
}
// A utility function to get sum of array elements freq[i] to freq[j]
int sum(int freq[], int i, int j)
{
int s = 0;
for (int k = i; k <=j; k++)
s += freq[k];
return s;
}
// Driver program to test above functions
int main()
{
int keys[] = {0,1,2};
int freq[] = {34, 8, 50};
int n = sizeof(keys)/sizeof(keys[0]);
printf("Cost of Optimal BST is %d ", optimalSearchTree(keys, freq, n));
return 0;
}
However in this solution they are also taking input of the "keys", but it seems they have no impact on the final answer, as they shouldn't. Only the frequency of how many time each key is searched for matters.
For simplicity sake and understanding this dynamic approach, I was wondering how can I possibly modify this solution so that it takes its input in the format shown above and prints the result.
The function you presented does have a keys parameter, but it does not use it. You could remove it altogether.
Edit: in particular, since function optimalSearchTree() does not use its keys parameter at all, removing that argument requires changing only the function signature (...
int optimalSearchTree(int freq[], int n)
...) and the one call of that function. Since you don't need the keys for this particular exercise, though, you can altogether remove them from the main program, too, to give you:
int main()
{
int freq[] = {25, 30, 5};
int n = sizeof(freq)/sizeof(freq[0]);
printf("Cost of Optimal BST is %d ", optimalSearchTree(freq, n));
return 0;
}
(substituting the frequency values you specified for the ones in the original code)
The function does, however, assume that the frequencies are given in order of increasing key. It needs at least the relative key order to do its job, because otherwise you cannot construct a search tree. If you were uncomfortable with the idea that the key values are unknown, you could interpret the code to be using indices into the freq[] array as aliases for the key values. That works because a consequence of the assumption described above is that x -> keys[x] is a 1:1, order-preserving mapping from integers 0 ... n - 1 to whatever the actual keys are.
If the function could not assume the frequencies were initially given in increasing order by key, then it could first use the keys to sort the frequencies into that order, and then proceed as it does now.
Related
I need to write a function which fills array rez with the conjugate-complex pairs from the array bounded by p1 and p2. The function returns the number of conjugate-complex pairs placed in the array. Duplicates must not be placed in the sequence. Conjugate-complex pairs are pairs of forms a + bi and a - bi.
This task should be solved using structures and pointer arithmetic. Auxiliary arrays are not allowed.
#include <stdio.h>
typedef struct {
int im, re;
} complex;
void remove_duplicates(complex *rez, int *number){
int i,j,k;
for (i = 0; i < *number; i++) {
for (j = i + 1; j < *number; j++) {
if (rez[i].im == rez[j].im && rez[i].re == rez[j].re) {
for (k = j; k < *number - 1; k++) {
rez[k].im = rez[k + 1].im;
rez[k].re = rez[k + 1].re;
}
(*number)--;
j--;
}
}
}
}
int conjugate_complex(complex *p1, complex *p2, complex *rez) {
int number_of_pairs = 0;
while (p1 < p2) {
if (p1->im == p1->re||p1->im == -1*p1->re) {
number_of_pairs++;
rez->re = p1->re;
rez->im = -1*p1->im;
}
rez++;
p1++;
}
remove_duplicates(rez,&number_of_pairs);
return number_of_pairs;
}
int main() {
int i;
complex arr1[5] = {{5, 5}, {3, 3}, {-5, -5}, {5, 5}, {-3, 3}};
complex arr2[5];
int vel = conjugate_complex(arr1, arr1 + 5, arr2);
printf("%d\n", vel);
for (i=0; i<vel; i++)
printf("(%d,%d) ",arr2[i].im,arr2[i].re);
return 0;
}
OUTPUT should be:
4
(-5,5) (-3,3) (5,-5) (3,3)
My output is:
5
(-5,5) (-3,3) (5,-5) (-5,5) (3,3)
The problem with my code is that it prints duplicates.
Could you help me fix my remove_duplicates function?
If I call it in main function it would work. However, I need to call it in the function conjugate_complex.
To see why it would be nice to have some O(n) space, consider what you would do in real life with graph paper. Take each complex number and place a spot in the graph (re, abs(im)). In that way, any duplicates get merged into one. This solution is O(n). (Expected, the hash is O(n) size, so you have to throw out some information, which will lead to collisions.)
It would be better to not duplicate elements in the first place. Whether you are using a Bloom filter to get around the restriction of not having an array, O(n) hash, O(n log n) sort, or an O(n^2) approach (arguably the simplest,) it would be good to have this function, (in pseudo-code, <stdbool.h> is C99, use int, adornments aside):
boolean pair_is_equal(pair a, pair b)
Be aware that a pair is not semantically equivalent to a complex. You can use the same representation (which you've been using, and, considering the output format, the simplest,) but be aware that they represent different things. If you let a complex stand in for a pair:
boolean pair_is_equal(complex a, complex b)
then you have to also also check one of a or b's complex conjugate, (except Im[a] == 0 || Im[b] == 0.) It might also be useful care about 2's-compliment INT_MIN, which is out of the domain of abs and will not have a complement (how to test.)
How can I use repetitions to check if there aren't any repeated numbers on a n x n matrix?
Using two for's two times wouldn't let me check anything that does not share at least a line or a column
Example: (in the most simplified way possible):
int matrix[n][n];
/*matrix is filled*/
int current, isEqual;
for (int i=0; i<n; i++)
{
for (int j=0; j<n; j++)
{
current = matrix[i][j];
if (current == matrix[i][j+1])
{
isEqual=1;
}
else
{
isEqual=0;
}
}
}
for (int j=0; j<n; j++)
{
for (int i=0; i<n; i++)
{
current = matrix[i][j];
if (current == matrix[i+1][j])
{
isEqual=1;
}
else
{
isEqual=0;
}
}
}
I can't check numbers that don't share lines or columns.
First, think in a NxM matrix as if it were an array with length [N*M]. The only difference is how you access the elements (two fors instead of one, for example).
Then, a simple algorithm would be to iterate every element (first index), and for each one, iterate every other element (second index) to check if it's the same. It's easier to do with an array; in a matrix it's the same, maybe a bit more verbose and complex. But the algorithm is the same.
As a second phase, after you have implemented the basic algorithm, you can improve its performance starting the second index in the element after the first index. This way, you avoid checking the already seen elements multiple times. This algorithm improvement is slightly harder to do in a matrix, if you iterate it with 2 fors, as it's a bit harder to know what's the "next index" (you have a "compound" index, {i,j}).
One simple way to do this is to insert each number into a data structure that makes it easy to check for duplicates. This is sort of fun to do in C, and although the following is certainly not super efficient or production ready, it's (IMO) a nice little toy:
/* Check if any integer on the input stream is a dup */
#include <stdio.h>
#include <stdlib.h>
struct node { int data; struct node *child[2]; };
static struct node *
new_node(int data)
{
struct node *e = calloc(1, sizeof *e);
if( e == NULL ){
perror("calloc");
exit(EXIT_FAILURE);
}
e->data = data;
return e;
}
/*
* Insert a value into the tree. Return 1 if already present.
* Note that this tree needs to be rebalanced. In a real
* project, we would use existing libraries. For this toy
* it is not worth the work needed to properly rebalance the
* tree.
*/
int
insert(struct node **table, int data)
{
struct node *t = *table;
if( !t ){
*table = new_node(data);
return 0;
}
if( data == t->data ){
return 1;
}
return insert(&t->child[data < t->data], data);
}
int
main(void)
{
int rv, v;
struct node *table = NULL;
while( (rv = scanf("%d", &v)) == 1 ){
if( insert(&table, v) ){
fprintf(stderr, "%d is duplicated\n", v);
return EXIT_FAILURE;
}
}
if( rv != EOF ){
fprintf(stderr, "Invalid input\n");
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
The basic approach is to loop through the nxn matrix and keeping a list of the numbers in it along with a count of the number of times each number is found in the nxn matrix.
The following is example source code for a 50 x 50 matrix. To extend this to an n x n matrix is fairly straightforward and I leave that as an exercise for you. You may need to do something such as using malloc() to create an arbitrary sized matrix. There are posts on that sort of thing.
I also do not specify how the data is put into the matrix in the first place. That is also up to you.
This is to just show a brute force approach for determining if there are duplicates in the matrix.
I've also taken the liberty of assuming the matrix elements are int but changing the type to something else should be straightforward. If the matrix elements are something other than a simple data value type such as int, long, etc. then the function findAndCount() will need changing for the equality comparison.
Here are the data structures I'm using.
typedef struct {
int nLength; // number of list elements in use
struct {
int iNumber; // number from an element of the nxn matrix
int iCount; // number of times this element was found in the matrix
} list[50 * 50];
} elementList;
elementList matrixList = {
0,
{0, 0}
};
int matrixThing[50][50];
next we need to loop through the matrix and with each element in the matrix to check if it is in the list. If it's not then add it. It does exist then increment the count.
for (unsigned short i = 0; i < 50; i++) {
for (unsigned short j = 0; j < 50; j++) {
findAndCount (matrixThing[i][j], &matrixList);
}
}
And then we need to define our function we use to check matrix values against the list.
void findAndCount (int matrixElement, elementList *matrixList)
{
for (int i = 0; i < matrixList->nLength; i++) {
if (matrixElement == matrixList->list[i].iNumber) {
matrixList->list[i].iCount++;
return;
}
}
// value not found in the list so we add it and set the initial count
// to one.
// we can then determine if there are any duplicates by checking the
// resulting list once we have processed all matrix elements to see
// if any count is greater than one.
// the initial check will be to see if the value of nLength is equal
// to the number of array elements in the matrix, n time n.
// so a 50 x 50 matrix should result in an nLength of 2500 if each
// element is unique.
matrixList->list[matrixList->nLength].iNumber = matrixElement;
matrixList->list[matrixList->nLength].iCount = 1;
matrixList->nLength++;
return;
}
Search algorithms
The above function, findAndCheck(), is a brute force search algorithm that searches through an unsorted list element by element until either the thing being searched for is found or the end of the list is reached.
If the list is sorted then you can use a binary search algorithm which is much quicker than a linear search. However you then run into the overhead needed to keep the list sorted using a sorting algorithm in order to use a binary search.
If you change the data structure used to store the list of found values to a data structure that maintains values in an ordered sequence, you can also cut down on the overhead of searching though there will also be an overhead of inserting new values into the data structure.
One such data structure is a tree and there are several types and algorithms to build a tree by inserting new items as well as searching a tree. See search tree which describes several different kinds of trees and searches.
So there is a kind of balancing between the effort to do searching versus the effort to add items to the data structure.
Here is an example that checks for duplicate values, the way want to do it.
Looping is slow, and we should use a hash set or a tree instead of using loops.
I assume you are not using C++, because the C++ standard library has build-in algorithms and data structures to do it efficiently.
#include <stdio.h>
/* Search the 'array' with the specified 'size' for the value 'key'
starting from 'offset' and return 1 if the value is found, otherwise 0 */
int find(int key, int* array, int size, int offset) {
for (int x = offset; x < size; ++x)
if (key == array[x])
return 1;
return 0;
}
/* Print duplicate values in a matrix */
int main(int argc, char *argv[]) {
int matrix[3][3] = { 1, 2, 3, 4, 3, 6, 2, 8, 2 };
int size = sizeof(matrix) / sizeof(matrix[0][0]);
int *ptr = (int*)matrix;
for (int x = 0; x < size; ++x) {
/* If we already checked the number, then don't check it again */
if (find(ptr[x], ptr, x, 0))
continue;
/* Check if the number repeats and show it in the console if it does */
if (find(ptr[x], ptr, size, x + 1))
printf("%d\n", ptr[x]);
}
return 0;
}
When you become better at C, you should find or implement a "hash set" or a "red-black tree", and use that instead.
I'm trying to write some code in C-language. The main idea is that I have an input linear array that consists the readius for each pixel (`````` - something like that, moreover, the length of pix_r, for instance, for picture with size (128,512) will be 128 * 512). And I need for each radius random selected fixed numbers of pixels and other set to -1. What I mean:
r = 2 in pix_r = [1, 8, 2, 2, 4, 6, 7, 7, 8, 2, 8] is in the following positions currentR = [2, 3, 9], and let's NumberOfRandomS = 2, so one of the possible result can be pix_r = [*, *, 2, -1, *, *, *, *, *, 2, *]. and the same should be doe for each r. If number of items == r is less than NumberOfRandomS, we should pick up all elements without any modification.
I try to write this in C-code. But I am a newbie and don't know all features and tips for optimization. My first aprroach of writing this function is
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <math.h>
#include <ctype.h>
#include <stdarg.h>
#include <stdint.h>
#include <stddef.h>
#include <limits.h>
#include <string.h>
#include <ctype.h>
const int NumberOfRandomS = 5;
void RandomSelected(size_t numEl, int maxRad, int *pix_r){
srand(time(NULL));
int lenRandomIndex = NumberOfRandomS*sizeof(int);
int* RandomIndex = (int*) malloc(lenRandomIndex);
memset(RandomIndex, 0, lenRandomIndex);
int lenNumPerShell1 = (maxRad) * sizeof(int);
int* numPerShell1 = (int*) malloc(lenNumPerShell1);
memset(numPerShell1, 0, lenNumPerShell1);
//Calculate the number of each pix_r per shell
for (int i=0; i<numEl; ++i){
numPerShell1[pix_r[i]]++;
}
//Main part for random selection of NumberOfRandomS items
//for each pix_r
for(int r=0; r<maxRad; ++r){
int lenShellR = numPerShell1[r];
//if number of items for this r is less than should be
//selected, skip it. It means that we selected all items
//for this r
if(lenShellR <= NumberOfRandomS){
continue;
}
int lenCurrentR = lenShellR*sizeof(int);
int* currentR = (int *) malloc(lenCurrentR); // array of indexes for this r
memset(currentR, 0, lenCurrentR);
//filling currentR array with all indexes for this r
int cInd = 0;
for(register int j=0; j<numEl; ++j){
if(pix_r[j] == r){
currentR[cInd] = j;
cInd++;
}
}
//generate random indexes without repetiotion that should be selected from currentR
//this indexes help us to save r value in these positions and others indexes for this r
//set to -1
int value[NumberOfRandomS];
for (int i=0;i<NumberOfRandomS;++i)
{
int check; //variable to check or index is already used for this r
size_t pick_index; //variable to store the random index in
do
{
pick_index = rand() % lenShellR;
//check or index is already used for this r:
check=1;
for (int j=0;j<i;++j)
if (pick_index == value[j]) //if index is already used
{
check=0; //set check to false
break; //no need to check the other elements of value[]
}
} while (check == 0); //loop until new, unique index is found
value[i]=pick_index; //store the generated index in the array
RandomIndex[i] = currentR[pick_index];
}
//set all positions for each r that are not on random selected to -1
for(register int k=0; k < lenShellR; ++k)
{
int flag = 0; // flag will be 1 if this index for this r in RandomIndex
for (register int q = 0; q < NumberOfRandomS; ++q)
{
if(RandomIndex[q] == currentR[k])
{
flag = 1; //this index is found
}
}
if(flag != 1)
{
//index for this r not in RandomIndex, so set this index for this r to -1
pix_r[currentR[k]] = -1;
}
}
}
return;
}
I tried to optimize a little bit, but different resources contradict each other and after testing it didn't show any speeding up:
void ModRandomSelected(size_t numEl, int maxRad, int *pix_r){
srand(time(NULL));
int lenRandomIndex = NumberOfRandomS*sizeof(int);
int* RandomIndex = (int*) malloc(lenRandomIndex);
memset(RandomIndex, 0, lenRandomIndex);
int lenNumPerShell1 = (maxRad) * sizeof(int);
int* numPerShell1 = (int*) malloc(lenNumPerShell1);
memset(numPerShell1, 0, lenNumPerShell1);
//Calculate the number of each pix_r per shell
for (int i=numEl-1; i>=0; --i){
numPerShell1[pix_r[i]]++;
}
//Main part for random selection of NumberOfRandomS items
//for each pix_r
for(int r=maxRad-1; r>=0; --r)
{
int lenShellR = numPerShell1[r];
//if number of items for this r is less than should be
//selected, skip it. It means that we selected all items
//for this r
if(lenShellR <= NumberOfRandomS){
continue;
}
int lenCurrentR = lenShellR*sizeof(int);
int* currentR = (int *) malloc(lenCurrentR); // array of indexes for this r
memset(currentR, 0, lenCurrentR);
//filling currentR array with all indexes for this r
int cInd = 0;
for(register int i = numEl-1; i>=0; --i)
{
if(pix_r[i] == r){
currentR[cInd] = i;
cInd++;
}
}
//generate random indexes without repetiotion that should be selected from currentR
//this indexes help us to save r value in these positions and others indexes for this r
//set to -1
int value[NumberOfRandomS];
for (int i=NumberOfRandomS-1; i>=0; --i)
{
int check; //variable to check or index is already used for this r
size_t pick_index; //variable to store the random index in
do
{
pick_index = rand() % lenShellR;
//check or index is already used for this r:
check=1;
for (int j=0;j<i;++j)
if (pick_index == value[j]) //if index is already used
{
check=0; //set check to false
break; //no need to check the other elements of value[]
}
} while (check == 0); //loop until new, unique index is found
value[i]=pick_index; //store the generated index in the array
RandomIndex[i] = currentR[pick_index];
}
//set all positions for each r that are not on random selected to -1
for(register int k=lenShellR-1; k >= 0; --k)
{
int flag = 0; // flag will be 1 if this index for this r in RandomIndex
for (register int q = NumberOfRandomS-1; q >= 0; --q)
{
if(RandomIndex[q]== currentR[k]){
flag = 1; //this index is found
}
}
if(flag != 1)
{
//index for this r not in RandomIndex, so set this index for this r to -1
pix_r[currentR[k]] = -1;
}
}
}
return;
}
I will be very thankful if you help and explain what and how I can improve this function.
The code is rather messy and hard to follow, so I can't be bothered to figure out what it actually does. The algorithm overall might be the true bottleneck. Anyway, here's some misc comments & advise of potential problems that I spotted:
Ensure to only call srand once in the whole program.
The register keyword is obsolete, from a time when compilers were bad at determining when to place variables in registers. Nowadays, compilers are more competent at this than programmers, don't use register, it is bloat.
Similarly, replacing up-counting loops with down-counting ones for the sake of performance is an obsolete technique nowadays sorting under "pre-mature optimization". The compiler can do that optimization for you - so write the code as readable as possible instead.
Avoid iterating over the same range/array multiple times.
Keep loop conditions as trivial as possible. This helps readability and data cache optimization both. The ideal for loop should look like for(int i=0; i<n; i++).
malloc is much slower than static or local storage. In this case you have a few items and only need to access them locally, so all malloc calls should be swapped with local arrays. You may use VLA here, to get stack allocation instead. That is, drop this code:
int lenRandomIndex = NumberOfRandomS;
int* RandomIndex = (int*) malloc(lenRandomIndex);
memset(RandomIndex, 0, lenRandomIndex);
and replace with this code:
int RandomIndex [NumberOfRandomS];
You have similar situations all over the code. And you probably don't need to set it to zero, because:
Don't zero-initialize or memset arrays that you indeed to fill with data the first thing you do anyway. This is a rather big performance problem in the posted code.
Empty return ; at the end of a function returning void is just clutter.
Investigate if some of these searches could be replaced with binary search. It means sorting the data in advance but might lead to much faster code overall.
Minimize the amount of checks, particularly inside loops.
Split up your big monster functions into several. Local static functions are very certain to get inlined and they improve readability a lot. Splitting functions into several smaller also allows much easier benchmarking.
Please benchmark your code when optimizations are enabled.
I want to generate an array of the sequence [0...1'000'000] in random order without shuffling.
This means that I don't want to do:
int arr[1000000];
for (int i = 0; i < 1000000; i++)
{
arr[i] = i;
}
shuffle(arr);
shuffle(arr);
I want to figure out how to do it without the "black-box" shuffle function. I also don't want to randomly select an index between 1 and 1'000'000 because at number 999'999 there would be only a 1/1'000'000 chance to continue.
I've been trying to think of a solution and I think the key is parallel arrays and looping backwards then using modulus to limit only to the indexes that you haven't already been to, but then I can't guarantee that the value I get is unique.
I don't want to use a HashSet or TreeSet implementation as well.
This can be done in O(n) time with two lists, one with the number (initialy) in order, and one in the resulting order.
You start with n elements in order in your source list. Then you select a random number mod n. That gives you the next element, which you place in the destination list.
Now the key part. If you were to pick a random number between 0 and n-1 each time, as you seem to think a shuffle does, you have an increasing chance of selecting a number you selected before. So how do you handle this? By decreasing the available list of number to select from.
In the source list, after selecting a number, you move the last element of the list to the index that was just used. You now have a list of n-1 numbers to chose from. So on the next iteration you take a random number mod n-1. Keep going until your source list only has one element.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define LEN 10
int main()
{
int a[LEN], b[LEN];
int i, val;
int count = LEN;
srand(time(NULL));
for (i=0;i<LEN;i++) {
a[i]=i+1;
}
for (i=0;i<LEN;i++) {
val = rand() % count;
b[i] = a[val];
a[val] = a[count-1];
count--;
}
for (i=0;i<LEN;i++) {
printf("%d ", b[i]);
}
printf("\n");
return 0;
}
EDIT:
Here's a slightly more efficient version that doesn't use two arrays and is therefore O(1) space:
int a[LEN];
int i, val, tmp;
srand(time(NULL));
for (i=0;i<LEN;i++) {
a[i]=i+1;
}
for (i=0;i<LEN-1;i++) {
val = (rand() % (LEN - 1 - i)) + i + 1;
tmp = a[i];
a[i] = a[val];
a[val] = tmp;
}
for (i=0;i<LEN;i++) {
printf("%d ", a[i]);
}
printf("\n");
The O(N) answer is great but here is an alternative way using binary search and binary indexed tree to do this in O(NlogN).
arr = []
N = 1000,000
for i from 0 to N-1
low = 0
high = N-1
mid = (low+high)/2
while low < high
if full(low,mid)
low = mid+1
else if full(mid+1,high)
high = mid
else
if rand() < 0.5
low = mid+1
else
high = mid
mark(low) // marking the element in binary indexed tree
arr[i] = low
The function full is implemented using binary indexed tree and checks whether all the elements in the range given are marked or not.
Both mark and full have O(logN) complexity.
Hello friends I need your help.
My program is such an array size 1000 where the numbers should be between 0-999. These numbers should be determined randomly (rand loop) and the number must not be repeated. Would be considered the main part, I have to count how many times I used rand().
My idea is that: one loop where it initializes all the 1000 numbers, and if in this loop they check whether the number appears twice, if the number appears twice is set it again until that not appear twice (maybe this is not the best way but ...)
It is my exercise (Here I need your help)-
#include <stdio.h>
#include <stdlib.h>
int main()
{
int const arr_size = 1000;
int i, j, c;
int arr[arr_size];
int loop = 0;
for(i = 0; i<arr_size; i++)
{
arr[i] = rand() % 1000;
loop++;
if (arr[i] == arr[i - 1])
{
arr[i] = rand() % 1000;
loop++;
}
}
printf("%d\n",loop);
}
So if anyone can give me advice on how I can make it work I appreciate your help.
Thanks.
As suggested, shuffling the set will work but other indirect statistical quantities might be of interest, such as the distribution of the loop variable as a function of the array index.
This seemed interesting so I went ahead and plotted the distribution of the loop as a function of the array index, which generally increases as i increases. Indeed, as we get near the end of the array, the chance of getting a new random number that is not already in the set decreases (and hence, the value of the loop variable increases; see the code below).
Specifically, for an array size = 1000, I recorded the non-zero values generated for loop (there were around 500 duplicates) and then made a plot vs the index.
The plot looks like this:
The code below will produce an array with the unique random values, and then calculate the value for loop. The loop values could be stored in another array and then saved for later analysis, but I didn't include that in the code below.
Again, I'm not exactly sure this fits the application, but it does return information that would not necessarily be available from an approach using a shuffle algorithm.
NOTE: some folks expressed concerns about how long this might take but it runs pretty quick, on my 2011 Macbook Pro it took a about a second for an array size of 1000. I didn't do a big-O analysis as a function of the array size, but that would be interesting too.
NOTE 2: its more elegant to use recursion for the numberInSet() function but it seemed best to keep simple.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <stdbool.h> /* If C99 */
const int ARR_SIZE = 1000;
/* Check if the number is in the set up to the given position: */
bool numberInSet(int number, int* theSet, int position);
int main()
{
int* arr = malloc(sizeof(int)*ARR_SIZE);
srand((unsigned int)time(NULL));
/* Intialize array with rand entries, possibly duplicates: */
for(int i = 0; i < ARR_SIZE; i++)
arr[i] = rand() % ARR_SIZE;
/* Scan the array, look for duplicate values, replace if needed: */
for(int i = 0; i < ARR_SIZE; i++) {
int loop = 0;
while ( numberInSet(arr[i], arr, i-1) ) {
arr[i] = rand() % ARR_SIZE;
loop++;
}
/* could save the loop values here, e.g., loopVals[i] = loop; */
}
for(int i = 0; i < ARR_SIZE; i++)
printf("i = %d, %d\n",i,arr[i]);
/* Free the heap memory */
free(arr);
}
bool numberInSet(int number, int* theSet, int position) {
if (position < 0)
return false;
for(int i = 0; i <= position; i++)
if (number == theSet[i])
return true;
return false;
}
To make sure all random number you get in the same program are different, you must seed once the random generator:
srand (time(NULL)); //seed the random generator
//in the loop, rand will use the seeded value
rand() % 1000