One thousand random selections of a Tree's node - c

I need to write a code where I add the node into a tree, then I randomly pick an element of a binary search tree randomly. All the elements should have about equal probability of being selected. I use the following nodes as an example for my tree.
60
/ \
41 72
/ \
23 57
/ \
1 32`
from those nodes I count them with my function countT(nodeT *p), Then I had implemented the following algorithm/pseudocode
function random()
//returns a random element n = countT
function probability_random_of(int x, int y)
// get the probability of gettin gvalue x on the nth call of random
for(i=0;i<1000;i++)
random()
probability_of_(x)
My question and/or problem is to know if I have a correct approach or I am overthinking it. If i am incorrect please feel free to guide me in the correct solution.Also, I have the idea of using either binomial distribution or normal distribution.
The output of the code will be,
Probabilities after 1000 random selections are
p(60) = 0.135000
p(41) = 0.135000
p(72) = 0.152000
p(23) = 0.147000
p(57) = 0.156000
p(1) = 0.147000
p(32) = 0.128000
this output bugs me because it negates the statement I said before
all element should have about equal probability of being selected
which means all of the element should had the exact result.

Forget the tree, that's not really relevant. This code should tell you how often the random number spits out a given number between (e.g.) 0 and 1000 (just an arbitrary maximum number of values I picked):
#define MAX_VALUES 1000
int index;
float percentages[MAX_VALUES];
int counts[MAX_VALUES];
int maxiterations = 1000000;
for( index = 0; index < MAX_VALUES; ++index )
{
counts[index] = 0;
}
// Initialize the generator...
srand((unsigned) time(&t));
// Now test for some number of iterations.
for( index = 0; index < maxiterations; ++index )
{
int value = rand() % 1000;
++counts[value];
}
// At this point, each value of the "count" array should be roughly
// equivalent. But there's no guarantee that they'll be exactly
// equal. This will calculate the percentages.
for( index = 0; index < MAX_VALUES; ++index )
{
percentages[index] = (float)counts[index];
percentages[index] /= (float)maxiterations;
}
Again, there's no guarantee that the percentages will be exactly the same. But they oughta be close, within a particularly small deviation. The higher the value of maxiterations, the closer and closer the percentages should (theoretically) be.

Related

Rebuild an array of integers after summing the digits of each element

We have an strictly increasing array of length n ( 1 < n < 500) . We sum the digits of each element to create a new array with each elements values is in range 1 to 500.The task is to rebuild the old array from the new one. since there might be more than one answer, we want the answers with the minimum value of the last element.
Example:
3 11 23 37 45 123 =>3 2 5 10 9 6
now from the second array, we can rebuild the original array in many different ways for instance:
12 20 23 37 54 60
from all the possible combinations, we need the one we minimum last element.
My Thoughts so far:
The brute force way is to find all possible permutations to create each number and then create all combinations possible of all numbers of the second array and find the combination with minimum last element. It is obvious that this is not a good choice.
Using this algorithm(with exponential time!) we can create all possible permutations of digits that sum to a number in the second arrays. Note that we know the original elements were less than 500 so we can limit the death of search of the algorithm.
One way I thought of that might find the answer faster is to:
start from the last element in the new arrays and find all possible
numbers that their digit sum resulted this element.
Then try to use the smallest amount in the last step for this element.
Now try to do the same with the second to last element. If the
minimum permutation value found for the second to last element is bigger
than the one found for the last element, backtrack to the last
element and try a larger permutation.
Do this until you get to the first element.
I think this is a greed solution but I'm not very sure about the time complexity. Also I want to know is there a better solution for this problem? like using dp?
For simplicity, let's have our sequence 1-based and the input sequence is called x.
We will also use an utility function, which returns the sum of the digits of a given number:
int sum(int x) {
int result = 0;
while (x > 0) {
result += x % 10;
x /= 10;
}
return result;
}
Let's assume that we are at index idx and try to set there some number called value (given that the sum of digits of value is x[idx]). If we do so, then what could we say about the previous number in the sequence? It should be strictly less than value.
So we already have a state for a potential dp approach - [idx, value], where idx is the index where we are currently at and value denotes the value we are trying to set on this index.
If the dp table holds boolean values, we will know we have found an answer if we have found a suitable number for the first number in the sequence. Therefore, if there is a path starting from the last row in the dp table and ends at row 0 then we'll know we have found an answer and we could then simply restore it.
Our recurrence function will be something like this:
f(idx, value) = OR {dp[idx - 1][value'], where sumOfDigits(value) = x[idx] and value' < value}
f(0, *) = true
Also, in order to restore the answer, we need to track the path. Once we set any dp[idx][value] cell to be true, then we can safe the value' to which we would like to jump in the previous table row.
Now let's code that one. I hope the code is self-explanatory:
boolean[][] dp = new boolean[n + 1][501];
int[][] prev = new int[n + 1][501];
for (int i = 0; i <= 500; i++) {
dp[0][i] = true;
}
for (int idx = 1; idx <= n; idx++) {
for (int value = 1; value <= 500; value++) {
if (sum(value) == x[idx]) {
for (int smaller = 0; smaller < value; smaller++) {
dp[idx][value] |= dp[idx - 1][smaller];
if (dp[idx][value]) {
prev[idx][value] = smaller;
break;
}
}
}
}
}
The prev table only keeps information about which is the smallest value', which we can use as previous to our idx in the resulting sequence.
Now, in order to restore the sequence, we can start from the last element. We would like it to be minimal, so we can find the first one that has dp[n][value] = true. Once we have such element, we then use the prev table to track down the values up to the first one:
int[] result = new int[n];
int idx = n - 1;
for (int i = 0; i <= 500; i++) {
if (dp[n][i]) {
int row = n, col = i;
while (row > 0) {
result[idx--] = col;
col = prev[row][col];
row--;
}
break;
}
}
for (int i = 0; i < n; i++) {
out.print(result[i]);
out.print(' ');
}
If we apply this on an input sequence:
3 2 5 10 9 6
we get
3 11 14 19 27 33
The time complexity is O(n * m * m), where n is the number of elements we have and m is the maximum possible value that an element could hold.
The space complexity is O(n * m) as this is dominated by the size of the dp and prev tables.
We can use a greedy algorithm: proceed through the array in order, setting each element to the least value that is greater than the previous element and has digits with the appropriate sum. (We can just iterate over the possible values and check the sums of their digits.) There's no need to consider any greater value than that, because increasing a given element will never make it possible to decrease a later element. So we don't need dynamic programming here.
We can calculate the sum of the digits of an integer m in O(log m) time, so the whole solution takes O(b log b) time, where b is the upper bound (500 in your example).

Filling in random positions in a huge 2D array

Is a there a neat algorithm that I can use to fill in random positions in a huge 2D n x n array with m number of integers without filling in an occupied position? Where , and
Kind of like this pseudo code:
int n;
int m;
void init(int new_n, int new_m) {
n = new_n;
m = new_m;
}
void create_grid() {
int grid[n][n];
int x, y;
for(x = 1; x <= n; x ++) {
for(y = 1; y <= n; y ++) {
grid[x][y] = 0;
}
}
populate_grid(grid);
}
void populate_grid(int grid[][]) {
int i = 1;
int x, y;
while(i <= m) {
x = get_pos();
y = get_pos();
if(grid[x][y] == 0) {
grid[x][y] = i;
i ++;
}
}
}
int get_pos() {
return random() % n + 1;
}
... but more efficient for bigger n's and m's. Specially if m is bigger and more positions are being occupied, it would take longer to generate a random position that isn't occupied.
Unless the filling factor really gets large, you shouldn't worry about hitting occupied positions.
Assuming for instance that half of the cells are already filled, you have 50% of chances to first hit a filled cell; and 25% to hit two filled ones in a row; 12.5% of hitting three... On average, it takes... two attempts to find an empty place ! (More generally, if there is only a fraction 1/M of free cells, the average number of attempts raises to M.)
If you absolutely want to avoid having to test the cells, you can work by initializing an array with the indexes of the free cells. Then instead of choosing a random cell, you choose a random entry in the array, between 1 and L (the lenght of the list, initially N²).
After having chosen an entry, you set the corresponding cell, you move the last element in the list to the random position, and set L= L-1. This way, the list of free positions is kept up-to-date.
Note the this process is probably less efficient than blind attempts.
To generate pseudo-random positions without repeats, you can do something like this:
for (int y=0; y<n; ++y) {
for(int x=0; x<n; ++x) {
int u=x,v=y;
u = (u+hash(v))%n;
v = (v+hash(u))%n;
u = (u+hash(v))%n;
output(u,v);
}
}
for this to work properly, hash(x) needs to be a good pseudo-random hash function that produces positive numbers that won't overflow when you add to a number between 0 and n.
This is a version of the Feistel structure (https://en.wikipedia.org/wiki/Feistel_cipher), which is commonly used to make cryptographic ciphers like DES.
The trick is that each step like u = (u+hash(v))%n; is invertible -- you can get your original u back by doing u = (u-hash(v))%n (I mean you could if the % operator worked with negative numbers the way everyone wishes it did)
Since you can invert the operations to get the original x,y back from each u,v output, each distinct x,y MUST produce a distinct u,v.

Algorithm - Find pure numbers

Description:
A positive integer m is said to a pure number if and only if m can be
expressed as q-th power of a prime p (q >= 1). Here your job is easy,
for a given positive integer k, find the k-th pure number.
Input:
The input consists of multiple test cases. For each test case, it
contains a positive integer k (k<5,000,000). Process to end of file.
Output:
For each test case, output the k-th pure number in a single line. If
the answer is larger than 5,000,000, just output -1.
Sample input:
1
100
400000
Sample output:
2
419
-1
Original page: http://acm.whu.edu.cn/learn/problem/detail?problem_id=1092
Can anyone give me some suggestion on the solution to this?
You've already figured out all the pure numbers, which is the tricky part. Sort the ones less than 5 million and then look up each input in turn in the resulting array.
To optimize you need to efficiently find all primes up to 5 million (note q >= 1 in the problem description: every prime is a pure number), for which you will want to use some kind of sieve (sieve of Erathosthenes will do, look it up).
You could probably adapt the sieve to leave in powers of primes, but I expect that it would not take long to sieve normally and then put the powers back in. You only have to compute powers of primes p where p <= the square root of 5 million, which is 2236, so this shouldn't take long compared with finding the primes.
Having found the numbers with a sieve, you no longer need to sort them, just copy the marked values from the sieve to a new array.
Having now looked at your actual code: your QuickSort routine is suspect. It performs badly for already-sorted data and your array will have runs of sorted numbers in it. Try qsort instead, or if you're supposed to do everything yourself then you need to read up on pivot choice for quicksort.
Try following approach:
static void Main(string[] args)
{
int max = 5000000;
int[] dp = new int[max];
for (int i = 2; i < max; i++)
{
if (dp[i] == 0)
{
long t = i;
while (t < max)
{
dp[t] = 1;
t *= i;
}
int end = max / i;
for (int j = 2; j < end; j++)
if (dp[i * j] == 0)
dp[i * j] = 2;
}
}
int[] result = new int[348978];
int pointer = 1;
for (int i = 2; i < max; i++)
{
if (dp[i] == 1)
result[pointer++] = i;
}
}
Into array as "1" marked pure numbers.
As "2" marked non pure(prime) numbers.
For each output check array ranges if it inside output result[index] if not output should be -1.

Sorting an Array without changing it C

Hey guys I have been working on this for 3 days and have come up with nothing from everywere I have looked.
I am trying to take an Array of around 250 floats and find the Kth largest value without changing the array in anyway or making a new array.
I can change it or create a new one because other functions need the placing of the data in the correct order and my Arduino cant hold any more values in its memory space so the 2 easiest options are out.
The values in the Array can ( and probably will ) have duplicates in them.
As an EG : if you have the array ::: 1,36,2,54,11,9,22,9,1,36,0,11;
from Max to min would be ::
1) 54
2) 36
3) 36
4) 22
5) 11
6) 11
7) 9
8) 9
9) 2
10) 1
11) 1
12) 0
Any help would be great.
It may be to much to ask for a function that would do this nicely for me :) hahaha
here is the code I have so far but I have not even tried to get the duplicates working yet
and it for some reason only gives me one answer for some reason that's 2 ,,, no clue why though
void setup()
{
Serial.begin(9600);
}
void loop ()
{
int Array[] = {1,2,3,4,5,6,7,8,9,10};
int Kth = 6; //// just for testing putting the value as a constant
int tr = 0; /// traking threw the array to find the MAX
for (int y=0;y<10;y++) //////////// finding the MAX first so I have somewhere to start
{
if (Array[y]>Array[tr])
{
tr = y;
}
}
Serial.print("The max number is ");
int F = Array[tr];
Serial.println(F); // Prints the MAX ,,, mostly just for error checking this is done
///////////////////////////////////////////////////////// got MAX
for ( int x = 1; x<Kth;x++) //// run the below Kth times and each time lowering the "Max" making the loop run Kth times
{
for(int P=0;P<10;P++) // run threw every element
{
if (Array[P]<F)
{
for(int r=0;r<10;r++) //and then test that element against every other element to make sure
//its is bigger then all the rest but small then MAX
{
Serial.println(r);
if(r=tr) /////////////////// done so the max dosent clash with the number being tested
{
r++;
Serial.println("Max's Placeing !!!!");
}
if(Array[P]>Array[r])
{
F=Array[P]; ////// if its bigger then all others and smaller then the MAx then make that the Max
Serial.print(F);
Serial.println(" on the ");
}
}}}}
Serial.println(F); /// ment to give me the Kth largest number
delay(1000);
}
If speed isn't an issue you can take this approach (pseudocode):
current=inf,0
for i in [0,k):
max=-inf,0
for j in [0,n):
item=x[j],j
if item<current and item>max:
max=item
current=max
current will then contain the kth largest item, where an item is a pair of value and index.
The idea is simple. To find the first largest item, you just find the largest item. To find the second largest item, you find the largest item that isn't greater than your first largest item. To find the third largest item, you find the largest item that isn't greater than your second largest item. etc.
The only trick here is that since there can be duplicates, the items need to include both a value and an index to make them unique.
Here is how it might be implemented in C:
void loop()
{
int array[] = {1,2,3,4,5,6,7,8,9,10};
int n = 10;
int k = 6; //// just for testing putting the value as a constant
int c = n; // start with current index being past the end of the array
// to indicate that there is no current index.
for (int x = 1; x<=k; x++) {
int m = -1; // start with the max index being before the beginning of
// the array to indicate there is no max index
for (int p=0; p<n; p++) {
int ap = array[p];
// if this item is less than current
if (c==n || ap<array[c] || (ap==array[c] && p<c)) {
// if this item is greater than max
if (m<0 || ap>array[m] || (ap==array[m] && p>m)) {
// make this item be the new max
m = p;
}
}
}
// update current to be the max
c = m;
}
Serial.println(array[c]); /// ment to give me the Kth largest number
delay(1000);
}
In the C version, I just keep track of the current and max indices, since I can always get the current and max values by looking in the array.

Starting with a 10x10 array how do I choose 10 random sites

I am trying to write C code to randomly select 10 random sites from a grid of 10x10. The way I am considering going about this is to assign every cell a random number between zero and RAND_MAX and then picking out the 10 smallest/largest values. But I have very little idea about how to actually code something like that :/
I have used pseudo-random number generators before so I can do that part.
Just generate 2 random numbers between 0 and 9 and the select the random element from the array like:
arr[rand1][rand2];
Do that 10 times in a loop. No need to make it more complicated than that.
To simplify slightly, treat the 10x10 array as an equivalent linear array of 100 elements. Now the problem becomes that of picking 10 distinct numbers from a set of 100. To get the first index, just pick a random number in the range 0 to 99.
int hits[10]; /* stow randomly selected indexes here */
hits[0] = random1(100); /* random1(n) returns a random int in range 0..n-1 */
The second number is almost as easy. Choose another number from the 99 remaining possibilities. Random1 returns a number in the continuous range 0..99; you must then map that into the broken range 0..hits[0]-1, hits[0]+1..99.
hits[1] = random1(99);
if (hits[1] == hits[0]) hits[1]++;
Now for the second number the mapping starts to get interesting because it takes a little extra work to ensure the new number is distinct from both existing choices.
hits[2] = random1(98);
if (hits[2] == hits[0]) hits[2]++;
if (hits[2] == hits[1]) hits[2]++;
if (hits[2] == hits[0]) hits[2]++; /* re-check, in case hits[1] == hits[0]+1 */
If you sort the array of hits as you go, you can avoid the need to re-check elements for uniqueness. Putting everything together:
int hits[10];
int i, n;
for (n = 0; n < 10; n++) {
int choice = random1( 100 - n ); /* pick a remaining index at random */
for (i = 0; i < n; i++) {
if (choice < hits[i]) /* find where it belongs in sorted hits */
break;
choice++; /* and make sure it is distinct *
/* need ++ to preserve uniform random distribution! */
}
insert1( hits, n, choice, i );
/* insert1(...) inserts choice at i in growing array hits */
}
You can use hits to fetch elements from your 10x10 array like this:
array[hits[0]/10][hits[0]%10]
for (int i = 0; i < 10; i++) {
// ith random entry in the matrix
arr[rand() % 10][rand() % 10];
}
Modified this from Peter Raynham's answer - I think the idea in it is right, but his execution is too complex and isn't mapping the ranges correctly:
To simplify slightly, treat the 10x10 array as an equivalent linear array of 100 elements. Now the problem becomes that of picking 10 distinct numbers from a set of 100.
To get the first index, just pick a random number in the range 0 to 99.
int hits[10]; /* stow randomly selected indexes here */
hits[0] = random1(100); /* random1(n) returns a random int in range 0..n-1 */
The second number is almost as easy. Choose another number from the 99 remaining possibilities. Random1 returns a number in the continuous range 0..99; you must then map that into the broken range 0..hits[0]-1, hits[0]+1..99.
hits[1] = random1(99);
if (hits[1] >= hits[0]) hits[1]++;
Note that you must map the complete range of hits[0]..98 to hits[0]+1..99
For another number you must compare to all previous numbers, so for the third number you must do
hits[2] = random1(98);
if (hits[2] >= hits[0]) hits[2]++;
if (hits[2] >= hits[1]) hits[2]++;
You don't need to sort the numbers! Putting everything together:
int hits[10];
int i, n;
for (n = 0; n < 10; n++) {
int choice = random1( 100 - n ); /* pick a remaining index at random */
for (i = 0; i < n; i++)
if (choice >= hits[i])
choice++;
hits[i] = choice;
}
You can use hits to fetch elements from your 10x10 array like this:
array[hits[0]/10][hits[0]%10]
If you want your chosen random cells from grid to be unique - it seems that you really want to construct random permutations. In that case:
Put cell number 0..99 into 1D array
Take some shuffle algorithm and toss that array with it
Read first 10 elements out of shuffled array.
Drawback: Running time of this algorithm increases linearly with increasing number of cells. So it may be better for practical reasons to do as #PaulP.R.O. says ...
There is a subtle bug in hjhill's solution. If you don't sort the elements in your list, then when you scan the list (inner for loop), you need to re-scan whenever you bump the choice index (choice++). This is because you may bump it into a previous entry in the list - for example with random numbers: 90, 89, 89.
The complete code:
int hits[10];
int i, j, n;
for (n = 0; n < 10; n++) {
int choice = random1( 100 - n ); /* pick a remaining index at random */
for (i = 0; i < n; i++) {
if (choice >= hits[i]) { /* find its place in partitioned range */
choice++;
for (j = 0; j < i; j++) { /* adjusted the index, must ... */
if (choice == hits[j]) { /* ... ensure no collateral damage */
choice++;
j = 0;
}
}
}
}
hits[n] = choice;
}
I know it's getting a little ugly with five levels of nesting. When selecting just a few elements (e.g., 10 of 100) it will have better performance than the sorting solution; when selecting a lot of elements (e.g., 90 of 100), performance will likely be worse than the sorting solution.

Resources