Related
I was trying to write some code in C to simulate temperature fluctuations +/- 4 from the previous value, however I'm getting some wild jumps in either direction.
The program is multi-threaded, however, even testing in isolation produces the same wrong results.
I've tried several variations on the code, thinking that it had to do with how the code was evaluating, but my errors but they all end up the same. My code is as follows:
int main(){
srand(1); //Just for testing and predictability of outcome
//short int temp = 20 + rand() / (RAND_MAX / 30 - 20 + 1) + 1; Initially I was initialising it at a random value between 20-30, but chose 20 for testing purposes
short int temp = 20;
short int new_temp, last_temp, new_min, new_max;
last_temp = temp;
for(int i = 0; i < 20; i++){
//last_temp = temp; At first I believed it was because last_temp wasn't being reassigned, however, this doesn't impact the end result
new_min = last_temp - 4;
new_max = last_temp + 4;
//new_temp = (last_temp-4) + rand() / (RAND_MAX / (last_temp + 4) - (last_temp - 4) + 1) + 1; I Also thought this broke because they last_temp was being changed with the prior math in the equations. Still no impact
new_temp = new_min + rand() / (RAND_MAX / new_max - new_min + 1) + 1;
printf("Temperature is %d\n", new_temp);
}
return 0;
}
Produces results like this.
Temperature is 37
Temperature is 26
Temperature is 35
Temperature is 36
Temperature is 38
As you can see, the first temperature reading should be within the range of 16-24, however it increases by 17 to 37, and I can't figure out why. Any insight would be appreciated. In the alternative, can anyone provide me with a clean way to simulate a random +/- without having to use a lot of embedded if statements?
There are 2 issues in this code:
rand() usage
last_temp value is not updating in each iteration
rand usage
rand() returns a value between 0 and RAND_MAX. You want to limit this value in [0,8] and add it to new_min, so that new_temp is limited in [last_temp-4,last_temp+4], ie [new_min,new_min+8].
To do that, you use % operator. By doing rand() % 9, you limit your random value between 0 and 8. So, the new_temp value should be: new_temp = new_min + rand() % 9.
last_temp update
You need to update the last_temp value after you assign your new_temp value like this:
new_temp = new_min + rand() % 9;
last_temp = new_temp;
So, you for loop should look like this in the end:
for(int i = 0; i < 20; i++){
new_min = last_temp - 4;
new_max = last_temp + 4;
new_temp = new_min + rand() % 9;
last_temp = new_temp;
printf("Temperature is %d\n", new_temp);
}
And the code can be minimized to this:
int main() {
srand(1); //Just for testing and predictability of outcome
short int temp = 20; //or 20 + rand()%11 for values in [20,30] range
for(int i = 0; i < 20; i++) {
temp += -4 + rand() % 9;
printf("Temperature is %hd\n", temp);
}
return 0;
}
with an outcome of:
Temperature is 23
Temperature is 25
Temperature is 22
Temperature is 21
Temperature is 18
Temperature is 21
Temperature is 19
Temperature is 19
Temperature is 16
Temperature is 17
Temperature is 15
Temperature is 14
Temperature is 12
Temperature is 11
Temperature is 10
Temperature is 12
Temperature is 12
Temperature is 10
Temperature is 10
Temperature is 6
I'm learning about the rand() function in C, as I want to use it to generate a random number in a range. However, I have a question about a part of the algorithm below.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main()
{
const MAX = 20, MIN = 1;
srand(time(NULL));
int randNumber = rand() % (MAX - MIN + 1) + MIN;
printf("%d", randNumber);
// yeu cau nhap so
int duDoan;
printf("Moi ban du doan con so:");
scanf("%d", &duDoan);
// chay vong lap kiem tra
while(duDoan != randNumber) {
printf("Ban da sai. Moi nhap lai:");
scanf("%d", &duDoan);
}
printf("Ban da nhap dung. Dap an la: %d ", randNumber);
return 0;
}
What confuses me here is why we have to add + MIN in this line:
rand() % (MAX - MIN + 1) + MIN;
If I leave it, what will the result be?
rand() is a number between 0 and RAND_MAX.
rand() % n is a number between 0 and n - 1. If you want a value from 0 to n, then you need rand() % (n+1).
In your example (MAX - MIN + 1) is the span of integer values to generate, while MIN is the lower value. So for example where:
MIN = -10
MAX = 10
the span n :
n = (MAX - MIN + 1) = 21
so that:
rand() % n
yields values from 0 to 20, and
rand() % n - MIN
is -10 to +10. Without the +1, it would incorrectly be -10 to +9.
Note that where a statistically high quality random number is required restricting the span by the use of % is flawed and will introduce a bias when n is not a factor of RAND_MAX + 1. In that case (int)(n * ((double)rand() / (double)RAND_MAX)) is a better solution, so you would have:
int randNumber = (int)((MAX - MIN) * ((double)rand() /
(double)RAND_MAX)) + MIN ;
Note there is no +1 here because the range of (double)rand() / (double)RAND_MAX is 0 to 1, so multiplying by n gives 0 to n inclusive.
I'm looking for solution to find the sum of numbers. Input will be given has n in integer and problem is to find Sum of the values of sum(1)+ sum(1+2) + sum(1+2+3) + ... + sum(1+2+..+n). I need a very optimised solution using dynamic programming or any math calculation.
int main()
{
int sum = 0;
int i = 0, n = 6;
for( i = 1; i < n; i++ )
sum = sum + findSumN( i );
printf( "%d",sum );
}
You can often find a formula for series like this by calculating the first few terms and using the results to search the On-Line Encyclopedia of Integer Sequences.
1 = 1
1 + (1+2) = 4
4 + (1+2+3) = 10
10 + (1+2+3+4) = 20
20 + (1+2+3+4+5) = 35
35 + (1+2+3+4+5+6) = 56
The sequence you're trying to calculate (1, 4, 10, 20, 35, 56, ...) is A000292, which has the following formula:
a(n) = n × (n + 1) × (n + 2) / 6
If you play with the number you can find some patterns. Starts with
sum(1 + 2 + 3 ... + N) = ((1 + N) * N) /2
Then there is a relationship between the max number and the value above, that is from 1 the difference step 1/3 everytime the max number increase by 1. So get:
(1 + ((1.0 / 3.0) * (max - 1)))
I am not good enough at math to explain why this pattern occurs. Perhaps someone can explain it in a math way.
The following is my solution, no iteration needed.
int main()
{
int min = 1;
int max = 11254;
double sum = ((min + max) * max / 2) * (1 + ((1.0 / 3.0) * (max - 1)));
printf("%.f", sum);
}
Look at the closed form of sum(n)=1+2+…+n and look up the Pascal's triangle identities. This gives immediately a very fast computation method.
As
binom(k,2) + binom(k,3) = binom(k+1,3)
binom(k,2) = binom(k+1,3) - binom(k,3)
the summation of binom(k+1,2) from k=M to N results in the sum value
binom(N+2,3)-binom(M+1,3)=(N+2)*(N+1)*N/6-(M+1)*M*(M-1)/6
= (N+1-M) * ((N+1)²+(N+1)M+M²-1)/6
I need to distribute a large integer budget randomly among a small array with n elements, so that all elements in the array will have the same distribution and sum up to budget and each element in the array gets at least min.
I have an algorithm that runs in O(budget):
private int[] distribute(int budget, int n, int min) {
int[] subBudgets = new int[n];
for (int i = 0; i < n; i++) {
subBudgets[i] = min;
}
budget -= n * min;
while (budget > 0) {
subBudgets[random.nextInt(n)]++;
budget--;
}
return subBudgets;
}
However, when budget increases, it can be very expensive. Is there any algorithm that runs in O(n) or even better?
First generate n random numbers x[i], sum them up and then divide budget by the sum and you will get k. Then assign k*x[i] to each array element. It is simple and O(n).
If you want there at least min value in each element you can modify above algorithm by filling all elements by min (or use k*x[i] + min) and subcontracting n*min from budget before starting above algorithm.
If you need working with integers you can approach problem by using real value k and rounding k*x[i]. Then you have to track accumulating rounding error and add or subtract accumulated error from calculated value if it reach whole unit. You have to also assign remaining value into last element to reach whole budget.
P.S.: Note this algorithm can be used with easy in pure functional languages. It is reason why I like this whole family of algorithms generating random numbers for each member and then do some processing afterward. Example of implementation in Erlang:
-module(budget).
-export([distribute/2, distribute/3]).
distribute(Budget, N) ->
distribute(Budget, N, 0).
distribute(Budget, N, Min) when
is_integer(Budget), is_integer(N), N > 0,
is_integer(Min), Min >= 0, Budget >= N*Min ->
Xs = [random:uniform() || _ <- lists:seq(1,N) ],
Rest = Budget - N*Min,
K = Rest / lists:sum(Xs),
F = fun(X, {Bgt, Err, Acc}) ->
Y = X*K + Err,
Z = round(Y),
{Bgt - Z, Y - Z, [Z + Min | Acc]}
end,
{Bgt, _, T} = lists:foldl(F, {Rest, 0.0, []}, tl(Xs)),
[Bgt + Min | T].
Same algorithm in C++ (?? I dunno.)
private int[] distribute(int budget, int n, int min) {
int[] subBudgets = new int[n];
double[] rands = new double[n];
double k, err = 0, sum = 0;
budget -= n * min;
for (int i = 0; i < n; i++) {
rands[i] = random.nextDouble();
sum += rands[i];
}
k = (double)budget/sum;
for (int i = 1; i < n; i++) {
double y = k*rands[i] + err;
int z = floor(y+0.5);
subBudgets[i] = min + z;
budget -= z;
err = y - z;
}
subBudgets[0] = min + budget;
return subBudgets;
}
Sampling from the Multinomial Distribution
The way that you are currently distributing the dollars left over after min has been given to each subbudget involves performing a fixed number budget of random "trials", where on each trial you randomly select one of n categories, and you want to know how many times each category is selected. This is modeled by a multinomial distribution with the following parameters:
Number of trials (called n on the WP page): budget
Number of categories (called k on the WP page): n
Probability of category i in each trial, for 1 <= i <= n: 1/n
The way you are currently doing it is a good way if the number of trials is around the same size as the number of categories, or less. But if the budget is large, there are other more efficient ways of sampling from this distribution. The easiest way I know of is to notice that a multinomial distribution with k categories can be repeatedly decomposed into binomial distributions by grouping categories together: instead of directly how many selections there are for each of the k categories, we express this as a sequence of questions: "How to split the budget between the first category and the other k-1?" We next ask "How to split the remainder between the second category and the other k-2?", etc.
So the top level binomial has category (subbudget) 1 vs. everything else. Decide the number of dollars that go to subbudget 1 by taking 1 sample from a binomial distribution with parameters n = budget and p = 1/n (how to do this is described here); this will produce some number 0 <= x[1] <= n. To find the number of dollars that go to subbudget 2, take 1 sample from a binomial distribution on the remaining money, i.e. using parameters n = budget - x[1] and p = 1/(n-1). After getting subbudget 2's amount x[2], subbudget 3's will be found using parameters n = budget - x[1] - x[2] and p = 1/(n-2), and so on.
Integrating #Hynek -Pichi- Vychodil's idea and my original algorithm, I came up with the following algorithm that runs in O(n) and all rounding errors are uniformly distributed to the array:
private int[] distribute(int budget, int n, int min) {
int[] subBudgets = new int[n];
for (int i = 0; i < n; i++) {
subBudgets[i] = min;
}
budget -= n * min;
if (budget > 3 * n) {
double[] rands = new double[n];
double sum = 0;
for (int i = 0; i < n; i++) {
rands[i] = random.nextDouble();
sum += rands[i];
}
for (int i =0; i < n; i++) {
double additionalBudget = budget / sum * rands[i];
subBudgets[i] += additionalBudget;
budget -= additionalBudget;
}
}
while (budget > 0) {
subBudgets[random.nextInt(n)]++;
budget--;
}
return subBudgets;
}
Let me demonstrate my algorithm using an example:
Assume budget = 100, n = 5, min = 10
Initialize the array to:
[10, 10, 10, 10, 10] => current sum = 50
Generate a random integer ranging from 0 to 50 (50 is the result of budget - current sum):
Say the random integer is 20 and update the array:
[30, 10, 10, 10, 10] => current sum = 70
Generate a random integer ranging from 0 to 30 (30 is the result of budget - current sum):
Say the random integer is 5 and update the array:
[30, 15, 10, 10, 10] => current sum = 75
Repeat the process above and the last element is whatever is left.
Finally, shuffle the array to get the final result.
What I am trying to do is to generate some random numbers (not necessarily single digit) like
29106
7438
5646
4487
9374
28671
92
13941
25226
10076
and then count the number of digits I get:
count[0] = 3 Percentage = 6.82
count[1] = 5 Percentage = 11.36
count[2] = 6 Percentage = 13.64
count[3] = 3 Percentage = 6.82
count[4] = 6 Percentage = 13.64
count[5] = 2 Percentage = 4.55
count[6] = 7 Percentage = 15.91
count[7] = 5 Percentage = 11.36
count[8] = 3 Percentage = 6.82
count[9] = 4 Percentage = 9.09
This is the code I am using:
#include <stdio.h>
#include <time.h>
#include <stdlib.h>
int main() {
int i;
srand(time(NULL));
FILE* fp = fopen("random.txt", "w");
// for(i = 0; i < 10; i++)
for(i = 0; i < 1000000; i++)
fprintf(fp, "%d\n", rand());
fclose(fp);
int dummy;
long count[10] = {0,0,0,0,0,0,0,0,0,0};
fp = fopen("random.txt", "r");
while(!feof(fp)) {
fscanf(fp, "%1d", &dummy);
count[dummy]++;
}
fclose(fp);
long sum = 0;
for(i = 0; i < 10; i++)
sum += count[i];
for(i = 0; i < 10; i++)
printf("count[%d] = %7ld Percentage = %5.2f\n",
i, count[i], ((float)(100 * count[i])/sum));
}
If I generate a large number of random numbers (1000000), this is the result I get:
count[0] = 387432 Percentage = 8.31
count[1] = 728339 Percentage = 15.63
count[2] = 720880 Percentage = 15.47
count[3] = 475982 Percentage = 10.21
count[4] = 392678 Percentage = 8.43
count[5] = 392683 Percentage = 8.43
count[6] = 392456 Percentage = 8.42
count[7] = 391599 Percentage = 8.40
count[8] = 388795 Percentage = 8.34
count[9] = 389501 Percentage = 8.36
Notice that 1, 2 and 3 have too many hits. I have tried running this several times and each time I get very similar results.
I am trying to understand what could cause 1, 2 and 3 to appear much more frequently than any other digit.
Taking hint from what Matt Joiner and Pascal Cuoq pointed out,
I changed the code to use
for(i = 0; i < 1000000; i++)
fprintf(fp, "%04d\n", rand() % 10000);
// pretty prints 0
// generates numbers in range 0000 to 9999
and this is what I get (similar results on multiple runs):
count[0] = 422947 Percentage = 10.57
count[1] = 423222 Percentage = 10.58
count[2] = 414699 Percentage = 10.37
count[3] = 391604 Percentage = 9.79
count[4] = 392640 Percentage = 9.82
count[5] = 392928 Percentage = 9.82
count[6] = 392737 Percentage = 9.82
count[7] = 392634 Percentage = 9.82
count[8] = 388238 Percentage = 9.71
count[9] = 388352 Percentage = 9.71
What can be the reason that 0, 1 and 2 are favored?
Thanks everyone. Using
int rand2(){
int num = rand();
return (num > 30000? rand2():num);
}
fprintf(fp, "%04d\n", rand2() % 10000);
I get
count[0] = 399629 Percentage = 9.99
count[1] = 399897 Percentage = 10.00
count[2] = 400162 Percentage = 10.00
count[3] = 400412 Percentage = 10.01
count[4] = 399863 Percentage = 10.00
count[5] = 400756 Percentage = 10.02
count[6] = 399980 Percentage = 10.00
count[7] = 400055 Percentage = 10.00
count[8] = 399143 Percentage = 9.98
count[9] = 400104 Percentage = 10.00
rand() generates a value from 0 to RAND_MAX. RAND_MAX is set to INT_MAX on most platforms, which may be 32767 or 2147483647.
For your example given above, it appears that RAND_MAX is 32767. This will place an unusually high frequency of 1, 2 and 3 for the most significant digit for the values from 10000 to 32767. You can observe that to a lesser degree, values up to 6 and 7 will also be slightly favored.
Regarding the edited question,
This is because the digits are still not uniformly distributed even if you % 10000. Assume RAND_MAX == 32767, and rand() is perfectly uniform.
For every 10,000 numbers counting from 0, all of the digits will appear uniformly (4,000 each). However, 32,767 is not divisible by 10,000. Therefore, these 2,768 numbers will provide more leading 0, 1 and 2's to the final count.
The exact contribution from these 2,768 numbers are:
digits count
0 1857
1 1857
2 1625
3 857
4 857
5 857
6 855
7 815
8 746
9 746
adding 12,000 for the initial 30,000 numbers to the count, then divide by the total number of digits (4×32,768) should give you the expected distribution:
number probability (%)
0 10.5721
1 10.5721
2 10.3951
3 9.80911
4 9.80911
5 9.80911
6 9.80759
7 9.77707
8 9.72443
9 9.72443
which is close to what you get.
If you want to truly uniform digit distribution, you need to reject those 2,768 numbers:
int rand_4digits() {
const int RAND_MAX_4_DIGITS = RAND_MAX - RAND_MAX % 10000;
int res;
do {
res = rand();
} while (res >= RAND_MAX_4_DIGITS);
return res % 10000;
}
Looks like Benford's Law - see http://en.wikipedia.org/wiki/Benford%27s_law, or alternatively a not very good RNG.
That's because you generate numbers between 0 and RAND_MAX. The generated numbers are evenly distributed (i.e. approx. same probability for each number), however, the digits 1,2,3 occur more often than others in this range. Try generating between 0 and 10, where each digit occurs with the same probability and you'll get a nice distribution.
If I understand what the OP (person asking the question) wants, they want to make better random numbers.
rand() and random(), quite frankly, don't make very good random numbers; they both do poorly when tested against diehard and dieharder (two packages for testing the quality of random numbers).
The Mersenne twister is a popular random number generator which is good for pretty much everything except crypto-strong random numbers; it passes all of the diehard(er) tests with flying colors.
If one needs crypto-strong random numbers (numbers that can not be guessed, even if someone knows which particular crypto-strong algorithm is being used), there are a number of stream ciphers out there. The one I like to use is called RadioGatún[32], and here’s a compact C representation of it:
/*Placed in the public domain by Sam Trenholme*/
#include <stdint.h>
#include <stdio.h>
#define p uint32_t
#define f(a) for(c=0;c<a;c++)
#define n f(3){b[c*13]^=s[c];a[16+c]^=s[c];}k(a,b
k(p *a,p *b){p A[19],x,y,r,q[3],c,i;f(3){q[c]=b[c
*13+12];}for(i=12;i;i--){f(3){b[c*13+i]=b[c*13+i-
1];}}f(3){b[c*13]=q[c];}f(12){i=c+1+((c%3)*13);b[
i]^=a[c+1];}f(19){y=(c*7)%19;r=((c*c+c)/2)%32;x=a
[y]^(a[(y+1)%19]|(~a[(y+2)%19]));A[c]=(x>>r)|(x<<
(32-r));}f(19){a[c]=A[c]^A[(c+1)%19]^A[(c+4)%19];
}a[0]^=1;f(3){a[c+13]^=q[c];}}l(p *a,p *b,char *v
){p s[3],q,c,r,x,d=0;for(;;){f(3){s[c]=0;}for(r=0
;r<3;r++){for(q=0;q<4;q++){if(!(x=*v&255)){d=x=1;
}v++;s[r]|=x<<(q*8);if(d){n);return;}}}n);}}main(
int j,char **h){p a[39],b[39],c,e,g;if(j==2){f(39
){a[c]=b[c]=0;}l(a,b,h[1]);f(16){k(a,b);}f(4){k(a
,b);for(j=1;j<3;++j){g=a[j];for(e=4;e;e--){printf
("%02x",g&255);g>>=8;}}}printf("\n");}}
There are also a lot of other really good random number generators out there.
When you want to generate random value from range [0, x), instead of doing rand()%x, you should apply formula x*((double)rand()/RAND_MAX), which will give you nicely distributed random values.
Say, RAND_MAX is equal to 15, so rand will give you integers from 0 to 15. When you use modulo operator to get random numbers from [0, 10), values [0,5] will have higher frequency than [6,9], because 3 == 3%10 == 13%10.