If I do rand()/RAND_MAX, will it give me a random probability value?
If I do so, is it going to be that 50% (on average) of the values will be more than 0.5?
Never use rand() for any purpose, ever.
random() is likely suitable for your needs. (#include <stdlib.h>) It generates a uniform distribution in the range 0..231-1. random() / (double)((1L << 31) - 1) should get you close to a uniform distribution between 0.0 and 1.0.
You can use srandomdev() to seed it in order to get a different sequence every time.
Here is a histogram of one billion values returned by random() in 256 bins over the range 0..231-1:
If you look closely, you can see the expected tiny variations from uniform along the top of the histogram.
Yes the rand() / (double)RAND_MAX will give you a random value.
No, rand() have no mention of a uniform(or any other) distribution. There is no guarantee that "is it going to be that 50% (on average)".
Have a uniform distribution in C is a different question. You may be interested in Generating a uniform distribution of INTEGERS in C
SO topic.
Related
I have read so many posts on this topic:
How does rand() work? Does it have certain tendencies? Is there something better to use?
How does the random number generator work in C
and this is what I got:
1) xn+1 depends on xn i.e., previous random number that is generated.
2) It is not recommended to initialize the seed more than once in the program.
3) It is a bad practice to use rand()%2 to generate either 0 or 1 randomly.
My questions are:
1) Are there any other libraries that I missed to take a look to generate a completely random number (either 0 or 1) without depending on previous output?
2) If there is any other work around using the inbuilt rand() function to satisfy the requirement?
3) What is the side effect of initializing the seed more than once in a program?
Code snippet:
srand(time(NULL));
d1=rand()%2;
d2=rand()%2;
Here my intention is to make d1 and d2 completely independent of each other.
My initial thought is to do this:
srand(time(NULL));
d1=rand()%2;
srand(time(NULL));
d2=rand()%2;
But as I mentioned earlier which is based on other posts, this is a bad practice I suppose?
So, can anyone please answer the above questions? I apologize if I completely missed an obvious thing.
Are there any other libraries that I missed to take a look to generate a completely random number between 0 and 1 without depending on previous output?
Not in the standard C library. There are lots of other libraries which generate "better" pseudo-random numbers.
If there is any other work around using the inbuilt rand() function to satisfy the requirement?
Most standard library implementations of rand produce sequences of random numbers where the low-order bit(s) have a short sequence and/or are not as independent of each other as one would like. The high-order bits are generally better distributed. So a better way of using the standard library rand function to generate a random single bit (0 or 1) is:
(rand() > RAND_MAX / 2)
or use an interior bit:
(rand() & 0x400U != 0)
Those will produce reasonably uncorrelated sequences with most standard library rand implementations, and impose no more computational overhead than checking the low-order bit. If that's not good enough for you, you'll probably want to research other pseudo-random number generators.
All of these (including rand() % 2) assume that RAND_MAX is odd, which is almost always the case. (If RAND_MAX were even, there would be an odd number of possible values and any way of dividing an odd number of possible values into two camps must be slightly biased.)
What is the side effect of initializing the seed more than once in a program?
You should think of the random number generator as producing "not very random" numbers after being seeded, with the quality improving as you successively generate new random numbers. And remember that if you seed the random number generator using some seed, you will get exactly the same sequence as you will the next time you seed the generator with the same seed. (Since time() returns a number of seconds, two successive calls in quick succession will usually produce exactly the same number, or very occasionally two consecutive numbers. But definitely not two random uncorrelated numbers.)
So the side effect of reseeding is that you get less random numbers, and possibly exactly the same ones as you got the last time you reseeded.
1) Are there any other libraries that I missed to take a look to
generate a completely random number between 0 and 1 without depending
on previous output?
This sub-question is off-topic for Stack Overflow, but I'll point out that POSIX and BSD systems have an alternative random number generator function named random() that you could consider if you are programming for such a platform (e.g. Linux, OS X).
2) If there is any other work around using the inbuilt rand() function
to satisfy the requirement?
Traditional computers (as opposed to quantum computers) are deterministic machines. They cannot do true randomness. Every completely programmatic "random number generator" is in practice a psuedo-random number generator. They generate completely deterministic sequences, but the values from a given set of calls are distributed across the generator's range in a manner approximately consistent with a target probability distribution (ordinarily the uniform distribution).
Some operating systems provide support for generating numbers that depend on something more chaotic and less predictable than a computed sequence. For instance, they may collect information from mouse movements, CPU temperature variations, or other such sources, to produce more objectively random (yet still deterministic) numbers. Linux, for example, has such a driver that is often exposed as the special file /dev/random. The problem with these is that they have a limited store of entropy, and therefore cannot provide numbers at a sustained high rate. If you need only a few random numbers, however, then that might be a suitable source.
3) What is the side effect of initializing the seed more than once in
a program?
Code snippet:
srand(time(NULL));
d1=rand()%2;
d2=rand()%2;
Here my intention is to make d1 and d2 completely independent of each
other.
My initial thought is to do this:
srand(time(NULL));
d1=rand()%2;
srand(time(NULL));
d2=rand()%2;
But as I mentioned earlier which is based on other posts, this is a
bad practice I suppose?
It is indeed bad if you want d1 and d2 to have a 50% probability of being different. time() returns the number of seconds since the epoch, so it is highly likely that it will return the same value when called twice so close together. The sequence of pseudorandom numbers is completely determined by the seed (this is a feature, not a bug), and when you seed the PRNG, you restart the sequence. Even if you used a higher-resolution clock to make the seeds more likely to differ, you don't escape correlation this way; you just change the function generating numbers for you. And the result does not have the same guarantees for output distribution.
Additionally, when you do rand() % 2 you use only one bit of the approximately log2(RAND_MAX) + 1 bits that it produced for you. Over the whole period of the PRNG, you can expect that bit to take each value the same number of times, but over narrow ranges you may sometimes see some correlation.
In the end, your requirement for your two random numbers to be completely independent of one another is probably way overkill. It is generally sufficient for the pseudo-random result of one call to be have no apparent correlation with the results of previous calls. You probably achieve that well enough with your first code snippet, even despite the use of only one bit per call. If you prefer to use more of the bits, though, then with some care you could base the numbers you choose on the parity of the count of how many bits are set in the values returned by rand().
Use this
(double)rand() / (double)RAND_MAX
completely random number ..... without depending on previous output?
Well in reality computers can't generate completely random numbers. There has to be some dependencies. But for almost all practical purposes, you can use rand().
side effect of initializing the seed more than once
No side effect. But that would mean you're completely invalidating the point of using rand(). If you're re initilizeing seed every time, the random number is more dependent on time(and processor).
any other work around using the inbuilt rand() function
You can write something like this:
#include<stdio.h>
#include<stdlib.h>
#include<time.h>
int main(int argc,char *argv[])
{
srand(time(NULL));
printf("%lf\n",(double)rand()/(double)RAND_MAX);
printf("%lf\n",(double)rand()/(double)RAND_MAX);
}
If you want to generate either a 0 or a 1, I think using rand()%2 is perfectly fine as the probability of an even number is same as probability of an odd number(probability of all numbers is equal for an unbiased random number generator).
I am trying to generate a random probability with maximum of given probability. How could we generate a random real value between 0 and 0.5 in C?
You cannot generate a really random number in portable C99. But you could use some PRNG (and perhaps seed it with the current time).
And computers don't know about true real numbers. Only floating point. See http://floating-point-gui.de/
Actually I believe that the universe does not know about real numbers (think about some cardinality argument similar to Cantor's diagonal argument). But ask physicists or philosophers.
Some operating systems or implementations have PRNGs, and some systems (including) hardware have even genuine random generators.
Read about random(3) & drand48(3) if your system has it. If on Linux, read also random(4)
You might try
double my_random_number /* between 0 & 0.5 */ = drand48() * 0.5;
to generate an almost uniformly distributed random number >= 0 and < 0.5
See also C++11 standard header <random> if you accept to code in C++ ...
Like the title says, I'm using a random number generator on a Freescale Coldfire chip and it returns a 32 bit unsigned value. As far as I know, there is no way to configure the generator to limit the range. What would be the best way to manipulate the number to be in the accepted range?
I was thinking of modding the number by the high range value but I would still have to deal with the lower bound.
This C FAQ article How can I get random integers in a certain range? explains how to properly generate random numbers in range [M,N] basically the formula you should use is:
M + (random number) / (RAND_MAX / (N - M + 1) + 1)
Stephan T. Lavavej explains why doing this is still not going to be that great:
From Going Native 2013 - rand() Considered Harmful
If you really care about even distribution, stick with a power of 2, or find some routines for dithering.
Yes, the traditional way is to MOD by the range, and this will be fine for many ordinary uses (simulating cards, dice, etc.) if the range is very small (a range of 52 compared to the 32-bit range of your generator is quite small). This will still be biased, but the bias will be nearly impossible to detect. If your range is bigger, the bias will be bigger. For example, if you want a random 9-digit number, the bias will be dreadful. If you want to eliminate bias altogether, there is no alternative but to do rejection sampling--that is, you must have a loop in which you generate random numbers and throw some away. If you do it right, you can keep those extra numbers needed to a minimum so you don't slow things down much.
I have to write a C program to convert a uniform distribution of random numbers (say from 0 to 1) to a poisson distribution. Can anyone help?
Use GSL, the Gnu Scientific Library. There's a function called gsl_ran_poisson:
This function returns a random integer from the Poisson distribution with mean mu.
The probability distribution for Poisson variates is,
p(k) = {\mu^k \over k!} \exp(-\mu)
for k >= 0.
Otherwise, look at the code and copy the ideas.
I am assuming you want to write a C program that can sample a random number from the Poisson Distribution, given a random number in U(0,1).
Generally, this is done by taking the inverse CDF of the number from U(0,1). For discrete distributions like Poisson, one first transforms it to a continuous distribution by assuming that the CDF function is smooth between the integer points, and then we apply appropriate approximations (floor function).
The book Numerical Recipes in C++ (3rd Ed) has the complete explanation and C++ code as well. sec 7.3.12, page 372.
For a simple simulation in C, I need to generate exponential random variables. I remember reading somewhere (but I can't find it now, and I don't remember why) that using the rand() function to generate random integers in a fixed range would generate non-uniformly distributed integers. Because of this, I'm wondering if this code might have a similar problem:
//generate u ~ U[0,1]
u = ( (double)rand() / ((double)(RAND_MAX));
//inverse of exponential CDF to get exponential random variable
expon = -log(1-u) * mean;
Thank you!
The problem with random numbers in a fixed range is that a lot of people do this for numbers between 100 and 200 for example:
100 + rand() % 100
That is not uniform. But by doing this it is (or is close enough to uniform at least):
u = 100 + 100 * ((double)rand() / ((double)(RAND_MAX));
Since that's what you're doing, you should be safe.
In theory, at least, rand() should give you a discrete uniform distribution from 0 to RAND_MAX... in practice, it has some undesirable properties, such as a small period, so whether it's useful depends on how you're using it.
RAND_MAX is usually 32k, while the LCG rand() uses generates pseudorandom 32 bit numbers. Thus, the lack of uniformity, as well as low periodicity, will generally go unnoticed.
If you require high quality pseudorandom numbers, you could try George Marsaglia's CMWC4096 (Complementary Multiply With Carry). This is probably the best pseudorandom number generator around, with extreme periodicity and uniform distribution (you just have to pick good seeds for it). Plus, it's blazing fast (not as fast as a LCG, but approximately twice as fast as a Mersenne Twister.
Yes and no. The problem you're thinking of arises when you're clamping the output from rand() into a range that's smaller than RAND_MAX (i.e. there are fewer possible outputs than inputs).
In your case, you're (normally) reversing that: you're taking a fairly small number of bits produced by the random number generator, and spreading them among what will usually be a larger number of bits in the mantissa of your double. That means there are normally some bit patterns in the double (and therefore, specific values of the double) that can never occur. For most people's uses that's not a problem though.
As far as the "normally" goes, it's always possible that you have a 64-bit random number generator, where a double typically has a 53-bit mantissa. In this case, you could have the same kind of problem as with clamping the range with integers.
No, your algorithm will work; it's using the modulus function that does things imperfectly.
The one problem is that because it's quantized, once in a while it will generate exactly RAND_MAX and you'll be asking for log(1-1). I'd recommend at least (rand() + 0.5)/(RAND_MAX+1), if not a better source like drand48().
There are much faster ways to compute the necessary numbers, e.g. the Ziggurat algorithm.