I am trying to generate a random probability with maximum of given probability. How could we generate a random real value between 0 and 0.5 in C?
You cannot generate a really random number in portable C99. But you could use some PRNG (and perhaps seed it with the current time).
And computers don't know about true real numbers. Only floating point. See http://floating-point-gui.de/
Actually I believe that the universe does not know about real numbers (think about some cardinality argument similar to Cantor's diagonal argument). But ask physicists or philosophers.
Some operating systems or implementations have PRNGs, and some systems (including) hardware have even genuine random generators.
Read about random(3) & drand48(3) if your system has it. If on Linux, read also random(4)
You might try
double my_random_number /* between 0 & 0.5 */ = drand48() * 0.5;
to generate an almost uniformly distributed random number >= 0 and < 0.5
See also C++11 standard header <random> if you accept to code in C++ ...
Related
I have read so many posts on this topic:
How does rand() work? Does it have certain tendencies? Is there something better to use?
How does the random number generator work in C
and this is what I got:
1) xn+1 depends on xn i.e., previous random number that is generated.
2) It is not recommended to initialize the seed more than once in the program.
3) It is a bad practice to use rand()%2 to generate either 0 or 1 randomly.
My questions are:
1) Are there any other libraries that I missed to take a look to generate a completely random number (either 0 or 1) without depending on previous output?
2) If there is any other work around using the inbuilt rand() function to satisfy the requirement?
3) What is the side effect of initializing the seed more than once in a program?
Code snippet:
srand(time(NULL));
d1=rand()%2;
d2=rand()%2;
Here my intention is to make d1 and d2 completely independent of each other.
My initial thought is to do this:
srand(time(NULL));
d1=rand()%2;
srand(time(NULL));
d2=rand()%2;
But as I mentioned earlier which is based on other posts, this is a bad practice I suppose?
So, can anyone please answer the above questions? I apologize if I completely missed an obvious thing.
Are there any other libraries that I missed to take a look to generate a completely random number between 0 and 1 without depending on previous output?
Not in the standard C library. There are lots of other libraries which generate "better" pseudo-random numbers.
If there is any other work around using the inbuilt rand() function to satisfy the requirement?
Most standard library implementations of rand produce sequences of random numbers where the low-order bit(s) have a short sequence and/or are not as independent of each other as one would like. The high-order bits are generally better distributed. So a better way of using the standard library rand function to generate a random single bit (0 or 1) is:
(rand() > RAND_MAX / 2)
or use an interior bit:
(rand() & 0x400U != 0)
Those will produce reasonably uncorrelated sequences with most standard library rand implementations, and impose no more computational overhead than checking the low-order bit. If that's not good enough for you, you'll probably want to research other pseudo-random number generators.
All of these (including rand() % 2) assume that RAND_MAX is odd, which is almost always the case. (If RAND_MAX were even, there would be an odd number of possible values and any way of dividing an odd number of possible values into two camps must be slightly biased.)
What is the side effect of initializing the seed more than once in a program?
You should think of the random number generator as producing "not very random" numbers after being seeded, with the quality improving as you successively generate new random numbers. And remember that if you seed the random number generator using some seed, you will get exactly the same sequence as you will the next time you seed the generator with the same seed. (Since time() returns a number of seconds, two successive calls in quick succession will usually produce exactly the same number, or very occasionally two consecutive numbers. But definitely not two random uncorrelated numbers.)
So the side effect of reseeding is that you get less random numbers, and possibly exactly the same ones as you got the last time you reseeded.
1) Are there any other libraries that I missed to take a look to
generate a completely random number between 0 and 1 without depending
on previous output?
This sub-question is off-topic for Stack Overflow, but I'll point out that POSIX and BSD systems have an alternative random number generator function named random() that you could consider if you are programming for such a platform (e.g. Linux, OS X).
2) If there is any other work around using the inbuilt rand() function
to satisfy the requirement?
Traditional computers (as opposed to quantum computers) are deterministic machines. They cannot do true randomness. Every completely programmatic "random number generator" is in practice a psuedo-random number generator. They generate completely deterministic sequences, but the values from a given set of calls are distributed across the generator's range in a manner approximately consistent with a target probability distribution (ordinarily the uniform distribution).
Some operating systems provide support for generating numbers that depend on something more chaotic and less predictable than a computed sequence. For instance, they may collect information from mouse movements, CPU temperature variations, or other such sources, to produce more objectively random (yet still deterministic) numbers. Linux, for example, has such a driver that is often exposed as the special file /dev/random. The problem with these is that they have a limited store of entropy, and therefore cannot provide numbers at a sustained high rate. If you need only a few random numbers, however, then that might be a suitable source.
3) What is the side effect of initializing the seed more than once in
a program?
Code snippet:
srand(time(NULL));
d1=rand()%2;
d2=rand()%2;
Here my intention is to make d1 and d2 completely independent of each
other.
My initial thought is to do this:
srand(time(NULL));
d1=rand()%2;
srand(time(NULL));
d2=rand()%2;
But as I mentioned earlier which is based on other posts, this is a
bad practice I suppose?
It is indeed bad if you want d1 and d2 to have a 50% probability of being different. time() returns the number of seconds since the epoch, so it is highly likely that it will return the same value when called twice so close together. The sequence of pseudorandom numbers is completely determined by the seed (this is a feature, not a bug), and when you seed the PRNG, you restart the sequence. Even if you used a higher-resolution clock to make the seeds more likely to differ, you don't escape correlation this way; you just change the function generating numbers for you. And the result does not have the same guarantees for output distribution.
Additionally, when you do rand() % 2 you use only one bit of the approximately log2(RAND_MAX) + 1 bits that it produced for you. Over the whole period of the PRNG, you can expect that bit to take each value the same number of times, but over narrow ranges you may sometimes see some correlation.
In the end, your requirement for your two random numbers to be completely independent of one another is probably way overkill. It is generally sufficient for the pseudo-random result of one call to be have no apparent correlation with the results of previous calls. You probably achieve that well enough with your first code snippet, even despite the use of only one bit per call. If you prefer to use more of the bits, though, then with some care you could base the numbers you choose on the parity of the count of how many bits are set in the values returned by rand().
Use this
(double)rand() / (double)RAND_MAX
completely random number ..... without depending on previous output?
Well in reality computers can't generate completely random numbers. There has to be some dependencies. But for almost all practical purposes, you can use rand().
side effect of initializing the seed more than once
No side effect. But that would mean you're completely invalidating the point of using rand(). If you're re initilizeing seed every time, the random number is more dependent on time(and processor).
any other work around using the inbuilt rand() function
You can write something like this:
#include<stdio.h>
#include<stdlib.h>
#include<time.h>
int main(int argc,char *argv[])
{
srand(time(NULL));
printf("%lf\n",(double)rand()/(double)RAND_MAX);
printf("%lf\n",(double)rand()/(double)RAND_MAX);
}
If you want to generate either a 0 or a 1, I think using rand()%2 is perfectly fine as the probability of an even number is same as probability of an odd number(probability of all numbers is equal for an unbiased random number generator).
I am writing a program in C that requires generating a normal distribution of positive integers with mean less than 1.
You can normalize data that is already normally distributed, for example take data for average length of human beings (180 centimeter) and scale every number by a factor so that the mean becomes less than 1 e.g. multiply every length by 1/180.
I used a Poisson random number generator function in C, which takes the mean as input. I used a combination of rand() followed by exponentiation to get the distribution, which is the normal way to do this, as in Calculation of Poisson distribution in C.
Since I generate 2^14 random numbers, by Central Limit theorem, their distribution will tend to a normal distribution, with the same mean and variance.
As far as I know rand() does not generate a uniform random distribution. What function/algorithm will allow me to do so? I have no need for cryptographic randomness, only a uniform random distribution. Lastly, what libraries provide these functions?
Thanks!
rand() does generate a uniform (pseudo-)random distribution.
The actual requirement, from the C standard (3.7 MB PDF), section 7.20.2.1, is:
The rand function computes a sequence of pseudo-random integers in
the range 0 to RAND_MAX.
where RAND_MAX is at least 32767. That's admittedly vague, but the intent is that it gives you a uniform distribution -- and in practice, that's what implementations actually do.
The standard provides a sample implementation, but C implementations aren't required to use it.
In practice, there are certainly better random number generators out there. And one specific requirement for rand() is that it must produce exactly the same sequence of numbers for a given seed (argument to srand()). Your description doesn't indicate that that would be a problem for you.
One problem is that rand() gives you uniformly distributed numbers in a fixed range. If you want numbers in a different range, you have to do some extra work. For example, if RAND_MAX is 32767, then rand() can produce 32768 distinct values; you can't get random numbers in the range 0..9 without discarding some values, since there's no way to evenly distribute those 32768 distinct values into 10 equal sized buckets.
Other PRNGs are likely to give you better results than rand(), but they're still probably going to be subject to the same issues.
As usual, the comp.lang.c FAQ answers this better than I did; see questions 13.15 through 13.21.
Here's an article and a stand-alone random number generator written in C#. The code is very small and easily portable to C++ etc.
Whenever this subject comes up, someone responds that you should not use your own random number generator but should leave that up to specialists. I respond that you should not come up with your own algorithm. Leave that up to specialists because it is indeed very subtle. But it's OK and even beneficial to have your own implementation. That way you know what's being done, and you could use the same method across languages or platforms.
The algorithm in that article is by George Marsaglia, a top expert in random number generation. Even though the code is tiny, the method holds up well to standard tests.
The BSD random() function (included in the XSI option of POSIX/SUS) is almost universally available and much better than rand on most systems (except some where rand actually uses random and thus they're both pretty good).
If you'd rather go outside the system libraries, here's some good information on your choices:
http://guru.multimedia.cx/category/pseudo-random-number-generators/
(From Michael Niedermayer of FFmpeg fame.)
Well, the question of whether or not an actual pseudorandom generator exists is still open. That being said, a quick search reveals that there may be some slightly better alternatives.
Trying to generate random numbers in C, rand() doesn't generate different numbers each time i compile the code, can anyone tell me how to use srand() or tell any other method for generating.
In order to generate a sequence of pseudorandom numbers, the generator needs to be seeded. The seed fully determines the sequence of numbers that will be produced. In C, you seed with srand, as you indicate. According to the srand(3) man page, no explicit seeding implies that the generator will use 1 as a seed. This expains why you always see the same numbers (but do remember that the sequence itself is pretty random, with quality depending on the generator used, even though the sequence is the same each time).
User mzabsky points out that one way to get a seed that feels random to a human user is to seed with time. Another common method (which I just saw that mzabsky also points out - sorry) is to seed the generator with the contents of the system's random number generator, which draws from an entropy pool fed by things such as mouse movement, disk timings etc. You can't draw a lot of randomness from the system generator, as it won't be able to gather enough entropy. But if you just draw a seed from it, you'll have chosen at random a sequence of random numbers in your program. Here's an example of how to do that in C on Linux:
unsigned int seed;
FILE* urandom = fopen("/dev/urandom", "r");
fread(&seed, sizeof(int), 1, urandom);
fclose(urandom);
srand(seed);
In light of Conrad Meyer's answer, I thought I'd elaborate a bit more. I'd divide the use of random numbers into three categories:
Variation. If you use random numbers to create seemingly random or varied behavior in for example a game, you don't need to think very hard about the topic, or about choosing a proper seed. Seed with time, and look at some other solution if this turns out not to be good enough. Even relatively bad RNGs will look random enough in this scenario.
Scientific simulations. If you use random numbers for scientific work, such as Monte Carlo calculations, you need to take care to choose a good generator. Your seed should be fixed (or user-changeable). You don't want variation (in the sense above); you want deterministic behavior but good randomness.
Cryptography. You'll want to be extremely careful. This is probably out of the scope of this thread.
This is commonly used solution:
srand ( time(NULL) );
Execution of all C code is deterministic, so you have to bring in something that is different every time you call the srand. In this case it is time.
Or you can read data from /dev/random (open it just like any other file).
If you are using an OS that does not provide /dev/random then use something like what is shown below
timeval t1;
gettimeofday(&t1, NULL);
srand(t1.tv_usec * t1.tv_sec);
This piece of code can be ported easily to other OS.
To improve the seed- you can combine (maybe using MD5 or a checksum algorithm) time product shown above with a MAC address of the host machine.
timeval t1;
gettimeofday(&t1, NULL);
unsigned int seed = t1.tv_usec * t1.tv_sec;
unsigned char mac_addr[6];
getMAC(&mac_addr);
improveSeedWithMAC(&seed, mac_addr) ; // MD5 or checksum ...
srand(seed);
Be careful; the rand(3) manpage on linux notes that rand() implementations on some platforms do not give good randomness on the lower-order bits. For this reason, you might want to use a library to acquire real random numbers. Glib provides useful functions like g_random_int_range() which may better suite your purpose.
For a simple simulation in C, I need to generate exponential random variables. I remember reading somewhere (but I can't find it now, and I don't remember why) that using the rand() function to generate random integers in a fixed range would generate non-uniformly distributed integers. Because of this, I'm wondering if this code might have a similar problem:
//generate u ~ U[0,1]
u = ( (double)rand() / ((double)(RAND_MAX));
//inverse of exponential CDF to get exponential random variable
expon = -log(1-u) * mean;
Thank you!
The problem with random numbers in a fixed range is that a lot of people do this for numbers between 100 and 200 for example:
100 + rand() % 100
That is not uniform. But by doing this it is (or is close enough to uniform at least):
u = 100 + 100 * ((double)rand() / ((double)(RAND_MAX));
Since that's what you're doing, you should be safe.
In theory, at least, rand() should give you a discrete uniform distribution from 0 to RAND_MAX... in practice, it has some undesirable properties, such as a small period, so whether it's useful depends on how you're using it.
RAND_MAX is usually 32k, while the LCG rand() uses generates pseudorandom 32 bit numbers. Thus, the lack of uniformity, as well as low periodicity, will generally go unnoticed.
If you require high quality pseudorandom numbers, you could try George Marsaglia's CMWC4096 (Complementary Multiply With Carry). This is probably the best pseudorandom number generator around, with extreme periodicity and uniform distribution (you just have to pick good seeds for it). Plus, it's blazing fast (not as fast as a LCG, but approximately twice as fast as a Mersenne Twister.
Yes and no. The problem you're thinking of arises when you're clamping the output from rand() into a range that's smaller than RAND_MAX (i.e. there are fewer possible outputs than inputs).
In your case, you're (normally) reversing that: you're taking a fairly small number of bits produced by the random number generator, and spreading them among what will usually be a larger number of bits in the mantissa of your double. That means there are normally some bit patterns in the double (and therefore, specific values of the double) that can never occur. For most people's uses that's not a problem though.
As far as the "normally" goes, it's always possible that you have a 64-bit random number generator, where a double typically has a 53-bit mantissa. In this case, you could have the same kind of problem as with clamping the range with integers.
No, your algorithm will work; it's using the modulus function that does things imperfectly.
The one problem is that because it's quantized, once in a while it will generate exactly RAND_MAX and you'll be asking for log(1-1). I'd recommend at least (rand() + 0.5)/(RAND_MAX+1), if not a better source like drand48().
There are much faster ways to compute the necessary numbers, e.g. the Ziggurat algorithm.