What is a suitable replacement for rand()? - c

As far as I know rand() does not generate a uniform random distribution. What function/algorithm will allow me to do so? I have no need for cryptographic randomness, only a uniform random distribution. Lastly, what libraries provide these functions?
Thanks!

rand() does generate a uniform (pseudo-)random distribution.
The actual requirement, from the C standard (3.7 MB PDF), section 7.20.2.1, is:
The rand function computes a sequence of pseudo-random integers in
the range 0 to RAND_MAX.
where RAND_MAX is at least 32767. That's admittedly vague, but the intent is that it gives you a uniform distribution -- and in practice, that's what implementations actually do.
The standard provides a sample implementation, but C implementations aren't required to use it.
In practice, there are certainly better random number generators out there. And one specific requirement for rand() is that it must produce exactly the same sequence of numbers for a given seed (argument to srand()). Your description doesn't indicate that that would be a problem for you.
One problem is that rand() gives you uniformly distributed numbers in a fixed range. If you want numbers in a different range, you have to do some extra work. For example, if RAND_MAX is 32767, then rand() can produce 32768 distinct values; you can't get random numbers in the range 0..9 without discarding some values, since there's no way to evenly distribute those 32768 distinct values into 10 equal sized buckets.
Other PRNGs are likely to give you better results than rand(), but they're still probably going to be subject to the same issues.
As usual, the comp.lang.c FAQ answers this better than I did; see questions 13.15 through 13.21.

Here's an article and a stand-alone random number generator written in C#. The code is very small and easily portable to C++ etc.
Whenever this subject comes up, someone responds that you should not use your own random number generator but should leave that up to specialists. I respond that you should not come up with your own algorithm. Leave that up to specialists because it is indeed very subtle. But it's OK and even beneficial to have your own implementation. That way you know what's being done, and you could use the same method across languages or platforms.
The algorithm in that article is by George Marsaglia, a top expert in random number generation. Even though the code is tiny, the method holds up well to standard tests.

The BSD random() function (included in the XSI option of POSIX/SUS) is almost universally available and much better than rand on most systems (except some where rand actually uses random and thus they're both pretty good).
If you'd rather go outside the system libraries, here's some good information on your choices:
http://guru.multimedia.cx/category/pseudo-random-number-generators/
(From Michael Niedermayer of FFmpeg fame.)

Well, the question of whether or not an actual pseudorandom generator exists is still open. That being said, a quick search reveals that there may be some slightly better alternatives.

Related

Is it acceptable to use rand() for cryptographically insecure random numbers?

Is it acceptable to use the C standard library's rand() function for random numbers that do not have to be cryptographically secure? If so, are there still better choices? If not, what should be used?
Of course, I assume the usual caveats about skew apply.
rand() suffers from some serious drawbacks.
There is no guarantee on the quality of the random number. This will vary from implementation to implementation.
The shared state used by different calls to rand, is not guaranteed to be thread safe.
As for POSIX C alternatives, there is random and random_r. OpenSSL provides more advances ways of generating random numbers.
The C++ (C++11 and later) library also provides a number of random number functions if including C++ in your project is an option.
Cryptographic security aside, there are a lot of systems where rand() has pretty atrocious randomness properties, and the standard advice if you need something better is to use the non-Standard random().
rand's poor properties on many systems include:
nonrandomness in the low-order bits (such that e.g. rand()%2 is guaranteed to alternate 0,1,0,1...).
relatively short period, perhaps "only" 4 billion or so
So my (reluctant) advice is that if you need "good" randomness (say, for a Monte Carlo simulation), you may very well want to investigate using a nonstandard alternative to rand(). (One of my eternal questions about C is why any vendor would spend time deploying a nonstandard random() instead of simply making rand() better. And I do know the canonical answers, although they suck.)
See also this similar question.
Yes, it is fine to use rand() to get pseudo-random numbers. In fact, that is the whole point of rand(). For simple tasks, where it is OK to be deterministic, you can even seed the system clock for simplicity.
The implementation of rand in mainstream C standard libraries may be adequate for casual use of pseudorandom numbers (such as in most single-player games, or for aesthetic purposes), especially if your application doesn't care about repeatable "randomness" across time or across computers. (But note that the rand specification in the C standard doesn't specify a particular distribution that the numbers delivered by rand have to follow.)
However, for more serious use of pseudorandom numbers, such as in a scientific simulation, I refer you to another answer of mine, where I explain that the problem with rand/srand is that rand—
Uses an unspecified RNG algorithm, yet
allows that RNG to be initialized with srand for repeatable "randomness".
These two points, taken together, hamper the ability of implementations to improve on the RNG's implementation; changing that RNG will defeat the goal of repeatable "randomness", especially if an application upgrades to a newer version of the C runtime library by the same vendor or is compiled with library implementations by different vendors. The first point also means that no particular quality of pseudorandom numbers is guaranteed. Another problem is that srand allows for only relatively small seeds — namely those with the same size as an unsigned.
However, even if the application doesn't care about repeatable "randomness", the fact that rand specifies that it behaves by default as though srand(1) were called (and thus, in practice, generates the same pseudorandom sequence by default) makes using rand harder to use effectively than it could be.
A better approach for noncryptographic pseudorandom numbers is to use a PRNG library—
that uses self-contained PRNGs that maintain their own state (e.g., in a single struct) and don't touch global state, and
that implements a PRNG algorithm whose details are known to the application.
I list several examples of high-quality PRNG algorithms for noncryptographic pseudorandom numbers.

How to generate either 0 or 1 randomly in C

I have read so many posts on this topic:
How does rand() work? Does it have certain tendencies? Is there something better to use?
How does the random number generator work in C
and this is what I got:
1) xn+1 depends on xn i.e., previous random number that is generated.
2) It is not recommended to initialize the seed more than once in the program.
3) It is a bad practice to use rand()%2 to generate either 0 or 1 randomly.
My questions are:
1) Are there any other libraries that I missed to take a look to generate a completely random number (either 0 or 1) without depending on previous output?
2) If there is any other work around using the inbuilt rand() function to satisfy the requirement?
3) What is the side effect of initializing the seed more than once in a program?
Code snippet:
srand(time(NULL));
d1=rand()%2;
d2=rand()%2;
Here my intention is to make d1 and d2 completely independent of each other.
My initial thought is to do this:
srand(time(NULL));
d1=rand()%2;
srand(time(NULL));
d2=rand()%2;
But as I mentioned earlier which is based on other posts, this is a bad practice I suppose?
So, can anyone please answer the above questions? I apologize if I completely missed an obvious thing.
Are there any other libraries that I missed to take a look to generate a completely random number between 0 and 1 without depending on previous output?
Not in the standard C library. There are lots of other libraries which generate "better" pseudo-random numbers.
If there is any other work around using the inbuilt rand() function to satisfy the requirement?
Most standard library implementations of rand produce sequences of random numbers where the low-order bit(s) have a short sequence and/or are not as independent of each other as one would like. The high-order bits are generally better distributed. So a better way of using the standard library rand function to generate a random single bit (0 or 1) is:
(rand() > RAND_MAX / 2)
or use an interior bit:
(rand() & 0x400U != 0)
Those will produce reasonably uncorrelated sequences with most standard library rand implementations, and impose no more computational overhead than checking the low-order bit. If that's not good enough for you, you'll probably want to research other pseudo-random number generators.
All of these (including rand() % 2) assume that RAND_MAX is odd, which is almost always the case. (If RAND_MAX were even, there would be an odd number of possible values and any way of dividing an odd number of possible values into two camps must be slightly biased.)
What is the side effect of initializing the seed more than once in a program?
You should think of the random number generator as producing "not very random" numbers after being seeded, with the quality improving as you successively generate new random numbers. And remember that if you seed the random number generator using some seed, you will get exactly the same sequence as you will the next time you seed the generator with the same seed. (Since time() returns a number of seconds, two successive calls in quick succession will usually produce exactly the same number, or very occasionally two consecutive numbers. But definitely not two random uncorrelated numbers.)
So the side effect of reseeding is that you get less random numbers, and possibly exactly the same ones as you got the last time you reseeded.
1) Are there any other libraries that I missed to take a look to
generate a completely random number between 0 and 1 without depending
on previous output?
This sub-question is off-topic for Stack Overflow, but I'll point out that POSIX and BSD systems have an alternative random number generator function named random() that you could consider if you are programming for such a platform (e.g. Linux, OS X).
2) If there is any other work around using the inbuilt rand() function
to satisfy the requirement?
Traditional computers (as opposed to quantum computers) are deterministic machines. They cannot do true randomness. Every completely programmatic "random number generator" is in practice a psuedo-random number generator. They generate completely deterministic sequences, but the values from a given set of calls are distributed across the generator's range in a manner approximately consistent with a target probability distribution (ordinarily the uniform distribution).
Some operating systems provide support for generating numbers that depend on something more chaotic and less predictable than a computed sequence. For instance, they may collect information from mouse movements, CPU temperature variations, or other such sources, to produce more objectively random (yet still deterministic) numbers. Linux, for example, has such a driver that is often exposed as the special file /dev/random. The problem with these is that they have a limited store of entropy, and therefore cannot provide numbers at a sustained high rate. If you need only a few random numbers, however, then that might be a suitable source.
3) What is the side effect of initializing the seed more than once in
a program?
Code snippet:
srand(time(NULL));
d1=rand()%2;
d2=rand()%2;
Here my intention is to make d1 and d2 completely independent of each
other.
My initial thought is to do this:
srand(time(NULL));
d1=rand()%2;
srand(time(NULL));
d2=rand()%2;
But as I mentioned earlier which is based on other posts, this is a
bad practice I suppose?
It is indeed bad if you want d1 and d2 to have a 50% probability of being different. time() returns the number of seconds since the epoch, so it is highly likely that it will return the same value when called twice so close together. The sequence of pseudorandom numbers is completely determined by the seed (this is a feature, not a bug), and when you seed the PRNG, you restart the sequence. Even if you used a higher-resolution clock to make the seeds more likely to differ, you don't escape correlation this way; you just change the function generating numbers for you. And the result does not have the same guarantees for output distribution.
Additionally, when you do rand() % 2 you use only one bit of the approximately log2(RAND_MAX) + 1 bits that it produced for you. Over the whole period of the PRNG, you can expect that bit to take each value the same number of times, but over narrow ranges you may sometimes see some correlation.
In the end, your requirement for your two random numbers to be completely independent of one another is probably way overkill. It is generally sufficient for the pseudo-random result of one call to be have no apparent correlation with the results of previous calls. You probably achieve that well enough with your first code snippet, even despite the use of only one bit per call. If you prefer to use more of the bits, though, then with some care you could base the numbers you choose on the parity of the count of how many bits are set in the values returned by rand().
Use this
(double)rand() / (double)RAND_MAX
completely random number ..... without depending on previous output?
Well in reality computers can't generate completely random numbers. There has to be some dependencies. But for almost all practical purposes, you can use rand().
side effect of initializing the seed more than once
No side effect. But that would mean you're completely invalidating the point of using rand(). If you're re initilizeing seed every time, the random number is more dependent on time(and processor).
any other work around using the inbuilt rand() function
You can write something like this:
#include<stdio.h>
#include<stdlib.h>
#include<time.h>
int main(int argc,char *argv[])
{
srand(time(NULL));
printf("%lf\n",(double)rand()/(double)RAND_MAX);
printf("%lf\n",(double)rand()/(double)RAND_MAX);
}
If you want to generate either a 0 or a 1, I think using rand()%2 is perfectly fine as the probability of an even number is same as probability of an odd number(probability of all numbers is equal for an unbiased random number generator).

generate uncorrelated number in the C language using rand()

I want to generate uncorrelated random number to do a simulation... However, the numbers generated by the rand() function in the C language are correlated. Is there any possibility to use the rand() function and generate multiple random streams? I mean, if the rand() function generate for me a series of correlated numbers, can I cut this series into different streams. Then use these streams independently?
Thanks
You are indeed correct. They are normally autocorrelated as the normal generator implementation is linear congruential (although the C standard does not mandate this). As such an x-y plot of successive numbers will fail a chi square test for random 2D dispersion.
Depending on your application, you could look at Bays-Durham shuffle which, to my knowledge, passes the diehard test for randomness: it's aim is to defeat autocorrelation effects.
I direct you to www.nr.com for an implementation and the rand1, rand2 functions in particular. A more modern way is to use a mersenne twister scheme but a little tricker to implement (by the way C++11 has this generator as part of its standard library).
If your C implementation has rand_r, you can try that. It lets you specify a location to store the state.
Or just use your own pseudo-random number generator.
You may use arc4random or better ar4random_uniform to increase randomness of generated values (actually ar4random_uniform proves you uniformly distributed values).
Generating true random numbers on a computer is impossible, you can only generate "pseudo-random" numbers i.e. numbers that "looks like" random.
Usually one will use a ''seed'' (small sequence of bits) with enough entropy and then "expand" it thanks to a Pseudo-Random-Number-Generator.
C rand() function generates poor quality of randomness, try PRNG that have been proposed in other answers/comments. Some examples:
Mersenne Twister (widely used)
ANSI X9 (adopted by FIPS standard)

Logic behind the random number generator in C [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How does a random number generator work?
How does C compiler takes decisions whether which number should be generated next in a random number generation function? For example it always generates a new random number between the given range. How is that done?
It generates the next number by keeping some state and modifying the state every time you call the function. Such a function is called a pseudorandom number generator. An old method of creating a PRNG is the linear congruential generator, which is easy enough:
static int rand_state;
int rand(void)
{
rand_state = (rand_state * 1103515245 + 12345) & 0x7fffffff;
return rand_state;
}
As you can see, this method allows you to predict the next number in the series if you know the previous number. There are more sophisticated methods.
Various types of pseudorandom number generators have been designed for specific purposes. There are secure PRNGs which are slow but hard to predict even if you know how they work, and there are big PRNGs like Mersenne Twister which have nice distribution properties and are therefore useful for writing Monte Carlo simulations.
As a rule of thumb, a linear congruential generator is good enough for writing a game (how much damage does the monster deal) but not good enough for writing a simulation. There is a colorful history of researchers who have chosen poor PRNGs for their programs; the results of their simulations are suspect as a result.
It is not a compiler but a C library that has a function to produce pseudorandom (not truly random!) numbers.
Usually linear congruential generators are used for this.
Well, the C compiler doesn't take that decison. The next random number depends on the algorithm. Generating random number is not an easy task. Take a look at
http://www.math.utah.edu/~pa/Random/Random.html
http://computer.howstuffworks.com/question697.htm
http://en.wikipedia.org/wiki/Random_number_generation
It depends on the specific implementation of the pseudo random number generator (PRNG) in question. There are a great many variants in use.
A common example is the family of linear congruential generators (LCGs). These are defined by a recurrence relation:
Xn+1 <- aXn + c (mod m)
So each new sample from the PRNG is determined solely by the previous sample, and the constants a, c and m. Note that the choice of a, c and m is crucial, as discussed here.
LCGs are very simple and efficient. They are often used for the random number generators provided by the standard library. However, they have poor statistical properties and for better randomness, more advanced PRNGs are preferred.
There are many questions regarding this in stackoverflow. Here are few. You can take help from these.
implementation of rand()
Rand function in c
Rand Implementation
This is actually a really big topic. Some of the key things:
Random number generation is done at run-time, rather than compile-time.
The strategy for providing randomness depends (or should depend) greatly on the application. For example, if you simply need a sequence of values that are evenly distributed throughout the given range, solutions such as a linear congruential generator are used. If your application is security/cryptography related, you'll want the stronger property that your values are both randomly distributed and also unpredictable.
A major challenge is acquiring "real" randomness, which you can use to seed your pseudorandom generator (which "stretches" real randomness into an arbitrary amount of usable randomness). A common technique is to use some unpredictable system state (e.g., sample the location of the mouse, or keypress timing) and then use a pseudorandom generator to provide randomness to the system as a whole.

What is a pseudo-random integer?

I am reading a C book. In the description of the rand() function, they say:
rand returns a pseudo-random integer in the range 0 to RAND_MAX.  RAND_MAX is implementation dependent but at least 32767.
I don't understand; what is a "pseudo-random integer"?
Thanks.
Informally, a pseudorandom number is a number that isn't truly random, but is "random enough" for most purposes.
Computers are inherently deterministic devices. The processor executes specific commands in a specific order, and programs control how the processor does so. Consequently, it's hard for programs to generate random numbers because no deterministic process can create a random number. Thus what many programs do is use a pseudorandom number generator, which is a function that produces numbers according to some deterministic formula that appear to be random but actually are not. Most programming languages provide some sort of pseudorandom number generator for general programming use, and when true randomness isn't needed they work just fine.
However, they have their limitations. In cryptographic settings, in many cases true randomness is required in order to prevent attackers from guessing the workings of a system and compromising it, for example. In this case, it is possible to get truly random numbers by using specialized hardware that can amplify background noise or use quantum effects. This sort of randomness is extremely hard to generate, though, and so it's not commonly used unless absolute unpredictability is required.
From wiki entry : Pseudorandom number generator
A pseudorandom number generator
(PRNG), also known as a deterministic
random bit generator (DRBG), is an
algorithm for generating a sequence of
numbers that approximates the
properties of random numbers.
In other words, its approximately random (to a known extent), but not truly random in the sense of random noise from a physical source.
It means that the number may seems to be random but the truth is computer can't generate a truly random number, it works out a number based on a logic which tends to be random.
Like observing time difference in keystrokes, and then using it as just an input in calculating a number. Such things combined with several other inputs tend to give random numbers but in reality its just an algorithm which tends to give random numbers. If the same conditions are matched, it will give the same number.
To add some pedantry to the other correct answers that you've already received, there really is no such thing as a "pseudorandom integer" (or for that matter, a random integer). This is the point that John von Neumann was making in his famous quote (which is usually misleading abbreviated to just the first sentence):
"Any one who considers arithmetical
methods of producing random digits is,
of course, in a state of sin. For, as
has been pointed out several times,
there is no such thing as a random
number — there are only methods to
produce random numbers, and a strict
arithmetic procedure of course is not
such a method"
Consider the number 7. Is it a random integer? If it is, is it a pseudorandom integer or a true random integer? What about the number -10? Clearly these questions don't make sense (obligatory link to xkcd 221).
So, as others have pointed out, what the book actually means is that the number is generated by a pseudorandom (i.e. deterministic) number generator as opposed to being obtained from a truly random sequence. There are several different pseudorandom number generator algorithms and the best ones generate output that is statistically indistinguishable from a true random sequence.
It means that it generates a number sequence that exhibits the properties of a random number in terms of distribution, but which is mathematically generated and deterministic in its output for any particular seed value.
You can see this by creating a program that uses such a function to generate a short sequence, and observing that each time you run it it generates the same sequence. This option surprises the unsuspecting, but is also quite a useful property in testing code. When a less predictable sequence is required, one normally initialised the PRNG with a seed value that is itself unpredictable, often derived from the current system time, since it is not normally predictable when a program will be started.
Pseudo random number generators are not normally suitable for security applications such as data encryption, or on-line gambling applications, where their ultimate predictability could be a serious weakness.
In most computers, a random number is considered psuedorandom because it is not completely random.
pseudo=fake
Basically, the current time is used to generate a random number, so it can be predicted what number will be generated initially, and for most applications, this is fine.

Resources