Does C's rand() have to be random? - c

I was trying to get some info on the specification and implementation of rand() in C, and I can't find much information. As a matter of fact, I can't find anything apart from:
rand() does not advance between function calls
same seed always results in same numbers
the random numbers are between 0 and RAND_MAX
Notably, none of these things require randomness. Specifically, I don't see anything that prohibits this implementation:
int randval = 0;
void srand(unsigned int seed) {
randval = seed;
return;
}
int rand() {
return randval++;
}
This seems somewhat unrandom. Is there a bit of standard I'm missing?
(Also, is is bad to seed rand() with time(), then seed ISAAC with rand()?)

Not only it does not have to be random, it must not be random, because it must be completely deterministic:
§7.22.2.2/2:
The srand function uses the argument as a seed for a new sequence of pseudo-random numbers to be returned by subsequent calls to rand. If srand is then called with the same seed value, the sequence of pseudo-random numbers shall be repeated. If rand is called before any calls to srand have been made, the same sequence shall be generated as when srand is first called with a seed value of 1.
If you need true randomness—for any cryptographic purposes you do—use /dev/random on Linux (and most other unices) and CryptGenRandom on Windows.
If you are instead interested in how well the pseudo-random sequence resembles a random one, that is how much it is statistically random, see #Story Teller's answer.

While notes in the C standard are not normative, it does say this:
7.22.2.1 The rand function / note 295
There are no guarantees as to the quality of the random sequence produced and some implementations are known to produce sequences with distressingly non-random low-order bits. Applications with particular requirements should use a generator that is known to be sufficient for their needs.
So you are correct that a completely lazy implementation of the standard library can look like this. It's not bad to seed rand with time, it's as good as any other seed, and provides a certain degree of randomness so long as it's only seeded once. But I won't use rand for any serious application, precisely due to the lack of guarantees.

It has not to be random but pseudo-random (reproducible and deterministic) as stated in POSIX:
The rand() function shall compute a sequence of pseudo-random integers
in the range [0, {RAND_MAX}]
This specification says that it is aligned with ISO-C standard.
Your proposition of implementation does not provide a pseudo-random sequence, obviously. A quick definition of pseudo-random generator could be: "a generator that provides uniform distribution", which yours are clearly not providing such. Your values are too trivially correlated to each others.

Related

Summing a list of random numbers always gives the same result. Why? [duplicate]

I quite like being able to generate the same set of pseudo-random data repeatedly, especially with tweaking experimental code. Through observation I would say that rand() seems to give the same sequence of numbers each time*.
Is it guaranteed to do this for repeated executions on the same machine / for different machines / for different architectures?
*For the same seed obviously.
Yes, given the same environment for the program. From the C standard §7.20.2.2/2,
The srand function uses the argument as a seed for a new sequence of pseudo-random numbers to be returned by subsequent calls to rand. If srand is then called with the same seed value, the sequence of pseudo-random numbers shall be repeated. If rand is called before any calls to srand have been made, the same sequence shall be generated as when srand is first called with a seed value of 1.
Of course, this assumes it is using the same implementation detail (i.e. same machine, same library at the same execution period). The C standard does not mandate a standard random number generating algorithm, thus, if you run the program with a different C standard library, one may get a different random number sequence.
See the question Consistent pseudo-random numbers across platforms if you need a portable and guaranteed random number sequence with a given seed.
It is guaranteed to give the same sequence for the same seed passed to srand() - but only for the duration of a single execution of the program. In general, if an implementation has a choice in behaviour, there is no specific requirement for that choice to remain the same across subsequent executions.
It would be conforming for an implementation to pick a "master seed" at each program startup, and use that to perturb the pseudo-random number generator in a way that is different each time the program starts.
If you wish for more determinism, you should implement a PRNG with specific parameters in your program.
No.
The C standard says:
If srand is then called with the same
seed value, the sequence of
pseudo-random numbers shall be
repeated.
But nowhere does it say what the sequence of pseudo-random numbers actually is - so it differs across implementations.
The only guarantee made is that rand() will give the same sequence of numbers for a given seed for a given implementation. There's no guarantee that the sequence will be the same across different machines or different architectures - and it almost certainly won't be.
If you need to use the exact same set of pseudo-random numbers for experimental purposes, one thing you could do is to use srand to generate a long sequence of random numbers and write them to a file/database. Then, write a portable "random number generator" function that returns values sequentially from that file. That way, you can be assured that you are using the same input data regardless of the platform, srand implementation, or seed value.
When switching to a different machine/runtime/whatever you might be out of luck. There is another possible choice the drand48 family of functions. These are normalized to use the same algorithm on all machines.
If you are in a UNIX/Linux enviroment you can see the drand48() and srand48() at your man pages if you are not you can see online manuals for the C Language.
The prototypes can be found at /usr/include/stdlib.h .
The first one use the Linear Congruential Method that is frequently used in Simulations.
If you provide the same seed to srand48() i.e. srand48(2) and then put the dran48() in a for loop then the sequence will be the same every time.
i.e.
include stdio.h
include stdlib.h
double drand48();
int main(void){
int i;
double rn;
srand48(2);
for(i=0; i<10; i++){
randNum = drand48();
printf("%.6l\n", randNum);
return 0;
}

Will the rand() function ALWAYS produce the same result with the same seed?

I am trying to implement a set of encode and decode steganography functions in C, where i use rand() to randomly scatter my data along an array.
I use rand to calculate a random index like so:
unsigned int getRandIndex(int size) {
return ((unsigned)(rand() * 8)) % size;
}
I seed rand like so:
unsigned long seed = time(NULL);
srand(seed);
I include the seed along with my data as a part of a header that contains a checksum and length.
The problem i have is that while decoding, when i seed the rand function again with the seed i decoded from the data, rand() tends to produce the slightest variations like so:
Index at encode: | Index at decode:
---------------------------------------------------------------------
.
.
.
At: 142568 | At: 142568
At: 155560 | At: 155552
-- --
At: 168184 | At: 168184
.
.
.
Messing up my decoded data.
Is this a limitation of the rand() function? I am 100% sure that the seed is being decoded correctly bit-for-bit as i have verified that.
C 2018 7.22.2.2 2 says:
The srand function uses the argument as a seed for a new sequence of pseudo-random numbers to be returned by subsequent calls to rand. If srand is then called with the same seed value, the sequence of pseudo-random numbers shall be repeated.
This does not explicitly say the sequence is the same in different executions of the program, but I take that as understood. It does not, however, extend to different C implementations, including those that result from linking in a different version of the C standard library.
OpenBSD's implementation of rand ignores all calls to srand and instead provides automatically seeded, cryptographically strong random numbers as if you were using arc4random; this is documented as an intentional deviation from the C standard. This demonstrates that even if your files are always decoded on the exact same computer that produced them, there's no guarantee rand will do what you want.
If you need reproducible sequences of pseudorandom numbers, you should do what industrial grade statistics software does, which is ship your own PRNG and document your choice of algorithm. (See for instance the ?Random help page for R.)
Also, since this is a steganography application, you need to use a cryptographically strong PRNG (also known as a stream cipher); merely statistically strong PRNGs like the Mersenne Twister are not good enough.
No. For reproducibility using rand (which is not exactly specified and inherently uses global state) is terrible for multiple reasons:
you might use a different compiler/system, which may use a different RNG,
you might use the same compiler, but updated to a new version, which uses a different RNG,
you might use the same compiler, same version, but with an updated libc, that uses a different RNG,
you use the same compiler and library version, but have any other non-deterministic call order of the RNG, including but not limited to:
a) some other source of randomness,
b) user input,
c) reordering of concurrency from run-to-run, or
d) any of the above in any of the libraries that you use.

Generate random numbers based on seed value

Can someone explain to me what seed value is? For example I have this code:
int MIN=0;
int MAX=100;
srand((unsigned) time(NULL));
srand(2020);
int num = (int)(rand() / (RAND_MAX + 1.0 + MIN) * MAX);
I am required to use a seed value of 2020 to generate numbers between 0 to 100. I checked everywhere but seems like there is no tutorial that explains what seed value is, and how the code should change from the default seed value. Line 3 of my code assumes the default seed value, so it does not work with a seed value of 2020. Also, MAX is 100, and MIN is 0.
Thanks in advance!
The pseudo-random number generator is initialized using the argument passed as seed. For every different seed value used in a call to srand, the pseudo-random number generator can be expected to generate a different succession of results in the subsequent calls to rand.
http://www.cplusplus.com/reference/cstdlib/srand/
Then calling rand() will return a pseudo-random number in the range of 0 to RAND_MAX.
If you want rand() to produce different random numbers, ensure you are not calling srand(2020) each time you are calling rand(). As you only need to seed the random number generation once.
The seed has 2 usages. The most common one is to start a random generator with a different value. It matters when you want to produce a random value for a crypto algorithm. When you start the program, you do not want to use the same value on each run...
The other usage is to generate a reproducible pseudo-random sequence. It matters when you want reproducible tests, or when you want to exchange with others with random data but want all to use the same sequence. It is very often used in tutorials to have a consistent sequence that the user will be able to reproduce when following the tuto. It is also used by teachers because it is easier to see if students have found the correct result (if every student uses a different sequence, all results would be different)
The rand function generates some sequence of numbers. (The sequence is calculated but is intended to serve as if the numbers were random, so they are called pseudo-random.) By itself, it will always generate the same sequence of numbers in a program. We use srand to choose which sequence of numbers it generates. (Most commonly, rand generates a cycle of the same numbers, if called enough times, and srand merely chooses where in that cycle we start.)
So, you call srand to set the starting point of the sequence for rand. The value you pass to srand is any unsigned int value. In the C standard, there is no specified meaning for the value. You just give srand a number like 1 to get one sequence, 2 to get a different sequence, 3 to get another sequence, and so on. You have no other control over the actual values generated; setting the seed to some particular value does not guarantee the first rand call will return any particular documented value.
The C standard also does not specify any method for calculating the numbers in the sequence. Each C implementation may choose its own method. Some C implementations use bad methods, in which the numbers are not very random-like at all, and some patterns can be easily observed. (For this reason, it is often recommended to use alternatives, like Unix’s srandom and random.)
When a program is being debugged or is being used as a class assignment, it is common to call srand with a fixed value, like srand(2020). This results in rand generating the same numbers each time. That makes it easier to debug the program or to check the results of student programs. When varying sequences of numbers are designed, it is common to call srand with srand(time(NULL)). Assuming the time is available (time(NULL) may return −1 for an error), this causes the program to use different sequences at different times. (Commonly, the value of time as an integer changes once per second, but this can vary depending on the C implementation.)
In C, srand takes a seed that determines the sequence of numbers rand will generate (also known as pseudorandom numbers), as long as the program doesn't change and the program uses the same implementation of rand and srand.
In C, srand and rand can't be used to create a reproducible sequence of pseudorandom numbers, because the standard that defines the C language doesn't specify what that sequence is, even if the seed is given. Notably, rand uses an unspecified pseudorandom number algorithm, and that algorithm can differ between C implementations, including versions of the same standard library. It's not the "compiler", the "system", or the "architecture" that decides which implementation of rand and srand is used. See also these questions:
How predictable is the result of rand() between individual systems?
Does Python have a function to mimic the sequence of C's rand()?
Why is the use of rand() considered bad?
To generate a random number with a specific seed use srand to set the seed for one-time only.
Then use rand to generate the random number. The default range is between 0 and RAND_MAX (RAND_MAX = 32767).
To define the maximum number just override RAND_MAX via precompiler statements (Side effects included) or use Modulo %.
Override RAND_MAX
#ifdef RAND_MAX
#undef RAND_MAX
#define RAND_MAX 100
#endif
Use Modulo
srand(2020);
int random_number = rand() % 100;

Does rand() indeed produce a random value when the random number generator has been seeded?

I know that in order to avoid repeating the same output of the rand() function a pseudo-random number generator must be seeded with the srand function. That means, if I try say srand(1), the output of the rand() will be one value, if I try srand(2), the output will contain another value. But when I try the first argument again like srand(1), the value will be the same as in the first output. This issue made me think that all random values would be predictable in some way. Is it possible to have different output for the same seed (say if I try the same seed tomorrow)? Or are random values predictable indeed?
With the traditional definition of a pseudorandom generator, if you know what the generator has been seeded with, then the sequence of output values is completely determined and not random. This means that if you knew the seed for a random generator, then you could predict every single output that generator would produce from that point forward. (A good random number generator is one where seeing a sequence of outputs of the generator does not let you easily reverse-engineer what the random seed is or predict other values.)
I seem to remember reading a while back that, a while back, some popular poker websites were not doing a good job choosing their random seeds. Some people figured out that you could input the pattern of cards you were seeing, and the system could then reverse-engineer the random seed and let you predict all the future cards. Oops. These days, we have cryptographically secure pseudorandom generators based on encryption routines that, at least when it comes to what's known in the open literature, can't be predicted even if you have gigabytes of random bits of output from the generators.
If you do need to get something that really isn't predictable - that is, you want to get a bunch of truly random bits - you'll need to use something other than a pseudorandom number generator. Most operating systems have some mechanism in place to generate values that do appear to be truly random. They might, for example, look at how long it takes for different capacitors to discharge on the motherboard, or factor in timing information from a clock, or see how the user interacts with the keyboard, etc. These data can be fed into something called an entropy accumulator that slowly builds up more and more random bits. If you need a value that's truly random and can't be predicted in advance, you can check your particular OS for the mechanism used to get data from the entropy accumulator. (You can read from /dev/random on UNIX-style machines, for example.)
Often, pulling data from the entropy accumulator takes time, since the computer has to wait long enough for enough different sources to mix together to give you back high-quality random data. A common strategy, therefore, is to use the entropy accumulator to get a high-quality random seed, then "stretch" the randomness by using it as the seed of a strong pseudorandom generator.
Here is the language of the C Standard:
7.22.2 Pseudo-random sequence generation functions
7.22.2.1 The rand function
Synopsis
#include <stdlib.h>
int rand(void);
The rand function computes a sequence of pseudo-random integers in the range 0 to RAND_MAX
The rand function is not required to avoid data races with other calls to pseudo-random sequence generation functions. The implementation shall behave as if no library function calls the rand function.
Returns
The rand function returns a pseudo-random integer.
Environmental limits
The value of the RAND_MAX macro shall be at least 32767.
7.22.2.2 The srand function
Synopsis
#include <stdlib.h>
void srand(unsigned int seed);
The srand function uses the argument as a seed for a new sequence of pseudo-random numbers to be returned by subsequent calls to rand. If srand is then called with the same seed value, the sequence of pseudo-random numbers shall be repeated. If rand is called before any calls to srand have been made, the same sequence shall be generated as when srand is first called with a seed value of 1.
The srand function is not required to avoid data races with other calls to pseudo-random sequence generation functions. The implementation shall behave as if no library function calls the srand function.
Returns
The srand function returns no value.
In other words, rand() returns a pseudo-random sequence of integers between 0 and RAND_MAX. The sequence is not random, it is predictable for every value passed to srand(), including if srand() is never called.
In order to try and get different sequences for successive runs of the program, srand() can be called with a rapidly varying value, such as the return value of clock(). Note that calling srand(time(NULL)) will produce the same sequence for multiple runs of the program during the same second.

Random Number In C Independent Of Time

Is there a way to generate random numbers in c language independent of time.
The idea is that I want to generate an array of random numbers at a time,but since rand() method depends on time,all the values in the array are generated similarly.
rand() doesn't depend on time. People typically seed their pseudo-random number generator using the current time (through the srand() function), but they don't have to. You can just pass whatever number you want to srand().
If your random numbers aren't of a high enough quality for your purposes (libc's rand is notorious for its inadequacy), you should look at other sources of randomness. On most operating systems, you can get high-quality random data just by reading from /dev/random (or /dev/urandom), and the Windows API provides CryptGenRandom. There are also a lot of cross-platform libraries that provide high-quality PRNGS; OpenSSL is one of them.
rand() generates values sequentially (in a time-sequence), but does not depend upon time (as in "time of day"), unless you seed the generator with srand(time(NULL)). If you don't do this, it's dependent on 1 (one).
There's also rand_r() (POSIX) to return the value of the current seed. You could use these to coordinate multiple streams of random-numbers, by saving and restoring the appropriate seed values.
For a non-deterministic seed without using time(NULL) you'll probably have to resort to a system-specific source (/dev/random on unix).
At all costs don't do this, and proceed to use myrand() as a replacement for rand(). This will return the same value for each call during each clock second.
unsigned myrand() { // BAD! NO!
srand(time(NULL)); // re-seeding destroys the properties of `rand()`
return rand();
}
If you call srand(), it should be just once at the beginning of the program.
The sequential determinism of rand() is actually a very useful property for testing programs. What you get is an (almost-)random, but repeatable sequence. If you print out the seed value at the start of the program, you can re-use the same value to produce the same results (like if it doesn't work on that run).

Resources