What is a pseudo-random integer? - c

I am reading a C book. In the description of the rand() function, they say:
rand returns a pseudo-random integer in the range 0 to RAND_MAX.  RAND_MAX is implementation dependent but at least 32767.
I don't understand; what is a "pseudo-random integer"?
Thanks.

Informally, a pseudorandom number is a number that isn't truly random, but is "random enough" for most purposes.
Computers are inherently deterministic devices. The processor executes specific commands in a specific order, and programs control how the processor does so. Consequently, it's hard for programs to generate random numbers because no deterministic process can create a random number. Thus what many programs do is use a pseudorandom number generator, which is a function that produces numbers according to some deterministic formula that appear to be random but actually are not. Most programming languages provide some sort of pseudorandom number generator for general programming use, and when true randomness isn't needed they work just fine.
However, they have their limitations. In cryptographic settings, in many cases true randomness is required in order to prevent attackers from guessing the workings of a system and compromising it, for example. In this case, it is possible to get truly random numbers by using specialized hardware that can amplify background noise or use quantum effects. This sort of randomness is extremely hard to generate, though, and so it's not commonly used unless absolute unpredictability is required.

From wiki entry : Pseudorandom number generator
A pseudorandom number generator
(PRNG), also known as a deterministic
random bit generator (DRBG), is an
algorithm for generating a sequence of
numbers that approximates the
properties of random numbers.
In other words, its approximately random (to a known extent), but not truly random in the sense of random noise from a physical source.

It means that the number may seems to be random but the truth is computer can't generate a truly random number, it works out a number based on a logic which tends to be random.
Like observing time difference in keystrokes, and then using it as just an input in calculating a number. Such things combined with several other inputs tend to give random numbers but in reality its just an algorithm which tends to give random numbers. If the same conditions are matched, it will give the same number.

To add some pedantry to the other correct answers that you've already received, there really is no such thing as a "pseudorandom integer" (or for that matter, a random integer). This is the point that John von Neumann was making in his famous quote (which is usually misleading abbreviated to just the first sentence):
"Any one who considers arithmetical
methods of producing random digits is,
of course, in a state of sin. For, as
has been pointed out several times,
there is no such thing as a random
number — there are only methods to
produce random numbers, and a strict
arithmetic procedure of course is not
such a method"
Consider the number 7. Is it a random integer? If it is, is it a pseudorandom integer or a true random integer? What about the number -10? Clearly these questions don't make sense (obligatory link to xkcd 221).
So, as others have pointed out, what the book actually means is that the number is generated by a pseudorandom (i.e. deterministic) number generator as opposed to being obtained from a truly random sequence. There are several different pseudorandom number generator algorithms and the best ones generate output that is statistically indistinguishable from a true random sequence.

It means that it generates a number sequence that exhibits the properties of a random number in terms of distribution, but which is mathematically generated and deterministic in its output for any particular seed value.
You can see this by creating a program that uses such a function to generate a short sequence, and observing that each time you run it it generates the same sequence. This option surprises the unsuspecting, but is also quite a useful property in testing code. When a less predictable sequence is required, one normally initialised the PRNG with a seed value that is itself unpredictable, often derived from the current system time, since it is not normally predictable when a program will be started.
Pseudo random number generators are not normally suitable for security applications such as data encryption, or on-line gambling applications, where their ultimate predictability could be a serious weakness.

In most computers, a random number is considered psuedorandom because it is not completely random.
pseudo=fake
Basically, the current time is used to generate a random number, so it can be predicted what number will be generated initially, and for most applications, this is fine.

Related

How to generate either 0 or 1 randomly in C

I have read so many posts on this topic:
How does rand() work? Does it have certain tendencies? Is there something better to use?
How does the random number generator work in C
and this is what I got:
1) xn+1 depends on xn i.e., previous random number that is generated.
2) It is not recommended to initialize the seed more than once in the program.
3) It is a bad practice to use rand()%2 to generate either 0 or 1 randomly.
My questions are:
1) Are there any other libraries that I missed to take a look to generate a completely random number (either 0 or 1) without depending on previous output?
2) If there is any other work around using the inbuilt rand() function to satisfy the requirement?
3) What is the side effect of initializing the seed more than once in a program?
Code snippet:
srand(time(NULL));
d1=rand()%2;
d2=rand()%2;
Here my intention is to make d1 and d2 completely independent of each other.
My initial thought is to do this:
srand(time(NULL));
d1=rand()%2;
srand(time(NULL));
d2=rand()%2;
But as I mentioned earlier which is based on other posts, this is a bad practice I suppose?
So, can anyone please answer the above questions? I apologize if I completely missed an obvious thing.
Are there any other libraries that I missed to take a look to generate a completely random number between 0 and 1 without depending on previous output?
Not in the standard C library. There are lots of other libraries which generate "better" pseudo-random numbers.
If there is any other work around using the inbuilt rand() function to satisfy the requirement?
Most standard library implementations of rand produce sequences of random numbers where the low-order bit(s) have a short sequence and/or are not as independent of each other as one would like. The high-order bits are generally better distributed. So a better way of using the standard library rand function to generate a random single bit (0 or 1) is:
(rand() > RAND_MAX / 2)
or use an interior bit:
(rand() & 0x400U != 0)
Those will produce reasonably uncorrelated sequences with most standard library rand implementations, and impose no more computational overhead than checking the low-order bit. If that's not good enough for you, you'll probably want to research other pseudo-random number generators.
All of these (including rand() % 2) assume that RAND_MAX is odd, which is almost always the case. (If RAND_MAX were even, there would be an odd number of possible values and any way of dividing an odd number of possible values into two camps must be slightly biased.)
What is the side effect of initializing the seed more than once in a program?
You should think of the random number generator as producing "not very random" numbers after being seeded, with the quality improving as you successively generate new random numbers. And remember that if you seed the random number generator using some seed, you will get exactly the same sequence as you will the next time you seed the generator with the same seed. (Since time() returns a number of seconds, two successive calls in quick succession will usually produce exactly the same number, or very occasionally two consecutive numbers. But definitely not two random uncorrelated numbers.)
So the side effect of reseeding is that you get less random numbers, and possibly exactly the same ones as you got the last time you reseeded.
1) Are there any other libraries that I missed to take a look to
generate a completely random number between 0 and 1 without depending
on previous output?
This sub-question is off-topic for Stack Overflow, but I'll point out that POSIX and BSD systems have an alternative random number generator function named random() that you could consider if you are programming for such a platform (e.g. Linux, OS X).
2) If there is any other work around using the inbuilt rand() function
to satisfy the requirement?
Traditional computers (as opposed to quantum computers) are deterministic machines. They cannot do true randomness. Every completely programmatic "random number generator" is in practice a psuedo-random number generator. They generate completely deterministic sequences, but the values from a given set of calls are distributed across the generator's range in a manner approximately consistent with a target probability distribution (ordinarily the uniform distribution).
Some operating systems provide support for generating numbers that depend on something more chaotic and less predictable than a computed sequence. For instance, they may collect information from mouse movements, CPU temperature variations, or other such sources, to produce more objectively random (yet still deterministic) numbers. Linux, for example, has such a driver that is often exposed as the special file /dev/random. The problem with these is that they have a limited store of entropy, and therefore cannot provide numbers at a sustained high rate. If you need only a few random numbers, however, then that might be a suitable source.
3) What is the side effect of initializing the seed more than once in
a program?
Code snippet:
srand(time(NULL));
d1=rand()%2;
d2=rand()%2;
Here my intention is to make d1 and d2 completely independent of each
other.
My initial thought is to do this:
srand(time(NULL));
d1=rand()%2;
srand(time(NULL));
d2=rand()%2;
But as I mentioned earlier which is based on other posts, this is a
bad practice I suppose?
It is indeed bad if you want d1 and d2 to have a 50% probability of being different. time() returns the number of seconds since the epoch, so it is highly likely that it will return the same value when called twice so close together. The sequence of pseudorandom numbers is completely determined by the seed (this is a feature, not a bug), and when you seed the PRNG, you restart the sequence. Even if you used a higher-resolution clock to make the seeds more likely to differ, you don't escape correlation this way; you just change the function generating numbers for you. And the result does not have the same guarantees for output distribution.
Additionally, when you do rand() % 2 you use only one bit of the approximately log2(RAND_MAX) + 1 bits that it produced for you. Over the whole period of the PRNG, you can expect that bit to take each value the same number of times, but over narrow ranges you may sometimes see some correlation.
In the end, your requirement for your two random numbers to be completely independent of one another is probably way overkill. It is generally sufficient for the pseudo-random result of one call to be have no apparent correlation with the results of previous calls. You probably achieve that well enough with your first code snippet, even despite the use of only one bit per call. If you prefer to use more of the bits, though, then with some care you could base the numbers you choose on the parity of the count of how many bits are set in the values returned by rand().
Use this
(double)rand() / (double)RAND_MAX
completely random number ..... without depending on previous output?
Well in reality computers can't generate completely random numbers. There has to be some dependencies. But for almost all practical purposes, you can use rand().
side effect of initializing the seed more than once
No side effect. But that would mean you're completely invalidating the point of using rand(). If you're re initilizeing seed every time, the random number is more dependent on time(and processor).
any other work around using the inbuilt rand() function
You can write something like this:
#include<stdio.h>
#include<stdlib.h>
#include<time.h>
int main(int argc,char *argv[])
{
srand(time(NULL));
printf("%lf\n",(double)rand()/(double)RAND_MAX);
printf("%lf\n",(double)rand()/(double)RAND_MAX);
}
If you want to generate either a 0 or a 1, I think using rand()%2 is perfectly fine as the probability of an even number is same as probability of an odd number(probability of all numbers is equal for an unbiased random number generator).

What is a suitable replacement for rand()?

As far as I know rand() does not generate a uniform random distribution. What function/algorithm will allow me to do so? I have no need for cryptographic randomness, only a uniform random distribution. Lastly, what libraries provide these functions?
Thanks!
rand() does generate a uniform (pseudo-)random distribution.
The actual requirement, from the C standard (3.7 MB PDF), section 7.20.2.1, is:
The rand function computes a sequence of pseudo-random integers in
the range 0 to RAND_MAX.
where RAND_MAX is at least 32767. That's admittedly vague, but the intent is that it gives you a uniform distribution -- and in practice, that's what implementations actually do.
The standard provides a sample implementation, but C implementations aren't required to use it.
In practice, there are certainly better random number generators out there. And one specific requirement for rand() is that it must produce exactly the same sequence of numbers for a given seed (argument to srand()). Your description doesn't indicate that that would be a problem for you.
One problem is that rand() gives you uniformly distributed numbers in a fixed range. If you want numbers in a different range, you have to do some extra work. For example, if RAND_MAX is 32767, then rand() can produce 32768 distinct values; you can't get random numbers in the range 0..9 without discarding some values, since there's no way to evenly distribute those 32768 distinct values into 10 equal sized buckets.
Other PRNGs are likely to give you better results than rand(), but they're still probably going to be subject to the same issues.
As usual, the comp.lang.c FAQ answers this better than I did; see questions 13.15 through 13.21.
Here's an article and a stand-alone random number generator written in C#. The code is very small and easily portable to C++ etc.
Whenever this subject comes up, someone responds that you should not use your own random number generator but should leave that up to specialists. I respond that you should not come up with your own algorithm. Leave that up to specialists because it is indeed very subtle. But it's OK and even beneficial to have your own implementation. That way you know what's being done, and you could use the same method across languages or platforms.
The algorithm in that article is by George Marsaglia, a top expert in random number generation. Even though the code is tiny, the method holds up well to standard tests.
The BSD random() function (included in the XSI option of POSIX/SUS) is almost universally available and much better than rand on most systems (except some where rand actually uses random and thus they're both pretty good).
If you'd rather go outside the system libraries, here's some good information on your choices:
http://guru.multimedia.cx/category/pseudo-random-number-generators/
(From Michael Niedermayer of FFmpeg fame.)
Well, the question of whether or not an actual pseudorandom generator exists is still open. That being said, a quick search reveals that there may be some slightly better alternatives.

Generating totally random numbers without random function? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
True random number generator
I was talking to a friend the other day and we were trying to figure out if it is possible to generate completely random numbers without the help of a random function? In C for example "rand" generates pseudo-random numbers. Or we can use something like "srand( time( NULL ) );" This will allow the computer to read numbers from its clock as seed values. So if I understand everything I have read so far right, then I am pretty sure that no random function actually produces truely random numbers. How would one write a program that generates numbers that are completely random and what would code look like?
Check out this question:
True random number generator
Also, from wikipedia's entry on pseudorandom numbers
As John von Neumann joked, "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin."
The excellent random.org website provides hardware-based random numbers as well as a number of software interfaces to retrieve these.
This can be used e.g. for genuinely unpredictable seeds or for 'true' random numbers. Being a web service, there are limits on the number of draws you can make, so don't try to use this for your graduate school monte carlo simulation.
FWIW, I wrapped one of those interface in the R package random.
It would look like:
int random = CallHardwareRandomGenerator();
Even with hardware, randomness is tricky. There are things which are physically random (atomic decay is random, but with predictable average amounts, so that can be used as a source of random information) there are things that are physically random enough to make prediction impractical (this is how casinos make money).
There are things that are largely indeterminate (mix up information from key-stroke rate, mouse-movements, and a few things like that), which are a good-enough source of "randomness" for many uses.
Mathematically, we cannot produce randomness, but we can improve distribution and make something harder to predict. Cryptographic PRNGs do a stronger job at this than most, but are more expensive in terms of resources.
This is more of a physics question I think. If you think about it nothing is random, it just occurs due to events the complexity of which make them unpredictable to us. A computer is a subsystem just like any other in the universe and by giving it unpredictable external inputs (RTC, I/O garbage) we can get the same kind of randomness that that a roulette wheel gets from varying friction, air resistance, initial impulse and millions of factors that I can't wrap my head around.
There's room for a fair amount of philosophical debate about what "truly random" really even means. From a practical viewpoint, even sources we know aren't truly random can be used in ways that produce what are probably close enough for almost any practical purpose though (in particular, that at least with current technology, full knowledge of the previously produced bitstream appears to be insufficient to predict the next bit accurately). Most of those do involve a bit of extra hardware though -- for example, it's pretty easy to put a source together from a little bit of Americium out of a smoke detector.
There are quite a few more sources as well, though they're mostly pretty low bandwidth (e.g., collect one bit for each keystroke, based on whether the interval between keystrokes was an even or odd number of CPU clocks -- assuming the CPU clock and keyboard clock are derived from separate crystals). OTOH, you have to be really careful with this -- a fair number of security holes (e.g., in Netscape around v. 4.0 or so) have stemmed from people believing that such sources were a lot more random than they really were.
While there are a number of web sites that produce random numbers from hardware sources, most of them are useless from a viewpoint of encryption. Even at best, you're just trusting your SSL (or TLS) connection to be secure so nobody captured the data you got from the site.

Silverlight System.Random sequencing cross platform

I need to know the effect of different platforms on the System.Random object (Silverlight). Is the sequence created the same on Mac, PC and across 32 / 64 bit?
Excuse my "stupid" answer, but to my mind, Random numbers should be always considered random and thus the created sequences should be handled as NOT same across any "domain". I know that the .NET (or Silverlight) random number generators use a Pseudo-random algorithm depending on the seed value and will generate the same number sequence when using the same seed value, but I just wouldn't rely on this fact.
It seems that you have some kind of "expectation" when you need to have random numbers synchronized across several platforms, and using a Random Number generator for expected value sequences looks weird to me.
If you can tell us more about your use case, maybe we can find another more solid solution?
Just my opinion.
The algorithm to generate the random numbers is encoded into the runtime. Hence regardless of the platform you should see the same set of "random" numbers for a given seed value.
The extact behaviour of default constructor for Random (where the seed value is time based) may vary slightly from platform to platform. For example rapid creation of instances of Random may create some instances that generate the same sequence, the distribution of these "duplicates" may vary on all sorts of conditions including the platform.

How is the lagged fibonacci generator random?

I dont get. If it has a fixed length, choosing the lags and the mod over and over again will give the same number, no?
To be precise, the lagged Fibonacci is a pseudo-random number generator. It's not true random, but it's much better than, say, the more commonly used linear congruential generator (the standard generator for C++, Java, etc). I'm not sure why you think it will give the same number all over again, but it's true that like all pseudo-random number generator, it has a period after which the sequence of numbers will repeat again.
The multiplicative LFG has a period of (2^k - 1)*2^(M-3). For practical parameters, this is actually quite huge (LCG's period is only M).
The only catch with LFG is that the initialization procedure is very complex, and the mathematics behind it is incomplete. It's best to consult the literature for good choice of parameters and recommended procedure for proper seeding.
As an illustration, a multiplicative LFG with parameters (j=31, k=52) and modulus m=2^32 is seeded with an array of 52 32-bit numbers.
Additional references:
http://sprng.fsu.edu/Version4.0/generators.html
More details on this generator and the seeding algorithms can be found in papers by Mascagni, et al.
It's not random, its pseudorandom
From this http://en.wikipedia.org/wiki/Lagged_Fibonacci_generator
Lagged Fibonacci generators have a maximum period of (2^k - 1)*2^(M-1) if addition or subtraction is used, and (2^k-1) if exclusive-or operations are used to combine the previous values. If, on the other hand, multiplication is used, the maximum period is (2^k - 1)*2^(M-3), or 1/4 of period of the additive case.
So, given a certain seed value, the sequence of output values is predictable and repeatable, and it has a cycle. It will repeat if you wait long enough - but the cycle is quite large.
For an observer that doesn't know the seed value, the sequence appears to be quite random so it can be useful as a source of "randomness" for simulations and other situations where true randomness isn't required.
It's random in the same way that any pseudorandom number generator is--which is to say, not at all.
However, lagged fibonacci (and all linear feedback shift register PRNGs) improve on a basic linear congruential generator by increasing the state size. That is, the next value depends on several former values, rather than just the immediate previous one. Combined with a decent seed you should be able to get fairly decent results.
Edit:
From your post, it isn't clear that you understand that the underlying state is stored in a shift register, meaning that it isn't static but updated (by shifting each value one place to the left, dropping the leftmost value, and appending the most recent value on the right side) after each draw. In this way, drawing the same number over & over again is avoided (for most seed values, at least).
It all depends on the seed. Most random number generators do give the same sequence of numbers for a fixed seed value.
Random number generators are often one-to-one functions where for every input there is a constant output. To make it "random" you have to feed it a seed (which must be "random"), like the system time or the values of computer memory locations, for example.
If you're wondering why you don't just straight up use the seed (the time, etc.), it's because the time is sequential (1,2,3,4) whereas most pseudorandom number generators spit out numbers that appear random (8, 27, 13, 1). That way if you're generating pseudorandom numbers in a loop (which happens very fast), you're not just getting {1,2,3,4}...

Resources