Silverlight System.Random sequencing cross platform - silverlight

I need to know the effect of different platforms on the System.Random object (Silverlight). Is the sequence created the same on Mac, PC and across 32 / 64 bit?

Excuse my "stupid" answer, but to my mind, Random numbers should be always considered random and thus the created sequences should be handled as NOT same across any "domain". I know that the .NET (or Silverlight) random number generators use a Pseudo-random algorithm depending on the seed value and will generate the same number sequence when using the same seed value, but I just wouldn't rely on this fact.
It seems that you have some kind of "expectation" when you need to have random numbers synchronized across several platforms, and using a Random Number generator for expected value sequences looks weird to me.
If you can tell us more about your use case, maybe we can find another more solid solution?
Just my opinion.

The algorithm to generate the random numbers is encoded into the runtime. Hence regardless of the platform you should see the same set of "random" numbers for a given seed value.
The extact behaviour of default constructor for Random (where the seed value is time based) may vary slightly from platform to platform. For example rapid creation of instances of Random may create some instances that generate the same sequence, the distribution of these "duplicates" may vary on all sorts of conditions including the platform.

Related

How to generate either 0 or 1 randomly in C

I have read so many posts on this topic:
How does rand() work? Does it have certain tendencies? Is there something better to use?
How does the random number generator work in C
and this is what I got:
1) xn+1 depends on xn i.e., previous random number that is generated.
2) It is not recommended to initialize the seed more than once in the program.
3) It is a bad practice to use rand()%2 to generate either 0 or 1 randomly.
My questions are:
1) Are there any other libraries that I missed to take a look to generate a completely random number (either 0 or 1) without depending on previous output?
2) If there is any other work around using the inbuilt rand() function to satisfy the requirement?
3) What is the side effect of initializing the seed more than once in a program?
Code snippet:
srand(time(NULL));
d1=rand()%2;
d2=rand()%2;
Here my intention is to make d1 and d2 completely independent of each other.
My initial thought is to do this:
srand(time(NULL));
d1=rand()%2;
srand(time(NULL));
d2=rand()%2;
But as I mentioned earlier which is based on other posts, this is a bad practice I suppose?
So, can anyone please answer the above questions? I apologize if I completely missed an obvious thing.
Are there any other libraries that I missed to take a look to generate a completely random number between 0 and 1 without depending on previous output?
Not in the standard C library. There are lots of other libraries which generate "better" pseudo-random numbers.
If there is any other work around using the inbuilt rand() function to satisfy the requirement?
Most standard library implementations of rand produce sequences of random numbers where the low-order bit(s) have a short sequence and/or are not as independent of each other as one would like. The high-order bits are generally better distributed. So a better way of using the standard library rand function to generate a random single bit (0 or 1) is:
(rand() > RAND_MAX / 2)
or use an interior bit:
(rand() & 0x400U != 0)
Those will produce reasonably uncorrelated sequences with most standard library rand implementations, and impose no more computational overhead than checking the low-order bit. If that's not good enough for you, you'll probably want to research other pseudo-random number generators.
All of these (including rand() % 2) assume that RAND_MAX is odd, which is almost always the case. (If RAND_MAX were even, there would be an odd number of possible values and any way of dividing an odd number of possible values into two camps must be slightly biased.)
What is the side effect of initializing the seed more than once in a program?
You should think of the random number generator as producing "not very random" numbers after being seeded, with the quality improving as you successively generate new random numbers. And remember that if you seed the random number generator using some seed, you will get exactly the same sequence as you will the next time you seed the generator with the same seed. (Since time() returns a number of seconds, two successive calls in quick succession will usually produce exactly the same number, or very occasionally two consecutive numbers. But definitely not two random uncorrelated numbers.)
So the side effect of reseeding is that you get less random numbers, and possibly exactly the same ones as you got the last time you reseeded.
1) Are there any other libraries that I missed to take a look to
generate a completely random number between 0 and 1 without depending
on previous output?
This sub-question is off-topic for Stack Overflow, but I'll point out that POSIX and BSD systems have an alternative random number generator function named random() that you could consider if you are programming for such a platform (e.g. Linux, OS X).
2) If there is any other work around using the inbuilt rand() function
to satisfy the requirement?
Traditional computers (as opposed to quantum computers) are deterministic machines. They cannot do true randomness. Every completely programmatic "random number generator" is in practice a psuedo-random number generator. They generate completely deterministic sequences, but the values from a given set of calls are distributed across the generator's range in a manner approximately consistent with a target probability distribution (ordinarily the uniform distribution).
Some operating systems provide support for generating numbers that depend on something more chaotic and less predictable than a computed sequence. For instance, they may collect information from mouse movements, CPU temperature variations, or other such sources, to produce more objectively random (yet still deterministic) numbers. Linux, for example, has such a driver that is often exposed as the special file /dev/random. The problem with these is that they have a limited store of entropy, and therefore cannot provide numbers at a sustained high rate. If you need only a few random numbers, however, then that might be a suitable source.
3) What is the side effect of initializing the seed more than once in
a program?
Code snippet:
srand(time(NULL));
d1=rand()%2;
d2=rand()%2;
Here my intention is to make d1 and d2 completely independent of each
other.
My initial thought is to do this:
srand(time(NULL));
d1=rand()%2;
srand(time(NULL));
d2=rand()%2;
But as I mentioned earlier which is based on other posts, this is a
bad practice I suppose?
It is indeed bad if you want d1 and d2 to have a 50% probability of being different. time() returns the number of seconds since the epoch, so it is highly likely that it will return the same value when called twice so close together. The sequence of pseudorandom numbers is completely determined by the seed (this is a feature, not a bug), and when you seed the PRNG, you restart the sequence. Even if you used a higher-resolution clock to make the seeds more likely to differ, you don't escape correlation this way; you just change the function generating numbers for you. And the result does not have the same guarantees for output distribution.
Additionally, when you do rand() % 2 you use only one bit of the approximately log2(RAND_MAX) + 1 bits that it produced for you. Over the whole period of the PRNG, you can expect that bit to take each value the same number of times, but over narrow ranges you may sometimes see some correlation.
In the end, your requirement for your two random numbers to be completely independent of one another is probably way overkill. It is generally sufficient for the pseudo-random result of one call to be have no apparent correlation with the results of previous calls. You probably achieve that well enough with your first code snippet, even despite the use of only one bit per call. If you prefer to use more of the bits, though, then with some care you could base the numbers you choose on the parity of the count of how many bits are set in the values returned by rand().
Use this
(double)rand() / (double)RAND_MAX
completely random number ..... without depending on previous output?
Well in reality computers can't generate completely random numbers. There has to be some dependencies. But for almost all practical purposes, you can use rand().
side effect of initializing the seed more than once
No side effect. But that would mean you're completely invalidating the point of using rand(). If you're re initilizeing seed every time, the random number is more dependent on time(and processor).
any other work around using the inbuilt rand() function
You can write something like this:
#include<stdio.h>
#include<stdlib.h>
#include<time.h>
int main(int argc,char *argv[])
{
srand(time(NULL));
printf("%lf\n",(double)rand()/(double)RAND_MAX);
printf("%lf\n",(double)rand()/(double)RAND_MAX);
}
If you want to generate either a 0 or a 1, I think using rand()%2 is perfectly fine as the probability of an even number is same as probability of an odd number(probability of all numbers is equal for an unbiased random number generator).

What is a suitable replacement for rand()?

As far as I know rand() does not generate a uniform random distribution. What function/algorithm will allow me to do so? I have no need for cryptographic randomness, only a uniform random distribution. Lastly, what libraries provide these functions?
Thanks!
rand() does generate a uniform (pseudo-)random distribution.
The actual requirement, from the C standard (3.7 MB PDF), section 7.20.2.1, is:
The rand function computes a sequence of pseudo-random integers in
the range 0 to RAND_MAX.
where RAND_MAX is at least 32767. That's admittedly vague, but the intent is that it gives you a uniform distribution -- and in practice, that's what implementations actually do.
The standard provides a sample implementation, but C implementations aren't required to use it.
In practice, there are certainly better random number generators out there. And one specific requirement for rand() is that it must produce exactly the same sequence of numbers for a given seed (argument to srand()). Your description doesn't indicate that that would be a problem for you.
One problem is that rand() gives you uniformly distributed numbers in a fixed range. If you want numbers in a different range, you have to do some extra work. For example, if RAND_MAX is 32767, then rand() can produce 32768 distinct values; you can't get random numbers in the range 0..9 without discarding some values, since there's no way to evenly distribute those 32768 distinct values into 10 equal sized buckets.
Other PRNGs are likely to give you better results than rand(), but they're still probably going to be subject to the same issues.
As usual, the comp.lang.c FAQ answers this better than I did; see questions 13.15 through 13.21.
Here's an article and a stand-alone random number generator written in C#. The code is very small and easily portable to C++ etc.
Whenever this subject comes up, someone responds that you should not use your own random number generator but should leave that up to specialists. I respond that you should not come up with your own algorithm. Leave that up to specialists because it is indeed very subtle. But it's OK and even beneficial to have your own implementation. That way you know what's being done, and you could use the same method across languages or platforms.
The algorithm in that article is by George Marsaglia, a top expert in random number generation. Even though the code is tiny, the method holds up well to standard tests.
The BSD random() function (included in the XSI option of POSIX/SUS) is almost universally available and much better than rand on most systems (except some where rand actually uses random and thus they're both pretty good).
If you'd rather go outside the system libraries, here's some good information on your choices:
http://guru.multimedia.cx/category/pseudo-random-number-generators/
(From Michael Niedermayer of FFmpeg fame.)
Well, the question of whether or not an actual pseudorandom generator exists is still open. That being said, a quick search reveals that there may be some slightly better alternatives.

What is a pseudo-random integer?

I am reading a C book. In the description of the rand() function, they say:
rand returns a pseudo-random integer in the range 0 to RAND_MAX.  RAND_MAX is implementation dependent but at least 32767.
I don't understand; what is a "pseudo-random integer"?
Thanks.
Informally, a pseudorandom number is a number that isn't truly random, but is "random enough" for most purposes.
Computers are inherently deterministic devices. The processor executes specific commands in a specific order, and programs control how the processor does so. Consequently, it's hard for programs to generate random numbers because no deterministic process can create a random number. Thus what many programs do is use a pseudorandom number generator, which is a function that produces numbers according to some deterministic formula that appear to be random but actually are not. Most programming languages provide some sort of pseudorandom number generator for general programming use, and when true randomness isn't needed they work just fine.
However, they have their limitations. In cryptographic settings, in many cases true randomness is required in order to prevent attackers from guessing the workings of a system and compromising it, for example. In this case, it is possible to get truly random numbers by using specialized hardware that can amplify background noise or use quantum effects. This sort of randomness is extremely hard to generate, though, and so it's not commonly used unless absolute unpredictability is required.
From wiki entry : Pseudorandom number generator
A pseudorandom number generator
(PRNG), also known as a deterministic
random bit generator (DRBG), is an
algorithm for generating a sequence of
numbers that approximates the
properties of random numbers.
In other words, its approximately random (to a known extent), but not truly random in the sense of random noise from a physical source.
It means that the number may seems to be random but the truth is computer can't generate a truly random number, it works out a number based on a logic which tends to be random.
Like observing time difference in keystrokes, and then using it as just an input in calculating a number. Such things combined with several other inputs tend to give random numbers but in reality its just an algorithm which tends to give random numbers. If the same conditions are matched, it will give the same number.
To add some pedantry to the other correct answers that you've already received, there really is no such thing as a "pseudorandom integer" (or for that matter, a random integer). This is the point that John von Neumann was making in his famous quote (which is usually misleading abbreviated to just the first sentence):
"Any one who considers arithmetical
methods of producing random digits is,
of course, in a state of sin. For, as
has been pointed out several times,
there is no such thing as a random
number — there are only methods to
produce random numbers, and a strict
arithmetic procedure of course is not
such a method"
Consider the number 7. Is it a random integer? If it is, is it a pseudorandom integer or a true random integer? What about the number -10? Clearly these questions don't make sense (obligatory link to xkcd 221).
So, as others have pointed out, what the book actually means is that the number is generated by a pseudorandom (i.e. deterministic) number generator as opposed to being obtained from a truly random sequence. There are several different pseudorandom number generator algorithms and the best ones generate output that is statistically indistinguishable from a true random sequence.
It means that it generates a number sequence that exhibits the properties of a random number in terms of distribution, but which is mathematically generated and deterministic in its output for any particular seed value.
You can see this by creating a program that uses such a function to generate a short sequence, and observing that each time you run it it generates the same sequence. This option surprises the unsuspecting, but is also quite a useful property in testing code. When a less predictable sequence is required, one normally initialised the PRNG with a seed value that is itself unpredictable, often derived from the current system time, since it is not normally predictable when a program will be started.
Pseudo random number generators are not normally suitable for security applications such as data encryption, or on-line gambling applications, where their ultimate predictability could be a serious weakness.
In most computers, a random number is considered psuedorandom because it is not completely random.
pseudo=fake
Basically, the current time is used to generate a random number, so it can be predicted what number will be generated initially, and for most applications, this is fine.

Generating totally random numbers without random function? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
True random number generator
I was talking to a friend the other day and we were trying to figure out if it is possible to generate completely random numbers without the help of a random function? In C for example "rand" generates pseudo-random numbers. Or we can use something like "srand( time( NULL ) );" This will allow the computer to read numbers from its clock as seed values. So if I understand everything I have read so far right, then I am pretty sure that no random function actually produces truely random numbers. How would one write a program that generates numbers that are completely random and what would code look like?
Check out this question:
True random number generator
Also, from wikipedia's entry on pseudorandom numbers
As John von Neumann joked, "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin."
The excellent random.org website provides hardware-based random numbers as well as a number of software interfaces to retrieve these.
This can be used e.g. for genuinely unpredictable seeds or for 'true' random numbers. Being a web service, there are limits on the number of draws you can make, so don't try to use this for your graduate school monte carlo simulation.
FWIW, I wrapped one of those interface in the R package random.
It would look like:
int random = CallHardwareRandomGenerator();
Even with hardware, randomness is tricky. There are things which are physically random (atomic decay is random, but with predictable average amounts, so that can be used as a source of random information) there are things that are physically random enough to make prediction impractical (this is how casinos make money).
There are things that are largely indeterminate (mix up information from key-stroke rate, mouse-movements, and a few things like that), which are a good-enough source of "randomness" for many uses.
Mathematically, we cannot produce randomness, but we can improve distribution and make something harder to predict. Cryptographic PRNGs do a stronger job at this than most, but are more expensive in terms of resources.
This is more of a physics question I think. If you think about it nothing is random, it just occurs due to events the complexity of which make them unpredictable to us. A computer is a subsystem just like any other in the universe and by giving it unpredictable external inputs (RTC, I/O garbage) we can get the same kind of randomness that that a roulette wheel gets from varying friction, air resistance, initial impulse and millions of factors that I can't wrap my head around.
There's room for a fair amount of philosophical debate about what "truly random" really even means. From a practical viewpoint, even sources we know aren't truly random can be used in ways that produce what are probably close enough for almost any practical purpose though (in particular, that at least with current technology, full knowledge of the previously produced bitstream appears to be insufficient to predict the next bit accurately). Most of those do involve a bit of extra hardware though -- for example, it's pretty easy to put a source together from a little bit of Americium out of a smoke detector.
There are quite a few more sources as well, though they're mostly pretty low bandwidth (e.g., collect one bit for each keystroke, based on whether the interval between keystrokes was an even or odd number of CPU clocks -- assuming the CPU clock and keyboard clock are derived from separate crystals). OTOH, you have to be really careful with this -- a fair number of security holes (e.g., in Netscape around v. 4.0 or so) have stemmed from people believing that such sources were a lot more random than they really were.
While there are a number of web sites that produce random numbers from hardware sources, most of them are useless from a viewpoint of encryption. Even at best, you're just trusting your SSL (or TLS) connection to be secure so nobody captured the data you got from the site.

Alternative Entropy Sources

Okay, I guess this is entirely subjective and whatnot, but I was thinking about entropy sources for random number generators. It goes that most generators are seeded with the current time, correct? Well, I was curious as to what other sources could be used to generate perfectly valid, random (The loose definition) numbers.
Would using multiple sources (Such as time + current HDD seek time [We're being fantastical here]) together create a "more random" number than a single source? What are the logical limits of the amount of sources? How much is really enough? Is the time chosen simply because it is convenient?
Excuse me if this sort of thing is not allowed, but I'm curious as to the theory behind the sources.
The Wikipedia article on Hardware random number generator's lists a couple of interesting sources for random numbers using physical properties.
My favorites:
A nuclear decay radiation source detected by a Geiger counter attached to a PC.
Photons travelling through a semi-transparent mirror. The mutually exclusive events (reflection — transmission) are detected and associated to "0" or "1" bit values respectively.
Thermal noise from a resistor, amplified to provide a random voltage source.
Avalanche noise generated from an avalanche diode. (How cool is that?)
Atmospheric noise, detected by a radio receiver attached to a PC
The problems section of the Wikipedia article also describes the fragility of a lot of these sources/sensors. Sensors almost always produce decreasingly random numbers as they age/degrade. These physical sources should be constantly checked by statistical tests which can analyze the generated data, ensuring the instruments haven't broken silently.
SGI once used photos of a lava lamp at various "glob phases" as the source for entropy, which eventually evolved into an open source random number generator called LavaRnd.
I use Random.ORG, they provide free random data from Atmospheric noise, that I use to periodically re-seed a Mersene-Twister RNG. Its about as random as you can get with no hardware dependencies.
Don't worry about a "good" seed for a random number generator. The statistical properties of the sequence do not depend on how the generator is seeded. There are other things, however. to worry about. See Pitfalls in Random Number Generation.
As for hardware random number generators, these physical sources have to be measured, and the measurement process has systematic errors. You might find "pseudo" random numbers to have higher quality than "real" random numbers.
Linux kernel uses device interrupt timing (mouse, keyboard, hard drives) to generate entropy. There's a nice article on Wikipedia on entropy.
Modern RNGs are both checked against correlations in nearby seeds and run several hundred iterations after the seeding. So, the unfortunately boring but true answer is that it really doesn't matter very much.
Generally speaking, using random physical processes have to be checked that they conform to a uniform distribution and are otherwise detrended.
In my opinion, it's often better to use a very well understood pseudo-random number generator.
I've used an encryption program that used the users mouse movement to generate random numbers. The only problem was that the program had to pause and ask the user to move the mouse around randomly for a few seconds to work properly which might not always be practical.
I found HotBits several years ago - the numbers are generated from radioactive decay, genuinely random numbers.
There are limits on how many numbers you can download a day, but it has always amused me to use these as really, really random seeds for RNG.
Some TPM (Trusted Platform Module) "chips" have a hardware RNG. Unfortunately, the (Broadcom) TPM in my Dell laptop lacks this feature, but many computers sold today come with a hardware RNG that uses truly unpredictable quantum mechanical processes. Intel has implemented the thermal noise variety.
Also, don't use the current time alone to seed an RNG for cryptographic purposes, or any application where unpredictability is important. Using a few low order bits from the time in conjunction with several other sources is probably okay.
A similar question may be useful to you.
Sorry I'm late to this discussion (what is it 3 1/2 years old now?), but I've a rekindled interest in PRN generation and alternate sources of entropy. Linux kernel developer Rusty Russell recently had a discussion on his blog on alternate sources of entropy (other than /dev/urandom).
But, I'm not all that impressed with his choices; a NIC's MAC address never changes (although it is unique from all others), and PID seems like too small a possible sample size.
I've dabbled with a Mersenne Twister (on my Linux box) which is seeded with the following algorithm. I'm asking for any comments/feedback if anyone's willing and interested:
Create an array buffer of 64 bits + 256 bits * number of /proc files below.
Place the time stamp counter (TSC) value in the first 64 bits of this buffer.
For each of the following /proc files, calculate the SHA256 sum:
/proc/meminfo
/proc/self/maps
/proc/self/smaps
/proc/interrupts
/proc/diskstats
/proc/self/stat
Place each 256-bit hash value into its own area of the array created in (1).
Create a SHA256 hash of this entire buffer. NOTE: I could (and probably should) use a different hash function completely independent of the SHA functions - this technique has been proposed as a "safeguard" against weak hash functions.
Now I have 256 bits of HOPEFULLY random (enough) entropy data to seed my Mersenne Twister. I use the above to populate the beginning of the MT Array (624 32-bit integers), and then initialize the remainder of that array with the MT author's code. Also, I could use a different hash function (e.g. SHA384, SHA512), but I'd need a different size array buffer (obviously).
The original Mersenne Twister code called for one single 32-bit seed, but I feel that's horribly inadequate. Running "merely" 2^32-1 different MTs in search of breaking the crypto is not beyond the realm of practical possibility in this day and age.
I'd love to read anyone's feedback on this. Criticism is more than welcome. I will defend my use of the /proc files as above because they're constantly changing (especially the /proc/self/* files, and the TSC always yields a different value (nanosecond [or better] resolution, IIRC). I've run Diehard tests on this (to the tune of several hundred billion bits), and it seems to be passing with flying colors. But that's probably more testament to the soundness of the Mersenne Twister as a PRNG than to how I'm seeding it.
Of course, these aren't totally impervious to someone hacking them, but I just don't see all of these (and SHA*) being hacked and broken to in my lifetime.
Some use keyboard input (timeouts between keystrokes), I heard of I think in a novel that radio static reception can be used - but of course that requires other hardware and software...
Noise on top of the Cosmic Microwave Background spectrum. Of course you must first remove some anisotropy, foreground objects, correlated detector noise, galaxy and local group velocities, polarizations etc. Many pitfalls remain.
Source of seed isn't that much important. More important is the pseudo numbers generator algorithm. However I've heard some time ago about generating seed for some bank operations. They took many factors together:
time
processor temperature
fan speed
cpu voltage
I don't remember more :)
Even if some of these parameters doesn't change much in time, you can put them into some good hashing function.
How to generate good random number?
Maybe we can take into account inifinite number of universes? If this is true, that all the time new parallel universes are being created, we can do something like this:
int Random() {
return Universe.object_id % MAX_INT;
}
In every moment we should be on another branch of parallel universes, so we should have different id. The only problem is how to get Universe object :)
How about spinning off a thread that will manipulate some variable in a tight loop for a fixed amount of time before it is killed. What you end up with will depend on the processor speed, system load, etc... Very hokey, but better than just srand(time(NULL))...
Don't worry about a "good" seed for a random number generator. The statistical properties of the sequence do not depend on how the generator is seeded.
I disagree with John D. Cook's advice. If you seed the Mersenne Twister with all bits set to zero except one, it will initially generate numbers which are anything but random. It takes a long time for the generator to churn this state into anything that would pass statistical tests. Simply setting the first 32 bits of the generator to a seed will have a similar effect. Also, if the entire state is set to zero the generator will produce endless zeroes.
Properly written RNG code will have a properly written seeding algorithm that accepts say a 64 bit value and seeds the generator so it will produce decent random numbers for each possible input. So if you are using a reliable library then any seed will do. But if you hack together your own implementation then you need to be careful.

Resources