Comparing SFMT with Mersenne Twister and Ran2 - c

I am trying to optimise a C-based code used for bioinformatic purposes. It uses Monte Carlo iterations for major part of the calculations. It earlier used ran2() for generating random numbers, which was making it super slow. After thorough research, I found that mersenne twister, and sfmt are more efficient random number generators. However, I tried using them in my code, and they seem to be making not much of a difference in the speed. Given the fact that the program uses generator 10+ times for each iteration, I am not able to figure out why changing the generator is making no difference to the speed.
Could anyone tell me where am I possibly going wrong?

Choosing a random number generator is always a balance between quality (of the numbers they generate) and speed. Linear congruential generators are typically the fastest, but they are not suitable for any serious Monte Carlo work.
From experience I'd say that mersenne twister is just fine --- it's not superslow and you don't have to worry about the quality. As long as the bottleneck is in the generator, I'd say there's not much you can do on a single core.
This being said, here's a comparison of several generators:
http://www.boost.org/doc/libs/1_48_0/doc/html/boost_random/performance.html

Related

Always the same output with rand() with different ide envirionment too [duplicate]

I am currently trying to teach myself c programming. I am stuck on learning random numbers. Many of the websites I visit use the time() function as a method of seeding the random number generator. But many posts and websites I have read say that using the system clock as a method of producing random numbers is flawed. My question is "what exactly should I be using to generate truly random numbers? Should I just manipulate the numbers with arithmetic or is there something else? To be specific, I'm looking for the "best practices" that programmers follow to generate random numbers in the c programming language.
Here is an example of a website I am talking about:
http://faq.cprogramming.com/cgi-bin/smartfaq.cgi?answer=1042005782&id=1043284385
srand(time(NULL)) is good enough for general basic use. Its shortcomings are:
It's not suitable for cryptography, since it's possible for an attacker to predict the pseudo-random sequence. Random numbers used in cryptography need to be really unpredictable.
If you run the program several times in quick succession, the RNG will be seeded with the same or similar values (since the current time hasn't changed much), so you're likely to get similar pseudo-random sequences each time.
If you generate very many random numbers with rand, you're likely to find that they're not well-distributed statistically. This can be important if you're doing something like Monte Carlo simulations.
There are more sophisticated RNG libraries available for cryptographic and statistical use.
In case if you want a random number, use:
randomize () ;
m = random(your limit);
printf("%d", m);// dont forget to include stdlib

seed random numbers in c

I am currently trying to teach myself c programming. I am stuck on learning random numbers. Many of the websites I visit use the time() function as a method of seeding the random number generator. But many posts and websites I have read say that using the system clock as a method of producing random numbers is flawed. My question is "what exactly should I be using to generate truly random numbers? Should I just manipulate the numbers with arithmetic or is there something else? To be specific, I'm looking for the "best practices" that programmers follow to generate random numbers in the c programming language.
Here is an example of a website I am talking about:
http://faq.cprogramming.com/cgi-bin/smartfaq.cgi?answer=1042005782&id=1043284385
srand(time(NULL)) is good enough for general basic use. Its shortcomings are:
It's not suitable for cryptography, since it's possible for an attacker to predict the pseudo-random sequence. Random numbers used in cryptography need to be really unpredictable.
If you run the program several times in quick succession, the RNG will be seeded with the same or similar values (since the current time hasn't changed much), so you're likely to get similar pseudo-random sequences each time.
If you generate very many random numbers with rand, you're likely to find that they're not well-distributed statistically. This can be important if you're doing something like Monte Carlo simulations.
There are more sophisticated RNG libraries available for cryptographic and statistical use.
In case if you want a random number, use:
randomize () ;
m = random(your limit);
printf("%d", m);// dont forget to include stdlib

Logic behind the random number generator in C [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How does a random number generator work?
How does C compiler takes decisions whether which number should be generated next in a random number generation function? For example it always generates a new random number between the given range. How is that done?
It generates the next number by keeping some state and modifying the state every time you call the function. Such a function is called a pseudorandom number generator. An old method of creating a PRNG is the linear congruential generator, which is easy enough:
static int rand_state;
int rand(void)
{
rand_state = (rand_state * 1103515245 + 12345) & 0x7fffffff;
return rand_state;
}
As you can see, this method allows you to predict the next number in the series if you know the previous number. There are more sophisticated methods.
Various types of pseudorandom number generators have been designed for specific purposes. There are secure PRNGs which are slow but hard to predict even if you know how they work, and there are big PRNGs like Mersenne Twister which have nice distribution properties and are therefore useful for writing Monte Carlo simulations.
As a rule of thumb, a linear congruential generator is good enough for writing a game (how much damage does the monster deal) but not good enough for writing a simulation. There is a colorful history of researchers who have chosen poor PRNGs for their programs; the results of their simulations are suspect as a result.
It is not a compiler but a C library that has a function to produce pseudorandom (not truly random!) numbers.
Usually linear congruential generators are used for this.
Well, the C compiler doesn't take that decison. The next random number depends on the algorithm. Generating random number is not an easy task. Take a look at
http://www.math.utah.edu/~pa/Random/Random.html
http://computer.howstuffworks.com/question697.htm
http://en.wikipedia.org/wiki/Random_number_generation
It depends on the specific implementation of the pseudo random number generator (PRNG) in question. There are a great many variants in use.
A common example is the family of linear congruential generators (LCGs). These are defined by a recurrence relation:
Xn+1 <- aXn + c (mod m)
So each new sample from the PRNG is determined solely by the previous sample, and the constants a, c and m. Note that the choice of a, c and m is crucial, as discussed here.
LCGs are very simple and efficient. They are often used for the random number generators provided by the standard library. However, they have poor statistical properties and for better randomness, more advanced PRNGs are preferred.
There are many questions regarding this in stackoverflow. Here are few. You can take help from these.
implementation of rand()
Rand function in c
Rand Implementation
This is actually a really big topic. Some of the key things:
Random number generation is done at run-time, rather than compile-time.
The strategy for providing randomness depends (or should depend) greatly on the application. For example, if you simply need a sequence of values that are evenly distributed throughout the given range, solutions such as a linear congruential generator are used. If your application is security/cryptography related, you'll want the stronger property that your values are both randomly distributed and also unpredictable.
A major challenge is acquiring "real" randomness, which you can use to seed your pseudorandom generator (which "stretches" real randomness into an arbitrary amount of usable randomness). A common technique is to use some unpredictable system state (e.g., sample the location of the mouse, or keypress timing) and then use a pseudorandom generator to provide randomness to the system as a whole.

Random number generator that doesn't use rand()/srand() C functions

I'm developing some library in C that can be used by various user applications.
The library should be completely "transparent" - a user application can init it and finalize,
and it's not supposed to see any change in the running application.
The problem is - I'm using C srand()/rand() functions in the library initialization,
which means that the library does affect user's application - if a user generates random numbers, they will be affected by the fact that rand() was already called.
So, can anyone point to some simple non-GPL alternative to rand() random number generator in C?
It doesn't have to be really strong - I'n not doing any crypto with the numbers.
I was thinking to write some small and really simple generator (something like take time and XOR with something and do something with some prime number and bla bla bla), but I was wondering if someone has a pointer to a more decent generator.
It generates the next number by keeping some state and modifying the state every time you call the function. Such a function is called a pseudorandom number generator. An old method of creating a PRNG is the linear congruential generator, which is easy enough:
static int rand_state;
int rand(void)
{
rand_state = (rand_state * 1103515245 + 12345) & 0x7fffffff;
return rand_state;
}
As you can see, this method allows you to predict the next number in the series if you know the previous number. There are more sophisticated methods.
Various types of pseudorandom number generators have been designed for specific purposes. There are secure PRNGs which are slow but hard to predict even if you know how they work, and there are big PRNGs like Mersenne Twister which have nice distribution properties and are therefore useful for writing Monte Carlo simulations.
As a rule of thumb, a linear congruential generator is good enough for writing a game (how much damage does the monster deal) but not good enough for writing a simulation. There is a colorful history of researchers who have chosen poor PRNGs for their programs; the results of their simulations are suspect as a result.
If C++ is also acceptable for you, have a look at Boost.
http://www.boost.org/doc/libs/1_51_0/doc/html/boost_random/reference.html
It does not only offer one generator, but several dozen, and gives an overview of speed, memory requirement and randomness quality.

Using random numbers with GPUs

I'm investigating using nvidia GPUs for Monte-Carlo simulations. However, I would like to use the gsl random number generators and also a parallel random number generator such as SPRNG. Does anyone know if this is possible?
Update
I've played about with RNG using GPUs. At present there isn't a nice solution. The Mersenne Twister that comes with the SDK isn't really suitable for (my) Monte-Carlo simulations since it takes an incredibly long time to generate seeds.
The NAG libraries are more promising. You can generate RNs either in batches or in individual threads. However, only a few distributions are currently supported - Uniform, exponential and Normal.
The GSL manual recommends the Mersenne Twister.
The Mersenne Twister authors have a version for Nvidia GPUs. I looked into porting this to the R package gputools but found that I needed excessively large number of draws (millions, I think) before the combination of 'generate of GPU and make available to R' was faster than just drawing in R (using only the CPU).
It really is a computation / communication tradeoff.
My colleagues and I have a preprint, to appear in the SC11 conference that revisits an alternative technique for generating random numbers that is well-suited to GPUs. The idea is that the nth random number is:
x_n = f(n)
In contrast to the conventional approach where
x_n = f(x_{n-1})
Source code is available, which implements several different generators. offering 2^64 or more streams, each with periods of 2^128 or more. All pass a wide assortment of tests (the TestU01 Crush and BigCrush suites) of both intra-stream and inter-stream statistical independence. The library also includes adapters that allow you to use our generators in a GSL framework.
Massive parallel random generation as you need it for GPUs is a difficult problem. This is an active research topic. You really have to be careful not only to have a good sequential random generator (these you find in the literature) but something that guarantees that they are independent. Pairwise independence is not sufficient for a good Monte Carlo simulation. AFAIK there is no good public domain code available.
I've just found that NAG provide some RNG routines. These libraries are free for academics.
Use the Mersenne Twister PRNG, as provided in the CUDA SDK.
Here we use sobol sequences on the GPUs.
You will have to implement them by yourself.

Resources