Problems with C rand() - c

I'm new to C. I just came across the rand() function. The book states that using rand() returns a random number from 0 to 32767. It also states that you can narrow the random numbers by using % (modulus operator) to do so.
Here is an example: the following expression puts a random number from 1 to 6 in the variable dice
dice = (rand() % 6) + 1;
I’m wondering why you can’t use
dice = (rand() % 7);
Won’t it do the same thing?

This is more of a math question than a C question. The answer lies in modulo arithmetic. Any number x modulo n equals 0 if n divides x evenly. In fact, the modulo operator returns the remainder of integer division. Therefore the range is from 0 to n - 1. So if you want a random number 1-6 you need to perform (rand() % 6) + 1, since rand() % 6 gives you something in the range of 0-5. Simply doing rand() % 7 gives you the range 0-6, increasing the upper bound, not the lower bound.

rand() % 6 is a number in the interval 0-5.
If you add one to any number in that interval, you get a number in the interval 1-6.
On the other hand, rand() % 7 is a number in the interval 0-6.

Related

<stdlib.h> rand() example code, unnecessary check for larger than max?

I've been looking into the int rand() function from <stdlib.h> in C11 when I stumbled over the following cppreference-example for rolling a six sided die.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
 
int main(void)
{
srand(time(NULL)); // use current time as seed for random generator
int random_variable = rand();
printf("Random value on [0,%d]: %d\n", RAND_MAX, random_variable);
 
// roll a 6-sided die 20 times
for (int n=0; n != 20; ++n) {
int x = 7;
while(x > 6)
x = 1 + rand()/((RAND_MAX + 1u)/6); // Note: 1+rand()%6 is biased
printf("%d ", x);
}
}
Specifically this part:
[...]
while(x > 6)
x = 1 + rand()/((RAND_MAX + 1u)/6); // Note: 1+rand()%6 is biased
[...]
Questions:
Why the addition of + 1u? Since rand() is [0,RAND_MAX] I'm guessing
that doing rand()/(RAND_MAX/6) -> [0,RAND_MAX/(RAND_MAX/6)] -> [0,6]? And
since it's integer division (LARGE/(LARGE+small)) < 1 -> 0, adding 1u gives it the required range of [0,5]?
Building on the previous question, assuming [0,5], 1 + (rand()/((RAND_MAX+1u)/6)) should only go through [1,6] and never trigger a second loop?
Been poking around to see if rand() has returned float at some point, but
that seems like a pretty huge breakage towards old code? I guess the check
makes sense if you add 1.0f instead of 1u making it a floating point
division?
Trying to wrap my head around this, have a feeling that I might be missing
something..
(P.s. This is not a basis for anything security critical, I'm just exploring
the standard library. D.s)
The code avoids bias by ensuring each possible result in [1, 6] is the output from exactly the same number of return values from rand.
By definition, rand returns int values from 0 to RAND_MAX. So there are 1+RAND_MAX possible values it can return. If 1+RAND_MAX is not a multiple of 6, then it is impossible to partition it into 6 exactly equal intervals of integers. So the code partitions it into 6 equal intervals that are as big as possible and one odd-size fragment interval. Then the results of rand are mapped into these intervals: The first six intervals correspond to results from 1 to 6, and the last interval is rejected, and the code tries again.
When we divide 1+RAND_MAX by 6, there is some quotient q and some remainder r. Now consider the result of rand() / q:
When rand produces a number in [0, q−1], rand() / q will be 0.
When rand produces a number in [q, 2q−1], rand() / q will be 1.
When rand produces a number in [2q, 3q−1], rand() / q will be 2.
When rand produces a number in [3q, 4q−1], rand() / q will be 3.
When rand produces a number in [4q, 5q−1], rand() / q will be 4.
When rand produces a number in [5q, 6q−1], rand() / q will be 5.
When rand produces a number that is 6q or greater, rand() / q will be 6.
Observe that in each of the first six intervals, there are exactly q numbers. In the seventh interval, the possible return values are in [6q, RAND_MAX]. That interval contains r numbers.
This code works by rejecting that last interval:
int x = 7;
while(x > 6)
x = 1 + rand()/((RAND_MAX + 1u)/6);
Whenever rand produces a number in that last fragmentary interval, this code rejects it and tries again. When rand produces a number in one of the whole intervals, this code accepts it and exits (after adding 1 so the results in x are 1 to 6 instead of 0 to 5).
Thus, every output from 1 to 6, inclusive, is mapped to from an exactly equal number of rand values.
This is the best way to produce a uniform distribution from rand in the sense that it has the fewest rejections, given we are using a scheme like this.1 The range of rand has been split into six intervals that are as big as possible. The remaining fragmentary interval cannot be used because the remainder r is less than six, so the r unused values cannot be split evenly over the six desired values for x.
Footnote
1 This is not necessarily the best way to use rand to generate random numbers in [1, 6] overall. For example, from a single rand call with RAND_MAX equal to 32767, we could view the value as a base-six numeral from 000000 to 411411. If it is under 400000, we can take the last five digits, which are each uniformly distributed in [0, 5], and adding one gts us the desired [1, 6]. If it is in [400000, 410000), we can use the last four digits. If it is in [410000, 411000), we can use the last three, and so on. Additionally, the otherwise discarded information, such as the leading digit, might be pooled over multiple rand calls to increase the average number of outputs we get per call to rand.

Problem with rand%100 for random number generation in C

So I have a homework assignment, and we need to generate random numbers between 1 and 100 in C. I have a working example with int i = rand()%100.
But according to the homework that is technically incorrect which I don't really get. The Homework explanation is as follows
"1.1 We use a random number generator to simulate bus arrival times. ===> the rand( ) function.The rand( ) function returns a pseudo random number 0 to RAND_MAX (2^31-1 in linux).To generate a random number, rn, between 0.0 and 1.0; rn = rand( ) / RAND_MAX.(by the way, a lot of people do below to create, say, 2 digit random numbers. r_num = rand( ) % 100; since % 100 is 0 to 99. However, this is wrong. The right way of generate 2 digit random number is: divide 0-RAND_MAX in 10 intervals and see where the random number falls. The interval time is, it = RAND_MAX / 100. Then, map it to one of 0 - 99 by the following: 0 1 2 3 ......... 99 0 it 2it 3it 99it to RAND_MAX If the rand( ) returns a number is between (12it) and (13*it), the 2 digit random number is 12.)"
I was hoping someone could take a stab at explaining what it is saying, I'm not really looking for code examples just an understanding of the problem.
There are a couple of problems there, both having to do with how the modulo operator works. a % b effectively gives you the remainder when you divide a by b. So let's suppose that we're computing numbers modulo 4. Let's also assume that RAND_MAX = 6, because I really don't want to have 32768+ rows in my table.
a | a % 4
------------
0 | 0
1 | 1
2 | 2
3 | 3
4 | 0
5 | 1
6 | 2
So if you're using your approach to generate random numbers between 1 and 4, you have two problems. First, the simple one: you're generating numbers between 0 and 3, not 1 and 4. The result of the modulo operator will always be between 0 and the modulus.
The other problem is more subtle. If RAND_MAX doesn't divide evenly into the modulus, you won't get the same probability of each number. In the case of our example, there are 2 ways each to make 0 through 2, but only one way to make 3. So 3 will occur ~14.3% of the time, and each other number will occur ~28.6% of the time. To get a uniform distribution, you need to find a way to deal with cases where RAND_MAX doesn't divide evenly.
RAND_MAX is usually 2^31 - 1 so it is equal 2147483647.
But let's assume for simplicity that we have a very strange system, with RAND_MAX = 100 (so rand() can return 0 to 100, that's 101 numbers). And let's assume the rand() function has ideal uniform distribution.
Now, what is the probability of rand() % 100 ? The numbers 1 to 99 have the same probability, that is 1/101. But 0 has the probability 2/101 because when rand() return 0 and when rand() return 100, the expression rand() % 100 will be equal to 0. So 0 can come more often then any other numbers, actually two times more often. So our distribution of 2-digit numbers with rand() % 100 is not uniform.
Now, the text proposes a solution to the problem. The proposed solution is to split 0 to RAND_MAX region into 100 even parts, so that numbers within each part have the same probability. Then roll rand() and see in which region the number ended. If RAND_MAX is 2147483647 and we for example get a number 279172968 we can see it ends in the 13th region - between RAND_MAX / 100 * 13 = 279172868 and RAND_MAX / 100 * 14 = 300647704.
The solution is also flawed, as we can see, that it is impossible to divide 0 to RAND_MAX into 100 even parts when RAND_MAX % 100 is not equal to 0.
I feel the only viable solution is to discard all numbers greater then RAND_MAX / 100 * 100 (using C integer arithmetic). The rest of the numbers will have uniform distribution and the maximum will be divisible by 100, so with the rest we can just rand() % 100. So something like this:
int get_2_digit_number() {
int r = 0;
while (1) {
r = rand();
if (r > (RAND_MAX / 100 * 100)) {
continue;
}
break;
}
return r % 100;
}
You can find relevant code on SO. For example, the rand_int() code below is based on code for integers in an answer to
Is this C implementation of the Fisher-Yates shuffle correct? (and specifically the answer by Roland Illig):
static size_t rand_int(size_t n)
{
size_t limit = RAND_MAX - RAND_MAX % n;
size_t rnd;
while ((rnd = rand()) >= limit)
;
return rnd % n;
}
The idea is that you calculate and ignore the large values returned by rand() which would lead to biassed results. When one of the large values is returned, you ignore it and try the next value. This will seldom need more than two calls to rand().
You might find some of the external references in Shuffle array in C useful too.

Why does C use modulus for random numbers?

C:
rand() % (max - min)
Let's say the random is between 0-10..
rand() % 10
0.567 % 10 = that same number. (0.567). It isn't really doing anything. a rand() is always between 0-1, and as long as max-min is always >= 1, it will do nothing at all.
Wouldn't you just use multiplication instead of modulo?
int rand = rand() * (max - min) + 1
rand() returns a number between 0 and RAND_MAX. You would then use modulo to constrain it to a certain range. So if you wanted a number between 0 and 10, you would do rand() % 10.

A line which randomizing a number between two values

my problem this time is not using a line but understanding it,
i received this line from my teacher to randomize a number between the MIN and MAX values, and it works perfectly, but i have tried to understand How exactly and i just couldn't.
I would be happy if anyone could explain it to me step by step (please not i'm not 100% sure how the rand() function works)
Thanks!
int number = (rand() % (DICE_MAX - DICE_MIN +1)) + DICE_MIN; // Randomizing a value between 'DICE_MAX' and 'DICE_MIN' which can be defined on the head of this program.
The function rand() generates a random (well, pseudo-random to be precise) number. The int returned from it has a large range, so you need to scale it to necessary range.
Assuming DICE_MIN to be 1 and DICE_MAX to be 6, you need to generate random integers in the range [1, 6]. There are 6 numbers in the range, and DICE_MAX - DICE_MIN + 1 = 6. So whatever integer you get from rand() the value of rand() % (DICE_MAX - DICE_MIN + 1) will be in the range [0, 5]. Adding the minimum of the required range DICE_MIN to it shifts the range to [1, 6].
This is a very widely practiced technique for generating random numbers in a given range.
rand:
Function: Random number generator.
Include: stdlib.h
syntax: int rand(void);
Return Value: The function rand returns the generated pseudo random number.
Description: The rand function generates an integer between 0 and RAND_MAX (a symbolic constant defined in stdlib.h). standard C states that the value of RAND_MAX must be at least 32767. If rand truly produces integers at random, every number between 0 and RAND_MAX has an equal probability of being chosen each time rand is called.
How it works?
Take an example of rolling a dice (six sided). The remainder operator % is used here in conjugation with rand as :
rand % 6;
to produce integers in the range 0 to 5. This is called scaling. The number 6 is called scaling factor. But, we need to generate number from 1 to 6. Now we shift the range of numbers produced by adding 1 to our result (1 + rand%6).
In general
n = a + rand() % b;
where a is the shifting value (which is equal to the first number in the desired range of consecutive integers, i.e, to lower bound) and b is equal to the width of the desired range of consecutive integers.
In the provided snippet of your's
int number = (rand() % (DICE_MAX - DICE_MIN +1)) + DICE_MIN;
DICE_MAX - DICE_MIN +1 is desired width and DICE_MIN is the shifting value.
Further reading: Using rand().

How do I get a specific range of numbers from rand()?

srand(time(null));
printf("%d", rand());
Gives a high-range random number (0-32000ish), but I only need about 0-63 or 0-127, though I'm not sure how to go about it. Any help?
rand() % (max_number + 1 - minimum_number) + minimum_number
So, for 0-65:
rand() % (65 + 1 - 0) + 0
(obviously you can leave the 0 off, but it's there for completeness).
Note that this will bias the randomness slightly, but probably not anything to be concerned about if you're not doing something particularly sensitive.
You can use this:
int random(int min, int max){
return min + rand() / (RAND_MAX / (max - min + 1) + 1);
}
From the:
comp.lang.c FAQ list · Question 13.16
Q: How can I get random integers in a certain range?
A: The obvious way,
rand() % N /* POOR */
(which tries to return numbers from 0 to N-1) is poor, because the
low-order bits of many random number generators are distressingly
non-random. (See question 13.18.) A better method is something like
(int)((double)rand() / ((double)RAND_MAX + 1) * N)
If you'd rather not use floating point, another method is
rand() / (RAND_MAX / N + 1)
If you just need to do something with probability 1/N, you could use
if(rand() < (RAND_MAX+1u) / N)
All these methods obviously require knowing RAND_MAX (which ANSI #defines in <stdlib.h>), and assume that N is much less than RAND_MAX. When N is close to RAND_MAX, and if the range of the random number
generator is not a multiple of N (i.e. if (RAND_MAX+1) % N != 0), all
of these methods break down: some outputs occur more often than
others. (Using floating point does not help; the problem is that rand
returns RAND_MAX+1 distinct values, which cannot always be evenly
divvied up into N buckets.) If this is a problem, about the only thing
you can do is to call rand multiple times, discarding certain values:
unsigned int x = (RAND_MAX + 1u) / N;
unsigned int y = x * N;
unsigned int r;
do {
r = rand();
} while(r >= y);
return r / x;
For any of these techniques, it's straightforward to shift the range,
if necessary; numbers in the range [M, N] could be generated with
something like
M + rand() / (RAND_MAX / (N - M + 1) + 1)
(Note, by the way, that RAND_MAX is a constant telling you what the
fixed range of the C library rand function is. You cannot set RAND_MAX
to some other value, and there is no way of requesting that rand
return numbers in some other range.)
If you're starting with a random number generator which returns
floating-point values between 0 and 1 (such as the last version of
PMrand alluded to in question 13.15, or drand48 in question
13.21), all you have to do to get integers from 0 to N-1 is
multiply the output of that generator by N:
(int)(drand48() * N)
Additional links
References: K&R2 Sec. 7.8.7 p. 168
PCS Sec. 11 p. 172
Quote from: http://c-faq.com/lib/randrange.html
check here
http://c-faq.com/lib/randrange.html
For any of these techniques, it's straightforward to shift the range, if necessary; numbers in the range [M, N] could be generated with something like
M + rand() / (RAND_MAX / (N - M + 1) + 1)
Taking the modulo of the result, as the other posters have asserted will give you something that's nearly random, but not perfectly so.
Consider this extreme example, suppose you wanted to simulate a coin toss, returning either 0 or 1. You might do this:
isHeads = ( rand() % 2 ) == 1;
Looks harmless enough, right? Suppose that RAND_MAX is only 3. It's much higher of course, but the point here is that there's a bias when you use a modulus that doesn't evenly divide RAND_MAX. If you want high quality random numbers, you're going to have a problem.
Consider my example. The possible outcomes are:
rand()
freq.
rand() % 2
0
1/3
0
1
1/3
1
2
1/3
0
Hence, "tails" will happen twice as often as "heads"!
Mr. Atwood discusses this matter in this Coding Horror Article
The naive way to do it is:
int myRand = rand() % 66; // for 0-65
This will likely be a very slightly non-uniform distribution (depending on your maximum value), but it's pretty close.
To explain why it's not quite uniform, consider this very simplified example:
Suppose RAND_MAX is 4 and you want a number from 0-2. The possible values you can get are shown in this table:
rand() | rand() % 3
---------+------------
0 | 0
1 | 1
2 | 2
3 | 0
See the problem? If your maximum value is not an even divisor of RAND_MAX, you'll be more likely to choose small values. However, since RAND_MAX is generally 32767, the bias is likely to be small enough to get away with for most purposes.
There are various ways to get around this problem; see here for an explanation of how Java's Random handles it.
rand() will return numbers between 0 and RAND_MAX, which is at least 32767.
If you want to get a number within a range, you can just use modulo.
int value = rand() % 66; // 0-65
For more accuracy, check out this article. It discusses why modulo is not necessarily good (bad distributions, particularly on the high end), and provides various options.
As others have noted, simply using a modulus will skew the probabilities for individual numbers so that smaller numbers are preferred.
A very ingenious and good solution to that problem is used in Java's java.util.Random class:
public int nextInt(int n) {
if (n <= 0)
throw new IllegalArgumentException("n must be positive");
if ((n & -n) == n) // i.e., n is a power of 2
return (int)((n * (long)next(31)) >> 31);
int bits, val;
do {
bits = next(31);
val = bits % n;
} while (bits - val + (n-1) < 0);
return val;
}
It took me a while to understand why it works and I leave that as an exercise for the reader but it's a pretty concise solution which will ensure that numbers have equal probabilities.
The important part in that piece of code is the condition for the while loop, which rejects numbers that fall in the range of numbers which otherwise would result in an uneven distribution.
double scale = 1.0 / ((double) RAND_MAX + 1.0);
int min, max;
...
rval = (int)(rand() * scale * (max - min + 1) + min);
Updated to not use a #define
double RAND(double min, double max)
{
return (double)rand()/(double)RAND_MAX * (max - min) + min;
}
If you don't overly care about the 'randomness' of the low-order bits, just rand() % HI_VAL.
Also:
(double)rand() / (double)RAND_MAX; // lazy way to get [0.0, 1.0)
This answer does not focus on the randomness but on the arithmetic order.
To get a number within a range, usually we can do it like this:
// the range is between [aMin, aMax]
double f = (double)rand() / RAND_MAX;
double result = aMin + f * (aMax - aMin);
However, there is a possibility that (aMax - aMin) overflows. E.g. aMax = 1, aMin = -DBL_MAX. A safer way is to write like this:
// the range is between [aMin, aMax]
double f = (double)rand() / RAND_MAX;
double result = aMin - f * aMin + f * aMax;
Based on this concept, something like this may cause a problem.
rand() % (max_number + 1 - minimum_number) + minimum_number
// 1. max_number + 1 might overflow
// 2. max_number + 1 - min_number might overflow
if you care about the quality of your random numbers don't use rand()
use some other prng like http://en.wikipedia.org/wiki/Mersenne_twister or one of the other high quality prng's out there
then just go with the modulus.
Just to add some extra detail to the existing answers.
The mod % operation will always perform a complete division and therefore yield a remainder less than the divisor.
x % y = x - (y * floor((x/y)))
An example of a random range finding function with comments:
uint32_t rand_range(uint32_t n, uint32_t m) {
// size of range, inclusive
const uint32_t length_of_range = m - n + 1;
// add n so that we don't return a number below our range
return (uint32_t)(rand() % length_of_range + n);
}
Another interesting property as per the above:
x % y = x, if x < y
const uint32_t value = rand_range(1, RAND_MAX); // results in rand() % RAND_MAX + 1
// TRUE for all x = RAND_MAX, where x is the result of rand()
assert(value == RAND_MAX);
result of rand()
2 cents (ok 4 cents):
n = rand()
x = result
l = limit
n/RAND_MAX = x/l
Refactor:
(l/1)*(n/RAND_MAX) = (x/l)*(l/1)
Gives:
x = l*n/RAND_MAX
int randn(int limit)
{
return limit*rand()/RAND_MAX;
}
int i;
for (i = 0; i < 100; i++) {
printf("%d ", randn(10));
if (!(i % 16)) printf("\n");
}
> test
0
5 1 8 5 4 3 8 8 7 1 8 7 5 3 0 0
3 1 1 9 4 1 0 0 3 5 5 6 6 1 6 4
3 0 6 7 8 5 3 8 7 9 9 5 1 4 2 8
2 7 8 9 9 6 3 2 2 8 0 3 0 6 0 0
9 2 2 5 6 8 7 4 2 7 4 4 9 7 1 5
3 7 6 5 3 1 2 4 8 5 9 7 3 1 6 4
0 6 5
Just using rand() will give you same random numbers when running program multiple times. i.e. when you run your program first time it would produce random number x,y and z. If you run the program again then it will produce same x,y and z numbers as observed by me.
The solution I found to keep it unique every time is using srand()
Here is the additional code,
#include<stdlib.h>
#include<time.h>
time_t t;
srand((unsigned) time(&t));
int rand_number = rand() % (65 + 1 - 0) + 0 //i.e Random numbers in range 0-65.
To set range you can use formula : rand() % (max_number + 1 - minimum_number) + minimum_number
Hope it helps!
You can change it by adding a % in front of the rand function in order to change to code
For example:
rand() % 50
will give you a random number in a range of 50. For you, replace 50 with 63 or 127
I think the following does it semi right. It's been awhile since I've touched C. The idea is to use division since modulus doesn't always give random results. I added 1 to RAND_MAX since there are that many possible values coming from rand including 0. And since the range is also 0 inclusive, I added 1 there too. I think the math is arranged correctly avoid integer math problems.
#define MK_DIVISOR(max) ((int)((unsigned int)RAND_MAX+1/(max+1)))
num = rand()/MK_DIVISOR(65);
Simpler alternative to #Joey's answer. If you decide to go with the % method, you need to do a reroll to get the correct distribution. However, you can skip rerolls most of the time because you only need to avoid numbers that fall in the last bucket:
int rand_less_than(int max) {
int last_bucket_min = RAND_MAX - RAND_MAX % max;
int value;
do {
value = rand();
} while (last_bucket_min <= value);
return value % max;
}
See #JarosrawPawlak's article for explanation with diagrams: Random number generator using modulo
In case of RAND_MAX < max, you need to expand the generator: Expand a random range from 1–5 to 1–7
#include <stdio.h>
#include <stdlib.h>
#include <time.h> // this line is necessary
int main() {
srand(time(NULL)); // this line is necessary
int random_number = rand() % 65; // [0-64]
return 0;
}
Foy any range between min_num and max_num:
int random_number = rand() % (max_num + 1 - min_num) + min_num;

Resources