Im trying to use the function nanosleep to make my process sleep for a random amount of time between 1/10th of a second?
im using srand() to seed my random number generator, with the process id, that is, im calling:
srand(getpid());
then using
struct timespec delay;
delay.tv_sec = 0;
delay.tv_nsec = rand();
nanosleep(&delay, NULL);
How can i make sure im sleeping for 0..1/10th of a second?
I'd say you just need 100000000ULL * rand() / RAND_MAX nanoseconds, this is at most 0.1s and at least 0s. Alternatively, try usleep() with argument 100000ULL * rand() / RAND_MAX. (I think usleep requires fewer CPU resources.)
(Edit: Added "unsigned long long" literal specifier to ensure that the number fits. See comments below, and thanks to caf for pointing this out!)
you need to "cap" your rand(), like:
delay.tv_nsec = rand() % 1e8;
Some experts say that this is not the optimal way to do it, because you use the LSB of the number and they are not as "random" as the higher bits, but this is fast and reliable.
Related
This question already has answers here:
Rand() % 14 only generates the values 6 or 13
(3 answers)
Closed 1 year ago.
I have a problem, I want to use rand() to get a random number between 0 and 6, but it always gives me 4 at each run, even when I call srand(time(NULL))
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
int main(void)
{
srand(time(NULL));
int rd = rand() % 7;
printf("%d\n", rd);
return (0);
}
output is 4 at each run
There are two fundamental problems with your code which, in combination, produce the curious result you're experiencing.
Almost anyone will warn you about the use of the rand() interface. Indeed, the Mac OS manpage itself starts with a warning:
$ man rand
NAME
rand, srand, sranddev, rand_r -- bad random number generator
Yep, it's a bad random number generator. Bad random number generators can be hard to seed, among other problems.
But speaking of seeding, here's another issue, perhaps less discussed but nonetheless important:
Do not use time(NULL) to seed your random number generator.
The linked answer goes into more detail about this, but the basic issue is simple: the value of time(NULL) changes infrequently (if frequently is measured in nanoseconds), and doesn't change much when it changes. So not only are you relying on the program to not be run very often (or at least less than once per second), you're also depending on the random number generator to produce radically different values from slightly different seeds. Perhaps a good random number generator would do that, but we've already established that rand() is a bad random number generator.
OK, that's all very general. The specific problem is somewhat interesting, at least for academic purposes (academic, since the practicial solution is always "use a better random number generator and seed it with a good random seed"). The precise problem here is that you're using rand() % 7.
That's a problem because what the Mac OS / FreeBSD implementation of rand() does is to multiply the seed by a multiple of 7. Because that product is reduced modulo 232 (which is not a multiple of 7), the value modulo 7 of the first random number produced by slowly incrementing seeds will eventually change, but it will have to wait until the amount of the overflow changes.
Here's a link to the code. The essence is in these three lines:
hi = *ctx / 127773;
lo = *ctx % 127773;
x = 16807 * lo - 2836 * hi;
which, according to a comment, "compute[s] x = (7^5 * x) mod (2^31 - 1) without overflowing 31 bits." x is the value which will eventually be returned (modulo 232) and it is also the next seed. *ctx is the current seed.
16807 is, as the comment says, 75, which is obviously divisible by 7. And 2836 mod 7 is 1. So by the rules of modular arithmetic:
x mod 7 = (16807 * lo) mod 7 - (2836 * hi) mod 7
= 0 - hi mod 7
That value only depends on hi, which is seed / 127773. So hi changes exactly once every 127773 ticks. Since the result of time(NULL) is in seconds, that's one change in 127773 seconds, which is about a day and a half. So if you ran your program once a day, you'd notice that the first random number is sometimes the same as the previous day and sometimes one less. But you're running it quite a bit more often than that, even if you wait a few seconds between runs, so you just see the same first random number every time. Eventually it will tick down and then you'll see a series of 3s instead of 4s.
As mentioned by #rici, the problem is caused by the poor implementation of rand(). The man page for srand() recommends using arc4random() instead. Alternatively, you could try seeding with a value taken directly from /dev/urandom as follows:
#include <stdlib.h>
#include <stdio.h>
int main(void)
{
int seed;
FILE *f = fopen("/dev/urandom", "r");
fread(&seed, sizeof(int), 1, f);
srand(seed);
fclose(f);
/* Should be a lot more unpredictable: */
printf("%d\n", rand() % 7);
return (0);
}
I was implementing a hashmap in C as part of a project I'm working on and using random inserts to test it. I noticed that rand() on Linux seems to repeat numbers far more often than on Mac. RAND_MAX is 2147483647/0x7FFFFFFF on both platforms. I've reduced it to this test program that makes a byte array RAND_MAX+1-long, generates RAND_MAX random numbers, notes if each is a duplicate, and checks it off the list as seen.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
int main() {
size_t size = ((size_t)RAND_MAX) + 1;
char *randoms = calloc(size, sizeof(char));
int dups = 0;
srand(time(0));
for (int i = 0; i < RAND_MAX; i++) {
int r = rand();
if (randoms[r]) {
// printf("duplicate at %d\n", r);
dups++;
}
randoms[r] = 1;
}
printf("duplicates: %d\n", dups);
}
Linux consistently generates around 790 million duplicates. Mac consistently only generates one, so it loops through every random number that it can generate almost without repeating. Can anyone please explain to me how this works? I can't tell anything different from the man pages, can't tell which RNG each is using, and can't find anything online. Thanks!
While at first it may sound like the macOS rand() is somehow better for not repeating any numbers, one should note that with this amount of numbers generated it is expected to see plenty of duplicates (in fact, around 790 million, or (231-1)/e). Likewise iterating through the numbers in sequence would also produce no duplicates, but wouldn't be considered very random. So the Linux rand() implementation is in this test indistinguishable from a true random source, whereas the macOS rand() is not.
Another thing that appears surprising at first glance is how the macOS rand() can manage to avoid duplicates so well. Looking at its source code, we find the implementation to be as follows:
/*
* Compute x = (7^5 * x) mod (2^31 - 1)
* without overflowing 31 bits:
* (2^31 - 1) = 127773 * (7^5) + 2836
* From "Random number generators: good ones are hard to find",
* Park and Miller, Communications of the ACM, vol. 31, no. 10,
* October 1988, p. 1195.
*/
long hi, lo, x;
/* Can't be initialized with 0, so use another value. */
if (*ctx == 0)
*ctx = 123459876;
hi = *ctx / 127773;
lo = *ctx % 127773;
x = 16807 * lo - 2836 * hi;
if (x < 0)
x += 0x7fffffff;
return ((*ctx = x) % ((unsigned long) RAND_MAX + 1));
This does indeed result in all numbers between 1 and RAND_MAX, inclusive, exactly once, before the sequence repeats again. Since the next state is based on multiplication, the state can never be zero (or all future states would also be zero). Thus the repeated number you see is the first one, and zero is the one that is never returned.
Apple has been promoting the use of better random number generators in their documentation and examples for at least as long as macOS (or OS X) has existed, so the quality of rand() is probably not deemed important, and they've just stuck with one of the simplest pseudorandom generators available. (As you noted, their rand() is even commented with a recommendation to use arc4random() instead.)
On a related note, the simplest pseudorandom number generator I could find that produces decent results in this (and many other) tests for randomness is xorshift*:
uint64_t x = *ctx;
x ^= x >> 12;
x ^= x << 25;
x ^= x >> 27;
*ctx = x;
return (x * 0x2545F4914F6CDD1DUL) >> 33;
This implementation results in almost exactly 790 million duplicates in your test.
MacOS provides an undocumented rand() function in stdlib. If you leave it unseeded, then the first values it outputs are 16807, 282475249, 1622650073, 984943658 and 1144108930. A quick search will show that this sequence corresponds to a very basic LCG random number generator that iterates the following formula:
xn+1 = 75 · xn (mod 231 − 1)
Since the state of this RNG is described entirely by the value of a single 32-bit integer, its period is not very long. To be precise, it repeats itself every 231 − 2 iterations, outputting every value from 1 to 231 − 2.
I don't think there's a standard implementation of rand() for all versions of Linux, but there is a glibc rand() function that is often used. Instead of a single 32-bit state variable, this uses a pool of over 1000 bits, which to all intents and purposes will never produce a fully repeating sequence. Again, you can probably find out what version you have by printing the first few outputs from this RNG without seeding it first. (The glibc rand() function produces the numbers 1804289383, 846930886, 1681692777, 1714636915 and 1957747793.)
So the reason you're getting more collisions in Linux (and hardly any in MacOS) is that the Linux version of rand() is basically more random.
rand() is defined by the C standard, and the C standard does not specify which algorithm to use. Obviously, Apple is using an inferior algorithm to your GNU/Linux implementation: The Linux one is indistinguishable from a true random source in your test, while the Apple implementation just shuffles the numbers around.
If you want random numbers of any quality, either use a better PRNG that gives at least some guarantees on the quality of the numbers it returns, or simply read from /dev/urandom or similar. The later gives you cryptographic quality numbers, but is slow. Even if it is too slow by itself, /dev/urandom can provide some excellent seeds to some other, faster PRNG.
In general, the rand/srand pair has been considered sort of deprecated for a long time due to low-order bits displaying less randomness than high-order bits in the results. This may or may not have anything to do with your results, but I think this is still a good opportunity to remember that even though some rand/srand implementations are now more up to date, older implementations persist and it's better to use random(3). On my Arch Linux box, the following note is still in the man page for rand(3):
The versions of rand() and srand() in the Linux C Library use the same
random number generator as random(3) and srandom(3), so the lower-order
bits should be as random as the higher-order bits. However, on older
rand() implementations, and on current implementations on different
systems, the lower-order bits are much less random than the higher-or-
der bits. Do not use this function in applications intended to be por-
table when good randomness is needed. (Use random(3) instead.)
Just below that, the man page actually gives very short, very simple example implementations of rand and srand that are about the simplest LC RNGs you've ever seen and having a small RAND_MAX. I don't think they match what's in the C standard library, if they ever did. Or at least I hope not.
In general, if you're going to use something from the standard library, use random if you can (the man page lists it as POSIX standard back to POSIX.1-2001, but rand is standard way back before C was even standardized). Or better yet, crack open Numerical Recipes (or look for it online) or Knuth and implement one. They're really easy and you only really need to do it once to have a general purpose RNG with the attributes you most often need and which is of known quality.
I've written a code to ensure each loop of while(1) loop to take specific amount of time (in this example 10000µS which equals to 0.01 seconds). The problem is this code works pretty well at the start but somehow stops after less than a minute. It's like there is a limit of accessing linux time. For now, I am initializing a boolean variable to make this time calculation run once instead infinite. Since performance varies over time, it'd be good to calculate the computation time for each loop. Is there any other way to accomplish this?
void some_function(){
struct timeval tstart,tend;
while (1){
gettimeofday (&tstart, NULL);
...
Some computation
...
gettimeofday (&tend, NULL);
diff = (tend.tv_sec - tstart.tv_sec)*1000000L+(tend.tv_usec - tstart.tv_usec);
usleep(10000-diff);
}
}
from man-page of usleep
#include <unistd.h>
int usleep(useconds_t usec);
usec is unsigned int, now guess what happens when diff is > 10000 in below line
usleep(10000-diff);
Well, the computation you make to get the difference is wrong:
diff = (tend.tv_sec - tstart.tv_sec)*1000000L+(tend.tv_usec - tstart.tv_usec);
You are mixing different integer types, missing that tv_usec can be an unsigned quantity, which your are substracting from another unsigned and can overflow.... after that, you get as result a full second plus a quantity that is around 4.0E09usec. This is some 4000sec. or more than an hour.... aproximately. It is better to check if there's some carry, and in that case, to increment tv_sec, and then substract 10000000 from tv_usec to get a proper positive value.
I don't know the implementation you are using for struct timeval but the most probable is that tv_sec is a time_t (this can be even 64bit) while tv_usec normally is just a unsigned 32 bit value, as it it not going to go further from 1000000.
Let me illustrate... suppose you have elapsed 100ms doing calculations.... and this happens to occur in the middle of a second.... you have
tstart.tv_sec = 123456789; tstart.tv_usec = 123456;
tend.tv_sec = 123456789; tend.tv_usec = 223456;
when you substract, it leads to:
tv_sec = 0; tv_usec = 100000;
but let's suppose you have done your computation while the second changes
tstart.tv_sec = 123456789; tstart.tv_usec = 923456;
tend.tv_sec = 123456790; tend.tv_usec = 23456;
the time difference is again 100msec, but now, when you calculate your expression you get, for the first part, 1000000 (one full second) but, after substracting the second part you get 23456 - 923456 =*=> 4294067296 (*) with the overflow.
so you get to usleep(4295067296) or 4295s. or 1h 11m more.
I think you have not had enough patience to wait for it to finish... but this is something that can be happening to your program, depending on how struct timeval is defined.
A proper way to make carry to work is to reorder the summation to do all the additions first and then the substractions. This forces casts to signed integers when dealing with signed and unsigned together, and prevents a negative overflow in unsigneds.
diff = (tend.tv_sec - tstart.tv_sec) * 1000000 + tstart.tv_usec - tend.tv_usec;
which is parsed as
diff = (((tend.tv_sec - tstart.tv_sec) * 1000000) + tstart.tv_usec) - tend.tv_usec;
I am using the below code to convert nano to micro sec
This code runs fines mostly but some times I see the usTick gives a value far beyond the current time.
For ex. if the current time in usTick is 63290061063 then sometimes this value is coming as 126580061060. If you see it is double.
Similarly one more instance I got is current time is 45960787154, but the usTick is showing as 91920787152
typedef unsigned long long TUINT64
unsigned long long GetMonoUSTick()
{
static unsigned long long usTick;
struct timespec t;
clock_gettime(CLOCK_MONOTONIC, &t);
usTick = ((TUINT64)t.tv_nsec) / 1000;
usTick = usTick +((TUINT64)t.tv_sec) * 1000000;
return usTick;
}
If multiple threads of the same process in parallel access variables concurrently for reading/writing or writing/writing those variables need to protected. This can be achieved by using a mutex.
In this case the local variable usTick needs to be protected as it is defined static.
Using POSIX-threads the code could look like this:
pthread_mutex_lock(&ustick_mutex);
usTick = ((TUINT64)t.tv_nsec) / 1000;
usTick = usTick +((TUINT64)t.tv_sec) * 1000000;
pthread_mutex_unlock(&ustick_mutex);
(error checking left out for clarity)
Take care to initialise ustick_mutex properly before using it.
Sounds like you are using threads and that once in a while two threads are calling this function at the same time. Because usTick is static, they are working with the same variable (not two different copies). They both convert the nanoseconds to microseconds and assign to usTick, one after the other, and then they both convert the seconds to microseconds and both add them to usTick (so the seconds get added twice).
EDIT - The following solution I proposed was based on two assumption:
If two threats called the function at the same time, the difference between the current time returned by clock_gettime() would be too small to matter (the difference between the results calculated by the threads would at most be 1).
On most modern CPU's, reading/writing an integer is an atomic operation.
I think you should be able to fix this by changing:
usTick = ((TUINT64)t.tv_nsec) / 1000;
usTick = usTick +((TUINT64)t.tv_sec) * 1000000;
to:
usTick = ((TUINT64)t.tv_nsec) / 1000 + ((TUINT64)t.tv_sec) * 1000000;
The problem with my solution was that even if the later assumption would be correct, it might not hold for long long. Therefore, for example this could happen:
Thread A and B call the function at (almost) the same time.
Thread A calculates current time in microsecond as 0x02B3 1F02 FFFF FFFF.
Thread B calculates current time in microsecond as 0x02B3 1F03 0000 0000.
Thread A writes the least significant 32 bits (0xFFFF FFFF) to usTick.
Thread B writes the least significant 32 bits (0x0000 0000) to usTick.
Thread B writes the most significant 32 bits (0x02B3 1F03)to usTick.
Thread A writes the most significant 32 bits (0x02B3 1F02)to usTick.
and then the function would in both threads return 0x02B3 1F02 0000 0000 which is off by 4294967295.
So do as #alk said, use mutex to protect the read/write of usTick
I observed that rand() library function when it is called just once within a loop, it almost always produces positive numbers.
for (i = 0; i < 100; i++) {
printf("%d\n", rand());
}
But when I add two rand() calls, the numbers generated now have more negative numbers.
for (i = 0; i < 100; i++) {
printf("%d = %d\n", rand(), (rand() + rand()));
}
Can someone explain why I am seeing negative numbers in the second case?
PS: I initialize the seed before the loop as srand(time(NULL)).
rand() is defined to return an integer between 0 and RAND_MAX.
rand() + rand()
could overflow. What you observe is likely a result of undefined behaviour caused by integer overflow.
The problem is the addition. rand() returns an int value of 0...RAND_MAX. So, if you add two of them, you will get up to RAND_MAX * 2. If that exceeds INT_MAX, the result of the addition overflows the valid range an int can hold. Overflow of signed values is undefined behaviour and may lead to your keyboard talking to you in foreign tongues.
As there is no gain here in adding two random results, the simple idea is to just not do it. Alternatively you can cast each result to unsigned int before the addition if that can hold the sum. Or use a larger type. Note that long is not necessarily wider than int, the same applies to long long if int is at least 64 bits!
Conclusion: Just avoid the addition. It does not provide more "randomness". If you need more bits, you might concatenate the values sum = a + b * (RAND_MAX + 1), but that also likely requires a larger data type than int.
As your stated reason is to avoid a zero-result: That cannot be avoided by adding the results of two rand() calls, as both can be zero. Instead, you can just increment. If RAND_MAX == INT_MAX, this cannot be done in int. However, (unsigned int)rand() + 1 will do very, very likely. Likely (not definitively), because it does require UINT_MAX > INT_MAX, which is true on all implementations I'm aware of (which covers quite some embedded architectures, DSPs and all desktop, mobile and server platforms of the past 30 years).
Warning:
Although already sprinkled in comments here, please note that adding two random values does not get a uniform distribution, but a triangular distribution like rolling two dice: to get 12 (two dice) both dice have to show 6. for 11 there are already two possible variants: 6 + 5 or 5 + 6, etc.
So, the addition is also bad from this aspect.
Also note that the results rand() generates are not independent of each other, as they are generated by a pseudorandom number generator. Note also that the standard does not specify the quality or uniform distribution of the calculated values.
This is an answer to a clarification of the question made in comment to this answer,
the reason i was adding was to avoid '0' as the random number in my code. rand()+rand() was the quick dirty solution which readily came to my mind.
The problem was to avoid 0. There are (at least) two problems with the proposed solution. One is, as the other answers indicate, that rand()+rand() can invoke undefined behavior. Best advice is to never invoke undefined behavior. Another issue is there's no guarantee that rand() won't produce 0 twice in a row.
The following rejects zero, avoids undefined behavior, and in the vast majority of cases will be faster than two calls to rand():
int rnum;
for (rnum = rand(); rnum == 0; rnum = rand()) {}
// or do rnum = rand(); while (rnum == 0);
Basically rand() produce numbers between 0 and RAND_MAX, and 2 RAND_MAX > INT_MAX in your case.
You can modulus with the max value of your data-type to prevent overflow. This ofcourse will disrupt the distribution of the random numbers, but rand is just a way to get quick random numbers.
#include <stdio.h>
#include <limits.h>
int main(void)
{
int i=0;
for (i=0; i<100; i++)
printf(" %d : %d \n", rand(), ((rand() % (INT_MAX/2))+(rand() % (INT_MAX/2))));
for (i=0; i<100; i++)
printf(" %d : %ld \n", rand(), ((rand() % (LONG_MAX/2))+(rand() % (LONG_MAX/2))));
return 0;
}
May be you could try rather a tricky approach by ensuring that the value returned by sum of 2 rand() never exceeds the value of RAND_MAX. A possible approach could be sum = rand()/2 + rand()/2; This would ensure that for a 16 bit compiler with RAND_MAX value of 32767 even if both rand happens to return 32767, even then (32767/2 = 16383) 16383+16383 = 32766, thus would not result in negative sum.
the reason i was adding was to avoid '0' as the random number in my code. rand()+rand() was the quick dirty solution which readily came to my mind.
A simple solution (okay, call it a "Hack") which never produces a zero result and will never overflow is:
x=(rand()/2)+1 // using divide -or-
x=(rand()>>1)+1 // using shift which may be faster
// compiler optimization may use shift in both cases
This will limit your maximum value, but if you don't care about that, then this should work fine for you.
To avoid 0, try this:
int rnumb = rand()%(INT_MAX-1)+1;
You need to include limits.h.
thx. the reason i was adding was to avoid '0' as the random number in my code. rand()+rand() was the quick dirty solution which readily came to my mind
It sounds like an XY problem to me, in which in order to not get a 0 from rand(), you call rand() two times, doing the program slower, with a new setback and the possibility of getting a 0 is still there.
Another solution is using uniform_int_distribution, which creates a random and uniformly distributed number in the defined interval:
https://wandbox.org/permlink/QKIHG4ghwJf1b7ZN
#include <random>
#include <array>
#include <iostream>
int main()
{
const int MAX_VALUE=50;
const int MIN_VALUE=1;
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<> distrib(MIN_VALUE, MAX_VALUE);
std::array<int,MAX_VALUE-MIN_VALUE> weight={0};
for(int i=0; i<50000; i++) {
weight[distrib(gen)-MIN_VALUE]++;
}
for(int i=0;i<(int)weight.size();i++) {
std::cout << "value: " << MIN_VALUE+i << " times: " << weight[i] << std::endl;
}
}
While what everyone else has said about the likely overflow could very well be the cause of the negative, even when you use unsigned integers. The real problem is actually using time/date functionality as the seed. If you have truly become familiar with this functionality you will know exactly why I say this. As what it really does is give a distance (elapsed time) since a given date/time. While the use of the date/time functionality as the seed to a rand(), is a very common practice, it really is not the best option. You should search better alternatives, as there are many theories on the topic and I could not possibly go into all of them. You add into this equation the possibility of overflow and this approach was doomed from the beginning.
Those that posted the rand()+1 are using the solution that most use in order to guarantee that they do not get a negative number. But, that approach is really not the best way either.
The best thing you can do is take the extra time to write and use proper exception handling, and only add to the rand() number if and/or when you end up with a zero result. And, to deal with negative numbers properly. The rand() functionality is not perfect, and therefore needs to be used in conjunction with exception handling to ensure that you end up with the desired result.
Taking the extra time and effort to investigate, study, and properly implement the rand() functionality is well worth the time and effort. Just my two cents. Good luck in your endeavors...