Use system implementation if find, otherwise use my own implementation - c

I'm try in to use fls in my routine. However, not every system has this function. So, I ship my own version of fls. I'm wondering if there is any way to let the program use the system implementation and not found, use my own implementation?
#include "strings.h"
#include <stdio.h>
int fls(int mask);
int foo(int N)
{
int tmp = 1 << (fls(N));
return tmp;
}
/*
* Find Last Set bit
*/
int
fls(int mask)
{
int bit;
if (mask == 0)
return (0);
for (bit = 1; mask != 1; bit++)
mask = (unsigned int) mask >> 1;
return (bit);
}

You can use a weak function.
https://en.wikipedia.org/wiki/Weak_symbol
By default, without any annotation, a symbol in an object file is
strong. During linking, a strong symbol can override a weak symbol of
the same name.
Same question for C++, slightly different from C implementation Can I re-define a function or check if it exists?
int __attribute__((weak)) fls(int mask){ .. }
so if system fls is defined as strong, your fls implementation will be overridden.

Related

What's the "to little endian" equivalent of htonl? [duplicate]

I need to convert a short value from the host byte order to little endian. If the target was big endian, I could use the htons() function, but alas - it's not.
I guess I could do:
swap(htons(val))
But this could potentially cause the bytes to be swapped twice, rendering the result correct but giving me a performance penalty which is not alright in my case.
Here is an article about endianness and how to determine it from IBM:
Writing endian-independent code in C: Don't let endianness "byte" you
It includes an example of how to determine endianness at run time ( which you would only need to do once )
const int i = 1;
#define is_bigendian() ( (*(char*)&i) == 0 )
int main(void) {
int val;
char *ptr;
ptr = (char*) &val;
val = 0x12345678;
if (is_bigendian()) {
printf(“%X.%X.%X.%X\n", u.c[0], u.c[1], u.c[2], u.c[3]);
} else {
printf(“%X.%X.%X.%X\n", u.c[3], u.c[2], u.c[1], u.c[0]);
}
exit(0);
}
The page also has a section on methods for reversing byte order:
short reverseShort (short s) {
unsigned char c1, c2;
if (is_bigendian()) {
return s;
} else {
c1 = s & 255;
c2 = (s >> 8) & 255;
return (c1 << 8) + c2;
}
}
;
short reverseShort (char *c) {
short s;
char *p = (char *)&s;
if (is_bigendian()) {
p[0] = c[0];
p[1] = c[1];
} else {
p[0] = c[1];
p[1] = c[0];
}
return s;
}
Then you should know your endianness and call htons() conditionally. Actually, not even htons, but just swap bytes conditionally. Compile-time, of course.
Something like the following:
unsigned short swaps( unsigned short val)
{
return ((val & 0xff) << 8) | ((val & 0xff00) >> 8);
}
/* host to little endian */
#define PLATFORM_IS_BIG_ENDIAN 1
#if PLATFORM_IS_LITTLE_ENDIAN
unsigned short htoles( unsigned short val)
{
/* no-op on a little endian platform */
return val;
}
#elif PLATFORM_IS_BIG_ENDIAN
unsigned short htoles( unsigned short val)
{
/* need to swap bytes on a big endian platform */
return swaps( val);
}
#else
unsigned short htoles( unsigned short val)
{
/* the platform hasn't been properly configured for the */
/* preprocessor to know if it's little or big endian */
/* use potentially less-performant, but always works option */
return swaps( htons(val));
}
#endif
If you have a system that's properly configured (such that the preprocessor knows whether the target id little or big endian) you get an 'optimized' version of htoles(). Otherwise you get the potentially non-optimized version that depends on htons(). In any case, you get something that works.
Nothing too tricky and more or less portable.
Of course, you can further improve the optimization possibilities by implementing this with inline or as macros as you see fit.
You might want to look at something like the "Portable Open Source Harness (POSH)" for an actual implementation that defines the endianness for various compilers. Note, getting to the library requires going though a pseudo-authentication page (though you don't need to register to give any personal details): http://hookatooka.com/poshlib/
This trick should would: at startup, use ntohs with a dummy value and then compare the resulting value to the original value. If both values are the same, then the machine uses big endian, otherwise it is little endian.
Then, use a ToLittleEndian method that either does nothing or invokes ntohs, depending on the result of the initial test.
(Edited with the information provided in comments)
My rule-of-thumb performance guess is that depends whether you are little-endian-ising a big block of data in one go, or just one value:
If just one value, then the function call overhead is probably going to swamp the overhead of unnecessary byte-swaps, and that's even if the compiler doesn't optimise away the unnecessary byte swaps. Then you're maybe going to write the value as the port number of a socket connection, and try to open or bind a socket, which takes an age compared with any sort of bit-manipulation. So just don't worry about it.
If a large block, then you might worry the compiler won't handle it. So do something like this:
if (!is_little_endian()) {
for (int i = 0; i < size; ++i) {
vals[i] = swap_short(vals[i]);
}
}
Or look into SIMD instructions on your architecture which can do it considerably faster.
Write is_little_endian() using whatever trick you like. I think the one Robert S. Barnes provides is sound, but since you usually know for a given target whether it's going to be big- or little-endian, maybe you should have a platform-specific header file, that defines it to be a macro evaluating either to 1 or 0.
As always, if you really care about performance, then look at the generated assembly to see whether pointless code has been removed or not, and time the various alternatives against each other to see what actually goes fastest.
Unfortunately, there's not really a cross-platform way to determine a system's byte order at compile-time with standard C. I suggest adding a #define to your config.h (or whatever else you or your build system uses for build configuration).
A unit test to check for the correct definition of LITTLE_ENDIAN or BIG_ENDIAN could look like this:
#include <assert.h>
#include <limits.h>
#include <stdint.h>
void check_bits_per_byte(void)
{ assert(CHAR_BIT == 8); }
void check_sizeof_uint32(void)
{ assert(sizeof (uint32_t) == 4); }
void check_byte_order(void)
{
static const union { unsigned char bytes[4]; uint32_t value; } byte_order =
{ { 1, 2, 3, 4 } };
static const uint32_t little_endian = 0x04030201ul;
static const uint32_t big_endian = 0x01020304ul;
#ifdef LITTLE_ENDIAN
assert(byte_order.value == little_endian);
#endif
#ifdef BIG_ENDIAN
assert(byte_order.value == big_endian);
#endif
#if !defined LITTLE_ENDIAN && !defined BIG_ENDIAN
assert(!"byte order unknown or unsupported");
#endif
}
int main(void)
{
check_bits_per_byte();
check_sizeof_uint32();
check_byte_order();
}
On many Linux systems, there is a <endian.h> or <sys/endian.h> with conversion functions. man page for ENDIAN(3)

Generating random values without time.h

I want to generate random numbers repeatedly without using the time.h library. I saw another post regarding use the
srand(getpid());
however that doesn't seem to work for me getpid hasn't been declared. Is this because I'm missing the library for it? If it is I need to work out how to randomly generate numbers without using any other libraries than the ones I currently have.
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int minute, hour, day, month, year;
srand(getpid());
minute = rand() % (59 + 1 - 0) + 0;
hour = rand() % (23 + 1 - 0) + 0;
day = rand() % (31 + 1 - 1) + 1;
month = rand() % (12 + 1 - 1) + 1;
year = 2018;
printf("Transferred successfully at %02d:%02d on %02d/%02d/%d\n", hour,
minute, day, month, year);
return 0;
}
NB: I can only use libraries <stdio.h> and <stdlib.h> and <string.h> — strict guidelines for an assignment.
getpid hasn't been declared.
No, because you haven't included the <unistd.h> header where it is declared (and according to your comment, you cannot use it, because you're restricted to using <stdlib.h>, <string.h>, and <stdio.h>).
In that case, I would use something like
#include <stdlib.h>
#include <stdio.h>
static int randomize_helper(FILE *in)
{
unsigned int seed;
if (!in)
return -1;
if (fread(&seed, sizeof seed, 1, in) == 1) {
fclose(in);
srand(seed);
return 0;
}
fclose(in);
return -1;
}
static int randomize(void)
{
if (!randomize_helper(fopen("/dev/urandom", "r")))
return 0;
if (!randomize_helper(fopen("/dev/arandom", "r")))
return 0;
if (!randomize_helper(fopen("/dev/random", "r")))
return 0;
/* Other randomness sources (binary format)? */
/* No randomness sources found. */
return -1;
}
and a simple main() to output some pseudorandom numbers:
int main(void)
{
int i;
if (randomize())
fprintf(stderr, "Warning: Could not find any sources for randomness.\n");
for (i = 0; i < 10; i++)
printf("%d\n", rand());
return EXIT_SUCCESS;
}
The /dev/urandom and /dev/random character devices are available in Linux, FreeBSD, macOS, iOS, Solaris, NetBSD, Tru64 Unix 5.1B, AIX 5.2, HP-UX 11i v2, and /dev/random and /dev/arandom on OpenBSD 5.1 and later.
As usual, it looks like Windows does not provide any such randomness sources: Windows C programs must use proprietary Microsoft interfaces instead.
The randomize_helper() returns nonzero if the input stream is NULL, or if it cannot read an unsigned int from it. If it can read an unsigned int from it, it is used to seed the standard pseudorandom number generator you can access using rand() (which returns an int between 0 and RAND_MAX, inclusive). In all cases, randomize_helper() closes non-NULL streams.
You can add other binary randomness sources to randomize() trivially.
If randomize() returns 0, rand() should return pseudorandom numbers. Otherwise, rand() will return the same default sequence of pseudorandom numbers. (They will still be "random", but the same sequence will occur every time you run the program. If randomize() returns 0, the sequence will be different every time you run the program.)
Most standard C rand() implementations are linear congruental pseudorandom number generators, often with poor choices of parameters, and as a result, are slowish, and not very "random".
For non-cryptographic work, I like to implement one of the Xorshift family of functions, originally by George Marsaglia. They are very, very fast, and reasonably random; they pass most of the statistical randomness tests like the diehard tests.
In OP's case, the xorwow generator could be used. According to current C standards, unsigned int is at least 32 bits, so we can use that as the generator type. Let's see what implementing one to replace the standard srand()/rand() would look like:
#include <stdlib.h>
#include <stdio.h>
/* The Xorwow PRNG state. This must not be initialized to all zeros. */
static unsigned int prng_state[5] = { 1, 2, 3, 4, 5 };
/* The Xorwow is a 32-bit linear-feedback shift generator. */
#define PRNG_MAX 4294967295u
unsigned int prng(void)
{
unsigned int s, t;
t = prng_state[3] & PRNG_MAX;
t ^= t >> 2;
t ^= t << 1;
prng_state[3] = prng_state[2];
prng_state[2] = prng_state[1];
prng_state[1] = prng_state[0];
s = prng_state[0] & PRNG_MAX;
t ^= s;
t ^= (s << 4) & PRNG_MAX;
prng_state[0] = t;
prng_state[4] = (prng_state[4] + 362437) & PRNG_MAX;
return (t + prng_state[4]) & PRNG_MAX;
}
static int prng_randomize_from(FILE *in)
{
size_t have = 0, n;
unsigned int seed[5] = { 0, 0, 0, 0, 0 };
if (!in)
return -1;
while (have < 5) {
n = fread(seed + have, sizeof seed[0], 5 - have, in);
if (n > 0 && ((seed[0] | seed[1] | seed[2] | seed[3] | seed[4]) & PRNG_MAX) != 0) {
have += n;
} else {
fclose(in);
return -1;
}
}
fclose(in);
prng_seed[0] = seed[0] & PRNG_MAX;
prng_seed[1] = seed[1] & PRNG_MAX;
prng_seed[2] = seed[2] & PRNG_MAX;
prng_seed[3] = seed[3] & PRNG_MAX;
prng_seed[4] = seed[4] & PRNG_MAX;
/* Note: We might wish to "churn" the pseudorandom
number generator state, to call prng()
a few hundred or thousand times. For example:
for (n = 0; n < 1000; n++) prng();
This way, even if the seed has clear structure,
for example only some low bits set, we start
with a PRNG state with set and clear bits well
distributed.
*/
return 0;
}
int prng_randomize(void)
{
if (!prng_randomize_from(fopen("/dev/urandom", "r")))
return 0;
if (!prng_randomize_from(fopen("/dev/arandom", "r")))
return 0;
if (!prng_randomize_from(fopen("/dev/random", "r")))
return 0;
/* Other sources? */
/* No randomness sources found. */
return -1;
}
The corresponding main() to above would be
int main(void)
{
int i;
if (prng_randomize())
fprintf(stderr, "Warning: No randomness sources found!\n");
for (i = 0; i < 10; i++)
printf("%u\n", prng());
return EXIT_SUCCESS;
}
Note that PRNG_MAX has a dual purpose. On one hand, it tells the maximum value prng() can return -- which is an unsigned int, not int like rand(). On the other hand, because it must be 232-1 = 4294967295, we also use it to ensure the temporary results when generating the next pseudorandom number in the sequence remain 32-bit. If the uint32_t type, declared in stdint.h or inttypes.h were available, we could use that and drop the masks (& PRNG_MAX).
Note that the prng_randomize_from() function is written so that it still works, even if the randomness source cannot provide all requested bytes at once, and returns a "short count". Whether this occurs in practice is up to debate, but I prefer to be certain. Also note that it does not accept the state if it is all zeros, as that is the one single prohibited initial seed state for the Xorwow PRNG.
You can obviously use both srand()/rand() and prng()/prng_randomize() in the same program. I wrote them so that the Xorwow generator functions all start with prng.
Usually, I do put the PRNG implementation into a header file, so that I can easily test it (to verify it works) by writing a tiny test program; but also so that I can switch the PRNG implementation simply by switching to another header file. (In some cases, I put the PRNG state into a structure, and have the caller provide a pointer to the state, so that any number of PRNGs can be used concurrently, independently of each other.)
however that doesn't seem to work for me getpid hasn't been declared.
That's because you need to include the headers for getpid():
#include <sys/types.h>
#include <unistd.h>
Another option is to use time() to seed (instead of getpid()):
srand((unsigned int)time(NULL));
As other answer pointed, you need to include the unistd.h header. If you don't want to do that then put the declaration of getpid() above main(). Read the manual page of getpid() here http://man7.org/linux/man-pages/man2/getpid.2.html
One approach may be
#include <stdio.h>
#include <stdlib.h>
pid_t getpid(void); /* put the declrataion of getpid(), if don't want to include the header */
int main(void) {
/* .. some code .. */
return 0;
}
Or you can use time() like
srand((unsigned int)time(NULL));

Fisher Yates algorithm gives back same order of numbers in parallel started programs when seeded over the system time

I start several C / C++ programs in parallel, which rely on random numbers. Fairly new to this topic, I heard that the seed should be done over the time.
Furthermore, I use the Fisher Yates Algorithm to get a list with unique random shuffled values. However, starting the program twice in parallel gives back the same results for both lists.
How can I fix this? Can I use a different, but still relient seed?
My simple test code for this looks like this:
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <time.h>
static int rand_int(int n) {
int limit = RAND_MAX - RAND_MAX % n;
int rnd;
do {
rnd = rand();
}
while (rnd >= limit);
return rnd % n;
}
void shuffle(int *array, int n) {
int i, j, tmp;
for (i = n - 1; i > 0; i--) {
j = rand_int(i + 1);
tmp = array[j];
array[j] = array[i];
array[i] = tmp;
}
}
int main(int argc,char* argv[]){
srand(time(NULL));
int x = 100;
int randvals[100];
for(int i =0; i < x;i++)
randvals[i] = i;
shuffle(randvals,x);
for(int i=0;i < x;i++)
printf("%d %d \n",i,randvals[i]);
}
I used the implementation for the fisher yates algorithm from here:
http://www.sanfoundry.com/c-program-implement-fisher-yates-algorithm-array-shuffling/
I started the programs in parallel like this:
./randomprogram >> a.txt & ./randomprogram >> b.txt
and then compared both text files, which had the same content.
The end application is for data augmentation in the deep learning field. The machine runs Ubuntu 16.04 with C++11.
You're getting the same results due to how you're seeding the RNG:
srand(time(NULL));
The time function returns the time in seconds since the epoch. If two instances of the program start during the same second (which is likely if start them in quick succession) then both will use the same seed and get the same set of random values.
You need to add more entropy to your seed. A simple way of doing this is to bitwise-XOR the process ID with the time:
srand(time(NULL) ^ getpid());
As I mentioned in a comment, I like to use a Xorshift* pseudo-random number generator, seeded from /dev/urandom if present, otherwise using POSIX.1 clock_gettime() and getpid() to seed the generator.
It is good enough for most statistical work, but obviously not for any kind of security or cryptographic purposes.
Consider the following xorshift64.h inline implementation:
#ifndef XORSHIFT64_H
#define XORSHIFT64_H
#include <stdlib.h>
#include <unistd.h>
#include <stdint.h>
#include <time.h>
#ifndef SEED_SOURCE
#define SEED_SOURCE "/dev/urandom"
#endif
typedef struct {
uint64_t state[1];
} prng_state;
/* Mixes state by generating 'rounds' pseudorandom numbers,
but does not store them anywhere. This is often done
to ensure a well-mixed state after seeding the generator.
*/
static inline void prng_skip(prng_state *prng, size_t rounds)
{
uint64_t state = prng->state[0];
while (rounds-->0) {
state ^= state >> 12;
state ^= state << 25;
state ^= state >> 27;
}
prng->state[0] = state;
}
/* Returns an uniform pseudorandom number between 0 and 2**64-1, inclusive.
*/
static inline uint64_t prng_u64(prng_state *prng)
{
uint64_t state = prng->state[0];
state ^= state >> 12;
state ^= state << 25;
state ^= state >> 27;
prng->state[0] = state;
return state * UINT64_C(2685821657736338717);
}
/* Returns an uniform pseudorandom number [0, 1), excluding 1.
This carefully avoids the (2**64-1)/2**64 bias on 0,
but assumes that the double type has at most 63 bits of
precision in the mantissa.
*/
static inline double prng_one(prng_state *prng)
{
uint64_t u;
double d;
do {
do {
u = prng_u64(prng);
} while (!u);
d = (double)(u - 1u) / 18446744073709551616.0;
} while (d == 1.0);
return d;
}
/* Returns an uniform pseudorandom number (-1, 1), excluding -1 and +1.
This carefully avoids the (2**64-1)/2**64 bias on 0,
but assumes that the double type has at most 63 bits of
precision in the mantissa.
*/
static inline double prng_delta(prng_state *prng)
{
uint64_t u;
double d;
do {
do {
u = prng_u64(prng);
} while (!u);
d = ((double)(u - 1u) - 9223372036854775808.0) / 9223372036854775808.0;
} while (d == -1.0 || d == 1.0);
return d;
}
/* Returns an uniform pseudorandom integer between min and max, inclusive.
Uses the exclusion method to ensure uniform distribution.
*/
static inline uint64_t prng_range(prng_state *prng, const uint64_t min, const uint64_t max)
{
if (min != max) {
const uint64_t basis = (min < max) ? min : max;
const uint64_t range = (min < max) ? max-min : min-max;
uint64_t mask = range;
uint64_t u;
/* In range, all bits up to the higest bit set in range, must be set. */
mask |= mask >> 1;
mask |= mask >> 2;
mask |= mask >> 4;
mask |= mask >> 8;
mask |= mask >> 16;
mask |= mask >> 32;
/* In all cases, range <= mask < 2*range, so at worst case,
(mask = 2*range-1), this excludes at most 50% of generated values,
on average. */
do {
u = prng_u64(prng) & mask;
} while (u > range);
return u + basis;
} else
return min;
}
static inline void prng_seed(prng_state *prng)
{
#if _POSIX_TIMERS-0 > 0
struct timespec now;
#endif
FILE *src;
/* Try /dev/urandom. */
src = fopen(SEED_SOURCE, "r");
if (src) {
int tries = 16;
while (tries-->0) {
if (fread(prng->state, sizeof prng->state, 1, src) != 1)
break;
if (prng->state[0]) {
fclose(src);
return;
}
}
fclose(src);
}
#if _POSIX_TIMERS-0 > 0
#if _POSIX_MONOTONIC_CLOCK-0 > 0
if (clock_gettime(CLOCK_MONOTONIC, &now) == 0) {
prng->state[0] = (uint64_t)((uint64_t)now.tv_sec * UINT64_C(60834327289))
^ (uint64_t)((uint64_t)now.tv_nsec * UINT64_C(34958268769))
^ (uint64_t)((uint64_t)getpid() * UINT64_C(2772668794075091))
^ (uint64_t)((uint64_t)getppid() * UINT64_C(19455108437));
if (prng->state[0])
return;
} else
#endif
if (clock_gettime(CLOCK_REALTIME, &now) == 0) {
prng->state[0] = (uint64_t)((uint64_t)now.tv_sec * UINT64_C(60834327289))
^ (uint64_t)((uint64_t)now.tv_nsec * UINT64_C(34958268769))
^ (uint64_t)((uint64_t)getpid() * UINT64_C(2772668794075091))
^ (uint64_t)((uint64_t)getppid() * UINT64_C(19455108437));
if (prng->state[0])
return;
}
#endif
prng->state[0] = (uint64_t)((uint64_t)time(NULL) * UINT64_C(60834327289))
^ (uint64_t)((uint64_t)clock() * UINT64_C(34958268769))
^ (uint64_t)((uint64_t)getpid() * UINT64_C(2772668794075091))
^ (uint64_t)((uint64_t)getppid() * UINT64_C(19455108437));
if (!prng->state[0])
prng->state[0] = (uint64_t)UINT64_C(16233055073);
}
#endif /* XORSHIFT64_H */
If it can seed the state from SEED_SOURCE, it is used as-is. Otherwise, if POSIX.1 clock_gettime() is available, it is used (CLOCK_MONOTONIC, if possible; otherwise CLOCK_REALTIME). Otherwise, time (time(NULL)), CPU time spent thus far (clock()), process ID (getpid()), and parent process ID (getppid()) are used to seed the state.
If you wanted the above to also run on Windows, you'd need to add a few #ifndef _WIN32 guards, and either omit the process ID parts, or replace them with something else. (I don't use Windows myself, and cannot test such code, so I omitted such from above.)
The idea is that you can include the above file, and implement other pseudo-random number generators in the same format, and choose between them by simply including different files. (You can include multiple files, but you'll need to do some ugly #define prng_state prng_somename_state, #include "somename.h", #undef prng_state hacking to ensure unique names for each.)
Here is an example of how to use the above:
#include <stdlib.h>
#include <inttypes.h>
#include <stdint.h>
#include <stdio.h>
#include "xorshift64.h"
int main(void)
{
prng_state prng1, prng2;
prng_seed(&prng1);
prng_seed(&prng2);
printf("Seed 1 = 0x%016" PRIx64 "\n", prng1.state[0]);
printf("Seed 2 = 0x%016" PRIx64 "\n", prng2.state[0]);
printf("After skipping 16 rounds:\n");
prng_skip(&prng1, 16);
prng_skip(&prng2, 16);
printf("Seed 1 = 0x%016" PRIx64 "\n", prng1.state[0]);
printf("Seed 2 = 0x%016" PRIx64 "\n", prng2.state[0]);
return EXIT_SUCCESS;
}
Obviously, initializing two PRNGs like this is problematic in the fallback case, because it basically relies on clock() yielding different values for consecutive calls (so expects each call to take at least 1 millisecond of CPU time).
However, even a small change in the seeds thus generated is sufficient to yield very different sequences. I like to generate and discard (skip) a number of initial values to ensure the generator state is well mixed:
Seed 1 = 0x8a62585b6e71f915
Seed 2 = 0x8a6259a84464e15f
After skipping 16 rounds:
Seed 1 = 0x9895f664c83ad25e
Seed 2 = 0xa3fd7359dd150e83
The header also implements 0 <= prng_u64() < 2**64, 0 <= prng_one() < 1, -1 < prng_delta() < +1, and min <= prng_range(,min,max) <= max, which should be uniform.
I use the above Xorshift64* variant for tasks where a lot of quite uniform pseudorandom numbers are needed, so the functions also tend to use the faster methods (like max. 50% average exclusion rate rather than 64-bit modulus operation, and so on) (of those that I know of).
Additionally, if you require repeatability, you can simply save a randomly-seeded prng_state structure (a single uint64_t), and load it later, to reproduce the exact same sequence. Just remember to only do the skipping (generate-and-discard) only after randomly seeding, not after loading a new seed from a file.
Converting rather copious comments into an answer.
If two programs are started in the same second, they'll both have the same sequence of random numbers.
Consider whether you need to use a better random number generator than the rand()/srand() duo — that is usually only barely random (better than nothing, but not by a large margin). Do NOT use them for cryptography.
I asked about platform; you responded Ubuntu 16.04 LTS.
Use /dev/urandom or /dev/random to get some random bytes for the seed.
On many Unix-like platforms, there's a device /dev/random — on Linux, there's also a slightly lower-quality device /dev/urandom which won't block whereas /dev/random might. Systems such as macOS (BSD) have /dev/urandom as a synonym for /dev/random for Linux compatibility. You can open it and read 4 bytes (or the relevant number of bytes) of random data, and use that as a seed for the PRNG of your choice.
I often use the drand48() set of functions because they are in POSIX and were in System V Unix. They're usually adequate for my needs.
Look at the manuals across platforms; there are often other random number generators. C++11 provides high-quality PRNG — the header <random> has a number of different ones, such as the MT 19937 (Mersenne Twister). MacOS Sierra (BSD) has random(3) and arc4random(3) as alternatives to rand() – as well as drand48() et al.
Another possibility on Linux is simply to keep a connection to /dev/urandom open, reading more bytes when you need them. However, that gives up any chance of replaying a random sequence. The PRNG systems have the merit of allowing you to replay the same sequence again by recording and setting the random seed that you use. By default, grab a seed from /dev/urandom, but if the user requests it, take a seed from the command line, and report the seed used (at least on request).

What's a good way to implement simple clone() based multithread library?

I'm trying to build simple multithread library based on linux using clone() and other kernel utilities.I've come to a point where I'm not really sure what's the correct way to do things. I tried going trough original NPTL code but it's a bit too much.
That's how for instance I imagine the create method:
typedef int sk_thr_id;
typedef void *sk_thr_arg;
typedef int (*sk_thr_func)(sk_thr_arg);
sk_thr_id sk_thr_create(sk_thr_func f, sk_thr_arg a){
void* stack;
stack = malloc( 1024*64 );
if ( stack == 0 ){
perror( "malloc: could not allocate stack" );
exit( 1 );
}
return ( clone(f, (char*) stack + FIBER_STACK, SIGCHLD | CLONE_FS | CLONE_FILES | CLONE_SIGHAND | CLONE_VM, a ) );
}
1: I'm not really sure what the correct clone() flags should be. I just found these being used in a simple example. Any general directions here will be welcome.
Here are parts of the mutex primitives created using futexes(not my own code for now):
#define cmpxchg(P, O, N) __sync_val_compare_and_swap((P), (O), (N))
#define cpu_relax() asm volatile("pause\n": : :"memory")
#define barrier() asm volatile("": : :"memory")
static inline unsigned xchg_32(void *ptr, unsigned x)
{
__asm__ __volatile__("xchgl %0,%1"
:"=r" ((unsigned) x)
:"m" (*(volatile unsigned *)ptr), "0" (x)
:"memory");
return x;
}
static inline unsigned short xchg_8(void *ptr, char x)
{
__asm__ __volatile__("xchgb %0,%1"
:"=r" ((char) x)
:"m" (*(volatile char *)ptr), "0" (x)
:"memory");
return x;
}
int sys_futex(void *addr1, int op, int val1, struct timespec *timeout, void *addr2, int val3)
{
return syscall(SYS_futex, addr1, op, val1, timeout, addr2, val3);
}
typedef union mutex mutex;
union mutex
{
unsigned u;
struct
{
unsigned char locked;
unsigned char contended;
} b;
};
int mutex_init(mutex *m, const pthread_mutexattr_t *a)
{
(void) a;
m->u = 0;
return 0;
}
int mutex_lock(mutex *m)
{
int i;
/* Try to grab lock */
for (i = 0; i < 100; i++)
{
if (!xchg_8(&m->b.locked, 1)) return 0;
cpu_relax();
}
/* Have to sleep */
while (xchg_32(&m->u, 257) & 1)
{
sys_futex(m, FUTEX_WAIT_PRIVATE, 257, NULL, NULL, 0);
}
return 0;
}
int mutex_unlock(mutex *m)
{
int i;
/* Locked and not contended */
if ((m->u == 1) && (cmpxchg(&m->u, 1, 0) == 1)) return 0;
/* Unlock */
m->b.locked = 0;
barrier();
/* Spin and hope someone takes the lock */
for (i = 0; i < 200; i++)
{
if (m->b.locked) return 0;
cpu_relax();
}
/* We need to wake someone up */
m->b.contended = 0;
sys_futex(m, FUTEX_WAKE_PRIVATE, 1, NULL, NULL, 0);
return 0;
}
2: The main question for me is how to implement the "join" primitive? I know it's supposed to be based on futexes too. It's a struggle for me for now to come up with something.
3: I need some way to cleanup stuff(like the allocated stack) after a thread has finished. I can't really thing of a good way to do this too.
Probably for these I'll need to have additional structure in user space for every thread with some information saved in it. Can someone point me in good direction for solving these issues?
4: I'll want to have a way to tell how much time a thread has been running, how long it's been since it's last being scheduled and other stuff like that. Are there some kernel calls providing such info?
Thanks in advance!
The idea that there can exist a "multithreading library" as a third-party library separate from the rest of the standard library is an outdated and flawed notion. If you want to do this, you'll have to first drop all use of the standard library; particularly, your call to malloc is completely unsafe if you're calling clone yourself, because:
malloc will have no idea that multiple threads exist, and therefore may fail to perform proper synchronization.
Even if it knew they existed, malloc will need to access an unspecified, implementation-specific structure located at the address given by the thread pointer. As this structure is implementation-specific, you have no way of creating such a structure that will be interpreted correctly by both the current and all future versions of your system's libc.
These issues don't apply just to malloc but to most of the standard library; even async-signal-safe functions may be unsafe to use, as they might dereference the thread pointer for cancellation-related purposes, performing optimal syscall mechanisms, etc.
If you really insist on making your own threads implementation, you'll have to abstain from using glibc or any modern libc that's integrated with threads, and instead opt for something much more naive like klibc. This could be an educational experiment, but it would not be appropriate for a deployed application.
1) You are using an example of LinuxThreads. I will not rewrite good references for directions, but I advise you "The Linux Programming interface" of Michael Kerrisk, chapter 28. It explains in 25 pages, what you need.
2) If you set the CLONE_CHILD_CLEARID flag, when the child terminates, the ctid argument of clone is cleared. If you treat that pointer as a futex, you can implement the join primitive. Good luck :-) If you don't want to use futexes, have also a look to wait3 and wait4.
3) I do not know what you want to cleanup, but you can use the clone tls arugment. This is a thread local storage buffer. If the thread is finished, you can clean that buffer.
4) See getrusage.

Run-time mocking in C?

This has been pending for a long time in my list now. In brief - I need to run mocked_dummy() in the place of dummy() ON RUN-TIME, without modifying factorial(). I do not care on the entry point of the software. I can add up any number of additional functions (but cannot modify code within /*---- do not modify ----*/).
Why do I need this?
To do unit tests of some legacy C modules. I know there are a lot of tools available around, but if run-time mocking is possible I can change my UT approach (add reusable components) make my life easier :).
Platform / Environment?
Linux, ARM, gcc.
Approach that I'm trying with?
I know GDB uses trap/illegal instructions for adding up breakpoints (gdb internals).
Make the code self modifiable.
Replace dummy() code segment with illegal instruction, and return as immediate next instruction.
Control transfers to trap handler.
Trap handler is a reusable function that reads from a unix domain socket.
Address of mocked_dummy() function is passed (read from map file).
Mock function executes.
There are problems going ahead from here. I also found the approach is tedious and requires good amount of coding, some in assembly too.
I also found, under gcc each function call can be hooked / instrumented, but again not very useful since the the function is intended to be mocked will anyway get executed.
Is there any other approach that I could use?
#include <stdio.h>
#include <stdlib.h>
void mocked_dummy(void)
{
printf("__%s__()\n",__func__);
}
/*---- do not modify ----*/
void dummy(void)
{
printf("__%s__()\n",__func__);
}
int factorial(int num)
{
int fact = 1;
printf("__%s__()\n",__func__);
while (num > 1)
{
fact *= num;
num--;
}
dummy();
return fact;
}
/*---- do not modify ----*/
int main(int argc, char * argv[])
{
int (*fp)(int) = atoi(argv[1]);
printf("fp = %x\n",fp);
printf("factorial of 5 is = %d\n",fp(5));
printf("factorial of 5 is = %d\n",factorial(5));
return 1;
}
test-dept is a relatively recent C unit testing framework that allows you to do runtime stubbing of functions. I found it very easy to use - here's an example from their docs:
void test_stringify_cannot_malloc_returns_sane_result() {
replace_function(&malloc, &always_failing_malloc);
char *h = stringify('h');
assert_string_equals("cannot_stringify", h);
}
Although the downloads section is a little out of date, it seems fairly actively developed - the author fixed an issue I had very promptly. You can get the latest version (which I've been using without issues) with:
svn checkout http://test-dept.googlecode.com/svn/trunk/ test-dept-read-only
the version there was last updated in Oct 2011.
However, since the stubbing is achieved using assembler, it may need some effort to get it to support ARM.
This is a question I've been trying to answer myself. I also have the requirement that I want the mocking method/tools to be done in the same language as my application. Unfortunately this cannot be done in C in a portable way, so I've resorted to what you might call a trampoline or detour. This falls under the "Make the code self modifiable." approach you mentioned above. This is were we change the actually bytes of a function at runtime to jump to our mock function.
#include <stdio.h>
#include <stdlib.h>
// Additional headers
#include <stdint.h> // for uint32_t
#include <sys/mman.h> // for mprotect
#include <errno.h> // for errno
void mocked_dummy(void)
{
printf("__%s__()\n",__func__);
}
/*---- do not modify ----*/
void dummy(void)
{
printf("__%s__()\n",__func__);
}
int factorial(int num)
{
int fact = 1;
printf("__%s__()\n",__func__);
while (num > 1)
{
fact *= num;
num--;
}
dummy();
return fact;
}
/*---- do not modify ----*/
typedef void (*dummy_fun)(void);
void set_run_mock()
{
dummy_fun run_ptr, mock_ptr;
uint32_t off;
unsigned char * ptr, * pg;
run_ptr = dummy;
mock_ptr = mocked_dummy;
if (run_ptr > mock_ptr) {
off = run_ptr - mock_ptr;
off = -off - 5;
}
else {
off = mock_ptr - run_ptr - 5;
}
ptr = (unsigned char *)run_ptr;
pg = (unsigned char *)(ptr - ((size_t)ptr % 4096));
if (mprotect(pg, 5, PROT_READ | PROT_WRITE | PROT_EXEC)) {
perror("Couldn't mprotect");
exit(errno);
}
ptr[0] = 0xE9; //x86 JMP rel32
ptr[1] = off & 0x000000FF;
ptr[2] = (off & 0x0000FF00) >> 8;
ptr[3] = (off & 0x00FF0000) >> 16;
ptr[4] = (off & 0xFF000000) >> 24;
}
int main(int argc, char * argv[])
{
// Run for realz
factorial(5);
// Set jmp
set_run_mock();
// Run the mock dummy
factorial(5);
return 0;
}
Portability explanation...
mprotect() - This changes the memory page access permissions so that we can actually write to memory that holds the function code. This isn't very portable, and in a WINAPI env, you may need to use VirtualProtect() instead.
The memory parameter for mprotect is aligned to the previous 4k page, this also can change from system to system, 4k is appropriate for vanilla linux kernel.
The method that we use to jmp to the mock function is to actually put down our own opcodes, this is probably the biggest issue with portability because the opcode I've used will only work on a little endian x86 (most desktops). So this would need to be updated for each arch you plan to run on (which could be semi-easy to deal with in CPP macros.)
The function itself has to be at least five bytes. The is usually the case because every function normally has at least 5 bytes in its prologue and epilogue.
Potential Improvements...
The set_mock_run() call could easily be setup to accept parameters for reuse. Also, you could save the five overwritten bytes from the original function to restore later in the code if you desire.
I'm unable to test, but I've read that in ARM... you'd do similar but you can jump to an address (not an offset) with the branch opcode... which for an unconditional branch you'd have the first bytes be 0xEA and the next 3 bytes are the address.
Chenz
An approach that I have used in the past that has worked well is the following.
For each C module, publish an 'interface' that other modules can use. These interfaces are structs that contain function pointers.
struct Module1
{
int (*getTemperature)(void);
int (*setKp)(int Kp);
}
During initialization, each module initializes these function pointers with its implementation functions.
When you write the module tests, you can dynamically changes these function pointers to its mock implementations and after testing, restore the original implementation.
Example:
void mocked_dummy(void)
{
printf("__%s__()\n",__func__);
}
/*---- do not modify ----*/
void dummyFn(void)
{
printf("__%s__()\n",__func__);
}
static void (*dummy)(void) = dummyFn;
int factorial(int num)
{
int fact = 1;
printf("__%s__()\n",__func__);
while (num > 1)
{
fact *= num;
num--;
}
dummy();
return fact;
}
/*---- do not modify ----*/
int main(int argc, char * argv[])
{
void (*oldDummy) = dummy;
/* with the original dummy function */
printf("factorial of 5 is = %d\n",factorial(5));
/* with the mocked dummy */
oldDummy = dummy; /* save the old dummy */
dummy = mocked_dummy; /* put in the mocked dummy */
printf("factorial of 5 is = %d\n",factorial(5));
dummy = oldDummy; /* restore the old dummy */
return 1;
}
You can replace every function by the use of LD_PRELOAD. You have to create a shared library, which gets loaded by LD_PRELOAD. This is a standard function used to turn programs without support for SOCKS into SOCKS aware programs. Here is a tutorial which explains it.

Resources