Generate uniformly distributed random number in Nginx? - c

Nginx seems to have a built-in function called ngx_random that's used in various places of the source code. But it seems just to be defined as:#define ngx_random random
If I'm understanding this correctly, that means all the places Nginx calling ngx_random() it's just calling (on Linux platform) random(). From the doc it isn't clear to me that it is guaranteed uniform distribution with a given range in any way, and I'm suspecting that similar to rand(), it's not uniform at all, and will only be uniform if range n is divisible by RAND_MAX.
But the nice thing of using ngx_random is I believe the system takes care of the seeding automatically, during startup time. Whereas if I want to use something truly uniform with my range, like drand48, I believe will have to add a new line after the following in ngx_posix_init.c?
srandom(((unsigned) ngx_pid << 16) ^ tp->sec ^ tp->msec);
srand48(((unsigned) ngx_pid << 16) ^ tp->sec ^ tp->msec); //Added so that I can use drand48
So is my assumption on ngx_random correct? And if I want to use drand48 in any place of the various modules, is the above the only way doing it?

I never tried it myself with nginx, so consider it just an idea. On Linux (or similar ELF based systems, e.g. Solaris) you could, using LD_LIBRARY_PRELOAD trick, to replace and/or intercept weak symbols from standard C library. It is often used to intercept and/or replace malloc, but might work for you as well
Code sample (untested, not compiled, just to demonstrate an idea)
#define _GNU_SOURCE
#include <stdlib.h>
#include <dlfcn.h>
static void (*real_srandom)(uint32_t) = NULL;
static void srandom_init(void) {
real_srandom = dlsym(RTLD_NEXT, "srandom");
if (NULL == real_srandom) {
fprintf(stderr, "Error in `dlsym`: %s\n", dlerror());
}
}
void srandom(uint32_t seed) {
if(real_srandom == NULL) {
srandom_init();
}
real_srandom(seed);
srand48(seed);
}
You could write SO to replace calls to random(3) as well, replacing it with your own implementation. The only thing you cannot replace is RAND_MAX, as it is compiled in constant.
I would be happy to hear if this trick works for you or not

Related

2D array, prototype function and random numbers [duplicate]

I need a 'good' way to initialize the pseudo-random number generator in C++. I've found an article that states:
In order to generate random-like
numbers, srand is usually initialized
to some distinctive value, like those
related with the execution time. For
example, the value returned by the
function time (declared in header
ctime) is different each second, which
is distinctive enough for most
randoming needs.
Unixtime isn't distinctive enough for my application. What's a better way to initialize this? Bonus points if it's portable, but the code will primarily be running on Linux hosts.
I was thinking of doing some pid/unixtime math to get an int, or possibly reading data from /dev/urandom.
Thanks!
EDIT
Yes, I am actually starting my application multiple times a second and I've run into collisions.
This is what I've used for small command line programs that can be run frequently (multiple times a second):
unsigned long seed = mix(clock(), time(NULL), getpid());
Where mix is:
// Robert Jenkins' 96 bit Mix Function
unsigned long mix(unsigned long a, unsigned long b, unsigned long c)
{
a=a-b; a=a-c; a=a^(c >> 13);
b=b-c; b=b-a; b=b^(a << 8);
c=c-a; c=c-b; c=c^(b >> 13);
a=a-b; a=a-c; a=a^(c >> 12);
b=b-c; b=b-a; b=b^(a << 16);
c=c-a; c=c-b; c=c^(b >> 5);
a=a-b; a=a-c; a=a^(c >> 3);
b=b-c; b=b-a; b=b^(a << 10);
c=c-a; c=c-b; c=c^(b >> 15);
return c;
}
The best answer is to use <random>. If you are using a pre C++11 version, you can look at the Boost random number stuff.
But if we are talking about rand() and srand()
The best simplest way is just to use time():
int main()
{
srand(time(nullptr));
...
}
Be sure to do this at the beginning of your program, and not every time you call rand()!
Side Note:
NOTE: There is a discussion in the comments below about this being insecure (which is true, but ultimately not relevant (read on)). So an alternative is to seed from the random device /dev/random (or some other secure real(er) random number generator). BUT: Don't let this lull you into a false sense of security. This is rand() we are using. Even if you seed it with a brilliantly generated seed it is still predictable (if you have any value you can predict the full sequence of next values). This is only useful for generating "pseudo" random values.
If you want "secure" you should probably be using <random> (Though I would do some more reading on a security informed site). See the answer below as a starting point: https://stackoverflow.com/a/29190957/14065 for a better answer.
Secondary note: Using the random device actually solves the issues with starting multiple copies per second better than my original suggestion below (just not the security issue).
Back to the original story:
Every time you start up, time() will return a unique value (unless you start the application multiple times a second). In 32 bit systems, it will only repeat every 60 years or so.
I know you don't think time is unique enough but I find that hard to believe. But I have been known to be wrong.
If you are starting a lot of copies of your application simultaneously you could use a timer with a finer resolution. But then you run the risk of a shorter time period before the value repeats.
OK, so if you really think you are starting multiple applications a second.
Then use a finer grain on the timer.
int main()
{
struct timeval time;
gettimeofday(&time,NULL);
// microsecond has 1 000 000
// Assuming you did not need quite that accuracy
// Also do not assume the system clock has that accuracy.
srand((time.tv_sec * 1000) + (time.tv_usec / 1000));
// The trouble here is that the seed will repeat every
// 24 days or so.
// If you use 100 (rather than 1000) the seed repeats every 248 days.
// Do not make the MISTAKE of using just the tv_usec
// This will mean your seed repeats every second.
}
if you need a better random number generator, don't use the libc rand. Instead just use something like /dev/random or /dev/urandom directly (read in an int directly from it or something like that).
The only real benefit of the libc rand is that given a seed, it is predictable which helps with debugging.
On windows:
srand(GetTickCount());
provides a better seed than time() since its in milliseconds.
C++11 random_device
If you need reasonable quality then you should not be using rand() in the first place; you should use the <random> library. It provides lots of great functionality like a variety of engines for different quality/size/performance trade-offs, re-entrancy, and pre-defined distributions so you don't end up getting them wrong. It may even provide easy access to non-deterministic random data, (e.g., /dev/random), depending on your implementation.
#include <random>
#include <iostream>
int main() {
std::random_device r;
std::seed_seq seed{r(), r(), r(), r(), r(), r(), r(), r()};
std::mt19937 eng(seed);
std::uniform_int_distribution<> dist{1,100};
for (int i=0; i<50; ++i)
std::cout << dist(eng) << '\n';
}
eng is a source of randomness, here a built-in implementation of mersenne twister. We seed it using random_device, which in any decent implementation will be a non-determanistic RNG, and seed_seq to combine more than 32-bits of random data. For example in libc++ random_device accesses /dev/urandom by default (though you can give it another file to access instead).
Next we create a distribution such that, given a source of randomness, repeated calls to the distribution will produce a uniform distribution of ints from 1 to 100. Then we proceed to using the distribution repeatedly and printing the results.
Best way is to use another pseudorandom number generator.
Mersenne twister (and Wichmann-Hill) is my recommendation.
http://en.wikipedia.org/wiki/Mersenne_twister
i suggest you see unix_random.c file in mozilla code. ( guess it is mozilla/security/freebl/ ...) it should be in freebl library.
there it uses system call info ( like pwd, netstat ....) to generate noise for the random number;it is written to support most of the platforms (which can gain me bonus point :D ).
The real question you must ask yourself is what randomness quality you need.
libc random is a LCG
The quality of randomness will be low whatever input you provide srand with.
If you simply need to make sure that different instances will have different initializations, you can mix process id (getpid), thread id and a timer. Mix the results with xor. Entropy should be sufficient for most applications.
Example :
struct timeb tp;
ftime(&tp);
srand(static_cast<unsigned int>(getpid()) ^
static_cast<unsigned int>(pthread_self()) ^
static_cast<unsigned int >(tp.millitm));
For better random quality, use /dev/urandom. You can make the above code portable in using boost::thread and boost::date_time.
The c++11 version of the top voted post by Jonathan Wright:
#include <ctime>
#include <random>
#include <thread>
...
const auto time_seed = static_cast<size_t>(std::time(0));
const auto clock_seed = static_cast<size_t>(std::clock());
const size_t pid_seed =
std::hash<std::thread::id>()(std::this_thread::get_id());
std::seed_seq seed_value { time_seed, clock_seed, pid_seed };
...
// E.g seeding an engine with the above seed.
std::mt19937 gen;
gen.seed(seed_value);
#include <stdio.h>
#include <sys/time.h>
main()
{
struct timeval tv;
gettimeofday(&tv,NULL);
printf("%d\n", tv.tv_usec);
return 0;
}
tv.tv_usec is in microseconds. This should be acceptable seed.
As long as your program is only running on Linux (and your program is an ELF executable), you are guaranteed that the kernel provides your process with a unique random seed in the ELF aux vector. The kernel gives you 16 random bytes, different for each process, which you can get with getauxval(AT_RANDOM). To use these for srand, use just an int of them, as such:
#include <sys/auxv.h>
void initrand(void)
{
unsigned int *seed;
seed = (unsigned int *)getauxval(AT_RANDOM);
srand(*seed);
}
It may be possible that this also translates to other ELF-based systems. I'm not sure what aux values are implemented on systems other than Linux.
Suppose you have a function with a signature like:
int foo(char *p);
An excellent source of entropy for a random seed is a hash of the following:
Full result of clock_gettime (seconds and nanoseconds) without throwing away the low bits - they're the most valuable.
The value of p, cast to uintptr_t.
The address of p, cast to uintptr_t.
At least the third, and possibly also the second, derive entropy from the system's ASLR, if available (the initial stack address, and thus current stack address, is somewhat random).
I would also avoid using rand/srand entirely, both for the sake of not touching global state, and so you can have more control over the PRNG that's used. But the above procedure is a good (and fairly portable) way to get some decent entropy without a lot of work, regardless of what PRNG you use.
For those using Visual Studio here's yet another way:
#include "stdafx.h"
#include <time.h>
#include <windows.h>
const __int64 DELTA_EPOCH_IN_MICROSECS= 11644473600000000;
struct timezone2
{
__int32 tz_minuteswest; /* minutes W of Greenwich */
bool tz_dsttime; /* type of dst correction */
};
struct timeval2 {
__int32 tv_sec; /* seconds */
__int32 tv_usec; /* microseconds */
};
int gettimeofday(struct timeval2 *tv/*in*/, struct timezone2 *tz/*in*/)
{
FILETIME ft;
__int64 tmpres = 0;
TIME_ZONE_INFORMATION tz_winapi;
int rez = 0;
ZeroMemory(&ft, sizeof(ft));
ZeroMemory(&tz_winapi, sizeof(tz_winapi));
GetSystemTimeAsFileTime(&ft);
tmpres = ft.dwHighDateTime;
tmpres <<= 32;
tmpres |= ft.dwLowDateTime;
/*converting file time to unix epoch*/
tmpres /= 10; /*convert into microseconds*/
tmpres -= DELTA_EPOCH_IN_MICROSECS;
tv->tv_sec = (__int32)(tmpres * 0.000001);
tv->tv_usec = (tmpres % 1000000);
//_tzset(),don't work properly, so we use GetTimeZoneInformation
rez = GetTimeZoneInformation(&tz_winapi);
tz->tz_dsttime = (rez == 2) ? true : false;
tz->tz_minuteswest = tz_winapi.Bias + ((rez == 2) ? tz_winapi.DaylightBias : 0);
return 0;
}
int main(int argc, char** argv) {
struct timeval2 tv;
struct timezone2 tz;
ZeroMemory(&tv, sizeof(tv));
ZeroMemory(&tz, sizeof(tz));
gettimeofday(&tv, &tz);
unsigned long seed = tv.tv_sec ^ (tv.tv_usec << 12);
srand(seed);
}
Maybe a bit overkill but works well for quick intervals. gettimeofday function found here.
Edit: upon further investigation rand_s might be a good alternative for Visual Studio, it's not just a safe rand(), it's totally different and doesn't use the seed from srand. I had presumed it was almost identical to rand just "safer".
To use rand_s just don't forget to #define _CRT_RAND_S before stdlib.h is included.
Assuming that the randomness of srand() + rand() is enough for your purposes, the trick is in selecting the best seed for srand. time(NULL) is a good starting point, but you'll run into problems if you start more than one instance of the program within the same second. Adding the pid (process id) is an improvement as different instances will get different pids. I would multiply the pid by a factor to spread them more.
But let's say you are using this for some embedded device and you have several in the same network. If they are all powered at once and you are launching the several instances of your program automatically at boot time, they may still get the same time and pid and all the devices will generate the same sequence of "random" numbers. In that case, you may want to add some unique identifier of each device (like the CPU serial number).
The proposed initialization would then be:
srand(time(NULL) + 1000 * getpid() + (uint) getCpuSerialNumber());
In a Linux machine (at least in the Raspberry Pi where I tested this), you can implement the following function to get the CPU Serial Number:
// Gets the CPU Serial Number as a 64 bit unsigned int. Returns 0 if not found.
uint64_t getCpuSerialNumber() {
FILE *f = fopen("/proc/cpuinfo", "r");
if (!f) {
return 0;
}
char line[256];
uint64_t serial = 0;
while (fgets(line, 256, f)) {
if (strncmp(line, "Serial", 6) == 0) {
serial = strtoull(strchr(line, ':') + 2, NULL, 16);
}
}
fclose(f);
return serial;
}
Include the header at the top of your program, and write:
srand(time(NULL));
In your program before you declare your random number. Here is an example of a program that prints a random number between one and ten:
#include <iostream>
#include <iomanip>
using namespace std;
int main()
{
//Initialize srand
srand(time(NULL));
//Create random number
int n = rand() % 10 + 1;
//Print the number
cout << n << endl; //End the line
//The main function is an int, so it must return a value
return 0;
}

c rand() function and ISAAC random number generator

I wrote a program in C that uses a number of different random number generators and one of them is ISAAC (available at http://burtleburtle.net/bob/rand/isaacafa.html). It works well but the problem is that in rand.h rand() is redefined as a macro. In my program I want to use the standard C rand() function as well. I tried changing the name of the macro to rand12() but I cannot see any other place in ISAAC that the macro is called so this doesn't work.
Could you offer some ideas how I can keep the standard rand() function and use ISAAC as well?
Given that the header rand.h contains:
#ifndef STANDARD
#include "standard.h"
#endif
#ifndef RAND
#define RAND
#define RANDSIZL (8)
#define RANDSIZ (1<<RANDSIZL)
/* context of random number generator */
struct randctx
{
ub4 randcnt;
ub4 randrsl[RANDSIZ];
ub4 randmem[RANDSIZ];
ub4 randa;
ub4 randb;
ub4 randc;
};
typedef struct randctx randctx;
/* If (flag==TRUE), then use the contents of randrsl[0..RANDSIZ-1] as the seed. */
void randinit(/*_ randctx *r, word flag _*/);
void isaac(/*_ randctx *r _*/);
/* Call rand(/o_ randctx *r _o/) to retrieve a single 32-bit random value */
#define rand(r) \
(!(r)->randcnt-- ? \
(isaac(r), (r)->randcnt=RANDSIZ-1, (r)->randrsl[(r)->randcnt]) : \
(r)->randrsl[(r)->randcnt])
#endif /* RAND */
You are going to need to do some work to the code to be able to use it alongside rand() from <stdlib.h>. The interface to the ISAAC rand() is different from the interface to rand() from <stdlib.h> too.
Create yourself a new header, "isaac.h", which defines cover functions to handle the peculiarities of the ISAAC system.
Maybe, if you aren't going to be working in a threaded context
#ifndef ISAAC_H_INCLUDED
#define ISAAC_H_INCLUDED
extern void isaac_init(unsigned long seed);
extern int isaac_rand(void);
#endif
You then implement those functions in isaac.c such that they call down onto the functions defined in rand.h, and isaac_rand() contains an invocation of the rand() macro from rand.h (providing a context from somewhere, which is where the non-threaded part comes in). You can decide what to do with the seed, or whether to change the seeding mechanism.
You can then use the isaac_init() and isaac_rand() functions in your code, as well as the normal rand() and srand().
I'd also upgrade the code in rand.h to provide full prototypes for the functions in the package. The commented prototypes is a legacy from when it was first written, back in the mid-90s, when standard C compilers were not universally accessible. The earliest date in the header is 1996; that's just on the cusp of when standard C compilers became almost universally available.
I note that the comments in the header (removed above) say the code is in the public domain; that means it is 100% legitimate to make any modifications you need.
isaac.c
#include "isaac.h"
#include "rand.h"
static randctx control;
void isaac_init(unsigned long seed)
{
assert(seed != 0);
randinit(&control, FALSE);
}
int isaac_rand(void)
{
return rand(&control);
}
This implementation ignores the seed you give, mainly because the structure expects eight 32-bit numbers to seed the randrsl member of the context structure (the one I called control). You could do something like use the seed value 8 times in a row instead of completely ignoring it, or add some number to it each time, or whatever other more complex seeding technique. You should seriously look at using /dev/urandom as a source of the seed:
#define DEV_URANDOM "/dev/urandom"
int ur = open(DEV_URANDOM, O_RDONLY);
if (ur >= 0)
{
read(ur, control.randrsl, sizeof(control.randrsl));
close(ur);
}
You'd put this code into isaac_init() before the call to randinit(), and you'd change the FALSE to TRUE. You'd probably also lose the seed argument to the isaac_init() function.
This leaves you with a problem of tracking the random seed to gain reproducibility (which can be important when debugging). That's for you to resolve, though — there are multiple ways to do that. You might have two initialize functions: void isaac_init(void) and void isaac_rsl(unsigned int *rsl) which takes an array of 8 unsigned int (or ub4) values and uses that as the seed instead of the output of /dev/urandom. Or you could pass a null pointer to mean "use output from /dev/urandom" and a non-null pointer to mean "use the values I've provided". Etc.

How can I implement a portable pointer compare and swap?

I have found this code for compareAndSwap in a StackOverflow answer:
boolean CompareAndSwapPointer(volatile * void * ptr,
void * new_value,
void * old_value) {
#if defined(_MSC_VER)
if (InterlockedCompareExchange(ptr, new_value, old_value) == old_value) return false;
else return true;
#elif (__GNUC__ * 10000 + __GNUC_MINOR__ * 100 + __GNUC_PATCHLEVEL__) > 40100
return __sync_bool_compare_and_swap(ptr, old_value, new_value);
#else
# error No implementation
#endif
}
Is this the most proper way of having portable fast code, (Except assembly inlining).
Also, one problem is that those specific builtin methods have different parameters and return values from one compiler to another, which may require some additional changes like the if then else in this example.
Also another problem would be the behavior of these builtin methods in the machine code level, do they behave exactly the same ? (e.g use the same assembly instructions)
Note: Another problem would be if there is many supported platforms not just (Windows and Linux) as in this example. The code might get very big.
I would use a Hardware Abstraction Layer, (HAL) that allows generic code to be common - and any portable source can be included and build for each platform.
In my opinion, this allows for better structured and more readable source.
To allow you to better understand this process I would suggest Google for finding examples and explanations.
Hopefully this brief answer helps.
[EDIT] I will attempt a simple example for Bionix, to show how to implement a HAL system...
Mr A wants his application to run on his 'Tianhe-2' and also his 'Amiga 500'. He has the cross compilers etc and will build both binaries on his PC. He want to read keys and print to the screen.
mrAMainApplication.c contains the following...
#include "hal.h"
// This gets called every time around the main loop ...
void mainProcessLoop( void )
{
unsigned char key = 0;
// scan key ...
key = hal_ReadKey();
if ( key != 0 )
{
hal_PrintChar( key );
}
}
He then creates a header file (Remember - this is an example, not working code! )...
He creates hal.h ...
#ifndef _HAL_H_
#define _HAL_H_
unsigned char hal_ReadKey( void );
unsigned char hal_PrintChar( unsigned char pKey );
#endif // _HAL_H_
Now Mr A needs two separate source files, one for his 'Tianhe-2' system and another for his Amiga 500...
hal_A500.c
void hal_ReadKey( void )
{
// Amiga related code for reading KEYBOARD
}
void hal_PrintChar( unsigned char pKey )
{
// Amiga related code for printing to a shell...
}
hal_Tianhe2_VERYFAST.c
void hal_ReadKey( void )
{
// Tianhe-2 related code for reading KEYBOARD
}
void hal_PrintChar( unsigned char pKey )
{
// Tianhe-2 related code for printing to a shell...
}
Mr A then - when building for the Amiga - builds mrAmainApplication.c and hal_A500.c
When building for the Tianhe-2 - he uses hal_Tianhe2_VERYFAST.c instead of hal_A500.c
Right - I've written this example with some humour, this is not ear-marked at anyone, just I feel it makes the example more interesting and hopefully aids in understanding.
Neil
In modern C, starting with C11, use _Atomic for the type qualification and atomic_compare_exchange_weak for the function.
The newer versions of gcc and clang are compliant to C11 and implement these operations in a portable way.
Take a look at ConcurrencyKit and possibly you can use higher level primitives which is probably what most of the time people really want. In contrast to HAL which somewhat OS specific, I believe CK works on Windows and with a number of non-gcc compilers.
But if you are just interested in how to implement "compare-and-swap" or atomic actions portably on a wide variety of C compilers, look and see how that code works. It is all open-source.
I suspect that the details can get messy and they are not something that in general will make for easy or interesting exposition here for the general public.

Simple C Kernel char Pointers Aren't Working

I am trying to make a simple kernel using C. Everything loads and works fine, and I can access the video memory and display characters, but when i try to implement a simple puts function for some reason it doesn't work. I've tried my own code and other's. Also, when I try to use a variable which is declared outside a function it doesn't seem to work. This is my own code:
#define PUTCH(C, X) pos = putc(C, X, pos)
#define PUTSTR(C, X) pos = puts(C, X, pos)
int putc(char c, char color, int spos) {
volatile char *vidmem = (volatile char*)(0xB8000);
if (c == '\n') {
spos += (160-(spos % 160));
} else {
vidmem[spos] = c;
vidmem[spos+1] = color;
spos += 2;
}
return spos;
}
int puts(char* str, char color, int spos) {
while (*str != '\0') {
spos = putc(*str, color, spos);
str++;
}
return spos;
}
int kmain(void) {
int pos = 0;
PUTSTR("Hello, world!", 6);
return 0;
}
The spos (starting position) stuff is because I can't make a global position variable. putc works fine, but puts doesn't. I also tried this:
unsigned int k_printf(char *message, unsigned int line) // the message and then the line #
{
char *vidmem = (char *) 0xb8000;
unsigned int i=0;
i=(line*80*2);
while(*message!=0)
{
if(*message=='\n') // check for a new line
{
line++;
i=(line*80*2);
*message++;
} else {
vidmem[i]=*message;
*message++;
i++;
vidmem[i]=7;
i++;
};
};
return(1);
};
int kmain(void) {
k_printf("Hello, world!", 0);
return 0;
}
Why doesn't this work? I tried using my puts implementation with my native GCC (without the color and spos data and using printf("%c")) and it worked fine.
Since you're having an issue with global variables in general, the problem most likely has to-do with where the linker is placing your "Hello World" string literal in memory. This is due to the fact that string literals are typically stored in a read-only portion of global memory by the linker ... You have not detailed exactly how you are compiling and linking your kernel, so I would attempt something like the following and see if that works:
int kmain(void)
{
char array[] = "Hello World\n";
int pos = 0;
puts(array, 0, pos);
return 0;
}
This will allocate the character array on the stack rather than global memory, and avoid any issues with where the linker decides to place global variables.
In general, when creating a simple kernel, you want to compile and link it as a flat binary with no dependencies on external OS libraries. If you're working with a multiboot compliant boot-loader like GRUB, you may want to look at the bare-bones sample code from the multiboot specification pages.
Since this got references outside of SO, I'll add a universal answer
There are several kernel examples around the internet, and many are in various states of degradation - the Multiboot sample code for instance lacks compilation instructions. If you're looking for a working start, a known good example can be found at http://wiki.osdev.org/Bare_Bones
In the end there are three things that should be properly dealt with:
The bootloader will need to properly load the kernel, and as such they must agree on a certain format. GRUB defines the fairly common standard that is Multiboot, but you can roll your own. It boils down that you need to choose a file format and locations where all the parts of your kernel and related metadata end up in memory before the kernel code will ever get executed. One would typically use the ELF format with multiboot which contains that information in its headers
The compiler must be able to create binary code that is relevant to the platform. A typical PC boots in 16-bit mode after which the BIOS or bootloader might often decide to change it. Typically, if you use GRUB legacy, the Multiboot standard puts you in 32-bit mode by its contract. If you used the default compiler settings on a 64-bit linux, you end up with code for the wrong architecture (which happens to be sufficiently similar that you might get something that looks like the result you want). Compilers also like to rename sections or include platform-specific mechanisms and security features such as stack probing or canaries. Especially compilers on windows tend to inject host-specific code that of course breaks when run without the presence of Windows. The example provided deliberately uses a separate compiler to prevent all sorts of problems in this category.these
The linker must be able to combine the code in ways needed to create output that adheres to the bootloader's contract. A linker has a default way of generating a binary, and typically it's not at all what you want. In pretty much all cases, choosing gnu ld for this task means that you're required to write a linker script that puts all the sections in the places where you want. Omitted sections will result in data going missing, sections at the wrong location might make an image unbootable. Assuming you have gnu ld, you can also use the bundled nm and objdump tools besides your hex editor of choice to tell you where things have appeared in your output binary, and with it, check if you've been following the contract that has been set for you.
Problems of this fundamental type are eventually tracked back to not following one or more of the steps above. Use the reference at the top of this answer and go find the differences.

How to write self modifying code in C?

I want to write a piece of code that changes itself continuously, even if the change is insignificant.
For example maybe something like
for i in 1 to 100, do
begin
x := 200
for j in 200 downto 1, do
begin
do something
end
end
Suppose I want that my code should after first iteration change the line x := 200 to some other line x := 199 and then after next iteration change it to x := 198 and so on.
Is writing such a code possible ? Would I need to use inline assembly for that ?
EDIT :
Here is why I want to do it in C:
This program will be run on an experimental operating system and I can't / don't know how to use programs compiled from other languages. The real reason I need such a code is because this code is being run on a guest operating system on a virtual machine. The hypervisor is a binary translator that is translating chunks of code. The translator does some optimizations. It only translates the chunks of code once. The next time the same chunk is used in the guest, the translator will use the previously translated result. Now, if the code gets modified on the fly, then the translator notices that, and marks its previous translation as stale. Thus forcing a re-translation of the same code. This is what I want to achieve, to force the translator to do many translations. Typically these chunks are instructions between to branch instructions (such as jump instructions). I just think that self modifying code would be fantastic way to achieve this.
You might want to consider writing a virtual machine in C, where you can build your own self-modifying code.
If you wish to write self-modifying executables, much depends on the operating system you are targeting. You might approach your desired solution by modifying the in-memory program image. To do so, you would obtain the in-memory address of your program's code bytes. Then, you might manipulate the operating system protection on this memory range, allowing you to modify the bytes without encountering an Access Violation or '''SIG_SEGV'''. Finally, you would use pointers (perhaps '''unsigned char *''' pointers, possibly '''unsigned long *''' as on RISC machines) to modify the opcodes of the compiled program.
A key point is that you will be modifying machine code of the target architecture. There is no canonical format for C code while it is running -- C is a specification of a textual input file to a compiler.
Sorry, I am answering a bit late, but I think I found exactly what you are looking for : https://shanetully.com/2013/12/writing-a-self-mutating-x86_64-c-program/
In this article, they change the value of a constant by injecting assembly in the stack. Then they execute a shellcode by modifying the memory of a function on the stack.
Below is the first code :
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/mman.h>
void foo(void);
int change_page_permissions_of_address(void *addr);
int main(void) {
void *foo_addr = (void*)foo;
// Change the permissions of the page that contains foo() to read, write, and execute
// This assumes that foo() is fully contained by a single page
if(change_page_permissions_of_address(foo_addr) == -1) {
fprintf(stderr, "Error while changing page permissions of foo(): %s\n", strerror(errno));
return 1;
}
// Call the unmodified foo()
puts("Calling foo...");
foo();
// Change the immediate value in the addl instruction in foo() to 42
unsigned char *instruction = (unsigned char*)foo_addr + 18;
*instruction = 0x2A;
// Call the modified foo()
puts("Calling foo...");
foo();
return 0;
}
void foo(void) {
int i=0;
i++;
printf("i: %d\n", i);
}
int change_page_permissions_of_address(void *addr) {
// Move the pointer to the page boundary
int page_size = getpagesize();
addr -= (unsigned long)addr % page_size;
if(mprotect(addr, page_size, PROT_READ | PROT_WRITE | PROT_EXEC) == -1) {
return -1;
}
return 0;
}
It is possible, but it's most probably not portably possible and you may have to contend with read-only memory segments for the running code and other obstacles put in place by your OS.
This would be a good start. Essentially Lisp functionality in C:
http://nakkaya.com/2010/08/24/a-micro-manual-for-lisp-implemented-in-c/
Depending on how much freedom you need, you may be able to accomplish what you want by using function pointers. Using your pseudocode as a jumping-off point, consider the case where we want to modify that variable x in different ways as the loop index i changes. We could do something like this:
#include <stdio.h>
void multiply_x (int * x, int multiplier)
{
*x *= multiplier;
}
void add_to_x (int * x, int increment)
{
*x += increment;
}
int main (void)
{
int x = 0;
int i;
void (*fp)(int *, int);
for (i = 1; i < 6; ++i) {
fp = (i % 2) ? add_to_x : multiply_x;
fp(&x, i);
printf("%d\n", x);
}
return 0;
}
The output, when we compile and run the program, is:
1
2
5
20
25
Obviously, this will only work if you have finite number of things you want to do with x on each run through. In order to make the changes persistent (which is part of what you want from "self-modification"), you would want to make the function-pointer variable either global or static. I'm not sure I really can recommend this approach, because there are often simpler and clearer ways of accomplishing this sort of thing.
A self-interpreting language (not hard-compiled and linked like C) might be better for that. Perl, javascript, PHP have the evil eval() function that might be suited to your purpose. By it, you could have a string of code that you constantly modify and then execute via eval().
The suggestion about implementing LISP in C and then using that is solid, due to portability concerns. But if you really wanted to, this could also be implemented in the other direction on many systems, by loading your program's bytecode into memory and then returning to it.
There's a couple of ways you could attempt to do that. One way is via a buffer overflow exploit. Another would be to use mprotect() to make the code section writable, and then modify compiler-created functions.
Techniques like this are fun for programming challenges and obfuscated competitions, but given how unreadable your code would be combined with the fact you're exploiting what C considers undefined behavior, they're best avoided in production environments.
In standard C11 (read n1570), you cannot write self modifying code (at least without undefined behavior). Conceptually at least, the code segment is read-only.
You might consider extending the code of your program with plugins using your dynamic linker. This require operating system specific functions. On POSIX, use dlopen (and probably dlsym to get newly loaded function pointers). You could then overwrite function pointers with the address of new ones.
Perhaps you could use some JIT-compiling library (like libgccjit or asmjit) to achieve your goals. You'll get fresh function addresses and put them in your function pointers.
Remember that a C compiler can generate code of various size for a given function call or jump, so even overwriting that in a machine specific way is brittle.
My friend and I encountered this problem while working on a game that self-modifies its code. We allow the user to rewrite code snippets in x86 assembly.
This just requires leveraging two libraries -- an assembler, and a disassembler:
FASM assembler: https://github.com/ZenLulz/Fasm.NET
Udis86 disassembler: https://github.com/vmt/udis86
We read instructions using the disassembler, let the user edit them, convert the new instructions to bytes with the assembler, and write them back to memory. The write-back requires using VirtualProtect on windows to change page permissions to allow editing the code. On Unix you have to use mprotect instead.
I posted an article on how we did it, as well as the sample code.
These examples are on Windows using C++, but it should be very easy to make cross-platform and C only.
This is how to do it on windows with c++. You'll have to VirtualAlloc a byte array with read/write protections, copy your code there, and VirtualProtect it with read/execute protections. Here's how you dynamically create a function that does nothing and returns.
#include <cstdio>
#include <Memoryapi.h>
#include <windows.h>
using namespace std;
typedef unsigned char byte;
int main(int argc, char** argv){
byte bytes [] = { 0x48, 0x31, 0xC0, 0x48, 0x83, 0xC0, 0x0F, 0xC3 }; //put code here
//xor %rax, %rax
//add %rax, 15
//ret
int size = sizeof(bytes);
DWORD protect = PAGE_READWRITE;
void* meth = VirtualAlloc(NULL, size, MEM_COMMIT, protect);
byte* write = (byte*) meth;
for(int i = 0; i < size; i++){
write[i] = bytes[i];
}
if(VirtualProtect(meth, size, PAGE_EXECUTE_READ, &protect)){
typedef int (*fptr)();
fptr my_fptr = reinterpret_cast<fptr>(reinterpret_cast<long>(meth));
int number = my_fptr();
for(int i = 0; i < number; i++){
printf("I will say this 15 times!\n");
}
return 0;
} else{
printf("Unable to VirtualProtect code with execute protection!\n");
return 1;
}
}
You assemble the code using this tool.
While "true" self modifying code in C is impossible (the assembly way feels like slight cheat, because at this point, we're writing self modifying code in assembly and not in C, which was the original question), there might be a pure C way to make the similar effect of statements paradoxically not doing what you think are supposed do to. I say paradoxically, because both the ASM self modifying code and the following C snippet might not superficially/intuitively make sense, but are logical if you put intuition aside and do a logical analysis, which is the discrepancy which makes paradox a paradox.
#include <stdio.h>
#include <string.h>
int main()
{
struct Foo
{
char a;
char b[4];
} foo;
foo.a = 42;
strncpy(foo.b, "foo", 3);
printf("foo.a=%i, foo.b=\"%s\"\n", foo.a, foo.b);
*(int*)&foo.a = 1918984746;
printf("foo.a=%i, foo.b=\"%s\"\n", foo.a, foo.b);
return 0;
}
$ gcc -o foo foo.c && ./foo
foo.a=42, foo.b="foo"
foo.a=42, foo.b="bar"
First, we change the value of foo.a and foo.b and print the struct. Then we change only the value of foo.a, but observe the output.

Resources