gsl_rng functions: in what order do they compile - c

I am learning gsl_rng library and found an interesting question.
I understand that the environment variables (GSL_RNG_TYPE and GSL_RNG_SEED) can be used to set library variables (gsl_rng_default and gsl_rng_default_seed) during run time (without re-compile). You just need to add gsl_rng_env_setup() and then change these two variables in terminal before do ./a.out.
However, if I specifically set the gsl_rng_default and gsl_rng_default_seed in the code (eg. use "taus" and "12"), with the same program compiled, now I cannot change the seed value at run time but can still change the generator type.
I am new to this stuff so probably I missed something. But can anyone help me understand why this happen? Why do these two variables behave differently? Is there an order or over-write problem?
Here is my code (simple practice):
#include <stdio.h>
#include <gsl/gsl_rng.h>
int main (void)
{
const gsl_rng_type * T; /*generator type*/
gsl_rng * r; /*rng instance*/
int i, n = 20;
gsl_rng_env_setup(); /*read from environment variable*/
T = gsl_rng_default; /*choose default generator type*/
gsl_rng_default = gsl_rng_mt19937;
gsl_rng_default_seed = 12;
r = gsl_rng_alloc (T); /*create an instance*/
for (i = 0; i < n; i++)
{
double u = gsl_rng_uniform (r);
printf ("%.5f\n", u);
}
gsl_rng_free (r); /*free all memory associated with r*/
return 0;
}

If we step through the code in execution order, we see what happens:
gsl_rng_env_setup(); /*read from environment variable*/
so gsl_rnd_default and gsl_rng_default_seed now contain the values from the environment, or if they weren't set, the library defaults.
T = gsl_rng_default; /*choose default generator type*/
T now contains a copy of the environment value
gsl_rng_default = gsl_rng_mt19937;
gsl_rng_default_seed = 12;
now we've overwritten both the values from earlier
r = gsl_rng_alloc (T); /*create an instance*/
At this point, since gsl_rng_alloc() uses the generator type we pass in the parameter, it doesn't matter that gsl_rng_default was overwritten because we're passing T, and that still contains its copy of the value from beforehand. However, since gsl_rng_alloc() will go ahead and use the current value of gsl_rnd_default, it gets the 12 that we put there.
If you were to assign your default values before calling gsl_rng_env_setup(), you'd overwrite the library defaults, then those values you set will be overwritten if the environment variables are set, or passed through if they aren't, which seems like the behaviour you really want.

Related

Calling local Julia package from C

The Julia documentation shows examples of how to call Base Julia functions from C (e.g. sqrt), which I've been successful in replicating. What I'm really interested in doing is calling locally developed Julia modules and it's not at all clear from the documentation how one would call non-Base functions. There are some discussion threads on the issue from a few years ago, but the APIs appear to have changed in the meantime. Any pointers would be appreciated.
The reason why jl_eval_string("using SomeModule") returns NULL is simply because using SomeModule returns nothing.
You can use functions from other modules by first importing the module, and then retrieving function objects from that Julia module into C. For example, let's use the package GR and its plot function. We can get the plot function with
jl_eval_string("using GR") // this returns nothing
jl_module_t* GR = (jl_module_t *)jl_eval_string("GR") // this returns the module
/* get `plot` function */
jl_function_t *plot = jl_get_function(GR, "plot");
Here we passed GR module as the first argument to jl_get_function. We can, knowing the fact that things will be loaded into the module Main and plot is exported from GR, use the following snippet instead to do the same. Note that jl_main_module holds a pointer to the module Main.
jl_eval_string("using GR")
/* get `plot` function */
jl_function_t *plot = jl_get_function(jl_main_module, "plot");
We can also use plots qualified name.
/* get `plot` function */
jl_function_t *plot = jl_get_function(jl_main_module, "GR.plot");
That said, here is the complete example plotting an array of values using GR. The example uses the first style to get the function GR.plot.
#include <julia.h>
JULIA_DEFINE_FAST_TLS() // only define this once, in an executable (not in a shared library) if you want fast code.
#include <stdio.h>
int main(int argc, char *argv[])
{
/* required: setup the Julia context */
jl_init();
/* create a 1D array of length 100 */
double length = 100;
double *existingArray = (double*)malloc(sizeof(double)*length);
/* create a *thin wrapper* around our C array */
jl_value_t* array_type = jl_apply_array_type((jl_value_t*)jl_float64_type, 1);
jl_array_t *x = jl_ptr_to_array_1d(array_type, existingArray, length, 0);
/* fill in values */
double *xData = (double*)jl_array_data(x);
for (int i = 0; i < length; i++)
xData[i] = i * i;
/* import `Plots` into `Main` module with `using`*/
jl_eval_string("using GR");
jl_module_t* GR = (jl_module_t *)jl_eval_string("GR");;
/* get `plot` function */
jl_function_t *plot = jl_get_function(GR, "plot");
/* create the plot */
jl_value_t* p = jl_call1(plot, (jl_value_t*)x);
/* display the plot */
jl_function_t *disp = jl_get_function(jl_base_module, "display");
jl_call1(disp, p);
getchar();
/* exit */
jl_atexit_hook(0);
return 0;
}
Including a Julia module from a local file and use it in C
I do not know what is exactly meant by a local Julia package, but, you can include your files and then import the modules in those files to do the same. Here is an example module.
# Hello.jl
module Hello
export foo!
foo!(x) = (x .*= 2) # multiply entries of x by 2 inplace
end
To include this file you need to use jl_eval_string("Base.include(Main, \"Hello.jl\")");. For some reason, embedded Julia cannot access include directly. You need to use Base.include(Main, "/path/to/file") instead.
jl_eval_string("Base.include(Main, \"Hello.jl\")");
jl_eval_string("using Main.Hello"); // or just '.Hello'
jl_module_t* Hello = (jl_module_t *)jl_eval_string("Main.Hello"); // or just .Hello
Here is the complete example in C.
#include <julia.h>
JULIA_DEFINE_FAST_TLS() // only define this once, in an executable (not in a shared library) if you want fast code.
#include <stdio.h>
int main(int argc, char *argv[])
{
/* required: setup the Julia context */
jl_init();
/* create a 1D array of length 100 */
double length = 100;
double *existingArray = (double*)malloc(sizeof(double)*length);
/* create a *thin wrapper* around our C array */
jl_value_t* array_type = jl_apply_array_type((jl_value_t*)jl_float64_type, 1);
jl_array_t *x = jl_ptr_to_array_1d(array_type, existingArray, length, 0);
JL_GC_PUSH1(&x);
/* fill in values */
double *xData = (double*)jl_array_data(x);
for (int i = 0; i < length; i++)
xData[i] = i * i;
/* import `Hello` module from file Hello.jl */
jl_eval_string("Base.include(Main, \"Hello.jl\")");
jl_eval_string("using Main.Hello");
jl_module_t* Hello = (jl_module_t *)jl_eval_string("Main.Hello");
/* get `foo!` function */
jl_function_t *foo = jl_get_function(Hello, "foo!");
/* call the function */
jl_call1(foo, (jl_value_t*)x);
/* print new values of x */
for (int i = 0; i < length; i++)
printf("%.1f ", xData[i]);
printf("\n");
JL_GC_POP();
getchar();
/* exit */
jl_atexit_hook(0);
return 0;
}

Can gcc/clang optimize initialization computing?

I recently wrote a parser generator tool that takes a BNF grammar (as a string) and a set of actions (as a function pointer array) and output a parser (= a state automaton, allocated on the heap). I then use another function to use that parser on my input data and generates a abstract syntax tree.
In the initial parser generation, there is quite a lot of steps, and i was wondering if gcc or clang are able to optimize this, given constant inputs to the parser generation function (and never using the pointers values, only dereferencing them) ? Is is possible to run the function at compile time, and embed the result (aka, the allocated memory) in the executable ?
(obviously, that would be using link time optimization, since the compiler would need to be able to check that the whole function does indeed have the same result with the same parameters)
What you could do in this case is have code that generates code.
Have your initial parser generator as a separate piece of code that runs independently. The output of this code would be a header file containing a set of variable definitions initialized to the proper values. You then use this file in your main code.
As an example, suppose you have a program that needs to know the number of bits that are set in a given byte. You could do this manually whenever you need:
int count_bits(uint8_t b)
{
int count = 0;
while (b) {
count += b & 1;
b >>= 1;
}
return count;
}
Or you can generate the table in a separate program:
int main()
{
FILE *header = fopen("bitcount.h", "w");
if (!header) {
perror("fopen failed");
exit(1);
}
fprintf(header, "int bit_counts[256] = {\n");
int count;
unsigned v;
for (v=0,count=0; v<256; v++) {
uint8_t b = v;
while (b) {
count += b & 1;
b >>= 1;
}
fprintf(header, " %d,\n" count);
}
fprintf(header, "};\n");
fclose(header);
return 0;
}
This create a file called bitcount.h that looks like this:
int bit_counts[256] = {
0,
1,
1,
2,
...
7,
};
That you can include in your "real" code.

Is it possible to exchange a C function implementation at run time?

I have implemented a facade pattern that uses C functions underneath and I would like to test it properly.
I do not really have control over these C functions. They are implemented in a header. Right now I #ifdef to use the real headers in production and my mock headers in tests. Is there a way in C to exchange the C functions at runtime by overwriting the C function address or something? I would like to get rid of the #ifdef in my code.
To expand on Bart's answer, consider the following trivial example.
#include <stdio.h>
#include <stdlib.h>
int (*functionPtr)(const char *format, ...);
int myPrintf(const char *fmt, ...)
{
char *tmpFmt = strdup(fmt);
int i;
for (i=0; i<strlen(tmpFmt); i++)
tmpFmt[i] = toupper(tmpFmt[i]);
// notice - we only print an upper case version of the format
// we totally disregard all but the first parameter to the function
printf(tmpFmt);
free(tmpFmt);
}
int main()
{
functionPtr = printf;
functionPtr("Hello world! - %d\n", 2013);
functionPtr = myPrintf;
functionPtr("Hello world! - %d\n", 2013);
return 0;
}
Output
Hello World! - 2013
HELLO WORLD! - %D
It is strange that you even need an ifdef-selected header. The code-to-test and your mocks should have the exact same function signatures in order to be a correct mock of the module-to-test. The only thing that then changes between a production-compilation and a test-compilation would be which .o files you give to the linker.
It is possible With Typemock Isolator++ without creating unnecessary new levels of indirection. It can be done inside the test without altering your production code. Consider the following example:
You have the Sum function in your code:
int Sum(int a, int b)
{
return a+b;
}
And you want to replace it with Sigma for your test:
int Sigma(int a, int b)
{
int sum = 0;
for( ; 0<a ; a--)
{
sum += b;
}
return sum;
}
In your test, mock Sum before using it:
WHEN_CALLED: call the method you want to fake.
ANY_VAL: specify the args values for which the mock will apply. in this case any 2 integers.
*DoStaticOrGlobalInstead: The alternative behavior you want for Sum.
In this example we call Sigma instead.
TEST_CLASS(C_Function_Tests)
{
public:
TEST_METHOD(Exchange_a_C_function_implementation_at_run_time_is_Possible)
{
void* context = NULL; //since Sum global it has no context
WHEN_CALLED(Sum (ANY_VAL(int), ANY_VAL(int))).DoStaticOrGlobalInstead(Sigma, context);
Assert::AreEqual(2, Sum(1,2));
}
};
*DoStaticOrGlobalInstead
It is possible to set other types of behaviors instead of calling an alternative method. You can throw an exception, return a value, ignore the method etc...
For instance:
TEST_METHOD(Alter_C_Function_Return_Value)
{
WHEN_CALLED(Sum (ANY_VAL(int), ANY_VAL(int))).Return(10);
Assert::AreEqual(10, Sum(1,2));
}
I don't think it's a good idea to overwrite functions at runtime. For one thing, the executable segment may be set as read-only and even if it wasn't you could end up stepping on another function's code if your assembly is too large.
I think you should create something like a function pointer collection for the one and the other set of implementations you want to use. Every time you want to call a function, you'll be calling from the selected function pointer collection. Having done that, you may also have proxy functions (that simply call from the selected set) to hide the function pointer syntax.

Using pthread_create

I am having errors when I try to use pthread_create. I understand that my use of argsRight->thread_id / argsLeft->thread_id and NULL are not correct, but I am unsure how else to make a reference to the thread id. It requires a pointer, but it seems like every way I tried (&, *), the GCC compiler would not accept.
Also, is there any reason it will not accept my use of NULL? I can't see any reason that would be wrong, but GCC says my use of the void function is invalid.
Can anyone shed some light on how to properly set up a call to pthread_create? I have included parts from my method where I am using the pthread_create function.
void pthreads_ms(struct ms_args* args)
{
int left_end = (args->end + args->start) / 2;
int right_start = left_end + 1;
int rc1, rc2;
// Create left side struct
struct ms_args* argsLeft;
argsLeft = malloc(sizeof(args));
argsLeft->thread_id = (2 * args->thread_id + 1);
argsLeft->start = args->start;
argsLeft->end = left_end;
argsLeft->array = args->array;
// Same methodology as above to create the right side
if (args->start != args->end)
{
// Print the thread id number, and start and end places
printf("[%d] start %d end %d", args->thread_id, args->start, args->end);
// Sort Left Side
rc1 = pthread_create(argsLeft->thread_id, NULL, pthreads_ms(argsLeft), argsLeft); //problem line here
//Sort right side
rc2 = pthread_create(argsRight->thread_id, NULL, pthreads_ms(argsRight), argsRight); //problem line here
}
It is not your application, it's pthread_create() will fill thread_id field. So, first of all, struct ms_args's field should be of type pthread_t and you should pass a pointer to that field:
pthread_create(&argsLeft->thread_id, ...
According to pthread_create the proper call should be
rc1 = pthread_create(&(argsLeft->thread_id), NULL, &pthreads_ms, argsLeft);
Same goes for right side.
The definition of pthread_ms() should include a return value
void *pthreads_ms(struct ms_args* args) { ... }
Besides that, your code looks pretty dangerous to me, since it creates recursively two threads for every existing one. Depending on your input, this might build a large tree of threads, which could bring your system to a halt.

In a C program, is it possible to reset all global variables to default vaues?

I have a legacy C Linux application that I need to reuse . This application uses a lot of global variables. I want to reuse this application's main method and invoke that in a loop. I have found that when I call the main method( renamed to callableMain) in a loop , the application behavior is not consistent as the values of global variables set in previous iteration impact the program flow in the new iteration.
What I would like to do is to reset all the global variables to the default value before the execution of the the new iteration.
for example , the original program is like this
OriginalMain.C
#include <stdio.h>
int global = 3; /* This is the global variable. */
void doSomething(){
global++; /* Reference to global variable in a function. */
}
// i want to rename this main method to callableMain() and
// invoke it in a loop
int main(void){
if(global==3) {
printf(" All Is Well \n");
doSomething() ;
}
else{
printf(" Noooo\n");
doNothing() ;
}
return 0;
}
I want to change this program as follows:
I changed the above file to rename the main() to callableMain()
And my new main methods is as follows:
int main(){
for(int i=0;i<20;i++){
callableMain();
// this is where I need to reset the value of global vaiables
// otherwise the execution flow changes
}
}
Is this possible to reset all the global variables to the values before main() was invoked ?
The short answer is that there is no magical api call that would reset global variables. The global variables would have to be cached and reused.
I would invoke it as a subprocess, modifying its input and output as needed. Let the operating system do the dirty work for you.
The idea is to isolate the legacy program from your new program by relegating it to its own process. Then you have a clean separation between the two. Also, the legacy program is reset to a clean state every time you run it.
First, modify the program so that it reads the input data from a file, and writes its output in a machine-readable format to another file, with the files being given on the command line.
You can then create named pipes (using the mkfifo call) and invoke the legacy program using system, passing it the named pipes on the command line. Then you feed it its input and read back its output.
I am not an expert on these matters; there is probably a better way of doing the IPC. Others here have mentioned fork. However, the basic idea of separating out the legacy code and invoking it as a subprocess is probably the best approach here.
fork() early?
You could fork(2) at some early point when you think the globals are in a good state, and then have the child wait on a pipe or something for some work to do. This would require writing any changed state or at least the results back to the parent process but would decouple your worker from your primary control process.
In fact, it might make sense to fork() at least twice, once to set up a worker controller and save the initialized (but not too initialized :-) global state, and then have this worker controller fork() again for each loop you need run.
A simpler variation might be to just modify the code so that the process can start in a "worker mode", and then use fork() or system() to start the application at the top, but with an argument that puts it in to the slave mode.
There is a way to do this on certain platforms / compilers, you'd basically be performing the same initialization your compiler performs before calling main().
I have done this for a TI DSP, in that case I had the section with globals mapped to a specific section of memory and there were linker directives available that declared variables pointing to the start and end of this section (so you can memset() the whole area to zero before starting initialization). Then, the compiler provided a list of records, each of which comprised of an address, data length and the actual data to be copied into the address location. So you'd just loop through the records and do memcpy() into the target address to initialize all globals.
Very compiler specific, so hopefully the compiler you're using allows you to do something similar.
In short, no. What I would do in this instance is create definitions, constants if you will, and then use those to reset the global variables with.
Basically
#define var1 10
int vara = 10
etc... basic C right?
You can then go ahead and wrap the reinitialization in a handy function =)
I think you must change the way you see the problem.
Declare all the variables used by callableMain() inside callableMain()'s body, so they are not global anymore and are destroyed after the function is executed and created once again with the default values when you call callableMain() on the next iteration.
EDIT:
Ok, here's what you could do if you have the source code for callableMain(): in the beginning of the function, add a check to verify if its the first time the function its being called. Inside this check you will copy the values of all global variables used to another set of static variables (name them as you like). Then, on the function's body replace all occurences of the global variables by the static variables you created.
This way you will preserve the initial values of all the global variables and use them on every iteration of callableMain(). Does it makes sense to you?
void callableMain()
{
static bool first_iter = true;
if (first_iter)
{
first_iter = false;
static int my_global_var1 = global_var1;
static float my_global_var2 = global_var2;
..
}
// perform operations on my_global_var1 and my_global_var2,
// which store the default values of the original global variables.
}
for (int i = 0; i < 20; i++) {
int saved_var1 = global_var1;
char saved_var2 = global_var2;
double saved_var3 = global_var3;
callableMain();
global_var1 = saved_var1;
global_var2 = saved_var2;
global_var3 = saved_var2;
}
Or maybe you can find out where global variables start memcpy them. But I would always cringe when starting a loop ...
for (int i = 0; i < 20; i++) {
static unsigned char global_copy[SIZEOFGLOBALDATA];
memcpy(global_copy, STARTOFGLOBALDATA, SIZEOFGLOBALDATA);
callableMain();
memcpy(STARTOFGLOBALDATA, global_copy, SIZEOFGLOBALDATA);
}
If you don't want to refactor the code and encapsulate these global variables, I think the best you can do is define a reset function and then call it within the loop.
Assuming we are dealing with ELF on Linux, then the following function to reset the variables works
// these extern variables come from glibc
// https://github.com/ysbaddaden/gc/blob/master/include/config.h
extern char __data_start[];
extern char __bss_start[];
extern char _end[];
#define DATA_START ((char *)&__data_start)
#define DATA_END ((char *)&__bss_start)
#define BSS_START ((char *)&__bss_start)
#define BSS_END ((char *)&_end)
/// first call saves globals, subsequent calls restore
void reset_static_data();
// variable for quick check
static int pepa = 42;
// writes to memory between global variables are reported as buffer overflows by asan
ATTRIBUTE_NO_SANITIZE_ADDRESS
void reset_static_data()
{
// global variable, ok to leak it
static char * x;
size_t s = BSS_END - DATA_START;
// memcpy is always sanitized, so access memory as chars in a loop
if (x == NULL) { // store current static variables
x = (char *) malloc(s);
for (size_t i = 0; i < s; i++) {
*(x+i) = *(DATA_START + i);
}
} else { // restore previously saved static variables
for (size_t i = 0; i < s; i++) {
*(DATA_START + i) = *(x+i);
}
}
// quick check, see that pepa does not grow in stderr output
fprintf(stderr, "pepa: %d\n", pepa++);
}
The general approach is based on answer in How to get the data and bss address space in run time (In Unix C program), see the linked ysbaddaden/gc GitHub repo for macOS version of the macros.
To test the above code, just call it a few times and note that the incremented global variable pepa still keeps the value of 42.
reset_static_data();
reset_static_data();
reset_static_data();
Saving current state of the globals is convenient in that it does not require rerunning __attribute__((constructor)) functions which would be necessary if I set everything in .bss to zero (which is easy) and everything in .data to the initial values (which is not so easy). For example, if you load libpython3.so in your program, it does do run-time initialization which is lost by zeroing .bss. Calling into Python then crashes.
Sanitizers
Writing into areas of memory immediately before or after a static variable will trigger buffer-overflow warning from Address Sanitizer. To prevent this, use the ATTRIBUTE_NO_SANITIZE_ADDRESS macro the way the code above does. The macro is defined in sanitizer/asan_interface.h.
Code coverage
Code coverage counters are implemented as global variables. Therefore, resetting globals will cause coverage information to be forgotten. To solve this, always dump the coverage-to-date before restoring the globals. There does not seem to be a macro to detect whether code coverage is enabled or not in the compiler, so use your build system (CMake, ...) to define suitable macro yourself, such as QD_COVERAGE below.
// The __gcov_dump function writes the coverage counters to gcda files
// and the __gcov_reset function resets them to zero.
// The interface is defined at https://github.com/gcc-mirror/gcc/blob/7501eec65c60701f72621d04eeb5342bad2fe4fb/libgcc/libgcov-interface.c
extern "C" void __gcov_reset();
extern "C" void __gcov_dump();
void flush_coverage() {
#if defined(QD_COVERAGE)
__gcov_dump();
__gcov_reset();
#endif
}

Resources