Put userdata on lua stack in a thread-safe manner - c

I have a C function, that calls a lua function, that in turn initiate a long chain of async callbacks that hop between C and lua. And I want all of the C functions involved to be able to access some specific userdata, that I created in the original C function. But the tricky part is: all of this should be thread-safe, and also I can't change API, so passing a reference value throughout callbacks is not an option.
So is there a way to somehow put userdata inside lua_State, in a way that only "my" chain of callbacks could access it?

Because Lua has no thread safety guarantees for data races within the same lua_State instance, I'll assume your C code ensures that only one piece of code is communicating with the same lua_State at any one time.
The usual way to handle this is through the Lua registry. It is a special table that is part of the lua_State, accessible from any C code. You designate some key in the registry table to be your special value.
Lua code cannot access the Lua registry unless some C code gives them access to it (BTW: don't do this). So as long as you maintain the integrity of this table, you don't have to worry about a Lua script reaching out and breaking it.

Related

Extracting the call graph with the accessed variables and called functions from C Source Code or C binary

I would like to know a reliable and convinient way (preexisting libraries, C-compilers) to extract the specific information from a C Source Code or C binary code - the format does not really matter, but rather the result is what matters ;-).
Basically the idea is to extract the read and write accesses to variables on a function level as well as called subfunctions recursively, top to bottom.
That means, I want to find out what variables does a certain parent function A access directly (Read, Write or both Read Write access), and then also recursively extract the same information from subfunctions function B, C called by the function A. Similarly, the directly accessed variables by subfunctions function B, C shall be identified, and then the same procedure shall continue recursively for all subfunctions of function B and function C.
I think of it as a call graph, where
Nodes represent a function
Arrows represent a function call
Each node contains additional information about:
name of the accessed variable
access type to a variable (Read, Write or Read/Write)
I attached a picture showing the call graph below to illustrate the idea better. Please be aware that the variable access information shall be collected for each function, and not only for the parent function A.
I am aware that there are there are preexisting compilers and libraries that can parse the C Source code or work directly on the binary such as clang compiler, LLVM, libclang Py-Library and pycparser. I am however not sure if those tools are capable of achieving my task. For example, from my understanding the combination of Clang and LLVM can extract the variable accesses from the binary application file, but not recusively so to speak. That means, all of the variable accesses done by child functions B, C, E, F, etc would be directly assigned to the parent function A.
I definitely want avoid writing a (e.g. python) script on my own that would extract this information from the C source code manually, but I rather prefer the well tested, and well documented tool/API/compiler to do this task.
Thank you a lot in advance!

Multiple calls to a dll from Dymola

I am implementing a model that requires to call a dll library twice, in order to receive from it specific values
the first call is to set up a system inside the library (for example, a powertrain from a catalogue of components and some design parameters)
the second call is to retrieve the performance of a component from such system (let's say the efficiency of a specific electric machine when used in that powertrain)
These two calls to the dll go together, and may be repeated during simulation time.
So far, I've only managed to interface my model to the dll through separate calls from Modelica external functions (one for the first call, one for the second). However, the state of the system is reset between the first and the second call.
Is there a way in Modelica to load a dll, call the same instance of
it multiple times, and eventually close it when the job is done?
Perhaps is it only possible to achieve such feature by bundling the
whole functionality in an external function?
Or am I attempting something that just doesn't work, because of some
technical aspects that I am not aware of? (I don't know, perhaps the
way it all gets compiled during translation)
If I understand this correctly, it looks like your external DLL has some kind of object pointer that is returned when you instantiate it, and this needs to be passed in at every subsequent function call to other functions in the DLL (to preserve the state).
So to do this in Modelica, you need to create an external object class. These are used to preserve state externally and has constructors and destructors to manage its memory. You can write small wrapper C functions to interface with your DLL functions that you can directly include in Modelica annotations, or write a wrapper lib.
Documentation
https://build.openmodelica.org/Documentation/ModelicaReference.Classes.ExternalObject.html
Simple Example
https://www.claytex.com/tech-blog/external-object-example-detecting-initial-rising-edge/
Detailed Example
https://github.com/modelica-3rdparty/ExternData/releases

Thread-safe init of read-only global data

Let's imagine that I'm writing a library that has a reasonably large amount of read-only global data that needs to be initialized before the library can be used. For example, perhaps the global data be lookup tables for various parts of the application logic that won't change during the lifetime of the program.
Now I have a few ways to initialize this data:
I may require that the user call some kind of init() function before the library is used.
I may lazily construct the data the first time a function is called on my library.
I may include the data in a initializer statement in the source, such that variables are statically initialized to their final value.
Now if my data is read-only and should be the same for every environment the library runs in, then (3) is fairly appealing. Even in that case it has some downsides: if the data is very large (but easy to generate procedurally) the size of bloat up a lot (e.g., a library with 50K of code but 8MB of lookup tables would end up around 8050K). Similarly, the source itself may be very large, or the build system needs to handle the generation of the source at compile time.
The main reason you might not able to use (3) is that the tables might be fixed (read-only), but require generation at runtime because they embed some information about the environment (e.g., the value of an environment variable, I configuration setting read from a file, information about the machine architecture, whatever). This data can't be embedded in the source since depends on the runtime environment.
So we have methods (1) and (2) at least - but I can't see how to make these thread-safe in a simple way. The rest of the library can be thread-safe simply by not mutating any global state - just like the vast majority of C functions can be written in a thread-safe way w/o any explicit use of threading primitives.
I can't figure out a similar alternative for this global init, however:
(1) Is undesirable because we prefer not to require the user to call this method, and in any case it simply moves the problem up to the calling code: the calling code then needs to organize to call this init() method exactly once across all threads using the library, and before any thread uses the library.
(2) Fails since concurrent calls to the library might do a double init.
In C++ you can just initialize globals with a method call, like int data[] = loadData(). Is there any equivalent in C? Or am I stuck using threading primitives (which vary by platform, e.g., pthread_once, call_once and whatever Windows has) just to get my thread-safe init?
I don't know of any platform-independent way of initializing a library in a thread-safe manner. That's not surprising since there's no platform-independent threading model in C.
So your solution is going to be platform-specific.
#ThingyWotsit mentions in the comments using C++ to initialize your library, and that will be thread-safe. But it may very well lock you into a specific C++ run-time, so it may not be a useful solution for your C shared object/library. You may not be willing or able to add a dependency on C++ and you may especially not be willing or able to be locked into a specific C++ run-time.
For GCC, you can use the __attribute((constructor)) to have your iniitaliziation function called when the shared object is loaded:
constructor
destructor
constructor (priority)
destructor (priority)
The constructor attribute causes the function to be called automatically before execution enters main ().
Similarly, the destructor attribute causes the function to be called
automatically after main () has completed or exit () has been called.
Functions with these attributes are useful for initializing data that
will be used implicitly during the execution of the program.
You may provide an optional integer priority to control the order in
which constructor and destructor functions are run. A constructor with
a smaller priority number runs before a constructor with a larger
priority number; the opposite relationship holds for destructors. So,
if you have a constructor that allocates a resource and a destructor
that deallocates the same resource, both functions typically have the
same priority. The priorities for constructor and destructor functions
are the same as those specified for namespace-scope C++ objects (see
C++ Attributes).
For example:
static __attribute__((constructor)) void my_lib_init_func( void )
{
...
}
Your code will run before main() is called.
If your library is dynamically loaded (explicit call to dlopen(), for exmaple), your init function will be called when your library is loaded, and your library won't be considered loaded until it returns.
Other compilers provide the functionally-identical #pragma init():
#pragma init(my_lib_init_func)
static void my_lib_init_func( void )
{
...
}
See #pragma init and #pragma fini using gcc compiler on linux
For Windows? The Windows C++ run-time is pretty stable and ubiquitous. I'd just use a C++ solution on Windows, especially if you're compiling with MSVC. (But see the comments...)
Option 3 is always preferable when possible. Your reasoning about the cons is wrong. If you have an 8MB constant table in the executable file, it's directly mapped and shared by all instances of the program or users of the shared library on any remotely modern operating system. If you generate it at runtime, each process will have its own copy of the table.
When option 3 is not available you must use pthread_once or equivalent or implement your own version of the same (much less efficiently) using a lock. There is little reason to use weird OS-specific replacements for it; all major platforms either support POSIX threads API natively or have existing libraries which provide it on top of the platform's low-level primitives.

C callbacks and non-Go threads

How does one call Go code in C from threads that weren't created by Go?
What do I assign to a C function pointer such that threads not created by Go can call that pointer and enter into Go code?
Update0
I don't want to use SWIG.
The callbacks will be coming from threads Go hasn't seen before. Neither cgo/life nor anything in pkg/runtime demonstrates this behaviour AFAICT.
You can do this, but the solution is relatively slow (about 22µs per call on my machine).
The answer is for the C code to use C thread primitives to communicate with another goroutine that will actually run the callback.
I have created a Go package that provides this functionality: rog-go.googlecode.com/hg/exp/callback.
There is an example package demonstrating its use here. The example demonstrates a call back to an arbitrary Go closure from a thread created outside of the Go runtime. Another example is here. This demonstrates a typical C callback interface and layers a Go callback on top of it.
To try out the first example:
goinstall rog-go.googlecode.com/hg/exp/example/looper
cd $GOROOT/src/pkg/rog-go.googlecode.com/hg/exp/example/looper
gotest
To try out the second example:
goinstall rog-go.googlecode.com/hg/exp/example/event
cd $GOROOT/src/pkg/rog-go.googlecode.com/hg/exp/example/event
gotest
Both examples assume that pthreads are available. Of course, this is just a stop-gap measure until cgo is fixed, but the technique for calling arbitrary Go closures in a C callback will be applicable even then.
Here is the documentation for the callback package:
PACKAGE
package callback
import "rog-go.googlecode.com/hg/exp/callback"
VARIABLES
var Func = callbackFunc
Func holds a pointer to the C callback function.
When called, it calls the provided function f in a
a Go context with the given argument.
It can be used by first converting it to a function pointer
and then calling from C.
Here is an example that sets up the callback function:
//static void (*callback)(void (*f)(void*), void *arg);
//void setCallback(void *c){
// callback = c;
//}
import "C"
import "rog-go.googlecode.com/hg/exp/callback"
func init() {
C.setCallback(callback.Func)
}
I'll assume you mean from C code compiled with gcc?
IIRC, this either can't be done or can't easily be done using 6g+cgo and friends. Go uses a different calling convention (as well as the segmented stacks and such).
However, you can write C code for [685]c (or even [685]a) and call into go easily using package·function() (you can even call methods IIRC). See the Source of the runtime package for examples.
Update:
Coming back to this question after the update, and giving it some more thought. This can't be done in a standard fashion using 6c or cgo. Especially because the threads are not started by the go runtime, the current implementation would fail. The scheduler would suddenly have a thread under its control that it does not know about; additionally, that thread would be missing some thread-local variables the go runtime uses for managing stacks and some other things. Also, if the go function returns a value (or several) the C code can't access it on the currently supported platforms, as go returns values on the stack (you could access them with assembly though). With these things in mind, I do believe you could still do this using channels. It would require your C code to be a little too intimate with the inner workings of the go runtime, but it would work for a given implementation. While using channels may not be the solution you're looking for, it could possibly fit more nicely with the concepts of Go than callbacks. If your C code reimplemented at least the sending methods in The channel implementation (that code is written for 6c, so it would have to be adapted for gcc most likely, and it calls the go runtime, which we've determined can't be done from a non-go thread), you should be able to lock the channel and push a value to it. The go scheduler can continue to manage it's own threads, but now it can receive data from other threads started in C.
Admittedly, it's a hack; I haven't looked close enough, but it would probably take a few other hacks to get it working (I believe the channels themselves maintain a list of the goroutines that are waiting on them [EDIT: confirmed: runtime·ready(gp);], so you'd need something in your go code to wake up the receiving channel or to warranty the go code won't receive on the channel until you've already pushed a value). However, I can't see any reason this can't work, whereas there are definite reasons that running code generated by 6g on a thread created in C can't.
My original answer still holds though: barring an addition to the language or runtime, this can't yet be done the way you'd like (I'd love to be proven wrong here).
You can find a real-world application of rog's callback package in these bindings for the PortAudio audio I/O library: http://code.google.com/p/portaudio-go/. Might make it easier to understand..
(Thanks for implementing that, rog. It's just what I needed!)

C - program structure (avoiding global variables, includes, etc.)

I'm using C (not C++) and I'm unsure how to avoid using global variables.
I have a pretty decent grasp on C, its syntax, and how to write a basic application, but I'm not sure of the proper way to structure the program.
How do really big applications avoid the use of global variables? I'm pretty sure there will always need to be at least some, but for big games and other applications written in C, what is the best way to do it?
Is there any good, open-source software written strictly in C that I could look at? I can't think of any off the top of my head, most of them seem to be in C++.
Thanks.
Edit
Here's an example of where I would use a global variable in a simple API hooking application, which is just a DLL inside another process.
This application, specifically, hooks API functions used in another application. It does this by using WriteProcessMemory to overwrite the call to the original, and make it a call to my DLL instead.
However, when unhooking the API function, I have to write back the original memory/machine code.
So, I need to maintain a simple byte array for that machine code, one for each API function that is hooked, and there are a lot.
// Global variable to store original assembly code (6 bytes)
BYTE g_MessageBoxA[6];
// Hook the API function
HookAPIFunction ( "user32.dll", "MessageBoxA", MyNewFunction, g_MessageBoxA );
// Later on, unhook the function
UnHookAPIFunction ( "user32.dll", "MessageBoxA", g_MessageBoxA );
Sorry if that's confusing.
"How do really big applications avoid the use of global variables?"
Use static variables. If a function needs to remember something between calls, use this versus global variables. Example:
int running_total (int num) {
static int sum = 0;
sum += num;
return sum;
}
Pass data via parameters, so that the value is defined one place, maybe main() and passed to where it is needed.
If all else fails, go ahead and use a global but try and mitigate potential problems.
At a minimum, use naming conventions to minimize potential conflicts. EG: Gbl_MyApp_DeveloperName. So all global variables would start with the Gbl_MyApp_ part -- where "MyApp" was something descriptive of your app.
Try to group functions by purpose, so that everything that needs a given global is in the same file (within reason). Then globals can be defined and restricted to that file (beware the extern keyword).
There are some valid uses for global variables. The schools started teaching that they were evil to keep programmers from being lazy and over using them. If you're sure that the data is really globally needed then use them. Given that you are concerned about doing a good job I don't think you'll go too far wrong using your own judgment.
It's easy - simply don't use them. create what would have ben the global variables in your main() function, and then pass them from there as parameters to the functions that need them.
The best-recomended open-source software to learn C in my college had been Linux so you can get examples from their source.
I think the same, globals are bad, reduces readability and increases error possibility and other problems.
I'll explain my example. I have a main() doing really few things. The most of the action comes from timers and external interrupts. These are functions with no parameter, due the platform I'm using, 32 bits micro controller. I can't pass parameters and in addition, these interrupts and timers are asynchronous. The easiest way is a global, imagine, interrupts filling a buffer, this buffer is parsed inside a timer, and the information parsed is used, stored, sent in other timers.
Ok, I can pull up an interrupt flag and process in the main function, but when executing critical software that way is not good idea.
This is the only kind of global variable that doesn't make me feel bad ;-)
Global variables are almost inevitable. However, they are often overused. They are not a substitute for passing proper parameters and designing the right data structures.
Every time you want a global variable, think: what if I need one more like this?

Resources