I am trying to write a program where I have to call some functions through a (shared) library (its source is available). The C code for the library has several global variables, and many functions change the values of these global variables. What I have to do in my program requires that each function call that I make gets to work with a fresh set of variables.
For example, let this function be a part of the library:
int x = 1;
int foo()
{
int a = 0;
//do somethings to 'a'
//...
x++;
return a;
}
Now every time I invoke foo() from my program, the value of x gets update from 1 to 2 then 3 then 4 and so on... I am try to construct a program so that every time foo() is invoked, it sees x = 1.
I am sorry to say that my knowledge of how C/linux treat these variable spaces is insufficient, so this question may seem vague. The above is just a small example; in reality, there are so many variables that is practically impossible to reset their values manually.
What may be the best way to compile that library and/or use it my program so as to refresh the variables?
(On a side note, what I am also trying to do is to parallelize calls to foo(), but because of the shared variables, I cannot do that.)
EDIT:
When working on some web dev projects, I used to encapsulate some code in webservices and then invoke those services from the main program. Does a similar framework exist in C/Linux? Please note that functions are returning data.
You have discovered one of the main reasons that global variables (or global state in general) are a really bad idea.
Since you have access to the source, I would suggest investing some time to refactor the source code.
You can achieve the ability to parallelize calls to foo with the following strategy:
Gather up all of the global variables into a single struct. Call it something like Context.
Change each function that acts on a global variable to take a pointer to a Context, and change the function to update the variables in the Context instead of updating global variables.
Now each thread that wants to use the library can create a new Context and pass that into foo and related functions.
If it's not feasible to make such a change to the source code, you can use more than one CPU core by starting child processes. Each child process has it's own memory space. That option is not nearly as efficient as using multiple threads.
I have no answer in details. But you can try one of the following:
unload and load library
try to clear library's .bss and fill .data section with values from the library (ref dl_iterate_phdr() call).
Related
I want to speed up the processing of a sequential C program using multi-threading. My problem is that my C program has a lot of global variables. They are read and written by the functions in my C program. Therefore, it is prevented to parallelize functions together by multithreading because it no longer holds the exact result compared to running sequence programs.
I using OpenMP to handle my C program. However, I wanna refactor my C program to react above purpose before use OpenMP
Here my example:
int a = 5 ; // global variable
funcA () {
int b;
b = a + 5; // read a
}
funcB () {
printf("%d\n", a);
}
I don't wanna find the way to parallel complete funcA and funcB but I want reduce the dependency caused global variable (like variable a in above example).
There is no simple way to do a complicated thing. It can seem difficult sometimes to design a code without global variables even when coding from zero. I your case, the problem is significantly more difficult.
There is not (and cannot be) a generic solution about how to minimize the number of global variables.
The only thing which can be done is:
analyze the code base;
understand the purpose of the global variables and how they are used;
find a way to achieve the same behavior without using global variables.
Of course, it might be easier for some global variables to be dealt with than others. You may want to start with the former. Seeing success coming your way will help your morale during the task.
It might help you if you read about how to make code:
tread safe;
re-entrant.
Google can help you greatly on this.
In general it is not an easy task to remove global variables.
You need to go on a case by case basis.
What you really need to do is to try to pass the variables required as function parameters rather than having them as globals.
In this example given, i cannot give any solution without looking at how the functions funcA and funcB are called. You should try to pass the variable a as a parameter to both the functions. You may need to go back up a few functions until you get to a common function which ultimately calls both functions.
I have (mapped in memory) two object files, "A.o" and "B.o", with the same CPU Instruction Set (not necessarily Intel --it can be x86, x86_64, MIPS(32/64), ARM(32/64), PowerPC(32/64),..., but always the same in both object files).
Also, both object files are compiled with the same endianness (both little endian, or both big endian).
However (you knew there was a however, otherwise there wouldn't be any question), "A.o" and "B.o" can have a different function calling convention and, to make things worse, unknown to each other ("A.o" has not even the slightest idea about the calling convention for functions in "B.o", and vice versa).
"A.o" and "B.o" are obviously designed to call functions within their same object file, but there must be a (very) limited interface for communicating between them (otherwise, if execution starts at some function in "A.o", no function from "B.o" would ever be executed if there was no such interface).
The file where execution started (let's suppose it's "A.o") knows the addresses of all static symbols from "B.o" (the addresses of all functions and all global variables). But the opposite is not true (well, the limited interface I'm trying to write would overcome that, but "B.o" doesn't know any address from "A.o" before such interface is established).
Finally the question: How can execution jump from a function in "A.o" to a function in "B.o", and back, while also communicating some data?
I need it to:
Be done in standard C (no assembly).
Be portable C (not compiler-dependent, nor CPU-dependent).
Be thread safe.
Don't make any assumption about the calling conventions involved.
Be able to communicate data between the two object files.
My best idea, for the moment, seems that can meet all these requirements, except thread safety. For example, if I define an struct like this:
struct data_interface {
int value_in;
int value_out; };
I could write a pointer to an struct like this from "A.o" into a global variable of "B.o" (knowing in advance that such global variable in "B.o" has space enough for storing a pointer).
Then, the interface function would be a void interface(void) (I'm assuming that calling void(void) functions is safe across different calling conventions... if this is not true, then my idea wouldn't work). Calling such a function from "A.o" to "B.o" would communicate the data to the code in "B.o". And, fingers crossed, when the called function in "B.o" returns, it would travel back nicely (supposing the different calling convention doesn't change the behaviour when returning from void(void) functions).
However, this is not thread safe, of course.
For it to be thread safe, I guess my only option is to access the stack.
But... can the stack be accessed in a portable way in standard C?
Here are two suggestions.
Data interface
This elaborates on the struct you defined yourself. From what I've seen in the past, compilers typically use a single register (e.g. eax) for their return value (provided the return type fits in a register). My guess is, the following function prototype is likely to be unaffected by differing calling conventions.
struct data_interface *get_empty_data_interface(void);
If so, then you could use that in a way that is similar to the idea you already had about using arrays. Define the following struct and functions in B:
struct data_interface {
int ready;
int the_real_data;
};
struct data_interface *get_empty_data_interface(void)
{
struct data_interface *ptr = malloc(sizeof(struct data_interface));
add_to_list_of_data_block_pointers(ptr);
ptr->ready = 0;
return ptr;
}
void the_function(void)
{
execute_functionality_for_every_data_block_in_my_list_that_is_flagged_ready_and_remove_from_list();
}
To call the function, do this in A:
struct data_interface *ptr = get_empty_data_interface();
ptr->the_real_data = 12345;
ptr->ready = 1;
the_function();
For thread-safety, make sure the list of data blocks maintained by B is thread-safe.
Simultaneous calls to get_empty_data_interface should not overwrite each other's slot in the list.
Simultaneous calls to the_function should not both pick up the same list element.
Wrapper functions
You could try to expose wrapper functions with a well-known calling convention (e.g. cdecl); if necessary defined in a separate object file that is aware of the calling convention of the functions it wraps.
Unfortunately you will probably need non-portable function attributes for this.
You may be able to cheat your way out of it by declaring variadic wrapper functions (with an ellipsis parameter, like printf has); compilers are likely to fall back on cdecl for those. This eliminates non-portable function attributes, but it may be unreliable; you would have to verify my assumption for every compiler you'd like to support. When testing this, keep in mind that compiler options (in particular optimizations) may well play a role. All in all, quite a dirty approach.
the question implies that both object files are compiled differently except for the endianness and that they are linked together into one executable.
it says that A.o knows all static symbols from B.o, but the opposite is not true.
Don't make any assumption about the calling conventions involved.
so we'll be using only void f(void) type of functions.
you'll declare int X, Y; in B.o and extern int X, Y; in A.o so before you call the functions in B.o you check the Y flag, if raised wait until it falls. when a B's function is called it raises the Y flag, read the input from X, do some calculations, write the result back in X and return.
then the calling function in A.o copies the value from X into it's own compilation unit and clears the Y flag.
...if calling a void f(void) function just makes a wild jump from one point in the code to another.
another way to do it would be to declare static int Y = 0; in B.o and omit it entirely in A.o
then when a B.o function gets called it checks if Y == 0 and if so increase Y, read X, do calculations, write X, decrease Y and return. if not so then wait to become 0 and block the calling function.
or maybe even have a static flag in every B.o function, but i don't see the point in this waste since the communication data is global in B.o
Remember that there are both caller saves and callee saves conventions out there, together with variations on use of registers to pass values, use or not of a frame pointer, and even (in some architectures, in some optimisation levels) the use of the delay slot in a branch to hold the first instruction of the subroutine. You are not going to be able to do this without some knowledge of the calling conventions in play, but fortunately the linker will need that anyway. Presumably there is some higher level entity that is responsible for loading those DLLs and that knows the calling conventions for both of them?
Anything you do here is going to be at best deep into implementation defined territory, if not technically undefined behaviour, and you will want to make a deep study of the linker and loader (In particular the linker must know how to resolve dynamic linkage in your unknown calling convention or you will not be able to load that shared object in a meaningful way, so you may be able to leaverage it using libbfd or such but that is outside the scope of C).
The place this sort of thing can go very wrong is if shared resources are allocated in A and freed in B (Memory springs to mind) as memory management is a usually a library based wrapper over the operating systems SBRK or similar, and these implementations of memory management are not inherently compatible in memory layout, other places you may be bitten by this include IO (see shennanigans you sometimes get when mixing printf and cout in c++ for a benign example), and locking.
Is it possible that I call a c program from a stateflow chart, then I copy this chart, still in this same model, and execute both with out any conflict?
For example a C program like this:
int var; // var is global
int myfunction(int n)
{
var = var + n;
return var;
}
i mean, treat them like two different entities and won't mess up with global variable.
btw, also without rename the function in source code, I've got a big program :)
This is more a C - related issue.
If you are using the same C function that operates on a global, then yes, all calls to this function will operate on the same variable.
What you can do instead is make this variable local to each of the calling Stateflow states and then pass it to the C function. This way you should not have conflicts and be able to reuse your code.
It's also a good design choice since you otherwise are potentially hiding a state variable in the function i.e. outside of your state machine.
In C you can have external static variables that are viewable every where in the file, while internal static variables are only visible in the function but is persistent
For example:
#include <stdio.h>
void foo_bar( void )
{
static counter = 0;
printf("counter is %d\n", counter);
counter++;
}
int main( void )
{
foo_bar();
foo_bar();
foo_bar();
return 0;
}
the output will be
counter is 0
counter is 1
counter is 2
My question is why would you use an internal static variable? If you don't want your static variable visible in the rest of the file shouldn't the function really be in its own file then?
This confusion usually comes about because the static keyword serves two purposes.
When used at file level, it controls the visibility of its object outside the compilation unit, not the duration of the object (visibility and duration are layman's terms I use during educational sessions, the ISO standard uses different terms which you may want to learn eventually, but I've found they confuse most beginning students).
Objects created at file level already have their duration decided by virtue of the fact that they're at file level. The static keyword then just makes them invisible to the linker.
When used inside functions, it controls duration, not visibility. Visibility is already decided since it's inside the function - it can't be seen outside the function. The static keyword in this case, causes the object to be created at the same time as file level objects.
Note that, technically, a function level static may not necessarily come into existence until the function is first called (and that may make sense for C++ with its constructors) but every C implementation I've ever used creates its function level statics at the same time as file level objects.
Also, whilst I'm using the word "object", I don't mean it in the sense of C++ objects (since this is a C question). It's just because static can apply to variables or functions at file level and I need an all-encompassing word to describe that.
Function level statics are still used quite a bit - they can cause trouble in multi-threaded programs if that's not catered for but, provided you know what you're doing (or you're not threading), they're the best way to preserve state across multiple function calls while still providing for encapsulation.
Even with threading, there are tricks you can do in the function (such as allocation of thread specific data within the function) to make it workable without exposing the function internals unnecessarily.
The only other choices I can think of are global variables and passing a "state variable" to the function each time.
In both these cases, you expose the inner workings of the function to its clients and make the function dependent on the good behavior of the client (always a risky assumption).
They are used to implement tools like strtok, and they cause problems with reentrancy...
Think carefully before fooling around with this tool, but there are times when they are appropriate.
For example, in C++, it is used as one way to get singleton istances
SingletonObject& getInstance()
{
static SingletonObject o;
return o;
}
which is used to solve the initialization order problem (although it's not thread-safe).
Ad "shouldn't the function be in its own file"
Certainly not, that's nonsense. Much of the point of programming languages is to facilitate isolation and therefore reuse of code (local variables, procedures, structures etc. all do that) and this is just another way to do that.
BTW, as others pointed out, almost every argument against global variables applies to static variables too, because they are in fact globals. But there are many cases when it's ok to use globals, and people do.
I find it handy for one-time, delayed, initialization:
int GetMagic()
{
static int magicV= -1;
if(-1 == magicV)
{
//do expensive, one-time initialization
magicV = {something here}
}
return magicV;
}
As others have said, this isn't thread-safe during it's very first invocation, but sometimes you can get away with it :)
I think that people generally stay away from internal static variables. I know strtok() uses one, or something like it, and because of that is probably the most hated function in the C library.
Other languages like C# don't even support it. I think the idea used to be that it was there to provide some semblance of encapsulation (if you can call it that) before the time of OO languages.
Probably not terribly useful in C, but they are used in C++ to guarantee the initialisation of namespace scoped statics. In both C and C++ there are problemns with their use in multi-threaded applications.
I wouldn't want the existence of a static variable to force me to put the function into its own file. What if I have a number of similar functions, each with their own static counter, that I wanted to put into one file? There are enough decisions we have to make about where to put things, without needing one more constraint.
Some use cases for static variables:
you can use it for counters and you won't pollute the global namespace.
you can protect variables using a function that gets the value as a pointer and returns the internal static. This whay you can control how the value is assigned. (use NULL when you just want to get the value)
I've never heard this specific construct termed "internal static variable." A fitting label, I suppose.
Like any construct, it has to be used knowledgeably and responsibly. You must know the ramifications of using the construct.
It keeps the variable declared at the most local scope without having to create a separate file for the function. It also prevents global variable declaration.
For example -
char *GetTempFileName()
{
static int i;
char *fileName = new char[1024];
memset(fileName, 0x00, sizeof(char) * 1024);
sprintf(fileName, "Temp%.05d.tmp\n", ++i);
return fileName;
}
VB.NET supports the same construct.
Public Function GetTempFileName() As String
Static i As Integer = 0
i += 1
Return String.Format("Temp{0}", i.ToString("00000"))
End Function
One ramification of this is that these functions are not reentrant nor thread safe.
Not anymore. I've seen or heard the results of function local static variables in multithreaded land, and it isn't pretty.
In writing code for a microcontroller I would use a local static variable to hold the value of a sub-state for a particular function. For instance if I had an I2C handler that was called every time main() ran then it would have its own internal state held in a static local variable. Then every time it was called it would check what state it was in and process I/O accordingly (push bits onto output pins, pull up a line, etc).
All statics are persistent and unprotected from simultaneous access, much like globals, and for that reason must be used with caution and prudence. However, there are certainly times when they come in handy, and they don't necessarily merit being in their own file.
I've used one in a fatal error logging function that gets patched to my target's error interrupt vectors, eg. div-by-zero. When this function gets called, interrupts are disabled, so threading is a non-issue. But re-entrancy could still happen if I caused a new error while in the process of logging the first error, like if the error string formatter broke. In that case, I'd have to take more drastic action.
void errorLog(...)
{
static int reentrant = 0;
if(reentrant)
{
// We somehow caused an error while logging a previous error.
// Bail out immediately!
hardwareReset();
}
// Leave ourselves a breadcrumb so we know we're already logging.
reentrant = 1;
// Format the error and put it in the log.
....
// Error successfully logged, time to reset.
hardwareReset();
}
This approach is checking against a very unlikely event, and it's only safe because interrupts are disabled. However, on an embedded target, the rule is "never hang." This approach guarantees (within reason) that the hardware eventually gets reset, one way or the other.
A simple use for this is that a function can know how many times it has been called.
I'm refactoring "spaghetti code" C module to work in multitasking (RTOS) environment.
Now, there are very long functions and many unnecessary global variables.
When I try to replace global variables that exists only in one function with locals, I get into dilemma. Every global variable is behave like local "static" - e.g. keep its value even you exit and re-enter to the function.
For multitasking "static" local vars are worst from global. They make the functions non reentered.
There are a way to examine if the function is relay on preserving variable value re-entrancing without tracing all the logical flow?
Short answer: no, there isn't any way to tell automatically whether the function will behave differently according to whether the declaration of a local variable is static or not. You just have to examine the logic of each function that uses globals in the original code.
However, if replacing a global variable with a static local-scope variable means the function is not re-entrant, then it wasn't re-entrant when it was a global, either. So I don't think that changing a global to a static local-scope variable will make your functions any less re-entrant than they were to start with.
Provided that the global really was used only in that scope (which the compiler/linker should confirm when you remove the global), the behaviour should be close to the same. There may or may not be issues over when things are initialized, I can't remember what the standard says: if static initialization occurs in C the same time it does in C++, when execution first reaches the declaration, then you might have changed a concurrency-safe function into a non-concurrency-safe one.
Working out whether a function is safe for re-entrancy also requires looking at the logic. Unless the standard says otherwise (I haven't checked), a function isn't automatically non-re-entrant just because it declares a static variable. But if it uses either a global or a static in any significant way, you can assume that it's non-re-entrant. If there isn't synchronization then assume it's also non-concurrency-safe.
Finally, good luck. Sounds like this code is a long way from where you want it to be...
If your compiler will warn you if a variable is used before initialized, make a suspected variable local without assigning it a value in its declaration.
Any variable that gives a warning cannot be made local without changing other code.
Changing global variables to static local variables will help a little, since the scope for modification has been reduced. However the concurrency issue still remains a problem and you have to work around it with locks around access to those static variables.
But what you want to be doing is pushing the definition of the variable into the highest scope it is used as a local, then pass it as an argument to anything that needs it. This obviously requires alot of work potentially (since it has a cascading effect). You can group similarly needed variables into "context" objects and then pass those around.
See the design pattern Encapsulate Context
If your global vars are truly used only in one function, you're losing nothing by making them into static locals since the fact that they were global anyway made the function that used them non-re-entrant. You gain a little by limiting the scope of the variable.
You should make that change to all globals that are used in only one function, then examine each static local variable to see if it can be made non-static (automatic).
The rule is: if the variable is used in the function before being set, then leave it static.
An example of a variable that can be made automatic local (you would put "int nplus4;" inside the function (you don't need to set it to zero since it's set before use and this should issue a warning if you actually use it before setting it, a useful check):
int nplus4 = 0; // used only in add5
int add5 (int n) {
nplus4 = n + 4; // set
return nplus4 + 1; // use
}
The nplus4 var is set before being used. The following is an example that should be left static by putting "static int nextn = 0;" inside the function:
int nextn = 0; // used only in getn
int getn (void) {
int n = nextn++; // use, then use, then set
return n;
}
Note that it can get tricky, "nextn++" is not setting, it's using and setting since it's equivalent to "nextn = nextn + 1".
One other thing to watch out for: in an RTOS environment, stack space may be more limited than global memory so be careful moving big globals such as "char buffer[10000]" into the functions.
Please give examples of what you call 'global' and 'local' variables
int global_c; // can be used by any other file with 'extern int global_c;'
static int static_c; // cannot be seen or used outside of this file.
int foo(...)
{
int local_c; // cannot be seen or used outside of this function.
}
If you provide some code samples of what you have and what you changed we could better answer the question.
If I understand your question correctly, your concern is that global variables retain their value from one function call to the next. Obviously when you move to using a normal local variable that won't be the case. If you want to know whether or not it is safe to change them I don't think you have any option other than reading and understanding the code. Simply doing a full text search for the the name of the variable in question might be instructive.
If you want a quick and dirty solution that isn't completely safe, you can just change it and see what breaks. I recommend making sure you have a version you can roll back to in source control and setting up some unit tests in advance.