I would like to know difference between static variables and global variables in terms of access speed and space consumption. (If you want to know my platform: gcc compiler on Windows. (I am using Cygwin with Triton IDE for ARM7 embedded programming on windows. Triton comes with gcc compiler on Java platform which can be run on Windows.))
(Obviously I know in terms of file and function scope from this question)
Edit: OK give me an answer on any micro controller / processor environment.
There is no difference for the space, they take the same amount.
But there is a speed difference: static is faster.
Of course the memory access to the variable is for global and static the same. But the compiler can optimize when you have static. When it compiles a module it knows that no function call to a function outside the module can change a static variable. So it knows exactly what happens and can e.g. keep it in a register over function calls. When it is global and you call a function from a different module, the compiler can't know what it does. Hence he must assume that the function accesses the variable and changes it, resulting in a store and reload.
With gcc you can pass all .c sources at the same time, so it can then also see what happens in function calls to functions from different modules. To make it work you have to pass besides all .c files at once -combine and -fwhole-program. The -fwhole-program makes all globals static (not module static but compilation unit static, i.e. all the given .c files together). The -combine makes the intermodule analysis.
Space consumption: basically no difference. The only time there'd be a space issue is if you manage to get the same chunk of static data hidden in N object files, then you get a multiplication factor of N where you might have just 1 copy if it was a single global piece of data. However, that's a mis-design issue. Information hiding is good - unless the information should not be hidden.
Access speed: no difference.
It's hard to guess, or to estimate. It would probably take time, but I would make a sample project and test for speed. Testing both access speed and space with a loop. Test the sample project with an emulator for that architecture.
I would expect any difference would come from packing (for space) and caching (for speed) issues. Both those could also arise from just about anything else as well.
There is no difference in the env you describe when it comes to space. The static or global var consume just the same amount of memory.
For speed considerations (but not good practice) you could prefer global vars, if you need access to the var outside the one file.
(ref use of external char my_global_char_placed_else_where;)
For better practice you use get/set functions instead but they are slower. So then you could use macros for get/set of a var that is global to hide from the reader of the code that the var is in fact global, but that is kind'a like cheating. But it can make the code more readable.
If you compare hiding a var inside a function, then it has no difference compared with placing it outside the function and more functions could have access to the var.
I myself use MSP430, ARM7(just for tests) and AVR32 micros for development
What Jonathan says is not exactly correct. Both static and global variables will be (has to be) saved in the ZI (or RW data) regions. The compiler cant "keep" it over the register strictly - what it might do is load the value into the register, use that register for all operations and than save that value back - thats a compiler specific optimization. And even then, there is no reason why the compiler wont do that also for global variables : unless of course u make it volatile. But then, technically you can also make a static variable volatile, so again no difference.
Edit : oh yeah - space : no difference.
Related
I am writing a DSP code in C (windows environment). The code should be modified, by another engineer, to run on Cortex-M4. This engineer claims that, for reduction of running time, many of the functions that I have implemented should be united into one function. I prefer to avoid it keeping clarity and testing.
Does his claim make sense? If it is, where I can read about it. Otherwise, can I show that he is wrong without a comparison of running time?
Does his claim make sense?
Depends on context. Modern compilers are perfectly able to inline function calls, but that usually means that those functions must be placed in the same translation unit (essentially the same .c file).
If your functions are in the same .c file then their claim is wrong, if you have the functions scattered across multiple files, then their claim is likely correct.
If it is, where I can read about it.
Function inlining has been around for some 30 years. C even added an inline keyword for it in year 1999 (C++ had one earlier still), though during the 2000s compilers turned smarter than programmers in terms of determining when and what to inline. Nowadays when using modern compilers, inline is mostly considered obsolete.
Otherwise, can I show that he is wrong without a comparison of running time?
By disassembling the optimized code and see if there are any function calls or not. Still, function calls are relatively cheap on Cortex M (unless there's a ton of different parameters), so doing manual optimization to remove them would be very tiny optimization.
As always there's a choice between code size and execution speed.
If you wish to remove the stack overhead of calling a new function but wish to keep your code modular then consider using the inline function attribute suitable for your compiler e.g.
static inline void com_ClearMessageBuffer(uint8_t* pBuffer, uint32_t length)
{
NRF_LOG_DEBUG("com_ClearMessageBuffer");
memset(pBuffer, 0, length);
}
Then at compile time your inline function code will be inserted into the code flow wherever it is called.
This will speed execution, but when called multiple times increase the code size.
Recently, I have learned the existence of "backtrace" function.
This function allow one to retrieve, at some conditions, the callstack of an ELF running program compiled without debugging information.
It's perfect for me (I can't insert debugging symbol in production program), but for "backtrace" to work, there is (roughly) two condition :
Tell the linker to add extra information (by passing -rdynamic option).
Convert all "static" function to "non-static" function.
My worries is that if I fullfill this two condition, my program will be slower (because compiler can't optimise non-static function as he optimize static function ?).
As far as I know, adding extra information with -rdynamic doesn't impact program's performance : it's just add a little weight to the ELF binary.
So here's my question :
What is the effect in term of running performance when all static function become non-static function ?
Yes, your worries are correct: Declaring a function as static provides a good hint to the compiler, which it can turn into better optimization. The amount of speedup that you receive from static depends on your precise situation, though, so there's only truth in measurement (as always when it comes to performance).
The point about declaring a function as static is, that the compiler knows definitely, that it sees all the calling sites of a function. And if it sees, that the function is only called from one single place, it will generally always inline it, no matter how long it is. And the inlining may unlock further opportunities for optimization. This avoids the function call overhead, both in terms of size and speed. In this, static is actually a stronger hint than inline.
Of course, the effect on the performance depends on the frequency of calls to the static function. So, as I said, you need measurement to assess how much performance you gain from the static keyword.
What is the effect in term of running performance when all static function become non-static function ?
The answer is "some", I think. The only way you can be sure what the performance issue is likely to be is to measure the performance of your program with and without static functions. The most obvious thing I can think of is that the optimiser might not be able to inline non static functions.
You actually don't need to make the static functions non static, the only issue will be the lack of symbolic names in the backtrace.
However, I think you have bigger problems. From your link to the man page:
Omission of the frame pointers (as implied by any of gcc(1)'s nonzero optimization levels) may cause these assumptions to be violated.
For backstrace to work reliably, it looks like you will have to compile without optimisation. That will certainly have a huge impact on performance. I think I'd manage without it.
What is the effect in term of running performance when all static function become non-static function ?
Ans - None
A Static function means that function is only visible to that file only and cannot be called outside that file. i.e. it has file scope.
Static functions are typically used in large programs when you do not want your functions to clash with other people's functions. A static function ensures that you can define a function and use it inside your own file, and somebody else can also define the same static function name in other file.
If your code compiles after removing the static keyword, then you do not have two functions with the same name and it will work just like any normal function.
It will have no run time penalty.
I have performance critical code written for multiple CPUs. I detect CPU at run-time and based on that I use appropriate function for the detected CPU. So, now I have to use function pointers and call functions using these function pointers:
void do_something_neon(void);
void do_something_armv6(void);
void (*do_something)(void);
if(cpu == NEON) {
do_something = do_something_neon;
}else{
do_something = do_something_armv6;
}
//Use function pointer:
do_something();
...
Not that it matters, but I'll mention that I have optimized functions for different cpu's: armv6 and armv7 with NEON support. The problem is that by using function pointers in many places the code become slower and I'd like to avoid that problem.
Basically, at load time linker resolves relocs and patches code with function addresses. Is there a way to control better that behavior?
Personally, I'd propose two different ways to avoid function pointers: create two separate .so (or .dll) for cpu dependent functions, place them in different folders and based on detected CPU add one of these folders to the search path (or LD_LIB_PATH). The, load main code and dynamic linker will pick up required dll from the search path. The other way is to compile two separate copies of library :)
The drawback of the first method is that it forces me to have at least 3 shared objects (dll's): two for the cpu dependent functions and one for the main code that uses them. I need 3 because I have to be able to do CPU detection before loading code that uses these cpu dependent functions. The good part about the first method is that the app won't need to load multiple copies of the same code for multiple CPUs, it will load only the copy that will be used. The drawback of the second method is quite obvious, no need to talk about it.
I'd like to know if there is a way to do that without using shared objects and manually loading them at runtime. One of the ways would be some hackery that involves patching code at run-time, it's probably too complicated to get it done properly). Is there a better way to control relocations at load time? Maybe place cpu dependent functions in different sections and then somehow specify what section has priority? I think MAC's macho format has something like that.
ELF-only (for arm target) solution is enough for me, I don't really care for PE (dll's).
thanks
You may want to lookup the GNU dynamic linker extension STT_GNU_IFUNC. From Drepper's blog when it was added:
Therefore I’ve designed an ELF extension which allows to make the decision about which implementation to use once per process run. It is implemented using a new ELF symbol type (STT_GNU_IFUNC). Whenever the a symbol lookup resolves to a symbol with this type the dynamic linker does not immediately return the found value. Instead it is interpreting the value as a function pointer to a function that takes no argument and returns the real function pointer to use. The code called can be under control of the implementer and can choose, based on whatever information the implementer wants to use, which of the two or more implementations to use.
Source: http://udrepper.livejournal.com/20948.html
Nonetheless, as others have said, I think you're mistaken about the performance impact of indirect calls. All code in shared libraries will be called via a (hidden) function pointer in the GOT and a PLT entry that loads/calls that function pointer.
For the best performance you need to minimize the number of indirect calls (through pointers) per second and allow the compiler to optimize your code better (DLLs hamper this because there must be a clear boundary between a DLL and the main executable and there's no optimization across this boundary).
I'd suggest doing these:
moving as much of the main executable's code that frequently calls DLL functions into the DLL. That'll minimize the number of indirect calls per second and allow for better optimization at compile time too.
moving almost all your code into separate CPU-specific DLLs and leaving to main() only the job of loading the proper DLL OR making CPU-specific executables w/o DLLs.
Here's the exact answer that I was looking for.
GCC's __attribute__((ifunc("resolver")))
It requires fairly recent binutils.
There's a good article that describes this extension: Gnu support for CPU dispatching - sort of...
Lazy loading ELF symbols from shared libraries is described in section 1.5.5 of Ulrich Drepper's DSO How To (updated 2011-12-10). For ARM it is described in section 3.1.3 of ELF for ARM.
EDIT: With the STT_GNU_IFUNC extension mentioned by R. I forgot that was an extension. GNU Binutils supports that for ARM, apparently since March 2011, according to changelog.
If you want to call functions without the indirection of the PLT, I suggest function pointers or per-arch shared libraries inside which function calls don't go through PLTs (beware: calling an exported function is through the PLT).
I wouldn't patch the code at runtime. I mean, you can. You can add a build step: after compilation disassemble your binaries, find all offsets of calls to functions that have multi-arch alternatives, build table of patch locations, link that into your code. In main, remap the text segment writeable, patch the offsets according to the table you prepared, map it back to read-only, flush the instruction cache, and proceed. I'm sure it will work. How much performance do you expect to gain by this approach? I think loading different shared libraries at runtime is easier. And function pointers are easier still.
I'm using C (not C++) and I'm unsure how to avoid using global variables.
I have a pretty decent grasp on C, its syntax, and how to write a basic application, but I'm not sure of the proper way to structure the program.
How do really big applications avoid the use of global variables? I'm pretty sure there will always need to be at least some, but for big games and other applications written in C, what is the best way to do it?
Is there any good, open-source software written strictly in C that I could look at? I can't think of any off the top of my head, most of them seem to be in C++.
Thanks.
Edit
Here's an example of where I would use a global variable in a simple API hooking application, which is just a DLL inside another process.
This application, specifically, hooks API functions used in another application. It does this by using WriteProcessMemory to overwrite the call to the original, and make it a call to my DLL instead.
However, when unhooking the API function, I have to write back the original memory/machine code.
So, I need to maintain a simple byte array for that machine code, one for each API function that is hooked, and there are a lot.
// Global variable to store original assembly code (6 bytes)
BYTE g_MessageBoxA[6];
// Hook the API function
HookAPIFunction ( "user32.dll", "MessageBoxA", MyNewFunction, g_MessageBoxA );
// Later on, unhook the function
UnHookAPIFunction ( "user32.dll", "MessageBoxA", g_MessageBoxA );
Sorry if that's confusing.
"How do really big applications avoid the use of global variables?"
Use static variables. If a function needs to remember something between calls, use this versus global variables. Example:
int running_total (int num) {
static int sum = 0;
sum += num;
return sum;
}
Pass data via parameters, so that the value is defined one place, maybe main() and passed to where it is needed.
If all else fails, go ahead and use a global but try and mitigate potential problems.
At a minimum, use naming conventions to minimize potential conflicts. EG: Gbl_MyApp_DeveloperName. So all global variables would start with the Gbl_MyApp_ part -- where "MyApp" was something descriptive of your app.
Try to group functions by purpose, so that everything that needs a given global is in the same file (within reason). Then globals can be defined and restricted to that file (beware the extern keyword).
There are some valid uses for global variables. The schools started teaching that they were evil to keep programmers from being lazy and over using them. If you're sure that the data is really globally needed then use them. Given that you are concerned about doing a good job I don't think you'll go too far wrong using your own judgment.
It's easy - simply don't use them. create what would have ben the global variables in your main() function, and then pass them from there as parameters to the functions that need them.
The best-recomended open-source software to learn C in my college had been Linux so you can get examples from their source.
I think the same, globals are bad, reduces readability and increases error possibility and other problems.
I'll explain my example. I have a main() doing really few things. The most of the action comes from timers and external interrupts. These are functions with no parameter, due the platform I'm using, 32 bits micro controller. I can't pass parameters and in addition, these interrupts and timers are asynchronous. The easiest way is a global, imagine, interrupts filling a buffer, this buffer is parsed inside a timer, and the information parsed is used, stored, sent in other timers.
Ok, I can pull up an interrupt flag and process in the main function, but when executing critical software that way is not good idea.
This is the only kind of global variable that doesn't make me feel bad ;-)
Global variables are almost inevitable. However, they are often overused. They are not a substitute for passing proper parameters and designing the right data structures.
Every time you want a global variable, think: what if I need one more like this?
im working on a c lib which would be nice to also work on embedded systems
but im not very deep into embedded development so my question
are most embedded compilers able to cope with local static variables - which i would then just assume in further development
OR
is there a #define which i can use for a #ifdef to create a global variable in case of
thx
They should, as local static variables are part of the C standard.
Of course, there is nothing preventing them from creating a C-like language that does not have all the features. But since that would be non-standard, then the way to identify that a feature is lacking would be non-standard as well.
Since static variables are part of the standard, you should be safe.
The problem with support is probably not to be found with your compiler (most of which handle the standard pretty well), but with whatever code you have to set up your runtime environment. Make sure that when you're loading the code that you properly unpack the executable, read-only data, read-write data, and zero-init sections of the executable before jumping into the C code.
Local static variables are part of th C standard, so yes.
\pedantic{
If your code is well organized, with separate files (compilation units) for different subsystems, you might do better to have a static variable with file scope. This will make it easier to factor the code that uses it into separate functions. If the code that uses the variable is complicated, this will permit you to split it into smaller static functions, which are easier to read, understand and debug.
}
Yes. local statics are really not much different than globals once the compiler is done chewing on your source code. I could think up exotic processors where globals would be an issue, but I doubt you will encounter many.
The truly interesting thing about globals on embedded processors is that you often have the option of having the compiler allocate them in ROM, EEPROMs, etc.