Internal static variables in C, would you use them? - c

In C you can have external static variables that are viewable every where in the file, while internal static variables are only visible in the function but is persistent
For example:
#include <stdio.h>
void foo_bar( void )
{
static counter = 0;
printf("counter is %d\n", counter);
counter++;
}
int main( void )
{
foo_bar();
foo_bar();
foo_bar();
return 0;
}
the output will be
counter is 0
counter is 1
counter is 2
My question is why would you use an internal static variable? If you don't want your static variable visible in the rest of the file shouldn't the function really be in its own file then?

This confusion usually comes about because the static keyword serves two purposes.
When used at file level, it controls the visibility of its object outside the compilation unit, not the duration of the object (visibility and duration are layman's terms I use during educational sessions, the ISO standard uses different terms which you may want to learn eventually, but I've found they confuse most beginning students).
Objects created at file level already have their duration decided by virtue of the fact that they're at file level. The static keyword then just makes them invisible to the linker.
When used inside functions, it controls duration, not visibility. Visibility is already decided since it's inside the function - it can't be seen outside the function. The static keyword in this case, causes the object to be created at the same time as file level objects.
Note that, technically, a function level static may not necessarily come into existence until the function is first called (and that may make sense for C++ with its constructors) but every C implementation I've ever used creates its function level statics at the same time as file level objects.
Also, whilst I'm using the word "object", I don't mean it in the sense of C++ objects (since this is a C question). It's just because static can apply to variables or functions at file level and I need an all-encompassing word to describe that.
Function level statics are still used quite a bit - they can cause trouble in multi-threaded programs if that's not catered for but, provided you know what you're doing (or you're not threading), they're the best way to preserve state across multiple function calls while still providing for encapsulation.
Even with threading, there are tricks you can do in the function (such as allocation of thread specific data within the function) to make it workable without exposing the function internals unnecessarily.
The only other choices I can think of are global variables and passing a "state variable" to the function each time.
In both these cases, you expose the inner workings of the function to its clients and make the function dependent on the good behavior of the client (always a risky assumption).

They are used to implement tools like strtok, and they cause problems with reentrancy...
Think carefully before fooling around with this tool, but there are times when they are appropriate.

For example, in C++, it is used as one way to get singleton istances
SingletonObject& getInstance()
{
static SingletonObject o;
return o;
}
which is used to solve the initialization order problem (although it's not thread-safe).
Ad "shouldn't the function be in its own file"
Certainly not, that's nonsense. Much of the point of programming languages is to facilitate isolation and therefore reuse of code (local variables, procedures, structures etc. all do that) and this is just another way to do that.
BTW, as others pointed out, almost every argument against global variables applies to static variables too, because they are in fact globals. But there are many cases when it's ok to use globals, and people do.

I find it handy for one-time, delayed, initialization:
int GetMagic()
{
static int magicV= -1;
if(-1 == magicV)
{
//do expensive, one-time initialization
magicV = {something here}
}
return magicV;
}
As others have said, this isn't thread-safe during it's very first invocation, but sometimes you can get away with it :)

I think that people generally stay away from internal static variables. I know strtok() uses one, or something like it, and because of that is probably the most hated function in the C library.
Other languages like C# don't even support it. I think the idea used to be that it was there to provide some semblance of encapsulation (if you can call it that) before the time of OO languages.

Probably not terribly useful in C, but they are used in C++ to guarantee the initialisation of namespace scoped statics. In both C and C++ there are problemns with their use in multi-threaded applications.

I wouldn't want the existence of a static variable to force me to put the function into its own file. What if I have a number of similar functions, each with their own static counter, that I wanted to put into one file? There are enough decisions we have to make about where to put things, without needing one more constraint.

Some use cases for static variables:
you can use it for counters and you won't pollute the global namespace.
you can protect variables using a function that gets the value as a pointer and returns the internal static. This whay you can control how the value is assigned. (use NULL when you just want to get the value)

I've never heard this specific construct termed "internal static variable." A fitting label, I suppose.
Like any construct, it has to be used knowledgeably and responsibly. You must know the ramifications of using the construct.
It keeps the variable declared at the most local scope without having to create a separate file for the function. It also prevents global variable declaration.
For example -
char *GetTempFileName()
{
static int i;
char *fileName = new char[1024];
memset(fileName, 0x00, sizeof(char) * 1024);
sprintf(fileName, "Temp%.05d.tmp\n", ++i);
return fileName;
}
VB.NET supports the same construct.
Public Function GetTempFileName() As String
Static i As Integer = 0
i += 1
Return String.Format("Temp{0}", i.ToString("00000"))
End Function
One ramification of this is that these functions are not reentrant nor thread safe.

Not anymore. I've seen or heard the results of function local static variables in multithreaded land, and it isn't pretty.

In writing code for a microcontroller I would use a local static variable to hold the value of a sub-state for a particular function. For instance if I had an I2C handler that was called every time main() ran then it would have its own internal state held in a static local variable. Then every time it was called it would check what state it was in and process I/O accordingly (push bits onto output pins, pull up a line, etc).

All statics are persistent and unprotected from simultaneous access, much like globals, and for that reason must be used with caution and prudence. However, there are certainly times when they come in handy, and they don't necessarily merit being in their own file.
I've used one in a fatal error logging function that gets patched to my target's error interrupt vectors, eg. div-by-zero. When this function gets called, interrupts are disabled, so threading is a non-issue. But re-entrancy could still happen if I caused a new error while in the process of logging the first error, like if the error string formatter broke. In that case, I'd have to take more drastic action.
void errorLog(...)
{
static int reentrant = 0;
if(reentrant)
{
// We somehow caused an error while logging a previous error.
// Bail out immediately!
hardwareReset();
}
// Leave ourselves a breadcrumb so we know we're already logging.
reentrant = 1;
// Format the error and put it in the log.
....
// Error successfully logged, time to reset.
hardwareReset();
}
This approach is checking against a very unlikely event, and it's only safe because interrupts are disabled. However, on an embedded target, the rule is "never hang." This approach guarantees (within reason) that the hardware eventually gets reset, one way or the other.

A simple use for this is that a function can know how many times it has been called.

Related

Reentrancy or not with this netbsd code

I am studying on "reading code" by reading pieces of NetBSD source code.
(for whoever is interested, it's < Code Reading: The Open Source Perspective > I'm reading)
And I found this function:
/* convert IP address to a string, but not into a single buffer
*/
char *
naddr_ntoa(naddr a)
{
#define NUM_BUFS 4
static int bufno;
static struct {
char str[16]; /* xxx.xxx.xxx.xxx\0 */
} bufs[NUM_BUFS];
char *s;
struct in_addr addr;
addr.s_addr = a;
strlcpy(bufs[bufno].str, inet_ntoa(addr), sizeof(bufs[bufno].str));
s = bufs[bufno].str;
bufno = (bufno+1) % NUM_BUFS;
return s;
#undef NUM_BUFS
}
It introduces 4 different temporary buffers to wrap inet_ntoa function since inet_ntoa is not re-entrant.
But seems to me this naddr_ntoa function is also not re-entrant:
the static bufno variable can be manipulated by other so the temporary buffers do not seem work as expected here.
So is it a potential bug?
Yes, this is a potential bug. If you want a similar function that most likely reentrant you could use e.g. inet_ntop (which incidentally handles IPv6 as well).
That code comes from src/sbin/routed/trace.c and it is not a general library routine, but just a custom hack used only in the routed program. The addrname() function in the same file makes use of the same trick, for the same reason. It's not even NetBSD code per se, but rather it comes from SGI originally, and is maintained by Vernon Schryver (see The Routed Page).
It's just a quick hack to allow use of multiple calls within the same expression, such as where the results are being used in one printf() call: E.g.:
printf("addr1->%s, addr2->%s, addr3->%s, addr4->%s\n",
naddr_ntoa(addr1), naddr_ntoa(addr2), naddr_ntoa(addr3), naddr_ntoa(addr4));
There are several examples of similar uses in the routed source files (if.c, input.c, rdisc.c).
There is no bug in this code. The routed program is not multi-threaded. Reentrancy is not being addressed at all in this hack. This trick has been done by design for a very specific purpose that has nothing to do with reentrancy. The Code Reading author(s) is wrong to associate this trick with reentrancy.
It's simply a way to hide the saving of multiple results in an array of static variables instead of having to individually copy those results from one static variable into separate storage in the calling function when multiple results are required for a single expression.
Remember that static variables have all the properties of global variables except for the limited scope of their identifier. It is of course true that unprotected use of global (or static) variables inside a function make that function non-reentrant, but that's not the only problem global variables cause. Use of a fully-reentrant function would not be appropriate in routed because it would actually make the code more complex than necessary, whereas this hack keeps the calling code clean and simple. It would though have been better for the hack to be properly documented such that future maintainers would more easily spot when NUM_BUFS has to be adjusted.

Is a global or static declaration safer in an embedded environment?

I have a choice to between declaring a variable static or global.
I want to use the variable in one function to maintain counter.
for example
void count()
{
static int a=0;
for(i=0;i<7;i++)
{
a++;
}
}
My other choice is to declare the variable a as global.
I will only use it in this function count().
Which way is the safest solution?
It matters only at compile and link-time. A static local variable should be stored and initialised in exactly the same way as a global one.
Declaring a local static variable only affects its visibility at the language level, making it visible only in the enclosing function, though with a global lifetime.
A global variable (or any object in general) not marked static has external linkage and the linker will consider the symbol when merging each of the object files.
A global variable marked static only has internal linkage within the current translation unit, and the linker will not see such a symbol when merging the individual translation units.
The internal static is probably better from a code-readability point of view, if you'll only ever use it inside that function.
If it was global, some other function could potentially modify it, which could be dangerous.
Either using global or static variable within a function both are not safe because then your function will no longer be re-entrant.
However if you are not concerned with function being re-entrant then you can have either based on your choice.
If the variable is only to be accessed within the function count() then it is by definition local, so I cannot see why the question arises. As a rule, always use the most restrictive scope possible for any symbol.
You should really read Jack Ganssle's article A Pox on Globals, it will be enlightening.
Always reduce scope as far as possible. If a variable doesn't need to be visible outside a function, it should not be declared outside it either. The static keyword should be used whenever possible. If you declare a variable at file scope, it should always be static to reduce the scope to the file it was declared in. This is C's way of private encapsulation.
The above is true for all systems. For embedded there is another concern: all variables declared as static or global must be initialized before the program is started. This is enforced by ISO C. So they are always set either to the value the programmer wants them initialized to. If the programmer didn't set any value they are initialized to zero (or NULL).
This means that before main is called, there must be a snippet executed in your program that sets all these static/global values. In an embedded system, the initialization values are copied from ROM (flash, eeprom etc) to RAM. A standard C compiler handles this by creating this snippet and adding it to your program.
However, in embedded systems this snippet is often unfortunate, as it leads to a delay at program startup, especially if there is lots of statics/globals. A common non-standard optimization most embedded compilers support, is to remove this snippet. The program will then no longer behave as expected by the C standard, but it will be faster. Once you have done this optimization, initialization must be done in runtime, roughly static int x; x=0; rather than static int x=0;.
To make your program portable to such non-standard embedded compilers, it is a good habit to always set your globals/statics in runtime. And no matter if you intend to port to such compilers or not, it is certainly a good habit not to rely on the default zero initialization of globals/statics. Because most rookie C programmers don't even know that this static zero initialization rule exists and they will get very confused if you don't init your variables explicitly before using them.
i dont think is there is anything special with static & normal global with embedded domain ...!!
in one way static is good that if you are going to initialize your counter as o in starting then if you just declare with static then there is no need to initialize with it 0 because every static varaible is by default initialized with 0.
Edit :
After Clifford's comment i have checked and get to know that globals are also statically allocated and initialised to zero, so that advantage does not exist..
Pass a pointer to a "standard" variable instead
void count(int *a) {
int i;
for (i = 0; i < 7; i++)
{
(*a)++;
}
}
This way you do not rely neither on global variables nor on static local variables, which makes your program better.
I would say static is better than global if you want only one function to access it in which you declared it . Plus global variables are more prone to be accidentally accessed by other functions.
If you do want to use globals since it can be accessed by other functions in the program, make sure you declare them as volatile .
volatile int a = 0;
volatile makes sure it is not optimised by compilers in the wrong way.

Making C module variables accessible as read-only

I would like to give a module variable a read-only access for client modules.
Several solutions:
1. The most common one:
// module_a.c
static int a;
int get_a(void)
{
return a;
}
// module_a.h
int get_a(void);
This makes one function per variable to share, one function call (I am thinking both execution time and readability), and one copy for every read. Assuming no optimizing linker.
2. Another solution:
// module_a.c
static int _a;
const int * const a = &_a;
// module_a.h
extern const int * const a;
// client_module.c
int read_variable = *a;
*a = 5; // error: variable is read-only
I like that, besides the fact that the client needs to read the content of a pointer. Also, every read-only variable needs its extern const pointer to const.
3. A third solution, inspired by the second one, is to hide the variables behind a struct and an extern pointer to struct. The notation module_name->a is more readable in the client module, in my opinion.
4. I could create an inline definition for the get_a(void) function. It would still look like a function call in the client module, but the optimization should take place.
My questions:
Is there a best way to make variables modified in a module accessible as read-only in other modules? Best in what aspect?
Which solutions above would you accept or refuse to use, and why?
I am aware that this is microoptimization - I might not implement it - but I am still interested in the possibility, and above all in the knowing.
Concerning option #4, I'm not sure you can make it inline if the variable isn't accessible outside the implementation file. I wouldn't count options #2 and #3 as truly read-only. The pointer can have the constness cast away and be modified (const is just a compiler "warning", nothing concrete). Only option #1 is read-only because it returns a copy.
For speed identical to variable access, you can define an extern variable inside an inline function:
static inline int get_a(void)
{
extern int a_var;
return a_var;
}
This is simple and clear to read. The other options seem unnecessarily convoluted.
Edit: I'm assuming that you use prefixes for your names, since you write C. So it will actually be:
extern int my_project_a;
This prevents a client from accidentally making a variable with the same name. However, what if a client makes a variable with the same name on purpose? In this situation, you have already lost, because the client is either 1) actively trying to sabotage your library or 2) incompetent beyond reasonable accommodation. In situation #1, there is nothing you can do to stop the programmer. In situation #2, the program will be broken anyway.
Try running nm /lib/libc.so or equivalent on your system. You'll see that most libc implementations have several variables that are not defined in header files. On my system this includes things like __host_byaddr_cache. It's not the responsibility of the C library implementors to babysit me and prevent me from running:
extern void *__host_byaddr_cache;
__host_byaddr_cache = NULL;
If you start down the path of thinking that you have to force clients to treat your variable as read-only, you are heading down the path of fruitless paranoia. The static keyword is really just a convenience to keep objects out of the global namespace, it is not and never was a security measure to prevent external access.
The only way to enforce read-only variables is to manage the client code — either by sandboxing it in a VM or by algorithmically verifying that it can't modify your variable.
The most common one:
There's a reason why it's the most common one. It's the best one.
I don't regard the performance hit to be significant enough to be worth worrying about in most situations.

Are nested functions a bad thing in gcc ? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I know that nested functions are not part of the standard C, but since they're present in gcc (and the fact that gcc is the only compiler i care about), i tend to use them quite often.
Is this a bad thing ? If so, could you show me some nasty examples ?
What's the status of nested functions in gcc ? Are they going to be removed ?
Nested functions really don't do anything that you can't do with non-nested ones (which is why neither C nor C++ provide them). You say you are not interested in other compilers - well this may be atrue at this moment, but who knows what the future will bring? I would avoid them, along with all other GCC "enhancements".
A small story to illustrate this - I used to work for a UK Polytechinc which mostly used DEC boxes - specifically a DEC-10 and some VAXen. All the engineering faculty used the many DEC extensions to FORTRAN in their code - they were certain that we would remain a DEC shop forever. And then we replaced the DEC-10 with an IBM mainframe, the FORTRAN compiler of which didn't support any of the extensions. There was much wailing and gnashing of teeth on that day, I can tell you. My own FORTRAN code (an 8080 simulator) ported over to the IBM in a couple of hours (almost all taken up with learning how to drive the IBM compiler), because I had written it in bog-standard FORTRAN-77.
There are times nested functions can be useful, particularly with algorithms that shuffle around lots of variables. Something like a written-out 4-way merge sort could need to keep a lot of local variables, and have a number of pieces of repeated code which use many of them. Calling those bits of repeated code as an outside helper routine would require passing a large number of parameters and/or having the helper routine access them through another level of pointer indirection.
Under such circumstances, I could imagine that nested routines might allow for more efficient program execution than other means of writing the code, at least if the compiler optimizes for the situation where there any recursion that exists is done via re-calling the outermost function; inline functions, space permitting, might be better on non-cached CPUs, but the more compact code offered by having separate routines might be helpful. If inner functions cannot call themselves or each other recursively, they can share a stack frame with the outer function and would thus be able to access its variables without the time penalty of an extra pointer dereference.
All that being said, I would avoid using any compiler-specific features except in circumstances where the immediate benefit outweighs any future cost that might result from having to rewrite the code some other way.
Like most programming techniques, nested functions should be used when and only when they are appropriate.
You aren't forced to use this aspect, but if you want, nested functions reduce the need to pass parameters by directly accessing their containing function's local variables. That's convenient. Careful use of "invisible" parameters can improve readability. Careless use can make code much more opaque.
Avoiding some or all parameters makes it harder to reuse a nested function elsewhere because any new containing function would have to declare those same variables. Reuse is usually good, but many functions will never be reused so it often doesn't matter.
Since a variable's type is inherited along with its name, reusing nested functions can give you inexpensive polymorphism, like a limited and primitive version of templates.
Using nested functions also introduces the danger of bugs if a function unintentionally accesses or changes one of its container's variables. Imagine a for loop containing a call to a nested function containing a for loop using the same index without a local declaration. If I were designing a language, I would include nested functions but require an "inherit x" or "inherit const x" declaration to make it more obvious what's happening and to avoid unintended inheritance and modification.
There are several other uses, but maybe the most important thing nested functions do is allow internal helper functions that are not visible externally, an extension to C's and C++'s static not extern functions or to C++'s private not public functions. Having two levels of encapsulation is better than one. It also allows local overloading of function names, so you don't need long names describing what type each one works on.
There are internal complications when a containing function stores a pointer to a contained function, and when multiple levels of nesting are allowed, but compiler writers have been dealing with those issues for over half a century. There are no technical issues making it harder to add to C++ than to C, but the benefits are less.
Portability is important, but gcc is available in many environments, and at least one other family of compilers supports nested functions - IBM's xlc available on AIX, Linux on PowerPC, Linux on BlueGene, Linux on Cell, and z/OS. See
http://publib.boulder.ibm.com/infocenter/comphelp/v8v101index.jsp?topic=%2Fcom.ibm.xlcpp8a.doc%2Flanguage%2Fref%2Fnested_functions.htm
Nested functions are available in some new (eg, Python) and many more traditional languages, including Ada, Pascal, Fortran, PL/I, PL/IX, Algol and COBOL. C++ even has two restricted versions - methods in a local class can access its containing function's static (but not auto) variables, and methods in any class can access static class data members and methods. The upcoming C++ standard has lamda functions, which are really anonymous nested functions. So the programming world has lots of experience pro and con with them.
Nested functions are useful but take care. Always use any features and tools where they help, not where they hurt.
As you said, they are a bad thing in the sense that they are not part of the C standard, and as such are not implemented by many (any?) other C compilers.
Also keep in mind that g++ does not implement nested functions, so you will need to remove them if you ever need to take some of that code and dump it into a C++ program.
Nested functions can be bad, because under specific conditions the NX (no-execute) security bit will be disabled. Those conditions are:
GCC and nested functions are used
a pointer to the nested function is used
the nested function accesses variables from the parent function
the architecture offers NX (no-execute) bit protection, for instance 64-bit linux.
When the above conditions are met, GCC will create a trampoline https://gcc.gnu.org/onlinedocs/gccint/Trampolines.html. To support trampolines, the stack will be marked executable. see: https://www.win.tue.nl/~aeb/linux/hh/protection.html
Disabling the NX security bit creates several security issues, with the notable one being buffer overrun protection is disabled. Specifically, if an attacker placed some code on the stack (say as part of a user settable image, array or string), and a buffer overrun occurred, then the attackers code could be executed.
update
I'm voting to delete my own post because it's incorrect. Specifically, the compiler must insert a trampoline function to take advantage of the nested functions, so any savings in stack space are lost.
If some compiler guru wants to correct me, please do so!
original answer:
Late to the party, but I disagree with the accepted answer's assertion that
Nested functions really don't do anything that you can't do with
non-nested ones.
Specifically:
TL;DR: Nested Functions Can Reduce Stack Usage in Embedded Environments
Nested functions give you access to lexically scoped variables as "local" variables without needing to push them onto the call stack. This can be really useful when working on a system with limited resource, e.g. embedded systems. Consider this contrived example:
void do_something(my_obj *obj) {
double times2() {
return obj->value * 2.0;
}
double times4() {
return times2() * times2();
}
...
}
Note that once you're inside do_something(), because of nested functions, the calls to times2() and times4() don't need to push any parameters onto the stack, just return addresses (and smart compilers even optimize them out when possible).
Imagine if there was a lot of state that the internal functions needed to access. Without nested functions, all that state would have to be passed on the stack to each of the functions. Nested functions let you access the state like local variables.
I agree with Stefan's example, and the only time I used nested functions (and then I am declaring them inline) is in a similar occasion.
I would also suggest that you should rarely use nested inline functions rarely, and the few times you use them you should have (in your mind and in some comment) a strategy to get rid of them (perhaps even implement it with conditional #ifdef __GCC__ compilation).
But GCC being a free (like in speech) compiler, it makes some difference... And some GCC extensions tend to become de facto standards and are implemented by other compilers.
Another GCC extension I think is very useful is the computed goto, i.e. label as values. When coding automatons or bytecode interpreters it is very handy.
Nested functions can be used to make a program easier to read and understand, by cutting down on the amount of explicit parameter passing without introducing lots of global state.
On the other hand, they're not portable to other compilers. (Note compilers, not devices. There aren't many places where gcc doesn't run).
So if you see a place where you can make your program clearer by using a nested function, you have to ask yourself 'Am I optimising for portability or readability'.
I'm just exploring a bit different kind of use of nested functions. As an approach for 'lazy evaluation' in C.
Imagine such code:
void vars()
{
bool b0 = code0; // do something expensive or to ugly to put into if statement
bool b1 = code1;
if (b0) do_something0();
else if (b1) do_something1();
}
versus
void funcs()
{
bool b0() { return code0; }
bool b1() { return code1; }
if (b0()) do_something0();
else if (b1()) do_something1();
}
This way you get clarity (well, it might be a little confusing when you see such code for the first time) while code is still executed when and only if needed.
At the same time it's pretty simple to convert it back to original version.
One problem arises here if same 'value' is used multiple times. GCC was able to optimize to single 'call' when all the values are known at compile time, but I guess that wouldn't work for non trivial function calls or so. In this case 'caching' could be used, but this adds to non readability.
I need nested functions to allow me to use utility code outside an object.
I have objects which look after various hardware devices. They are structures which are passed by pointer as parameters to member functions, rather as happens automagically in c++.
So I might have
static int ThisDeviceTestBram( ThisDeviceType *pdev )
{
int read( int addr ) { return( ThisDevice->read( pdev, addr ); }
void write( int addr, int data ) ( ThisDevice->write( pdev, addr, data ); }
GenericTestBram( read, write, pdev->BramSize( pdev ) );
}
GenericTestBram doesn't and cannot know about ThisDevice, which has multiple instantiations. But all it needs is a means of reading and writing, and a size. ThisDevice->read( ... ) and ThisDevice->Write( ... ) need the pointer to a ThisDeviceType to obtain info about how to read and write the block memory (Bram) of this particular instantiation. The pointer, pdev, cannot have global scobe, since multiple instantiations exist, and these might run concurrently. Since access occurs across an FPGA interface, it is not a simple question of passing an address, and varies from device to device.
The GenericTestBram code is a utility function:
int GenericTestBram( int ( * read )( int addr ), void ( * write )( int addr, int data ), int size )
{
// Do the test
}
The test code, therefore, need be written only once and need not be aware of the details of the structure of the calling device.
Even wih GCC, however, you cannot do this. The problem is the out of scope pointer, the very problem needed to be solved. The only way I know of to make f(x, ... ) implicitly aware of its parent is to pass a parameter with a value out of range:
static int f( int x )
{
static ThisType *p = NULL;
if ( x < 0 ) {
p = ( ThisType* -x );
}
else
{
return( p->field );
}
}
return( whatever );
Function f can be initialised by something which has the pointer, then be called from anywhere. Not ideal though.
Nested functions are a MUST-HAVE in any serious programming language.
Without them, the actual sense of functions isn't usable.
It's called lexical scoping.

Refactoring global to local. Should they be static or not?

I'm refactoring "spaghetti code" C module to work in multitasking (RTOS) environment.
Now, there are very long functions and many unnecessary global variables.
When I try to replace global variables that exists only in one function with locals, I get into dilemma. Every global variable is behave like local "static" - e.g. keep its value even you exit and re-enter to the function.
For multitasking "static" local vars are worst from global. They make the functions non reentered.
There are a way to examine if the function is relay on preserving variable value re-entrancing without tracing all the logical flow?
Short answer: no, there isn't any way to tell automatically whether the function will behave differently according to whether the declaration of a local variable is static or not. You just have to examine the logic of each function that uses globals in the original code.
However, if replacing a global variable with a static local-scope variable means the function is not re-entrant, then it wasn't re-entrant when it was a global, either. So I don't think that changing a global to a static local-scope variable will make your functions any less re-entrant than they were to start with.
Provided that the global really was used only in that scope (which the compiler/linker should confirm when you remove the global), the behaviour should be close to the same. There may or may not be issues over when things are initialized, I can't remember what the standard says: if static initialization occurs in C the same time it does in C++, when execution first reaches the declaration, then you might have changed a concurrency-safe function into a non-concurrency-safe one.
Working out whether a function is safe for re-entrancy also requires looking at the logic. Unless the standard says otherwise (I haven't checked), a function isn't automatically non-re-entrant just because it declares a static variable. But if it uses either a global or a static in any significant way, you can assume that it's non-re-entrant. If there isn't synchronization then assume it's also non-concurrency-safe.
Finally, good luck. Sounds like this code is a long way from where you want it to be...
If your compiler will warn you if a variable is used before initialized, make a suspected variable local without assigning it a value in its declaration.
Any variable that gives a warning cannot be made local without changing other code.
Changing global variables to static local variables will help a little, since the scope for modification has been reduced. However the concurrency issue still remains a problem and you have to work around it with locks around access to those static variables.
But what you want to be doing is pushing the definition of the variable into the highest scope it is used as a local, then pass it as an argument to anything that needs it. This obviously requires alot of work potentially (since it has a cascading effect). You can group similarly needed variables into "context" objects and then pass those around.
See the design pattern Encapsulate Context
If your global vars are truly used only in one function, you're losing nothing by making them into static locals since the fact that they were global anyway made the function that used them non-re-entrant. You gain a little by limiting the scope of the variable.
You should make that change to all globals that are used in only one function, then examine each static local variable to see if it can be made non-static (automatic).
The rule is: if the variable is used in the function before being set, then leave it static.
An example of a variable that can be made automatic local (you would put "int nplus4;" inside the function (you don't need to set it to zero since it's set before use and this should issue a warning if you actually use it before setting it, a useful check):
int nplus4 = 0; // used only in add5
int add5 (int n) {
nplus4 = n + 4; // set
return nplus4 + 1; // use
}
The nplus4 var is set before being used. The following is an example that should be left static by putting "static int nextn = 0;" inside the function:
int nextn = 0; // used only in getn
int getn (void) {
int n = nextn++; // use, then use, then set
return n;
}
Note that it can get tricky, "nextn++" is not setting, it's using and setting since it's equivalent to "nextn = nextn + 1".
One other thing to watch out for: in an RTOS environment, stack space may be more limited than global memory so be careful moving big globals such as "char buffer[10000]" into the functions.
Please give examples of what you call 'global' and 'local' variables
int global_c; // can be used by any other file with 'extern int global_c;'
static int static_c; // cannot be seen or used outside of this file.
int foo(...)
{
int local_c; // cannot be seen or used outside of this function.
}
If you provide some code samples of what you have and what you changed we could better answer the question.
If I understand your question correctly, your concern is that global variables retain their value from one function call to the next. Obviously when you move to using a normal local variable that won't be the case. If you want to know whether or not it is safe to change them I don't think you have any option other than reading and understanding the code. Simply doing a full text search for the the name of the variable in question might be instructive.
If you want a quick and dirty solution that isn't completely safe, you can just change it and see what breaks. I recommend making sure you have a version you can roll back to in source control and setting up some unit tests in advance.

Resources