Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
In C, when implementing functions which work on some data, I can think of at least the following commonly used styles of how to implement them:
Style 1: The simplest one. User calls functions directly and provides pointer to data as argument.
typedef struct
{
int data;
}SomeStruct;
void setData(SomeStruct *s, int value)
{
s->data = value;
}
// construction
SomeStruct s;
// usage
setData(&s, 123);
Style 2: User calls functions through function pointers, but still needs to manually provide pointer to data.
typedef struct SomeStruct SomeStruct;
struct SomeStruct
{
void (*set)(SomeStruct*, int);
int data;
};
void setData(SomeStruct *s, int value)
{
s->data = value;
}
// construction
SomeStruct s;
s.set = setData;
// usage
s.set(&s, 123);
Style 3: User calls functions through function pointers and doesn't need to provide pointer to data manually.
typedef struct SomeStruct SomeStruct;
struct SomeStruct
{
void (*set)(int);
int data;
};
#define MAX_INSTANCES 2
SomeStruct structs[MAX_INSTANCES];
void setData(SomeStruct *s, int value)
{
s->data = value;
}
void setData0(int value)
{
setData(&structs[0], value);
}
void setData1(int value)
{
setData(&structs[1], value);
}
// construction
structs[0].set = setData0;
structs[1].set = setData1;
SomeStruct *s0 = &structs[0];
// usage
s0->set(123);
In style 2 and 3 the data and setData can be hidden from the user, but I omitted this here for the sake of simplicity.
The question is: Are there established names for these styles?
Style 1 is traditional C.
Style 2 is the beginnings of Object-Oriented C. If it has a name, it is "method implementation pattern"
The problem is that you have an improperly named method for Style 2.
void SomeStruct_setData(SomeStruct *s, int value)
{
s->data = value;
}
or
void SomeStruct_data(SomeStruct *s, int value)
{
s->data = value;
}
are a bit better, because now it signals the intention that SomeStruct is being treated like an object.
Fancier approaches that emulate more of an object oriented system exist. They typically provide (in the header)
struct SomeStruct;
and place the private struct within the SomeStruct.c file. This prevents the ability to access the fields directly, forcing all who modify SomeStruct to use the SomeStruct_*(struct SomeStruct*, ...) methods. Often these approaches will use typdefs to present a slightly renamed struct struct SomeSturct_s* as SomeStruct.
Style 3 is one of the many variations that might be called "vtable implementation" in C. It is often a precursor to polymorphism
Remember that C doesn't have a true polymorphic typing system, so the actual types generally get converted to void * in implementation; but, through patterns of all the object struct fields, you can embed the type into to struct (typically done with a uint32_t or enum, sometimes done with more sophisticated type-handling routines).
In any case, eventually you will need a function that changes behavior, and that function is implemented as a function pointer with many different C-style functions providing the behavior(s), and the type assignment / handling choosing the correct functions to assign to the struct to implement the type's behavior. Basically this is a mini-vtable, if you want to translate it into C++ terms.
Some systems start off with switch statements based on the object type, typically read from the struct (in a fixed location field required by every object struct). Eventually they realize they are just a switch statement that passes the parameters to one method based on the base-class type. By using a function pointer and leveraging construction time assignment, the switch statement can be optimized away.
Object Oriented C is a thing, but it is not a standard.
There are many times when one wants to use Object-Oriented C. The maintenance conveniences and clear delineations for testing are typically the drivers that have people choose this approach. That said, the actual system will resemble a lot of "you write the Object-Oriented simulation in C" and you don't often complete all of the features of whatever OO system you are emulating.
To make this easier, there are a number of approaches:
Write a mini-language that expands into C. This was the original approach of C++, until they decided that the benefits of type checking all the way through the compilation phase was worth combining the pre-processing language with the compiler.
Use a pre-built Object Oriented C environment. GNOME has already built their GObject library, and while the syntax and macro choices may take some getting used to, the entire library is complete, robust, and tested. This "C" as the fundamental layer, allows GNOME to be ported to a few more platforms that its traditional competitor KDE, which requires a C++ compiler (available on fewer platforms).
Refactor your code into reusable components, to speed the construction of your less-than-full-featured OO system. By making the components reusable, either through macro expansion or including methods, you will speed up the creation of new objects, and spend less time debugging them to verify they work as objects.
The first steps are very easy to implement (non-dynamic methods, vtables without sophisticated built-in type checking) but it gets harder as you go along (but not too hard). I've built a system (for fun) that even correctly implements "try / catch" semantics with macros that use the setjmp and longjmp C functions to handle thrown objects with polymorphisim, complete with re-throws, re-catches, and re-handlers following a Java-like behavior.
If you like to play with these kinds of toys, I suggest you get good at standing up unit tests in C, using tools like autotools or its equivalent; because, at least in the early stages, you will be finding that the most basic approach won't capture all the behavior needed as you near completion, and having a testing suite will prevent you from breaking your prior work.
I've noticed that in a lot of library, version informations, as well as informations on the availability of special features that may differ or be absent depending on the build, are made accessible to client applications not by a constant, but by a function call returning a constant, e.g.:
const char *libversion(void) {
return "0.2";
}
bool support_ssl(void) {
return LIB_SSL_ENABLED; /* whatever */
}
instead of simply:
const char *libversion = "0.2";
bool support_ssl = LIB_SSL_ENABLED;
Is there a practical reason for doing this, or is it only some kind of convention?
Is there a practical reason for doing this, or is it only some kind of convention?
I'd say both…
A practical reason I see for this, is that when you distribute your library, your users install a compiled version of it as a shared object, and access its data using the header. If the constant is accessible through a function, its prototype is declared in the header, but the value is defined in the compilation unit, linked in the shared object file. Edit: I'm not saying it's not possible, but a good reason for doing so is to keep the possibility to keep the API stable, while switching from a constant value to a calculated value for a given function, cf reason #3.
Another practical reason I can see is that you could access that API using some sort of "middleware", like corba, that enables you to access functions, but not constants (please be kind with me if I'm wrong about that particular point, I haven't done any CORBA in 10 years…).
And in the end, it's somehow good OOP convention, the header file being a pure functional interface, and all the members being encapsulated enabling a full decoupling of the inner workings of the library and the exposed behaviour.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I know that nested functions are not part of the standard C, but since they're present in gcc (and the fact that gcc is the only compiler i care about), i tend to use them quite often.
Is this a bad thing ? If so, could you show me some nasty examples ?
What's the status of nested functions in gcc ? Are they going to be removed ?
Nested functions really don't do anything that you can't do with non-nested ones (which is why neither C nor C++ provide them). You say you are not interested in other compilers - well this may be atrue at this moment, but who knows what the future will bring? I would avoid them, along with all other GCC "enhancements".
A small story to illustrate this - I used to work for a UK Polytechinc which mostly used DEC boxes - specifically a DEC-10 and some VAXen. All the engineering faculty used the many DEC extensions to FORTRAN in their code - they were certain that we would remain a DEC shop forever. And then we replaced the DEC-10 with an IBM mainframe, the FORTRAN compiler of which didn't support any of the extensions. There was much wailing and gnashing of teeth on that day, I can tell you. My own FORTRAN code (an 8080 simulator) ported over to the IBM in a couple of hours (almost all taken up with learning how to drive the IBM compiler), because I had written it in bog-standard FORTRAN-77.
There are times nested functions can be useful, particularly with algorithms that shuffle around lots of variables. Something like a written-out 4-way merge sort could need to keep a lot of local variables, and have a number of pieces of repeated code which use many of them. Calling those bits of repeated code as an outside helper routine would require passing a large number of parameters and/or having the helper routine access them through another level of pointer indirection.
Under such circumstances, I could imagine that nested routines might allow for more efficient program execution than other means of writing the code, at least if the compiler optimizes for the situation where there any recursion that exists is done via re-calling the outermost function; inline functions, space permitting, might be better on non-cached CPUs, but the more compact code offered by having separate routines might be helpful. If inner functions cannot call themselves or each other recursively, they can share a stack frame with the outer function and would thus be able to access its variables without the time penalty of an extra pointer dereference.
All that being said, I would avoid using any compiler-specific features except in circumstances where the immediate benefit outweighs any future cost that might result from having to rewrite the code some other way.
Like most programming techniques, nested functions should be used when and only when they are appropriate.
You aren't forced to use this aspect, but if you want, nested functions reduce the need to pass parameters by directly accessing their containing function's local variables. That's convenient. Careful use of "invisible" parameters can improve readability. Careless use can make code much more opaque.
Avoiding some or all parameters makes it harder to reuse a nested function elsewhere because any new containing function would have to declare those same variables. Reuse is usually good, but many functions will never be reused so it often doesn't matter.
Since a variable's type is inherited along with its name, reusing nested functions can give you inexpensive polymorphism, like a limited and primitive version of templates.
Using nested functions also introduces the danger of bugs if a function unintentionally accesses or changes one of its container's variables. Imagine a for loop containing a call to a nested function containing a for loop using the same index without a local declaration. If I were designing a language, I would include nested functions but require an "inherit x" or "inherit const x" declaration to make it more obvious what's happening and to avoid unintended inheritance and modification.
There are several other uses, but maybe the most important thing nested functions do is allow internal helper functions that are not visible externally, an extension to C's and C++'s static not extern functions or to C++'s private not public functions. Having two levels of encapsulation is better than one. It also allows local overloading of function names, so you don't need long names describing what type each one works on.
There are internal complications when a containing function stores a pointer to a contained function, and when multiple levels of nesting are allowed, but compiler writers have been dealing with those issues for over half a century. There are no technical issues making it harder to add to C++ than to C, but the benefits are less.
Portability is important, but gcc is available in many environments, and at least one other family of compilers supports nested functions - IBM's xlc available on AIX, Linux on PowerPC, Linux on BlueGene, Linux on Cell, and z/OS. See
http://publib.boulder.ibm.com/infocenter/comphelp/v8v101index.jsp?topic=%2Fcom.ibm.xlcpp8a.doc%2Flanguage%2Fref%2Fnested_functions.htm
Nested functions are available in some new (eg, Python) and many more traditional languages, including Ada, Pascal, Fortran, PL/I, PL/IX, Algol and COBOL. C++ even has two restricted versions - methods in a local class can access its containing function's static (but not auto) variables, and methods in any class can access static class data members and methods. The upcoming C++ standard has lamda functions, which are really anonymous nested functions. So the programming world has lots of experience pro and con with them.
Nested functions are useful but take care. Always use any features and tools where they help, not where they hurt.
As you said, they are a bad thing in the sense that they are not part of the C standard, and as such are not implemented by many (any?) other C compilers.
Also keep in mind that g++ does not implement nested functions, so you will need to remove them if you ever need to take some of that code and dump it into a C++ program.
Nested functions can be bad, because under specific conditions the NX (no-execute) security bit will be disabled. Those conditions are:
GCC and nested functions are used
a pointer to the nested function is used
the nested function accesses variables from the parent function
the architecture offers NX (no-execute) bit protection, for instance 64-bit linux.
When the above conditions are met, GCC will create a trampoline https://gcc.gnu.org/onlinedocs/gccint/Trampolines.html. To support trampolines, the stack will be marked executable. see: https://www.win.tue.nl/~aeb/linux/hh/protection.html
Disabling the NX security bit creates several security issues, with the notable one being buffer overrun protection is disabled. Specifically, if an attacker placed some code on the stack (say as part of a user settable image, array or string), and a buffer overrun occurred, then the attackers code could be executed.
update
I'm voting to delete my own post because it's incorrect. Specifically, the compiler must insert a trampoline function to take advantage of the nested functions, so any savings in stack space are lost.
If some compiler guru wants to correct me, please do so!
original answer:
Late to the party, but I disagree with the accepted answer's assertion that
Nested functions really don't do anything that you can't do with
non-nested ones.
Specifically:
TL;DR: Nested Functions Can Reduce Stack Usage in Embedded Environments
Nested functions give you access to lexically scoped variables as "local" variables without needing to push them onto the call stack. This can be really useful when working on a system with limited resource, e.g. embedded systems. Consider this contrived example:
void do_something(my_obj *obj) {
double times2() {
return obj->value * 2.0;
}
double times4() {
return times2() * times2();
}
...
}
Note that once you're inside do_something(), because of nested functions, the calls to times2() and times4() don't need to push any parameters onto the stack, just return addresses (and smart compilers even optimize them out when possible).
Imagine if there was a lot of state that the internal functions needed to access. Without nested functions, all that state would have to be passed on the stack to each of the functions. Nested functions let you access the state like local variables.
I agree with Stefan's example, and the only time I used nested functions (and then I am declaring them inline) is in a similar occasion.
I would also suggest that you should rarely use nested inline functions rarely, and the few times you use them you should have (in your mind and in some comment) a strategy to get rid of them (perhaps even implement it with conditional #ifdef __GCC__ compilation).
But GCC being a free (like in speech) compiler, it makes some difference... And some GCC extensions tend to become de facto standards and are implemented by other compilers.
Another GCC extension I think is very useful is the computed goto, i.e. label as values. When coding automatons or bytecode interpreters it is very handy.
Nested functions can be used to make a program easier to read and understand, by cutting down on the amount of explicit parameter passing without introducing lots of global state.
On the other hand, they're not portable to other compilers. (Note compilers, not devices. There aren't many places where gcc doesn't run).
So if you see a place where you can make your program clearer by using a nested function, you have to ask yourself 'Am I optimising for portability or readability'.
I'm just exploring a bit different kind of use of nested functions. As an approach for 'lazy evaluation' in C.
Imagine such code:
void vars()
{
bool b0 = code0; // do something expensive or to ugly to put into if statement
bool b1 = code1;
if (b0) do_something0();
else if (b1) do_something1();
}
versus
void funcs()
{
bool b0() { return code0; }
bool b1() { return code1; }
if (b0()) do_something0();
else if (b1()) do_something1();
}
This way you get clarity (well, it might be a little confusing when you see such code for the first time) while code is still executed when and only if needed.
At the same time it's pretty simple to convert it back to original version.
One problem arises here if same 'value' is used multiple times. GCC was able to optimize to single 'call' when all the values are known at compile time, but I guess that wouldn't work for non trivial function calls or so. In this case 'caching' could be used, but this adds to non readability.
I need nested functions to allow me to use utility code outside an object.
I have objects which look after various hardware devices. They are structures which are passed by pointer as parameters to member functions, rather as happens automagically in c++.
So I might have
static int ThisDeviceTestBram( ThisDeviceType *pdev )
{
int read( int addr ) { return( ThisDevice->read( pdev, addr ); }
void write( int addr, int data ) ( ThisDevice->write( pdev, addr, data ); }
GenericTestBram( read, write, pdev->BramSize( pdev ) );
}
GenericTestBram doesn't and cannot know about ThisDevice, which has multiple instantiations. But all it needs is a means of reading and writing, and a size. ThisDevice->read( ... ) and ThisDevice->Write( ... ) need the pointer to a ThisDeviceType to obtain info about how to read and write the block memory (Bram) of this particular instantiation. The pointer, pdev, cannot have global scobe, since multiple instantiations exist, and these might run concurrently. Since access occurs across an FPGA interface, it is not a simple question of passing an address, and varies from device to device.
The GenericTestBram code is a utility function:
int GenericTestBram( int ( * read )( int addr ), void ( * write )( int addr, int data ), int size )
{
// Do the test
}
The test code, therefore, need be written only once and need not be aware of the details of the structure of the calling device.
Even wih GCC, however, you cannot do this. The problem is the out of scope pointer, the very problem needed to be solved. The only way I know of to make f(x, ... ) implicitly aware of its parent is to pass a parameter with a value out of range:
static int f( int x )
{
static ThisType *p = NULL;
if ( x < 0 ) {
p = ( ThisType* -x );
}
else
{
return( p->field );
}
}
return( whatever );
Function f can be initialised by something which has the pointer, then be called from anywhere. Not ideal though.
Nested functions are a MUST-HAVE in any serious programming language.
Without them, the actual sense of functions isn't usable.
It's called lexical scoping.
Introduction
Hello folks, I recently learned to program in C! (This was a huge step for me, since C++ was the first language, I had contact with and scared me off for nearly 10 years.) Coming from a mostly OO background (Java + C#), this was a very nice paradigm shift.
I love C. It's such a beautiful language. What surprised me the most, is the high grade of modularity and code reusability C supports - of course it's not as high as in a OO-language, but still far beyond my expectations for an imperative language.
Question
How do I prevent naming conflicts between the client code and my C library code? In Java there are packages, in C# there are namespaces. Imagine I write a C library, which offers the operation "add". It is very likely, that the client already uses an operation called like that - what do I do?
I'm especially looking for a client friendly solution. For example, I wouldn't like to prefix all my api operations like "myuniquelibname_add" at all. What are the common solutions to this in the C world? Do you put all api operations in a struct, so the client can choose its own prefix?
I'm very looking forward to the insights I get through your answers!
EDIT (modified question)
Dear Answerers, thank You for Your answers! I now see, that prefixes are the only way to safely avoid naming conflicts. So, I would like to modifiy my question: What possibilities do I have, to let the client choose his own prefix?
The answer Unwind posted, is one way. It doesn't use prefixes in the normal sense, but one has to prefix every api call by "api->". What further solutions are there (like using a #define for example)?
EDIT 2 (status update)
It all boils down to one of two approaches:
Using a struct
Using #define (note: There are many ways, how one can use #define to achieve, what I desire)
I will not accept any answer, because I think that there is no correct answer. The solution one chooses rather depends on the particular case and one's own preferences. I, by myself, will try out all the approaches You mentioned to find out which suits me best in which situation. Feel free to post arguments for or against certain appraoches in the comments of the corresponding answers.
Finally, I would like to especially thank:
Unwind - for his sophisticated answer including a full implementation of the "struct-method"
Christoph - for his good answer and pointing me to Namespaces in C
All others - for Your great input
If someone finds it appropriate to close this question (as no further insights to expect), he/she should feel free to do so - I can not decide this, as I'm no C guru.
I'm no C guru, but from the libraries I have used, it is quite common to use a prefix to separate functions.
For example, SDL will use SDL, OpenGL will use gl, etc...
The struct way that Ken mentions would look something like this:
struct MyCoolApi
{
int (*add)(int x, int y);
};
MyCoolApi * my_cool_api_initialize(void);
Then clients would do:
#include <stdio.h>
#include <stdlib.h>
#include "mycoolapi.h"
int main(void)
{
struct MyCoolApi *api;
if((api = my_cool_api_initialize()) != NULL)
{
int sum = api->add(3, 39);
printf("The cool API considers 3 + 39 to be %d\n", sum);
}
return EXIT_SUCCESS;
}
This still has "namespace-issues"; the struct name (called the "struct tag") needs to be unique, and you can't declare nested structs that are useful by themselves. It works well for collecting functions though, and is a technique you see quite often in C.
UPDATE: Here's how the implementation side could look, this was requested in a comment:
#include "mycoolapi.h"
/* Note: This does **not** pollute the global namespace,
* since the function is static.
*/
static int add(int x, int y)
{
return x + y;
}
struct MyCoolApi * my_cool_api_initialize(void)
{
/* Since we don't need to do anything at initialize,
* just keep a const struct ready and return it.
*/
static const struct MyCoolApi the_api = {
add
};
return &the_api;
}
It's a shame you got scared off by C++, as it has namespaces to deal with precisely this problem. In C, you are pretty much limited to using prefixes - you certainly can't "put api operations in a struct".
Edit: In response to your second question regarding allowing users to specify their own prefix, I would avoid it like the plague. 99.9% of users will be happy with whatever prefix you provide (assuming it isn't too silly) and will be very UNHAPPY at the hoops (macros, structs, whatever) they will have to jump through to satisfy the remaining 0.1%.
As a library user, you can easily define your own shortened namespaces via the preprocessor; the result will look a bit strange, but it works:
#define ns(NAME) my_cool_namespace_ ## NAME
makes it possible to write
ns(foo)(42)
instead of
my_cool_namespace_foo(42)
As a library author, you can provide shortened names as desribed here.
If you follow unwinds's advice and create an API structure, you should make the function pointers compile-time constants to make inlinig possible, ie in your .h file, use the follwoing code:
// canonical name
extern int my_cool_api_add(int x, int y);
// API structure
struct my_cool_api
{
int (*add)(int x, int y);
};
typedef const struct my_cool_api *MyCoolApi;
// define in header to make inlining possible
static MyCoolApi my_cool_api_initialize(void)
{
static const struct my_cool_api the_api = { my_cool_api_add };
return &the_api;
}
Unfortunately, there's no sure way to avoid name clashes in C. Since it lacks namespaces, you're left with prefixing the names of global functions and variables. Most libraries pick some short and "unique" prefix (unique is in quotes for obvious reasons), and hope that no clashes occur.
One thing to note is that most of the code of a library can be statically declared - meaning that it won't clash with similarly named functions in other files. But exported functions indeed have to be carefully prefixed.
Since you are exposing functions with the same name client cannot include your library header files along with other header files which have name collision. In this case you add the following in the header file before the function prototype and this wouldn't effect client usage as well.
#define add myuniquelibname_add
Please note this is a quick fix solution and should be the last option.
For a really huge example of the struct method, take a look at the Linux kernel; 30-odd million lines of C in that style.
Prefixes are only choice on C level.
On some platforms (that support separate namespaces for linkers, like Windows, OS X and some commercial unices, but not Linux and FreeBSD) you can workaround conflicts by stuffing code in a library, and only export the symbols from the library you really need. (and e.g. aliasing in the importlib in case there are conflicts in exported symbols)
In C you can have external static variables that are viewable every where in the file, while internal static variables are only visible in the function but is persistent
For example:
#include <stdio.h>
void foo_bar( void )
{
static counter = 0;
printf("counter is %d\n", counter);
counter++;
}
int main( void )
{
foo_bar();
foo_bar();
foo_bar();
return 0;
}
the output will be
counter is 0
counter is 1
counter is 2
My question is why would you use an internal static variable? If you don't want your static variable visible in the rest of the file shouldn't the function really be in its own file then?
This confusion usually comes about because the static keyword serves two purposes.
When used at file level, it controls the visibility of its object outside the compilation unit, not the duration of the object (visibility and duration are layman's terms I use during educational sessions, the ISO standard uses different terms which you may want to learn eventually, but I've found they confuse most beginning students).
Objects created at file level already have their duration decided by virtue of the fact that they're at file level. The static keyword then just makes them invisible to the linker.
When used inside functions, it controls duration, not visibility. Visibility is already decided since it's inside the function - it can't be seen outside the function. The static keyword in this case, causes the object to be created at the same time as file level objects.
Note that, technically, a function level static may not necessarily come into existence until the function is first called (and that may make sense for C++ with its constructors) but every C implementation I've ever used creates its function level statics at the same time as file level objects.
Also, whilst I'm using the word "object", I don't mean it in the sense of C++ objects (since this is a C question). It's just because static can apply to variables or functions at file level and I need an all-encompassing word to describe that.
Function level statics are still used quite a bit - they can cause trouble in multi-threaded programs if that's not catered for but, provided you know what you're doing (or you're not threading), they're the best way to preserve state across multiple function calls while still providing for encapsulation.
Even with threading, there are tricks you can do in the function (such as allocation of thread specific data within the function) to make it workable without exposing the function internals unnecessarily.
The only other choices I can think of are global variables and passing a "state variable" to the function each time.
In both these cases, you expose the inner workings of the function to its clients and make the function dependent on the good behavior of the client (always a risky assumption).
They are used to implement tools like strtok, and they cause problems with reentrancy...
Think carefully before fooling around with this tool, but there are times when they are appropriate.
For example, in C++, it is used as one way to get singleton istances
SingletonObject& getInstance()
{
static SingletonObject o;
return o;
}
which is used to solve the initialization order problem (although it's not thread-safe).
Ad "shouldn't the function be in its own file"
Certainly not, that's nonsense. Much of the point of programming languages is to facilitate isolation and therefore reuse of code (local variables, procedures, structures etc. all do that) and this is just another way to do that.
BTW, as others pointed out, almost every argument against global variables applies to static variables too, because they are in fact globals. But there are many cases when it's ok to use globals, and people do.
I find it handy for one-time, delayed, initialization:
int GetMagic()
{
static int magicV= -1;
if(-1 == magicV)
{
//do expensive, one-time initialization
magicV = {something here}
}
return magicV;
}
As others have said, this isn't thread-safe during it's very first invocation, but sometimes you can get away with it :)
I think that people generally stay away from internal static variables. I know strtok() uses one, or something like it, and because of that is probably the most hated function in the C library.
Other languages like C# don't even support it. I think the idea used to be that it was there to provide some semblance of encapsulation (if you can call it that) before the time of OO languages.
Probably not terribly useful in C, but they are used in C++ to guarantee the initialisation of namespace scoped statics. In both C and C++ there are problemns with their use in multi-threaded applications.
I wouldn't want the existence of a static variable to force me to put the function into its own file. What if I have a number of similar functions, each with their own static counter, that I wanted to put into one file? There are enough decisions we have to make about where to put things, without needing one more constraint.
Some use cases for static variables:
you can use it for counters and you won't pollute the global namespace.
you can protect variables using a function that gets the value as a pointer and returns the internal static. This whay you can control how the value is assigned. (use NULL when you just want to get the value)
I've never heard this specific construct termed "internal static variable." A fitting label, I suppose.
Like any construct, it has to be used knowledgeably and responsibly. You must know the ramifications of using the construct.
It keeps the variable declared at the most local scope without having to create a separate file for the function. It also prevents global variable declaration.
For example -
char *GetTempFileName()
{
static int i;
char *fileName = new char[1024];
memset(fileName, 0x00, sizeof(char) * 1024);
sprintf(fileName, "Temp%.05d.tmp\n", ++i);
return fileName;
}
VB.NET supports the same construct.
Public Function GetTempFileName() As String
Static i As Integer = 0
i += 1
Return String.Format("Temp{0}", i.ToString("00000"))
End Function
One ramification of this is that these functions are not reentrant nor thread safe.
Not anymore. I've seen or heard the results of function local static variables in multithreaded land, and it isn't pretty.
In writing code for a microcontroller I would use a local static variable to hold the value of a sub-state for a particular function. For instance if I had an I2C handler that was called every time main() ran then it would have its own internal state held in a static local variable. Then every time it was called it would check what state it was in and process I/O accordingly (push bits onto output pins, pull up a line, etc).
All statics are persistent and unprotected from simultaneous access, much like globals, and for that reason must be used with caution and prudence. However, there are certainly times when they come in handy, and they don't necessarily merit being in their own file.
I've used one in a fatal error logging function that gets patched to my target's error interrupt vectors, eg. div-by-zero. When this function gets called, interrupts are disabled, so threading is a non-issue. But re-entrancy could still happen if I caused a new error while in the process of logging the first error, like if the error string formatter broke. In that case, I'd have to take more drastic action.
void errorLog(...)
{
static int reentrant = 0;
if(reentrant)
{
// We somehow caused an error while logging a previous error.
// Bail out immediately!
hardwareReset();
}
// Leave ourselves a breadcrumb so we know we're already logging.
reentrant = 1;
// Format the error and put it in the log.
....
// Error successfully logged, time to reset.
hardwareReset();
}
This approach is checking against a very unlikely event, and it's only safe because interrupts are disabled. However, on an embedded target, the rule is "never hang." This approach guarantees (within reason) that the hardware eventually gets reset, one way or the other.
A simple use for this is that a function can know how many times it has been called.