C - program structure (avoiding global variables, includes, etc.) - c

I'm using C (not C++) and I'm unsure how to avoid using global variables.
I have a pretty decent grasp on C, its syntax, and how to write a basic application, but I'm not sure of the proper way to structure the program.
How do really big applications avoid the use of global variables? I'm pretty sure there will always need to be at least some, but for big games and other applications written in C, what is the best way to do it?
Is there any good, open-source software written strictly in C that I could look at? I can't think of any off the top of my head, most of them seem to be in C++.
Thanks.
Edit
Here's an example of where I would use a global variable in a simple API hooking application, which is just a DLL inside another process.
This application, specifically, hooks API functions used in another application. It does this by using WriteProcessMemory to overwrite the call to the original, and make it a call to my DLL instead.
However, when unhooking the API function, I have to write back the original memory/machine code.
So, I need to maintain a simple byte array for that machine code, one for each API function that is hooked, and there are a lot.
// Global variable to store original assembly code (6 bytes)
BYTE g_MessageBoxA[6];
// Hook the API function
HookAPIFunction ( "user32.dll", "MessageBoxA", MyNewFunction, g_MessageBoxA );
// Later on, unhook the function
UnHookAPIFunction ( "user32.dll", "MessageBoxA", g_MessageBoxA );
Sorry if that's confusing.

"How do really big applications avoid the use of global variables?"
Use static variables. If a function needs to remember something between calls, use this versus global variables. Example:
int running_total (int num) {
static int sum = 0;
sum += num;
return sum;
}
Pass data via parameters, so that the value is defined one place, maybe main() and passed to where it is needed.
If all else fails, go ahead and use a global but try and mitigate potential problems.
At a minimum, use naming conventions to minimize potential conflicts. EG: Gbl_MyApp_DeveloperName. So all global variables would start with the Gbl_MyApp_ part -- where "MyApp" was something descriptive of your app.
Try to group functions by purpose, so that everything that needs a given global is in the same file (within reason). Then globals can be defined and restricted to that file (beware the extern keyword).

There are some valid uses for global variables. The schools started teaching that they were evil to keep programmers from being lazy and over using them. If you're sure that the data is really globally needed then use them. Given that you are concerned about doing a good job I don't think you'll go too far wrong using your own judgment.

It's easy - simply don't use them. create what would have ben the global variables in your main() function, and then pass them from there as parameters to the functions that need them.

The best-recomended open-source software to learn C in my college had been Linux so you can get examples from their source.

I think the same, globals are bad, reduces readability and increases error possibility and other problems.
I'll explain my example. I have a main() doing really few things. The most of the action comes from timers and external interrupts. These are functions with no parameter, due the platform I'm using, 32 bits micro controller. I can't pass parameters and in addition, these interrupts and timers are asynchronous. The easiest way is a global, imagine, interrupts filling a buffer, this buffer is parsed inside a timer, and the information parsed is used, stored, sent in other timers.
Ok, I can pull up an interrupt flag and process in the main function, but when executing critical software that way is not good idea.
This is the only kind of global variable that doesn't make me feel bad ;-)

Global variables are almost inevitable. However, they are often overused. They are not a substitute for passing proper parameters and designing the right data structures.
Every time you want a global variable, think: what if I need one more like this?

Related

Do i really needed accessor functions to access global variable from another file?

In my code (game engine code) there are multiple source (.c) files which maintain the status of the game, status like
START
CONFIGURE
STOP
END
DEFAULT
RUNNING
for maintaining state, one global variable gameStatus used which shared between multiple source files using extern keyword. now I have read that the global variable is bad to use and it allows the outside module to change it and as the number of components using global variable increases, the complexity of the interactions can also increase.
So I have limited scope that variable to one file using static keyword and added accessor methods like get or set APIs in the same file. So other files only access that variable using accessor APIs.
I have removed the global variable that is good but now every other file which used that global variable have to call accessor APIs which seems to add the overhead of function calls,
so now I am confused which is better? any C standard about how efficiently share the data between different source files?
The fact that global variables are "bad practice" is entirely opinion based and 100% dependent on the context. It is impossible to say whether you are applying such "bad practice" or not without looking at your code. Global variables are not bad practice per se, using them in the wrong way is. Global variables are often necessary in C. Take as an example the C standard library: errno is a global variable that is used basically everywhere in both library code and user code to check for errors. Is that bad practice? Could they have defined a function get_errno() instead (well to be honest they actually did it's just hidden... but that's for complex concurrency reasons)? I'll let you decide.
In your specific case, changing a globally visible variable to static and then creating two functions only to get and set its value is totally unnecessary. Any part of the code can still modify the variable, but now it's just more annoying to do so, and it could also lead to slower code if not optimized correctly. All in all, by creating those functions you just stripped the variable of the static qualifier.

C Handles - How to work with them?

I have in some documentation for a plugin for Dreamweaver I am making that says the following:
void **connectionData
• The connectionData argument is a
handle to the data that the agent
wants Dreamweaver to pass to it when
calling other API functions.
I have no other information than this from the manual in regard to connectionData. Thinking literally, I figured handle refered to a generic handle,however I am not able to find documentation on working with generic handles in regard to C.
HANDLE h = connectionData;
Does compile in my code. How exactly do I get the "secrets" inside this data structure/can someone explain how generic handles for C work?
Well, usually you are not supposed to get the secrets of handles; they are usually just a pointer to some internal structure inside the lib/API you are using and only the lib will know how to use it.
There is no generic rules or anything about handles, you'll have to use them as indicated by your lib's docs.
The way that this is defined, connectionData is a pointer to a pointer to something. Without knowing what is assigned to connectionData, you can't know anything else. The reason why your other statement worked is that HANDLE is probably a macro that expands to void*
To know the "Secrets," you would need to find out what struct (this is a guess - it could actually be any data type) connectionData points to, then look at the definition of that struct. I don't know how familiar you are with programming in general but a debugger allows you to easily look at the struct's fields while paused at a breakpoint.
However, as other people have said, you probably don't want to muck with the internals of whatever this points to, and only use API calls.
C developers use a "handle" data type when they specifically want to hide the internal data and keep API users from monkeying around with the implementation. A handle is sometimes just a pointer, but can also be an index into an internal lookup table.
In general, you should only use provided API functions with a handle, and learn the proper way to get the handle, understand its life cycle and how to properly dispose of it when you're done.

Learning C coming from managed OO languages

I am fairly comfortable coding in languages like Java and C#, but I need to use C for a project (because of low level OS API calls) and I am having some difficulty dealing with pointers and memory management (as seen here)
Right now I am basically typing up code and feeding it to the compiler to see if it works. That just doesn't feel right for me. Can anyone point me to good resources for me to understand pointers and memory management, coming from managed languages?
k&r - http://en.wikipedia.org/wiki/The_C_Programming_Language_(book)
nuff said
One of the good resources you found already, SO.
Of course you are compiling with all warnings on, don't you?
Learning by doing largely depends on the quality of your compiler and the warnings / errors he feeds you. The best in that respect that I found in the linux / POSIX world is clang. Nicely traces the origin of errors and tells you about missing header files quite well.
Some tips:
By default varibles are stored in the stack.
Varibles are passed into functions by Value
Stick to the same process for allocating and freeing memory. eg allocate and free in the same the function
C's equivalent of
Integer i = new Integer();
i=5;
is
int *p;
p=malloc(sizeof(int));
*p=5;
Memory Allocation(malloc) can fail, so check the pointer for null before you use it.
OS functions can fail and this can be detected by the return values.
Learn to use gdb to step through your code and print variable values (compile with -g to enable debugging symbols).
Use valgrind to check for memory leaks and other related problems (like heap corruption).
The C language doesn't do anything you don't explicitly tell it to do.
There are no destructors automatically called for you, which is both good and bad (since bugs in destructors can be a pain).
A simple way to get somewhat automatic destructor behavior is to use scoping to construct and destruct things. This can get ugly since nested scopes move things further and further to the right.
if (var = malloc(SIZE)) { // try to keep this line
use_var(var);
free(var); // and this line close and with easy to comprehend code between them
} else {
error_action();
}
return; // try to limit the number of return statements so that you can ensure resources
// are freed for all code paths
Trying to make your code look like this as much as possible will help, though it's not always possible.
Making a set of macros or inline functions that initialize your objects is a good idea. Also make another set of functions that allocate your objects' memory and pass that to your initializer functions. This allows for both local and dynamically allocated objects to easily be initialized. Similar operations for destructor-like functions is also a good idea.
Using OO techniques is good practice in many instances, and doing so in C just requires a little bit more typing (but allows for more control). Putters, getters, and other helper functions can help keep objects in consistent states and decrease the changes you have to make when you find an error, if you can keep the interface the same.
You should also look into the perror function and the errno "variabl".
Usually you will want to avoid using anything like exceptions in C. I generally try to avoid them in C++ as well, and only use them for really bad errors -- ones that aren't supposed to happen. One of the main reasons for avoiding them is that there are no destructor calls magically made in C, so non-local GOTOs will often leak (or otherwise screw up) some type of resource. That being said, there are things in C which provide a similar functionality.
The main exception like mechanism in C are the setjmp and longjmp functions. setjmp is called from one location in code and passed a (opaque) variable (jmp_buf) which can later be passed to longjmp. When a call to longjmp is made it doesn't actually return to the caller, but returns as the previously called setjmp with that jmp_buf. setjmp will return a value specified by the call to longjmp. Regular calls to setjmp return 0.
Other exception like functionality is more platform specific, but includes signals (which have their own gotchas).
Other things to look into are:
The assert macro, which can be used to cause program exit when the parameter (a logical test of some sort) fails. Calls to assert go away when you #define NDEBUG before you #include <assert.h>, so after testing you can easily remove the assertions. This is really good for testing for NULL pointers before dereferencing them, as well as several other conditions. If a condition fails assert attempts to print the source file name and line number of the failed test.
The abort function causes the program to exit with failure without doing all of the clean up that calling exit does. This may be done with a signal on some platforms. assert calls abort.

What methods are there to modularize C code?

What methods, practices and conventions do you know of to modularize C code as a project grows in size?
Create header files which contain ONLY what is necessary to use a module. In the corresponding .c file(s), make anything not meant to be visible outside (e.g. helper functions) static. Use prefixes on the names of everything externally visible to help avoid namespace collisions. (If a module spans multiple files, things become harder., as you may need to expose internal things and not be able hide them with "static")
(If I were to try to improve C, one thing I would do is make "static" the default scoping of functions. If you wanted something visible outside, you'd have to mark it with "export" or "global" or something similar.)
OO techniques can be applied to C code, they just require more discipline.
Use opaque handles to operate on objects. One good example of how this is done is the stdio library -- everything is organised around the opaque FILE* handle. Many successful libraries are organised around this principle (e.g. zlib, apr)
Because all members of structs are implicitly public in C, you need a convention + programmer discipline to enforce the useful technique of information hiding. Pick a simple, automatically checkable convention such as "private members end with '_'".
Interfaces can be implemented using arrays of pointers to functions. Certainly this requires more work than in languages like C++ that provide in-language support, but it can nevertheless be done in C.
The High and Low-Level C article contains a lot of good tips. Especially, take a look at the "Classes and objects" section.
Standards and Style for Coding in ANSI C also contains good advice of which you can pick and choose.
Don't define variables in header files; instead, define the variable in the source file and add an extern statement (declaration) in the header. This will tie into #2 and #3.
Use an include guard on every header. This will save so many headaches.
Assuming you've done #1 and #2, include everything you need (but only what you need) for a certain file in that file. Don't depend on the order of how the compiler expands your include directives.
The approach that Pidgin (formerly Gaim) uses is they created a Plugin struct. Each plugin populates a struct with callbacks for initialization and teardown, along with a bunch of other descriptive information. Pretty much everything except the struct is declared as static, so only the Plugin struct is exposed for linking.
Then, to handle loose coupling of the plugin communicating with the rest of the app (since it'd be nice if it did something between setup and teardown), they have a signaling system. Plugins can register callbacks to be called when specific signals (not standard C signals, but a custom extensible kind [identified by string, rather than set codes]) are issued by any part of the app (including another plugin). They can also issue signals themselves.
This seems to work well in practice - different plugins can build upon each other, but the coupling is fairly loose - no direct invocation of functions, everything's through the signaling stystem.
A function should do one thing and do this one thing well.
Lots of little function used by bigger wrapper functions help to structure code from small, easy to understand (and test!) building blocks.
Create small modules with a couple of functions each. Only expose what you must, keep anything else static inside of the module. Link small modules together with their .h interface files.
Provide Getter and Setter functions for access to static file scope variables in your module. That way, the variables are only actually written to in one place. This helps also tracing access to these static variables using a breakpoint in the function and the call stack.
One important rule when designing modular code is: Don't try to optimize unless you have to. Lots of small functions usually yield cleaner, well structured code and the additional function call overhead might be worth it.
I always try to keep variables at their narrowest scope, also within functions. For example, indices of for loops usually can be kept at block scope and don't need to be exposed at the entire function level. C is not as flexible as C++ with the "define it where you use it" but it's workable.
Breaking the code up into libraries of related functions is one way of keeping things organized. To avoid name conflicts you can also use prefixes to allow you to reuse function names, though with good names I've never really found this to be much of a problem. For example, if you wanted to develop your own math routines but still use some from the standard math library, you could prefix yours with some string: xyz_sin(), xyz_cos().
Generally I prefer the one function (or set of closely related functions) per file and one header file per source file convention. Breaking files into directories, where each directory represents a separate library is also a good idea. You'd generally have a system of makefiles or build files that would allow you to build all or part of the entire system following the hierarchy representing the various libraries/programs.
There are directories and files, but no namespaces or encapsulation. You can compile each module to a separate obj file, and link them together (as libraries).

Static vs global in terms of speed and space consumption in C

I would like to know difference between static variables and global variables in terms of access speed and space consumption. (If you want to know my platform: gcc compiler on Windows. (I am using Cygwin with Triton IDE for ARM7 embedded programming on windows. Triton comes with gcc compiler on Java platform which can be run on Windows.))
(Obviously I know in terms of file and function scope from this question)
Edit: OK give me an answer on any micro controller / processor environment.
There is no difference for the space, they take the same amount.
But there is a speed difference: static is faster.
Of course the memory access to the variable is for global and static the same. But the compiler can optimize when you have static. When it compiles a module it knows that no function call to a function outside the module can change a static variable. So it knows exactly what happens and can e.g. keep it in a register over function calls. When it is global and you call a function from a different module, the compiler can't know what it does. Hence he must assume that the function accesses the variable and changes it, resulting in a store and reload.
With gcc you can pass all .c sources at the same time, so it can then also see what happens in function calls to functions from different modules. To make it work you have to pass besides all .c files at once -combine and -fwhole-program. The -fwhole-program makes all globals static (not module static but compilation unit static, i.e. all the given .c files together). The -combine makes the intermodule analysis.
Space consumption: basically no difference. The only time there'd be a space issue is if you manage to get the same chunk of static data hidden in N object files, then you get a multiplication factor of N where you might have just 1 copy if it was a single global piece of data. However, that's a mis-design issue. Information hiding is good - unless the information should not be hidden.
Access speed: no difference.
It's hard to guess, or to estimate. It would probably take time, but I would make a sample project and test for speed. Testing both access speed and space with a loop. Test the sample project with an emulator for that architecture.
I would expect any difference would come from packing (for space) and caching (for speed) issues. Both those could also arise from just about anything else as well.
There is no difference in the env you describe when it comes to space. The static or global var consume just the same amount of memory.
For speed considerations (but not good practice) you could prefer global vars, if you need access to the var outside the one file.
(ref use of external char my_global_char_placed_else_where;)
For better practice you use get/set functions instead but they are slower. So then you could use macros for get/set of a var that is global to hide from the reader of the code that the var is in fact global, but that is kind'a like cheating. But it can make the code more readable.
If you compare hiding a var inside a function, then it has no difference compared with placing it outside the function and more functions could have access to the var.
I myself use MSP430, ARM7(just for tests) and AVR32 micros for development
What Jonathan says is not exactly correct. Both static and global variables will be (has to be) saved in the ZI (or RW data) regions. The compiler cant "keep" it over the register strictly - what it might do is load the value into the register, use that register for all operations and than save that value back - thats a compiler specific optimization. And even then, there is no reason why the compiler wont do that also for global variables : unless of course u make it volatile. But then, technically you can also make a static variable volatile, so again no difference.
Edit : oh yeah - space : no difference.

Resources