Recommendations for 'C' Project Architecture Guidelines? [closed] - c

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
Now that I got my head wrapped around the 'C' language to a point where I feel proficient enough to write clean code, I'd like to focus my attention on project architecture guidelines. I'm looking for a good resource that coves the following topics:
How to create an interface that promotes code maintainability and is extensible for future upgrades.
Library creation guidelines. Example, when should I consider using static vs dynamic libraries. How to properly design an ABI to cope with either one.
Header files: what to partition out and when. Examples on when to use 1:1 vs 1:many .h to .c
Anything you feel I missed but is important when attempting to architect a new C project.
Ideally, I'd like to see some example projects ranging from small to large and see how the architecture changes depending on project size, function or customer.
What resource(s) would you recommend for such topics?

Whenever I got serious writing C code, I had to emulate C++ features in it. The main stuff worth doing is:
Think of each module like a class. The functions you expose in the header are like public methods. Only put a function in the header if it part of the module's needed interface.
Avoid circular module dependencies. Module A and module B should not call each other. You can refactor something into a module C to avoid that.
Again, following the C++ pattern, if you have a module that can perform the same operations on different instances of data, have a create and delete function in your interface that will return a pointer to struct that is passed back to other functions. But for the sake of encapsulation, return a void pointer in the public interface and cast to your struct inside of the module.
Avoid module-scope variables--the previously described pattern will usually do what you need. But if you really need module-scope variables, group them under a struct stored in a single module-scope variable called "m" or something consistent. Then in your code whenever you see "m.variable" you will know at a glance it is one of the module-scope structs.
To avoid header trouble, put #ifndef MY_HEADER_H #define MY_HEADER_H declaration that protects against double including. The header .h file for your module, should only contain #includes needed FOR THAT HEADER FILE. The module .c file can have more includes needed for the compiling the module, but don't add those includes into the module header file. This will save you from a lot of namespace conflicts and order-of-include problems.

The truths about architecting systems are timeless, and they cross language boundaries. Here is a little advice focused on C:
Every module hides a secret. Build your system around interfaces that hide information from their clients. C's only type-safe information-hiding construct is the pointer to an incomplete structure. Learn it thoroughly and use it often.
One interface to an implementation is a good rule of thumb. (Interface is .h, implementation is .c.) Sometimes you will want to provide two interfaces that relate to the same implementation: one that hides the representation and one that exposes it.
You'll need naming conventions.
A superb model of how to handle these kinds of problems in C is Dave Hanson's C Interfaces and Implementations. In it you will get to see how to design good interfaces and implementations, and how one interface can build on another to form a coherent library. You will also get an excellent set of starter interfaces you can use in your own applications. For someone in your position, I cannot recommend this book too highly. It is an archetype of well-architected systems in C.

Namespace cleanliness - especially important with libraries. Prefix your public functions with the name of the library, in some fashion. If your library was called 'happyland', make functions such as "happyland_init" or even "hl_init".
This goes for the use of static. You will write functions which are specialized - hide them from other modules by using static liberally.
For libraries, reentrant is also critical. Do not depend on global state which is not compartmentalized (you can have a typedef struct token to keep this state if required).

Separate presentation code from logic. That is very important.
Static if they're used only in your project or in few binaries, dynamic when used many times (saves very much space).
Whenever code is used more than once, split it off into a header.
Those are a few of my known tips I can give you.

How to create an interface that
promotes code maintainability and is
extensible for future upgrades.
By exposing as few implementation details as possible. For example,
Use opaque pointers.
If you need a "private" function, declare it static, and don't put it in the .h file.

Related

Tips/resources for structuring C code? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Does anyone have tips/resources for how to, in the best way, structure your C code projects? (Different folders etc.) And how do you know when it's good to split code into separate files? And what is an example of a good Makefile?
My project is not that big, but I wanna start to structure my code at an early stage..
Structuring code needs some experience but mostly common sense.
For splitting code, you usually go for readability: conceptually coherent functions/datatypes should go in the same file. You can take c standard library as a good example. It is better to keep your data structure definitions and function declarations in separate headers. This allows you to use the data structures as part of a compilation unit even if you have not defined all the functions.
Files that provide similar functionality should go in the same directory. It is good to avoid deep directory structure (1 level deep is best) as that complicates building the project unnecessarily.
I think Makefiles are OK for small projects, but become unwieldy for bigger ones. For really serious work (if you want to distribute your code, create an installer etc) you may want to look at cmake, scons, etc.
Have a look at the GNU coding standards: http://www.gnu.org/prep/standards/standards.html
Look at the gnu make manual for a simple example Makefile. You can also pick up any opensource project and look at the Makefile. Browsing code repositories in sourceforge.net may be useful.
Read one of the many C coding standards available on the internet and follow one that looks reasonable for your requirements. A few links:
GNU Coding Standards
C Coding Standards at IRAM (pdf)
Indian Hill C Style and Coding Standards
The following books also contain effective guidelines on writing good C code:
The C Programming Language
The Practice of Programming
The Elements of Programming Style
This is sometimes overlooked, but security is an issue in big projects. Here's some advice about how to program securely.
Here is an idiom I like:
Declare structs in a header so that their size is known by client code. Then declare init and deinit functions to the following convention:
The first parameter is a struct foo*.
The return type is a struct foo*.
If they might fail, the last parameter is either int* (simplest), enum foo_error* (if there are several ways it can fail that the calling code might care about) or GError** (if you're writing GLib-style code).
foo_init() and foo_deinit() return NULL if the first parameter is NULL. They also return the first parameter.
Why do it this way? Calling code doesn't have to allocate heap space for the structure, it can go on the stack. If you are allocating it on the heap, though, the following works nicely:
struct foo* a_foo = foo_init(malloc(sizeof(*a_foo)));
if (a_foo == NULL) {
/* Ruh-oh, allocation failure... */
}
free(foo_deinit(a_foo));
Everything works nicely even if a_foo == NULL when foo_deinit is called.

best practice for delivering a C API hiding internal functions [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I have written a C library which consists in a few .h files and .c files. I compile it as a .a static library.
I would like to expose only certain functions to the user and keep the rest as "obscure" as possible to make reverse engineering reasonably difficult.
Ideally my library would consist of:
1- one .h file with only the functions exposed to the user
2- myLibrary.a: as un-reversengineerable as possible
What are the best practices for that? Where should I look, is there a good tutorial/book somewhere?
More specifically:
for - 1
I already have all my .h and .c working and I would like to avoid changing them around, moving function declarations from .h to .c and go into circular references potential pbs. Is That possible?
For instance is it a good idea to create a new .h file which I would use only for distributing with my .a? That .h would contain copies of the functions I want to expose and forward declarations of types I use. Is that a good idea?
for - 2
a) what gcc flags (or xcode) shall I be aware of (for stripping, not having debug symbols etc)
b) a good pointer to learn about how to do code obfuscation?
Any thought will help,
Thanks, baba
The usual practice is to make sure that every function and global variable that is for use only internal to some module is declared static in that module. That limits exposure of internal implementation details from a single module.
If you need internal implementation details that cross between modules, but which are not for public consumption, then declare one or more .h files that are kept private and not delivered to end users. The names of objects defined in that way will still be visible to the linker (and to tools such as objdump and nm) but their detailed signatures will not be.
If you have data structures that are delivered to the end user, but which are opaque, then consider having the API deliver them as pointers to a struct that is declared by not defined in the public API .h file. That will preserve type safety, while concealing the implementation details. Naturally, the complete struct definition is in a private .h file.
With care, you can keep a partially documented publicly known struct that is a type-pun for the real definition but which only exposes the public members. This is more difficult to keep up to date, and if you do it, I would make certain that there are some strong test cases to validate that the public version is in fact equivalent to the private version in all ways that matter.
Naturally, use strip to remove the debug segments so that the internal details are not leaked that way.
There are tools out there that can obfuscate all the names that are intended to be only internal use. If run as part of the build process, you can work with an internal debug build that has sensible names for everything, and ship a build that has named all the internal functions and global variables with names that only a linker can love.
Finally, get used to the fact that anyone that can use your library will be able to reverse engineer your library to some extent. There are anti-debugger measures that can be taken, but IMHO that way lies madness and frustration.
I don't have a quick answer other than to explore the use of "static" functions. I would recommend reading Miro Samek's work on something he calls "C+". Basically object oriented ANSI C. Great read. He owns Quantum leaps software.
Erase headers for this functions, do some obfuscation in exports table and get pack your code and apply some anti debugger algorithm.
http://upx.sourceforge.net/
http://www.oreans.com/

Good way to organize C source files? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
The way I've always organized my C source was to put struct, macro and function prototypes in header files and function implementations in .c files. However, I have recently been reading alot of other peoples code for large projects and I'm starting to see that people often define things like structs and macros in the C source itself, immediately above the functions that make use of it. I can see some benefit to this as you don't have to go searching around to find the definition of structs and macros used by particular functions, everything is right there in roughly the same place as the functions that use it. However I can also see some disadvantages to it as it means that there is not one central repository for struct/macro definitions as they're scattered through the sourcecode.
My question is, what are some good rules of thumb for deciding when to put the macro/struct definition in the C source code as opposed to the header files themselves?
Typically, everything you put in the header file is part of the interface, while everything that you put in the source file is part of the implementation.
That is, if something in the header file is only ever used by the associated source file, it's an excellent candidate for moving to that source file. This prevents "polluting" the namespace of every file that uses your header with macros and types that they were never intended to use.
Interfaces and implementations is what it's all about.
Addendum to the accepted answer: it can be useful to put an incomplete struct declaration in the header but put the definition only in the .c file. Now a pointer to that struct gives you a private type that you have complete control over. Very useful to guarantee separation of concerns; a bit like private members in C++.
For extensive examples, follow the link.
Put your public structures and interface into the .h file.
Put your private bits into the .c file.
If I have more than one .c file that implements a logical set of functionality, I'll put the things that need to be shared among those implementation files into a *p.h file ('p' for private). Client code should not include the *p.h header.
For example, if I have a set of routines that implement an XML parser, I might have the following organization:
xmlparser.h - the public structures, types, enums, and function prototypes
xmlparserp.h - private types, function prototypes, etc. that client code
doesn't and shouldn't need
xmlparser.c - implementation of the XML parser
xmlutil.c - some other implementation bits (would include xmlparserp.h)
stuff defining the external interface to the module goes in the header.
stuff just used within the module should stay in the C file.
header files should include required headers to support its declarations
headers should wrap themselves in #ifndef NAME_H to ensure a single inclusion per compilation unit
modules should include their own headers to ensure consistency
Lots of people don't know this basic stuff BTW, which is required to maintain your sanity on any sized C project.
In the book "C style: Standards and Guidelines" from David Straker (available online here), there are some good ideas on file layout, and on the division between C file and headers.
You may read the chapter 7 and specially chapter 7.4.
And as John Calsbeek said, you can based your organization on how the header parts are used.
If one structure, type, macro, ... is only used by one source, you can move your code there.
You may have header files for the prototypes and some header file for the common declarations (type definitions, etc...)
Mainly, the issue at hand is a sort of encapsulation consideration. You can think of a module's header file as not so much its "overhead", which is what you seem to be doing now, as its public interface declaration, which is presumably how the people whose code you're looking at are seeing it.
From there it follows naturally that what goes in the header file is things that other modules need to know about: prototypes for functions you anticipate being used externally, structs and macros used for interface purposes, extern variable declarations, and so on, while things that are strictly for the module's internal use go in the .c file.
If you define your structs and macros inside the .c, you won't be able to use it from other .c files
To do so, you have to put it in the .h so that #include tells the compiler where to check for your structs and macros
unless you #include "x.c", which you shouldn't do =)

How should I structure complex projects in C? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have little more than beginner-level C skills and would like to know if there are any de facto "standards" to structure a somewhat complex application in C. Even GUI based ones.
I have been always using the OO paradigm in Java and PHP and now that I want to learn C I'm afraid that I might structure my applications in the wrong way. I'm at a loss on which guidelines to follow to have modularity, decoupling and dryness with a procedural language.
Do you have any readings to suggest? I couldn't find any application framework for C, even if I don't use frameworks I've always found nice ideas by browsing their code.
The key is modularity. This is easier to design, implement, compile and maintain.
Identify modules in your app, like classes in an OO app.
Separate interface and implementation for each module, put in interface only what is needed by other modules. Remember that there is no namespace in C, so you have to make everything in your interfaces unique (e.g., with a prefix).
Hide global variables in implementation and use accessor functions for read/write.
Don't think in terms of inheritance, but in terms of composition. As a general rule, don't try to mimic C++ in C, this would be very difficult to read and maintain.
If you have time for learning, take a look at how an Ada app is structured, with its mandatory package (module interface) and package body (module implementation).
This is for coding.
For maintaining (remember that you code once, but you maintain several times) I suggest to document your code; Doxygen is a nice choice for me. I suggest also to build a strong regression test suite, which allows you to refactor.
It's a common misconception that OO techniques can't be applied in C. Most can -- it's just that they are slightly more unwieldy than in languages with syntax dedicated to the job.
One of the foundations of robust system design is the encapsulation of an implementation behind an interface. FILE* and the functions that work with it (fopen(), fread() etc.) is a good example of how encapsulation can be applied in C to establish interfaces. (Of course, since C lacks access specifiers you can't enforce that no-one peeks inside a struct FILE, but only a masochist would do so.)
If necessary, polymorphic behaviour can be had in C using tables of function pointers. Yes, the syntax is ugly but the effect is the same as virtual functions:
struct IAnimal {
int (*eat)(int food);
int (*sleep)(int secs);
};
/* "Subclass"/"implement" IAnimal, relying on C's guaranteed equivalence
* of memory layouts */
struct Cat {
struct IAnimal _base;
int (*meow)(void);
};
int cat_eat(int food) { ... }
int cat_sleep(int secs) { ... }
int cat_meow(void) { ... }
/* "Constructor" */
struct Cat* CreateACat(void) {
struct Cat* x = (struct Cat*) malloc(sizeof (struct Cat));
x->_base.eat = cat_eat;
x->_base.sleep = cat_sleep;
x->meow = cat_meow;
return x;
}
struct IAnimal* pa = CreateACat();
pa->eat(42); /* Calls cat_eat() */
((struct Cat*) pa)->meow(); /* "Downcast" */
All good answers.
I would only add "minimize data structure". This might even be easier in C, because if C++ is "C with classes", OOP is trying to encourage you to take every noun / verb in your head and turn it into a class / method. That can be very wasteful.
For example, suppose you have an array of temperature readings at points in time, and you want to display them as a line-chart in Windows. Windows has a PAINT message, and when you receive it, you can loop through the array doing LineTo functions, scaling the data as you go to convert it to pixel coordinates.
What I have seen entirely too many times is, since the chart consists of points and lines, people will build up a data structure consisting of point objects and line objects, each capable of DrawMyself, and then make that persistent, on the theory that that is somehow "more efficient", or that they might, just maybe, have to be able to mouse over parts of the chart and display the data numerically, so they build methods into the objects to deal with that, and that, of course, involves creating and deleting even more objects.
So you end up with a huge amount of code that is oh-so-readable and merely spends 90% of it's time managing objects.
All of this gets done in the name of "good programming practice" and "efficiency".
At least in C the simple, efficient way will be more obvious, and the temptation to build pyramids less strong.
The GNU coding standards have evolved over a couple of decades. It'd be a good idea to read them, even if you don't follow them to the letter. Thinking about the points raised in them gives you a firmer basis on how to structure your own code.
If you know how to structure your code in Java or C++, then you can follow the same principles with C code. The only difference is that you don't have the compiler at your side and you need to do everything extra carefully manually.
Since there are no packages and classes, you need to start by carefully designing your modules. The most common approach is to create a separate source folder for each module. You need to rely on naming conventions for differentiating code between different modules. For example prefix all functions with the name of the module.
You can't have classes with C, but you can easily implement "Abstract Data Types". You create a .C and .H file for every abstract data type. If you prefer you can have two header files, one public and one private. The idea is that all structures, constants and functions that need to be exported go to the public header file.
Your tools are also very important. A useful tool for C is lint, which can help you find bad smells in your code. Another tool you can use is Doxygen, which can help you generate documentation.
Encapsulation is always key to a successful development, regardless of the development language.
A trick I've used to help encapsulate "private" methods in C is to not include their prototypes in the ".h" file.
I'd suggets you to check out the code of any popular open source C project, like... hmm... Linux kernel, or Git; and see how they organize it.
The number rule for complex application: it should be easy to read.
To make complex application simplier, I employ Divide and conquer.
I would suggest reading a C/C++ textbook as a first step. For example, C Primer Plus is a good reference. Looking through the examples would give you and idea on how to map your java OO to a more procedural language like C.

Code Ordering in Source Files - Forward Declarations vs "Don't Repeat Yourself"? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
If you code in C and configure your compiler to insist that all functions are declared before they are used (or if you code in C++), then you can end up with one of (at least) two organizations for your source files.
Either:
Headers
Forward declarations of (static) functions in this file
External functions (primary entry points)
Static - non-public - functions
Or:
Headers
Static - non-public - functions
External functions (primary entry points)
I recognize that in C++, the term 'static' is not preferred, but I'm primarily a C programmer and the equivalent concept exists in C++, namely functions in an anonymous namespace within the file.
Question:
Which organization do you use, and why do you prefer it?
For reference, my own code uses the second format so that the static functions are defined before they are used, so that there is no need to both declare them and define them, which saves on having the information about the function interfaces written out twice - which, in turn, reduces (marginally) the overhead when an internal interface needs to change. The downside to that is that the first functions defined in the file are the lowest-level routines - the ones that are called by functions defined later in the file - so rather than having the most important code at the top, it is nearer the bottom of the file. How much does it matter to you?
I assume that all externally accessible functions are declared in headers, and that this form of repetition is necessary - I don't think that should be controversial.
I've always used method #1, the reason being that I like to be able to quickly tell which functions are defined in a particular file and see their signatures all in one place. I don't find the argument of having to change the prototypes along with the function definition particularly convincing since you usually wind up changing all the code that calls the changed functions anyway, changing the function prototypes while you are at it seems relatively trivial.
In C code I use a simple rule:
Every C file with non-static members will have a corresponding header file defining those members.
This has worked really well for me in the past - makes it easy enough to find the definition of a function because it's in the same-named .h file if I need to look it up. It also works well with doxygen (my preferred tool) because all the cruft is kept in the header where I don't spend most of my time - the C file is full of code.
For static members in a file I insist in ordering the declarations in such a way that they are defined by instantiation before use anyway. And, I avoid circular dependency in function calls almost all of the time.
For C++ code I tried the following:
All code defined in the header file. Use #pragma interface/#pragma implementation to inform the compiler of that; kind of the same way templates put all the code in the header.
That's worked really well for me in C++. It means you end up with HUGE header files which can increase compile time in some cases. You also end up with a C++ body file where you simply include the header and compile. You can instantiate your static member variables here. It also became a nightmare because it was far too easy to change your method params and break your code.
I moved to
Header file with doxygen comments (except for templates, where code must be included in the header) and full body file, except for short methods which I know I'd prefer be inlined when used.
Separating out implementation from definition has the distinct plus that it's harder to change your method/function signatures so you're less likely to do it and break things. It also means that I can have huge doxygen blocks in the header file documenting how things work and work in the code relatively interruption free except for useful comments like "declare a variable called i" (tongue in cheek).
Ada forces the convention and the file naming scheme on you. Most dynamic languages like Ruby, Python, etc don't generally care where/if you declare things.
Number 2: because I write many short functions and refactor them freely, it'd be a significant nuisance to maintain forward declarations. If there's an Emacs extension that does that for you with no fuss, I'd be interested, since the top-down organization is a bit more readable. (I prefer top-down in e.g. Python.)
Actually not quite your Number 2, because I generally group related functions together in the .c regardless of whether they're public or private. If I want to see all the public declarations I'll look in the header.
Number 2 for me.
I think using static or other methods to make your module functions and variables private to the module is a good practice.
I prefer to have my api functions at the bottom of the module. Conversely I put the api functions at the top of my classes as classes are generally reusable. Putting the api functions at the top make it easier to find them quickly. Most IDEs, can take you to any function pretty directly.
(Talking about C code)
Number 2 for me because I always forget to update forward decls to reflect static functions changes.
But I think that the best practice should be
headers
forward declarations + comment on function behaviour for each one
exported functions + eventual comments about implementation details when code is not clear enough
static functions + eventual comments about implementation details
How much does it matter to you?
It's not.
It is important that all local function will be marked as static, but for my opinion defining how to group function in the file is too much. There is no strong reasoning for any version and i don't find any strong disadvantage ever.
In general coding convention is very important and we trying to define as much as possible, but in this case my feeling, that this is unjustified overhead.
After reading all posts again it seems like i should simply upvote (which i did) Darius answer, instead writing all of these ...

Resources