Good way to organize C source files? [closed] - c

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
The way I've always organized my C source was to put struct, macro and function prototypes in header files and function implementations in .c files. However, I have recently been reading alot of other peoples code for large projects and I'm starting to see that people often define things like structs and macros in the C source itself, immediately above the functions that make use of it. I can see some benefit to this as you don't have to go searching around to find the definition of structs and macros used by particular functions, everything is right there in roughly the same place as the functions that use it. However I can also see some disadvantages to it as it means that there is not one central repository for struct/macro definitions as they're scattered through the sourcecode.
My question is, what are some good rules of thumb for deciding when to put the macro/struct definition in the C source code as opposed to the header files themselves?

Typically, everything you put in the header file is part of the interface, while everything that you put in the source file is part of the implementation.
That is, if something in the header file is only ever used by the associated source file, it's an excellent candidate for moving to that source file. This prevents "polluting" the namespace of every file that uses your header with macros and types that they were never intended to use.

Interfaces and implementations is what it's all about.
Addendum to the accepted answer: it can be useful to put an incomplete struct declaration in the header but put the definition only in the .c file. Now a pointer to that struct gives you a private type that you have complete control over. Very useful to guarantee separation of concerns; a bit like private members in C++.
For extensive examples, follow the link.

Put your public structures and interface into the .h file.
Put your private bits into the .c file.
If I have more than one .c file that implements a logical set of functionality, I'll put the things that need to be shared among those implementation files into a *p.h file ('p' for private). Client code should not include the *p.h header.
For example, if I have a set of routines that implement an XML parser, I might have the following organization:
xmlparser.h - the public structures, types, enums, and function prototypes
xmlparserp.h - private types, function prototypes, etc. that client code
doesn't and shouldn't need
xmlparser.c - implementation of the XML parser
xmlutil.c - some other implementation bits (would include xmlparserp.h)

stuff defining the external interface to the module goes in the header.
stuff just used within the module should stay in the C file.
header files should include required headers to support its declarations
headers should wrap themselves in #ifndef NAME_H to ensure a single inclusion per compilation unit
modules should include their own headers to ensure consistency
Lots of people don't know this basic stuff BTW, which is required to maintain your sanity on any sized C project.

In the book "C style: Standards and Guidelines" from David Straker (available online here), there are some good ideas on file layout, and on the division between C file and headers.
You may read the chapter 7 and specially chapter 7.4.
And as John Calsbeek said, you can based your organization on how the header parts are used.
If one structure, type, macro, ... is only used by one source, you can move your code there.
You may have header files for the prototypes and some header file for the common declarations (type definitions, etc...)

Mainly, the issue at hand is a sort of encapsulation consideration. You can think of a module's header file as not so much its "overhead", which is what you seem to be doing now, as its public interface declaration, which is presumably how the people whose code you're looking at are seeing it.
From there it follows naturally that what goes in the header file is things that other modules need to know about: prototypes for functions you anticipate being used externally, structs and macros used for interface purposes, extern variable declarations, and so on, while things that are strictly for the module's internal use go in the .c file.

If you define your structs and macros inside the .c, you won't be able to use it from other .c files
To do so, you have to put it in the .h so that #include tells the compiler where to check for your structs and macros
unless you #include "x.c", which you shouldn't do =)

Related

Optimizing includes in C [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
My project has several header files. Most of C source files include all of those, so nearly every source file contains the following lines:
#include "add.h"
#include "sub.h"
#include "mul.h"
#include "div.h"
#include "conv.h"
#include "comp.h"
// etc.
Should this be moved to some all.h or similar? Is there a better way to do this?
In each .c file it should be easy to know what includes it needs. By doing your way, you could lose track of what includes it needs, and another colleague will need to look for this file.
In another words, you shouldn't do that because you lose clarity in your code.
There is also the third, somewhere-in-the-middle option, to have a minimum common.h for stuff which is shared by pretty much all your translation units (e.g. headers like stdint.h). Or perhaps you might decide to group add.h, sub.h and similar headers into a single ops.h header.
But having literally a single all.h header would make encapsulation/information hiding in C even harder than it is already. Does a math library in a spaceship microcontroller really need to know everything about life support systems?
Of course, C doesn't really have "first class encapsulation mechanisms", you can throw an extern function declaration pretty much everywhere, that's why it's rather important to stick to reasonable conventions and best practices.
There are pro and conses in both approaches (having one single all.h including all your headers, or not), so it is also a matter of opinion. And on a reasonably small project, having a single header file containing all your common declarations (and also definitions of static inline functions) is also possible.
See also this answer (which gives examples of both approaches).
A possible reason to have one single header (which includes other headers, even the system ones) is to enable pre-compilation of that header file. See this answer (and the links there).
A possible reason to favor several header files is readability and modularity. You might want to have one (rather small) header file per module and in every translation unit (practically every .c file) you would include only the minimal set of header files (in the good order).
Remember that C99 & C11 (and even C++14) do not have any notion of modules; in other words, modules in C are only a matter of conventions & habits. And conventions and habits are really important in C programs, so look at what other people are doing in existing free software projects (e.g. on github or sourceforge).
Notice that preprocessing is the first phase of most C compilers. Read documentation about your C preprocessor.
You practically would use a build automation system like GNU make. You may want to automatically generate dependencies (e.g. like here).
For a single-person project (of a few dozens of thousand source lines at most), I personally prefer to have a single header file, but that is a matter of opinion and taste (so a lot of people disagree). BTW, refactoring your project (to several header files) when it becomes large enough is quite easy to do (but harder to design); you'll just copy&paste some chunks of code in several new header files.
For a project involving several developers, you may want to favor the idea that most files (header or code) have one single developer responsible for them (with other developers making occasional changes to it).
Notice that header inclusion and preprocessing is a textual operation. You might in theory even avoid having header files and copy and paste the same declarations in your .c files but that is very bad practice so you should not do that (in hand-written code). However, some projects are generating C code (some sort of metaprogramming), and their C code generator (some script, or some tool like bison) could emit the same declarations in several files.

C function headers location: .h or .c? [duplicate]

This question already has answers here:
Where to document functions in C or C++? [closed]
(10 answers)
Closed 8 years ago.
Suppose we have function (external only considered here) int foo(int a, char *b), normally there will be a header that goes with it documenting what the function does, what each parameter and return value does, etc. It'll probably be in doxygen format too. My habit is that such header should go into .h files because that's where the interface is defined and reader should have all the information in that place. But a lot of people keep such headers in C file where the actual implimentation goes. I've seen this in the Linux kernel code also. So was I wrong? Which would you prefer?
Although header files can be used in any which way, they are primarily a mechanism to enable external linkage.
You design an API that is meant for external consumption, and you put everything required to consume this API (constants, types, prototypes) in header file(s).
All the other stuff, that is a part of implementation, and doesn't need to be seen by external users, can go in the source files (if the usage is localized to one file), or private headers that can be shared between multiple files. The latter is another example of header files enabling external linkage, but for internal consumption.
The answer to this question is largely "it depends":
Depends on what? Who's reading the documentation, and how they access it.
If you're developing a program, then having the documentation inline with the implementation is probably OK, because anybody who wants to know about your program can access the source code and read about it. Your target audience is probably developers working on the program itself, so having the documentation in the C file, along with the bulk of the code they're working on, is a suitable approach.
If you're developing a library, the target audience changes (or you might have two target audiences). You still have the developers, who could make use of more detailed documentation as it relates to private implementation detail. You also have the users of the library, who only care about the interface that they're working with; from a code-browsing point of view, they typically only have access to the headers.
I put them in the .h file by preference when it's my choice, if I have a .h file. If I just have a .c file, I will document the functions when they are defined, simply because if I just have a .c file, I'm probably still coding, and I want to change the documentation if I change the code.
I feel that documentation and declarations go together in a separate file in a finished c project. Documentation in the code breaks up the code and can be redundant.
If I'm contributing somewhere, I'll follow the established convention.

best practice for delivering a C API hiding internal functions [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I have written a C library which consists in a few .h files and .c files. I compile it as a .a static library.
I would like to expose only certain functions to the user and keep the rest as "obscure" as possible to make reverse engineering reasonably difficult.
Ideally my library would consist of:
1- one .h file with only the functions exposed to the user
2- myLibrary.a: as un-reversengineerable as possible
What are the best practices for that? Where should I look, is there a good tutorial/book somewhere?
More specifically:
for - 1
I already have all my .h and .c working and I would like to avoid changing them around, moving function declarations from .h to .c and go into circular references potential pbs. Is That possible?
For instance is it a good idea to create a new .h file which I would use only for distributing with my .a? That .h would contain copies of the functions I want to expose and forward declarations of types I use. Is that a good idea?
for - 2
a) what gcc flags (or xcode) shall I be aware of (for stripping, not having debug symbols etc)
b) a good pointer to learn about how to do code obfuscation?
Any thought will help,
Thanks, baba
The usual practice is to make sure that every function and global variable that is for use only internal to some module is declared static in that module. That limits exposure of internal implementation details from a single module.
If you need internal implementation details that cross between modules, but which are not for public consumption, then declare one or more .h files that are kept private and not delivered to end users. The names of objects defined in that way will still be visible to the linker (and to tools such as objdump and nm) but their detailed signatures will not be.
If you have data structures that are delivered to the end user, but which are opaque, then consider having the API deliver them as pointers to a struct that is declared by not defined in the public API .h file. That will preserve type safety, while concealing the implementation details. Naturally, the complete struct definition is in a private .h file.
With care, you can keep a partially documented publicly known struct that is a type-pun for the real definition but which only exposes the public members. This is more difficult to keep up to date, and if you do it, I would make certain that there are some strong test cases to validate that the public version is in fact equivalent to the private version in all ways that matter.
Naturally, use strip to remove the debug segments so that the internal details are not leaked that way.
There are tools out there that can obfuscate all the names that are intended to be only internal use. If run as part of the build process, you can work with an internal debug build that has sensible names for everything, and ship a build that has named all the internal functions and global variables with names that only a linker can love.
Finally, get used to the fact that anyone that can use your library will be able to reverse engineer your library to some extent. There are anti-debugger measures that can be taken, but IMHO that way lies madness and frustration.
I don't have a quick answer other than to explore the use of "static" functions. I would recommend reading Miro Samek's work on something he calls "C+". Basically object oriented ANSI C. Great read. He owns Quantum leaps software.
Erase headers for this functions, do some obfuscation in exports table and get pack your code and apply some anti debugger algorithm.
http://upx.sourceforge.net/
http://www.oreans.com/

What is a C header file? [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicates:
[C] Header per source file.
In C++ why have header files and cpp files?
C++ - What should go into an .h file?
Is the only reason header files exist in C is so a developer can quickly see what functions are available, and what arguments they can take? Or is it something to do with the compiler?
Why has no other language used this method? Is it just me, or does it seem that having 2 sets of function definitions will only lead to more maintenance and more room for errors? Or is knowing about header files just something every C developer must know?
Header files are needed to declare functions and variables that are available. You might not have access to the definitions (=the .c files) at all; C supports binary-only distribution of code in libraries.
The compiler needs the information in the header files to know what functions, structures, etc are available and how to use them.
All languages needs this kind of information, although they retrieve the information in different ways. For example, a Java compiler does this by scanning either the class-file or the java source code to retrieve the information.
The drawback with the Java-way is that the compiler potentially needs to hold a much more of information in its memory to be able to do this. This is no big deal today, but in the seventies, when the C language was created, it was simply not possible to keep that much information in memory.
The main reason headers exist is to share declarations among multiple source files.
Say you have the function float *f(int a, int b) defined in the file a.c and reused in b.c and d.c. To allow the compiler to properly check arguments and return values you either put the function prototype in an header file and include it in the .c source files or you repeat the prototype in each source file.
Same goes for typedef etc.
While you could, in theory, repeat the same declaration in each source file, it would become a real nightmare to properly manage it.
Some language uses the same approach. I remember the TurboPascal units being not very different. You would put use ... at the beginning to signal that you were going to require functions that were defined elsewhere. I can't remember if that was passed into Delphi as well.
Know what is in a library at your disposal.
Split the program into bite-size chunks for the compiler. Compiling a megabyte of C files simultaneously will take more resources than most modern hardware can offer.
Reduce compiler load. Why should it know in screen display procedures about deep database engine? Let it learn only of functions it needs now.
Separate private and public data. This use isn't frequent but you may implement in C what C++ uses private fields for: each .c file includes two .h files, one with declarations of private stuff, the other with whatever others may require from the file. Less chance of a namespace conflict, safer due to hermetization.
Alternate configs. Makefile decides which header to use, and the same code may service two different platforms given two different header files.
probably more.

Code Ordering in Source Files - Forward Declarations vs "Don't Repeat Yourself"? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
If you code in C and configure your compiler to insist that all functions are declared before they are used (or if you code in C++), then you can end up with one of (at least) two organizations for your source files.
Either:
Headers
Forward declarations of (static) functions in this file
External functions (primary entry points)
Static - non-public - functions
Or:
Headers
Static - non-public - functions
External functions (primary entry points)
I recognize that in C++, the term 'static' is not preferred, but I'm primarily a C programmer and the equivalent concept exists in C++, namely functions in an anonymous namespace within the file.
Question:
Which organization do you use, and why do you prefer it?
For reference, my own code uses the second format so that the static functions are defined before they are used, so that there is no need to both declare them and define them, which saves on having the information about the function interfaces written out twice - which, in turn, reduces (marginally) the overhead when an internal interface needs to change. The downside to that is that the first functions defined in the file are the lowest-level routines - the ones that are called by functions defined later in the file - so rather than having the most important code at the top, it is nearer the bottom of the file. How much does it matter to you?
I assume that all externally accessible functions are declared in headers, and that this form of repetition is necessary - I don't think that should be controversial.
I've always used method #1, the reason being that I like to be able to quickly tell which functions are defined in a particular file and see their signatures all in one place. I don't find the argument of having to change the prototypes along with the function definition particularly convincing since you usually wind up changing all the code that calls the changed functions anyway, changing the function prototypes while you are at it seems relatively trivial.
In C code I use a simple rule:
Every C file with non-static members will have a corresponding header file defining those members.
This has worked really well for me in the past - makes it easy enough to find the definition of a function because it's in the same-named .h file if I need to look it up. It also works well with doxygen (my preferred tool) because all the cruft is kept in the header where I don't spend most of my time - the C file is full of code.
For static members in a file I insist in ordering the declarations in such a way that they are defined by instantiation before use anyway. And, I avoid circular dependency in function calls almost all of the time.
For C++ code I tried the following:
All code defined in the header file. Use #pragma interface/#pragma implementation to inform the compiler of that; kind of the same way templates put all the code in the header.
That's worked really well for me in C++. It means you end up with HUGE header files which can increase compile time in some cases. You also end up with a C++ body file where you simply include the header and compile. You can instantiate your static member variables here. It also became a nightmare because it was far too easy to change your method params and break your code.
I moved to
Header file with doxygen comments (except for templates, where code must be included in the header) and full body file, except for short methods which I know I'd prefer be inlined when used.
Separating out implementation from definition has the distinct plus that it's harder to change your method/function signatures so you're less likely to do it and break things. It also means that I can have huge doxygen blocks in the header file documenting how things work and work in the code relatively interruption free except for useful comments like "declare a variable called i" (tongue in cheek).
Ada forces the convention and the file naming scheme on you. Most dynamic languages like Ruby, Python, etc don't generally care where/if you declare things.
Number 2: because I write many short functions and refactor them freely, it'd be a significant nuisance to maintain forward declarations. If there's an Emacs extension that does that for you with no fuss, I'd be interested, since the top-down organization is a bit more readable. (I prefer top-down in e.g. Python.)
Actually not quite your Number 2, because I generally group related functions together in the .c regardless of whether they're public or private. If I want to see all the public declarations I'll look in the header.
Number 2 for me.
I think using static or other methods to make your module functions and variables private to the module is a good practice.
I prefer to have my api functions at the bottom of the module. Conversely I put the api functions at the top of my classes as classes are generally reusable. Putting the api functions at the top make it easier to find them quickly. Most IDEs, can take you to any function pretty directly.
(Talking about C code)
Number 2 for me because I always forget to update forward decls to reflect static functions changes.
But I think that the best practice should be
headers
forward declarations + comment on function behaviour for each one
exported functions + eventual comments about implementation details when code is not clear enough
static functions + eventual comments about implementation details
How much does it matter to you?
It's not.
It is important that all local function will be marked as static, but for my opinion defining how to group function in the file is too much. There is no strong reasoning for any version and i don't find any strong disadvantage ever.
In general coding convention is very important and we trying to define as much as possible, but in this case my feeling, that this is unjustified overhead.
After reading all posts again it seems like i should simply upvote (which i did) Darius answer, instead writing all of these ...

Resources