Where to document functions in C or C++? [closed] - c

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I have a C program with multiple files, so I have, for example, stuff.c which implements a few functions, and stuff.h with the function prototypes.
How should I go about documenting the functions in comments?
Should I have all the docs in the header file, all the docs in the .c file, or duplicate the docs for both? I like the latter approach, but then I run into problems where I'll update the docs on one of them and not the other (usually the one where I make the first modification, i.e. if I modify the header file first, then its comments will reflect that, but if I update the implementation, only those comments will change).
This question and its answers also apply to C++ code — see also Where should I put documentation comments?

Put the information that people using the functions need to know in the header.
Put the information that maintainers of the functions need to know in the source code.

I like to follow the Google C++ Style Guide.
Which says:
Function Declarations
Every function declaration should
have comments immediately preceding
it that describe what the function
does and how to use it. These
comments should be descriptive
("Opens the file") rather than
imperative ("Open the file"); the
comment describes the function, it
does not tell the function what to
do. In general, these comments do not
describe how the function performs
its task. Instead, that should be
left to comments in the function
definition.
Function Definitions
Each function definition should have
a comment describing what the
function does and anything tricky
about how it does its job. For
example, in the definition comment
you might describe any coding tricks
you use, give an overview of the
steps you go through, or explain why
you chose to implement the function
in the way you did rather than using
a viable alternative. For instance,
you might mention why it must acquire
a lock for the first half of the
function but why it is not needed for
the second half.
Note you should not just repeat the
comments given with the function
declaration, in the .h file or
wherever. It's okay to recapitulate
briefly what the function does, but
the focus of the comments should be
on how it does it.

You should use a tool like doxygen, so the documentation is generated by specially crafted comments in your source code.

I've gone back and forth on this and eventually I settled on documentation in header files. For the vast majority of APIs in C/C++ you have access to the original header file and hence all of the comments that lie within [1]. Putting comments here maximizes the chance developers will see them.
I avoid duplication of comments between header and source files though (it just feels like a waste). It's really annoying when using Vim but most IDEs will pick up the header file comments and put them into things like intellisense or parameter help.
[1] Exceptions to this rule include generated header files from certain COM libraries.

It will often depend on what is set as the coding standard. Many people prefer to put the documentation in the .h file and leave the implementation in the .c file. Many IDE's with code completion will also pick up more easily on this rather than the documentation in the .c file.
But I think the major point in putting the documentation in the .h file deals with writing a library or assembly that will be shared with another program. Imagine that you're writing a .dll (or .so) that contains a component that you will be distributing. Other programmers will include your .h, but they often won't have (nor need) the implementation file behind it. In this case, documentation in the .h file is invaluable.
The same can be said when you're writing a class for use in the same program. If you're working with other programmers, most often those programmers are just looking at the header file for how to interact with your code rather than how the code is implemented. How it is implemented is not the concern of the person or code that will be using the component. So once again, documentation in the header will help that person or those people figure out how to use that code.

Consider that it's possible for people to use these functions while only having the headers and a compiled version of the implementation. Make sure that anything necessary for using your functions is documented in the header. Implementation details can be documented in the source.

The comments in the header vs. the implementation file should reflect the difference in how the two are used.
If you're going to create interface documentation (e.g., to be extracted with Doxygen, on the same general order as JavaDocs) that clearly belongs in the header. Even if you're not going to extract the comments to produce separate documentation, the same general idea applies -- comments that explain the interface/how to use the code, belong primarily or exclusively in the header.
Comments in the implementation should generally relate to the implementation. Contrary to frequent practice, rather than attempting to explain how things work, most should explain why particular decisions were made. This is especially true when you make decisions that make sense, but it might not be obvious that they do (e.g., noting that you did not use a Quicksort, because you need a stable sort).

It's simple really when you think about it.
The API docs absolutely must go in the header file. It's the header file that defines the external interface, so that's where the API docs go.
As a rule, implementation details should be hidden from API users. This includes documentation of implementation (except where it might affect the use e.g. time complexity etc). Thus implementation documentation should go in the implementation file.
Never ever duplicate documentation in multiple places. It will be unmaintainable and will be out of sync almost as soon as somebody has to change it.

I wrote a simple script that takes as input a template header-file with no function declarations and a source-code file with commented functions. The script extracts the commentary before a function definition from the source code file and writes it and the associated function declaration into an output header-file. This ensures that 1) there's only one place where function commentary needs to be written; and 2) the documentation in the header-file and the source code file always remain in sync. Commentary on the implementation of a function is put into the body of the function and is not extracted.

Definitely keep the docs in one place, to avoid the maintenance nightmare. You, personally, might be fastidious enough to keep two copies in sync, but the next person wont.
Use something like doxygen to create a "pretty" version of the docs.

Related

C function headers location: .h or .c? [duplicate]

This question already has answers here:
Where to document functions in C or C++? [closed]
(10 answers)
Closed 8 years ago.
Suppose we have function (external only considered here) int foo(int a, char *b), normally there will be a header that goes with it documenting what the function does, what each parameter and return value does, etc. It'll probably be in doxygen format too. My habit is that such header should go into .h files because that's where the interface is defined and reader should have all the information in that place. But a lot of people keep such headers in C file where the actual implimentation goes. I've seen this in the Linux kernel code also. So was I wrong? Which would you prefer?
Although header files can be used in any which way, they are primarily a mechanism to enable external linkage.
You design an API that is meant for external consumption, and you put everything required to consume this API (constants, types, prototypes) in header file(s).
All the other stuff, that is a part of implementation, and doesn't need to be seen by external users, can go in the source files (if the usage is localized to one file), or private headers that can be shared between multiple files. The latter is another example of header files enabling external linkage, but for internal consumption.
The answer to this question is largely "it depends":
Depends on what? Who's reading the documentation, and how they access it.
If you're developing a program, then having the documentation inline with the implementation is probably OK, because anybody who wants to know about your program can access the source code and read about it. Your target audience is probably developers working on the program itself, so having the documentation in the C file, along with the bulk of the code they're working on, is a suitable approach.
If you're developing a library, the target audience changes (or you might have two target audiences). You still have the developers, who could make use of more detailed documentation as it relates to private implementation detail. You also have the users of the library, who only care about the interface that they're working with; from a code-browsing point of view, they typically only have access to the headers.
I put them in the .h file by preference when it's my choice, if I have a .h file. If I just have a .c file, I will document the functions when they are defined, simply because if I just have a .c file, I'm probably still coding, and I want to change the documentation if I change the code.
I feel that documentation and declarations go together in a separate file in a finished c project. Documentation in the code breaks up the code and can be redundant.
If I'm contributing somewhere, I'll follow the established convention.

Good way to organize C source files? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
The way I've always organized my C source was to put struct, macro and function prototypes in header files and function implementations in .c files. However, I have recently been reading alot of other peoples code for large projects and I'm starting to see that people often define things like structs and macros in the C source itself, immediately above the functions that make use of it. I can see some benefit to this as you don't have to go searching around to find the definition of structs and macros used by particular functions, everything is right there in roughly the same place as the functions that use it. However I can also see some disadvantages to it as it means that there is not one central repository for struct/macro definitions as they're scattered through the sourcecode.
My question is, what are some good rules of thumb for deciding when to put the macro/struct definition in the C source code as opposed to the header files themselves?
Typically, everything you put in the header file is part of the interface, while everything that you put in the source file is part of the implementation.
That is, if something in the header file is only ever used by the associated source file, it's an excellent candidate for moving to that source file. This prevents "polluting" the namespace of every file that uses your header with macros and types that they were never intended to use.
Interfaces and implementations is what it's all about.
Addendum to the accepted answer: it can be useful to put an incomplete struct declaration in the header but put the definition only in the .c file. Now a pointer to that struct gives you a private type that you have complete control over. Very useful to guarantee separation of concerns; a bit like private members in C++.
For extensive examples, follow the link.
Put your public structures and interface into the .h file.
Put your private bits into the .c file.
If I have more than one .c file that implements a logical set of functionality, I'll put the things that need to be shared among those implementation files into a *p.h file ('p' for private). Client code should not include the *p.h header.
For example, if I have a set of routines that implement an XML parser, I might have the following organization:
xmlparser.h - the public structures, types, enums, and function prototypes
xmlparserp.h - private types, function prototypes, etc. that client code
doesn't and shouldn't need
xmlparser.c - implementation of the XML parser
xmlutil.c - some other implementation bits (would include xmlparserp.h)
stuff defining the external interface to the module goes in the header.
stuff just used within the module should stay in the C file.
header files should include required headers to support its declarations
headers should wrap themselves in #ifndef NAME_H to ensure a single inclusion per compilation unit
modules should include their own headers to ensure consistency
Lots of people don't know this basic stuff BTW, which is required to maintain your sanity on any sized C project.
In the book "C style: Standards and Guidelines" from David Straker (available online here), there are some good ideas on file layout, and on the division between C file and headers.
You may read the chapter 7 and specially chapter 7.4.
And as John Calsbeek said, you can based your organization on how the header parts are used.
If one structure, type, macro, ... is only used by one source, you can move your code there.
You may have header files for the prototypes and some header file for the common declarations (type definitions, etc...)
Mainly, the issue at hand is a sort of encapsulation consideration. You can think of a module's header file as not so much its "overhead", which is what you seem to be doing now, as its public interface declaration, which is presumably how the people whose code you're looking at are seeing it.
From there it follows naturally that what goes in the header file is things that other modules need to know about: prototypes for functions you anticipate being used externally, structs and macros used for interface purposes, extern variable declarations, and so on, while things that are strictly for the module's internal use go in the .c file.
If you define your structs and macros inside the .c, you won't be able to use it from other .c files
To do so, you have to put it in the .h so that #include tells the compiler where to check for your structs and macros
unless you #include "x.c", which you shouldn't do =)

Using Doxygen with C, do you comment the function prototype or the definition? Or both?

I'm using Doxygen with some embedded C source. Given a .c/.h file pair, do you put Doxygen comments on the function prototype (.h file) or the function definition (.c file), or do you duplicate them in both places?
I'm having a problem in which Doxygen is warning about missing comments when I document in one place but not the other; is this expected, or is my Doxygen screwed up?
For public APIs I document at the declaration, as this is where the user usually looks first if not using the doxygen output.
I never had problems with only documenting on one place only, but I used it with C++; could be different with C, although I doubt it.
[edit] Never write it twice. Never. In-Source documentation follows DRY, too, especially concerning such copy-and-paste perversions.[/edit]
However, you can specify whether you want warnings for undocumented elements. Although such warnings look nice in theory, my experience is that they quickly are more of a burden than a help. Documenting all functions usually is not the way to go (there is such a thing is redundant documentation, or even hindering documentation, and especially too much documentation); good documentation needs a knowledgeable person spending time with it. Given that, those warnings are unnecessary.
And if you do not have the resources for writing good documentation (money, time, whatever...), than those warnings won't help either.
Quoted from my answer to this question: C/C++ Header file documentation:
I put documentation for the interface
(parameters, return value, what the
function does) in the interface file
(.h), and the documentation for the
implementation (how the function
does) in the implementation file (.c,
.cpp, .m). I write an overview of the
class just before its declaration, so
the reader has immediate basic
information.
With Doxygen, this means that documentation describing overview, parameters and return values (\brief, \param, \return) are used for documenting function prototype and inline documentation (\details) is used for documenting function body (you can also refer to my answer to that question: How to be able to extract comments from inside a function in doxygen?)
I often use Doxygen with C targeting embedded systems. I try to write documentation for any single object in one place only, because duplication will result in confusion later. Doxygen does some amount of merging of the docs, so in principle it is possible to document the public API in the .h file, and to have some notes on how it actually works sprinkled in the .c file. I've tried not to do that myself.
If moving the docs from one place to the other changes the amount of warnings it produces, that may be a hint that there may be something subtly different between the declaration and definition. Does the code compile clean with -Wall -Wextra for example? Are there macros that mutate the code in one place and not the other? Of course, Doxygen's parser is not a full language parser, and it is possible to get it confused as well.
We comment only the function definitions, but we use it with C++.
Write it at both places is wasting time.
About the warning, if your documentation looks good, maybe it's a good way to ignore such warnings.
I've asked myself the same question and was pleasantly surprised to see that Doxygen actually includes the same in-line documentation that is in the .c file in the corresponding .h file when browsing the generated html documentation. Hence you don't have to repeat your in-line documentation, and Doxygen is smart enough to include it in both places!
I'm running version Doxygen version 1.8.10.

untangling .h dependencies

What do you do when you have a set of .h files that has fallen victim to the classic 'gordian knot' situation, where to #include one .h means you end up including almost the entire lot? Prevention is clearly the best medicine, but what do you do when this has happened before the vendor (!) has shipped the library?
Here's an extension to the question, and this is probably the more pertinent question -- should you even attempt to disentangle the dependencies in the first place?;
I've done this on a C++ code base that was already split into many libraries (which was a good start).
I had to workout (or guess) which library was the most depended upon, which depended upon nothing else in the code base. I then processed each library in turn.
I looked at each module (*.cpp files) in turn and made sure that its own header was #included first and commented out the rest, then I commented out all the #includes in that header file and then re-compiled just that module to let the compiler tell me what was needed. I would un-comment the first header that seemed to be needed, and reviewed that one, recursing as necessary. It was interesting to see how many headers ended up not being needed.
Where only the name is needed (because you have a pointer or reference) use class name; or struct name;, which is called forward declaration and avoid #including the header file.
The compiler is very helpful in telling you what the dependencies are when you comment out #includes (you need to recompile with ALL the compilers you have to maintain portability).
Sometimes I had to move modules between libraries so that no pairs or groups of libraries were mutually dependant.
As you have the opportunity, you should refactor the code to reduce includes that are too large, however that assumes you can achieve some sort of package cohesion. If you disentangle things just to discover that every user of the code has to include all the elements anyway, the end result is the same.
Another option is to use #defines to configure sections on and off. Regardless, for an existing code base the solution is to move toward package cohesion.
Read: http://ivanov.files.wordpress.com/2007/02/sedpackages.pdf and research issues related to package cohesion.
I've untangled that knot a few times, and it generally helps a lot when maintaining a system to reduce the .h dependencies as much as possible. There are decent tools for generating dependency trees ( I was using Klocwork at the time ).
The downside I found was with conditional compilation. Someone might remove a header file because they think we don't need it, but it turns out that we only don't need it because VxWorks has some screwed up headers... on Solaris (or any reasonable Posix system) you do need it.
There is a balance to be struck between an enormous number of finely organized headers and a single header that includes everything. Consider the Standard C library; there are some biggish headers like <stdio.h>, which declares a lot of functions, but they are all related to I/O. There are other headers that are more of a miscellany - notably <stdlib.h>.
The Goddard Space Flight Center guidelines for C are worth hunting down.
The basic rule is that each header should declare the facilities provided by a suitable (usually small) set of source files. The facilities and header should be self-contained. That is, if someone needs the code in header "something.h", then that should be the only header that must be added to the compilation. If there are facilities needed by "something.h" that are not declared in the header, then it must include the relevant headers. That can mean that headers end up including <stddef.h> because one of the functions uses size_t, for example.
As #quamrana points out, you can use forward declarations for structures (not classes, since the question is tagged C and not C++) when appropriate - which primarily means when the interface takes pointers and does not need to know the size of the structures or any of the members.

Code Ordering in Source Files - Forward Declarations vs "Don't Repeat Yourself"? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
If you code in C and configure your compiler to insist that all functions are declared before they are used (or if you code in C++), then you can end up with one of (at least) two organizations for your source files.
Either:
Headers
Forward declarations of (static) functions in this file
External functions (primary entry points)
Static - non-public - functions
Or:
Headers
Static - non-public - functions
External functions (primary entry points)
I recognize that in C++, the term 'static' is not preferred, but I'm primarily a C programmer and the equivalent concept exists in C++, namely functions in an anonymous namespace within the file.
Question:
Which organization do you use, and why do you prefer it?
For reference, my own code uses the second format so that the static functions are defined before they are used, so that there is no need to both declare them and define them, which saves on having the information about the function interfaces written out twice - which, in turn, reduces (marginally) the overhead when an internal interface needs to change. The downside to that is that the first functions defined in the file are the lowest-level routines - the ones that are called by functions defined later in the file - so rather than having the most important code at the top, it is nearer the bottom of the file. How much does it matter to you?
I assume that all externally accessible functions are declared in headers, and that this form of repetition is necessary - I don't think that should be controversial.
I've always used method #1, the reason being that I like to be able to quickly tell which functions are defined in a particular file and see their signatures all in one place. I don't find the argument of having to change the prototypes along with the function definition particularly convincing since you usually wind up changing all the code that calls the changed functions anyway, changing the function prototypes while you are at it seems relatively trivial.
In C code I use a simple rule:
Every C file with non-static members will have a corresponding header file defining those members.
This has worked really well for me in the past - makes it easy enough to find the definition of a function because it's in the same-named .h file if I need to look it up. It also works well with doxygen (my preferred tool) because all the cruft is kept in the header where I don't spend most of my time - the C file is full of code.
For static members in a file I insist in ordering the declarations in such a way that they are defined by instantiation before use anyway. And, I avoid circular dependency in function calls almost all of the time.
For C++ code I tried the following:
All code defined in the header file. Use #pragma interface/#pragma implementation to inform the compiler of that; kind of the same way templates put all the code in the header.
That's worked really well for me in C++. It means you end up with HUGE header files which can increase compile time in some cases. You also end up with a C++ body file where you simply include the header and compile. You can instantiate your static member variables here. It also became a nightmare because it was far too easy to change your method params and break your code.
I moved to
Header file with doxygen comments (except for templates, where code must be included in the header) and full body file, except for short methods which I know I'd prefer be inlined when used.
Separating out implementation from definition has the distinct plus that it's harder to change your method/function signatures so you're less likely to do it and break things. It also means that I can have huge doxygen blocks in the header file documenting how things work and work in the code relatively interruption free except for useful comments like "declare a variable called i" (tongue in cheek).
Ada forces the convention and the file naming scheme on you. Most dynamic languages like Ruby, Python, etc don't generally care where/if you declare things.
Number 2: because I write many short functions and refactor them freely, it'd be a significant nuisance to maintain forward declarations. If there's an Emacs extension that does that for you with no fuss, I'd be interested, since the top-down organization is a bit more readable. (I prefer top-down in e.g. Python.)
Actually not quite your Number 2, because I generally group related functions together in the .c regardless of whether they're public or private. If I want to see all the public declarations I'll look in the header.
Number 2 for me.
I think using static or other methods to make your module functions and variables private to the module is a good practice.
I prefer to have my api functions at the bottom of the module. Conversely I put the api functions at the top of my classes as classes are generally reusable. Putting the api functions at the top make it easier to find them quickly. Most IDEs, can take you to any function pretty directly.
(Talking about C code)
Number 2 for me because I always forget to update forward decls to reflect static functions changes.
But I think that the best practice should be
headers
forward declarations + comment on function behaviour for each one
exported functions + eventual comments about implementation details when code is not clear enough
static functions + eventual comments about implementation details
How much does it matter to you?
It's not.
It is important that all local function will be marked as static, but for my opinion defining how to group function in the file is too much. There is no strong reasoning for any version and i don't find any strong disadvantage ever.
In general coding convention is very important and we trying to define as much as possible, but in this case my feeling, that this is unjustified overhead.
After reading all posts again it seems like i should simply upvote (which i did) Darius answer, instead writing all of these ...

Resources