Idiomatic way to declare local functions in C [closed] - c

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
My small program consists of several "modules" - translation units. Some of the entities are extern, but others are static.
For the external entities, I always put a prefix. The internal ones you can tell by the static in front. Still, I am looking for a naming convention that clearly conveys that the entity is
file-specific, not visible from the outside
can be safely changed without risk of changing api-s
if there is no static in front, this is a mistake.
Python programmers use a leading underscore. What do C-programmers do?

Many things, but avoid leading underscores as that is magic in several environments.

The static declaration conveys what you are asking about. Naturally, it misses this requirement:
if there is no static in front, this is a mistake.
But I would argue that's a silly requirement. You say you use _ in Python, which from the standpoint of the safety of building one file (in C, "compiling one translation unit"), is strictly inferior. If you omit it, Python won't notice until runtime, and the behavior won't change. Maybe you need another requirement in Python to alert the reader that a _ may be missing. How about __? But what if that is missing? And so forth. What if your program doesn't terminate and it's supposed to? You can't write a "does it stop" check.
And in any case, in my experience, C developers don't actually worry about this. If it helps you read, it would not be unreasonable to make up your own and stick with it when sensible. Maybe s_ for static, m_ for module, t_ for translation unit, etc. These sorts of prefix I think are a weak "meta-convention" in C and C++ for special classes of variables, e.g. class members.

I would advise against coding static into function names. Why? Because the name should convey what the function is used for in a clear and concise manner. Where it is stored, and where it is accessable is generally of no importance to convey.
Note that this argument is specific to static on functions. It does not apply to other things like const on variable names (I personally use a k-prefix for those), or global variables (which should be avoided by all means in the first place).

Related

Why does the standard C library feature multiple header files instead of consolidating the contents into a single header? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 months ago.
Improve this question
Why does the standard C library need to feature multiple header files? Would it not be more user friendly to consolidate it into one header file?
I understand that it is unnecessary to include function prototypes / global variables that go unused, however, won't the compiler eventually remove all of these references if they're unused?
Maybe I'm underestimating the size of contents spanning all of the header files. But it seems like I'm always googling to figure out what header I need to #include.
Edit: The top comment references a size of 50MB which appears to be untrue. The C library is relatively concise compared to other language's standard libraries hence the question.
// sarcasm ON
Build your own #include "monster.h" that includes all you want from the collection.
People write obfuscated code all the time. But, you won't be popular with your cohort.
// sarcasm OFF
The real reason? C was developed when storage was barely beyond punch card technology. A multi-user system might have 1/2Mb of (magnetic) core memory and a cycle time that is now laughable.
Yet, reams of C code was developed (and the story of the popularity of UNIX well known.)
Although new standards could change the details (and C++ has), there is an ocean of code that might/would no longer be able to be compiled.
Look up "Legacy".
There are (too many) examples of code in the world where the usual collection of, say, stdio.h, stdlib.h, etc. have been "hidden" inside #include "myApp.h". Doing so forces the reader to hunt-down a second file to verify what the &^%&%& is going on.
Get used to it. If it was a bad practice, it would not have survived for ~50 years.
EDIT: It is for a similar reason that "functions" have been "grouped together" into different libraries. By extension, pooling those libraries might make things a tiny bit "less cluttered", but the process of linking may require a supercomputer to finish the job before knock-off time.
EDIT2: Brace yourself. Cleanly written C++ code often has a .h and a .cpp for every class used in the app. Sometimes "families" (hierarchies) might live together in a single pair of files, although that can lead to boo-boo's when the coder is having a bad day...

C access control practices [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I am writing a small shell as a course exercise, which emulates bash's autocompletion and history mechanisms, resulting in a main.c file for managing user commands, and a raw.c file for managing the terminal in raw mode.
It is unlikely any file in the project will ever need call anything except raw.c's get_line() method, therefore my instinct is to only include this get_line() method in raw.h, to prevent accidental access of another raw.c method and further complexities.
Where can a good primer and or discussion on C access control techniques be found, in particular whether it is a good idea to emulate OO language's private/public concepts, and how it is usually done, if so?
First things first: "OO's private / public concepts" are not "access control". Even if something is "private" it's still there, and can still be accessed. You have protected it against accidential access, but that is a far cry from "securing" it (from an "authorization" point of view). A determined and / or malicious client can still get at it, because "security" is not what those mechanics are for.
Once you understood that, you realize that all those "visibility" things - whether you declare something in a header, or make it public vs. private, or whatever - are basically aiming at maintainability: Reducing the amount of identifiers in the current scope, reducing the amount of functions and variables you have to think about in a given context.
Then, you say that your "instinct is to only include this get_line() method in raw.h". You realize that this is faulty wording? You can declare that function in a header file, you can include that header file, but you don't include a function.
So. You implement functions that belong together in a translation unit (main.c, raw.c). You declare functions that might be called from outside that translation unit in that translation unit's header file (raw.h). All functions not to be called from the outside, you define as static inside the translation unit itself, and don't declare them in a header at all.
As for emulating another language's concepts, don't. Do things the way they are done in the language you are currently using, or use a different language.
Private functions in raw.c should of course be declared static (and omitted from the public header). Then they're only visible and callable from the same "compilation unit", i.e. from within raw.c.
Only public method should be put into the .h
Private method must be declared static at the top of the .c file.
If your module is using multi .c files, you should not put the function into the public .h. Instead you should create a second private .h, for example : mymodule_p.h instead of mymodule.h. It is like a protected function

best practice for delivering a C API hiding internal functions [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I have written a C library which consists in a few .h files and .c files. I compile it as a .a static library.
I would like to expose only certain functions to the user and keep the rest as "obscure" as possible to make reverse engineering reasonably difficult.
Ideally my library would consist of:
1- one .h file with only the functions exposed to the user
2- myLibrary.a: as un-reversengineerable as possible
What are the best practices for that? Where should I look, is there a good tutorial/book somewhere?
More specifically:
for - 1
I already have all my .h and .c working and I would like to avoid changing them around, moving function declarations from .h to .c and go into circular references potential pbs. Is That possible?
For instance is it a good idea to create a new .h file which I would use only for distributing with my .a? That .h would contain copies of the functions I want to expose and forward declarations of types I use. Is that a good idea?
for - 2
a) what gcc flags (or xcode) shall I be aware of (for stripping, not having debug symbols etc)
b) a good pointer to learn about how to do code obfuscation?
Any thought will help,
Thanks, baba
The usual practice is to make sure that every function and global variable that is for use only internal to some module is declared static in that module. That limits exposure of internal implementation details from a single module.
If you need internal implementation details that cross between modules, but which are not for public consumption, then declare one or more .h files that are kept private and not delivered to end users. The names of objects defined in that way will still be visible to the linker (and to tools such as objdump and nm) but their detailed signatures will not be.
If you have data structures that are delivered to the end user, but which are opaque, then consider having the API deliver them as pointers to a struct that is declared by not defined in the public API .h file. That will preserve type safety, while concealing the implementation details. Naturally, the complete struct definition is in a private .h file.
With care, you can keep a partially documented publicly known struct that is a type-pun for the real definition but which only exposes the public members. This is more difficult to keep up to date, and if you do it, I would make certain that there are some strong test cases to validate that the public version is in fact equivalent to the private version in all ways that matter.
Naturally, use strip to remove the debug segments so that the internal details are not leaked that way.
There are tools out there that can obfuscate all the names that are intended to be only internal use. If run as part of the build process, you can work with an internal debug build that has sensible names for everything, and ship a build that has named all the internal functions and global variables with names that only a linker can love.
Finally, get used to the fact that anyone that can use your library will be able to reverse engineer your library to some extent. There are anti-debugger measures that can be taken, but IMHO that way lies madness and frustration.
I don't have a quick answer other than to explore the use of "static" functions. I would recommend reading Miro Samek's work on something he calls "C+". Basically object oriented ANSI C. Great read. He owns Quantum leaps software.
Erase headers for this functions, do some obfuscation in exports table and get pack your code and apply some anti debugger algorithm.
http://upx.sourceforge.net/
http://www.oreans.com/

Recommendations for 'C' Project Architecture Guidelines? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
Now that I got my head wrapped around the 'C' language to a point where I feel proficient enough to write clean code, I'd like to focus my attention on project architecture guidelines. I'm looking for a good resource that coves the following topics:
How to create an interface that promotes code maintainability and is extensible for future upgrades.
Library creation guidelines. Example, when should I consider using static vs dynamic libraries. How to properly design an ABI to cope with either one.
Header files: what to partition out and when. Examples on when to use 1:1 vs 1:many .h to .c
Anything you feel I missed but is important when attempting to architect a new C project.
Ideally, I'd like to see some example projects ranging from small to large and see how the architecture changes depending on project size, function or customer.
What resource(s) would you recommend for such topics?
Whenever I got serious writing C code, I had to emulate C++ features in it. The main stuff worth doing is:
Think of each module like a class. The functions you expose in the header are like public methods. Only put a function in the header if it part of the module's needed interface.
Avoid circular module dependencies. Module A and module B should not call each other. You can refactor something into a module C to avoid that.
Again, following the C++ pattern, if you have a module that can perform the same operations on different instances of data, have a create and delete function in your interface that will return a pointer to struct that is passed back to other functions. But for the sake of encapsulation, return a void pointer in the public interface and cast to your struct inside of the module.
Avoid module-scope variables--the previously described pattern will usually do what you need. But if you really need module-scope variables, group them under a struct stored in a single module-scope variable called "m" or something consistent. Then in your code whenever you see "m.variable" you will know at a glance it is one of the module-scope structs.
To avoid header trouble, put #ifndef MY_HEADER_H #define MY_HEADER_H declaration that protects against double including. The header .h file for your module, should only contain #includes needed FOR THAT HEADER FILE. The module .c file can have more includes needed for the compiling the module, but don't add those includes into the module header file. This will save you from a lot of namespace conflicts and order-of-include problems.
The truths about architecting systems are timeless, and they cross language boundaries. Here is a little advice focused on C:
Every module hides a secret. Build your system around interfaces that hide information from their clients. C's only type-safe information-hiding construct is the pointer to an incomplete structure. Learn it thoroughly and use it often.
One interface to an implementation is a good rule of thumb. (Interface is .h, implementation is .c.) Sometimes you will want to provide two interfaces that relate to the same implementation: one that hides the representation and one that exposes it.
You'll need naming conventions.
A superb model of how to handle these kinds of problems in C is Dave Hanson's C Interfaces and Implementations. In it you will get to see how to design good interfaces and implementations, and how one interface can build on another to form a coherent library. You will also get an excellent set of starter interfaces you can use in your own applications. For someone in your position, I cannot recommend this book too highly. It is an archetype of well-architected systems in C.
Namespace cleanliness - especially important with libraries. Prefix your public functions with the name of the library, in some fashion. If your library was called 'happyland', make functions such as "happyland_init" or even "hl_init".
This goes for the use of static. You will write functions which are specialized - hide them from other modules by using static liberally.
For libraries, reentrant is also critical. Do not depend on global state which is not compartmentalized (you can have a typedef struct token to keep this state if required).
Separate presentation code from logic. That is very important.
Static if they're used only in your project or in few binaries, dynamic when used many times (saves very much space).
Whenever code is used more than once, split it off into a header.
Those are a few of my known tips I can give you.
How to create an interface that
promotes code maintainability and is
extensible for future upgrades.
By exposing as few implementation details as possible. For example,
Use opaque pointers.
If you need a "private" function, declare it static, and don't put it in the .h file.

Code Ordering in Source Files - Forward Declarations vs "Don't Repeat Yourself"? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
If you code in C and configure your compiler to insist that all functions are declared before they are used (or if you code in C++), then you can end up with one of (at least) two organizations for your source files.
Either:
Headers
Forward declarations of (static) functions in this file
External functions (primary entry points)
Static - non-public - functions
Or:
Headers
Static - non-public - functions
External functions (primary entry points)
I recognize that in C++, the term 'static' is not preferred, but I'm primarily a C programmer and the equivalent concept exists in C++, namely functions in an anonymous namespace within the file.
Question:
Which organization do you use, and why do you prefer it?
For reference, my own code uses the second format so that the static functions are defined before they are used, so that there is no need to both declare them and define them, which saves on having the information about the function interfaces written out twice - which, in turn, reduces (marginally) the overhead when an internal interface needs to change. The downside to that is that the first functions defined in the file are the lowest-level routines - the ones that are called by functions defined later in the file - so rather than having the most important code at the top, it is nearer the bottom of the file. How much does it matter to you?
I assume that all externally accessible functions are declared in headers, and that this form of repetition is necessary - I don't think that should be controversial.
I've always used method #1, the reason being that I like to be able to quickly tell which functions are defined in a particular file and see their signatures all in one place. I don't find the argument of having to change the prototypes along with the function definition particularly convincing since you usually wind up changing all the code that calls the changed functions anyway, changing the function prototypes while you are at it seems relatively trivial.
In C code I use a simple rule:
Every C file with non-static members will have a corresponding header file defining those members.
This has worked really well for me in the past - makes it easy enough to find the definition of a function because it's in the same-named .h file if I need to look it up. It also works well with doxygen (my preferred tool) because all the cruft is kept in the header where I don't spend most of my time - the C file is full of code.
For static members in a file I insist in ordering the declarations in such a way that they are defined by instantiation before use anyway. And, I avoid circular dependency in function calls almost all of the time.
For C++ code I tried the following:
All code defined in the header file. Use #pragma interface/#pragma implementation to inform the compiler of that; kind of the same way templates put all the code in the header.
That's worked really well for me in C++. It means you end up with HUGE header files which can increase compile time in some cases. You also end up with a C++ body file where you simply include the header and compile. You can instantiate your static member variables here. It also became a nightmare because it was far too easy to change your method params and break your code.
I moved to
Header file with doxygen comments (except for templates, where code must be included in the header) and full body file, except for short methods which I know I'd prefer be inlined when used.
Separating out implementation from definition has the distinct plus that it's harder to change your method/function signatures so you're less likely to do it and break things. It also means that I can have huge doxygen blocks in the header file documenting how things work and work in the code relatively interruption free except for useful comments like "declare a variable called i" (tongue in cheek).
Ada forces the convention and the file naming scheme on you. Most dynamic languages like Ruby, Python, etc don't generally care where/if you declare things.
Number 2: because I write many short functions and refactor them freely, it'd be a significant nuisance to maintain forward declarations. If there's an Emacs extension that does that for you with no fuss, I'd be interested, since the top-down organization is a bit more readable. (I prefer top-down in e.g. Python.)
Actually not quite your Number 2, because I generally group related functions together in the .c regardless of whether they're public or private. If I want to see all the public declarations I'll look in the header.
Number 2 for me.
I think using static or other methods to make your module functions and variables private to the module is a good practice.
I prefer to have my api functions at the bottom of the module. Conversely I put the api functions at the top of my classes as classes are generally reusable. Putting the api functions at the top make it easier to find them quickly. Most IDEs, can take you to any function pretty directly.
(Talking about C code)
Number 2 for me because I always forget to update forward decls to reflect static functions changes.
But I think that the best practice should be
headers
forward declarations + comment on function behaviour for each one
exported functions + eventual comments about implementation details when code is not clear enough
static functions + eventual comments about implementation details
How much does it matter to you?
It's not.
It is important that all local function will be marked as static, but for my opinion defining how to group function in the file is too much. There is no strong reasoning for any version and i don't find any strong disadvantage ever.
In general coding convention is very important and we trying to define as much as possible, but in this case my feeling, that this is unjustified overhead.
After reading all posts again it seems like i should simply upvote (which i did) Darius answer, instead writing all of these ...

Resources