We are working on a game engine written in C and currently we are using the following naming conventions.
ABClass object;
ABClassMethod(object, args)
AB Being our prefix.
Our API, even if working on objects, does not have inheritance, polymorphism or anything. All we have is data types and methods working on them.
Our Constants are named alike: AB_ConstantName and Preprocessor macros are named like AB_API_BEGIN. We don't use function like macros.
I was wondering how this was fitting as a C API. Also, you may note that the entire API is wrapper into lua, and you can either use the API from C or lua. Most of the time the engine will be used from lua.
Whatever the API you'll come out with, for your users' mental sanity (and for yours), ensure that it's consistent throughout the code.
Consistency, to me, includes three things:
Naming. Case and use of the underscore should be regulated. For example: ABClass() is a "public" symbol while AB_Class() is not (in the sense that it might be visible (for whatever reason) to other modules but it's reserved for internal use.
If you have "ABClass()", you should never have "abOtherClass()" or "AbYet_anotherClass()"
Nouns and verbs. If something is called "point" it must always be "point" and not "pnt" or "p" or similar.
Standard C library, for example, has both putc() and putchar() (yes, they are different but the name doesn't tell which one writes on stdout).
Also verbs should be consistent: avoid having "CreateNewPoint()", "BuildCircle()" and "NewSquareMake()" at the same time!
Argument position. If a set of related function takes similar arguments (e.g. a string or a file) ensure they have the same position. Again the C standard library do a poor job with fwrite() and fprintf(): one has the file as the last argument, the other as the first one.
The rest is much up to your taste and any other constraint you might have.
For example, you mentioned you're using Lua: Following a convention that is similar to the Lua one could be a plus if programmers have to be exposed to both API at the same time.
This seems standard enough. OpenGL did it with a gl prefix, so you can't be that far off. :)
There is a lot of C APIs. If you are creative enough to invent a new one, there's no "majority" to blame you. On the other hand, no matter which way you go there are enough zealots of other standards to get mad at you.
Related
Is there a reliable way to prevent external code from calling inner functions of a lib that was compiled from C code?
I would like to deliver a static library with an API header file. The library has different modules, consisting of .c and .h files. I would like to prevent the recepients from using functions declared in the inner .h files.
Is this possible?
Thanks!
Is there a reliable way to prevent external code from calling inner functions of a lib ?
No, there cannot be (read about Rice's theorem; detecting statically such non-trivial properties is undecidable). The library or the code might use function pointers. A malicious user could play with function pointers and pointer arithmetic to call some private function (perhaps after having reverse-engineered your code), even if it is static.
On Linux you might play with visibility tricks.
Or you could organize your library as a single translation unit (a bit like sqlite is doing its amalgamation) and have all internal functions be static ...
In general, the library should have naming conventions about its internal functions (e.g. suffix all of them with _). This could be practically helpful (but not against malicious users).
Most importantly, a library should be well documented (with naming conventions being also documented), and a serious user will only use documented functions the way they are documented to be useful.
(so I don't think you should care about internal functions being called; you do need to document which public functions can be called, and how, and when...; a user calling anything else should expect undefined behavior, that is very bad things)
I would like to deliver a static library with an APIheader file, and would like to prevent the recepients from using the structs I define and the inner functions.
I am not sure that (at least on Linux) delivering a static library is wise. I would recommend delivering a shared library, read Drepper's How to Write Shared Libraries.
And you can't prevent the recipient (assuming a malicious, clever, and determined one) to use inner functions and internal struct-s. You should just discourage them in the documentation, and document well your public functions and data types.
I would like to prevent the recepients from using functions declared in the inner .h files. Is this possible?
No, that is impossible.
It looks like you seek a technical solution to a social issue. You need to trust your users (and they need to trust you), so you should document what functions can be used (and you could even add in your documentation some sentence saying that using directly any undocumented function yields undefined behavior). You can't do much more. Perhaps (in particular if you are selling your library as a proprietary software) you need a lawyer to write a good contract.
You might consider writing your own GCC plugin (or GCC MELT extension) to detect such calls. That could take you weeks of work and is not worth the trouble (and will remain imperfect).
I am not able to guess your motivations and use case (is it some life-critical software driving a nuclear reactor, a medical drug injector, an autonomous vehicule, a missile?). Please explain what would happen to you if some (malicious but clever) user would call an internal undocumented function. And what could happen to that user?
I am trying to write a generic library in pure c , just some data structures like stack, queue...
In my stack.h when giving name to those functions. I have questions about that.
Can I use such name, for example "init" as the function name to init a stack. Will there be something wrong?
I know maybe there exist other functions which just do other things and have the same name as "init". Then would the program be confused, especially when i both include the different init's headers.
3.I know my worry may be unnecessary, but i still want to know the principle.
Any help is appreciated, thanks.
Can I use such name, for example "init" as the function name to init a
stack. Will there be something wrong?
Yes, if anyone else wants a function named init.
I know my worry may be unnecessary, but i still want to know the
principle
Your worry is necessary, this (the lack of namespaces) is a serious problem in C.
Export as few functions as possible. Make everything static if you can
Prefix function names with something. For instance, instead of init, try stack_init
You don't have namespaces in C so usually you prefix every identifier with the name or nickname of your library.
init();
becomes
fancy_lib_init();
There might be existing libraries doing what you want (e.g. Glib). At least, study them a little before writing your own.
If you claim to develop a generic reusable C library, I suggest having naming conventions. For instance, have all the identifiers (notably function names, typedef-s, struct names...) share some common prefix.
Be systematic in your naming conventions. For instance, initializers for stacks and for queues should have similar names & signatures, and end with _init. Document your naming conventions.
Define very clearly how should data be allocated and released. Who and when should call free?
init() might be okay (if you're including your library into something else as an actual library, rather than compiling its source in), but it's better practice to use something like stack_init(), and to prefix your library's functions with stack_ or queue_, etc.
A program using your library may get confused, depending on the order the libraries are included, see #1.
As far as the principles go, the linker (on Linux, anyway) will look for symbols, and there's an ordering to how those symbols will be found. For more information, you can check out the man page for dlsym(), and specifically for RTLD_NEXT.
Function names in C are global. If two functions in a program have the same name, the program should fail to compile. (Well, sometimes it fails at link time, but the idea still holds.)
Generally, you get around this problem by using some sort of prefix or suffix on the function names in your library. "apporc_stack_init()" is much less likely to collide with something than "init()" is.
I have a "a pain in the a$$" task to extract/parse all standard C functions that were called in the main() function. Ex: printf, fseek, etc...
Currently, my only plan is to read each line inside the main() and search if a standard C functions exists by checking the list of standard C functions that I will also be defining (#define CFUNCTIONS "printf...")
As you know there are so many standard C functions, so defining all of them will be so annoying.
Any idea on how can I check if a string is a standard C functions?
If you have heard of cscope, try looking into the database it generates. There are instructions available at the cscope front end to list out all the functions that a given function has called.
If you look at the list of the calls from main(), you should be able to narrow down your work considerably.
If you have to parse by hand, I suggest starting with the included standard headers. They should give you a decent idea about which functions could you expect to see in main().
Either way, the work sounds non-trivial and interesting.
Parsing C source code seems simple at first blush, but as others have pointed out, the possibility of a programmer getting far off the leash by using #defines and #includes is rather common. Unless it is known that the specific program to be parsed is mild-mannered with respect to text substitution, the complexity of parsing arbitrary C source code is considerable.
Consider the less used, but far more effective tactic of parsing the object module. Compile the source module, but do not link it. To further simplify, reprocess the file containing main to remove all other functions, but leave declarations in their places.
Depending on the requirements, there are two ways to complete the task:
Write a program which opens the object module and iterates through the external reference symbol table. If the symbol matches one of the interesting function names, list it. Many platforms have library functions for parsing an object module.
Write a command file or script which uses the developer tools to examine object modules. For example, on Linux, the command nm lists external references with a U.
The task may look simple at first but in order to be really 100% sure you would need to parse the C-file. It is not sufficient to just look for the name, you need to know the context as well i.e. when to check the id, first when you have determined that the id is a function you can check if it is a standard c-runtime function.
(plus I guess it makes the task more interesting :-)
I don't think there's any way around having to define a list of standard C functions to accomplish your task. But it's even more annoying than that -- consider macros,
for example:
#define OUTPUT(foo) printf("%s\n",foo)
main()
{
OUTPUT("Ha ha!\n");
}
So you'll probably want to run your code through the preprocessor before checking
which functions are called from main(). Then you might have cases like this:
some_func("This might look like a call to fclose(fp), but surprise!\n");
So you'll probably need a full-blown parser to do this rigorously, since string literals
may span multiple lines.
I won't bring up trigraphs...that would just be pointless sadism. :-) Anyway, good luck, and happy coding!
Does the FILE type used through standard C functions fopen, etc. have an object-oriented interface?
I'm looking for opinions with reasoning rather than an absolute answer, as definitions of OO vary by who you ask. What are the important OO concepts it meets or doesn't meet?
In response to JustJeff's comment below, I am not asking whether C is an OO language, nor whether C (easily or not) allows OO programming. (Isn't that a separate issue?)
Is C an object-oriented language?
Was OOP (object-oriented-programming) anything more than a laboratory concept when C and FILE were created?
Answering these questions will answer your question.
EDIT:
Further thoughts:
Object Oriented specifically means several behaviors, including:
Inheritence: Can you derive new classes from FILE?
Polymorphism: Can you treat derived classes as FILEs?
Encapsulation: Can you put a FILE inside another object?
Methods & Properties: Does a FILE have methods and properties specific to it? (eg.
myFile.Name, myFile.Size, myFile.Delete())
Although there are well known C "tricks" to accomplish something resembling each of these behaviors, this is not built in to FILE, and is not the original intent.
I conclude that FILE is not Object Oriented.
If the FILE type were "object oriented", presumably we could derive from it in some meaningful way. I've never seen a convincing instance of such a derivation.
Lets say I have new hardware abstraction, a bit like a socket, called a wormhole. Can I derive from FILE (or socket) to implement it. Not really - I've probably got to make some changes to tables in the OS kernel. This is not what I call object orientation
But this whole issue comes down to semantics in the end. Some people insist that anything that uses a jump-table is object oriented, and IBM have always claimed that their AS/400 boxes are object-oriented, through & through.
For those of you that want to dip into the pit of madness and stupidity that is the USENET comp.object newsgroup, this topic was discussed quite exhaustively there a few years ago, albeit by mad and stupid people. If you want to trawl those depths, the Google Groups interface is a good place to start.
Academically speaking, certainly the actual files are objects. They have attributes and you can perform actions on them. Doesn't mean FILE is a class, just saying, there are degrees of OO-ness to think about.
The trouble with trying to say that the stdio FILE interface qualifies as OO, however, is that the stdio FILE interface doesn't represent the 'objectness' of the file very well. You could use FILEs under plain old C in an OO way, but of course you forfeit the syntactic clarity afforded by Java or C++.
It should probably further be added that while you can't generate 'inheritance' from FILE, this further disqualifies it as OO, but you could argue that's more a fault of its environment (plain C) than the abstract idea of the file-as-object itself.
In fact .. you could probably make a case for FILE being something like a java interface. In the linux world, you can operate almost any kind of I/O device through the open/close/read/write/ioctl calls; the FILE functions are just covers on top of those; therefore in FILE you have something like an abstract class that defines the basic operations (open/read/etc) on an 'abstact i/o device', leaving it up to the various sorts of derived types to flesh those out with type-specific behavior.
Granted, it's very hard to see the OO in a pile of C code, and very easy to break the abstractions, which is why the actual OO languages are so much more popular these days.
It depends. How do you define an "object-oriented interface"? As the comments to abelenky's post shows, it is easy to construct an argument that FILE is object-oriented. It depends on what you mean by "object-oriented". It doesn't have any member methods. But it does have functions specific to it.
It can not be derived from in the "conventional" sense, but it does seem to be polymorphic. Behind a FILE pointer, the implementation can vary widely. It may be a file, it may be a buffer in memory, it may be a socket or the standard output.
Is it encapsulated? Well, it is essentially implemented as a pointer. There is no access to the implementation details of where the file is located, or even the name of the file, unless you call the proper API functions on it. That sounds encapsulated to me.
The answer is basically whatever you want it to be. If you don't want FILE to be object-oriented, then define "object-oriented" in a way that FILE can't fulfill.
C has the first half of object orientated.
Encapsulation, ie you can have compound types like FILE* or structs but you can't inherit from them which is the second (although less important) half
No. C is not an object-oriented language.
I know that's an "absolute answer," which you didn't want, but I'm afraid it's the only answer. The reasoning is that C is not object-oriented, so no part of it can have an "object-oriented interface".
Clarification:
In my opinion, true object-orientation involves method dispatch through subtype polymorphism. If a language lacks this, it is not object-oriented.
Object-orientation is not a "technique" like GTK. It is a language feature. If the language lacks the feature, it is not object-oriented.
If object-orientation were merely a technique, then nearly every language could be called object-oriented, and the term would cease to have any real meaning.
There are different definitions of oo around. The one I find most useful is the following (inspired by Alan Kay):
objects hold state (ie references to other objects)
objects receive (and process) messages
processing a message may result in
messages beeing sent to the object itself or other objects
a change in the object's state
This means you can program in an object-oriented way in any imperative programming language - even assembler. A purely functional language has no state variables, which makes oo impossible or at least awkward to implement (remember: LISP is not pure!); the same should go for purely declarative languages.
In C, message passing in most often implemented as function calls with a pointer to a struct holding the object's state as first argument, which is the case for the file handling api. Still, C as a language can't be classified as oo as it doesn't have syntactic support for this style of programming.
Also, some other definitions of oo include things like class-based inheritance (so what about prototypal languages?) and encapsulation - which aren't really essential in my opinion - but some of them can be implemented in C with some pointer- and casting magic.
What methods, practices and conventions do you know of to modularize C code as a project grows in size?
Create header files which contain ONLY what is necessary to use a module. In the corresponding .c file(s), make anything not meant to be visible outside (e.g. helper functions) static. Use prefixes on the names of everything externally visible to help avoid namespace collisions. (If a module spans multiple files, things become harder., as you may need to expose internal things and not be able hide them with "static")
(If I were to try to improve C, one thing I would do is make "static" the default scoping of functions. If you wanted something visible outside, you'd have to mark it with "export" or "global" or something similar.)
OO techniques can be applied to C code, they just require more discipline.
Use opaque handles to operate on objects. One good example of how this is done is the stdio library -- everything is organised around the opaque FILE* handle. Many successful libraries are organised around this principle (e.g. zlib, apr)
Because all members of structs are implicitly public in C, you need a convention + programmer discipline to enforce the useful technique of information hiding. Pick a simple, automatically checkable convention such as "private members end with '_'".
Interfaces can be implemented using arrays of pointers to functions. Certainly this requires more work than in languages like C++ that provide in-language support, but it can nevertheless be done in C.
The High and Low-Level C article contains a lot of good tips. Especially, take a look at the "Classes and objects" section.
Standards and Style for Coding in ANSI C also contains good advice of which you can pick and choose.
Don't define variables in header files; instead, define the variable in the source file and add an extern statement (declaration) in the header. This will tie into #2 and #3.
Use an include guard on every header. This will save so many headaches.
Assuming you've done #1 and #2, include everything you need (but only what you need) for a certain file in that file. Don't depend on the order of how the compiler expands your include directives.
The approach that Pidgin (formerly Gaim) uses is they created a Plugin struct. Each plugin populates a struct with callbacks for initialization and teardown, along with a bunch of other descriptive information. Pretty much everything except the struct is declared as static, so only the Plugin struct is exposed for linking.
Then, to handle loose coupling of the plugin communicating with the rest of the app (since it'd be nice if it did something between setup and teardown), they have a signaling system. Plugins can register callbacks to be called when specific signals (not standard C signals, but a custom extensible kind [identified by string, rather than set codes]) are issued by any part of the app (including another plugin). They can also issue signals themselves.
This seems to work well in practice - different plugins can build upon each other, but the coupling is fairly loose - no direct invocation of functions, everything's through the signaling stystem.
A function should do one thing and do this one thing well.
Lots of little function used by bigger wrapper functions help to structure code from small, easy to understand (and test!) building blocks.
Create small modules with a couple of functions each. Only expose what you must, keep anything else static inside of the module. Link small modules together with their .h interface files.
Provide Getter and Setter functions for access to static file scope variables in your module. That way, the variables are only actually written to in one place. This helps also tracing access to these static variables using a breakpoint in the function and the call stack.
One important rule when designing modular code is: Don't try to optimize unless you have to. Lots of small functions usually yield cleaner, well structured code and the additional function call overhead might be worth it.
I always try to keep variables at their narrowest scope, also within functions. For example, indices of for loops usually can be kept at block scope and don't need to be exposed at the entire function level. C is not as flexible as C++ with the "define it where you use it" but it's workable.
Breaking the code up into libraries of related functions is one way of keeping things organized. To avoid name conflicts you can also use prefixes to allow you to reuse function names, though with good names I've never really found this to be much of a problem. For example, if you wanted to develop your own math routines but still use some from the standard math library, you could prefix yours with some string: xyz_sin(), xyz_cos().
Generally I prefer the one function (or set of closely related functions) per file and one header file per source file convention. Breaking files into directories, where each directory represents a separate library is also a good idea. You'd generally have a system of makefiles or build files that would allow you to build all or part of the entire system following the hierarchy representing the various libraries/programs.
There are directories and files, but no namespaces or encapsulation. You can compile each module to a separate obj file, and link them together (as libraries).