Good OO naming scheme for c - c

I find myself with a lot of functions that are oo style in c(nothing with fancy macros or function pointers just struct + functions that take that struct type as a first argument.
Is my_func_my_type a good scheme for such things? If I use that should I try to be consistent? what if it one of my functions name becomes > 20 characters? > 25? Should I keep the style consistent even then?
Also what is a good naming scheme for constructors/initializers/destructors?
Is there something beter then new_my_type , init_my_type, free_my_type ?
P.S. Is there a good name for this/self ptr or should I just name the first parameter of OO like function as I would in a normal function (to_init, some_guy,ect.)

I would personally use a scheme consistent with most C libraries that put the library name as a prefix of the function. Which would give for instance void MyObject_MyFunction. For construction/destruction, you can stay consistent and use MyObject_Construct, MyObject_Destroy. I'd say the name don't matter much as long as you stay consistent.

'good' is a bit subjective, but I find it the easiest to maintain to choose a certain scheme (incorporating struct name into method) and follow it strictly. Really long function names I tend to avoid though, if a function is so complicated that it need such a long name it's probably time to refactor it, split into subfunctions etc.
I also do not use underscores, but that's a matter of taste.
A sample of what's used here:
typedef struct Discriminator
{
//members
} Discriminator;
DiscriminatorConstruct( Discriminator* p );
DiscriminatorDestruct( Discriminator* p );
DiscriminatorFunctionA( Discriminator* p, int arg1, int arg2 );

For lack of one within the C world, C++ provides a useful precedent in the naming of its functions:
[[namespace::...]struct::]function
There's nothing like namespaces in C, but you can just think of and model them as prefixes shared by related structs. For readability, it helps to have something to visually separate the distinct naming components. If you like using underscores in your functions, you might consider two underscores as a separator, otherwise perhaps one. (Technically, two underscores may be reserved for the implementation, but I've never seen any implementation identifier that wasn't prefixed with underscores put a double underscore internally).
Keeping the names close to C++ also helps programmers span both languages, and in migrating code back and forth if necessary. Similarly, C++ terminology may be adopted: constructor, destructor, possibly new and delete (though these names may become misnomers if the C code is ported to C++ but continues to use free/malloc).
IMHO, consistent and clear naming saves more hassles than long identifiers cause, but if something becomes a pain then seek a localised workaround such as a macro, inline wrapper function, function pointer etc..

Related

How to deal with functions with same name in C

So i have a header and source file containing an implementation for a vector. I want to use the vector to implement a heap. So I realized that I would want my functions to be specific to the class so I declared each one as static thinking I could do Vector::get(int n, Vector::Vector* vector) but apparently :: is not an operator in C and static just makes things private. Can anyone help me understand how I can make the correct encapsulation and not name all my get functions Heap_get or Vector_get?
C++ has namespaces and class specifiers to distinguish between things like that but, in C, names have to be unique.
It's a time honored tradition to simply use a (generally short) prefix for your code and hope you never have to integrate code from someone else that has used the same prefix.
So names like vecGet() or heapCreate() are exactly the way C developers do this.
Now you can do polymorphism in C but it's probably overkill for what you're trying to do.

Managing without Objects in C - And, why can I declare variables anywhere in a function in C?

everyone. I actually have two questions, somewhat related.
Question #1: Why is gcc letting me declare variables after action statements? I thought the C89 standard did not allow this. (GCC Version: 4.4.3) It even happens when I explicitly use --std=c89 on the compile line. I know that most compilers implement things that are non-standard, i.e. C compilers allowing // comments, when the standard does not specify that. I'd like to learn just the standard, so that if I ever need to use just the standard, I don't snag on things like this.
Question #2: How do you cope without objects in C? I program as a hobby, and I have not yet used a language that does not have Objects (a.k.a. OO concepts?) -- I already know some C++, and I'd like to learn how to use C on it's own. Supposedly, one way is to make a POD struct and make functions similar to StructName_constructor(), StructName_doSomething(), etc. and pass the struct instance to each function - is this the 'proper' way, or am I totally off?
EDIT: Due to some minor confusion, I am defining what my second question is more clearly: I am not asking How do I use Objects in C? I am asking How do you manage without objects in C?, a.k.a. how do you accomplish things without objects, where you'd normally use objects?
In advance, thanks a lot. I've never used a language without OOP! :)
EDIT: As per request, here is an example of the variable declaration issue:
/* includes, or whatever */
int main(int argc, char *argv[]) {
int myInt = 5;
printf("myInt is %d\n", myInt);
int test = 4; /* This does not result in a compile error */
printf("Test is %d\n", test);
return 0;
}
c89 doesn't allow this, but c99 does. Although it's taken a long time to catch on, some compilers (including gcc) are finally starting to implement c99 features.
IMO, if you want to use OOP, you should probably stick to C++ or try out Objective C. Trying to reinvent OOP built on top of C again just doesn't make much sense.
If you insist on doing it anyway, yes, you can pass a pointer to a struct as an imitation of this -- but it's still not a good idea.
It does often make sense to pass (pointers to) structs around when you need to operate on a data structure. I would not, however, advise working very hard at grouping functions together and having them all take a pointer to a struct as their first parameter, just because that's how other languages happen to implement things.
If you happen to have a number of functions that all operate on/with a particular struct, and it really makes sense for them to all receive a pointer to that struct as their first parameter, that's great -- but don't feel obliged to force it just because C++ happens to do things that way.
Edit: As far as how you manage without objects: well, at least when I'm writing C, I tend to operate on individual characters more often. For what it's worth, in C++ I typically end up with a few relatively long lines of code; in C, I tend toward a lot of short lines instead.
There is more separation between the code and data, but to some extent they're still coupled anyway -- a binary tree (for example) still needs code to insert nodes, delete nodes, walk the tree, etc. Likewise, the code for those operations needs to know about the layout of the structure, and the names given to the pointers and such.
Personally, I tend more toward using a common naming convention in my C code, so (for a few examples) the pointers to subtrees in a binary tree are always just named left and right. If I use a linked list (rare) the pointer to the next node is always named next (and if it's doubly-linked, the other is prev). This helps a lot with being able to write code without having to spend a lot of time looking up a structure definition to figure out what name I used for something this time.
#Question #1: I don't know why there is no error, but you are right, variables have to be declared at the beginning of a block. Good thing is you can declare blocks anywhere you like :). E.g:
{
int some_local_var;
}
#Question #2: actually programming C without inheritance is sometimes quite annoying. but there are possibilities to have OOP to some degree. For example, look at the GTK source code and you will find some examples.
You are right, functions like the ones you have shown are common, but the constructor is commonly devided into an allocation function and an initialization function. E.G:
someStruct* someStruct_alloc() { return (someStruct*)malloc(sizeof(someStruct)); }
void someStruct_init(someStruct* this, int arg1, arg2) {...}
In some libraries, I have even seen some sort of polymorphism, where function pointers are stored within the struct (which have to be set in the initializing function, of course). This results in a C++ like API:
someStruct* str = someStruct_alloc();
someStruct_init(str);
str->someFunc(10, 20, 30);
Regarding OOP in C, have you looked at some of the topics on SO? For instance, Can you write object oriented code in C?.
I can't put my finger on an example, but I think they enforce an OO like discipline in Linux kernel programming as well.
In terms of learning how C works, as opposed to OO in C++, you might find it easier to take a short course in some other language that doesn't have an OO derivative -- say, Modula-2 (one of my favorites) or even BASIC (if you can still find a real BASIC implementation -- last time I wrote BASIC code it was with the QBASIC that came with DOS 5.0, later compiled in full Quick BASIC).
The methods you use to get things done in Modula-2 or Pascal (barring the strong typing, which protects against certain types of errors but makes it more complicated to do certain things) are exactly those used in non-OO C, and working in a language with different syntax might (probably will, IMO) make it easier to learn the concepts without your "programming reflexes" kicking in and trying to do OO operations in a nearly-familiar language.

What style to use when naming types in C

According to this stack overflow answer, the "_t" postfix on type names is reserved in C. When using typedef to create a new opaque type, I'm used to having some sort of indication in the name that this is a type. Normally I would go with something like hashmap_t but now I need something else.
Is there any standard naming scheme for types in C? In other languages, using CapsCase like Hashmap is common, but a lot of C code I see doesn't use upper case at all. CapsCase works fairly nicely with a library prefix too, like XYHashmap.
So is there a common rule or standard for naming types in C?
Yes, POSIX reserves names ending _t if you include any of the POSIX headers, so you are advised to stay clear of those - in theory. I work on a project that has run afoul of such names two or three times over the last twenty or so years. You can minimize the risk of collision by using a corporate prefix (your company's TLA and an underscore, for example), or by using mixed case names (as well as the _t suffix); all the collisions I've seen have been short and all-lower case (dec_t, loc_t, ...).
Other than the system-provided (and system-reserved) _t suffix, there is no specific widely used convention. One of the mixed-case systems (camelCase or InitialCaps) works well. A systematic prefix works well too - the better libraries tend to be careful about these.
If you do decide to use lower-case and _t suffix, do make sure that you use long enough names and check diligently against the POSIX standard, the primary platforms you work on, and any you think you might work on to avoid unnecessary conflicts. The worst problems come when you release some name example_t to customers and then find there is a conflict on some new platform. Then you have to think about making customers change their code, which they are always reluctant to do. It is better to avoid the problem up front.
The Indian Hill style guidelines have some suggestions:
Individual projects will no doubt have
their own naming conventions. There
are some general rules however.
Names with leading and trailing underscores are reserved for system
purposes and should not be used for
any user-created names. Most systems
use them for names that the user
should not have to know. If you must
have your own private identifiers,
begin them with a letter or two
identifying the package to which they
belong.
#define constants should be in all CAPS.
Enum constants are Capitalized or in all CAPS
Function, typedef, and variable names, as well as struct, union, and
enum tag names should be in lower
case.
Many macro "functions" are in all CAPS. Some macros (such as getchar and
putchar) are in lower case since they
may also exist as functions.
Lower-case macro names are only
acceptable if the macros behave like a
function call, that is, they evaluate
their parameters exactly once and do
not assign values to named parameters.
Sometimes it is impossible to write a
macro that behaves like a function
even though the arguments are
evaluated exactly once.
Avoid names that differ only in case, like foo and Foo. Similarly,
avoid foobar and foo_bar. The
potential for confusion is
considerable.
Similarly, avoid names that look like each other. On many terminals and
printers, 'l', '1' and 'I' look quite
similar. A variable named 'l' is
particularly bad because it looks so
much like the constant '1'.
In general, global names (including
enums) should have a common prefix
identifying the module that they
belong with. Globals may alternatively
be grouped in a global structure.
Typedeffed names often have "_t"
appended to their name.
Avoid names that might conflict with
various standard library names. Some
systems will include more library code
than you want. Also, your program may
be extended someday.
C only reserves some uses of a _t suffix. As far as I can tell, this is only current identifiers ending with _t plus any identifier that starts int or uint (7.26.8). However, POSIX may reserve more.
It's a general problem in C, since you have extremely flat namespaces, and there's no silver bullet. If you're familiar with CapCase names and they work well for you, then you should continue to use them. Otherwise, you'll have to evaluate the goals of the current project and see which solution best meets them.
CapsCase is often used for types in C.
For instance, if you look at projects in the GNOME ecosystem (GTK+, GDK, GLib, GObject, Clutter, etc.), you'll see types like GtkButton or ClutterStageWindow. They only use CapsCase for data types; function names and variables are all lower-case with underscore separators - e.g. clutter_actor_get_geometry().
Type naming schemes are like indentation conventions - they generate religious wars with people asserting some sort of moral superiority for their preferred approach. It is certainly preferable to follow the style in existing code, or in related projects (e.g. for me, GNOME over the last few years.)
However, if you're starting from scratch and have no template, there's no hard-and-fast rule. If you're interested in coding efficiently and leaving work at reasonable hour so you can go home and have a beer or whatever, you certainly should pick a style and stick to it for your project, but it matters very little exactly which style you pick.
One alternate solution that works reasonably well is to use uppercase for all type names and macro names. Global variables may be CapCase (CamelBack) and all local variables lower case.
This technique helps to improve readability and also takes advantage of language syntax which reduces the number of pollution characters in variable names; e.g. gvar, kvar, type_t, etc. For example, data types cannot be syntatically confused with any other type.
Global variables are easily distinguished from locals by having at least one upper case letter.
I agree that prefixed or postfixed underscores should be avoided in all token names.
Lets look at the example below.
Its readily clear that InvertedCount is a global due to its case. It's equally clear that INT32U and RET_ERR are types due to their sytax. Its also clear that INVERT_VAL() is a macro due to the fact thats its on the right hand side and there is no cast so it cant be a data type.
One thing is for sure though. Whichever method you use, it should be inline with your organizations coding standard. For me, the least amount of clutter, the better.
Of course, style is a different issue.
#define INVERT_VAL(x) (~x)
#define CALIBRATED_VAL 100u
INT32U InvertedCount;
typedef enum {
ERR_NONE = 0,
...
} RET_ERR;
RET_ERR my_func (void)
{
INT32U val;
INT32U check_sum;
val = CALIBRATED_VAL; // --> Lower case local variable.
check_sum = INVERT_VAL(val); // --> Clear use of macris.
InvertedCount = checksum; // --> Upper case global variable.
// Looks different no g prefix required.
...
return (ERR_NONE);
}
There are many ideas and opinion on this subject, but there is no one universal standard for naming types. The most important thing is to be consistent. In the absence of coding standards, when maintaining code, resist the urge to use another naming convention. Introducing a new naming convention, even if it's perfect, can add unnecessary complexity.
This is actually a great topic to raise when interviewing people. I've never come across a good programmer that didn't have an opinion on this. No opinion or no passion in the answer indicates that the person isn't an experienced programmer.

C library naming conventions

Introduction
Hello folks, I recently learned to program in C! (This was a huge step for me, since C++ was the first language, I had contact with and scared me off for nearly 10 years.) Coming from a mostly OO background (Java + C#), this was a very nice paradigm shift.
I love C. It's such a beautiful language. What surprised me the most, is the high grade of modularity and code reusability C supports - of course it's not as high as in a OO-language, but still far beyond my expectations for an imperative language.
Question
How do I prevent naming conflicts between the client code and my C library code? In Java there are packages, in C# there are namespaces. Imagine I write a C library, which offers the operation "add". It is very likely, that the client already uses an operation called like that - what do I do?
I'm especially looking for a client friendly solution. For example, I wouldn't like to prefix all my api operations like "myuniquelibname_add" at all. What are the common solutions to this in the C world? Do you put all api operations in a struct, so the client can choose its own prefix?
I'm very looking forward to the insights I get through your answers!
EDIT (modified question)
Dear Answerers, thank You for Your answers! I now see, that prefixes are the only way to safely avoid naming conflicts. So, I would like to modifiy my question: What possibilities do I have, to let the client choose his own prefix?
The answer Unwind posted, is one way. It doesn't use prefixes in the normal sense, but one has to prefix every api call by "api->". What further solutions are there (like using a #define for example)?
EDIT 2 (status update)
It all boils down to one of two approaches:
Using a struct
Using #define (note: There are many ways, how one can use #define to achieve, what I desire)
I will not accept any answer, because I think that there is no correct answer. The solution one chooses rather depends on the particular case and one's own preferences. I, by myself, will try out all the approaches You mentioned to find out which suits me best in which situation. Feel free to post arguments for or against certain appraoches in the comments of the corresponding answers.
Finally, I would like to especially thank:
Unwind - for his sophisticated answer including a full implementation of the "struct-method"
Christoph - for his good answer and pointing me to Namespaces in C
All others - for Your great input
If someone finds it appropriate to close this question (as no further insights to expect), he/she should feel free to do so - I can not decide this, as I'm no C guru.
I'm no C guru, but from the libraries I have used, it is quite common to use a prefix to separate functions.
For example, SDL will use SDL, OpenGL will use gl, etc...
The struct way that Ken mentions would look something like this:
struct MyCoolApi
{
int (*add)(int x, int y);
};
MyCoolApi * my_cool_api_initialize(void);
Then clients would do:
#include <stdio.h>
#include <stdlib.h>
#include "mycoolapi.h"
int main(void)
{
struct MyCoolApi *api;
if((api = my_cool_api_initialize()) != NULL)
{
int sum = api->add(3, 39);
printf("The cool API considers 3 + 39 to be %d\n", sum);
}
return EXIT_SUCCESS;
}
This still has "namespace-issues"; the struct name (called the "struct tag") needs to be unique, and you can't declare nested structs that are useful by themselves. It works well for collecting functions though, and is a technique you see quite often in C.
UPDATE: Here's how the implementation side could look, this was requested in a comment:
#include "mycoolapi.h"
/* Note: This does **not** pollute the global namespace,
* since the function is static.
*/
static int add(int x, int y)
{
return x + y;
}
struct MyCoolApi * my_cool_api_initialize(void)
{
/* Since we don't need to do anything at initialize,
* just keep a const struct ready and return it.
*/
static const struct MyCoolApi the_api = {
add
};
return &the_api;
}
It's a shame you got scared off by C++, as it has namespaces to deal with precisely this problem. In C, you are pretty much limited to using prefixes - you certainly can't "put api operations in a struct".
Edit: In response to your second question regarding allowing users to specify their own prefix, I would avoid it like the plague. 99.9% of users will be happy with whatever prefix you provide (assuming it isn't too silly) and will be very UNHAPPY at the hoops (macros, structs, whatever) they will have to jump through to satisfy the remaining 0.1%.
As a library user, you can easily define your own shortened namespaces via the preprocessor; the result will look a bit strange, but it works:
#define ns(NAME) my_cool_namespace_ ## NAME
makes it possible to write
ns(foo)(42)
instead of
my_cool_namespace_foo(42)
As a library author, you can provide shortened names as desribed here.
If you follow unwinds's advice and create an API structure, you should make the function pointers compile-time constants to make inlinig possible, ie in your .h file, use the follwoing code:
// canonical name
extern int my_cool_api_add(int x, int y);
// API structure
struct my_cool_api
{
int (*add)(int x, int y);
};
typedef const struct my_cool_api *MyCoolApi;
// define in header to make inlining possible
static MyCoolApi my_cool_api_initialize(void)
{
static const struct my_cool_api the_api = { my_cool_api_add };
return &the_api;
}
Unfortunately, there's no sure way to avoid name clashes in C. Since it lacks namespaces, you're left with prefixing the names of global functions and variables. Most libraries pick some short and "unique" prefix (unique is in quotes for obvious reasons), and hope that no clashes occur.
One thing to note is that most of the code of a library can be statically declared - meaning that it won't clash with similarly named functions in other files. But exported functions indeed have to be carefully prefixed.
Since you are exposing functions with the same name client cannot include your library header files along with other header files which have name collision. In this case you add the following in the header file before the function prototype and this wouldn't effect client usage as well.
#define add myuniquelibname_add
Please note this is a quick fix solution and should be the last option.
For a really huge example of the struct method, take a look at the Linux kernel; 30-odd million lines of C in that style.
Prefixes are only choice on C level.
On some platforms (that support separate namespaces for linkers, like Windows, OS X and some commercial unices, but not Linux and FreeBSD) you can workaround conflicts by stuffing code in a library, and only export the symbols from the library you really need. (and e.g. aliasing in the importlib in case there are conflicts in exported symbols)

Are conflicting types always a problem in C?

I am having growing pains moving from Java to C. I have become used to having different methods with the same name, but which take different parameters. In C this creates problems?
Cell makeCell(int dim, int iterations, Cell parent);
Cell makeCell(Cell parent);
Is there some quick little work around for this problem, or should I just keep a stiff upper lip and call one of them _makeCell or makeCell2 or something equally ridiculous?
In C, you do not have overloaded functions - functions with the same name but different types for the arguments (ignoring esoterica such as <tgmath.h> in C99).
The functions must be given different names.
Or use a different language that supports overloaded function names.
There is no overloading in C as you understand it in Java or C++ (manually created overloading with function pointers notwithstanding but seriously, if you want object orientation, use C++, not C with kludges :-), so you should, as you suspect, call them different function names.
But don't call them _makeCell or makeCell2 since that's not descriptive. How about:
Cell makeCellFromDimensionAndIteration(int dim, int iterations, Cell parent);
Cell makeCell(Cell parent);
That first one could be shortened but make sure it still has meaning. makeCell2 will mean very little to someone reading the code (including yourself six months into the future).
Simply put, there is no method overloading in C, I think it would be possible to create preprocessor rules to rename the methods based on the number of arguments, but I have no idea how to do it, and it would give you far more work then simply using different names.
Welcome to C hell. There's one big flat namespace, there's no overloading, so you need conventions to help you invent and manage names. A good place to look for inspiration is Dave Hanson's C Interfaces and Implementations.
For your particular example I would suggest something along these lines:
Cell_T Cell_with_dim(int dim, int iterations, Cell_T parent);
Cell_T Cell_of_parent(Cell_T parent);
(Not super confident of these suggestions because I'm having a hard time guessing what the different overloadings are supposed to do.)
As others mentioned there's no way to create two functions with the same name, you can use something like:
Cell makeCell(Cell parent, int dim, int iterations);
for both cases and pass some "special value" to distinguish the two cases. For example:
cell = makeCell(par,-1,-1);
The other option I see is to use a variable number of arguments:
Cell makeCell(Cell parent, ...);
but then you have the problem of determining how many arguments have been passed and if you can't do it looking at "parent", you're basically back to the previous case as you have to use a "special value" to specify how many parameters are there.
If this is a sort of constructor (as the name suggests) I would rather have two functions:
Cell makeCell(Cell parent);
Cell setCell(Cell cell, int dim, int iterations);
one that creates a new empty Cell and the other that sets what needs to be set. Of course, it depends on the nature of "Cell" if this is a viable option for you.
Well there is no function overloading in C, but if you want or must use C, you can achieve something similar by creating a function that takes a variable number of arguments. You could start here for information on various techniques, some of which are fairly portable: http://en.wikipedia.org/wiki/Stdarg.h
Generally, though, I would caution against using this trick too extensively in C. The most common reason to use variable argument functions is to implement functions like printf(), not for function overloading kludges. But it's there if you want it.
If you are coming from Java, you may find C++ much easier to get to grips with than C. And as the compiler you are using is almost certainly a C++ compiler, why not give it a try?
It's a common problem, because you can't overload functions in C. Only the compiler can (as in tgmath.h). Having a look into tgmath.h of GCC already served my hack-o-meter for today. They found the right words i think
/* This is ugly but unless gcc gets appropriate builtins we have to do
something like this. Don't ask how it works. */
People have invented different ways to solve it. OpenGL for example take a course where they include parameter types and count into the function name:
glVertex3f
To say the function takes 3 floats.

Resources