Naming conventions for Ruby C extension developers - c

I'm interested in following the correct naming conventions when writing an extension for ruby in C. Specifically I'm referring things such adding _p to function names of predicates and prefixing variables with m for module, c for class etc.
For example, if we want to define a predicate method like the following in C, we should use _p as a suffix in the function that defines the method.
class MyClass
def awesome?
true
end
end
In C:
static VALUE my_extension_my_class_awesome_p(VALUE self) {
return Qtrue;
}
void Init_my_extension(void) {
VALUE cMyClass = rb_define_class("MyClass", rb_cObject);
rb_define_method(cMyClass,
"awesome?",
my_extension_my_class_awesome_p,
0);
}
Looking through the core Ruby source code I see suffixes for _p (predicate) and _m, which I'm not able to infer a meaning from. I'm sure there are a number of other conventions.
There are additional naming conventions, such as when to use underscores and when to use camel casing. It would be easy to create a mess without a guideline to follow when writing an extension with a substantial amount of C code.
Is there a definitive list somewhere? I never seem to turn up useful results when googling for Ruby C extension topics. Any quick examples that show the pure Ruby syntax and the equivalent C function named correctly?

Here are a couple more: http://geoffgarside.co.uk/2007/05/20/ruby-c-extensions-nested-modules-classes/
Geoff Garside has a couple dozens repos written in ruby/C. He's pretty credible IMO. https://github.com/geoffgarside
I will keep looking for more and edit this post when I do find more.
EDIT
It looks like it's hard to find someone who wants to talk about ruby extension naming conventions... Maybe you could try sending a tweet/email in M. Garside's direction. He looks pretty active on twitter.

Related

C coding conventions for function signatures

I always follow the existing coding conventions of whatever language I am using and I have started doing C recently. I have noticed that some books display functions with the return value above the rest of the function signature, like this -
int
foo(int bar)
{
...
...
...
}
I haven't seen this in any other languages I've used. Is this the standard way of presenting C functions these days or is it some old convention that is not in general use anymore?
There are no universal conventions for code formatting in C. The popular styles are named by a project (such as "Linux kernel") or organization (GNU) or book (K&R), or stuff like that.
Wikipedia has a list of styles.
It is more to do with getting things like ctags to work effectively. Or, being able to find the function body itself (rather than any call to it) by simply doing a search for lines starting with funcname.
Ex: /^funcname
As long as you use any reasonable indenting style after that, it will be the only place (across an entire code base) where it appears at the very beginning of a line that way.
you can see this in the Indian Hill Style manual 1990 http://www.cs.arizona.edu/~mccann/cstyle.html.
It is just a matter of style and mostly habit. Compiler does not care about that. Use a style in which you are comfortable and something that others easily follow while reading your code.

Generate documentation for 2 languages with same code

Can I somehow generate documentation for 2 different languages with same code? Problem is that I have a C API that is also exposed trough a proprietary language that resembles VB.
So the exposed function in C is something like:
int Function (*PointerToObject)
While in VB it would be something like:
int Function (ByVal long PointerToObject)
I already opened another thread before about this same problem, but by that time I did not know anything about Doxygen. The last couple of days I have been reading the documentation and apparently it can create documentation for VB, but I have to have actual VB code for it to work and I don't. The only thing I have is the original C and the swig output also in C.
What I have in mind is some tool (doxygen, sphinx,...) that would enable me to create some kind of multi-language documentation from a single source (not valid doxygen, but explains the idea):
/*! \fn int Function(*PointerToObject)
* \brief A simple function.
* \Cparam[in] PointerToObject A pointer to the object being used.
* \VBparam[in] ByVal long PointerToObject A pointer to the object being used.
* \return An integer.
*/
It would be great if I could somehow integrate it to swig since it is swig that identifies the correct VB type, but I guess I maybe be asking too much.
It is a bit complicated, if I am not clear enough please leave a comment I will try to explain it further.
I'm not positive how useful this will be as I haven't done precisely what you're looking for (and it is a bit of a kludge), but under similar circumstances I came to the conclusion that our best bet was to generate a fluff object just for doxygen to document.
In our case we have an LDMud which has a few hundred driver-published external functions which don't exist in the LPC language the rest of the MUD is written in. We could parse it in its native C code as a separate documentation run and include the docs with our primary documentation. In our case we have fairly thorough plain-text documentation including what we should consider the definition of these functions to be, so we use this documentation instead to generate an object with dummy function definitions and parse the plaintext documentation into function-head comments in doxygen style. This obviously doesn't provide us the advantages of including code or in-code comments with the documentation, but it does allow us to generate the documentation in multiple forms, push it around in guides, make references to these functions from other documentation, etc.
I'm not directly familiar with C so I'm not sure if there's any introspective support for performing a programmatic inventory of the functions you need and generating dummy declarations from that information or not. If you don't and have a small number of functions I suspect it would be the best investment of your time to simply write the dummy declarations by hand. If you have a really large number of these functions it may be worth your time to use doxygen to parse the original code into XML documentation, and generate the dummy object(s) from the XML. Doxygen's XML can be a pain to deal with, but if all you really need is the declaration and arguments it is (relatively) simple.
As a final aside, if your documentation can really be thought of as internal and external, you might also want to use two or more configuration files to generate different sets of documentation saved to different locations and publish them separately to your internal/external audiences. It will make the task of providing the right amount of detail to each less frustrating. Along those lines you may also find the INTERNAL_DOCS option and the #internal / #endinternal commands useful.

C Library to read configuration files with syntax based on curly brackets

For my C projects I'd like to use curly brackets based configuration files like:
account {
name = "test#test.com";
password = "test";
autoconnect = true;
}
etc. or some variations.
I'm trying to find some nice C libraries to suit my needs. Can you please advise?
Your desired syntax is nearly identical to Lua, which would look like this:
account = {
name = "test#test.com",
password = "test",
autoconnect = true,
}
If that suits you, I highly recommend Lua, as it's designed to be embeddable in C programs as a configuration or scripting facility. You can either use the raw Lua C API, or if you prefer C++ there are things like Luabind to make certain things prettier in that language.
Here is a trivial example using the pure C Lua API to retrieve values from a buffer which contains a Lua "chunk": http://lua-users.org/wiki/GettingValuesFromLua . You can basically read (or mmap) your configuration file in C, pass the pointer to the text to Lua, have Lua execute it, and then retrieve the bits and pieces iteratively. An alternative is to do "binding" (for which there is also an example on the Lua wiki). With binding the flow is more like that you set up C structures to represent your configuration data, bind them to Lua, and let the Lua configuration script actually populate (construct) a configuration object which is then accessible from C. Depending on your exact needs this may be better or worse, but in pure C (as opposed to C++), the learning curve may be steeper than the "get values" approach.
I would suggest using a lexer and parser for doing this, either the lex/yacc combo or flex/bison.
You basically write code in a .l and .y file to describe the layout and the lexer/parser generator creates C code that will process the file for you, calling functions to deliver the data to you.
Lexical analysis and parsing are a pain to do unless you're well versed in the art. Tools like those I've mentioned make the job a lot easier.
In the lexer, you get it to recognise the lexical elements like
e_account (account)
e_openbrace ({)
e_name (name)
e_string ("[^"]*")
e_semicolon (;)
and so on.
The lexer is used by the parser to detect the lexical elements and the parser has the higher level rules for deciding what constructs are valid. Things like an account section being e_account, e_openbrace, zero or more of e_stanza then finally e_closebrace. And also detecting e_stanza as being (among others) e_name, e_equals, e_string then e_semicolon.
Most of the intelligence is under the covers (and pretty ugly looking code at least for lex/yacc) but it's better than trying to write it yourself :-)
A variant of what you described would be JSON:
account={
name: "test#test.com",
password: "test",
autoconnect: true
}
http://www.json.org/
lists ~100 libraries to read and write JSON for every conceivable platform and language. There are seven libraries alone for C. The nice thing for JSON is interoperability of course and having a data format which is widely accepted (it even has a RFC: rfc4627)
libconfuse has nearly the syntax you require:
/*
* This is a C-style multi-line comment
*/
BackLog = 2147483647
bookmark heimdal {
login = "anonymous"
password = ${ANONPASS:-anonymous#} # environment variable substitution
}

What naming convention for a C API

We are working on a game engine written in C and currently we are using the following naming conventions.
ABClass object;
ABClassMethod(object, args)
AB Being our prefix.
Our API, even if working on objects, does not have inheritance, polymorphism or anything. All we have is data types and methods working on them.
Our Constants are named alike: AB_ConstantName and Preprocessor macros are named like AB_API_BEGIN. We don't use function like macros.
I was wondering how this was fitting as a C API. Also, you may note that the entire API is wrapper into lua, and you can either use the API from C or lua. Most of the time the engine will be used from lua.
Whatever the API you'll come out with, for your users' mental sanity (and for yours), ensure that it's consistent throughout the code.
Consistency, to me, includes three things:
Naming. Case and use of the underscore should be regulated. For example: ABClass() is a "public" symbol while AB_Class() is not (in the sense that it might be visible (for whatever reason) to other modules but it's reserved for internal use.
If you have "ABClass()", you should never have "abOtherClass()" or "AbYet_anotherClass()"
Nouns and verbs. If something is called "point" it must always be "point" and not "pnt" or "p" or similar.
Standard C library, for example, has both putc() and putchar() (yes, they are different but the name doesn't tell which one writes on stdout).
Also verbs should be consistent: avoid having "CreateNewPoint()", "BuildCircle()" and "NewSquareMake()" at the same time!
Argument position. If a set of related function takes similar arguments (e.g. a string or a file) ensure they have the same position. Again the C standard library do a poor job with fwrite() and fprintf(): one has the file as the last argument, the other as the first one.
The rest is much up to your taste and any other constraint you might have.
For example, you mentioned you're using Lua: Following a convention that is similar to the Lua one could be a plus if programmers have to be exposed to both API at the same time.
This seems standard enough. OpenGL did it with a gl prefix, so you can't be that far off. :)
There is a lot of C APIs. If you are creative enough to invent a new one, there's no "majority" to blame you. On the other hand, no matter which way you go there are enough zealots of other standards to get mad at you.

Using Doxygen with C, do you comment the function prototype or the definition? Or both?

I'm using Doxygen with some embedded C source. Given a .c/.h file pair, do you put Doxygen comments on the function prototype (.h file) or the function definition (.c file), or do you duplicate them in both places?
I'm having a problem in which Doxygen is warning about missing comments when I document in one place but not the other; is this expected, or is my Doxygen screwed up?
For public APIs I document at the declaration, as this is where the user usually looks first if not using the doxygen output.
I never had problems with only documenting on one place only, but I used it with C++; could be different with C, although I doubt it.
[edit] Never write it twice. Never. In-Source documentation follows DRY, too, especially concerning such copy-and-paste perversions.[/edit]
However, you can specify whether you want warnings for undocumented elements. Although such warnings look nice in theory, my experience is that they quickly are more of a burden than a help. Documenting all functions usually is not the way to go (there is such a thing is redundant documentation, or even hindering documentation, and especially too much documentation); good documentation needs a knowledgeable person spending time with it. Given that, those warnings are unnecessary.
And if you do not have the resources for writing good documentation (money, time, whatever...), than those warnings won't help either.
Quoted from my answer to this question: C/C++ Header file documentation:
I put documentation for the interface
(parameters, return value, what the
function does) in the interface file
(.h), and the documentation for the
implementation (how the function
does) in the implementation file (.c,
.cpp, .m). I write an overview of the
class just before its declaration, so
the reader has immediate basic
information.
With Doxygen, this means that documentation describing overview, parameters and return values (\brief, \param, \return) are used for documenting function prototype and inline documentation (\details) is used for documenting function body (you can also refer to my answer to that question: How to be able to extract comments from inside a function in doxygen?)
I often use Doxygen with C targeting embedded systems. I try to write documentation for any single object in one place only, because duplication will result in confusion later. Doxygen does some amount of merging of the docs, so in principle it is possible to document the public API in the .h file, and to have some notes on how it actually works sprinkled in the .c file. I've tried not to do that myself.
If moving the docs from one place to the other changes the amount of warnings it produces, that may be a hint that there may be something subtly different between the declaration and definition. Does the code compile clean with -Wall -Wextra for example? Are there macros that mutate the code in one place and not the other? Of course, Doxygen's parser is not a full language parser, and it is possible to get it confused as well.
We comment only the function definitions, but we use it with C++.
Write it at both places is wasting time.
About the warning, if your documentation looks good, maybe it's a good way to ignore such warnings.
I've asked myself the same question and was pleasantly surprised to see that Doxygen actually includes the same in-line documentation that is in the .c file in the corresponding .h file when browsing the generated html documentation. Hence you don't have to repeat your in-line documentation, and Doxygen is smart enough to include it in both places!
I'm running version Doxygen version 1.8.10.

Resources