Introduction
Hello folks, I recently learned to program in C! (This was a huge step for me, since C++ was the first language, I had contact with and scared me off for nearly 10 years.) Coming from a mostly OO background (Java + C#), this was a very nice paradigm shift.
I love C. It's such a beautiful language. What surprised me the most, is the high grade of modularity and code reusability C supports - of course it's not as high as in a OO-language, but still far beyond my expectations for an imperative language.
Question
How do I prevent naming conflicts between the client code and my C library code? In Java there are packages, in C# there are namespaces. Imagine I write a C library, which offers the operation "add". It is very likely, that the client already uses an operation called like that - what do I do?
I'm especially looking for a client friendly solution. For example, I wouldn't like to prefix all my api operations like "myuniquelibname_add" at all. What are the common solutions to this in the C world? Do you put all api operations in a struct, so the client can choose its own prefix?
I'm very looking forward to the insights I get through your answers!
EDIT (modified question)
Dear Answerers, thank You for Your answers! I now see, that prefixes are the only way to safely avoid naming conflicts. So, I would like to modifiy my question: What possibilities do I have, to let the client choose his own prefix?
The answer Unwind posted, is one way. It doesn't use prefixes in the normal sense, but one has to prefix every api call by "api->". What further solutions are there (like using a #define for example)?
EDIT 2 (status update)
It all boils down to one of two approaches:
Using a struct
Using #define (note: There are many ways, how one can use #define to achieve, what I desire)
I will not accept any answer, because I think that there is no correct answer. The solution one chooses rather depends on the particular case and one's own preferences. I, by myself, will try out all the approaches You mentioned to find out which suits me best in which situation. Feel free to post arguments for or against certain appraoches in the comments of the corresponding answers.
Finally, I would like to especially thank:
Unwind - for his sophisticated answer including a full implementation of the "struct-method"
Christoph - for his good answer and pointing me to Namespaces in C
All others - for Your great input
If someone finds it appropriate to close this question (as no further insights to expect), he/she should feel free to do so - I can not decide this, as I'm no C guru.
I'm no C guru, but from the libraries I have used, it is quite common to use a prefix to separate functions.
For example, SDL will use SDL, OpenGL will use gl, etc...
The struct way that Ken mentions would look something like this:
struct MyCoolApi
{
int (*add)(int x, int y);
};
MyCoolApi * my_cool_api_initialize(void);
Then clients would do:
#include <stdio.h>
#include <stdlib.h>
#include "mycoolapi.h"
int main(void)
{
struct MyCoolApi *api;
if((api = my_cool_api_initialize()) != NULL)
{
int sum = api->add(3, 39);
printf("The cool API considers 3 + 39 to be %d\n", sum);
}
return EXIT_SUCCESS;
}
This still has "namespace-issues"; the struct name (called the "struct tag") needs to be unique, and you can't declare nested structs that are useful by themselves. It works well for collecting functions though, and is a technique you see quite often in C.
UPDATE: Here's how the implementation side could look, this was requested in a comment:
#include "mycoolapi.h"
/* Note: This does **not** pollute the global namespace,
* since the function is static.
*/
static int add(int x, int y)
{
return x + y;
}
struct MyCoolApi * my_cool_api_initialize(void)
{
/* Since we don't need to do anything at initialize,
* just keep a const struct ready and return it.
*/
static const struct MyCoolApi the_api = {
add
};
return &the_api;
}
It's a shame you got scared off by C++, as it has namespaces to deal with precisely this problem. In C, you are pretty much limited to using prefixes - you certainly can't "put api operations in a struct".
Edit: In response to your second question regarding allowing users to specify their own prefix, I would avoid it like the plague. 99.9% of users will be happy with whatever prefix you provide (assuming it isn't too silly) and will be very UNHAPPY at the hoops (macros, structs, whatever) they will have to jump through to satisfy the remaining 0.1%.
As a library user, you can easily define your own shortened namespaces via the preprocessor; the result will look a bit strange, but it works:
#define ns(NAME) my_cool_namespace_ ## NAME
makes it possible to write
ns(foo)(42)
instead of
my_cool_namespace_foo(42)
As a library author, you can provide shortened names as desribed here.
If you follow unwinds's advice and create an API structure, you should make the function pointers compile-time constants to make inlinig possible, ie in your .h file, use the follwoing code:
// canonical name
extern int my_cool_api_add(int x, int y);
// API structure
struct my_cool_api
{
int (*add)(int x, int y);
};
typedef const struct my_cool_api *MyCoolApi;
// define in header to make inlining possible
static MyCoolApi my_cool_api_initialize(void)
{
static const struct my_cool_api the_api = { my_cool_api_add };
return &the_api;
}
Unfortunately, there's no sure way to avoid name clashes in C. Since it lacks namespaces, you're left with prefixing the names of global functions and variables. Most libraries pick some short and "unique" prefix (unique is in quotes for obvious reasons), and hope that no clashes occur.
One thing to note is that most of the code of a library can be statically declared - meaning that it won't clash with similarly named functions in other files. But exported functions indeed have to be carefully prefixed.
Since you are exposing functions with the same name client cannot include your library header files along with other header files which have name collision. In this case you add the following in the header file before the function prototype and this wouldn't effect client usage as well.
#define add myuniquelibname_add
Please note this is a quick fix solution and should be the last option.
For a really huge example of the struct method, take a look at the Linux kernel; 30-odd million lines of C in that style.
Prefixes are only choice on C level.
On some platforms (that support separate namespaces for linkers, like Windows, OS X and some commercial unices, but not Linux and FreeBSD) you can workaround conflicts by stuffing code in a library, and only export the symbols from the library you really need. (and e.g. aliasing in the importlib in case there are conflicts in exported symbols)
Related
I recently transferred into a different school & cs program. The language used is C as compared to java which was taught at my previous school. One of my main issues which may be the result of not writing enough C code is that I'm having trouble finding a standard for making Abstract Data Types.
From what I've seen, there are tons of ways these are implemented and the lack of a visible standard is making me worried I missed something while self learning C. I've seen implementations that hide the init variable from the user such as
#define createVector(vec) Vector vec; void init_vector(&vec)
and another version which is what I would be more used to in which a handle is used to hold the returned pointer to struct from the createVector() function. The issue is I can't find any detailed description on handles online or in my course 2 book. The course 2 book only shows the interface and methods but not how they are grouped together in a way that hides the implementation from the user. I wanted to know if there was a "correct" way/standard for ADTs? The book in question is Robert Sedgewick "Algorithms in C - Third Edition".
Abstract Data Types
Split your sources.
The header (.h files) contains the abstract declarations like the datatypes (structs, functions, enums, constants, etc)
The actual implementation is done in the .c files.
When using such a (lets call it) module you only include the header in your source.
The implementiation you use is decided at linking time. You may decide to use different .c files for implementation or a static library (or even a dynamic library).
If you want to hide the data you use opaque structures.
Why is this standard? Ever heard of the FILE type? This is the opaque type used for IO in c's standardlibrary. You only include the header stdio.h and leave the implementation to the compiler. The header on the other hand or at least the symbols that it defines are well documented (and part of the c standard).
Abstract Classes
Java has the concept of an abstract class. Well, it also has the concept of a class in general. C does not. This is more a personal opinion but don't waste time on emulating language features that the language does not offer.
For none abstract methods use functions which take a pointer to a (probably opaque) struct containing all the data needed as first parameter, like fprintf(FILE*,const char*,...).
For abstract methods you will need function pointers.
Use these function pointers (or maybe a struct of function pointers) like a strategy. You may define a method for registering such a strategyand delegate the normal functions to them. Take for example the atexit function, which globally (you may call it a singleton) adds a exiting-strategy.
The XY Problem
I'm having trouble finding a standard for making Abstract Data Types
Read about this and apply it to your question.
Instead of trying to force your solution to work rethink if the attempted solution is applicable to the problem. Try to get comfy with the techniques described above. This may need a bit of practice but then you can model your solution in a more c-styled way.
I just wanted to post this as I figured out the answer that would be more specific to my case however I understand that this probably doesn't apply to everyone. The thing I was looking for was the idea of "First Class ADTs" which use a handle to contain a pointer to the actual object that was created from a .c implementation file that would be hidden from the user.
For ADT using C, this approach is the standard as far as I know. You will have a header (.h) file and one or more implementation (.c) files. The header file might look something like:
typedef struct * Doodad;
Doodad * doodadInit(int);
void doodadDestroy(Doodad *);
int doodadGetData(Doodad *);
void doodadSetData(int);
For your implementation file(s) you might have:
typedef struct iDoodad {
int data;
} Doodad;
Doodad * doodadInit(int data) {
...
}
...
When developing and maintaining code, I add a new member to a structure and sometimes forget to add the code to initialize or free it which may later result in a memory leak, an ineffective assertion, or run-time memory corruption.
I try to maintain symmetry in the code where things of the same type are structured and named in a similar manner, which works for matching Construct() and Deconstruct() code but because structures are defined in separate files I can't seem to align their definitions with the functions.
Question: is there a way through coding to make myself more aware that I (or someone else) has changed a structure and functions need updating?
Efforts:
The simple:
-Have improved code organization to help minimize the problem
-Have worked to get into the habit of updating everything at once
-Have used comments to document struct members, but this just means results in duplication
-Do use IDE's auto-suggest to take a look and compare suggested entries to implemented code, but this doesn't detect changes.
I had thought that maybe structure definitions could appear multiple times as long as they were identical, but that doesn't compile. I believe duplicate structure names can appear as long as they do not share visibility.
The most effective thing I've come up with is to use a compile time assertion:
static_assert(sizeof(struct Foobar) == 128, "Foobar structure size changed, reevaluate construct and destroy functions");
It's pretty good, definitely good enough. I don't mind updating the constant when modifying the struct. Unfortunately compile time assertions are very platform (compiler) and C Standard dependent, and I'm trying to maintain the backwards compatibility and cross platform compatibility of my code.
This is a good link regarding C Compile Time Assertions:
http://www.pixelbeat.org/programming/gcc/static_assert.html
Edit:
I just had a thought; although a structure definition can't easily be relocated to a source file (unless it does not need to be shared with other source files), I believe a function can actually be relocated to a header file by inlining it.
That seems like a hacked way to make the language serve my unintended purpose, which is not what I want. I want to be professional. If the professional practice is not to approach this code-maintainability issue this way, then that is the answer.
I've been programming in C for almost 40 years, and I don't know of a good solution to this problem.
In some circles it's popular to use a set of carefully-contrived macro definitions so that you can write the structure once, not as a direct C struct declaration but as a sequence of these macros and then, by defining the macro differently and re-expanding, turn your "definition" into either a declaration or a definition or an initialization. Personally, I feel that these techniques are too obfuscatory and are more trouble than they're worth, but they can be used to decent effect.
Otherwise, the only solution -- though it's not what you're looking for -- is "Be careful."
In an ideal project (although I realize full well there's no such thing) you can define your data structures first, and then spend the rest of your time writing and debugging the code that uses them. If you never have occasion to add fields to structs, then obviously you won't have this problem. (I'm sorry if this sounds like a facetious or unhelpful comment, but I think it's part of the reason that I, just as #CoffeeTableEspresso mentioned in a comment, tend not to have too many problems like this in practice.)
It's perhaps worth noting that C++ has more or less the same problem. My biggest wishlist feature in C++ was always that it would be possible to initialize class members in the class declaration. (Actually, I think I've heard that a recent revision to the C++ standard does allow this -- in which case another not-necessarily-helpful answer to your question is "Use C++ instead".)
C doesn't let you have benign struct redefinitions but it does let you have benign macro redefinitions.
So as long as you
save the struct body in a macro (according to a fixed naming convention)
redefine the macro at the point of your constructor
you will get a warning if the struct body changes and you haven't updated the corresponding constructor.
Example:
header.h:
#define MC_foo_bod \
int x; \
double y; \
void *p
struct foo{ MC_foo_bod; };
foo__init.c
#include "header.h"
#ifdef MC_foo_bod
//try for a silent redefinition
//if it wasn't silent, the macro changed and so should this code
#define MC_foo_bod \
int x; \
double y; \
void *p
#else
#error ""
//oops--not a redefinition
//perhaps a typo in the macro name or a failure to include the header?
#endif
void foo__init(struct foo*X)
{
//...
}
It is possible to imitate namespaces in C like this:
#include <stdio.h>
#include <math.h>
struct math_namespace {
double (*sin)(double);
};
const struct math_namespace math = {sin};
int main() {
printf("%f\n", math.sin(3));
return 0;
}
Are there any disadvantages to this, or just situations where a prefix makes more sense? It just seems cleaner to do it this way.
This method is already used in real projects such as the C Containers Library by Jacob Navia. C is not designed for object-oriented programming. This is not really efficient, since you have to (1) access to the structure and (2) dereference the function pointer. If you really want prefixes, I think changing your identifiers remains the best solution.
I have used this style for a while now. It helps organize the program without all of the excess baggage of an OOP language. There is no performance penalty because accessing a function pointer in C is the same as directly accessing the function. I like it enough that I even wrote a very short paper about it. It can be found on http://slkpg.1eko.com under the link "C with Structs" at the bottom of the page.
The direct link is http://slkpg.1eko.com/cstructs.html.
Why reinvent the wheel? One disadvantage is all the setting up which could go out of sync, and also to add to the namespace you have to change the structure.
And there's no 'using namespace' so you always have to specify it. What about and functions with different parameter types?
Well, this does allow you to export your namespace and it does allow a client module to use a static or local version of something that's named sin. So, in that sense, it does actually work.
The downside is that it's not terribly ELF-friendly. The struct initialization is buried in the middle of a writable data page, and it needs to be patched up. Unless you are statically linking, this is a load-time fix-up. On the bright side, it just duplicates what the ELF dispatch table would have done, so I bet it isn't even any slower. On Windows I think the considerations are similar.
everyone. I actually have two questions, somewhat related.
Question #1: Why is gcc letting me declare variables after action statements? I thought the C89 standard did not allow this. (GCC Version: 4.4.3) It even happens when I explicitly use --std=c89 on the compile line. I know that most compilers implement things that are non-standard, i.e. C compilers allowing // comments, when the standard does not specify that. I'd like to learn just the standard, so that if I ever need to use just the standard, I don't snag on things like this.
Question #2: How do you cope without objects in C? I program as a hobby, and I have not yet used a language that does not have Objects (a.k.a. OO concepts?) -- I already know some C++, and I'd like to learn how to use C on it's own. Supposedly, one way is to make a POD struct and make functions similar to StructName_constructor(), StructName_doSomething(), etc. and pass the struct instance to each function - is this the 'proper' way, or am I totally off?
EDIT: Due to some minor confusion, I am defining what my second question is more clearly: I am not asking How do I use Objects in C? I am asking How do you manage without objects in C?, a.k.a. how do you accomplish things without objects, where you'd normally use objects?
In advance, thanks a lot. I've never used a language without OOP! :)
EDIT: As per request, here is an example of the variable declaration issue:
/* includes, or whatever */
int main(int argc, char *argv[]) {
int myInt = 5;
printf("myInt is %d\n", myInt);
int test = 4; /* This does not result in a compile error */
printf("Test is %d\n", test);
return 0;
}
c89 doesn't allow this, but c99 does. Although it's taken a long time to catch on, some compilers (including gcc) are finally starting to implement c99 features.
IMO, if you want to use OOP, you should probably stick to C++ or try out Objective C. Trying to reinvent OOP built on top of C again just doesn't make much sense.
If you insist on doing it anyway, yes, you can pass a pointer to a struct as an imitation of this -- but it's still not a good idea.
It does often make sense to pass (pointers to) structs around when you need to operate on a data structure. I would not, however, advise working very hard at grouping functions together and having them all take a pointer to a struct as their first parameter, just because that's how other languages happen to implement things.
If you happen to have a number of functions that all operate on/with a particular struct, and it really makes sense for them to all receive a pointer to that struct as their first parameter, that's great -- but don't feel obliged to force it just because C++ happens to do things that way.
Edit: As far as how you manage without objects: well, at least when I'm writing C, I tend to operate on individual characters more often. For what it's worth, in C++ I typically end up with a few relatively long lines of code; in C, I tend toward a lot of short lines instead.
There is more separation between the code and data, but to some extent they're still coupled anyway -- a binary tree (for example) still needs code to insert nodes, delete nodes, walk the tree, etc. Likewise, the code for those operations needs to know about the layout of the structure, and the names given to the pointers and such.
Personally, I tend more toward using a common naming convention in my C code, so (for a few examples) the pointers to subtrees in a binary tree are always just named left and right. If I use a linked list (rare) the pointer to the next node is always named next (and if it's doubly-linked, the other is prev). This helps a lot with being able to write code without having to spend a lot of time looking up a structure definition to figure out what name I used for something this time.
#Question #1: I don't know why there is no error, but you are right, variables have to be declared at the beginning of a block. Good thing is you can declare blocks anywhere you like :). E.g:
{
int some_local_var;
}
#Question #2: actually programming C without inheritance is sometimes quite annoying. but there are possibilities to have OOP to some degree. For example, look at the GTK source code and you will find some examples.
You are right, functions like the ones you have shown are common, but the constructor is commonly devided into an allocation function and an initialization function. E.G:
someStruct* someStruct_alloc() { return (someStruct*)malloc(sizeof(someStruct)); }
void someStruct_init(someStruct* this, int arg1, arg2) {...}
In some libraries, I have even seen some sort of polymorphism, where function pointers are stored within the struct (which have to be set in the initializing function, of course). This results in a C++ like API:
someStruct* str = someStruct_alloc();
someStruct_init(str);
str->someFunc(10, 20, 30);
Regarding OOP in C, have you looked at some of the topics on SO? For instance, Can you write object oriented code in C?.
I can't put my finger on an example, but I think they enforce an OO like discipline in Linux kernel programming as well.
In terms of learning how C works, as opposed to OO in C++, you might find it easier to take a short course in some other language that doesn't have an OO derivative -- say, Modula-2 (one of my favorites) or even BASIC (if you can still find a real BASIC implementation -- last time I wrote BASIC code it was with the QBASIC that came with DOS 5.0, later compiled in full Quick BASIC).
The methods you use to get things done in Modula-2 or Pascal (barring the strong typing, which protects against certain types of errors but makes it more complicated to do certain things) are exactly those used in non-OO C, and working in a language with different syntax might (probably will, IMO) make it easier to learn the concepts without your "programming reflexes" kicking in and trying to do OO operations in a nearly-familiar language.
I've been working/coding in C#/Java for some years, so the basics etc. don't give me to much a hard time.
But I've never done anything larger than small commandline learning programs in c.
Now I'm trying to make a mobile phone emulator for Linux and I have no clue how to structure my code when its not object oriented.
I have 3 big books that cover c in detail but none of them cover how to code for maintainability in a bigger project.
So i was hoping some of you more experienced people could point me to a best practice or similar?
Some thoughts (and this question should be a community wiki)
Even if it's not fully-fledged object-oriented programming, try to stick to information hiding practices. Let your functions return pointers or handles to opaque structures (the unix file handle api is a good (early) example. Use static functions where you'd otherwise use private methods.
Try to keep related functionality contained to a single file. This will make it easier to use the abovementioned static keyword, as well as give you a good indication when it's time to split off some functionality into a seperate file. For a Java programmer, this shouldn't be too strange a practice.
The standard practices on commenting apply. Use something like Doxygen if you're looking for something similar to javadoc/C# XML comments.
If you really haven't done anything serious in C for a while, coming to grips with practices for keeping your memory management clean and sensible, even though it's not hard per se, is going to hurt a lot more then establishing practices for maintainability. Just a warning.
structure it generally the same way. Separate things into several files, each containing code which do related work.
Often with C, you can still think about objects. But instead of classes with methods, they are structures and functions which operate on a struct.
I'd recommend splitting your project into smaller components, and place each component in its own .c and .h file. You place the code in the .c file and the structures and prototypes in the .h file.
Doing this, you can program object-oriented in C if you keep your functions reentrant: Say a couple of functions perform a function FOO, then rather than having global variables in foo.c, declare a structure named FOO in foo.h and have the functions take a pointer to a FOO structure as their first parameter.
If a function fred() is only used somewhere in foo.c, mark it static and don't put the prototype in foo.h
Also, search Google codesearch for C projects to see how it is done.
Just because it's C doesn't mean it's not object oriented. See this question or any number of questions named with some variation of "Learning C coming from Object Oriented background."
Techniques like this are still used today - GIMP is built on GTK+, which includes the GObject library for object-oriented coding in C. It may not be the best way, and may not be "idiomatic" C, but it may help you.
The other advice I have on how to code maintainability in a large project is use libraries. C doesn't have a lot built in, so the more functionality you can squeeze out of a portable (open source?) third party library, the less code you have to write (and therefore maintain).
GTK+ once again has GLib, which is a catch-all library with lots of features that people found themselves implementing and reimplementing in C. Apache has its own, the Apache Portable Runtime, which does something very similar (but with different kinds of functions). There are also a few string libraries, which will probably save you a lot of headache, and some more special purpose libraries (Readline for interactive prompts, Ncurses for textual interfaces like vi, etc) that are useful but may not play a huge role in your particular application.
The best choices depend to some degree on what you're writing. If you're writing an operating system kernel or a device driver, or any application for embedded systems, disregard all of the above advice. If you're looking to implement a programming language, look into flex and bison to get started with grammars on a few smaller test projects, but I recommend rolling your own parser and lexer for a serious project (if for no other reason than the improved error reporting).
You can do OO in c see for example:
Object Oriented Programming in C
ANSI C and object-oriented programming
A few suggestions...
• Make use of the static modifier as much as you can; it's (mostly) the equivalent of private in OO languages.
• Try not to use too many global variables, especially if your program uses multiple threads.
• Group related info together into structs; these are your objects in C.
• C doesn't have exception handling, so check the return values of all the system functions you call.
OO is just syntactic sugar. The original C++ compilers compiled to C. Here's a basic example of roughly the same class in Java and in C:
Point.java
import java.lang.Math;
public class Point
{
private int x, y;
public Point(int x, int y)
{
this.x = x;
this.y = y;
}
public int getDistance(Point p)
{
return Math.sqrt((p.x - this.x) * (p.x - this.x) + (p.y - this.y) * (p.y - this.y));
}
}
Point.h
typedef struct __Point Point;
typedef struct __Point
{
int x, y;
int (*getDistance)(Point*,Point*);
} Point;
Point* new_Point(int, int);
void delete_Point(Point*);
int getDistance(Point*, Point*);
Point.c
#include <math.h>
#include "Point.h"
Point* new_Point(int x, int y)
{
Point* this = malloc(sizeof(Point));
this->x = x;
this->y = y;
this->getDistance = getDistance;
}
void delete_Point(Point* this)
{
free(this);
}
int getDistance(Point* this, Point* p)
{
return sqrt((p->x - this->x) * (p->x - this->x) + (p->y - this->y) * (p->y - this->y));
}