Related
if I am developing a C shared library and I have my own structs. To make common operations on these struct instances easier for library consumers, can I provide function pointers to such functions inside the struct itself? Is it a good practice? Would there be issues with respect to multithreading where a utility function is called in parallel with different arguments and so on?
I know it goes a lot closer to C++ classes but I wish to stick to C and learn how it would be done in a procedural language as opposed to OOP.
To give an example
typedef struct tag tag;
typedef struct my_custom_struct my_custom_struct;
struct tag
{
// ...
};
struct my_custom_struct
{
tag *tags;
my_custom_struct* (*add_tag)(my_custom_struct* str, tag *tag);
};
my_custom_struct* add_tag(my_custom_struct* str, tag *tag)
{
// ...
}
where add_tag is a helper that manages to add the tag to tag list inside *str.
I saw this pattern in libjson-c like here- http://json-c.github.io/json-c/json-c-0.13.1/doc/html/structarray__list.html. There is a function pointer given inside array_list to help free it.
To make common operations on these struct instances easier for library
consumers, can I provide function pointers to such functions inside
the struct itself?
It is possible to endow your structures with members that are function pointers, pointing to function types whose parameters include pointers to your structure type, and that are intended to be used more or less like C++ instance methods, more or less as presented in the question.
Is it a good practice?
TL;DR: no.
The first problem you will run into is getting those pointer members initialized appropriately. Name correspondence notwithstanding, the function pointers in instances of your structure will not automatically be initialized to point to a particular function. Unless you make the structure type opaque, users can (and undoubtedly sometimes will) declare instances without calling whatever constructor-analog function you provide for the purpose, and then chaos will ensue.
If you do make the structure opaque (which after all isn't a bad idea), then you'll need non-member functions anyway, because your users won't be able to access the function pointers directly. Perhaps something like this:
struct my_custom_struct *my_add_tag(struct my_custom_struct *str, tag *tag) {
return str->add_tag(str, tag);
}
But if you're going to provide for that, then what's the point of the extra level of indirection? (Answer: the only good reason for that would be that in different instances, the function pointer can point to different functions.)
And similar applies if you don't make the structure opaque. Then you might suppose that users would (more) directly call
str->add_tag(str, tag);
but what exactly makes that a convenience with respect to simply
add_tag(str, tag);
?
So overall, no, I would not consider this approach a good practice in general. There are limited circumstances where it may make sense to do something along these lines, but not as a general library convention.
Would there be issues with
respect to multithreading where a utility function is called in
parallel with different arguments and so on?
Not more so than with functions designated any other way, except if the function pointers themselves are being modified.
I know it goes a lot closer to C++ classes but I wish to stick to C
and learn how it would be done in a procedural language as opposed to
OOP.
If you want to learn C idioms and conventions then by all means do so. What you are describing is not one. C code and libraries can absolutely be designed with use of OO principles such as encapsulation, and to some extent even polymorphism, but it is not conventionally achieved via the mechanism you describe. This answer touches on some of the approaches that are used for the purpose.
Is it a good practice?
TLDR; no.
Background:
I've been programming almost exclusively in embedded C on STM32 microcontrollers for the last year and a half (as opposed to using C++ or "C+", as I'll describe below). It's been very insightful for me to have to learn C at the architectural level, like I have. I've studied C architecture pretty hard to get to where I can say I "know C". It turns out, as we all know, C and C++ are NOT the same language. At the syntax level, C is almost exactly a subset of C++ (with some key differences where C supports stuff C++ does not), hence why people (myself included before this) frequently think/thought they are pretty much the same language, but at the architectural level they are VASTLY DIFFERENT ANIMALS.
Aside:
Note that my favorite approach to embedded is to use what some colloquially know as "C+". It is basically using a C++ compiler to write C-style embedded code. You basically just write C how you'd expect to write C, except you use C++ classes to vastly simplify the (otherwise pure C) architecture. In other words, "C+" is a pseudonym used to describe using a C++ compiler to write C-like code that uses classes instead of "object-based C" architecture (which is described below). You may also use some advanced C++ concepts on occasion, like operator overloading or templates, but avoid the STL for the most part to not accidentally use dynamic allocation (behind-the-scenes and automatically, like C++ vectors do, for example) after initialization, since dynamic memory allocation/deallocation in normal run-time can quickly use up scarce RAM resources and make otherwise-deterministic code non-deterministic. So-called "C+" may also include using a mix of C (compiled with the C compiler) and C++ (compiled with the C++ compiler), linked together as required (don't forget your extern "C" usage in C header files included in your C++ code, as required).
The core Arduino source code (again, the core, not necessarily their example "sketches" or example code for beginners) does this really well, and can be used as a model of good "C+" design. <== before you attack me on this, go study the Arduino source code for dozen of hours like I have [again, NOT the example "sketches", but their actual source code, linked-to below], and drop your "arduino is for beginners" pride right now.
The AVR core (mix of C and "C+"-style C++) is here: https://github.com/arduino/ArduinoCore-avr/tree/master/cores/arduino
Some of the core libraries ("C+"-style C++) are here: https://github.com/arduino/ArduinoCore-avr/tree/master/libraries
[aside over]
Architectural C notes:
So, regarding C architecture (ie: actual C, NOT "C+"/C-style C++):
C is not an OO language, as you know, but it can be written in an "object-based" style. Notice I say "object-based", NOT "object oriented", as that's how I've heard other pedantic C programmers refer to it. I can say I write object-based C architecture, and it's actually quite interesting.
To make object-based C architecture, here's a few things to remember:
Namespaces can be done in C simply by prepending your namespace name and an underscore in front of something. That's all a namespace really is after-all. Ex: mylibraryname_foo(), mylibraryname_bar(), etc. Apply this to enums, for example, since C doesn't have "enum classes" like C++. Apply it to all C class "methods" too since C doesn't have classes. Apply to all global variables or defines as well that pertain to a particular library.
When making C "classes", you have 2 major architectural options, both of which are very valid and widely used:
Use public structs (possibly hidden in headers named "myheader_private.h" to give them a pseudo-sense of privacy)
Use opaque structs (frequently called "opaque pointers" since they are pointers to opaque structs)
When making C "classes", you have the option of wrapping up pointers to functions inside of your structs above to give it a more "C++" type feel. This is somewhat common, but in my opinion a horrible idea which makes the code nearly impossible to follow and very difficult to read, understand, and maintain.
1st option, public structs:
Make a header file with a struct definition which contains all your "class data". I recommend you do NOT include pointers to functions (will discuss later). This essentially gives you the equivalent of a "C++ class where all members are public." The downside is you don't get data hiding. The upside is you can use static memory allocation of all of your C "class objects" since your user code which includes these library headers knows the full specification and size of the struct.
2nd option: opaque structs:
In your library header file, make a forward declaration to a struct:
/// Opaque pointer (handle) to C-style "object" of "class" type mylibrarymodule:
typedef struct mylibrarymodule_s *mylibrarymodule_h;
In your library .c source file, provide the full definition of the struct mylibrarymodule_s. Since users of this library include only the header file, they do NOT get to see the full implementation or size of this opaque struct. That is what "opaque" means: "hidden". It is obfuscated, or hidden away. This essentially gives you the equivalent of a "C++ class where all members are private." The upside is you get true data hiding. The downside is you can NOT use static memory allocation for any of your C "class objects" in your user code using this library, since any user code including this library doesn't even know how big the struct is, so it cannot be statically allocated. Instead, the library must do dynamic memory allocation at program initialization, one time, which is safe even for embedded deterministic real-time safety-critical systems since you are not allocating or freeing memory during normal program execution.
For a detailed and full example of Option 2 (don't be confused: I call it "Option 1.5" in my answer linked-to here) see my other answer on opaque structs/pointers here: Opaque C structs: how should they be declared?.
Personally, I think the Option 1, with static memory allocation and "all public members", may be my preferred approach, but I am most familiar with the opaque struct Option 2 approach, since that's what the C code base I work in the most uses.
Bullet 3 above: including pointers to functions in your structs.
This can be done, and some do it, but I really hate it. Don't do it. It just makes your code so stinking hard to follow. In Eclipse, for instance, which has an excellent indexer, I can Ctrl + click on anything and it will jump to its definition. What if I want to see the implementation of a function I'm calling on a C "object"? I Ctrl + click it and it jumps to the declaration of the pointer to the function. But where's the function??? I don't know! It might take me 10 minutes of grepping and using find or search tools, digging all around the code base, to find the stinking function definition. Once I find it, I forget where I was, and I have to repeat it all over again for every single function, every single time I edit a library module using this approach. It's just bad. The opaque pointer approach above works fantastic instead, and the public pointer approach would be easy too.
Now, to directly answer your questions:
To make common operations on these struct instances easier for library consumers, can I provide function pointers to such functions inside the struct itself?
Yes you can, but it only makes calling something easier. Don't do it. Finding the function to look at its implementation becomes really hard.
Is it a good practice?
No, use Option 1 or Option 2 above instead, where you now just have to call C "namespaced" "methods" on every C "object". You must simply pass the "members of the C class" into the function as the first argument for every call instead. This means instead of in C++ where you can do:
myclass.dosomething(int a, int b);
You'll just have to do in object-based C:
// Notice that you must pass the "guts", or member data
// (`mylibrarymodule` here), of each C "class" into the namespaced
// "methods" to operate on said C "class object"!
// - Essentially you're passing around the guts (member variables)
// of the C "class" (which guts are frequently referred to as
// "private data", or just `priv` in C lingo) to each function that
// needs to operate on a C object
mylibrarymodule_dosomething(mylibrarymodule_h mylibrarymodule, int a, int b);
Would there be issues with respect to multithreading where a utility function is called in parallel with different arguments and so on?
Yes, same as in any multithreaded situation where multiple threads are trying to access the same data. Just add a mutex to each C struct-based "object", and be sure each "method" acting on your C "objects" properly locks (takes) and unlocks (gives) the mutex as required before operating on any shared volatile members of the C "object".
Related:
Opaque C structs: how should they be declared? [use "Object-based" C architecture]
I would like to suggest you reading com specification, you will gain a lot. all these com, ole and dcom technology is based on a simple struct that incorporates its own data and methods.
https://www.scribd.com/document/45643943/Com-Spec
simplied more here
http://www.voidcn.com/article/p-fixbymia-beu.html
I recently came across a colleague's code that looked like this:
typedef struct A {
int x;
}A;
typedef struct B {
A a;
int d;
}B;
void fn(){
B *b;
((A*)b)->x = 10;
}
His explanation was that since struct A was the first member of struct B, so b->x would be the same as b->a.x and provides better readability.
This makes sense, but is this considered good practice? And will this work across platforms? Currently this runs fine on GCC.
Yes, it will work cross-platform(a), but that doesn't necessarily make it a good idea.
As per the ISO C standard (all citations below are from C11), 6.7.2.1 Structure and union specifiers /15, there is not allowed to be padding before the first element of a structure
In addition, 6.2.7 Compatible type and composite type states that:
Two types have compatible type if their types are the same
and it is undisputed that the A and A-within-B types are identical.
This means that the memory accesses to the A fields will be the same in both A and B types, as would the more sensible b->a.x which is probably what you should be using if you have any concerns about maintainability in future.
And, though you would normally have to worry about strict type aliasing, I don't believe that applies here. It is illegal to alias pointers but the standard has specific exceptions.
6.5 Expressions /7 states some of those exceptions, with the footnote:
The intent of this list is to specify those circumstances in which an object may or may not be aliased.
The exceptions listed are:
a type compatible with the effective type of the object;
some other exceptions which need not concern us here; and
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union).
That, combined with the struct padding rules mentioned above, including the phrase:
A pointer to a structure object, suitably converted, points to its initial member
seems to indicate this example is specifically allowed for. The core point we have to remember here is that the type of the expression ((A*)b) is A*, not B*. That makes the variables compatible for the purposes of unrestricted aliasing.
That's my reading of the relevant portions of the standard, I've been wrong before (b), but I doubt it in this case.
So, if you have a genuine need for this, it will work okay but I'd be documenting any constraints in the code very close to the structures so as to not get bitten in future.
(a) In the general sense. Of course, the code snippet:
B *b;
((A*)b)->x = 10;
will be undefined behaviour because b is not initialised to something sensible. But I'm going to assume this is just example code meant to illustrate your question. If anyone's concerned about it, think of it instead as:
B b, *pb = &b;
((A*)pb)->x = 10;
(b) As my wife will tell you, frequently and with little prompting :-)
I'll go out on a limb and oppose #paxdiablo on this one: I think it's a fine idea, and it's very common in large, production-quality code.
It's basically the most obvious and nice way to implement inheritance-based object oriented data structures in C. Starting the declaration of struct B with an instance of struct A means "B is a sub-class of A". The fact that the first structure member is guaranteed to be 0 bytes from the start of the structure is what makes it work safely, and it's borderline beautiful in my opinion.
It's widely used and deployed in code based on the GObject library, such as the GTK+ user interface toolkit and the GNOME desktop environment.
Of course, it requires you to "know what you're doing", but that is generally always the case when implementing complicated type relationships in C. :)
In the case of GObject and GTK+, there's plenty of support infrastructure and documentation to help with this: it's quite hard to forget about it. It might mean that creating a new class isn't something you do just as quickly as in C++, but that's perhaps to be expected since there's no native support in C for classes.
That's a horrible idea. As soon as someone comes along and inserts another field at the front of struct B your program blows up. And what is so wrong with b.a.x?
Anything that circumvents type checking should generally be avoided.
This hack rely on the order of the declarations and neither the cast nor this order can be enforced by the compiler.
It should work cross-platform, but I don't think it is a good practice.
If you really have deeply nested structures (you might have to wonder why, however), then you should use a temporary local variable to access the fields:
A deep_a = e->d.c.b.a;
deep_a.x = 10;
deep_a.y = deep_a.x + 72;
e->d.c.b.a = deep_a;
Or, if you don't want to copy a along:
A* deep_a = &(e->d.c.b.a);
deep_a->x = 10;
deep_a->y = deep_a->x + 72;
This shows from where a comes and it doesn't require a cast.
Java and C# also regularly expose constructs like "c.b.a", I don't see what the problem is. If what you want to simulate is object-oriented behaviour, then you should consider using an object-oriented language (like C++), since "extending structs" in the way you propose doesn't provide encapsulation nor runtime polymorphism (although one may argue that ((A*)b) is akin to a "dynamic cast").
I am sorry to disagree with all the other answers here, but this system is not compliant to standard C. It is not acceptable to have two pointers with different types which point to the same location at the same time, this is called aliasing and is not allowed by the strict aliasing rules in C99 and many other standards. A less ugly was of doing this would be to use in-line getter functions which then do not have to look neat in that way. Or perhaps this is the job for a union? Specifically allowed to hold one of several types, however there are a myriad of other drawbacks there too.
In short, this kind of dirty casting to create polymorphism is not allowed by most C standards, just because it seems to work on your compiler does not mean it is acceptable. See here for an explanation of why it is not allowed, and why compilers at high optimization levels can break code which does not follow these rules http://en.wikipedia.org/wiki/Aliasing_%28computing%29#Conflicts_with_optimization
Yes, it will work. And it is one of the core principle of Object Oriented using C. See this answer 'Object-orientation in C' for more examples about extending (i.e inheritance).
This is perfectly legal, and, in my opinion, pretty elegant. For an example of this in production code, see the GObject docs:
Thanks to these simple conditions, it is possible to detect the type
of every object instance by doing:
B *b;
b->parent.parent.g_class->g_type
or, more quickly:
B *b;
((GTypeInstance*)b)->g_class->g_type
Personally, I think that unions are ugly and tend to lead towards huge switch statements, which is a big part of what you've worked to avoid by writing OO code. I write a significant amount of code myself in this style --- typically, the first member of the struct contains function pointers that can be made to work like a vtable for the type in question.
I can see how this works but I would not call this good practice. This is depending on how the bytes of each data structure is placed in memory. Any time you are casting one complicated data structure to another (ie. structs), it's not a very good idea, especially when the two structures are not the same size.
I think the OP and many commenters have latched onto the idea that the code is extending a struct.
It is not.
This is and example of composition. Very useful. (Getting rid of the typedefs, here is a more descriptive example ):
struct person {
char name[MAX_STRING + 1];
char address[MAX_STRING + 1];
}
struct item {
int x;
};
struct accessory {
int y;
};
/* fixed size memory buffer.
The Linux kernel is full of embedded structs like this
*/
struct order {
struct person customer;
struct item items[MAX_ITEMS];
struct accessory accessories[MAX_ACCESSORIES];
};
void fn(struct order *the_order){
memcpy(the_order->customer.name, DEFAULT_NAME, sizeof(DEFAULT_NAME));
}
You have a fixed size buffer that is nicely compartmentalized. It sure beats a giant single tier struct.
struct double_order {
struct order order;
struct item extra_items[MAX_ITEMS];
struct accessory extra_accessories[MAX_ACCESSORIES];
};
So now you have a second struct that can be treated (a la inheritance) exactly like the first with an explicit cast.
struct double_order d;
fn((order *)&d);
This preserves compatibility with code that was written to work with the smaller struct. Both the Linux kernel (http://lxr.free-electrons.com/source/include/linux/spi/spi.h (look at struct spi_device)) and bsd sockets library (http://beej.us/guide/bgnet/output/html/multipage/sockaddr_inman.html) use this approach. In the kernel and sockets cases you have a struct that is run through both generic and differentiated sections of code. Not all that different than the use case for inheritance.
I would NOT suggest writing structs like that just for readability.
I think Postgres does this in some of their code as well. Not that it makes it a good idea, but it does say something about how widely accepted it seems to be.
Perhaps you can consider using macros to implement this feature, the need to reuse the function or field into the macro.
I read plenty of questions regarding
declaration of functions inside structure?
But I did not get a satisfactory answer.
My question is not about whether functions can be declared inside a structure or not?
Instead, my question is
WHY function can not be declared inside structure?
Well that is the fundamental difference between C and C++ (works in C++). C++ supports classes (and in C++ a struct is a special case of a class), and C does not.
In C you would implement the class as a structure with functions that take a this pointer explicitly, which is essentially what C++ does under the hood. Coupled with a sensible naming convention so you know what functions belong to which classes (again something C++ does under then hood with name-mangling), you get close to object-based if not object-oriented programming. For example:
typedef struct temp
{
int a;
} classTemp ;
void classTemp_GetData( classTemp* pThis )
{
printf( "Enter value of a : " );
scanf( "%d", &(pThis->a) );
}
classTemp T ;
int main()
{
classTemp_GetData( &T );
}
However, as you can see without language support for classes, implementing then can become tiresome.
In C, the functions and data structures are more or less bare; the language gives a minimum of support for combining data structures together, and none at all (directly) for including functions with those data structures.
The purpose of C is to have a language that translates as directly as possible into machine code, more like a portable assembly language than a higher-level language such as C++ (not that C++ is all that high-level). C let's you get very close to the machine, getting into details that most languages abstract away; the down side of this is that you have to get close to the machine in C to use the language to its utmost. It takes a completely different approach to programming from C++, something that the surface similarities between them hide.
Check out here for more info (wonderful discussions there).
P.S.: You can also accomplish the functionality by using function pointers, i.e.
have a pointer to a function (as a variable) inside the struct.
For example:
#include <stdio.h>
struct t {
int a;
void (*fun) (int * a); // <-- function pointers
} ;
void get_a (int * a) {
printf (" input : ");
scanf ("%d", a);
}
int main () {
struct t test;
test.a = 0;
printf ("a (before): %d\n", test.a);
test.fun = get_a;
test.fun(&test.a);
printf ("a (after ): %d\n", test.a);
return 0;
}
WHY function can not be declared inside structure?
Because C standard does not allow to declare function/method inside a structure. C is not object-oriented language.
6.7.2.1 Structure and union specifiers:
A structure or union shall not contain a member with incomplete or function type(hence,
a structure shall not contain an instance of itself, but may contain a pointer to an instance
of itself), except that the last member of a structure with more than one named member
may have incomplete array type; such a structure (and any union containing, possibly
recursively, a member that is such a structure) shall not be a member of a structure or an
element of an array.
I suppose there were and are many reasons, here are several of them:
C programming language was created in 1972, and was influenced by pure assembly language, so struct was supposed as "data-only" element
As soon as C is NOT object oriented language - there is actually no sense to define functions inside "data structure", there are no such entity as constructor/method etc
Functions are directly translated to pure assembly push and call instructions and there are no hidden arguments like this
I guess, because it wouldn't make much sence in C. If you declare a function inside structure, you expect it to be somehow related to that structure, right? Say,
struct A {
int foo;
void hello() {
// smth
}
}
Would you expect hello() to have access to foo at least? Because otherwise hello() only got something like namespace, so to call it we would write A.hello() - it would be a static function, in terms of C++ - not much difference from normal C function.
If hello() has access to foo, there must be a this pointer to implement such access, which in C++ always implicitly passed to functions as first argument.
If a structure function has access to structure variables, it must be different somehow from access that have other functions, again, to add some sence to functions inside structures at all. So we have public, private modificators.
Next. You don't have inheritance in C (but you can simulate it), and this is something that adds lots of sence to declaring functions inside structutes. So here we'd like to add virtual and protected.
Now you can add namespaces and classes, and here you are, invented C++ (well, without templates).
But C++ is object-oriented, and C is not. First, people created C, then they wrote tons of programs, understood some improvements that could be made, and then, following reasonings that I mentioned earlier, they came up with C++. They did not change C instead to separate concepts - C is procedure-oriented, and C++ is object-oriented.
C was designed so that it could be processed with a relatively simple compilation system. To allow function definitions to appear within anything else would have required the compiler to keep track of the context in which the function appeared while processing it. Given that members of a structure, union, or enum declaration do not enter scope until the end of the declaration, there would be nothing that a function declared within a structure could do which a function declared elsewhere could not.
Conceptually, it might have been nice to allow constant declarations within a structure; a constant pointer to a literal function would then be a special case of that (even if the function itself had to be declared elsewhere), but that would have required the C compiler to keep track of more information for each structure member--not just its type and its offset, but also whether it was a constant and, if so, what its value should be. Such a thing would not be difficult in today's compilation environments, but early C compilers were expected to run with orders of magnitude less memory than would be available today.
Given that C has been extended to offer many features which could not very well be handled by a compiler running on a 64K (or smaller) system, it might reasonably be argued that it should no longer be bound by such constraints. Indeed, there are some aspect of C's heritage which I would like to lose, such as the integer-promotion rules which require even new integer types to follow the inconsistent rules for old types, rather than allowing the new types to have explicitly specified behavior [e.g. have a wrap32_t which could be converted to any other integer type without a typecast, but when added to any "non-wrap" integer type of any size would yield a wrap32_t]. Being able to define functions within a struct might be nice, but it would be pretty far down on my list of improvements.
In C if I declare a struct/union/enum:
struct Foo { int i ... }
when I want to use my structure I need to specify the tag:
struct Foo foo;
To loose this requirement, I have to alias my structure using typedef:
typedef struct Foo Foo;
Why not have all types/structs/whatever in the same "namespace" by default? What is the rationale behind the decision of requiring the declaration tag at each variable declaration (unless typdefe'd) ???
Many other languages do not make this distinction, and it seems that it's only bringing an extra level of complexity IMHO.
Structures/records were a very early pre-C addition to B, just after Dennis Ritchie added a the basic 'typed' structure. I believe that the original struct syntax did not have a tag at all, for every variable you made an anonymous struct:
struct {
int i;
char a[5];
} s;
Later, the tag was added to enable reuse of structure layout, but it wasn't really regarded as real 'type'. Also, removing the struct/union would make parsing impossible:
/* is Foo a union or a struct? */
Foo { int i; double x; };
Foo s;
or break the 'declaration syntax mimics expression syntax' paradigm that is so fundamental to C.
I suspect that typedef was added much later, possible a few years after the 'birth' of C.
The argument "C was the highest level language at the time." does not seem true. Algol-68 predates it and has records as proper types. The same holds for Pascal.
If you like to know more about the history of C you might find Ritchie's "The Development of the C Language" an interesting read.
Well, other languages also usually support namespaces. C doesn't.
It probably isn't the reason, but it makes sense to have at least this natural namespace.
Interesting question! Here are my thoughts.
When C was created, little abstraction existed over assembly language. There was FORTRAN, B, and others, but when C came to be it was arguably the highest level language in existence. It's goal was to provide functionality and syntax powerful enough to create and maintain an operating system, and it succeed remarkably.
Think that, at the time, porting a system to a new platform meant rewriting and adapting components to the platform's assembly language. With the release of C, it eventually came down to porting the C compiler, and recompiling existent code.
It was probably an asset back then that the very syntax of the language forced you to differentiate between types that could fit in a register, and types that couldn't.
Language syntax has evolved a lot since then, and most of the things we're used to see in modern languages are missing in C. User-defined namespaces is only one of them, and I don't think the concept of "syntax sugar" even existed back then. Or rather, C was the peak of syntax sugar.
We're surrounded with things like this. I mean, take a look at your keyboard: why do we ave a PAUSE/BREAK key? I don't think I've pressed that key for years.
It's inheritance from a time in which it made sense.
This question already has answers here:
Object Oriented pattern in C? [duplicate]
(4 answers)
Closed 1 year ago.
Possible Duplicates:
Can you write object oriented code in C?
Object Oriented pattern in C ?
I remember reading a while ago about someone (I think it was Linus Torvalds) talking about how C++ is a horrible language and how you can write object-oriented programs with C. On having had time to reflect, I don't really see how all object oriented concepts carry over into C. Some things are fairly obvious. For example:
To emulate member functions, you can put function pointers in structs.
To emulate polymorphism, you can write a function that takes a variable number of arguments and do some voodoo depending on, say, the sizeof the parameter(s)
How would you emulate encapsulation and inheritance though?
I suppose encapsulation could sort of be emulated by having a nested struct that stored private members. It would be fairly easy to get around, but could perhaps be named PRIVATE or something equally obvious to signal that it isn't meant to be used from outside the struct. What about inheritance though?
You can implement polymorphism with regular functions and virtual tables (vtables). Here's a pretty neat system that I invented (based on C++) for a programming exercise:
The constructors allocate memory and then call the class's init function where the memory is initialized. Each init function should also contain a static vtable struct that contains the virtual function pointers (NULL for pure virtual). Derived class init functions call the superclass init function before doing anything else.
A very nice API can be created by implementing the virtual function wrappers (not to be confused with the functions pointed to by the vtables) as follows (add static inline in front of it, if you do this in the header):
int playerGuess(Player* this) { return this->vtable->guess(this); }
Single inheritance can be done by abusing the binary layout of a struct:
Notice that multiple inheritance is messier as then you often need to adjust the pointer value when casting between types of the hierarchy.
Other type-specific data can be added to the virtual tables as well. Examples include runtime type info (e.g. type name as a string), linking to superclass vtable and the destructor chain. You probably want virtual destructors where derived class destructor demotes the object to its super class and then recursively calls the destructor of that and so on, until the base class destructor is reached and that finally frees the struct.
Encapsulation was done by defining the structs in player_protected.h and implementing the functions (pointed to by the vtable) in player_protected.c, and similarly for derived classes, but this is quite clumsy and it degrades performance (as virtual wrappers cannot be put to headers), so I would recommend against it.
Have you read the "bible" on the subject? See Object Oriented C...
How would you emulate encapsulation and inheritance though?
Actually, encapsulation is the easiest part. Encapsulation is a design philosophy, it has nothing at all to do with the language and everything to to with how you think about problems.
For example, the Windows FILE api is completely encapsulated. When you open a file, you get back an opaque object that contains all of the state information for the file 'object'. You hand this handle back to each of the file io apis. The encapsulation is actually much better than C++ because there is no public header file that people can look at and see the names of your private variables.
Inheritance is harder, but it isn't at all necessary in order for your code to be object oriented. In some ways aggregation is better than inheritance anyway, and aggregation is just as easy in C as in C++. see this for instance.
In response to Neil see Wikipedia for an explanation of why inheritance isn't necessary for polymorphism.
Us old-timers wrote object oriented code years before C++ compilers were available, it's a mind-set not a tool-set.
Apple's C-based CoreFoundation framework was actually written so that its "objects" could double as objects in Objective-C, an actual OO language. A fairly large subset of the framework is open source on Apple's site as CF-Lite. Might be a useful case study in a major OS-level framework done this way.
From a little bit higher altitude and considering the problem rather more open-minded than as the OOP mainstream may suggest, Object-Oriented Programming means thinking about objects as of data with associated functions. It does not necessarily mean an function has to be physically attached to an object as it is in popular languages which support paradigm of OOP, for instance in C++:
struct T
{
int data;
int get_data() const { return data; }
};
I would suggest to take a closer look at GTK+ Object and Type System. It is a brilliant example of OOP realised in C programming language:
GTK+ implements its own custom object
system, which offers standard
object-oriented features such as
inheritance and virtual function
The association can also be contractual and conventional.
Regarding encapsulation and data hiding techniques, popular and simple one may be Opaque Pointer (or Opaque Data Type) - you can pass it around but in order to load or store any information, you have to call associated function which knows how to talk to the object hidden behind that opaque pointer.
Another one, similar but different is Shadow Data type - check this link where Jon Jagger gives excellent explanation of this not-so-well-known-technique.
the gtk and glib libraries use macros to cast objects to various types.
add_widget(GTK_WIDGET(myButton));
I can't say how it's done but you can read their source to find out exactly how it's done.
Take a look at the way the VFS layer works in the Linux kernel for an example of an inheritance pattern. The file operations for the various filesystems "inherit" a set of generic file operations functions (eg generic_file_aio_read(), generic_file_llseek()...), but can override them with their own implementations (eg. ntfs_file_aio_write()).
Definitely look at Objective-C.
typedef struct objc_object {
Class isa;
} *id;
typedef struct objc_class {
struct objc_class *isa;
struct objc_class *super_class
const char *name;
long version;
long info
long instance_size;
struct objc_ivar_list *ivars;
struct objc_method_list **methodLists;
struct objc_cache *cache;
struct objc_protocol_list *protocols;
} *Class;
As you can see, inheritance information, along with other details is held in a class struct (conveniently the class can also be treated as an object).
Objective-C suffers in the same manner as C++ with encapsulation in that you need to declare your variables publicly. Straight C is much more flexible in that you can just return void pointers that only your module has internal access to, so in that respect encapsulation is much better.
I once wrote a basic OO style C paint program as part of a graphics course - I didn't go as far as the class declaration, I simply used a vtable pointer as the first element of the struct and implemented hand-coded inheritance. The neat thing about playing around with vtables at such a low level is that you can change class behaviour at runtime by changing a few pointers, or change on objects class dynamically. It was quite easy to create all sorts of hybrid objects, fake multiple inheritance, etc.
Nice article & discussion regarding Objective-C here:
http://cocoawithlove.com/2009/10/objective-c-niche-why-it-survives-in.html
For a great example of object-oriented programming in C, look at the source of POV-Ray from several years ago - version 3.1g is particularly good. "Objects" were struct with function pointers, of course. Macros were used to provide the core methods and data for an abstract object, and derived classes were structs that began with that macro. There was no attempt to deal with private/public, however. Things to be seen were in .h files and implementation details were in .c files, mostly, except for a lot of exceptions.
There were some neat tricks that I don't see how could be carried over to C++ - such as converting one class to a different but similar one on the fly just by reassigning function pointers. Simple for today's dynamic languages. I forgot the details; I think it might have been CSG intersection and union objects.
http://www.povray.org/
An interesting bit of history. Cfront, the original C++ implementation output C code and then requires a C compiler to actually build the final code. So, anything that could be expressed in C++ could be written as C.
One way to handle inheritance is by having nested structs:
struct base
{
...
};
void method_of_base(base *b, ...);
struct child
{
struct base base_elements;
...
};
You can then do calls like this:
struct child c;
method_of_base(&c.b, ...);
You might want to look at Objective-C, that's pretty much what it does. It's just a front-end that compiles Objective-C OO code to C.