Linux kernel: why do 'subclass' structs put base class info at end? - c

I was reading the chapter in Beautiful Code on the Linux kernel and the author discusses how Linux kernel implements inheritance in the C language (amongst other topics). In a nutshell, a 'base' struct is defined and in order to inherit from it the 'subclass' struct places a copy of the base at the end of the subclass struct definition. The author then spends a couple pages explaining a clever and complicated macro to figure out how many bytes to back in order to convert from the base part of the object to the subclass part of the object.
My question: Within the subclass struct, why not declare the base struct as the first thing in the struct, instead of the last thing?
The main advantage of putting the base struct stuff first is when casting from the base to the subclass you wouldn't need to move the pointer at all - essentially, doing the cast just means telling the compiler to let your code use the 'extra' fields that the subclass struct has placed after the stuff that the base defines.
Just to clarify my question a little bit let me throw some code out:
struct device { // this is the 'base class' struct
int a;
int b;
//etc
}
struct usb_device { // this is the 'subclass' struct
int usb_a;
int usb_b;
struct device dev; // This is what confuses me -
// why put this here, rather than before usb_a?
}
If one happens to have a pointer to the "dev" field inside of a usb_device object then in order to cast it back to that usb_device object one needs to subtract 8 from that pointer. But if "dev" was the first thing in a usb_device casting the pointer wouldn't need to move the pointer at all.
Any help on this would be greatly appreciated. Even advice on where to find an answer would be appreciated - I'm not really sure how to Google for the architectural reason behind a decision like this. The closest I could find here on StackOverflow is:
why to use these weird nesting structure
And, just to be clear - I understand that a lot of bright people have worked on the Linux kernel for a long time so clearly there's a good reason for doing it this way, I just can't figure out what it is.

The Amiga OS uses this "common header" trick in a lot of places and it looked like a good idea at the time: Subclassing by simply casting the pointer type. But there are drawbacks.
Pro:
You can extend existing data structures
You can use the same pointer in all places where the base type is expected, no pointer arithmetic needed, saving precious cycles
It feels natural
Con:
Different compilers tend to align data structures differently. If the base structure ended with char a;, then you could have 0, 1 or 3 pad bytes afterwards before the next field of the subclass starts. This led to quite nasty bugs, especially when you had to maintain backwards compatibility (i.e. for some reason, you have to have a certain padding because an ancient compiler version had a bug and now, there is lots of code which expects the buggy padding).
You don't notice quickly when you pass the wrong structure around. With the code in your question, fields get trashed very quickly if the pointer arithmetic is wrong. That is a good thing since it raises chances that a bug is discovered more early.
It leads to an attitude "my compiler will fix it for me" (which it sometimes won't) and all the casts lead to a "I know better than the compiler" attitude. The latter one would make you automatically insert casts before understanding the error message, which would lead to all kinds of odd problems.
The Linux kernel is putting the common structure elsewhere; it can be but doesn't have to be at the end.
Pro:
Bugs will show early
You will have to do some pointer arithmetic for every structure, so you're used to it
You don't need casts
Con:
Not obvious
Code is more complex

I'm new to the Linux kernel code, so take my ramblings here with a grain of salt. As far as I can tell, there is no requirement as to where to put the "subclass" struct. That is exactly what the macros provide: You can cast to the "subclass" structure, regardless of its layout. This provides robustness to your code (the layout of a structure can be changed, without having to change your code.
Perhaps there is a convention of placing the "base class" struct at the end, but I'm not aware of it. I've seen lots of code in drivers, where different "base class" structs are used to cast back to the same "subclass" structure (from different fields in the "subclass" of course).

I don't have fresh experience from the Linux kernel, but from other kernels. I'd say that this doesn't matter at all.
You are not supposed to cast from one to the other. Allowing casts like that should only be done in very specific situations. In most cases it reduces the robustness and flexibility of the code and is considered quite sloppy. So the deepest "architectural reason" you're looking for might just be "because that's the order someone happened to write it in". Or alternatively, that's what the benchmarks showed would be the best for performance of some important code path in that code. Or alternatively, the person who wrote it thinks it looks pretty (I always build upside-down pyramids in my variable declarations and structs if I have no other constraints). Or someone happened to write it this way 20 years ago and since then everyone else has been copying it.
There might be some deeper design behind this, but I doubt it. There's just no reason to design those things at all. If you want to find out from an authoritative source why it's done this way, just submit a patch to linux that changes it and see who yells at you.

It's for multiple inheritance. struct dev isn't the only interface you can apply to a struct in the linux kernel, and if you have more than one, just casting the sub class to a base class wouldn't work. For example:
struct device {
int a;
int b;
// etc...
};
struct asdf {
int asdf_a;
};
struct usb_device {
int usb_a;
int usb_b;
struct device dev;
struct asdf asdf;
};

Related

Read low pointer bit in way that could *probably* work on as many systems as possible

It seems that the low bit of pointers being 0 is more-or-less pretty portable (where portable obviously does not mean "standard", but that people get away with it and can use it to some advantage in some cases, hopefully disable-able with a compile switch).
Projects that want to get fiddly have used it, with less luck on the second lowest bit:
How portable is using the low bit of a pointer as a flag?
But let's say one doesn't want to just poke a bit of data or not into a pointer of a known type. What you wish you could do instead is to use that low bit being 0 to allow a pointer type to do "double-duty" as a terminator.
So your items look like this:
struct Item {
uintptr_t flags; // low bit zero means "not an item"
type1 field1;
type2 field2;
...
};
Then you'd like to have a situation where some container of items looks like this:
[(flags field1 field2...) (flags field1 field2...) some-pointer stuff stuff...]
You'd be thus getting away with a "sunk-cost" (let's say some internal management pointer in the data structure for another purpose) doing your termination for you.
UPDATE: To be clearer on the situation: this is where one controls the codebase and structures. So any pointer in a structure used like this you could declare as a union type, for instance:
union Maybe_Terminator_Pointer {
uintptr_t flags;
type1* pointer1;
type2* pointer2;
...
};
...and then use that, if it helps. Excluding char*s is fine, as they of course would not count.
So an extra type punning problem here is: the pointer being used to do the test-for-termination is an Item*, and the routine doing the checking doesn't know which sort of pointer some-pointer is specifically.
I'm wondering what--if any--is the best gamble is for being able to port and compile such a trick. That includes turning the pointers into unions, #ifdef'ing the endianness of the machine and getting a char* from the byte with the bit, etc. Whatever might be more likely to work, if anyone has experience or guesses.
Imagine it's worth the effort for your case, shaving off a large amount of data. And you have the backup scenario of if people compiling find the trick isn't working somewhere...an #ifdef could use full-sized items for terminators and waste the extra space. So wondering if there are any tips on to make this obviously-standards-violating trick have a better chance of working on more systems.
(Self-answering to provide more information and allow people to spot any potential problems with my alternative.)
So wondering if there are any tips on to make this obviously-standards-violating trick have a better chance of working on more systems.
Tip One (as per comments) is don't do this if you can possibly find another way.
For example, "you" mention this layout:
[(flags field1 field2...) (flags field1 field2...) some-pointer stuff stuff...]
But is there anything in "stuff stuff" that isn't a pointer--perhaps boring old integers that are known to be even--where you could do the same trick? If so, why not reorder this like:
[(flags field1 field2...) (flags field1 field2...) even-uintptr_t stuff...]
That way when you read flags from either in an Item or not, it will be the same type. If you look around at things that aren't opaque like pointers, you might find obvious non-opaque numbers in the current code that are always even...for instance, a lot of byte counts measuring aggregates are things you likely are guaranteed to have as % 2 = 0.
Tip Two applies to the above alternative--and perhaps helps the odds with the standards-breaking-pointer-version too. Be sure that when you write the value you go through an "aliasing" pointer, and do not write the field directly via . or ->.
There is no requirement for the compiler to ensure memory coherence between two different structures on a field, just because they are the same type. Say struct A begins with an uintptr_t field_a and struct B begins with an uintptr_t field_b, and you put a pointer to both at the same address. If you do some_a->field_a = value;, then reading back from a some_b->field_b pointer at that address might well not see that update, because the compiler doesn't expect you to be writing B fields via an A pointer.
Hence go through a pointer to do the write. Something like uintptr_t *alias = &some_a->field_a; and then *alias = value will enforce coherence with the successive reads of any integer (!). (Dissatisfaction with the performance consequences of this property of pointers is why restrict exists. If this trick is to work, it can only do so by exploiting the non-restrict behavior of pointers.)
(!) - I think you only have to do the write through a pointer, and not the reads, but perhaps someone can provide insight.

Use of FAR* in C++11

Firstly, I have read:
A. What does "const char far* inStrSource" mean?
From which I know that FAR pointer in a segmented architecture computer, is a pointer which includes a segment selector, making it possible to point to addresses outside of the current segment.
B. what is FAR PASCAL?
From this one I know: 'FAR is a fall back to 16-bit days when heap memory was was segmented. NEAR data was limited in size and faster, FAR was allowed to be larger but more expensive.'
I have been given an old C/C++ code to translate to C++11 which uses short FAR* heavily. The short FAR* type will not resolve in VS2013 (using C++11), so my question is should I be replacing the short FAR* data type [I know the answer is probably 'definitely'], if so what with? If not, how can I get this to resolve?
Just get rid of them altogether and let the compiler deal with it.
If you wish to erase them all from your code, with a context-aware IDE (any decent IDE should do that) you can :
#define FAR /*nothing*/;
Replace identifier FAR with an empty identifier;
Delete the remaining #define.
Yes, you should delete it since you're no longer working on a segmented memory architecture, I assume. On any modern system, the concept expressed as short FAR * is now expressed as short *, i.e. just a plain "pointer to short" type.
Don't mess around with the preprocessor, that just adds confusion.
Simply use any capable text editing system (your IDE or text editor are good first-hand choices) to delete the text.
Remember to search for the string FAR using all-upper case, with a case-sensitive search, and enabling any "whole word match only" flag to lessen the risk of false matches. Do a replace for an empty string, effectively editing out the FAR words from the code.

Why do C written libraries use so many structs?

I've looked to some open source Libraries in some places. And, I've realized which that Libraries are basically a great stack of structs. I've seen few methods.
Why does C written libraries uses so much structs? What's the basis behind this? This, for me, looked like a attempt to simulate object orientation, 'cause a fast searching told me that each struct is "instantiated" by the using program to make something, per example, in some Desktop enviroments for linux that I've seen that each window was a struct in the used GUI library.
Anyway, the question is that.
Structs are a great way to organize data. And data is fundamental, as Fred Brooks knew decades ago:
Show me your flowcharts and conceal your tables, and I shall continue
to be mystified. Show me your tables, and I won't usually need your
flowcharts; they'll be obvious.
Object-oriented programming doesn't have to be merely simulated in C, it can be realized. For example, did you know that inside your structs you can store function pointers which operate on those same structs, and then you are a little bit closer to C++'s classes?
Also consider extensibility: even a function taking many arguments may be improved by taking a single struct, because then its signature does not need to change when a new argument is added.
Finally, C does not have multiple return values from a single function call. But it can return a struct, which is about the same thing. C is a lot about building your own tools from the raw language, and being able to stash a bunch of related data and/or functions together in one place is a good building block.
With or without object orientation, structures are a useful way to group aggregate data into a single symbol. You can copy the structure wherever you like without having to write out all the members each time, and this makes the structure easier to change if you have to.
It also makes it easier to reference certain members using pointer arithmetic, if you're careful (see sockaddr).
Same argument as with arrays.
Simply put, there's no reason not to use structures.
Structures are useful while retrieving data using a pointer. Because single pointer is enough for complete bunch of data with in a structure.
One, it keeps the APIs clean. Instead of passing N separate arguments to a function, you pass a single argument containing N members.
Two, it allows the library to hide implementation details from the programmer. For example, the C FILE type abstracts away some details of stream I/O, details which vary from implementation to implementation. We don't need to know those details, so they're not exposed to us; we just use the FILE type to pass that information around.

what the author of nedtries means by "in-place"?

I. Just implemented a kind of bitwise trie (based on nedtries), but my code does lot
Of memory allocation (for each node).
Contrary to my implemetation, nedtries are claimed to be fast , among othet things,
Because of their small number of memory allocation (if any).
The author claim his implementation to be "in-place", but what does it really means in this context ?
And how does nedtries achieve such a small number of dynamic memory allocation ?
Ps: I know that the sources are available, but the code is pretty hard to follow and I cannot figure how it works
I'm the author, so this is for the benefit of the many according to Google who are similarly having difficulties in using nedtries. I would like to thank the people here on stackflow for not making unpleasant comments about me personally which some other discussions about nedtries do.
I am afraid I don't understand the difficulties with knowing how to use it. Usage is exceptionally easy - simply copy the example in the Readme.html file:
typedef struct foo_s foo_t;
struct foo_s {
NEDTRIE_ENTRY(foo_t) link;
size_t key;
};
typedef struct foo_tree_s foo_tree_t;
NEDTRIE_HEAD(foo_tree_s, foo_t);
static foo_tree_t footree;
static size_t fookeyfunct(const foo_t *RESTRICT r)
{
return r->key;
}
NEDTRIE_GENERATE(static, foo_tree_s, foo_s, link, fookeyfunct, NEDTRIE_NOBBLEZEROS(foo_tree_s));
int main(void)
{
foo_t a, b, c, *r;
NEDTRIE_INIT(&footree);
a.key=2;
NEDTRIE_INSERT(foo_tree_s, &footree, &a);
b.key=6;
NEDTRIE_INSERT(foo_tree_s, &footree, &b);
r=NEDTRIE_FIND(foo_tree_s, &footree, &b);
assert(r==&b);
c.key=5;
r=NEDTRIE_NFIND(foo_tree_s, &footree, &c);
assert(r==&b); /* NFIND finds next largest. Invert the key function to invert this */
NEDTRIE_REMOVE(foo_tree_s, &footree, &a);
NEDTRIE_FOREACH(r, foo_tree_s, &footree)
{
printf("%p, %u\n", r, r->key);
}
NEDTRIE_PREV(foo_tree_s, &footree, &a);
return 0;
}
You declare your item type - here it's struct foo_s. You need the NEDTRIE_ENTRY() inside it otherwise it can contain whatever you like. You also need a key generating function. Other than that, it's pretty boilerplate.
I wouldn't have chosen this system of macro based initialisation myself! But it's for compatibility with the BSD rbtree.h so nedtries is very easy to swap in to anything using BSD rbtree.h.
Regarding my usage of "in place"
algorithms, well I guess my lack of
computer science training shows
here. What I would call "in place"
is when you only use the memory
passed into a piece of code, so if
you hand 64 bytes to an in place
algorithm it will only touch that 64
bytes i.e. it won't make use of
extra metadata, or allocate some
extra memory, or indeed write to
global state. A good example is an
"in place" sort implementation where
only the collection being sorted
(and I suppose the thread stack)
gets touched.
Hence no, nedtries doesn't need a
memory allocator. It stores all the
data it needs in the NEDTRIE_ENTRY
and NEDTRIE_HEAD macro expansions.
In other words, when you allocate
your struct foo_s, you do all the
memory allocation for nedtries.
Regarding understanding the "macro
goodness", it's far easier to
understand the logic if you compile
it as C++ and then debug it :). The
C++ build uses templates and the
debugger will cleanly show you state
at any given time. In fact, all
debugging from my end happens in a
C++ build and I meticulously
transcribe the C++ changes into
macroised C.
Lastly, before a new release, I
search Google for people having
problems with my software to see if
I can fix things and I am typically
amazed what someone people say about
me and my free software. Firstly,
why didn't those people having
difficulties ask me directly for
help? If I know that there is
something wrong with the docs, then
I can fix them - equally, asking on
stackoverflow doesn't let me know
immediately that there is a docs
problem bur rather relies on me to
find it next release. So all I would
say is that if anyone finds a
problem with my docs, please do
email me and say so, even if there
is a discussion say like here on
stackflow.
Niall
I took a look at the nedtrie.h source code.
It seems that the reason it is "in-place" is that you have to add the trie bookkeeping data to the items that you want to store.
You use the NEDTRIE_ENTRY macro to add parent/child/next/prev links to your data structure, and you can then pass that data structure to the various trie routines, which will extract and use those added members.
So it is "in-place" in the sense that you augment your existing data structures and the trie code piggybacks on that.
At least that's what it looks like. There's lots of macro goodness in that code so I could have gotten myself confused (:
In-place means you operate on the original (input) data, so the input data becomes the output data. Not-in-place means that you have separate input and output data, and the input data is not modified. In-place operations have a number of advantages - smaller cache/memory footprint, lower memory bandwidth, hence typically better performance, etc, but they have the disadvantage that they are destructive, i.e. you lose the original input data (which may or may not matter, depending on the use case).
In-place means to operate on the input data and (possibly) update it. The implication is that there no copying and/moving of the input data. This may result in loosing the input data original values which you will need to consider if it is relevant for your particular case.

'Multipurpose' linked list implementation in pure C

This is not exactly a technical question, since I know C kind of enough to do the things I need to (I mean, in terms of not 'letting the language get in your way'), so this question is basically a 'what direction to take' question.
Situation is: I am currently taking an advanced algorithms course, and for the sake of 'growing up as programmers', I am required to use pure C to implement the practical assignments (it works well: pretty much any small mistake you make actually forces you to understand completely what you're doing in order to fix it). In the course of implementing, I obviously run into the problem of having to implement the 'basic' data structures from the ground up: actually not only linked lists, but also stacks, trees, et cetera.
I am focusing on lists in this topic because it's typically a structure I end up using a lot in the program, either as a 'main' structure or as a 'helper' structure for other bigger ones (for example, a hash tree that resolves conflicts by using a linked list).
This requires that the list stores elements of lots of different types. I am assuming here as a premise that I don't want to re-code the list for every type. So, I can come up with these alternatives:
Making a list of void pointers (kinda inelegant; harder to debug)
Making only one list, but having a union as 'element type', containing all element types I will use in the program (easier to debug; wastes space if elements are not all the same size)
Using a preprocessor macro to regenerate the code for every type, in the style of SGLIB, 'imitating' C++'s STL (creative solution; doesn't waste space; elements have the explicit type they actually are when they are returned; any change in list code can be really dramatic)
Your idea/solution
To make the question clear: which one of the above is best?
PS: Since I am basically in an academic context, I am also very interested in the view of people working with pure C out there in the industry. I understand that most pure C programmers are in the embedded devices area, where I don't think this kind of problem I am facing is common. However, if anyone out there knows how it's done 'in the real world', I would be very interested in your opinion.
A void * is a bit of a pain in a linked list since you have to manage it's allocation separately to the list itself. One approach I've used in the past is to have a 'variable sized' structure like:
typedef struct _tNode {
struct _tNode *prev;
struct _tNode *next;
int payloadType;
char payload[1]; // or use different type for alignment.
} tNode;
Now I realize that doesn't look variable sized but let's allocate a structure thus:
typedef struct {
char Name[30];
char Addr[50];
} tPerson;
tNode *node = malloc (sizeof (tNode) - 1 + sizeof (tPerson));
Now you have a node that, for all intents and purposes, looks like this:
typedef struct _tNode {
struct _tNode *prev;
struct _tNode *next;
int payloadType;
char Name[30];
char Addr[50];
} tNode;
or, in graphical form (where [n] means n bytes):
+----------------+
| prev[4] |
+----------------+
| next[4] |
+----------------+
| payloadType[4] |
+----------------+ +----------+
| payload[1] | <- overlap -> | Name[30] |
+----------------+ +----------+
| Addr[50] |
+----------+
That is, assuming you know how to address the payload correctly. This can be done as follows:
node->prev = NULL;
node->next = NULL;
node->payloadType = PLTYP_PERSON;
tPerson *person = &(node->payload); // cast for easy changes to payload.
strcpy (person->Name, "Bob Smith");
strcpy (person->Addr, "7 Station St");
That cast line simply casts the address of the payload character (in the tNode type) to be an address of the actual tPerson payload type.
Using this method, you can carry any payload type you want in a node, even different payload types in each node, without the wasted space of a union. This wastage can be seen with the following:
union {
int x;
char y[100];
} u;
where 96 bytes are wasted every time you store an integer type in the list (for a 4-byte integer).
The payload type in the tNode allows you to easily detect what type of payload this node is carrying, so your code can decide how to process it. You can use something along the lines of:
#define PAYLOAD_UNKNOWN 0
#define PAYLOAD_MANAGER 1
#define PAYLOAD_EMPLOYEE 2
#define PAYLOAD_CONTRACTOR 3
or (probably better):
typedef enum {
PAYLOAD_UNKNOWN,
PAYLOAD_MANAGER,
PAYLOAD_EMPLOYEE,
PAYLOAD_CONTRACTOR
} tPayLoad;
My $.002:
Making a list of void pointers (kinda diselegant; harder to debug)
This isn't such a bad choice, IMHO, if you must write in C. You might add API methods to allow the application to supply a print() method for ease of debugging. Similar methods could be invoked when (e.g.) items get added to or removed from the list. (For linked lists, this is usually not necessary, but for more complex data structures -- hash tables, for example) -- it can sometimes be a lifesaver.)
Making only one list, but having a union as 'element type', containing all element types I will use in the program (easier to debug; wastes space if elements are not all the same size)
I would avoid this like the plague. (Well, you did ask.) Having a manually-configured, compile-time dependency from the data structure to its contained types is the worst of all worlds. Again, IMHO.
Using a preprocessor macro to regenerate the code for every type, in the style of SGLIB (sglib.sourceforge.net), 'imitating' C++'s STL (creative solution; doesn't waste space; elements have the explicit type they actually are when they are returned; any change in list code can be really dramatic)
Intriguing idea, but since I don't know SGLIB, I can't say much more than that.
Your idea/solution
I'd go with the first choice.
I've done this in the past, in our code (which has since been converted to C++), and at the time, decided on the void* approach. I just did this for flexibility - we were almost always storing a pointer in the list anyways, and the simplicity of the solution, and usability of it outweighed (for me) the downsides to the other approaches.
That being said, there was one time where it caused some nasty bug that was difficult to debug, so it's definitely not a perfect solution. I think it's still the one I'd take, though, if I was doing this again now.
Using a preprocessor macro is the best option. The Linux kernel linked list is a excellent a eficient implementation of a circularly-linked list in C. Is very portable and easy to use. Here a standalone version of linux kernel 2.6.29 list.h header.
The FreeBSD/OpenBSD sys/queue is another good option for a generic macro based linked list
I haven't coded C in years but GLib claims to provide "a large set of utility functions for strings and common data structures", among which are linked lists.
Although It's tempting to think about solving this kind of problem using the techniques of another language, say, generics, in practice it's rarely a win. There are probably some canned solutions that get it right most of the time (and tell you in their documentation when they get it wrong), using that might miss the point of the assignment, So i'd think twice about it. For a very few number of cases, It might be feasable to roll your own, but for a project of any reasonable size, Its not likely to be worth the debugging effort.
Rather, When programming in language x, you should use the idioms of language x. Don't write java when you're using python. Don't write C when you're using scheme. Don't write C++ when you're using C99.
Myself, I'd probably end up using something like Pax's suggestion, but actually use a union of char[1] and void* and int, to make the common cases convenient (and an enumed type flag)
(I'd also probably end up implementing a fibonacci tree, just cause that sounds neat, and you can only implement RB Trees so many times before it loses it's flavor, even if that is better for the common cases it'd be used for.)
edit: based on your comment, it looks like you've got a pretty good case for using a canned solution. If your instructor allows it, and the syntax it offers feels comfortable, give it a whirl.
This is a good problem. There are two solutions I like:
Dave Hanson's C Interfaces and Implementations uses a list of void * pointers, which is good enough for me.
For my students, I wrote an awk script to generate type-specific list functions. Compared to preprocessor macros, it requires an extra build step, but the operation of the system is much more transparent to programmers without a lot of experience. And it really helps make the case for parametric polymorphism, which they see later in their curriculum.
Here's what one set of functions looks like:
int lengthEL (Explist *l);
Exp* nthEL (Explist *l, unsigned n);
Explist *mkEL (Exp *hd, Explist *tl);
The awk script is a 150-line horror; it searches C code for typedefs and generates a set of list functions for each one. It's very old; I could probably do better now :-)
I wouldn't give a list of unions the time of day (or space on my hard drive). It's not safe, and it's not extensible, so you may as well just use void * and be done with it.
One improvement over making it a list of void* would be making it a list of structs that contain a void* and some meta-data about what the void* points to, including its type, size, etc.
Other ideas: embed a Perl or Lisp interpreter.
Or go halfway: link with the Perl library and make it a list of Perl SVs or something.
I'd probably go with the void* approach myself, but it occurred to me that you could store your data as XML. Then the list can just have a char* for data (which you would parse on demand for whatever sub elements you need)....

Resources