I'm new to Linux kernel development.
One thing that bothers me is a way a variables are declared and initialized.
I'm under impression that code uses variable declaration placement rules for C89/ANSI C (variables are declared at the beginning of block), while C99 relaxes the rule.
My background is C++ and there many advises from "very clever people" to declare variable as locally as possible - better declare and initialize in the same instruction:
Google C++ Style Guide
C++ Coding Standards: 101 Rules, Guidelines, and Best Practices - item 18
A good discussion about it here.
What is the accepted way to initialize variables in Linux kernel?

I couldn't find a relevant passage in the Linux kernel coding style. So, follow the convention used in existing code -- declare variables at beginning of block -- or run the risk of your code seeming out-of-place.
Reasons why variables at beginning of block is a Good Thing:
the target architecture may not have a C99 compiler
... can't think of more reasons

You should always try to declare variables as locally as possible. If you're using C++ or C99, that would usually be right before the first use.
In older C, doing that doesn't fall under "possible", and there the place to declare those variables would usually be the beginning of the current block.
(I say 'usually' because of some cases with functions and loops where it's better to make them a bit more global...)

In most normal cases, declare them in the beginning of the function where you are using them. There are exceptions, but they are rare.
if your function is short enough, the deceleration is far away from the first use anyway. If your function is longer then that - it's a good sign your function is too long.
The reason many C++ based coding standards recommend declaring close to use is that C++ data types can be much "fatter" (e.g. thing of class with multiple inheritances etc.) and so take up a lot more space. If you define such an instance at the beginning of a function but use it only much later (and maybe not at all) you are wasting a lot of RAM. This tends to be much less of an issue in C with it's native type only data types.

There is an oblique reference in the Coding Style document. It says:
Another measure of the function is the number of local variables. They
shouldn't exceed 5-10, or you're doing something wrong. Re-think the
function, and split it into smaller pieces. A human brain can
generally easily keep track of about 7 different things, anything more
and it gets confused. You know you're brilliant, but maybe you'd like
to understand what you did 2 weeks from now.
So while C99 style in-place initialisers are handy in certain cases the first thing you should probably be asking yourself is why it's hard to have them all at the top of the function. This doesn't prevent you from declaring stuff inside block markers, for example for in-loop calculations.

In older C it is possible to declare them locally by creating a block inside the function. Blocks can be added even without ifs/for/while:
int foo(void)
int a;
int b;
a = 5 + b;
int c;
Although it doesn't look very neat, it still is possible, even in older C.

I can't speak to why they have done things one way in the Linux kernel, but in the systems we develop, we tend to not use C99-specific features in the core code. Individual applications tend to have stuff written for C99, because they will typically be deployed to one known platform, and the gcc C99 implementation is known good.
But the core code has to be deployable on whatever platform the customer demands (within reason). We have supplied systems on AIX, Solaris, Informix, Linux, Tru-64, OpenVMS(!) and the presence of C99 compliant compilers isn't always guaranteed.
The Linux kernel needs to be substantially more portable again - and particularly down to small footprint embedded systems. I guess the feature just isn't important enough to override these sorts of considerations.


Efficiency issues when using C99 and C11.

The other day I was converting a program written with C99 standard into C11. Basically the motive was to use the code with MSVC but It was written in Linux and was mostly compiled with default GCC behaviour. During the code conversion, I found out that you can not decalre variables of a function after any statement i.e. you must declare them at the top of the function.
But my question is that wouldn't it be against the efficient programming rule that variables should be declared near their use so that it maximizes the cache hits? For example, In a large function of say 200 LOC, I want to use some big static look up array at nearly the end of the function. Wouldn't declaring and initializing it just before the usage cause more cache hits? or am I simple missing some basic point of C11 C language standard?
You seem to have some confusion for which version of the standard you are compiling your program. AFAIK, MSVC doesn't support any of the more recent C standards.
But to come to the core of your question, no this is not an efficiency issue. The compiler is allowed to reorder statements to its liking, as long as the observable behavior of the program doesn't change. Thus a modern compiler will always touch a new variable the latest possible before its first use.
Where the variable declaration appears has no effect on cache behavior. Just having a declaration doesn't touch memory.
You may need to separate out initialization into a separate assignment, however, in order to make sure you don't have an initializer causing a memory access at (near) the beginning of the function.

Why doesn't GNOME use C99?

Looking at mutter source code and evince source code, both still use C89 style of declaring all variables at the very beginning of the function, instead of where it is first used (limited scope is good). Why don't they use C99? GNOME 3 was launch recently and mutter is quite new, so that could have been a good opportunity to switch, if the reason was compatibility with old code style.
Does that mean that contributing code to GNOME needs to be written in C89?
The rationale can be linked to the same rationale behind Glib and GTK+:
No C99 comments or declarations.
Rationale: we expect GLib and GTK+ to
be buildable on various compilers and
C99 support is still not yet
Speaking of scope, I guess you can still do this:
if (condition)
int temporary = expression();
In other words, each actual brace-enclosed scope can contain new variable declarations, even in C89. Many people seem surprised by this; there's no difference from this perspective between a function's top-level scope and any other scope contained therein. Variables will be visible in all scopes descending from the one that declared them.
Note that I don't know if this is supported by the GNOME style guide, but it's at least supported by C89, and a recommended technique (by me) to keep things as local as possible.
Many people consider declaring variables all over the place, as opposed to at the beginning of the block, bad style. It makes it mildly more work to look for declarations, and makes it so you have to inspect the whole function to find them all. Also, for whatever reason, declarations after statements were one of the last C99 features GCC implemented, so for a long time, it was a major compatibility consideration.

Is ARPACK thread-safe?

Is it safe to use the ARPACK eigensolver from different threads at the same time from a program written in C? Or, if ARPACK itself is not thread-safe, is there an API-compatible thread-safe implementation out there? A quick Google search didn't turn up anything useful, but given the fact that ARPACK is used heavily in large scientific calculations, I'd find it highly surprising to be the first one who needs a thread-safe sparse eigensolver.
I'm not too familiar with Fortran, so I translated the ARPACK source code to C using f2c, and it seems that there are quite a few static variables. Basically, all the local variables in the translated routines seem to be static, implying that the library itself is not thread-safe.
Fortran 77 does not support recursion, and hence a standard conforming compiler can allocate all variables in the data section of the program; in principle, neither a stack nor a heap is needed [1].
It might be that this is what f2c is doing, and if so, it might be that it's the f2c step that makes the program non thread-safe, rather than the program itself. Of course, as others have mentioned, check out for COMMON blocks as well. EDIT: Also, check for explicit SAVE directives. SAVE means that the value of the variable should be retained between subsequent invocations of the procedure, similar to static in C. Now, allocating all procedure local data in the data section makes all variables implicitly SAVE, and unfortunately, there is a lot of old code that assumes this even though it's not guaranteed by the Fortran standard. Such code, obviously, is not thread-safe. Wrt. ARPACK specifically, I can't promise anything but ARPACK is generally well regarded and widely used so I'd be surprised if it suffered from these kinds of dusty-deck problems.
Most modern Fortran compilers do use stack allocation. You might have better luck compiling ARPACK with, say, gfortran and the -frecursive option.
[1] Not because it's more efficient, but because Fortran was originally designed before stacks and heaps were invented, and for some reason the standards committee wanted to retain the option to implement Fortran on hardware with neither stack nor heap support all the way up to Fortran 90. Actually, I'd guess that stacks are more efficient on todays heavily cache-dependent hardware rather than accessing procedure local data that is spread all over the data section.
I have converted ARPACK to C using f2c. Whenever you use f2c and you care about thread-safety you must use the -a switch. This makes local variables have automatic storage, i.e. be stack based locals rather than statics which is the default.
Even so, ARPACK itself is decidedly not threadsafe. It uses a lot of common blocks (i.e. global variables) to preserve state between different calls to its functions. If memory serves, it uses a reverse communication interface which tends to lead developers to using global variables. And of course ARPACK probably was written long before multi-threading was common.
I ended up re-working the converted C code to systematically remove all the global variables. I created a handful of C structs and gradually moved the global variables into these structs. Finally I passed pointers to these structs to each function that needed access to those variables. Although I could just have converted each global into a parameter wherever it was needed it was much cleaner to keep them all together, contained in structs.
Essentially the idea is to convert global variables into local variables.
ARPACK uses BLAC right? Then those libraries need to be thread safe too.
I believe your idea to check with f2c might not be a bullet proof way of telling if the Fortran code is thread safe, I would guess it also depends on the Fortran compiler and libraries.
I don't know what strategy f2c uses in translating Fortran. Since ARPACK is written in FORTRAN 77, the first thing to do is check for the presence of COMMON blocks. These are global variables, and if used, the code is most likely not thread safe. The ARPACK webpage,, says that there is a parallel version -- it seems likely that that version is threadsafe.

Large C macros. What's the benefit?

I've been working with a large codebase written primarily by programmers who no longer work at the company. One of the programmers apparently had a special place in his heart for very long macros. The only benefit I can see to using macros is being able to write functions that don't need to be passed in all their parameters (which is recommended against in a best practices guide I've read). Other than that I see no benefit over an inline function.
Some of the macros are so complicated I have a hard time imagining someone even writing them. I tried creating one in that spirit and it was a nightmare. Debugging is extremely difficult, as it takes N+ lines of code into 1 in the a debugger (e.g. there was a segfault somewhere in this large block of code. Good luck!). I had to actually pull the macro out and run it un-macro-tized to debug it. The only way I could see the person having written these is by automatically generating them out of code written in a function after he had debugged it (or by being smarter than me and writing it perfectly the first time, which is always possible I guess).
Am I missing something? Am I crazy? Are there debugging tricks I'm not aware of? Please fill me in. I would really like to hear from the macro-lovers in the audience. :)
To me the best use of macros is to compress code and reduce errors. The downside is obviously in debugging, so they have to be used with care.
I tend to think that if the resulting code isn't an order of magnitude smaller and less prone to errors (meaning the macros take care of some bookkeeping details) then it wasn't worth it.
In C++, many uses like this can be replaced with templates, but not all. A simple example of Macros that are useful are in the event handler macros of MFC -- without them, creating event tables would be much harder to get right and the code you'd have to write (and read) would be much more complex.
If the macros are extremely long, they probably make the code short but efficient. In effect, he might have used macros to explicitly inline code or remove decision points from the run-time code path.
It might be important to understand that, in the past, such optimizations weren't done by many compilers, and some things that we take for granted today, like fast function calls, weren't valid then.
To me, macros are evil. With their so many side effects, and the fact that in C++ you can gain same perf gains with inline, they are not worth the risk.
For ex. see this short macro:
#define max(a, b) ((a)>(b)?(a):(b))
then try this call:
max(i++, j++)
More. Say you have
#define PLANETS 8
if an error is thrown, it will refer to '8', but not either of its meaninful representations.
I only know of two reasons for doing what you describe.
First is to force functions to be inlined. This is pretty much pointless, since the inline keyword usually does the same thing, and function inlining is often a premature micro-optimization anyway.
Second is to simulate nested functions in C or C++. This is related to your "writing functions that don't need to be passed in all their parameters" but can actually be quite a bit more powerful than that. Walter Bright gives examples of where nested functions can be useful.
There are other reasons to use of macros, such as using preprocessor-specific functionality (like including __FILE__ and __LINE__ in autogenerated error messages) or reducing boilerplate code in ways that functions and templates can't (the Boost.Preprocessor library excels here; see Boost.ScopeExit or this sample enum code for examples), but these reasons don't seem to apply for doing what you describe.
Very long macros will have performance drawbacks, like increased compiled binary size, and there are certainly other reasons for not using them.
For the most problematic macros, I would consider running the code through the preprocessor, and replacing the macro output with function calls (inline if possible) or straight LOC. If the macros exists for compatibility with other architectures/OS's, you might be stuck though.
Part of the benefit is code replication without the eventual maintenance cost - that is, instead of copying code elsewhere you create a macro from it and only have to edit it once...
Of course, you could also just make a method to be called but that is sort of more work... I'm against much macro use myself, just trying to present a potential rationale.
There are a number of good reasons to write macros in C.
Some of the most important are for creating configuration tables using x-macros, for making function like macros that can accept multiple parameter types as inputs and converting tables from human readable/configurable/understandable values into computer used values.
I cant really see a reason for people to write very long macros, except for the historic automatic function inline.
I would say that when debugging complex macros, (when writing X macros etc) I tend to preprocess the source file and substitute the preprocessed file for the original.
This allows you to see the C code generated, and gives you real lines to work with in the debugger.
I don't use macros at all. Inline functions serve every useful purpose a macro can do. Macro allow you to do very weird and counterintuitive things like splitting up identifiers (How does someone search for the identifier then?).
I have also worked on a product where a legacy programmer (who thankfully is long gone) also had a special love affair with Macros. His 'custom' scripting language is the height of sloppiness. This was compounded by the fact that he wrote his C++ classes in C, meaning all class functions and variables were all public. Anyways, he wrote almost everything in macro's and variadic functions (Another hideous monstrosity foisted on the world). So instead of writing a proper template class he would use a Macro instead! He also resorted to macro's to create factory classes as well, instead of normal code... His code is pretty much unmaintanable.
From what I have seen, macro's can be used when they are small and are used declaratively and don't contain moving parts like loops, and other program flow expressions. It's OK if the macro is one or at the most two lines long and it declares and instance of something. Something that won't break during runtime. Also macro's should not contain class definitions, or function definitions. If the macro contains code that needs to be stepped into using a debugger than the macro should be removed and replace with something else.
They can also be useful for wrapping custom tracing/debugging functionality. For instance you want custom tracing in debug builds but not release builds.
Anyways when you are working in legacy code like that, just be sure to remove a bit of the macro mess a bit at a time. If you keep it up, with enough time eventually you will remove them all and make life a bit easier for yourself. I have done this in the past, with especially messy macro's. What I do is turn on the compiler switch to have the preprocessor generate an output file. Then I raid that file, and copy the code, re-indent it, and replace the macro with the generated code. Thank goodness for that compiler feature.
Some of the legacy code I've worked with used macros very extensively in the place of methods. The reasoning was that the computer/OS/runtime had an extremely small stack, so that stack overflows were a common problem. Using macros instead of methods meant that there were fewer methods on the stack.
Luckily, most of that code was obsolete, so it is (mostly) gone now.
C89 did not have inline functions. If using a compiler with extensions disabled (which is a desirable thing to do for several reasons), then the macro might be the only option.
Although C99 came out in 1999, there was resistance to it for a long time; commercial compiler vendors didn't feel it was worth their time to implement C99. Some (e.g. MS) still haven't. So for many companies it was not a viable practical decision to use C99 conforming mode, even up to today in the case of some compilers.
I have used C89 compilers that did have an extension for inline functions, but the extension was buggy (e.g. multiple definition errors when there should not be), things like that may dissuade a programmer from using inline functions.
Another thing is that the macro version effectively forces that the function will actually be inlined. The C99 inline keyword is only a compiler hint and the compiler may still decide to generate a single instance of the function code which is linked like a non-inline function. (One compiler that I still use will do this if the function is not trivial and returning void).

When should I use type abstraction in embedded systems

I've worked on a number of different embedded systems. They have all used typedefs (or #defines) for types such as UINT32.
This is a good technique as it drives home the size of the type to the programmer and makes you more conscious of chances for overflow etc.
But on some systems you know that the compiler and processor won't change for the life of the project.
So what should influence your decision to create and enforce project-specific types?
I think I managed to lose the gist of my question, and maybe it's really two.
With embedded programming you may need types of specific size for interfaces and also to cope with restricted resources such as RAM. This can't be avoided, but you can choose to use the basic types from the compiler.
For everything else the types have less importance.
You need to be careful not to cause overflow and may need to watch out for register and stack usage. Which may lead you to UINT16, UCHAR.
Using types such as UCHAR can add compiler 'fluff' however. Because registers are typically larger, some compilers may add code to force the result into the type.
can become
which is unecessary.
So I think my question should have been :-
given the constraints of embedded software what is the best policy to set for a project which will have many people working on it - not all of whom will be of the same level of experience.
I use type abstraction very rarely. Here are my arguments, sorted in increasing order of subjectivity:
Local variables are different from struct members and arrays in the sense that you want them to fit in a register. On a 32b/64b target, a local int16_t can make code slower compared to a local int since the compiler will have to add operations to /force/ overflow according to the semantics of int16_t. While C99 defines an intfast_t typedef, AFAIK a plain int will fit in a register just as well, and it sure is a shorter name.
Organizations which like these typedefs almost invariably end up with several of them (INT32, int32_t, INT32_T, ad infinitum). Organizations using built-in types are thus better off, in a way, having just one set of names. I wish people used the typedefs from stdint.h or windows.h or anything existing; and when a target doesn't have that .h file, how hard is it to add one?
The typedefs can theoretically aid portability, but I, for one, never gained a thing from them. Is there a useful system you can port from a 32b target to a 16b one? Is there a 16b system that isn't trivial to port to a 32b target? Moreover, if most vars are ints, you'll actually gain something from the 32 bits on the new target, but if they are int16_t, you won't. And the places which are hard to port tend to require manual inspection anyway; before you try a port, you don't know where they are. Now, if someone thinks it's so easy to port things if you have typedefs all over the place - when time comes to port, which happens to few systems, write a script converting all names in the code base. This should work according to the "no manual inspection required" logic, and it postpones the effort to the point in time where it actually gives benefit.
Now if portability may be a theoretical benefit of the typedefs, readability sure goes down the drain. Just look at stdint.h: {int,uint}{max,fast,least}{8,16,32,64}_t. Lots of types. A program has lots of variables; is it really that easy to understand which need to be int_fast16_t and which need to be uint_least32_t? How many times are we silently converting between them, making them entirely pointless? (I particularly like BOOL/Bool/eBool/boolean/bool/int conversions. Every program written by an orderly organization mandating typedefs is littered with that).
Of course in C++ we could make the type system more strict, by wrapping numbers in template class instantiations with overloaded operators and stuff. This means that you'll now get error messages of the form "class Number<int,Least,32> has no operator+ overload for argument of type class Number<unsigned long long,Fast,64>, candidates are..." I don't call this "readability", either. Your chances of implementing these wrapper classes correctly are microscopic, and most of the time you'll wait for the innumerable template instantiations to compile.
The C99 standard has a number of standard sized-integer types. If you can use a compiler that supports C99 (gcc does), you'll find these in <stdint.h> and you can just use them in your projects.
Also, it can be especially important in embedded projects to use types as a sort of "safety net" for things like unit conversions. If you can use C++, I understand that there are some "unit" libraries out there that let you work in physical units that are defined by the C++ type system (via templates) that are compiled as operations on the underlying scalar types. For example, these libraries won't let you add a distance_t to a mass_t because the units don't line up; you'll actually get a compiler error.
Even if you can't work in C++ or another language that lets you write code that way, you can at least use the C type system to help you catch errors like that by eye. (That was actually the original intent of Simonyi's Hungarian notation.) Just because the compiler won't yell at you for adding a meter_t to a gram_t doesn't mean you shouldn't use types like that. Code reviews will be much more productive at discovering unit errors then.
My opinion is if you are depending on a minimum/maximum/specific size don't just assume that (say) an unsigned int is 32 bytes - use uint32_t instead (assuming your compiler supports C99).
I like using stdint.h types for defining system APIs specifically because they explicitly say how large items are. Back in the old days of Palm OS, the system APIs were defined using a bunch of wishy-washy types like "Word" and "SWord" that were inherited from very classic Mac OS. They did a cleanup to instead say Int16 and it made the API easier for newcomers to understand, especially with the weird 16-bit pointer issues on that system. When they were designing Palm OS Cobalt, they changed those names again to match stdint.h's names, making it even more clear and reducing the amount of typedefs they had to manage.
I believe that MISRA standards suggest (require?) the use of typedefs.
From a personal perspective, using typedefs leaves no confusion as to the size (in bits / bytes) of certain types. I have seen lead developers attempt both ways of developing by using standard types e.g. int and using custom types e.g. UINT32.
If the code isn't portable there is little real benefit in using typedefs, however , if like me then you work on both types of software (portable and fixed environment) then it can be useful to keep a standard and use the cutomised types. At the very least like you say, the programmer is then very much aware of how much memory they are using. Another factor to consider is how 'sure' are you that the code will not be ported to another environment? Ive seen processor specific code have to be translated as a hardware engieer has suddenly had to change a board, this is not a nice situation to be in but due to the custom typedefs it could have been a lot worse!
Consistency, convenience and readability. "UINT32" is much more readable and writeable than "unsigned long long", which is the equivalent for some systems.
Also, the compiler and processor may be fixed for the life of a project, but the code from that project may find new life in another project. In this case, having consistent data types is very convenient.
If your embedded systems is somehow a safety critical system (or similar), it's strongly advised (if not required) to use typedefs over plain types.
As TK. has said before, MISRA-C has an (advisory) rule to do so:
Rule 6.3 (advisory): typedefs that indicate size and signedness should be used in place of the basic numerical types.
(from MISRA-C 2004; it's Rule #13 (adv) of MISRA-C 1998)
Same also applies to C++ in this area; eg. JSF C++ coding standards:
AV Rule 209 A UniversalTypes file will be created to define all sta
ndard types for developers to use. The types include: [uint16, int16, uint32_t etc.]
Using <stdint.h> makes your code more portable for unit testing on a pc.
It can bite you pretty hard when you have tests for everything but it still breaks on your target system because an int is suddenly only 16 bit long.
Maybe I'm weird, but I use ub, ui, ul, sb, si, and sl for my integer types. Perhaps the "i" for 16 bits seems a bit dated, but I like the look of ui/si better than uw/sw.
