Cool GCC built-ins - c

I've heard of a lot of cool GCC extensions and built-in functions over the years, but I always wind up forgetting about them before thinking of using them.
What are some cool GCC extensions and built-ins, and some real-life examples of how to put them to use?

GCC provides many features as compiler extensions, off the top of mind and frequently used by me are:
Statement Expressions
Designated Initializers
There are many more documented on the GCC website here.
Caveat:
However, using any form of compiler extensions renders your code non-portable across other compilers so do use them at that risk.

If you want real-life examples of how useful gcc extensions can be then GCC hacks in the Linux kernel is an interesting choice since if it is being used in the Linux kernel then it is probably a good indication it has some real-world impact. As noted before, using extensions does make your code non-portable but clang does make an effort to support gcc extensions which may mitigate some of the impact.
One extensions that is not covered but is used a lot in the Linux kernel is statement expressions, also see Are compund statements (blocks) surrounded by parens expressions in ANSI C?.
The article covers the following features:
Type discovery using typeof
Range extension which includes both Case Ranges and Designated Initializers
Zero-length arrays are flexible array members but with some additions
Determining call address using __builtin_return_addres
Constant detection using __builtin_constant_p
Function Attributes
Branch prediction hints using __builtin_expect
Pre-fetching using __builtin_prefetch
Variable attributes

I recently stumbled over quite a lot of them that are really helpful to emulate the new C11 standard. Actually many of the new features are already there, but with different syntax.
alignment attributes
thread local variables
noreturn attribute to functions
atomic operations (through their __sync_... builtins)
type generic programming
I've written some of that and how to use that with the C11 interfaces in my blog.
Two features that are not covered in functionality by C11 that are really nice, and that I'd very much like to see in future versions of the standard
statement expressions (already mentioned by Als)
__typeof__

Related

Dependency of MISRA-C coding rules checker to the compiler

I have started using a tool allowing to check the compliancy to MISRA-C 2012. The tool is Helix QAC. During the configuration it requests to select one compiler. My understanding is that MISRA-C (and coding rules in general) are not linked to a compiler toolchain, since one of their objective is portability. Moreover one rule of MISRAC is to not use language extensions (obviously this rule may be disabled or there may be exceptions to it). Helix documentation or support is rather vague about this (still trying to get more info from them) and just mention the need to know the integer type length or the path of standard includes. But the rules analysis should be independant from int size and the interface of standard includes is standard so the actual files should not be needed.
What are the dependencies between a MISRA-C rules checker and the compiler ?
Some of the Guidelines depend on knowing what the implementation is doing - this is particularly the case with the implementation defined aspects, including (but not limited to integer sizes, maximum/minimum values, method of implementing boolean etc)
MISRA C even has a section 4.2 Understanding the compiler which coupled with 4.3 Understanding the static analysis tool addresses these issues.
There is one thing every MISRA-C checker needs to know and that's what type you use as bool. This is necessary since MISRA-C:2012 still supports C90 which didn't have standard support for a boolean type. (C99 applications should use _Bool/bool, period.) It also needs to know which constants that false and true correspond to, in case stdbool.h with false and true is unavailable. This could be the reason why it asks which compiler that is used. Check Appendix D - Essential types for details.
Type sizes of int etc isn't relevant for the MISRA checker to know. Though it might be nice with some awareness of non-standard extensions. We aren't allowed to use non-standard extensions or implementation-defined behavior without documenting them. The usual suspects being inline assembler, interrupts, memory allocation at specific places and so on. But once we have documented them in our deviation to Dir 1.1/Rule 1.1, we might want to disable warnings about using those specific, allowed deviations. If the MISRA checker is completely unaware of a certain feature, then how can you disable the warning caused by it?

Compatibility of C89/C90, C99 and C11

I just read: C Wikipedia entry. As far as I know there are 3 different versions of C that are widely used: C89, C99 and C11. My question concerns the compatibility of source code of different versions.
Suppose I am going to write a program (in C11 since it is the latest version) and import a library written in C89. Are these two versions going to work together properly when compiling all files according to the C11 specification?
Question 1:
Are the newer versions of C i.e. C99, C11 supersets of older C versions? By superset I mean, that old code will compile without errors and the same meaning when compiled according to newer C specifications.
I just read, that the // has different meanings in C89 and C99. Apart from this feature, are C99 and C11 supersets of C89?
If the answer to Question 1 is no, then I got another 2 questions.
How to 'port' old code to the new versions? Is there a document which explains this procedure?
Is it better to use C89 or C99 or C11?
Thanks for your help in advance.
EDIT: changed ISO C to C89.
Are the newer versions of C i.e. C99, C11 supersets of older C versions?
There are many differences, big and subtle both. Most changes were adding of new features and libraries. C99 and C11 are not supersets of C90, although there was a great deal of effort made to ensure backwards-compatibility. C11 is however mostly a superset of C99.
Still older code could break when porting from C90, in case the code was written poorly. Particularly various forms of "implicit int" and implicit function declarations were banned from the language with C99. C11 banned the gets function.
A complete list of changes can be found in the C11 draft page 13 of the pdf, where "the third edition" refers to C11 and "the second edition" refers to C99.
How to 'port' old code to the new versions? Is there a document which explains this procedure?
I'm not aware about any such document. If you have good code, porting is easy. If you have rotten code, porting will be painful. As for the actual porting procedure, it will be easy if you know the basics of C99 and C11, so the best bet is to find a reliable source of learning which addresses C99/C11.
Porting from C99 to C11 should be effortless.
Is it better to use C89 or C99 or C11?
It is best to use C11 as that is the current standard. C99 and C11 both contained various "language bug fixes" and introduced new, useful features.
In most ways, the later versions are supersets of earlier versions. While C89 code which tries to use restrict as an identifier will be broken by C99's addition of a reserved word with the same spelling, and while there are some situations in which code which is contrived to exploit some corner cases with a parser will be treated differently in the two languages, most of those are unlikely to be important.
A more important issue, however, has to do with memory aliasing. C89 include
rules which restrict the types of pointers that can be used to access certain
objects. Since the rules would have made functions like malloc() useless if
they applied, as written, to the objects created thereby, most programmers and
compiler writers alike treated the rules as applying only in limited cases (I doubt C89 would have been widely accepted if people didn't believe the rules applied only narrowly). C99 claimed to "clarify" the rules, but its new rules are much more expansive in effect than contemporaneous interpretations of the old ones, breaking a lot of code that would have had defined behavior under those common interpretations of C89, and even some code which would have been unambiguously defined under C89 has no practical C99 equivalent.
In C89, for example, memcpy could be used to copy the bit pattern associated with an object of any type to an object of any other type with the same size, in any cases where that bit pattern would represent a valid value in the destination type. C99 added language which allows compilers to behave in arbitrary fashion if memcpy is used to copy an object of some type T to storage with no declared type (e.g. storage returned from malloc), and that storage is then read as object of a type that isn't alias-compatible with T--even if the bit pattern of the original object would have a valid meaning in the new type. Further, the rules that apply to memcpy also apply in cases where an object is copied as an array of character type--without clarifying exactly what that means--so it's not clear exactly what code would need to do to achieve behavior matching the C89 memcpy.
On many compilers such issues can be resolved by adding a -fno-strict-aliasing option to the command line. Note that specifying C89 mode may not be sufficient, since compilers writers often use the same memory semantics regardless of which standard they're supposed to be implementing.
The newer versions of C are definitely not strict super-sets of the older versions.
Generally speaking, this sort of problem only arises when upgrading the compiler or switching compiler vendors. You must plan for a lot of minor touches of the code to deal with this event. A lot of the time, the new compiler will catch issues that were left undiagnosed by the old compiler, in addition to minor incompatibilities that may have occurred by enforcing the newer C standard.
If it is possible to determine, the best standard to use is the one that the compiler supports the best.
The C11 wikipedia article has a lengthy description of how it differs from C99.
In general, newer versions of the standard are backward compatible.
If not, you can compile different .c files to different .o files, using different standards, and link them together. That does work.
Generally you should use the newest standard available for new code and, if it's easy, fix code that the new standard does break instead of using the hacky solution above.
EDIT: Unless you're dealing with potentially undefined behavior.

Does GCC atomic buitlins work with std=C99?

I am using this built-in atomic methods link
It is mentioned that:
The following built-in functions approximately match the requirements
for the C++11 memory model.
However I have tried compiling these methods with std=C99 and std=C89. The program compiles and I get the right results. Is there something I am missing here ?
Does C99 and C89 have a memory model as well ?
It is a compiler extension and therefore it is allowed to provide functionality outside of the what the standard allows but that page does not make it obvious that is the can be used in C.
Fortunately, gcc does have good online documents and if we check out for example the 4.9 series document on C extensions the __atomic Builtins points to the same page.
So that would indicate that it is valid to use in C and it will stick the requirements as laid out in the documentation and so it will work in the C99 as it does in C++. Usually if there is a difference between how a feature/extension is implemented between C and C++ the documents will note this, for example compound literals have significant differences.

What is the difference between the __sync and __atomic intrinsics of gcc

I'm writing a toy operating system (so I cannot use any library, including the standard one), compiled with gcc, and I want to use atomics for some of the synchronization code. After some search, I found that gcc has two sets of builtins for atomic operations, __sync_* and __atomic_*, but there is no information as to the difference between the two.
What is the difference between these two besides the latter has a parameter for memory ordering? Is the __sync_ version equivalent to __atomic_ version with the sequential ordering? Is the __sync_ version deprecated in favor of the __atomic_ one?
Disclaimer: I have not used these primitives before. The following answer is based on my reading of the documentation and previous experience with concurrency.
Is the __sync_ version deprecated in favor of the __atomic_ one?
Yes, you should use __atomic and let the compiler fall back to __sync when necessary.
Is the __sync_ version equivalent to __atomic_ version with the sequential ordering?
No, the exact ordering guarantees are specified in the documentation for __sync. If you use __atomic, and the compiler chooses to fall back to __sync, then it will add code to meet the requested ordering guarantees.
From the documentation for __atomic:
Target architectures are encouraged to provide their own patterns for each of these built-in functions. If no target is provided, the original non-memory model set of ‘__sync’ atomic built-in functions are utilized, along with any required synchronization fences surrounding it in order to achieve the proper behavior. Execution in this case is subject to the same restrictions as those built-in functions.
A final word of caution: not all the __sync or __atomic operations can be implemented inline. The compiler may implement them as a call to an external function that is (presumably) implemented in the standard library. If you don't have access to the standard library, then you'll have to implement the missing functions yourself. Here is the relevant quote from the documentation:
If there is no pattern or mechanism to provide a lock free instruction sequence, a call is made to an external routine with the same parameters to be resolved at run time.
These primitives are a low-level mechanism, and you should understand what the compiler can and cannot do.
For an example of what code the compiler generates inline, see the related question: Atomic operations and code generation for gcc

Should I use ANSI C (C89)?

It's 2012. I'm writing some code in C. Should I be still be using C89? Are there still compilers that do not support C99?
I don't mind using /* */ instead of //.
I'm not sure about C89 forbids mixing declarations and code. I'm kind of leaning towards the idea that it's actually more readable to have all the declarations in one place, and if it isn't, the function is too long.
VLAs look useful but I haven't needed them yet.
Should I stick with C89 if I don't have a compelling reason not to? Are there other things I haven't considered?
Unless you know that you cannot use a C99-compatible compiler (the Visual Studio C compiler is the most prominent candidate) there is no good reason for not using the nice things C99 gives you.
However, even if you need to support that compiler you can use some C99 features - just not all of them.
One feature of C99 that is incredibly handy is being able to do for(int i = ...) instead of having to declare your loop variable on top of the function - especially since C actually has a block scope. That's the kind of declaration where having it on top really doesn't improve the readability.
There is a reason (or many) why C89 was superseded by C99. If you know for sure that no C99 compiler is available for your particular piece of work (unlikely unless you are stuck with Visual Studio which never supported C officially anyway), then you need to stay with C89 but otherwise you should certainly put yourself in a position where you can benefit from the last 20+ years of improvement. There is nothing inherently slower about C99.
Perhaps you should even consider looking at the newest C11 standard. There has been some important fixes for dealing with Unicode that any C programmer could benefit from (other changes in the standard are absolutely minimal)...
Good code is a mixture of performance, scalability, readability, and maintainability.
In my opinion, C99 makes code easier to read and maintain. Very, very few compilers don't support C99, so I say go with it. Use the tools you have available, unless you are certain you will need to compile your project with a compiler that requires the earlier standard.
Check out these links to learn more about the advantages to C99:
http://www.kuro5hin.org/?op=displaystory;sid=2001/2/23/194544/139
http://en.wikipedia.org/wiki/C99#Design
Note that C99 also supports library functions such as snprintf, which are very useful, and have better floating point support. Also, I find macros to be extremely helpful, especially when working with math-intensive applications (like cryptographic algorithms)
I disagree with Paul R's "bottom line" comment. There are multiple cases where C89 is advantageous for portability.
Targeting embedded systems, which may or may not have compilers supporting C99:
https://groups.google.com/forum/#!topic/comp.arch.embedded/WNvhw3T_9pI%5B1-25%5D
Targeting the TinyCC compiler, as might be required in a restricted environment where installing a gigantic toolchain is either impractical or not allowed. (TCC is no longer being developed, and Bellard's last statement as to ISOC99 support was that it was "heading towards" full compliance.)
Supporting dynamic compilation via libtcc (see above).
Targeting MSVC, as others have noted.
For source-compatibility with projects that may be required by their company to use the C89 standard. This is especially relevant if you're writing an open source library, and want to maximize its application in some industry.
As cegfault noted, some of the C99 features as listed on Wikipedia can be very useful, but none I would consider indispensable if your priority is portability, or any of the above reasons apply.
It appears Microsoft hasn't budged on C99 compliance. SimonRev from Beijer Electronics commented on a related MSDN thread in November 2016:
In broad strokes, the only parts of the C99 compiler that were
implemented are those parts that they needed to keep the C++ compiler
up to date.
Microsoft has done basically nothing to the C compiler since VC6, and
they haven't made much secret that C++ is their vision of the future
of native code, not C.
In conclusion, if you want portability for embedded or restricted systems, dynamic compilation, MSVC, or compatibility with proprietary source code, I would say C89 is advantageous.

Resources