Using GCC extensions with code compiled with clang? - c

I have a two questions regarding GCC extensions
If I compile a library using GCC attributes/extensions, will they work even if I link them to a program compiled with for example clang?
Should function attributes/extensions be declared in the function declaration or prototype?

If the attributes/extensions only affect code generation and not interfaces, it should work.
Depends on the attribute.
E.g., attributes such as pure, const, or nonnull, are no good unless every translation unit that uses the functions can see them—you should put them on the prototypes in your header (and use the underscored form, e.g., __attribute__((__pure__))).
On the other hand, attributes affecting code generation or visibility should be on the implementation, or else if your library user decided to make an override of a function provided by your library, including your header would effectively force those attributes on their override.
In any case, if you put an attribute on the declaration, it affects the definition too (assuming the definition sees the declaration—thanks to Jonathan Leffler for the clarification), but definitions can take on additional attributes not present in the declaration(s).

It depends on the extension.
See https://clang.llvm.org/docs/LanguageExtensions.html
This document describes the language extensions provided by Clang. In
addition to the language extensions listed here, Clang aims to support
a broad range of GCC extensions. Please see the GCC manual for more
information on these extensions.

Related

Linking object files from different C compilers

Say I have two compilers, or even a single compiler with two different option sets. Each compiler compiles some C code into an object and I try to link the two .o files with a common linker. Will this succeed?
My initial thought is: not always. If the compilers are using the same object file format and have compatible options, then it would succeed. But, if the compilers have conflicting options, or (and this is an easy one) are using two different object file formats, it would not work correctly.
Does anyone have more insight on this? What standards would the object files need to comply with to gain confidence that this will work?
Most flavors of *nix OSes have well defined and open ABI and mostly use ELF object file format, so it is not a problem at all for *nix.
Windows is less strictly defined and different compilers may vary in some calling conventions (for example __fastcall may not be supported by some compilers or may have different behavior, see https://en.wikipedia.org/wiki/X86_calling_conventions). But main set of calling conventions (__stdcall, _cdecl, etc) is standard enough to ensure successfull call of function compiled by one compiler from another compiler, otherwise the program won't work at all, since unlike Linux every system call in Windows is wrapped by function from DLL which you need to successfully call.
The other problem is that there is no standard common format for object files. Although most tools (MS, Intel, GCC (MinGW), Clang) use COFF format, some may use OMF (Watcom) or ELF (TinyC).
Another problem is so called "name mangling". Although it was introduced to support overloading C++ functions with the same name, it was adopted by C compilers to prevent linkage of functions defined with different calling conventions. For example, function int _cdecl fun(void); will get compiled name _fun whilst int __stdcall fun(void); will get name _fun#0. More information on name mangling see here: https://en.wikipedia.org/wiki/Name_mangling.
At last, default behavior may differ for some compilers, so yes, options may prevent successful linking of object files produced by different compilers or even by the same compiler. For example, TinyC use default convention _cdecl, whilst CLang use __stdcall. TinyC with default options may not produce code that may be linked with other because it doesn't prepend name by underscore sign. To make it cross-linkable it needs -fleading-underscore option.
But keeping in mind all said above the code may successfully be intermixed. For example, I successfully linked together code produced by Visual Studio, Intel Parallel Studio, GCC (MinGW), Clang, TinyC, NASM.

Extending the C preprocessor to inject code

I am working on a project where I need to inject code to C (or C++) files given some smart comments in the source. The code injected is provided by an external file. Does anyone know of any such attempts and can point me to examples - of course I need to preserve original line numbers with #line. My thinking is to replace the cpp with a script which first does this and then calls the system cpp.
Any suggestions will be appreciated
Thanks
Danny
Providing your modified cpp external program won't usually work, at least in recent GCC where the preprocessing is internal to the compiler (so is part of cc1 or cc1plus). Hence, there is no more any cpp program involved in most GCC compilations (but libcpp is an internal library of GCC).
If using mostly GCC, I would suggest to inject code with you own #pragmas (not comments!). You could add your own GCC plugin, or code your own MELT extension, for that purpose (since GCC plugins can add pragmas and builtins but cannot currently affect preprocessing).
As Ira Baxter commented, you could simply put some weird macro invocations and define these macros in separate files.
I don't exactly guess what precise kind of code injection you want.
Alternatively, you could generate your C or C++ code with your own generator (which could emit #line directives) and feed that to gcc

Using standard function name in C

I am compiling one program called nauty. This program uses a canonical function name getline which is also part of the standard GNU C library.
Is it possible to tell GCC at compile time to use this program defined function?
One solution:
Now you have declaration of the function in some application .h file something like:
int getline(...); // the custon getline
Change that to:
int application_getline(...); // the custon getline
#define getline application_getline
I think that should do it. It will also fix the .c file where the function is defined, assuming it includes that .h file.
Also, use grep or "find in files" of editor to make sure that every place where that macro takes effect, it will not cause trouble.
Important: in every file, make sure that .h file included after any standard headers which may use getline symbol. You do not want that macro to take effect in those...
Note: this is an ugly hack. Then again, almost everything involving C pre-processor macros can be considered an ugly hack, by some criteria ;). Then again, getting existing incompatible code bases to co-operate and work together is often a case where a hack is acceptable, especially if long term maintenance is not a concern.
Note2: As per this answer and as pointed out in a comment, this is undefined behavior by C standard. Keep this in mind, if intention is to maintain the software for longer then just getting a working executable binary one time. But I added a better solution.
Note that you may trigger undefined behavior if the GCC header where standard getline is defined is actually used in your code. These are the relevant information sources (emphasis mine):
The libc manual:
1.3.3 Reserved Names
The names of all library types, macros, variables and functions that come from the ISO C standard are reserved unconditionally; your program may not redefine these names. All other library names are reserved if your program explicitly includes the header file that defines or declares them. There are several reasons for these restrictions:
[...]
and the C99 draft standard (N1256):
7.1.3 Reserved identifiers
1
Each header declares or defines all identifiers listed in its associated subclause, and
optionally declares or defines identifiers listed in its associated future library directions subclause and identifiers which are always reserved either for any use or for use as file scope identifiers.
[...]
2
No other identifiers are reserved. If the program declares or defines an identifier in a context in which it is reserved (other than as allowed by 7.1.4), or defines a reserved identifier as a macro name, the behavior is undefined.
3
If the program removes (with #undef) any macro definition of an identifier in the first
group listed above, the behavior is undefined.
Thus even the macro trick suggested in another post will invoke undefined behavior if you include the header of getline in your code.
Unfortunately, in this case the only safe bet is to manually rename all getline invocations.
C demands unique function names.
but you can use -fno-builtin or -ffreestanding gcc flags.
see description about this flags in gcc man page.
A common approach is to use prefixes which form some sort of namespace. Sometimes you can see macros used for this to make changing the namespace name easier, e.g.
#define MYAPP(f) myapp_##f
Which is then used like
int MYAPP(add)(int a, int b) {
return a + b;
}
This defines a function myapp_add which you can also invoke like
MYAPP(add)(3, 5);
This standards compliance issue started to bug me, so I did a bit of experimenting. Here's a 2nd answer, which is possibly better then the currently accepted answer of mine.
First, solution:
Just define macro _XOPEN_SOURCE with value 699, by adding this to compiler command line options
-D_XOPEN_SOURCE=699
How exactly, that depends on applications build system, but one probably working way would be to define CFLAGS environment variable, and see if it takes effect when rebuilding:
export CFLAGS="-D_XOPEN_SOURCE=699"
Other alternative would be to add #define _XOPEN_SOURCE 699 before includes in every .c file of the application, in case it uses some esoteric build system and you can't get it added to compile options, but doing it from command line is by far preferable.
Then some explanation:
Man page of getline specifies, that getline is defined only under certain standards, such as if _XOPEN_SOURCE>=700. So, by defining a smaller value before including the relevant file, we exclude the library declaration. More information about these feature-test macros is found in GNU libc manual.
I expected there to be some linker issues too, but there weren't, and my investigation resulted this question here. To summarize, linker will prefer symbol from linked object files (at least with gcc), and will only look at dynamic libraries if it has not found symbol otherwise. So, since getline is not ISO C symbol, GNU libc documentation quoted in this answer seems to imply, that after using the _XOPEN_SOURCE trick of this answer, it's ok to use it in an application. Still, beware of other libraries using the POSIX getline and ending up calling application's function (probably with different parameters, resulting in undefined behaviour, probably a crash).
Here is a neat solution to your problem. The trick is LD_PRELOAD.
I have done the similar thing in one of my question post.See the following.
Hack the standard function in library and call the native library function afterwards
You can defined the getline() in the separate file. This will make the design clean too. Now, compile that c file;
$gcc -c -g -fPIC <file.c>.
This will create the file.o. Now, make the shared object of it.
-g for debugging.
-fPIC for position independent code. This will help to save the RAM SIZE. The text segment will be shared, if you specify the -fPIC option.
$gcc -shared libfile.so file.o
Now, link your main file with this shared object.
gcc -g main.c -o main.out -lfile
while executing, use the LD_PRELOAD, this will use your library instead of the native API.
$LD_PRELOAD=<path to libfile.so>/libfile.so ./main.out
If you like my answer,then please appreciate. I have done the similar kind of stuff, in my previous post Hack the standard function in library and call the native library function afterwards .

Generating link-time error for deprecated functions

Is there a way with gcc and GNU binutils to mark some functions such that they will generate an error at link-time if used? My situation is that I have some library functions which I am not removing for the sake of compatibility with existing binaries, but I want to ensure that no newly-compiled binary tries to make use of the functions. I can't just use compile-time gcc attributes because the offending code is ignoring my headers and detecting the presence of the functions with a configure script and prototyping them itself. My goal is to generate a link-time error for the bad configure scripts so that they stop detecting the existence of the functions.
Edit: An idea.. would using assembly to specify the wrong .type for the entry points be compatible with the dynamic linker but generate link errors when trying to link new programs?
FreeBSD 9.x does something very close to what you want with the ttyslot() function. This function is meaningless with utmpx. The trick is that there are only non-default versions of this symbol. Therefore, ld will not find it, but rtld will find the versioned definition when an old binary is run. I don't know what happens if an old binary has an unversioned reference, but it is probably sensible if there is only one definition.
For example,
__asm__(".symver hidden_badfunc, badfunc#MYLIB_1.0");
Normally, there would also be a default version, like
__asm__(".symver new_badfunc, badfunc##MYLIB_1.1");
or via a Solaris-compatible version script, but the trick is not to add one.
Typically, the asm directive is wrapped into a macro.
The trick depends on the GNU extensions to define symbol versions with the .symver assembler directive, so it will probably only work on Linux and FreeBSD. The Solaris-compatible version scripts can only express one definition per symbol.
More information: .symver directive in info gas, Ulrich Drepper's "How to write shared libraries", the commit that deprecated ttyslot() at http://gitorious.org/freebsd/freebsd/commit/3f59ed0d571ac62355fc2bde3edbfe9a4e722845
One idea could be to generate a stub library that has these symbols but with unexpected properties.
perhaps create objects that have the name of the functions, so the linker in the configuration phase might complain that the symbols are not compatible
create functions that have a dependency "dont_use_symbol_XXX" that is never resolved
or fake a .a file with a global index that would have your functions but where the .o members in the archive have the wrong format
The best way to generate a link-time error for deprecated functions that you do not want people to use is to make sure the deprecated functions are not present in the libraries - which makes them one stage beyond 'deprecated'.
Maybe you can provide an auxilliary library with the deprecated function in it; the reprobates who won't pay attention can link with the auxilliary library, but people in the mainstream won't use the auxilliary library and therefore won't use the functions. However, it is still taking it beyond the 'deprecated' stage.
Getting a link-time warning is tricky. Clearly, GCC does that for some function (mktemp() et al), and Apple has GCC warn if you run a program that uses gets(). I don't know what they do to make that happen.
In the light of the comments, I think you need to head the problem off at compile time, rather than waiting until link time or run time.
The GCC attributes include (from the GCC 4.4.1 manual):
error ("message")
If this attribute is used on a function declaration and a call to such a function is
not eliminated through dead code elimination or other optimizations, an error
which will include message will be diagnosed. This is useful for compile time
checking, especially together with __builtin_constant_p and inline functions
where checking the inline function arguments is not possible through extern
char [(condition) ? 1 : -1]; tricks. While it is possible to leave the function
undefined and thus invoke a link failure, when using this attribute the problem
will be diagnosed earlier and with exact location of the call even in presence of
inline functions or when not emitting debugging information.
warning ("message")
If this attribute is used on a function declaration and a call to such a function is
not eliminated through dead code elimination or other optimizations, a warning
which will include message will be diagnosed. This is useful for compile time
checking, especially together with __builtin_constant_p and inline functions.
While it is possible to define the function with a message in .gnu.warning*
section, when using this attribute the problem will be diagnosed earlier and
with exact location of the call even in presence of inline functions or when not
emitting debugging information.
If the configuration programs ignore the errors, they're simply broken. This means that new code could not be compiled using the functions, but the existing code can continue to use the deprecated functions in the libraries (up until it needs to be recompiled).

How to use the __attribute__ keyword in GCC C?

I am not clear with use of __attribute__ keyword in C.I had read the relevant docs of gcc but still I am not able to understand this.Can some one help to understand.
__attribute__ is not part of C, but is an extension in GCC that is used to convey special information to the compiler. The syntax of __attribute__ was chosen to be something that the C preprocessor would accept and not alter (by default, anyway), so it looks a lot like a function call. It is not a function call, though.
Like much of the information that a compiler can learn about C code (by reading it), the compiler can make use of the information it learns through __attribute__ data in many different ways -- even using the same piece of data in multiple ways, sometimes.
The pure attribute tells the compiler that a function is actually a mathematical function -- using only its arguments and the rules of the language to arrive at its answer with no other side effects. Knowing this the compiler may be able to optimize better when calling a pure function, but it may also be used when compiling the pure function to warn you if the function does do something that makes it impure.
If you can keep in mind that (even though a few other compilers support them) attributes are a GCC extension and not part of C and their syntax does not fit into C in an elegant way (only enough to fool the preprocessor) then you should be able to understand them better.
You should try playing around with them. Take the ones that are more easily understood for functions and try them out. Do the same thing with data (it may help to look at the assembly output of GCC for this, but sizeof and checking the alignment will often help).
Think of it as a way to inject syntax into the source code, which is not standard C, but rather meant for consumption of the GCC compiler only. But, of course, you inject this syntax not for the fun of it, but rather to give the compiler additional information about the elements to which it is attached.
You may want to instruct the compiler to align a certain variable in memory at a certain alignment. Or you may want to declare a function deprecated so that the compiler will automatically generate a deprecated warning when others try to use it in their programs (useful in libraries). Or you may want to declare a symbol as a weak symbol, so that it will be linked in only as a last resort, if any other definitions are not found (useful in providing default definitions).
All of this (and more) can be achieved by attaching the right attributes to elements in your program. You can attach them to variables and functions.
Take a look at this whole bunch of other GCC extensions to C. The attribute mechanism is a part of these extensions.
There are too many attributes for there to be a single answer, but examples help.
For example __attribute__((aligned(16))) makes the compiler align that struct/function on a 16-bit stack boundary.
__attribute__((noreturn)) tells the compiler this function never reaches the end (e.g. standard functions like exit(int) )
__attribute__((always_inline)) makes the compiler inline that function even if it wouldn't normally choose to (using the inline keyword suggests to the compiler that you'd like it inlining, but it's free to ignore you - this attribute forces it).
Essentially they're mostly about telling the compiler you know better than it does, or for overriding default compiler behaviour on a function by function basis.
One of the best (but little known) features of GNU C is the attribute mechanism, which allows a developer to attach characteristics to function declarations to allow the compiler to perform more error checking. It was designed in a way to be compatible with non-GNU implementations, and we've been using this for years in highly portable code with very good results.
Note that attribute spelled with two underscores before and two after, and there are always two sets of parentheses surrounding the contents. There is a good reason for this - see below. Gnu CC needs to use the -Wall compiler directive to enable this (yes, there is a finer degree of warnings control available, but we are very big fans of max warnings anyway).
For more information please go to http://unixwiz.net/techtips/gnu-c-attributes.html
Lokesh Venkateshiah

Resources