If you've ever linked a kernel with gcc you may know the parameter -lgcc.
Is this parameter important ? What does it do ?
If you do some driver/kernel dev, you may use the -nostdlib to remove your module from the bloated stdlib. However, you also remove all the internal hacks GCC has in order to have a consistent behaviour on a whole range of hardware.
http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Link-Options.html
-nostdlib
Do not use the standard system startup files or libraries when linking. No startup files and only the libraries you specify will be
passed to the linker, options specifying linkage of the system
libraries, such as -static-libgcc or -shared-libgcc, will be ignored.
The compiler may generate calls to memcmp, memset, memcpy and memmove.
These entries are usually resolved by entries in libc. These entry
points should be supplied through some other mechanism when this
option is specified.
One of the standard libraries bypassed by -nostdlib and -nodefaultlibs is libgcc.a, a library of internal subroutines that GCC uses to overcome shortcomings of particular machines, or special needs
for some languages. (See Interfacing to GCC Output, for more
discussion of libgcc.a.) In most cases, you need libgcc.a even when
you want to avoid other standard libraries. In other words, when you
specify -nostdlib or -nodefaultlibs you should usually specify -lgcc
as well. This ensures that you have no unresolved references to
internal GCC library subroutines. (For example, `__main', used to
ensure C++ constructors will be called; see collect2.)
https://gcc.gnu.org/onlinedocs/gcc-4.6.1/gccint/Interface.html#Interface
3 Interfacing to GCC Output
GCC is normally configured to use the same function calling convention
normally in use on the target system. This is done with the
machine-description macros described (see Target Macros).
However, returning of structure and union values is done differently
on some target machines. As a result, functions compiled with PCC
returning such types cannot be called from code compiled with GCC, and
vice versa. This does not cause trouble often because few Unix library
routines return structures or unions.
GCC code returns structures and unions that are 1, 2, 4 or 8 bytes
long in the same registers used for int or double return values. (GCC
typically allocates variables of such types in registers also.)
Structures and unions of other sizes are returned by storing them into
an address passed by the caller (usually in a register). The target
hook TARGET_STRUCT_VALUE_RTX tells GCC where to pass this address.
By contrast, PCC on most target machines returns structures and unions
of any size by copying the data into an area of static storage, and
then returning the address of that storage as if it were a pointer
value. The caller must copy the data from that memory area to the
place where the value is wanted. This is slower than the method used
by GCC, and fails to be reentrant.
On some target machines, such as RISC machines and the 80386, the
standard system convention is to pass to the subroutine the address of
where to return the value. On these machines, GCC has been configured
to be compatible with the standard compiler, when this method is used.
It may not be compatible for structures of 1, 2, 4 or 8 bytes.
GCC uses the system's standard convention for passing arguments. On
some machines, the first few arguments are passed in registers; in
others, all are passed on the stack. It would be possible to use
registers for argument passing on any machine, and this would probably
result in a significant speedup. But the result would be complete
incompatibility with code that follows the standard convention. So
this change is practical only if you are switching to GCC as the sole
C compiler for the system. We may implement register argument passing
on certain machines once we have a complete GNU system so that we can
compile the libraries with GCC.
On some machines (particularly the SPARC), certain types of arguments
are passed “by invisible reference”. This means that the value is
stored in memory, and the address of the memory location is passed to
the subroutine.
If you use longjmp, beware of automatic variables. ISO C says that
automatic variables that are not declared volatile have undefined
values after a longjmp. And this is all GCC promises to do, because it
is very difficult to restore register variables correctly, and one of
GCC's features is that it can put variables in registers without your
asking it to.
Related
What is the default call convention that the clang compiler uses? I noticed that when I return a local pointer, the reference is not lost
#include <stdio.h>
char *retx(void) {
char buf[4] = "buf";
return buf;
}
int main(void) {
char *p1 = retx();
puts(p1);
return 0;
}
This is Undefined Behaviour. It might happen to work, or it might not, depending on what the compiler happened to choose when compiling for some specific target. It's literally undefined, not "guaranteed to break"; that's the entire point. Compilers can just completely ignore the possibility of UB when generating code, not using extra instructions to make sure UB breaks. (If you want that, compile with -fsanitize=undefined).
Understanding exactly what happened requires looking at the asm, not just trying running it.
warning: address of stack memory associated with local variable 'buf' returned [-Wreturn-stack-address]
return buf;
^~~
Clang prints this warning even without -Wall enabled. Exactly because it's not legal C, regardless of what asm calling convention you're targeting.
Clang uses the C calling convention of the target it's compiling for1. Different OSes on the same ISA can have different conventions, although outside of x86 most ISAs only have one major calling convention. x86 has been around so long that the original calling conventions (stack args with no register args) were inefficient so various 32-bit conventions evolved. And Microsoft chose a different 64-bit convention from everyone else. So there's x86-64 System V, Windows x64, i386 System V for 32-bit x86, AArch64's standard convention, PowerPC's standard convention, etc. etc.
I have tested with clang several times and every time I displayed the string
The "decision" / "luck" of whether it "works" or not is made at compile time, not runtime. Compiling / running the same source multiple times with the same compiler tells you nothing.
Look at the generated asm to find out where char buf[4] ends up.
My guess: maybe you're on Windows x64. Happening to work is more plausible there than most calling conventions, where you'd expect buf[4] to end up below the stack pointer in main, so the call to puts, and puts itself, would be very likely to overwrite it.
If you're on Windows x64 compiling with optimization disabled, retx()'s local char buf[4] might be placed in the shadow space it owns. The caller then calls puts() with the same stack alignment, so retx's shadow space becomes puts's shadow space.
And if puts happens not to write its shadow space, then the data in memory that retx stored is still there. e.g. maybe puts is a wrapper function that in turn calls another function, without initializing a bunch of locals for itself first. But not a tailcall, so it allocates new shadow space.
(But that's not what clang8.0 does in practice with optimization disabled. It looks like buf[4] will be placed below RSP and get stepped on there, using __attribute__((ms_abi)) to get Windows x64 code-gen from Linux clang: https://godbolt.org/z/2VszYg)
But it's also possible in stack-args conventions where padding is left to align the stack pointer by 16 before a call. (e.g. modern i386 System V on Linux for 32-bit x86). puts() has an arg but retx() doesn't, so maybe buf[4] ended up in memory that the caller "allocates" as padding before pushing a pointer arg for puts.
Of course that would be unsafe because the data would be temporarily below the stack pointer, in a calling convention with no red-zone. (Only a few ABIs / calling conventions have red zones: memory below the stack pointer that's guaranteed not to be clobbered asynchronously by signal handlers, exception handlers, or debuggers calling functions in the target process.)
I wondered if enabling optimization would make it inline and happen to work. But no, I tested that for Windows x64: https://godbolt.org/z/k3xGe4. clang and MSVC both optimize away any stores of "buf\0" into memory. Instead they just pass puts a pointer to some uninitialized stack memory.
Code that breaks with optimization enabled is almost always UB.
Footnote 1: Except for x86-64 System V, where clang uses an extra un-documented "feature" of the calling convention: Narrow integer types as function args in registers are assumed to be sign-extended to 32 bits. gcc and clang both do this when calling, but ICC does not, so calling clang functions from ICC-compiled code can cause breakage. See Is a sign or zero extension required when adding a 32bit offset to a pointer for the x86-64 ABI?
Annex L of the C11 Draft N1570 recognizes some situations (i.e. "non-critical Undefined Behavior") where the Standard imposes no particular behavioral requirements but implementations that define __STDC_ANALYZABLE__ with a non-zero value should offer some guarantees, and other situations ("critical Undefined Behavior") where it would be common for implementations not to guarantee anything. Attempts to access objects past their lifetime would fall into the latter category.
While nothing would prevent an implementation from offering behavioral guarantees beyond what the Standard requires, even for Critical Undefined Behavior, and some tasks would require that implementations do so (e.g. many embedded systems tasks require that programs dereference pointers to addresses whose targets no not satisfy the definition for "objects"), accessing automatic variables past their lifetime is a behavior about which few implementations would offer any guarantees beyond perhaps guaranteeing that reading an arbitrary RAM address will have no side-effects beyond yielding an Unspecified value.
Even implementations that guaranteed how automatic objects will be laid out on the stack seldom guaranteed that the storage that held them wouldn't be overwritten between the time a function returned and the next action by the caller. Unless interrupts were disabled, interrupt handling could overwrite any storage used that had been used by automatic objects that were no longer in a live stack frame.
While many implementations can be configured to offer useful guarantees about the behavior of actions for which the Standard imposes no requirements, I can't think of any implementations that can be configured to offer sufficient guarantees to make the above code usable.
Say I have two compilers, or even a single compiler with two different option sets. Each compiler compiles some C code into an object and I try to link the two .o files with a common linker. Will this succeed?
My initial thought is: not always. If the compilers are using the same object file format and have compatible options, then it would succeed. But, if the compilers have conflicting options, or (and this is an easy one) are using two different object file formats, it would not work correctly.
Does anyone have more insight on this? What standards would the object files need to comply with to gain confidence that this will work?
Most flavors of *nix OSes have well defined and open ABI and mostly use ELF object file format, so it is not a problem at all for *nix.
Windows is less strictly defined and different compilers may vary in some calling conventions (for example __fastcall may not be supported by some compilers or may have different behavior, see https://en.wikipedia.org/wiki/X86_calling_conventions). But main set of calling conventions (__stdcall, _cdecl, etc) is standard enough to ensure successfull call of function compiled by one compiler from another compiler, otherwise the program won't work at all, since unlike Linux every system call in Windows is wrapped by function from DLL which you need to successfully call.
The other problem is that there is no standard common format for object files. Although most tools (MS, Intel, GCC (MinGW), Clang) use COFF format, some may use OMF (Watcom) or ELF (TinyC).
Another problem is so called "name mangling". Although it was introduced to support overloading C++ functions with the same name, it was adopted by C compilers to prevent linkage of functions defined with different calling conventions. For example, function int _cdecl fun(void); will get compiled name _fun whilst int __stdcall fun(void); will get name _fun#0. More information on name mangling see here: https://en.wikipedia.org/wiki/Name_mangling.
At last, default behavior may differ for some compilers, so yes, options may prevent successful linking of object files produced by different compilers or even by the same compiler. For example, TinyC use default convention _cdecl, whilst CLang use __stdcall. TinyC with default options may not produce code that may be linked with other because it doesn't prepend name by underscore sign. To make it cross-linkable it needs -fleading-underscore option.
But keeping in mind all said above the code may successfully be intermixed. For example, I successfully linked together code produced by Visual Studio, Intel Parallel Studio, GCC (MinGW), Clang, TinyC, NASM.
Library functions have the weak attribute set by default (see [1]) and could be "overwritten" with functions having the same signature by accident.
For example printf internally calls fputc and I could easily declare one of my functions int fputc(int, FILE *).
If that happens, I would like to receive a compiler warning.
Is there a way to tell the compiler to warn me in case of overwriting a weak function?
[1] https://gcc.gnu.org/onlinedocs/gcc-3.2/gcc/Function-Attributes.html
(I am guessing you are on Linux, and compiling and linking your application as usual, in particular with the libc.so dynamically linked)
Library functions have the weak attribute set by default
This is not always true; on my system fputc is not a weak symbol:
% nm -D /lib/x86_64-linux-gnu/libc-2.21.so|grep fputc
000000000006fdf0 T fputc
0000000000071ea0 T fputc_unlocked
(if it was weak, the T would be a W, and indeed write is weak)
BTW, redefining your own fputc (or malloc) is legitimate (and could be useful, but is very tricky), provided it keeps a semantic conforming to the standard. More generally weak symbols are expected to be redefinable (but this is tricky).
Is there a way to tell the compiler to warn me in case of overwriting a weak function?
No (the compiler cannot warn you reliably).
Since the only thing which could give you some warning is not the compiler (which does not know which particular libc would be used at runtime, you might upgrade your libc.so after compilation) but the linker, and more precisely the dynamic linker, that is ld-linux(8). And the warnings could reliably only be given at runtime (because the libc.so might be different at build time and at run time). Perhaps you want LD_DYNAMIC_WEAK.
If you are ready to spend weeks working on a solution, you might consider using GCC MELT with your own MELT extension and customize a recent GCC to emit a warning when a weak symbol from the libc available at compile time (which might not be the same libc dynamically linked at runtime, so such a check has limited usefulness) is redefined.
Perhaps you might use some LD_PRELOAD trick.
Also, if you linked statically your application, the linker could give you diagnostics if you redefine a libc function.
Read also Drepper's How to Write a Shared Library & Levine's Linkers & loaders book.
The memory layout of a struct is up to the compiler. So what happens when some code compiled by one compiler uses a struct generated by code compiled by another compiler?
For example, say I have a header file that declares a struct somestruct, and a function that returns the struct. One source file defines that function and is compiled by compiler A. Another source file uses than function and is compiled by compiler B and links against the binary of the other source file.
If the two compilers create two different layouts for somestruct, then what's the layout of the variable returned by the function? Does it defer to one compiler's layout, or will there be a memory bug when the second source file tries to access elements of the struct returned by the first source file? Is it an error at compile time or link time?
The function will return a structure as specified by the ABI of the compiler of the function. The callee compiler, will just treat the function as if it conforms to the ABI of itself.
Assuming the two compilers use a similar ABI, in most cases, no errors will be reported during compile-time or link time or even during runtime. For some compatible compilers like Clang, GCC, and Intel C Compiler on OS X and Linux, no errors should result (if there are errors then it's a bug of the compiler). However in real world it is usually difficult to find fully compatible compilers (in most cases their ABIs are similar but not exactly the same; such ABI errors will be even harder to track down because your app would appear normal and crashes under some really weird circumstances are encountered during runtime).
Just as Basile said, name mangling for C++ poses an additional difference in ABI, but such differences are more easily caught during compile time as the linker literally can't find the symbol of the function, rather than finding a function that is not compatible.
Also, passing structures is another headache in terms of ABI because there are multiple structure-packing ABIs, sometimes even different in "compatible" compilers like GCC/MinGW and MSVC. (See also the -m[no-]ms-bitfields option in GCC, which forces GCC to use the MSVC ABI for structures.) I have also seen some cases where passing structures by pointer is more reliable than passing structures by value.
The layout of data (e.g. structures etc...), and the call protocol (how are call done at the processor level) are defined in a (processor and operating system specific) document called Application Binary Interface. If both compilers are following the same ABI (for the same processor and the same operating system) their generated code should be interoperable.
See e.g. the wikipage for x86 calling conventions and the x86-64 ABI specification.
Name mangling, notably for C++, might also be an issue.
Read also Levine's book on Linkers and Loaders
Environment: Intel Linux, Red Hat 5.
Compiler: gcc 3.4.6
(old stuff, legacy environment with serious infrastructure, sorry)
I have multiple versions of a particular shared library (call it something like "shared_lib.so") derived from Fortran which contains a COMMON block and various computations with references to variables in that COMMON.
I need to be able to (from C code elsewhere in the end-product executable) use dlclose() and dlopen() to switch between versions of this library (within which all versions of the COMMON contents are identical) while running. In some cases the same COMMON also appears in code which is part of a static library (call it "static_lib.a") that is also linked into the executable, and is separately maintained from my project but which has functionality which interacts with that in my shared library.
I appear to be seeing that multiple instances of the COMMON wind up in the executable, and (more importantly) that there is no linkage between the values of variables in the instance from the static library, and the values of the “same” variables in the instance from a shared library pulled in with dlopen().
What I need, in summary, is (within the overall executable) for a dlopen()-loaded shared_lib.so to be able to set/use variable XYZ in COMMON ABC, and for code in static_lib.a to set/use XYZ, and have it in effect be the same instance of XYZ, or at least for the two to be kept in synch. Is this possible?
My compilation commands for sources in shared_lib.so are of the form:
g77 –c –g –m32 -fPIC –o shared_src.o shared_src.f
My command for building shared_lib.so is of the form:
gcc -g -m32 -fPIC -shared -o shared_lib.so *.o
My command for building the executable is of the form:
gcc –g -m32 –rdynamic –o exec exec.o static_lib.a shared_lib.so –lm –ldl –lg2c
My need is to do something from the C code of the form:
handle1 = dlopen ("shared_lib.so", RTLD_NOLOAD);
dlclose (handle1);
handle2 = dlopen ("shared_lib2.so", RTLD_NOW | RTLD_GLOBAL);
...
The initial startup configuration does appear to function correctly with respect to the needed variables, but the result of subsequent dlclose() and dlopen() sequences do not. Perhaps the underlying issue is that dlopen() lacks some intelligence that gcc possesses when it is linking.
Short answer
Did/can you recompile the executable with the -fPIC? I found that it was necessary to compile both the shared library AND the executable with the -fPIC to get the COMMON blocks to be recognized properly.
Long answer
I ran into a slightly similar problem recently with COMMON blocks shared between an executable and a FORTRAN shared library. However, I'm using Intel compilers NOT the GNU compilers. The executable is mixed C/C++ and FORTRAN.
The existing (working) Windows version of the code works by sharing the common blocks between executable and DLL through DLLEXPORT/DLLIMPORT ATTRIBUTE directives. According to the Intel compiler documentation, these attribute directives are not recognized in Linux. Indeed, the Linux Intel compiler just produces warnings for these directives.
The main changes in converting the code from Windows to Linux were replacing the Windows LoadLibrary and GetProcAddress with Linux's dlopen and dlsym routines, respectively, using #ifdef sections. The shared library was compiled using -fpic and linked with -shared.
While the shared library was compiled with -fpic, the executable was NOT. When running the code compiled in this manner, variables passed to the shared library through subroutine calls were passed properly, however, the COMMON block variables were not set correctly (or were uninitialized).
In desperation, I finally tried compiling the executable itself with the -fpic compiler option, and then the COMMON blocks were recognized properly in the shared library.
This isn't really an answer, but might help you.
Here's what the 2008 standard says about COMMON:
5.7.2.4 Common association
1 Within a program, the common block storage sequences of all nonzero-sized common blocks with the same
name have the same first storage unit, and the common block storage
sequences of all zero-sized common blocks with the same name are
storage associated with one another. Within a program, the common
block storage sequences of all nonzero-sized blank common blocks have
the same first storage unit and the storage sequences of all
zero-sized blank common blocks are associated with one another and
with the first storage unit of any nonzero-sized blank common blocks.
This results in the association of objects in different scoping units.
Use or host association may cause these associated objects to be
accessible in the same scoping unit.
In short, COMMON sections with the same name in the same program occupy the same storage.
A program is defined as follows.
2.2.2 Program
1 A program shall consist of exactly one main program, any number (including zero) of other kinds of program units, any
number (including zero) of external procedures, and any number
(including zero) of other entities defined by means other than
Fortran. The main program shall be defined by a Fortran main-program
program-unit or by means other than Fortran, but not both.
The standard doesn't say anything about static vs dynamic linking and it doesn't restrict the previous statements to static linking. Therefore, it seems the dynamically loaded library should share the COMMON block with the main program (which I'm not sure is even technically possible) and thus the GNU implementation is incorrect.
On the other hand, the standard also doesn't say anything about being able to load libraries dynamically. Program units "defined by means other than Fortran" should include C libraries, but that doesn't tell us how these program units are connected to the main program. Fortran, in general, is not a very dynamic language.
Of course, you can work around all this by simply not using COMMON blocks. If a procedure needs to read/write some data, just pass it as a parameter with intent in/out. You can also group data together in a derived type and pass it around together as a unit. Nowadays (Fortran 2003+), you can even use object oriented programming, so there is really no need for global variables anymore.