shared library constructor not working - c

In my shared library I have to do certain initialization at the load time. If I define the function with the GCC attribute __attribute__ ((constructor)) it doesn't work, i.e. it doesn't get called when the program linking my shared library is loaded.
If I change the function name to _init(), it works. Apparently the usage of _init() and _fini() functions are not recommended now.
Any idea why __attribute__ ((constructor)) wouldn't work? This is with Linux 2.6.9, gcc version 3.4.6
Edit:
For example, let's say the library code is this the following:
#include <stdio.h>
int smlib_count;
void __attribute__ ((constructor)) setup(void) {
smlib_count = 100;
printf("smlib_count starting at %d\n", smlib_count);
}
void smlib_count_incr() {
smlib_count++;
smlib_count++;
}
int smlib_count_get() {
return smlib_count;
}
For building the .so I do the following:
gcc -fPIC -c smlib.c
ld -shared -soname libsmlib.so.1 -o libsmlib.so.1.0 -lc smlib.o
ldconfig -v -n .
ln -sf libsmlib.so.1 libsmlib.so
Since the .so is not in one of the standard locations I update the LD_LIBRARY_PATH and link the .so from a another program. The constructor doesn't get called. If I change it to _init(), it works.

Okay, so I've taken a look at this, and it looks like what's happening is that your intermediate gcc step (using -c) is causing the issue. Here's my interpretation of what I'm seeing.
When you compile as a .o with setup(), gcc just treats it as a normal function (since you're not compiling as a .so, so it doesn't care). Then, ld doesn't see any _init() or anything like a DT_INIT in the ELF's dynamic section, and assumes there's no constructors.
When you compile as a .o with _init(), gcc also treats it as a normal function. In fact, it looks to me like the object files are identical except for the names of the functions themselves! So once again, ld looks at the .o file, but this time sees a _init() function, which it knows it's looking for, and decides it's a constructor, and correspondingly creates a DT_INIT entry in the new .so.
Finally, if you do the compilation and linking in one step, like this:
gcc -Wall -shared -fPIC -o libsmlib.so smlib.c
Then what happens is that gcc sees and understands the __attribute__ ((constructor)) in the context of creating a shared object, and creates a DT_INIT entry accordingly.
Short version: use gcc to compile and link in one step. You can use -Wl (see the man page) for passing in extra options like -soname if required, like -Wl,-soname,libsmlib.so.1.

From this link :
"Shared libraries must not be compiled with the gcc arguments -nostartfiles'' or-nostdlib''. If those arguments are used, the constructor/destructor routines will not be executed (unless special measures are taken)."
gcc/ld doesn't set the DT_INIT bit in the elf header when -nostdlib is used . You can check objdump -p and look for the section INIT in both cases. In attribute ((constructor)) case you wont find that INIT section . But for __init case you will find INIT section in the shared library.

Related

How to write common functions for reusing in C

I was trying to write a common function for other files could reuse it, the example as following, I have three files:
The first file: cat test1.h
void say();
The second file: cat test1.c
void say(){
printf("This is c example!");
}
The third file: cat test2.c
include "test1.h"
void main(){
say();
}
but when I ran: gcc -g -o test2 test2.c
it threw error as:
undefined reference to `say'
Additionally: I knew this would work:gcc -g -o test2 test1.c test2.c
but I don't wanna do this, because the other team would use the server, and I hope them directly use my binary code not source code. I hope that just like we use printf() function, we just need include .
You can build yourself a library from the object files containing your useful functions, and store the header(s) that describe them in a convenient location. You and your colleagues then compile with the headers and link that library with any executables that use any of those functions. That's very much the same general mechanism that the C compiler uses to include the standard headers and automatically link with the standard C library.
The mechanics vary a bit depending on platform (Windows vs Unix being the primary distinction, though there are differences between Unix platforms too), and also on the type of library (static archive vs dynamic linked / loaded libraries — also known as shared objects or shared libraries).
In broad outline, for a Unix system with a static library, you'd:
Compile library object files libfile1.o, libfile2.o, … using (for example) gcc -c libfile1.c libfile2.c.
Create an archive from the object files — using for example ar r libname.a libfile1.o libfile2.o.
Copy the headers to a standard location such as /usr/local/include.
Copy the library to a standard location such as /usr/local/lib.
You'd compile any code that uses the library functions with -I/usr/local/include (if that is not already a standard compilation option).
You'd link the programs with -L/usr/local/lib -lname (you might not need to specify -L… but you would need to specify -lname).
Including a header file does not make a function available. It simply informs the compiler that the function will be provided at a later time.
You should compile the file with the function into a shareable object file (or a library if there is more than one function that you want to share). Mind the switch -c which tells gcc not to build an executable file:
gcc -o test1.o test1.c -c
Similarly, compile the main function into its own object file. Now you or anyone else can link the object file with their main program:
gcc -o test2 test2.o test1.o
The process can be automated using make.
Other programmers can use compiled object files (`*.o') in their programs. They need only to have a header file with function prototypes, extern data declarations and type definitions.
You can also wrap many object files into the library.
On many systems you can also create the dynamic linked libraries which do not have to be linked into the executable.
you also need to compile test1:
gcc -g -o test2 test1.c test2.c.

C symbol visibility in static archives

I have files foo.c bar.c and baz.c, plus wrapper code myfn.c defining a function myfn() that uses code and data from those other files.
I would like to create something like an object file or archive, myfn.o or libmyfn.a, so that myfn() can be made available to other projects without also exporting a load of symbols from {foo,bar,baz}.o as well.
What's the right way to do that in Linux/gcc? Thanks.
Update: I've found one way of doing it. I should've emphasised originally that this was about static archives, not DSOs. Anyway, the recipe:
#define PUBLIC __attribute__ ((visibility("default"))) then mark myfn() as PUBLIC in myfn.c. Don't mark anything else PUBLIC.
Compile objects with gcc -c foo.c bar.c baz.c myfn.c -fvisibility=hidden, which marks everything as hidden except for myfn().
Create a convenience archive using ld's partial-linking switch: ld -r foo.o bar.o baz.o myfn.o -o libmyfn.a
Localise everything that wasn't PUBLIC like so: objcopy --localize-hidden libmyfn.a
Now nm says myfn is the only global symbol in libmyfn.a and subsequent linking into other programs works just fine: gcc -o main main.c -L. -lmyfn (here, the program calls myfn(); if it tried to call foo() then compilation would fail).
If I use ar instead of ld -r in step 3 then compilation fails in step 5: I guess ar hasn't linked foo etc to myfn, and no longer can once those functions are localised, whereas ld -r resolves the link before it gets localised-away.
I'd welcome any response that confirms this is the "right" way, or describes a slicker way of achieving the same.
Unfortunately, C linkage for globals is all-or-nothing, in the sense that the globals of all modules would be available in libmyfn.a's final list of external symbols.
gcc tool chain offers an extension that lets you hide symbols from outside users, while making them available to other translation units in your library:
foo.h:
void foo();
foo.c:
void foo() __attribute__ ((visibility ("hidden")));
myfn.h:
void myfn();
myfn.c:
#include <stdio.h>
#include "foo.h"
void myfn() {
printf("calling foo...\n");
foo();
printf("calling foo again...\n");
foo();
}
For portability, you would probably benefit from making a macro for __attribute__ ((visibility ("hidden"))), and placing it in a conditional compilation block conditioned on gcc.
In addition, Linux offers a utility called strip, which lets you remove some of the symbols from compiled object files. Options -N and -K let you identify individual symbols that you want to keep or remove.
Start with this to build a static library
gcc -c -O2 foo.c bar.c baz.c myfn.c
ar av libmyfunctions.a foo.o bar.o baz.o myfn.o
Compile and link with other programs like:
gcc -O2 program.c -lmyfunctions -o myprogram
Now your libmyfunctions.a will ultimately have extra stuff from the source that isn't required by the code in myfn.c But the linker should do a reasonable job of removing this when it creates the final program.
Suppose myfn.c has function myfun() which you want to use in other three files foo.c, bar.c & baz.c
Now create a shared library from code in myfn.c viz libmyf.a
Use this function call myfun() in other three files. Declare function as extern in these files. Now you can create object code of these thee files and link the libmyf.a at linking phase.
Refer to following link for using shared libraries.
http://www.cprogramming.com/tutorial/shared-libraries-linux-gcc.html

Trying to understand the main function with GCC and Windows

They say that main() is a function like any other function, but "marked" as an entry point inside the binary, an entry point that the operating system may find (Don't know how) and start the program from there. So, I'm trying to find out more about this function. What have I done? I created a simple .C file with this code inside:
int main(int argc, char **argv) {
return (0);
}
I saved the file, installed the GCC compiler (in Windows, MingW environment) and created a batch file like this:
gcc -c test.c -nostartfiles -nodefaultlibs -nostdlib -nostdinc -o test.o
gcc -o test.exe -nostartfiles -nodefaultlibs -nostdlib -nostdinc -s -O2 test.o
#%comspec%
I did this to obtain a very simplistic compiler and linker, no library, no header, just the compiler. So, the compiling goes well but the linking stops with this error:
test.c:(.text+0xa): undefined reference to '___main'
collect2.exe: error: Id returned 1 exit status
I thought that the main function is exported by the linker but I believed that you didn't need any library with additional information about it. But it looks like it does. In my case I supposed that it must be the standard GCC library, so I downloaded the source code of it and opened this file: libgcc2.c
Now, I don't know if that is the file where the main function is constructed to be linked by GCC. In fact, I don't understand how the main function is used by GCC. Why does the linker need the gcc standard libraries? To know what about main? I hope this has made my question quite specific and clear. Thanks!
When gcc puts together all object files (test.o) and libraries to form a binary it also prepends a small object (usually crt0.o or crt1.o), which is responsible for calling your main(). You can see what gcc is doing, when you add -v on the command line:
$ gcc -v -o test.exe test.o
crt0/crt1 does some setup and then calls into main. But the linker is finally responsible for building the executable according to the OS. With -v you can see also an option for the target system. In my case it's for Linux 64 bit: -m elf_x86_64. For your system this will be something like -m windows or -m mingw.
The error happens because you use these two options: -nodefaultlibs -nostdlib
These tell GCC that it should not link your code against libc.a/c.lib which contains the code which really calls main(). In a nutshell, every OS is slightly different and most of them don't care about C and main(). Each has their own special way to start a process and most of them are not compatible with the C API.
So the solution of the C developers was to put "glue code" into the C standard library libc.a which contains the interface which the OS expects, creates the standard C environment (setting up the memory allocation structures so malloc() will map the OS's memory management functions, set up stdio, etc) and eventually calls main()
For C developers, this means they get a libc.a for their OS (along with the compiler binaries) and they don't need to care about how the setup works.
Another source of confusion is the name of the reference. On most systems, the symbolic name of main() is _main (i.e. one underscore) while __main is the name of an internal function called by the setup code which eventually calls the real main()

gcc detect duplicate symbols/functions in static libraries

Is there any way we can get gcc to detect a duplicate symbol in static libraries vs the main code (Or another static library ?)
Here's the situation:
main.c erroneously contained a function definition, e.g. with the signature uint foohash(const char*)
foo.c also contains a function definition with the signature uint foohash(const char*)
foo.c and other source files are compiled to a static util library, which the main program links in, i.e. something like:
gcc -o main main.o util.o -L ./libs -lfooutils
So, now main.o and libs/libfooutils.a both contain a foohash function. Presumably the linker found that symbol in main.o and doesn't bother looking for it elsewhere.
Is there any way we can get gcc to detect such a situation ?
Indeed as Simon Richter stated, --whole-archive option can be useful. Try to change your command-line to:
gcc -o main main.o util.o -L ./libs -Wl,--whole-archive -lfooutils -Wl,--no-whole-archive
and you'll see a multiple definition error.
gcc calls the ld program for linking. The relevant ld options are:
--no-define-common
--traditional-format
--warn-common
See the man page for ld. These should be what you need to experiment with to get the warnings sought.
Short answer: no.
GCC does not actually do anything with libraries. It is the task of ld, the linker (called automatically by GCC) to pull in symbols from libraries, and that's really a fairly dumb tool.
The linker has lots of complex jiggery pokery for combining different types of data from different sources, and supporting different file formats, and all the evil little details of binary executables, but in the end, all it really does is look for undefined symbols and find the definitions.
What you can do is a link trace (pass -t to gcc) to see what comes from where. Or else run nm on all the object files and libraries in your system, and write a script to detect duplicates.

Including source files in C

So I get the point of headers vs source files. What I don't get is how the compiler knows to compile all the source files. Example:
example.h
#ifndef EXAMPLE_H
#define EXAMPLE_H
int example(int argument); // prototype
#endif
example.c
#include "example.h"
int example(int argument)
{
return argument + 1; // implementation
}
main.c
#include "example.h"
main()
{
int whatever;
whatever = example(whatever); // usage in program
}
How does the compiler, compiling main.c, know the implementation of example() when nothing includes example.c?
Is this some kind of an IDE thing, where you add files to projects and stuff? Is there any way to do it "manually" as I prefer a plain text editor to quirky IDEs?
Compiling in C or C++ is actually split up into 2 separate phases.
compiling
linking
The compiler doesn't know about the implementation of example(). It just knows that there's something called example() that will be defined at some point. So it just generated code with placeholders for example()
The linker then comes along and resolves these placeholders.
To compile your code using gcc you'd do the following
gcc -c example.c -o example.o
gcc -c main.c -o main.o
gcc example.o main.o -o myProgram
The first 2 invocations of gcc are the compilation steps. The third invocation is the linker step.
Yes, you have to tell the compiler (usually through a makefile if you're not using an IDE) which source files to compile into object files, and the compiler compiles each one individually. Then you give the linker the list of object files to combine into the executable. If the linker is looking for a function or class definition and can't find it, you'll get a link error.
It doesn't ... you have to tell it to.
For example, whe using gcc, first you would compile the files:
gcc file1.c -c -ofile1.o
gcc file2.c -c -ofile2.o
Then the compiler compiles those files, assuming that symbols that you've defined (like your example function) exist somewhere and will be linked in later.
Then you link the object files together:
gcc file1.o file2.o -oexecutable
At this point of time, the linker looks at those assumtions and "clarifies" them ie. checks whether they're present. This is how it basically works...
As for your IDE question, Google "makefiles"
The compiler does not know the implementation of example() when compiling main.c - the compiler only knows the signature (how to call it) which was included from the header file. The compiler produces .o object files which are later linked by a linker to create the executable binary. The build process can be controlled by an IDE, or if you prefer a Makefile. Makefiles have a unique syntax which takes a bit of learning to understand but will make the build process much clearer. There are lots of good references on the web if you search for Makefile.
The compiler doesn't. But your build tool does. IDE or make tool. The manual way is hand-crafted Makefiles.

Resources