rpm reduces the size of shared library - c

I am making an rpm of my shared library. In the .spec file, I am writing the normal install commands in %install part ,making some soft links, running ldconfig on %post and %postun. I am not building library in rpm because I already have compiled and stripped library with me. But it happens that when I see the file size of the library(in my development folder) before installing my shared library with rpm, its 24k and when I see the deployed file of my shared library on /usr/lib64/, the size is around 23.8k due to which the hashes of library before deployment and after deployment doesnt match (which I need to match at the moment). What can be problem?
Any help would be appreciated.
edit: I have stat both library files. Shared library file before deployment is 8 blocks more than the file after deploying through rpm.

There are various possibilities as to what is causing the change...
The first is that RPM may be stripping some of the symbols - if there are symbols for internal functions which are not exported then it may choose to strip them.
It may also be removing various ELF sections from the file because RPM will normally try and extract any debug information into separate files which will then be placed in a separate debuginfo package. Even if you don't have any actual debug information in the library it may still have empty debug sections which are being removed by this process.
The best way to work out what is changing is to explore the two versions of the library with readelf and see if the list of sections (reported by readelf -S) or symbols (reported by readelf -s) have changed.

Related

Creating a standalone, relocatable build of postgres

For a small project I'm working on, I would like to create a “relocatable build” of PostgreSQL, similar to the binaries here. The idea is that you have PostgreSQL and all required libraries packaged so that you can just unpack it in any directory on any machine and it will run. I want the resulting build of Postgres to work on virtually any Linux machine it finds itself on.
I've made it so far as determining which libaries I need to build:
My understanding is that I should be getting the source code for these libraries (and their dependencies) and compiling them statically.
As things stand currently, my build script is quite barebones and obviously produces an install that is linked against whatever distribution it was run on:
./configure \
--prefix="${outputDir}" \
--with-uuid="ossp"
I'm wondering if anyone could outline what steps I must take to get the relocatable build that I'm after. My hunch right now is that I'm looking for guidance on what environment variables I would need to set and/or parameters I'd need to provide to my build in order to end up with a fully relocatable build of Postgres.
Please note: I don't normally work with C/C++ although I have several years of ./configure, make and doing builds for other much higher level ecosystems under my belt. I'm well aware that distribution-specific releases of Postgres are widely available, to speak nothing of the official docker container. Please take the approach that I'm pursuing a concept in the spirit of research or exploration. I'm looking for a precise solution, not a fast one.
This answer is for Linux; this will work differently on different operating systems.
You can create a “relocatable build” of PostgreSQL if you build it with the appropriate “run path”. The documentation gives you these hints:
The method to set the shared library search path varies between platforms, but the most widely-used method is to set the environment variable LD_LIBRARY_PATH [...]
On some systems it might be preferable to set the environment variable LD_RUN_PATH before building.
The manual for ld tells you:
If -rpath is not used when linking an ELF executable, the contents > of the environment variable LD_RUN_PATH will be used if it is defined.
It also tells you this about the run path:
The tokens $ORIGIN and $LIB can appear in these search directories. They will be replaced by the full path to
the directory containing the program or shared object in the case of $ORIGIN and either lib - for 32-bit
binaries - or lib64 - for 64-bit binaries - in the case of $LIB.
See also this useful answer.
So the sequence of steps would be:
./configure --disable-rpath [other options]
export LD_RUN_PATH='$ORIGIN/../lib'
make
make install
Then you package the PostgreSQL binaries in the bin subdirectory and the shared libraries plus all required libraries (you can find them with ldd) in the lib subdirectory. The libraries will then be looked up relative to the binaries.

Does a .so library need to be present at run time in all cases

My question is in relation to .so shared libraries. I am building a project that uses cmake on one ubuntu machine but running the application on another ubuntu machine.
In the CMakeLists.txt file, I have the following lines:
project (clientapp)
add_executable(${PROJECT_NAME} ${SOURCES} ${WAKAAMA_SOURCES} ${SHARED_SOURCES})
LINK_DIRECTORIES(/home/user//mraa-master-built/build/src)
target_link_libraries (clientapp libmraa.so)
target_link_libraries(clientapp m)
These lines add two libraries libmraa.so and the math library to the executable and it runs successfully on the other machine.
My understanding of shared libraries is that they must be present at compile time, and when the application starts. But I do not have the libmraa.so file on the other machine and the application runs ok. I expected it not to work.
Is my assumption correct?
In general, gcc and clang support lazy linking/binding of symbols, but not for entire libraries. This means that all of the shared objects (ie: .so files) should be present at application startup, at a minimum. The one exception to this is if you modified your makefile to not link against these libraries, and you manually call library functions via dlopen()/dlsym(), etc.
The binding of individual symbols within those libraries can be postponed until they are needed, or you can force all the symbols to be resolved at startup, using -z lazy or -z now, respectively.
It is strange that your application runs without libmraa.so being present. The two most likely reasons your application is running in the absence of the library is:
Your application isn't using any symbols defined in the library, so the linker ignores the library at build time (try ldd app_name and see if your library is present in the list of libraries provided by ldd).
Something is amiss in your build script, and you are statically linking against a .a archive of the library.
Edit: In response to how the application knows how to find the library, your linker (ld in this case) will use rpath lookup to decide which directories to use in its search for the appropriate library. You can see how this works by doing something like LD_DEBUG=libs app_name from the command line. You can also add an extra path via LD_LIBRARY_PATH=/some/path app_name.
Is my assumption correct?
Yes.
There are two likely explanations for why the application runs anyway:
You are mistaken, and there is libmraa.so somewhere on the machine (though perhaps not in the place where you looked), or
Your compiler defaults to -Wl,--as-needed by default, and your binary does not in fact depend on libmraa.so despite the fact that it appears on your link line.
You can trivially confirm or disprove either of the above guesses.
To confirm guess 2, do this:
readelf -d clientapp | grep NEED | grep libmraa
# if there is no output, guess 2 is correct
If guess 2 is wrong, to confirm guess 1, do this (on machine without libmrra.so):
ldd clientapp | grep libmraa.so
# if guess 2 is incorrect, and this command produces no output, then
# your dynamic loader is broken, which is very unlikely.

Compiling an individual kernel module (Debian/Ubuntu)

I need to modify the ELF loader's kernel implementation of an Ubuntu 14.04 distribution. Having downloaded the sources using:
sudo apt-get source linux-image-$(uname -r)
I ran the configuration script:
make config
in the root source tree. After a seemingly endless sequence of input requests, the script created the .config file needed to build the kernel(or a set of modules). The kernel version I am using is linux-3.13.0 and has the following source tree layout:
$ ls
arch COPYING crypto Documentation dropped.txt FileSystemMakefile fs init Kbuild kernel MAINTAINERS mm README samples security sound ubuntu virt
block CREDITS debian.master drivers elf.dat firmware include ipc Kconfig lib Makefile net REPORTING-BUGS scripts shortcuts tools usr
The ELF loader is located in /path/to/source/fs/binfmt_elf.c. Following this question,in order to compile an individual module it is sufficient to run
make /path/to/module/directory.
In this case that would be:
make ./path/to/source/fs
The compilation is quite lengthy; it takes about twenty minutes(on a virtual machine) and the output is written(by default) in the same directory in which the module is located. I've found the object files by running:
find . -name "*.o"
in /path/to/source/fs. Filtering by name the ELF loader can be located by running:
find . -name "*elf*.o"
In the current sources it is written(by default) in:
/path/to/source/fs/binfmt_elf.o
Having gone through this tutorial, I've noticed that kernel modules have the naming convention [module_name].ko in order to distinguish them from user space object files.
My question is how can I insert the new(modified) ELF loader into the kernel given that the current ELF loader is present(as unloading it may prevent binaries from being executed)?
What you have described is not really compiling a "kernel module" as it is commonly referred to. You have built an object that is statically linked into the kernel and there is no way that you can load just that object into a running kernel.
"kernel module" usually refers to "loadable kernel module" (LKM). Building and loading the fs as an LKM is what you need/want. Take a look at the below HOWTO. Follow that to build the desired fs as an LKM. Then you can just replace that one LKM (.ko) file and reboot (normally you can dynamically remove and insert LKMs but not sure how that will affect something fundamental like the ELF fs - you can try rmmod/modprobe without a reboot first if you ike).
http://www.tldp.org/HOWTO/Module-HOWTO/x73.html

Debugging core file for image built without gdb flag -g

Consider a field case, where we won't provide the image built with gdb flags.
Now is there any link or documentation or any such similar stuffs which helps in
debugging the core file generated in the field.(Remember the image is not built with -g gdb flag).
Some pointers would be really useful !!
An even better solution is to always build your program with -g (which at least for GCC does not inhibit optimization). Then you can use objcopy to create separate debug files which you do not ship with the product, and stripped binaries which you do ship.
Then when you load a core from the field on a development machine, where the debug symbols are present, GDB will load the debug symbols from the separate files. In the field, the debug symbol files are not present, since you didn't ship them, so the debug info is not available.
If applicable, you can also create a DVD or USB key with the symbol files so that a technician can bring symbols with them to analyze a core file on-site.
You need to build your executable with -g (you may also specify -O). You then ship a stripped version of the executable (man strip). Any core file will be compatible with either version.

How do I include only used symbols when statically linking with gcc?

I'm deploying a small program compiled with gcc, 4.3.2-1.1 (Debian). This program will be deployed on virtual machine templates ranging from Debain 5 to bleeding edge Fedora, Ubuntu, Slackware, Arch and others.
The program depends on some symbols from Xen's libraries which are only available in an unstable tree. Hence, installing Xen's libraries via respective package managers on the virtual machine templates would not solve my immediate issue.
Until I package my own version of these libraries, I need to statically link the executable.
Does gcc 4.3-x, by default only include symbols that are actually used when statically linking, or is there another optimization flag that I should be passing to the linker? I know that statically linking is bad, I'm doing it only as a temporary work around.
This issue is related not only to gcc, but to ld(1) too.
By default, gcc doesn't eliminate dead code, you can check this by compiling/linking executable, and then running
objdump -d a.out
which shows you all functions in your executable.
Simple "googling" give this link.
So, to remove unused functions, you need:
Compile with “-fdata-sections” to keep the data in separate data sections and “-ffunction-sections” to keep functions in separate sections, so they (data and functions) can be discarded if unused.
Link with “--gc-sections” to remove unused sections.

Resources