Moving to different Linux build system, getting error: undefined symbol: stat - c

This may just be an issue with the build system I am migrating to, but I'll include differences in the two systems and how I encountered the problem.
My old build system is a SLES 10 machine. The gcc/cpp/g++ version is 4.1.0
My new system is on SLES 11 SP4, and the gcc/cpp/g++ version is 4.3.4.
I am building a shared library; building and linking work fine on the new system. However, at load time on the new system, I get the following:
error ./mysharedlib.so: undefined symbol: stat
Since the stat() function is included from /usr/include/sys/stat.h, I looked at glibc on both systems. Old:
# rpm -q -f /usr/include/sys/stat.h
glibc-devel-2.4-31.2
and new:
# rpm -q -f /usr/include/sys/stat.h
glibc-devel-2.11.3-17.95.2
I also looked at objdump output related to stat() on the old system:
# objdump -T mysharedlib.so | grep stat
0000000000000000 D *UND* 0000000000000000 __xstat
# objdump -x mysharedlib.so | grep stat
00000000000e3f8a l F .text 0000000000000024 stat
0000000000000000 *UND* 0000000000000000 __xstat
And the new system:
# objdump -T mysharedlib.so | grep stat
0000000000000000 D *UND* 0000000000000000 stat
0000000000000000 D *UND* 0000000000000000 lstat
# objdump -x mysharedlib.so | grep stat
0000000000000000 *UND* 0000000000000000 stat
0000000000000000 *UND* 0000000000000000 lstat
This tells me that on the old system, stat() was defined as a local function in the .text section of my actual shared object. Stat is undefined in mysharedlib on the new system.
I did find some information on feature_test_macros and thought that might resolve the issue, so I included features.h before stat.h and updated my makefile to define _XOPEN_SOURCE:
cc -D_XOPEN_SOURCE=500
This didn't resolve the problem.
I also tried adding "-lc" to my ld flags to link in libc. This seemed like it should work, since that's where stat() is defined(I think), but it did not.
At this point, I found this StackOverflow question:
Why does -O to gcc cause "stat" to resolve?
So I tried adding -O to my makefile when invoking g++ on the file that calls stat(). This seems to resolve the problem. I probably don't know enough about resolving symbols; however, this seems a bit hack-ish to me. Am I way off base there? If not, what's the correct way to resolve the load time error on the new system?

The problem you are facing is most likely the result of building your shared library with ld. User-level code on UNIX systems should never use ld directly. You should use the compiler driver (g++ in your case) to perform the link instead.
Example:
// t.c
#include <sys/stat.h>
void fn(const char *p)
{
struct stat st;
stat(p, &st);
}
gcc -fPIC -c t.c
ld -shared -o t.so t.o
nm t.so | grep stat
U stat ## problem: this library is not linked correctly
Compare to correctly linked library:
gcc -shared -o t.so t.o
nm t.so | grep stat
0000000000000700 t stat
0000000000000700 t __stat
U __xstat##GLIBC_2.2.5
To find where the above local stat symbol came from, you could do this:
gcc -shared -o t.so t.o -Wl,-y,stat
t.o: reference to stat
/usr/lib/x86_64-linux-gnu/libc_nonshared.a(stat.oS): definition of stat
Finally, the reason U stat disappears with optimization:
gcc -E t.c | grep -A2 ' stat '
extern int stat (const char *__restrict __file,
struct stat *__restrict __buf) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
gcc -E t.c -O | grep -A2 ' stat '
__attribute__ ((__nothrow__ , __leaf__)) stat (const char *__path, struct stat *__statbuf)
{
return __xstat (1, __path, __statbuf);
That's right: you get different preprocessed source depending on the optimization level.

Related

How to link file generated with --relocatable in a PIE executable?

I have a big text file that I want to include in a C program. I could just make it a string literal but it's pretty big and that would be cumbersome. So I'm currently linking like this:
$ ld -r -b binary -o /tmp/stuff.o /tmp/stuff.txt
$ clang -o myprogram main.o /tmp/stuff.o
Objdump output:
$ objdump -t /tmp/stuff.o
/tmp/stuff.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l d .data 0000000000000000 .data
0000000000000006 g *ABS* 0000000000000000 _binary__tmp_stuff_txt_size
0000000000000006 g .data 0000000000000000 _binary__tmp_stuff_txt_end
0000000000000000 g .data 0000000000000000 _binary__tmp_stuff_txt_start
In the code, I do this (gotten from this question):
extern char _binary__tmp_stuff_txt_start[];
extern char _binary__tmp_stuff_txt_size[];
int f(void) {
size_t size = (size_t)_binary__tmp_stuff_txt_size;
do_stuff(size, _binary__tmp_stuff_txt_start);
}
Everything works great, but when I compile with GCC instead of Clang, it segfaults. Looking at it in GDB, the size variable initialized like this size_t size = (size_t)_binary__tmp_stuff_txt_size; is garbage. It seems that when GCC links, it passes the -pie flag to ld but Clang doesn't. I could fix this by just passing -no-pie to GCC, but it seems kindof sad that doing something so simple would prevent using PIE. Is there something I should change to make this work?

How does GCC implement __attribute__((constructor)) on MinGW?

I know that on ELF platforms, __attribute__((constructor)) uses the .ctors ELF section. Now I realized that the function attribute works with GCC on MinGW as well and I'm wondering how it is implemented.
For MinGW targets (and other COFF targets, like Cygwin) compiler just emits each constructor function address in .ctors COFF section:
$ cat c1.c
void c1() {
}
$ x86_64-w64-mingw32-gcc -c c1.c
$ objdump -x c1.o | grep ctors
# nothing
$ cat c1.c
__attribute__((constructor)) void c1() {
}
$ x86_64-w64-mingw32-gcc -c c1.c
$ objdump -x c1.o | grep ctors
5 .ctors 00000008 0000000000000000 0000000000000000 00000150 2**3
GNU ld linker (for MinGW targets) is then configured (via its default linker script) to combine these sections into regular .text section with __CTOR_LIST__ symbol pointing to the first item, and having the last item terminated with zero. (Probably .rdata section would be clearer since these are just addresses of functions, not CPU instructions, but for some reason .text is used. In fact LLVM LLD linker targeting MinGW places them in .rdata.)
LD linker:
$ x86_64-w64-mingw32-ld --verbose
...
.text ... {
...
__CTOR_LIST__ = .;
LONG (-1); LONG (-1);
KEEP (*(.ctors));
KEEP (*(.ctor));
KEEP (*(SORT_BY_NAME(.ctors.*)));
LONG (0); LONG (0);
...
...
}
Then it is up to C runtime library to run these constructors during initialization, by using this __CTOR_LIST__ symbol.
From mingw-w64 C runtime:
extern func_ptr __CTOR_LIST__[];
void __do_global_ctors (void)
{
// finds the last (zero terminated) item
...
// then runs from last to first:
for (i = nptrs; i >= 1; i--)
{
__CTOR_LIST__[i] ();
}
...
}
(also, it is very similar in Cygwin runtime)
This can be also seen in the debugger:
$ echo $MSYSTEM
MINGW64
$ cat c11.c
#include <stdio.h>
__attribute__((constructor))
void i1() {
puts("i 1");
}
int main() {
puts("main");
return 0;
}
$ gcc c11.c -o c11
$ gdb ./c11.exe
(gdb) b i1
(gdb) r
(gdb) bt
#0 0x00007ff603591548 in i1 ()
#1 0x00007ff6035915f2 in __do_global_ctors () at C:/_/M/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/gccmain.c:44
#2 0x00007ff60359164f in __main () at C:/_/M/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/gccmain.c:58
#3 0x00007ff60359139b in __tmainCRTStartup () at C:/_/M/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:313
#4 0x00007ff6035914f6 in mainCRTStartup () at C:/_/M/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:202
(gdb)
Note that in some environments (not MinGW and not Linux) it is instead the responsibility of GCC (its compiler runtime libgcc, more specifically its static part called crtbegin.o and crtend.o) and not C runtime to run these constructors.
Also, for comparison, on ELF targets (like Linux) GCC compiler used similar mechanism like the one described above for MinGW (it used ELF .ctors sections, although the rest was a bit different), but since GCC 4.7 (released in 2012) it uses slightly different mechanism (ELF .init_array section).

Symbol visibility not working as expected

I have a sample program like this:
#include <stdio.h>
#if 1
#define FOR_EXPORT __attribute__ ((visibility("hidden")))
#else
#define FOR_EXPORT
#endif
FOR_EXPORT void mylocalfunction1(void)
{
printf("function1\n");
}
void mylocalfunction2(void)
{
printf("function2\n");
}
void mylocalfunction3(void)
{
printf("function3\n");
}
void printMessage(void)
{
printf("Running the function exported from the shared library\n");
}
And compile it using
gcc -shared -fPIC -fvisibility=hidden -o libdefaultvisibility.so defaultvisibility.c
Now after compilation I do:
$ nm libdefaultvisibility.so
nm libdefaultvisibility.so
0000000000000eb0 t _mylocalfunction1
0000000000000ed0 t _mylocalfunction2
0000000000000ef0 t _mylocalfunction3
0000000000000f10 t _printMessage
U _printf
U dyld_stub_binder
Which means as far as I can tell that despite -fvisibility=hidden all symbols get exported. The book I was following claimed that only the function marked with FOR_EXPORT should be exported.
I looked oup several other resources, but for the simple test I'm doing -fvisibility=hidden should be sufficient.
My clang version:
$ clang -v
clang -v
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.0.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
You're misunderstanding the output of nm. Scroll through man nm and you'll you
read that the t flag means the symbol is a local (static) symbol in
the text section. The linker can't see it. If it were global (external)
the flag would be T. So all four of your functions are local.
Contrast:
$ clang -shared -fPIC -fvisibility=hidden -o libdefaultvisibility.so defaultvisibility.c
$ nm libdefaultvisibility.so | grep ' t '
0000000000000570 t deregister_tm_clones
0000000000000600 t __do_global_dtors_aux
0000000000200e08 t __do_global_dtors_aux_fini_array_entry
0000000000000640 t frame_dummy
0000000000200e00 t __frame_dummy_init_array_entry
0000000000000670 t mylocalfunction1
0000000000000690 t mylocalfunction2
00000000000006b0 t mylocalfunction3
00000000000006d0 t printMessage
00000000000005b0 t register_tm_clones
with dropping the -fvisibility=hidden:
$ clang -shared -fPIC -o libdefaultvisibility.so defaultvisibility.c
$ nm libdefaultvisibility.so | grep ' t '
0000000000000600 t deregister_tm_clones
0000000000000690 t __do_global_dtors_aux
0000000000200e08 t __do_global_dtors_aux_fini_array_entry
00000000000006d0 t frame_dummy
0000000000200e00 t __frame_dummy_init_array_entry
0000000000000700 t mylocalfunction1
0000000000000640 t register_tm_clones
$ nm libdefaultvisibility.so | grep ' T '
0000000000000780 T _fini
00000000000005b0 T _init
0000000000000720 T mylocalfunction2
0000000000000740 T mylocalfunction3
0000000000000760 T printMessage
Then only the explicitly hidden mylocalfunction1 remains local, and the
other three are now global.
You should not expect that a symbol marked with __attribute__ ((visibility("hidden")))
will be exported by a shared library in any circumstances. The attribute means precisely
that it will not be, whether it is applied explicitly to a symbol, as in this case,
or acquired by default in the presence of the linker option -fvisibility=hidden.
If you want to export just that one function in the example by means of a visibility attribution
you would have:
#define FOR_EXPORT __attribute__ ((visibility("default")))
Then:
$ clang -shared -fPIC -fvisibility=hidden -o libdefaultvisibility.so defaultvisibility.c
$ nm libdefaultvisibility.so | grep ' T '
0000000000000720 T _fini
0000000000000550 T _init
00000000000006a0 T mylocalfunction1
It is global, because the explicit attribition overrides the commandline option,
and all your other functions are local. Perhaps confusingly, default visibility
is always public.
And you could accomplish this without resorting to visibility attributions - which are
not portable - simply declaring all the functions that you don't want to export as static. Then the compiler
would not expose them to the linker in the first place:
foo.c
#include <stdio.h>
void mylocalfunction1(void)
{
printf("function1\n");
}
static void mylocalfunction2(void)
{
printf("function2\n");
}
static void mylocalfunction3(void)
{
printf("function3\n");
}
static void printMessage(void)
{
printf("Running the function exported from the shared library\n");
}
With which you get again:-
$ clang -shared -fPIC -o libfoo.so foo.c
$ nm libfoo.so | grep ' T '
00000000000006c0 T _fini
0000000000000550 T _init
00000000000006a0 T mylocalfunction1
Although the distinction does not make itself felt in your example you
should understand that while a local/static symbol is not seen by the linker and (therefore) is unavailable for dynamic linkage, a global/external symbol
may or may not be available for dynamic linkage. visibility
controls the availability of global symbols for dynamic linkage, only.
According to GCC Wiki on Visibility, you should:
Use nm -C -D on the outputted DSO [Dynamic Shared Object] to compare before and after to see
the difference it makes.
As stated on nm manual:
-D will display the dynamic symbols rather than the normal symbols
If I compile your code exactly as you did I get the following objects:
$ nm -C -D libdefaultvisibility.so
nm -C -D libdefaultvisibility.so
0000000000200a68 B __bss_start
w __cxa_finalize
0000000000200a68 D _edata
0000000000200a70 B _end
00000000000006c8 T _fini
w __gmon_start__
0000000000000518 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
w _Jv_RegisterClasses
U puts
And if I compile it without the -fvisibility=hidden option I get the objects:
$ nm -C -D libdefaultvisibility.so
nm -C -D libdefaultvisibility.so
0000000000200ae8 B __bss_start
w __cxa_finalize
0000000000200ae8 D _edata
0000000000200af0 B _end
0000000000000748 T _fini
w __gmon_start__
00000000000005a0 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
w _Jv_RegisterClasses
0000000000000712 T mylocalfunction2
0000000000000724 T mylocalfunction3
0000000000000736 T printMessage
U puts

C, Linker: How to use weak symbols with static library

I have a large code base which is mainly built as binary. I have changed the Makefile to create a static library and I am creating a binary linking the library.
When I use it as a static library, code doesn't run due to weak symbols undefined reference.
gcc test.c -L . -lasntc -ltc -lresolv -lnetlink -lutil -ltc -lm -o mytc
nm mytc | grep htb_qdisc_util
w htb_qdisc_util
I untared the archives which resulted in .o's and then using those object files, I created a binary and this however works as shown
gcc tc.o tc_qdisc.o tc_class.o tc_filter.o tc_util.o tc_monitor.o
m_police.o m_estimator.o m_action.o m_ematch.o emp_ematch.yacc.o
emp_ematch.lex.o asn_tc.o asn_global.o q_fifo.o q_sfq.o q_red.o q_prio.o q_tbf.o
q_cbq.o q_rr.o q_multiq.o q_netem.o f_rsvp.o f_u32.o f_route.o f_fw.o f_basic.o
f_flow.o f_cgroup.o q_dsmark.o q_gred.o f_tcindex.o q_ingress.o q_hfsc.o q_htb.o
q_drr.o q_qfq.o m_gact.o m_mirred.o m_nat.o m_pedit.o m_skbedit.o p_ip.o
p_icmp.o p_tcp.o p_udp.o em_nbyte.o em_cmp.o em_u32.o em_meta.o q_mqprio.o static-syms.o tc_core.o tc_red.o tc_cbq.o tc_estimator.o tc_stab.o -lresolv -L. -lnetlink -lutil -L. -lm -o tc
nm tc | grep htb_qdisc_util
0000000000641bc0 D htb_qdisc_util
Just looking at the object files symbol table, following is seen
nm *.o | grep htb_qdisc_util
0000000000000000 D htb_qdisc_util
w htb_qdisc_util
Weak symbols resulting due to
extern char hfsc_qdisc_util[] __attribute__((weak)); if (!strcmp(sym, "hfsc_qdisc_util")) return hfsc_qdisc_util;
extern char htb_qdisc_util[] __attribute__((weak)); if (!strcmp(sym, "htb_qdisc_util")) return htb_qdisc_util;
How do I create a static library and what is happening when I create a binary

Is it possible to override static functions in an object module (gcc, ld, x86, objcopy)?

Is there a way to override functions with static scope
within an object module?
If I start with something like this, a module
with global symbol "foo" is a function that calls
local symbol "bar," that calls local symbol "baz"
[scameron#localhost ~]$ cat foo.c
#include <stdio.h>
static void baz(void)
{
printf("baz\n");
}
static void bar(void)
{
printf("bar\n");
baz();
}
void foo(void)
{
printf("foo\n");
bar();
}
[scameron#localhost ~]$ gcc -g -c foo.c
[scameron#localhost ~]$ objdump -x foo.o | egrep 'foo|bar|baz'
foo.o: file format elf32-i386
foo.o
00000000 l df *ABS* 00000000 foo.c
00000000 l F .text 00000014 baz
00000014 l F .text 00000019 bar
0000002d g F .text 00000019 foo
It has one global, "foo" and two locals "bar" and "baz."
Suppose I want to write some unit tests that exercise bar and baz,
I can do:
[scameron#localhost ~]$ cat barbaz
bar
baz
[scameron#localhost ~]$ objcopy --globalize-symbols=barbaz foo.o foo2.o
[scameron#localhost ~]$ objdump -x foo2.o | egrep 'foo|bar|baz'
foo2.o: file format elf32-i386
foo2.o
00000000 l df *ABS* 00000000 foo.c
00000000 g F .text 00000014 baz
00000014 g F .text 00000019 bar
0000002d g F .text 00000019 foo
[scameron#localhost ~]$
And now bar and baz are global symbols and accessible from
outside the module. So far so good.
But what if I want to interpose my own function on top
of "baz", and have "bar" call my interposed "baz"?
Is there a way to do that?
--wrap option doesn't seem to do it...
[scameron#localhost ~]$ cat ibaz.c
#include <stdio.h>
extern void foo();
extern void bar();
void __wrap_baz()
{
printf("wrapped baz\n");
}
int main(int argc, char *argv[])
{
foo();
baz();
}
[scameron#localhost ~]$ gcc -o ibaz ibaz.c foo2.o -Xlinker --wrap -Xlinker baz
[scameron#localhost ~]$ ./ibaz
foo
bar
baz
wrapped baz
[scameron#localhost ~]$
The baz called from main() got wrapped, but
bar still calls the local baz not the wrapped baz.
Is there a way to make bar call the wrapped baz?
Even if it requires modifying the object code to tinker with the addresses of function calls, if that can be done in an automated way, that might be good enough, but in that case it needs to work on at least i386 and x86_64.
-- steve
Since static is a promise to the C compiler that the function or variable is local to the file, the compiler is free to remove that code if it can get the same result without it.
This might be inlining the function call. It might mean replacing a variable with constants. If the code is inside an if statement that is always false, the function may not even exist in the compiled result.
All of that means that you cannot reliably redirect calls to that function.
If you compile with the new -lto options it is even worse, because the compiler is free to reorder, remove or inline all of the code in the entire project.
I received an email from Ian Lance Taylor (author of gold, an alternative linker to ld):
Is there a way to override functions with static scope
within an object module? (I'm on x86_64 and i386 linux)
No, there isn't. In particular, the compiler may inline calls to static
functions, and it may also rewrite the functions to use a different
calling convention (GCC does both of these optimizations). So there is
no reliable way to override a static function after the code has been
compiled.
The inlining can be dealt with, I think, by -fno-inline, but changing the calling conventions is probably too much.
That being said, the DynamoRIO guys claim to be able to do it, but I haven't verified it:
https://groups.google.com/forum/?fromgroups#!topic/dynamorio-users/xt8JTXBCZ74
If you are okay with modifying machine code then you should have no problem modifying source code.
Write scripts to mechanically produce unit test sources from the real sources. Perl does this very well.

Resources