Retrieve names of called functions in a C program (Lib ELF) - c

I'm trying to get the name of the called functions in a C program.
For exemple, this is my code:
void toto(void)
{
printf("toto\n");
}
void tutu(void)
{
printf("tutu\n");
}
void mdr(void)
{
printf("tutu\n");
printf("tutu\n");
printf("tutu\n");
printf("tutu\n");
}
int main(int ac, char **av)
{
toto();
tutu();
mdr();
}
I only wanna have on results : Main(), toto(), tutu() and mdr().
But the problem is, that when i'm using the Libelf, i retrieve some informations on the Symbol Table, but it gives me my functions called and more, like this example :
...
57: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main##GLIBC_
58: 0000000000401148 47 FUNC GLOBAL DEFAULT 13 mdr
59: 0000000000401126 17 FUNC GLOBAL DEFAULT 13 toto
60: 0000000000404020 0 NOTYPE GLOBAL DEFAULT 23 __data_start
61: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
62: 0000000000402008 0 OBJECT GLOBAL HIDDEN 15 __dso_handle
63: 0000000000401137 17 FUNC GLOBAL DEFAULT 13 tutu
64: 0000000000402000 4 OBJECT GLOBAL DEFAULT 15 _IO_stdin_used
65: 00000000004011a0 101 FUNC GLOBAL DEFAULT 13 __libc_csu_init
66: 0000000000404028 0 NOTYPE GLOBAL DEFAULT 24 _end
67: 0000000000401070 5 FUNC GLOBAL HIDDEN 13 _dl_relocate_static_pie
68: 0000000000401040 47 FUNC GLOBAL DEFAULT 13 _start
69: 0000000000404024 0 NOTYPE GLOBAL DEFAULT 24 __bss_start
70: 0000000000401177 37 FUNC GLOBAL DEFAULT 13 main
71: 0000000000404028 0 OBJECT GLOBAL HIDDEN 23 __TMC_END__
72: 0000000000401000 0 FUNC GLOBAL HIDDEN 11 _init
So how can i only retrieve my called functions ? Thanks again for the help

Related

Global const optimization and symbol interposition

I was experimenting with gcc and clang to see if they can optimize
#define SCOPE static
SCOPE const struct wrap_ { const int x; } ptr = { 42 /*==0x2a*/ };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
int ret_global(void) { return w.ptr->x; }
to return an intermediate constant.
It turns out they can:
0000000000000010 <ret_global>:
10: b8 2a 00 00 00 mov $0x2a,%eax
15: c3 retq
but surprisingly, removing the static yields the same assembly output.
That got me curious because if the global isn't static it should be interposable and replacing the reference with an intermediate should prevent inerposition on the global variable.
And indeed it does:
#!/bin/sh -eu
: ${CC:=gcc}
cat > lib.c <<EOF
int ret_42(void) { return 42; }
#define SCOPE
SCOPE const struct wrap_ { const int x; } ptr = { 42 };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
int ret_global(void) { return w.ptr->x; }
int ret_fn_result(void) { return ret_42()+1; }
EOF
cat > lib_override.c <<EOF
int ret_42(void) { return 50; }
#define SCOPE
SCOPE const struct wrap_ { const int x; } ptr = { 60 };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
EOF
cat > main.c <<EOF
#include <stdio.h>
int ret_42(void), ret_global(void), ret_fn_result(void);
struct wrap_ { const int x; };
extern struct wrap { const struct wrap_ *ptr; } const w;
int main(void)
{
printf("ret_42()=%d\n", ret_42());
printf("ret_fn_result()=%d\n", ret_fn_result());
printf("ret_global()=%d\n", ret_global());
printf("w.ptr->x=%d\n",w.ptr->x);
}
EOF
for c in *.c; do
$CC -fpic -O2 $c -c
#$CC -fpic -O2 $c -c -fno-semantic-interposition
done
$CC lib.o -o lib.so -shared
$CC lib_override.o -o lib_override.so -shared
$CC main.o $PWD/lib.so
export LD_LIBRARY_PATH=$PWD
./a.out
LD_PRELOAD=$PWD/lib_override.so ./a.out
outputs
ret_42()=42
ret_fn_result()=43
ret_global()=42
w.ptr->x=42
ret_42()=50
ret_fn_result()=51
ret_global()=42
w.ptr->x=60
Is it OK for the compiler to replace refs to extern global variables with intermediates? Shouldn't those be interposable as well?
Edit:
Gcc does not optimize out external function calls (unless compiled with -fno-semantic-interposition)
such as the call to ret_42() in int ret_fn_result(void) { return ret_42()+1; }, even though, as with a reference to an extern global const variable, the only way for the definition of the symbol to change is through interposition.
0000000000000020 <ret_fn_result>:
20: 48 83 ec 08 sub $0x8,%rsp
24: e8 00 00 00 00 callq 29 <ret_fn_result+0x9>
29: 48 83 c4 08 add $0x8,%rsp
2d: 83 c0 01 add $0x1,%eax
I always assumed this was to allow for the possibility of symbol interposition. Incidentally, clang does optimize them.
I wonder where (if anywhere) it says that the reference to extern const w in ret_global() can be optimized to an intermediate while the call to ret_42() in ret_fn_result cannot.
Anyway, it seems that symbol iterposition is awfully inconsistent and unreliable across different compilers unless you establish translation unit boundaries. :/
(Would be nice if simply all globals were consistently interposable unless -fno-semantic-interposition is on, but one can only wish.)
According to What is the LD_PRELOAD trick?
, LD_PRELOAD is an environment variable that allow users to load a library before any other library is loaded, including libc.so.
From this definition, it means 2 things:
The library specified in LD_PRELOAD can overload symbols from other library.
However, if that library specified does not contain the symbol, others library will be searched for that symbol as usual.
Here you specified LD_PRELOAD as lib_override.so, it defines int ret_42(void) and global variable ptr and w, but it does not define int ret_global(void).
So int ret_global(void) will be loaded from lib.so, and this function will directly returns 42 because the compiler sees no possibility that ptr and w from lib.c can be modified at runtime(they will be put int const data section in elf, linux guarantee that they can not be modified at runtime by hardware memory protection), so the compiler optimized that to return 42 directly.
Edit -- a test:
So I did some modification to your script:
#!/bin/sh -eu
: ${CC:=gcc}
cat > lib.c <<EOF
int ret_42(void) { return 42; }
#define SCOPE
SCOPE const struct wrap_ { const int x; } ptr = { 42 };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
int ret_global(void) { return w.ptr->x; }
EOF
cat > lib_override.c <<EOF
int ret_42(void) { return 50; }
#define SCOPE
SCOPE const struct wrap_ { const int x; } ptr = { 60 };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
int ret_global(void) { return w.ptr->x; }
EOF
cat > main.c <<EOF
#include <stdio.h>
int ret_42(void), ret_global(void);
struct wrap_ { const int x; };
extern struct wrap { const struct wrap_ *ptr; } const w;
int main(void)
{
printf("ret_42()=%d\n", ret_42());
printf("ret_global()=%d\n", ret_global());
printf("w.ptr->x=%d\n",w.ptr->x);
}
EOF
for c in *.c; do gcc -fpic -O2 $c -c; done
$CC lib.o -o lib.so -shared
$CC lib_override.o -o lib_override.so -shared
$CC main.o $PWD/lib.so
export LD_LIBRARY_PATH=$PWD
./a.out
LD_PRELOAD=$PWD/lib_override.so ./a.out
And this time, it prints:
ret_42()=42
ret_global()=42
w.ptr->x=42
ret_42()=50
ret_global()=60
w.ptr->x=60
Edit -- conclusion:
So it turns out that you either overload all related parts or overload nothing, otherwise you will get such tricky behavior. Another approach is to define int ret_global(void) in the header, not in the dynamic library, so you won't have to worry about that when you tries to overload some functionalities to do some tests.
Edit -- an explanation of why int ret_global(void) is overloadable and ptr and w is not.
First, I want to point out the type of symbols defined by you(using techniques from How do I list the symbols in a .so file
:
File lib.so:
Symbol table '.dynsym' contains 13 entries:
Num: Value Size Type Bind Vis Ndx Name
5: 0000000000001110 6 FUNC GLOBAL DEFAULT 12 ret_global
6: 0000000000001120 17 FUNC GLOBAL DEFAULT 12 ret_fn_result
7: 000000000000114c 0 FUNC GLOBAL DEFAULT 14 _fini
8: 0000000000001100 6 FUNC GLOBAL DEFAULT 12 ret_42
9: 0000000000000200 4 OBJECT GLOBAL DEFAULT 1 ptr
10: 0000000000003018 8 OBJECT GLOBAL DEFAULT 22 w
Symbol table '.symtab' contains 28 entries:
Num: Value Size Type Bind Vis Ndx Name
23: 0000000000001100 6 FUNC GLOBAL DEFAULT 12 ret_42
24: 0000000000001110 6 FUNC GLOBAL DEFAULT 12 ret_global
25: 0000000000001120 17 FUNC GLOBAL DEFAULT 12 ret_fn_result
26: 0000000000003018 8 OBJECT GLOBAL DEFAULT 22 w
27: 0000000000000200 4 OBJECT GLOBAL DEFAULT 1 ptr
File lib_override.so:
Symbol table '.dynsym' contains 11 entries:
Num: Value Size Type Bind Vis Ndx Name
6: 0000000000001100 6 FUNC GLOBAL DEFAULT 12 ret_42
7: 0000000000000200 4 OBJECT GLOBAL DEFAULT 1 ptr
8: 0000000000001108 0 FUNC GLOBAL DEFAULT 13 _init
9: 0000000000001120 0 FUNC GLOBAL DEFAULT 14 _fini
10: 0000000000003018 8 OBJECT GLOBAL DEFAULT 22 w
Symbol table '.symtab' contains 26 entries:
Num: Value Size Type Bind Vis Ndx Name
23: 0000000000001100 6 FUNC GLOBAL DEFAULT 12 ret_42
24: 0000000000003018 8 OBJECT GLOBAL DEFAULT 22 w
25: 0000000000000200 4 OBJECT GLOBAL DEFAULT 1 ptr
You will find that despite both being GLOBAL symbol, all functions is marked as type FUNC which is overloadable, while all variables has type OBJECT. Type OBJECT means it is not overloadable, so compiler doesn't need to use symbol resolution to get the data.
For further information on this, check this: What Are "Tentative" Symbols?
.
You can use LD_DEBUG=bindings to trace symbol binding. In this case, it prints (among other things):
17570: binding file /tmp/lib.so [0] to /tmp/lib_override.so [0]: normal symbol `ptr'
17570: binding file /tmp/lib_override.so [0] to /tmp/lib_override.so [0]: normal symbol `ptr'
17570: binding file ./a.out [0] to /tmp/lib_override.so [0]: normal symbol `ret_42'
17570: binding file ./a.out [0] to /tmp/lib_override.so [0]: normal symbol `ret_global'
So the ptr object in lib.so is indeed interposed, but the main program never calls ret_global in the original library. The call goes to ret_global from the preloaded library because the function is interposed as well.
EDIT: Question: I wonder where (if anywhere) it says that the reference to extern const w in ret_global() can be optimized to an intermediate while the call to ret_42() in ret_fn_result cannot.
TLDR; Logic behind this behavior (at least for GCC)
Compiler constant folding optimization capable of inlining complex const variables and structures
Compiler default behavior for functions is to export. If -fvisibility=hidden flag is not used, all functions are exported. Because any defined function is exported, it cannot be inlined. So call to ret_42 in ret_fn_result cannot be inlined. Turn on -fvisibility=hidden, the result will be as below.
Let's say that, if it would be possible to export and inline function for optimization purposes at the same time, it would lead to linker creating code that sometimes work in one way (inlined), some times works overriden (interposition), some times works straight in the scope of single loading and execution of resulting executable.
There are other flags that are in effect for this subject. Most notables:
-Bsymbolic, -Bsymbolic-functions and --dynamic-list as per SO.
-fno-semantic-interposition
of course optimization flags
Function ret_fn_result when ret_42 is hidden, not exported then inlined.
0000000000001110 <ret_fn_result>:
1110: b8 2b 00 00 00 mov $0x2b,%eax
1115: c3 retq
Technicals
STEP #1, subject is defined in lib.c:
SCOPE const struct wrap_ { const int x; } ptr = { 42 };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
int ret_global(void) { return w.ptr->x; }
When lib.c is compiled, w.ptr->x is optimized to const. So, with constant folding, it results in:
$ object -T lib.so
lib.so: file format elf64-x86-64
DYNAMIC SYMBOL TABLE:
0000000000000000 w D *UND* 0000000000000000 _ITM_deregisterTMCloneTable
0000000000000000 w D *UND* 0000000000000000 __gmon_start__
0000000000000000 w D *UND* 0000000000000000 _ITM_registerTMCloneTable
0000000000000000 w DF *UND* 0000000000000000 GLIBC_2.2.5 __cxa_finalize
0000000000001110 g DF .text 0000000000000006 Base ret_42
0000000000002000 g DO .rodata 0000000000000004 Base ptr
0000000000001120 g DF .text 0000000000000006 Base ret_global
0000000000001130 g DF .text 0000000000000011 Base ret_fn_result
0000000000003e18 g DO .data.rel.ro 0000000000000008 Base w
Where ptr and w is put to rodata and data.rel.ro (because const pointer) respectively. Constant folding results in following code:
0000000000001120 <ret_global>:
1120: b8 2a 00 00 00 mov $0x2a,%eax
1125: c3 retq
Another part is:
int ret_42(void) { return 42; }
int ret_fn_result(void) { return ret_42()+1; }
Here ret_42 is a function, since not hidden, it is exported function. So it is a code. And both are resulting in:
0000000000001110 <ret_42>:
1110: b8 2a 00 00 00 mov $0x2a,%eax
1115: c3 retq
0000000000001130 <ret_fn_result>:
1130: 48 83 ec 08 sub $0x8,%rsp
1134: e8 f7 fe ff ff callq 1030 <ret_42#plt>
1139: 48 83 c4 08 add $0x8,%rsp
113d: 83 c0 01 add $0x1,%eax
1140: c3 retq
Considering, that compiler does know only lib.c, we are done. Put lib.so aside.
STEP #2, compile lib_override.c:
int ret_42(void) { return 50; }
#define SCOPE
SCOPE const struct wrap_ { const int x; } ptr = { 60 };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
Which is simple:
$ objdump -T lib_override.so
lib_override.so: file format elf64-x86-64
DYNAMIC SYMBOL TABLE:
0000000000000000 w D *UND* 0000000000000000 _ITM_deregisterTMCloneTable
0000000000000000 w D *UND* 0000000000000000 __gmon_start__
0000000000000000 w D *UND* 0000000000000000 _ITM_registerTMCloneTable
0000000000000000 w DF *UND* 0000000000000000 GLIBC_2.2.5 __cxa_finalize
00000000000010f0 g DF .text 0000000000000006 Base ret_42
0000000000002000 g DO .rodata 0000000000000004 Base ptr
0000000000003e58 g DO .data.rel.ro 0000000000000008 Base w
Exported function ret_42, and then ptr and w is put to rodata and data.rel.ro (because const pointer) respectively. Constant folding results in following code:
00000000000010f0 <ret_42>:
10f0: b8 32 00 00 00 mov $0x32,%eax
10f5: c3 retq
STEP 3, compile main.c, let's see object first:
$ objdump -t main.o
# SKIPPED
0000000000000000 *UND* 0000000000000000 _GLOBAL_OFFSET_TABLE_
0000000000000000 *UND* 0000000000000000 ret_42
0000000000000000 *UND* 0000000000000000 printf
0000000000000000 *UND* 0000000000000000 ret_fn_result
0000000000000000 *UND* 0000000000000000 ret_global
0000000000000000 *UND* 0000000000000000 w
We have all symbols undefined. So they have to come from somewhere.
Then we link by default with lib.so and code is (printf and others are omitted):
0000000000001070 <main>:
1074: e8 c7 ff ff ff callq 1040 <ret_42#plt>
1089: e8 c2 ff ff ff callq 1050 <ret_fn_result#plt>
109e: e8 bd ff ff ff callq 1060 <ret_global#plt>
10b3: 48 8b 05 2e 2f 00 00 mov 0x2f2e(%rip),%rax # 3fe8 <w>
Now we have lib.so, lib_override.so and a.out in hands.
Let's simply call a.out:
main => ret_42 => lib.so => ret_42 => return 42
main => ret_fn_result => lib.so => ret_fn_result => return ( lib.so => ret_42 => return 42 ) + 1
main => ret_global => lib.so => ret_global => return rodata 42
main => lib.so => w.ptr->x = rodata 42
Now let's preload with lib_override.so:
main => ret_42 => lib_override.so => ret_42 => return 50
main => ret_fn_result => lib.so => ret_fn_result => return ( lib_override.so => ret_42 => return 50 ) + 1
main => ret_global => lib.so => ret_global => return rodata 42
main => lib_override.so => w.ptr->x = rodata 60
For 1: main calls ret_42 from lib_override.so because it is preloaded, ret_42 now resolves to one in lib_override.so.
For 2: main calls ret_fn_result from lib.so which calls ret_42 but from lib_override.so, because it now resolves to one in lib_override.so.
For 3: main calls ret_global from lib.so which returns folded constant 42.
For 4: main reads extern pointer which is pointing to lib_override.so, because it is preloaded.
Finally, once lib.so is generated with folded constants which are inlined, one can't demand them to be "overrideable". If intention to have overrideable data structure, one should define it in some other way (provide functions to manipulate them, don't use constants etc.). Because when defining something as constant, intention is clear, and compiler does what it does. Then even if that same symbol is defined as not constant in main.c or other place, it cannot be unfolded back in lib.c.
#!/bin/sh -eu
: ${CC:=gcc}
cat > lib.c <<EOF
int ret_42(void) { return 42; }
#define SCOPE
SCOPE const struct wrap_ { const int x; } ptr = { 42 };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
int ret_global(void) { return w.ptr->x; }
int ret_fn_result(void) { return ret_42()+1; }
EOF
cat > lib_override.c <<EOF
int ret_42(void) { return 50; }
#define SCOPE
SCOPE const struct wrap_ { const int x; } ptr = { 60 };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
EOF
cat > main.c <<EOF
#include <stdio.h>
int ret_42(void), ret_global(void), ret_fn_result(void);
struct wrap_ { const int x; };
extern struct wrap { const struct wrap_ *ptr; } const w;
int main(void)
{
printf("ret_42()=%d\n", ret_42());
printf("ret_fn_result()=%d\n", ret_fn_result());
printf("ret_global()=%d\n", ret_global());
printf("w.ptr->x=%d\n",w.ptr->x);
}
EOF
for c in *.c; do gcc -fpic -O2 $c -c; done
$CC lib.o -o lib.so -shared
$CC lib_override.o -o lib_override.so -shared
$CC main.o $PWD/lib.so
export LD_LIBRARY_PATH=$PWD
./a.out
LD_PRELOAD=$PWD/lib_override.so ./a.out

Advantage of different data section in c [duplicate]

When checking the disassembly of the object file through the readelf, I see the data and the bss segments contain the same offset address.
The data section will contain the initialized global and static variables. BSS will contain un-initialized global and static variables.
1 #include<stdio.h>
2
3 static void display(int i, int* ptr);
4
5 int main(){
6 int x = 5;
7 int* xptr = &x;
8 printf("\n In main() program! \n");
9 printf("\n x address : 0x%x x value : %d \n",(unsigned int)&x,x);
10 printf("\n xptr points to : 0x%x xptr value : %d \n",(unsigned int)xptr,*xptr);
11 display(x,xptr);
12 return 0;
13 }
14
15 void display(int y,int* yptr){
16 char var[7] = "ABCDEF";
17 printf("\n In display() function \n");
18 printf("\n y value : %d y address : 0x%x \n",y,(unsigned int)&y);
19 printf("\n yptr points to : 0x%x yptr value : %d \n",(unsigned int)yptr,*yptr);
20 }
output:
SSS:~$ size a.out
text data bss dec hex filename
1311 260 8 1579 62b a.out
Here in the above program I don't have any un-initialized data but the BSS has occupied 8 bytes. Why does it occupy 8 bytes?
Also when I disassemble the object file,
EDITED :
[ 3] .data PROGBITS 00000000 000110 000000 00 WA 0 0 4
[ 4] .bss NOBITS 00000000 000110 000000 00 WA 0 0 4
[ 5] .rodata PROGBITS 00000000 000110 0000cf 00 A 0 0 4
data, rodata and bss has the same offset address. Does it mean the rodata, data and bss refer to the same address?
Do Data section, rodata section and the bss section contain the data values in the same address, if so how to distinguish the data section, bss section and rodata section?
The .bss section is guaranteed to be all zeros when the program is loaded into memory. So any global data that is uninitialized, or initialized to zero is placed in the .bss section. For example:
static int g_myGlobal = 0; // <--- in .bss section
The nice part about this is, the .bss section data doesn't have to be included in the ELF file on disk (ie. there isn't a whole region of zeros in the file for the .bss section). Instead, the loader knows from the section headers how much to allocate for the .bss section, and simply zero it out before handing control over to your program.
Notice the readelf output:
[ 3] .data PROGBITS 00000000 000110 000000 00 WA 0 0 4
[ 4] .bss NOBITS 00000000 000110 000000 00 WA 0 0 4
.data is marked as PROGBITS. That means there are "bits" of program data in the ELF file that the loader needs to read out into memory for you. .bss on the other hand is marked NOBITS, meaning there's nothing in the file that needs to be read into memory as part of the load.
Example:
// bss.c
static int g_myGlobal = 0;
int main(int argc, char** argv)
{
return 0;
}
Compile it with $ gcc -m32 -Xlinker -Map=bss.map -o bss bss.c
Look at the section headers with $ readelf -S bss
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
:
[13] .text PROGBITS 080482d0 0002d0 000174 00 AX 0 0 16
:
[24] .data PROGBITS 0804964c 00064c 000004 00 WA 0 0 4
[25] .bss NOBITS 08049650 000650 000008 00 WA 0 0 4
:
Now we look for our variable in the symbol table: $ readelf -s bss | grep g_myGlobal
37: 08049654 4 OBJECT LOCAL DEFAULT 25 g_myGlobal
Note that g_myGlobal is shown to be a part of section 25. If we look back in the section headers, we see that 25 is .bss.
To answer your real question:
Here in the above program I dont have any un-intialised data but the BSS has occupied 8 bytes. Why does it occupy 8 bytes ?
Continuing with my example, we look for any symbol in section 25:
$ readelf -s bss | grep 25
9: 0804825c 0 SECTION LOCAL DEFAULT 9
25: 08049650 0 SECTION LOCAL DEFAULT 25
32: 08049650 1 OBJECT LOCAL DEFAULT 25 completed.5745
37: 08049654 4 OBJECT LOCAL DEFAULT 25 g_myGlobal
The third column is the size. We see our expected 4-byte g_myGlobal, and this 1-byte completed.5745. This is probably a function-static variable from somewhere in the C runtime initialization - remember, a lot of "stuff" happens before main() is ever called.
4+1=5 bytes. However, if we look back at the .bss section header, we see the last column Al is 4. That is the section alignment, meaning this section, when loaded, will always be a multiple of 4 bytes. The next multiple up from 5 is 8, and that's why the .bss section is 8 bytes.
Additionally We can look at the map file generated by the linker to see what object files got placed where in the final output.
.bss 0x0000000008049650 0x8
*(.dynbss)
.dynbss 0x0000000000000000 0x0 /usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../lib/crt1.o
*(.bss .bss.* .gnu.linkonce.b.*)
.bss 0x0000000008049650 0x0 /usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../lib/crt1.o
.bss 0x0000000008049650 0x0 /usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../lib/crti.o
.bss 0x0000000008049650 0x1 /usr/lib/gcc/x86_64-redhat-linux/4.7.2/32/crtbegin.o
.bss 0x0000000008049654 0x4 /tmp/ccKF6q1g.o
.bss 0x0000000008049658 0x0 /usr/lib/libc_nonshared.a(elf-init.oS)
.bss 0x0000000008049658 0x0 /usr/lib/gcc/x86_64-redhat-linux/4.7.2/32/crtend.o
.bss 0x0000000008049658 0x0 /usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../lib/crtn.o
Again, the third column is the size.
We see 4 bytes of .bss came from /tmp/ccKF6q1g.o. In this trivial example, we know that is the temporary object file from the compilation of our bss.c file. The other 1 byte came from crtbegin.o, which is part of the C runtime.
Finally, since we know that this 1 byte mystery bss variable is from crtbegin.o, and it's named completed.xxxx, it's real name is completed and it's probably a static inside some function. Looking at crtstuff.c we find the culprit: a static _Bool completed inside of __do_global_dtors_aux().
By definition, the bss segment takes some place in memory (when the program starts) but don't need any disk space. You need to define some variable to get it filled, so try
int bigvar_in_bss[16300];
int var_in_data[5] = {1,2,3,4,5};
Your simple program might not have any data in .bss, and shared libraries (like libc.so) may have "their own .bss"
File offsets and memory addresses are not easily related.
Read more about the ELF specification, also use /proc/ (eg cat /proc/self/maps would display the address space of the cat process running that command).
Read also proc(5)

vdso gettimeofday with 64 bit kernel & application compiled for 32 bit

is vdso supported for a 32 bit application which is running on a 64 bit kernel with glibc version 2.15.? If yes, how do I make it work for 32 bit application running on 64 bit kernel.? Cause even though dlopen on "linux-vdso.so.1" is success, dlsym on "__vdso_gettimeofday" fails.
On the same system I able to do a dlopen on "linux-vdso.so.1" & dlsym on "__vdso_gettimeofday" from a application compiled for 64 bit.
On my 64-bit Linux 4.4.15, the 32-bit vdso has these symbols:
readelf -Ws vdso32
Symbol table '.dynsym' contains 9 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000ce0 9 FUNC GLOBAL DEFAULT 12 __kernel_sigreturn##LINUX_2.5
2: 00000d00 13 FUNC GLOBAL DEFAULT 12 __kernel_vsyscall##LINUX_2.5
3: 00000ad0 438 FUNC GLOBAL DEFAULT 12 __vdso_gettimeofday##LINUX_2.6
4: 00000c90 42 FUNC GLOBAL DEFAULT 12 __vdso_time##LINUX_2.6
5: 00000770 853 FUNC GLOBAL DEFAULT 12 __vdso_clock_gettime##LINUX_2.6
6: 00000cf0 8 FUNC GLOBAL DEFAULT 12 __kernel_rt_sigreturn##LINUX_2.5
7: 00000000 0 OBJECT GLOBAL DEFAULT ABS LINUX_2.5
8: 00000000 0 OBJECT GLOBAL DEFAULT ABS LINUX_2.6
This suggests that the __vdso_gettimeofday you are looking for has been added in kernel 2.6, and that your kernel version is older.

Hide function name in GCC compilation

I am compiling a c "hello world" program that juste include one simple function and a main function.
I am using GCC under Linux.
When I run readelf command on the binary, I can see symbol table and I can see function names in clear.
Is there a way to tell GCC (or the linker) to not generate this symbol table?
Is it possible to tell GCC to store only functions addresses, without storing function names in clear?
Use the -s option to strip the symbol table:
gcc -s -o hello hello.c
The utility strip discards symbols from object files.
Consider :
#include <stdio.h>
static void static_func(void)
{
puts(__FUNCTION__);
}
void func(void)
{
puts(__FUNCTION__);
}
int main(void)
{
static_func();
func();
return 0;
}
readelf produces on a fresh compiled binary :
Symbol table '.symtab' contains 71 entries:
Num: Value Size Type Bind Vis Ndx Name
....
37: 0000000000000000 0 FILE LOCAL DEFAULT ABS hide.c
38: 0000000000400526 17 FUNC LOCAL DEFAULT 14 static_func
....
61: 0000000000400537 17 FUNC GLOBAL DEFAULT 14 func
....
66: 0000000000400548 21 FUNC GLOBAL DEFAULT 14 main
....
And after stripping the binary the whole output is :
Symbol table '.dynsym' contains 4 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND puts#GLIBC_2.2.5 (2)
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main#GLIBC_2.2.5 (2)
3: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__

glibc function strtoull() failure

I am facing issue with c library function strtoull which is returning me wrong output.
int main(int argc, char *argv[])
{
unsigned long long int intValue;
if(atoi(argv[2]) == 1)
{
intValue = strtoull((const char *)argv[1], 0, 10 );
}
else
{
// ...
}
printf("intValue of %s is %llu \n", argv[1], intValue);
return 0;
}
I built them and generated 32 and 64 bit executables as str32_new and str64_new.
But the output received from 32 bit exe is errorneous as wrong number is returned:
strtoull should had returned me number 5368709120 for the passed string "5368709120" but it returned me 1073741824.
# ./str32_new "5368709120" 1
intValue of 5368709120 is 1073741824
I note that when I decrease one character from string then it shows proper output.
# ./str32_new "536870912" 1
intValue of 536870912 is 536870912
glibc attached to 32 bit exe is
# readelf -Wa /home/str32_new | grep strt
[39] .shstrtab STRTAB 00000000 002545 000190 00 0 0 1
[41] .strtab STRTAB 00000000 0032f8 0002a4 00 0 0 1
0804a014 00000607 R_386_JUMP_SLOT 00000000 strtoull
6: 00000000 0 FUNC GLOBAL DEFAULT UND strtoull#GLIBC_2.0 (2)
55: 00000000 0 FILE LOCAL DEFAULT ABS strtoull.c
75: 00000000 0 FUNC GLOBAL DEFAULT UND strtoull##GLIBC_2.0
77: 08048534 915 FUNC GLOBAL DEFAULT 15 my_strtoull
glibc attached to 64 bit exe is
# readelf -Wa /home/str64_new | grep strt
[39] .shstrtab STRTAB 0000000000000000 001893 000192 00 0 0 1
[41] .strtab STRTAB 0000000000000000 002cd0 00029b 00 0 0 1
0000000000601028 0000000700000007 R_X86_64_JUMP_SLOT 0000000000000000 strtoull + 0
7: 0000000000000000 0 FUNC GLOBAL DEFAULT UND strtoull#GLIBC_2.2.5 (2)
57: 0000000000000000 0 FILE LOCAL DEFAULT ABS strtoull.c
73: 00000000004006cc 804 FUNC GLOBAL DEFAULT 15 my_strtoull
82: 0000000000000000 0 FUNC GLOBAL DEFAULT UND strtoull##GLIBC_2.2.5
64 bit exe shows proper output but on some system it too behaves abnormally.
Why is the strtoull in 32 bit exe behaving so and how to resolve this issue?
Ok, so we've established that this is quite obviously happening due to an overflow, as the value matches what would happen if casted into 32bit int.
This however does not explain everything - you did use strtoull, not the shorter strtoul, and it indeed works on 64bit binary. If anything, I was surprised to see you were even able to call the longer version in your 32bit build (how did you build it by the way, with -m32? or on a special machine?)
This link, raises the possibility that there's some linkage phenomenon that makes strtoull get declared as int strtoll() (presumably the system can't support the original lib version), and so we get the value implicitly casted through int, before copied back to your unsigned long long.
Either way - this should have been warned against by the compiler, try setting it to c99 and raise the warning levels, maybe that would make it shout
I think that is due to overflow. int in 32bit can not hold a number that large (max is 4294967296). As Leeor said, 5368709120 & (0xffffffff) = 1073741824.
The type int is minimally 32 bits wide, and is only 32 bits wide on most (if not all) systems.
You most likely forgot to #include <stdlib.h> and you probably did not enable any compiler warnings (like for using undeclared functions).
When a C compiler sees a function call to an undeclared function it blindly assumes int f(int) as prototype. In your case the return value of strtoull() will be int and so the value will be truncated to 32-bit.
(It is indeed quite strange that you get the correct result on a 64-bit system, where int is usually also just 32-bit.)

Resources