Replacing symbols without LD_PRELOAD

Replacing symbols without LD_PRELOAD - c

Is it possible to set hooks on system calls in runtime? In portable way, without asm, maybe some dynamic linker functions?
I want to intercept system calls of 3rd party libraries. Don't want to use LD_PRELOAD, it needs external wrapper-launcher script setting env var

You can override a library call by redefining the function:
#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>
void abort(void)
{
// If necessary, get a instance to the "real" function:
void (*real_abort)(void) = dlsym(RTLD_NEXT, "abort");
if (!real_abort) {
fpritnf(stderr, "Could not find real abort\n");
exit(1);
}
fprintf(stderr, "Calling abort\n");
real_abort();
}
with main
#include <stdlib.h>
int main(int argc, char** argv) {
abort();
}
Resulting in:
$ ./a.out
Calling abort
Aborted
If you want to do this in runtime for an abitrary function (without compiling your own version of the function), you might try to use the relocation informations of your ELF objects (executable and shared objects) and update the relocations at runtime.
Let's compile a simple hell world and look at its relocations:
$ LANG=C readelf -r ./a.out
Relocation section '.rela.dyn' at offset 0x348 contains 1 entries:
Offset Info Type Sym. Value Sym. Name + Addend
0000006008d8 000300000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
Relocation section '.rela.plt' at offset 0x360 contains 3 entries:
Offset Info Type Sym. Value Sym. Name + Addend
0000006008f8 000100000007 R_X86_64_JUMP_SLO 0000000000000000 puts + 0
000000600900 000200000007 R_X86_64_JUMP_SLO 0000000000000000 __libc_start_main + 0
000000600908 000300000007 R_X86_64_JUMP_SLO 0000000000000000 __gmon_start__ + 0
Those are the relocations done by the dynamic linker: the first line of the .rela.plt tells the dynamic linker it needs to setup a PLT entry at 0x0000006008f8 for the puts symbol. In order to override the put function, we might find all occurences of the puts symbols in all shared objects and relocate them to the suitable function.

Related

Is it possible to make a hardcoding with the help of the command objcopy

I'm working on Linux and I've just heard that there was a command objcopy, I've found the relative command on my x86_64 PC: x86_64-linux-gnu-objcopy.
With its help, I can convert a file into an obj file: x86_64-linux-gnu-objcopy -I binary -O elf64-x86-64 custom.config custom.config.o
The file custom.config is a human-readable file. It contains two lines:
name titi
password 123
Now I can execute objdump -x -s custom.config.o to check its information.
custom.config.o: file format elf64-little
custom.config.o
architecture: UNKNOWN!, flags 0x00000010:
HAS_SYMS
start address 0x0000000000000000
Sections:
Idx Name Size VMA LMA File off Algn
0 .data 00000017 0000000000000000 0000000000000000 00000040 2**0
CONTENTS, ALLOC, LOAD, DATA
SYMBOL TABLE:
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 g .data 0000000000000000 _binary_custom_config_start
0000000000000017 g .data 0000000000000000 _binary_custom_config_end
0000000000000017 g *ABS* 0000000000000000 _binary_custom_config_size
Contents of section .data:
0000 6e616d65 20746974 690a7061 7373776f name titi.passwo
0010 72642031 32330a rd 123.
As all we know, we can open, read or write a file, such as custom.config in any C/C++ project. Now, I'm thinking if it's possible to use this obj file custom.config.o immediately in a C/C++ project. For example, is it possible to read the content of the file custom.config.o immediately without calling the I/O functions, such as open, read or write. If possible, I think this might become some kind of hardcoding style and avoid calling the I/O functions?

Even if I tried this on Win10 with MinGW (MinGW-W64 project, GCC 8.1.0), this should work for you with only minor adaptions.
As you see from the info objdump gave you, the file's contents is placed in the .data section that is the common section for non-constant variables.
And some symbols were defined for it. You can declare these symbols in your C source.
The absolute value _binary_custom_config_size is special, because it is marked *ABS*. Currently I know no other way to obtain its value than to declare a variable of any type and take its address.
This is my show_config.c:
#include <stdio.h>
#include <string.h>
extern const char _binary_custom_config_start[];
extern const char _binary_custom_config_size;
int main(void) {
size_t size = (size_t)&_binary_custom_config_size;
char config[size + 1];
strncpy(config, _binary_custom_config_start, size);
config[size] = '\0';
printf("config = \"%s\"\n", config);
return 0;
}
Because the "binary" file (actually a text) has no final '\0' character, you need to append one to get a correctly terminated C string.
You could as well declare _binary_custom_config_end and use it to calculate the size, or as a limit.
Building everything goes like this (I used the -g option to debug):
$ objcopy -I binary -O elf64-x86-64 -B i386 custom.config custom.config.o
$ gcc -Wall -Wextra -pedantic -g show_config.c custom.config.o -o show_config
And the output shows the success:
$ show_config.exe
config = "name titi
password 123"
If you need the file's contents in another section, you will add the option to rename the section to objcopy's call. Add any flag you need, the example shows .rodata that is used for read-only data:
--rename-section .data=.rodata,alloc,load,readonly,data,contents

How does dynamic linker know which library to search for a symbol?

I'm experimenting with LD_PRELOAD/dlopen and faced a confusion regarding symbol lookup. Consider the following 2 libraries:
libshar
shared.h
int sum(int a, int b);
shared.c
int sum(int a, int b){
return a + b;
}
libshar2
shared.h
int sum(int a, int b);
shared.c
int sum(int a, int b){
return a + b + 10000;
}
and executable bin_shared:
#include <dlfcn.h>
#include "shared.h"
int main(void){
void *handle = dlopen("/home/me/c/build/libshar2.so", RTLD_NOW | RTLD_GLOBAL);
int s = sum(2 + 3);
printf("s = %d", s);
}
linking the binary with libshar and libdl I considered the following 2 cases:
LD_PRELOAD is empty
The program prints 5.
Why does the dynamic linker decide to lookup the sum function in the libshar, not libshar2? Both of them are loaded and contain the needed symbol:
0x7ffff73dc000 0x7ffff73dd000 0x1000 0x0 /home/me/c/build/libshar2.so
0x7ffff73dd000 0x7ffff75dc000 0x1ff000 0x1000 /home/me/c/build/libshar2.so
0x7ffff75dc000 0x7ffff75dd000 0x1000 0x0 /home/me/c/build/libshar2.so
0x7ffff75dd000 0x7ffff75de000 0x1000 0x1000 /home/me/c/build/libshar2.so
#...
0x7ffff7bd3000 0x7ffff7bd4000 0x1000 0x0 /home/me/c/build/libshar.so
0x7ffff7bd4000 0x7ffff7dd3000 0x1ff000 0x1000 /home/me/c/build/libshar.so
0x7ffff7dd3000 0x7ffff7dd4000 0x1000 0x0 /home/me/c/build/libshar.so
0x7ffff7dd4000 0x7ffff7dd5000 0x1000 0x1000 /home/me/c/build/libshar.so
LD_PRELOAD = /path/to/libshar2.so
The program prints 10005. This is expected, but again I noticed that both libshar.so and libshar2.so are loaded:
0x7ffff79d1000 0x7ffff79d2000 0x1000 0x0 /home/me/c/build/libshar.so
0x7ffff79d2000 0x7ffff7bd1000 0x1ff000 0x1000 /home/me/c/build/libshar.so
0x7ffff7bd1000 0x7ffff7bd2000 0x1000 0x0 /home/me/c/build/libshar.so
0x7ffff7bd2000 0x7ffff7bd3000 0x1000 0x1000 /home/me/c/build/libshar.so
0x7ffff7bd3000 0x7ffff7bd4000 0x1000 0x0 /home/me/c/build/libshar2.so
0x7ffff7bd4000 0x7ffff7dd3000 0x1ff000 0x1000 /home/me/c/build/libshar2.so
0x7ffff7dd3000 0x7ffff7dd4000 0x1000 0x0 /home/me/c/build/libshar2.so
0x7ffff7dd4000 0x7ffff7dd5000 0x1000 0x1000 /home/me/c/build/libshar2.so
The LD_PRELOAD case seems to be explained in ld.so(8):
LD_PRELOAD
A list of additional, user-specified, ELF shared objects to be loaded
before all others. The items of the list can be separated by spaces
or colons. This can be used to selectively override functions in
other shared objects. The objects are searched for using the rules
given under DESCRIPTION.

Why does the dynamic linker decide to lookup the sum function in the libshar, not libshar2?
Dynamic linkers on UNIX attempt to emulate what would have happened if you linked with archive libraries.
In the case of empty LD_PRELOAD, the symbol search order is (when the symbol is referenced by the main binary; rules get more complicated when the symbol is referenced by the DSO): the main binary, directly linked DSOs in the order they are listed on the link line, dlopened DSOs in the order they were dlopened.
LD_PRELOAD = /path/to/libshar2.so
The program prints 10005. This is expected,
Non-empty LD_PRELOAD modifies the search order by inserting any libraries listed after the main executable, and before any directly linked DSOs.
but again I noticed that both libshar.so and libshar2.so are loaded:
Why is that a surprise? The dynamic linker loads all libraries listed in LD_PRELOAD, and then all libraries that you directly linked against (as explained before).

dlopen can't (nor can anything else) change the definition of (global) symbols already present at the time of the call. It can only make available new ones that did not exist before.
The (sloppy) formalization of this is in the specification for dlopen:
Symbols introduced into the process image through calls to dlopen() may be used in relocation activities. Symbols so introduced may duplicate symbols already defined by the program or previous dlopen() operations. To resolve the ambiguities such a situation might present, the resolution of a symbol reference to symbol definition is based on a symbol resolution order. Two such resolution orders are defined: load order and dependency order. Load order establishes an ordering among symbol definitions, such that the first definition loaded (including definitions from the process image file and any dependent executable object files loaded with it) has priority over executable object files added later (by dlopen()). Load ordering is used in relocation processing. Dependency ordering uses a breadth-first order starting with a given executable object file, then all of its dependencies, then any dependents of those, iterating until all dependencies are satisfied. With the exception of the global symbol table handle obtained via a dlopen() operation with a null pointer as the file argument, dependency ordering is used by the dlsym() function. Load ordering is used in dlsym() operations upon the global symbol table handle.
Note that LD_PRELOAD is nonstandard functionality and thus not described here, but on implementations that offer it, LD_PRELOAD acts with load order after the main program but before any shared libraries loaded as dependencies.

How to determine the address range of global variables in a shared library at runtime?

At runtime, are global variables in a loaded shared library guaranteed to occupy a contiguous memory region? If so, is it possible to find out that address range?
Context: we want to have multiple "instances" of a shared library (e.g. a protocol stack implementation) in memory for simulation purposes (e.g. to simulate a network with multiple hosts/routers). One of the approaches we are trying is to load the library only once, but emulate additional instances by creating and maintaining "shadow" sets of global variables, and switch between instances by memcpy()'ing the appropriate shadow set in/out of the memory area occupied by the global variables of the library. (Alternative approaches like using dlmopen() to load the library multiple times, or introducing indirection inside the shared lib to access global vars have their limitations and difficulties too.)
Things we tried:
Using dl_iterate_phdr() to find the data segment of the shared lib. The resulting address range was not too useful, because (1) it did not point to an area containing the actual global variables but to the segment as loaded from the ELF file (in readonly memory), and (2) it contained not only the global vars but also additional internal data structures.
Added start/end guard variables in C to the library code, and ensured (via linker script) that they are placed at the start and end of the .data section in the shared object. (We verified that with objdump -t.) The idea was that at runtime, all global variables would be located in the address range between the two guard variables. However, our observation was that the relative order of the actual variables in memory was quite different than what would follow from the addresses in the shared object. A typical output was:
$ objdump -t libx.so | grep '\.data'
0000000000601020 l d .data 0000000000000000 .data
0000000000601020 l O .data 0000000000000000 __dso_handle
0000000000601038 l O .data 0000000000000000 __TMC_END__
0000000000601030 g O .data 0000000000000004 custom_data_end_marker
0000000000601028 g O .data 0000000000000004 custom_data_begin_marker
0000000000601034 g .data 0000000000000000 _edata
000000000060102c g O .data 0000000000000004 global_var
$ ./prog
# output from dl_iterate_phdr()
name=./libx.so (7 segments)
header 0: type=1 flags=5 start=0x7fab69fb0000 end=0x7fab69fb07ac size=1964
header 1: type=1 flags=6 start=0x7fab6a1b0e08 end=0x7fab6a1b1038 size=560 <--- data segment
header 2: type=2 flags=6 start=0x7fab6a1b0e18 end=0x7fab6a1b0fd8 size=448
header 3: type=4 flags=4 start=0x7fab69fb01c8 end=0x7fab69fb01ec size=36
header 4: type=1685382480 flags=4 start=0x7fab69fb0708 end=0x7fab69fb072c size=36
header 5: type=1685382481 flags=6 start=0x7fab69bb0000 end=0x7fab69bb0000 size=0
header 6: type=1685382482 flags=4 start=0x7fab6a1b0e08 end=0x7fab6a1b1000 size=504
# addresses obtained via dlsym() are consistent with the objdump output:
dlsym('custom_data_begin_marker') = 0x7fab6a1b1028
dlsym('custom_data_end_marker') = 0x7fab6a1b1030 <-- between the begin and end markers
# actual addresses: at completely different address range, AND in completely different order!
&custom_data_begin_marker = 0x55d613f8e018
&custom_data_end_marker = 0x55d613f8e010 <-- end marker precedes begin marker!
&global_var = 0x55d613f8e01c <-- after both markers!
Which means the "guard variables" approach does not work.
Maybe we should iterate over the Global Offset Table (GOT) and collect the addresses of global variables from there? However, there doesn't seem to be an official way for doing that, if it's possible at all.
Is there something we overlooked? I'll be happy to clarify or post our test code if needed.
EDIT: To clarify, the shared library in question is a 3rd party library whose source code we prefer not to modify, hence the quest for the above general solution.
EDIT2: As further clarification, the following code outlines what I would like to be able to do:
// x.c -- source for the shared library
#include <stdio.h>
int global_var = 10;
void bar() {
global_var++;
printf("global_var=%d\n", global_var);
}
// a.c -- main program
#include <stdlib.h>
#include <dlfcn.h>
#include <memory.h>
struct memrange {
void *ptr;
size_t size;
};
extern int global_var;
void bar();
struct memrange query_globals_address_range(const char *so_file)
{
struct memrange result;
// TODO what generic solution can we use here instead of the next two specific lines?
result.ptr = &global_var;
result.size = sizeof(int);
return result;
}
struct memrange g_range;
void *allocGlobals()
{
// allocate shadow set and initialize it with actual global vars
void *globals = malloc(g_range.size);
memcpy(globals, g_range.ptr, g_range.size);
return globals;
}
void callBar(void *globals) {
memcpy(g_range.ptr, globals, g_range.size); // overwrite globals from shadow set
bar();
memcpy(globals, g_range.ptr, g_range.size); // save changes into shadow set
}
int main(int argc, char *argv[])
{
g_range = query_globals_address_range("./libx.so");
// allocate two shadow sets of global vars
void *globals1 = allocGlobals();
void *globals2 = allocGlobals();
// call bar() in the library with a few times with each
callBar(globals1);
callBar(globals2);
callBar(globals2);
callBar(globals1);
callBar(globals1);
return 0;
}
Build+run script:
#! /bin/sh
gcc -c -g -fPIC x.c -shared -o libx.so
gcc a.c -g -L. -lx -ldl -o prog
LD_LIBRARY_PATH=. ./prog
EDIT3: Added dl_iterate_phdr() output

Shared libraries are compiled as Position-Independent Code. That means that unlike executables, addresses are not fixed, but are rather decided during dynamic linkage.
From a software engineering standpoint, the best approach is to use objects (structs) to represent all your data and avoid global variables (such data structures are typically called "contexts"). All API functions then take a context argument, which allows you to have multiple contexts in the same process.

At runtime, are global variables in a loaded shared library guaranteed to occupy a contiguous memory region?
Yes: on any ELF platform (such as Linux) all writable globals are typically grouped into a single writable PT_LOAD segment, and that segment is located at a fixed address (determined at the library load time).
If so, is it possible to find out that address range?
Certainly. You can find the library load address using dl_iterate_phdr, and iterate over the program segments that it gives you. One of the program headers will have .p_type == PT_LOAD, .p_flags == PF_R|PF_W. The address range you want is [dlpi_addr + phdr->p_vaddr, dlpi_addr + phdr->p_vaddr + phdr->p_memsz).
Here:
# actual addresses: completely different order:
you are actually looking at the address of the GOT entries in the main executable, and not the addresses of the variables themselves.

How do I add contents of text file as a section in an ELF file?

I have a NASM assembly file that I am assembling and linking (on Intel-64 Linux).
There is a text file, and I want the contents of the text file to appear in the resulting binary (as a string, basically). The binary is an ELF executable.
My plan is to create a new readonly data section in the ELF file (equivalent to the conventional .rodata section).
Ideally, there would be a tool to add a file verbatim as a new section in an elf file, or a linker option to include a file verbatim.
Is this possible?

This is possible and most easily done using OBJCOPY found in BINUTILS. You effectively take the data file as binary input and then output it to an object file format that can be linked to your program.
OBJCOPY will even produce a start and end symbol as well as the size of the data area so that you can reference them in your code. The basic idea is that you will want to tell it your input file is binary (even if it is text); that you will be targeting an x86-64 object file; specify the input file name and the output file name.
Assume we have an input file called myfile.txt with the contents:
the
quick
brown
fox
jumps
over
the
lazy
dog
Something like this would be a starting point:
objcopy --input binary \
--output elf64-x86-64 \
--binary-architecture i386:x86-64 \
myfile.txt myfile.o
If you wanted to generate 32-bit objects you could use:
objcopy --input binary \
--output elf32-i386 \
--binary-architecture i386 \
myfile.txt myfile.o
The output would be an object file called myfile.o . If we were to review the headers of the object file using OBJDUMP and a command like objdump -x myfile.o we would see something like this:
myfile.o: file format elf64-x86-64
myfile.o
architecture: i386:x86-64, flags 0x00000010:
HAS_SYMS
start address 0x0000000000000000
Sections:
Idx Name Size VMA LMA File off Algn
0 .data 0000002c 0000000000000000 0000000000000000 00000040 2**0
CONTENTS, ALLOC, LOAD, DATA
SYMBOL TABLE:
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 g .data 0000000000000000 _binary_myfile_txt_start
000000000000002c g .data 0000000000000000 _binary_myfile_txt_end
000000000000002c g *ABS* 0000000000000000 _binary_myfile_txt_size
By default it creates a .data section with contents of the file and it creates a number of symbols that can be used to reference the data.
_binary_myfile_txt_start
_binary_myfile_txt_end
_binary_myfile_txt_size
This is effectively the address of the start byte, the end byte, and the size of the data that was placed into the object from the file myfile.txt. OBJCOPY will base the symbols on the input file name. myfile.txt is mangled into myfile_txt and used to create the symbols.
One problem is that a .data section is created which is read/write/data as seen here:
Idx Name Size VMA LMA File off Algn
0 .data 0000002c 0000000000000000 0000000000000000 00000040 2**0
CONTENTS, ALLOC, LOAD, DATA
You specifically are requesting a .rodata section that would also have the READONLY flag specified. You can use the --rename-section option to change .data to .rodata and specify the needed flags. You could add this to the command line:
--rename-section .data=.rodata,CONTENTS,ALLOC,LOAD,READONLY,DATA
Of course if you want to call the section something other than .rodata with the same flags as a read only section you can change .rodata in the line above to the name you want to use for the section.
The final version of the command that should generate the type of object you want is:
objcopy --input binary \
--output elf64-x86-64 \
--binary-architecture i386:x86-64 \
--rename-section .data=.rodata,CONTENTS,ALLOC,LOAD,READONLY,DATA \
myfile.txt myfile.o
Now that you have an object file, how can you use this in C code (as an example). The symbols generated are a bit unusual and there is a reasonable explanation on the OS Dev Wiki:
A common problem is getting garbage data when trying to use a value defined in a linker script. This is usually because they're dereferencing the symbol. A symbol defined in a linker script (e.g. _ebss = .;) is only a symbol, not a variable. If you access the symbol using extern uint32_t _ebss; and then try to use _ebss the code will try to read a 32-bit integer from the address indicated by _ebss.
The solution to this is to take the address of _ebss either by using it as &_ebss or by defining it as an unsized array (extern char _ebss[];) and casting to an integer. (The array notation prevents accidental reads from _ebss as arrays must be explicitly dereferenced)
Keeping this in mind we could create this C file called main.c:
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
/* These are external references to the symbols created by OBJCOPY */
extern char _binary_myfile_txt_start[];
extern char _binary_myfile_txt_end[];
extern char _binary_myfile_txt_size[];
int main()
{
char *data_start = _binary_myfile_txt_start;
char *data_end = _binary_myfile_txt_end;
size_t data_size = (size_t)_binary_myfile_txt_size;
/* Print out the pointers and size */
printf ("data_start %p\n", data_start);
printf ("data_end %p\n", data_end);
printf ("data_size %zu\n", data_size);
/* Print out each byte until we reach the end */
while (data_start < data_end)
printf ("%c", *data_start++);
return 0;
}
You can compile and link with:
gcc -O3 main.c myfile.o
The output should look something like:
data_start 0x4006a2
data_end 0x4006ce
data_size 44
the
quick
brown
fox
jumps
over
the
lazy
dog
A NASM example of usage is similar in nature to the C code. The following assembly program called nmain.asm writes the same string to standard output using Linux x86-64 System Calls:
bits 64
global _start
extern _binary_myfile_txt_start
extern _binary_myfile_txt_end
extern _binary_myfile_txt_size
section .text
_start:
mov eax, 1 ; SYS_Write system call
mov edi, eax ; Standard output FD = 1
mov rsi, _binary_myfile_txt_start ; Address to start of string
mov rdx, _binary_myfile_txt_size ; Length of string
syscall
xor edi, edi ; Return value = 0
mov eax, 60 ; SYS_Exit system call
syscall
This can be assembled and linked with:
nasm -f elf64 -o nmain.o nmain.asm
gcc -m64 -nostdlib nmain.o myfile.o
The output should appear as:
the
quick
brown
fox
jumps
over
the
lazy
dog

Location of global variables with DWARF (and relocation)

When dynamically linking a binary with libraries, relocation information is used to bind the variables/functions of the different ELF objects. However DWARF is not affected by relocation: how is a debugger supposed to resolve global variables?
Let's say I have liba.so (a.c) defining a global variable (using GNU/Linux with GCC or Clang):
#include <stdio.h>
int foo = 10;
int test(void) {
printf("&foo=%p\n", &foo);
}
and an program b linked against liba.so (b.c):
#include <stdio.h>
extern int foo;
int main(int argc, char** argv) {
test();
printf("&foo=%p\n", &foo);
return 0;
}
I expect that "foo" will be instanciated in liba.so
but in fact it is instanciated in both liba.so and b:
$ ./b
&foo=0x600c68 # <- b .bss
&foo=0x600c68 # <- b .bss
The foo variable which is used (both by b and by lib.so) is in the .bss of b
and not in liba.so:
[...]
0x0000000000600c68 - 0x0000000000600c70 is .bss
[...]
0x00007ffff7dda9c8 - 0x00007ffff7dda9d4 is .data in /home/foo/bar/liba.so
0x00007ffff7dda9d4 - 0x00007ffff7dda9d8 is .bss in /home/foo/bar/liba.so
The foo variable is instanciated twice:
once in liba.so (this instance is not used when linked with program b)
once in b (this instance is used instance of the other in b).
(I don't really understand why the variable is instanciated in the executable.)
There is only a declaration in b (as expected) in the DWARF informations:
$ readelf -wi b
[...]
<1><ca>: Abbrev Number: 9 (DW_TAG_variable)
<cb> DW_AT_name : foo
<cf> DW_AT_decl_file : 1
<d0> DW_AT_decl_line : 3
<d1> DW_AT_type : <0x57>
<d5> DW_AT_external : 1
<d5> DW_AT_declaration : 1
[...]
and a location is found in liba.so:
$ readelf -wi liba.so
[...]
<1><90>: Abbrev Number: 5 (DW_TAG_variable)
<91> DW_AT_name : foo
<95> DW_AT_decl_file : 1
<96> DW_AT_decl_line : 3
<97> DW_AT_type : <0x57>
<9b> DW_AT_external : 1
<9b> DW_AT_location : 9 bloc d'octets: 3 d0 9 20 0 0 0 0 0 (DW_OP_addr: 2009d0)
[...]
This address is the location of the (unsued) instance of foo in liba.so (.data).
I end up with 2 instances of the foo global variable (on in liba.so and one in b);
only the first one can be seen with DWARF;
only the secone one is used.
How is the debugger supposed to resolve the foo global variable?

I don't really understand why the variable is instanciated in the executable.
You can find the answer here.
How is the debugger supposed to resolve the foo global variable
The debugger reads symbol tables (in addition to debug info), and foo does get defined in both the main executable b, and in liba.so:
nm b | grep foo
0000000000600c68 B foo

(I read the Oracle doc provided by #Employed Russian.)
The global variable reinstanciation is done for non-PIC code in order to dereference the variable in a non-PIC way without patching the non-PIC code:
a copy of the variable is done for non-PIC code;
the variable is instanciated in the executable;
a copy relocation instruction is used to copy the data from the source shared objet at dynamic linking time;
the instance in the shared objet is not used (after the relocation copy has been done).
Copy relocation instructions:
$readelf -r b
Relocation section '.rela.dyn' at offset 0x638 contains 2 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000600c58 000300000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
000000600ca8 001200000005 R_X86_64_COPY 0000000000600ca8 foo + 0
For functions, the GOT+PLT technique is used the same way they are used in PIC code.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight