Here is the source file fragment:
#define TEST 34
#define PRINT_CONCAT(a, b) \
printf("%d\n", a##b)
Compiling with GCC and linking this source file into a binary with flags -ggdb3 -O3 and running the app with gdb shows up the following behavior:
(gdb) p TEST
$3 = 34
(gdb) p PRINT_CONCAT
No symbol "PRINT_CONCAT" in current context.
Is there a way to make gdb expand function macros in any way?
It turned to be as easy as macro expand
(gdb) macro expand PRINT_CONCAT(2, 4)
expands to: printf("%d\n", 24)
-ggdb3 is not even required, -g3 is enough. -g2 does not seem to include the desired information.
Related
TL;DR
When I pass an array from Fortran to C, the array's address is incorrect in C. I've checked this by printing the address of the array in Fortran before the CALL, then stepping into the C function and printing the address of the argument.
The Fortran pointer: 0x9acd44c0
The C pointer: 0xffffffff9acd44c0
The upper dword of the C pointer has been set to 0xffffffff. I'm trying to understand why this is happening, and only happening on the HPC cluster and not on a development machine.
Context
I'm using a rather large scientific program written in Fortran/C++/CUDA. On some particular machine, I get a segfault when calling a C function from Fortran. I've found that a pointer is being passed to the C function with some bytes set incorrectly.
Code Snippets
Every Fortran file in the program includes a common header file which sets up some options and declares the common blocks.
IMPLICIT REAL*8 (A-H,O-Z)
COMMON/NBODY/ X(3,NMAX), BODY(NMAX)
COMMON/GPU/ GPUPHI(NMAX)
The Fortran call site looks like this:
CALL GPUPOT(NN,BODY(IFIRST),X(1,IFIRST),GPUPHI)
And the C function, which is compiled by nvcc, is declared like so:
extern "C" void gpupot_(int *n,
double m[],
double x[][3],
double pot[]);
GDB Output
I found from debugging that the value of the pointer to pot is incorrect; so any attempt to access that array will segfault.
When I ran the program with gdb, I put a break point just before the call to gpupot and printed the value of the GPUPHI variable:
(gdb) p &GPUPHI
$1 = (PTR TO -> ( real(kind=8) (1050000))) 0x9acd44c0 <gpu_>
I then let the debugger step into the gpupot_ C function, and inspected the value of the pot argument:
(gdb) p pot
$2 = (double *) 0xffffffff9acd44c0
All of the other arguments have the correct pointer values.
Compiler options
The compiler options that are set for gfortran are:
-fPIC -O3 -ffast-math -Wall -fopenmp -mcmodel=medium -march=native -mavx -m64
And nvcc is using the following:
-ccbin=g++ -Xptxas -v -ftz=true -lineinfo -D_FORCE_INLINES \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_35,code=compute_35 -Xcompiler \
"-O3 -fPIC -Wall -fopenmp -std=c++11 -fPIE -m64 -mavx \
-march=native" -std=c++14 -lineinfo
For debugging, the -O3 is replaced with -g -O0 -fcheck=all -fstack-protector -fno-omit-frame-pointer, but the behaviour (crash) remains the same.
This is prefaced by my top comments [and yours].
It looks like you're getting an [unwanted] sign extension of the address.
gfortran is being built with -mcmodel=medium but C does not.
With that option, larger symbols/arrays will be linked above 2GB [which has the sign bit set]
So, add the option to both or leave it off both to fix the problem.
I have a function in my C code that is being called implicitly, and getting dumped by the linker. how can I prevent this phenomena?
I'm compiling using gcc and the linker flag -gc-sections, and I don't want to exclude the whole file from the flag. I tried using attributes: "used" and "externally_visible" and neither has worked.
void __attribute__((section(".mySec"), nomicromips, used)) func(){
...
}
on map file I can see that the function has compiled but didn't linked. am I using it wrong? is there any other way to do it?
You are misunderstanding the used attribute
used
This attribute, attached to a function, means that code must be emitted for the function even if it appears that the function is not referenced...
i.e the compiler must emit the function definition even the function appears
to be unreferenced. The compiler will never conclude that a function is unreferenced
if it has external linkage. So in this program:
main1.c
static void foo(void){}
int main(void)
{
return 0;
}
compiled with:
$ gcc -c -O1 main1.c
No definition of foo is emitted at all:
$ nm main1.o
0000000000000000 T main
because foo is not referenced in the translation unit, is not external,
and so may be optimised out.
But in this program:
main2.c
static void __attribute__((used)) foo(void){}
int main(void)
{
return 0;
}
__attribute__((used)) compels the compiler to emit the local definition:
$ gcc -c -O1 main2.c
$ nm main2.o
0000000000000000 t foo
0000000000000001 T main
But this does nothing to inhibit the linker from discarding a section
in which foo is defined, in the presence of -gc-sections, even if foo is external, if that section is unused:
main3.c
void foo(void){}
int main(void)
{
return 0;
}
Compile with function-sections:
$ gcc -c -ffunction-sections -O1 main3.c
The global definition of foo is in the object file:
$ nm main3.o
0000000000000000 T foo
0000000000000000 T main
But after linking:
$ gcc -Wl,-gc-sections,-Map=mapfile main3.o
foo is not defined in the program:
$ nm a.out | grep foo; echo Done
Done
And the function-section defining foo was discarded:
mapfile
...
...
Discarded input sections
...
...
.text.foo 0x0000000000000000 0x1 main3.o
...
...
As per Eric Postpischil's comment, to force the linker to retain
an apparently unused function-section you must tell it to assume that the program
references the unused function, with linker option {-u|--undefined} foo:
main4.c
void __attribute__((section(".mySec"))) foo(void){}
int main(void)
{
return 0;
}
If you don't tell it that:
$ gcc -c main4.c
$ gcc -Wl,-gc-sections main4.o
$ nm a.out | grep foo; echo Done
Done
foo is not defined in the program. If you do tell it that:
$ gcc -c main4.c
$ gcc -Wl,-gc-sections,--undefined=foo main4.o
$ nm a.out | grep foo; echo Done
0000000000001191 T foo
Done
it is defined. There's no use for attribute used.
Apart from -u already mentioned here are two other ways to keep the symbol using GCC.
Create a reference to it without calling it
This approach does not require messing with linker scripts, which means it will work for hosted programs and libraries using the operating system's default linker script.
However it varies with compiler optimization settings and may not be very portable.
For example, in GCC 7.3.1 with LD 2.31.1, you can keep a function without actually calling it, by calling another function on its address, or branching on a pointer to its address.
bool function_exists(void *address) {
return (address != NULL);
}
// Somewhere reachable from main
assert(function_exists(foo));
assert(foo != NULL); // Won't work, GCC optimises out the constant expression
assert(&foo != NULL); // works on GCC 7.3.1 but not GCC 10.2.1
Another way is to create a struct containing function pointers, then you can group them all together and just check the address of the struct. I use this a lot for interrupt handlers.
Modify the linker script to keep the section
If you are developing a hosted program or a library, then it's pretty tricky to change the linker script.
Even if you do, its not very portable, for example gcc on OSX does not actually use the GNU linker since OSX uses the Mach-O format instead of ELF.
Your code already shows a custom section though, so it's possible you are working on an embedded system and can easily modify the linker script.
SECTIONS {
// ...
.mySec {
KEEP(*(.mySec));
}
}
This question already has answers here:
GDB macro symbols are not present even when using -g3 or -ggdb3 or -gdwarf-4
(4 answers)
Closed 6 years ago.
Why GDB doesn't print a macro's value in the following example?
❯ cat sample.c
#include <stdio.h>
#define M 42
int main(int argc, const char **argv)
{
printf("M: %d\n", M);
return 0;
}
❯ rm -f sample
❯ gcc -Wall -g3 -ggdb -gdwarf-2 sample.c -o sample
❯ gdb sample
gdb> break main
gdb> run
gdb> info macro M
The symbol `M' has no definition as a C/C++ preprocessor macro
at <user-defined>:-1
gdb> continue
Continuing.
M: 42
Thanks!
❯ gcc --version
Apple LLVM version 7.3.0 (clang-703.0.29)
❯ gdb --version
GNU gdb (GDB) 7.10.1
I get different results with GCC 4.4.7 and GDB 7.2 than what you report. Having used your source and your compilation command, my GDB session looks like this:
> gdb sample
[ ... startup banner ... ]
(gdb) break main
Breakpoint 1 at 0x4004d3: file sample.c, line 7.
(gdb) run
Starting program: /home/jbolling/tmp/sample
Breakpoint 1, main (argc=1, argv=0x7fffffffcba8) at sample.c:7
7 printf("M: %d\n", M);
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.166.el6_7.7.x86_64
(gdb) info macro M
Defined at /home/jbolling/tmp/sample.c:3
#define M 42
(gdb) continue
Continuing.
M: 42
Program exited normally.
(gdb)
I suspect that the key difference here, and the reason that you aren't seeing a definition of M, is in GDB's sense of the source location associated with a breakpoint at function main. The GDB output you reported provides a clue about this:
gdb> info macro M
The symbol `M' has no definition as a C/C++ preprocessor macro
at <user-defined>:-1
Note in particular the location GDB reports: "<user-defined>" file, line number -1. In my GDB run, the breakpoint was associated with the first source line in the body of main(). I am inclined to believe that if you break there then GDB will report correctly on the macro's definition at that location.
I am trying to use linker symbols to automatically set a version number in my executables, and it seems to work as long as the symbols aren't set to zero...
In my C code:
extern char __VERSION_MAJ;
extern char __VERSION_MIN;
...
printf("Version %u.%u\n", (unsigned) &__VERSION_MAJ, (unsigned) &__VERSION_MIN);
And in my makefile:
LDFLAGS += -Xlinker --defsym=__VERSION_MAJ=1
LDFLAGS += -Xlinker --defsym=__VERSION_MIN=0
Results in the following output when I try to run the executable test:
./test: symbol lookup error: ./test: undefined symbol: __VERSION_MIN
If I change the symbol definition as follows:
LDFLAGS += -Xlinker --defsym=__VERSION_MAJ=1
LDFLAGS += -Xlinker --defsym=__VERSION_MIN=1
Then it works just fine:
Version 1.1
I've read about linker symbols here http://web.eecs.umich.edu/~prabal/teaching/eecs373-f10/readings/Linker.pdf and trawled google but haven't spotted anything that says 0 is a disallowed value for custom linker symbols.
Also, if I look at the linker map output it does have __VERSION_MIN:
0x0000000000000001 __VERSION_MAJ = 0x1
0x0000000000000000 __VERSION_MIN = 0x0
So, I'm quite stumped!
I would just use gcc -D__VERSION_MIN=0 instead, but that leads to trickiness and makefile ugliness with using prerequisites to rebuild the application when the version changes (it will be stored in a text file, not hard-coded in the makefile as above.)
I'm compiling and linking with gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) for target i686-linux-gnu, if any of that makes a difference.
Executive summary:
Should a --defsym expression that results in 0 be allowed?
What am I doing wrong?
Is there a better/simpler way to achieve this?
If you want to use
gcc -D__VERSION_MIN=0
then you have to remove the definition from your header file
extern char __VERSION_MIN;
The
gcc -D__VERSION_MIN=0
is equivalent to define __VERSION_MIN as a macro in your c code
#define __VERSION_MIN 0
And then you can not define __VERSION_MIN twice in your C code
extern char __VERSION_MIN;
#define __VERSION_MIN 0
This is not allowed
So If you want to use
gcc -D__VERSION_MIN=0
then you have to remove extern char __VERSION_MIN; from your code
I'm trying to use dladdr. It correctly locates the library, but it does not find the function name. I can call objdump, do a little math, and get the address of the function that I pass dladdr. If objdump can see it, why can't dladdr?
Here is my function:
const char *FuncName(const void *pFunc)
{
Dl_info DlInfo;
int nRet;
// Lookup the name of the function given the function pointer
if ((nRet = dladdr(pFunc, &DlInfo)) != 0)
return DlInfo.dli_sname;
return NULL;
}
Here is a gdb transcript showing what I get.
Program received signal SIGINT, Interrupt.
[Switching to Thread 0xf7f4c6c0 (LWP 28365)]
0xffffe410 in __kernel_vsyscall ()
(gdb) p MatchRec8Cmp
$2 = {void (TCmp *, TWork *, TThread *)} 0xf1b62e73 <MatchRec8Cmp>
(gdb) call FuncName(MatchRec8Cmp)
$3 = 0x0
(gdb) call FuncName(0xf1b62e73)
$4 = 0x0
(gdb) b FuncName
Breakpoint 1 at 0xf44bdddb: file threads.c, line 3420.
(gdb) call FuncName(MatchRec8Cmp)
Breakpoint 1, FuncName (pFunc=0xf1b62e73) at threads.c:3420
3420 {
The program being debugged stopped while in a function called from GDB.
When the function (FuncName) is done executing, GDB will silently
stop (instead of continuing to evaluate the expression containing
the function call).
(gdb) s
3426 if ((nRet = dladdr(pFunc, &DlInfo)) != 0)
(gdb)
3427 return DlInfo.dli_sname;
(gdb) p DlInfo
$5 = {dli_fname = 0x8302e08 "/xxx/libdata.so", dli_fbase = 0xf1a43000, dli_sname = 0x0, dli_saddr = 0x0}
(gdb) p nRet
$6 = 1
(gdb) p MatchRec8Cmp - 0xf1a43000
$7 = (void (*)(TCmp *, TWork *, TThread *)) 0x11fe73
(gdb) q
The program is running. Exit anyway? (y or n) y
Here is what I get from objdmp
$ objdump --syms /xxx/libdata.so | grep MatchRec8Cmp
0011fe73 l F .text 00000a98 MatchRec8Cmp
Sure enough, 0011fe73 = MatchRec8Cmp - 0xf1a43000. Anyone know why dladdr can't return dli_sname = "MatchRec8Cmp" ???
I'm running Red Hat Enterprise Linux Server release 5.4 (Tikanga). I have seen this work before. Maybe it's my compile switches:
CFLAGS = -m32 -march=i686 -msse3 -ggdb3 -pipe -fno-common -fomit-frame-pointer \
-Ispio -fms-extensions -Wmissing-declarations -Wstrict-prototypes -Wunused -Wall \
-Wno-multichar -Wdisabled-optimization -Wmissing-prototypes -Wnested-externs \
-Wpointer-arith -Wextra -Wno-sign-compare -Wno-sequence-point \
-I../../../include -I/usr/local/include -fPIC \
-D$(Uname) -D_REENTRANT -D_GNU_SOURCE
I have tried it with -g instead of -ggdb3 although I don't think debugging symbols have anything to do with elf.
If objdump can see it, why can't dladdr
dladdr can only see functions exported in the dynamic symbol table. Most likely
nm -D /xxx/libdata.so | grep MatchRec8Cmp
shows nothing. Indeed your objdump shows that the symbol is local, which proves that this is the cause.
The symbol is local either because it has a hidden visibility, is static, or because you hide it in some other way (e.g. with a linker script).
Update:
Those marked with the 'U' work with dladdr. They get "exported" automatically somehow.
They work because they are exported from some other shared library. The U stands for unresolved, i.e. defined elsewhere.
I added -rdynamic to my LDFLAGS.
man gcc says:
-rdynamic
Pass the flag -export-dynamic to the ELF linker, on targets that support it. This instructs the linker to add all symbols, not only used ones, to the
dynamic symbol table. This option is needed for some uses of "dlopen" or to allow obtaining backtraces from within a program.
Adding the gcc option "-export-dynamic" solved this for me.
hinesmr solution worked for me. The exact option I passed gcc was "-Wl,--export-dynamic" and all the functions became visible to dladdr