gcc weak attribute inconsistent behaviour - c

I am using gcc compiler in windows10's powershell. gcc came with the Atollic TrueSTUDIO ide. The reason I am doing this is to be able to create an .exe file from the C code so unit testing becomes easier.
I encounter a linker error (undefined reference to 'function_name') when there is a function that is defined as weak and that function is used in another .c file.
Meanwhile I do not get this linker error if I use arm-atollic-eabi-gcc or gcc running on ubuntu.
Here is a simple code to demonstrate this:
hello.c:
#include "weak.h"
void whatever(void)
{
iamweak();
}
weak.c:
#include <stdio.h>
#include "weak.h"
void __attribute__((weak)) iamweak(void)
{
printf("i am weak...\n");
}
weak.h
void iamweak(void);
main.c
int main(void)
{
return 0;
}
Creating the object files and linking:
> gcc -c main.c weak.c hello.c
> gcc -o main.exe main.o weak.o hello.o
> hello.o:hello.c:(.text+0x7): undefined reference to `iamweak'
collect2.exe: error: ld returned 1 exit status
Now I checked with gcc-nm the symbol table of hello.o:
> gcc-nm hello.o
00000000 b .bss
00000000 d .data
00000000 r .eh_frame
00000000 r .rdata$zzz
00000000 t .text
U _iamweak
00000000 T _whatever
Symbol table for weak.o:
>gcc-nm weak.o
00000000 b .bss
00000000 d .data
00000000 r .eh_frame
00000000 r .rdata
00000000 r .rdata$zzz
00000000 t .text
00000000 T .weak._iamweak.
w _iamweak
U _puts
Now when I use gcc on Ubuntu as I said everything works. Also the symbol tables are a little different.
Symbol table for hello.o:
nm hello.o
U _GLOBAL_OFFSET_TABLE_
U iamweak
0000000000000000 T whatever
Symbol table for weak.o:
nm weak.o
U _GLOBAL_OFFSET_TABLE_
0000000000000000 W iamweak
U puts
From https://linux.die.net/man/1/nm it says that "If lowercase, the symbol is local; if uppercase, the symbol is global (external)."
So iamweak is local in windows10 and global in Ubuntu. Is that why the linker cannot see it? What can I do about this? The weak function definitions are also in some HAL libraries and I don't want to modify those. Is there a workaround?

it is atollic gcc fork error. It does even worse:
main:
00401440: push %ebp
00401441: mov %esp,%ebp
00401443: and $0xfffffff0,%esp
00401446: call 0x401970 <__main>
36 iamweak();
0040144b: call 0x0
37 return 0;
00401450: mov $0x0,%eax
38 }
the complete atollic studio project here

Related

What is *ABS* section and when to use?

// foo.c
int main() { return 0; }
When I compiled the code above I noticed some symbols located in *ABS*:
$ gcc foo.c
$ objdump -t a.out | grep ABS
0000000000000000 l df *ABS* 0000000000000000 crtstuff.c
0000000000000000 l df *ABS* 0000000000000000 foo.c
0000000000000000 l df *ABS* 0000000000000000 crtstuff.c
0000000000000000 l df *ABS* 0000000000000000
Looks like they're some debug symbols but isn't debug info are stored in somewhere like .debug_info section?
According to man objdump:
*ABS* if the section is absolute (ie not connected with any section)
I don't understand it since no example is given here.
Question here shows an interesting way to pass some extra symbols in *ABS* by --defsym. But I think it could be easier by passing macros.
So what is this *ABS* section and when would someone use it?
EDIT:
Absolute symbols don't get relocated, their virtual addresses (0000000000000000 in the example you gave) are fixed.
I wrote a demo but it seems that the addresses of absolute symbols can be modified.
// foo.c
#include <stdio.h>
extern char foo;
int main()
{
printf("%p\n", &foo);
return 0;
}
$ gcc foo.c -Wl,--defsym,foo=0xbeef -g
$ objdump -t a.out | grep ABS
0000000000000000 l df *ABS* 0000000000000000 crtstuff.c
0000000000000000 l df *ABS* 0000000000000000 foo.c
0000000000000000 l df *ABS* 0000000000000000 crtstuff.c
0000000000000000 l df *ABS* 0000000000000000
000000000000beef g *ABS* 0000000000000000 foo
# the addresses are not fixed
$ ./a.out
0x556e06629eef
$ ./a.out
0x564f0d7aeeef
$ ./a.out
0x55c2608dceef
# gdb shows that before entering main(), &foo == 0xbeef
$ gdb a.out
(gdb) p &foo
$1 = 0xbeef <error: Cannot access memory at address 0xbeef>
(gdb) br main
Breakpoint 1 at 0x6b4: file foo.c, line 7.
(gdb) r
Starting program: /home/user/a.out
Breakpoint 1, main () at foo.c:7
7 printf("%p", &foo);
(gdb) p &foo
$2 = 0x55555555feef <error: Cannot access memory at address 0x55555555feef>
If you look at other symbols you might find an index (or section name if the reader does the mapping for you) in place of *ABS*. This is a section index in the section headers table. It points to the section header of a section the symbol is defined in (or SHN_UNDEF (zero) if it is undefined in the object you are looking at). So the value (virtual address) of a symbol will be adjusted by the same value its containing section is adjusted during loading. (This process is called relocation.) Not so for absolute symbols (having special value SHN_ABS as their st_shndx). Absolute symbols don't get relocated, their virtual addresses (0000000000000000 in the example you gave) are fixed.
Such absolute symbols are sometimes used to store some meta information. In particular, the compiler can create symbols with symbol names equivalent to the names of translation units it compiles. Such symbols aren't needed for linking or running the program, they are just for humans and binary processing tools.
As for your question w.r.t the reason this isn't stored in .debug_info section (and why this info is emitted even though no debug switches were specified), the answer is that it is a separate thing; it is just the symbol table (.symtab). It is also needed for debugging, sure, but it's primary purpose is linking of object (.o) files. By default it is preserved in linked executables/libraries. You can get rid of it with strip.
Much of what I wrote here is in man 5 elf.
I don't think doing what you are doing (with --defsym) is supported/supposed to work with dynamic linking. Looking at the compiler output (gcc -S -masm=intel), I see this
lea rsi, foo[rip]
Or, if we look at objdump -M intel -rD a.out (linking with -q to preserve relocations), we see the same thing: rip-relative addressing is used to get the address of foo.
113d: 48 8d 35 ab ad 00 00 lea rsi,[rip+0xadab] # beef <foo>
1140: R_X86_64_PC32 foo-0x4
The compiler doesn't know that it's going to be an absolute symbol, so it produces the code it does (as for a normal symbol). rip is the instruction pointer, so it depends on the base address of the segment containing .text after the program is mapped into memory by ld.so.
I found this answer shedding light on the proper use-case for absolute symbols.

Generating both a library and binary from a single C source

I have a C source file with a main function and some other functions. Something like:
#include "stdlib.h"
int program(int argc, char ** argv)
{
int a = atoi(argv[1]);
int b = atoi(argv[2]);
return a + b;
}
int main(int argc, char ** argv)
{
return program(argc, argv);
}
I know how to compile this to produce a binary.
Is there a way to compile this into an object file with the main symbol/function omitted?
I understand that I could accomplish my goal by splitting main into its own file, but suppose I don't want to do that.
Usually, having a definition of main() in a library is not a problem because the linker would only use it if there were no main() in any non-library binary. That can even be used to advantage, to include a default main(). See, for example, the Posix standard -ll library used with lex (or -lfl if you use flex).
If you really want to ensure that the symbol is not available for resolution, you can remove the symbol from the library. There are tools for manipulating binary files, which vary from system to system. For example, take a look at the --strip-symbol option of objcopy. (That doesn't remove the compiled code; it just makes it unresolvable.)
A library is simply an archive of object modules - to omit main() it must either be in a separate object module which you then simply omit from the library build, or you use conditional compilation so that it is omitted at compile time.
In fact if main were in a separate object module it would not matter whether it were not omitted since any definition in a directly linked object module would override any static library definition, so the library definition would only be used if it were not redefined. I am not sure whether this will work if main() is defined in a module containing other symbols that are referenced in the binary, but nothing bad will happen if you try it other than a duplicate symbol error.
Is there a way to compile this into an object file with the main
symbol/function omitted?
So you don't want symbol main in your object files.
This might be one way.
file.c
#include "stdlib.h"
int program(int argc, char ** argv)
{
int a = atoi(argv[1]);
int b = atoi(argv[2]);
return a + b;
}
int not_main(int argc, char ** argv)
{
exit(0);
}
and then compile
[gcc]
gcc file.c -o file -e not_main -nostartfiles
SYMBOL TABLE:
0000000000000238 l d .interp 0000000000000000 .interp
0000000000000254 l d .note.gnu.build-id 0000000000000000 .note.gnu.build-id
0000000000000278 l d .gnu.hash 0000000000000000 .gnu.hash
0000000000000298 l d .dynsym 0000000000000000 .dynsym
00000000000002e0 l d .dynstr 0000000000000000 .dynstr
0000000000000302 l d .gnu.version 0000000000000000 .gnu.version
0000000000000308 l d .gnu.version_r 0000000000000000 .gnu.version_r
0000000000000328 l d .rela.plt 0000000000000000 .rela.plt
0000000000000360 l d .plt 0000000000000000 .plt
0000000000000390 l d .text 0000000000000000 .text
00000000000003f0 l d .eh_frame_hdr 0000000000000000 .eh_frame_hdr
0000000000000418 l d .eh_frame 0000000000000000 .eh_frame
0000000000200e78 l d .dynamic 0000000000000000 .dynamic
0000000000200fd8 l d .got 0000000000000000 .got
0000000000000000 l d .comment 0000000000000000 .comment
0000000000000000 l df *ABS* 0000000000000000 file.c
0000000000000000 l df *ABS* 0000000000000000
0000000000200e78 l O .dynamic 0000000000000000 _DYNAMIC
00000000000003f0 l .eh_frame_hdr 0000000000000000 __GNU_EH_FRAME_HDR
0000000000200fd8 l O .got 0000000000000000 _GLOBAL_OFFSET_TABLE_
0000000000201000 g .got 0000000000000000 _edata
00000000000003d5 g F .text 0000000000000019 not_main
0000000000000390 g F .text 0000000000000045 program
0000000000201000 g .got 0000000000000000 _end
0000000000201000 g .got 0000000000000000 __bss_start
0000000000000000 F *UND* 0000000000000000 atoi##GLIBC_2.2.5
0000000000000000 F *UND* 0000000000000000 exit##GLIBC_2.2.5
Is there a way to compile this into an object file with the main symbol/function omitted?
Yes, by using preprocessor tricks and/or preprocessor options to the compiler.
Change your C code (in your file mycode.c) to contain:
#ifdef HAVE_MAIN
int main(int argc, char ** argv)
{
return program(argc, argv);
}
#endif
Then, to get only an object file mycode.o, compile as gcc -Wall -Wextra -g mycode.c -c -o mycode.o (if using GCC)
To get the entire program myprog, compile it as gcc -Wall -Wextra -g -DHAVE_MAIN mycode.c -o myprog
You could (avoiding any #ifdef HAVE_MAIN) even compile with gcc -Wall -Wextra -g -Dmain=mymain -c mycode.c to get the main function renamed, by preprocessing, as mymain (and then it is losing its magical status of "entry point").
However, doing that is considered bad taste (not very readable code). You'll better put your main in a different translation unit and compile it only when you want a whole program. And quite often, a library (or an executable) is made from several translation units (each compiled into some object file; the set of object files gets linked together). You'll practically use some build automation tool (e.g. make or ninja, etc...) to build it.

floating point library on cortex-m0plus

I am working on a project that uses dynamic relocations, it works fine for the Cortex-M4, but I am having some problems with the Cortex-M0+.
The problems are occuring with the symbols of the functions of floating point. This cores do not have floating point unit.
So I was trying to understand the difference between the codes generated of the two cores (M4 and M0+).
The code is this:
#include <stdint.h>
#include <math.h> // <fastmath.h>
float a, b, c; //, d, e;
void ldMain(void)
{
a = 1.100000f + a;
b = 1.100000f - b;
c = 1.100000f * c;
//d = 1.100000f / d;
}
The commands to compile and linking are these:
arm-none-eabi-gcc.exe -c TESTE.c -o TESTE.o0 -mthumb -mcpu=cortex-m0plus -O0 -mlong-calls -mword-relocations -mabi=atpcs -mfloat-abi=soft -mcaller-super-interworking
arm-none-eabi-ld.exe -o TESTE.o TESTE.o0 --relocatable --strip-all --discard-all --embedded-relocs
The symbols that are generated are (get with arm-none-eabi-readelf):
Relocation section '.rel.text' at offset 0x2e4 contains 6 entries:
Offset Info Type Sym.Value Sym. Name
00000028 00000b02 R_ARM_ABS32 00000004 a
0000002c 00000c02 R_ARM_ABS32 00000000 __addsf3
00000034 00000602 R_ARM_ABS32 00000004 b
00000038 00000802 R_ARM_ABS32 00000000 __subsf3
0000003c 00000902 R_ARM_ABS32 00000004 c
00000040 00000a02 R_ARM_ABS32 00000000 __mulsf3
Independent of the flag -mcpu=cortex-m0plus or -mcpu=cortex-m4 used on the gcc command, the symbols generated are the same.
The problem is that these symbols appear does not exist on cortex-m0plus.
The libgcc of cortex-m0plus (armv6-m) located at C:\Program Files (x86)\GNU Tools ARM Embedded\4.9 2015q2\lib\gcc\arm-none-eabi\4.9.3\armv6-m not have these symbols. It was verified with the command arm-none-eabi-nm.
Does anyone know why these symbols are used if they do not exist for the cortex-m0plus?
I am using the version 4.9 2015q2 of the GCC ARM Embedded.
These functions are defined in GCC glibc (either newlib or nanolib). This post is almost 4 years old, and I do not have experience with the 2015 GCC. However, the recent (e.g. 2018 etc.) GCC definitely has these FP software routines in the libraries.

Is extern optional?

I am sure I am going crazy, but consider the following C code:
// file1.c
int first;
void f(void)
{ first = 2; }
// file2.c
#include <stdio.h>
int first;
void f();
int main(void)
{
first = 1;
f();
printf("%d", first);
}
These two files, for some reason will compile and link together, and print 2. I was always under the impression that unless I labelled one or the other (but not both) definitions of first with extern, this wouldn't compile, and that was in fact the whole point of extern!
It only compiles because first is only declared twice, there are not actually two places in memory but only one. Just initialize the one first with int first=4; and the other with int first=5; and your linker will show you the error, e.g. GCC:
b.o:b.c:(.data+0x0): multiple definition of `_first'
a.o:a.c:(.data+0x0): first defined here
collect2.exe: error: ld returned 1 exit status
Under normal conditions (no extra gcc flags) you should be fine to compile this code as:
gcc file1.c file2.c
What's going to happen is the compiler will see that you have two global variables named the same thing and neither is initialized. Then it will place your uninitialized global variables in the "common" section of the code**. In other words it's going to have only 1 copy of the "first" variable. This happens because the default for gcc is -fcommon
If you were to compile with the -fno-common flag you'd now receive the error you were thinking of:
/tmp/ccZNeN8c.o:(.bss+0x0): multiple definition of `first'
/tmp/cc09s2r7.o:(.bss+0x0): first defined here
collect2: ld returned 1 exit status
To resolve this you'd add extern to all but one of the variables.
WARNING:
Now let's say you had two global uninitialized arrays of different sizes:
// file1.c
int first[10];
// file2.c
int first[20];
Well guess what, compiling them with gcc -Wall file1.c file2.c produces no warnings or errors and the variable was made common even though it's differently sized!!!
//objdump from file1.c:
0000000000000028 O *COM* 0000000000000020 first
//objdump from file2.c:
0000000000000050 O *COM* 0000000000000020 first
This is one of the dangers of global variables.
**If you look at an objdump of the *.o files (you have to compile with gcc -c to generate them) you'll see first placed in the common (*COM*) section:
mike#mike-VirtualBox:~/C$ objdump -t file2.o
a.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 file2.c
0000000000000000 l d .text 0000000000000000 .text
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 l d .bss 0000000000000000 .bss
0000000000000000 l d .rodata 0000000000000000 .rodata
0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack
0000000000000000 l d .eh_frame 0000000000000000 .eh_frame
0000000000000000 l d .comment 0000000000000000 .comment
0000000000000004 O *COM* 0000000000000004 first
0000000000000000 g F .text 0000000000000039 main
0000000000000000 *UND* 0000000000000000 f
0000000000000000 *UND* 0000000000000000 printf

Undefined reference to main - collect2: ld returned 1 exit status

I'm trying to compile a program (called es3), but, when I write from terminal:
gcc es3.c -o es3
it appears this message:
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/crt1.o: In function `_start':
(.text+0x18): undefined reference to `main'
collect2: ld returned 1 exit status
What could I do?
It means that es3.c does not define a main function, and you are attempting to create an executable out of it. An executable needs to have an entry point, thereby the linker complains.
To compile only to an object file, use the -c option:
gcc es3.c -c
gcc es3.o main.c -o es3
The above compiles es3.c to an object file, then compiles a file main.c that would contain the main function, and the linker merges es3.o and main.o into an executable called es3.
Perhaps your main function has been commented out because of e.g. preprocessing.
To learn what preprocessing is doing, try gcc -C -E es3.c > es3.i then look with an editor into the generated file es3.i (and search main inside it).
First, you should always (since you are a newbie) compile with
gcc -Wall -g -c es3.c
gcc -Wall -g es3.o -o es3
The -Wall flag is extremely important, and you should always use it. It tells the compiler to give you (almost) all warnings. And you should always listen to the warnings, i.e. correct your source code file es3.C till you got no more warnings.
The -g flag is important also, because it asks gcc to put debugging information in the object file and the executable. Then you are able to use a debugger (like gdb) to debug your program.
To get the list of symbols in an object file or an executable, you can use nm.
Of course, I'm assuming you use a GNU/Linux system (and I invite you to use GNU/Linux if you don't use it already).
In my case it was just because I had not Saved the source file and was trying to compile a empty file .
Executable file needs a main function. See below hello world demo.
#include <stdio.h>
int main(void)
{
printf("Hello world!\n");
return 0;
}
As you can see there is a main function. if you don't have this main function, ld will report "undefined reference to main' "
check my result:
$ cat es3.c
#include <stdio.h>
int main(void)
{
printf("Hello world!\n");
return 0;
}
$ gcc -Wall -g -c es3.c
$ gcc -Wall -g es3.o -o es3
~$ ./es3
Hello world!
please use $ objdump -t es3.o to check if there is a main symbol. Below is my result.
$ objdump -t es3.o
es3.o: file format elf32-i386
SYMBOL TABLE:
00000000 l df *ABS* 00000000 es3.c
00000000 l d .text 00000000 .text
00000000 l d .data 00000000 .data
00000000 l d .bss 00000000 .bss
00000000 l d .debug_abbrev 00000000 .debug_abbrev
00000000 l d .debug_info 00000000 .debug_info
00000000 l d .debug_line 00000000 .debug_line
00000000 l d .rodata 00000000 .rodata
00000000 l d .debug_frame 00000000 .debug_frame
00000000 l d .debug_loc 00000000 .debug_loc
00000000 l d .debug_pubnames 00000000 .debug_pubnames
00000000 l d .debug_aranges 00000000 .debug_aranges
00000000 l d .debug_str 00000000 .debug_str
00000000 l d .note.GNU-stack 00000000 .note.GNU-stack
00000000 l d .comment 00000000 .comment
00000000 g F .text 0000002b main
00000000 *UND* 00000000 puts
One possibility which has not been mentioned so far is that you might not be editing the file you think you are. i.e. your editor might have a different cwd than you had in mind.
Run 'more' on the file you're compiling to double check that it does indeed have the contents you hope it does. Hope that helps!
You can just add a main function to resolve this problem.
Just like:
int main()
{
return 0;
}
I my case I found out the void for the main function declaration was missing.
I was previously using Visual Studio in Windows and this was never a problem, so I thought I might leave it out now too.

Resources