I'm compiling C code with GCC and assembling some x86 code with NASM on Windows.
Now, GCC by default (and I haven't been able to find an option to change this) prepends an underscore _ to all external symbol names (and expected names).
I need this assembly code to work with GCC on both Windows and Linux and would like to avoid hacks as much as possible (and code duplication; I had separate .s files for Windows/Linux at first).
I found out about (and used) the --prefix flag in NASM. Now for some symbols I'd like NASM to treat them as without the leading underscore (exact situation right now is that I need to reference the entry point in a linker script without the leading underscore). Hence the question here on how to override, per symbol, the --prefix/--postfix flags of NASM.
Feel free to treat this as an XY problem. If there's a way to set the mangling scheme of GCC for C that'd be great, for example.
I stumbled upon the same problem. I've created an include file with a lot of defines like
%define printf _printf
%define puts _puts
%define scanf _scanf
and some other stuff.
That file (libc_win32.in) is included by a "master" include file (libc.inc):
%ifndef LIBC_INC
%define LIBC_INC
%ifdef win32
%include 'libc_win32.inc'
%elifdef win64
%include 'libc_win64.inc'
%elifdef elf32
%include 'libc_elf32.inc'
%elifdef elf64
%include 'libc_elf64.inc'
%else
; %error "libc.inc"
%endif
%endif
I set the symbols and include the files at the command line:
nasm -fwin32 -dwin32 -plibc.inc ...
or
nasm -felf32 -delf32 -plibc.inc ...
There is a predefined macro called __OUTPUT_FORMAT__, but it works only inside of a macro, not at program start.
Related
I'm trying to locate the .c files that are related to the #include header files in avr.
I want to have a look at some of the standard libraries that are defined in the avr-gcc library, particularly the PORT definitions contained in <avr/io.h>. I searched through the library in /usr/lib/avr/include/avr and found the header file, however what I am looking for is the .c file. Does this file exist? If so, where can I find it? If not, what is the header file referencing?
The compiler provided libraries are precompiled object code stored in static libraries. In gcc, libraries conventionally the extension .a (for "archive" for largely historic reasons), and the prefix "lib".
At build time, the linker will search the library archives to find the object-code modules necessary to resolve referenced to library symbols. It extracts the required modules and links them to the binary image being built.
In gcc a library libXXX.a is typically linked using the command line switch -lXXX - so the libXXX.a naming convention is important in that case. So for example the standard C library libc.a is looking linked by the switch -lc.
So to answer your question, there are normally no .c files for the compiler provided libraries provided with the toolchain. The library need not even have been written by in C.
That said, being open source, the source files (.c or otherwise) will be available from the repositories of the various libraries. For example, for the standard C library: https://www.nongnu.org/avr-libc/.
For other AVR architecture and I/O support libraries, you might inspect the associated header files or documentation. The header files will typically have a boiler-plate comment with a project URL for example.
PORTB and other special function registers are usually defined as macros in headers provided by avr-libc. Find your include/avr directory (the one that contains io.h). In that directory, there should be many other header files. As an example, iom328p.h contains the following line that defines PORTB on the ATmega328P:
#define PORTB _SFR_IO8(0x05)
If you are also looking for the libraries that are distributed as .a files, you should run avr-gcc -print-search-dirs.
There are several ways to find out where the system headers are located and which are included:
avr-gcc -v -mmcu=atmega8 foo.c ...
With option -v, GCC will print (amongst other stuff) whch include paths it is using. Check the output on a shell / console, where GCC will print the search paths:
#include "..." search starts here:
#include <...> search starts here:
/usr/lib/gcc/avr/5.4.0/include
/usr/lib/gcc/avr/5.4.0/include-fixed
/usr/lib/gcc/avr/5.4.0/../../../avr/include
The last location is for AVR-LibC, which provides avr/io.h. Resolving the ..s, that path is just /usr/lib/avr/include. These paths depend on how avr-gcc was configured and installed, hence you have to run that command with your installation of avr-gcc.
avr-gcc -H -mmcu=atmega8 foo.c ...
Suppose the C-file foo.c reads:
#include <avr/io.h>
int main (void)
{
PORTD = 0;
}
for an easy example. With -H, GCC will print out which files it is actually including:
. /usr/lib/avr/include/avr/io.h
.. /usr/lib/avr/include/avr/sfr_defs.h
... /usr/lib/avr/include/inttypes.h
.... /usr/lib/gcc/avr/5.4.0/include/stdint.h
..... /usr/lib/avr/include/stdint.h
.. /usr/lib/avr/include/avr/iom8.h
.. /usr/lib/avr/include/avr/portpins.h
.. /usr/lib/avr/include/avr/common.h
.. /usr/lib/avr/include/avr/version.h
.. /usr/lib/avr/include/avr/fuse.h
.. /usr/lib/avr/include/avr/lock.h
avr-gcc -save-temps -g3 -mmcu=atmega8 foo.c ...
With DWARF-3 debugging info, the macro definitions will be recorded in the debug info and are visible in the pre-processed file (*.i for C code, *.ii for C++, *.s for pre-processed assembly). Hence, in foo.i we can find the definition of PORTD as
#define PORTD _SFR_IO8(0x12)
Starting from the line which contains that definition, scroll up until you find the annotation that tells in which file the macro definition happened. For example
# 45 "/usr/lib/avr/include/avr/iom8.h" 3
in the case of my toolchain installation. This means that the lines following that annotation follow line 45 of /usr/lib/avr/include/avr/iom8.h.
If you want to see the resolution of PORTD, scroll down to the end of foo.i which contains the pre-processed source:
# 3 "foo.c"
int main (void)
{
(*(volatile uint8_t *)((0x12) + 0x20)) = 0;
}
0x12 is the I/O address of PORTD, and 0x20 is the offset between I/O addresses and RAM addresses for ATmega8. This means the compiler may implement PORTD = 0 by means of out 0x12, __zero_reg__.
avr-gcc -print-file-name=libc.a -mmcu=...
Finally, this command will print the location (absolue path) of libraries like libc.a, libm.a, libgcc.a or lib<mcu>.a. The location of the library depends on how the compiler was configureed and installed, but also on command line options like -mmcu=.
avr-gcc -Wl,-Map,foo.map -mmcu=atmega8 foo.c -o foo.elf
This directs the linker to dump a "map" file foo.map where it reports which symbol will drag which module from which library. This is a text file that contains lines like:
LOAD /usr/lib/gcc/avr/5.4.0/../../../avr/lib/avr4/crtatmega8.o
...
LOAD /usr/lib/gcc/avr/5.4.0/avr4/libgcc.a
LOAD /usr/lib/gcc/avr/5.4.0/../../../avr/lib/avr4/libm.a
LOAD /usr/lib/gcc/avr/5.4.0/../../../avr/lib/avr4/libc.a
LOAD /usr/lib/gcc/avr/5.4.0/../../../avr/lib/avr4/libatmega8.a
libgcc.a is from the compiler's C runtime, and all the others are provided by AVR-LibC. Resolving the ..s, the AVR-LibC files for ATmega8 are located in /usr/lib/avr/lib/avr4/.
I have two assembly codes, code1.s and code2.s and I want to build a relocatable (using -fPIC switch) shared library from these two.
I want code2.s call a function, named myfun1, which is defined in code1.s.
When I use call myfun1#PLT in code2.s it finds the function and it works like a charm but it uses PLT section to call this function which is in the same shared library. I want to do this without adhering to PLT section. When I remove #PLT I get the relocation R_X86_64_PC32 against symbol error for myfun1.
How can I do this without using PLT section? Is there any way at all? I think it should be feasible as the shared library should be relocatable but not necessary each of its object files, therefore why calling a function inside the same library should goes through the PLT section.
Here is my compile commands:
For codeX.s:
gcc -c codeX.s -fPIC -DPIC -o codeX.o
or
gcc -c codeX.s -o codeX.o
and for sharelibrary named libcodes.so:
gcc -shared -fPIC -DPIC -o libcodes.so code1.o code2.o
Just as you may be curious why I am doing so, I have many object files and each of them wants to call myfun1. Here I just made it simpler to ask the technical part. Even I tries to put myfun1 in all codeX.s files but I get the error that myfun1 is defined multiple times. I don't that much care about space and if I get to put myfun1 in all files.
From within one source file you can just use two labels (Building .so with recursive function in it), one with .globl and the other not. But that's not sufficient across source files within the shared library.
Still useful in combination with the below answer for functions that are also exported: one .hidden and one not, so you can efficiently call within the library.
Use .globl and .hidden to create a symbol that can be seen outside the current object file, but not outside the shared library. Thus it's not subject to symbol-interposition, and calls from other files in the same shared library can call it directly, not through the PLT or GOT.
Tested and working example:
## foo.S
.globl myfunc
.hidden myfunc
myfunc:
#.globl myfunc_external # optional, a non-hidden symbol at the same addr
#myfunc_external:
ret
## bar.S
.globl bar
bar:
call myfunc
ret
Build with gcc -shared foo.S bar.S -o foo.so, and objdump -drwC -Mintel foo.so:
Disassembly of section .text:
000000000000024d <myfunc>:
24d: c3 ret
000000000000024e <bar>:
24e: e8 fa ff ff ff call 24d <myfunc> # a direct near call
253: c3 ret
(I actually built with -nostdlib as well to keep the disassembly output clean for example purposes by omitting the other functions like __do_global_dtors_aux and register_tm_clones, and the .init section.)
I think Glibc uses strong or weak_alias for this (what does the weak_alias function do and where is it defined), so calls from within the shared library can use the normal name. Where are syscalls located in glibc source, e.g. __chdir and chdir.
e.g. glibc's printf.c defines __printf and makes printf a strong alias for it.
io/chdir.c defines __chdir and makes chdir a weak alias for it.
One of the x86-64 memchr asm implementations also uses a strong_alias macro (at the bottom of the file).
The relevant GAS directives are:
.weak names
.weakref foo, foo_internal
There's no strong alias GAS directive. That may be equivalent to simply foo = foo_internal or an equivalent .set foo, foo_internal.
(TODO: complete example and more details of what strong/weak do exactly. I don't currently know, so edits welcome if I don't get around to reading the docs myself.
I know this stuff exists and solves this problem, but I don't know exactly how.)
Well, I was not able to find any way to do so but as I edited my question I do not care to put myfun1 in all object files.
The problem I had was that linker outputted error that I have defined myfun1 in multiple places and that was all because I had globl directive for myfun1 which when I removed that line it get fixed.
Thanks Ross Ridge for pushing me again to try that.
My question is fairly OS X on x86-64 specific but a universal solution that works on other POSIX OSes is even more appreciated.
Given a list of symbol names of some shared library (called original library in the following) and I want my shared library to re-export these symbols. Re-export as in if someone tries to resolve the symbol against my library I either provide my version of this symbol or (if my library doesn't have this symbol) forward to the original library's symbol.
I don't know the types of the symbols, I only know whether they are functions (type T in nm output) or other symbols (type S in nm output).
For functions, I already have a solution: For every function I want to re-export I generate an assembly stub that does dynamically resolve the symbol (using dlsym()) and then jumps into the resolved function with the very same environment (registers rdi, rsi, rdx, rcx, r8, r9, stack pointer, ...). I'm basically generating universal proxy functions. Using some macro trickery that can be generated fairly easy without writing code for each and every symbol.
For non-function symbols the problem seems to be harder because I cannot generate this universal proxy function, because the resolving party does never call a function.
Using a constructor function static void init(void) __attribute__((constructor)); I can execute code whenever someone loads my library, that would be a good point to resolve and re-export all non-function symbols if that's possible.
In other words, I'd like to write the symbol table of my library to point to the respective symbols of another shared library. Doing the rewriting at compile or run time is okay (run time preferred). Or put yet another way, the behaviour of DYLD_INSERT_LIBRARIES (LD_PRELOAD) is exactly what I need but I don't want to insert a new library, I want to replace one (in the file system). EDIT: The reason I don't want/can't use DYLD_INSERT_LIBRARIES or any other environment variable of the DYLD_* family is that they are ignored for code signed, restricted, ... binaries.
I'm aware of the -reexport-l, -reexport_library and -reexported_symbols_list linker flags but I could not get them to work, especially when my library is a "replacement" for frameworks that are part of umbrella frameworks (example: /System/Library/Frameworks/CoreServices.framework/Frameworks/SearchKit.framework/SearchKit) because ld forbids to link directly against parts of umbrella frameworks.
EDIT: Because I explained it somewhat ambiguously: I can't change the way the actual program is linked. The goal is to produce a shared library that is a replacement for the original library. (Apparently called filter library.)
Found it out now (OS X specific): clang -o replacement-lib.dylib ... -Xlinker -reexport_library PATH_TO_ORIGINAL_LIB does the trick. PATH_TO_ORIGINAL_LIB could for example be /System/Library/Frameworks/CoreServices.framework/Frameworks/SearchKit.framework/Versions/Current/SearchKit.
If PATH_TO_ORIGINAL_LIB is a library that is part of an umbrella framework (as in the example above), then replace PATH_TO_ORIGINAL_LIB by the path of some other lib (I created a lib empty.dylib for that) and as a second step do
install_name_tool -change /usr/local/lib/empty.dylib PATH_TO_ORIGINAL_LIB replacement-lib.dylib
To see if the actual reexporting worked use:
otool -l replacement-lib.dylib | grep -A2 LC_REEXPORT_DYLIB
The output should look like
cmd LC_REEXPORT_DYLIB
cmdsize XX
name empty.dylib (offset YY)
After launching the install_name_tool it could be
cmd LC_REEXPORT_DYLIB
cmdsize XX
name /System/Library/Frameworks/CoreServices.framework/Frameworks/SearchKit.framework/Versions/Current/SearchKit (offset YY)
You could link against both libraries and use the link order to make sure to link against the right symbols. This works on both OS X and Linux:
cc -o executable -lmylib -loriglib
Where origlib is the original library and mylib contains symbols that are supposed to overwrite symbols in origlib. Then the executable will be linked against your symbols from mylib first and all unresolved symbols will be linked against origlib.
This works in the same way when linking against OS X frameworks. Just link against your library that replaces symbols first and against the framework after.
cc -o executable -lmylib -framework SomeFramework
Edit: If you just want to replace symbols at runtime then you can use LD_PRELOAD in the same way:
cc -o executable -framework SomeFramework
LD_PRELOAD=libmylib.dylib ./executable
I am facing a trivial prblem.
I am doing an ldr r0, _buff
in arm assembly, where _buff is defined in a c file. _buff is not static.
How should I define external linkage in assembly file (similar to extern in C)
is it required or is there anything which I am missing.
You don't need to. If the symbol can't be found in the source file, it'll be assumed to be defined elsewhere.
It's not necessary to do that. If *_buff* has been defined as global in the C file, you can compile and build the files together:
arm-none-gnuaebi-gcc -o output assembly.s cfile.c
You can also compile them separately, it's also going to work. But _buff does not exist, you will get a link error.
A third party provided me a static lib (.a) to link with on solaris station.
I tried to compile with sunpro, and failed at link step.
I suppose the issue is coming from the compiler I use (gcc instead?) or simply its version (as the std lib provided by the compiler could change from the version expected by the library AFAIK it could leads to errors at link step).
How could I know which compiler was used to generate this lib? Is there some tools doing that? Some option in sunpro/gcc or whatever?
As an hint: I've read some time ago that compilers use different mangling conventions when generating object files (true?). Still, "nm --demangle" command line prints me well all function names from debug symbols in this static lib. How does it work ? If my assumption is ok, nm does have a way to resolve which convention is in use in a static library, isn't it? Or is it simply meaning that lib was generated by GNU gcc, as nm is a part of GNU binutils?
I am not close to my workstation so I can't copy & paste error output from the linker (not for the moment but I could copy them in a further edit)
Extract the object files from the archive then run the strings command on some of them (first on the smaller ones since there'd be less noise to sift through). Many compilers insert ASCII signatures in the object files.
For example, the following meaningless source file, foo.c:
extern void blah();
when compiled on my Fedora 10 machine into foo.o via gcc -c -o foo.o foo.c results in a 647 byte foo.o object file. Running strings on foo.o results in
GCC: (GNU) 4.3.2 20081105 (Red Hat 4.3.2-7)
.symtab
.strtab
.shstrtab
.text
.data
.bss
.comment
.note.GNU-stack
foo.c
which makes it clear the compiler was GCC. Even if I'd compiled it with -fno-ident, the .GNU-stack note ELF section would have still been present.
You can extract the object files using the ar utility, or using Midnight Commander (which integrates ar), or you can simply run strings on the archive (which might give you more noise and be less relevant, but would still help.)
I tend to use the strings program (with the '-a' option, or my own variant where the '-a' behaviour is standard) and look for the tell-tale signs. For example, in one of my own libraries, I find:
/work1/gcc/v4.2.3/bin/../lib/gcc/sparc-sun-solaris2.10/4.2.3/include
/work1/gcc/v4.3.0/bin/../lib/gcc/sparc-sun-solaris2.10/4.3.0/include
/work1/gcc/v4.3.1/bin/../lib/gcc/sparc-sun-solaris2.10/4.3.1/include
/work1/gcc/v4.3.3/bin/../lib/gcc/sparc-sun-solaris2.10/4.3.3/include
That suggests that the code in the library has been compiled with a variety of versions of GCC over a period of years (actually, I'm quite startled to find so many versions in a single library).
Another library contains:
cg: Sun Compiler Common 11 Patch 120760-06 2006/05/26
acomp: Sun C 5.8 Patch 121015-02 2006/03/29
iropt: Sun Compiler Common 11 Patch 120760-06 2006/05/26
/compilers/v11/SUNWspro/prod/bin/cc -O -v -Xa -xarch=v9 ...
So, there are usually fingerprints in the object files indicating which compiler was used. But you have to know how to look for them.
Is the library supposed to be a C or C++ library?
If it is a C library then name mangling can not be the problem, as there is none in C. It could be however in a wrong format. Unices used to have libraries in the a.out format but almost all newer versions switched to more powerful formats like ELF.
If it is a C++ library then name mangling can be an issue. Most compilers embed some symbols that are compiler specific into the code, so if you have a tool like nm to list the symbols you can hopefully deduce from what compiler it came.
For example g++ creates a symbol
__gxx_personality_v0
in it's libraries
You can try the unix utility file:
file foo.a