Is it possible to access a C variable from a linker script - c

Let's say for example I need to get the size of the process loaded into memory then I define this in my code:
#include <stdio.h>
ssize_t prog_sz;
int main()
{
printf("%x\n", prog_sz);
}
then I have a linker script accessing it with a line like this proc_sz = .
NOTE: all linker scripts I test with my programs always produce errors which is why I specified only a line from the script. For example something as simple as this without that line I first talked about:
SECTIONS
{
.text : { *(.text) }
.data : { *(.data) }
.bss : { *(.bss) }
}
produces annoying errors like these:
/usr/bin/ld: a.out: error: PHDR segment not covered by LOAD segment
/usr/bin/ld: /usr/lib/x86_64-linux-gnu/libc_nonshared.a(elf-init.oS): in function `__libc_csu_init':
(.text+0x9): undefined reference to `__init_array_start'
/usr/bin/ld: (.text+0x20): undefined reference to `__init_array_end'
/usr/bin/ld: a.out: hidden symbol `__init_array_end' isn't defined
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status
even an example from the ld info documentation produces that annoying error. Maybe you can help me solve that issue too.

It looks like you have two separate issues.
The first issue involves the symbol/variable prog_sz. The approach that you have shown will just cause the linker to try and create another symbol named prog_sz, which would not likely accomplish the goal.
Without further detail on exactly what is being done, I will present three objectives and their solution:
Set the variable prog_sz to contain the address of a symbol defined in the linker file.
Define the symbol in your command file with a different name, such as prog_sz__. You can then add directly above the declaration of prog_sz in your code the line:
extern char prog_sz;
The type char does not really matter here. This statement is just necessary to tell the compiler that the symbol will be defined somewhere else. After this, you can assign the address of the symbol to prog_sz by modifying your definition to:
size_t prog_sz = (size_t)(&prog_sz__);
Using & tells the program to store the address associated with symbol prog_sz__ in variable prog_sz. This will assign the symbol defined in your linker script to your variable prog_sz.
Position the variable at a fixed location using the linker script there is a way to do this.
Assuming that you are using the GNU toolchain, when you are building use the GCC option -fdata-sections. This will place each variable into its own data section. Be aware that your .bss and .data sections will be replaced with a section for each variable prefixed .data or .bss and you may need to use a wildcard * to capture the .bss and .data sections.
You can then add a section just below the location in the linker file where . is set to the desired address.
For example:
SECTION
{
...
. = where_i_want_prog_sz;
prog_sz_section :
{
* (.bss.prog_sz)
}
...
}
Note that this will store prog_sz at a specific location, but will not set prog_sz to the value of the location.
Treat a linker symbol defined in the command file as a size_t variable.
Define the variable using the extern keyword: extern size_t prog_sz;
This tells the compiler that the symbol is defined elsewhere but will be of type size_t. Keep in mind, if this is what is being done, then you will need to be sure that the memory location is not being used for anything else, otherwise prog_sz may overlap other data in the system.
Regarding the second issue, which is the list of linker error messages, I believe that you may not have the linker configured correctly. Symbols __init_array_end and __init_array_start are related to initializing the C programming environment. I would suggest reviewing the linker settings and documentation to make sure that the program is properly configured. If you are using the GNU toolchain, you can find documentation here:
https://sourceware.org/binutils/docs-2.37/

Related

How do I fix collect2 error while compiling an old MUD?

I'm trying to run make on an Ubuntu machine to compile a RoT MUD, but the farthest I've gotten is when I get a collect2: error: ld returned 1 exit status.
This is what comes immediately before the error in the terminal (along with a lot of other similar errors):
/usr/bin/ld: obj/wizlist.o:/home/lucas/Projects/R2b5/src/merc.h:3355: multiple definition of `bllmax'; obj/act_comm.o:/home/lucas/Projects/R2b5/src/merc.h:3355: first defined here
From what I've gathered this means that the header files have variable declarations in them, and that using static is an easy fix, however, I haven't been able to figure out where I should put that keyword in the code to fix this issue. The following is the only mention of bllmax in merc.h:
int bllmax, crbmax, crnmax, srpmax, mngmax;
Here is the program I'm trying to compile.
You need to learn the difference between declaration and definition. A declaration is telling the compiler that the symbol exists somewhere but possibly not here. A definition is telling the compiler that the symbol exists here.
The line you show (without any context) is defining the variables, which means they will be defined in each source file that includes the header file.
What it should do is to declare the variables, which can be done by making them extern:
extern int bllmax, crbmax, crnmax, srpmax, mngmax;
Then in a single source file define the variables (without extern).

Relocation out of range; R_AARCH64_ADR_PREL_PG_HI21; Variable via Linker script;

I got an linker relocation error
relocation R_AARCH64_ADR_PREL_PG_HI21 out of range: 8556371968 is not in [-4294967296, 4294967295]
There is a code which uses this relocated symbol (simplified, not actual but enough to get an idea)
extern "C" extern void* memOffset;
void* getAddress(const void* addr)
{
return (void*)((uintptr_t)addr + (uintptr_t)&memOffset);
}
memOffset is provided via a Linker script and calculated as: memOffset = addr1 - addr2;
Well, when addresses difference is over 32b there is a relocation error.
Is there a way to provide this full scale 64b 'offset' variable like a linker script symbol?
Thanks
PS: I'm interesting in keeping one symbol 'offset', not workaround like passing addresses (addr1 & addr2) in code and calculate addresses offset at run-time.
It looks like you need -fPIC compilation option, while compiling your .c file(s).
if you're attempting to cross compile a lib with precompiled .a or .so.x files, gcc might be attempting to use the static version of the library because it can find the .so file.
A simple ln -s libX.so.1.2.3 libX.so might fix this for you.

How do I specify manual relocation for GCC code?

I am in a situation in an embedded system (an xtensa processor) where I need to manually override a symbol, but the symbol happens to be in the middle of another symbol. When I try using -Wl,--wrap=symbol it won't work, since the symbol isn't its own thing.
What I need to do is specify (preferably in a GCC .S, though .c is okay) where the code will end up. Though the actual symbol will be located somewhere random by the compiler, I will be memcpying the code into the correct place.
40101388 <replacement_user_vect>:
40101388: 13d100 wsr.excsave1 a0
4010138b: 002020 esync
4010138e: 011fc5 call0 4010258c <_UserExceptionVector_1>
My problem is GCC creates the assembly with relative jumps assuming the code will be located where it is in flash, while the eventual location will be fixed in an interrupt vector. How do I tell GCC / GNU as "put the code wherever you feel like, but, trust me it will actually execute from {here}"
Though my code is at 0x40101388 (GCC decided) it will eventually reside and execute from 0x40100050. How do I trick GCC by telling it "put the code HERE" but pretend it's located "HERE"
EDIT: I was able to get around this, as it turns out, the function I needed to modify was held in the linker script, individually. I was able to just switch it out in the linker script. Though I still would love to know the answer, I now have a work-around.
In the linker script each output section has two associated addresses: VMA and LMA -- the address for which the code is linked and the address where the code will be loaded.
Put the code that needs to be relocated into separate section, add an output section to your linker script with desired VMA and LMA and put an input section matching the name of the code section inside it.
E.g. the following C code
void f(void) __attribute__((section(".relocatable1.text")))
{
...
}
extern char _relocatable1_lma[];
extern char _relocatable1_vma_start[];
extern char _relocatable1_vma_end[];
void relocatable1_copy(void)
{
memcpy(_relocatable1_vma_start, _relocatable1_lma,
_relocatable1_vma_end - _relocatable1_vma_start);
}
Together with the following piece of ld script, with VMA substituted with the desired target code location
SECTIONS {
...
.some_section : { ... }
.relocatable1 VMA : AT(LOADADDR(.some_section) + SIZEOF(.some_section)) {
_relocatable1_vma_start = . ;
*(.relocatable1.literal .relocatable1.text) ;
_relocatable1_vma_end = . ;
}
_relocatable1_lma = LOADADDR(.relocatable1) ;
...
}
should do what you want.

objcopy: fails to copy a particular section (`.rodata' required but not present [...])

I compiled a Hello World C file and need just one section (only the hello world function) of it.
The compiled file has the format elf32-i386 and contains 4 sections: .rodata, .text.hello, .comment, .eh_frame.
I tried to use objcopy to extract only the .text.hello section: http://www.thegeekstuff.com/2013/01/objcopy-examples/ example 3.
It fails, reporting:
BFD: hello_new: symbol `.rodata' required but not present
objcopy:hello_new: No symbols
How to solve it?
First, you mentioned you need only the the .text section - for the purpose of runtime execution? That is not right: if that hello functions has strings hardcoded inside, all these strings will be located inside .rodata section, so are u going to ignore this section?
.eh_frame is for debugger, and .comment i think is not needed, but .data is also needed.
Another thing is the relocation table - if the fucntion is to be dynamically loaded into some arbitrary memory region, then lots of area INSIDE the function may need to be patched.....check objdump -r of your ELF to find out if there is any relocation entries. if not, u are safe.
Also, anything inside your function declared as "const" will also go into the .rodata section - global data of course. Variables or constants local to the function are on the stack. And all global data are located inside .data section.
But coming back to the original error, the reason is because the example 3 in the original URL does not have cross-referencing (the .interp section) and therefore objdump -s will not have error. Your case, .text does have cross-referencing to .rodata section, but is not available after you have extracted just the .text section out.

Place a function at very start of binary

I'm working on a toy operating system and bootloader. I'm trying to write the kernel in C, and then convert it to binary for direct jumping to from the bootloader (i.e., I'm not loading an ELF or anything like that).
I've got the linker file setup with the proper origin (I'm loading the kernel to address 0xC0000000) and confirm with objdump that it's using it correctly. However, it's not placing my entry point at the start (0xC0000000) like I wanted. I guess that's not what the ENTRY directive is for.
My problem is simply that I want to place a particular function, kernel_main at address 0xC0000000. Is there a way I can accomplish this using gcc for compiling and linking?
Here is what the relevant parts of my linker file look like:
ENTRY(kernel_main)
SECTIONS
{
/* Origin */
. = 0xC0000000;
.text BLOCK(4K) : ALIGN(4K)
{
*(.text)
}
/* etc. */
}
The ENTRY linker command tells the linker which symbol the loader should jump to when it loads the program. If you're making your own operating system it's really not used since there is no loader.
Instead, as you know, the program simply starts at the first code address.
To make place a special segment of code first, you could place it in a special code segment, and put it first in the list:
.text BLOCK(4K) : ALIGN(4K)
{
*(.text.boot) *(.text)
}
The segments in the list is placed in the order they are given.
The ENTRY directive is only useful for output formats that support an entrypoint. Since you're using a binary output, this won't work. What you can do is write a small stub in a separate source file (i.e. entry.c or entry.asm or whatever). Then, in the ld script, before the *(.text) line, you can put entry.o(.text). This instructs ld to load the symbols from a specific object file (whereas * denotes all object files). So the new ld script would look like this:
ENTRY(kernel_main)
SECTIONS
{
/* Origin */
. = 0xC0000000;
.text BLOCK(4K) : ALIGN(4K)
{
entry.o(.text)
*(.text)
}
/* etc. */
}
As long as entry.o contains just one function (that simply calls your kernel main), this should work.

Resources