I compiled a Hello World C file and need just one section (only the hello world function) of it.
The compiled file has the format elf32-i386 and contains 4 sections: .rodata, .text.hello, .comment, .eh_frame.
I tried to use objcopy to extract only the .text.hello section: http://www.thegeekstuff.com/2013/01/objcopy-examples/ example 3.
It fails, reporting:
BFD: hello_new: symbol `.rodata' required but not present
objcopy:hello_new: No symbols
How to solve it?
First, you mentioned you need only the the .text section - for the purpose of runtime execution? That is not right: if that hello functions has strings hardcoded inside, all these strings will be located inside .rodata section, so are u going to ignore this section?
.eh_frame is for debugger, and .comment i think is not needed, but .data is also needed.
Another thing is the relocation table - if the fucntion is to be dynamically loaded into some arbitrary memory region, then lots of area INSIDE the function may need to be patched.....check objdump -r of your ELF to find out if there is any relocation entries. if not, u are safe.
Also, anything inside your function declared as "const" will also go into the .rodata section - global data of course. Variables or constants local to the function are on the stack. And all global data are located inside .data section.
But coming back to the original error, the reason is because the example 3 in the original URL does not have cross-referencing (the .interp section) and therefore objdump -s will not have error. Your case, .text does have cross-referencing to .rodata section, but is not available after you have extracted just the .text section out.
Related
Writing a simple hello_world.c program, and compile with 32-bit MinGW, the
objdump can show symbol table using:
objdump -t hello_world.exe
And the symbol table then have an entry for _main as:
...
[ 32](sec 1)(fl 0x00)(ty 20)(scl 2) (nx 1) 0x00000460 _main
...
However, when loading the hello_world.exe file in x64dbg debugger, the
_main symbol is not shown, as can be seen from the symbol list below
when hello_world module is selected.
This is annoying, since I would like to create a breakpoint at start of user
code in hello_world, and using the symbols to jump to the start location
would be very convenient.
Any idea on how to get the _main symbols included in the symbol list?
While PE has support for storing debug information, the symbol table is mostly an ELF concept.
If you look at the PE sections (use objdump -h) you'll see a lot of extra sections not referenced in the PE directories.
These are used by the binutils to extract the DWARF information and show you, for example, the symbol table.
x64dbg is a pure Windows/PE tool and doesn't understand DWARF.
However, it will show you the address of the PE entry-point (rarely the address of the "main" itself though) and will put a breakpoint there for you automatically.
The Entry-point is shown under the export symbols of the binary under inspection.
Furthermore, x64dbg will break on 'ntdll` allowing you to reach the entry-point with a ninja use of CTRL+F9 (beware of TLS initialization callbacks).
To get to main you can step through the code until you find a call to an address in the .text section or simply a call followed by two calls to cexit and ExitProcess.
Also, given the offset of _main retrieved with objdumpt -t the VA of _main is is BASE ADDRESS + .text RVA + __main OFFSET.
In my case, this was 4010460h
I have a requirement where I need to create a duplicate/copy section of .data section.
I've tried creating a dummy section with same size of data section in linker script and copy the contents of data section to the dummy section in the init functions of my ELF image, but that doesn't suit my requirement, as I want the copy/duplicate section to be created along with final ELF image not during the execution of it.
Below is what I wanted in my linker script,
SECTIONS {
.data : { <data section contents> }
.dummydata : { <copy of .data section> }
}
Can anyone help to write the linker script to match above requirement?
I don't think this can be done with just ld and a linker script. Given this line from here:
If a file name matches more than one wildcard pattern, or if a file
name appears explicitly and is also matched by a wildcard pattern, the
linker will use the first match in the linker script.
It sounds like the linker script will only put the data (or anything) in one section.
However all hope is not lost. You can copy the section using objcopy and then add the section using objcopy again
objcopy -O binary --only-section=.data your-file temp.bin
objcopy --add-section .dummydata=temp.bin your-file
This will append the section to be the last section with a VMA/LMA of 0. You can then use objcopy to move the section to the desired location.
objcopy --change-section-address .dummydata=desired-address your-file
Of course if there is something already there that would be problematic. Luckily you can create a hole right after your first .data with something like:
data_start = .;
.data : { *(.data) }
data_end = .;
. += (data_end - data_start);
This should create a hole right after your first data, big enough to put another copy of data right after it. If this isn't exactly where you want it to be just add (data_end - data_start) where you want the hole.
Finally you can change the section flags, again with objcopy
objcopy --set-section-flags .dummydata=the-flags-you-want your-file
Not as clean as just duplicating something in the linker script but it should work.
The test is on 32-bit Linux, x86.
Suppose in my assembly program final.s, I have to load some library symbols, say, stdin##GLIBC_2.0, and I want to load these symbols in a fixed address.
So following instructions in this question, I did this:
echo ""stdin##GLIBC_2.0" = 0x080a7390;" > symbolfile
echo ""stdin#GLIBC_2.0 (4)" = 0x080a7390;" >> symbolfile
gcc -Wl,--just-symbols=symbolfile final.s -g
And when I checked the output of symbol table, I got this:
readelf -s a.out | grep stdin
53: 080a7390 4 OBJECT GLOBAL DEFAULT ABS stdin##GLIBC_2.0
17166: 080a7390 0 NOTYPE GLOBAL DEFAULT ABS stdin#GLIBC_2.0 (4)
And comparing to a common ELF biary that requires stdin symbol:
readelf -s hello.out | grep stdin
17199: 0838b8c4 4 OBJECT GLOBAL DEFAULT 25 stdin##GLIBC_2.0
52: 0838b8c4 4 OBJECT GLOBAL DEFAULT 25 stdin#GLIBC_2.0 (4)
So an obvious difference I found is that the Ndx column, say, the section number of my fixed position symbols are ABS. Please check the references here.
When executing the a.out, it throws a segmentation fault error.
So my question is, how to set the section number of the symbol fixed position?
I want to load these symbols in a fixed address.
You are importing these symbols from GLIBC. Unless you are doing a fully-static linking, you get no say in what address these symbols end up at.
So my question is, how to set the section number of the symbol
That question makes no sense: section number itself is meaningless and 25 may refer to .bss in one executable, but to .text in another.
Your section 25 just happens to be .bss on this particular system and for this particular build. Try building a fully-static binary, and you are likely to see section 24 instead.
Anyway, a normal executable gets stdin copied from libc.so.6. You will do well to read this description of the process, and pay special attention to "Extra credit #2: Referencing shared library data from the executable" section.
But it may be easier to understand the fully-static case first.
Is there a way to prohibit the use of global variables?
I want GCC to generate an error on compile time when a global variable is defined.
We have a code that should be run per thread and want to allow only use of stack (which is thread safe)
Is there way to enforce it ?
Some GCC flag or other way to verify it ?
One approach would be to generate a linker map file (e.g. pass option -Wl,-Map,program.map to gcc), and examine the .data and .bss output sections for any contributions from the object files that you want to run without globals.
For instance, if my source file hello.c has:
static int gTable[100];
the linker map file will have something like this in it:
.bss 0x0000000000600940 0x1b0
*(.dynbss)
.dynbss 0x0000000000000000 0x0 /usr/lib/gcc/x86_64-linux-gnu/4.7/../../../x86_64-linux-gnu/crt1.o
*(.bss .bss.* .gnu.linkonce.b.*)
.bss 0x0000000000600940 0x0 /usr/lib/gcc/x86_64-linux-gnu/4.7/../../../x86_64-linux-gnu/crt1.o
.bss 0x0000000000600940 0x0 /usr/lib/gcc/x86_64-linux-gnu/4.7/../../../x86_64-linux-gnu/crti.o
.bss 0x0000000000600940 0x1 /usr/lib/gcc/x86_64-linux-gnu/4.7/crtbegin.o
*fill* 0x0000000000600941 0x1f 00
.bss 0x0000000000600960 0x190 hello.o
You can see that hello.o is contributing 0x190 (400) bytes to the .bss section. I've used the approach of parsing a link map file with a Python script to generate code size and RAM usage metrics for an embedded project with reasonable success in the past; the text output format from the linker is pretty stable.
No such functionality in gcc. Some workaround would be to incorporate in the build process a static analysis tool which can detect globals. Still the compilation would not fail, but at least you would be warned in some way. I can see that PC-Lint (www.gimpel.com) has a check for
non const non volatile global variables, locating these can assist multi-threaded applications in detecting non re-entrant situations
Probably other tools may include similar functionality.
I would use ctags to extract the symbols from the source code and then search the output with a (perl or python) script for global variables.
E.g. following line would tell you whether a C soucre file hello.c contains global variables:
ctags -f- hello.c | perl -ne"#a=split(/\t/, $_); if ($a[3] eq qq(v)){ print qq(Has global variables.); exit 0; }"
I'm working on a toy operating system and bootloader. I'm trying to write the kernel in C, and then convert it to binary for direct jumping to from the bootloader (i.e., I'm not loading an ELF or anything like that).
I've got the linker file setup with the proper origin (I'm loading the kernel to address 0xC0000000) and confirm with objdump that it's using it correctly. However, it's not placing my entry point at the start (0xC0000000) like I wanted. I guess that's not what the ENTRY directive is for.
My problem is simply that I want to place a particular function, kernel_main at address 0xC0000000. Is there a way I can accomplish this using gcc for compiling and linking?
Here is what the relevant parts of my linker file look like:
ENTRY(kernel_main)
SECTIONS
{
/* Origin */
. = 0xC0000000;
.text BLOCK(4K) : ALIGN(4K)
{
*(.text)
}
/* etc. */
}
The ENTRY linker command tells the linker which symbol the loader should jump to when it loads the program. If you're making your own operating system it's really not used since there is no loader.
Instead, as you know, the program simply starts at the first code address.
To make place a special segment of code first, you could place it in a special code segment, and put it first in the list:
.text BLOCK(4K) : ALIGN(4K)
{
*(.text.boot) *(.text)
}
The segments in the list is placed in the order they are given.
The ENTRY directive is only useful for output formats that support an entrypoint. Since you're using a binary output, this won't work. What you can do is write a small stub in a separate source file (i.e. entry.c or entry.asm or whatever). Then, in the ld script, before the *(.text) line, you can put entry.o(.text). This instructs ld to load the symbols from a specific object file (whereas * denotes all object files). So the new ld script would look like this:
ENTRY(kernel_main)
SECTIONS
{
/* Origin */
. = 0xC0000000;
.text BLOCK(4K) : ALIGN(4K)
{
entry.o(.text)
*(.text)
}
/* etc. */
}
As long as entry.o contains just one function (that simply calls your kernel main), this should work.