Finding the raw offset of .text section in a PE file

Finding the raw offset of .text section in a PE file - c

In IDA the .text section begins at 0x01001630.
In the file, these bytes are located at a raw offset of 0xA30.
The PointerToRawData field in the Section Table for the .text segment is 0x400, which is the distance between 0xA30 and 0x630.
I'm not sure how to get 0xA30 from the file headers of a PE file. Any help is appreciated.

The first 0x630 bytes of the .text section is the IAT (import address table) which IDA has converted to a new section .idata:
1000 [ 630] RVA [size] of Import Address Table Directory
Name Start End
---- ----- ---
HEADER 01000000 01001000
.idata 01001000 01001630 <- added by IDA
.text 01001630 01054000
.idata 01054000 01054004 <- added by IDA
.data 01054004 01059000
If you uncheck [x] Make imports segment in the initial load dialog, you'll get the unmodified section table:
Name Start End
---- ----- ---
HEADER 01000000 01001000
.text 01001000 01054000
.data 01054000 01059000

I believe this is a case of IDA trying to be clever. The .text section does actually start at file offset 0x400 (RVA 0x1000). IDA realises that the start of the .text section contains api import data, so it changes the section name to .idata. If you have a look at all the section names in the PE header, you will see that there is no .idata section. Take a look at the entire PE header. You will see the import address table starts at RVA 0x1000 and has a size of, you guessed it, 0x630.

Related

What does the "Rva+Base" mean in a MSVC linker generated map file?

I compiled a C program and linked with the Microsoft linker link.exe.
The linker generated a *.map file. Below are some snippets:
Start Length Name Class
0001:00000000 00000180H .text CODE
0001:00000180 00049158H .text$mn CODE
0002:00000000 000090c4H .rdata DATA
0002:000090c4 00000130H .rdata$zzzdbg DATA
0003:00000000 00002060H .data DATA
0003:00002060 00001370H .bss DATA
0004:00000000 00002790H .pdata DATA
0005:00000000 000003e4H .xdata DATA
Address Publics by Value Rva+Base Lib:Object
0002:000000c0 _gVar 0000000000049660 MyLib:File1.obj
My understanding is:
The .rdata section has the index 0002 with the length 0x90c4.
The _gVar global variable has the Address like 0002:000000c0, so it lives in the .rdata seciton with the section offset 0xc0.
The base address of the .rdata section is decided by the loader at runtime.
My confusion is about the Rva+Base column.
I checked the binary file at offset 0x49660. It contains exactly the value I assigned for the _gVar. So it seems Rva+Base is the offset from the beginning of the file.
But I think the Rva + Base should equal to absolute virtual address, which can only be decided by the loader at runtime.
How could it be the file offset? Or an unfortunate typo in the column name?
(Btw, I searched a lot for an authoritative reference about the .map file generated by link.exe. But no luck so far. If anyone can share some resource, it will be deeply appreciated.)
ADD 1 - 5:31 PM 4/8/2019
Based on 500 - Internal Server Error's comment, I use CFF Explorer to verify the PE sections. It looks like below:
According to the PE sepc:
Virtual Address: For executable images, the address of the first byte of the section relative to the image base when the section is
loaded into memory.
Raw Address/PointerToRawData: The file pointer to the first page of the section within the COFF file.
These 2 columns have the same data in my scenario.
And in the PE Optional Header, the ImageBase is 0 (this is because of my link flag /BASE:0):
When linker generated the .map file, it has no knowledge about where the image will be loaded at runtime. So it has to use the ImageBase=0 to calculate the Rva+Base column.
The _gVar's absolute virtual address = ImageBase(0) + .rdata Virtual Address(0x495A0) + offset into .rdata (0xc0) = 0x49660
In my case,
The Virtual Address column has the same value as Raw Address column.
And ImageBase = 0 has the same effect of file beginning offset = 0
So the final absolute virtual address equals the file offset, but this is only specific to my scenario.

How can i make the gnu linker create an output section similar to .bss?

In an embedded environment i want to convey information about a special part
of memory (start address and length) from the build process to the program
loader. My idea is to let linker create an output section similar to .bss,
i.e. that section should not occupy space in the elf file and should have flags
like the .bss section. I came to this idea since i am already using a customized
linker script.
When processing the elf file, my costumized loader could recognize this section
by a magic name and use the sections size and VMA as the description for the
special part of memory.
When i say it should be similar to .bss, i mean the output of objdump -h
should be similar to this:
Sections:
Idx Name Size VMA LMA File off Algn
...
7 .bss 00000204 10204c9c 10204c9c 00005c40 2**2
ALLOC
...
I guess it is important that here only the flag ALLOC is present, but not LOAD
or CONTENTS.
Can this be achieved with some instructions in the linker script?
If so, what are those instructions?

Browsing through similar questions here at stackoverflow and the ld documentation showed that the solution is quite simple:
.special_start = 0x20000000;
.special_size = 0x10000000;
.special .special_start (NOLOAD) :
{
. = . + .special_size;
}
gives this ouput from objdump -h:
Sections:
Idx Name Size VMA LMA File off Algn
...
6 .sbss 0000005c 10204c40 10204c40 00005c40 2**2
ALLOC, SMALL_DATA
7 .bss 00000204 10204c9c 10204c9c 00005c40 2**2
ALLOC
8 .special 10000000 20000000 20000000 00006000 2**0
ALLOC
...

What is placed at the end of hex file generated by Keil

I've found a problem I cannot answer. I've been placing a table of 3 bytes after the section where my program lies:
const uint8 AppVersion[] __attribute__((at(0x08006E00))) = {1,1,3);
What I've get in hex generated by Keil was the table plus extra data:
End of hex file after data added at arbitrary address.
Whereas while not using aforementioned table I've get the same "extra" data (364 bytes) at the end of the hex:
End of hex file normally.
Could you tell me what is placed at the end of the application? I haven't found any clue in the .map file.
Thanks!
Paweł

You have to look in the .map file to see what it's putting there. But it can be your code, or a library. You are using an absolute address, not a relative reference to the "end of image".
Instead, use a custom linker file to explicitly link that table to the end of the image.
LR_IROM1 0x08000000 0x0007000 {
; Program ROM Area
ER_IROM1 0x08000000 (0x0007000-3) {
*.o (RESET, +First)
*(InRoot$$Sections)
.ANY (+RO)
}
; Program SRAM Area
RW_IRAM1 0x20000000 0x00001000 {
.ANY (+RW +ZI)
}
; Version area
VERSION (0x08000000 + (0x0007000-3)) 0x3 {
version.o
}
}
I have no idea about your target layout, adjust the numbers yourself.
Once on a sunny day I wrote a little tool to read the map file. Maybe it works on your version of keil?
Update:
You've shared the .sct file (linker file).
LR_IROM1 0x08000000 0x00008000 {
ER_IROM1 0x08000000 0x00008000 {
*.o (RESET, +First) *(InRoot$$Sections)
.ANY (+RO)
}
RW_IRAM1 0x20000000 0x00001000
{
.ANY (+RW +ZI)
}
}
Your ROM region LR_IROM1 spans from 0x08000000 to 0x08007FFF.
Therefore 0x08006E00 isn't the end of your image and the linker is allowed to put anything (.ANY) after the statically linked object AppVersion.
If you don't want that, explicitly tell the linker to create a region only for your version object as in the example above.

How can I fix overlapping sections in a linker script with a phdrs command?

I'm trying to make a simple operating system kernel higher half. When using Grub as a bootloader as I am, there must also be some lower half (32 bit) code. Because I want to keep this 32 bit code as brief as possible, I do not want to write an ELF loader in it just to load the 64 bit code because that would be patently absurd (this is in fact the most common solution, but I would like to avoid it if possible).
I discovered that linker scripts allow load addresses that differ from virtual addresses. This is useful so that I can load the 64 bit sections to fit in a small binary and then use virtual memory to map the proper virtual addresses to the physical addresses they were loaded at. This works except that the low text section is not put in the text segment. The entry point, _start, is in this section.
I can't put the low text section (where _start resides) in the text segment unless I specify the text segment in a PHDRS command. Of course, using this command makes the linker decide to take a pass on generating the normally expected segments. When I do this too, the sections end up overlapping and I am not entirely sure why. I specify the segments in the order data, rodata, text, and the sections are the same, and yet their load memory addresses are assigned with rodata and data swapped and all three overlapping.
Here is my linker script:
ENTRY(_start)
PHDRS {
.low PT_LOAD FILEHDR PHDRS;
.data PT_LOAD;
.rodata PT_LOAD;
.text PT_LOAD;
}
SECTIONS {
. = 1M;
.data_low BLOCK(4K) : ALIGN(4K) {
*(.bss_low)
} : .low
.rodata_low BLOCK(4K) : ALIGN(4K) {
KEEP(*(.multiboot_low))
*(.rodata_low)
} : .low
.text_low BLOCK(4K) : ALIGN(4K) {
*(.text_low)
} : .low
.stack 0xC0000000 : AT(0x200000) ALIGN(4K) {
*(.bootstrap_stack)
} : .data
_LADD_ = LOADADDR(.stack) + SIZEOF(.stack);
.data BLOCK(4K) : AT(_LADD_) ALIGN(4K) {
*(COMMON)
*(.bss)
} : .data
_LADD_ += SIZEOF(.data);
.rodata BLOCK(4K) : AT(_LADD_) ALIGN(4K) {
*(.rodata)
} : .rodata
_LADD_ += SIZEOF(.rodata);
.text BLOCK(4K) : AT(_LADD_) ALIGN(4K) {
*(.text)
} : .text
}
I do not think the code is relevant to this error. When I link my object files using this linker script (additionally with -n --gc-sections), I get this error:
ld: section .data loaded at [0000000000200020,000000000020103f] overlaps section .rodata loaded at [0000000000200010,00000000002000d0]
ld: section .text loaded at [00000000002000d1,00000000002017ce] overlaps section .data loaded at [0000000000200020,000000000020103f]
The load memory addresses are in the order rodata, data, text, even though I expect they should be in the order data, rodata, text since the sections are specified in that order with AT specifiers with monotonically non decreasing positions (assuming the sections do not have negative size).
I should specify that I am using "segment" to mean one of the entries in the ELF program header (PHDRS in the linker script) and "section" to mean one of the entries in the ELF section header (SECTIONS in the linker script). I believe this to be correct terminology but acknowledge that I have an understanding of linker files and the ELF format that is limited at best. For whatever reason, Grub will not load an ELF file if its entry point is not in a segment.
Why indeed are the sections not in the order I expect them to be in, and how can I make them be? Thank you.

Why does arm-none-eabi-size report the .data section to be 0 even though I am using initialized RAM?

I am a bit confused by the results I am getting when I use my toolchain's (Yagarto and codesourcery) size utility. it is reporting that I am using 0 bytes in the data section. see below
$ arm-none-eabi-size.exe rest-server-example.crazy-horse.elf
text data bss dec hex filename
79364 0 34288 113652 1bbf4 rest-server-example.crazy-horse.elf
I know my code is using and initializing static RAM variables to values other than 0.
interestingly enough when I pass the size tool directly some of the object files that are getting linked I see .data section being reported
example:
text data bss dec hex filename
1648 0 20 1668 684 obj_crazy-horse/uip-nd6.o
200 12 2652 2864 b30 obj_crazy-horse/uip-packetqueue.o
12 0 0 12 c obj_crazy-horse/uip-split.o
1816 24 48 1888 760 obj_crazy-horse/usb-core.o
284 0 0 284 11c obj_crazy-horse/usb-interrupt.o
2064 20 188 2272 8e0 obj_crazy-horse/xmac.o
Why would the elf file report 0 for the .data section when the object files that make it are reporting non-zero values?
FYI I am working on embedded software for a AT91SAM7x256 Micro
edit:
adding the CFLAGS and LDFLAGS
CFLAGS += -O -DRUN_AS_SYSTEM -DROM_RUN -ffunction-sections
LDFLAGS += -L $(CPU_DIRECTORY) -T $(LINKERSCRIPT) -nostartfiles -Wl,-Map,$(TARGET).map
edit #2:
from the object dump we can clearly see that the .data section has data assigned to it but the size utility is not picking it up for some reason
objdump link
All I am looking for is to get an exact usage of my RAM I am not trying to figure out whether one of my variables was optimized out.
edit 3:
more information showing that the size utility does see something in the .data section
$ arm-none-eabi-size.exe -A -t -x rest-server-example.crazy-horse.elf
rest-server-example.crazy-horse.elf :
section size addr
.vectrom 0x34 0x100000
.text 0x10fc8 0x100038
.rodata 0x149c 0x111000
.ARM.extab 0x30 0x11249c
.ARM.exidx 0xe0 0x1124cc
.data 0x1028 0x200000
.bss 0x7bec 0x201028
.stack 0xa08 0x20f5f8
.ARM.attributes 0x32 0x0
.comment 0x11 0x0
.debug_aranges 0xc68 0x0
.debug_info 0x2b87e 0x0
.debug_abbrev 0x960b 0x0
.debug_line 0x9bcb 0x0
.debug_frame 0x4918 0x0
.debug_str 0x831d 0x0
.debug_loc 0x13fad 0x0
.debug_ranges 0x620 0x0
Total 0x7c4c5

My interpretation would be that the linker script creates a single loadable section, which contains the initial values of the data section and a piece of startup code that copies the data to the uninitialized data section.
This is necessary if you want to have a single image file that can be run from read-only memory, as there is no ELF loader in front then that would perform that copy for you.
Normally, this is only done in the section to segment mapping (i.e. the output sections are arranged in the linker script using the > section placement command) rather than by mapping the input section twice, but that is certainly possible as well.
The usage numbers are quite accurate: the text size is the amount of Flash space needed, the BSS size is the amount of RAM needed. Initialized data is counted twice, once for the initial data in Flash, and once for the modifiable data in RAM.

Your .data section have the CODE attribute set, and this confuses "arm-none-eabi-size". The size of the .data section is incorrectly added to the total text size instead of the data size.
My guess is that you have some code that is stored in flash but is copied to ram at run time such as a fast interrupt handler or flash reprogramming that must run from RAM. This will set the CODE attribute for the data segment, and "size" believes that all of .data is text.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight