How can i extract constants' addresses,added by compiler optimization, from ELF file? - c

I'm writing some code size analysis tool for my C program, using the output ELF file.
I'm using readelf -debug-dump=info to generate Dwarf format file.
I've noticed that My compiler is adding as a part of the optimization new consts, that are not in Dwarf file, to the .rodata section.
So .rodata section size includes their sizes but i don't have their sizes in Dwarf.
Here is an example fro map file:
*(.rodata)
.rodata 0x10010000 0xc0 /<.o file0 path>
0x10010000 const1
0x10010040 const2
.rodata 0x100100c0 0xa /<.o file1 path>
fill 0x100100ca 0x6
.rodata 0x100100d0 0x6c /<.o file2 path>
0x100100d0 const3
0x100100e0 const4
0x10010100 const5
0x10010120 const6
fill 0x1001013c 0x4
In file1 above, although i didn't declare on const variable - the compiler does, this const is taking space in .rodata yet there is no symbol/name for it.
Here is the code inside some function that generates it:
uint8 arr[3][2] = {{146,179},
{133, 166},
{108, 141}} ;
So compiler add some consts values to optimize the load to the array.
How can i extract theses hidden additions from data sections?
I want to be able to fully characterize my code - How much space is used in each file, etc...

I am guessing here - it will be linker dependent, but when you have code such as:
uint8 arr[3][2] = {{146,179},
{133, 166},
{108, 141}} ;
arr at run-time exists in r/w memory, but its initialiser will be located in R/O memory to be copied to the R/W memory when the array is initialised. The linker need only provide the address, because the size will be known locally as a compile-time constant embedded as a literal in the initializing code. Consequently the size information does not appear in the map, because the linker discards that information.
Length is however implicit by the address of adjacent objects for filled space. So for example:
The size of const1 for example is equal to const2 - const1 and for const6 it is 0x1001013c - const6.
It is all rather academic however - you have precise control over this in terms of the size of your constant initialisers. They are not magically created data unrelated to your code, and I am not convinced that thy are a product of optimization as you suggest. The non-zero initialisers must exist regardless of optimisation options, and in any case optimisation primarily affects the size and/or speed of code (.text) rather then data. The impact on data sizes is likely to relate only to padding and alignment and in debug builds possibly "guard-space" for overrun detection.
However there is no need at all for you to guess. You can determine how this data is used by inspecting the disassembly or observing its execution (at the instruction level) in a debugger - to see exactly where initialised variables are copying the data from. You could even place an read-access break-point at these addresses and you will determine directly what code is utilizing them.

to get the size of elf file in details use
"You can use nm and size to get the size of functions and ELF sections.
To get the size of the functions (and objects with static storage duration):
$ nm --print-size --size-sort --radix=d tst.o
The second column shows the size in decimal of function and objects.
To get the size of the sections:
$ size -A -d tst.o
The second column shows the size in decimal of the sections."
Tool to analyze size of ELF sections and symbol

Related

Why does GCC not assign the static variable when it is initialized to 0

I initialize a static variable to 0, but when I see the assembly code, I find that only memory is allocated to the variable. The value is not assigned
And when I initialize the static variable to other numbers, I can find that the memory is assigned a value.
I guess whether GCC thinks the memory should be initialized to 0 by OS before we use the memory.
The GCC option I use is "gcc -m32 -fno-stack-protector -c -o"
When I initialize the static variable to 0, the c code and the assembly code:
static int temp_front=0;
.local temp_front.1909
.comm temp_front.1909,4,4
When I initialize it to other numbers, the code is:
static int temp_front=1;
.align 4
.type temp_front.1909, #object
.size temp_front.1909, 4
temp_front.1909:
.long 1
TL:DR: GCC knows the BSS is guaranteed to be zero-initialized on the platform it's targeting so it puts zero-initialized static data there.
Big picture
The program loader of most modern operating systems gets two different sizes for each part of the program, like the data part. The first size it gets is the size of data stored in the executable file (like a PE/COFF .EXE file on Windows or an ELF executable on Linux), while the second size is the size of the data part in memory while the program is running.
If the data size for the running program is bigger than the amount of data stored in the executable file, the remaining part of the data section is filled with bytes containing zero. In your program, the .comm line tells the linker to reserve 4 bytes without initializing them, so that the OS zero-initializes them on start.
What does gcc do?
gcc (or any other C compiler) allocates zero-initialized variables with static storage duration in the .bss section. Everything allocated in that section will be zero-initialized on program startup. For allocation, it uses the comm directive, and it just specifies the size (4 bytes).
You can see the size of the main section types (code, data, bss) using the size command. If you initialize the variable with one, it is included in a data section, and occupies 4 bytes there. If you initialize it with zero (or not at all), it is instead allocated in the .bss section.
What does ld do?
ld merges all data-type section of all object files (even those from static libraries) into one data section, followed by all .bss-type sections. The executable output contains a simplified view for the operating system's program loader. For ELF files, this is the "program header". You can take a look at it using objdump -p for any format, or readelf for ELF files.
The program headers contain of entries of different type. Among them are a couple of entries with the type PT_LOAD describing the "segments" to be loaded by the operating system. One of these PT_LOAD entries is for the data area (where the .data section is linked). It contains an entry called p_filesz that specifies how many bytes for initialized variables are provided in the ELF file, and an entry called p_memsz telling the loader how much space in the address space should be reserved. The details on which sections get merged into what PT_LOAD entries differ between linkers and depend on command line options, but generally you will find a PT_LOAD entry that describes a region that is both readable and writeable, but not executable, and has a p_filesz value that is smaller than the p_memsz entry (potentially zero if there's only a .bss, no .data section). p_filesz is the size of all read+write data sections, whereas p_memsz is bigger to also provide space for zero-initialized variables.
The amount p_memsz exceeds p_filesz is the sum of all .bss sections linked into the executable. (The values might be off a bit due to alignment to pages or disk blocks)
See chapter 5 in the System V ABI specification, especially pages 5-2 and 5-3 for a description of the program header entries.
What does the operating system do?
The Linux kernel (or another ELF-compliant kernel) iterates over all entries in the program header. For each entry containing the type PT_LOAD it allocates virtual address space. It associates the beginning of that address space with the corresponding region in the executable file, and if the space is writeable, it enables copy-on-write.
If p_memsz exceeds p_filesz, the kernel arranges the remaining address space to be completely zeroed out. So the variable that got allocated in the .bss section by gcc ends up in the "tail" of the read-write PT_LOAD entry in the ELF file, and the kernel provides the zero.
Any whole pages that have no backing data can start out copy-on-write mapped to a shared physical page of zeros.
Why does GCC not assign ...
Most modern OSs will automatically zero-initialize the BSS section.
Using such an OS an "uninitialized" variable is identical to a variable that is initialized to zero.
However, there is one difference: The data of uninitialized variables are not stored in the resulting object and executable files; the data of initialized variables is.
This means that "real" zero-initialized variables may lead to a larger file size compared to uninitialized variables.
For this reason the compiler prefers using "uninitialized" variables if variables are really zero-initialized.
The GCC option I use is ...
Of course there are also operating systems which do not automatically initialize "uninitialized" memory to zero.
As far as I remember Windows 95 is an example for this.
If you want to compile for such an operating system, you may use the GCC command line option -fno-zero-initialized-in-bss. This command line option forces GCC to "really" zero-initialize variables that are zero-initialized.
I just compiled your code with that command line option; the output looks like this:
.data
.align 4
.type temp_front, #object
.size temp_front, 4
temp_front:
.zero 4
There's no point even in Windows 95 to make zero-initialisation in code of every compiled module. May be the Win95 program loader (or even MS-DOS) does not initialize the bss section, but the "ctr0" init module (linked in every comppiled C/C++ program, and that will finally call main() or the DllEntry point, can do that directly in a fast operation for the whole BSS section, whose size is already on the program header and that can also be determined in a static preinitialized variable whose value is computed by the linker, and there's no need to change the way each module is compiled with gcc.
However there are more difficulties about automatic variables (local variables allocated on the stack): the compiler does not know if the variable will be initialized if its first use is by reference in a call parameter (to a non-inlined function, which may be in another module compiled separately or linked from an external library or DLL), supposed to fill it.
GCC only knows when the variable is explicitly assigned in the function itself, but if it gets used by reference only, GCC can now fircibly preinitialize it to zero to prevent it to keep a sensitive value left on the stack. In that case this adds some zero-fill code in the compiled function preamble for these local variables, and this helps prevent some data leaks (generally such leak is unlikely when the varaible is a simple type, but when it is a whole structure, many fields may be left in random state by the subcall.
C11 indicates that such code assuming initialization of auto variables has "undefined" behavior. But GCC will help close the security risk: this is allowed by C11 because this forced zeroing is better to leaving random value and both behaviors are conforming to the "undefined" behavior: zero is as well acceptable as a randomly leaked value.
Some secure functions also avoid leaving senstive data when returning, they explicitly clear the variables they no longer need to avoid expose them after these function return (and notably when they return from a privilege code to an unprivileged one): this is a good practice, but it is independant of the forced initilization of auto variables used by references in subcalls before they were initialized. And GCC is smart enough to not forcibly initialize these auto varaibles when there's explicit code that assign them an explicit value. So the impact is minimal. This feature may be disabled in GCC for those apps that want microoptimizations in terms of performance, but in both cases this does not add to the BSS size, and the image size just grows by only <0.1% for the Linux kernel only because of the few bytes of code compiled in a few functions that benefit of this security fix.
And this has no effect on "uninitialized" static variables, that GCC puts in the BSS section, cleared by the program loader of the OS, or by the small program's crt0 init module.

Why is there no content for the .bss section in an object (ELF) file?

This question confused me a lot. As far as I know, .bss section is for saving data that initialized but not used yet. But I don't understand what 'content' here mean and why there is no content here?
Thanks for any helps!
The quick response is: Well, there's no content to fill the .bss with, so there's no sense in putting any data on the executable in relation to that section. Only the positions of the variables are stored, but that belongs to another ELF section.
.bss section is where your program has all the uninitialized variables (by default all initialized to zero) The linker only needs to know the actual size of this region and the actual variable positions, but not the values, because its contents are obvious, independently of the nature or the distribution of the variables put there.
When your program is loaded, the kernel normally assigns a read-only segment for the unmodifiable text of the program (.text section) and also puts in that segment the contents of the initialized const variables (.rodata section) so in case yo attempt to modify something there, you get an exception. Then comes the initialized data section with the initial values of all the initialized variables of your program (.data section) and the uninitialized ones (.bss section)
The data segment (look how I call different a section and a load segment) is given more space, the sum of .data and .bss sections, to hold all the variables (both are included, so that's the reason it uses its length) but while the contents of the .data section have to be filled from the file, the contents of the .bss section don't, because all are zeroed by the operating system, before allowing the user process to access the allocated segment. That's not true for small systems, where the operating system doesn't fill the data with zeros... but there, the compiler adds some code to zero all the .bss segment, so again, there's no need to copy any data from the executable file.
The historic (and main) reason for this behaviour is that the pages the kernel assigns that have to be loaded with your program, are cleared to zero for security reasons (so you cannot luckily get a page full of other users' passwords, or other sensible information) so there's no reason to fill it with zeros again and nothing has to be copied there, there's no reason to put anything on the executable file. The pages the kernel maintains normally are zeroed only when they are going to be given to a user, but maintain (as they are designed for that purpose) the information until they are overwritten.
There's no content in the BSS (Block started By Symbol) section because it would be wasted storage. The contents of the BSS is all zeros and it is cleared by the startup code before main is called. Think of the BSS as a run-length compressed block of bytes. All you need to know to uncompress that block is the value (0) and the length, which is stored in the ELF entry for the BSS.
Your notion of "data that [is] initialized but not used yet" is a bit off. Consider that all sections in an ELF file are somehow "not used yet". The text segment may or may not become used (it may contain dead/unreachable code). The data segment may or may not be used at all (you can define objects never used by code).

Understanding certain ELF file structure

From ARM's infocenter, regarding section static linking and relocations:
** Section #1 'ER_RO' (SHT_PROGBITS) [SHF_ALLOC + SHF_EXECINSTR]
Size : 28 bytes (alignment 4)
Address: 0x00008000
$a
.text
bar
0x00008000: E59f000C .... LDR r0,[pc,#12] ; [0x8014] = 0x801C
0x00008004: E5901000 .... LDR r1,[r0,#0]
0x00008008: E2411001 ..A. SUB r1,r1,#1
0x0000800C: E5801000 .... STR r1,[r0,#0]
0x00008010: E12FFF1E ../. BX lr
$d
0x00008014: 0000801C .... DCD 32796
$a
.text
foo
0x00008018: EAFFFFF8 .... B bar ; 0x8000
and from ELF for the ARM architecture:
Table 4-7, Mapping symbols
Name Meaning
$a - Start of a sequence of ARM instructions
$d - Start of a sequence of data items (for example, a literal pool)
As you can see, the ELF file contains a section in which there is code (bar), then data/ro (32796), then more code (foo) in consecutive addresses.
Now, a basic principle regarding any SW file structure is that the SW is composed from different and separate sections - text (code), data, and bss. (and rodata if we want to be pedantic) as we can see if we examine the MAP file.
So, this ELF structure is not consistent with this basic principle, so my question is what is going on here? am I mistaking in this basic principle? if not, than is this ELF structure will be changed in run time to meet the sections separation?
and why is the ELF section contains mixed types in a certain sequential address space?
NOTE: I assume the scatter file used in the example is the default one since the document contains the example do not provide any scatter file along with the example.
At run time, the sections do not matter, only the PT_LOAD segments in the program header. The ELF specification is quite flexible there as well, but some loaders have restrictions on the PT_LOAD segments they can process.
The reason for splitting code and data this way could be that this architecture supports only a limited range of PC-relative addressing and needs a constant pool for loading most constants (because constructing them via immediates is too expensive). Having as few large constants pools as possible is attractive because it leads to improved data and instruction cache utilization (instead of caching memory which is not of the right type and this can never be used), but you may still need more than one if the code size exceeds what can be addressed directly.

in which segment constant data is stored?

I was trying to understand in which segment constant data is stored.
for example
const int x = 100
1) Where x is stored? In code segment or data segment?
2) is there any 'read only' data segment exists in initialized data segment?
The data will be stored in any segment, or multiple segments, or no segment at all, depending on the specifics of the compiler, compiler flags, linker, linker flags, and surrounding code.
On Linux/ELF, if it is stored somewhere, it will usually get stored in the .rodata section which is inside a segment with r-x permissions. ELF segments do not have names, as far as I know.
However, in many cases, constants like these will get inlined.
Probably not in any segment since such constants are resolved at compile time, and their values used directly when needed.
In general there is the rodata section, which stands indeed for read-only data section, which is a special section inside usually data segment meant to be used exactly for this purpose

why .bss explicitly initialize global variable to zero?

I am generating mips disassembly in order to simulating it. I need to have big data to work on it but I don't want to have big assembly files so I wanted to work on a big uninitialized array (and then possibly initialize it in my simulator...). So I need this array to be global. And global variables seem to be put on the .bss section to be initialized when the page is actually accessed.
The problem is in my binary the array is in the .bss section, but is explicitly filled with zero...This is not the behaviour expected if I understood correctly what I have found on internet...Is there a way for saying to the compiler (or linker, or loader...I don't understand well which one do what for that) to not really put zero in this array ?
Or alternatively, can we have an option while compiling, or a C instruction for saying we don't want this array for being initialized with 0 ? (I tried to change the array section with attribute but it is still initialized with 0).
By the way, I am generating my disassembly file with objdump, and it normally skip blocks of zeroes, but I really need the other blocks of zeroes to be disassembled, so I using the "-z" option.
What I really don't understand is that everywhere I looked, it was said that .bss section didn't really put zero in the binary file...
The data for the .bss section isn't stored in the compiled object files because, well, there is no data—the compiler puts variables in that segment precisely because they should be zero-initialized.
When the OS loads the executable, it just looks at the size of the .bss segment, allocates that much memory, and zero-initializes it for you. By not storing that data in the executable file, it reduces loading times.
If you want data to be initialized with certain data, then give it an initializer in your code. The compiler will then put it in the .data segment (initialized data) instead of .bss (uninitialized data). When the OS then loads the executable, it will allocate the memory for the data and then copy it in from the executable. This takes extra I/O, but your data is explicitly initialized how you want it.
Alternatively, you could leave the data stay in the .bss segment and then initialize it yourself at runtime. If the data is quick and easy to generate at runtime, it might be faster to recompute it at startup rather then read it off of disk. But those situations are probably rare.
I suspect that using the -z option is causing objdump to show you zeroes for the .bss, even though the zeroes are not actually in your binary. Try using od -t x4 to get a simple hexadecimal dump of what is really in the binary. If od shows you blocks of zeroes, then they really are in the binary.

Resources