what is the "Value" field in output of readelf -s - c

Here is my code:
#include <stdio.h>
int variable;
int main(){
printf("%p", &variable);
}
Output in couple of runs:
~ % ./a.out
0x559bae5c4030
~ % ./a.out
0x55b9d1038030
~ %
as you can see, there's a "30" at the end of both addresses.
and the symbol table:
~ % readelf -s a.out | grep variable
Num: Value Size Type Bind Vis Ndx Name
51: 0000000000004030 4 OBJECT GLOBAL DEFAULT 23 variable
~ %
again there's this "30" at the end of Value field.
My question is, what exactly is that value field and what does it have to do with the output of code? and why the last two digits are preserved in every run?
sorry for my poor english

The Value field from readelf corresponds to the address of the variable in the executable a.out.
What you see in the output is the actual loaded address of variable at runtime. So your executable is loaded at (starting address) 0x559bae5c0000 in the first run ( = 0x559bae5c4030 - 0x4030). And is loaded at 0x55b9d1034000 in the second run (0x55b9d1038030 - 0x4030).
You can see this by inspecting /proc/<PID>/maps of the executable a.out when running.
The load address changes from run to run because of Address Space Layout Randomization on Linux.

Related

Why are instructions addresses on the top of the memory user space contrary to the linux process memory layout?

#include <stdio.h>
void func() {}
int main() {
printf("%p", &func);
return 0;
}
This program outputted 0x55c4cda9464a
Supposing that func will be stored in the .text section, and according to this figure, from CS:APP:
I suppose that the address of func would be somewhere near the starting address of the .text section, but this address is somewhere in the middle. Why is this the case? Local variables stored on the stack have addresses near 2^48 - 1, but I tried to disassemble different C codes and the instructions were always located somewhere around that 0x55... address.
gcc, when configured with --enable-default-pie1 (which is the default), produces Position Independent Executables(PIE). Which means the load address isn't same as what linker specified at compile-time (0x400000 for x86_64). This is a security mechanism so that Address Space Layout Randomization (ASLR) 2 can be enabled. That is, gcc compiles with -pie option by default.
If you compile with -no-pie option (gcc -no-pie file.c), then you can see the address of func is closer to 0x400000.
On my system, I get:
$ gcc -no-pie t.c
$ ./a.out
0x401132
You can also check the load address with readelf:
$ readelf -Wl a.out | grep LOAD
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x000478 0x000478 R 0x1000
LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x0001f5 0x0001f5 R E 0x1000
LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x000158 0x000158 R 0x1000
LOAD 0x002e10 0x0000000000403e10 0x0000000000403e10 0x000228 0x000230 RW 0x1000
1 you can check this with gcc --verbose.
2 You may also notice that address printed by your program is different in each run. That's because of ASLR. You can disable it with:
$ echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
ASLR is enabled on Linux by default.

The function of "main" address remains unchanged. Why does local variable address change every run?

Please refer to the below program:
#include<stdio.h>
int main()
{
int a, b;
printf("address of main =%p\n", main);
a=3;
printf("Address of 'a' =%p\n", &a);
return 0;
}
I compiled the above program using gcc and then ran the binary. I am getting the below output:
[root#localhost gdb]# ./a.out
address of main =0x400536
Address of 'a' =0x7ffc4802cbdc
[root#localhost gdb]# ./a.out
address of main =0x400536
Address of 'a' =0x7ffe2bdcd66c
[root#localhost gdb]#
Same source code compiled with –m32, now I'm getting the output:
[root#localhost gdb]# ./a.out
address of main =0x804841b
Address of 'a' =0xffa6b29c
[root#localhost gdb]# ./a.out
address of main =0x804841b
Address of 'a' =0xff9b808c
Here is my question: why the address of a variable range has changed while running 64 and 32 bits application in 64 bit kernel?. The main function address remains unchanged, Why does a variable addresses change every run? And where is the address of a variable stored?
The software that loads programs intentionally varies the location of the stack in each execution to make it harder for attackers to exploit bugs.
The program knows where a is because its offset within the stack frame of main is built into it by the compiler, and the address of the stack frame for main comes from the stack pointer passed to main by the software that loads the program and starts main.

What is the ".data..init_task" section in the Linux Kernel code?

I am exploring the Linux Kernel code and came across this line of code:
#define __init_task_data __attribute__((__section__(".data..init_task"))).
I know that something like:
int x __attribute__((__section__("section"))) = 10;
is an attribute of gcc which would put the symbol of x into the section "section" of the compiled process image. However when I try to specify ".data..init_task" as the section, my variable gets put into the .data section. Here is my code:
int apple __attribute__((__section__(".data..init_task"))) = 10;
Compiled with:
gcc test.c
Disassembled with:
objdump -D a.out
My variable "apple" appears under the .data section, there is no section ".data..init_task" which is what has stumped me.

Setting a constant in rodata

I am trying to understand how to set the value of a string in the rodata segment as loading it using an environment variable gives me issues.
I want to externally set a constant string in the rodata section. This function should be independent of the code executed. So, when I do
"objdump -c foo"
the rodata section must enlist this string without the file foo.c having to do it.
How do I set a constant in the .rodata section ?
Edit: Linux OS and using GCC
I cannot use an environment var as that would mean that the c code is modified, I want the c code untouched and add the constant, say "Goo" to the rodata segment.
Then you need to write a program that lets you modify the binary file.
Read the ELF file specifications.
Then write a program that modifies the ELF program and section headers and adds the data to the .rodata section.
I've managed to write a small bash script that does more or less what I think you want.
First let's consider this sample program:
test.c
#include <stdio.h>
const char message[1024] = "world";
int main()
{
printf("hello %s\n", message);
}
The target variable will be message. Note that I will not change the size of the variable, that would be a mess, you be careful to reserve as much memory as you will ever need.
Now the script:
patchsym
#!/bin/bash
# usage: patchsym PROGRAM SYMBOL < NEWCONTENT
EXE="$1"
SYMBOL="$2"
OFFS=$((0x$(objdump -t "$EXE" | grep " $SYMBOL$" | cut -d ' ' -f 1)))
OFFS=2176
dd of="$EXE" bs=1 seek=$OFFS conv=notrunc
The new message content will be:
newmsg
universe^#
(where ^# is actually a NUL character).
Now just do:
$ gcc test.c -o test
$ ./test
hello world
$ ./patchsym test message < newmsg
$ ./test
hello universe

Why the int type takes up 8 bytes in BSS section but 4 bytes in DATA section

I am trying to learn the structure of executable files of C program. My environment is GCC and 64bit Intel processor.
Consider the following C code a.cc.
#include <cstdlib>
#include <cstdio>
int x;
int main(){
printf("%d\n", sizeof(x));
return 10;
}
The size -o a shows
text data bss dec hex filename
1134 552 8 1694 69e a
After I added another initialized global variable y.
int y=10;
The size a shows (where a is the name of the executable file from a.cc)
text data bss dec hex filename
1134 556 12 1702 6a6 a
As we know, the BSS section stores the size of uninitialized global variables and DATA stores initialized ones.
Why int takes up 8 bytes in BSS? The sizeof(x) in my code shows that the int actually takes up 4 bytes.
The int y=10 added 4 bytes to DATA which makes sense since int should take 4 bytes. But, why does it adds 4 bytes to BSS?
The difference between two size commands stays the same after deleting the two lines #include ....
Update:
I think my understanding of BSS is wrong. It may not store the uninitialized global variables. As the Wikipedia says "The size that BSS will require at runtime is recorded in the object file, but BSS (unlike the data segment) doesn't take up any actual space in the object file." For example, even the one line C code int main(){} has bss 8.
Does the 8 or 16 of BSS comes from alignment?
It doesn't, it takes up 4 bytes regardless of which segment it's in. You can use the nm tool (from the GNU binutils package) with the -S argument to get the names and sizes of all of the symbols in the object file. You're likely seeing secondary affects of the compiler including or not including certain other symbols for whatever reasons.
For example:
$ cat a1.c
int x;
$ cat a2.c
int x = 1;
$ gcc -c a1.c a2.c
$ nm -S a1.o a2.o
a1.o:
0000000000000004 0000000000000004 C x
a2.o:
0000000000000000 0000000000000004 D x
One object file has a 4-byte object named x in the uninitialized data segment (C), while the other object file has a 4-byte object named x in the initialized data segment (D).

Resources