What is align in memory sections of an object file? - c

I was using objdump here, and using the -x flag, I saw that the sections were with some 2 ** 0. What would somebody be? Has any practical effect on this align
#include <stdio.h>
int main(void) {
char *x = "section"; // .rodata algn 2**0
return 0;
}

Related

Shellcode not running, despite disabling stack protections

I am exploring shellcode. I wrote an example program as part of my exploration.
Using objdump, I got the following shellcode:
\xb8\x0a\x00\x00\x00\xc
for the simple function:
int boo()
{
return(10);
}
I then wrote the following program to attempt to run the shellcode:
#include <stdio.h>
#include <stdlib.h>
unsigned char code[] = "\xb8\x0a\x00\x00\x00\xc3";
int main(int argc, char **argv) {
int foo_value = 0;
int (*foo)() = (int(*)())code;
foo_value = foo();
printf("%d\n", foo_value);
}
I am compiling using gcc, with the options:
-fno-stack-protector -z execstack
However, when I attempt to run, I still get a segfault.
What am I messing up?
You're almost there!
You have placed your code[] outside of main, it's a global array. Global variables are not placed on the stack. They can be placed:
In the BSS section if there are not initialized
In the data section if there are initialized and access in both
read/write
In the rodata section if there are only accessed in read
Let's verify this You can use readelf command to check all the sections of your binary (I only show the ones we are interested in):
$ readelf -S --wide <your binary>
There are 31 section headers, starting at offset 0x39c0:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[...]
[16] .text PROGBITS 0000000000001060 001060 0001a5 00 AX 0 0 16
[...]
[18] .rodata PROGBITS 0000000000002000 002000 000008 00
[...]
[25] .data PROGBITS 0000000000004000 003000 000017 00 WA 0 0 8
[...]
[26] .bss NOBITS 0000000000004017 003017 000001 00 WA 0 0 1
Then we can look for your symbol code in your binary:
$ readelf -s <your binary> | grep code
66: 0000000000004010 7 OBJECT GLOBAL DEFAULT 25 code
This confirms that your variable/array code is in .data section, which doesn't present the X flag, so you cannot execute code from it.
From there, the solution is obvious, place your array in your main function:
int main(int argc, char **argv) {
uint8_t code[] = "\xb8\x0a\x00\x00\x00\xc3";
int foo_value = 0;
int (*foo)() = (int(*)())code;
foo_value = foo();
printf("%d\n", foo_value);
}
However, this may also not work!
Your C compiler may find that yes, you are using code, but never reading from it anything, so it will optimize it and simply allocate it on the stack without initializing it. This is what happens with my version of GCC.
To force the compiler to not optimize the array, use volatile keyword.
int main(int argc, char **argv) {
volatile uint8_t code[] = "\xb8\x0a\x00\x00\x00\xc3";
int foo_value = 0;
int (*foo)() = (int(*)())code;
foo_value = foo();
printf("%d\n", foo_value);
}
In a real use-case, your array would be allocated on the stack and sent as a parameter to another function which itself would modify the array content with shellcode. So you wouldn't encounter such compiler optimization issue.

Function address nearly the same as other variables addresses [duplicate]

This question already has answers here:
Possible to know section of memory a variable is located?
(2 answers)
Closed 2 years ago.
Why are function addresses nearly the same as the address of static global variables or dynamically allocated variables? Here is the code for demonstration:
#include <stdio.h>
#include <stdlib.h>
int global_var;
int global_var1;
int global_var2;
static int st_var = 3;
void func()
{
return;
}
int main(void)
{
int x;
int* x_m = malloc(sizeof(int));
printf("Malloc: %p\n", x_m);
printf("Local: %p\n", &x);
printf("Function: %p\n", &func);
printf("Global: %p\n", &global_var);
printf("Global: %p\n", &global_var1);
printf("Global: %p\n", &global_var2);
printf("Static: %p\n", &st_var);
free(x_m);
return 0;
}
Output:
Malloc: 0x55bede9ce2a0
Local: 0x7ffdbc67b25c
Function: 0x55bede7151a9
Global: 0x55bede718024
Global: 0x55bede718030
Global: 0x55bede718020
Static: 0x55bede718010
Can somebody explain this? Because I thought that just global and static variables are stored into the .bss segment.
This is because, usually, the .text section (containing function code) and the .bss section of an ELF executable are mapped "relatively near" each other.
You can check this with readelf:
$ gcc prog.c
$ readelf -S a.out
There are 29 section headers, starting at offset 0x1ac0:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[14] .text PROGBITS 00000000000007e0 000007e0
0000000000000302 0000000000000000 AX 0 0 16
...
[24] .bss NOBITS 0000000000201010 00001010
0000000000000010 0000000000000000 WA 0 0 8
...
You can see from above from the "Address" field of .text and .bss that they will be loaded 0x201010-0x7e0 = 0x200830 bytes apart in virtual memory when the program runs.
In any case, this does not mean that your code is in the .bss section or that your variables are in the .text section. They are in two different yet "relatively near" sections.
The distance between the two is arbitrary, there is no real minimum or maximum requirement dictated by the ELF specification. You could write your own linker script to place them farther away if you really want.

Objcopy symbols are mixed or invalid in executable

As a simple example of my problem, let's say we have two data arrays to embed into an executable to be used in a C program: chars and shorts. These data arrays are stored on disk as chars.raw and shorts.raw.
Using objcopy I can create object files that contain the data.
objcopy --input binary --output elf64-x86-64 chars.raw char_data.o
objcopy --input binary --output elf64-x86-64 shorts.raw short_data.o
objdump shows that the data is correctly stored and exported as _binary_chars_raw_start, end, and size.
$ objdump -x char_data.o
char_data.o: file format elf64-x86-64
char_data.o
architecture: i386:x86-64, flags 0x00000010:
HAS_SYMS
start address 0x0000000000000000
Sections:
Idx Name Size VMA LMA File off Algn
0 .data 0000000e 0000000000000000 0000000000000000 00000040 2**0
CONTENTS, ALLOC, LOAD, DATA
SYMBOL TABLE:
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 g .data 0000000000000000 _binary_chars_raw_start
000000000000000e g .data 0000000000000000 _binary_chars_raw_end
000000000000000e g *ABS* 0000000000000000 _binary_chars_raw_size
(Similar output for short_data.o)
However, when I link these object files with my code into an executable, I run into problems. For example:
#include <stdio.h>
extern char _binary_chars_raw_start[];
extern char _binary_chars_raw_end[];
extern int _binary_chars_raw_size;
extern short _binary_shorts_raw_start[];
extern short _binary_shorts_raw_end[];
extern int _binary_shorts_raw_size;
int main(int argc, char **argv) {
printf("%ld == %ld\n", _binary_chars_raw_end - _binary_chars_raw_start, _binary_chars_raw_size / sizeof(char));
printf("%ld == %ld\n", _binary_shorts_raw_end - _binary_shorts_raw_start, _binary_shorts_raw_size / sizeof(short));
}
(compiled with gcc main.c char_data.o short_data.o -o main) prints
14 == 196608
7 == 98304
on my computer. The size _binary_chars_raw_size (and short) is not correct and I don't know why.
Similarly, if the _starts or _ends are used to initialize anything, then they may not even be located near each other in the executable (_end - _start is not equal to the size, and may even be negative).
What am I doing wrong?
The lines:
extern char _binary_chars_raw_start[];
extern char _binary_chars_raw_end[];
extern int _binary_chars_raw_size;
extern short _binary_shorts_raw_start[];
extern short _binary_shorts_raw_end[];
extern int _binary_shorts_raw_size;
They are not variables themselves. They are variables that are placed themselves at the beginning and end of the region. So the addresses of these variables are the start and end of the region. Do:
#include <stdio.h>
extern char _binary_chars_raw_start;
extern char _binary_chars_raw_end;
extern char _binary_chars_raw_size;
// print ptrdiff_t with %td
printf("%td == %d\n",
// the __difference in addresses__ of these variables
&_binary_chars_raw_end - &_binary_chars_raw_start,
(int)&_binary_chars_raw_size);
// note: alsoo print size_t like result of `sizeof(..)` with %zu
#edit _size is also a pointer

Why does the gold linker cause dl_iterate_phdr() not to return my custom note section?

On Linux, I would like to store some structures in a custom .note.foobar section and discover them at runtime.
I compile and link the program below once with gold and once without:
$ gcc -o test-ld test.c
$ gcc -o test-gold -fuse-ld=gold test.c
You can see that the ld-linked version finds the section while the gold-linked version does not:
$ ./test-ld
note section at vaddr: 2c4
note section at vaddr: 2f0
found f00dface
note section at vaddr: 324
note section at vaddr: 7a8
note section at vaddr: 270
note section at vaddr: 1c8
$ ./test-gold
note section at vaddr: 254
note section at vaddr: 7a8
note section at vaddr: 270
note section at vaddr: 1c8
However, the section does exist in both binaries:
$ readelf -x .note.foobar test-ld
Hex dump of section '.note.foobar':
0x000002f0 04000000 14000000 67452301 666f6f00 ........gE#.foo.
0x00000300 cefa0df0 00000000 00000000 00000000 ................
0x00000310 04000000 14000000 67452301 666f6f00 ........gE#.foo.
0x00000320 efbeadde ....
$ readelf -x .note.foobar test-gold
Hex dump of section '.note.foobar':
0x00000280 04000000 14000000 67452301 666f6f00 ........gE#.foo.
0x00000290 cefa0df0 00000000 00000000 00000000 ................
0x000002a0 04000000 14000000 67452301 666f6f00 ........gE#.foo.
0x000002b0 efbeadde ....
So you would expect the test-gold program to report a section at vaddr 280, but it does not.
Why can dl_iterate_phdr not find this section, while readelf can, and what is gold doing differently to cause this?
#define _GNU_SOURCE
#include <link.h>
#include <stdlib.h>
#include <stdio.h>
typedef struct {
unsigned int elf_namesize;
unsigned int elf_datasize;
unsigned int elf_type;
unsigned int elf_name;
unsigned int bar;
} foo_t;
const foo_t __attribute__((used,section(".note.foobar,\"a\"#"))) foo1 = {
4,
20,
0x01234567,
0x6f6f66,
0xf00dface,
};
const foo_t __attribute__((used,section(".note.foobar,\"a\"#"))) foo2 = {
4,
20,
0x01234567,
0x6f6f66,
0xdeadbeef,
};
static int
callback(struct dl_phdr_info *info, size_t size, void *data)
{
for (int i = 0; i < info->dlpi_phnum; i++) {
const ElfW(Phdr)* phdr = &info->dlpi_phdr[i];
if (phdr->p_type == PT_NOTE) {
foo_t *payload = (foo_t*)(info->dlpi_addr + phdr->p_vaddr);
printf("note section at vaddr: %lx\n", phdr->p_vaddr);
if (phdr->p_memsz >= sizeof(foo_t) && payload->elf_type == 0x01234567 && payload->elf_name == 0x6f6f66) {
printf("found %x\n", payload->bar);
}
}
}
return 0;
}
int
main(int argc, char *argv[])
{
dl_iterate_phdr(callback, NULL);
return 0;
}
This code:
foo_t *payload = (foo_t*)(info->dlpi_addr + phdr->p_vaddr);
assumes that your .note.foobar is the very first Elf...Note in the PT_NOTE segment, but you can't make that assumption -- the order of notes in PT_NOTE is not guaranteed; you need to iterate over all of them.
You can verify that there are multiple notes with readelf -n test-{ld,gold}.
It appears that GNU-ld emits a separate PT_NOTE for each .note* section, while Gold merges them all into a single PT_NOTE segment. Either behavior is perfectly fine as far as ELF standard is concerned, though GNU-ld is wasteful (there is no need to emit extra PT_NOTE program headers).
Here is what I get for your test program:
readelf -l test-ld | grep NOTE
NOTE 0x00000000000002c4 0x00000000004002c4 0x00000000004002c4
NOTE 0x00000000000002f0 0x00000000004002f0 0x00000000004002f0
NOTE 0x0000000000000324 0x0000000000400324 0x0000000000400324
readelf -l test-gold | grep NOTE
NOTE 0x0000000000000254 0x0000000000400254 0x0000000000400254
P.S.
Why does the gold linker cause dl_iterate_phdr() not to return my custom note section?
The direct answer is that dl_iterate_phdr doesn't deal with (or care) about sections. It iterates over segments, and assignment of sections to segments is up for linkers to perform as they see fit.

Data section in a.out

here is a simple code that I executed
int a;
int main()
{
return 0;
}
Then after compiling with gcc I did
size a.out
I got some output in bss and data section...Then I changed my code to this
int a;
int main()
{
char *p = "hello";
return 0;
}
Again when I saw the output by size a.out after compiling , size of data section remained same..But we know that string hello will be allocated memory in read only initialized part..Then why size of data section remained same?
#include <stdio.h>
int main()
{
return 0;
}
It gives
text data bss dec hex filename
960 248 8 1216 4c0 a.out
when you do
int a;
int main()
{
char *p = "hello";
return 0;
}
it gives
text data bss dec hex filename
982 248 8 1238 4d6 a.out
at that time hello is stored in .rodata and the location of that address is stored in char pointer p but here p is stored on stack
and size doesnt shows stack. And i am not sure but .rodata is here calculated in text or dec.
when you write
int a;
char *p = "hello";
int main()
{
return 0;
}
it gives
text data bss dec hex filename
966 252 8 1226 4ca a.out
now here again "hello" is stored in .rodata but char pointer takes 4 byte and stored in data so data is increment by 4
For more info http://codingfreak.blogspot.in/2012/03/memory-layout-of-c-program-part-2.html
Actually, that's an implementation detail. The compiler works by an as-is principle. Meaning that as long as the behavior of the program is the same, it's free to exclude any piece of code it wants. In this case, it can skip char* p = "hello" altogether.
The string "hello" is allocated in the section .rodata
Even if the total size doesn't changed, it doesn't mean that the code didn't.
I tested your example.
The string "hello" is a constant data, thus it is stored in the readonly .rodata section.
You can see this particular section using objdump, for example:
objdump -s -j .rodata <yourbinary>
With gcc 4.6.1 without any options, I got for your second code:
Contents of section .rodata:
4005b8 01000200 68656c6c 6f00 ....hello.
Since you don't use that char * in your code, the compiler optimized it away.

Resources