Linux Kernel function memblock_alloc_range_nid is not presented in the address space - c

I'm trying to debug physical memory allocation to understand what part of the Linux Kernel use memblock_alloc_range_nid on x86-64 and how.
I'm running the latest Linux Kernel 5.19-rc2 built from upstream with Ubuntu 20.04 inside QEMU. The problem is it's not possible to access memory address the function memblock_alloc_range_nid is located at. While other Kernel functions can be easily disassembled.
Here is what I have in my gdb connected to the QEMU VM:
(gdb) disas memblock_alloc_range_nid
Cannot access memory at address 0xffffffff831a05d1
(gdb) disas native_safe_halt
Dump of assembler code for function native_safe_halt:
#...
End of assembler dump.
What's wrong with the function memblock_alloc_range_nid? Why is it not possible to access its address? It seems all the function from memblock.c cannot be accessed.

As Margaret and Roi have noted in the above comments, memblock_alloc_range_nid() is annotated with __init. Functions annotated with __init, __head and similar macros are only needed during kernel initialization right after boot. After the kernel has finished initializing things, the special memory section containing all those functions is unmapped from memory since they are not needed anymore.
If you want to debug any such function, you will have to break very early, for example at start_kernel(), then you can inspect the function or set a breakpoint and continue execution to hit it.
Important: add -S (uppercase) to your QEMU command line options to make it stop and wait for the debugger before starting the kernel, and start the kernel with KASLR disabled using -append "nokaslr" (or adding nokaslr if you are already specifying -append).
The following GDB script should be what you need:
$ cat script.gdb
directory /path/to/kernel-source-dir
file /path/to/kernel-source-dir/vmlinux
target remote localhost:1234
break start_kernel
continue
Launch gdb -x script.gdb (after starting QEMU), and when you hit the start_kernel breakpoint, you can add another one for memblock_alloc_range_nid, then continue:
0x000000000000fff0 in exception_stacks ()
Breakpoint 1 at 0xffffffff82fbab4c
Breakpoint 1, 0xffffffff82fbab4c in start_kernel ()
(gdb) b memblock_alloc_range_nid
Breakpoint 2 at 0xffffffff82fe2879
(gdb) c
Continuing.
Breakpoint 2, 0xffffffff82fe2879 in memblock_alloc_range_nid ()
(gdb) disassemble
Dump of assembler code for function memblock_alloc_range_nid:
=> 0xffffffff82fe2879 <+0>: push r15
0xffffffff82fe287b <+2>: mov r15,rcx
0xffffffff82fe287e <+5>: push r14
0xffffffff82fe2880 <+7>: mov r14,rdx
0xffffffff82fe2883 <+10>: push r13
0xffffffff82fe2885 <+12>: mov r13d,r9
...

Related

Aarch64 baremetal Hello World program on QEMU

I have compiled and ran a simple hello world program on ARM Foundation Platform. The code is given below.
#include <stdio.h>
int main(int argc, char *argv[])
{
printf("Hello world!\n");
return 0;
}
This program is compiled and linked using ARM GNU tool chain as given below.
$aarch64-none-elf-gcc -c -march=armv8-a -g hello.c hello.o
aarch64-none-elf-gcc: warning: hello.o: linker input file unused because linking not done
$aarch64-none-elf-gcc -specs=aem-ve.specs -Wl,-Map=linkmap.txt hello.o -o hello.axf
I could successfully execute this program on ARM Foundation Platform (I think, foundation platform is similar to ARM Fixed Virtual Platform) and it prints "Hello world!"
The contents of 'aem-ve.specs' file is given below:
cat ./aarch64-none-elf/lib/aem-ve.specs
# aem-ve.specs
#
# Spec file for AArch64 baremetal newlib, libgloss on VE platform with version 2
# of AngelAPI semi-hosting.
#
# This Spec file is also appropriate for the foundation model.
%rename link old_link
*link:
-Ttext-segment 0x80000000 %(old_link)
%rename lib libc
*libgloss:
-lrdimon
*lib:
cpu-init/rdimon-aem-el3.o%s --start-group %(libc) %(libgloss) --end-group
*startfile:
crti%O%s crtbegin%O%s %{!pg:rdimon-crt0%O%s} %{pg:rdimon-crt0%O%s}
Is it possible to execute the same binary on QEMU? If so could you please share the sample command. If not, could you please share the right and best way to execute it on QEMU?
I have traced instruction execution of Foundation platform using the "trace" option and analysed using 'tarmac-profile' tool and 'Tarmac-calltree' tool. The outputs are given below:
$ ./tarmac-profile hello.trace --image hello.axf
Address Count Time Function name
0x80000000 1 120001
0x80001030 1 30001 register_fini
0x80001050 1 60001 deregister_tm_clones
0x800010c0 1 210001 __do_global_dtors_aux
0x80001148 1 23500001 frame_dummy
0x800012c0 1 10480001 main
0x80002000 1 40001 main
0x80003784 1 360001 main
0x80003818 1 460001 _cpu_init_hook
0x80003870 1 590001 _write_r
0x800038d0 1 1090001 __call_exitprocs
...
...
./tarmac-calltree hello.trace --image hello.axf
o t:10000 l:2288 pc:0x80001148 - t:23510000 l:7819 pc:0x80008060 : frame_dummy
- t:240000 l:2338 pc:0x800011a8 - t:720000 l:2443 pc:0x800011ac
o t:250000 l:2340 pc:0x80003818 - t:710000 l:2442 pc:0x80003828 : _cpu_init_hook
- t:260000 l:2343 pc:0x8000381c - t:320000 l:2354 pc:0x80003820
o t:270000 l:2345 pc:0x80002000 - t:310000 l:2353 pc:0x80002010 : main
- t:320000 l:2354 pc:0x80003820 - t:700000 l:2436 pc:0x80003824
o t:330000 l:2356 pc:0x80003784 - t:690000 l:2435 pc:0x80003814 : main
- t:760000 l:2453 pc:0x800011bc - t:2010000 l:2970 pc:0x800011c0
o t:770000 l:2455 pc:0x80004200 - t:2000000 l:2969 pc:0x800042c0 : memset
- t:2010000 l:2970 pc:0x800011c0 - t:4870000 l:3587 pc:0x800011c4
o t:2020000 l:2972 pc:0x80007970 - t:4860000 l:3586 pc:0x80007b04 : initialise_monitor_handles
- t:2960000 l:3165 pc:0x80007b24 - t:4340000 l:3465 pc:0x80007b28
I have tried the following method to execute on QEMU, without any success.
I have started the QEMU with gdb based debugging through TCP Port
$ qemu-system-aarch64 -semihosting -m 128M -nographic -monitor none -serial stdio -machine virt,gic-version=2,secure=on,virtualization=on -cpu cortex-a53 -kernel hello.axf -S -gdb tcp::9000
The result of debug session is given below:
gdb) target remote localhost:9000
Remote debugging using localhost:9000
_start () at /data/jenkins/workspace/GNU-toolchain/arm-11/src/newlib-cygwin/libgloss/aarch64/crt0.S:90
90 /data/jenkins/workspace/GNU-toolchain/arm-11/src/newlib-cygwin/libgloss/aarch64/crt0.S: No such file or directory.
(gdb) si
<The system hangs here>
I have tried to disassemble the code using gdb and its output is given below. It looks like the code is not loaded correctly
(gdb) disas frame_dummy
Dump of assembler code for function frame_dummy:
0x0000000080001110 <+0>: udf #0
0x0000000080001114 <+4>: udf #0
0x0000000080001118 <+8>: udf #0
0x000000008000111c <+12>: udf #0
0x0000000080001120 <+16>: udf #0
0x0000000080001124 <+20>: udf #0
0x0000000080001128 <+24>: udf #0
0x000000008000112c <+28>: udf #0
0x0000000080001130 <+32>: udf #0
0x0000000080001134 <+36>: udf #0
0x0000000080001138 <+40>: udf #0
0x000000008000113c <+44>: udf #0
0x0000000080001140 <+48>: udf #0
0x0000000080001144 <+52>: udf #0
End of assembler dump.
Could you please shed some light on this. Any hint given is appreciated very much.
The Foundation Model and the QEMU 'virt' board are not the same machine type. They have different devices at different physical addresses, and in particular they do not have RAM at the same address. To run bare metal code on the 'virt' machine type you will need to adjust your code. This is normal for bare metal -- the whole point is that you are running directly on some piece of (emulated) hardware and need to match the details of it.
Specifically, the minimum change you need to make here is that the RAM on the 'virt' board starts at 0x4000_0000, not the 0x8000_0000 that the Foundation Model uses. There may be others as well, but that is the immediate cause of what you're seeing.

Unable to inject valid instruction pointer during exploit

I am trying to exploit a given program and I can't figure out what I doing wrong. Long story short I manage to inject code to overwrite the RIP. This means that I should be able to redirect the code execution, but the problem is, I get SIGSEGV. Do I have to design the injected stack in a special way in order to not get SIGSEGV?
My game plan is to exploit the function mainloop and change the return adress. The stack for the function mainloop has the following values:
0000| 0x7fffffffdff0 --> 0xa7400ffffe010
0008| 0x7fffffffdff8 --> 0xf423f55758260
0016| 0x7fffffffe000 --> 0x7fffffffe010 --> 0x5555555550b0 (<__libc_csu_init>: push r15)
0024| 0x7fffffffe008 --> 0x5555555550a4 (<main+66>: mov eax,0x0)
So the return adress is stored at 0x7fffffffe008 and I have managed to overwrite that value with the adress pointing to the code that I want to execute. In this case the adress 0x555555554e6e.
The backtrace of the program is as follows:
#6 0x0000555555554fab in mainloop ()
#7 0x00005555555550a4 in main ()
#8 0x00007ffff7e1109b in __libc_start_main (main=0x555555555062 <main>, argc=0x1, argv=0x7fffffffe0f8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
stack_end=0x7fffffffe0e8) at ../csu/libc-start.c:308
#9 0x000055555555486a in _start ()
As you can see, when I exit mainloop I will be returned to main, and when I quit main I go to a bunch of libc-functions so that the program exits cleanly (?).
So what happens when I run my exploit-code? This:
0000| 0x7ffe1b3d4150 --> 0x424142001b3d4170
0008| 0x7ffe1b3d4158 ("ABABABABABABABABnNUUUU")
0016| 0x7ffe1b3d4160 ("ABABABABnNUUUU")
0024| 0x7ffe1b3d4168 --> 0x555555554e6e ('nNUUUU')
0032| 0x7ffe1b3d4170 --> 0x55a34784000a
0040| 0x7ffe1b3d4178 --> 0x7f7c156c409b (<__libc_start_main+235>: mov edi,eax)
0048| 0x7ffe1b3d4180 --> 0x0
0056| 0x7ffe1b3d4188 --> 0x7ffe1b3d4258 --> 0x7ffe1b3d5474 ("./device")
What you are seeing is the stack. I added some bytes so you can get more context. But I think I managed to hit the right size of the filler for my exploit. I have managed to change the value at byte 24.
But my PEDA/GDB doesn't seem to regard that value as a instruction pointer, which is weird. It seems to regard it as a C-string(?). The backtrace looks like this:
#6 0x000055a347849fab in mainloop ()
#7 0x0000555555554e6e in ?? ()
#8 0x000055a34784000a in ?? ()
#9 0x00007f7c156c409b in __libc_start_main (main=0x55a34784a062 <main>, argc=0x1, argv=0x7ffe1b3d4258, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
stack_end=0x7ffe1b3d4248) at ../csu/libc-start.c:308
#10 0x000055a34784986a in _start ()
And when I exit mainloop I get the following in PEDA/GDB:
Stopped reason: SIGSEGV
0x0000555555554e6e in ?? ()
And if I run the command i f in GDB I get:
Stack level 0, frame at 0x7ffe1b3d4178:
rip = 0x555555554e6e; saved rip = 0x55a34784000a
called by frame at 0x7ffe1b3d4180
Arglist at 0x7ffe1b3d4168, args:
Locals at 0x7ffe1b3d4168, Previous frame's sp is 0x7ffe1b3d4178
Saved registers:
rip at 0x7ffe1b3d4170
At adress 0x0000555555554e6e the program executes the following ASM:
0x0000555555554e6e <+172>: lea rdi,[rip+0x20126b] # 0x5555557560e0 <flag2>
So I seem to have the right RIP, but that's about it.
Guys, what's going on?
GDB (by default) disables ASLR when you start a program from within GDB. In a PIE executable, static code/data addresses are randomized. Injecting a return address only works if you know the right absolute address.
But apparently you're sometimes starting your program outside GDB, where addresses aren't the same every time. This obviously leads to a segfault, and your question doesn't show disassembly from the actual 0x0000555555554e6e in the actual process that segfaulted, contrary to what you're claimed / assumed.
(GCC making PIE executables by default is a recent thing on Linux; if you were following an old tutorial, it might have assumed that the executable itself was position-dependent, and only libraries + stack would be ASLRed. 32-bit absolute addresses no longer allowed in x86-64 Linux?)
See Disable randomization of memory addresses for a system-wide or per-process way to disable ASLR, or build non-PIE executables.

Translate Instruction Pointer Address (in shared library) to Source Instruction

Are there any tools or libraries one can use on Linux to get the original (source) instruction only from the PID and the current instruction pointer address, even if the IP currently points into a shared library?
AFAIK it should be possible, since the location of the library mapping is available through /proc/[PID]/maps, though I haven't found any applications or examples doing so.
Any suggestions?
EDIT: an assembly instruction or the nearest symbol suffice (source code line is not necessarily needed)
I found a way to do this with GDB:
Interactive:
$ gdb --pid 1566
(gdb) info symbol 0x7fe28b8a2b79
pselect + 89 in section .text of /lib/x86_64-linux-gnu/libc.so.6
(gdb) info symbol 0x5612550f14a4
copy_word_list + 20 in section .text of /usr/bin/bash
(gdb) info symbol 0x7fe28b878947
execve + 7 in section .text of /lib/x86_64-linux-gnu/libc.so.6
Shows exactly what I wanted!
It can also be scripted:
gdb -q --pid PID --batch -ex 'info symbol HEX_SYMBOL_ADDR'

Why is _init from glibc's csu/init-first.c called before _start even if _start is the ELF entry point?

I first noticed it while playing with GDB's rbreak ., and then made a minimal example:
(gdb) file hello_world.out
Reading symbols from hello_world.out...done.
(gdb) b _init
Breakpoint 1 at 0x4003e0
(gdb) b _start
Breakpoint 2 at 0x400440
(gdb) run
Starting program: /home/ciro/bak/git/cpp/cheat/gdb/hello_world.out
Breakpoint 1, _init (argc=1, argv=0x7fffffffd698, envp=0x7fffffffd6a8) at ../csu/init-first.c:52
52 ../csu/init-first.c: No such file or directory.
(gdb) continue
Continuing.
Breakpoint 2, 0x0000000000400440 in _start ()
(gdb) continue
Continuing.
Breakpoint 1, 0x00000000004003e0 in _init ()
(gdb) info breakpoints
Num Type Disp Enb Address What
1 breakpoint keep y <MULTIPLE>
breakpoint already hit 2 times
1.1 y 0x00000000004003e0 <_init>
1.2 y 0x00007ffff7a36c20 in _init at ../csu/init-first.c:52
2 breakpoint keep y 0x0000000000400440 <_start>
breakpoint already hit 1 time
Note that there are 2 _init: one in csu/init-first.c, and the other seems to come from sysdeps/x86_64/crti.S. I'm talking about the csu one.
Isn't _start supposed to be the entry point set by the linker, and stored in the ELF header? What mechanism makes _init run first? What is its purpose?
Tested on GCC 4.8, glibc 2.19, GDB 7.7.1 and Ubuntu 14.04.
Where the debugger halts first in your example isn't the real beginning of the process.
In the ELF header there is an entry for the program interpreter (dynamic linker). On Linux 64 bit its value is /lib64/ld-linux-x86-64.so.2. The kernel sets the initial instruction pointer to the entry point of this program interpreter. The symbol name of it is _start too, like the programs _start.
After the dynamic linker has done its work, calling also functions in the program, like _init in glibc, it calls the entry point of the program.
The breakpoint at _start doesn't work for the dynamic linker because it takes only the address of the program's _start.
You can find the entry point address with readelf -h /lib64/ld-linux-x86-64.so.2.
You could also set a breakpoint at _dl_start and print a backtrace to see that this function is called from dynamic linker's _start.
If you download glibc's current source code you can find the entry point of the dynamic loader at glibc-2.21/sysdeps/x86_64/dl-machine.h starting on line 121.

No Debug Symbols cross compile ARM on BusyBox

i'm trying to debug a C program, which runs on an ARM926EJ-S rev 5 (v5l). The software was cross-compiled (and is statically linked) with the std. arm-linux-gnueabi compiler (intalled via synaptic). I run Ubuntu 13.04 64bit. On the device is a Busybox v1.18.2. I successfully compiled gdbserver (with host=arm-linux-gnueabi) and gdb (with target=arm-linux-gnueabi) and can start my program on the embedded device via the locally running gdb...
My problem now is, that i don't have a proper backtrace output.
Message of gdb:
Remote debugging using 192.168.21.127:2345
0x0000a79c in ?? ()
(gdb) run
The "remote" target does not support "run". Try "help target" or "continue".
(gdb) continue
Continuing.
Cannot access memory at address 0x0
Program received signal SIGINT, Interrupt.
0x00026628 in ?? ()
(gdb) backtrace
#0 0x00026628 in ?? ()
#1 0x00036204 in ?? ()
#2 0x00036204 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)
I try to compile the software with -g, -g3 -gdwarf-2, -ggdb, -ggdb3 without any difference.
Has anybody an idea what i am missing here?
Is this a problem maybe with the BusyBox or do i need additional libs on my host system?
I also tried the function backtrace_symbols from execinfo.h with nearly the same output...
Thanks in advance for any reply.
Another way for debugging is use gdb inside board follow below steps.
1)Run gdb process and attach your process to gdb using attach <pid> command
2)Continue your process using c command in gdb
Whenever you find any SIGINT or SIGSEGV then refer stack of your process using bt command in gdb.

Resources