Why does the System V AMD64 ABI say to use syscall? - c

Originally my project had just x86 system call code in it (using 32 bit registers and int $0x80), but then I created a version called src3 that used 64 bit registers and syscall. That worked, until I created src4 which changed the argument to my _exit function so that it only handles a single byte as the desired process exit status value (in my tests anything that needed more than a byte to represent seemed to overflow in shell printouts, so I'm assuming that process exit status values are only 1 byte in size anyway). This broke my _exit function until I changed using syscall to int $0x80.

Related

What are the exact programming instructions that are in user space?

I know that a process switches between user mode and kernel mode for running. I am confused that for every line of code, we should possibly need the kernel. Below is the example, could I get explanation of the kernels role in execution of the following coding lines. Does the following actually require kernel mode.
if(a < 0)
a++
I am confused that for every line of code, we should possibly need the kernel.
Most code in user-space is executed without the kernel being involved. The kernel becomes involved (and the CPU switches from user-space to kernel) when:
a) The user-space code explicitly asks the kernel to do something (calls a system call).
b) There's an IRQ (from a device) that interrupts user-space code.
c) The kernel is providing some functionality that user-space code is unaware of. The most common reason is virtual memory management; but debugging and profiling are other reasons.
d) Asynchronous notifications (e.g. something causing a switch to kernel so that kernel can redirect the program to a suitable signal handler).
e) The user-space code does something illegal (crashes).
Does the following actually require kernel mode.
That code (if(a < 0) a++;) probably won't require kernel's assistance; but it possibly might. For example, if the variable a is in memory that was previously sent to swap space, then any attempt to access a is a request for the kernel to fetch that data from swap space. In a similar way, if the executable file was memory mapped but not loaded yet (a common optimization to improve program startup time), then attempting to execute any instruction (regardless of what the instruction is) could ask the kernel to fetch the code from the executable file on disk.
Short answer:
It depends on what you are trying to do, following code depending on which enviroment and how its compiled it shouldn't need to use the kernel. The CPU executes machine code directly, only trapping to the kernel on instructions like syscall, or on faults like page-fault or an interrupt.
The ISA is designed so that a kernel can set up the page tables in a way that stops user-space from taking over the machine, even though the CPU is fetching bytes of its machine code directly. This is how user-space code can run just as efficiently when it's just operating on its own data, doing pure computation not hardware access.
Long answer:
Comparing something and increasing value of something shouldn't require use of a kernel, On x86 (64 bit) architecture following could be represented like this (in NASM syntax):
; a is in RAX, perhaps a return value from some earlier function
cmp rax, 0 ; if (a<0) implemented as
jnl no_increase ; a jump over the inc if a is Not Less-than 0
inc rax
no_increase:
Actual compilers do it branchlessly, with various tricks as you can see on the Godbolt compiler explorer.
Clearly there aren't any syscalls so this piece of code can be ran on any x86 device but it wouldn't be meaningful
What requires kernels are the system calls now sys calls aren't required to have a device that can output something in theory you can output something by finding a memory location that let's say corresponds to video memory and you can manipulate pixels to output something in the screen but for userland this isn't possible due virtual memory.
A userspace application needs a kernel to exist if a kernel did not exist then userspace wouldn't exist :) and please note not every kernel let's a userspace.
So only doing something like:
write(open(stdout, _O_RDWR), "windows sucks linux rocks", 24);
would obviously require a kernel.
Writing / reading to arbitary memory location for example: 0xB8000 to manipulate video memory doesn't need a kernel.
TL:DR; For example code you provided it needs a kernel to be in userspace but can be written in a system where userspace and kernel doesn't exist at all and work perfectly fine (eg: microcontrollers)
In simpler words: It doesn't require a kernel to be work since it doesn't use any system calls, but for meaningful operation in a modern operating system it would atleast require a exit syscall to exit with a code otherwise you will see Segmentation fault even though there isn't dynamic allocation done by you.
Whatever the code we write is but obvious in the realm of user mode.. Kernel mode is only going to be in picture when you write any code that performs any system call..
and since the if() is not calling any system function it's not going to be in kernel mode.

how to change stack protection via syscalls without parameters

This is a little bit strange question. I am trying to find a syscall that allowed to execute code on the stack without parameters on i386. I am doing ctf and I success to find a way to call syscall and control eax and have full control on the stack (with argv so just pointer to my strings). now I am jumping to the vdso (thats all the code in the program no dll's or anything else) to run a syscall that will allowed stack execution. but I go on the man page over and over and didn't found something I can use.
$uname -r 4.4.179-0404179-generic
There's no zero-arg Linux system call equivalent to mprotect(stack_base, stack_size, PROT_WRITE|PROT_READ|PROT_EXEC).
Not that I know of, and I wouldn't expect there to be one. Probably the only use case would be to help attackers, which is the opposite of hardening; normally you can make the stack executable via linker options or any specific pages via mprotect with args. There's no need for a shortcut for that.
There's also not one that can set the READ_IMPLIES_EXEC personality for an already-running process, even if you do allow args. (See Using personality syscall to make the stack executable - at best it will have an effect after execve.)
You might be able to use some ROP techniques to get some args set up for mprotect, and then return to the code you injected.

vxworks system call trap mechanism

I'm new to VxWorks and working with an ELF binary for VxWorks. System calls appear to trap into the kernel by calling the address _func_syscallTrapHandle which is 0x1234. Since the program must transition into the kernel, am I correct in assuming that the goal of this is to segfault by accessing low memory to enter the kernel? If so does the segfault ISR check the contents of rax and, when it's 0x1234 perform systemcall logic? Why isn't the syscall instruction used instead?
You are describing the system call trap mechanism in vxsim; as VxWorks, in this case, is executed as normal process inside Linux or Windows it cannot use syscall instruction.
An elf binary for real hardware behaves differently.

How does gdb implement call function

When I use gdb to debug process in arm linux I can use call like call write(123,"abc",3)
How does gdb inject that call into process and recover all?
How does gdb inject that call into process and recover all?
GDB can read and write the inferior (being debugged) process memory using ptrace system call.
So it reads and saves in its own memory some chunk of instructions from inferior (say 100 bytes).
Then it overwrites this chunk with new instructions, which look something like:
r0 = 123
r1 = pointer to "abc"
r2 = 3
BLR write
BKPT
Now GDB saves the current inferior registers, sets ip to point to the chunk of instructions it just wrote, and resumes the inferior.
Inferior executes instructions until it reaches the breakpoint, at which point GDB regains control. It can now look at the return register to know what write returned and print it. GDB now restores the original instructions and original register values, and we are back as if nothing happened.
P.S. This is a general description of how "call function in inferior" works; I do not claim that this is exactly how it works.
There are also complications: if write calls back into the code that GDB overwrote, it wouldn't work. So in reality GDB uses some other mechanism to obtain suitable "scratch area" in the inferior. Also, the "abc" string requires scratch area as well.

How are parameters passed to Linux system call ? Via register or stack?

I trying to understand the internals of the Linux kernel by reading Robert Love's Linux Kernel Development.
On page 74 he says the easiest way to pass arguments to a syscall is via :
Somehow, user-space must relay the parameters to the kernel during the
trap.The easiest way to do this is via the same means that the syscall
number is passed: The parameters are stored in registers. On x86-32,
the registers ebx, ecx, edx, esi, and edi contain, in order, the first
five arguments.
Now this is bothering me for a number of reasons:
All syscalls are defined with the asmlinkage option. Which implies that the arguments are always to be found on the stack and not the register. So what is all this business with the registers ?
It may be possible that before the syscall is performed the values are copied on to the kernel stack. I have no idea why that would be efficient but it might be a possibility.
(This answer is for 32-bit x86 Linux to match your question; things are slightly different for 64-bit x86 and other architectures.)
The parameters are passed from userspace in registers as Love says.
When userspace invokes a system call with int $0x80, the kernel syscall entry code gets control. This is written in assembly language and can be seen here, for instance. One of the things this code does is to take the parameters from the registers and push them onto the stack, and then call the appropriate kernel sys_XXX() function (which is written in C). So those functions do indeed expect their arguments on the stack.
It wouldn't work as well to try to pass parameters from userspace to the kernel on the stack. When the system call is made, the CPU switches to a separate kernel stack, so the parameters would have to be copied from the userspace stack to the kernel stack, and this is somewhat complicated. And it would have to be done even for very simple system calls that just take a few numeric arguments and wouldn't otherwise need to access userspace memory at all (think about close() for instance).

Resources