I am trying to add a set of system calls to support semaphore in xv6.
I added a syssemaphore.c file(which will be instored with functions that will path the user arguments from the ustack using argptr, argint, etc..) and noticed that I cant find the h file which will link the functions I will write.
basicly I want to add files like sysproc.c and sysfile.c.
is it possible?
Adding a new system call to XV6 meaning altering the entire system call mechanism flow, from user space invoking system call interrupt while setting the system call id number in eax register, through syscall function which runs the right system call handler, and finally to the system call implementation (which includes a sys_something function to retrieve user parameters and validate them).
If I understand your question correctly, you're new file, syssemaphore.c, includes the sys_something functions that you wish to call from syscall in syscall.c file.
The syscall function is the only function that should invoke your new sys_something wrappers. therefore, it will be sufficient to add those functions prototypes (as extern function) above the syscalls array in syscall.c file, which will then allow you to add your new functions to the syscalls array.
See additional information at How to pass a value into system call XV6
Related
In my understanding, when userspace use bpf_map_update_elem(int fd, void *key, void *value, __u64 flags),
first, userspace find the map through the fd;
second, userspace make a memory in user-space;
and ....
I know a little bit, but the specific process is still unclear.
So I wanna know what the detail is when userspace run API map helpers.
Because you mention “user space”, I am unsure what you are talking about exactly. So let's start with some clarification.
BPF maps (or at least, most of the existing types, including hash maps and arrays) can be accessed in two ways:
From user space, by any application running on the system and having sufficient permission
From kernel space, from BPF programs
From user space, there is no “helper” function. Interaction with maps is entirely (*) done through the bpf() syscall (with the BPF_MAP_LOOKUP_ELEM, BPF_MAP_UPDATE_ELEM, BPF_MAP_DELETE_ELEM commands passed to the syscall as its first argument). See the bpf(2) manual page for more details. This is what you use in a user space application that would load and manage BPF programs and maps, say bpftool for example.
From kernel space, i.e. from a BPF program, things work differently and access is done with one of the BPF “helpers” such as bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags). See the bpf-helpers(7) man page for details on existing helpers. You can find details on those helper calls in the Cilium guide, or obviously by reading kernel code (e.g. for array maps). They look like low-level C function calls, with BPF registers being used to pass the necessary arguments, and then it calls from the BPF program instructions into the helper that is compiled as part of the kernel binary.
So you mentioned bpf_map_update_elem() and user space. Although this is the name for the helper on the kernel side, I suspect you might be talking about the function with the same name which is offered by the libbpf library, to provide a wrapper around the bpf() system call. So what happens with this function is rather simple.
There is no need to find the map from the file descriptor in user space: Actually the opposite happens, the file descriptor is open from the map in user space (from its map id, or from its pinned path under the /sys/fs/bpf virtual file system for example). So the fd is passed to the bpf() system call and used by the kernel as a reference to the map.
I'm not sure what you mean but “userspace make a memory in user-space”. There is no need to allocate any memory here: The key and value should already have been filled at this point, and they are passed to the kernel through the bpf() syscall to tell what entry to update, and with what value. Same things for the flags.
Once bpf() has been called, what happens on the kernel side is rather straightforwards. Mostly, the kernel check permissions, validates the arguments (to make sure they are safe and consistent with the map), then it updates the actual data. For array maps, array_map_update_elem() (used with the BPF helper on kernel side too, see link above) is called eventually.
(*) Some interactions might actually be done without the bpf() system call, I believe that with “global data” stored in BPF maps, use applications mmap() to the kernel memory. But this goes beyond the scope of basic usage of arrays and maps.
I am trying to add a linux syscall for an arm architecture. So far I have added a new syscall number in the /arch/arm/include/asm/unistd.h file, added a function prototype in syscalls.h and included a parameter to call the syscall in calls.S. When I compile the kernel I am able to execute the syscall using the syscall() function. I now want the user to use a wrapper or a stub to call the function so that the user does not need to remember the syscall number. How do I achieve this at the kernel level. I tried looking into the _syscall() function but it seems to be deprecated.
Thanks.
If one tries to hook certain syscalls via sys_call_table-hooking, e.g. sys_execve this will fail, because they are indirectly called by a stub. For sys_execve this is stub_execve (compare assembly code on LXR).
But what are these stubs good for? Why do only certain system calls like execve(2) and fork(2) require a stub and how is this connected to x86_64? Is there a workaround to hook stubbed syscalls (in a Loadable Kernel Module)?
From here, it says:
"Certain special system calls that need to save a complete full stack frame."
And I think execve is just one of these special system calls.
From the code of stub_execve, If you want to hook it, at least you can try:
Get to understand the meaning of those assembly code and do it by yourself, then you can call your own function in your own assembly code.
From the middle of the assembly code, it has a call sys_execve, you can replace the address of sys_execve to your own hook function.
I am trying to implement functionality in a linux 2.6.32.60 x86 kernel that would allow me to block all system calls based on a field I added in the task struct. This would basically be of the form:
task_struct ts;
if(ts-> added_field == 0)
//do system call normally
else
//don't do system call
I was wondering if I should do this directly in entry_32.S or if I would be able to modify the way the syscall table is called elsewhere. The problem with directly modifying entry_32.S is that I don't know if I can access the task struct that is making the call.
Thanks for the help!
The kernel already has a very similar feature, called seccomp (LWN article). You may want to consider basing your feature off of this, rather than implementing something new.
If I were to do this, I'd hook into __kernel_vsyscall() and just stop the dispatch if the task structure so indicated per your logic above.
Specifically, arch/i386/kernel/vsyscall-sysenter.S is shared among every process's address space and is the entry point through which all syscalls go. This is the spot just before the actual syscall is dispatched and, in my opinion, the place to put your hook. You are in the processes' address space, so you should have access to mm->current for your task structure. (See also arch/sh/kernel/vsyscall/vsyscall.c)
LKM can create dynamically entries inside /proc/sys, but sysctl (not the Linux command but C's sysctl) accepts as first argument an array of ints with predefined values representing entries inside /proc/sys. My question is: can I read a dynamically created entry with sysctl or do I need to use fopen, read, etc...?
You need to use the file system interface: fopen, fread, etc (or open, read, if you prefer).
And about the C function called sysctl, don't use it:
Use of this system call has long been discouraged, and it is so unloved that it is likely to disappear in a future kernel version. Since Linux 2.6.24, uses of this system call result in warnings in the kernel log. Remove it from your programs now; use the /proc/sys interface instead.