Notifying user-mode as soon as a packet arrives - c

(This is for a low latency system)
Assuming I have some code which transfers received UDP packets to a region of shared memory, how can I then notify the application (in user mode) that it is now time to read the shared memory? I do not want the application continuously polling eating up cpu cycles.
Is it possible to insert some code in the network stack which can call my application code immediately after it has written to the shared memory?
EDIT I added a C tag, but the application would be in C++

One way to signal an event from one Unix process to another is with POSIX semaphores. You would use sem_open to initialize and open a named semaphore that you can use cross-process.
See How can I get multiple calls to sem_open working in C?.
The lowest latency method to signal an event between processes on the same host is to spin-wait looking for a (shared) memory location to change... this avoids a system call. You expressly said you do not want the application polling, however in a multi-threaded application running on a multi-core system it may not be a bad tradeoff if you really care about latency.

Unless you are planning to use a real-time OS, there is no "immediate" protocol. The CPU resources are available in quantums of few milliseconds, and usually it takes some time for your user thread to understand it can continue.
Considering all above, any form of IPC would do: local sockets, signals, pipes, event descriptors etc. Practical difference on performance would be miserable.
Furthermore, usage of shared memory can lead to unnessessary complications in maintaining/debugging, but that's the designer's choice.

Related

Non-blocking access to the file system

When writing a non-blocking program (handling multiple sockets) which at a certain point needs to open files using open(2), stat(2) files or open directories using opendir(2), how can I ensure that the system calls do not block?
To me it seems that there's no other alternative than using threads or fork(2).
As Mel Nicholson replied, for everything file descriptor based you can use select/poll/epoll. For everything else you can have a proxy thread-per-item (or a thread pool) with the small stack that would convert (by means of the kernel scheduler) any synchronous blocking waits to select/poll/epoll-able asynchronous events using eventfd or a unix pipe (where portability is required).
The proxy thread shall block till the operation completes and then write to the eventfd or to the pipe to wake up the select/poll/epoll.
Indeed there is no other method.
Actually there is another kind of blocking that can't be dealt with other than by threads and that is page faults. Those may happen in program code, program data, memory allocation or data mapped from files. It's almost impossible to avoid them (actually you can lock some pages to memory, but it's privileged operation and would probably backfire by making the kernel do a poor job of memory management somewhere else). So:
You can't really weed out every last chance of blocking for a particular client, so don't bother with the likes of open and stat. The network will probably add larger delays than these functions anyway.
For optimal performance you should have enough threads so some can be scheduled if the others are blocked on page fault or similar difficult blocking point.
Also if you need to read and process or process and write data during handling a network request, it's faster to access the file using memory-mapping, but that's blocking and can't be made non-blocking. So modern network servers tend to stick with the blocking calls for most stuff and simply have enough threads to keep the CPU busy while other threads are waiting for I/O.
The fact that most modern servers are multi-core is another reason why you need multiple threads anyway.
You can use the poll( ) command to check any number of sockets for data using a single thread.
See here for linux details, or man poll for the details on your system.
open( ) and stat( ) will block in the thread they are called from in all POSIX compliant systems unless called via an asynchronous tactic (like in a fork)

pthread on-wakeup execution

How can I make my pthreads execute a function each time they are rescheduled by the kernel?
I need to identify on which physical CPU/socket (not logical core) my thread is being scheduled at and cannot afford to do this all the time.
Can the wakeup routine be hooked somehow to make the necessary updates to TLS only when the thread is actually being rescheduled?
As to why I need this: I have code which executes AMOs appx every 70ns per thread which is fine if the address is not cached on another socket, deploying the same code on two sockets gives a 15 times performance impact because of frequent cache invalidations. I intend to allocate memory especially for this which is only shared among threads running the same L3 cache. So I need to identify on which socket I am running and address the correct memory block. I could obviously call sched_getcpu and compare this to the physical CPU ID in /proc/cpuinfo, but this is a rather big overhead. I cannot afford to allocate thread-private memory for each thread though, too expensive.
From what I have read in Linux Kernel Development, Third Edition, there is no service nor interface, provided by the kernel, for what you want. Using pthread_setaffinity (as suggested above by #osgx, or, in more recent linux kernel implementations, pthread_setaffinity_np) or caching a TLS key per cpu socket in the beginning (as suggested above by #caf) are perhaps the best methods to use in that direction.

what's the difference between the threads(and process) in kernel-mode and ones in user-mode?

my question:
1)In book modern operating system, it says the threads and processes can be in kernel mode or user mode, but it does not say clearly what's the difference between them .
2)Why the switch for the kernel-mode threads and process costs more than the switch for user-mode threads and process?
3) now, I am learning Linux,I want to know how would I create threads and processes in Kernel mode and user mode respectively IN LINUX SYSTEM?
4)In book modern operating system, it says that it is possible that process would be in user- mode, but the threads which are created in the user-mode process can be in kernel mode. How would this be possible?
There are some terminology problems due more to historical accident than anything else here.
"Thread" usually refers to thread-of-control within a process, and may (does in this case) mean "a task with its own stack, but which shares access to everything not on that stack with other threads in the same protection domain".
"Process" tends to refer to a self-contained "protection domain" which may (and does in this case) have the ability to have multiple threads within it. Given two processes P1 and P2, the only way for P1 to affect P2 (or vice versa) is through some particular defined "communications channel" such as a file, pipe, or socket; via "inter-process" signals like Unix/Linux signals; and so on.
Since threads don't have this kind of barrier between each other, one thread can easily interfere with (corrupt the data used by) another thread.
All of this is independent of user vs kernel, with one exception: in "the kernel"—note that there is an implicit assumption here that there is just one kernel—you have access to the entire machine state at all times, and full privileges to do anything. Hence you can deliberately (or in some cases accidentally) disregard or turn off hardware protection and mess with data "belonging to" someone else.
That mostly covers several possibly-confused items in Q1. As for Q2, the answer to the question as asked is "it doesn't". In general, because threads do not involve (as much) protection, it's cheaper to switch from one thread to another: you do not have to tell the hardware (in whatever fashion) that it should no longer allow various kinds of access, since threads T1 and T2 have "the same" access. Switching between processes, however, as with P1 and P2, you "cross a protection barrier", which has some penalty (the actual penalty varies widely with hardware, and to some extent the skills of the OS writers).
It's also worth noting that crossing from user to kernel mode, and vice versa, is also crossing a protection domain, which again has some kind of cost.
In Linux, there are a number of ways for user processes to create what amount to threads, including both "POSIX threads" (pthreads) and the clone call (details for clone, which is extremely flexible, are beyond the scope of this answer). If you want to write portable code, you should probably stick with pthreads.
Within the Linux kernel, threads are done completely differently, and you will need Linux kernel documentation.
I can't properly answer Q4 since I don't have the book and am not sure what they are referring to here. My guess is that they mean that whenever any user process-or-thread makes a "system call" (requests some service from the OS), this crosses that user/kernel protection barrier, and it is then up to the kernel to verify that the user code has appropriate privileges for that operation, and then to do that operation. The part of the kernel that does this is running with kernel-level protections and thus needs to be more careful.
Some hardware (mostly obsolete these days) has (or had) more than just two levels of hardware-provided protection. On these systems, "user processes" had the least direct privilege, but above those you would find "executive mode", "system mode", and (most privileged) "kernel" or "nucleus" mode. These were intended to lower the cost of crossing the various protection barriers. Code running in "executive" did not have full access to everything in the machine, so it could, for instance, just assume that a user-provided address was valid, and try to use it. If that address was in fact invalid, the exception would rise to the next higher level. With only two levels—"user", unprivileged; and "kernel", completely-privileged—kernel code must be written very carefully. However, it's possible to provide "virtual machines" at low cost these days, which pretty much obsoletes the need for multiple hardware levels of protection. One simply writes a true kernel, then lets it run other things in what they "think" is "kernel mode". This is what VMware and other "hypervisor" systems do.
User-mode threads are scheduled in user mode by something in the process, and the process itself is the only thing handled by the kernel scheduler.
That means your process gets a certain amount of grunt from the CPU and you have to share it amongst all your user mode threads.
Simple case, you have two processes, one with a single thread and one with a hundred threads.
With a simplistic kernel scheduling policy, the thread in the single-thread process gets 50% of the CPU and each thread in the hundred-thread process gets 0.5% each.
With kernel mode threads, the kernel itself manages your threads and schedules them independently. Using the same simplistic scheduler, each thread would get just a touch under 1% of the CPU grunt (101 threads to share the 100% of CPU).
In terms of why kernel mode switching is more expensive, it probably has to do with the fact that you need to switch to kernel mode to do it. User mode threads do all their stuff in user mode (obviously) so there's no involving the kernel in a thread switch operation.
Under Linux, you create threads (and processes) with the clone call, similar to fork but with much finer control over things.
Your final point is a little obtuse. I can't be certain but it's probably talking about user and kernel mode in the sense that one could be executing user code and another could be doing some system call in the kernel (which requires switching to kernel or supervisor mode).
That's not the same as the distinction when talking about the threading support (user or kernel mode support for threading). Without having a copy of the book to hand, I couldn't say definitively, but that'd be my best guess.

How Blocking IO Affects A Multithreaded Application/Service In Linux

Am exploring with several concepts for a web crawler in C on Linux. To decide if i'll use blocking IO, multiplexed OI, AIO, a certain combination, etc., I esp need to know (I probably should discover it for myself practically via some test code, but for expediency I prefer to know from others) when a call to IO in blocking mode is made, is it the particular thread (assuming a multithreaded app/svc) or the whole process itself that is blocked? Even more specifically, in a multitheaded (POSIX) app/service can a thread dedicated to remote read/writes block the entire process? If so, how can I unblock such a thread without terminating the entire process?
NB: Whether or not I should use blocking/nonblocking is not really the question here.
Kindly
Blocking calls block only the thread that made them, not the entire process.
Whether to use blocking I/O (with one socket per thread) or non-blocking I/O (with each thread managing multiple sockets) is something you are going to have to benchmark. But as a rule of thumb...
Linux handles multiple threads reasonably efficiently. So if you are only handling a few dozen sockets, using one thread for each is easy to code and should perform well. If you are handling hundreds of sockets, it is a closer call. And for thousands of sockets, you are almost certainly better off using one thread (or process) to manage large groups.
In the latter case, for optimal performance you probably want to use epoll, even though it is Linux-specific.

Executing a user-space function from the kernel space

Im writing a custom device driver in linux that has to be able to respond very rapidly on interrupts. Code to handle this already exists in a user-space implementation but that is too slow as it relies on software constantly checking the state of the interrupt line. After doing some research, I found that you can register these interrupt lines from a kernel module, and execute a function given by a function pointer. However the code we want to execute is in the user-space, is there a way to call a function in the user-space from a kernel space module?
You are out of luck with invoking user-space functions from the kernel since the kernel doesn't and isn't supposed to know about individual user-space application functions and logic, not to mention that each user-space application has its own memory layout, that no other process nor the kernel is allowed to invade in that way (shared objects are the exception here, but still you can't tap into that from the kernel space). What about the security model, you aren't supposed to run user-space code (which is automatically considered unsafe code in the kernel context) in the kernel context in the first place since that will break the security model of a kernel right there in that instant. Now considering all of the above mentioned, plus many other motives you might want to reconsider your approach and focus on Kernel <-> User-space IPC and Interfaces, the file system or the user-mode helper API(read bellow).
You can invoke user space apps from the kernel though, that using the usermode-helper API. The following IBM DeveloperWorks article should get you started on using the usermode-helper Linux kernel API:
Kernel APIs, Part 1: Invoking user-space applications from the kernel
I think the easiest way is to register a character device which becomes ready when the device has some data.
Any process which tries to read from this device, then gets put to sleep until the device is ready, then woken up, at which point it can do the appropriate thing.
If you just want to signal readyness, a reader could just read a single null byte.
The userspace program would then just need to execute a blocking read() call, and would be blocked appropriately, until you wake it up.
You will need to understand the kernel scheduler's wait queue mechanism to use this.
Sounds like your interrupt line is already available to userspace via gpiolib? (/sys/class/gpio/...)
Have you benchmarked if gpio edge triggering and poll() is fast enough for you? That way you don't have to poll the status from the userspace application but edge triggering will report it via poll(). See Documentation/gpio.txt in kernel source.
If the edge triggering via sysfs is not good enough, then the proper way is to develop a kernel driver that takes care of the time critical part and exports the results to userspace via a API (sysfs, device node, etc).
I am also facing the same problem, I read this document http://people.ee.ethz.ch/~arkeller/linux/multi/kernel_user_space_howto-6.html, so planning to use signals. In my case there is no chance of losing signals, because
1. the system is closed loop, after signals executed then only I will get another signal.
2. And I am using POSIX real-time signals.

Resources