Who is executing the task for reading? It's kernel? And where the task is enqueued? Is the queue same for all processes?
http://linux.die.net/man/3/aio_read
The aio_read() function queues the I/O request described by the buffer
pointed to by aiocbp. This function is the asynchronous analog of
read(2).
The kernel starts an I/O request at the request of the process. The process goes and does other things. Since I/O is usually much slower than memory operations, this means the process can do a lot work before the read will have completed. The I/O completes asynchronously, meaning the process does not block, does not sit there doing nothing while the I/O subsystem goes out to disk and returns data.
An analogy is: you ask a friend to get you a glass of water when you are eating. While the friend gets water, you continue eating. When the friend gets back later, you drink the water. That is asynchronous delivery of a glass of water. Synchronous means that you sit at the table doing nothing, unable to do anything but wait for the glass of water
From my understanding the task is executed by the process calling aio_read. The results should be queued in the return buffer that you provided in the original call. This means that depending on the process and what socket it is trying to read your output will be different even if they are running concurrently since they will have different buffers to be stored in.
Hopefully that was helpful. For additional information I would take a look at the IBM Source I posted below
Asynchronous I/O is currently only supported for sockets. The aio_offset field may be set but it will be ignored.
IBM Source
Related
When a program is doing I/O, my understanding is that the thread will briefly sleep and then resume (e.g. when writing to a file). My question is that when we do printing using printf(), does a C program thread sleep in any way ?
Since you've specifically asked for printf(), I'm going to assume that you mean in the most generic way where it will fill a reasonably sized buffer and invoke the system call write(2) to stdout and that the stdout happens to point to your terminal.
In most operating systems, when you invoke certain system calls the calling thread/process is removed from CPU runnable list and placed in a separate waiting list. This is true for all I/O calls like read/write/etc. Being temporarily removed from processing due to I/O is not the same as being put to sleep via a timer.
For example, in Linux there's uninterruptible sleep state of a thread/process specifically meant for I/O waiting, while interruptible sleep state for those thread/process that are waiting on timers and events. Though, from a dumb user's perspective they both seem to be same, their implementation behind the scenes are significantly different.
To answer your question, a call to printf() isn't exactly sleeping but waiting for the buffer to be flushed to device rather than actually being in sleep. Even then there are a few more quirks which you can read about it in signal(7) and even more about various process/thread states from Marek's blog.
Hope this helps.
Much of the point of stdio.h is that it buffers I/O: a call to printf will often simply put text into a memory buffer (owned by the library by default) and perform zero system calls, thus offering no opportunity to yield the CPU. Even when something like write(2) is called, the thread may continue running: the kernel can copy the data into kernel memory (from which it will be transferred to the disk later, e.g. by DMA) and return immediately.
Of course, even on a single-core system, most operating systems frequently interrupt the running thread in order to share it. So another thread can still run at any time, even if no blocking calls are made.
I'd like to know what operations are safe in PortAudio's PaStreamFinishedCallback. I know typically it is not a good idea to attempt operations that could block on the PaStreamCallback for playback as that could cause pops/glitches on the user's or other application's audio streams. Do the same limitations apply to the PaStreamFinishedCallback? I guess ultimately I'm curious if that callback is also called on the OS's audio thread.
Alternately, is there a function like Pa_StopStream that will block until the callback has returned paComplete/paAbort, but without inducing a stop? That'd actually be ideal for my use, since I have a thread that's the right place for me to clean up. I know I could achieve this by having my callback signal to my thread that it's done, and then the thread could call Pa_StopStream but that feels heavy handed.
edit: To give a bit more context about my use, I have a ring buffer that holds some PCM and uses a pthread condvar to signal when space is available in the buffer. One thread writes into this ring and then the PaStreamCallback reads out of the the other end. When things are finished, the writer sets a closed flag on the ring and then the callback drains whatever is left. I'd like to make sure my ring drains and that PortAudio flushes. The callback is the only place that knows when the ring drains, so returning paComplete feels appropriate. But then I need some way to know that it's ok to deallocate my ring.
The answer to this is that it depends highly on the host and the behavior may change over time even for one host. I went ahead and read the implementation, and I discovered a couple of useful pieces of information here.
Pa_StopStream will just invoke the host system's Stop()-like behavior. I didn't read all the implementations but presumably most have some sort of blocking Stop(). That means that it's unlikely that blocking for a stop, without actually asking for one, will be a supported behavior.
PaStreamFinishedCallback is also just a thin wrapper on the host's own stream stopped callback. For example, in OSX Core Audio this is a Listener on kAudioOutputUnitProperty_IsRunning. It's entirely up to the host how and when this is called. I think the smart play here is to be as cautious as possible -- assume no blocking operations are safe inside this callback.
So, if you're in the same situation as me where one thread feeds PCM into a ring buffer, and the PaStreamCallback reads from that ring, then you'll probably want to
Subscribe to PaStreamFinishedCallback
Producer thread closes ring buffer and lets PaStreamCallback drain it
Return paComplete from PaStreamCallback when the ring is drained
Signal the producer thread that work is done from PaStreamFinishedCallback, in my case using pthread_cond_signal
Producer thread wakes up and cleans up by deallocating
Even signaling (and locking mutexes) from the audio thread is probably best to avoid, but it's hard to imagine there's an alternative. For regular reading from PCM ring buffer, the PaStreamCallback should probably spin some limited number of times before giving up. For the completion signal, the producer thread should lock and then immediately wait, so that it holds the lock as little as possible.
What is the meaning of "blocking system call"?
In my operating systems course, we are studying multithreaded programming. I'm unsure what is meant when I read in my textbook "it can allow another thread to run when a thread make a blocking system call"
A blocking system call is one that must wait until the action can be completed. read() would be a good example - if no input is ready, it'll sit there and wait until some is (provided you haven't set it to non-blocking, of course, in which case it wouldn't be a blocking system call). Obviously, while one thread is waiting on a blocking system call, another thread can be off doing something else.
For a blocking system call, the caller can't do anything until the system call returns. If the system call may be lengthy (e.g. involve file IO or networking IO) this can be a bad thing (e.g. imagine a frustrated user hammering a "Cancel" button in an application that doesn't respond because that thread is blocked waiting for a packet from the network that isn't arriving). To get around that problem (to do useful work while you wait for a blocking system call to return) you can use threads - while one thread is blocked the other thread/s can continue doing useful work.
The alternative is non-blocking system calls. In this case the system call returns (almost) immediately. For lengthy system calls the result of the system call is either sent to the caller later (e.g. as some sort of event or message or signal) or polled by the caller later. This allows you to have a single thread waiting for many different lengthy system calls to complete at the same time; and avoids the hassle of threads (and locking, race conditions, the overhead of thread switches, etc). However, it also increases the hassle involved with getting and handling the system call's results.
It is (almost always) possible to write a non-blocking wrapper around a blocking system call; where the wrapper spawns a thread and returns (almost) immediately, and the spawned thread does the blocking system call and either sends the system call's results to the original caller or stores them where the original caller can poll for them.
It is also (almost always) possible to write a blocking wrapper around a non-blocking system call; where the wrapper does the system call and waits for the results before it returns.
I would suggest having a read on this very short text:
http://files.mkgnu.net/files/upstare/UPSTARE_RELEASE_0-12-8/manual/html-multi/x755.html
In particular you can read there why blocking system calls can be a worry with threads, not just with concurrent processes:
This is particularly problematic for multi-threaded applications since
one thread blocking on a system call may indefinitely delay the update
of the code of another thread.
Hope it helps.
A blocking system call is a system call by means of which any process is requesting some service from the system but that service is not currently available. So that particular system call blocks the process.
If you want to make it clear in context with multi threading you can go through the link...
In the context of block devices like a file; are Linux kernel AIO functions like io_submit() only asynchronous within the supplied queue of I/O operations, or are they (also) asynchronous across several prosesses and/or threads that also have queues of I/O operations on the same file.
Doc says: The io_submit() system call queues nr I/O request blocks for
processing in the AIO context ctx_id. The iocbpp argument should be
an array of nr AIO control blocks, which will be submitted to context
ctx_id.
Update:
Example:
If I spawn two threads, both have 100 queued I/O operations on the same file and both call io_submit() at approx. the same time; will all 200 I/O operations be asynchronous or will thread #1's 100 I/O operations only be asynchronous in regards to each other but block thread #2 until all thread #1's I/O operations are done?
The only PART of asynchronous behaviour that your application should care about is within your application. Yes, other processes are likely going to ALSO write data to the disk at some point during the runtime of your application. There is very little you can do to stop that in a multitasking, multiuser and potentially multiprocessor system.
The general idea here is that your application doesn't block, which is the way that read or write [and their more advanced cousins, fread, fwrite, etc).
If you want to stop other processes from touching "your" files, then you need to use file-locking or something similar.
When a set of io requests is submitted with io_submit, the system call returns immediately. From the point of view of the thread emitting the requests, the execution of the commands embedded in the requests is asynchronous. The thread will have to query the OS to know the result, and is free to do what it wants in the mean time.
Now, if two threads happens to emit each a set of requests, they will both fall in the same situation. They will both have to ask the OS about the advancement of their respective IO commands. None of the threads will be blocked.
From the AIO framework point of view, it is entirely possible to make the OS actually execute the requests before returning from the io_submit call for either or all the threads invoking it, but the API remains the same: userland threads will still manipulate the API as an async one, obtaining a token for a future result from the API when it posts its requests, and using that token to get the real result.
In the specific case of linux AIO, the token is the context created beforehand, and the result check syscall is io_getevents, which reports an "event" (ie. a result) for each completed request.
Regarding your example, is it possible that during the second syscall, all the requests of the first thread get completed? I don't see a reason for this never happening at all, but if both threads post 100 requests very close to each other, then it seems very unlikely. A more likely scenario is that several requests of the first thread to call io_submit got completed when the second thread makes its own call to io_submit, but at any rate that call will not block.
I'm trying to understand how asynchronous file operations being emulated using threads. I've found next-to-nothing materials to read about the subject.
Is it possible that:
a process uses a thread to open a regular file (HDD).
the parent gets the file descriptor from the thread, now it may close the thread.
the parent uses the file descriptor with a new thread, reading X bytes from the file.
the parent gets the file descriptor with the seek-position of the current file state.
the parent may repeat these operations, without the need to open, or seek, every time it wishes to "continue" reading a new chunk of the file?
This is just a wild guess of mine, would appreciate if anybody mind to shed more light to clarify how it's being emulated efficiently.
UPDATE:
By efficient I actually mean that I don't want the thread to "wait" since the moment the file been opened. Think of a HTTP non-blocking daemon which serves a client with a huge file, you want to use the thread to read chunks of the file without blocking the daemon - but you don't want to keep the thread busy while "waiting" for the actual transfer to take place, you want to use the thread for other blocking operations of other clients.
To understand asynchronous I/O better, it may be helpful to think in terms of overlapping operation. That is, the number of pending operations (operations that have been started but not yet completed) can simutaneously go above one.
A diagram that explains asynchronous I/O might look like this: http://msdn.microsoft.com/en-us/library/aa365683(VS.85).aspx
If you are using the asynchronous I/O capabilities provided by the underlying Operating System, then it is possible to asynchronously read from multiple files without spawning a equal number of threads.
If your underlying Operating System does not provide asynchronous I/O, or if you decide not to use it, in other words, you wish to emulate asynchronous operation by only using blocking I/O (the regular Read/Write provided by the Operating System) then it is necessary to spawn as many threads as the number of simutaneous I/O operations. This is because when a thread is making a function call to blocking I/O, the thread cannot continue its execution until the operation finishes. In order to start another blocking I/O operation, that operation has to be issued from another thread that is not already occupied.
When you open/create a file fire up a thread. Now store that thread id/ptr as your file handle.
Basically the thread will do nothing except sit in a loop waiting for an "event". A semaphore would be good here. When you want to do a read then you add the read command to a queue (remember to critical section the stack add), return a unique id, and then you increment the semaphore. If the thread is asleep it will now wake up and grab the first message off the queue and process it. When it has completed you remove the command from the queue.
To poll if a file read has completed you can, simply, check to see if its in the command queue. If its not there then the command has completed.
Furthermore if you want to allow synchronous reads as well then you can wait after sending the message through for an "event" to get triggered by the completion. You then check to see if the unique id is the queue and if it isn't you return control. If it still is then you go back to a wait state until the relevant unique id has been processed.