When using snd_pcm_writei() in non-blocking mode everything works perfect for a while but eventually the audio gets choppy. It sounds like the ring buffer pointers are getting out of sync (ie. sometimes I can tell that the audio is playing out of order). How long it takes for the problem to start it's hardware dependent. On a Gentoo box on real hardware it seldom happens, but on a buildroot system running on QEMU it happens after about 5 minutes. On both cases draining the pcm stream fixes the problem. I have verified that I'm writing the samples correctly by also writting them to a file and playing them with aplay.
Currently I'm setting avail_min to the period size (1024 frames) and calling snd_pcm_wait() before writting chunks of the period size. But I tried a number of different variations (different chunk sizes, checking avail myself and use pthread_cond_timedwait() instead of snd_pcm_wait(), etc). But the only thing that works fine is using blocking mode but I can not do that.
You can see the current source code here: https://bitbucket.org/frodzdev/mediabox/src/5a6471316c7ae481b329e7e0d4af1bb68a32e71d/src/audio.c?at=staging&fileviewer=file-view-default (it needs a little cleanup since I'm trying all kinds of things). The code that does the actual IO starts at line 375.
Edit:
I think I got a solution but I don't understand why it seems to work. It seems that it does not matter if I'm using non-blocking mode, the problem is when I wait to make sure there's room on the buffer (either through snd_pcm_wait(), pthread_cond_timedwait(), or usleep()).
The version that seems to work is here: https://bitbucket.org/frodzdev/mediabox/src/c3eb290087d9bbe0d5f37653a33a1ba88ef0628b/src/audio.c?fileviewer=file-view-default. I switched to blocking mode while still waiting before calling snd_pcm_writei() and it didn't made a difference. Then I added the call to snd_pcm_avail() before calling snd_pcm_status() on avbox_audiostream_gettime(). This function is called constantly by another thread to get the stream clock and it only uses snd_pcm_status() to get the timestamps. Now it seems to work (at least it is a lot less probable to happen) but I don't understand exactly why. I understand that snd_pcm_avail() will synchronize the pointers with the kernel but I don't really understand when it needs to be called and the difference between snd_pcm_state() et al and snd_pcm_status(). Does snd_pcm_status() also synchronize anything? It seems not because sometimes snd_pcm_status_get_state() will return RUNNING when snd_pcm_avail() returns -EPIPE. The ALSA documentation is really vague. Perhaps understanding these things will help me understand my problem?
Now, when I said that it seems to be working I mean that I cannot reproduce it on real hardware. It still happens on QEMU though way less often. But considering that on the next commit I switched to blocking mode without waiting (which I've used in the past and never had a problem with on real hardware) and it still happens in QEMU and also the fact that this is a common issue with QEMU I'm starting to think that I may have fixed the issue on my end and now it's just a QEMU problem. Is there any way to determine if the problem is a bug on my end that is easier to trigger on the emulator or if it's just an emulator problem?
Edit: I realize that I should fill the buffer before waiting but at this point my concern is not to prevent underruns but to make sure that my code can handle them when they happen. Besides the buffer is filling up after a few iterations. I confirmed this by outputing avail, buffer_size, etc before writing each packet and the numbers I get don't make perfect sense, they show an error of 1 or 2 periods about every 8th period. Also (and this is the main problem) I'm not detecting any underruns, the audio get choppy but all writes succeed. In fact, if the problem start happening and I trigger an underrun by overloading the CPU it will correct itself when the pcm is reset.
In line 505: You're using time as argument to malloc.
In line 568: Weren't you playing audio? In this case you should do wait only after you wrote the frames. Let's think ...
Audio device generates an interrupt when it terminates to process a period.
| period A | period B |
^ ^
irq irq
Before you start the pcm, audio device doesn't generate any interrupt. Notice here that you're waiting and you haven't started the pcm yet. You only starts it when you call snd_pcm_writei().
When you wait for audio data you'll be awake only when the current period has been fully processed -- in your first wait the first period wasn't even written -- so in a comfortable situation you should write the whole buffer, wait for the first interrupt, and then write the just processed period, and on and on.
Initially, buffer is empty:
| | |
write():
|############|############|
wait():
..............
When we wake up:
| |############|
write():
|############|############|
I found the problem is you're writing audio just before it be played, then sometimes it may arrive delayed in the buffer.
Related
I am running a DMA transfer through the DAC of an stmf303re nucleo, and was wondering if there was a difference between the HAL_DAC_Stop_DMA vs a HAL_DAC_Stop? I ask this because earlier on in my code I just used a HAL_DAC_Stop and it worked fine, however i now see that there is a HAL_DAC_Stop_DMA also and was wondering what the difference is?
If you started it with DMA you should stop it with the equivalent function. If you use the non-DMA stop function the ADC will stop but the DMA is still running waiting for the ADC to tell it more data is available. Obviously this will never come so the system is left in a funny state. Maybe the next start function can tidy up this funny state, or maybe it can't. Read the source of the functions if you want to find out the exact details.
Another possible problem of not using the DMA stop function could be that the last data produced by the ADC is still being copied by the DMA. This would mean that your destination buffer is not ready for you to use. Whether or not this causes a problem depends on how soon your code reads the destination buffer.
I've tried multiple example programs that appear to have code to handle xruns under playback:
https://albertlockett.wordpress.com/2013/11/06/creating-digital-audio-with-alsa/
https://www.linuxjournal.com/article/6735 (listing 3)
When using snd_pcm_writei(), it appears that when the return value is -EPIPE (which is xrun/underrun), they do:
if (rc == -EPIPE) {
/* EPIPE means underrun */
fprintf(stderr, "underrun occurred\n");
snd_pcm_prepare(handle);
}
I.e. call snd_pcm_prepare() on the handle.
However, I still get stuttering when I attempt to run programs like these. Typically, I will get at least a few, to maybe half a dozen xrun, and then it will play smoothly and continuously without further xruns. However, if I have something else using the sound card, such as Firefox, I will get many more xruns, and sometimes only xruns. But again, even if I kill any other program that uses the sound card, I still experience the issue with some initial xruns and actual stuttering on the speakers.
This is not acceptable for me, how can I modify this type of xrun handling to prevent stuttering?
My own attempt at figuring this out:
From the ALSA API, I see that snd_pcm_prepare() does:
Prepare PCM for use.
This is not very helpful to an ALSA beginner like myself. It is not explained how this can be used to recover xrun issues.
I also note, from: https://www.alsa-project.org/alsa-doc/alsa-lib/pcm.html
SND_PCM_STATE_XRUN
The PCM device reached overrun (capture) or underrun (playback). You can use the -EPIPE return code from I/O functions
(snd_pcm_writei(), snd_pcm_writen(), snd_pcm_readi(), snd_pcm_readn())
to determine this state without checking the actual state via
snd_pcm_state() call. It is recommended to use the helper function
snd_pcm_recover() to recover from this state, but you can also use
snd_pcm_prepare(), snd_pcm_drop() or snd_pcm_drain() calls.
Again, it is not clear to me. I can use snd_pcm_prepare() OR I can use these other calls? What is the difference? What should I use?
The best way to handle underruns is to avoid handling them by preventing them. This can be done by writing samples early enough before the buffers is empty. To do this,
reorganize your program so that the new samples are already available to be written when you need to call snd_pcm_write*(), and/or
increase the priority of your process/thread (if possible; this is probably not helpful if other programs interfere with your disk OI/O), and/or
increase the buffer size (this also increases the latency).
When an underrun happens, you have to decide what should happen with the samples that should have been played but were not written to the buffer at the correct time.
To play these samples later (i.e., to move all following samples to a later time), configure the device so that an underrun stops it. (This is the default setting.) Your program has to restart the device when it has new samples.
To continue with the following samples at the same time as if the missing samples had actually be played, configure the device so that an underrun does not stop it. This can be done by setting the stop threshold¹ to the boundary value². (Other errors, like unplugging a USB device, will still stop the device.)
When an underrun does happen, the device will play those samples that happen to be in the ring buffer. By default, these are the old samples from some time ago, which will not sound correct. To play silence instead (which will not sound correct either, but in a different way), tell the device to clear each part of the buffer immediately after it has been played by setting the silence threshold³ to zero and the silence size⁴ to the boundary value.
To (try to) reinitialize a device after an error (an xrun or some other error), you could call either snd_pcm_prepare() or snd_pcm_recover(). The latter calls the former, and also handles a suspended device (by waiting for it to be resumed).
¹stop threshold: snd_pcm_sw_params_set_stop_threshold()
²boundary value: snd_pcm_sw_params_get_boundary()
³silence threshold: snd_pcm_sw_params_set_silence_threshold()
⁴silence size: snd_pcm_sw_params_set_silence_size()
Clients sending sufficient large amount of data with sufficient slow internet connection are causing me to busy-wait in a classic non-blocking server-client setup in C with sockets.
The busy-waiting is caused in detail by this procedure
I install EPOLLIN for client, (monitor for receiving data)
client sends data.
epoll_wait signalizes me there is data to be read (EPOLLIN)
coroutine is being resumed, data is being consumed, more data is needed in order to finish this client. EWOULDBLOCK and BACK TO 1.
This above procedure is being repeated for minutes (due to the slow internet connection and large data). It's basically just a useless hopping around without doing anything meaningful other than consuming cpu time. Additionally it's kind of killing the purpose of epoll_wait.
So, I wanted to avoid this busy-waiting by some mechanism which does accumulate the data in receive buffer until either a minimum size has been reached or a maximal timeout has passed since the first byte arrived and only then epoll_wait should wake me up with EPOLLIN for this client.
I first looked into tcp(7), I was hoping for something like TCP_CORK but for the receive buffer, but could not find anything.
Then I looked into unix(7) and tried to implement it myself via SIOCINQ right after step 3. The problem is that I end up busy-waiting again because step 3. is immediately going to return because data is available for read. Alternatively I could deregister the client right after 3., but this would block this specific client until epoll_wait returns from a different client.
Is it a stalemate, or is there any solution to the above problem to accumulate data inside receive buffer upon a min size or max time without busy-waiting?
#ezgoing and I chatted at length about this, and I'm convinced this is a non problem (as #user207421 noted as well).
When I first read the question, I thought perhaps they were worried about tiny amounts (say, 16 bytes at a time), and that would have been worth investigating, but once it turns out that it's 4KiB at a time, it's so routine that this is not worth looking into.
Interestingly, the serial I/O module does support this, with a mode that wakes up only after so many characters are available or so much time has passed, but no such thing with the network module.
The only time this would be worth addressing is if there is actual evidence that it's impacting the application's responsiveness in a meaningful way, not a hypothetical concern for packet rates.
I am trying to initiate a spi communication between an omap processor an sam4l one. I have configured spi protocol and omap is the master. Now what I see is the test data I am sending is correctly reaching on sam4l and I can see the isr is printing that data. Using more printf here and there in isr makes the operation happen and the respective operation happens, but if I remove all printfs I can't see any operation happening. What can be the cause of this anomaly? Is it a usual case of wrong frequency settings or something?
If code is needed I will post that too but its big.
Thanks
I think you are trying to print message in driver.
As printing message on console with slow down your driver, it may behave slowly and your driver work well.
Use pr_info() for debug and change setting to not come message on console by editing /proc/sys/kernel/printk to 4 4 1 7
-> It will store debug message in buffer.
-> Driver not slow down because of printing message on screen.
-> And you can see it by typing dmesg command later.
Then find orignal problem which may cause error.
If a routine works with printf "here and there" and not otherwise, almostcertainly the problem is that there are timing issues. As a trivial example, let's say you write to an SPI flash and then check its content. The flash memory write will take some times, so if you check immediately, the data would not be valid, but if you insert a printf call in between, it may have taken enough time that the read back is now valid.
How would be the correct way to prevent a soft lockup/unresponsiveness in a long running while loop in a C program?
(dmesg is reporting a soft lockup)
Pseudo code is like this:
while( worktodo ) {
worktodo = doWork();
}
My code is of course way more complex, and also includes a printf statement which gets executed once a second to report progress, but the problem is, the program ceases to respond to ctrl+c at this point.
Things I've tried which do work (but I want an alternative):
doing printf every loop iteration (don't know why, but the program becomes responsive again that way (???)) - wastes a lot of performance due to unneeded printf calls (each doWork() call does not take very long)
using sleep/usleep/... - also seems like a waste of (processing-)time to me, as the whole program will already be running several hours at full speed
What I'm thinking about is some kind of process_waiting_events() function or the like, and normal signals seem to be working fine as I can use kill on a different shell to stop the program.
Additional background info: I'm using GWAN and my code is running inside the main.c "maintenance script", which seems to be running in the main thread as far as I can tell.
Thank you very much.
P.S.: Yes I did check all other threads I found regarding soft lockups, but they all seem to ask about why soft lockups occur, while I know the why and want to have a way of preventing them.
P.P.S.: Optimizing the program (making it run shorter) is not really a solution, as I'm processing a 29GB bz2 file which extracts to about 400GB xml, at the speed of about 10-40MB per second on a single thread, so even at max speed I would be bound by I/O and still have it running for several hours.
While the posed answer using threads might possibly be an option, it would in reality just shift the problem to a different thread. My solution after all was using
sleep(0)
Also tested sched_yield / pthread_yield, both of which didn't really help. Unfortunately I've been unable to find a good resource which documents sleep(0) in linux, but for windows the documentation states that using a value of 0 lets the thread yield it's remaining part of the current cpu slice.
It turns out that sleep(0) is most probably relying on what is called timer slack in linux - an article about this can be found here: http://lwn.net/Articles/463357/
Another possibility is using nanosleep(&(struct timespec){0}, NULL) which seems to not necessarily rely on timer slack - linux man pages for nanosleep state that if the requested interval is below clock granularity, it will be rounded up to clock granularity, which on linux depends on CLOCK_MONOTONIC according to the man pages. Thus, a value of 0 nanoseconds is perfectly valid and should always work, as clock granularity can never be 0.
Hope this helps someone else as well ;)
Your scenario is not really a soft lock up, it is a process is busy doing something.
How about this pseudo code:
void workerThread()
{
while(workToDo)
{
if(threadSignalled)
break;
workToDo = DoWork()
}
}
void sighandler()
{
signal worker thread to finish
waitForWorkerThreadFinished;
}
void main()
{
InstallSignalHandler;
CreateSemaphore
StartThread;
waitForWorkerThreadFinished;
}
Clearly a timing issue. Using a signalling mechanism should remove the problem.
The use of printf solves the problem because printf accesses the console which is an expensive and time consuming process which in your case gives enough time for the worker to complete its work.