The application I'm using only plays sounds after enough sound has been generated. Say I click the mouse 10 times, with no sound, and then after those ten clicks I'll hear ten mouse click sounds (for example)
The only way I've found to alleviate this problem is to set a very short buffer size, which I don't want to do.
I've been trying to use the start_threshold sw parameter but that has no effect.
It seems like I should be able to force it to play when a specified amount of data has been written that is under buffer size, is this correct? That's what start_threshold seems to indicate, since the period length can be much shorter than the buffer (or so I've seen in examples).
My code is like this:
Call HW parameter setup
get a byte array with data
loop through and write to the buffer plus one byte offset each time
call start (this should force it to play back, right??)
if there is -EPIPE, call prepare and add 0 to offset (I think this is the only time when things get played.)
Thanks!
Related
I'm using libao (ao_play) to play some buffers. I listen the keyboard keys and for each key I have a wav sound to play. It's simple.
With ao_play I see that the application blocks while is playing the sound. Because I want to play multiple audios at same time, I needed to use threads (with pthread lib).
It works, but I fell like a workaround and if I play to much files (maybe 10 or something like this) so everything stuck for some seconds and so come back.
Well, my question is: how to play multiple sounds at same time non-blocking using libao (and not using threads)?
This not a real design, more like a guess.
First of all, you'll need threads because it's a good old tradition to separate computations from visualisations, or audializations in this case. You'll need an audio thread that renders the stream and sends it to the output.
So, each time your main thread discovers a keypress, it sends a note to the audio thread. That latter captures an event and adds a wave to the currently played stream. The stream is rendered in frames (64, or 1024, or 10240 samples, or whatever you fancy your latency, if the wave itself is a simple mix of few possible samples, it can be notably realtime.) You should keep track of notes currently played, position per each sample. If latency is low, thus granularity high, you can even align sample edges by buffer edges, which would notably simplify rendering.
And after current buffer is rendered you simply send it to DAC and proceed with the next frame.
A quick glance at libao's help page does not reveal any mixing capabilities, so you'll need to create a simple mixer on your own, or you may actually need an existing solution, some simple opensource audio rendering library.
as said in the heading, is there a way to lower the speed of printf-outputs in C? Just like watching every character getting printed in particular (it does not have to be so slow, just so you understand what i mean).
The reason why i ask is:
I need to program a small microcontroller. But every 'printf' executed on it should be send back to the com1 port of the host. Everything works fine, I already buffered my printf so everything will be stored in a char-array with a finit size and this array will be sent back to com1 char by char. But because i don't know how many printfs there will be, and because of the limited memory of the μC, a size-limited array isn't the best solution. So my new attempt is to write directly to the send-register of the μC, which can only contain one char at a time until its sent. I do this via
setvbuf(stdout, LINFLEX_0.BDRL.B.DATA0, _IOFBF, 1);
where LINFLEX_0.BDRL.B.DATA0 represents the transmit-register. What I think what my problem is now: the printfs just overwrite the register to fast, so it has no time to send any char stored in it before it gets changed again. When sending char by char from the array, i wait until a data-transmission-flag is set:
//write character to transmit buffer
LINFLEX_0.BDRL.B.DATA0 = buffer[j];
// Wait for data transmission completed flag
while (1 != LINFLEX_0.UARTSR.B.DTF) {}
// Clear DTF Flag
LINFLEX_0.UARTSR.R = 0x0002;
So the idea is to slower the speed the printfs processes every character, but feel free to comment if anyone has another idea.
The problem isn't with printf as such but with the underlying UART driver. That's what you'd have to tweak. If you are using Codewarrior for MPC56 you can actually view the source code for all of it: quite horrible code. Messing with it will only go bad - and apparently it doesn't seem to work well in the first place.
Using printf in these kind of embedded applications is overall a very bad idea, since the function is unsuitable for pretty much any purpose, UART communication in particular. The presence of printf is actually an indicator that a project has gone terribly wrong, quite possibly it has been hijacked by PC programmers. That's not really a programming problem, but a manager one.
Technically, the only sane thing to do here is to toss out all the crap from your project. That means everything remotely resembling stdio.h. Instead, write your own UART driver, based on the available Freescale examples. Make it work on bytes. This also enables you to add custom features such as "echo", where the MCU has to wait for a reply from the receiver. Or you could implement it with DMA if you just want to write data to a buffer and then forget all about it.
I am writing a small module in C to handle jitter and drift for a full-duplex audio system. It acts as a very primitive voice chat module, which connects to an external modem that uses a separate clock, independent from my master system clock (ie: it is not slaved off of the system master clock).
The source is based off of an existing example available online here: http://svn.xiph.org/trunk/speex/libspeex/jitter.c
I have 4 audio streams:
Network uplink (my voice, after processing, going to the far side speaker)
Network downlink (far side's voice, before processing, coming to me)
Speaker output (the far side's voice, after processing, to the local speakers)
Mic input (my voice, before processing, coming from the local microphone)
I have two separate threads of execution. One handles the local devices and buffer (ie: playing processed audio to the speakers, and capturing data from the microphone and passing it off to the DSP processing library to remove background noise, echo, etc). The other thread handles pulling the network downlink signal and passing it off to the processing library, and taking the processed data from the library and pushing it via the uplink connection.
The two threads use mutexes and a set of shared circular/ring buffers. I am looking for a way to implement a sure-fire (safe and reliable) jitter and drift correction mechanism. By jitter, I am referring to a clock having variable duty cycle, but the same frequency as an ideal clock.
The other potential issue I would need to correct is drift, which would assume both clocks use an ideal 50% duty cycle, but their base frequency is off by ±5%, for example.
Finally, these two issues can occur simultaneously. What would be the ideal approach to this? My current approach is to use a type of jitter buffer. They are just data buffers which implement a moving average to count their average "fill" level. If a thread tries to read from the buffer, and not-enough data is available and there is a buffer underflow, I just generate data for it on-the-fly by either providing a spare zeroed-out packet, or by duplicating a packet (ie: packet loss concealment). If data is coming in too quickly, I discard an entire packet of data, and keep going. This handles the jitter portion.
The second half of the problem is drift correction. This is where the average fill level metric comes in useful. For all buffers, I can calculate the relative growth/reduction levels in various buffers, and add or subtract a small number of samples every so often so that all buffer levels hover around a common average "fill" level.
Does this approach make sense, and are there any better or "industry standard" approaches to handling this problem?
Thank you.
References
Word Clock – What’s the difference between jitter and frequency drift?, Accessed 2014-09-13, <http://www.apogeedigital.com/knowledgebase/fundamentals-of-digital-audio/word-clock-whats-the-difference-between-jitter-and-frequency-stability/>
Jitter.c, Accessed 2014-09-13, <http://svn.xiph.org/trunk/speex/libspeex/jitter.c>
I faced a similar, although admittedly simpler, problem. I won't be able to fully answer your question but i hope sharing my solutions to some practical problems i ran into will benefit you anyway.
Last year i was working on a system which should simultaneously record from and render to multiple audio devices, each potentially ticking off a different clock. The most obvious example being a duplex stream on 2 devices, but it also handled multiple inputs/outputs only. All in all being a bit simpler than your situation (single threaded and no network i/o). In the end i don't believe dealing with more than 2 devices is harder than 2, any system with multiple clocks is going to have to deal with the same problems.
Some stuff i've learned:
Pick one stream and designate it's clock as "the truth" (i.e., sync all other streams to a common master clock). If you don't do this you won't have a well-defined notion of "current sample position", and without it there's nothing to sync to. This also has the benefit that at least one stream in the system will always be clean (no dropping/padding samples).
Your approach of using an additional buffer to handle jitter is correct. Without it you'd be constantly dropping/padding even on streams with the same nominal sample rate.
Consider whether or not you'd want to introduce such a jitter buffer for the "master" stream also. Doing so means introducing artificial latency in the master stream, not doing so means the rest of your streams will lag behind.
I'm not sure whether it's a good idea to drop entire packets. Why not try to use up as much of the samples as possible? Especially with large packet sizes this is far less noticeable.
To elaborate on the above, I got badly bitten by the following case: assume s1 (master) producing 48000 frames every second and s2 producing 96000 every 2 seconds. Round 1: read 48000 from s1, 0 from s2. Round 2: read 48000 from s1, 96000 from s2 -> overflow. Discard entire packet. Round 3: read 48000 from s1, 0 from s2. Etc. Obviously this is a contrived example but i ran into cases where on average I dropped 50% of secondary stream's data using this scheme. Introduction of the jitter buffer does help but didn't completely fix this problem. Note that this is not strictly related to clock jitter/skew, it's just that some drivers like to update their padding values periodically and they will not accurately report to you what is really in the hardware buffer.
Another variation on this problem happens when you really do got clock jitter but the API of your choice doesn't let you control packet size (e.g., allows you to request less frames than are actually available). Assume s1 (master) recording #1000 Hz and s2 alternating each second #1000 and 1001hz. Round 1, read 1000 frames from both. Round 2, read 1000 frames from s1, and 1001 from s2 -> overflow. Etc, on average you'll dump around 50% of frames on s2. Note that this is not so much a problem if your API lets you say "give me 1000 samples even though i know you've got more". By doing so though, you'll eventually overflow the hardware input buffer.
To have the most control over when to drop/pad, I found it easiest to allways keep input buffers empty and output buffers full. This way all dropping/padding takes place in the jitter buffer and you'll at least know and control what's happening.
If possible try to separate your program logic: the hard part is finding out where to pad/drop samples. Once you've got that in place it's easy to try different variations of pad/drop, sample-and-hold, interpolation etc.
All in all I'd say your solution looks very reasonable, although I'm not sure about the "drop entire packet thing" and I'd definitely pick one stream as the master to sync against. For completeness here's the solution I eventually came up with:
1 Assume a jitter buffer of size J on each stream.
2: Wait for a packet of size M to become available on the master stream (M is typically derived from the stream latency). We're going to deliver M frames of input/output to the app. I didn't implement an additional buffer on the master stream.
3: For all input streams: let H be the number of recorded frames in the hardware buffer, B be the number of recorded frames currently in the jitter buffer, and A being the number of frames available to the application: A equals H + B.
3a: If A < M, we have input underflow. Offer A recorded frames + (M - A) padding frames to the app. Since the device is likely slow, fill 1/2 of the jitter buffer with silence.
3b: If A == M, offer A frames to the app. The jitter buffer is now empty.
3c: If A > M but (A - M) <= J, offer M recorded frames to the app. A - M frames stay in the jitter buffer.
3d: If A > M and (A - M) > J, we have input overflow. Offer M recorded frames to the app, of the remaining frames put J/2 back in the jitter buffer, we use up M + J/2 frames and we drop A - (M + J/2) frames as overflow. Don't try to keep the jitter buffer full because the device is likely fast and we don't want to overflow again on the next round.
4: Sort of the inverse of 3: for outputs, fast devices will underflow, slow devices will overflow.
A, H and B are the same thing but this time they don't represent available frames but available padding (e.g., how much frames can i offer to the app to write to).
Try to keep hardware buffers full at all costs.
This scheme worked out quite well for me, although there's a few things to consider:
It involves a lot of bookkeeping. Make sure that for input buffers, data always flows from hardware->jitter buffer->application and for outputs always from app->jitter buffer->hardware. It's very easy to make the mistake of thinking you can "skip" frames in the jitter buffer if there's enough samples available from the hardware directly to the app. This will essentially mess up the chronological order of frames in an audio stream.
This scheme introduces variable latency on secondary streams because i try to postpone the moment of padding/dropping as long as possible. This may or may not be a problem. I found that in practice postponing these operations gives audibly better results, probably because many "minor" glitches of only a few samples are more annoying than the occasional larger hiccup.
Also, PortAudio (an open source audio project) has implemented a similar scheme, see http://www.portaudio.com/docs/proposals/001-UnderflowOverflowHandling.html. It may be worthwile to browse through the mailinglist and see what problems/solutions came up there.
Note that everything i've said so far is only about interaction with the audio hardware, i've no idea whether this will work equally well with the network streams but I don't see any obvious reason why not. Just pick 1 audio stream as the master and sync the other one to it and do the same for the network streams. This way you'll end up with two more-or-less independent systems connected only by the ringbuffer, each with an internally consistent clock, each running on it's own thread. If you're aiming for low audio latency, you'll also want to drop the mutexes and opt for a lock-free fifo of some sorts.
I am curious to see if this is possible. I'll throw in my two bits though.
I am a novice programmer, but studied audio engineering/interactive audio.
My first assumption is that this is not possible. At least not on a sample-to-sample basis. Especially not for complex audio data and waveforms such as human speech. The program could have no expectation of what the waveform "should" look like.
This is why there are high-end audio interfaces with temperature controlled internal clocks.
On the other hand, maybe there is a library that can detect the symptoms of jitter, somehow...
In which case I would be very curious to hear about it.
As far as drift correction, maybe I don't understand something on the programming front, but shouldn't you be pulling audio at a specific sample rate? I believe sample rate/drift is handled at the hardware level.
I really hope this helps. You might have to steer me closer to home.
I'm writing my own drivers for LPC2148 and a question came to mind.
How do I receive a message of unspecified size in UART?
The only 2 things that come to mind are: 1 - Configure a watchdog and end the receiving when the time runs out. 2- make it so that whenever a meswsage is sent to it there must be an end of message character.
The first choice seems better in my opinion, but I'd like to know if anybody has a better answer, and I know there must be.
Thank you very much
Just give the caller whatever bytes you have received so far. The UART driver shouldn't try to implement the application protocol, the application should do that.
It looks like a wrong use for a watchdog. I ended up with three solutions for this problem:
Use fixed-size packets and DMA; so, you receive one packet per transaction. Apparently, it is not possible in your case.
Receive message char-by-char until the end-of-message character is received. Kind of error-prone, since the EOM char may appear in the data, probably.
Use a fixed-size header before every packet. In the header, store payload size and/or message type ID.
The third approach is probably the best one. You may combine it with the first one, i.e. use DMA to receive header and then data (in the second transaction, after the data size is known from the header). It is also one of the most flexible approaches.
One more thing to worry about is to keep bytestream in sync. There may be rubbish laying in the UART input buffers, which may get read as data, or you can get only a part of a packet after your MCU is powered (i.e. the beginning of the packet had already been sent by that time). To avoid that, you can add magic bytes in your packet header, and probably CRC.
EDIT
OK, one more option :) Just store everything you receive in a growing buffer for later use. That is basically what PC drivers do.
Real embedded uart drivers usually use a ring buffer. Bytes are stored in order and the clients promise to read from the buffer before it's full.
A state machine can then process the message in multiple passes with no need for a watchdog to tell it reception is over
better to go for option 2) append end of transmission character to the transmission string.
but i suggest to add start of transmission also to validate that you are receiving actual transmission.
Watchdog timer is used to reset system when there is a unexpected behavior of device. I think it is better to use a buffer which can store size of data that your application requires.
I have a encoder which encodes a speech file(.wav) that i give as input. Now what i want to do is to write a program such that i can speak in the mic and at the same time the encoder can process it. Basically i want to record and process a speech signal in real time (a small delay can be tolerated). To do this i was thinking of making a loop inside which i would first record the speech for say 1 sec in a file say speech.in, then i would copy this file to temp and pass this temp to the encoder. In the meantime the recorder should overwrite the speech.in file and save the next 1 sec of data in it.And continue this loop...
The problem i am having is i cant write a program to control the recorder to do the thing i want. Is there any recorder which can be easily controlled or any code to do it ?
This is the only way i could think of to implement this. Any other(hopefully better) solution is also welcome.
*edit: I am working on Ubuntu 10.04 but i have used the same program on windows as well so any suggestion on either platform is welcome
Your proposed way is not the way to go. At least, this is not how it's done on Windows and Mac. (I don't know how linux flavoured machines would do it but I'm guessing the methodology is the same)
You'll have to open the audio device, and allocate a set of (say 4) internal memorybuffers (length of 100ms sound would suffice, but you'll have to experiment how small you can get the buffer (the smaller, the less latency, but the more chances on audio glitches)).
You attach these to the audio device and ask for a callback when any of these buffers are filled. When you get the first call back, make sure you encode the buffer quickly enough before the 1st buffer is used again by the audiodevice and is overwritten with new data.
You could simultaneously output the encoded sound to the audiodevice again. The latency would be similar to the length of 1 of the buffers.
Sounds like this would be best served by threading.
Here is a MSDN link