I have a encoder which encodes a speech file(.wav) that i give as input. Now what i want to do is to write a program such that i can speak in the mic and at the same time the encoder can process it. Basically i want to record and process a speech signal in real time (a small delay can be tolerated). To do this i was thinking of making a loop inside which i would first record the speech for say 1 sec in a file say speech.in, then i would copy this file to temp and pass this temp to the encoder. In the meantime the recorder should overwrite the speech.in file and save the next 1 sec of data in it.And continue this loop...
The problem i am having is i cant write a program to control the recorder to do the thing i want. Is there any recorder which can be easily controlled or any code to do it ?
This is the only way i could think of to implement this. Any other(hopefully better) solution is also welcome.
*edit: I am working on Ubuntu 10.04 but i have used the same program on windows as well so any suggestion on either platform is welcome
Your proposed way is not the way to go. At least, this is not how it's done on Windows and Mac. (I don't know how linux flavoured machines would do it but I'm guessing the methodology is the same)
You'll have to open the audio device, and allocate a set of (say 4) internal memorybuffers (length of 100ms sound would suffice, but you'll have to experiment how small you can get the buffer (the smaller, the less latency, but the more chances on audio glitches)).
You attach these to the audio device and ask for a callback when any of these buffers are filled. When you get the first call back, make sure you encode the buffer quickly enough before the 1st buffer is used again by the audiodevice and is overwritten with new data.
You could simultaneously output the encoded sound to the audiodevice again. The latency would be similar to the length of 1 of the buffers.
Sounds like this would be best served by threading.
Here is a MSDN link
Related
I'm using libao (ao_play) to play some buffers. I listen the keyboard keys and for each key I have a wav sound to play. It's simple.
With ao_play I see that the application blocks while is playing the sound. Because I want to play multiple audios at same time, I needed to use threads (with pthread lib).
It works, but I fell like a workaround and if I play to much files (maybe 10 or something like this) so everything stuck for some seconds and so come back.
Well, my question is: how to play multiple sounds at same time non-blocking using libao (and not using threads)?
This not a real design, more like a guess.
First of all, you'll need threads because it's a good old tradition to separate computations from visualisations, or audializations in this case. You'll need an audio thread that renders the stream and sends it to the output.
So, each time your main thread discovers a keypress, it sends a note to the audio thread. That latter captures an event and adds a wave to the currently played stream. The stream is rendered in frames (64, or 1024, or 10240 samples, or whatever you fancy your latency, if the wave itself is a simple mix of few possible samples, it can be notably realtime.) You should keep track of notes currently played, position per each sample. If latency is low, thus granularity high, you can even align sample edges by buffer edges, which would notably simplify rendering.
And after current buffer is rendered you simply send it to DAC and proceed with the next frame.
A quick glance at libao's help page does not reveal any mixing capabilities, so you'll need to create a simple mixer on your own, or you may actually need an existing solution, some simple opensource audio rendering library.
When using snd_pcm_writei() in non-blocking mode everything works perfect for a while but eventually the audio gets choppy. It sounds like the ring buffer pointers are getting out of sync (ie. sometimes I can tell that the audio is playing out of order). How long it takes for the problem to start it's hardware dependent. On a Gentoo box on real hardware it seldom happens, but on a buildroot system running on QEMU it happens after about 5 minutes. On both cases draining the pcm stream fixes the problem. I have verified that I'm writing the samples correctly by also writting them to a file and playing them with aplay.
Currently I'm setting avail_min to the period size (1024 frames) and calling snd_pcm_wait() before writting chunks of the period size. But I tried a number of different variations (different chunk sizes, checking avail myself and use pthread_cond_timedwait() instead of snd_pcm_wait(), etc). But the only thing that works fine is using blocking mode but I can not do that.
You can see the current source code here: https://bitbucket.org/frodzdev/mediabox/src/5a6471316c7ae481b329e7e0d4af1bb68a32e71d/src/audio.c?at=staging&fileviewer=file-view-default (it needs a little cleanup since I'm trying all kinds of things). The code that does the actual IO starts at line 375.
Edit:
I think I got a solution but I don't understand why it seems to work. It seems that it does not matter if I'm using non-blocking mode, the problem is when I wait to make sure there's room on the buffer (either through snd_pcm_wait(), pthread_cond_timedwait(), or usleep()).
The version that seems to work is here: https://bitbucket.org/frodzdev/mediabox/src/c3eb290087d9bbe0d5f37653a33a1ba88ef0628b/src/audio.c?fileviewer=file-view-default. I switched to blocking mode while still waiting before calling snd_pcm_writei() and it didn't made a difference. Then I added the call to snd_pcm_avail() before calling snd_pcm_status() on avbox_audiostream_gettime(). This function is called constantly by another thread to get the stream clock and it only uses snd_pcm_status() to get the timestamps. Now it seems to work (at least it is a lot less probable to happen) but I don't understand exactly why. I understand that snd_pcm_avail() will synchronize the pointers with the kernel but I don't really understand when it needs to be called and the difference between snd_pcm_state() et al and snd_pcm_status(). Does snd_pcm_status() also synchronize anything? It seems not because sometimes snd_pcm_status_get_state() will return RUNNING when snd_pcm_avail() returns -EPIPE. The ALSA documentation is really vague. Perhaps understanding these things will help me understand my problem?
Now, when I said that it seems to be working I mean that I cannot reproduce it on real hardware. It still happens on QEMU though way less often. But considering that on the next commit I switched to blocking mode without waiting (which I've used in the past and never had a problem with on real hardware) and it still happens in QEMU and also the fact that this is a common issue with QEMU I'm starting to think that I may have fixed the issue on my end and now it's just a QEMU problem. Is there any way to determine if the problem is a bug on my end that is easier to trigger on the emulator or if it's just an emulator problem?
Edit: I realize that I should fill the buffer before waiting but at this point my concern is not to prevent underruns but to make sure that my code can handle them when they happen. Besides the buffer is filling up after a few iterations. I confirmed this by outputing avail, buffer_size, etc before writing each packet and the numbers I get don't make perfect sense, they show an error of 1 or 2 periods about every 8th period. Also (and this is the main problem) I'm not detecting any underruns, the audio get choppy but all writes succeed. In fact, if the problem start happening and I trigger an underrun by overloading the CPU it will correct itself when the pcm is reset.
In line 505: You're using time as argument to malloc.
In line 568: Weren't you playing audio? In this case you should do wait only after you wrote the frames. Let's think ...
Audio device generates an interrupt when it terminates to process a period.
| period A | period B |
^ ^
irq irq
Before you start the pcm, audio device doesn't generate any interrupt. Notice here that you're waiting and you haven't started the pcm yet. You only starts it when you call snd_pcm_writei().
When you wait for audio data you'll be awake only when the current period has been fully processed -- in your first wait the first period wasn't even written -- so in a comfortable situation you should write the whole buffer, wait for the first interrupt, and then write the just processed period, and on and on.
Initially, buffer is empty:
| | |
write():
|############|############|
wait():
..............
When we wake up:
| |############|
write():
|############|############|
I found the problem is you're writing audio just before it be played, then sometimes it may arrive delayed in the buffer.
I have barcode scanner, attached to Linux computer via USB. The scanner emulates a keyboard device.
I have to write a program that to read the scanned barcodes and process them. The program runs on background as a service and should read the barcode scanner regardless of the current X focus.
How this can be made in Linux?
Some lower level solution/explanation is preferred.
It sounds like you want to capture the data from a specified device,
In which case the method described in this post should help:
(EDIT: original link dead, Archive link provided)
https://web.archive.org/web/20190101053530/http://www.thelinuxdaily.com/2010/05/grab-raw-keyboard-input-from-event-device-node-devinputevent/
That will listen out for keyboard events stemming from only the specified source.
A word of caution though, as far as I know, that won't stop it from propagating to whatever your current window focus is.
To start with solution, I guess a daemon would perfect choice.
You can write a daemon code, which will open device node (for scanner) and read the data buffer.
Now you have received data in user space, you are free to handle it as per your requirement.
I have been playing with creating sounds using mathematical wave functions in C. The next step in my project is getting user input from a MIDI keyboard controller in order to modulate the waves to different pitches.
My first notion was that this would be relatively simple and that Linux, being Linux, would allow me to read the raw data stream from my device like I would any other file.
However, research overwhelmingly advises that I write a device driver for the MIDI controller. The general idea is that even though the device file may be present, the kernel will not know what system calls to execute when my application calls functions like read() and write().
Despite these warnings, I did an experiment. I plugged in the MIDI controller and cat'ed the "/dev/midi1" device file. A steady stream of null characters appeared, and when I pressed a key on the MIDI controller several bytes appeared corresponding to the expected Message Chunks that a MIDI device should output. MIDI Protocol Info
So my questions are:
Why does the cat'ed stream behave this way?
Does this mean that there is a plug and play device driver already installed on my system?
Should I still go ahead and write a device driver, or can I get away with reading it like a file?
Thank you in advanced for sharing your wisdom in these areas.
Why does the cat'ed stream behave this way?
Because that is presumably the raw MIDI data that is being received by the controller. The null bytes are probably some sort of sync tick.
Does this mean that there is a plug and play device driver already installed on my system?
Yes.
However, research overwhelmingly advises that I write a device driver for the MIDI controller. The general idea is that even though the device file may be present, the kernel will not know what system calls to execute when my application calls functions like read() and write().
<...>
Should I still go ahead and write a device driver, or can I get away with reading it like a file?
I'm not sure what you're reading or how you're coming to this conclusion, but it's wrong. :) You've already got a perfectly good driver installed for your MIDI controller -- go ahead and use it!
Are you sure you are reading NUL bytes? And not 0xf8 bytes? Because 0xf8 is the MIDI time tick status and is usually sent periodically to keep the instruments in sync. Try reading the device using od:
od -vtx1 /dev/midi1
If you're seeing a bunch of 0xf8, it's okay. If you don't need the tempo information sent by your MIDI controller, either disable it on your controller or ignore those 0xf8 status bytes.
Also, for MIDI, keep in mind that the current MIDI status is usually sent once (to save on bytes) and then the payload bytes follow for as long as needed. For example, the pitch bend status is byte 0xeK (where K is the channel number, i.e. 0 to 15) and its payload is 7 bits of the least significant byte followed by 7 bits of the most significant bytes. Thus, maybe you got a weird controller and you're seeing only repeated payloads of some status, but any controller that's not stupid won't repeat what it doesn't need to.
Now for the driver: have a look at dmesg when you plug in your MIDI controller. Now if your OSS /dev/midi1 appears when you plug in your device (udev is doing this job), and dmesg doesn't shoot any error, you don't need anything else. The MIDI protocol is yet-another-serial-protocol that has a fixed baudrate and transmits/receives bytes. There's nothing complicated about that... just read from or write to the device and you're done.
The only issue is that queuing at some place could result in bad audio latency (if you're using the MIDI commands to control live audio, which I believe is what you're doing). It seems like those devices are mostly made for system exclusive messages, that is, for example, downloading some patch/preset for a synthesizer online and uploading it to the device using MIDI. Latency doesn't really matter in this situation.
Also have a look at the ALSA way of playing with MIDI on Linux.
If you are not developing a new MIDI controller hardware, you shouldn't worry about writing a driver for it. It's the user's concern installing their hardware, and the vendor's obligation to supply the drivers.
Under Linux, you just read the file. Now to interpret and make useful things with the data.
The application I'm using only plays sounds after enough sound has been generated. Say I click the mouse 10 times, with no sound, and then after those ten clicks I'll hear ten mouse click sounds (for example)
The only way I've found to alleviate this problem is to set a very short buffer size, which I don't want to do.
I've been trying to use the start_threshold sw parameter but that has no effect.
It seems like I should be able to force it to play when a specified amount of data has been written that is under buffer size, is this correct? That's what start_threshold seems to indicate, since the period length can be much shorter than the buffer (or so I've seen in examples).
My code is like this:
Call HW parameter setup
get a byte array with data
loop through and write to the buffer plus one byte offset each time
call start (this should force it to play back, right??)
if there is -EPIPE, call prepare and add 0 to offset (I think this is the only time when things get played.)
Thanks!