playing only part of a sound using FMOD - c

I'm trying to play only part of a sound using FMOD, say frames 50000-100000 of a 200000 frame file.
I have found a couple of ways to seek forward (i.e. to start playback at frame 50000) but I have not found a way to make sure the sound stops playing at 100000. Is there any way FMOD can natively do this without having to add lbsndfile or the like into the picture?
I should also mention that I am using the streaming option. I have to assume that these sounds are arbitrarily large and cannot be comfortably/quickly loaded into memory.

You can use Channel::setDelay for sample accurate starting and stopping of sounds. Use FMOD_DELAYTYPE_DSPCLOCK_START to set the start time of the sound and FMOD_DELAYTYPE_DSPCLOCK_END to set the end time.
Check out the docs for Channel::setDelay, FMOD_DELAYTYPE, System::getDSPClock.

You should be able to use the streaming callback to stop the stream when you get to the desired point.
Option 1: When you create the stream, set lenbytes to an even divisor of the number of frames you wish to play. In your example, set 'lenbytes' to 5000, then keep a counter in the callback. When you get to 10, stop the stream.
Option 2: use FSOUND_Stream_AddSyncPoint with pcmoffset set to your desired stopping point. Register a callback with FSOUND_Stream_SetSyncCallback. Stop the stream in the callback.

To start playback at sample 50,000 and end at 100,000 you could do the following assuming the sound file sample rate and the system sample rate are the same. As DSP clock works in system output samples you may need to do some maths to adjust your end sample in terms of output rate. See Sound::getDefaults for sound sample rate and System::getSoftwareFormat for system rate.
unsigned int sysHi, sysLo;
// ... create sound, play sound paused ...
// Seek the data to the desired start offset
channel->setPosition(50000, FMOD_TIMEUNIT_PCM);
// For accurate sample playback get the current system "tick"
system->getDSPClock(&sysHi, &sysLo);
// Set start offset to a couple of "mixes" in the future, 2048 samples is far enough in the future to avoid issues with mixer timings
FMOD_64BIT_ADD(sysHi, sysLo, 0, 2048);
channel->setDelay(FMOD_DELAYTYPE_DSPCLOCK_START, sysHi, sysLo);
// Set end offset for 50,000 samples from our start time, which means the end sample will be 100,000
FMOD_64BIT_ADD(sysHi, sysLo, 0, 50000);
channel->setDelay(FMOD_DELAYTYPE_DSPCLOCK_END, sysHi, sysLo);
// ... unpause sound ...

Related

Play multiple wav audio with C and libao at same time

I'm using libao (ao_play) to play some buffers. I listen the keyboard keys and for each key I have a wav sound to play. It's simple.
With ao_play I see that the application blocks while is playing the sound. Because I want to play multiple audios at same time, I needed to use threads (with pthread lib).
It works, but I fell like a workaround and if I play to much files (maybe 10 or something like this) so everything stuck for some seconds and so come back.
Well, my question is: how to play multiple sounds at same time non-blocking using libao (and not using threads)?
This not a real design, more like a guess.
First of all, you'll need threads because it's a good old tradition to separate computations from visualisations, or audializations in this case. You'll need an audio thread that renders the stream and sends it to the output.
So, each time your main thread discovers a keypress, it sends a note to the audio thread. That latter captures an event and adds a wave to the currently played stream. The stream is rendered in frames (64, or 1024, or 10240 samples, or whatever you fancy your latency, if the wave itself is a simple mix of few possible samples, it can be notably realtime.) You should keep track of notes currently played, position per each sample. If latency is low, thus granularity high, you can even align sample edges by buffer edges, which would notably simplify rendering.
And after current buffer is rendered you simply send it to DAC and proceed with the next frame.
A quick glance at libao's help page does not reveal any mixing capabilities, so you'll need to create a simple mixer on your own, or you may actually need an existing solution, some simple opensource audio rendering library.

How do I write audio data at a certain sample rate?

I am making a synthesizer by piping data into aplay (I know it's not ideal) and the sound is lagging behind the keypresses which alter the sound. I believe this is because aplay is going at a constant 8000 Hz, but the c program is going at an unstable rate. How do I get the for loop to go at 8000 Hz in C?
To generate audio samples at 8000 Hz (or any fixed rate) you don't want your loop to "run at" that rate. That would involve huge amounts of overhead (99.99% or more) spinning doing nothing until time to generate the next sample, and (especially if you sleep rather than spinning) would be unreliable in that your process might not wake-up/get-scheduled in time for some of the samples.
Instead, you just want to be producing samples at an overall rate matching what the consumer (aplay/the audio device) expects. You can compute the overall current sample number you should be generating up to as something like:
current_time + buffer_depth - start_time
then, after generating up to that sample, sleep for some period proportional to the buffer depth, but sufficiently less that you won't be in trouble if your process doesn't get scheduled again right away. The buffer depth you can use depends on what kind of latency you need. If you're making sounds for live/realtime events, you probably want a buffer depth of 1/50 sec (20 ms) or less. If not, you can happily use huge buffers like 5-10 seconds.
If you are piping data to aplay, you will not experience any problems with the sample rate (8 kHz, for example) because the kernel will block your program when you write() when the buffer is full. This will effectively limit your audio generation to 8 kHz with no work on your part.
However, this is far from ideal. Your application will only be throttled once the kernel buffer for the pipe is full, and the default size for pipe buffers on Linux is 64 kB. For stereo 16-bit data at 8 kHz, this is two full seconds of audio data, so you would expect your audio to lag at least two seconds from the user input. This is unacceptable for synthesizer applications.
The only real solution is to use the ALSA library directly (or some alternative sound API). Using this API, you can send buffered audio data to your audio output device without accumulating excessive queued data in kernel buffers.
See A Guide Through The Linux Sound API Jungle for some tips.

Detect a beep of a certain frequency and duration

How can one detect a beep from audio data with a known frequency and duration but unknown time of arrival?
I am trying to implement a bandpass filter to ignore the sounds of any unknown frequency. I haven't succeeded yet. Once I succeed in that, I will check the the amplitude of sound if it exceeds a certain threshold for my fixed amount of time. That should detect the beep. I have been told that a Fourier transform can also be used to detect the beep.
Which strategy is better?
Furthermore, listening to audio recordings, I have come to believe that Windows or the sound driver of my laptop (Inspiron 15R) is applying some sort of a noise cancelling filter on the microphone input. Is this common on laptops? If it is, is there any way to get the untouched real audio from the mic? I am using portaudio library to get the sound.
The simplest and most common method for this kind of tone detection task (where the tone frequency is known a priori) is the Goertzel Algorithm. It's effectively just a single bin DFT at the frequency of interest - you take the output and low pass filter it and when it exceeds an empirically-derived threshold then you have detected your tone.

Tell libavcodec/ffmpeg to drop frame

I'm building an app in which I create a video.
Problem is, sometime (well... most of the time) the frame acquisition process isn't quick enough.
What I'm currently doing is to skip the current frame acquisition if I'm late, however FFMPEG/libavcodec considers every frame I pass to it as the next frame in line, so If I drop 1 out of 2 frames, a 20seconds video will only last 10. More problems come in as soon as I add sound, since sound processing is way faster...
What I'd like would be to tell FFMPEG : "last frame should last twice longer that originally intended", or anything that could allow me to process in real time.
I tried to stack the frames at a point, but this ends up killing all my memory (I also tried to 'stack' my frames in the hard drive, which was way to slow, as I expected)
I guess I'll have to work with the pts manually, but all my attempts have failed, and reading some other apps code which use ffmpeg, such as VLC, wasn't of a great help... so any advice would be much appreciated!
Thanks a lot in advance!
your output will probably be considered variable framerate (vfr), but you can simply generate a timestamp using wallclock time when a frame arrives and apply it to your AVFrame before encoding it. then the frame will be displayed at the correct time on playback.
for an example of how to do this (at least the specifying your own timestamp part), see doc/examples/muxing.c in the ffmpeg distribution (line 491 in my current git pull):
frame->pts += av_rescale_q(1, video_st->codec->time_base, video_st->time_base);
here the author is incrementing the frame timestamp by 1 in the video codec's timebase rescaled to the video stream's timebase, but in your case you can simply rescale the number of seconds since you started capturing frames from an arbitrary timebase to your output video stream's timebase (as in the above example). for example, if your arbitrary timebase is 1/1000, and you receive a frame 0.25 seconds since you started capturing, then do this:
AVRational my_timebase = {1, 1000};
frame->pts = av_rescale_q(250, my_timebase, avstream->time_base);
then encode the frame as usual.
Many (most?) video formats don't permit leaving out frames. Instead, try reusing old video frames when you can't get a fresh one in time.
Just an idea.. when it's lagging with the processing have you tried to pass to it the same frame again (and drop the current one)? Maybe it can process the duplicated frame quickly.
There's this ffmpeg command line switch -threads ... for multicore processing, so you should be able to do something similar with the API (though I have no idea how). This might solve your problem.

Trouble implementing a real time program in C

I have a encoder which encodes a speech file(.wav) that i give as input. Now what i want to do is to write a program such that i can speak in the mic and at the same time the encoder can process it. Basically i want to record and process a speech signal in real time (a small delay can be tolerated). To do this i was thinking of making a loop inside which i would first record the speech for say 1 sec in a file say speech.in, then i would copy this file to temp and pass this temp to the encoder. In the meantime the recorder should overwrite the speech.in file and save the next 1 sec of data in it.And continue this loop...
The problem i am having is i cant write a program to control the recorder to do the thing i want. Is there any recorder which can be easily controlled or any code to do it ?
This is the only way i could think of to implement this. Any other(hopefully better) solution is also welcome.
*edit: I am working on Ubuntu 10.04 but i have used the same program on windows as well so any suggestion on either platform is welcome
Your proposed way is not the way to go. At least, this is not how it's done on Windows and Mac. (I don't know how linux flavoured machines would do it but I'm guessing the methodology is the same)
You'll have to open the audio device, and allocate a set of (say 4) internal memorybuffers (length of 100ms sound would suffice, but you'll have to experiment how small you can get the buffer (the smaller, the less latency, but the more chances on audio glitches)).
You attach these to the audio device and ask for a callback when any of these buffers are filled. When you get the first call back, make sure you encode the buffer quickly enough before the 1st buffer is used again by the audiodevice and is overwritten with new data.
You could simultaneously output the encoded sound to the audiodevice again. The latency would be similar to the length of 1 of the buffers.
Sounds like this would be best served by threading.
Here is a MSDN link

Resources