can anyone say how sampling rate and framesize is realted? - c

Can anyone say how sampling rate and framesize are related ?
I decoded a spx file to wav, with sampling rate of 10 kHz and at 16 bit. The frame size applied during the decoding process was 640.
The decoded file is playable in vlc. But I want to play that file in Flex.
Flex supports rate of 44.1 kHz, 22.5 kHz and 11.2 kHz only. I want to increase the sampling rate during decoding process. I know how to do that in the code but I guess the framesize also should be increased. I don't know the dependency between these two. Can anyone help?

Frame size and sampling rate are generally orthogonal concepts. They don't need to affect each other unless a particular format demands it.
For PCM .wav, the frame size will always be bits/channels * channels. In your case, 16 bits for mono, or 32 bits for stereo.
Also, there is no need to change the decoding frame size only because you later apply resampling.

You mix two independent tasks: spex decoding and resampling. The mentioned frame size should be considered only as a buffer that contains PCM samples. These PCM samples you should pass to a resampler (for example SSRC: http://shibatch.sourceforge.net/).

Frame Size depends on the codec used to compress the original data. It will contain an integral number of samples (320 in this case).
If I'm correct in thinking raw audio has a frame size equal to the sample size. However some codecs perform compression over a range of samples. Usually the larger the frame size, the more memory needed to compress the data but the potentially better compression you can achieve.
You can't increase the sampling rate during decoding however you could resample the decoded audio. Presumably you're actually re-encoding the data to send it to Flex? You'll need to have a look at the codec you're using to rencode. Which codec are you using?

irrespective of number of channels used, frame rate and sampling rate are same.
because that is the purpose of TDM.
New channels are introduced in the gap left between two consecutive samples.
As the number of channels increase time allotted to each channel decrease there by time taken by each bit.
but tame gap between consecutive samples of any channel will remain constant and it will equal to the total frame time.
i.e. Time gap between samples = Frame time, hence Frame rate is equal to sample rate.

Related

How to set periods and buffer size in ALSA?

I'm trying to capture audio for a sip like application.
I want to get 20 milliseconds of audio at 8khz mono.
I need the application to get audio exactly every 20 milliseconds to avoid jitter.
I have set parameters as follows
access: SND_PCM_ACCESS_RW_INTERLEAVED
format: SND_PCM_FORMAT_S16_LE
rate: 8000
channels: 1
period size: 160
I want the periods to be 2 and the buffer to be 320 (period_size*periods). However if I try to set either of these using:
snd_pcm_hw_params_set_periods
snd_pcm_hw_params_set_buffer_size
Then I get -22 returned, which is -EINVAL
The period size specifies how often the hardware notifies your application that a complete period has been captured. It is a hardware parameter, which means that the hardware might not support the value that you want.
To get the period size that is nearest to your desired value, use snd_pcm_hw_params_set_period_size_near().
If you want to read 160 samples, just tell snd_pcm_read*() to read 160 frames. However, if this does not match the period size, you will get jitter. If reducing jitter is important, you have to put the samples in your own queue and take them with
out with an appropriate timer.
Please note that capture latency depends only on the period size, not on the buffer size, so you should make the buffer as large as possible to reduce the risk of overruns.

Alsa Lib Hardware Parameters setting

I'm trying to record sound on my linux (debian) embedded device with alsa library. My embedded hardware is this [1], and according to its datasheet page 33 [2],
Analog audio signals are featured by the on-SOM TLV320AIC3106 audio codec.
and the datasheet of this Texas Instruments audio codec [3],
Supports Rates From 8 kHz to 96 kHz
I use the example application code for alsa lib, for initial work I didn't change the code. In the example code, the sampling rate was set to 44100Hz. I successfully recorded sound and played after. For now, I think, I can record sound with alsa-lib with the sampling rate of 8000Hz based on the datasheets. I set the sampling rate to 8000Hz but while the alsa configuration, it changes to 16000Hz.
I set sampling rate to 8000Hz;
snd_pcm_hw_params_set_rate_near(handle, params, &(record_params->rate), &dir);
snd_pcm_hw_params_set_channels(handle, params, record_params->channel);
rc = snd_pcm_hw_params(handle, params);
But after invoking this method;
snd_pcm_hw_params_get_period_time(params, &(record_params->rate), &dir);
it changes to 16000. There is no other method call between above. Are my settings wrong or may be the codec doesn't support for 8kHz?
UPDATE: When I set rate to 16000, it changes to 8000. I'm really confused more.
[1] = http://www.variscite.com/products/system-on-module-som/cortex-a9/dart-mx6-cpu-freescale-imx6
[2] = http://www.variscite.com/images/stories/DataSheets/DART-MX6/DART-MX6_v1_2_datasheet_v2_1.pdf
[3] = http://www.ti.com/lit/ds/symlink/tlv320aic3106.pdf
Period time and rate are two different things.
The period of a PCM is basically the amount of frames that get transferred between device interrupts. It's done this way because making data transfers to a device frame by frame would be extremely inefficient.
The ALSA library allows the setting of the period size to be specified by microseconds (using snd_pcm_get_period_time) or by frame count (using snd_pcm_get_period_size).
If you're trying to calculate what size buffer to allocate for reading or writing to a PCM, it would be more intuitive to use snd_pcm_get_period_size (which returns the number of frames in a period) and then call snd_pcm_frames_to_bytes, which converts a frame count of a PCM to a byte count.

Reducing the Frame-Rate of a Network Video Stream with FFMPEG

I have a network video stream that I am decoding with the ffmpeg C library.
I'd like to reduce the maximal frame rate to some maximum, say 15 fps.
I used the filter fps=fps=15, but even on a 25 fps video stream this caused frame duplication. I presume this was due to network delays.
Is there some way to reduce the maximal frame-rate but avoid frame duplication and just get delays instead?
If not, is there a way to identify if a decoded frame is one of the duplicates?

Converting frames of raw video to Inverse Perspective Mapped frames

I have a raw video of 1000 frames.I am doing Inverse Perspective Mapping of these frames and storing these frames on hard disk. But this process takes around 10 minutes to convert it.
Is there any other way in which speed can be improved? I am using CVWarpPerspective and cvgetperspectivetransform functions, I have to do it in real time with a maximum delay of 500ms.
You could use OpenGL for hardware acceleration; but your biggest bottleneck is likely to be writing the images back to disk.
Assuming the images aren't small; half a second to load, warp and save 1000 raw frames is very demanding. What is the reason for this specification ?

Playing the solfege notes with the ALSA API?

I'm playing with the Alsa API and I wonder which parameters I should pass to the function snd_pcm_writei to simply play the solfège syllables/notes (A-G / do re mi fa sol la si do).
Thanks
If you really want to do it with that function, generate a waveform in a buffer. A triangle-shaped wave may not sound too awful and should be simple enough to generate.
The base "la" (A) is 440Hz, that is, 440 cycles of the waveform of your choice per second.
The other notes can be obtained by multiplying/dividing by 2^(1/12) (1.05946309) for each half tone above/below this base frequency. You will need to know at what frequency the output device is set up (that's probably an argument to another ALSA function). If the device frequency is, say, 44100 Hz, and you want to play the base "la", each period of your waveform should occupy 44100 / 440 or about 100 samples. Pay attention to the sample width and the number of channels the device is configured for, too.
Explanation: there are 12 half tones in an octave, and an octave is exactly half (lower pitched) or double (higher pitched) the frequency. Once you have multiplied 12 times by 2^(1/12), you have multiplied by 2, so each half-tone is at a factor of 2^(1/12) above the previous one.
Sounds like you want midi, not ALSA. ALSA deals with sampled audio (e.g. digital waveforms derived from a CD, wav, mp3, etc). It is not a sound synthesis program.

Resources