I'm trying to capture audio for a sip like application.
I want to get 20 milliseconds of audio at 8khz mono.
I need the application to get audio exactly every 20 milliseconds to avoid jitter.
I have set parameters as follows
access: SND_PCM_ACCESS_RW_INTERLEAVED
format: SND_PCM_FORMAT_S16_LE
rate: 8000
channels: 1
period size: 160
I want the periods to be 2 and the buffer to be 320 (period_size*periods). However if I try to set either of these using:
snd_pcm_hw_params_set_periods
snd_pcm_hw_params_set_buffer_size
Then I get -22 returned, which is -EINVAL
The period size specifies how often the hardware notifies your application that a complete period has been captured. It is a hardware parameter, which means that the hardware might not support the value that you want.
To get the period size that is nearest to your desired value, use snd_pcm_hw_params_set_period_size_near().
If you want to read 160 samples, just tell snd_pcm_read*() to read 160 frames. However, if this does not match the period size, you will get jitter. If reducing jitter is important, you have to put the samples in your own queue and take them with
out with an appropriate timer.
Please note that capture latency depends only on the period size, not on the buffer size, so you should make the buffer as large as possible to reduce the risk of overruns.
Related
here is the problem I am facing. I have interfaced my ATmega328P with a 6-axis IMU (MPU6050 with the GY521 breakout board). I can read data through the TWI interface (Atmel's I2C) and send it to my PC (running Ubuntu) via the UART. I am using custom-built libraries for both these communication protocols, but they are pretty standard and seem to work just fine. The goal of the project is to compute orientation data from the IMU readings in real-time, say at 100 Hz.
The main problem is that I cannot log data from the device at 100 Hz (not even at 50 Hz). The orientation filter I am using (here) requires a quite high frequency and 100 Hz turned out to work fine (tested offline acquiring data from another device).
Right now, I am using the 16-bit timer of the ATmega328P to sample data at 100 Hz and this seem to work, as I have added to the ISR a line to toggle the built-in LED and it looks to me that it is blinking at 100 Hz (I can barely see it turning on and off). In the same ISR, I read the values from the inertial sensor and, just to log them, send these values through the serial port. Every 10 ms (maximum), I send 9 floats (36 bytes) with a baud rate of 115200. If I use the Arduino IDE's Serial Monitor to visualize this data stream, I notice something very weird, as in the following screenshot.
https://imgur.com/zTBdkhv
As you notice taking a look at the timestamps, there is a common 33 ms delay every 2 or 3 sets of samples received. Moreover, I get roughly the 60% of the data. For example, an acquisition of 10 seconds only gets me less than 600 samples (per each variable) instead of 1000. Moreover, I tested the same sending only one variable through the UART (i.e. only a single float, 4 bytes) and this results in the same behavior!
By the way, I am exploiting the following to send each byte (char) via the UART interface.
void writeCharUART(char c) {
loop_until_bit_is_set(UCSR0A, UDRE0);
UDR0 = c;
}
Even though my ISR runs at 100 Hz (LED blinking seem to confirm that), data loss may occur at the level of the TWI transmission. To prove that, I modified the code of the ISR to send just a normal char (T) instead of data from the MPU and I got a similar behavior. Something like this:
00:10:05.203 -> T
00:10:05.203 -> T
00:10:05.236 -> T
00:10:05.236 -> T
00:10:05.236 -> T
00:10:05.236 -> T
00:10:05.269 -> T
So, I guess there is something wrong with the UART library and I actually sample at 100 Hz, but the logging frequency is much lower (and not constant). How can I solve this issue and/or debug the UART library? Do you see other reasons to justify this issue?
EDIT 1
As pointed out in the comments, it seems to be a problem of the receiving software that limits the frequency to ~30 Hz by some sort of buffering. To confirm that, I programmed the ATmega328P with the following code (this time using the IDE).
void loop() {
Serial.println("T");
}
At first, I thought there was no delay this time, but I could find it after 208 samples. So, there are ~200 samples received at the same timestamp and another bunch of samples after 33 ms. This may be proof that the receiving software introduces this delay.
I also tested a simple serial monitor that I had developed in C and, even though there is no timestamp functionality, I am also loosing samples if I fix the duration of the acquisition sampling at 100 Hz. My serial monitor is based on the termios.h library, but I could not find any documentation about its way of buffering incoming data.
There are two issues here:
You are missing messages. You checked the sample rate just with your eyes and told us that you can still see a very fast blinking. Depending on the colour of your LED, the ambient light, your physical state, and your eyes this could mean anything from 30 Hz to 100 Hz.
I would not trust my eyes to estimate and rather use an oscilloscope or a frequency counter to measure.
You could reduce the frequency of the LED blinking to 1Hz or even lower by dividing in software. Such a low frequency can be measured by hand via a stop watch. For example count 30 blinks and check the time needed for this.
Add a counter to the message and increment it with each message. You will see it right away if you're losing data.
The timestamps seem to indicate that the messages are "clustered" at about 30 Hz.
I'm guessing that the source of the timestamp in running at 30 Hz. So it can not give you more accurate values.
I kind of solved my issues! First of all, thanks to the comments I have checked that my ISR was correctly running at 100 Hz. Doing so, I could be sure that the problem where somewhere else, namely in the UART communication.
I found this very helpful: Linux, serial port, non-buffering mode
Apparently, the Serial Monitor provided by the Arduino IDE uses exploits the termios.h library and uses its default settings. I checked also the user manual and switched to the polling-read mode. Quoting from the user manual
If data is available, read(2) returns immediately, with the lesser of the number of bytes available, or the number of bytes requested. If no data is available, read(2) returns 0.
Hence, I switched back to my serial monitor code and changed the initPort() function adding the following lines of code.
struct termios options;
(...)
options.c_cc[VTIME] = 0;
options.c_cc[VMIN] = 0;
I noticed right away a much higher data frequency in the terminal. I kept the 1 Hz LED blinking in the ISR and there is no period stretching. Moreover, an acquisition of 10 seconds this time gave me roughly 1000 samples per variable, consistent with a sampling rate of 100 Hz.
On the AVR side, I also changed the way I send data through the UART. Before, I was sending 9 floats like this:
sprintf(buffer, "%f, %f, %f", value1_x, value1_y, value1_z);
serial_print(buffer); // no "\n" sent here
sprintf(buffer, "%f, %f, %f", value2_x, value2_y, value2_z);
serial_print(buffer); // again, no "\n" sent
sprintf(buffer, "%f, %f, %f", roll, pitch, yaw);
serial_println(buffer); // "\n" is sent here once the last data byte is sent
Now, I replaced all this with a single call to the function serial_println() and I write only 6 floats to the buffer.
I'm writing a client-server app using BSD sockets. It needs to run in the background, continuously transferring data, but cannot hog the bandwidth of the network interface from normal use. Depending on the speed of the interface, I need to throttle this connection to a certain max transfer rate.
What is the best way to achieve this, programmatically?
The problem with sleeping a constant amount of 1 second after each transfer is that you will have choppy network performance.
Let BandwidthMaxThreshold be the desired bandwidth threshold.
Let TransferRate be the current transfer rate of the connection.
Then...
If you detect your TransferRate > BandwidthMaxThreshold then you do a SleepTime = 1 + SleepTime * 1.02 (increase sleep time by 2%)
Before or after each network operation do a
Sleep(SleepTime)
If you detect your TransferRate is a lot lower than your BandwidthMaxThreshold you can decrease your SleepTime. Alternatively you could just decay/decrease your SleepTime over time always. Eventually your SleepTime will reach 0 again.
Instead of an increase of 2% you could also do an increase by a larger amount linearly of the difference between TransferRate - BandwidthMaxThreshold.
This solution is good, because you will have no sleeps if the user's network is already not as high as you would like.
The best way would be to use a token bucket.
Transmit only when you have enough tokens to fill a packet (1460 bytes would be a good amount), or if you are the receive side, read from the socket only when you have enough tokens; a bit of simple math will tell you how long you have to wait before you have enough tokens, so you can sleep that amount of time (be careful to calculate how many tokens you gained by how much you actually slept, since most operating systems can sleep your process for longer than you asked).
To control the size of the bursts, limit the maximum amount of tokens you can have; a good amount could be one second worth of tokens.
I've had good luck with trickle. It's cool because it can throttle arbitrary user-space applications without modification. It works by preloading its own send/recv wrapper functions which do the bandwidth calculation for you.
The biggest drawback I found was that it's hard to coordinate multiple applications that you want to share finite bandwidth. "trickled" helps, but I found it complicated.
Update in 2017: it looks like trickle moved to https://github.com/mariusae/trickle
I am making a synthesizer by piping data into aplay (I know it's not ideal) and the sound is lagging behind the keypresses which alter the sound. I believe this is because aplay is going at a constant 8000 Hz, but the c program is going at an unstable rate. How do I get the for loop to go at 8000 Hz in C?
To generate audio samples at 8000 Hz (or any fixed rate) you don't want your loop to "run at" that rate. That would involve huge amounts of overhead (99.99% or more) spinning doing nothing until time to generate the next sample, and (especially if you sleep rather than spinning) would be unreliable in that your process might not wake-up/get-scheduled in time for some of the samples.
Instead, you just want to be producing samples at an overall rate matching what the consumer (aplay/the audio device) expects. You can compute the overall current sample number you should be generating up to as something like:
current_time + buffer_depth - start_time
then, after generating up to that sample, sleep for some period proportional to the buffer depth, but sufficiently less that you won't be in trouble if your process doesn't get scheduled again right away. The buffer depth you can use depends on what kind of latency you need. If you're making sounds for live/realtime events, you probably want a buffer depth of 1/50 sec (20 ms) or less. If not, you can happily use huge buffers like 5-10 seconds.
If you are piping data to aplay, you will not experience any problems with the sample rate (8 kHz, for example) because the kernel will block your program when you write() when the buffer is full. This will effectively limit your audio generation to 8 kHz with no work on your part.
However, this is far from ideal. Your application will only be throttled once the kernel buffer for the pipe is full, and the default size for pipe buffers on Linux is 64 kB. For stereo 16-bit data at 8 kHz, this is two full seconds of audio data, so you would expect your audio to lag at least two seconds from the user input. This is unacceptable for synthesizer applications.
The only real solution is to use the ALSA library directly (or some alternative sound API). Using this API, you can send buffered audio data to your audio output device without accumulating excessive queued data in kernel buffers.
See A Guide Through The Linux Sound API Jungle for some tips.
Can anyone say how sampling rate and framesize are related ?
I decoded a spx file to wav, with sampling rate of 10 kHz and at 16 bit. The frame size applied during the decoding process was 640.
The decoded file is playable in vlc. But I want to play that file in Flex.
Flex supports rate of 44.1 kHz, 22.5 kHz and 11.2 kHz only. I want to increase the sampling rate during decoding process. I know how to do that in the code but I guess the framesize also should be increased. I don't know the dependency between these two. Can anyone help?
Frame size and sampling rate are generally orthogonal concepts. They don't need to affect each other unless a particular format demands it.
For PCM .wav, the frame size will always be bits/channels * channels. In your case, 16 bits for mono, or 32 bits for stereo.
Also, there is no need to change the decoding frame size only because you later apply resampling.
You mix two independent tasks: spex decoding and resampling. The mentioned frame size should be considered only as a buffer that contains PCM samples. These PCM samples you should pass to a resampler (for example SSRC: http://shibatch.sourceforge.net/).
Frame Size depends on the codec used to compress the original data. It will contain an integral number of samples (320 in this case).
If I'm correct in thinking raw audio has a frame size equal to the sample size. However some codecs perform compression over a range of samples. Usually the larger the frame size, the more memory needed to compress the data but the potentially better compression you can achieve.
You can't increase the sampling rate during decoding however you could resample the decoded audio. Presumably you're actually re-encoding the data to send it to Flex? You'll need to have a look at the codec you're using to rencode. Which codec are you using?
irrespective of number of channels used, frame rate and sampling rate are same.
because that is the purpose of TDM.
New channels are introduced in the gap left between two consecutive samples.
As the number of channels increase time allotted to each channel decrease there by time taken by each bit.
but tame gap between consecutive samples of any channel will remain constant and it will equal to the total frame time.
i.e. Time gap between samples = Frame time, hence Frame rate is equal to sample rate.
I write a program which can forward ip packets between 2 servers, so how to test the speed of the program ? thanks!
There are a number of communication metrics that may be of interest to your potential users.
Latency is the amount of time to send a message, usually quoted in microseconds for co-located devices and in milliseconds for all other scenarios. It is usually quoted as the "zero-byte latency", meaning the time required to transmitted the meta-data of a message. Lower is better.
Bandwidth is measured in bits per second. It is often quoted as "peak bandwidth" and can be obtained by sending a massive amount of data over the line. Higher is better.
CPU utilization is the percent of CPU time required to transmit a message. Network protocols that can offload a message's transmission have low utilization, which means that the communication can "overlap" some other computation in the user's application, which has the effect of hiding latency. Lower is better.
All of these are measured simply by a variation of the ping test, usually called the "ping-pong":
Node 1:
for n = 1 to MAXSIZE, step via n*=2
send message of size n bytes
receive a response of size n bytes
Node 2:
for n = 1 to MAXSIZE, step via n*=2
receive a message of size n bytes
send response of size n bytes
There's also a "ping-ping" test, in which both nodes write to each other at the same time. This requires non-blocking communication to set-up.
Just output n and the time required for each iteration. The first time is the zero-byte latency. The largest sustainable n/time is the bandwidth (convert to bits per second to be industry standard). You can also measure the CPU utilization required to run the larger iterations, but that's a tricky topic for a whole different question.
Take a look at iperf. You can find it at http://sourceforge.net/projects/iperf/ If you google around you will find tutorials for it. You can look at the source and might get some good ideas of how he does it. I use it for routine testing and it is quite robust