Phase angle from FFT using atan2 - weird behaviour. Phase shift offset? Unwrapping? - c

I'm testing and performing simple FFT's and I'm interested in phase shift.
I geneate simple array of 256 samples with sinusoid with 10 cycles.
I perform an FFT of those samples and receiving complex data (2x128).
Than I calculate magnitude of those data and FFT looks like expected:
Then I want to calculate phase shift from fft complex output. I'm using atan2.
Combined output fft_magnitude (blue) + fft+phase (red) looks like this:
This is pretty much what I expect with a "small" problem. I know this is wrapping but if I imagine unwrapping it, the phase shift in the magnitude peak is reading 36 degrees and I think it should be 0 because my input sinusiod was not shifted at all.
If I shift this -36 deg (blue is in-phase, red is shifted, blue is printed only for reference) the sinusiod looks like this:
And than if I perform an FFT of this red data the magnitude + phase output looks like this:
So it is easy to imagine that unwrapped phase will be close to 0 at the magniture peak.
So there is 36 deg offset. But what happenes if I prepare data with 20 cycles per 256 samples and 0 phase shift
If I then perform an FFT, this is an output (magnitude + phase):
And I can tell you if will cross the peak point at 72 degrees. So there is now 72 degrees offset.
Can anyone give me a hint why is that happening?
Is it right that atan2() phase output is frequency dependent with offset of 2pi/cycles (360 deg/cycles) ?
How to unwrap it and get correct results (I couldn't find working C library to unwrap).
This is running on ARM Cortex-M7 processor (embedded).
#define phaseShift 0
#define cycles 20
#include <arm_math.h>
#include <arm_const_structs.h>
float32_t phi = phaseShift * PI / 180; //phase shift in radians
float32_t data[256]; //input data for fft
float32_t output_buffer[256]; //output buffer from fft
float32_t phase_data[128]; //will contain atan2 values of output from fft (output values are complex)
float32_t magnitude[128]; //will contain absolute values of output from fft (output values are complex)
float32_t incrFactorRadians = cycles * 2 * PI / 255;
arm_rfft_fast_instance_f32 RealFFT_Instance;
void setup()
{
Serial.begin(115200);
delay(2000);
arm_rfft_fast_init_f32(&RealFFT_Instance, 256); //initializing fft to be ready for 256 samples
for (int i = 0; i < 256; i++) //print sinusoids
{
data[i] = arm_sin_f32(incrFactorRadians * i + phi);
Serial.print(arm_sin_f32(incrFactorRadians * i), 8); Serial.print(","); Serial.print(data[i], 8); Serial.print("\n"); //print reference in-phase sinusoid and shifted sinusoid (data for fft)
}
Serial.print("\n\n");
delay(10000);
arm_rfft_fast_f32(&RealFFT_Instance, data, output_buffer, 0); //perform fft
for (int i = 0; i < 128; i++) //calculate absolute values of an fft output (fft output is complex), and phase shift
{
magnitude[i] = output_buffer[i * 2] * output_buffer[i * 2] + output_buffer[(i * 2) + 1] * output_buffer[(i * 2) + 1];
__ASM("VSQRT.F32 %0,%1" : "=t"(magnitude[i]) : "t"(magnitude[i])); //fast square root ARM DSP function
phase_data[i] = atan2(output_buffer[i * 2], output_buffer[i * 2 +1]) * 180 / PI;
}
}
void loop() //print magnitude of fft and phase output every 10 seconds
{
for (int i = 0; i < 128; i++)
{
Serial.print(magnitude[i], 8); Serial.print(","); Serial.print(phase_data[i], 8); Serial.print("\n");
}
Serial.print("\n\n");
delay(10000);
}

To break down the excellent answer by hotpaw2. (Their answers are always so loaded with golden nuggets of information that I spend days learning enough to comprehend the brilliance.)
When an engineer says "integer periodic" they mean your samples that you are feeding into the FFT (the aperture) needs to sample in a way the ensures you capture one full wave of the frequency sin wave.
Think of the sin wave starting at zero and cresting at one then falling below zero into the trough at negative one and then coming back up to zero.
This is one "full cycle". Now if your wave has a period of 10 cycles per second and you sample at 100 samples per second you will have 10 samples per wave.
So now you put 13 samples into an FFT and your phase is off. Why?
Well the phase is looking for the wave to smoothly continue forever. You just started a zero for the first sample and dropped off as .25 on the 13th sample. Now the phase calculation tries to connect the two ends and has this jump in the wave. This causes the phase to come out wrong.
What you need to do is select a number of samples to feed into your FFT that you know will contain full waves only.
(NOTE) You are only concerned with the phase of one freq at a time.
AND your sample aperture must not start and end at the sin waves same point.
IF you start at zero and end at zero the calculation pasting the two ends together in a forever circle will get two zeros at the transition. So you have to stop one sample short of the repeat point.
Code demonstrating this can be found: Scipy FFT - how to get phase angle

An bare FFT plus an atan2() only correctly measures the starting phase of an input sinusoid if that sinusoid is exactly integer periodic in the FFT's aperture width.
If the signal is not exactly integer periodic (some other frequency), then you have to recenter the data by doing an FFTshift (rotate the data by N/2) before the FFT. The FFT will then correctly measure the phase at the center of the original data, and away from the circular discontinuity produced by the FFT's finite length rectangular window on non-periodic-in-aperture signals.
If you want the phase at some point in the data other than the center, you can use the estimate of the frequency and phase at the center to recalculate the phase at other positions.
There are other window functions (Blackman-Nutall, et.al.) that might produce a better phase estimate than a rectangular window, but usually not as good an estimate as using an FFTShift.

Related

How to use KissFFT with audio?

I have an array of 2048 samples of an audio file at 44.1 khz and want to transform it into a spectrum for an LED effect. I don't know too much about the inner workings of fft but I tryed it using kiss fft:
kiss_fft_cpx *cpx_in = malloc(FRAMES * sizeof(kiss_fft_cpx));
kiss_fft_cpx *cpx_out = malloc(FRAMES * sizeof(kiss_fft_cpx));
kiss_fft_cfg cfg = kiss_fft_alloc( FRAMES , 0 ,0,0 );
for(int j = 0;j<FRAMES;j++) {
float x = (alsa_buffer[(fft_last_index+j+BUFFER_OVERSIZE*FRAMES)%(BUFFER_OVERSIZE*FRAMES)] - offset);
cpx_in[j] = (kiss_fft_cpx){.r = x, .i = x};
}
kiss_fft(cfg, cpx_in, cpx_out);
My output seems really off. When I play a simple sine, there multiple outputs with values way above zero. Also it generally seems like the first entries are way higher. Do I have to weigh the outputs?
I also don't understand how I have to treat the complex numbers, I'm currently using my input values on the real and imaginary part and for the output I use the abs, is that right?
Also usually spectrum analyzers for audio have logarithmic scaling, so I tried that but the problem is that the fft output as far as I know isn't logarithmic, so the first band for example is say 0-100hz but optimally my first LED on the effect should be only up to like 60hz (so a fraction of the first outputs band), while the last LED would be say 8khz to 10khz which would in that case be 20 fft outputs.
Is there any way to make the output logarithmic? How do I limit the spectrum to 20khz (or know what the bands of the output are in general) and is there any other thing to look out for when working with audio signals?

Inverse FFT with CMSIS is wrong

I am attempting to perform an FFT on a signal and use the resulting data to retrieve the original samples via an IFFT. I am using the CMSIS DSP library on an STM32 with a M3.
My issue is understanding the scaling that occurs with the FFT, and also how to get a correct IFFT. Currently the IFFT results in a similar wave as the input, but points are scaled anywhere between 120x-140x of the original. Is this simply the result of precision errors of q15? Am I too scale the IFFT results by 7 bits? My code is below
The documentation also mentions "For the RIFFT, the source buffer must at least have length fftLenReal + 2. The last two elements must be equal to what would be generated by the RFFT: (pSrc[0] - pSrc[1]) >> 1 and 0". What is this for? Applying these operations to FFT_SIZE2 - 2, and FFT_SIZE2 - 1 respectively did not change the results of the IFFT at all.
//128 point FFT
#define FFT_SIZE 128
arm_rfft_instance_q15 fft_instance;
arm_rfft_instance_q15 ifft_instance;
//time domain signal buffers
float32_t sinetbl_in[FFT_SIZE];
float32_t sinetbl_out[FFT_SIZE];
//a copy for comparison after RFFT since function modifies input buffer
volatile q15_t fft_in_buf_cpy[FFT_SIZE];
q15_t fft_in_buf[FFT_SIZE];
//output for FFT, RFFT provides real and complex data organized as re[0], im[0], re[1], im[1]
q15_t fft_out_buf[FFT_SIZE*2];
q15_t fft_out_buf_mag[FFT_SIZE*2];
//inverse fft buffer result
q15_t ifft_out_buf[FFT_SIZE];
//generate 1kHz sinewave with a sample frequency of 8kHz for 128 samples, amplitude is 1
for(int i = 0; i < FFT_SIZE; ++i){
sinetbl_in[i] = arm_sin_f32(2*3.14*1000 *i/8000);
sinetbl_out[i] = 0;
}
//convert buffer to q15 (not enough flash to use f32 fft functions)
arm_float_to_q15(sinetbl_in, fft_in_buf, FFT_SIZE);
memcpy(fft_in_buf_cpy, fft_in_buf, FFT_SIZE*2);
//perform RFFT
arm_rfft_init_q15(&fft_instance, FFT_SIZE, 0, 1);
arm_rfft_q15(&fft_instance, fft_in_buf, fft_out_buf);
//calculate magnitude, skip 1st real and img numbers as they are DC and both real
arm_cmplx_mag_q15(fft_out_buf + 2, fft_out_buf_mag + 1, FFT_SIZE/2-1);
//weird operations described by documentation, does not change results
//fft_out_buf[FFT_SIZE*2 - 2] = (fft_out_buf[0] - fft_out_buf[1]) >> 1;
//fft_out_buf[FFT_SIZE*2 - 1] = 0;
//perform inverse FFT
arm_rfft_init_q15(&ifft_instance, FFT_SIZE, 1, 1);
arm_rfft_q15(&ifft_instance, fft_out_buf, ifft_out_buf);
//closest approximation to get to original scaling
//arm_shift_q15(ifft_out_buf, 7, ifft_out_buf, FFT_SIZE);
//convert back to float for comparison with input
arm_q15_to_float(ifft_out_buf, sinetbl_out, FFT_SIZE);
I feel like I answered my own question with the precision comment, but I'd like to be sure. Am I doing this FFT stuff right?
Thanks in advance
As Cris pointed out some libraries skip the normalization process. CMSIS DSP is one of those libraries as it is intended to be fast. For CMSIS, depending on the FFT size you must left shift your data a certain amount to get back to the original range. In my case with a FFT size of 128 and also the magnitude calculation, it was 7 as I originally surmised.

RMS calculation DC offset

I need to implement an RMS calculations of sine wave in MCU (microcontroller, resource constrained). MCU lacks FPU (floating point unit), so I would prefer to stay in integer realm. Captures are discrete via 10 bit ADC.
Looking for a solution, I've found this great solution here by Edgar Bonet: https://stackoverflow.com/a/28812301/8264292
Seems like it completely suits my needs. But I have some questions.
Input are mains 230 VAC, 50 Hz. It's transformed & offset by hardware means to become 0-1V (peak to peak) sine wave which I can capture with ADC getting 0-1023 readings. Hardware are calibrated so that 260 VRMS (i.e. about -368:+368 peak to peak) input becomes 0-1V peak output. How can I "restore" back original wave RMS value providing I want to stay in integer realm too? Units can vary, mV will do fine also.
My first guess was subtracting 512 from the input sample (DC offset) and later doing this "magic" shift as in Edgar Bonet answer. But I've realized it's wrong because DC offset aren't fixed. Instead it's biased to start from 0V. I.e. 130 VAC input would produce 0-500 mV peak to peak output (not 250-750 mV which would've worked so far).
With real RMS to subtract the DC offset I need to subtract squared sum of samples from the sum of squares. Like in this formula:
So I've ended up with following function:
#define INITIAL 512
#define SAMPLES 1024
#define MAX_V 368UL // Maximum input peak in V ( 260*sqrt(2) )
/* K is defined based on equation, where 64 = 2^6,
* i.e. 6 bits to add to 10-bit ADC to make it 16-bit
* and double it for whole range in -peak to +peak
*/
#define K (MAX_V*64*2)
uint16_t rms_filter(uint16_t sample)
{
static int16_t rms = INITIAL;
static uint32_t sum_squares = 1UL * SAMPLES * INITIAL * INITIAL;
static uint32_t sum = 1UL * SAMPLES * INITIAL;
sum_squares -= sum_squares / SAMPLES;
sum_squares += (uint32_t) sample * sample;
sum -= sum / SAMPLES;
sum += sample;
if (rms == 0) rms = 1; /* do not divide by zero */
rms = (rms + (((sum_squares / SAMPLES) - (sum/SAMPLES)*(sum/SAMPLES)) / rms)) / 2;
return rms;
}
...
// Somewhere in a loop
getSample(&sample);
rms = rms_filter(sample);
...
// After getting at least N samples (SAMPLES * X?)
uint16_t vrms = (uint32_t)(rms*K) >> 16;
printf("Converted Vrms = %d V\r\n", vrms);
Does it looks fine? Or am I doing something wrong like this?
How does SAMPLES (window size?) number relates to F (50Hz) and my ADC capture rate (samples per second)? I.e. how much real samples do I need to feed to rms_filter() before I can get real RMS value providing my capture speed are X sps? At least how to evaluate required minimum N of samples?
I did not test your code, but it looks to me like it should work fine.
Personally, I would not have implemented the function this way. I would
instead have removed the DC part of the signal before trying to
compute the RMS value. The DC part can be estimated by sending the raw
signal through a low pass filter. In pseudo-code this would be
rms = sqrt(low_pass(square(x - low_pass(x))))
whereas what you wrote is basically
rms = sqrt(low_pass(square(x)) - square(low_pass(x)))
It shouldn't really make much of a difference though. The first formula,
however, spares you a multiplication. Also, by removing the DC component
before computing the square, you end up multiplying smaller numbers,
which may help in allocating bits for the fixed-point implementation.
In any case, I recommend you test the filter on your computer with
synthetic data before committing it to the MCU.
How does SAMPLES (window size?) number relates to F (50Hz) and my ADC
capture rate (samples per second)?
The constant SAMPLES controls the cut-off frequency of the low pass
filters. This cut-off should be small enough to almost completely remove
the 50 Hz part of the signal. On the other hand, if the mains
supply is not completely stable, the quantity you are measuring will
slowly vary with time, and you may want your cut-off to be high enough
to capture those variations.
The transfer function of these single-pole low-pass filters is
H(z) = z / (SAMPLES * z + 1 − SAMPLES)
where
z = exp(i 2 π f / f₀),
i is the imaginary unit,
f is the signal frequency and
f₀ is the sampling frequency
If f₀ ≫ f (which is desirable for a good sampling), you can approximate
this by the analog filter:
H(s) = 1/(1 + SAMPLES * s / f₀)
where s = i2πf and the cut-off frequency is f₀/(2π*SAMPLES). The gain
at f = 50 Hz is then
1/sqrt(1 + (2π * SAMPLES * f/f₀)²)
The relevant parameter here is (SAMPLES * f/f₀), which is the number of
periods of the 50 Hz signal that fit inside your sampling window.
If you fit one period, you are letting about 15% of the signal through
the filter. Half as much if you fit two periods, etc.
You could get perfect rejection of the 50 Hz signal if you design a
filter with a notch at that particular frequency. If you don't want
to dig into digital filter design theory, the simplest such filter may
be a simple moving average that averages over a period of exactly
20 ms. This has a non trivial cost in RAM though, as you have to
keep a full 20 ms worth of samples in a circular buffer.

Generate a sine signal with time dependent frequency in C

I want to generate a sine signal with a time dependent frequency that varies periodically between fmin and fmax with a frequency f0 in C. Mathematically, this can be described by
y(t)=1/2*(1+sin(2*Pi*(fmin*t + (fmax-fmin)*1/2*(t - 1/2/Pi/f0*cos(2*Pi*f0*t) + 1/2/Pi/f0))))
Because I want to use this on a microcontroller, I want to reduce the use of floating-point numbers in order to save memory and increase performance.
This is also a reason why I cannot use the standard sine function. To calculate the sine function, I approximate it with a parabola. Therefore, I use the following code:
Input range [0R 2πR] mapped to [0 to 1024].
Output range [-1.0 1.0] mapped to [-512 512].
int_fast16_t sin_bam(int angle_bam) {
angle_bam %= 1024;
if (angle_bam < 0)
angle_bam += 1024;
uint_fast16_t sub_angle = angle_bam % (1024u / 2);
int_fast16_t sn = (uint32_t) (sub_angle*(1024/2 - sub_angle) / 128);
if (angle_bam & 512) {
sn = -sn;
}
return sn;
}
My first approach to implement the signal above was
y(i) = sin_bam(fmin*i+(fmax-fmin)*(i-2/f0*sin_bam((int) (f0*i+1024/4))+1024/f0))+512.
For values of approximately f0 >= 1, this works well, but for smaller values my code seems to break, e.g. for f0 = 0.1 the generated signal becomes very "unsmooth" each time it reaches a frequency minimum.
Here a sample output:
I assume that the problem could be the integer implementation of the sine function as for f0 = 0.1 it reads
sin_bam((int) (0.1*i+1024/4)).
That means, for example for values of i between 0 and 9, sin_bam((int) (0.1*i+1024/4)) delivers the same output.
My first idea to solve this issue was to increase the angular resolution of the sine function, but unfortunately this didn't help.
Is there any logical error in my algorithm or does anybody have a better idea to implement this?

Noise-cancellation with undirected microphones and different amplifiers

I have the following setup https://sketchfab.com/show/7e2912f5f8794a7b96ef3ac5930e090a (It's a 3d viewer, use your mouse to view all angles)
The box has two nondirectional electret microphones(black dots). On the ground there are some elements falling down like water or similar(symbolized by the sphere) and creating noises. On top, someone is speaking in the box. Distances are roughly accurate, so the mouth is pretty close.
Inside the box there are two different amplifiers(but the same electret microphones) with two different amplification circuits(the mouth-one is louder in general and has some normalization circuitry integrated. Long story short, I can record this into a raw audio file with 44100 Hz, 16Bit and Stereo, while the left channel is the upper, the right channel is the lower microphone amplifier output.
Goal is to - even though the electret microphones are not directed and even though there are different amplifiers - subtract the lower microphone(facing the ground) from the upper microphone(facing the speaker) to have noise cancellation.
I tried(With Datei being the raw-filename). This includes a high or low pass filter and a routine to put the final result back into a raw mono file (%s.neu.raw)
The problem is - well - undefinable distortion. I can hear my voice but it's not bearable at all. If you need a sample I can upload one.
EDIT: New code.
static void *substractf( char *Datei)
{
char ergebnis[80];
sprintf(ergebnis,"%s.neu.raw",Datei);
FILE* ausgabe = fopen(ergebnis, "wb");
FILE* f = fopen(Datei, "rb");
if (f == NULL)
return;
double g = 0.1;
double RC = 1.0/(1215*2*3.14);
double dt = 1.0/44100;
double alpha = dt/(RC+dt);
double noise_gain = 18.0;
double voice_gain = 1.0;
struct {
uint8_t noise_lsb;
int8_t noise_msb;
uint8_t voice_lsb;
int8_t voice_msb;
} sample;
while (fread(&sample, sizeof sample, 1, f) == 1)
{
int16_t noise_source = sample.noise_msb * 256 + sample.noise_lsb;
int16_t voice_source = sample.voice_msb * 256 + sample.voice_lsb;
double signal, difference_voice_noise;
difference_voice_noise = voice_gain*voice_source - noise_gain*noise_source;
signal = (1.0 - alpha)*signal + alpha*difference_voice_noise;
putc((char) ( (signed)signal & 0xff),ausgabe);
putc((char) (((signed)signal >> 8) & 0xff),ausgabe);
}
fclose(f);
fclose(ausgabe);
char output[300];
sprintf(output,"rm -frv \"%s\"",Datei);
system(output);
}
Your code doesn't take differences of path length into consideration.
The path difference d2 – d1 between the sound source and the two mics corresponds to a time delay of (d2 – d1) / v, where v is the speed of sound (330 m/s).
Suppose d2 – d1 is equal to 10 cm. In this case, any sound wave whose frequency is a multiple of 3300 Hz (i.e., whose period is a multiple of (0.10/330) seconds) will be at exactly the same phase at both microphones. This is how you want things to be at all frequencies.
However, a sound wave at an odd multiple of half that frequency (1650 Hz, 4950 Hz, 8250 Hz, etc.) will have changed in phase by 180° by the time it reaches the second mic. As a result, your subtraction operation will actually have the opposite effect — you'll be boosting these frequencies instead of making them quieter.
The end result will be similar to what you get if you push all the alternate sliders on a graphic equaliser in opposite directions. This is probably what you're experiencing now.
Try estimating the length of this path difference and delaying the samples in one channel by a corresponding amount. At a sampling rate of 44100 Hz, one centimetre corresponds to about 0.75 samples. If the sound source is moving around, then things get a bit complicated. You'll have to find a way of estimating the path difference dynamically from the audio signals themselves.
Ideas too big for a comment.
1) Looks like OP is filtering the l signal jetzt = vorher + (alpha*(l - vorher)) and then subtracting the r with dif = r - g*jetzt. It seems to make more sense to subtract l and r first and apply that difference to the filter.
float signal = 0.0; (outside loop)
...
float dif;
// Differential (with gain adjustments)
dif = gain_l*l - gain_r*r;
// Low pass filter (I may have this backwards)
signal = (1.0 - alpha)*signal + alpha*dif;
// I am not certain if diff or signal should be written
// but testing limit would be useful.
if ((dif > 32767) || (dif < -32767)) report();
int16_t sig = dif;
// I see no reason for the following test
// if (dif != 0)
putc((char) ( (unsigned)dif & 0xff),ausgabe);
putc((char) (((unsigned)dif >> 8) & 0xff),ausgabe);
2) The byte splicing may be off. Suggested simplification
// This assumes incoming data is little endian,
// Maybe data is in big endian and _that_ is OP problem?
struct {
uint8_t l_lsb;
int8_t l_msb;
uint8_t r_lsb;
int8_t r_msb;
} sample;
...
while (fread(&sample, sizeof sample, 1, f) == 1) {
int16_t left = sample.l_msb * 256 + sample.l_lsb;
int16_t right = sample.r_msb * 256 + sample.r_lsb;
3) Use of float vs. double. Usually the more limiting float creates computational noise, but the magnitude of OP's complaint suggest that this issue is unlikely the problem. Still worth considering.
4) Endian of the 16-bit samples may be backwards. Further, depending on A/D encoding the samples may be 16-bit unsigned rather than 16-bit signed.
5) The phase of the 2 signals could be 180 out from each other due to wiring and mic pick-up. Is so try diff = gain_l*l + gain_r*r.

Resources