Generate a sine signal with time dependent frequency in C - c

I want to generate a sine signal with a time dependent frequency that varies periodically between fmin and fmax with a frequency f0 in C. Mathematically, this can be described by
y(t)=1/2*(1+sin(2*Pi*(fmin*t + (fmax-fmin)*1/2*(t - 1/2/Pi/f0*cos(2*Pi*f0*t) + 1/2/Pi/f0))))
Because I want to use this on a microcontroller, I want to reduce the use of floating-point numbers in order to save memory and increase performance.
This is also a reason why I cannot use the standard sine function. To calculate the sine function, I approximate it with a parabola. Therefore, I use the following code:
Input range [0R 2πR] mapped to [0 to 1024].
Output range [-1.0 1.0] mapped to [-512 512].
int_fast16_t sin_bam(int angle_bam) {
angle_bam %= 1024;
if (angle_bam < 0)
angle_bam += 1024;
uint_fast16_t sub_angle = angle_bam % (1024u / 2);
int_fast16_t sn = (uint32_t) (sub_angle*(1024/2 - sub_angle) / 128);
if (angle_bam & 512) {
sn = -sn;
}
return sn;
}
My first approach to implement the signal above was
y(i) = sin_bam(fmin*i+(fmax-fmin)*(i-2/f0*sin_bam((int) (f0*i+1024/4))+1024/f0))+512.
For values of approximately f0 >= 1, this works well, but for smaller values my code seems to break, e.g. for f0 = 0.1 the generated signal becomes very "unsmooth" each time it reaches a frequency minimum.
Here a sample output:
I assume that the problem could be the integer implementation of the sine function as for f0 = 0.1 it reads
sin_bam((int) (0.1*i+1024/4)).
That means, for example for values of i between 0 and 9, sin_bam((int) (0.1*i+1024/4)) delivers the same output.
My first idea to solve this issue was to increase the angular resolution of the sine function, but unfortunately this didn't help.
Is there any logical error in my algorithm or does anybody have a better idea to implement this?

Related

Inverse FFT with CMSIS is wrong

I am attempting to perform an FFT on a signal and use the resulting data to retrieve the original samples via an IFFT. I am using the CMSIS DSP library on an STM32 with a M3.
My issue is understanding the scaling that occurs with the FFT, and also how to get a correct IFFT. Currently the IFFT results in a similar wave as the input, but points are scaled anywhere between 120x-140x of the original. Is this simply the result of precision errors of q15? Am I too scale the IFFT results by 7 bits? My code is below
The documentation also mentions "For the RIFFT, the source buffer must at least have length fftLenReal + 2. The last two elements must be equal to what would be generated by the RFFT: (pSrc[0] - pSrc[1]) >> 1 and 0". What is this for? Applying these operations to FFT_SIZE2 - 2, and FFT_SIZE2 - 1 respectively did not change the results of the IFFT at all.
//128 point FFT
#define FFT_SIZE 128
arm_rfft_instance_q15 fft_instance;
arm_rfft_instance_q15 ifft_instance;
//time domain signal buffers
float32_t sinetbl_in[FFT_SIZE];
float32_t sinetbl_out[FFT_SIZE];
//a copy for comparison after RFFT since function modifies input buffer
volatile q15_t fft_in_buf_cpy[FFT_SIZE];
q15_t fft_in_buf[FFT_SIZE];
//output for FFT, RFFT provides real and complex data organized as re[0], im[0], re[1], im[1]
q15_t fft_out_buf[FFT_SIZE*2];
q15_t fft_out_buf_mag[FFT_SIZE*2];
//inverse fft buffer result
q15_t ifft_out_buf[FFT_SIZE];
//generate 1kHz sinewave with a sample frequency of 8kHz for 128 samples, amplitude is 1
for(int i = 0; i < FFT_SIZE; ++i){
sinetbl_in[i] = arm_sin_f32(2*3.14*1000 *i/8000);
sinetbl_out[i] = 0;
}
//convert buffer to q15 (not enough flash to use f32 fft functions)
arm_float_to_q15(sinetbl_in, fft_in_buf, FFT_SIZE);
memcpy(fft_in_buf_cpy, fft_in_buf, FFT_SIZE*2);
//perform RFFT
arm_rfft_init_q15(&fft_instance, FFT_SIZE, 0, 1);
arm_rfft_q15(&fft_instance, fft_in_buf, fft_out_buf);
//calculate magnitude, skip 1st real and img numbers as they are DC and both real
arm_cmplx_mag_q15(fft_out_buf + 2, fft_out_buf_mag + 1, FFT_SIZE/2-1);
//weird operations described by documentation, does not change results
//fft_out_buf[FFT_SIZE*2 - 2] = (fft_out_buf[0] - fft_out_buf[1]) >> 1;
//fft_out_buf[FFT_SIZE*2 - 1] = 0;
//perform inverse FFT
arm_rfft_init_q15(&ifft_instance, FFT_SIZE, 1, 1);
arm_rfft_q15(&ifft_instance, fft_out_buf, ifft_out_buf);
//closest approximation to get to original scaling
//arm_shift_q15(ifft_out_buf, 7, ifft_out_buf, FFT_SIZE);
//convert back to float for comparison with input
arm_q15_to_float(ifft_out_buf, sinetbl_out, FFT_SIZE);
I feel like I answered my own question with the precision comment, but I'd like to be sure. Am I doing this FFT stuff right?
Thanks in advance
As Cris pointed out some libraries skip the normalization process. CMSIS DSP is one of those libraries as it is intended to be fast. For CMSIS, depending on the FFT size you must left shift your data a certain amount to get back to the original range. In my case with a FFT size of 128 and also the magnitude calculation, it was 7 as I originally surmised.

Phase angle from FFT using atan2 - weird behaviour. Phase shift offset? Unwrapping?

I'm testing and performing simple FFT's and I'm interested in phase shift.
I geneate simple array of 256 samples with sinusoid with 10 cycles.
I perform an FFT of those samples and receiving complex data (2x128).
Than I calculate magnitude of those data and FFT looks like expected:
Then I want to calculate phase shift from fft complex output. I'm using atan2.
Combined output fft_magnitude (blue) + fft+phase (red) looks like this:
This is pretty much what I expect with a "small" problem. I know this is wrapping but if I imagine unwrapping it, the phase shift in the magnitude peak is reading 36 degrees and I think it should be 0 because my input sinusiod was not shifted at all.
If I shift this -36 deg (blue is in-phase, red is shifted, blue is printed only for reference) the sinusiod looks like this:
And than if I perform an FFT of this red data the magnitude + phase output looks like this:
So it is easy to imagine that unwrapped phase will be close to 0 at the magniture peak.
So there is 36 deg offset. But what happenes if I prepare data with 20 cycles per 256 samples and 0 phase shift
If I then perform an FFT, this is an output (magnitude + phase):
And I can tell you if will cross the peak point at 72 degrees. So there is now 72 degrees offset.
Can anyone give me a hint why is that happening?
Is it right that atan2() phase output is frequency dependent with offset of 2pi/cycles (360 deg/cycles) ?
How to unwrap it and get correct results (I couldn't find working C library to unwrap).
This is running on ARM Cortex-M7 processor (embedded).
#define phaseShift 0
#define cycles 20
#include <arm_math.h>
#include <arm_const_structs.h>
float32_t phi = phaseShift * PI / 180; //phase shift in radians
float32_t data[256]; //input data for fft
float32_t output_buffer[256]; //output buffer from fft
float32_t phase_data[128]; //will contain atan2 values of output from fft (output values are complex)
float32_t magnitude[128]; //will contain absolute values of output from fft (output values are complex)
float32_t incrFactorRadians = cycles * 2 * PI / 255;
arm_rfft_fast_instance_f32 RealFFT_Instance;
void setup()
{
Serial.begin(115200);
delay(2000);
arm_rfft_fast_init_f32(&RealFFT_Instance, 256); //initializing fft to be ready for 256 samples
for (int i = 0; i < 256; i++) //print sinusoids
{
data[i] = arm_sin_f32(incrFactorRadians * i + phi);
Serial.print(arm_sin_f32(incrFactorRadians * i), 8); Serial.print(","); Serial.print(data[i], 8); Serial.print("\n"); //print reference in-phase sinusoid and shifted sinusoid (data for fft)
}
Serial.print("\n\n");
delay(10000);
arm_rfft_fast_f32(&RealFFT_Instance, data, output_buffer, 0); //perform fft
for (int i = 0; i < 128; i++) //calculate absolute values of an fft output (fft output is complex), and phase shift
{
magnitude[i] = output_buffer[i * 2] * output_buffer[i * 2] + output_buffer[(i * 2) + 1] * output_buffer[(i * 2) + 1];
__ASM("VSQRT.F32 %0,%1" : "=t"(magnitude[i]) : "t"(magnitude[i])); //fast square root ARM DSP function
phase_data[i] = atan2(output_buffer[i * 2], output_buffer[i * 2 +1]) * 180 / PI;
}
}
void loop() //print magnitude of fft and phase output every 10 seconds
{
for (int i = 0; i < 128; i++)
{
Serial.print(magnitude[i], 8); Serial.print(","); Serial.print(phase_data[i], 8); Serial.print("\n");
}
Serial.print("\n\n");
delay(10000);
}
To break down the excellent answer by hotpaw2. (Their answers are always so loaded with golden nuggets of information that I spend days learning enough to comprehend the brilliance.)
When an engineer says "integer periodic" they mean your samples that you are feeding into the FFT (the aperture) needs to sample in a way the ensures you capture one full wave of the frequency sin wave.
Think of the sin wave starting at zero and cresting at one then falling below zero into the trough at negative one and then coming back up to zero.
This is one "full cycle". Now if your wave has a period of 10 cycles per second and you sample at 100 samples per second you will have 10 samples per wave.
So now you put 13 samples into an FFT and your phase is off. Why?
Well the phase is looking for the wave to smoothly continue forever. You just started a zero for the first sample and dropped off as .25 on the 13th sample. Now the phase calculation tries to connect the two ends and has this jump in the wave. This causes the phase to come out wrong.
What you need to do is select a number of samples to feed into your FFT that you know will contain full waves only.
(NOTE) You are only concerned with the phase of one freq at a time.
AND your sample aperture must not start and end at the sin waves same point.
IF you start at zero and end at zero the calculation pasting the two ends together in a forever circle will get two zeros at the transition. So you have to stop one sample short of the repeat point.
Code demonstrating this can be found: Scipy FFT - how to get phase angle
An bare FFT plus an atan2() only correctly measures the starting phase of an input sinusoid if that sinusoid is exactly integer periodic in the FFT's aperture width.
If the signal is not exactly integer periodic (some other frequency), then you have to recenter the data by doing an FFTshift (rotate the data by N/2) before the FFT. The FFT will then correctly measure the phase at the center of the original data, and away from the circular discontinuity produced by the FFT's finite length rectangular window on non-periodic-in-aperture signals.
If you want the phase at some point in the data other than the center, you can use the estimate of the frequency and phase at the center to recalculate the phase at other positions.
There are other window functions (Blackman-Nutall, et.al.) that might produce a better phase estimate than a rectangular window, but usually not as good an estimate as using an FFTShift.

RMS calculation DC offset

I need to implement an RMS calculations of sine wave in MCU (microcontroller, resource constrained). MCU lacks FPU (floating point unit), so I would prefer to stay in integer realm. Captures are discrete via 10 bit ADC.
Looking for a solution, I've found this great solution here by Edgar Bonet: https://stackoverflow.com/a/28812301/8264292
Seems like it completely suits my needs. But I have some questions.
Input are mains 230 VAC, 50 Hz. It's transformed & offset by hardware means to become 0-1V (peak to peak) sine wave which I can capture with ADC getting 0-1023 readings. Hardware are calibrated so that 260 VRMS (i.e. about -368:+368 peak to peak) input becomes 0-1V peak output. How can I "restore" back original wave RMS value providing I want to stay in integer realm too? Units can vary, mV will do fine also.
My first guess was subtracting 512 from the input sample (DC offset) and later doing this "magic" shift as in Edgar Bonet answer. But I've realized it's wrong because DC offset aren't fixed. Instead it's biased to start from 0V. I.e. 130 VAC input would produce 0-500 mV peak to peak output (not 250-750 mV which would've worked so far).
With real RMS to subtract the DC offset I need to subtract squared sum of samples from the sum of squares. Like in this formula:
So I've ended up with following function:
#define INITIAL 512
#define SAMPLES 1024
#define MAX_V 368UL // Maximum input peak in V ( 260*sqrt(2) )
/* K is defined based on equation, where 64 = 2^6,
* i.e. 6 bits to add to 10-bit ADC to make it 16-bit
* and double it for whole range in -peak to +peak
*/
#define K (MAX_V*64*2)
uint16_t rms_filter(uint16_t sample)
{
static int16_t rms = INITIAL;
static uint32_t sum_squares = 1UL * SAMPLES * INITIAL * INITIAL;
static uint32_t sum = 1UL * SAMPLES * INITIAL;
sum_squares -= sum_squares / SAMPLES;
sum_squares += (uint32_t) sample * sample;
sum -= sum / SAMPLES;
sum += sample;
if (rms == 0) rms = 1; /* do not divide by zero */
rms = (rms + (((sum_squares / SAMPLES) - (sum/SAMPLES)*(sum/SAMPLES)) / rms)) / 2;
return rms;
}
...
// Somewhere in a loop
getSample(&sample);
rms = rms_filter(sample);
...
// After getting at least N samples (SAMPLES * X?)
uint16_t vrms = (uint32_t)(rms*K) >> 16;
printf("Converted Vrms = %d V\r\n", vrms);
Does it looks fine? Or am I doing something wrong like this?
How does SAMPLES (window size?) number relates to F (50Hz) and my ADC capture rate (samples per second)? I.e. how much real samples do I need to feed to rms_filter() before I can get real RMS value providing my capture speed are X sps? At least how to evaluate required minimum N of samples?
I did not test your code, but it looks to me like it should work fine.
Personally, I would not have implemented the function this way. I would
instead have removed the DC part of the signal before trying to
compute the RMS value. The DC part can be estimated by sending the raw
signal through a low pass filter. In pseudo-code this would be
rms = sqrt(low_pass(square(x - low_pass(x))))
whereas what you wrote is basically
rms = sqrt(low_pass(square(x)) - square(low_pass(x)))
It shouldn't really make much of a difference though. The first formula,
however, spares you a multiplication. Also, by removing the DC component
before computing the square, you end up multiplying smaller numbers,
which may help in allocating bits for the fixed-point implementation.
In any case, I recommend you test the filter on your computer with
synthetic data before committing it to the MCU.
How does SAMPLES (window size?) number relates to F (50Hz) and my ADC
capture rate (samples per second)?
The constant SAMPLES controls the cut-off frequency of the low pass
filters. This cut-off should be small enough to almost completely remove
the 50 Hz part of the signal. On the other hand, if the mains
supply is not completely stable, the quantity you are measuring will
slowly vary with time, and you may want your cut-off to be high enough
to capture those variations.
The transfer function of these single-pole low-pass filters is
H(z) = z / (SAMPLES * z + 1 − SAMPLES)
where
z = exp(i 2 π f / f₀),
i is the imaginary unit,
f is the signal frequency and
f₀ is the sampling frequency
If f₀ ≫ f (which is desirable for a good sampling), you can approximate
this by the analog filter:
H(s) = 1/(1 + SAMPLES * s / f₀)
where s = i2πf and the cut-off frequency is f₀/(2π*SAMPLES). The gain
at f = 50 Hz is then
1/sqrt(1 + (2π * SAMPLES * f/f₀)²)
The relevant parameter here is (SAMPLES * f/f₀), which is the number of
periods of the 50 Hz signal that fit inside your sampling window.
If you fit one period, you are letting about 15% of the signal through
the filter. Half as much if you fit two periods, etc.
You could get perfect rejection of the 50 Hz signal if you design a
filter with a notch at that particular frequency. If you don't want
to dig into digital filter design theory, the simplest such filter may
be a simple moving average that averages over a period of exactly
20 ms. This has a non trivial cost in RAM though, as you have to
keep a full 20 ms worth of samples in a circular buffer.

Need an algorithm to detect large spikes in oscillating data

I am parsing through data on an sd card one large chunk at a time on a microcontroller. It is accelerometer data so it is oscillating constantly. At certain points huge oscillations occur (shown in graph). I need an algorithm to be able to detect these large oscillations, more so, determine the range of data that contains this spike.
I have some example data:
This is the overall graph, there is only one spike of interest, the first one.
Here it is zoomed in a bit
As you can see it is a large oscillation that produces the spike.
So any algorithm that can scan through a data set and determine the portion of data that contains a spike relative to some threshold would be great. This data set is about 50,000 samples, each sample is 32 bits long. I have ample RAM to be able to hold this much data.
Thanks!
For the following signal:
If you take the absolute value of the differential between two consecutive samples, you get:
That is not quite good enough to unambiguously distinguish from the minor "unsustained" disturbances. But if you then take a simple moving sum (a leaky integrator) of the abs-differentials. Here a window width of 4 diff-samples was used:
The moving average introduces a lag or phase shift, which in cases where the data is stored and processing is not real-time can easily be compensated for by subtracting half the window width from the timing:
For real-time processing if the lag is critical a more sophisticated IIR filter might be appropriate. Anyhow a clear threshold can be selected from this data.
In code for the above data set:
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
#include <stdlib.h>
static int32_t dataset[] = { 0,0,0,0,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,3,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,
0,-10,-15,-5,20,25,50,-10,-20,-30,0,30,5,-5,
0,0,5,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,1,0,0,0,0,6,0,0,0,0,0,0,0} ;
#define DATA_LEN (sizeof(dataset)/sizeof(*dataset))
#define WINDOW_WIDTH 4
#define THRESHOLD 15
int main()
{
uint32_t window[WINDOW_WIDTH] = {0} ;
int window_index = 0 ;
int window_sum = 0 ;
bool spike = false ;
for( int s = 1; s < DATA_LEN ; s++ )
{
uint32_t diff = abs( dataset[s] - dataset[s-1] ) ;
window_sum -= window[window_index] ;
window[window_index] = diff ;
window_index++ ;
window_index %= WINDOW_WIDTH ;
window_sum += diff ;
if( !spike && window_sum >= THRESHOLD )
{
spike = true ;
printf( "Spike START # %d\n", s - WINDOW_WIDTH / 2 ) ;
}
else if( spike && window_sum < THRESHOLD )
{
spike = false ;
printf( "Spike END # %d\n", s - WINDOW_WIDTH / 2 ) ;
}
}
return 0;
}
The output is:
Spike START # 66
Spike END # 82
https://onlinegdb.com/ryEw69jJH
Comparing the original data with the detection threshold gives:
For your real data, you will need to select a suitable window width and threshold to get the desired result, both of which will depend on the bandwidth and amplitude of the disturbances you wish to detect.
Also you may need to guard against arithmetic overflow if your samples are of sufficient magnitude. They need to be less than 232 / window-width to guarantee no overflow in the integrator. Alternatively you could use floating-point or uint64_t for the window type, or add code to deal with saturation.
You could look at statistical analysis. Calculating the standard deviation over the data set and then checking when your data goes out of bound.
You can choose to do this in two way's; either you use a running average over a fixed (relatively small) number of samples or you take the average over the whole data set. As I see multiple spikes in your set I would suggest the first. This way you can possibly stop processing (and later continue) every time you find a spike.
For your purpose you do not really need to calculate the standard deviation sigma. You could actually leave it at the squared of sigma. This will give you a slight performance optimization not having to calculate the square root.
Some pseudo code:
// The data set.
int x[N];
// The number of samples in your mean and std calculation.
int M <= N;
// Simga at index i over the previous M samples.
int sigma_i = sqrt( sum( pow(x[i] - mean(x,M), 2) ) / M );
// Or the squared of sigma
int sigma_squared_i = sum( pow(x[i] - mean(x,M), 2) ) / M;
The disadvantage of this method is that you need to set a threshold for the value of sigma at which you trigger. However it is very safe to say that when setting the threshold at 4 or 5 times your average sigma you will have an usable system.
Managed to get a working algorithm. Basically, determine the average difference between data points. If my data starts to exceed some multiple of that value consecutively then most likely there is a spike occurring.

Division in C returns 0... sometimes

Im trying to make a simple RPM meter using an ATMega328.
I have an encoder on the motor which has 306 interrupts per rotation (as the motor encoder has 3 spokes which interrupt on rising and falling edge, the motor is geared 51:1 and so 6 transitions * 51 = 306 interrupts per wheel rotation ) , and I am using a timer interrupting every 1ms, however in the interrupt it set to recalculate every 1 second.
There seems to be 2 problems.
1) RPM never goes below 60, instead its either 0 or RPM >= 60
2) Reducing the time interval causes it to always be 0 (as far as I can tell)
Here is the code
int main(void){
while(1){
int temprpm = leftRPM;
printf("Revs: %d \n",temprpm);
_delay_ms(50);
};
return 0;
}
ISR (INT0_vect){
ticksM1++;
}
ISR(TIMER0_COMPA_vect){
counter++;
if(counter == 1000){
int tempticks = ticksM1;
leftRPM = ((tempticks - lastM1)/306)*1*60;
lastM1 = tempticks;
counter = 0;
}
}
Anything that is not declared in that code is declared globally and as an int, ticksM1 is also volatile.
The macros are AVR macros for the interrupts.
The purpose of the multiplying by 1 for leftRPM represents time, ideally I want to use 1ms without the if statement so the 1 would then be 1000
For a speed between 60 and 120 RPM the result of ((tempticks - lastM1)/306) will be 1 and below 60 RPM it will be zero. Your output will always be a multiple of 60
The first improvement I would suggest is not to perform expensive arithmetic in the ISR. It is unnecessary - store the speed in raw counts-per-second, and convert to RPM only for display.
Second, perform the multiply before the divide to avoid unnecessarily discarding information. Then for example at 60RPM (306CPS) you have (306 * 60) / 306 == 60. Even as low as 1RPM you get (6 * 60) / 306 == 1. In fact it gives you a potential resolution of approximately 0.2RPM as opposed to 60RPM! To allow the parameters to be easily maintained; I recommend using symbolic constants rather than magic numbers.
#define ENCODER_COUNTS_PER_REV 306
#define MILLISEC_PER_SAMPLE 1000
#define SAMPLES_PER_MINUTE ((60 * 1000) / MILLISEC_PER_SAMPLE)
ISR(TIMER0_COMPA_vect){
counter++;
if(counter == MILLISEC_PER_SAMPLE)
{
int tempticks = ticksM1;
leftCPS = tempticks - lastM1 ;
lastM1 = tempticks;
counter = 0;
}
}
Then in main():
int temprpm = (leftCPS * SAMPLES_PER_MINUTE) / ENCODER_COUNTS_PER_REV ;
If you want better that 1RPM resolution you might consider
int temprpm_x10 = (leftCPS * SAMPLES_PER_MINUTE) / (ENCODER_COUNTS_PER_REV / 10) ;
then displaying:
printf( "%d.%d", temprpm / 10, temprpm % 10 ) ;
Given the potential resolution of 0.2 rpm by this method, higher resolution display is unnecessary, though you could use a moving-average to improve resolution at the expense of some "display-lag".
Alternatively now that the calculation of RPM is no longer in the ISR you might afford a floating point operation:
float temprpm = ((float)leftCPS * (float)SAMPLES_PER_MINUTE ) / (float)ENCODER_COUNTS_PER_REV ;
printf( "%f", temprpm ) ;
Another potential issue is that ticksM1++ and tempticks = ticksM1, and the reading of leftRPM (or leftCPS in my solution) are not atomic operations, and can result in an incorrect value being read if interrupt nesting is supported (and even if it is not in the case of the access from outside the interrupt context). If the maximum rate will be less that 256 cps (42RPM) then you might get away with an atomic 8 bit counter; you cal alternatively reduce your sampling period to ensure the count is always less that 256. Failing that the simplest solution is to disable interrupts while reading or updating non-atomic variables shared across interrupt and thread contexts.
It's integer division. You would probably get better results with something like this:
leftRPM = ((tempticks - lastM1)/6);

Resources