My target is to compare two u_long timestamps, delivered by a GPS device. A long integer like 16290212 has the following structure:
hhmmssµµ
The following code snippet shows an approach how to parse a long integer to an integer array. But I think this is not very efficient. What would be the quickest way to compare two timestamps? I would love to use an UNIX timestamp, but it is not possible in this context.
u_long timestamp_old = 16290212;
u_long base = 1000000;
/* arr[0]: hours
* arr[1]: minutes
* arr[2]: seconds
* arr[3]: miliseconds */
int arr[4];
int i=0;
// parse timestamp_old
while(base >= 1)
{
arr[i++] = (timestamp_old / base);
timestamp_old = (timestamp_old % base);
base /= 100;
}
Your timestamps are u_longs; compare them the same way you compare any 2 u_longs, something like <.
What would be the quickest way to compare two timestamps?
I want to check if the difference is smaller than 400 ms
Perhaps not the fastest yet at least a starting point with a quick worst case. Note that ssµµµ is the same as µµµµµ.
int32_t GPS_to_ms(u_long timestamp) {
int32_t ms = timestamp%100000;
int32_t hhmm = timestamp / 100000;
ms += (hhmm%100)*60*1000;
int32_t hh = hhmm / 100;
ms += hh*60*60*1000;
return ms;
}
if (GPS_to_ms(timestamp_later) - GPS_to_ms(timestamp_first) < 400) {
// timestamps are in close succession.
}
To speed things up on average, 1) assume timestamp_later >= timestamp_first
is usually true 2) timestamps typically have the same hhmm
bool GPS_compare_400(u_long first, u_long later) {
int32_t ms1 = first%100000;
int32_t hhmm1 = first/100000;
int32_t ms2 = later%100000;
int32_t hhmm2 = later/100000;
if (hhmm1 == hhmm2) {
return ms2 - ms1 < 400;
}
return GPS_to_ms(timestamp_later) - GPS_to_ms(timestamp_first) < 400;
}
I assume the input is a string gpstime of the form "hhmmssµµµ". I assume that trailing zeros are always present such that there are always exactly three digits for the microseconds part.
1.
int h, m, s, us;
double t;
if(sscanf(gpstime, "%2d%2d%2d%3d", &h, &m, &s, &us) == 4)
t = (h * 60L + m) * 60 + s + us/1000.;
else {
/* parse error */
t = 0;
}
If that's not efficient enough, here's something down-and-dirty, eschewing scanf:
2.
#define Ctod(c) ((c) - '0')
int h = 10 * Ctod(utctime[0]) + Ctod(utctime[1]);
int m = 10 * Ctod(utctime[2]) + Ctod(utctime[3]);
int s = 10 * Ctod(utctime[4]) + Ctod(utctime[5]);
int us = 100 * Ctod(utctime[6]) + 10 * Ctod(utctime[7]) + Ctod(utctime[8]);
double t = (h * 60L + m) * 60 + s + us/1000.;
In either case, once you've got your t values, just subtract and compare as usual.
If you don't want to use floating point, change t to long int, change the scaling factors as appropriate, and compare your final difference to 400 instead of 0.4.
(But with all of this said, I wouldn't worry about efficiency too much. 10 Hz may sound fast to a pitiful human, but a tenth of a second is a hell of a long time for a decent computer.)
Related
i have an array of n length fullfilled by 16 bit (int16) pcm raw data,the data is in 44100 sample_rate
and stereo,so i have in my array first 2 bytes left channel then right channel etc...i tried to implement a simple low pass converting my array into floating points -1 1,the low pass works but there are round errors that cause little pops in the sound
now i do simply this :
INT32 left_id = 0;
INT32 right_id = 1;
DOUBLE filtered_l_db = 0.0;
DOUBLE filtered_r_db = 0.0;
DOUBLE last_filtered_left = 0;
DOUBLE last_filtered_right = 0;
DOUBLE l_db = 0.0;
DOUBLE r_db = 0.0;
DOUBLE low_filter = filter_freq(core->audio->low_pass_cut);
for(UINT32 a = 0; a < (buffer_size/2);++a)
{
l_db = ((DOUBLE)input_buffer[left_id]) / (DOUBLE)32768;
r_db = ((DOUBLE)input_buffer[right_id]) / (DOUBLE)32768;
///////////////LOW PASS
filtered_l_db = last_filtered_left +
(low_filter * (l_db -last_filtered_left ));
filtered_r_db = last_filtered_right +
(low_filter * (r_db - last_filtered_right));
last_filtered_left = filtered_l_db;
last_filtered_right = filtered_r_db;
INT16 l = (INT16)(filtered_l_db * (DOUBLE)32768);
INT16 r = (INT16)(filtered_r_db * (DOUBLE)32768);
output_buffer[left_id] = (output_buffer[left_id] + l);
output_buffer[right_id] = (output_buffer[right_id] + r);
left_id +=2;
right_id +=2;
}
PS: the input buffer is an int16 array with the pcm data from -32767 to 32767;
i found this function here
Low Pass filter in C
and was the only one that i could understand xd
DOUBLE filter_freq(DOUBLE cut_freq)
{
DOUBLE a = 1.0/(cut_freq * 2 * PI);
DOUBLE b = 1.0/SAMPLE_RATE;
return b/(a+b);
}
my aim is instead to have absolute precision on the wave,and to directly low pass using only integers
with the cost to lose resolution on the filter(and i'm ok with it)..i saw a lot of examples but i really didnt understand anything...someone of you would be so gentle to explain how this is done like you would explain to a little baby?(in code or pseudo code rapresentation) thank you
Assuming the result of function filter_freq can be written as a fraction m/n your filter calculation basically is
y_new = y_old + (m/n) * (x - y_old);
which can be transformed to
y_new = ((n * y_old) + m * (x - y_old)) / n;
The integer division / n truncates the result towards 0. If you want rounding instead of truncation you can implement it as
y_tmp = ((n * y_old) + m * (x - y_old));
if(y_tmp < 0) y_tmp -= (n / 2);
else y_tmp += (n / 2);
y_new = y_tmp / n
In order to avoid losing precision from dividing the result by n in one step and multiplying it by n in the next step you can save the value y_tmp before the division and use it in the next cycle.
y_tmp = (y_tmp + m * (x - y_old));
if(y_tmp < 0) y_new = y_tmp - (n / 2);
else y_new = y_tmp + (n / 2);
y_new /= n;
If your input data is int16_t I suggest to implement the calculation using int32_t to avoid overflows.
I tried to convert the filter in your code without checking other parts for possible problems.
INT32 left_id = 0;
INT32 right_id = 1;
int32_t filtered_l_out = 0; // output value after division
int32_t filtered_r_out = 0;
int32_t filtered_l_tmp = 0; // used to keep the output value before division
int32_t filtered_r_tmp = 0;
int32_t l_in = 0; // input value
int32_t r_in = 0;
DOUBLE low_filter = filter_freq(core->audio->low_pass_cut);
// define denominator and calculate numerator
// use power of 2 to allow bit-shift instead of division
const uint32_t filter_shift = 16U;
const int32_t filter_n = 1U << filter_shift;
int32_t filter_m = (int32_t)(low_filter * filter_n)
for(UINT32 a = 0; a < (buffer_size/2);++a)
{
l_in = input_buffer[left_id]);
r_in = input_buffer[right_id];
///////////////LOW PASS
filtered_l_tmp = filtered_l_tmp + filter_m * (l_in - filtered_l_out);
if(last_filtered_left < 0) {
filtered_l_out = last_filtered_left - filter_n/2;
} else {
filtered_l_out = last_filtered_left + filter_n/2;
}
//filtered_l_out /= filter_n;
filtered_l_out >>= filter_shift;
/* same calculation for right */
INT16 l = (INT16)(filtered_l_out);
INT16 r = (INT16)(filtered_r_out);
output_buffer[left_id] = (output_buffer[left_id] + l);
output_buffer[right_id] = (output_buffer[right_id] + r);
left_id +=2;
right_id +=2;
}
As your filter is initialized with 0 it may need several samples to follow a possible step to the first input value. Depending on your data it might be better to initialize the filter based on the first input value.
I have a problem that, after much head scratching, I think is to do with very small numbers in a long-double.
I am trying to implement Planck's law equation to generate a normalised blackbody curve at 1nm intervals between a given wavelength range and for a given temperature. Ultimately this will be a function accepting inputs, for now it is main() with the variables fixed and outputting by printf().
I see examples in matlab and python, and they are implementing the same equation as me in a similar loop with no trouble at all.
This is the equation:
My code generates an incorrect blackbody curve:
I have tested key parts of the code independently. After trying to test the equation by breaking it into blocks in excel I noticed that it does result in very small numbers and I wonder if my implementation of large numbers could be causing the issue? Does anyone have any insight into using C to implement equations? This a new area to me and I have found the maths much harder to implement and debug than normal code.
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
//global variables
const double H = 6.626070040e-34; //Planck's constant (Joule-seconds)
const double C = 299800000; //Speed of light in vacume (meters per second)
const double K = 1.3806488e-23; //Boltzmann's constant (Joules per Kelvin)
const double nm_to_m = 1e-6; //conversion between nm and m
const int interval = 1; //wavelength interval to caculate at (nm)
//typedef structure to hold results
typedef struct {
int *wavelength;
long double *radiance;
long double *normalised;
} results;
int main() {
int min = 100 , max = 3000; //wavelength bounds to caculate between, later to be swaped to function inputs
double temprature = 200; //temprature in kelvin, later to be swaped to function input
double new_valu, old_valu = 0;
static results SPD_data, *SPD; //setup a static results structure and a pointer to point to it
SPD = &SPD_data;
SPD->wavelength = malloc(sizeof(int) * (max - min)); //allocate memory based on wavelength bounds
SPD->radiance = malloc(sizeof(long double) * (max - min));
SPD->normalised = malloc(sizeof(long double) * (max - min));
for (int i = 0; i <= (max - min); i++) {
//Fill wavelength vector
SPD->wavelength[i] = min + (interval * i);
//Computes radiance for every wavelength of blackbody of given temprature
SPD->radiance[i] = ((2 * H * pow(C, 2)) / (pow((SPD->wavelength[i] / nm_to_m), 5))) * (1 / (exp((H * C) / ((SPD->wavelength[i] / nm_to_m) * K * temprature))-1));
//Copy SPD->radiance to SPD->normalised
SPD->normalised[i] = SPD->radiance[i];
//Find largest value
if (i <= 0) {
old_valu = SPD->normalised[0];
} else if (i > 0){
new_valu = SPD->normalised[i];
if (new_valu > old_valu) {
old_valu = new_valu;
}
}
}
//for debug perposes
printf("wavelength(nm) radiance(Watts per steradian per meter squared) normalised radiance\n");
for (int i = 0; i <= (max - min); i++) {
//Normalise SPD
SPD->normalised[i] = SPD->normalised[i] / old_valu;
//for debug perposes
printf("%d %Le %Lf\n", SPD->wavelength[i], SPD->radiance[i], SPD->normalised[i]);
}
return 0; //later to be swaped to 'return SPD';
}
/*********************UPDATE Friday 24th Mar 2017 23:42*************************/
Thank you for the suggestions so far, lots of useful pointers especially understanding the way numbers are stored in C (IEEE 754) but I don't think that is the issue here as it only applies to significant digits. I implemented most of the suggestions but still no progress on the problem. I suspect Alexander in the comments is probably right, changing the units and order of operations is likely what I need to do to make the equation work like the matlab or python examples, but my knowledge of maths is not good enough to do this. I broke the equation down into chunks to take a closer look at what it was doing.
//global variables
const double H = 6.6260700e-34; //Planck's constant (Joule-seconds) 6.626070040e-34
const double C = 299792458; //Speed of light in vacume (meters per second)
const double K = 1.3806488e-23; //Boltzmann's constant (Joules per Kelvin) 1.3806488e-23
const double nm_to_m = 1e-9; //conversion between nm and m
const int interval = 1; //wavelength interval to caculate at (nm)
const int min = 100, max = 3000; //max and min wavelengths to caculate between (nm)
const double temprature = 200; //temprature (K)
//typedef structure to hold results
typedef struct {
int *wavelength;
long double *radiance;
long double *normalised;
} results;
//main program
int main()
{
//setup a static results structure and a pointer to point to it
static results SPD_data, *SPD;
SPD = &SPD_data;
//allocate memory based on wavelength bounds
SPD->wavelength = malloc(sizeof(int) * (max - min));
SPD->radiance = malloc(sizeof(long double) * (max - min));
SPD->normalised = malloc(sizeof(long double) * (max - min));
//break equasion into visible parts for debuging
long double aa, bb, cc, dd, ee, ff, gg, hh, ii, jj, kk, ll, mm, nn, oo;
for (int i = 0; i < (max - min); i++) {
//Computes radiance at every wavelength interval for blackbody of given temprature
SPD->wavelength[i] = min + (interval * i);
aa = 2 * H;
bb = pow(C, 2);
cc = aa * bb;
dd = pow((SPD->wavelength[i] / nm_to_m), 5);
ee = cc / dd;
ff = 1;
gg = H * C;
hh = SPD->wavelength[i] / nm_to_m;
ii = K * temprature;
jj = hh * ii;
kk = gg / jj;
ll = exp(kk);
mm = ll - 1;
nn = ff / mm;
oo = ee * nn;
SPD->radiance[i] = oo;
}
//for debug perposes
printf("wavelength(nm) | radiance(Watts per steradian per meter squared)\n");
for (int i = 0; i < (max - min); i++) {
printf("%d %Le\n", SPD->wavelength[i], SPD->radiance[i]);
}
return 0;
}
Equation variable values during runtime in xcode:
I notice a couple of things that are wrong and/or suspicious about the current state of your program:
You have defined nm_to_m as 10-9,, yet you divide by it. If your wavelength is measured in nanometers, you should multiply it by 10-9 to get it in meters. To wit, if hh is supposed to be your wavelength in meters, it is on the order of several light-hours.
The same is obviously true for dd as well.
mm, being the exponential expression minus 1, is zero, which gives you infinity in the results deriving from it. This is apparently because you don't have enough digits in a double to represent the significant part of the exponential. Instead of using exp(...) - 1 here, try using the expm1() function instead, which implements a well-defined algorithm for calculating exponentials minus 1 without cancellation errors.
Since interval is 1, it doesn't currently matter, but you can probably see that your results wouldn't match the meaning of the code if you set interval to something else.
Unless you plan to change something about this in the future, there shouldn't be a need for this program to "save" the values of all calculations. You could just print them out as you run them.
On the other hand, you don't seem to be in any danger of underflow or overflow. The largest and smallest numbers you use don't seem to be a far way from 10±60, which is well within what ordinary doubles can deal with, let alone long doubles. The being said, it might not hurt to use more normalized units, but at the magnitudes you currently display, I wouldn't worry about it.
Thanks for all the pointers in the comments. For anyone else running into a similar problem with implementing equations in C, I had a few silly errors in the code:
writing a 6 not a 9
dividing when I should be multiplying
an off by one error with the size of my array vs the iterations of for() loop
200 when I meant 2000 in the temperature variable
As a result of the last one particularly I was not getting the results I expected (my wavelength range was not right for plotting the temperature I was calculating) and this was leading me to the assumption that something was wrong in the implementation of the equation, specifically I was thinking about big/small numbers in C because I did not understand them. This was not the case.
In summary, I should have made sure I knew exactly what my equation should be outputting for given test conditions before implementing it in code. I will work on getting more comfortable with maths, particularly algebra and dimensional analysis.
Below is the working code, implemented as a function, feel free to use it for anything but obviously no warranty of any kind etc.
blackbody.c
//
// Computes radiance for every wavelength of blackbody of given temprature
//
// INPUTS: int min wavelength to begin calculation from (nm), int max wavelength to end calculation at (nm), int temperature (kelvin)
// OUTPUTS: pointer to structure containing:
// - spectral radiance (Watts per steradian per meter squared per wavelength at 1nm intervals)
// - normalised radiance
//
//include & define
#include "blackbody.h"
//global variables
const double H = 6.626070040e-34; //Planck's constant (Joule-seconds) 6.626070040e-34
const double C = 299792458; //Speed of light in vacuum (meters per second)
const double K = 1.3806488e-23; //Boltzmann's constant (Joules per Kelvin) 1.3806488e-23
const double nm_to_m = 1e-9; //conversion between nm and m
const int interval = 1; //wavelength interval to calculate at (nm), to change this line 45 also need to be changed
bbresults* blackbody(int min, int max, double temperature) {
double new_valu, old_valu = 0; //variables for normalising result
bbresults *SPD;
SPD = malloc(sizeof(bbresults));
//allocate memory based on wavelength bounds
SPD->wavelength = malloc(sizeof(int) * (max - min));
SPD->radiance = malloc(sizeof(long double) * (max - min));
SPD->normalised = malloc(sizeof(long double) * (max - min));
for (int i = 0; i < (max - min); i++) {
//Computes radiance for every wavelength of blackbody of given temperature
SPD->wavelength[i] = min + (interval * i);
SPD->radiance[i] = ((2 * H * pow(C, 2)) / (pow((SPD->wavelength[i] * nm_to_m), 5))) * (1 / (expm1((H * C) / ((SPD->wavelength[i] * nm_to_m) * K * temperature))));
//Copy SPD->radiance to SPD->normalised
SPD->normalised[i] = SPD->radiance[i];
//Find largest value
if (i <= 0) {
old_valu = SPD->normalised[0];
} else if (i > 0){
new_valu = SPD->normalised[i];
if (new_valu > old_valu) {
old_valu = new_valu;
}
}
}
for (int i = 0; i < (max - min); i++) {
//Normalise SPD
SPD->normalised[i] = SPD->normalised[i] / old_valu;
}
return SPD;
}
blackbody.h
#ifndef blackbody_h
#define blackbody_h
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
//typedef structure to hold results
typedef struct {
int *wavelength;
long double *radiance;
long double *normalised;
} bbresults;
//function declarations
bbresults* blackbody(int, int, double);
#endif /* blackbody_h */
main.c
#include <stdio.h>
#include "blackbody.h"
int main() {
bbresults *TEST;
int min = 100, max = 3000, temp = 5000;
TEST = blackbody(min, max, temp);
printf("wavelength | normalised radiance | radiance |\n");
printf(" (nm) | - | (W per meter squr per steradian) |\n");
for (int i = 0; i < (max - min); i++) {
printf("%4d %Lf %Le\n", TEST->wavelength[i], TEST->normalised[i], TEST->radiance[i]);
}
free(TEST);
free(TEST->wavelength);
free(TEST->radiance);
free(TEST->normalised);
return 0;
}
Plot of output:
unsigned const number = minimum + (rand() % (maximum - minimum + 1))
I know how to (easily) generate a random number within a range such as from 0 to 100. But what about a random number from the full range of int (assume sizeof(int) == 4), that is from INT_MIN to INT_MAX, both inclusive?
I don't need this for cryptography or the like, but a approximately uniform distribution would be nice, and I need a lot of those numbers.
The approach I'm currently using is to generate 4 random numbers in the range from 0 to 255 (inclusive) and do some messy casting and bit manipulations. I wonder whether there's a better way.
On my system RAND_MAX is 32767 which is 15 bits. So for a 32-bit unsigned just call three times and shift, or, mask etc.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(void){
unsigned rando, i;
srand((unsigned)time(NULL));
for (i = 0; i < 3; i++) {
rando = ((unsigned)rand() << 17) | ((unsigned)rand() << 2) | ((unsigned)rand() & 3);
printf("%u\n", rando);
}
return 0;
}
Program output:
3294784390
3748022412
4088204778
For reference I'm adding what I've been using:
int random_int(void) {
assert(sizeof(unsigned int) == sizeof(int));
unsigned int accum = 0;
size_t i = 0;
for (; i < sizeof(int); ++i) {
i <<= 8;
i |= rand() & 0x100;
}
// Attention: Implementation defined!
return (int) accum;
}
But I like Weather Vane's solution better because it uses fewer rand() calls and thus makes more use of the (hopefully good) distribution generated by it.
We should be able to do something that works no matter what the range of rand() or what size result we're looking for just by accumulating enough bits to fill a given type:
// can be any unsigned type.
typedef uint32_t uint_type;
#define RAND_UINT_MAX ((uint_type) -1)
uint_type rand_uint(void)
{
// these are all constant and factor is likely a power of two.
// therefore, the compiler has enough information to unroll
// the loop and can use an immediate form shl in-place of mul.
uint_type factor = (uint_type) RAND_MAX + 1;
uint_type factor_to_k = 1;
uint_type cutoff = factor ? RAND_UINT_MAX / factor : 0;
uint_type result = 0;
while ( 1 ) {
result += rand() * factor_to_k;
if (factor_to_k <= cutoff)
factor_to_k *= factor;
else
return result;
}
}
Note: Makes the minimum number of calls to rand() necessary to populate all bits.
Let's verify this gives a uniform distribution.
At this point we could just cast the result of rand_uint() to type int and be done, but it's more useful to get output in a specified range. The problem is: How do we reach INT_MAX when the operands are of type int?
Well... We can't. We'll need to use a type with greater range:
int uniform_int_distribution(int min, int max)
{
// [0,1) -> [min,max]
double canonical = rand_uint() / (RAND_UINT_MAX + 1.0);
return floor(canonical * (1.0 + max - min) + min);
}
As a final note, it may be worthwhile to implement the random function in terms of type double instead, i.e., accumulate enough bits for DBL_MANT_DIG and return a result in the range [0,1). In fact this is what std::generate_canonical does.
I'm trying to use the Sine Table lookup method to find the tone frequency at different step size, but when I'm converting the floating point to integer and use the oscicopte to view the frequncy, it can't display any things on screen.
Does anyone know what's the solution for this issues. Any help is apperaite.
Below is the code:
// use the formula: StepSize = 360/(Fs/f) Where Fs is the Sample frequency 44.1 kHz and f is the tone frequency.
// example: StepSize = 360/(44100/440) = 3.576, since the STM32 doesn't support the floating point, therefore, we have to use the fixed-point format which multiply it by 1000 to be 3575
int StepSize = 3575;
unsigned int v=0;
signed int sine_table[91] = {
0x800,0x823,0x847,0x86b,
0x88e,0x8b2,0x8d6,0x8f9,
0x91d,0x940,0x963,0x986,
0x9a9,0x9cc,0x9ef,0xa12,
0xa34,0xa56,0xa78,0xa9a,
0xabc,0xadd,0xaff,0xb20,
0xb40,0xb61,0xb81,0xba1,
0xbc1,0xbe0,0xc00,0xc1e,
0xc3d,0xc5b,0xc79,0xc96,
0xcb3,0xcd0,0xcec,0xd08,
0xd24,0xd3f,0xd5a,0xd74,
0xd8e,0xda8,0xdc1,0xdd9,
0xdf1,0xe09,0xe20,0xe37,
0xe4d,0xe63,0xe78,0xe8d,
0xea1,0xeb5,0xec8,0xedb,
0xeed,0xeff,0xf10,0xf20,
0xf30,0xf40,0xf4e,0xf5d,
0xf6a,0xf77,0xf84,0xf90,
0xf9b,0xfa6,0xfb0,0xfba,
0xfc3,0xfcb,0xfd3,0xfda,
0xfe0,0xfe6,0xfec,0xff0,
0xff4,0xff8,0xffb,0xffd,
0xffe,0xfff,0xfff};
unsigned int sin(int x){
x = x % 360;
if(x <= 90)
return sine_table[x];
else if ( x <= 180){
return sine_table[180 - x];
}else if ( x <= 270){
return 4096 - sine_table[x - 180];
}else{
return 4096 - sine_table[360 - x];
}
}
void main(void)
{
while(1){
v+=StepSize; // Don't know why it doesn't work in this way. not display anything on screen.
DAC->DHR12R2 = sin(v/1000); // DAC channel-2 12-bit Right aligned data
if (v >= 360) v = 0;
}
}
But, if I change the StepSize = 3; it shows the frequency:
There are a few issues with your code. But I will start with the one that you asked about.
int StepSize = 3575;
unsigned int v=0;
while(1){
v+=StepSize;
DAC->DHR12R2 = sin(v/1000);
if (v >= 360) v = 0;
}
The reason why this code doesn't work is that v is always set to 0 at the end of the loop because 3575 is greater than 360. So then you always call sin(3) because 3575/1000 is 3 in integer division.
Perhaps, you should rewrite your last line as if ((v/1000) >= 360) v = 0;. Otherwise, I would rewrite your loop like this
while(1){
v+=StepSize;
v/=1000;
DAC->DHR12R2 = sin(v);
if (v >= 360) v = 0;
}
I would also recommend that you declare your lookup table a const. So it would look like
const signed int sine_table[91] = {
Last recommendation is to choose another name for your sin function so as not to confuse with the sin library function. Even though in this case there shouldn't be a problem.
I have a number of time series each containing a sequence of 400 numbers that are close to each other. I have thousands of time series; each has its own series of close numbers.
TimeSeries1 = 184.56, 184.675, 184.55, 184.77, ...
TimeSeries2 = 145.73, 145.384, 145.96, 145.33, ...
TimeSeries3 = -126.48, -126.78, -126.55, ...
I can store an 8 byte double for each time Series, so for most of the time series, I can compress each double to a single byte by multiplying by 100 and taking the delta of the current value and the previous value.
Here is my compress/decompress code:
struct{
double firstValue;
double nums[400];
char compressedNums[400];
int compressionOK;
} timeSeries;
void compress(void){
timeSeries.firstValue = timeSeries.nums[0];
double lastValue = timeSeries.firstValue;
for (int i = 1; i < 400; ++i){
int delta = (int) ((timeSeries.nums[i] * 100) - (lastValue* 100));
timeSeries.compressionOK = 1;
if (delta > CHAR_MAX || delta < -CHAR_MAX){
timeSeries.compressionOK = 0;
return;
}
else{
timeSeries.compressedNums[i] = (char) delta;
lastValue = timeSeries.nums[i];
}
}
}
double decompressedNums[400];
void decompress(void){
if (timeSeries.compressionOK){
double lastValue = timeSeries.firstValue;
for (int i = 1; i < 400; ++i){
decompressedNums[i] = lastValue + timeSeries.compressedNums[i] / 100.0;
lastValue = decompressedNums[i];
}
}
}
I can tolerate some lossiness, on the order of .005 per number. However, I am getting more loss than I can tolerate, especially since a precision loss in one of the compressed series carries forward and causes an increasing amount of loss.
So my questions are:
Is there something I can change to reduce the lossiness?
Is there an altogether different compression method that has a comparable, or better, than this 8 to 1 ratio?
You can avoid the slow drift in precision by working out the delta not from the precise value of the previous element, but rather from the computed approximation of the previous element (i.e. the sum of the deltas). That way, you will always get the closest approximation to the next value.
Personally, I'd use integer arithmetic for this purpose, but it will probably be fine with floating point arithmetic too, since floating point is reproducible even if not precise.
Look at the values as stored in memory:
184. == 0x4067000000000000ull
184.56 == 0x406711eb851eb852ull
The first two bytes are the same but the last six bytes are different.
For integer deltas, multiply by 128 instead of 100, this will get you 7 bits of the fractional part. If the delta is too large for one byte use a three byte sequence {0x80, hi_delta, lo_delta}, so 0x80 is used a special indicator. If the delta happened to be -128, then that would be {0x80, 0xff, 0x80}.
You should round the values before converting to an int to avoid the problems, as in this code.
#include <limits.h>
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
enum { TS_SIZE = 400 };
typedef struct
{
double firstValue;
double nums[TS_SIZE];
signed char compressedNums[TS_SIZE];
int compressionOK;
} timeSeries;
static
void compress(timeSeries *t1)
{
t1->firstValue = t1->nums[0];
double lastValue = t1->firstValue;
for (int i = 1; i < TS_SIZE; ++i)
{
int delta = (int) round((t1->nums[i] - lastValue) * 100.0);
t1->compressionOK = 1;
if (delta > CHAR_MAX || delta < -CHAR_MAX)
{
printf("Delta too big: %d (%.3f) vs %d (%.3f) = delta %.3f\n",
i-1, t1->nums[i-1], i, t1->nums[i], t1->nums[i] - t1->nums[i-1]);
t1->compressionOK = 0;
return;
}
else
{
t1->compressedNums[i] = (char) delta;
lastValue = t1->nums[i];
}
}
}
static
void decompress(timeSeries *t1)
{
if (t1->compressionOK)
{
double lastValue = t1->firstValue;
for (int i = 1; i < TS_SIZE; ++i)
{
t1->nums[i] = lastValue + t1->compressedNums[i] / 100.0;
lastValue = t1->nums[i];
}
}
}
static void compare(const timeSeries *t0, const timeSeries *t1)
{
for (int i = 0; i < TS_SIZE; i++)
{
char c = (fabs(t0->nums[i] - t1->nums[i]) > 0.005) ? '!' : ' ';
printf("%c %03d: %.3f vs %.3f = %+.3f\n", c, i, t0->nums[i], t1->nums[i], t0->nums[i] - t1->nums[i]);
}
}
int main(void)
{
timeSeries t1;
timeSeries t0;
int i;
for (i = 0; i < TS_SIZE; i++)
{
if (scanf("%lf", &t0.nums[i]) != 1)
break;
}
if (i != TS_SIZE)
{
printf("Reading problems\n");
return 1;
}
t1 = t0;
for (i = 0; i < 10; i++)
{
printf("Cycle %d:\n", i+1);
compress(&t1);
decompress(&t1);
compare(&t0, &t1);
}
return 0;
}
With the following data (generated from integers in the range 18456..18855 divided by 100 and randomly perturbed by a small amount (about 0.3%, to keep the values close enough together), I got the same data over, and over again, for the full 10 cycles of compression and decompression.
184.60 184.80 184.25 184.62 184.49 184.94 184.95 184.39 184.50 184.96
184.54 184.72 184.84 185.02 184.83 185.01 184.43 185.00 184.74 184.88
185.04 184.79 184.55 184.94 185.07 184.60 184.55 184.57 184.95 185.07
184.61 184.57 184.57 184.98 185.24 185.11 184.89 184.72 184.77 185.29
184.98 184.91 184.76 184.89 185.26 184.94 185.09 184.68 184.69 185.04
185.39 185.05 185.41 185.41 184.74 184.77 185.16 184.84 185.31 184.90
185.18 185.15 185.03 185.41 185.18 185.25 185.01 185.31 185.36 185.29
185.62 185.48 185.40 185.15 185.29 185.19 185.32 185.60 185.39 185.22
185.66 185.48 185.53 185.59 185.27 185.69 185.29 185.70 185.77 185.40
185.41 185.23 185.84 185.30 185.70 185.18 185.68 185.43 185.45 185.71
185.60 185.82 185.92 185.40 185.85 185.65 185.92 185.80 185.60 185.57
185.64 185.39 185.48 185.36 185.69 185.76 185.45 185.72 185.47 186.04
185.81 185.80 185.94 185.64 186.09 185.95 186.03 185.55 185.65 185.75
186.03 186.02 186.24 186.19 185.62 186.13 185.98 185.84 185.83 186.19
186.17 185.80 186.15 186.10 186.32 186.25 186.09 186.20 186.06 185.80
186.02 186.40 186.26 186.15 186.35 185.90 185.98 186.19 186.15 185.84
186.34 186.20 186.41 185.93 185.97 186.46 185.92 186.19 186.15 186.32
186.06 186.25 186.47 186.56 186.47 186.33 186.55 185.98 186.36 186.35
186.65 186.60 186.52 186.13 186.39 186.55 186.50 186.45 186.29 186.24
186.81 186.61 186.80 186.60 186.75 186.83 186.86 186.35 186.34 186.53
186.60 186.69 186.32 186.23 186.39 186.71 186.65 186.37 186.37 186.54
186.81 186.84 186.78 186.50 186.47 186.44 186.36 186.59 186.87 186.70
186.90 186.47 186.50 186.74 186.80 186.86 186.72 186.63 186.78 186.52
187.22 186.71 186.56 186.90 186.95 186.67 186.79 186.99 186.85 187.03
187.04 186.89 187.19 187.33 187.09 186.92 187.35 187.29 187.04 187.00
186.79 187.32 186.94 187.07 186.92 187.06 187.39 187.20 187.35 186.78
187.47 187.54 187.33 187.07 187.39 186.97 187.48 187.10 187.52 187.55
187.06 187.24 187.28 186.92 187.60 187.05 186.95 187.26 187.08 187.35
187.24 187.66 187.57 187.75 187.15 187.08 187.55 187.30 187.17 187.17
187.13 187.14 187.40 187.71 187.64 187.32 187.42 187.19 187.40 187.66
187.93 187.27 187.44 187.35 187.34 187.54 187.70 187.62 187.99 187.97
187.51 187.36 187.82 187.75 187.56 187.53 187.38 187.91 187.63 187.51
187.39 187.54 187.69 187.84 188.16 187.61 188.03 188.06 187.53 187.51
187.93 188.04 187.77 187.69 188.03 187.81 188.04 187.82 188.14 187.96
188.05 187.63 188.35 187.65 188.00 188.27 188.20 188.21 187.81 188.04
187.87 187.96 188.18 187.98 188.46 187.89 187.77 188.18 187.83 188.03
188.48 188.09 187.82 187.90 188.40 188.32 188.33 188.29 188.58 188.53
187.88 188.32 188.57 188.14 188.02 188.25 188.62 188.43 188.19 188.54
188.20 188.06 188.31 188.19 188.48 188.44 188.69 188.63 188.34 188.76
188.32 188.82 188.45 188.34 188.44 188.25 188.39 188.83 188.49 188.18
Until I put the rounding in, the values would rapidly drift apart.
If you don't have round() — which was added to Standard C in the C99 standard — then you can use these lines in place of round():
int delta;
if (t1->nums[i] > lastValue)
delta = (int) (((t1->nums[i] - lastValue) * 100.0) + 0.5);
else
delta = (int) (((t1->nums[i] - lastValue) * 100.0) - 0.5);
This rounds correctly for positive and negative values. You could also factor that into a function; in C99, you could make it an inline function, but if that worked, you would have the round() function in the library, too. I used this code at first before switching to the round() function.