Problem with rms and dB values of discrete samples - c

I'm trying to sample pcm-data via the ALSA-project on my RaspberryPi 4 in c. Recording things works like a charm, but tampering with the samples themselves leaves me confused, especially since i already did the same on a different project (ESP32).
Consider "buffer" as an array of varying size per session (ALSA allocates differently every time) containing 32bit 44100Hz discrete audio samples stored as 8 bit values (int32_t cast needed). In order to get the dBFS value of a time stretch as big as one buffer i thought to square every sample, add them together, divide by the number of samples, get the sqrt, divide by the INT32_MAX value and pull the log10 from that, which is finally multiplied by 20. A standard rms and then dBFS calculation:
uint32_t sum = 0;
int32_t* samples = (int32_t*)buffer;
for(int i = 0; i < (size / (BIT_DEPTH/8)); i ++){
sum += (uint32_t)pow(samples[i], 2);
}
double rms = sqrt(sum / (size / (BIT_DEPTH/8)));
int32_t decibel = (int32_t)(20 * log10(rms / INT32_MAX));
fprintf(stderr, "sum = %d\n", sum);
fprintf(stderr, "rms = %d\n", rms);
fprintf(stderr, "%d dBFS\n", decibel);
But instead of reasonable values for a somewhat quiet room (open window) or a speaker right next to the mics I get non-changing really quiet values of around -134 dBFS. Yes, the gain is low, so -134 could be possible but what I understand even less is what happens when I print out variables sum and rms:
buffersize: 262144
sum = -61773
rms = -262146
-138 dBFS
How could they ever be negative? This is probably a classic c-issue which I can't see at the moment.
Again: writing the samples to a file results in a high quality but low gain wav-file (header needed). Any help? thanks.

sum is a uint32_t, but you are printing it with %d, which is for int. The resulting behavior is not defined by the C standard. A common result is for values with the high bit set to be printed as negative numbers, but other behaviors are possible too. A correct conversion specification for unsigned int is %u, but, for uint32_t, you should include <inttypes.h> and use fprintf(stderr, "%" PRIu32 "\n", sum);.
Additionally, the squares and the summation may exceed what can be represented in a uint32_t, resulting in wrapping modulo 232.
rms is a double, but you are also printing it with %d, which is very wrong. Use %g, %f, or %e, or some other conversion for double, possibly with various modifiers to select formatting options.
With the int32_t decibel, %d might work in some C implementations, but a proper method is fprintf(stderr, "%" PRId32 " dBFS\n", decibel);.
Your compiler should have warned you of at least the double format issue. Pay attention to compiler warnings and fix the problems they report. Preferably, escalate compiler warnings to errors with the -Werror switch to GCC and Clang or the /WX switch to MSVC.
The line int32_t* samples = (int32_t*)buffer; may result in prohibited aliasing. Be very sure that the memory for buffer is probably defined to allow it to be aliased as int32_t. If it is not, the behavior is not defined by the C standard, and alternative techniques of accessing the buffer should be used, such as copying the data into an int32_t object one at a time or into an array of int32_t.
Do not use pow to compute squares as it is wasteful (and introduces inaccuracies when other types are involved). For your types, use static inline uint32_t square(int32_t x) { return x*x; } and call it as square(samples[i]). If overflow is occurring, consider using int64_t when computing the square and uint64_t for the sum.

Related

Convert Long To Double, Unexpected Results

I am using very basic code to convert a string into a long and into a double. The CAN library I am using requires a double as an input. I am attempting to send the device ID as a double to another device on the CAN network.
If I use an input string of that is 6 bytes long the long and double values are the same. If I add a 7th byte to the string the values are slightly different.
I do not think I am hitting a max value limit. This code is run with ceedling for an automated test. The same behaviour is seen when sending this data across my CAN communications. In main.c the issue is not observed.
The test is:
void test_can_hal_get_spn_id(void){
struct dbc_id_info ret;
memset(&ret, NULL_TERMINATOR, sizeof(struct dbc_id_info));
char expected_str[8] = "smg123";
char out_str[8];
memset(&out_str, 0, 8);
uint64_t long_val = 0;
double phys = 0.0;
memcpy(&long_val, expected_str, 8);
phys = long_val;
printf("long %ld \n", long_val);
printf("phys %f \n", phys);
uint64_t temp = (uint64_t)phys;
memcpy(&out_str, &temp, 8);
printf("%s\n", expected_str);
printf("%s\n", out_str);
}
With the input = "smg123"
[test_can_hal.c]
- "long 56290670243187 "
- "phys 56290670243187.000000 "
- "smg123"
- "smg123"
With the input "smg1234"
[test_can_hal.c]
- "long 14692989459197299 "
- "phys 14692989459197300.000000 "
- "smg1234"
- "tmg1234"
Is this error just due to how floats are handled and rounded? Is there a way to test for that? Am I doing something fundamentally wrong?
Representing the char array as a double without the intermediate long solved the issue. For clarity I am using DBCPPP. I am using it in C. I should clarify my CAN library comes from NXP, DBCPPP allows my application to read a DBC file and apply the data scales and factors to my raw CAN data. DBCPPP accepts doubles for all data being encoded and returns doubles for all data being decoded.
The CAN library I am using requires a double as an input.
That sounds surprising, but if so, then why are you involving a long as an intermediary between your string and double?
If I use an input string of that is 6 bytes long the long and double values are the same. If I add a 7th byte to the string the values are slightly different.
double is a floating point data type. To be able to represent values with a wide range of magnitudes, some of its bits are used to represent scale, and the rest to represent significant digits. A typical C implementation uses doubles with 53 bits of significand. It cannot exactly represent numbers with more than 53 significant binary digits. That's enough for 6 bytes, but not enough for 7.
I do not think I am hitting a max value limit.
Not a maximum value limit. A precision limit. A 64-bit long has smaller numeric range but more significant digits than an IEEE-754 double.
So again, what role is the long supposed to be playing in your code? If the objective is to get eight bytes of arbitrary data into a double, then why not go directly there? Example:
char expected_str[8] = "smg1234";
char out_str[8] = {0};
double phys = 0.0;
memcpy(&phys, expected_str, 8);
printf("phys %.14e\n", phys);
memcpy(&out_str, &phys, 8);
printf("%s\n", expected_str);
printf("%s\n", out_str);
Do note, however, that there is some risk when (mis)using a double this way. It is possible for the data you put in to constitute a trap representation (a signaling NaN might be such a representation, for example). Handling such a value might cause a trap, or cause the data to be corrupted, or possibly produce other misbehavior. It is also possible to run into numeric issues similar to the one in your original code.
Possibly your library provides some relevant guarantees in that area. I would certainly hope so if doubles are really its sole data type for communication. Otherwise, you could consider using multiple doubles to covey data payloads larger than 53 bits, each of which you could consider loading via your original technique.
If you have a look at the IEEE-754 Wikipedia page, you'll see that the double precision values have a precision of "[a]pproximately 16 decimal digits". And that's roughly where your problem seems to appear.
Specifically, though it's a 64-bit type, it does not have the necessary encoding to provide 264 distinct floating point values. There are many bit patterns that map to the same value.
For example, NaN is encoded as the exponent field of binary 1111 1111 with non-zero fraction (23 bits) regardless of the sign (one bit). That's 2 * (223 - 1) (over 16 million) distinct values representing NaN.
So, yes, your "due to how floats are handled and rounded" comment is correct.
In terms of fixing it, you'll either have to limit your strings to values that can be represented by doubles exactly, or find a way to send the strings across the CAN bus.
For example (if you can't send strings), two 32-bit integers could represent an 8-character string value with zero chance of information loss.

Setting one's own type limits in C?

Can I set my own limits for data types in C? I'm solving some problem which involves some mega-great numbers, and I wish to perform many additions and multiplications and to take the final result modulo some desired number, say 1537849. So I wonder if it's possible to reset the limits of data types such that the values are automatically taken modulo the number I wish when the outcome of any of the operations exceeds my specified number, just as the processor normally does but with the limits I wish. And if such a thing isn't possible, what is the most efficient way to negotiate such a problem?
edit:
Consider one would want to calculate (2^1000) % 1537849 and place the result in the variable monster. Below is my attempt to conquer the problem:
uint64_t monster = 1;
uint32_t power = 1000;
for (uint32_t i = 0; i < power; i ++ ) {
monster *= 2;
if (i%64==63) monster %= 1537849;
}
monster %= 1537849;
Is there any better way of doing so (different algorithm, using libraries, whatever ...)??
Can I set my own limits for data types in C?
The limits of basic types are fixed per the compiler.
I wish to perform many additions and multiplications and to take the final result modulo some desired number, say 1537849
At any stage in the addition and multiplication, code can repeatedly perform the modulo. If the original numbers are N-bit, than at most N-bit math is needed - although it is easier to do with 2N-bit math. Unlimited wide math is inefficient and not needed for this task.
Example code for +, * and pow() with modulo limitations:
Modular exponentiation without range restriction
uintmax_t addmodmax(uintmax_t a, uintmax_t b, uintmax_t mod);
uintmax_t mulmodmax(uintmax_t a, uintmax_t b, uintmax_t mod);
uintmax_t powmodmax(uintmax_t x, uintmax_t expo, uintmax_t mod);
Can I set my own limits for data types in C?
No, short of writing your own compiler and libraries.
I'm solving some problem which involves some mega-great numbers which easily exceed the types' limits
There are algorithms for handling huge numbers in parts ... and there are libraries that already do the work for you, e.g. have a look at the GNU multi precision arithmetic library (GMP).

Join two integers into one double

I need to transfer a double value (-47.1235648, for example) using sockets. Since I'll have a lot of platforms, I must convert to network byte order to ensure correct endian of all ends....but this convert doesn't accept double, just integer and short, so I'm 'cutting' my double into two integer to transfer, like this:
double lat = -47.848945;
int a;
int b;
a = (int)lat;
b = (int)(lat+1);
Now, I need to restore this on the other end, but using the minimum computation as possible (I saw some examples using POW, but looks like pow use a lot of resources for this, I'm not sure). Is there any way to join this as simples as possible, like bit manipulating?
Your code makes no sense.
The typical approach is to use memcpy():
const double lat = -47.848945;
uint32_t ints[sizeof lat / sizeof (uint32_t)];
memcpy(ints, &lat, sizeof lat);
Now send the elements of ints, which are just 32-bit unsigned integers.
This of course assumes:
That you know how to send uint32_ts in a safe manner, i.e. byte per byte or using endian-conversion functions.
That all hosts share the same binary double format (typically IEEE-754).
That you somehow can manage the byte order requirements when moving to/from a pair of integers from/to a single double value (see #JohnBollinger's answer).
I interpreted your question to mean all of these assumptions were safe, that might be a bit over the top. I can't delete this answer as long as it's accepted.
It's good that you're considering differences in numeric representation, but your idea for how to deal with this problem just doesn't work reliably.
Let us suppose that every machine involved uses 64-bit IEEE-754 format for its double representation. (That's already a potential point of failure, but in practice you probably don't have to worry about failures there.) You seem to postulate that the byte order for machines' doubles will map in a consistent way onto the byte order for their integers, but that is not a safe assumption. Moreover, even where that assumption holds true, you need exactly the right kind of mapping for your scheme to work, and that is not only not safe to assume, but very plausibly will not be what you actually see.
For the sake of argument, suppose machine A, which features big-endian integers, wants to transfer a double value to machine B, which features little-endian integers. Suppose further that on B, the byte order for its double representation is the exact reverse of the order on A (which, again, is not safe to assume). Thus, if on A, the bytes of that double are in the order
S T U V W X Y Z
then we want them to be in order
Z Y X W V U T S
on B. Your approach is to split the original into a pair (STUV, WXYZ), transfer the pair in a value-preserving manner to get (VUTS, ZYXW), and then put the pair back together to get ... uh oh ...
V U T S Z Y X W
. Don't imagine fixing that by first swapping the pair. That doesn't serve your purpose because you must avoid such a swap in the event that the two communicating machines have the same byte order, and you have no way to know from just the 8 bytes whether such a swap is needed. Thus even if we make simplifying assumptions that we know to be unsafe, your strategy is insufficient for the task.
Alternatives include:
transfer your doubles as strings.
transfer your doubles as integer (significand, scale) pairs. The frexp() and ldexp() functions can help with encoding and decoding such representations.
transfer an integer-based fixed-point representation of your doubles (the same as the previous option, but with pre-determined scale that is not transferred)
I need to transfer a double value (-47.1235648, for example) using sockets.
If the platforms have potentially different codings for double, then sending a bit pattern of the double is a problem. If code wants portability, a less than "just copy the bits" approach is needed. An alternative is below.
If platforms always have the same double format, just copy the n bits. Example:#Rishikesh Raje
In detail, OP's problem is only loosely defined. On many platforms, a double is a binary64 yet this is not required by C. That double can represent about 264 different values exactly. Neither -47.1235648 nor -47.848945 are one of those. So it is possible OP does not have a strong precision concern.
"using the minimum computation as possible" implies minimal code, usually to have minimal time. For speed, any solution should be rated on order of complexity and with code profiling.
A portable method is to send via a string. This approach addresses correctness and best possible precision first and performance second. It removes endian issues as data is sent via a string and there is no precision/range loss in sending the data. The receiving side, if the using the same double format will re-formed the double exactly. With different double machines, it has a good string representation to do the best it can.
// some ample sized buffer
#define N (sizeof(double)*CHAR_BIT)
double x = foo();
char buf[N];
#if FLT_RADIX == 10
// Rare based 10 platforms
// If macro DBL_DECIMAL_DIG not available, use (DBL_DIG+3)
sprintf(buf, "%.*e", DBL_DECIMAL_DIG-1, x);
#else
// print mantissa in hexadecimal notation with power-of-2 exponent
sprintf(buf, "%a", x);
#endif
bar_send_string(buf);
To reconstitute the double
char *s = foo_get_string();
double y;
// %f decode strings in decimal(f), exponential(e) or hexadecimal/exponential notation(a)
if (sscanf(s, "%f", &y) != 1) Handle_Error(s);
else use(y);
A much better idea would be to send the double directly as 8 bytes in network byte order.
You can use a union
typedef union
{
double a;
uint8_t bytes[8];
} DoubleUnionType;
DoubleUnionType DoubleUnion;
//Assign the double by
DoubleUnion.a = -47.848945;
Then you can make a network byte order conversion function
void htonfl(uint8_t *out, uint8_t *in)
{
#if LITTLE_ENDIAN // Use macro name as per architecture
out[0] = in[7];
out[1] = in[6];
out[2] = in[5];
out[3] = in[4];
out[4] = in[3];
out[5] = in[2];
out[6] = in[1];
out[7] = in[0];
#else
memcpy (out, in, 8);
#endif
}
And call this function before transmission and after reception.

floating point bug in embedded system

On a Rabbit microcontroller..
(1)
I am incrementing f1 every second by converting into hours to the existing value and store in the same register.
void main()
{
float f1;
int i;
f1 = 4096;
// Assume that I am simulating a one second through each iteration of the following loop
for(i = 0; i < 100; i++)
{
f1 += 0.000278; // f1 does not change from 4096
printf("\ni: %d f1: %.06f", i, f1);
}
}
(2)
Another question is when I try to store a 32-bit unsigned long int value into float variable and accessing it does not give me the value I have stored. What am I doing wrong?
void main()
{
unsigned long L1;
int temp;
float f1;
L1 = 4000000000; // four billion
f1 = (float)L1;
// Now print both
// You see that L1: 4000000000 while f1: -4000000000.000000
printf("\nL1: %lu f1:%.6f", L1, f1);
}
The first problem is that single precision (32 bit) binary floating point is good for only approximately 6 significant figures in decimal. So if you start with 4096.00 anything less than .01 cannot be added to the value. Using double precision will improve the result at some significant cost.
It is usually unnecessary and inappropriate to use floating point, it is very expensive on a processor without an FPU - especially an 8 bitter. Moreover your literal approximation of one second in hours (1.0f/3600.0f hours) will introduce significant cumulative error in any case. You may be better off storing time in integer seconds, and converting to hours where necessary for display or output.
The second problem is less clear, but seems likely to be an issue with the Rabbit compiler implementation of floating point or possibly of the %f format specifier in the printf() implementation. Check the ISO compliance statement in the compiler documentation - there may be restrictions - especially on floating point. Again you may find that using a double resolve the problem - especially as strictly that is the type expected by the %f format specifier in an ISO conforming implementation. As I said, you are probably best off avoiding floating point altogether on such a target.
Note that if you are using Rabbit's Dynamic C compiler, you should be clear that Dynamic C is not an ISO conforming C compiler. It is a proprietary C-like language, that is similar enough to C to cause a great deal of confusion! Specifically it does not support double precision (double) floating point.
f1 += (1/3600); should be f1 += (1.0f/3600.0f);.
If you perform integer division then result will also be integer.

printf support for MSP430 micro-controller

I am upgrading a fully tested C program for the Texas Instruments (TI) MSP430 micro-controller using a different C compiler, changing from the Quadravox, AQ430 Development Tool to the VisualGDB C compiler.
The program compiles with zero errors and zero warnings with VisualGDB. The interrupt service routines, timers, UART control, and more all seem to be working. Certain uses of sprintf, however, are not working. For example:
unsigned int duration;
float msec;
msec = (float)duration;
msec = msec / 125.0;
sprintf(glb_response,"break: %3.1f msec",msec);
This returns: break: %3.1f msec
This is what is expected: break: 12.5 msec
I have learned about the following (from Wikipedia):
The compiler option --printf_support=[full | minimal | nofloat] allows
you to use a smaller, feature limited, variant of printf/sprintf, and
make that choice at build time.
The valid values are:
full: Supports all format specifiers. This is the default.
nofloat: Excludes support for printing floating point values. Supports
all format specifiers except %f, %g, %G, %e, and %E.
minimal: Supports the printing of integer, char, or string values
without width or precision flags. Specifically, only the %%, %d, %o,
%c, %s, and %x format specifiers are supported
I need full support for printf. I know the MSP430 on my product will support this, as this C program has been in service for years.
My problem is that I can't figure out 1) if VisualGDB has the means to set printf support to full and 2) if so, where and to set it.
Any and all comments and answers will be appreciated.
I would suggest that full support for floating point is both unnecessary and ill-advised. It is a large amount of code to solve a trivial problem; and without floating-point hardware, floating point operations are usually best avoided in any case for performance, code space and memory usage reasons.
So it appears that duration is in units of 1/125000 seconds and that you wish to output a value to a precision of 0.1 milliseconds. So:
unsigned msec_x10 = duration * 10u / 125u ;
sprintf( glb_response,
"break: %u.%u msec",
msec_x10 / 10,
msec_x10 % 10 ) ;
If you want rounding to the nearest tenth (as opposed to rounding down), then:
unsigned msec_x10 = ((duration * 20u) + 125 ) / 250u ;
I agree with Clifford that if you don't need floats (or only need them for printing), don't use them.
However, if your program is already using floats extensively, and you need a way to print them, consider adapting a public domain printf such as the one from SQLite.

Resources