ARM EABI floatingpoint issue

ARM EABI floatingpoint issue - arm

I have a data-conversion function in our ARM9 code which uses varargs.
I've been using an arm-elf yagarto distribution from a couple of years
ago, with no problems. Recently, we upgraded to the arm-eabi-none
yagarto package from the yagarto site, and I'm finding that we now have
problems with floating point values. What I eventually discovered is
that doubles are being forced to 8-byte boundaries, and the existing
varargs floating-point handler didn't expect to find gaps in the args.
I can manually check the pointer and force it up to an eight-byte
boundary (in fact, I did that, and that fixed the issue entirely), but
I'd like to know why this has suddenly started happening.
Is there a compiler switch that specifies data alignment on the stack,
or in function calls, or something like that?? And why would it be
defaulting to 8-byte boundaries on a 32-bit (4-byte) architecture??
I would appreciate any advice or insights that anyone could provide on
these issues.
The code is simple:
.....
float floatValue = 10.0;
int intValue = 10;
char buffer[32];
...
snprintf (buffer, 32, "%g", floatValue); /* Here we are getting junk value bcz of 8-byte*/
snprintf (buffer, 32, "%lld", intValue); /* Here we are getting junk value bcz of 8-byte */
....
The version of GCC we were using is 4.7.1
Compiling options:
Toolchain compiling options:
`mabi=aapcs-linux
`mcpu=arm7tdmi/mcpu=arm946e-s`
`mfloat-abi=softfp`
Application compiler option:
`-mfloat-abi=softfp`
`-mfpu=vfp`
`-mstructure-size-boundary=8`
`-fomit-frame-pointer`
`-fshort-wchar`
`-fno-short-enums`

Related

Float-point values doesn't work in uC-OS-III

Float-point variables defined with float doesn't seem to work in µC-OS-III.
A simple code like this:
float f1;
f1 = 3.14f;
printf("\nFLOAT:%f", f1);
Would produce an output like this:
FLOAT:2681561605....
When I test this piece of code in the main() before the µC-OS-III initialization, it works just fine. However, after the multitasking begins, it doesn't work. It doesn't work in the tasks or in the startup task.
I've searched the Internet for the similar problem but I couldn't find anything. However, there is this article that says "The IAR C/C++ Compiler for ARM requires the Stack Pointer to be aligned at 8 bytes..."
https://www.iar.com/support/tech-notes/general/problems-with-printf-floating-point-f-on-arm/
I located the stacks at an 8-byte aligned locations. Then the code worked in the task but the OS crashed right after the printf.
My compiler tool chain is IAR EWARM Version 8.32.1 and I am using µC-OS-III V3.07.03 with STM32F103.
I might miss some OS or compiler configuration. I don't know! I had the same problem few years ago with µC-OS-II, but finally I decided to use Fixed-point mathematics instead of floats.
Could someone shed a light on this...

Locating the RTOS stacks at an 8-byte alignment will solve the problem, according to the IAR article.
I located the stacks at fixed locations:
static CPU_STK task_stk_startup[TASK_CFG_STACK_SIZE_STARTUP] # (0x20000280u);

MC9S08QD4 - assigning to doubles/longs "corrupts" unrelated variables

Overview
I'm trying to program a frequency divider using a MC9S08QD4 microcontroller. It uses an 8-bit architecture following the HCS08 instruction set (documentation available here).
However, in order to accommodate input frequencies in as wide a range as possible, I've been trying to use double and unsigned long variables for storing properties such as the period of the input signal.
The issue I'm having is this: whenever I assign values to double or long variables, it correctly assigns the value to that variable but also overwrites completely unrelated parts of memory, corrupting other variables that may be stored there. A colleague suggested that this might be because it's using those locations to store intermediate values during calculation, which would be very odd if it's the case.
Environment
These are the tools I'm working with as a reference for the remainder of this post:
Windows 7 64-bit development machine
CodeWarrior Development Studio 10.7
Details
In certain parts of my code I assign potentially large values to long or double variables. By "large" I mean greater than what a 16-bit value could support but well within the range of what an unsigned 32-bit integer or double could support.
If I inspect local/global variables while performing these assignments, I can see that the variable I'm assigning to is assigned as expected, but so are other variables. I can also inspect the memory as I do this, where I see arbitrary and disparate locations being overwritten when I assign to these variables.
By following this guide, I've taken all the steps I'm aware of to ensure support for working with large/potentially floating data types:
Set the S08 linker to include ansis.lib, which uses small HCS08 memory model and supports 32-bit floats and 64-bit doubles.
Ensured __NO_FLOAT__ is not defined as a preprocessor symbol.
Ensured Use IEEE32 for double (default is IEEE64) in HCS08 Compiler settings is not selected (although for my use case 32-bit doubles is fine).
Ensured all data types are the expected sizes under HCS08 Compiler -> Type Sizes.
I've also verified that variables are being allocated the correct amount of memory in the generated map file and from the variable debugging screen.
SSCCE
I've been able to reproduce this issue easily with a very small amount of code in a dummy project I've set up:
static double temp = 0;
static double temp2 = 0;
static double temp3 = 0;
void main(void)
{
double a = 1000;
double b = a + 2;
temp = 1;
temp2 = temp + 2;
temp3 = temp2 + 3;
}
Immediately after flashing the board, my IDE looks like this:
So far so good. The global variables are initialised to 0, and the local variables have indeterminate values which is fine as they haven't been assigned to yet. Progressing past the first line, I see that a has been assigned correctly with no issues:
Stepping one more line, I find that the assignment to b succeeded but corrupted my global variables:
Disassembly
Below is the disassembly for the first two lines of code in main. I've linked the HCS08 instruction set in Overview.
5 void main(void)
f092: A7F0 AIS #-16
7 double a = 1000;
f094: 5F CLRX
f095: 8C CLRH
f096: 9EFF07 STHX 7,SP
f099: 9EFF05 STHX 5,SP
f09c: 454000 LDHX #0x4000
f09f: 9EFF03 STHX 3,SP
f0a2: AE8F LDX #0x8F
f0a4: 9EFF01 STHX 1,SP
8 double b = a + 2;
f0a7: 95 TSX
f0a8: CDF4F1 JSR 0xF4F1 _DADD_RC (0xf4f1)
f0ab: 40 NEGA
f0ac: 000000 BRSET 0,0x00,*+3 main+0x15 (0xf0af)
f0af: 000000 BRSET 0,0x00,*+3 main+0x15 (0xf0b2)
f0b2: 00AF08 BRSET 0,0xAF,*+11 main+0x26 (0xf0bd)
f0b5: CDF13A JSR 0xF13A _POP64 (0xf13a)
The instructions for double a = 1000; look reasonable, but those for double b = a + 2; involve a jump that leads down a very deep rabbit hole that I've not been able to return from.
Any advice about why this might be happening would be appreciated.
Edit
I've uploaded the memory map file for my real project here (not enough space left in this post to include it directly). This is in response to those suggesting this is a matter of limited memory, which I don't believe is correct.

This is most likely a stack overflow. The HCS08QD4 is not a PC, it is a very low-end 8-bit MCU with 256 bytes of RAM (including S08 "zero-page") and no FPU. Out of those 256 bytes, a small portion will be reserved for the stack per default. 80 bytes or so perhaps? To know exactly how much, check your linker file (.prm).
Most likely the floating-point library alone needs more RAM than you even have available on-chip.
As someone with some 15 years of experience of these parts, as well as the Codewarrior compiler, I'll be brutally honest: to pick such a limited MCU for a project that needs double precision floating point and 32-bit integer arithmetic is just sheer nonsense. Either you specified a completely wrong MCU for the task, or you are some kind of PC programmer just now switching over to embedded systems. Either way, there is no way you can ever get this program to work.
Start over the project from scratch, beginning with the specification.

Loss of precision at run time when linking against specific libraries in C

I have a somewhat odd problem that I have no real idea how to tackle.
I have a program that uses long doubles to do most of the math present, up until now that has worked fine. Recently I wanted to use MLAPACK, which is a high precision version of lapack that uses double double and quad double types in order to do matrix solves. Unfortunately when i link against the libraries for MLAPACK, i lose precision in the original program.
Ie. If i just do a simple sum of 2 numbers:
long double a = 50000.55964442486829568679
long double b = 0.006514624142341807720713032
when i dont link against MLAPACK i get (correctly for long double)
long double a + b = 50000.5661590490106374945
when i do link against this library i get:
long double a + b = 50000.56615904901264002547
ie, they differ at the level of double precision rather than long double.
The thing is, i have no idea how to go about trying to work out what is causing this change. I assume there must be a function in MLAPACK that is also defined in the original program, and that it is calling the wrong one, but that original program is large (and not written by me).
The code is compiled on a linux system, with the MLAPACK libraries being linked against being .so files, everything is being compiled with the same version of gcc/gfortran etc.
I'm sure this is not the most well posed of questions, but i dont really understand why this would happen.. any ideas where to even begin looking for a solution?
Cheers

I'm assuming you're compiling your program for Windows, as "32 bit", not "64 bit". If you're using Microsoft Visual C, add this line before you start doing math:
_control87( _PC_64, _MCW_PC ); /* requires: #include <float.h> */
If you're using a different compiler then you may need to use a different function.
(I kind of doubt you're using MSVC, because it doesn't have "long double" as a distinct floating-point type. What are you using, anyway?)
FULL EXPLANATION:
What's happening is that the floating-point unit inside your CPU is having its precision level changed by the startup code in the MLAPACK library.
The x87 FPU can run in any of three modes: single-precision (24 bits of precision), double-precision (53 bits of precision), and extended-precision (64 bits of precision, a/k/a long double). In Microsoft Visual C, the precision mode is set by the _control_87 built-in function; it might be different in your compiler. http://msdn.microsoft.com/en-us/library/e9b52ceh.aspx
Typically, the precision mode is set in the "startup code" for the C run-time library, which is included whenever you build a C program. Your program doesn't really start with main(), but with some other "entry point" inside the C run-time library. The code at that entry point sets everything up so a C program can run, then calls your main function. And if your program has long-double precision normally, that means that the entry point function must have called _control87(_PC_64, _MCW_PC) to set 64 bit long-double precision control.
So why is it changing when you link to MLAPACK? I would guess that MLAPACK is a DLL (dynamically-linked library), or at some point it happens to load a DLL. DLLs also have their own C run-time libraries (they're much more like separate executables than ordinary static libraries are) -- and especially if MLAPACK was built with a different compiler, it would have a different C run-time library with its own startup code. And that startup code sets the x87 FPU to 53-bit (double) precision!
So the answer is: you need to make sure you call _control87(_PC_64, _MCW_PC), or whatever the equivalent is on your compiler, to set to "long double" precision in your program before you start doing math. It might be OK just to do it in main, as one of the first things you do. Or maybe it is necessary to do something involving MLAPACK first, just to make sure MLAPACK is completely started up. Like you might invert a 1x1 matrix, kind of a dumb thing, and then set to 64-bit precision -- basically you're undoing the damage done by MLAPACK's C run-time library startup.
Note: on Windows, 64-bit programs don't use the x87 floating-point unit at all, and so they never have "long double" precision. That's why I assume you're building a 32 bit program. And if this is Linux or Mac, I don't know what's going on!

snprintf crash when displaying %d or %u

I'm trying to print an integer into a string with snprintf for display on an OLED display from an ARM micro. However, when I use %d or %u the micro locks up and stops executing.
Using %x or %c works fine, but the output isn't much use.
What could cause this behaviour? Unfortunately I don't have access to a JTAG device to debug. I'm using arm-none-eabi-gcc to compile and it's all running on a maple mini.
UPDATE
Passing values < 10 seems to make it work.

This actually turned out to be a stack size issue with the RTOS that I was using. I guess the added complexity of the snprintf call was pushing it over the limit and crashing.
Thanks to all who took a crack at answering this!

Passing values < 10 seems to make it work.
This sounds to me as if you have a missing/non-working divide routine. printf/sprintf usually prints decimal numbers by successively dividing them by 10. For numbers less than 10 the division is not necessary and that's probably why it doesn't work.
To check, make a function which divides two variables (dividing by a constant is usually optimized into multiplication by the compiler). E.g.:
int t()
{
volatile int a, b; // use volatile to prevent compiler optimizations
a = 123;
b = 10;
return a/b;
};
Also, check your build log for link warnings.

It can't be a type error since %x and %u both specify the same types. So it has to be a problem in snprintf itself. The only major difference between the two is that %u has to divide integers and compute the remainder, whereas %x can get by with shifts and masks.
It is possible that your C library was compiled for a different variety of ARM processor than you are using, and perhaps it is using an illegal instruction to compute a quotient or remainder.
Make sure you are compiling your library for Cortex M3. E.g.,
gcc -mcpu=cortex-m3 ...

Do you have a prototype in scope ? snprintf() is a varargs function, and calling a varargs may involve some trickery to get the arguments at the place where the function expects them.
Also: always use the proper types when calling a varargs function. (the one after the '%' is the type that snprintf() expects to find somewhere, 'somewhere' may even depend on the type. Anything goes...) in your case : "%X" expects an unsigned int. Give it to her, either by casting the parameter in the function call, or by using "unsigned int sweeplow;" when defining it. Negative frequencies or counts make no sense anyway.

C: avoiding overflows when working with big numbers

I've implemented some sorting algorithms (to sort integers) in C, carefully using uint64_t to store anything which has got to do with the data size (thus also counters and stuff), since the algorithms should be tested also with data sets of several giga of integers.
The algorithms should be fine, and there should be no problems about the amount of data allocated: data is stored on files, and we only load little chunks per time, everything works fine even when we choke the in-memory buffers to any size.
Tests with datasets up to 4 giga ints (thus 16GB of data) work fine (sorting 4Gint took 2228 seconds, ~37 minutes), but when we go above that (ie: 8 Gints) the algorithm doesn't seem to halt (it's been running for about 16 hours now).
I'm afraid the problem could be due to integer overflow, maybe a counter in a loop is stored on a 32 bits variable, or maybe we're calling some functions that works with 32 bits integers.
What else could it be?
Is there any easy way to check whether an integer overflow occurs at runtime?

This is compiler-specific, but if you're using gcc then you can compile with -ftrapv to issue SIGABRT when signed integral overflow occurs.
For example:
/* compile with gcc -ftrapv <filename> */
#include <signal.h>
#include <stdio.h>
#include <limits.h>
void signalHandler(int sig) {
printf("Overflow detected\n");
}
int main() {
signal(SIGABRT, &signalHandler);
int largeInt = INT_MAX;
int normalInt = 42;
int overflowInt = largeInt + normalInt; /* should cause overflow */
/* if compiling with -ftrapv, we shouldn't get here */
return 0;
}
When I run this code locally, the output is
Overflow detected
Aborted

Take a look at -ftrapv and -fwrapv:
-ftrapv
This option generates traps for signed overflow on addition, subtraction, multiplication operations.
-fwrapv
This option instructs the compiler to assume that signed arithmetic overflow of addition, subtraction and multiplication wraps around using twos-complement representation. This flag enables some optimizations and disables other. This option is enabled by default for the Java front-end, as required by the Java language specification.
See also Integer overflow in C: standards and compilers, and Useful GCC flags for C.

clang now support dynamic overflow checks for both signed and unsigned integers. See -fsanitize=integer switch. For now it is only one C++ compiler with fully supported dynamic overflow checking for debug purpose.

If you are using Microsoft's compiler, there are options to generate code that triggers a SEH exception when an integer conversion cuts off non-zero bits. In places where this is actually desired, use a bitwise AND to remove the upper bits before doing the conversion.

The only sure fire way is to wrap operations on those integers into functions that perform bounds violation checking. This will of course slow down integer ops, but if your code asserts or halts on a boundary violation with a meaningful error message, that will go a long way towards helping you identify where the problem is.
As for your particular issue, keep in mind that general case sorting is O(nlogn), so the reason the algorithm is taking much longer could be due to the fact that the increase in time is not linear with respect to the data set size. Since also didn't mention how much physical memory is in the box and how much of it is used for your algorithm, there could possibly be page faulting to disk with the larger data set, thus potentially slowing things to a crawl.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight