Arm-Embedded-GCC Toolchain - Difference Version 7 and 9 - arm

We use the ARM Embedded Toolchain based on GCC for our Cortex-M3 projects (arm-none-eabi maintained by ARM).
We lately upgraded from 7.2.1 to 9.3.1 in a project that uses GNU11 and GNU++14 and the newlib-nano (nano.specs) and the optimization level is set to Og.
We face some strange behavior that suddenly just at random one GPIO wont turn on/off anymore, this does not happen if we use 7.2.1. When stepping though the program to check why it wont turn toggle anylonger, all the commands are executed properly: initialization, and then writing a value to GPIO.
Also noticable is, that if we change the optimization level to any other level (O0, 01, 02, 0s) it works again, seems that Og with 9.3.1 breaks something.
Then we went ahead and started searching which code section we must exclude from optimization (which function or which instruction) so it becomes runnable again. There we did not find a solution as its not one function call. At random by narrowing down, just by disabling optimization for a random function which has nothing to do with accessing GPIO, it works again.
Currently we have no idea how to proceed further in this matter, to find the root cause of this behavior.
Questions:
What did change between 7.2.1 and 9.3.1 when it comes to Gnu11/Gnu++14, the newlib-nano and Optimization for Debug (Og)?
How can this be explained that the older version seems to produce a runnable binary where the newer one causes issues?
How can you explain, why only Og causes trouble, but all other optimization levels seem to work just fine?
Any hint/input highly appreciated.

Related

Linking additional code for the microcontroller (AVR) against already existing code

The problem definition:
There is a need to have two parts of the code in an AVR microcontroller, a fixed one that is always there and does not change (often), and a transient one, that is (not so) often to be replaced or appended. The challenge is to give the ability for the transient code to call functions and access global variables of the fixed one -- and vice versa.
It is quite obvious that there should be special methods for the fixed code to access transient one -- like having calculated function pointers in RAM and using only them to call transient code procedures.
For the calling in backwards direction, I was thinking about linking transient code against existing .elf file of the fixed code.
I'm using avr-gcc toolchain (as in ubuntu 20.20), gcc version 5.4.0
What's I've already tried:
adding '-shared' as a link argument when building fixed code -- appears to be unsupported for AVR (linker reports an error).
adding instead '-Wl,--export-dynamic' as a link argument -- it seems to be ignored, no .dynsym section appears in the elf.
There is still a .symtab section in the fixed code elf -- could that be somehow used to link against it?
Note: my division of 'fixed' and 'transient' code has nothing to do with boot-area of some AVR microcontroller, boot is just something I do not care here about.
Note2: The question is much alike this one, but gives clear explanation for the need.
You have to forget all big computer knowledge. 8 bits AVRs are timy microcontrollers. Code has to be linked statically. There is no other way.

Floating Point Multiply Bug Using gcc 2.7.0 on Amiga with 68881 - Any Fixes/Workarounds?

For the heck of it, I decided to see if a program I started writing on an Amiga many years ago and much further developed on other machines would still compile and run on an Amiga (after being developed on other machines). I originally used Lattice C because that's what I used before. But the 68881 support in Lattice is VERY buggy. So I decided to try gcc. I think the most recent version of gcc for Amiga is 2.7.0 (so I can't upgrade). It's worked rather well except for one bug in 68881 support: When multiplying any negative number by zero, the result is always:
1.:00000
when printed out (colon is NOT a typo). BTW, if you set x to zero, then print out, it's 0.00000 like it should be.
Here's a sample program to test the bug, it doesn't matter which variable is 0 and which is negative, and if the non-zero value is positive, it works fine.
#include <stdio.h>
#include <math.h>
main()
{
float x,a,b;
a=-10.0;
b=0.0;
x=a*b;
printf("%f\n",x);
}
and it's compiled with: gcc -o tt -m68020 -m68881 tt.c -lm
Taking out -m68881 works fine (but of course, doesn't use the FPU)
Taking out -lm and/or math.h makes no difference.
Does anyone know of a bug fix or workaround? Maybe a gcc command line argument? (would rather not have to do UGLY things like "if ((a<0)&&(b==0))")
BTW, since I don't have a working Amiga anymore, I've had to use an emulator. If you want to see what I've been doing on this project (using Lattice C version), you can view my video at:
https://www.youtube.com/watch?v=x8O-qYQvP4M
(Amiga part starts at 10:07)
Thanks for any help.
This isn't exactly an answer, but a revelation that the problem is rather complicated (more so than a simple bug with gcc). Here's the info:
If I set the Amiga emulator to emulate a 68020 or a 68030 and a 68881 or a 68882 INSTEAD of a 68040 using the 68040's internal FPU it doesn't produce the 1.:00000 (in other words, it works). So that could mean the emulator is to blame for not emulating the 68040's FPU correctly (though I imagine the 68040's FPU is likely compatible with the 68881/68882). (Don't know if there's a performance hit in setting the emulator to 68020/30 68881/2 (I have the emulator set to run as fast as possible on the host machine instead of going at the speed of the 680xx)).
However, if I compile with the Amiga's gcc's -noixemul option, the code works correctly in every combination of CPU and FPU. So that would indicate it's a problem with the Amiga's version of gcc (really the part of the gcc system that tries to emulate UNIX on an Amiga (that is what ixemul.library does)). So that might not be gcc's fault (if it were compiled on some other system that uses a 68040 it would probably work), but the fault of the people who ported gcc to Amiga.
So, you might say "problem solved, just use -noixemul" - well not so fast... Although the simple test program doesn't crash, my bigger program that exposed this problem crashes on program exit (recoverable GURU meditation) only when compiled with -noixemul (perhaps it's trying to close a library that was never opened, I don't know). This is why I didn't use -noixemul even though I wanted to.
So, it's not exactly solved, but I would say it's not likely a non-Amiga gcc bug.

Speed up compiled programs using runtime information like for example JVM does it?

Java programs can outperform compiled programming languages like C in specific tasks. It is because the JVM has runtime information, and does JIT compiling when necessary (i guess).
(example: http://benchmarksgame.alioth.debian.org/u32/performance.php?test=chameneosredux)
Is there anything like this for a compiled language?
(i am interested in C first of all)
After compiling the source, the developer runs it and tries to mimic typical workload.
A tool gathers information about the run, and then according to this data, it recompiles again.
gcc has -fprofile-arcs
from the manpage:
-fprofile-arcs
Add code so that program flow arcs are instrumented. During execution the
program records how many times each branch and call is executed and how many
times it is taken or returns. When the compiled program exits it saves this
data to a file called auxname.gcda for each source file. The data may be
used for profile-directed optimizations (-fbranch-probabilities), or for
test coverage analysis (-ftest-coverage).
I don't think the jvm has ever really beaten well optimized C code.
But to do something like that for c, you are looking for profile guided optimization, where the compiler use runtime information from a previous run, to re-compile the program.
Yes there are some tools like this, I think it's known as "profiler-guided optimization".
There are a number of optimizations. Importantly is to reduce backing-store paging, as well as the use of your code caches. Many modern processors have one code cache, maybe a second level of code cache, or a second unified data and code cache, maybe a third level of cache.
The simplest thing to do is to move all of your most-frequently used functions to one place in the executable file, say at the beginning. More sophisticated is for less-frequently-taken branches to be moved into some completely different part of the file.
Some instruction set architectures such as PowerPC have branch prediction bits in their machine code. Profiler-guided optimization tries to set these more advantageously.
Apple used to provide this for the Macintosh Programmer's Workshop - for Classic Mac OS - with a tool called "MrPlus". I think GCC can do it. I expect LLVM can but I don't know how.

ARM THUMB mode issue on Cortex A15

we are using cortex A15, and kernel 3.8.
If I compile
arm-gcc-4.7.3 test.c -o test_thumb -mthumb
In Kernel if I set CONFIG_ARM_THUMB or unset. my THUMB(user space) always run,
So i could not understand the behavior.
Ok, so, I can't see a good reason to do what you're attempting to do ... so I'll assume you are asking out of pure curiosity.
It is not possible (in the processor) to disable decoding Thumb instructions or switching to Thumb state. The CONFIG_ARM_THUMB option is about making the use of Thumb code in applications safe with regards to how the operating system acts. This means, on the theoretical level, that not having this disabled could mean that in certain situations the program would not work properly - not that it would prevent actively Thumb code from executing.
In practise, the main effect it ever had was with OABI, which used an embedded value in the SWI (now SVC) instruction to identify which system call it was requesting.
I think OABI is not even supported by latest versions of GCC/binutils...
Any 4.7 toolchain is highly likely to be EABI.

System calls not working in Atmel AVR Studio (with ASF)

I am not getting answers on the AVR Freaks forum and wonder if someone here could help me.
The answer might lie in this SO question, but I am not sure why it would be necessary.
Basically, I have my fist ever Atmel project (AVR studio 6, UC3 processor). The code compiles and links and I can load it to the Atmel board and step through in the debugger.
However, when I try to step over (or run until a breakpoint on the line after) a (valid) call to sprintf(), malloc() or memcpy() (there may be more, which I have not yet discovered), the IDE never returns to the next line of my code, just seeming to hang, or run forever.
[Note] Compiler optimization is off
Do I need to set some linker options (e.g link static (which I tried & it didn't help)? Or build with some library?
What confuses me is that the code compilers and links - what is being linked when I call these standard functions? If I need something else I would expect a compiler or linker error, but get none - so why won't my code run?
Sorry for such a stupid n00nb question, but it is my first micro-controller project.
I discovered that the CPU on my board is an Engineering Sample and not supported by Atmel Studio without a new io.h file.
I sort of figured that out from this question: http://www.avrfreaks.net/index.php?name=PNphpBB2&file=viewtopic&t=106652
Sorry to have troubled you.
what is being linked when I call these standard functions?
The AVR-libc, the implementation of the C standard library ported to the AVR platform.
so why won't my code run?
Compiler errors and runtime errors are not even related. Both of these lines are valid C and they compile, however, on most systems, I'd expect them to dump core:
int x = 1 / 0;
*(int *)0 = 41;
So it might be either:
a bug in the standard library (very unlikely), or
a bug in the online debugger (very unlikely), or
maybe you just expect something that is not supposed to happen?
Instead of trying to step over, what happens if you set a breakpoint at next line after the line you want to step over?
Also, does the operation change if you turn off compiler optimization?

Resources