arm_math library with ropi rwpi compiler options - c

when cross-compiling for STM32-U585 processor the arm_math library with IAR compiler options ropi and rwpi enabled the below errors appears.
Thank you in advance for any help
arm_const_structs.c
Error[Pe147]: declaration is incompatible with "struct #30 arm_cfft_sR_f32_len16" (declared at line 36 of "C:\AST\FedLearn\AI_in_a_Module_on_U5_V1.1\B-U585I-IOT02A_VESPUCCI_MODULAR_FW\Drivers\CMSIS\DSP\ C:\AST\FedLearn\AI_in_a_Module_on_U5_V1.1\B-U585I-IOT02A_VESPUCCI_MODULAR_FW\Drivers\CMSIS\DSP\Source\CommonTables\arm_const_structs.c 36
Projects\IAR/../../Include\arm_const_structs.h")
Error[Ta071]: The address of a pc/static base relative object cannot be used in static initialization C:\AST\FedLearn\AI_in_a_Module_on_U5_V1.1\B-U585I-IOT02A_VESPUCCI_MODULAR_FW\Drivers\CMSIS\DSP\Source\CommonTables\arm_const_structs.c 38
Error[Ta071]: The address of a pc/static base relative object cannot be used in static initialization C:\AST\FedLearn\AI_in_a_Module_on_U5_V1.1\B-U585I-IOT02A_VESPUCCI_MODULAR_FW\Drivers\CMSIS\DSP\Source\CommonTables\arm_const_structs.c 38
Error[Pa088]: non-constructor dynamic initialization of const variables not allowed in this memory C:\AST\FedLearn\AI_in_a_Module_on_U5_V1.1\B-U585I-IOT02A_VESPUCCI_MODULAR_FW\Drivers\CMSIS\DSP\Source\CommonTables\arm_const_structs.c 36

Related

Why does Windows require DLL data to be imported?

On Windows data can be loaded from DLLs, but it requires indirection through a pointer in the import address table. As a result, the compiler must know if an object that is being accessed is being imported from a DLL by using the __declspec(dllimport) type specifier.
This is unfortunate because it means a that a header for a Windows library designed to be used as either a static library or a dynamic library needs to know which version of the library the program is linking to. This requirement is not applicable to functions, which are transparently emulated for DLLs with a stub function calling the real function, whose address is stored in the import address table.
On Linux the dynamic linker (ld.so) copies the values of all linked data objects from a shared object into a private mapped region for each process. This doesn't require indirection because the address of the private mapped region is local to the module, so its address is decided when the program is linked (and in the case of position independent executables, relative addressing is used).
Why doesn't Windows do the same? Is there a situation where a DLL might be loaded more than once, and thus require multiple copies of linked data? Even if that was the case, it wouldn't be applicable to read only data.
It seems that the MSVCRT handles this issue by defining the _DLL macro when targeting the dynamic C runtime library (with the /MD or /MDd flag), then using that in all standard headers to conditionally declare all exported symbols with __declspec(dllimport). I suppose you could reuse this macro if you only supported statically linking when using the static C runtime and dynamically linking when using the dynamic C runtime.
References:
LNK4217 - Russ Keldorph's WebLog (emphasis mine)
__declspec(dllimport) can be used on both code and data, and its semantics are subtly different between the two. When applied to a routine call, it is purely a performance optimization. For data, it is required for correctness.
[...]
Importing data
If you export a data item from a DLL, you must declare it with __declspec(dllimport) in the code that accesses it. In this case, instead of generating a direct load from memory, the compiler generates a load through a pointer, resulting in one additional indirection. Unlike calls, where the linker will fix up the code correctly whether the routine was declared __declspec(dllimport) or not, accessing imported data requires __declspec(dllimport). If omitted, the code will wind up accessing the IAT entry instead of the data in the DLL, probably resulting in unexpected behavior.
Importing into an Application Using __declspec(dllimport)
Using __declspec(dllimport) is optional on function declarations, but the compiler produces more efficient code if you use this keyword. However, you must use `__declspec(dllimport) for the importing executable to access the DLL's public data symbols and objects.
Importing Data Using __declspec(dllimport)
When you mark the data as __declspec(dllimport), the compiler automatically generates the indirection code for you.
Importing Using DEF Files (interesting historical notes about accessing the IAT directly)
How do I share data in my DLL with an application or with other DLLs?
By default, each process using a DLL has its own instance of all the DLLs global and static variables.
Linker Tools Warning LNK4217
What happens when you get dllimport wrong? (seems to be unaware of data semantics)
How do I export data from a DLL?
CRT Library Features (documents the _DLL macro)
Linux and Windows use different strategies for accessing data stored in dynamic libraries.
On Linux, an undefined reference to an object is resolved to a library at link time. The linker finds the size of the object and reserves space for it in the .bss or the .rdata segment of the executable. When executed, the dynamic linker (ld.so) resolves the symbol to a dynamic library (again), and copies the object from the dynamic library to the process's memory.
On Windows, an undefined reference to an object is resolved to an import library at link time, and no space is reserved for it. When the module is executed, the dynamic linker resolves the symbol to a dynamic library, and creates a copy on write memory map in the process, backed by a shared data segment in the dynamic library.
The advantage of a copy on write memory map is that if the linked data is unchanged, then it can be shared with other processes. In practice this is a trifling benefit which greatly increases complexity, both for the toolchain and programs using dynamic libraries. For objects which are actually written this is always less efficient.
I suspect, although I have no evidence, that this decision was made for a particular and now outdated use case. Perhaps it was common practice to use large (for the time) read only objects in dynamic libraries on 16-bit Windows (in official Microsoft programs or otherwise). Either way, I doubt anyone at Microsoft has the expertise and time to change it now.
In order to investigate the issue I created a program which writes to an object from a dynamic library. It writes one byte per page (4096 bytes) in the object, then writes the entire object, then retries the initial one byte per page write. If the object is reserved for the process before main is called, the first and third loops should take approximately the same time, and the second loop should take longer than both. If the object is a copy on write map to a dynamic library, the first loop should take at least as long as the second, and the third should take less time than both.
The results are consistent with my hypothesis, and analyzing the disassembly confirms that Linux accesses the dynamic library data at a link time address, relative to the program counter. Surprisingly, Windows not only indirectly accesses the data, the pointer to the data and its length are reloaded from the import address table every loop iteration, with optimizations enabled. This was tested with Visual Studio 2010 on Windows XP, so maybe things have changed, although I wouldn't think that it has.
Here are the results for Linux:
$ dd bs=1M count=16 if=/dev/urandom of=libdat.dat
$ xxd -i libdat.dat libdat.c
$ gcc -O3 -g -shared -fPIC libdat.c -o libdat.so
$ gcc -O3 -g -no-pie -L. -ldat dat.c -o dat
$ LD_LIBRARY_PATH=. ./dat
local = 0x1601060
libdat_dat = 0x601040
libdat_dat_len = 0x601020
dirty= 461us write= 12184us retry= 456us
$ nm dat
[...]
0000000000601040 B libdat_dat
0000000000601020 B libdat_dat_len
0000000001601060 B local
[...]
$ objdump -d -j.text dat
[...]
400693: 8b 35 87 09 20 00 mov 0x200987(%rip),%esi # 601020 <libdat_dat_len>
[...]
4006a3: 31 c0 xor %eax,%eax # zero loop counter
4006a5: 48 8d 15 94 09 20 00 lea 0x200994(%rip),%rdx # 601040 <libdat_dat>
4006ac: 0f 1f 40 00 nopl 0x0(%rax) # align loop for efficiency
4006b0: 89 c1 mov %eax,%ecx # store data offset in ecx
4006b2: 05 00 10 00 00 add $0x1000,%eax # add PAGESIZE to data offset
4006b7: c6 04 0a 00 movb $0x0,(%rdx,%rcx,1) # write a zero byte to data
4006bb: 39 f0 cmp %esi,%eax # test loop condition
4006bd: 72 f1 jb 4006b0 <main+0x30> # continue loop if data is left
[...]
Here are the results for Windows:
$ cl /Ox /Zi /LD libdat.c /link /EXPORT:libdat_dat /EXPORT:libdat_dat_len
[...]
$ cl /Ox /Zi dat.c libdat.lib
[...]
$ dat.exe # note low resolution timer means retry is too small to measure
local = 0041EEA0
libdat_dat = 1000E000
libdat_dat_len = 1100E000
dirty= 20312us write= 3125us retry= 0us
$ dumpbin /symbols dat.exe
[...]
9000 .data
1000 .idata
5000 .rdata
1000 .reloc
17000 .text
[...]
$ dumpbin /disasm dat.exe
[...]
004010BA: 33 C0 xor eax,eax # zero loop counter
[...]
004010C0: 8B 15 8C 63 42 00 mov edx,dword ptr [__imp__libdat_dat] # store data pointer in edx
004010C6: C6 04 02 00 mov byte ptr [edx+eax],0 # write a zero byte to data
004010CA: 8B 0D 88 63 42 00 mov ecx,dword ptr [__imp__libdat_dat_len] # store data length in ecx
004010D0: 05 00 10 00 00 add eax,1000h # add PAGESIZE to data offset
004010D5: 3B 01 cmp eax,dword ptr [ecx] # test loop condition
004010D7: 72 E7 jb 004010C0 # continue loop if data is left
[...]
Here is the source code used for both tests:
#include <stdio.h>
#ifdef _WIN32
#include <windows.h>
typedef FILETIME time_l;
time_l time_get(void) {
FILETIME ret; GetSystemTimeAsFileTime(&ret); return ret;
}
long long int time_diff(time_l const *c1, time_l const *c2) {
return 1LL*c2->dwLowDateTime/100-c1->dwLowDateTime/100+c2->dwHighDateTime*100000-c1->dwHighDateTime*100000;
}
#else
#include <unistd.h>
#include <time.h>
#include <stdlib.h>
typedef struct timespec time_l;
time_l time_get(void) {
time_l ret; clock_gettime(CLOCK_MONOTONIC, &ret); return ret;
}
long long int time_diff(time_l const *c1, time_l const *c2) {
return 1LL*c2->tv_nsec/1000-c1->tv_nsec/1000+c2->tv_sec*1000000-c1->tv_sec*1000000;
}
#endif
#ifndef PAGESIZE
#define PAGESIZE 4096
#endif
#ifdef _WIN32
#define DLLIMPORT __declspec(dllimport)
#else
#define DLLIMPORT
#endif
extern DLLIMPORT unsigned char volatile libdat_dat[];
extern DLLIMPORT unsigned int libdat_dat_len;
unsigned int local[4096];
int main(void) {
unsigned int i;
time_l t1, t2, t3, t4;
long long int d1, d2, d3;
t1 = time_get();
for(i=0; i < libdat_dat_len; i+=PAGESIZE) {
libdat_dat[i] = 0;
}
t2 = time_get();
for(i=0; i < libdat_dat_len; i++) {
libdat_dat[i] = 0xFF;
}
t3 = time_get();
for(i=0; i < libdat_dat_len; i+=PAGESIZE) {
libdat_dat[i] = 0;
}
t4 = time_get();
d1 = time_diff(&t1, &t2);
d2 = time_diff(&t2, &t3);
d3 = time_diff(&t3, &t4);
printf("%-15s= %18p\n%-15s= %18p\n%-15s= %18p\n", "local", local, "libdat_dat", libdat_dat, "libdat_dat_len", &libdat_dat_len);
printf("dirty=%9lldus write=%9lldus retry=%9lldus\n", d1, d2, d3);
return 0;
}
I sincerely hope someone else benefits from my research. Thanks for reading!

Why does GCC store global and static int differently?

Here is my C program with one static, two global, one local and one extern variable.
#include <stdio.h>
int gvar1;
int gvar2 = 12;
extern int evar = 1;
int main(void)
{
int lvar;
static int svar = 4;
lvar = 2;
gvar1 = 3;
printf ("global1-%d global2-%d local+1-%d static-%d extern-%d\n", gvar1, gvar2, (lvar+1), svar, evar);
return 0;
}
Note that gvar1, gvar2, evar, lvar and svar are all defined as integers.
I disassembled the code using objdump and the debug_str for this shows as below:
Contents of section .debug_str:
0000 76617269 61626c65 732e6300 6c6f6e67 variables.c.long
0010 20756e73 69676e65 6420696e 74002f75 unsigned int./u
0020 73657273 2f686f6d 6534302f 72616f70 sers/home40/raop
0030 2f626b75 702f6578 616d706c 65730075 /bkup/examples.u
0040 6e736967 6e656420 63686172 00737661 nsigned char.sva
0050 72006d61 696e006c 6f6e6720 696e7400 r.main.long int.
0060 6c766172 0073686f 72742075 6e736967 lvar.short unsig
0070 6e656420 696e7400 67766172 31006776 ned int.gvar1.gv
0080 61723200 65766172 00474e55 20432034 ar2.evar.GNU C 4
0090 2e342e36 20323031 31303733 31202852 .4.6 20110731 (R
00a0 65642048 61742034 2e342e36 2d332900 ed Hat 4.4.6-3).
00b0 73686f72 7420696e 7400 short int.
Why is it showing the following?
unsigned char.svar
long int.lvar
short unsigned int.gvar1.gvar2.evar
How does GCC decide which type it should be stored as?
I am using GCC 4.4.6 20110731 (Red Hat 4.4.6-3)
Why is it showing the following?
Simple answer: It is not showing what you think but it is showing:
1 "variables.c"
2 "long unsigned int"
2a "unsigned int"
2b "int"
3 "/users/home40/raop/bkup/examples"
4 "unsigned char"
4a "char"
5 "svar"
6 "main"
7 "long int"
8 "lvar"
9 "short unsigned int"
10 "gvar1"
11 "gvar2"
12 "evar"
13 "GNU C 4.4.6 20110731 (Red Hat 4.4.6-3)"
14 "short int"
The section is named .debug_str; it contains a list of strings which are separated by NUL bytes. These strings are in any order and they are referenced by the section .debug_info. So the fact that svar is following unsigned char has no meaning at all.
The .debug_info section contains the actual debugging information. This section does not contain strings. Instead it will contain information like this:
...
Item 123:
Type of information: Data type
Name: 2b /* String #2b in ".debug_str" is "int" */
Kind of data type: Signed integer
Number of bits: 32
... some more information ...
Item 124:
Type of information: Global variable
Name: 8 /* "lvar" */
Data type defined by: Item 123
Stored at: Address 0x1234
... some more information ...
Begin item 125:
Type of information: Function
Name: 6 /* "main" */
... some more information ...
Item 126:
Type of information: Local variable
Name: 5 /* "svar" */
Data type defined by: Item 123
Stored at: Address 0x1238
... some more information ...
End item 125 /* Function "main" */
Item 127:
...
You can see this information using the following command:
readelf --debug-dump filename.o
Why does GCC store global and static int differently?
I compiled your example twice: Once with optimization and once without optimization.
Without optimization svar and gvar1 were stored exactly the same way: Data type int, stored on a fixed address. lvar was: Data type int, stored on the stack.
With optimization lvar and svar were stored the same way: Data type: int, not stored at all, instead they are treated as constant value.
(This makes sense because the values of these variables never change.)
The C11 specification (read n1570) -or older C standards- does not define at what addresses or offsets are stored global or static variables, so the implementation (your gcc compiler and your ld linker) is free to put them at any place.
The organization and layout of the data segments is an implementation detail.
You may want to read more about DWARF to understand debug information, which is useful to the gdb debugger.
You may want to read more about linkers and loaders, and about the ELF format, if you want to understand how they are working. On Linux, there are several utilities to inspect elf(5) files, including objdump(1), readelf(1), nm(1).
Notice that your GCC4.4 is an obsolete and old version of GCC. Current version is GCC7, and GCC8 will be released in a few weeks (spring 2018). I strongly recommend to upgrade your compiler.
If you need to understand how and why the data segments are organized in such way and why your implementation chooses such a layout, you could take advantage that both gcc and ld (from binutils) are free software, and study their source in details. You'll need many years of work, since they are complex software (more than ten million lines of source code).
If you happen to start studying the internals of GCC, be sure to study a recent version. Most people of the GCC community have probably forgotten the details of GCC4.4 (released in 2009). A lot of things have changed in GCC since that ancient thing. A few years ago, I have written many slides about GCC internals, see the documentation of GCC MELT.
BTW, the layout of data segments, or of variables inside them, might vary with optimization options. It might happen that lvar does not sit in memory (e.g. stays in a register only); it could happen that a static variable is removed (using something like the as-if rule) etc.
For a single translation unit foo.c, you might compile it into assembler code using gcc -fverbose-asm -S -O foo.c and look into the emitted foo.s assembler code.
To understand more how your ld linker work, you might look into some relevant linker script. You could find how ld is invoked from gcc by using gcc -v (instead of gcc) in your compilation and linking command.
In most cases, you should not care about the particular offsets (in object files or executables) or addresses (in the virtual address space of your process) of global or static variables. Be also aware of ASLR. The proc(5) filesystem can be used to understand your process.
(your question is severely lacking some motivation and context)

The parameter type is not valid for a function of this linkage type

I'm working on AIX with IBM's XL C compiler. I'm catching a compile error and I'm not sure how to proceed:
$ xlc -g3 -O0 -qarch=pwr8 -qaltivec fips197-p8.c -o fips197-p8.exe
"fips197-p8.c", line 59.16: 1506-754 (W) The parameter type is not valid for a function of this linkage type.
The relevant source code is shown below. The complete source code is available at fips197-p8.c. The source code is a test driver for Power 8 __cipher and __vcipherlast. It has a main and a few C functions. Effectively is a minimal complete working example for Power 8 AES.
$ cat -n fips197-p8.c
...
11 #if defined(__xlc__) || defined(__xlC__)
12 // #include <builtins.h>
13 #include <altivec.h>
14 typedef vector unsigned char uint8x16_p8;
15 typedef vector unsigned int uint64x2_p8;
16 #else
17 #include <altivec.h>
18 typedef vector unsigned char uint8x16_p8;
19 typedef vector unsigned long long uint64x2_p8;
20 #endif
...
52 uint8x16_p8 Load8x16(const uint8_t src[16])
53 {
54 #if defined(__xlc__) || defined(__xlC__)
55 /* IBM XL C/C++ compiler */
56 # if defined(__LITTLE_ENDIAN__)
57 return vec_xl_be(0, src);
58 # else
59 return vec_xl(0, src);
60 # endif
61 #else
62 /* GCC, Clang, etc */
63
64 #endif
65 }
The compiler version is shown below. We don't control the compiler, so this is what we have:
$ xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0000
vec_xl is fine on a little-endian. vec_xl for big-endian is giving us the trouble.
What is the problem, and how do I fix it?
So a little guesswork (confirmed by OP comments since it works) led me to think that this cryptic & obscure "The parameter type is not valid for a function of this linkage type." message (google first match is this question !) could be a qualifier issue.
Since your contract is
uint8x16_p8 Load8x16(const uint8_t src[16])
it is possible that, given the options & the current endianness, the compiler/prototype believes that vec_xl_be expects a non-const parameter as src.
So passing a const violates the contract (and that's the nicest way xlc could find to notifu you)
So either change to
uint8x16_p8 Load8x16(uint8_t src[16])
(with the risk of dropping constant constraints for all callers)
or drop the const by a non-const cast (like we do when the prototype lacks const, but the data is in fact not modified in the function):
vec_xl_be(0,(uint8_t*)src);

eeprom_write/read functions cause compilation error (PIC microcontroller)

I am using MPLABX and C8 compiler for the PIC 16f690 microcontroller.
From reading C8 datasheet, it seems that the eeprom_write/read functions are included in the xc.h header. However, MPLAB does not recognize the functions (unable to resolve identifier eeprom_write/read), and it will not compile. Are there other initializations required to use eeprom?
The variables I am trying to store are both unsigned chars that are smaller than 1 byte. Out of context, this is how they are being formatted ("final" being a previously declared char):
eeprom_write(0x00, final);
int x = (int) eeprom_read(0x00);

Never seen before C method of initialization of an array of structs found in the Linux kernel source

55 typedef struct pidmap {
56 atomic_t nr_free;
57 void *page;
58 } pidmap_t;
59
60 static pidmap_t pidmap_array[PIDMAP_ENTRIES] =
61 { [ 0 ... PIDMAP_ENTRIES-1 ] = { ATOMIC_INIT(BITS_PER_PAGE), NULL } };
The code snippet above shows the initialization of an array of a structs that I found in the Linux kernel source. I have never seen this form of initialization before and I couldn't simulate the same thing on my own. What am I missing actually?
Source of the code
It is a GNU/GCC extension called Designated Initializers. You can find more information about it in the GCC documentation.
To initialize a range of elements to the same value, write [first ... last] = value. This is a GNU extension
It is done by using a Designated Initializer.
It is a gcc extension and not standard c construct. Using it results in non portable code, So avoid using such compiler extensions unless portability is least of your concerns.

Resources