I'm using a library to register some structs at compile time. In this case it's registering a struct representing a JSON-RPC method that I'd like to expose. The library marks the structs with __attribute(section("xautodata_" "somename")) so that it'll be put in a separate section that can later be retrieved. The generated content will look like this:
static const autodata_json_command_ *__attribute__((__used__)) __attribute__((section("xautodata_" "json_command"))) autodata_json_command_151 = (&help_command);;
static const autodata_json_command_ *__attribute__((__used__)) __attribute__((section("xautodata_" "json_command"))) autodata_json_command_173 = (&stop_command);;
The code that later retrieves the commands will get a pointer to the section (and count the number of elements in that section) and iterate over it, like this:
size_t count;
struct json_command **commands = get_json_commands(&count);
for (size_t i=0; i<count; i++) {
// Access commands[i];
}
This works perfectly fine if we don't compile with -fsanitize=address, but it'll add padding when compiled with -fsanitize=address.
Without the address sanitizer commands are adjacent to each other, i.e., commands[0] and commands[1] are valid pointers to structs. With the sanitizer only every 8th command is a valid pointer (presumably due to padding).
Now for the real question: what's the cleanest way to fix this? Should I try to make the stepsize larger (in which case a preprocessor instruction is needed to distinguish the sanitizer use)? Or is there a way to disable this padding for things in the section?
GCC Asan deliberately avoids instrumenting variables in custom sections for the reasons you outlined (i.e. to preserve consecutiveness):
/* Don't protect if using user section, often vars placed
into user section from multiple TUs are then assumed
to be an array of such vars, putting padding in there
breaks this assumption. */
|| (DECL_SECTION_NAME (decl) != NULL
&& !symtab_node::get (decl)->implicit_section
&& !section_sanitized_p (DECL_SECTION_NAME (decl)))
(from gcc/asan.c). A special flag -fsanitize-sections=wildcard1,wildcard2,... can be used to force instrumentation in this case.
Clang Asan on the other hand ignores user section annotations (see AddressSanitizer.cpp).
I suggest to add a PR to Asan tracker to either make Clang behave like GCC or add a special flag to control instrumentation of user sections (in latter case we also need to update Asan Clang/GCC incompatibility wiki).
Related
Are these attributes incompatible? The address attribute seems to be ignored, emitting no warnings (-Wall).
(For reference, EEMEM is defined in eeprom.h as: #define EEMEM __attribute__((section(".eeprom"))).)
Using a declaration like:
uint8_t storedFlags EEMEM __attribute__((address (100)));
(and similarly for all the others) results in the variables being placed in whatever order the linker prefers, ignoring my attribute. Order of attributes doesn't make a difference.
I am aware of the preferred method (creating sections and passing their locations to the linker). I was just looking to shove them around for the moment, as I'm in active development and adding and removing allocations in EEPROM; I'd rather things not move around every other build so I don't have to reprogram EEPROM from default values every damn time. Worst of all, I'm sure I've done precisely this before, and had it work. Version differences? Coincidental assignments? (I have GCC 3.4 and 8.1, not sure what that project used; I'm using 8.1 for this one.)
The documentation for the address attribute states:
Variables with the address attribute are used to address memory-mapped peripherals that may lie outside the io address range.
Looking at the AVR memory space shows the I/O addresses fall under the SRAM data memory space.
This explains why your construct doesn't work as expected since EEMEM and the address attribute map to conflicting memory sections.
Edit: Testing with avr-gcc 3.6.2 suggest that the section attribute overrides the address attribute (without warning). Using eeprom_read_byte to read data from EEPROM, the following example gets correctly compiled by avr-gcc (correct because the address 0x0123 is passed to the eeprom_read_byte function):
#include <avr/eeprom.h>
uint8_t __attribute__((address (0x0123))) storedFlags;
int main(void){
if (eeprom_read_byte(&storedFlags) == 1){
return 1;
}
}
Edit2: tested on avr-gcc 11.1, also generates correct instructions.
Will this compile and work as meant under Linux GCC ?
In the LoRa Gateway Stack hosted at Github I found the following construct in loragw_hal.h
enum lgw_radio_type_e {
LGW_RADIO_TYPE_NONE,
LGW_RADIO_TYPE_SX1255,
LGW_RADIO_TYPE_SX1257
};
#define LGW_RF_CHAIN_NB 2 /* number of RF chains */
and then in loragw_hal.c
static enum lgw_radio_type_e rf_radio_type[LGW_RF_CHAIN_NB];
edit: the array is not initialized at any place in the code
and then in the function
setup_sx125x(uint8_t rf_chain, uint32_t freq_hz)
the following switch statement is used to select the rf chain according to the rf_chain argument
switch (rf_radio_type[rf_chain]) {
case LGW_RADIO_TYPE_SX1255:
// some code
break;
case LGW_RADIO_TYPE_SX1257:
// some code
break;
default:
DEBUG_PRINTF("ERROR: UNEXPECTED VALUE %d FOR RADIO TYPE\n",
rf_radio_type[rf_chain]);
break;
}
rf_chain argument is set to 1, when the function is called, and it selects the default Error 'unexpected rf chain' of course.
The copyright holder Semtech Inc. support, points always to this code, if you have any problems with their product, as reference.
But I have the feeling that this code wouldn't run anyway without any modifications.
So my question to the forum here is, aside from that this construct above makes not really sense, is that not a faulty construct anyway ?
Will this compile and work as meant under Linux GCC ?
I try to use this code under GCC ARM and it does NOT work as it seems to be planned.
You seem to be trying to draw attention to this:
enum lgw_radio_type_e {
LGW_RADIO_TYPE_NONE,
LGW_RADIO_TYPE_SX1255,
LGW_RADIO_TYPE_SX1257
};
#define LGW_RF_CHAIN_NB 2 /* number of RF chains */
[...]
static enum lgw_radio_type_e rf_radio_type[LGW_RF_CHAIN_NB];
[...] the array is not initialized at any place in the code
It is not a particular problem that the array is not explicitly initialized. File-scope variables (and static block-scope variables) are subject to default initialization if no explicit initializer is provided. In this case, the array declaration is equivalent to
static enum lgw_radio_type_e rf_radio_type[2] = {
LGW_RADIO_TYPE_NONE, LGW_RADIO_TYPE_NONE
};
That seems to be quite sensible in itself.
You go on to say,
[...] when the function is called, and it selects the default Error 'unexpected rf chain' of course.
I don't see any reason to expect a different case to be selected, but neither do I see any justification for assuming that a different one would not be selected. Nor is it clear under what circumstances the switch itself is executed at all.
One would normally expect one or both elements of rf_radio_type to be set during driver initialization if in fact the corresponding hardware is present. If the overall code (not just the parts you've presented) is correct, then probably it will not execute the presented switch when rf_radio_type[rf_chain] has a value different from both LGW_RADIO_TYPE_SX1255 and LGW_RADIO_TYPE_SX1257. On the other hand, printing the error message is essentially harmless in itself; if the driver prints it then that may be merely a quality-of-implementation issue, not a functional flaw.
So my question to the forum here is, aside from that this construct
above makes not really sense, is that not a faulty construct anyway ?
No, it isn't. And as far as I can tell, all constructs presented make as much sense as can be expected when taken out of context as they have been.
Will this compile and work as meant under Linux GCC ?
You have presented several individually valid C fragments, but they do not together constitute a valid translation unit. It is possible to form a complete, valid translation unit containing all those fragments that will compile successfully and do absolutely anything. The fragments will not inherently interfere with compilation, nor necessarily cause malfunction.
I try to use this code under GCC ARM and it does NOT work as it seems to be planned.
I find your apparent confidence in your assessment of the intended behavior of the overall code to be a bit optimistic.
edit: the array is not initialized at any place in the code
As pointed out in another answer, variables with static storage duration are required by the C standard to get implicitly initialized to zero if the programmer didn't set them explicitly. So this is code fine as far as the C standard is concerned.
However, writing code relying on initialization of static storage duration variables in .bss is recognized as bad practice in embedded systems programming. This is because the code that does the copy-down of .data and zero initialization of .bss is often omitted on embedded systems, as a very common non-standard practice in order to speed up program start-up.
Such a non-standard option is usually called "minimal/compact/fast start-up" or similar in the compiler options. If you have such an option enabled - which is quite common - the code won't work.
Good practice is to initialize such variables later on in "run-time" instead, before they are used for the first time.
Summary: the code is sloppily written, since the intention here is to provide portable code across many different microcontroller platforms, rather than to provide code for some PC. I would guess it was written by some kind of PC programmer, as is often the case for these protocol stacks.
It is possible for a GCC plugin to add a new builtin function? If so, how to do it properly?
GCC version is 5.3 (or newer). The code is compiled and processed by the plugin written in C.
It is mentioned in the rationale for GCC plugins at gcc-melt.org that this is doable but I cannot see how.
As far as I can see in the sources of GCC, the builtins are created using add_builtin_function() from gcc/langhooks.c:
tree
add_builtin_function (const char *name,
tree type,
int function_code,
enum built_in_class cl,
const char *library_name,
tree attrs)
It is more or less clear which values the arguments of this function should have, except for function_code, a unique numeric ID of the function.
Looks like (see add_builtin_function_common()), a value from enum built_in_function is expected there but a GCC plugin cannot change that enum.
One cannot pass any random value greater than END_BUILTINS as function_code either, it seems. builtin_decl_implicit() and builtin_decl_explicit() would have a failed assertion in that case.
So, what is the proper way to add a builtin in a GCC plugin (without using MELT and such, just GCC plugin API)?
Update
I looked again at the implementation of add_builtin_function_common() and of langhooks.builtin_function() for C as well as at how these are used in GCC. It seems that 0 is acceptable as function_code in some cases. You cannot use builtin_decl_implicit() then but you can save the DECL returned by add_builtin_function() and use it later.
Looks like the only event when I can try to create built-ins that way is PLUGIN_START_UNIT (otherwise GCC may crash due to external_scope variable being NULL).
I tried the following at that stage (fntype was created before):
decl = add_builtin_function (
"my_helper", fntype,
0 /* function_code */,
BUILT_IN_NORMAL /* enum built_in_class cl */,
NULL /* library_name */,
NULL_TREE /* attrs */)
my_helper was defined in a different C source file compiled and linked with the main source file. Then I used decl to insert the calls to that function into other functions (gimple_build_call) during my GIMPLE pass.
GCC output no errors and indeed inserted the call to my_helper but as a call to an ordinary function. I actually needed a builtin to avoid a call but rather insert the body of the function.
On the other hand, tsan0 pass, which executes right after my pass, inserts the calls of builtin functions just like one would expect: there is no explicit call as a result, just the body of the function is inserted. Its builtins, however, are defined by GCC itself rather than by the plugins.
So I suppose my builtin still needs something to be a valid builtin, but I do not know what it is. What could that be?
I'm assuming what you want to do (from your comment and linked post) is insert C code into a function. In that case, I would have thought you wouldn't need to go so far as to write a compiler plugin. Have a look at Boost.Preprocessor, which can do very advanced manipulations of C code using only the preprocessor.
I understand that a function pointer points to the starting address of the code for a function. But is there any way to be able to point to the end of the code of a function as well?
Edit: Specifically on an embedded system with a single processor and no virtual memory. No optimisation too. A gcc compiler for our custom processor.
I wish to know the complete address range of my function.
If you put the function within its own special linker section, then your toolchain might provide a pointer to the end (and the beginning) of the linker section. For example, with Green Hills Software (GHS) MULTI compiler I believe you can do something like this:
#pragma ghs section text=".mysection"
void MyFunction(void) { }
#pragma ghs section
That will tell the linker to locate the code for MyFunction in .mysection. Then in your code you can declare the following pointers, which point to the beginning and end of the section. The GHS linker provides the definitions automatically.
extern char __ghsbegin_mysection[];
extern char __ghsend_mysection[];
I don't know whether GCC supports similar functionality.
You didn't say why you need this information, but on some embedded system it's required to copy a single function from flash to ram in order to (re)program the flash.
Normally you are placing this functions into a new unique section and depending of your linker you can copy this section with pure C or with assembler to the new (RAM) location.
You also need to tell the linker that the code will run from another address than that it is placed in flash.
In a project the flash.c could look like
#pragma define_section pram_code "pram_code.text" RWX
#pragma section pram_code begin
uint16_t flash_command(uint16_t cmd, uint16_t *addr, uint16_t *data, uint16_t cnt)
{
...
}
#pragma section pram_code end
The linker command file looks like
.prog_in_p_flash_ROM : AT(Fpflash_mirror) {
Fpram_start = .;
# OBJECT(Fflash_command,flash.c)
* (pram_code.text)
. = ALIGN(2);
# save data end and calculate data block size
Fpram_end = .;
Fpram_size = Fpram_end - Fpram_start;
} > .pRAM
But as others said, this is very toolchain specific
There is no way with C to point to the end of a function. A C compiler has a lot of latitude as to how it arranges the machine code it emits during compilation. With various optimization settings, a C compiler may actually merge machine code intermingling the machine code of the various functions.
Since along with what ever the C compiler does there is also what is done by the linker as well as the loader as a part of linking the various compiled pieces of object code together and then loading the application which may also be using various kinds of shared libraries.
In the complex running environment of modern operating systems and modern development tool chains, unless the language provides a specific mechanism for doing something, it is prudent to not try to get fancy leaving yourself open to an application which suddenly stops working due to changes in the operating environment.
In most cases if you use a non-optimizing setting of the compiler with static linked libraries, the symbol map that most linkers provide will give you a good idea as to where functions begin and end. However the only thing you can really depend on is knowing the address of the function entry points.
In some implementations (including gcc) you could do something like this (but its not guaranteed and lots of implementation details could affect it):
int foo() {
printf("testing\n");
return 7;
}
void foo_end() { }
int sizeof_foo() {
// assumes no code size optimizations across functions
// function could be smaller than reported
// reports size, not range
return (int (*)())foo_end - foo;
}
I'm programming an embedded powerpc 32 system with a 32 kbyte 8-way set associative L2 instruction cache. To avoid cache thrashing we align functions in a way such that the text of a set of functions called at a high frequency (think interrupt code) ends up in separate cache sets. We do this by inserting dummy functions as needed, e.g.
void high_freq1(void)
{
...
}
void dummy(void)
{
__asm__(/* Silly opcodes to fill ~100 to ~1000 bytes of text segment */);
}
void high_freq2(void)
{
...
}
This strikes me as ugly and suboptimal. What I'd like to do is
avoid __asm__ entirely and use pure C89 (maybe C99)
find a way to create the needed dummy() spacer that the GCC optimizer does not touch
the size of the dummy() spacer should be configurable as a multiple of 4 bytes. Typical spacers are 260 to 1000 bytes.
should be feasible for a set of about 50 functions out of a total of 500 functions
I'm also willing to explore entirely new techniques of placing a set of selected functions in a way so they aren't mapped to the same cache lines. Can a linker script do this?
Use GCC's __attribute__(( aligned(size) )).
Or, pass -falign-functions=n on your GCC command line.
GCC Function Attributes
GCC Optimize Options
Maybe linker scripts are the way to go. The GNU linker can use these I think... I've used LD files for the AVR and on MQX both of which we using GCC based compilers... might help...
You can define your memory sections etc and what goes where... Each time I come to write one its been so long since the last I have to read up again...
Have a search for SVR3-style command files to gem up.
DISCLAIMER: Following example for a very specific compiler... but the SVR3-like format is pretty general... you'll have to read up for your system
For example you can use commands like...
ApplicationStart = 0x...;
MemoryBlockSize = 0x...;
ApplicationDataSize = 0x...;
ApplicationLength = MemoryBlockSize - ApplicationDataSize;
MEMORY {
RAM: ORIGIN = 0x... LENGTH = 1M
ROM: ORIGIN = ApplicationStart LENGTH = ApplicationLength
}
This defines three memory sections for the linker. Then you can say things like
SECTIONS
{
GROUP :
{
.text :
{
* (.text)
* (.init , '.init$*')
* (.fini , '.fini$*')
}
.my_special_text ALIGN(32):
{
* (.my_special_text)
}
.initdat ALIGN(4):
// Blah blah
} > ROM
// SNIP
}
The SECTIONS command tells the linker how to map input sections into output sections, and how to place the output sections in memory... Here we're saying what is going into the ROM output section, which we defined in the MEMORY definition above. The bit possible of interest to you is .my_special_text. In your code you can then do things like...
__attribute__ ((section(".my_special_text")))
void MySpecialFunction(...)
{
....
}
The linker will put any function preceded by the __attribute__ statement into the my_special_text section. In the above example this is placed into ROM on the next 4 byte aligned boundary after the text section, but you can put it anyway you like. So you could make a few sections, one for each of the functions you describe, and make sure the addresses won't cause clashes...
You can the size and memory location of the section using linker defined variables of the form
extern char_fsection_name[]; // Set to the address of the start of section_name
extern char_esection_name[]; // Set to the first byte following section_name
So for this example...
extern char _fmy_special_text[]; // Set to the address of the start of section_name
extern char _emy_special_text[]; // Set to the first byte following section_name
If you are willing to expend some effort, you can use
__attribute__((section(".text.hotpath.a")))
to place the function into a separate section, and then in a custom linker script explicitly place the functions.
This gives you a bit more fine-grained control than simply asking for the functions to be aligned, but requires more hand-holding.
Example, assuming that you want to lock 4KiB into cache:
SECTIONS {
.text.hotpath.one BLOCK(0x1000) {
*(.text.hotpath.a)
*(.text.hotpath.b)
}
}
ASSERT(SIZEOF(.text.hotpath.one) <= 0x1000, "Hot Path functions do not fit into 4KiB")
This will make sure the hot path functions a and b are next to each other and both fit into the same block of 4 KiB that is aligned on a 4 KiB boundary, so you can simply lock that page into the cache; if the code doesn't fit, you get an error.
You can even use
NOCROSSREFS(.text.hotpath.one .text)
to forbid hot path functions calling other functions.
Assuming you're using GCC and GAS, this may be a simple solution for you:
void high_freq1(void)
{
...
}
asm(".org .+288"); /* Advance location by 288 bytes */
void high_freq2(void)
{
...
}
You could, possibly, even use it to set absolute locations for the functions rather than using relative increments in address, which would insulate you from consequences due to the functions changing in size when/if you modify them.
It's not pure C89, for sure, but it may be less ugly than using dummy functions. :)
(Then again, it should be mentioned that linker scripts aren't standardized either.)
EDIT: As noted in the comments, it seems to be important to pass the -fno-toplevel-reorder flag to GCC in this case.