How macros in embedded C affect memory? - c

I know that macros in C such as:
#define VARNULL (u8)0
doesn't store this VARNULL in RAM, but this of course will increase the code size in the FLASH.
But what if I have a multi-line macro such as:
#define CALL_FUNCS(x) \
do { \
func1(x); \
func2(x); \
func3(x); \
} while (0)
By knowing that func1, func2, and func3 are functions from different .c files. Does this means that these functions will be stored in RAM? And of course in the FLASH (the code).
Kindly correct me if I'm wrong?

You keep saying that "of course" the macros will be "stored" in flash memory on your target device, but that is not true.
The macros exist in the source code only; they are replaced with their defined values during compilation. The program in flash memory will not "contain" them in any meaningful way.

Macros, and any other directive prefixed with a # are processed before C compilation by the pre-processor; they do not generate any code, but rather generate source code that is then processed by the compiler as if you had typed in the code directly. So in your example the code:
int main()
{
CALL_FUNCS(2) ;
}
Results in the following generated source code:
int main()
{
do { \
func1(2);
func2(2);
func3(2);
} while (0) ;
}
Simple as that. If you never invoke the macro, it will generate exactly no code. If you invoke it multiple times, it will generate code multiple times. There is nothing clever going on the macro is merely a textual replacement generated before compilation; what the compiler does with that depends entirely on what the macro expands to and not the fact that it is a macro - the compiler sees only the generated code, not the macro definition.
With respect to const vs #define, a literal constant macro is also jyst a textual replacement and will be placed in the code as a literal constant. A const on the other hand is a variable. The compiler may simply insert a literal constant where that generates less code that fetching the constant from memory, in C++ that is guaranteed for simple types, and it would be unusual for a C compiler not to behave in the same way. However, because it is a variable you can take it's address - if your code does take the address of a const, then the const will necessarily have storage. Whether that storage is in RAM or ROM depends on your compiler and linker configuration - you should consult the toolchain documentation to see how it handles const storage.
One benefit of using a const is that const variables have strong typing and scope unlike macros.

Related

How does function-like macros work, this example

void someOtherFunction(void)
{
...
}
//Is it possible to call the #define like this, in the global scope? When does the execution come here?
SYSTEM_CONTROL_REGISTER_INIT_FUNCTION( credHandlerInit, LEVEL_APPLICATION, SYSTEM_CONTROL_ORDER_DONT_CARE );
void credHandlerInit(void)
{
portBASE_TYPE result;
result = xTaskCreate( credHandlerTask,
(portCHAR *) MBS_CFG_TASK_NAME_CRED_HANDLER,
MBS_CFG_TASK_STACK_CRED_HANDLER,
NULL,
MBS_CFG_TASK_PRIO_CRED_HANDLER,
&credHandlerTaskHandle );
}
and in a .h-file the following macro is defined:
#define SYSTEM_CONTROL_REGISTER_INIT_FUNCTION( _initFunctionName, \
_initLevel, \
_initOrder ) \
\
void _initFunctionName( void ); \
\
SystemControlInitListRecord const systemControlInitRecord_ ## _initLevel ## _initOrder ## _ ## _initFunctionName \
__attribute__ ((section (".systemControlInitList"))) \
= { \
.name = #_initFunctionName, \
.syncedInitFunction = NULL, \
.unsyncedInitFunction = _initFunctionName, \
.level = _initLevel, \
.order = _initOrder, \
.initType = SYSTEM_CONTROL_RTOS_RUNNING \
}; \
\
void _initFunctionName( void )
What I don't understand is how this function is called?
I do not see any call to this function in the code.
Can someone explain how this work?
Is the code below valid, calling the macro like this?
//main.c
#define SOMETHING(x) (someVariable = x)
static uint32_t someVariable;
SOMETHING(5);
int main(void)
{
printf("%d", someVariable); //should print 5 here then?
}
The intention of this code is the following:
The macro declares a function named credHandlerInit. This name is passed to the _initFunctionName macro parameter name. Whoever wrote the macro didn't quite know what they are doing, so we end up with two function declarations void credHandlerInit(void );, which is OK but very fishy. Perhaps they optionally meant the macro to be followed by the function definition in the form of { ... }. At any rate, not the best idea.
The whole SystemControlInitListRecord const part declares a read-only struct of that type, then names it systemControlInitRecord_LEVEL_APPLICATION_SYSTEM_CONTROL_ORDER_DONT_CARE _credHandlerInit. This is where I'd start to suspect that the programmer who wrote this is paid per letter written... and also that they were not an expert at C programming.
NOTE: identifiers longer than 32 characters is a safety hazard in C, see 5.2.4.1 translation limits. So if this function name will act as an external identifier, which we have all the reasons to believe since it ain't static, a conforming compiler may not be able to distinguish between different functions with the systemControlInitRecord_LEVEL_APPLICATION_ prefix. There's no guarantee of "name mangling" but the compiler might end up generating source that calls the wrong function from the external caller side.
In practice, modern compilers tend to distinguish between far more than 32 characters, but they aren't guaranteed to do so. Best case scenario, the code is non-portable between standard compilers. Worst case, the whole code will go completely haywire on the designated compiler.
So this is a subtle and severe bug who the original programmer was not aware of. Someone will need to slap 'em from inventing such ridiculously long identifier names. It is both dangerous and completely unreadable.
The __attribute__ ((section (".systemControlInitList"))) is a common non-standard extension used by gcc and a bunch of other compilers for creating custom memory segments. These names need to correspond to a memory section in the linker script. Why this variable needs to reside in that memory section, I don't know, but this is obviously part of some embedded system where named memory sections are quite common practice (used for bootloaders, on-chip library code, NVM variables, ISRs, flash drivers etc etc).
The whole { .name = #_initFunctionName, .syncedInitFunction = NULL, ... part is a struct initializer list utilizing _designated initializers. The first member is apparently a string and the # "stringification" operator turns the name string into "credHandlerInit".
The void _initFunctionName( void ), as already mentioned, might optionally begin the function definition, which is then apparently expected to continue on the caller-side. Or otherwise the ; in the macro call makes this a 2nd function declaration.
As you can hopefully tell from my comments, this code is very badly written by someone obsessing in making things as needlessly complicated as possible, whereas the truly good C programmers try to make things as simple as possible.
I would guess it is part of some smelly RTOS source and used for user-side task creation or similar? I would stay clear of this source, since there's lots of code smell and I already found one severe bug from just reading one single macro.
A "function-like macro" just means a macro which takes arguments.
In this example you have a macro which takes arguments but it is not being used like a function at all.
This macro expands to a constant object definition. The object is a structure with information about a function and when to call it. The object is placed in a custom non-standard section called ".systemControlInitList".
In order to use this code you must have a corresponding custom linker control script (usually called something.ld). This will collect all the objects that are placed in the custom section and put them somewhere in the program image, probably ROM. It will put some custom symbol at the the start of the block of such objects, and probably another symbol at the end.
There will then need to be some custom code that runs as part of your application start or boot sequence that looks for that symbol and executes all the functions referred to by the array of structure.

'Reverse' a collection of C preprocessor macros easily

I have a lot of preprocessor macro definitions, like this:
#define FOO 1
#define BAR 2
#define BAZ 3
In the real application, each definition corresponds to an instruction in an interpreter virtual machine. The macros are also not sequential in numbering to leave space for future instructions; there may be a #define FOO 41, then the next one is #define BAR 64.
I'm now working on a debugger for this virtual machine, and need to effectively 'reverse' these preprecessor macros. In other words, I need a function which takes the number and returns the macro name, e.g. an input of 2 returns "BAR".
Of course, I could create a function using a switch myself:
const char* instruction_by_id(int id) {
switch (id) {
case FOO:
return "FOO";
case BAR:
return "BAR";
case BAZ:
return "BAZ";
default:
return "???";
}
}
However, this will a nightmare to maintain, since renaming, removing or adding instructions will require this function to be modified too.
Is there another macro which I can use to create a function like this for me, or is there some other approach? If not, is it possible to create a macro to perform this task?
I'm using gcc 6.3 on Windows 10.
You have the wrong approach. Read SICP if you have not read it.
I have a lot of preprocessor macro definitions, like this:
#define FOO 1
#define BAR 2
#define BAZ 3
Remember that C or C++ code can be generated, and it is quite easy to instruct your build automation tool to generate some particular C file (with GNU make or ninja you just add some rule or recipe).
For example, you could use some different preprocessor (liek GPP or m4), or some script -e.g. in awk or Python or Guile, etc..., or write your own program (in C, C++, Ocaml, etc...), to generate the header file containing these #define-s. And another script or program (or the same one, invoked differently) could generate the C code of instruction_by_id
Such basic metaprogramming techniques (of generating some or several C files from something higher level but specific) have been used since at least the 1980s (e.g. with yacc or RPCGEN). The C preprocessor facilitates that with its #include directive (since you can even include lines inside some function body, etc...). Actually, the idea that code is data (and proof) and data is code is even older (Church-Turing thesis, Curry-Howard correspondence, Halting problem). The Gödel, Escher, Bach book is very entertaining....
For example, you could decide to have a textual file opcodes.txt (or even some sqlite database containing stuff....) like
# ignore lines starting with an hashsign
FOO 1
BAR 2
and have two small awk or Python scripts (or two tiny C specialized programs), one generating the #define-s (into opcode-defines.h) and another generating the body of instruction_by_id (into opcode-instr.inc). Then you need to adapt your Makefile to generate these, and put #include "opcode-defines.h" inside some global header, and have
const char* instruction_by_id(int id) {
switch (id) {
#include "opcode-instr.inc"
default: return "???";
}
}
this will a nightmare to maintain,
Not so with such a metaprogramming approach. You'll just maintain opcodes.txt and the scripts using it, but you express a given "knowledge element" (the relation of FOO to 1) only once (in a single line of opcode.txt). Of course you need to document that (at the very least, with comments in your Makefile).
Metaprogramming from some higher-level, declarative formalization, is a very powerful paradigm. In France, J.Pitrat pioneered it (and he is writing an interesting blog today, while being retired) since the 1960s. In the US, J.MacCarthy and the Lisp community also.
For an entertaining talk, see Liam Proven FOSDEM 2018 talk on The circuit less traveled
Large software are using that metaprogramming approach quite often. For example, the GCC compiler have about a dozen of C++ code generators (in total, they are emitting more than a million of C++ lines).
Another way of looking at such an approach is the idea of domain-specific languages that could be compiled to C. If you use an operating system providing dynamic loading, you can even write a program emitting C code, forking a process to compile it into some plugin, then loading that plugin (on POSIX or Linux, with dlopen). Interestingly, computers are now fast enough to enable such an approach in an interactive application (in some sort of REPL): you can emit a C file of a few thousand lines, compile it into some .so shared object file, and dlopen that, in a fraction of second. You could also use JIT-compiling libraries like GCCJIT or LLVM to generate code at runtime. You could embed an interpreter (like Lua or Guile) into your program.
BTW, metaprogramming approaches is one of the reasons why basic compilation techniques should be known by most developers (and not only just people in the compiler business); another reason is that parsing problems are very common. So read the Dragon Book.
Be aware of Greenspun's tenth rule. It is much more than a joke, actually a profound truth about large software.
In a similar case I've resorted to defining a text file format that defines the instructions, and writing a program to read this file and write out the C source of the actual instruction definitions and the C source of functions like your instruction_by_id(). This way you only need to maintain the text file.
As awesome as general code generation is, I’m surprised that nobody mentioned that (if you relax your problem definition just a bit) the C preprocessor is perfectly capable of generating the necessary code, using a technique called X macros. In fact every simple bytecode VM in C that I’ve seen uses this approach.
The technique works as follows. First, there is a file (call it insns.h) containing the authoritative list of instructions,
INSN(FOO, 1)
INSN(BAR, 2)
INSN(BAZ, 3)
or alternatively a macro in some other header containing the same,
#define INSNS \
INSN(FOO, 1) \
INSN(BAR, 2) \
INSN(BAZ, 3)
whichever is more conveinent for you. (I’ll use the first option in the following.) Note that INSN is not defined anywhere. (Traditionally it would be called X, thus the name of the technique.) Wherever you want to loop over your instructions, define INSN to generate the code you want, include insns.h, then undefine INSN again.
In your disassembler, write
const char *instruction_by_id(int id) {
switch (id) {
#define INSN(NAME, VALUE) \
case NAME: return #NAME;
#include "insns.h" /* or just INSNS if you use a macro */
#undef INSN
default: return "???";
}
}
using the prefix stringification operator # to turn names-as-identifiers into names-as-string-literals.
You obviously can’t define the constants this way, because macros cannot define other macros in the C preprocessor. However, if you don’t insist that the instruction constants be preprocessor constants, there’s a different perfectly serviceable constant facility in the C language: enumerations. Whether or not you use an enumerated type, the enumerators defined inside it are regular integer constants from the point of view of the compiler (though not the preprocessor—you cannot use #ifdef with them, for example). So, using an anonymous enumeration type, define your constants like this:
enum {
#define INSN(NAME, VALUE) \
NAME = VALUE,
#include "insns.h" /* or just INSNS if you use a macro */
#undef INSN
NINSNS /* C89 doesn’t allow trailing commas in enumerations (but C99+ does), and you may find this constant useful in any case */
};
If you want to statically initialize an array indexed by your bytecodes, you’ll have to use C99 designated initializers {[FOO] = foovalue, [BAR] = barvalue, /* ... */} whether or not you use X macros. However, if you don’t insist on assigning custom codes to your instructions, you can eliminate VALUE from the above and have the enumeration assign consecutive codes automatically, and then the array can be simply initialized in order, {foovalue, barvalue, /* ... */}. As a bonus, NINSNS above then becomes equal to the number of the instructions and the size of any such array, which is why I called it that.
There are more tricks you can use here. For example, if some instructions have variants for several data types, the instruction list X macro can call the type list X macro to generate the variants automatically. (The somewhat ugly second option of storing the X macro list in a large macro and not an include file may be more handy here.) The INSN macro may take additional arguments such as the mode name, which would ignored in the code list but used to call the appropriate decoding routine in the disassembler. You can use token pasting operator ## to add prefixes to the names of the constants, as in INSN_ ## NAME to generate INSN_FOO, INSN_BAR, etc. And so on.

Mechanism for including header files in C main body

For a large software developed by C, we first declare all the self-defined functions in a separate header file (e.g. myfun.h). After that, once we write a code (e.g. main.c) that uses the functions listed in myfun.h, we have to #include "myfun.h". I'm wondering how it works, because even if I include the function names declared in header file before the main body, the code cannot see the function details in main.c. I guess it will search the library to get the function details...Am I right?
When you say "it will search the library for the function details" you're not far off, but that isn't quite right. A function declaration, i.e.. a function prototype only contains enough information for the compiler to do two things:
First, the compiler will register the function as a known identifier so that it knows what you're taking about when you call it, as opposed to a random string of letters with parentheses (to the compiler, they are essentially the same thing without a function prototype for either - an error).
Second, the compiler uses the function prototype for checking code correctness. Correctness in this sense means that a function call will match the prototype in both arity and type. In other words a function call to int square(int a, int b); will have two arguments, both integers.
The program doesn't "search the library," though. Function names without parentheses are not function calls but rather function's address. Therefore, when you call a function, the processor jumps to the memory location of the function. (This assumes the function has not been inlined.)
Where is this function located though? It depends. If you wrote the function in the same module, i.e... a .c file that got compiled into an object linked with the main.c file into a single executable, then the location of the function will be somewhere in the .TEXT section of the executable. In other words, it's just a slight offset from the main function's entry point. In a huge project this offset won't be so slight, but it will be shorter than the offset of separate objects.
Having said that, if you compiled this hypothetical function into a DLL which you call from your main program, then the function's address will be determined in one of two ways:
Either you will have generated a .lib/.a? (depending on whether you're on Windows or Linux) file containing the function declaration's and addresses, or:
You will use run-time linking where the main program will calculate the function addresses when it loads the .dll/.so into its address space. First, it will determine where to load it. You can set DLL's to have preferred offsets to optimize load time. Otherwise, libraries will start loading from the first segment available and any additional libraries will need their function address recalculated using this new address, hampering initial load times. Once they are loaded into the program's memory though, there shouldn't be any performance hits thereafter.
Going back to the preprocessor, it's important to note two things. First, it runs before any compilation takes place. This is important. Since the program is not really being "compiled" when the preprocessor is doing its thing, macros are not type-safe. (Insert Haskell joke about C "type safety") This is why you don't -or shouldn't- see macros in C++. Anything that can be accomplished with macros in C can be accomplished by const and inline functions in C++, with the added benefit of type safety.
Second, the preprocessor is almost just a search and replace engine. For example, in the following code, nothing happens because the preprocessor if statement evaluates to false, since I never defined anything. The preprocessor removes the code in this section. Remember that since the compiler has not run in earnest yet, this removed code will not be compiled. This fact is usually utilized to implement functions for debugging or logging in debug builds. In release builds the preprocessor definition is then manipulated such that the debug code is not included.
#include <stdio.h>
#include <stdlib.h>
int main()
{
#if TRUE
printf("Hello, World!");
#endif
return EXIT_SUCCESS;
}
In fact, the EXIT_SUCCESS macro I used is defined in stdlib.h, and replaced by 0. (EXIT_FAILURE =1).
Back in the day, the preprocessor was used as duct tape, basically, to compensate for faults in C.
For example, since const values can't be used as array sizes, macros were used instead, like this:
// Not valid C89, possibly even C99
const int DEFAULT_BUFFER_SIZE = 128;
char user_input[DEFAULT_BUFFER_SIZE];
// Legal since the dawn of time
#define DEFAULT_BUFFER_SIZE 128
char user_input[DEFAULT_BUFFER_SIZE];
Another significant use of the preprocessor was for code portability, for example:
#ifdef WIN32
// Do windows things
#elif
// Handle other OS
#endif
One trick was to define a generic function and set it to the appropriate OS-dependent one (Remember that functions without the parentheses represent the function's address, not an actual function call), like this:
void RequestSomeKernelAction();
#ifdef WIN32
RequestSomeKernelAction = WindowsVersion;
#else
RequestSomeKernelAction = OtherOSFunction;
#endif
This is all to say that the code you see in header files follows these same rules. If I have the following header file:
#ifndef SRC_INCLUDES_TEST_H
#define SRC_INCLUDES_TEST_H
int square(int a);
#endif /** SRC_INCLUDES_TEST_H */
And I have this main.c file:
#define SRC_INCLUDES_TEST_H
#include "test.h"
int main()
{
int n = square(4);
}
This program will not compile. The square function will not be known to main.c because while I did include the header file where square is declared, my #define SRC_INCLUDES_TEST_H statement tells the preprocessor to copy all the header file contents over to main except those in the block where SRC_INCLUDES_TEST_H is defined, i.e... nothing.
These preprocessor commands can be nested, and there are several, which I highly recommend you look up, if only for historical or pedagogical reasons.
The last point I will make is that while the C preprocessor has its faults, it was a powerful tool in the right hands, and in fact, the first C++ compiler Bjarne Stroustroup wrote was essentially just a preprocessor.

Where macros variable created? and size of the variable?

I have doubts about macros, When we create like the following
#define DATA 40
where DATA can be create? and i need to know size also?and type of DATA?
In java we create macro along with data type,
and what about macro function they are all inline function?
Macros are essentially text substitutions.
DATA does not exist beyond the pre-processing stage. The compiler never sees it. Since no variable is created, we can't talk about its data type, size or address.
Macros are literally pasted into the code. They are not "parsed", but expanded. The compiler does not see DATA, but 40. This is why you must be careful because macros are not like normal functions or variables. See gcc's documentation.
A macro is a fragment of code which has been given a name. Whenever
the name is used, it is replaced by the contents of the macro. There
are two kinds of macros. They differ mostly in what they look like
when they are used. Object-like macros resemble data objects when
used, function-like macros resemble function calls.
You may define any valid identifier as a macro, even if it is a C
keyword. The preprocessor does not know anything about keywords. This
can be useful if you wish to hide a keyword such as const from an
older compiler that does not understand it. However, the preprocessor
operator defined (see Defined) can never be defined as a macro, and
C++'s named operators (see C++ Named Operators) cannot be macros when
you are compiling C++.
macro's are not present in your final executable. They present in your source code only.macro's are processed during pre-processing stage of compilation.You can find more info about macro's here
Preprocessor directives like #define are replaced with the corresponding text during the preprocessing phase of compilation, and are (almost) never represented in the final executable.

Why does my #define macro appear to be a global?

I was investigating a compile and link issue within my program when I came across the following macro that was defined in a header and source file:
/* file_A.c */
#ifndef _NVSize
#define _NVSize 1
#endif
/* file_B.c */
#include "My_Header.h"
#ifndef _NVSize
#define _NVSize 1
#endif
/* My_Header.h */
#define _NVSize 1024
Nothing out of the ordinary yet, until I saw the following information in the GCC output map file:
/* My Map File */
...
.rodata 0x08015694 _NVSize
...
My understanding of the map file is that if you see a symbol in the .rodata section of the map file, this symbol is being treated as a global variable by the compiler. But, this shouldn't be the case because macros should be handled by the preprocessor before the compiler even parses the file. This macro should be replaced with it's defined value before compiling.
Is this the standard way that GCC handles macros or is there some implementation specific reason that GCC would treat this as a global (debug setting maybe)? Also, what does this mean if my macro gets redefined in a different source file? Did I just redefine it for a single source file or did I modify a global variable, thereby changing _NVSize everywhere it's used within my program?
I think the compiler is free to assign your macro to a global variable as long as it ensures that this produces the exact same result as if it did a textual replacement.
During the compilation the compiler can mark this global specially to denote that it is a macro constant value, so no re-assignment is possible, no address can be taken, etc.
If you redefine the macro in your sorce, the compiler might not perform this transformation (and treat it as you'd expect: a pre-compier textual replacement), perform it on one of the different values (or on all of them say, using different names for each occurrance), or do domething else :)
Macros are substituted in the preprocessor step, the compiler only sees the substituted result. Thus if it sees the macro name, then my bet is that the macro wasn't defined at the point of usage. It is defined between the specific #define _NVSize and an #undef _NVSize. Redefining an existing macro without using an #undef first should result in a preprocessor error, AFAIR.
BTW, you shouldn't start your macro names with an underscore. These are reserved for the implementation.

Resources