Host testing C program with hard coded memory addresses - c

We will write functional/unit tests for C code. This C program will be run as embedded software. However we need to run tests on a Linux environment.
The problem is that parts of the code under test looks like this:
my_addresses.h:
#define MY_BASE_ADDRESS (0x00600000)
#define MY_OFFSET_ADDRESS (0x108)
my_code.c
#include "my_addresses.h"
static const My_Type* my_ptr =
(My_Type*)(MY_BASE_ADDRESS + MY_OFFSET_ADDRESS);
/* my_ptr is dereferenced and used ... */
Obviously, this will not run so well on Linux host environment.
Is there some way we can work around this issue during testing? Can we somehow "redirect" the program to use other addresses, that are valid addresses to memory allocated during test procedures?
Our first attempt was to replace "my_addresses.h" with another header file during tests, which (extern) declares variables instead of hard defines - then assign malloc'd memory to MY_BASE_ADDRESS, etc. The problem with that is the "static const" declaration in the c file. Of course you cannot assign a variable to a static const type.
Preferably, we should not modify the code under test (although in the worst case it may come to that).

You could check for e.g. the __linux__ macro and use conditional compilation. When on Linux use an array as base, and make it big enough to keep all the data needed in it.
Something like e.g.
#ifdef __linux__
int8_t array[1024];
# define MY_BASE_ADDRESS array
#else
# define MY_BASE_ADDRESS 0x00600000
#endif

In the Linux environment you can define a global array and then use the address of this as your base pointer.
const char my_buffer[1024];
#define my_base_addr (&my_buff)

Assuming
your embedded memory layout is small enough to model unchanged in its entirety (or in a few chunks) on Linux,
your compiler is happy with the constant memory address expressions below, and
My_Type in your example was defined as typedef My_Type1_t * My_Type;
you could (1) separate the definition of the embedded memory layout from deciding how it is placed and perhaps (2) gain some type-safety if you declare a struct for the layout:
#pragma (essential: stuff to force structs to contain no extra padding)
typedef struct {
char pad0[0x108];
My_Type1_t foo;
char pad1[0x210];
My_Type2_t bar;
...
} Memory_Layout_t;
#pragma (preferably: something to revert to previous struct layout options)
(If you don’t like calculating the size of pad1, use a union.)
Then make the variants:
#ifdef __linux__
Memory_Layout_t Embedded_Memory;
# define Embedded_Memory_P (& Embedded_Memory)
#else
# define Embedded_Memory_P ((Memory_Layout_t *) (0x00600000))
#endif
and reference it with
static const My_Type my_ptr = & Embedded_Memory_P->foo;

Related

Redefining a define

I'm reviewing some code and I stumbled across this:
In a header file we have this MAGIC_ADDRESS defined
#define ANOTHER_ADDRESS ((uint8_t*)0x40024000)
#define MAGIC_ADDRESS (ANOTHER_ADDRESS + 4u)
And then peppered throughout the code in various files we have things like this:
*(uint32_t*)MAGIC_ADDRESS = 0;
and
*(uint32_t*)MAGIC_ADDRESS = SOME_OTHER_DEFINE;
This compiles, apparently works, and it throws no linter errors. MAGIC_ADDRESS = 0; without the cast does not compile as I would expect.
So my questions are:
Why in the world would we ever want to do this rather than just making a uint32_t in the first place?
How does this actually work? I thought preprocessor defines were untouchable, how are we managing to cast one?
Why in the world would we ever want to do this rather than just making a uint32_t in the first place?
That's a fair question. One possibility is that ANOTHER_ADDRESS is used as a base address for more than one kind of data, but the code fragments presented do not show any reason why ANOTHER_ADDRESS should not be defined to expand to an expression of type uint32_t *. Note, however, that if that change were made then the definition of MAGIC_ADDRESS would need to be changed to (ANOTHER_ADDRESS + 1u).
How does this actually work? I thought preprocessor defines were untouchable, how are we managing to cast one?
Where an in-scope macro identifier appears in C source code, the macro's replacement text is substituted. Simplifying a bit, if the replacement text contains macro identifiers, too, then those are then replaced with their replacement text, etc.. Nowhere in your code fragments as a macro being cast, per se, but the fully-expanded result expresses some casts.
For example, this ...
*(uint32_t*)MAGIC_ADDRESS = 0;
... expands to ...
*(uint32_t*)(ANOTHER_ADDRESS + 4u) = 0;
... and then on to ...
*(uint32_t*)(((uint8_t*)0x40024000) + 4u) = 0;
. There are no casts of macros there, but there are (valid) casts of macros' replacement text.
It's not the cast that allows the assignment to work, it's the * dereferencing operator. The macro expands to a pointer constant, and you can't reassign a constant. But since it's a pointer you can assign to the memory it points to. So if you wrote
*MAGIC_ADDRESS = 0;
you wouldn't get an error.
The cast is necessary to assign to a 4-byte field at that address, rather than just a single byte, since the macro expands to a uint8_t*. Casting it to uint32_t* make it a 4-byte assignment.
#define ANOTHER_ADDRESS ((uint8_t*)0x40024000)
#define MAGIC_ADDRESS (ANOTHER_ADDRESS + 4u)
And then peppered throughout the code in various files we have things like this:
*(uint32_t*)MAGIC_ADDRESS = 0;
That's the problem - you don't want anything repetitive peppered throughout. Instead, this is what more-or-less idiomatic embedded C code would look like:
// Portable to compilers without void* arithmetic extension
#define BASE_ADDRESS ((uint8_t*)0x40024000)
#define REGISTER1 (*(uint32_t*)(ANOTHER_ADDRESS + 4u))
You can then write REGISTER1 = 42 or if (REGISTER1 != 42) etc. As you may imagine, this is normally used to for memory-mapped peripheral control registers.
If you're using gcc or clang, there's another layer of type safety available as an extension: you don't really want the compiler to allow *BASE_ADDRESS to compile, since presumably you only want to access registers - the *BASE_ADDRESS expression shouldn't pass a code review. And thus:
// gcc, clang, icc, and many others but not MSVC
#define BASE_ADDRESS ((void*)0x40024000)
#define REGISTER1 (*(uint32_t*)(ANOTHER_ADDRESS + 4u))
Arithmetic on void* is a gcc extension adopted most compilers that don't come from Microsoft, and it's handy: the *BASE_ADDRESS expression won't compile, and that's a good thing.
I imagine that the BASE_ADDRESS is the address of the battery-backed RAM on an STM32 MCU, in which case the "REGISTER" interpretation is incorrect, since all you want is to persist some application data, and you're using C, not assembly language, and there's this handy thing we call structures - absolutely use a structure instead of this ugly hack. The things beings stored in that non-volatile area aren't registers, they are just fields in a structure, and the structure itself is stored in a non-volatile fashion:
#define BKPSRAM_BASE_ ((void*)0x40024000)
#define nvstate (*(NVState*)BKPSRAM_BASE_)
enum NVLayout { NVVER_1 = 1, NVVER_2 = 2 };
struct {
// Note: This structure is persisted in NVRAM.
// Do not reorder the fields.
enum NVLayout layout;
// NVVER_1 fields
uint32_t value1;
uint32_t value2;
...
/* sometime later after a release */
// NVVER_2 fields
uint32_t valueA;
uint32_t valueB;
} typedef NVState;
Use:
if (nvstate.layout >= NVVER1) {
nvstate.value1 = ...;
if (nvstate.value2 != 42) ...
}
And here we come to the crux of the problem: your code review was focused on the minutiae, but you should have also divulged the big picture. If my big picture guess is correct - that it's all about sticking some data in a battery-backed RAM, then an actual data structure should be used, not macro hackery and manual offset management. Yuck.
And yes, you'll need that layout field for forward compatibility unless the entire NVRAM area is pre-initialized to zeroes, and you're OK with zeroes as default values.
This approach easily allows you to copy the NVRAM state, e.g. if you wanted to send it over the wire for diagnostic purposes - you don't have to worry about how much data is there, just use sizeof(NVState) for passing it to functions such as fwrite, and you can even use a working copy of that NV data - all without a single memcpy:
NVState wkstate = nvstate;
/* user manipulates the state here */
if (OK_pressed)
nvstate = wkstate;
else if (Cancel_pressed)
wkstate = nvstate;
If you need to assign values to a specific place in memory using MACROs
allows you to do so in a way that is relatively easy to read (and if you need to
use another address later - just change the macro definition)
The macro is translated by the preprocessor to a value. When you then de-reference
it you get access to the memory which you can read or write to. This has nothing to
do with the string that is used as a label by the preprocessor.
Both definitions are wrong I afraid (or at least not completely correct)
It should be defined as a pointer to volatile value if pointers are referencing hardware registers.
#define ANOTHER_POINTER ((volatile uint8_t*)0x40024000)
#define MAGIC_APOINTER (ANOTHER_ADDRESS + 4u)
I was defined as uint8_t * pointer because probably author wanted pointer arithmetic to be done on the byte level.

Placeholder for hardware reserved registers in struct

i have a struct which represents a set of hardware registers. Here, some parts are reserved and must neither be written nor read. Is there a placeholder or something similar instead of using an obvious variable naming?
typedef volatile struct RegisterStruct
{
uint8 BDH;
uint8 BDL;
...
uint8 IR;
uint8 RESERVED0; // this area should not be accessed
...
}
Obvious naming would be the right thing to use, as there's no "reserved" feature in C.
You can use arrays of byte-sized integers to correctly pad to the right length:
typedef volatile struct RegisterStruct
{
uint8_t BDH;
uint8_t BDL;
uint8_t IR;
uint8_t __RESERVED[num_of_reserved_bytes]; // this area should not be accessed
uint8_t NEXT_REGISTER_NAME;
};
The problem with using structs for register mapping in general (or similarly, for data communication protocol mapping), is that a struct may contain padding bytes anywhere.
If you use a struct (or union) for such purposes, you have to ensure that padding is disabled, by adding a line like for example
_Static_assert(sizeof(RegisterStruct) == sizeof(uint8_t)*4, "Padding detected");
This will prevent padding bugs, as it will block structs with padding from compiling.
Unfortunately, you cannot disable struct padding in a portable manner; most of the time you don't want to disable it because it will make the programs slower at best, in the worst case you'll get hardware exceptions for misaligned access, all depending on CPU.
The most common non-standard extension to disable padding is #pragma pack(1), but it is non-standard and non-portable.
In my opinion, the best way to avoid all such problems is to avoid structs entirely for the actual mapping. Instead, just declare everything as plain volatile variables. (Or by using macros, which is unfortunately the only way you can map something to a specific memory location in standard C).
And when you have gotten that far, there's no need to use any "reserved" place holders. Simply don't map anything to those reserved memory locations.
There's actually really no sound reason why you would want to have a number of hardware registers in a struct, even though it is for some reason mighty popular to do so among embedded compilers. You'll find that register maps written for such compilers are unreadable and also extremely non-standard.
For communication protocols it makes more sense to have structs, but then you would typically write serialize/de-serialize routines to fill up the struct.
There isn't anything in C to declare a placeholder/hole without a name in a structure or something with a name that is unreadable (const could help but with write protection only). And I don't see anything in gcc's extensions that could help here.
But you could additionally scramble the name by using the preprocessor, e.g.:
#define GLUE(X,Y,Z) X ## Y ## Z
#ifdef __GNUC__
#define SCRAMBLE(X) GLUE(X,_,__COUNTER__)
#else
#define SCRAMBLE(X) GLUE(X,_,__LINE__)
#endif
typedef volatile struct
{
uint8 BDH;
uint8 BDL;
// ...
uint8 IR;
uint8 SCRAMBLE(RESERVED0);
// ...
} RegisterStruct;

Static initialization with pointer to extern variable

I would like to understand the innards of the Python import system, including the rough spots. In the Python C API documentation, there's this terse reference to one such rough spot:
This is so important that we’re going to pick the top of it apart
still further:
PyObject_HEAD_INIT(NULL)
This line is a bit of a wart; what we’d like
to write is:
PyObject_HEAD_INIT(&PyType_Type)
as the type of a type object is
“type”, but this isn’t strictly conforming C and some compilers
complain.
Why is this not strictly conforming C? Why do some compilers accept this without complaint and others do not?
I now think the following is misleading, skip down to "SUBSTANTIAL EDIT"
Scrolling about a page down there is what I believe is a clue. This quote regards initializing another member of the struct but it sounds like the same issue and this time it is explained.
We’d like to just assign this to the tp_new slot, but we can’t, for
portability sake, On some platforms or compilers, we can’t statically
initialize a structure member with a function defined in another C
module
This still leaves me a bit confused, in part due to the odd word choice of "module". I think the second quote meant to say that static initialization that relies on calls to functions in separate compilation units is a non-standard extension. I still don't understand why that would be so. Is that what's going on in the first quote?
SUBSTANTIAL EDIT:
The use of PyObject_HEAD_INIT(NULL) is advised to go at the very top of the initialization of an instance of PyTypeObject.
The definition of PyTypeObject looks like this:
typedef struct _typeobject {
PyObject_VAR_HEAD
const char *tp_name; /* For printing, in format "<module>.<name>" */
Py_ssize_t tp_basicsize, tp_itemsize; /* For allocation */
/* Methods to implement standard operations */
destructor tp_dealloc;
/*... lots more ... */
} PyTypeObject;
The PyObject_HEAD_INIT(NULL) macro is used to initialize the top of PyTypeObject instances. The top of the PyTypeObject definition is created by the macro PyObject_VAR_HEAD. PyObject_VAR_HEAD is:
/* PyObject_VAR_HEAD defines the initial segment of all variable-size
* container objects. These end with a declaration of an array with 1
* element, but enough space is malloc'ed so that the array actually
* has room for ob_size elements. Note that ob_size is an element count,
* not necessarily a byte count.
*/
#define PyObject_VAR_HEAD \
PyObject_HEAD \
Py_ssize_t ob_size; /* Number of items in variable part */
#define Py_INVALID_SIZE (Py_ssize_t)-1
In turn, PyObject_HEAD expands to:
/* PyObject_HEAD defines the initial segment of every PyObject. */
#define PyObject_HEAD \
_PyObject_HEAD_EXTRA \
Py_ssize_t ob_refcnt; \
struct _typeobject *ob_type;
_PyObject_HEAD_EXTRA is only used in debugging builds and normally expands to nothing. The members being initialized by the PyObject_HEAD_INIT macro are ob_refcnt and ob_type. ob_type is the one that we would like to initialize with &PyType_Type but we're told that would violate the C Standard. ob_type points to a _typeobject, which is typedef'd as a PyTypeObject (the same struct that we're trying to initialize). We use the PyObject_HEAD_INIT macro, which initializes those two values, expands as so:
#define PyObject_HEAD_INIT(type) \
_PyObject_EXTRA_INIT \
1, type,
So we're starting a reference count at 1 and setting a member pointer to whatever is in the type parameter. The Python documentation says we can't set the type parameter it to the address of PyType_Type because that is not strictly standard C so we settle for NULL.
PyType_Type is declared in the same translation unit a few lines below.
PyAPI_DATA(PyTypeObject) PyType_Type; /* built-in 'type' */
PyAPI_DATA is defined elsewhere. It has a few different conditional definitions.
#define PyAPI_DATA(RTYPE) extern __declspec(dllexport) RTYPE
#define PyAPI_DATA(RTYPE) extern RTYPE
So the Python API documentation is saying that we'd like to initialize an instance of a PyTypeObject with a pointer to previously declared PyTypeObject that was declared with the extern qualifier. What in the C Standard would that violate?
The initialization of PyType_Type occurs in a .c file. A typical Python extension that initializes a PyTypeObject, as described above, will be dynamically loaded by code that was compiled with this initialization:
PyTypeObject PyType_Type = {
PyVarObject_HEAD_INIT(&PyType_Type, 0)
"type", /* tp_name */
sizeof(PyHeapTypeObject), /* tp_basicsize */
sizeof(PyMemberDef), /* tp_itemsize */
(destructor)type_dealloc, /* tp_dealloc */
/* ... lots more ... */
}
PyObject_HEAD_INIT(&PyType_Type)
Produces
1, &PyType_Type
which initializes fields
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
PyType_Type is defined with PyAPI_DATA(PyTypeObject) PyType_Type which produces
extern PyTypeObject PyType_Type;
possibly with a __declspec qualifier. PyTypeObject is a typedef for struct _typeobject, so we have
extern struct _typeobject PyType_Type;
so PyObject_HEAD_INIT(&PyType_Type) would initialize the struct _typeobject* ob_type field with a struct _typeobject* ... which is certainly valid C, so I don't see why they say it isn't.
I came upon an explanation of this elsewhere in the Python source code.
/* We link this module statically for convenience. If compiled as a shared
library instead, some compilers don't allow addresses of Python objects
defined in other libraries to be used in static initializers here. The
DEFERRED_ADDRESS macro is used to tag the slots where such addresses
appear; the module init function must fill in the tagged slots at runtime.
The argument is for documentation -- the macro ignores it.
*/
#define DEFERRED_ADDRESS(ADDR) 0
And then the macro is used where NULL appears at the top of the OP.
PyVarObject_HEAD_INIT(DEFERRED_ADDRESS(&PyType_Type), 0)

Removing (or rather conditionally attach) const modifier using macros in C

I am dealing with the following issue in C. I use global variables for defining some global parameters in my code. I would like such global variables to be constant, even though they have to be initialized inside a routine that reads their values from an input data file. In a nutshell, I am looking for a good way to "cast away" constness during variable initialization in C (I guess in C++ this would not be an issue thanks to const_cast)
I came up with a pattern based on macros to do so, as illustrated below.
It seems to work fine, but I have the following questions.
Does anyone see any hidden flaw or potential danger in the procedure below?
Would anyone discourage the following approach in favor of a simpler one?
My approach:
I have a main header file containing the definition of my global variable (int N) like so
/* main_header.h */
#ifdef global_params_reader
#define __TYPE__QUAL__
#else
#define __TYPE__QUAL__ const
#endif
__TYPE__QUAL__ int N;
I have a file "get_global_params.c" implementing the initialization of N, which sees N as "int N" (as it includes "main_header.h" after defining global_params_reader)
/* get_global_params.c */
#define global_params_reader
#include get_global_params.h
void get_global_params(char* filename){
N = ... ; // calling some function that reads the value of N from
// the datafile "filename" and returns it
}
and the corresponding header file "get_global_params.h"
/* get_global_params.h */
#include "main_header.h"
void get_global_params(char* filename);
Finally, I have a main.c, which sees N as "const int N" (as it includes "main_header.h" without defining global_params_reader):
/* main.c */
#include "main_header.h"
#include "get_global_params.h"
int main(int argc, char **argv){
// setting up input data file //
...
// initialize N //
get_global_params(datafile);
// do things with N //
...
}
I hope my explanation was clear enough.
Thanks for any feedback.
Just contain the globals in a separate file.
globl.h:
struct Globals{
int N;
//...
};
extern const struct Globals *const globals;
init_globl.h:
init_globals(/*Init Params*/);
globl.c
#include globl.h
#include init_globl.h
static struct Globals _globals;
const struct Globals *const globals = &_globals;
init_globals(/*Init Params*/){
// Initialize _globals;
//...
}
Now you can initialize the globals at startup by including init_globl.h in whatever file needs access to that functionality, everyone else can directly access the globals just by including globl.h, and using the notation globals->N.
If I were you, I would simply avoid this kind of global variables. Instead, I would define a struct with all those program parameters, and define one function that returns a const pointer to the one and only instance of this struct (singleton pattern). That way, the function that returns the pointer has non-const access to the singleton, while the entire rest of the program does not. This is precisely what you need, it's clean and object oriented, so there is no reason to mess around with macros and casts.
The instance can be declared as a static variable within the function or it can be malloc'ed to a static pointer. It does not really matter, because that is an implementation detail of that function which is never leaked to the outside. Nor does the rest of the code need to be aware of when the parameters are actually read, it just calls the function and it gets the one and only object with all valid parameters.
"I would like such global variables to be constant, even though they have to be initialized inside a routine that reads their values from an input data file."
It is not possible to initialize a const in c during run-time. In c value either has or has not a const qualifier, and it is defined upon declaration. c does not support changing it. The semantics are fixed. But some expert with quoting the standard would be nicer and more ensuring.
I don't think this is possible in c++ either, but I won't bet on it, since c++ can do some magic here and there.

Regarding typedefs of 1-element arrays in C

Sometimes, in C, you do this:
typedef struct foo {
unsigned int some_data;
} foo; /* btw, foo_t is discouraged */
To use this new type in an OO-sort-of-way, you might have alloc/free pairs like these:
foo *foo_alloc(/* various "constructor" params */);
void foo_free(foo *bar);
Or, alternatively init/clear pairs (perhaps returning error-codes):
int foo_init(foo *bar, /* and various "constructor" params */);
int foo_clear(foo *bar);
I have seen the following idiom used, in particular in the MPFR library:
struct foo {
unsigned int some_data;
};
typedef struct foo foo[1]; /* <- notice, 1-element array */
typedef struct foo *foo_ptr; /* let's create a ptr-type */
The alloc/free and init/clear pairs now read:
foo_ptr foo_alloc(/* various "constructor" params */);
void foo_free(foo_ptr bar);
int foo_init(foo_ptr bar, /* and various "constructor" params */);
int foo_clear(foo_ptr bar);
Now you can use it all like this (for instance, the init/clear pairs):
int main()
{
foo bar; /* constructed but NOT initialized yet */
foo_init(bar); /* initialize bar object, alloc stuff on heap, etc. */
/* use bar */
foo_clear(bar); /* clear bar object, free stuff on heap, etc. */
}
Remarks: The init/clear pair seems to allow for a more generic way of initializing and clearing out objects. Compared to the alloc/free pair, the init/clear pair requires that a "shallow" object has already been constructed. The "deep" construction is done using init.
Question: Are there any non-obvious pitfalls of the 1-element array "type-idiom"?
This is very clever (but see below).
It encourages the misleading idea that C function arguments can be passed by reference.
If I see this in a C program:
foo bar;
foo_init(bar);
I know that the call to foo_init does not modify the value of bar. I also know that the code passes the value of bar to a function when it hasn't initialized it, which is very probably undefined behavior.
Unless I happen to know that foo is a typedef for an array type. Then I suddenly realize that foo_init(bar) is not passing the value of bar, but the address of its first element. And now every time I see something that refers to type foo, or to an object of type foo, I have to think about how foo was defined as a typedef for a single-element array before I can understand the code.
It is an attempt to make C look like something it's not, not unlike things like:
#define BEGIN {
#define END }
and so forth. And it doesn't result in code that's easier to understand because it uses features that C doesn't support directly. It results in code that's harder to understand (especially to readers who know C well), because you have to understand both the customized declarations and the underlying C semantics that make the whole thing work.
If you want to pass pointers around, just pass pointers around, and do it explicitly. See, for example, the use of FILE* in the various standard functions defined in <stdio.h>. There is no attempt to hide pointers behind macros or typedefs, and C programmers have been using that interface for decades.
If you want to write code that looks like it's passing arguments by reference, define some function-like macros, and give them all-caps names so knowledgeable readers will know that something odd is going on.
I said above that this is "clever". I'm reminded of something I did when I was first learning the C language:
#define EVER ;;
which let me write an infinite loop as:
for (EVER) {
/* ... */
}
At the time, I thought it was clever.
I still think it's clever. I just no longer think that's a good thing.
The only advantage to this method is nicer looking code and easier typing. It allows the user to create the struct on the stack without dynamic allocation like so:
foo bar;
However, the structure can still be passed to functions that require a pointer type, without requiring the user to convert to a pointer with &bar every time.
foo_init(bar);
Without the 1 element array, it would require either an alloc function as you mentioned, or constant & usage.
foo_init(&bar);
The only pitfall I can think of is the normal concerns associated with direct stack allocation. If this in a library used by other code, updates to the struct may break client code in the future, which would not happen when using an alloc free pair.

Resources