C Structs with three options in the magenta kernel - c

In the magenta kernel there is a paragraph in which struct got not only a type and name but one option more. I found in the references nothing to explain that syntax. So what is __CPU_ALIGN as argument in struct for and where do I find the syntax for it?
struct type name ???
#if WITH_SMP
/* a global state structure, aligned on cpu cache line to minimize aliasing */
struct mp_state mp __CPU_ALIGN = {
.hotplug_lock = MUTEX_INITIAL_VALUE(mp.hotplug_lock),
.ipi_task_lock = SPIN_LOCK_INITIAL_VALUE,
};
I know that __CPU_ALIGN itself is used to have aligned bytes for the CPU memory size.

It's a macro shorthand for the aligned attribute, which is a GCC extension.
The macro is defined as follows:
#define __CPU_ALIGN __ALIGNED(CACHE_LINE)
The macro __ALIGNED in turn is defined like this:
#define __ALIGNED(x) __attribute__((aligned(x)))
...which matches the syntax in the GCC documentation. (The value of CACHE_LINE depends on the architecture.)

Related

Redefining a define

I'm reviewing some code and I stumbled across this:
In a header file we have this MAGIC_ADDRESS defined
#define ANOTHER_ADDRESS ((uint8_t*)0x40024000)
#define MAGIC_ADDRESS (ANOTHER_ADDRESS + 4u)
And then peppered throughout the code in various files we have things like this:
*(uint32_t*)MAGIC_ADDRESS = 0;
and
*(uint32_t*)MAGIC_ADDRESS = SOME_OTHER_DEFINE;
This compiles, apparently works, and it throws no linter errors. MAGIC_ADDRESS = 0; without the cast does not compile as I would expect.
So my questions are:
Why in the world would we ever want to do this rather than just making a uint32_t in the first place?
How does this actually work? I thought preprocessor defines were untouchable, how are we managing to cast one?
Why in the world would we ever want to do this rather than just making a uint32_t in the first place?
That's a fair question. One possibility is that ANOTHER_ADDRESS is used as a base address for more than one kind of data, but the code fragments presented do not show any reason why ANOTHER_ADDRESS should not be defined to expand to an expression of type uint32_t *. Note, however, that if that change were made then the definition of MAGIC_ADDRESS would need to be changed to (ANOTHER_ADDRESS + 1u).
How does this actually work? I thought preprocessor defines were untouchable, how are we managing to cast one?
Where an in-scope macro identifier appears in C source code, the macro's replacement text is substituted. Simplifying a bit, if the replacement text contains macro identifiers, too, then those are then replaced with their replacement text, etc.. Nowhere in your code fragments as a macro being cast, per se, but the fully-expanded result expresses some casts.
For example, this ...
*(uint32_t*)MAGIC_ADDRESS = 0;
... expands to ...
*(uint32_t*)(ANOTHER_ADDRESS + 4u) = 0;
... and then on to ...
*(uint32_t*)(((uint8_t*)0x40024000) + 4u) = 0;
. There are no casts of macros there, but there are (valid) casts of macros' replacement text.
It's not the cast that allows the assignment to work, it's the * dereferencing operator. The macro expands to a pointer constant, and you can't reassign a constant. But since it's a pointer you can assign to the memory it points to. So if you wrote
*MAGIC_ADDRESS = 0;
you wouldn't get an error.
The cast is necessary to assign to a 4-byte field at that address, rather than just a single byte, since the macro expands to a uint8_t*. Casting it to uint32_t* make it a 4-byte assignment.
#define ANOTHER_ADDRESS ((uint8_t*)0x40024000)
#define MAGIC_ADDRESS (ANOTHER_ADDRESS + 4u)
And then peppered throughout the code in various files we have things like this:
*(uint32_t*)MAGIC_ADDRESS = 0;
That's the problem - you don't want anything repetitive peppered throughout. Instead, this is what more-or-less idiomatic embedded C code would look like:
// Portable to compilers without void* arithmetic extension
#define BASE_ADDRESS ((uint8_t*)0x40024000)
#define REGISTER1 (*(uint32_t*)(ANOTHER_ADDRESS + 4u))
You can then write REGISTER1 = 42 or if (REGISTER1 != 42) etc. As you may imagine, this is normally used to for memory-mapped peripheral control registers.
If you're using gcc or clang, there's another layer of type safety available as an extension: you don't really want the compiler to allow *BASE_ADDRESS to compile, since presumably you only want to access registers - the *BASE_ADDRESS expression shouldn't pass a code review. And thus:
// gcc, clang, icc, and many others but not MSVC
#define BASE_ADDRESS ((void*)0x40024000)
#define REGISTER1 (*(uint32_t*)(ANOTHER_ADDRESS + 4u))
Arithmetic on void* is a gcc extension adopted most compilers that don't come from Microsoft, and it's handy: the *BASE_ADDRESS expression won't compile, and that's a good thing.
I imagine that the BASE_ADDRESS is the address of the battery-backed RAM on an STM32 MCU, in which case the "REGISTER" interpretation is incorrect, since all you want is to persist some application data, and you're using C, not assembly language, and there's this handy thing we call structures - absolutely use a structure instead of this ugly hack. The things beings stored in that non-volatile area aren't registers, they are just fields in a structure, and the structure itself is stored in a non-volatile fashion:
#define BKPSRAM_BASE_ ((void*)0x40024000)
#define nvstate (*(NVState*)BKPSRAM_BASE_)
enum NVLayout { NVVER_1 = 1, NVVER_2 = 2 };
struct {
// Note: This structure is persisted in NVRAM.
// Do not reorder the fields.
enum NVLayout layout;
// NVVER_1 fields
uint32_t value1;
uint32_t value2;
...
/* sometime later after a release */
// NVVER_2 fields
uint32_t valueA;
uint32_t valueB;
} typedef NVState;
Use:
if (nvstate.layout >= NVVER1) {
nvstate.value1 = ...;
if (nvstate.value2 != 42) ...
}
And here we come to the crux of the problem: your code review was focused on the minutiae, but you should have also divulged the big picture. If my big picture guess is correct - that it's all about sticking some data in a battery-backed RAM, then an actual data structure should be used, not macro hackery and manual offset management. Yuck.
And yes, you'll need that layout field for forward compatibility unless the entire NVRAM area is pre-initialized to zeroes, and you're OK with zeroes as default values.
This approach easily allows you to copy the NVRAM state, e.g. if you wanted to send it over the wire for diagnostic purposes - you don't have to worry about how much data is there, just use sizeof(NVState) for passing it to functions such as fwrite, and you can even use a working copy of that NV data - all without a single memcpy:
NVState wkstate = nvstate;
/* user manipulates the state here */
if (OK_pressed)
nvstate = wkstate;
else if (Cancel_pressed)
wkstate = nvstate;
If you need to assign values to a specific place in memory using MACROs
allows you to do so in a way that is relatively easy to read (and if you need to
use another address later - just change the macro definition)
The macro is translated by the preprocessor to a value. When you then de-reference
it you get access to the memory which you can read or write to. This has nothing to
do with the string that is used as a label by the preprocessor.
Both definitions are wrong I afraid (or at least not completely correct)
It should be defined as a pointer to volatile value if pointers are referencing hardware registers.
#define ANOTHER_POINTER ((volatile uint8_t*)0x40024000)
#define MAGIC_APOINTER (ANOTHER_ADDRESS + 4u)
I was defined as uint8_t * pointer because probably author wanted pointer arithmetic to be done on the byte level.

Understanding #define preprocessor directive macro syntax

The following code is taken from the LPC54618.h header file:
typedef struct {
//...structure elements
__IO uint32_t SDIOCLKSEL;
//...more elements
} SYSCON_Type;
#define SYSCON_BASE (0x40000000u)
#define SYSCON ((SYSCON_Type *)SYSCON_BASE)
As far as I can guess the meaning behind the line
#define SYSCON ((SYSCON_Type *)SYSCON_BASE)
I would assume that it creates a pointer named SYSCON that points to a variable of type SYSCON_Type which is stored at the address 0x40000000u. Is this really what happens? And is there any ressource that explains the syntax that is being used here (i.e. defining pointers inside macros)?
When I try to alter the value of SDIOCLKSEL directly, i.e.:
SYSCON->SDIOCLKSEL = some value;
I get an error:
error: expected ')'
error: expected parameter declarator
error: expected ')'
error: expected function body after function declarator
but if I use it inside a function, e.g.:
void foo(void)
{
SYSCON->SDIOCLKSEL = some value;
}
there is no error. Why is that? Why can't I write directly to the structure?
Any answer would be greatly appreciated!
#define SYSCON_BASE (0x40000000u)
This simply lists that at the physical address 0x40000000.
#define SYSCON ((SYSCON_Type *)SYSCON_BASE)
This converts the integer constant 0x40000000u to a pointer to struct by means of a cast. It doesn't actually allocate anything - the actual registers are already allocated as memory-mapped hardware.
Simply put, it says "at address 0x40000000 there's a hardware peripheral SYSCON" (whatever that is, some timer?). It's a common scenario that you have several hardware peripherals of the same type inside a MCU (many SPI, ADC etc), each with the same register layout, but found at different addresses. We can use the same struct type for each such peripheral, and also the same driver code.
The struct itself will have a memory map which corresponds 100% to the register layout. Here it is important to ensure that padding/alignment doesn't screw things up, but hopefully the MCU manufacturer have thought of that (don't take it for granted though).
Assuming SDIOCLKSEL has a register offset of 0x10, then when you type SYSCON->SDIOCLKSEL = some value;, you get machine code like this (pseudo assembler code):
LOAD 0x40000000 into index register X
LOAD 0x10 into register A
ADD A to X
MOVE some value into the address of X
(ARM got special instructions that can move etc based on an offset, so it may be fewer instructions in the actual machine code. Subsequent register accesses could keep "X" untouched and use that base address repeatedly, for effective code.)
The __IO qualifier is just code bloat hiding volatile.
The reason why you get an error when you try to "write directly into the structure" is simply that you can't execute code outside all functions, it has nothing to do with this struct.
it is very easy.
that it creates a pointer named SYSCON that points to a variable of
type SYSCON_Type which is stored at the address 0x40000000u. Is this
really what happens?
Yes and no. When you use the macro SYSCON
void foo(uint32_t value)
{
SYSCON->SDIOCLKSEL = value;
}
preprocessor converts into:
void foo(uint32_t value)
{
((SYSCON_Type *)0x40000000u)->SDIOCLKSEL = value;
}
which writes the 32bit unsigned value to the memory location at the address 0x40000000u + the offset of the struct member.
It is usually used to access the harware registers mapped in the memory address space.
You need to do it inside the function (as all code in the C language)

How can we use unions and structure within a #define statement

I was going through a header file of a micro-controller, when i found this,
#define DEFA(name, address) __no_init union \
{ \
struct \
{ \
volatile unsigned char name##_L; \
volatile unsigned char name##_H; \
}; \
struct \
{ \
volatile unsigned short name##L; \
volatile unsigned short name##H; \
}; \
__ACCESS_20BIT_REG__ name; \
} # address;
I have multiple questions here,
I didn't know that we can use union and structures within #define statements. How are they interpreted by the compiler?
What does "name##_L" and "name##L" mean.?, especially "##".
What is "__ACCESS_20BIT_REG__ name"?
What is this new "#" symbol, I googled to find out about this symbol, and i find nothing related to the # symbol anywhere, some people say that this is not C standard, is this true?, and what does it mean?
Someone, please explain me this peace of code, everything depends on this piece of the code, and everything either directly or indirectly uses this #define. :(
You can put most things inside macros, they are mostly just text replacement (though on a token by token level). Putting typedefs and variable declarations inside macros is however bad practice and should be avoided unless there's no other option.
It pastes the token past and creates a new pre-processor token. In this case it creates member variable names. If you use DEFA(foo,0) you will get members named foo_H and foo_L.
Notably, this particular macro would probably have been better if the fields were just named H and L. Why would anyone want to type reg.foo_H rather than foo.H? Using _L versus L postfix to distinguish between 8 and 16 bit access is not a very bright idea either.
Some custom type used by the same code. Why they have a union between something called "20 bit" and 32 bit structs, I have no idea.
A non-standard extension. Usually it is used for allocation of a variable at a specific address, which is the case here. Probably you have a memory-mapped hardware register and this macro is supposed to be a generic declaration for such registers.
Also notable, the union syntax isn't standard C. You need to name the union. The anonymous structs are standard C since C11, but I doubt this is from a modern compiler.
Overall, very bad code. A very typical kind of trash code you find in microcontroller register maps delivered by silicon/tool vendors. Not compliant with ISO C nor MISRA-C.
I didn't know that we can use union and structures within #define statements. How are they interpreted by the compiler?
Define statements are just macros. They are NOT interpreted by the compiler but by the preprocessor. That bit of code that is defined inside the macro is just copy-pasted everywhere, where you call/use that macro. If it contains valid C/C++ code, then it will compile in the next step also, once the preprocessor is done.
What does "name##_L" and "name##L" mean.?, especially "##".
Yes, I was surprised by this ## once upon a time also. It is token concatenation. Basically, you can generate 'programmatically' variable/function names using it. So for example, you have,
#include <stdio.h>
#define concat(a, b) a##b
int main(void)
{
int foobar = 1;
printf("%d", concat(foo, bar));
return 0;
}
would print 1, as it concatenates foo and bar together to form foobar.
What is "__ACCESS_20BIT_REG__ name"?
__ACCESS_20BIT_REG__ seems to be an implementation-defined macro defined probably somewhere else in the code. The __ double underscore followed by a capital are reserved for implementation (so that they can be used for example by your uC manufacturer).
What is this new "#" symbol, I googled to find out about this symbol, and i find nothing related to the # symbol anywhere, some people say that this is not C standard, is this true?, and what does it mean?
This one has me stumped also. I don't think this is legal C.
EDIT:
Actually, regarding your point 4 i.e. the # synmbol, I googled some more and found this stackoverflow question.
I think the best way to explain this piece of code is by trying an example:
So what results from the define DEFA(name, address)?
Using the define like e.g. DEFA(myName, myAddress) the preprocessor creates these lines of code:
__no_init union
{
struct
{
volatile unsigned char myName_L;
volatile unsigned char myName_H;
};
struct
{
volatile unsigned short myNameL;
volatile unsigned short myNameH;
};
__ACCESS_20BIT_REG__ myName;
} # myAddress;
So now to your questions:
Interpreted as shown
This is called token concatenation, is concates the given token to to an marco variableen.wikipedia.org/wiki/C_preprocessor#Token_concatenation
ACCESS_20BIT_REG is probably a 20bit long data type marcro whick is defined somewhere else

Static initialization with pointer to extern variable

I would like to understand the innards of the Python import system, including the rough spots. In the Python C API documentation, there's this terse reference to one such rough spot:
This is so important that we’re going to pick the top of it apart
still further:
PyObject_HEAD_INIT(NULL)
This line is a bit of a wart; what we’d like
to write is:
PyObject_HEAD_INIT(&PyType_Type)
as the type of a type object is
“type”, but this isn’t strictly conforming C and some compilers
complain.
Why is this not strictly conforming C? Why do some compilers accept this without complaint and others do not?
I now think the following is misleading, skip down to "SUBSTANTIAL EDIT"
Scrolling about a page down there is what I believe is a clue. This quote regards initializing another member of the struct but it sounds like the same issue and this time it is explained.
We’d like to just assign this to the tp_new slot, but we can’t, for
portability sake, On some platforms or compilers, we can’t statically
initialize a structure member with a function defined in another C
module
This still leaves me a bit confused, in part due to the odd word choice of "module". I think the second quote meant to say that static initialization that relies on calls to functions in separate compilation units is a non-standard extension. I still don't understand why that would be so. Is that what's going on in the first quote?
SUBSTANTIAL EDIT:
The use of PyObject_HEAD_INIT(NULL) is advised to go at the very top of the initialization of an instance of PyTypeObject.
The definition of PyTypeObject looks like this:
typedef struct _typeobject {
PyObject_VAR_HEAD
const char *tp_name; /* For printing, in format "<module>.<name>" */
Py_ssize_t tp_basicsize, tp_itemsize; /* For allocation */
/* Methods to implement standard operations */
destructor tp_dealloc;
/*... lots more ... */
} PyTypeObject;
The PyObject_HEAD_INIT(NULL) macro is used to initialize the top of PyTypeObject instances. The top of the PyTypeObject definition is created by the macro PyObject_VAR_HEAD. PyObject_VAR_HEAD is:
/* PyObject_VAR_HEAD defines the initial segment of all variable-size
* container objects. These end with a declaration of an array with 1
* element, but enough space is malloc'ed so that the array actually
* has room for ob_size elements. Note that ob_size is an element count,
* not necessarily a byte count.
*/
#define PyObject_VAR_HEAD \
PyObject_HEAD \
Py_ssize_t ob_size; /* Number of items in variable part */
#define Py_INVALID_SIZE (Py_ssize_t)-1
In turn, PyObject_HEAD expands to:
/* PyObject_HEAD defines the initial segment of every PyObject. */
#define PyObject_HEAD \
_PyObject_HEAD_EXTRA \
Py_ssize_t ob_refcnt; \
struct _typeobject *ob_type;
_PyObject_HEAD_EXTRA is only used in debugging builds and normally expands to nothing. The members being initialized by the PyObject_HEAD_INIT macro are ob_refcnt and ob_type. ob_type is the one that we would like to initialize with &PyType_Type but we're told that would violate the C Standard. ob_type points to a _typeobject, which is typedef'd as a PyTypeObject (the same struct that we're trying to initialize). We use the PyObject_HEAD_INIT macro, which initializes those two values, expands as so:
#define PyObject_HEAD_INIT(type) \
_PyObject_EXTRA_INIT \
1, type,
So we're starting a reference count at 1 and setting a member pointer to whatever is in the type parameter. The Python documentation says we can't set the type parameter it to the address of PyType_Type because that is not strictly standard C so we settle for NULL.
PyType_Type is declared in the same translation unit a few lines below.
PyAPI_DATA(PyTypeObject) PyType_Type; /* built-in 'type' */
PyAPI_DATA is defined elsewhere. It has a few different conditional definitions.
#define PyAPI_DATA(RTYPE) extern __declspec(dllexport) RTYPE
#define PyAPI_DATA(RTYPE) extern RTYPE
So the Python API documentation is saying that we'd like to initialize an instance of a PyTypeObject with a pointer to previously declared PyTypeObject that was declared with the extern qualifier. What in the C Standard would that violate?
The initialization of PyType_Type occurs in a .c file. A typical Python extension that initializes a PyTypeObject, as described above, will be dynamically loaded by code that was compiled with this initialization:
PyTypeObject PyType_Type = {
PyVarObject_HEAD_INIT(&PyType_Type, 0)
"type", /* tp_name */
sizeof(PyHeapTypeObject), /* tp_basicsize */
sizeof(PyMemberDef), /* tp_itemsize */
(destructor)type_dealloc, /* tp_dealloc */
/* ... lots more ... */
}
PyObject_HEAD_INIT(&PyType_Type)
Produces
1, &PyType_Type
which initializes fields
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
PyType_Type is defined with PyAPI_DATA(PyTypeObject) PyType_Type which produces
extern PyTypeObject PyType_Type;
possibly with a __declspec qualifier. PyTypeObject is a typedef for struct _typeobject, so we have
extern struct _typeobject PyType_Type;
so PyObject_HEAD_INIT(&PyType_Type) would initialize the struct _typeobject* ob_type field with a struct _typeobject* ... which is certainly valid C, so I don't see why they say it isn't.
I came upon an explanation of this elsewhere in the Python source code.
/* We link this module statically for convenience. If compiled as a shared
library instead, some compilers don't allow addresses of Python objects
defined in other libraries to be used in static initializers here. The
DEFERRED_ADDRESS macro is used to tag the slots where such addresses
appear; the module init function must fill in the tagged slots at runtime.
The argument is for documentation -- the macro ignores it.
*/
#define DEFERRED_ADDRESS(ADDR) 0
And then the macro is used where NULL appears at the top of the OP.
PyVarObject_HEAD_INIT(DEFERRED_ADDRESS(&PyType_Type), 0)

C empty struct -- what does this mean/do?

I found this code in a header file for a device that I need to use, and although I've been doing C for years, I've never run into this:
struct device {
};
struct spi_device {
struct device dev;
};
and it used as in:
int spi_write_then_read(struct spi_device *spi,
const unsigned char *txbuf, unsigned n_tx,
unsigned char *rxbuf, unsigned n_rx);
and also here:
struct spi_device *spi = phy->spi;
where it is defined the same.
I'm not sure what the point is with this definition. It is in a header file for a linux application of the board, but am baffled by it use. Any explanations, ideas? Anyone seen this before (I'm sure some of you have :).
Thanks!
:bp:
This is not C as C structures have to contain at least one named member:
(C11, 6.7.2.1 Structure and union specifiers p8) "If the struct-declaration-list does not contain any named members, either directly or via an anonymous structure or anonymous union, the behavior is undefined."
but a GNU C extension:
GCC permits a C structure to have no members:
struct empty {
};
The structure has size zero
https://gcc.gnu.org/onlinedocs/gcc/Empty-Structures.html
I don't know what is the purpose of this construct in your example but in general I think it may be used as a forward declaration of the structure type. Note that in C++ it is allowed to have a class with no member.
In Linux 2.4 there is an example of an empty structure type with conditional compilation in the definition of spin_lock_t type alias in Linux kernel 2.4 (in include/linux/spinlock.h):
#if (DEBUG_SPINLOCKS < 1)
/* ... */
typedef struct { } spinlock_t;
#elif (DEBUG_SPINLOCKS < 2)
/* ... */
typedef struct {
volatile unsigned long lock;
} spinlock_t;
#else /* (DEBUG_SPINLOCKS >= 2) */
/* ... */
typedef struct {
volatile unsigned long lock;
volatile unsigned int babble;
const char *module;
} spinlock_t;
#endif
The purpose is to save some space without having to change the functions API in case DEBUG_SPINLOCKS < 1. It also allows to define dummy (zero-sized) objects of type spinlock_t.
Another example in the (recent) Linux kernel of an empty structure hack used with conditional compilation in include/linux/device.h:
struct acpi_dev_node {
#ifdef CONFIG_ACPI
void *handle;
#endif
};
See the discussion with Greg Kroah-Hartman for this last example here:
https://lkml.org/lkml/2012/11/19/453
This is not standard C.
C11: 6.2.5-20:
— A structure type describes a sequentially allocated nonempty set of member objects (and, in certain circumstances, an incomplete array), each of which has an optionally specified name and possibly distinct type.
J.2 Undefined behavior:
The behavior is undefined in the following circumstances:
....
— A structure or union is defined without any named members (including those
specified indirectly via anonymous structures and unions) (6.7.2.1).
GCC uses it as an extension (no more detailed is given there about when/where should it be used). Using this in any program will make it compiler specific.
One reason might to do this for a library is that the library developers do not want you to know or interfere with the internals of these struct. It these cases they may provide an "interface" version of the structs spi_device/device (which is what you may see) and have a second type definition that defines another version of said structs for use inside the library with the actual members.
Since you cannot access struct members or even create compatible structs of that type yourself with that approach (since even your compiler would not know the size actual size of this struct), this only works if the library itself creates the structs, only ever passes you pointers to it, and does not need you to modify any members.
If you add an empty struct as the first member of another struct, the empty
struct can serve as a "marker interface", i.e. when you cast a pointer to that
outer struct to a pointer of the inner struct and the cast succeeds you know
that the outer struct is "marked" as something.
Also it might just be a place holder for future development, not to sure. Hope this helps
This is valid C
struct empty;
struct empty *empty;
and facilitates use of addresses of opaque regions of memory.
Such addresses are usually obtained from and passed to library subroutines.
For example, something like this is done in stdio.h

Resources