Using structs with attibute packed when parsing binary data - c

Have seen various code around where one read data into a char or void and then
cast it to a struct. Example is parsing of file formats where data has fixed offsets.
Example:
struct some_format {
char magic[4];
uint32_t len;
uint16_t foo;
};
struct some_format *sf = (struct some_format*) buf;
To be sure this is always valid one need to align the struct by using __attribute__((packed)).
struct test {
uint8_t a;
uint8_t b;
uint32_t c;
uint8_t d[128];
} __attribute__((packed));
When reading big and complex file formats this surely makes things much simpler. Typically
reading media format with structs having 30+ members etc.
It is also easy to read in a huge buffer and cast to proper type by for example:
struct mother {
uint8_t a;
uint8_t b;
uint32_t offset_child;
};
struct child {
...
}
m = (struct mother*) buf;
c = (struct child*) ((uint8_t*)buf + mother->offset_child);
Or:
read_big_buf(buf, 4096);
a = (struct a*) buf;
b = (struct b*) (buf + sizeof(struct a));
c = (struct c*) (buf + SOME_DEF);
...
It would also be easy to quickly write such structures to file.
My question is how good or bad this way of coding is. I am looking at various data
structures and would use the best way to handle this.
Is this how it is done? (As in: is this common practice.)
Is __attribute__((packed)) always safe?
Is it better to use sscanf. What was I thinking about?, Thanks #Amardeep
Is it better to make functions where one initiates structure with casts and bit shifting.
etc.
As of now I use this mainly in data information tools. Like listing all structures of
a certain type with their values in a file format like e.g. a media stream.
Information dumping tools.

It is how it is sometimes done. Packed is safe as long as you use it correctly. Using sscanf() would imply you are reading text data, which is a different use case than a binary image from a structure.
If your code does not require portability across compilers and/or platforms (CPU architectures), and your compiler has support for packed structures, then this is a perfectly legitimate way of accessing serialized data.
However, problems may arise if you try to generate data on one platform and use it on another due to:
Host Byte Order (Little Endian/Big Endian)
Different sizes for language primitive types (long can be 32 or 64 bits for example)
Code changes on one side but not the other.
There are libraries that simplify serialization/deserialization and handle most of these issues. The overhead of such operations is easier justified on systems that must span processes and hosts. However, if your structures are very complex, using a ser/des library may be justified simply due to ease of maintenance.

Is this how it is done?
I don't this question understand. Edit: you'd like to know if this is a common idiom. In codebases where dependency on GNU extensions is acceptable, yes, this is used quite frequently, since it's convenient.
is __attribute__((packed)) always safe?
For this use case, pretty much yes, except when it's unavailable.
Is it better to use sscanf.
No. Don't use scanf().
Is it better to make functions where one initiates structure with casts and bit shifting.
It's more portable. __attribute__((packed)) is a GNU extension, and not all compilers support it (although I'm wondering who cares about compilers other than GCC and Clang, but theoretically, this still is an issue).

One of my gripes about C language standards to date is that they impose enough rules about how compilers have to lay out structures and bit fields to preclude what might otherwise be useful optimizations [e.g. on a system with power-of-two integer sizes, a compiler would be forbidden from packing eight three-bit fields into three bytes] but does not provide any means by which a programmer can specify an explicit struct layout. I used to frequently use byte pointers to read out data from structures, but I don't favor such techniques now so much as I used to. When speed isn't critical, I prefer nowadays to use a family functions which either write multi-byte types to multiple consecutive memory locations using whatever endianness is needed [e.g. void storeI32LE(uint8_t **dest, int32_t dat) or int32_t readI32LE(uint8_t const **src);]. Such code will not be as efficient as what a compiler might be able to write in cases where processors have the correct endianness and either the structure members are aligned or processors support unaligned accesses, but code using such methods may easily be ported to any processor regardless of its native alignment and endianness.

Related

Skip/avoid alignment padding bytes when calculating struct checksum

Is there a generic way to skip/avoid alignment padding bytes when calculating the checksum of a C struct?
I want to calculate the checksum of a struct by summing the bytes. The problem is, the struct has alignment padding bytes which can get random (unspecified) values and cause two structs with the identical data to get different checksum values.
Note: I'm mainly concerned about maintainability (adding/removing/modifying fields without the need to update the code) and reusability, not about portability (the platform is very specific and unlikely to change).
Currently, I found a few solutions, but they all have disadvantages:
Pack the struct (e.g. #pragma pack (1)). Disadvantage: I prefer to avoid packing for better performance.
Calculate checksum field by field. Disadvantage: The code will need to be updated when modifying the struct and requires more code (depending on the number of fields).
Set to zero all struct bytes before setting values. Disadvantage: I cannot fully guarantee that all structs were initially zeroed.
Arrange the struct fields to avoid padding and possibly add dummy fields to fill padding. Disadvantage: Not generic, the struct will need to be carefully rearranged when modifying the struct.
Is there a better generic way?
Calculating checksum example:
unsigned int calcCheckSum(MyStruct* myStruct)
{
unsigned int checkSum = 0;
unsigned char* bytes = (unsigned char*)myStruct;
unsigned int byteCount = sizeof(MyStruct);
for(int i = 0; i < byteCount; i++)
{
checkSum += bytes[i];
}
return checkSum;
}
Is there a generic way to skip/avoid alignment padding bytes when calculating the checksum of a C struct?
There is no such mechanism on which a strictly conforming program can rely. This follows from
the fact that C implementations are permitted to lay out structures with arbitrary padding following any member or members, for any reason or none, and
the fact that
When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object representation
that correspond to any padding bytes take unspecified values.
(C2011, 6.2.6.1/6)
The former means that the standard provides no conforming way to guarantee that a structure layout contains no padding, and the latter means that in principle, there is nothing you can do to control the values of the padding bytes -- even if you initially zero-fill a structure instance, any padding takes indeterminate values as soon as you assign to that object or to any of its members.
In practice, it is likely that any of the approaches you mention in the question will do the job where the C implementation and the nature of the data permit. But only (2), computing the checksum member by member, can be used by a strictly-conforming program, and that one is not "generic" as I take you to mean that term. This is what I would choose. If you have many distinct structures that require checksumming, then it might be worthwhile to deploy a code generator or macro magic to help you with maintaining things.
On the other hand, your most reliable way to provide for generic checksumming is to exercise an implementation-specific extension that enables you to avoid structures containing any padding (your (1)). Note that this will tie you to a specific C implementation or implementations that implement such an extension compatibly, that it may not work at all on some systems (such as those where misaligned access is a hard error), and that it may reduce performance on other systems.
Your (4) is an alternative way to avoid padding, but it would be a portability and maintenance nightmare. Nevertheless, it could provide for generic checksumming, in the sense that the checksum algorithm wouldn't need to pay attention to individual members. But note also that this also places a requirement for initialization behavior analogous to (3). That would come cheaper, but it would not be altogether automatic.
In practice, C implementations do not wantonly modify padding bytes, but they don't necessarily go out of their way to preserve them, either. In particular, even if you zero-filled rigorously, per your (3), padding is not guaranteed to be copied by whole-structure assignment or when you pass or return a structure by value. If you want to do any of those things then you need to take measures on the receiving side to ensure zero-filling, and requires member-by-member attention.
This sounds like an XY problem. Computing a checksum for a C object in memory is not usually a meaningful operation; the result is dependent on the C implementation (arch/ABI if not even the specific compiler) and C does not admit a fault-tolerance programming model able to handle the possibility of object values changing out from under you due to hardware faults of memory-safety errors. Checksums make sense mainly for serialized data on disk or in transit over a network where you want to guard against data corruption in storage/transit. And C structs are not for serialization (although they're commonly abused for it). If you write proper serialization routines, you then can just do the checksum on the serialized byte stream.

Are there reasons to avoid bit-field structure members?

I long knew there are bit-fields in C and occasionally I use them for defining densely packed structs:
typedef struct Message_s {
unsigned int flag : 1;
unsigned int channel : 4;
unsigned int signal : 11;
} Message;
When I read open source code, I instead often find bit-masks and bit-shifting operations to store and retrieve such information in hand-rolled bit-fields. This is so common that I do not think the authors were not aware of the bit-field syntax, so I wonder if there are reasons to roll bit-fields via bit-masks and shifting operations your own instead of relying on the compiler to generate code for getting and setting such bit-fields.
Why other programmers use hand-coded bit manipulations instead of bitfields to pack multiple fields into a single word?
This answer is opinion based as the question is quite open:
Many programmers are unaware of the availability of bitfields or unsure about their portability and precise semantics. Some even distrust the compiler's ability to produce correct code. They prefer to write explicit code that they understand.
As commented by Cornstalks, this attitude is rooted in real life experience as explained in this article.
Bitfield's actual memory layout is implementation defined: if the memory layout must follow a precise specification, bitfields should not be used and hand-coded bit manipulations may be required.
The handing of signed values in signed typed bitfields is implementation defined. If signed values are packed into a range of bits, it may be more reliable to hand-code the access functions.
Are there reasons to avoid bitfield-structs?
bitfield-structs come with some limitations:
Bit fields result in non-portable code. Also, the bit field length has a high dependency on word size.
Reading (using scanf()) and using pointers on bit fields is not possible due to non-addressability.
Bit fields are used to pack more variables into a smaller data space, but cause the compiler to generate additional code to manipulate these variables. This results in an increase in both space as well as time complexities.
The sizeof() operator cannot be applied to the bit fields, since sizeof() yields the result in bytes and not in bits.
Source
So whether you should use them or not depends. Read more in Why bit endianness is an issue in bitfields?
PS: When to use bit-fields in C?
There is no reason for it. Bitfields are useful and convenient. They are in the common use in the embedded projects. Some architectures (like ARM) have even special instructions to manipulate bitfields.
Just compare the code (and write the rest of the function foo1)
https://godbolt.org/g/72b3vY
In many cases, it is useful to be able to address individual groups of bits within a word, or to operate on a word as a unit. The Standard presently does not provide
any practical and portable way to achieve such functionality. If code is written to use bitfields and it later becomes necessary to access multiple groups as a word, there would be no nice way to accommodate that without reworking all the code using the bit fields or disabling type-based aliasing optimizations, using type punning, and hoping everything gets laid out as expected.
Using shifts and masks may be inelegant, but until C provides a means of treating an explicitly-designated sequence of bits within one lvalue as another lvalue, it is often the best way to ensure that code will be adaptable to meet needs.

When to use stdint.h's scalars in driver code

It has come to my attention that there seems no consistency nor best practise when to use native scalar types (integer, short, char) or the ones provided by stdint: uint32_t uint16_t uint8_t.
This is bugging me a lot because drivers form an essential part of a kernel that needs to be maintainable, consistent, stable and good.
Here is an illustrational example in gcc (used this for a hobby project for the raspberry pi):
// using native scalars
struct fbinfo {
unsigned width, height;
unsigned vwidth, vheight;
unsigned pitch, bits;
int x, y;
void *ptr;
unsigned size;
} __attribute__((aligned(16)));
// using stdint scalars
struct fbinfo {
uint32_t width, height;
uint32_t vwidth, vheight;
uint32_t pitch, bits;
int32_t x, y;
uint32_t ptr; // convert to void* in order to use it
uint32_t size;
} __attribute__((aligned(16)));
To me, the first example seems more logical, because this piece of code
is only intended to run on a raspberry pi. It would be pointless to run
this on other hardware.
The second example seems more practical, because it looks more
descriptive since C does not guarantee much about the size of integers.
It may be 16 bits or something else. uint32_t, uint_fast32_t and variants make guarantees about the exact or approximated size: e.g. at least or at most X bytes.
The operating system development community tends to use the stdint types, while the linux kernel uses multiple different techniques: u32, __u32, and endian specific stuff like __le32.
What considerations should be taken into account when to choose a scalar type and when to use a typedef'd scalar type? Is it better to use native scalar types in the provided example or use stdint.h's?
1. fixed with vs. fundamental types
Fixed width types are sometimes difficultly to use. E.g. printf() specifiers for int32_t are PRIi32 and require splitting of the format string:
printk("foo=" PRIi32 ", bar=" PRIi32 "\n", foo, bar);
Fixed width types should/must be used when hardware is accessed directly; e.g. when writing DMA descriptors. But for simple register accesses, writel() or readl() functions can be used which work with fundamental types.
As a rule of thumb, when a certain memory layout is assumed (like the __attribute__((__aligned__(16))) in your example, fixed width types should be used.
Signed fixed width types (int32_t x,y in your example) might need double checking, whether their representation matches the hardware expectations.
NOTE that in your example, the second structure is architecture dependent because of
uint32_t ptr; // convert to void* in order to use it
Writing such thing in common C would be uintptr_t ptr and in the kernel it is common to write
unsigned long ptr;
Alternatively, dma_addr_t might be a better type.
2. uint32_t vs. __u32
More than 10 years ago, Linus Torvalds objected against uint32_t because at this time, non-C99 compilers were common and using such types in (exported) linux headers would pollute the namespace.
But now, uint32_t and similar types are available everywhere (you can not compile the kernel with a non-C99 compiler) and kernel header export has been improved significantly, so these arguments are gone.
It is a matter of personal preference whether to use standard types or typedef'ed variants (which are framework dependent and differ between them).
3. uint_fastX_t and variants
They are not used in the kernel and I would avoid them. They combine disadvantages of uint32_t (difficult usage) and int (variable width).
4. __le32 vs. __u32
Use the endian types when specification explicitly requires them (e.g. in network protocol implementations). This makes it easy to detect wrong usage (e.g. assignments like endian_variable = native_variable).
Do not use them e.g. for filling processor structures (e.g. DMA descriptors); some processors can run both in little and big endian mode and native datatypes are usually the right way to write such infomration.

Struct packing for multi processor data communication

On our ASIC we have two processors, which means two different compilers that behave slightly differently. We pass structs full of data between the two for communication, this happens quite often per second so we don't have much time to burn here.
The problem is that both compilers treat padding differently. So when we go to copy data from one domain to the other, we get values that don't align correctly. Our initial solution was to put attribute((packed)) on everything inside the struct. While this seems to work most of the time, it is by no means portable. As we are moving the code to different platforms, I'm noticing that not all compilers understand attribute((packed)) and I'd like to keep our code portable.
Has anyone here dealt with this kind of issue? What would you recommend?
Thanks in advance!
I would manually pack such structures, so that there is no padding issue with any compiler, present or future.
It is tedious, but it is a portable future-proof solution, worth the effort.
Fundamentally struct arrangement in C isn't portable, and so __attribute__((packed)) or similar is generally the typical way to impose a fixed layout on a struct.
The other option is to manually add pad fields in appropriate places and be aware of each platforms' alignment constraints to ensure that across your two platforms the two structures match - but this is essentially manual attribute((packed)).
Note that the pahole utility from dwarves (announcement, paper) is a great tool to check and view the alignment of structures, assuming your compiler emits ELF files.
This is the reason why you shouldn't use structs for data protocols. It might seem like a harsh thing to say, but structs are unfortunately non-portable and therefore dangerous. You could consider doing something like this (pseudo code):
typedef struct
{
uint32_t x;
uint16_t y;
...
} my_struct_t; // some custom struct
#define PROTOCOL_SIZE ( sizeof(uint32_t) + sizeof(uint16_t) + ... )
void pack (uint8_t raw_data[PROTOCOL_SIZE],
const my_struct_t* ms)
{
uint16_t i=0;
memcpy(&raw_data[i], ms->x, sizeof(ms->x));
i += sizeof(ms->x);
memcpy(&raw_data[i], ms->y, sizeof(ms->y));
i += sizeof(ms->y);
...
}
And then make a similar unpack() function which copies raw data into the struct.
Advantages of this is: 100% portable. And if the protocol specifies a particular endianess, this function could also handle that conversion (which you would have to do anyhow.)
Disadvantages are one extra memory buffer and some extra data copying.

MPI and C structs

I have to admit, I was quite shocked to see how many lines of code are required to transfer one C struct with MPI.
Under what circumstances will it work to simply transmit a struct using the predefined datatype MPI_CHAR? Consider the following example:
struct particle {
double x;
double y;
long i;
};
struct particle p;
MPI_Isend(&p, sizeof(particle), MPI_CHAR, tag, MPI_COMM_WORLD, &sendr);
In my case, all processes run on the same architecture. Is padding the only issue?
MPI_BYTE is the datatype to use when sending untyped data, not MPI_CHAR. If the two machines have the same architecture but use different character encoding, using MPI_CHAR may corrupt your data. Sending your data as MPI_BYTE will leave the binary representation as it is and perform no representation conversion whatsoever.
That said, yes, technically it is correct to send a struct that way, if (and only if) you can guarantee that the data representation will be the same on the sending and receiving ends. However, it is poor programming practice, as it obfuscates the purpose of your code, and introduces platform dependency.
Keep in mind that you only have to define and commit a datatype once, while you will generally need to write code to send it several times. Reducing the clarity of all your sends just to save a couple lines on the single definition is not a trade up.
Personally I'd be more concerned about comprehensibility and maintainability, even portability, than padding. If I'm sending a structure I like my code to show that I am sending a structure, not a sequence of bytes or chars. And I expect my codes to run on multiple architectures, across multiple generations of language standards and compilers.
I guess all I'm saying is that, if it's worth defining a structure (and you obviously think it is) then it's worth defining a structure. Saving a few lines of (near-)boilerplate isn't much of an argument against that.

Resources