I have to admit, I was quite shocked to see how many lines of code are required to transfer one C struct with MPI.
Under what circumstances will it work to simply transmit a struct using the predefined datatype MPI_CHAR? Consider the following example:
struct particle {
double x;
double y;
long i;
};
struct particle p;
MPI_Isend(&p, sizeof(particle), MPI_CHAR, tag, MPI_COMM_WORLD, &sendr);
In my case, all processes run on the same architecture. Is padding the only issue?
MPI_BYTE is the datatype to use when sending untyped data, not MPI_CHAR. If the two machines have the same architecture but use different character encoding, using MPI_CHAR may corrupt your data. Sending your data as MPI_BYTE will leave the binary representation as it is and perform no representation conversion whatsoever.
That said, yes, technically it is correct to send a struct that way, if (and only if) you can guarantee that the data representation will be the same on the sending and receiving ends. However, it is poor programming practice, as it obfuscates the purpose of your code, and introduces platform dependency.
Keep in mind that you only have to define and commit a datatype once, while you will generally need to write code to send it several times. Reducing the clarity of all your sends just to save a couple lines on the single definition is not a trade up.
Personally I'd be more concerned about comprehensibility and maintainability, even portability, than padding. If I'm sending a structure I like my code to show that I am sending a structure, not a sequence of bytes or chars. And I expect my codes to run on multiple architectures, across multiple generations of language standards and compilers.
I guess all I'm saying is that, if it's worth defining a structure (and you obviously think it is) then it's worth defining a structure. Saving a few lines of (near-)boilerplate isn't much of an argument against that.
Related
Is there a generic way to skip/avoid alignment padding bytes when calculating the checksum of a C struct?
I want to calculate the checksum of a struct by summing the bytes. The problem is, the struct has alignment padding bytes which can get random (unspecified) values and cause two structs with the identical data to get different checksum values.
Note: I'm mainly concerned about maintainability (adding/removing/modifying fields without the need to update the code) and reusability, not about portability (the platform is very specific and unlikely to change).
Currently, I found a few solutions, but they all have disadvantages:
Pack the struct (e.g. #pragma pack (1)). Disadvantage: I prefer to avoid packing for better performance.
Calculate checksum field by field. Disadvantage: The code will need to be updated when modifying the struct and requires more code (depending on the number of fields).
Set to zero all struct bytes before setting values. Disadvantage: I cannot fully guarantee that all structs were initially zeroed.
Arrange the struct fields to avoid padding and possibly add dummy fields to fill padding. Disadvantage: Not generic, the struct will need to be carefully rearranged when modifying the struct.
Is there a better generic way?
Calculating checksum example:
unsigned int calcCheckSum(MyStruct* myStruct)
{
unsigned int checkSum = 0;
unsigned char* bytes = (unsigned char*)myStruct;
unsigned int byteCount = sizeof(MyStruct);
for(int i = 0; i < byteCount; i++)
{
checkSum += bytes[i];
}
return checkSum;
}
Is there a generic way to skip/avoid alignment padding bytes when calculating the checksum of a C struct?
There is no such mechanism on which a strictly conforming program can rely. This follows from
the fact that C implementations are permitted to lay out structures with arbitrary padding following any member or members, for any reason or none, and
the fact that
When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object representation
that correspond to any padding bytes take unspecified values.
(C2011, 6.2.6.1/6)
The former means that the standard provides no conforming way to guarantee that a structure layout contains no padding, and the latter means that in principle, there is nothing you can do to control the values of the padding bytes -- even if you initially zero-fill a structure instance, any padding takes indeterminate values as soon as you assign to that object or to any of its members.
In practice, it is likely that any of the approaches you mention in the question will do the job where the C implementation and the nature of the data permit. But only (2), computing the checksum member by member, can be used by a strictly-conforming program, and that one is not "generic" as I take you to mean that term. This is what I would choose. If you have many distinct structures that require checksumming, then it might be worthwhile to deploy a code generator or macro magic to help you with maintaining things.
On the other hand, your most reliable way to provide for generic checksumming is to exercise an implementation-specific extension that enables you to avoid structures containing any padding (your (1)). Note that this will tie you to a specific C implementation or implementations that implement such an extension compatibly, that it may not work at all on some systems (such as those where misaligned access is a hard error), and that it may reduce performance on other systems.
Your (4) is an alternative way to avoid padding, but it would be a portability and maintenance nightmare. Nevertheless, it could provide for generic checksumming, in the sense that the checksum algorithm wouldn't need to pay attention to individual members. But note also that this also places a requirement for initialization behavior analogous to (3). That would come cheaper, but it would not be altogether automatic.
In practice, C implementations do not wantonly modify padding bytes, but they don't necessarily go out of their way to preserve them, either. In particular, even if you zero-filled rigorously, per your (3), padding is not guaranteed to be copied by whole-structure assignment or when you pass or return a structure by value. If you want to do any of those things then you need to take measures on the receiving side to ensure zero-filling, and requires member-by-member attention.
This sounds like an XY problem. Computing a checksum for a C object in memory is not usually a meaningful operation; the result is dependent on the C implementation (arch/ABI if not even the specific compiler) and C does not admit a fault-tolerance programming model able to handle the possibility of object values changing out from under you due to hardware faults of memory-safety errors. Checksums make sense mainly for serialized data on disk or in transit over a network where you want to guard against data corruption in storage/transit. And C structs are not for serialization (although they're commonly abused for it). If you write proper serialization routines, you then can just do the checksum on the serialized byte stream.
On our ASIC we have two processors, which means two different compilers that behave slightly differently. We pass structs full of data between the two for communication, this happens quite often per second so we don't have much time to burn here.
The problem is that both compilers treat padding differently. So when we go to copy data from one domain to the other, we get values that don't align correctly. Our initial solution was to put attribute((packed)) on everything inside the struct. While this seems to work most of the time, it is by no means portable. As we are moving the code to different platforms, I'm noticing that not all compilers understand attribute((packed)) and I'd like to keep our code portable.
Has anyone here dealt with this kind of issue? What would you recommend?
Thanks in advance!
I would manually pack such structures, so that there is no padding issue with any compiler, present or future.
It is tedious, but it is a portable future-proof solution, worth the effort.
Fundamentally struct arrangement in C isn't portable, and so __attribute__((packed)) or similar is generally the typical way to impose a fixed layout on a struct.
The other option is to manually add pad fields in appropriate places and be aware of each platforms' alignment constraints to ensure that across your two platforms the two structures match - but this is essentially manual attribute((packed)).
Note that the pahole utility from dwarves (announcement, paper) is a great tool to check and view the alignment of structures, assuming your compiler emits ELF files.
This is the reason why you shouldn't use structs for data protocols. It might seem like a harsh thing to say, but structs are unfortunately non-portable and therefore dangerous. You could consider doing something like this (pseudo code):
typedef struct
{
uint32_t x;
uint16_t y;
...
} my_struct_t; // some custom struct
#define PROTOCOL_SIZE ( sizeof(uint32_t) + sizeof(uint16_t) + ... )
void pack (uint8_t raw_data[PROTOCOL_SIZE],
const my_struct_t* ms)
{
uint16_t i=0;
memcpy(&raw_data[i], ms->x, sizeof(ms->x));
i += sizeof(ms->x);
memcpy(&raw_data[i], ms->y, sizeof(ms->y));
i += sizeof(ms->y);
...
}
And then make a similar unpack() function which copies raw data into the struct.
Advantages of this is: 100% portable. And if the protocol specifies a particular endianess, this function could also handle that conversion (which you would have to do anyhow.)
Disadvantages are one extra memory buffer and some extra data copying.
Have seen various code around where one read data into a char or void and then
cast it to a struct. Example is parsing of file formats where data has fixed offsets.
Example:
struct some_format {
char magic[4];
uint32_t len;
uint16_t foo;
};
struct some_format *sf = (struct some_format*) buf;
To be sure this is always valid one need to align the struct by using __attribute__((packed)).
struct test {
uint8_t a;
uint8_t b;
uint32_t c;
uint8_t d[128];
} __attribute__((packed));
When reading big and complex file formats this surely makes things much simpler. Typically
reading media format with structs having 30+ members etc.
It is also easy to read in a huge buffer and cast to proper type by for example:
struct mother {
uint8_t a;
uint8_t b;
uint32_t offset_child;
};
struct child {
...
}
m = (struct mother*) buf;
c = (struct child*) ((uint8_t*)buf + mother->offset_child);
Or:
read_big_buf(buf, 4096);
a = (struct a*) buf;
b = (struct b*) (buf + sizeof(struct a));
c = (struct c*) (buf + SOME_DEF);
...
It would also be easy to quickly write such structures to file.
My question is how good or bad this way of coding is. I am looking at various data
structures and would use the best way to handle this.
Is this how it is done? (As in: is this common practice.)
Is __attribute__((packed)) always safe?
Is it better to use sscanf. What was I thinking about?, Thanks #Amardeep
Is it better to make functions where one initiates structure with casts and bit shifting.
etc.
As of now I use this mainly in data information tools. Like listing all structures of
a certain type with their values in a file format like e.g. a media stream.
Information dumping tools.
It is how it is sometimes done. Packed is safe as long as you use it correctly. Using sscanf() would imply you are reading text data, which is a different use case than a binary image from a structure.
If your code does not require portability across compilers and/or platforms (CPU architectures), and your compiler has support for packed structures, then this is a perfectly legitimate way of accessing serialized data.
However, problems may arise if you try to generate data on one platform and use it on another due to:
Host Byte Order (Little Endian/Big Endian)
Different sizes for language primitive types (long can be 32 or 64 bits for example)
Code changes on one side but not the other.
There are libraries that simplify serialization/deserialization and handle most of these issues. The overhead of such operations is easier justified on systems that must span processes and hosts. However, if your structures are very complex, using a ser/des library may be justified simply due to ease of maintenance.
Is this how it is done?
I don't this question understand. Edit: you'd like to know if this is a common idiom. In codebases where dependency on GNU extensions is acceptable, yes, this is used quite frequently, since it's convenient.
is __attribute__((packed)) always safe?
For this use case, pretty much yes, except when it's unavailable.
Is it better to use sscanf.
No. Don't use scanf().
Is it better to make functions where one initiates structure with casts and bit shifting.
It's more portable. __attribute__((packed)) is a GNU extension, and not all compilers support it (although I'm wondering who cares about compilers other than GCC and Clang, but theoretically, this still is an issue).
One of my gripes about C language standards to date is that they impose enough rules about how compilers have to lay out structures and bit fields to preclude what might otherwise be useful optimizations [e.g. on a system with power-of-two integer sizes, a compiler would be forbidden from packing eight three-bit fields into three bytes] but does not provide any means by which a programmer can specify an explicit struct layout. I used to frequently use byte pointers to read out data from structures, but I don't favor such techniques now so much as I used to. When speed isn't critical, I prefer nowadays to use a family functions which either write multi-byte types to multiple consecutive memory locations using whatever endianness is needed [e.g. void storeI32LE(uint8_t **dest, int32_t dat) or int32_t readI32LE(uint8_t const **src);]. Such code will not be as efficient as what a compiler might be able to write in cases where processors have the correct endianness and either the structure members are aligned or processors support unaligned accesses, but code using such methods may easily be ported to any processor regardless of its native alignment and endianness.
I have a question about structure padding and memory alignment optimizations regarding structures in C language. I am sending a structure over the network, I know that, for run-time optimizations purposes, the memory inside a structure is not contiguous. I've run some tests on my local computer and indeed, sizeof(my_structure) was different than the sum of all my structure members. I ran some research to find out two things :
First, the sizeof() operator retrieves the padded size of the structure (i.e the real size that would be stored in memory).
When specifying __attribute__((__packed__)) in the declaration of the structure this optimization is disabled by the compiler, so sizeof(my_structure) will be exactly the same as the sum of the fields of my structure.
That being said, i am wondering if the sizeof operator was getting the padded size on every compilers implementation and on every architecture, in other words, is it always safe to copy a structure with memcpy for example using the sizeof operator such as :
memcpy(struct_dest, struct_src, sizeof(struct_src));
I am also wondering what is the real purpose of __attribute__((__packed__)), is it used to send a less important amount the data on a network when submitting a structure or is it, in fact, used to avoid some unspecified and platform-dependant sizeof operator behaviour ?
Thanks by advance.
Different compilers on different architectures can and do use different padding. So for wire transmission it is not uncommon to pack structs to achieve a consistent binary layout. This can then cater for the code at each end of the wire running on different architecture.
However you also need to make sure that your data types are the same size if you use this approach. For example, on 64 bit systems, long is 4 bytes on Windows and 8 bytes almost everywhere else. And you also need to deal with endianness issues. The standard is to transmit over the wire in network byte order. In practice you would be better using a dedicated serialization library rather than trying to reinvent solutions to all these issues.
I am sending a structure over the network
Stop there. Perhaps some would disagree with me on this (in practice you do see a lot of projects doing this), but struct is a way of laying out things in memory - it's not a serialization mechanism. By using this tool for the job, you're already tying yourself to a bunch of non-portable assumptions.
Sure, you may be able to fake it with things like structure padding pragmas and attributes, but - can you really? Even with those non-portable mechanisms you never know what quirks might show up. I recall working in a code base where "packed" structures were used, then suddenly taking it to a platform where access had to be word aligned... even though it was nominally the same compiler (thus supported the same proprietary extensions) it produced binaries which crashed. Any pain you get from this path is probably deserved, and I would say only take it if you can be 100% sure it will only run in a given compiler and environment, and that will never change. I'd say the safer bet is to write a proper serialization mechanism that doesn't allow writing structures around across process boundaries.
Is it always safe to copy a structure with memcpy for example using the sizeof operator
Yes, it is and that is the purpose of providing the sizeof operator.
Usually __attribute__((__packed__)) is used not for size considerations but when you want want to to make sure of the layout of a structure is exactly as you want it to be.
For ex:
If a structure is to be used to match hardware or be sent on a wire then it needs to have the exact same layout without any padding.This is because different architectures usually implement different kinds & amounts of padding and alignment and the only way to ensure common ground is to remove padding out out of the picture by using packing.
Is it a good idea to simply dump a struct to a binary file using fwrite?
e.g
struct Foo {
char name[100];
double f;
int bar;
} data;
fwrite(&data,sizeof(data),1,fout);
How portable is it?
I think it's really a bad idea to just throw whatever the compiler gives(padding,integer size,etc...). even if platform portability is not important.
I've a friend arguing that doing so is very common.... in practice.
Is it true???
Edit: What're the recommended way to write portable binary file? Using some sort of library?
I'm interested how this is achieved too.(By specifying byte order,sizes,..?)
That's certainly a very bad idea, for two reasons:
the same struct may have different sizes on different platforms due to alignment issues and compiler mood
the struct's elements may have different representations on different machines (think big-endian/little-endian, IEE754 vs. some other stuff, sizeof(int) on different platforms)
It rather critically matters whether you want the file to be portable, or just the code.
If you're only ever going to read the data back on the same C implementation (and that means with the same values for any compiler options that affect struct layout in any way), using the same definition of the struct, then the code is portable. It might be a bad idea for other reasons: difficulty of changing the struct, and in theory there could be security risks around dumping padding bytes to disk, or bytes after any NUL terminator in that char array. They could contain information that you never intended to persist. That said, the OS does it all the time in the swap file, so whatEVER, but try using that excuse when users notice that your document format doesn't always delete data they think they've deleted, and they just emailed it to a reporter.
If the file needs to be passed between different platforms then it's a pretty bad idea, because you end up accidentally defining your file format to be something like, "whatever MSVC on Win32 ends up writing". This could end up being pretty inconvenient to read and write on some other platform, and certainly the code you wrote in the first place won't do it when running on another platform with an incompatible storage representation of the struct.
The recommended way to write portable binary files, in order of preference, is probably:
Don't. Use a text format. Be prepared to lose some precision in floating-point values.
Use a library, although there's a bit of a curse of choice here. You might think ASN.1 looks all right, and it is as long as you never have to manipulate the stuff yourself. I would guess that Google Protocol Buffers is fairly good, but I've never used it myself.
Define some fairly simple binary format in terms of what each unsigned char in turn means. This is fine for characters[*] and other integers, but gets a bit tricky for floating-point types. "This is a little-endian representation of an IEEE-754 float" will do you OK provided that all your target platforms use IEEE floats. Which I expect they do, but you have to bet on that. Then, assemble that sequence of characters to write and interpret it to read: if you're "lucky" then on a given platform you can write a struct definition that matches it exactly, and use this trick. Otherwise do whatever byte manipulation you need to. If you want to be really portable, be careful not to use an int throughout your code to represent the value taken from bar, because if you do then on some platform where int is 16 bits, it won't fit. Instead use long or int_least32_t or something, and bounds-check the value on writing. Or use uint32_t and let it wrap.
[*] Until you hit an EBCDIC machine, that is. Not that anybody will seriously expect your files to be portable to a machine that plain text files aren't portable to either.
How fond are you of getting a call in the middle of the night? Either use a #pragma to pack them or write them variable by variable.
Yes, this sort of foolishness is very common but that doesn't make it a good idea. You should write each field individually in a specified byte order, that will avoid alignment and byte order problems at the cost of a little tiny bit of extra effort. Reading and writing field by field will also make your life easier when you upgrade your software and have to read your old data format or if the underlying hardware architecture changes.