How can I disable structure padding in C without using pragma?
There is no standard way of doing this. The standard states that padding may be done at the discretion of the implementation. From C99 6.7.2.1 Structure and union specifiers, paragraph 12:
Each non-bit-field member of a structure or union object is aligned in an implementation-defined manner appropriate to its type.
Having said that, there's a couple of things you can try.
The first you've already discounted, using #pragma to try and convince the compiler not to pack. In any case, this is not portable. Nor are any other implementation-specific ways but you should check into them as it may be necessary to do it if you really need this capability.
The second is to order your fields in largest to smallest order such as all the long long types followed by the long ones, then all the int, short and finally char types. This will usually work since it's most often the larger types that have the more stringent alignment requirements. Again, not portable.
Thirdly, you can define your types as char arrays and cast the addresses to ensure there's no padding. But keep in mind that some architectures will slow down if the variables aren't aligned properly and still others will fail miserably (such as raising a BUS error and terminating your process, for example).
That last one bears some further explanation. Say you have a structure with the fields in the following order:
char C; // one byte
int I; // two bytes
long L; // four bytes
With padding, you may end up with the following bytes:
CxxxIIxxLLLL
where x is the padding.
However, if you define your structure as:
typedef struct { char c[7]; } myType;
myType n;
you get:
CCCCCCC
You can then do something like:
int *pInt = &(n.c[1]);
int *pLng = &(n.c[3]);
int myInt = *pInt;
int myLong = *pLng;
to give you:
CIILLLL
Again, unfortunately, not portable.
All these "solutions" rely on you having intimate knowledge of your compiler and the underlying data types.
Other than compiler options like pragma pack, you cannot, padding is in the C Standard.
You can always attempt to reduce padding by declaring the smallest types last in the structure as in:
struct _foo {
int a; /* No padding between a & b */
short b;
} foo;
struct _bar {
short b; /* 2 bytes of padding between a & b */
int a;
} bar;
Note for implementations which have 4 byte boundaries
On some architectures, the CPU itself will object if asked to work on misaligned data. To work around this, the compiler could generate multiple aligned read or write instructions, shift and split or merge the various bits. You could reasonably expect it to be 5 or 10 times slower than aligned data handling. But, the Standard doesn't require compilers to be prepared to do that... given the performance cost, it's just not in enough demand. The compilers that support explicit control over padding provide their own pragmas precisely because pragmas are reserved for non-Standard functionality.
If you must work with unpadded data, consider writing your own access routines. You might want to experimenting with types that require less alignment (e.g. use char/int8_t), but it's still possible that e.g. the size of structs will be rounded up to multiples of 4, which would frustrate packing structures tightly, in which case you'll need to implement your own access for the entire memory region.
Either you let compiler do padding, or tell it not to do using #pragma, either you just use some bunch of bytes like a char array, and you build all your data by yourself (shifting and adding bytes). This is really inefficient but you'll exactly control the layout of the bytes. I did that sometimes preparing network packets by hand, but in most case it's a bad idea, even if it's standard.
If you really want structs without padding: Define replacement datatypes for short, int, long, etc., using structs or classes that are composed only of 8 bit bytes. Then compose your higher level structs using the replacement datatypes.
C++'s operator overloading is very convenient, but you could achieve the same effect in C using structs instead of classes. The below cast and assignment implementations assume the CPU can handle misaligned 32bit integers, but other implementations could accommodate stricter CPUs.
Here is sample code:
#include <stdint.h>
#include <stdio.h>
class packable_int { public:
int8_t b[4];
operator int32_t () const { return *(int32_t*) b; }
void operator = ( int32_t n ) { *(int32_t*) b = n; }
};
struct SA {
int8_t c;
int32_t n;
} sa;
struct SB {
int8_t c;
packable_int n;
} sb;
int main () {
printf ( "sizeof sa %d\n", sizeof sa ); // sizeof sa 8
printf ( "sizeof sb %d\n", sizeof sb ); // sizeof sb 5
return 0;
}
We can disable structure padding in c program using any one of the following methods.
-> use __attribute__((packed)) behind definition of structure. for eg.
struct node {
char x;
short y;
int z;
} __attribute__((packed));
-> use -fpack-struct flag while compiling c code. for eg.
$ gcc -fpack-struct -o tmp tmp.c
Hope this helps.
Thanks.
Related
For code that is compiled on various/unknown architectures/compilers (8/16/32/64-bit) a global mempool array has to be defined:
uint8_t mempool[SIZE];
This mempool is used to store objects of different structure types e.g.:
typedef struct Meta_t {
uint16_t size;
struct Meta_t *next;
//and more
}
Since structure objects always have to be aligned to the largest possible boundary e.g. 64-byte it has to be ensured that padding bytes are added between those structure objects inside the mempool:
struct Meta_t* obj = (struct Meta_t*) mempool[123] + padding;
Meaning if a structure object would start on a not aligned address, the access to this would cause an alignment trap.
This works already well in my code. But I'm still searching for a portable way for aligning the mempool start address as well. Because without that, padding bytes have to be inserted already between the array start address and the address of the first structure inside the mempool.
The only way I have discovered so far is by defining the mempool inside a union together with another variable that will be aligned by the compiler anyways, but this is supposed be not portable.
Unfortunately for embedded platforms my code is also compiled with ANSI C90 compilers. In fact I cannot make any guess what compilers are exactly used. Because of this I'm searching for an absolutely portable solution and I guess any kind of preprocessor directives or compiler specific attributes or language features that were added after C90 cannot be used
You can use _Alignas, which is part of the C11 standard, to force a particular alignment.
_Alignas(uint64_t) uint8_t mempool[SIZE];
(struct Meta_t*) mempool will lead to undefined behavior for more reasons than alignment - it's also a strict aliasing violation.
The best solution might be to create a union such as this:
typedef union
{
struct Meta_t my_struct;
uint8_t bytes[sizeof(struct Meta_t)];
} thing;
This solves both the alignment and the pointer aliasing problems and works in C90.
Now if we do *(thing)mempool then this is well-defined since this (lvalue) access is done through a union type that includes an uint8_t array among its members. And type punning between union members is also well-defined in C. (No solution exists in C++.)
Unfortunately, this ...
This mempool is used to store objects of different structure types
... combined with this ...
I'm searching for an absolute portable solution and I guess any kind of preprocessor directives or compiler specific attributes or language features that were added after C90 cannot be used
... puts you absolutely up a creek, unless you know in advance all the structure types with which your memory pool may be used. Even the approach you are taking now does not conform strictly to C90, because there is no strictly-conforming way to determine the alignment of an address, so as to compute how much padding is needed.* (You have probably assumed that you can convert it to an integer and look at the least-significant bits, but C does not guarantee that you can determine anything about alignment that way.)
In practice, there is a variety of things that will work in a very wide range of target environments, despite not strictly conforming to the C language specification. Interpreting the result of converting a pointer to an integer as a machine address, so that it is sensible to use it for alignment computations, is one of those. For appropriately aligning a declared array, so would this be:
#define MAX(x,y) (((x) < (y)) ? (y) : (x))
union max_align {
struct d { long l; } l;
struct l { double d; } d;
struct p { void *p; } p;
unsigned char bytes[MAX(MAX(sizeof(struct d), sizeof(struct l)), sizeof(struct p))];
};
#undef MAX
#define MEMPOOL_BLOCK_SIZE sizeof(union max_align)
union maxalign mempool[(size + MEMPOOL_BLOCK_SIZE - 1) / MEMPOOL_BLOCK_SIZE];
For a very large set of C implementations, that not only ensures that the pool itself is properly aligned for any use by strictly-conforming C90 clients, but it also conveniently divides the pool into aligned blocks on which your allocator can draw. Refer also to #Lundin's answer for how pointers into that pool would need to be used to avoid strict aliasing violations.
(If you do happen to know all the types for which you must ensure alignment, then put one of each of those into union max_align instead of d, l, and p, and also make your life easier by having the allocator hand out pointers to union max_align instead of pointers to void or unsigned char.)
Overall, you need to choose a different objective than absolute portability. There is no such thing. Avoiding compiler extensions and language features added in C99 and later is a great start. Minimizing the assumptions you make about implementation behavior is important. And where that's not feasible, choose the most portable option you can come up with, and document it.
*Not to mention that you are relying on uint8_t, which is not in C90, and which is not necessarily provided even by all C99 and later implementations.
Conversion of several values to a byte-string for radio transmission has to avoid unneeded bytes. Using GCC on an ARM target (32 bit) I use "attribute ((packed))". This directive is GCC based (as I read somewhere here) and so not generally portable - which I would prefer. Example:
typedef struct __attribute__ ((packed)) {
uint8_t u8; // (*)
int16_t i16; // (*)
float v;
...
uint16_t cs; // (*)
}valset_t; // valset_t is more often used
valset_t vs;
(*) values would use 4 bytes without the ((packed)) attribute, instead of one or two as desired. Byte-wise access for transmission with:
union{
valset_t vs; // value-set
uint8_t b[sizeof(valset_t)]; // byte array
}vs_u;
using vs_u.b[i] .
Processing time is not critical here, as the transmission is much slower.
Endian is also not to be considered here, but a different C-compiler might be applied in some cases.
Elder posts to this task gave some insight, but maybe packing and alignment features were improved meanwhile in C ?c) .
Is there a more portable solution in C to perform this task?
Packing/padding isn't standardized, and therefore structs/unions are strictly speaking not portable. #pragma pack(1) is somewhat more common and supported by many different compilers (including gcc), but it is still not fully portable.
Also please note that padding exists for a reason, these structs with non-standard packing could be dangerous or needlessly inefficient on some systems.
The only 100% portable data type for storing protocols etc is an array of unsigned char. You can get only get fully portable structs if you write serializer/deserializer routines for them. This naturally comes at the expense of extra code. Example of deserialization:
valset_t data_to_valset (const unsigned char* data)
{
valset_t result;
result.something = ... ;
...
return result;
}
In case of a certain network endianess, you can convert from network endianess to CPU endianess inside the same routine.
Please note that you have to type it out like in the function example above. You cannot write code such as:
unsigned char data[n] = ... ;
valset_t* vs = (valset_t*)data; // BAD
This is bad in multiple different ways: alignment, padding, strict aliasing, endianess and so on.
It is possible to go the other way around though, using a unsigned char* to inspect or serialize a struct byte by byte. However, doing so doesn't solve the issues of padding bytes or endianess still.
I've inherited some old code that assumes that an int can store values from -231 to 2^31-1, that overflow just wraps around, and that the sign bit is the high-order bit. In other words, that code should have used uint32_t, except that it wasn't. I would like to fix this code to use uint32_t.
The difficulty is that the code is distributed as source code and I'm not allowed to change the external interface. I have a function that works on an array of int. What it does internally is its own business, but int is exposed in the interface. In a nutshell, the interface is:
struct data {
int a[10];
};
void frobnicate(struct data *param);
I'd like to change int a[10] to uint32_t a[10], but I'm not allowed to modify the definition of struct data.
I can make the code work on uint32_t or unsigned internally:
struct internal_data {
unsigned a[10];
};
void frobnicate(struct data *param) {
struct internal_data *internal = (struct internal_data *)param;
// ... work with internal ...
}
However this is not actually correct C since it's casting between pointers to different types.
Is there a way I can add compile-time guards so that, for the rare people for whom int isn't “old-school” 32-bit, the code doesn't build? If int is less than 32 bits, the code has never worked anyway. For the vast majority of users, the code should build, and in a way that tells the compiler not to do “weird” things with overflowing int calculations.
I distribute the source code and people may use it with whatever compiler they choose, so compiler-specific tricks are not relevant.
I'm at least going to add
#if INT_MIN + 1 != -0x7fffffff
#error "This code only works with 32-bit two's complement int"
#endif
With this guard, what can go wrong with the cast above? Is there a reliable way of manipulating the int array as if its elements were unsigned, without copying the array?
In summary:
I can't change the function prototype. It references an array of int.
The code should manipulate the array (not a copy of the array) as an array of unsigned.
The code should build on platforms where it worked before (at least with sufficiently friendly compilers) and should not build on platforms where it can't work.
I have no control over which compiler is used and with which settings.
However this is not actually correct C since it's casting between pointers to different types.
Indeed, you cannot do such casts, because the two structure types are not compatible. You could however use a work-around such as this:
typedef union
{
struct data;
uint32_t array[10];
} internal_t;
...
void frobnicate(struct data *param) {
internal_t* internal = (internal_t*)param;
...
Another option if you can change the original struct declaration but not its member names, is to use C11 anonymous union:
struct data {
union {
int a[10];
uint32_t u32[10];
}
};
This means that user code accessing foo.a won't break. But you'd need C11 or newer.
Alternatively, you could use a uint32_t* to access the int[10] directly. This is also well-defined, since uint32_t in this case is the unsigned equivalent of the effective type int.
Is there a way I can add compile-time guards so that, for the rare people for whom int isn't “old-school” 32-bit, the code doesn't build?
The obvious is static_assert(sizeof(int) == 4, "int is not 32 bits"); but again this requires C11. If backwards compatibility with older C is needed, you can invent some dirty "poor man's static assert":
#define stat_assert(expr) typedef int dummy_t [expr];
#if INT_MIN != -0x80000000
Depending on how picky you are, this isn't 100% portable. int could in theory be 64 bits, but probably portability to such fictional systems isn't desired either.
If you don't want to drag limits.h around, you could also write the macro as
#if (unsigned int)-1 != 0xFFFFFFFF
It's a better macro regardless, since it doesn't have any hidden implicit promotion gems - note that -0x80000000 is always 100% equivalent to 0x80000000 on a 32 bit system.
So I have a structure with mixed data types like the below and I want to make sure that sizeof(struct a) is a multiple of the word size in x32 and x64. How can I do that? Thank you.
struct a {
vaddr_t v1;
size_t v2;
unsigned short v3;
struct b* v4;
struct a *v5;
int v6;
pthread_mutex_t lock;
};
With basic types, like ints or shorts, you could achieve this by explicitly using int32 or int16 instead of int or short. For other types like size_t or pointers, it gets more complicated. Your best bet is to use type attributes (http://gcc.gnu.org/onlinedocs/gcc-3.2/gcc/Type-Attributes.html).
If all that matters is the structure alignment in memory, align the structure itself, not its members.
I may be stepping a bit outside of my comfort zone here, however there seems to be a variation on the malloc called memalign thus :
void *memalign(size_t alignment, size_t size);
The memalign() function allocates size bytes on a specified
alignment boundary and returns a pointer to the allocated
block. The value of the returned address is guaranteed to be
an even multiple of alignment. The value of alignment must
be a power of two and must be greater than or equal to the
size of a word.
That may or may not exist on all platforms but this one seems to be very common :
int posix_memalign(void **memptr, size_t alignment, size_t size);
Seen at :
http://pubs.opengroup.org/onlinepubs/009695399/functions/posix_memalign.html
Now I would think that the datatypes for fixed width type declarations, as proposed by the ISO/JTC1/SC22/WG14 C, committee's working draft for the revision of the current ISO C standard ISO/IEC 9899:1990 Programming language - C, ( I read that in a manpage ) would be cross platform and cross architecture stable.
So if you looked into the lower levels of your struct members then hopefully they are based on things like int32_t or uint32_t for an integer. There are POSIX types such as :
/*
* POSIX Extensions
*/
typedef unsigned char uchar_t;
typedef unsigned short ushort_t;
typedef unsigned int uint_t;
typedef unsigned long ulong_t;
So I am thinking here that perhaps it is possible to construct your structs using only types that are defined as these totally cross platform stable datatypes and the end result being that the structs are always the same size regardless where or how you compile your code.
Please bear in mind, I am making a stretch here and hoping that someone else may clarify and perhaps correct my thinking.
I'm writing C cross-platform library but eventually I've got error in my unittests, but only on Windows machines. I've tracked the problem and found it's related to alignment of structures (I'm using arrays of structures to hold data for multiple similar objects). The problem is: memset(sizeof(struct)) and setting structures members one by one produce different byte-to-byte result and therefore memcmp() returns "not equal" result.
Here the code for illustration:
#include <stdio.h>
#include <string.h>
typedef struct {
long long a;
int b;
} S1;
typedef struct {
long a;
int b;
} S2;
S1 s1, s2;
int main()
{
printf("%d %d\n", sizeof(S1), sizeof(S2));
memset(&s1, 0xFF, sizeof(S1));
memset(&s2, 0x00, sizeof(S1));
s1.a = 0LL; s1.b = 0;
if (0 == memcmp(&s1, &s2, sizeof(S1)))
printf("Equal\n");
else
printf("Not equal\n");
return 0;
}
This code with MSVC 2003 # Windows produce following output:
16 8
Not equal
But the same code with GCC 3.3.6 # Linux works as expected:
12 8
Equal
This makes my unit-testing very hard.
Am I understand correctly that MSVC uses size of biggest native type (long long) to determine alignment to structure?
Can somebody give me advice how can I change my code to make it more robust against this strange alignment problem? In my real code I'm working with arrays of structures via generic pointers to execute memset/memcmp and I'm usually don't know exact type, I have only sizeof(struct) value.
Your unit test's expectation is wrong. It (or the code it tests) should not scan the structure's buffer byte-by-byte. For byte-precise data the code should create a byte buffer explicitly on stack or on heap and fill it with the extracts from each member. The extracts can be obtained in CPU-endianness-independent way by using the right shift operation against the integer values and casting the result by the byte type such as (unsigned char).
BTW, your snippet writes past s2. You could fix that by changing this
memset(&s2, 0x00, sizeof(S1));
s1.a = 0LL; s1.b = 0;
if (0 == memcmp(&s1, &s2, sizeof(S1)))
to this,
memset(&s2, 0x00, sizeof(S2));
s1.a = 0LL; s1.b = 0;
if (0 == memcmp(&s1, &s2, sizeof(S2)))
but the result is technically "undefined" because the alignment of members in the structures is compiler-specific.
GCC Manual:
Note that the alignment of any given struct or union type is required by the ISO C standard to be at least a perfect multiple of the lowest common multiple of the alignments of all of the members of the struct or union in question.
Also, this typically introduces an element of padding (i.e. filler bytes to have the structure aligned). You can use the #pragma with an argument of packed. Note, #pragmas are NOT a portable way of working. Unfortunately, this is also about the only way of working in your case.
References:
Here GCC on structure alignment.
MSDN structure alignment.
What we have done is used the #pragma pack to specify how big the objects should be:
#pragma pack(push, 2)
typedef struct {
long long a;
int b;
} S1;
typedef struct {
long a;
int b;
} S2;
#pragma pack(pop)
If you do this, the structures will be the same size on both platforms.
Note that this is not a 'strange' alignment problem. MSVC has chosen to ensure that the struct is aligned on a 64-bit boundary since it has a 64-bit member so it adds some padding at the end of the struct to ensure that arrays of those objects will have each element properly aligned. I'm actually surprised that GCC doesn't do the same.
I'm curious what you're unit testing does that hits a snag with this - most of the time alignment of structure members isn't necessary unless you need to match a binary file format or a wire protocol or you really need to reduce the memory used by a structure (especially used in embedded systems). Without knowing what you're trying to do in your tests I don't think a good suggestion can be given. Packing the structure might be a solution, but it comes at some cost - performance (especially on non-Intel platforms) and portability (how struct packing is set up is can be different from compiler to compiler). These may not matter to you, but there might be a better way to solve the problem in your case.
You can either do something like
#ifdef _MSC_VER
#pragma pack(push, 16)
#endif
/* your struct defs */
#ifdef _MSC_VER
#pragma pack(pop)
#endif
to give a compiler directive forcing alignment
Or go into the project options and change the default struct alignment [under Code Generation]
Structure padding for 64-bit values is different on different compilers. I've seen differences between even between gcc targets.
Note that explicitly padding to 64-bit alignment will only hide the problem. It will come back if you begin naively nesting structures, because the compilers will still disagree on the natural alignment of the inner structures.