In the "Advanced Programming in the Unix Environment" book there's a part (ch 8.14, page 251) in which the author shows us the definition of the "acct" struct (used to store accounting records info). He then shows a program in which he reads the accounting data from a file into the struct (the key part of which is):
fread (&acdata, sizeof(acdata), 1, fp)
The trouble I'm having is that I've heard that C compilers will sometimes rearrange the elements of a struct in memory in order to better utilize space (due to alignment issues). So, if this code is just taking all of the content of the file and sticking it into acdata (and the contents of the file are arranged to match the ordering specified in the struct definition) if some of the elements of struct have been moved, then if I refer to them in code, I may not be getting what I expected (since the data in the file did not get rearranged the way the struct did in memory).
What am I missing (because from what I'm getting this doesn't seem reliable)?
Thanks for your help (my apologies if I've done something wrong procedurally - this is my first time posting)
Worry!
You are right to worry about this issue and pay attention to it. It's a vexing problem, and often happens when you carry your source to another machine, with a different -- even slightly different -- architecture, and perhaps with a different OS or maybe a different compiler; compile your program there; and expect your structs to remain intact over fwrite( ) and fread( ). Or when you add a 1-byte variable to your struct, recompile, and send out binaries to all your friends. Your program doesn't work on their machines anymore, for some mysterious reason.
Sometimes it works (by accident) and you never notice the problem; sometimes it doesn't work and you pull your hair out for a few days.
The isssue has nothing to do with rearrangement of struct members. Compilers don't do that. It has nothing to do with optimization, either.
The issue is byte alignment, and the Wikipedia article mentioned below tells you how to fix up your structs so they'll always be correctly aligned. It's always a good idea to pay attention to byte alignment. Otherwise your program isn't portable. And, worse, the program you carefully compiled on your whiz-bang x86-64 and distributed to all of your customers all of a sudden won't run on their 32-bit machines.
Just as important: be mindful of the lengths and alignments of the struct members, too.
There's a nice Wikipedia article that explains the details. It's a very worthwhile read.
I would be wary of a compiler-specific pragma that does the job, but just for that compiler. If you put a pragma in your code, then your program isn't C anymore.
The layout (padding and alignment, but not order) of the structure may change if you compile your code on a different compiler, or a later version of the compiler, or even with different compile-time options.
It won't change from run to run of the same compiled program - that would be a nightmare scenario :-)
So, provided the same program (or technically, any program which has the same structure layout encoded into it at compile time) is the one doing the reading, this will work just fine.
The relevant sections of the C99 standard are:
6.2.6.1/1: The representations of all types are unspecified except as stated in this subclause.
6.2.6.1/6 (the only mention of structures in that subclause): When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values. The value of a structure or union object is never a trap representation, even though the value of a member of the structure or union object may be a trap representation.
That's the only mention of structure padding in that subclause. In other words, it's up to the implementation and they don't even need to document it (unspecified as opposed to implementation-defined, which would require documenting).
6.7.2.1/13: ... There may be unnamed padding within a structure object, but not at its beginning.
6.7.2.1/15: There may be unnamed padding at the end of a structure or union.
If you were to create version 1.1 of your program and it uses a different structure layout (new compiler, different compiler options, #pragma pack, etc), it would very quickly be evident that you had a problem during your unit tests (which should include loading in a file from the previous version).
In that case, you could include some 'intelligence' in your 1.1 program which could recognise an earlier file layout and transform the data as it comes in. That's why good file formats will often have a version indicator (for the file layout version, not the program version) as the first item in that file.
For example, quite a few of my applications use an application identifier along with a 16-bit integer at the front of the file to indicate what application and version it is and the file loader part of the program can handle at least the current and previous versions (and often every version ever created).
The program version and file layout version are separate things - they can drift if, for example, you release ten versions of your program without needing to update the file layout.
Yes
Your program will be stable.
Your question has touched off a bonfire of portability recommendations that you didn't actually ask for. The question you seemed to be asking is "is this code pattern and my program stable?". And the answer to that is yes.
You structure will not be reordered. C99 specifically prohibits rearranging the structure members.1
Also, the layout and alignment do not depend on optimization level. If they did, all programs would have to be entirely built with the same optimization level, as well as all library routines, the kernel, all kernel interfaces, etc.
Users would also have to track, forever, the optimization level of every one of those interfaces listed above that ever had been compiled as part of the system.
The memory alignment rules are really a kind of hidden ABI. They can't change without adding very specialized and by definition rarely-used compiler flags. They tend to work just fine over different compilers. (Otherwise, every element of a system identified above would ALSO have to be compiled by the same compiler, or be useless. Every compiler that supports a given system uses the exact same alignment rules. Nothing would work, otherwise.) The compiler flags that change alignment policies are usually intended to be built into the compiler configuration for a given OS.
Now, your binary file layout, while perfectly reasonable, is a bit old-school. It has certain drawbacks. While none of these are show-stoppers and none are generally worth rewriting an app, they include:
it's hard to debug binary files
they do lock in a single byte order and a single alignment policy. In the (sadly, increasingly unlikely) case where you need to port to a new architecture, you might end up needing to unpack the record with memcpy(3). Not the end of the world.
they aren't structured. Things like YAML and, ahem, even XML are sort of self-parsing, so it becomes a lot easier to read in a file, and certain types of file manipulations can be done with tools. Even more important, the file format itself becomes more flexible. Your ability to take advantage of the auto-parsed-object is limited, however, in C and C++.
As I understand Paxdiablo's request, he would like me to agree that there exist compiler options and pragmas that, if used, will alter the alignment rules. That's true. Obviously these options are used only for specific reasons.
1. C99 6.7.2.1(13) Within a structure object, the non-bit-field members and the units in which bit-fields
reside have addresses that increase in the order in which they are declared.
The struct is written to the file based on how it is in memory. The ordering will be the same. Mixing compilers between write and read might be an issue however.
Related
The standard doesn't seem impose any padding requirements on struct members, even though it does prohibit reordering (6.7.2.1p6). How likely is it that a C platform will not pad minimally, i.e., not add only the minimum amount of padding needed to make sure the next member (or instance of the same struct, if this is the last member) is sufficiently aligned for its type?
Is it even sensible of the standard not to require that padding be minimal?
I'm asking because this lack of a padding guarantee seems to prevent me from portably representing serialized objects as structs (even if I limit myself to just uint8_t arrays as members, compilers seem to be allowed to add padding in between them), and I'm finding it a little weird to have to resort to offset arithmetic there.
How likely is it that a C platform will not pad minimally, i.e., not add only the minimum amount of padding needed to make sure the next member (or instance of the same struct, if this is the last member) is sufficiently aligned for its type?
Essentially, the "extra" padding may allow significant compiler optimizations.
Unfortunately, I don't know if any compilers actually do that (and therefore cannot provide any estimate on its likelihood of occurring).
As a simple example, consider a 32-bit or 64-bit architecture, where the ABI states that string literals and character arrays are aligned to 32-bit or 64-bit boundary. Many of the C library functions are (also) implemented by the C compiler itself; see e.g. these lists for GCC. The compiler can track the parameters to see if they refer to a string literal or (the beginning of a) character array, and if so, replace e.g. strcmp() with an optimized built-in version (which does the comparison in 32-bit units, rather than char-at-a-time).
As a more complicated example, consider a RISC hardware architecture, where unaligned byte access is slower than aligned native word access. (For example, the former may be implemented in hardware as the latter, followed by a bit shift.) Such an architecture could have an ABI that requires all structure members to be word-aligned. Then, the C compiler would be required to add more-than-minimal padding.
Traditionally, the C standards committee has been very careful to not exclude any kind of hardware architecture from correctly implementing the language.
Is it even sensible of the standard not to require that padding be minimal?
The purpose of the C standard used to be to ensure that C code would behave in the same manner if compiled with different compilers, and to allow implementation of the language on any sufficiently capable hardware architecture. In that sense, it is very sensible for the standard not to require minimal padding, as some ABIs may require more than minimal padding for whatever reason.
With the introduction of the Microsoft "extensions", the purpose of the C standard has shifted significantly, to binding C to C++ to ensure a C++ compiler can compile C code with minimal differences to C++ compilation, and to provide interfaces that can be marketed as "safer" with the actual purpose of balkanizing developers and binding them to a single vendor implementation. Because this is contrary to the previous purpose of the standard, and it is clearly non-sensible to standardize single-vendor functions like fscanf_s() while not standardizing multi-vendor functions like getline(), it may not be possible to define what sensible means anymore in the context of the C standard. It definitely does not match "good judgment"; it probably now refers to "being perceptible by the senses".
I'm asking because this lack of a padding guarantee seems to prevent me from portably representing serialized objects as structs
You are making the same mistake C programmers make, over and over again. Structs are not suitable for representing serialized objects. You should not use a struct to represent a network object, or a file header, because of the C struct rules.
Instead, you should use a simple character buffer, and either accessor functions (to extract or pack each member or field from the buffer), or conversion functions (to convert the buffer contents to a struct and vice versa).
The underlying reason why even experienced programmers like the asker still would prefer to use a struct instead, is that the accessors/conversion involves a lot of extra code; having the compiler do it instead would be much better: less code, simpler code, easier to maintain.
And I agree. It would even be quite straightforward, if a new keyword, say serialized_struct was introduced; to introduce a serialized data structure with completely different member rules to traditional C structs. (Note that this support would not affect e.g. linking at all, so it really is not as complicated as one might think.) Additional attributes or keywords could be used to specify explicit byte order, and the compiler would do all the conversion details for us, in whatever way the compiler sees best for the specific architecture it compiler for. This support would only be available for new code, but it would be hugely beneficial in cutting down on interoperability issues -- and it would make a lot of serialization code simpler!
Unfortunately, when you combine the C standard committee's traditional dislike to adding new keywords, and the overall direction change from interoperability to vendor lock-in, there is no chance at all for anything like this to be included in the C standard.
Of course, as described in the comments, there are lots of C libraries that implement one serialization scheme or other. I've even written a few myself (for rather peculiar use cases, though). A sensible approach (poor pun intended) would be to pick a vibrant one (well maintained, with a lively community around the library), and use it.
The page shown is from
"Programming Embedded Systems"
by Michael Barr, Anthony Massa,
published by O'Reilly,
ISBN: 0-596-00983-6, page 204.
I am asking you to give more details and explanations on this, like:
Does this means that the bit fields are going to be portable across all the compilers?
For (different) architectures does this work for bit fields with sizes more than one byte or (considering the endianness difference which I don't think that using this method will overcome this problem)?
For (same) architectures does this work for bit fields with sizes more than one byte?
If they are standardised across all compilers as the book says, can we specify how are they going to be aligned?
Q.1.2 if the bit fields are just one byte so the endianness problem won't affect it, right? So will the bit fields be portable across all the compilers on different architectures and endiannesses?
does this means that the bit fields are going to be portable across all the compilers
No, the cited text about unions somehow making bit-fields portable is strange. union does not add any further portability guarantees what-so-ever. There are many aspects of bit-fields that make them completely non-portable, because they are very poorly specified by the standard. Some examples here.
For example, using uint8_t or a char type for a bit-field is not covered by the standard. The book fails to mention this even though it makes such a non-standard example.
for (different) architectures does this work for bit fields with sizes more than one byte
for (same) architectures does this work for bit fields with sizes more than one byte
No, there are no guarantees at all.
if they are standardised across all compilers as the book says ,can we specify how are they going to be aligned?
They aren't, the book is misleading. My advise is to stop reading at "bitfields are not portable" then forget that you ever heard about bit-fields. It is a 100% superfluous feature anyway. Instead, use bitwise operators.
That is a very disturbing text, I think I would toss the whole book.
Please go read the spec, you dont have to pay for it there are older ones available and draft versions (which of course are not the final) versions available, but for that couple of decades they are more common that different.
If I could add more upvotes to Lundin's answer I would, not worth creating new users with email addresses just to do that...
I have/will possibly spark an argument on this, but...The spec does say that if you define some number of (non-zero sized) bitfields in a row in a struct or union they will get packed, and there is a special zero sized one that is used to break that so that you can declare a bunch of bitfields and group them without having to make some other struct.
Perhaps it says they will be aligned, but I would never assume, that. I know for a fact the same compiler will treat endians different and pack them on opposite ends (top bit down or bottom bit up). But there is no reason to assume that any compiler follows any convention other than packing, and I would assume although perhaps it is also subject to interpretation, that they are packed in the order defined once you figure out where they start. So I wouldnt assume that 6 bits worth of declaration are aligned either way, could be up to six different alignents within a byte assuming a byte is the size of the unit, if the size of the unit is 32 or 64 bits then I am not going to bother counting the combinations, it is more than one and that is all that matters. I know for a fact from gcc that when the 32 to 64 bit x86 transition happened, that caused problems with code making assumptions on where those bits landed.
I personally wouldnt even assume that the bits are in the declared order when they pack them together...Popular compilers tend to do that, but the spec does not say more than they are packed...what does that mean. If I have a 1 then an 8 then a 1 then an 8 then a 6 bit I would hope the compiler alignes the 8 bit ones on a byte boundary and then moves the two ones near the 6, if I were to ever use a bitfield which I dont...
The prime contention here, is to me the spec is very clear that the initial items in more than one declaration in a union only uses the same memory if the order and size are the same they are compatible types. A one bit unsigned it is not the same as a 32 bit unsigned int, so they are NOT compatible types IMO. The spec goes further to state that, for bitfields the types have to be the same type and size, so for a bitfield to share the same memory in a union, you need two structures with the same initial bitfield items to be the same type and size, and only those items are per spec going to share memory, what happens with the rest of the bits is a different story, per the spec. so your example from my reading of the spec does nothing to say that the 8 bit char (using a made up non-spec declaration) and 8 declared bits of bit field are in no way expected to line up with each other and share the same memory. Just because a compiler chooses to do that in some version does not mean you can assume that, the union in particular does not make that code portable or more portable in anyway, in fact perhaps worse as now you not only have a bitfield issue across compilers or versions, but you now have union issues across compilers or versions.
As a general rule NEVER use a structure across a compile domain (with or without bitfields)(this includes unions), so never read a file into a structure, you have crossed a compile domain, never point structures at hardware registers, you have crossed a compile domain, never point structures at memory, dont point a structure at a char array that contains an ethernet packet and use the struct and/or bitfields to pick apart the ip header. Yes, these rules are widely used and abused and are a ticking time bomb. The primary reason the time bomb only goes off rarely is that the code keeps using the same compiler, and or a couple of vary popular compilers that currently have the same implementation. but struct pointing in general fails very very often, bitfield failures are just a side effect of that, and perhaps because of the horrible text in your book, unions are starting to show up a lot making the time bomb now nuclear instead of conventional.
So if you want to use a struct or a union or a bitfield and have the code actually work without maintenance. Then stay within the same compile domain (one program compiled at the same time with the same compiler and settings), pass structures defined as structures across functions, do not point at memory or other arrays, etc. for unions never access across individually defined items, if a single variable only use that variable until finished completely with it assume it is now trash if you use a struct or other variable in that union. With bitfields each variable is a standalone item independent of the other variables next to it, you are just ATTEMPTING to save memory by using them but you are actually wasting a lot of code overhead, performance, code space by using them in the first place. Keep to that and your code is far more likely to work without maintenance, across compilers. Now if you want to use this as job security and have your code fail to build or function every minor or major release of a compiler, then do those things above, point structs across a compile domain, point bitfields at hardware registers, etc. Other than your boss noting that you write horrible code that breaks often when some other employees dont, you will have to keep maintaining that code on a regular basis for the life of that product.
All the compiler does with your bitfield is generate masks and shifts, if you write those masks and shifts yourself, MASSIVELY more portable, you may still have endian issues (which can actually at times be easily solved in portable endian-less code) but you wont be completely pointing at the wrong thing using masking and shifting. it simply works, it does not produce more code, does not produce slower code, if you really need make macros for everything, using a macro to isolate a field within a "unit" is far more portable than using a bitfield.
Forget you ever read about bitfields or heard about them, never ever use them again. The need for them died decades ago. Unions somewhat fall into the same category as well, they do actually save memory but you have to be careful to share that memory properly.
And I would toss that book as well, if they dont understand this simple topic what else do those authors not understand. As with most folks this may be confusing a popular compilers interpretation with reality.
There is a little 'bit' confusion with the concepts of bit-fields and endians -
Suppose you have an MCU of 32bit - it means that internal memory of device have to be multiplied with a size of 32bits.
Now as you might know, the way each MCU stores the memory is LSB or MSB which is Big Endian and Little Endian respectively,
see here for illustrations: Endians figure
As can be seen, same data: 0x12345678 (32 bit value) is stored differently internally.
When you are reading and writing memory using a 32 bit pointer (trivial) case - the MCU will convert it for you (it doesn't makes any difference between the endians) - the problem arose when you are dealing with a one by one byte manipulation or when exporting (or importing) from other MCU / memory peripheral also suing 8 bit / 1 byte manipulations.
Bit field will be aligned to the Byte, Word and to the Long Word types (as seen) so it can be miss-interpreted when porting to other target.
Hence, the answer your questions:
If it is only one byte that you dividing into bits it will be ported nicely.
if you define a multi-bytes union it will make you goes in trouble.
Answered at the introduction of this answer
See answer no. 1
See the figure I have attached for illustrations
Right in general
1, 2: Not quite: it always depends on the platform (endianess) and the types you are using.
3: Yes they will always land on the same spot in memory.
4: Which alignment do you mean — the memory alignment or the field alignment?
When is the Ada type attribute Alignment needed when wrapping C structures from Ada?
Our typical wrapper structure looks like
type T is record
a : aliased Interfaces.C.unsigned_char;
b : aliased Interfaces.C.double;
end record;
Now, when/where is
for T'Alignment use 8;
needed?
And does this depend on the target architecture?
Ada 2012 LRM's definition of 'Alignment.
When 'Alignment is associated with a type, the address of objects of that type must be evenly divisible by the alignment value.
So in your definition, an object of type T could be explicitly placed, or will be automatically allocated by the compiler, at address 800, but not 804.
This becomes relevant when certain data types must obey alignment constraints, such as doubles starting on a double-word (8-byte) boundary. (This is dependent on target architecture--some impose such constraints, others do not.) Likewise, some architecture may allow multi-byte values to start on odd addresses--'Alignment use 1--but most not.
This issue is most likely to arise in situations such as you describe where you need to define an Ada layout matching an externally defined one. Specifying 'Alignment can ensure that objects and record components get properly laid out to match the external source.
Often though, especially when interfacing to C, simply applying the Convention aspect or pragma to the corresponding type definitions will ensure that the Ada layout automatically matches the C layout, leaving all the detailed aligning and padding to the compiler.
The alignment attribute's purpose is not to support C compatibility, it is cheifly to provide compatibility with hardware that has special alignment restrictions for its data (including some data for some CPUs, most typically floats and the FPU).
That being said, technically I suppose it could be useful for C compatibility in situations where you know the alignment rules used by the C compiler you are trying to make your Ada code compatible with.
In this particular case, I suspect what would happen would be that the compiler would make sure all objects of type T get laid out at a quadword boundry. However, the compiler might then chose to put seven bytes of fill in the middle of your T object to ensure that the double is also put on a quadword boundry. You can't really say without forcing its hand with a record representation clause as well. If you care how the internals of a record structure are laid out (eg: you want to ensure it matches some C layout), you probably ought to be explicit about it.
I have a question about structure padding and memory alignment optimizations regarding structures in C language. I am sending a structure over the network, I know that, for run-time optimizations purposes, the memory inside a structure is not contiguous. I've run some tests on my local computer and indeed, sizeof(my_structure) was different than the sum of all my structure members. I ran some research to find out two things :
First, the sizeof() operator retrieves the padded size of the structure (i.e the real size that would be stored in memory).
When specifying __attribute__((__packed__)) in the declaration of the structure this optimization is disabled by the compiler, so sizeof(my_structure) will be exactly the same as the sum of the fields of my structure.
That being said, i am wondering if the sizeof operator was getting the padded size on every compilers implementation and on every architecture, in other words, is it always safe to copy a structure with memcpy for example using the sizeof operator such as :
memcpy(struct_dest, struct_src, sizeof(struct_src));
I am also wondering what is the real purpose of __attribute__((__packed__)), is it used to send a less important amount the data on a network when submitting a structure or is it, in fact, used to avoid some unspecified and platform-dependant sizeof operator behaviour ?
Thanks by advance.
Different compilers on different architectures can and do use different padding. So for wire transmission it is not uncommon to pack structs to achieve a consistent binary layout. This can then cater for the code at each end of the wire running on different architecture.
However you also need to make sure that your data types are the same size if you use this approach. For example, on 64 bit systems, long is 4 bytes on Windows and 8 bytes almost everywhere else. And you also need to deal with endianness issues. The standard is to transmit over the wire in network byte order. In practice you would be better using a dedicated serialization library rather than trying to reinvent solutions to all these issues.
I am sending a structure over the network
Stop there. Perhaps some would disagree with me on this (in practice you do see a lot of projects doing this), but struct is a way of laying out things in memory - it's not a serialization mechanism. By using this tool for the job, you're already tying yourself to a bunch of non-portable assumptions.
Sure, you may be able to fake it with things like structure padding pragmas and attributes, but - can you really? Even with those non-portable mechanisms you never know what quirks might show up. I recall working in a code base where "packed" structures were used, then suddenly taking it to a platform where access had to be word aligned... even though it was nominally the same compiler (thus supported the same proprietary extensions) it produced binaries which crashed. Any pain you get from this path is probably deserved, and I would say only take it if you can be 100% sure it will only run in a given compiler and environment, and that will never change. I'd say the safer bet is to write a proper serialization mechanism that doesn't allow writing structures around across process boundaries.
Is it always safe to copy a structure with memcpy for example using the sizeof operator
Yes, it is and that is the purpose of providing the sizeof operator.
Usually __attribute__((__packed__)) is used not for size considerations but when you want want to to make sure of the layout of a structure is exactly as you want it to be.
For ex:
If a structure is to be used to match hardware or be sent on a wire then it needs to have the exact same layout without any padding.This is because different architectures usually implement different kinds & amounts of padding and alignment and the only way to ensure common ground is to remove padding out out of the picture by using packing.
Is it a good idea to simply dump a struct to a binary file using fwrite?
e.g
struct Foo {
char name[100];
double f;
int bar;
} data;
fwrite(&data,sizeof(data),1,fout);
How portable is it?
I think it's really a bad idea to just throw whatever the compiler gives(padding,integer size,etc...). even if platform portability is not important.
I've a friend arguing that doing so is very common.... in practice.
Is it true???
Edit: What're the recommended way to write portable binary file? Using some sort of library?
I'm interested how this is achieved too.(By specifying byte order,sizes,..?)
That's certainly a very bad idea, for two reasons:
the same struct may have different sizes on different platforms due to alignment issues and compiler mood
the struct's elements may have different representations on different machines (think big-endian/little-endian, IEE754 vs. some other stuff, sizeof(int) on different platforms)
It rather critically matters whether you want the file to be portable, or just the code.
If you're only ever going to read the data back on the same C implementation (and that means with the same values for any compiler options that affect struct layout in any way), using the same definition of the struct, then the code is portable. It might be a bad idea for other reasons: difficulty of changing the struct, and in theory there could be security risks around dumping padding bytes to disk, or bytes after any NUL terminator in that char array. They could contain information that you never intended to persist. That said, the OS does it all the time in the swap file, so whatEVER, but try using that excuse when users notice that your document format doesn't always delete data they think they've deleted, and they just emailed it to a reporter.
If the file needs to be passed between different platforms then it's a pretty bad idea, because you end up accidentally defining your file format to be something like, "whatever MSVC on Win32 ends up writing". This could end up being pretty inconvenient to read and write on some other platform, and certainly the code you wrote in the first place won't do it when running on another platform with an incompatible storage representation of the struct.
The recommended way to write portable binary files, in order of preference, is probably:
Don't. Use a text format. Be prepared to lose some precision in floating-point values.
Use a library, although there's a bit of a curse of choice here. You might think ASN.1 looks all right, and it is as long as you never have to manipulate the stuff yourself. I would guess that Google Protocol Buffers is fairly good, but I've never used it myself.
Define some fairly simple binary format in terms of what each unsigned char in turn means. This is fine for characters[*] and other integers, but gets a bit tricky for floating-point types. "This is a little-endian representation of an IEEE-754 float" will do you OK provided that all your target platforms use IEEE floats. Which I expect they do, but you have to bet on that. Then, assemble that sequence of characters to write and interpret it to read: if you're "lucky" then on a given platform you can write a struct definition that matches it exactly, and use this trick. Otherwise do whatever byte manipulation you need to. If you want to be really portable, be careful not to use an int throughout your code to represent the value taken from bar, because if you do then on some platform where int is 16 bits, it won't fit. Instead use long or int_least32_t or something, and bounds-check the value on writing. Or use uint32_t and let it wrap.
[*] Until you hit an EBCDIC machine, that is. Not that anybody will seriously expect your files to be portable to a machine that plain text files aren't portable to either.
How fond are you of getting a call in the middle of the night? Either use a #pragma to pack them or write them variable by variable.
Yes, this sort of foolishness is very common but that doesn't make it a good idea. You should write each field individually in a specified byte order, that will avoid alignment and byte order problems at the cost of a little tiny bit of extra effort. Reading and writing field by field will also make your life easier when you upgrade your software and have to read your old data format or if the underlying hardware architecture changes.