We use the below structure in code running on a 32-bit machine. If we have to transfer this stucture to a 64-bit machine, is there any change required?
struct test
{
int num;
char a;
double dd;
};
i have two machine on network and i have two transfer data stored in above mention structure from 32 bit machine to 64 bit machine so how to make the above mention structure in generic structure so that data will not loose... this is my question.
The layout of such a structure is completely platform-dependent and you can't even use it to transfer data between two instances of a 32 bit application compiled using different compilers, or different compile settings under the same compiler.
The only safe use for such a structure in data transfer is between multiple instances of the same executable. Same as in: same build. You can't even generally guarantee that some later build will have the same structure.
To transfer binary data in a binary-compatible fashion, you need to use some kind of a binary stream that maintains a fixed binary structure, independent of the platform. Google Protocol Buffers are one example of such, another is Qt's QDataStream.
Generally the struct is not really adequate to use for network or persistency purposes, as it relies in too many ways on the C implementation (compiler + platform).
"Transferring" depends on what you're doing with the struct and contained elements.
These items should be on you checklist:
Check elements for value ranges. All used types may change in width. char may change in signedness.
Check the whole structure's size. This might be important for code relying on a specific size or some arbitrary bounds.
When leaving the process's address space (network or persistently storing) make sure that the struct's are properly migrated, incl. endings, size, alignment.
Everything depends heavily on the used C implementations on the different platforms.
Related
I want to synchronize two Raspberry Pi's with a C program. It is working fine, if the program only is running on the Pi's, but for development I want to use my PC (where its also easier to debug), but I send the timespec struct directly as binary over the wire. A raspberry is using 4bytes for long and time_t, my PC is using 8byte each... So they do not come together.
Is it possible to set long and time_t to 4byte each, only for this C script?
I know that the size of long, short, etc. is defined by the system.
Important: I only want to define it once in the script and not transforming it to uintXX or int each time.
In programming, it is not uncommon to need to treat network transmissions as separate from in-memory handling. In fact, it is pretty much the norm. So converting it to a network format of the proper byte order and size is really recommended and while help with the abstractions for your interfaces.
You might as well consider transforming to plain text, if that is not a time-critical piece of data exchange. It makes for a lot easier debugging.
C is probably not the best tool for the job here. It's much too low level to provide automatic data serialization like JavaScript, Python or similar more abstract languages.
You cannot assume the definitions of timespec will be identical on different platforms. For one thing the size of an int will be different depending on the 32/64 bits architecture, and you can have endianing problems too.
When you want to exchange data structures between heterogeneous platforms, you need to define your own protocol with unambiguous data and a clear endianing convention.
One solution would be to send the numbers as ASCII. Not terribly efficient, but if it's just a couple of values, who cares?
Another would be to create an exchange structure with (u)intXX_t fields.
You can assume a standard raspberry kernel will be little endian like your PC, but if you're writing a small exchange protocol, you might as well add a couple of htonl/ntohl for good measure.
I am working on a project that is using Unix domain socket (AF_UNIX) as a choice of IPC between different processes.
When I want to pass a data structure from one process to another, do I need to do serialization on the data structure as mentioned in this question (Passing a structure through Sockets in C)?
Since these processes are compiled with same compiler and running on the same machine, there should be no endianness , nor different padding issue. So I am not sure if serialization is necessary.
You need only ensure that the received structure is intelligible.
If the structure is composed of self-contained types then no processing is required, you can just call write() or send() to push the data into the socket.
Serialisation is needed where the structure is not self-contained ( eg if it contains pointers, or platform-specific data types)
If there a chance that the two processes could have different bit-ness
(eg 32 bit vis 64 bit) or different endian-ness you'll want to take care that the struct is well-defined such that it comes out with the same binary representation in both forms.
Serialization is not necessary in this case. Every operating system and CPU architecture combination will have a quite well defined ABI which says how structs and such are laid out in memory. This severely limits the compiler in how much it is allowed to change things around and for a good reason - change the ABI and all precompiled libraries stop working. So if you compile stuff with the same compiler targeting the same architecture the in-memory layout of structs will be the same.
To be sure, just remember to rebuild both sides on major operating system updates in case the ABI changes (which it never does, but it could happen some day).
I am learning structure padding and packing in C.
I have this doubt, as I have read padding will depend on architecture, so does it affect inter machine communication?, ie. if data created on one machine is getting read on other machine.
How this problem is avoided in this scenario.
Yes, you cannot send the binary data of a structure between platforms and expect it to look the same on the other side.
The way you solve it is you create a marshaller/demarshaller for your construct and pass it through on the way out of one system, and on the way in to the other system. This lets the compiler take care of the buffering for you on each system.
Each side knows how to take the data, as you've specified it will be sent, and deal with it for the local platform.
Platforms such as java handle this for you by creating serialization mechanisms for your classes. In C, you'll need to do this for yourself. How you do it depends on how you want to send your data. You could serialize to binary, XML, or anything else.
#pragma pack is supported by most compilers that I know of. This can allow the programmer to specify their desired padding method for structs.
http://msdn.microsoft.com/en-us/library/2e70t5y1%28v=vs.80%29.aspx
http://gcc.gnu.org/onlinedocs/gcc/Structure_002dPacking-Pragmas.html
http://clang.llvm.org/docs/UsersManual.html#microsoft-extensions
In C/C++ a structures are used as data pack. It doesn't provide any data encapsulation or data hiding features (C++ case is an exception due to its semantic similarity with classes).
Because of the alignment requirements of various data types, every member of structure should be naturally aligned. The members of structure allocated sequentially increasing order.
It will only be affected if the code you have compiled for some other architecture uses a different padding scheme.
To help alleviate problems, I recommend that you pack structures with no padding. Where padding is required, use place-holders in (eg char reserved[2]). Also, don't use bitfields!! They are not portable.
You should also be aware of other architecture-related problems. Specifically endianness, and datatype sizes. If you need better portability, you may want to serialise and de-serialise a byte stream instead of casting it as a struct.
You can use #pragma pack(1) before the struct declaration and #pragma pack() before to disable architecture based packing; this will solve half of the problem 'cause some data types are architecture based too, to solve the second half I usually use specific data type like int_16 for 16 bits integers, u_int_32 for 32 bits integers and so on.
Take a look at http://freebsd.active-venture.com/FreeBSD-srctree/newsrc/netinet/ip_icmp.h.html ; this include describe some architecture independent network data packets.
I have a question about structure padding and memory alignment optimizations regarding structures in C language. I am sending a structure over the network, I know that, for run-time optimizations purposes, the memory inside a structure is not contiguous. I've run some tests on my local computer and indeed, sizeof(my_structure) was different than the sum of all my structure members. I ran some research to find out two things :
First, the sizeof() operator retrieves the padded size of the structure (i.e the real size that would be stored in memory).
When specifying __attribute__((__packed__)) in the declaration of the structure this optimization is disabled by the compiler, so sizeof(my_structure) will be exactly the same as the sum of the fields of my structure.
That being said, i am wondering if the sizeof operator was getting the padded size on every compilers implementation and on every architecture, in other words, is it always safe to copy a structure with memcpy for example using the sizeof operator such as :
memcpy(struct_dest, struct_src, sizeof(struct_src));
I am also wondering what is the real purpose of __attribute__((__packed__)), is it used to send a less important amount the data on a network when submitting a structure or is it, in fact, used to avoid some unspecified and platform-dependant sizeof operator behaviour ?
Thanks by advance.
Different compilers on different architectures can and do use different padding. So for wire transmission it is not uncommon to pack structs to achieve a consistent binary layout. This can then cater for the code at each end of the wire running on different architecture.
However you also need to make sure that your data types are the same size if you use this approach. For example, on 64 bit systems, long is 4 bytes on Windows and 8 bytes almost everywhere else. And you also need to deal with endianness issues. The standard is to transmit over the wire in network byte order. In practice you would be better using a dedicated serialization library rather than trying to reinvent solutions to all these issues.
I am sending a structure over the network
Stop there. Perhaps some would disagree with me on this (in practice you do see a lot of projects doing this), but struct is a way of laying out things in memory - it's not a serialization mechanism. By using this tool for the job, you're already tying yourself to a bunch of non-portable assumptions.
Sure, you may be able to fake it with things like structure padding pragmas and attributes, but - can you really? Even with those non-portable mechanisms you never know what quirks might show up. I recall working in a code base where "packed" structures were used, then suddenly taking it to a platform where access had to be word aligned... even though it was nominally the same compiler (thus supported the same proprietary extensions) it produced binaries which crashed. Any pain you get from this path is probably deserved, and I would say only take it if you can be 100% sure it will only run in a given compiler and environment, and that will never change. I'd say the safer bet is to write a proper serialization mechanism that doesn't allow writing structures around across process boundaries.
Is it always safe to copy a structure with memcpy for example using the sizeof operator
Yes, it is and that is the purpose of providing the sizeof operator.
Usually __attribute__((__packed__)) is used not for size considerations but when you want want to to make sure of the layout of a structure is exactly as you want it to be.
For ex:
If a structure is to be used to match hardware or be sent on a wire then it needs to have the exact same layout without any padding.This is because different architectures usually implement different kinds & amounts of padding and alignment and the only way to ensure common ground is to remove padding out out of the picture by using packing.
The 8-bit,16-bit,32-bit,64-bit operating systems have different data range for
integers,float and double values.
Is this the compiler or the processor that makes difference(8bit,16bit,32bit,64bit).
If in a network if a 16 bit integer data from one system is transferred to a 32 bit system
or vice-versa will the data be correctly represented in memory.Please help me to understand.
Ultimately, it is up to the compiler. The compiler is free to choose any data types it likes*, even if it has to emulate their behaviour with software routines. Of course, typically, for efficiency it will try to replicate the native types of the underlying hardware.
As to your second question, yes, of course, if you transfer the raw representation from one architecture to another, it may be interpreted incorrectly (endianness is another issue). That is why functions like ntohs() are used.
* Well, not literally anything it likes. The C standard places some constraints, such as that an int must be at least as large as a short.
The compiler (more properly the "implementation") is free to choose the sizes, subject to the limits in the C standard. The set of sizes offered by C for its various types depends in part on the hardware it runs on; i.e. the compiler makes the choice but, it (except in cases like Java where datatypes are explicitly independent of underlying hardware) is strongly influenced by what the hardware offers.
It depends not on just the compiler and operating system. It is dictated by the architecture (processor at least).
When passing data between possibly different architectures they use special fixed size data type, e.g. uint64_t, uint32_t instead of int, short etc.
But the size of integers is not the only concern when communicating between computers with different architectures, there's a byte order issue too (try googling about BigEndian and LittleEndian)
The size of a given type depends on the CPU and the conventions on the operating system.
If you want to have an int of a specific size, use the stdint.h header [wikipedia]. It defines the int8_t, int16_t, int32_t, int64_t, some others and their unsigned equivalent.
For communications between different computers, the protocol should define the sizes and byte order to use.
In a network it has to be defined in the protocol which data sizes you have. For endianness, it is highly recommended to use big endian values.
If there weren't the issue with the APIs, a compiler would be free to set its short, int, long as it wants. Bot often, the API calls are connected to these types. E.g. the open() function returns an int, whose size should be correct.
But the types might as well be part of the ABI definition.