Passing a structure through unix domain socket - c

I am working on a project that is using Unix domain socket (AF_UNIX) as a choice of IPC between different processes.
When I want to pass a data structure from one process to another, do I need to do serialization on the data structure as mentioned in this question (Passing a structure through Sockets in C)?
Since these processes are compiled with same compiler and running on the same machine, there should be no endianness , nor different padding issue. So I am not sure if serialization is necessary.

You need only ensure that the received structure is intelligible.
If the structure is composed of self-contained types then no processing is required, you can just call write() or send() to push the data into the socket.
Serialisation is needed where the structure is not self-contained ( eg if it contains pointers, or platform-specific data types)
If there a chance that the two processes could have different bit-ness
(eg 32 bit vis 64 bit) or different endian-ness you'll want to take care that the struct is well-defined such that it comes out with the same binary representation in both forms.

Serialization is not necessary in this case. Every operating system and CPU architecture combination will have a quite well defined ABI which says how structs and such are laid out in memory. This severely limits the compiler in how much it is allowed to change things around and for a good reason - change the ABI and all precompiled libraries stop working. So if you compile stuff with the same compiler targeting the same architecture the in-memory layout of structs will be the same.
To be sure, just remember to rebuild both sides on major operating system updates in case the ABI changes (which it never does, but it could happen some day).

Related

Define size of long and time_t to 4bytes

I want to synchronize two Raspberry Pi's with a C program. It is working fine, if the program only is running on the Pi's, but for development I want to use my PC (where its also easier to debug), but I send the timespec struct directly as binary over the wire. A raspberry is using 4bytes for long and time_t, my PC is using 8byte each... So they do not come together.
Is it possible to set long and time_t to 4byte each, only for this C script?
I know that the size of long, short, etc. is defined by the system.
Important: I only want to define it once in the script and not transforming it to uintXX or int each time.
In programming, it is not uncommon to need to treat network transmissions as separate from in-memory handling. In fact, it is pretty much the norm. So converting it to a network format of the proper byte order and size is really recommended and while help with the abstractions for your interfaces.
You might as well consider transforming to plain text, if that is not a time-critical piece of data exchange. It makes for a lot easier debugging.
C is probably not the best tool for the job here. It's much too low level to provide automatic data serialization like JavaScript, Python or similar more abstract languages.
You cannot assume the definitions of timespec will be identical on different platforms. For one thing the size of an int will be different depending on the 32/64 bits architecture, and you can have endianing problems too.
When you want to exchange data structures between heterogeneous platforms, you need to define your own protocol with unambiguous data and a clear endianing convention.
One solution would be to send the numbers as ASCII. Not terribly efficient, but if it's just a couple of values, who cares?
Another would be to create an exchange structure with (u)intXX_t fields.
You can assume a standard raspberry kernel will be little endian like your PC, but if you're writing a small exchange protocol, you might as well add a couple of htonl/ntohl for good measure.

Sending C struct of any type safely in MPI

How can I safely send a C struct of any (unknown) type and content in MPI, considering heterogeneous processors. I know MPI_BYTE with MPI_Send as the data type could be used, but how can I ensure consistency and the correctness of the data representation at the receiver side in a heterogeneous system.
You cannot. MPI is not automatically aware of how specific compilers represent in binary data structures, therefore it makes heavy use of its built-in extensive datatype system, which is used to explicitly tell the library where in memory to find the data and how to interpret it.
To send a C structure between instances of two different executables, which includes e.g. different executables for different platforms (heterogeneous computing) or different executables produced by different compilers or by the same compiler but with different alignment options (no idea why would anyone do such a thing, but it is anyway possible), you must construct an MPI datatype that describes the structure using MPI_Type_create_struct. Another option is to pack on the sending side the relevant structure fields using MPI_Pack and then unpack them on the receiving side using MPI_Unpack. In both cases, an MPI implementation that supports heterogeneous environments will take care of transforming the data to and from some intermediate format (with XDR often being the format of choice).
Using MPI_BYTE to send raw binary data between machines with different endianness or even with different type alignment is simply not an option.

Generic structure to transfer from 32 bit machine to 64 bit machine

We use the below structure in code running on a 32-bit machine. If we have to transfer this stucture to a 64-bit machine, is there any change required?
struct test
{
int num;
char a;
double dd;
};
i have two machine on network and i have two transfer data stored in above mention structure from 32 bit machine to 64 bit machine so how to make the above mention structure in generic structure so that data will not loose... this is my question.
The layout of such a structure is completely platform-dependent and you can't even use it to transfer data between two instances of a 32 bit application compiled using different compilers, or different compile settings under the same compiler.
The only safe use for such a structure in data transfer is between multiple instances of the same executable. Same as in: same build. You can't even generally guarantee that some later build will have the same structure.
To transfer binary data in a binary-compatible fashion, you need to use some kind of a binary stream that maintains a fixed binary structure, independent of the platform. Google Protocol Buffers are one example of such, another is Qt's QDataStream.
Generally the struct is not really adequate to use for network or persistency purposes, as it relies in too many ways on the C implementation (compiler + platform).
"Transferring" depends on what you're doing with the struct and contained elements.
These items should be on you checklist:
Check elements for value ranges. All used types may change in width. char may change in signedness.
Check the whole structure's size. This might be important for code relying on a specific size or some arbitrary bounds.
When leaving the process's address space (network or persistently storing) make sure that the struct's are properly migrated, incl. endings, size, alignment.
Everything depends heavily on the used C implementations on the different platforms.

Questions about memory alignement in structures and portability of the sizeof operator

I have a question about structure padding and memory alignment optimizations regarding structures in C language. I am sending a structure over the network, I know that, for run-time optimizations purposes, the memory inside a structure is not contiguous. I've run some tests on my local computer and indeed, sizeof(my_structure) was different than the sum of all my structure members. I ran some research to find out two things :
First, the sizeof() operator retrieves the padded size of the structure (i.e the real size that would be stored in memory).
When specifying __attribute__((__packed__)) in the declaration of the structure this optimization is disabled by the compiler, so sizeof(my_structure) will be exactly the same as the sum of the fields of my structure.
That being said, i am wondering if the sizeof operator was getting the padded size on every compilers implementation and on every architecture, in other words, is it always safe to copy a structure with memcpy for example using the sizeof operator such as :
memcpy(struct_dest, struct_src, sizeof(struct_src));
I am also wondering what is the real purpose of __attribute__((__packed__)), is it used to send a less important amount the data on a network when submitting a structure or is it, in fact, used to avoid some unspecified and platform-dependant sizeof operator behaviour ?
Thanks by advance.
Different compilers on different architectures can and do use different padding. So for wire transmission it is not uncommon to pack structs to achieve a consistent binary layout. This can then cater for the code at each end of the wire running on different architecture.
However you also need to make sure that your data types are the same size if you use this approach. For example, on 64 bit systems, long is 4 bytes on Windows and 8 bytes almost everywhere else. And you also need to deal with endianness issues. The standard is to transmit over the wire in network byte order. In practice you would be better using a dedicated serialization library rather than trying to reinvent solutions to all these issues.
I am sending a structure over the network
Stop there. Perhaps some would disagree with me on this (in practice you do see a lot of projects doing this), but struct is a way of laying out things in memory - it's not a serialization mechanism. By using this tool for the job, you're already tying yourself to a bunch of non-portable assumptions.
Sure, you may be able to fake it with things like structure padding pragmas and attributes, but - can you really? Even with those non-portable mechanisms you never know what quirks might show up. I recall working in a code base where "packed" structures were used, then suddenly taking it to a platform where access had to be word aligned... even though it was nominally the same compiler (thus supported the same proprietary extensions) it produced binaries which crashed. Any pain you get from this path is probably deserved, and I would say only take it if you can be 100% sure it will only run in a given compiler and environment, and that will never change. I'd say the safer bet is to write a proper serialization mechanism that doesn't allow writing structures around across process boundaries.
Is it always safe to copy a structure with memcpy for example using the sizeof operator
Yes, it is and that is the purpose of providing the sizeof operator.
Usually __attribute__((__packed__)) is used not for size considerations but when you want want to to make sure of the layout of a structure is exactly as you want it to be.
For ex:
If a structure is to be used to match hardware or be sent on a wire then it needs to have the exact same layout without any padding.This is because different architectures usually implement different kinds & amounts of padding and alignment and the only way to ensure common ground is to remove padding out out of the picture by using packing.

When to worry about endianness?

I have seen countless references about endianness and what it means. I got no problems about that...
However, my coding project is a simple game to run on linux and windows, on standard "gamer" hardware.
Do I need to worry about endianness in this case? When should I need to worry about it?
My code is simple C and SDL+GL, the only complex data are basic media files (png+wav+xm) and the game data is mostly strings, integer booleans (for flags and such) and static-sized arrays. So far no user has had issues, so I am wondering if adding checks is necessary (will be done later, but there are more urgent issues IMO).
The times when you need to worry about endianess:
you are sending binary data between machines or processes (using a network or file). If the machines may have different byte order or the protocol used specifies a particular byte order (which it should), you'll need to deal with endianess.
you have code that access memory though pointers of different types (say you access a unsigned int variable through a char*).
If you do these things you're dealing with byte order whether you know it or not - it might be that you're dealing with it by assuming it's one way or the other, which may work fine as long as your code doesn't have to deal with a different platform.
In a similar vein, you generally need to deal with alignment issues in those same cases and for similar reasons. Once again, you might be dealing with it by doing nothing and having everything work fine because you don't have to cross platform boundaries (which may come back to bite you down the road if that does become a requirement).
If you mean a PC by "standard gamer hardware", then you don't have to worry about endianness as it will always be little endian on x86/x64. But if you want to port the project to other architectures, then you should design it endianness-independently.
Whenever you recieve/transmit data from a network, remeber to convert to/from network and host byte order. The C functions htons, htonl etc, or equivalients in your language, should be used here.
Whenever you read multi-byte values (like UTF-16 characters or 32 bit ints) from a file, since that file might have originated on a system with different endianness. If the file is UTF 16 or 32 it probably has a BOM (byte-order mark). Otherwise, the file format will have to specify endianness in some way.
You only need to worry about it if your game needs to run on different hardware architectures. If you are positive that it will always run on Intel hardware then you can forget about it. If it will run on Linux though many people use different architectures than Intel and you may end up having to think about it.
Are you distributing you game in source code form?
Because if you are distributing you game as a binary only, then you know exactly which processor families your game will run on. Also, the media files, are they user generated (possibly via a level editor) or are they really only ment to be supplied by yourself?
If this is a truly closed environment (your distribute binaries and the game assets are not intended to be customized) then you know your own risks to endians and I personally wouldn't fool with it.
However, if you are either distributing source and/or hoping people will customize their game, then you have a potential for concern. However, with most of the desktop/laptop computers around these days moving to x86 I would think this is a diminishing concern.
The problem occurs with networking and how the data is sent and when you are doing bit fiddling on different processors since different processors may store the data differently in memory.
I believe Power PC has the opposite endianness of the Intel boards. Might be able to have a routine that sets the endianness dependant on the architecture? I'm not sure if you can actually tell what the hardware architecture is in code...maybe someone smarter then me does know the answer to that question.
Now in reference to your statement "standard" Gamer H/W, I would say typically you're going to look at Consumer Off the Shelf solutions are really what most any Standard Gamer is using, so you're almost going to for sure get the same endian across the board. I'm sure someone will disagree with me but that's my $.02
Ha...I just noticed on the right there is a link that is showing up related to the suggestion I had above.
Find Endianness through a c program

Resources