I've got the following struct:
struct fetch_info_t {
u_int8_t grocery_type;
u_int8_t arg[1024];
} __attribute__((packed));
I'd like to send this over a socket to a server, to request data. I'd very much like to avoid any libraries, such as protobuf.
grocery_type can be any value between 1 and 255. Some grocery types, say type 128, must provide additional information. I'ts not enough to provide type 128, I'd also like to provide Cheeses as a string. Having that said, type 129 must provide a number, u_int32_t and not a string, unlike 128.
Basically I've allocated 1024 bytes for the additional information the system may require. The question is, how do I send it over a socket, or more specifically, populate arg with the right information non-system-dependant? I know htonl on the number could be used, but how do I actually set the buffer value to that?
I'd imagine that the info sending would actually eventually be casting the struct pointer to unsigned char array and send it like that over a socket. Let me please know if there's a better way.
You cannot assign directly the 32-bit value to the array
because the correct alignment is not guaranteed.
memcpy() will just replicate the bytes with not alignment problem.
u_int32_t the_value=htonl( ... );
struct fetch_info_t the_info;
the_info.grocery_type=129;
memcpy(the_info.arg, &the_value, sizeof(the_value));
Then, because your structure is packed, you can send it with
send(my_socket, &the_info,
sizeof(the_info.grocery_type)+sizeof(the_value), 0);
In case you need to send a string
char *the_text= ... ;
size_t the_size=strlen(the_text)+1;
struct fetch_info_t the_info;
the_info.grocery_type=128;
memcpy(the_info.arg, the_text, the_size);
send(my_socket, &the_info,
sizeof(the_info.grocery_type)+the_size, 0);
Note that the '\0' is transmitted here.
Related
Basicly i have a custom structure that contains different kind of data. For example:
typedef struct example_structure{
uint8_t* example_1[4];
int example_2[4];
int example_3;
} example_structure;
What i need to do is to copy context of this structure to a const char* buffer so i can send that copied data (buffer) using winsock2's send(SOCKET s, const char* buffer, int len, int flags) function. I tried using memcpy() but wouldn't i just copy address of pointers and not the data?
Yes, if you copied or sent that structure through a socket you would end up copying/sending pointers, which would obviously be meaningless to the recipient, however, if the recipient is running on different hardware (e.g. not the same endian), all of the data may be meaningless anyway. On top of that, differences in the amount of padding between structure members may also become a problem.
For non-trivial situations it is best to use an existing protocol (such as protobuf), or roll your own protocol, keeping in mind the potential differences in hardware representation of your data.
You need to design a protocol before you can encode the data in accord with that protocol. Decide exactly how the data will be encoded at the byte level. Then write code to encode and decode to that format that you decided on.
Do not skip the step of actually documenting the wire protocol at the byte level. It will save you pain later, I promise.
See this answer for a bit more detail.
const char* buffer
This buffer has a constant value so u cant copy anything to it. You probably don't need to copy anything. Just use send function in such a way:
send(s, (char*)&example_structure, sizeof(structure), flags)
But here is the problem with pointers in your structure (uint8_t* example_1[4];).
Sending pointers between different applications / machine does not make sense.
Hmm, your struct contains uint8_t * fields, what looks like C strings... It does not make sense copying or sending a pointer which is just a mere memory address in sending process user space.
If your struct has been (note, no pointers):
typedef struct example_structure{
uint8_t example_1[4];
int example_2[4];
int example_3;
} example_structure;
and provided you transfer it on exactly same architecture (same hardware, same compiler, same compiler options), you could do simply:
example_structure ex_struc;
// initialize the struct
...
send(s, &ex_struc, sizeof(ex_struc), flags);
And even in that case, I would strongly advise you to define and use a protocol - as already said by #DavidSchwartz, it could save you time and headaches later...
But as you have pointers, you cannot do that and must define a protocol.
it could be (but you are free to prefere little endian order, or 2 or 8 bytes for each int depending on your actual data):
one byte (or two) for length of first uint8_t array, followed by the array
above repeated 3 more times
four bytes in big endian order for first int of example_2
repeated 3 times
four bytes in big endian order for int of example_3
This clearly defines the format of a message.
I'm making a C program that pass a structure via socket
This is my struct
typedef struct{
char type; //message type
char* sender; //sender
char* receiver; //receiver
unsigned int msglen; //msg length
char* msg; //text
} msg_t;
this is my send function:
void send_message(int socket, char* msg)
{
msg_t message;
bzero(&message,sizeof(message));
message.msg = msg;
if(send(socket,&message,sizeof(msg_t),0) < 0)
{
perror("ERROR: send fail\n");
}
}
and this is my receive function:
msg_t rec_message(int socket)
{
msg_t buff;
bzero(&buff,sizeof(buff));
if(recv(socket,&buff,sizeof(buff),0) < 0)
{
perror("ERROR: receive failed\n");
}
return buff;
}
When I send message like strings everything works fine, but when I switch to structure the client seems to send the message and then give me this:
ERROR: receive failed: connection reset by peer
and the server this:
ERROR: receive failed: invalid argument
What am I doing wrong?
The question has several problems that need to be addressed. Perhaps it would be best to focus, first, on the msg_t structure itself. Here is a model of what it probably looks like; both in memory, as well as 'on the wire' as it is transmitted:
According to the above, msg_t is 40 bytes long. This can be confirmed by printing out it's size:
printf("sizeof(msg_t): %zd\n", sizeof(msg_t));
"So what's with all the empty white blocks?"
In order to make thing speedy at run-time, the compiler 'aligns' each field in the 'msg_t' structure at "natural/native" offsets of the CPU addressing architecture. On my 64-bit system, that means each structure field will be aligned on an eight-byte offset; even if it means leaving empty, unused space in the structure. Notice that the offsets of the structure fields are: 0, 8, 16, 24, 32; all multiples of 8 bytes.
On a 32-bit system, you might find that these offsets are at multiples of 4 bytes.
While an 8-byte alignment of structure fields is optimum for memory access, it is not so great when structures are sent over the wire. It is preferable for wire/protocol structures to be aligned at 1-byte; thus eliminating unused 'filler' bytes in the structure.
One way to change the alignment of a structure (supported my many compilers, but perhaps not defined by the C language itself) is '#pragma pack()'; which is used as shown below:
#pragma pack(1)
typedef struct{
char type; //message type
char* sender; //sender
char* receiver; //receiver
unsigned int msglen; //msg length
char* msg; //text
} msg_t;
#pragma pack()
In the above structure definition, the first '#pragma pack(1)' causes the following structure to be 1-byte aligned. The next '#pragma pack()' returns the compiler to its default 8-byte alignment default. This "packed" structure looks like this:
Next, examine the fields in the structure. The 'sender' field is a 'char *'. A 'char *' is an address where the sender string can be found on the 'sending' machine (or endpoint).
To be blunt, this 'address' is of no value at all to the 'receiver' machine (or endpoint); as the 'receiver' has no access to the memory of the 'sender'.
The same is true of the 'receiver' field; and the 'msg' field. All of these are addresses of strings on the 'sender' machine; which are of no value to the 'receiver' machine.
Most likely, the 'intent' is to send the actual 'sender', 'receiver' and 'msg' strings. To do that, a structure similar to the following might be used:
#pragma pack(1)
typedef struct{
char type; //message type
char sender[15]; //sender
char receiver[15]; //receiver
char msg[30]; //text
} msg_t;
#pragma pack()
This structure looks like this:
Now, the actual strings are in the structure; not just their address in memory. This will do what was actually intended.
Unfortunately, it does limit the length of each string; and it also contains a lot of unused/wasted space. Perhaps it would be nice to remove that limitation and allow more flexibility. It might be better to send these fields like this:
Notice that each 'variable-length' string is prefixed with one byte that indicates the length of the string that follows. (This is how strings are stored in the PASCAL language). This byte allows the following string to be from 0-255 bytes long. No wasted space on the wire.
Unfortunately, this 'wire format' cannot be produced directly using C structures.
Lets go now to the structure defined in the question; with some slight modification:
typedef struct{
char type; //message type
char* sender; //sender
char* receiver; //receiver
char* msg; //text
} msg_t;
Notice that I have returned the structure to it's natural/native 8-byte alignment by eliminating the '#pragma pack()' stuff. I have also removed the 'msgLength' field (it is not really needed).
Most likely, the sender, reciever, and msg fields of the structure will be initialized to point to strings (perhaps allocated with malloc(), etc.). What you do to send this structure over the wire, using the efficient layout above, is to send each field individually.
First, send the one byte 'type'. Then send the one byte length of the sender 'string' [ie: strlen(sender) + 1). Then send the 'sender' string, followed by the one byte length of the receiver string, followed by the 'receiver' string, followed by the one byte length of the 'msg' string, followed by the 'msg' string.
On the 'receiver' endpoint, you first read the one-byte 'type' (which would clue you in that there will be three 'length-preceeded' strings to follow). Reading the next byte would tell you the size of the following string (and allow you to malloc() memory to the 'sender' field of the msg_t structure at the receiver endpoint). Then read the 'sender' string into exactly the right sized, malloc()ed memory. Do the same to read the receiver string length, and the receiver string; and finally, with the msg length, and string.
If you find a PASCAL string (limited to 255 bytes) a bit tight, change the length-preceeded value from one byte, to multiple bytes.
Im stuck with a problem of reading bytes in my C tcp socket server which receives request from a python client. I have the following struct as my receive template
struct ofp_connect {
uint16_t wildcards; /* identifies ports to use below */
uint16_t num_components;
uint8_t pad[4]; /* Align to 64 bits */
uint16_t in_port[0];
uint16_t out_port[0];
struct ofp_tdm_port in_tport[0];
struct ofp_tdm_port out_tport[0];
struct ofp_wave_port in_wport[0];
struct ofp_wave_port out_wport[0];
};
OFP_ASSERT(sizeof(struct ofp_connect) == 8);
I can read the first two 32 bit fields properly but my problem is the in_port[0] after the pad field that seems to be wrong. The way its currently being read is
uint16_t portwin, portwout, * wportIN;
wportIN = (uint16_t*)&cflow_mod->connect.in_port; //where cflow_mod is the main struct which encompasses connect struct template described above
memcpy(&portwin, wportIN, sizeof(portwin) );
DBG("inport:%d:\n", ntohs(portwin));
unfortunately this doesnt give me the expected inport number. I can check in wireshark that the client is sending the right packet format but I feel the way I read the in/out port is wrong. Or is it because of the way python sends the data? Can you provide some advice on where and why im going wrong? Thanks in advance.
The declaration of struct ofp_connect violates the following clause of the ISO C standard:
6.7.2.1 Structure and union specifiers ... 18 As a special case, the last element of a structure with more than one named member may have
an incomplete array type; this is called a flexible array member.
Note that in your case in_port and out_port should have been declared as in_port[] and out_port[] to take advantage of the clause above in which case you would have two flexible array membes, which is prohibited by the above clause. The zero-length array declaration is a convention adopted by many compilers (including gcc, for example) which has the same semantics but in your case, both in_port and out_port share the same space (essentially whatever bytes follow the ofp_connect structure). Moreover, for this to work, you have to allocate some space after the structure for the flexible array members. Since, as you said, struct connect is part of a larger structure, accessing in_port returns the 'value' stored in the containing structure's member following the connect sub-struct
Suppose I have some complex struct
struct icmphdr
{
u_int8_t type;
u_int8_t code;
u_int16_t checksum;
/* Parts of the packet below don’t have to appear */
union
{
struct
{
u_int16_t id;
u_int16_t sequence;
// Type of ICMP message
// Packet code
// Datagram checksum
} echo;
u_int32_t gateway;
struct
{
u_int16_t __unused;
u_int16_t mtu;
} frag;
} un;
};
and a
char buf[SIZE];//for some integer SIZE
what is the meaning and the interest of this cast ?
ip=(struct icmphdr*)buf; //ip was formerly defined as some struct iphdr *ip;
The likely scenario behind your code is this:
The programmer wanted to create a data protocol and represent the various contents as a struct, to ease programming and improve code readability.
The underlying API probably only allows data transmissions on byte basis. This means that the struct will have to be passed as a "chunk of bytes". Your particular code appears to be the receiver: it has a chunk of raw bytes and states that the data in those bytes corresponds to a struct.
Formally & theoretically, the C standard does not define what happens when you cast between pointers to different data types. In theory, anything can happen if you do. But in practice/the real world, such casts are well-defined as long as there some sort of guarantee about the structure of the data.
Here is where you can get problems. Many computers have alignment requirements, meaning that the compiler is free to insert so-called padding bytes anywhere inside your struct/union. These padding bytes may not necessarily be the same between two compilations, and they may certainly not be the same between two different systems.
So you have to either ensure that both the sender and the receiver have no padding enabled, or that they have the same padding. Otherwise you cannot use structs/unions, they will cause the program to crash and burn.
The quick & dirty way to ensure that struct padding isn't enabled, is to use a compiler option such as the non-standard #pragma pack 1, which is commonly supported by many compilers.
The professional, portable way is to add a compile-time assert to check that the size of the struct is indeed as intended. With C11, it would look like
static_assert(sizeof(struct icmphdr) ==
(sizeof(uint8_t) +
sizeof(uint8_t) + ... /* all individual members' types */ ),
"Error: padding detected");
If the compiler doesn't support static_assert, there are several ways to achieve something similar with various macros, or even a runtime assert().
That's pretty bad. Don't ever make a char buffer and cast it to a struct, because the alignment will be wrong (ie, the char buffer is going to have some random starting address because strings can start anywhere, but ints need/should have addresses multiples of four on most architectures).
The solution is not to do nasty casts like that. Make a proper union that will have the alignment of the most restrictive of its members, or use a special element to force the alignment you need if you have to (see the definition of sockaddr_storage in your /usr/include/sys/socket.h or similar).
Illustration
You create a buffer on the stack and read some data into it:
char buf[1024]; int nread = read(fd, &buf, sizeof(buf));
Now you pretend the buffer was the struct:
CHECK(nread >= sizeof(struct icmphdr));
struct icmphdr* hdr = (struct icmphdr*)buf;
hdr->u.gateway; // probable SIGSEGV on eg Itanium!
By reinterpreting the buffer as a struct, we bypassed the compiler's checks. If we're unlucky, &hdr->u.gateway won't be a multiple of four, and accessing it as an integer will barf on some platforms.
Illustration of solution
strut iphdr hdr; int nread = read(fd, &hdr, sizeof(hdr));
CHECK(nread == sizeof(hdr));
hdr.u.gateway; // OK
Let the compiler help you. Don't do grotty casts. When you make a buffer, tell the compiler what you're going to use the buffer for so it can put it in the correct place in memory for you.
I have several structures defined to send over different Operating Systems (tcp networks).
Defined structures are:
struct Struct1 { uint32_t num; char str[10]; char str2[10];}
struct Struct2 { uint16_t num; char str[10];}
typedef Struct1 a;
typedef Struct2 b;
The data is stored in a text file.
Data Format is as such:
123
Pie
Crust
Struct1 a is stored as 3 separate parameters. However, struct2 is two separate parameters with both 2nd and 3rd line stored to the char str[] . The problem is when I write to a server over the multiple networks, the data is not received correctly. There are numerous spaces that separate the different parameters in the structures. How do I ensure proper sending and padding when I write to server? How do I store the data correctly (dynamic buffer or fixed buffer)?
Example of write: write(fd,&a, sizeof(typedef struct a)); Is this correct?
Problem Receive Side Output for struct2:
123( , )
0 (, Pie)
0 (Crust,)
Correct Output
123(Pie, Crust)
write(fd,&a, sizeof(a)); is not correct; at least not portably, since the C compiler may introduce padding between the elements to ensure correct alignment. sizeof(typedef struct a) doesn't even make sense.
How you should send the data depends on the specs of your protocol. In particular, protocols define widely varying ways of sending strings. It is generally safest to send the struct members separately; either by multiple calls to write or writev(2). For instance, to send
struct { uint32_t a; uint16_t b; } foo;
over the network, where foo.a and foo.b already have the correct endianness, you would do something like:
struct iovec v[2];
v[0].iov_base = &foo.a;
v[0].iov_len = sizeof(uint32_t);
v[1].iov_base = &foo.b;
v[1].iov_len = sizeof(uint16_t);
writev(fp, v, 2);
Sending structures over the network is tricky. The following problems you might have
Byte endiannes issues with integers.
Padding introduced by your compiler.
String parsing (i.e. detecting string boundaries).
If performance is not your goal, I'd suggest to create encoders and decoders for each struct to be send and received (ASN.1, XML or custom). If performance is really required you can still use structures and solve (1), by fixing an endianness (i.e. network byte
order) and ensure your integers are stored as such in those structures, and (2) by fixing a compiler and using the pragmas or attributes to enforce a "packed" structure.
Gcc for example uses attribute((packed)) as such:
struct mystruct {
uint32_t a;
uint16_t b;
unsigned char text[24];
} __attribute__((__packed__));
(3) is not easy to solve. Using null terminated strings at a network protocol
and depending on them being present would make your code vulnerable to several attacks. If strings need to be involved I'd use an proper encoding method such as the ones suggested above.
The easy way would be to write two functions for each structure: one to convert from textual representation to the struct and one to convert a struct back to text. Then you just send the text over the network and on the receiving side convert it to your structures. That way endianness does not matter.
There are conversion functions to ensure portability of binary integers across a network. Use htons, htonl, ntohs and ntohl to convert 16 and 32 bit integers from host to network byte order and vice versa.