I have a struct that I am sending to a UDP socket:
typedef struct
{
char field_id;
short field_length;
char* field;
} field_t, *field_p;
I am able to read the field_id and field_length once received on the UDP server-side, however the pointer to field is invalid as expected.
What is the best method to properly send and receive a dynamic char*?
I have a basic solution using memcpy on the client side:
char* data =
(char*)malloc(sizeof(field_t) + (sizeof(char) * strlen(my_field->field)));
memcpy(data, my_field, sizeof(field_t));
memcpy(data+sizeof(field_t), my_field->field, strlen(my_field->field) + 1);
And on the server side:
field_p data = (field_p)buffer;
field_string = (char*)buffer+sizeof(field_t);
Is there a cleaner way of doing this or is this the only way?
Thanks.
You of course cannot send a pointer over a socket - get rid of the char* field; member. Instead, just append id and size pair with the data itself. Use writev(2) or sendmsg(2) to avoid moving data around from buffer to buffer.
Watch out for structure member alignment and padding and number endianness.
Serialization is your friend.
Related Links:
SO-1
SO-2
Define your structure as:
typedef struct
{
uint8_t field_id;
uint16_t field_length;
char field[0]; // note: in C99 you could use char field[];
} field_t, *field_p;
Then, text buffer will immediately follow your structure. Just remember a few tricks:
// initialize structure
field_t *
field_init (uint8_t id, uint16_t len, const char *txt)
{
field_t *f = malloc (sizeof (field_t + len)); // note "+ len";
f->field_id = id;
f->field_length = len;
memcpy (f->field, txt, len);
return f;
}
// send structure
int
field_send (field_t *f, int fd)
{
return write (fd, f, sizeof (*f) + f->field_length); // note "+ f->field_length"
}
I don't think it's standard, though. However, most compilers (GCC && MSVC) should support this. If your compiler does not support zero-sized array, you can use one-element char array - just remember to subtract extra one byte when calculating packet size.
Related
I am trying to copy a byte array to my struct, then serialize my struct to a byte array again.
But, after I serialize my struct array, I cant get my data value (0x12, 0x34, 0x56) again, instead i get some rubbish data.
What is wrong here?
#pragma pack(push, 1)
typedef struct {
uint8_t length;
uint8_t *data;
} Tx_Packet;
#pragma pack(pop)
static void create_tx_packet(uint8_t *packet, uint8_t *src, int length);
int main(void)
{
uint8_t packet[32];
uint8_t data[] = { 0x12, 0x34, 0x56 };
create_tx_packet(packet, data, 3);
//i check using debugger, i cant get the data value correctly
//but i could get length value correctly
return 0;
}
static void create_tx_packet(uint8_t *packet, uint8_t *src, int length)
{
Tx_Packet *tx_packet = malloc(sizeof(*tx_packet ));
tx_packet->length = length;
tx_packet->data = (uint8_t *)malloc(length);
memcpy(tx_packet->data, src, length);
memcpy(packet, tx_packet, sizeof(*tx_packet));
}
Right now, your create_tx_packet() function copies a Tx_Packet struct created in the function to a uint8_t array. That struct contains the length and a pointer to the data, but not the data itself. It's actually not necessary to use the struct as an intermediate step at all, particularly for such a simple packet, so you could instead do:
static void create_tx_packet(uint8_t *packet, uint8_t *src, int length)
{
*packet = length; /* set (first) uint8_t pointed to by packet to the
length */
memcpy(packet + 1, src, length); /* copy length bytes from src to
the 2nd and subsequent bytes of
packet */
}
You still need to make sure packet points to enough space (at least length + 1 bytes) for everything (which it does). Since the version above doesn't dynamically allocate anything, it also fixes the memory leaks in your original (which should have freed tx_packet->data and tx_packet before exiting).
--
If you do want to use a struct, you can (since the data is at the end) change your struct to use an array instead of a pointer for data -- then extra space past the size of the struct can be used for the data, and accessed through the data array in the struct. The struct might be:
typedef struct {
uint8_t length;
uint8_t data[];
} Tx_Packet;
and the function becomes (if a temporary struct is used):
static void create_tx_packet(uint8_t *packet, uint8_t *src, int length)
{
/* allocate the temporary struct, with extra space at the end for the
data */
Tx_Packet *tx_packet = malloc(sizeof(Tx_Packet)+length);
/* fill the struct (set length, copy data from src) */
tx_packet->length = length;
memcpy(tx_packet->data, src, length);
/* copy the struct and following data to the output array */
memcpy(packet, tx_packet, sizeof(Tx_Packet) + length);
/* and remember to free our temporary struct/data */
free(tx_packet);
}
Rather than allocate a temporary struct, though, you could also use struct pointer to access the byte array in packet directly and avoid the extra memory allocation:
static void create_tx_packet(uint8_t *packet, uint8_t *src, int length)
{
/* Set a Tx_Packet pointer to point at the output array */
Tx_Packet *tx_packet = (Tx_Packet *)packet;
/* Fill out the struct as before, but this time directly into the
output array so we don't need to allocate and copy so much */
tx_packet->length = length;
memcpy(tx_packet->data, src, length);
}
If you use memcpy(packet, tx_packet, sizeof(*tx_packet)); you are copying the memory representation of tx_Packet into packet, starting with tx_packet->length.
Additionally when mallocating tx_packet that size should be sizeof(*packet)+sizeof(uint8_t) (length of packet plus length field)
And again when copying the tx_packet back to packet you are writing out of the boundaries of packet.
EDIT:
I forgot to mention that depending on your compiler memory alignment parameter you could get any length for the fields (including tx_packet->length) to accelerate memory operation. On 32bits machine it could be 4 and padded with rubbish.
When you serialize your struct with
memcpy(packet, tx_packet, sizeof(*tx_packet));
you're copying the length and the pointer to the data, but not the data itself. You'll probably need two memcpy calls: one of sizeof(uint8_t) to copy the length field, and one of length to copy the data.
This line:
Tx_Packet *tx_packet = malloc(sizeof(*packet));
only allocates one byte for the packet header, which you then immediately write off the end of, causing undefined behavior. You probably meant
Tx_Packet *tx_packet = malloc(sizeof(*tx_packet));
I have a struct object that comprises of several primitive data types, pointers and struct pointers. I want to send it over a socket so that it can be used at the other end. As I want to pay the serialization cost upfront, how do I initialize an object of that struct so that it can be sent immediately without marshalling? For example
struct A {
int i;
struct B *p;
};
struct B {
long l;
char *s[0];
};
struct A *obj;
// can do I initialize obj?
int len = sizeof(struct A) + sizeof(struct B) + sizeof(?);
obj = (struct A *) malloc(len);
...
write(socket, obj, len);
// on the receiver end, I want to do this
char buf[len];
read(socket, buf, len);
struct A *obj = (struct A *)buf;
int i = obj->i;
char *s = obj->p->s[0];
int i obj.i=1; obj.p.
Thank you.
The simplest way to do this may be to allocate a chunk of memory to hold everything. For instance, consider a struct as follows:
typedef struct A {
int v;
char* str;
} our_struct_t;
Now, the simplest way to do this is to create a defined format and pack it into an array of bytes. I will try to show an example:
int sLen = 0;
int tLen = 0;
char* serialized = 0;
char* metadata = 0;
char* xval = 0;
char* xstr = 0;
our_struct_t x;
x.v = 10;
x.str = "Our String";
sLen = strlen(x.str); // Assuming null-terminated (which ours is)
tLen = sizeof(int) + sLen; // Our struct has an int and a string - we want the whole string not a mem addr
serialized = malloc(sizeof(char) * (tLen + sizeof(int)); // We have an additional sizeof(int) for metadata - this will hold our string length
metadata = serialized;
xval = serialized + sizeof(int);
xstr = xval + sizeof(int);
*((int*)metadata) = sLen; // Pack our metadata
*((int*)xval) = x.v; // Our "v" value (1 int)
strncpy(xstr, x.str, sLen); // A full copy of our string
So this example copies the data into an array of size 2 * sizeof(int) + sLen which allows us a single integer of metadata (i.e. string length) and the extracted values from the struct. To deserialize, you could imagine something as follows:
char* serialized = // Assume we have this
char* metadata = serialized;
char* yval = metadata + sizeof(int);
char* ystr = yval + sizeof(int);
our_struct_t y;
int sLen = *((int*)metadata);
y.v = *((int*)yval);
y.str = malloc((sLen + 1) * sizeof(char)); // +1 to null-terminate
strncpy(y.str, ystr, sLen);
y.str[sLen] = '\0';
As you can see, our array of bytes is well-defined. Below I have detailed the structure:
Bytes 0-3 : Meta-data (string length)
Bytes 4-7 : X.v (value)
Bytes 8 - sLen : X.str (value)
This kind of well-defined structure allows you to recreate the struct on any environment if you follow the defined convention. To send this structure over the socket, now, depends on how you develop your protocol. You can first send an integer packet containing the total length of the packet which you just constructed, or you can expect that the metadata is sent first/separately (logically separately, this technically can still all be sent at the same time) and then you know how much data to receive on the client-side. For instance, if I receive metadata value of 10 then I can expect sizeof(int) + 10 bytes to follow to complete the struct. In general, this is probably 14 bytes.
EDIT
I will list some clarifications as requested in the comments.
I do a full copy of the string so it is in (logically) contiguous memory. That is, all the data in my serialized packet is actually full data - there are no pointers. This way, we can send a single buffer (we call is serialized) over the socket. If simply send the pointer, the user receiving the pointer would expect that pointer to be a valid memory address. However, it is unlikely that your memory addresses will be exactly the same. Even if they are, however, he will not have the same data at that address as you do (except in very limited and specialized circumstances).
Hopefully this point is made more clear by looking at the deserialization process (this is on the receiver's side). Notice how I allocate a struct to hold the information sent by the sender. If the sender did not send me the full string but instead only the memory address, I could not actually reconstruct the data which was sent (even on the same machine we have two distinct virtual memory spaces which are not the same). So in essence, a pointer is only a good mapping for the originator.
Finally, as far as "structs within structs" go, you will need to have several functions for each struct. That said, it is possible that you can reuse the functions. For instance, if I have two structs A and B where A contains B, I can have two serialize methods:
char* serializeB()
{
// ... Do serialization
}
char* serializeA()
{
char* B = serializeB();
// ... Either add on to serialized version of B or do some other modifications to combine the structures
}
So you should be able to get away with a single serialization method for each struct.
This answer is besides the problems with your malloc.
Unfortunately, you cannot find a nice trick that would still be compatible with the standard. The only way of properly serializing a structure is to separately dissect each element into bytes, write them to an unsigned char array, send them over the network and put the pieces back together on the other end. In short, you would need a lot of shifting and bitwise operations.
In certain cases you would need to define a kind of protocol. In your case for example, you need to be sure you always put the object p is pointing to right after struct A, so once recovered, you can set the pointer properly. Did everyone say enough already that you can't send pointers through network?
Another protocolish thing you may want to do is to write the size allocated for the flexible array member s in struct B. Whatever layout for your serialized data you choose, obviously both sides should respect.
It is important to note that you cannot rely on anything machine specific such as order of bytes, structure paddings or size of basic types. This means that you should serialize each field of the element separately and assign them fixed number of bytes.
You should serialize the data in a platform independent way.
Here is an example using the Binn library (my creation):
binn *obj;
// create a new object
obj = binn_object();
// add values to it
binn_object_set_int32(obj, "id", 123);
binn_object_set_str(obj, "name", "Samsung Galaxy Charger");
binn_object_set_double(obj, "price", 12.50);
binn_object_set_blob(obj, "picture", picptr, piclen);
// send over the network
send(sock, binn_ptr(obj), binn_size(obj));
// release the buffer
binn_free(obj);
If you don't want to use strings as keys you can use a binn_map which uses integers as keys. There is also support for lists. And you can insert a structure inside another (nested structures). eg:
binn *list;
// create a new list
list = binn_list();
// add values to it
binn_list_add_int32(list, 123);
binn_list_add_double(list, 2.50);
// add the list to the object
binn_object_set_list(obj, "items", list);
// or add the object to the list
binn_list_add_object(list, obj);
Interpret your data and understand what you want to serialize. You want to serialize an integer and a structure of type B (recursivelly, you want to serialize an int, a long, and an array of strings). Then serialize them. The length you need it sizeof(int) + sizeof(long) + ∑strlen(s[i])+1.
On the other hand, serialization is a solved problem (multiple times actually). Are you sure you need to hand write a serialization routine ? Why don't you use D-Bus or a simple RPC call ? Please consider using them.
I tried the method provided by #RageD but it didn't work.
The int value I got from deserialization was not the original one.
For me, memcpy() works for non-string variables. (You can still use strcpy() for char *)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct A {
int a;
char *str;
} test_struct_t;
char *serialize(test_struct_t t) {
int str_len = strlen(t.str);
int size = 2 * sizeof(int) + str_len;
char *buf = malloc(sizeof(char) * (size+1));
memcpy(buf, &t.a, sizeof(int));
memcpy(buf + sizeof(int), &str_len, sizeof(int));
memcpy(buf + sizeof(int) * 2, t.str, str_len);
buf[size] = '\0';
return buf;
}
test_struct_t deserialize(char *buf) {
test_struct_t t;
memcpy(&t.a, buf, sizeof(int));
int str_len;
memcpy(&str_len, buf+sizeof(int), sizeof(int));
t.str = malloc(sizeof(char) * (str_len+1));
memcpy(t.str, buf+2*sizeof(int), str_len);
t.str[str_len] = '\0';
return t;
}
int main() {
char str[15] = "Hello, world!";
test_struct_t t;
t.a = 123;
t.str = malloc(strlen(str) + 1);
strcpy(t.str, str);
printf("original values: %d %s\n", t.a, t.str);
char *buf = serialize(t);
test_struct_t new_t = deserialize(buf);
printf("new values: %d %s\n", new_t.a, new_t.str);
return 0;
}
And the output of the code above is:
original values: 123 Hello, world!
new values: 123 Hello, world!
#Shahbaz is right I would think you actually want this
int len = sizeof(struct A);
obj = (struct A *) malloc(len);
But also you will run into problems when sending a pointer to another machine as the address the pointer points to means nothing on the other machine.
I looked at couple of instances wherein I see something like char fl[1] in the following code snippet. I am not able to guess what might possibly be the use of such construct.
struct test
{
int i;
double j;
char fl[1];
};
int main(int argc, char *argv[])
{
struct test a,b;
a.i=1;
a.j=12;
a.fl[0]='c';
b.i=2;
b.j=24;
memcpy(&(b.fl), "test1" , 6);
printf("%lu %lu\n", sizeof(a), sizeof(b));
printf("%s\n%s\n",a.fl,b.fl);
return 0;
}
output -
24 24
c<some junk characters here>
test1
It's called "the struct hack", and you can read about it at the C FAQ. The general idea is that you allocate more memory then necessary for the structure as listed, and then use the array at the end as if it had length greater than 1.
There's no need to use this hack anymore though, since it's been replaced by C99+ flexible array members.
The idea usually is to have a name for variable-size data, like a packet read off a socket:
struct message {
uint16_t len; /* tells length of the message */
uint16_t type; /* tells type of the message */
char payload[1]; /* placeholder for message data */
};
Then you cast your buffer to such struct, and work with the data by indexing into the array member.
Note that the code you have written is overwriting memory that you shouldn't be touching. The memcpy() is writing more than one character into a one character array.
The use case for this is often more like this:
struct test *obj;
obj = malloc(sizeof(struct test) + 300); // 300 characters of space in the
// flexible member (the array).
obj->i = 3;
obj->j = 300;
snprintf(obj->f, 300, "hello!");
so I have the following enum method in c:
enum enum_type GetInfo (int socket, unsigned char *data)
{
}
and at the api I can find this:
Received data is written to pointer *data....
So if I'm doing something like this:
unsigned char *data;
enum_type enum1;
enum1 = GetInfo (int socket, data);
I got an segmentation fault.
What's my problem?
Thanks,
Simon
Your problem is that you haven't allocated space for data but try to write to it. Do
unsigned char *data = malloc(sizeof(unsigned char) * MYBUFLENGTH);
and then pass data to GetInfo. At the end do not forget to
free(data);
Alternatively you could allocate space on stack (available in C99, some compilers support it as extension even with earlier versions of the C Standard)
unsigned char data[MYBUFLENGTH];
In this case you should not worry about memory management.
you need to allocate memory to store the data in.
for instance:
unsigned char data[10000]; /* allocate 10000 bytes */
enum_type enum1;
enum1 = GetInfo(socket, data);
If you don't understand what's going on, I recommend spending time to read up on pointers.
not sure how big the info is, but try the following
unsigned char data[512] = {0};
enum_type enum1;
enum1 = GetInfo (socket, data);
this makes sure, that data points to a valid memory address on the stack.
It's maybe because GetInfo wants to write to a buffer pointed by data, and you just pass the pointer without allocating any memory space. Allocate memory and point it to data like this:
// I assume you need 1000 bytes
data = (unsigned char*)malloc(1000*sizeof(unsigned char));
I am trying to pass whole structure from client to server or vice-versa. Let us assume my structure as follows
struct temp {
int a;
char b;
}
I am using sendto and sending the address of the structure variable and receiving it on the other side using the recvfrom function. But I am not able to get the original data sent on the receiving end. In sendto function I am saving the received data into variable of type struct temp.
n = sendto(sock, &pkt, sizeof(struct temp), 0, &server, length);
n = recvfrom(sock, &pkt, sizeof(struct temp), 0, (struct sockaddr *)&from,&fromlen);
Where pkt is the variable of type struct temp.
Eventhough I am receiving 8bytes of data but if I try to print it is simply showing garbage values. Any help for a fix on it ?
NOTE: No third party Libraries have to be used.
EDIT1: I am really new to this serialization concept .. But without doing serialization cant I send a structure via sockets ?
EDIT2: When I try to send a string or an integer variable using the sendto and recvfrom functions I am receiving the data properly at receiver end. Why not in the case of a structure? If I don't have to use serializing function then should I send each and every member of the structure individually? This really is not a suitable solution since if there are 'n' number of members then there are 'n' number of lines of code added just to send or receive data.
This is a very bad idea. Binary data should always be sent in a way that:
Handles different endianness
Handles different padding
Handles differences in the byte-sizes of intrinsic types
Don't ever write a whole struct in a binary way, not to a file, not to a socket.
Always write each field separately, and read them the same way.
You need to have functions like
unsigned char * serialize_int(unsigned char *buffer, int value)
{
/* Write big-endian int value into buffer; assumes 32-bit int and 8-bit char. */
buffer[0] = value >> 24;
buffer[1] = value >> 16;
buffer[2] = value >> 8;
buffer[3] = value;
return buffer + 4;
}
unsigned char * serialize_char(unsigned char *buffer, char value)
{
buffer[0] = value;
return buffer + 1;
}
unsigned char * serialize_temp(unsigned char *buffer, struct temp *value)
{
buffer = serialize_int(buffer, value->a);
buffer = serialize_char(buffer, value->b);
return buffer;
}
unsigned char * deserialize_int(unsigned char *buffer, int *value);
Or the equivalent, there are of course several ways to set this up with regards to buffer management and so on. Then you need to do the higher-level functions that serialize/deserialize entire structs.
This assumes serializing is done to/from buffers, which means the serialization doesn't need to know if the final destination is a file or a socket. It also means you pay some memory overhead, but it's generally a good design for performance reasons (you don't want to do a write() of each value to the socket).
Once you have the above, here's how you could serialize and transmit a structure instance:
int send_temp(int socket, const struct sockaddr *dest, socklen_t dlen,
const struct temp *temp)
{
unsigned char buffer[32], *ptr;
ptr = serialize_temp(buffer, temp);
return sendto(socket, buffer, ptr - buffer, 0, dest, dlen) == ptr - buffer;
}
A few points to note about the above:
The struct to send is first serialized, field by field, into buffer.
The serialization routine returns a pointer to the next free byte in the buffer, which we use to compute how many bytes it serialized to
Obviously my example serialization routines don't protect against buffer overflow.
Return value is 1 if the sendto() call succeeded, else it will be 0.
Using the 'pragma' pack option did solved my problem but I am not sure if it has any dependencies ??
#pragma pack(1) // this helps to pack the struct to 5-bytes
struct packet {
int i;
char j;
};
#pragma pack(0) // turn packing off
Then the following lines of code worked out fine without any problem
n = sendto(sock,&pkt,sizeof(struct packet),0,&server,length);
n = recvfrom(sock, &pkt, sizeof(struct packet), 0, (struct sockaddr *)&from, &fromlen);
There is no need to write own serialisation routines for short and long integer types - use htons()/htonl() POSIX functions.
If you don't want to write the serialisation code yourself, find a proper serialisation framework, and use that.
Maybe Google's protocol buffers would be possible?
Serialization is a good idea. You can also use Wireshark to monitor the traffic and understand what is actually passed in the packets.
Instead of serialising and depending on 3rd party libraries its easy to come up with a primitive protocol using tag, length and value.
Tag: 32 bit value identifying the field
Length: 32 bit value specifying the length in bytes of the field
Value: the field
Concatenate as required. Use enums for the tags. And use network byte order...
Easy to encode, easy to decode.
Also if you use TCP remember it is a stream of data so if you send e.g. 3 packets you will not necessarily receive 3 packets. They maybe be "merged" into a stream depending on nodelay/nagel algorithm amongst other things and you may get them all in one recv... You need to delimit the data for example using RFC1006.
UDP is easier, you'll receive a distinct packet for each packet sent, but its a lot less secure.
If the format of the data you want to transfer is very simple then converting to and from an ANSI string is simple and portable.