Basicly i have a custom structure that contains different kind of data. For example:
typedef struct example_structure{
uint8_t* example_1[4];
int example_2[4];
int example_3;
} example_structure;
What i need to do is to copy context of this structure to a const char* buffer so i can send that copied data (buffer) using winsock2's send(SOCKET s, const char* buffer, int len, int flags) function. I tried using memcpy() but wouldn't i just copy address of pointers and not the data?
Yes, if you copied or sent that structure through a socket you would end up copying/sending pointers, which would obviously be meaningless to the recipient, however, if the recipient is running on different hardware (e.g. not the same endian), all of the data may be meaningless anyway. On top of that, differences in the amount of padding between structure members may also become a problem.
For non-trivial situations it is best to use an existing protocol (such as protobuf), or roll your own protocol, keeping in mind the potential differences in hardware representation of your data.
You need to design a protocol before you can encode the data in accord with that protocol. Decide exactly how the data will be encoded at the byte level. Then write code to encode and decode to that format that you decided on.
Do not skip the step of actually documenting the wire protocol at the byte level. It will save you pain later, I promise.
See this answer for a bit more detail.
const char* buffer
This buffer has a constant value so u cant copy anything to it. You probably don't need to copy anything. Just use send function in such a way:
send(s, (char*)&example_structure, sizeof(structure), flags)
But here is the problem with pointers in your structure (uint8_t* example_1[4];).
Sending pointers between different applications / machine does not make sense.
Hmm, your struct contains uint8_t * fields, what looks like C strings... It does not make sense copying or sending a pointer which is just a mere memory address in sending process user space.
If your struct has been (note, no pointers):
typedef struct example_structure{
uint8_t example_1[4];
int example_2[4];
int example_3;
} example_structure;
and provided you transfer it on exactly same architecture (same hardware, same compiler, same compiler options), you could do simply:
example_structure ex_struc;
// initialize the struct
...
send(s, &ex_struc, sizeof(ex_struc), flags);
And even in that case, I would strongly advise you to define and use a protocol - as already said by #DavidSchwartz, it could save you time and headaches later...
But as you have pointers, you cannot do that and must define a protocol.
it could be (but you are free to prefere little endian order, or 2 or 8 bytes for each int depending on your actual data):
one byte (or two) for length of first uint8_t array, followed by the array
above repeated 3 more times
four bytes in big endian order for first int of example_2
repeated 3 times
four bytes in big endian order for int of example_3
This clearly defines the format of a message.
Related
I've got the following struct:
struct fetch_info_t {
u_int8_t grocery_type;
u_int8_t arg[1024];
} __attribute__((packed));
I'd like to send this over a socket to a server, to request data. I'd very much like to avoid any libraries, such as protobuf.
grocery_type can be any value between 1 and 255. Some grocery types, say type 128, must provide additional information. I'ts not enough to provide type 128, I'd also like to provide Cheeses as a string. Having that said, type 129 must provide a number, u_int32_t and not a string, unlike 128.
Basically I've allocated 1024 bytes for the additional information the system may require. The question is, how do I send it over a socket, or more specifically, populate arg with the right information non-system-dependant? I know htonl on the number could be used, but how do I actually set the buffer value to that?
I'd imagine that the info sending would actually eventually be casting the struct pointer to unsigned char array and send it like that over a socket. Let me please know if there's a better way.
You cannot assign directly the 32-bit value to the array
because the correct alignment is not guaranteed.
memcpy() will just replicate the bytes with not alignment problem.
u_int32_t the_value=htonl( ... );
struct fetch_info_t the_info;
the_info.grocery_type=129;
memcpy(the_info.arg, &the_value, sizeof(the_value));
Then, because your structure is packed, you can send it with
send(my_socket, &the_info,
sizeof(the_info.grocery_type)+sizeof(the_value), 0);
In case you need to send a string
char *the_text= ... ;
size_t the_size=strlen(the_text)+1;
struct fetch_info_t the_info;
the_info.grocery_type=128;
memcpy(the_info.arg, the_text, the_size);
send(my_socket, &the_info,
sizeof(the_info.grocery_type)+the_size, 0);
Note that the '\0' is transmitted here.
I need help trying to understand what's happening in some old C code.
I'm utilizing an old btree/isam software product (from Softfocus) to write my data to the database. It essentially puts data into a mydb.dt file and the index data into mydb.nx (for example).
In my program, I have a struct with members corresponding to "fields" in the database. The struct is defined like so (I'm greatly simplifying with fictional data):
typedef struct {
unsigned char name[50]; /*size is 50 bytes*/
int active; /*size is 4 bytes*/
int yet_unused_bytes[46]; /*unused space (in fixed-length record)*/
} DB_PEOPLE; /*total struct size is 100 bytes*/
When I want to write to the record, I call the DB software's bt3Write routine like so (people_db_fd is my database-file's descriptor, and db_current_record_people is just a copy of my struct above, with data in it):
ret = bt3Write(people_db_fd, db_current_record_people);
That bt3Write routine is basically the following (I don't think it's important to know exactly what it's doing, but the key part is the trueBase bit). The fd is the database file, and data is the byte stream (the db_current_record_people struct that I'm handing to it above). I suppose recno is just some overhead for the nx file that I don't care about here, that lioWrite takes care of:
/*
* all the keys are in; write the data record and store a copy
*/
if (lioWrite(fd -> fdData, recno, trueBase(data)) == UERROR)
return (sfuint) isMuCallErr(BT3WRITE, 0);
In a header file, trueBase and BASEOFFSET are defined as the following macros:
/*
* macro for easing the buffer address calculations (who knows what
* may change down the road
*/
#define BASEOFFSET (sizeof(sflong))
#define trueBase(address) ((char *) address - BASEOFFSET)
Now, here's what I would like help with (I'm by no means a C expert... barely functional, really). I need to know what trueBase is doing (or your best guess). To my untrained eye, it seems like it's shifting the pointer to the data by the length of BASEOFFSET (which is 8 bytes on my system).
Extra bonus points for anyone who knows anything about this particular software product, too! It's pretty old, and I can't really find ANY documentation for it. It's commented fairly well - except for this bit.
Your analysis is correct. It looks as if they are storing some hidden header data in each allocation that they are doing. So they only give you a pointer to the "user" part. Only when you have to free the data, for example, you need to know the "real" starting point of the allocated space and that's what the macro is computing.
I don't apply for the extra point since I have no idea what that is.
I have a structure like this
struct packet
{
int seqnum;
char type[1];
float time1;
float pri;
float time2;
unsigned char data[512];
}
I am receiving packet in an array
char buf[529];
I want to take the seqnum,data everything separately.Does the following typecast work.. It is giving junk value for me.
struct packet *pkt;
pkt=(struct packet *)buf;
printf(" %d",pkt->seqnum)
No, that likely won't work and is generally a bad and broken way of doing this.
You must use compiler-specific extensions to make sure there's no invisible padding between your struct members, for something like that to work. With gcc, for instance, you do this using the __attribute__() syntax.
It is, thus, not a portable idea.
It's much better to be explicit about it, and unpack each field. This also gives you a chance to have a well-defined endianness in your network protocol, which is generally a good idea for interoperability's sake.
No, that isn't generally valid code. You should make the struct first and then memcopy stuff into it:
packet p;
memcpy(&p.seqnum, buf + 0, 4);
memcpy(&p.type[0], buf + 4, 1);
memcpy(&p.time1, buf + 5, 4);
And so forth.
You must take great care to get the type sizes and endianness right.
First of all, you cannot know in advance where the compiler will insert padding bytes in your structure for performance optimization (cache line alignment, integer alignment etc) since this is platform-dependent. Except, of course, if you are considering building the app only on your platform.
Anyway, in your case it seems like you are getting data from somewhere (network ?) and it is highly probable that the data has been compacted (no padding bytes between fields).
If you really want to typecast your array to a struct pointer, you can still tell the compiler to remove the padding bytes it might add. Note that this depends on the compiler you use and is not a standard C implementation. With gcc, you might add this statement at the end of your structure definition :
struct my_struct {
int blah;
/* Blah ... */
} __attribute__((packed));
Note that it will affect the performance for member access, copy etc ...
Unless you have a very good reason to do so, don't ever use the __attribute__((packed)) thing !
The other solution, which is much more advisable is to make the parsing on your own. You just allocate an appropriate structure and fill its fields by seeking the good information from your buffer. A sequence of memcpy instructions is likely to do the trick here (see Kerrek's answer)
I'm coding a network layer protocol and it is required to find a size of packed a structure defined in C. Since compilers may add extra padding bytes which makes sizeof function useless in my case. I looked up Google and find that we could use ___attribute(packed)___ something like this to prevent compiler from adding extra padding bytes. But I believe this is not portable approach, my code needs to support both windows and linux environment.
Currently, I've defined a macro to map packed sizes of every structure defined in my code. Consider code below:
typedef struct {
...
} a_t;
typedef struct {
...
} b_t;
#define SIZE_a_t 8;
#define SIZE_b_t 10;
#define SIZEOF(XX) SIZE_##XX;
and then in main function, I can use above macro definition as below:-
int size = SIZEOF(a_t);
This approach does work, but I believe it may not be best approach. Any suggestions or ideas on how to efficiently solve this problem in C?
Example
Consider the C structure below:-
typedef struct {
uint8_t a;
uint16_t b;
} e_t;
Under Linux, sizeof function return 4 bytes instead of 3 bytes. To prevent this I'm currently doing this:-
typedef struct {
uint8_t a;
uint16_t b;
} e_t;
#define SIZE_e_t 3
#define SIZEOF(XX) SIZE_##e_t
Now, when I call SIZEOF(e_t) in my functin, it should return 3 not 4.
sizeof is the portable way to find the size of a struct, or of any other C data type.
The problem you're facing is how to ensure that your struct has the size and layout that you need.
#pragma pack or __attribute__((packed)) may well do the job for you. It's not 100% portable (there's no mention of packing in the C standard), but it may be portable enough for your current purposes, but consider whether your code might need to be ported to some other platform in the future. It's also potentially unsafe; see this question and this answer.
The only 100% portable approach is to use arrays of unsigned char and keep track of which fields occupy which ranges of bytes. This is a lot more cumbersome, of course.
Your macro tells you the size that you think the struct should have, if it has been laid out as you intend.
If that's not equal to sizeof(a_t), then whatever code you write that thinks it is packed isn't going to work anyway. Assuming they're equal, you might as well just use sizeof(a_t) for all purposes. If they're not equal then you should be using it only for some kind of check that SIZEOF(a_t) == sizeof(a_t), which will fail and prevent your non-working code from compiling.
So it follows that you might as well just put the check in the header file that sizeof(a_t) == 8, and not bother defining SIZEOF.
That's all aside from the fact that SIZEOF doesn't really behave like sizeof. For example consider typedef a_t foo; sizeof(foo);, which obviously won't work with SIZEOF.
I don't think, that specifying size manually is more portable, than using sizeof.
If size is changed your const-specified size will be wrong.
Attribute packed is portable. In Visual Studio it is #pragma pack.
I would recommend against trying to read/write data by overlaying it on a struct. I would suggest instead writing a family of routines which are conceptually like printf/scanf, but which use format specifiers that specify binary data formats. Rather than using percent-sign-based tags, I would suggest simply using a binary encoding of the data format.
There are a few approaches one could take, involving trade-off between the size of the serialization/deserialization routines themselves, the size of the code necessary to use them, and the ability to handle a variety of deserialization formats. The simplest (and most easily portable) approach would be to have routines which, instead of using a format string, process items individually by taking a double-indirect pointer, read some data type from it, and increment it suitably. Thus:
uint32_t read_uint32_bigendian(uint8_t const ** src)
{
uint8_t *p;
uint32_t tmp;
p = *src;
tmp = (*p++) << 24;
tmp |= (*p++) << 16;
tmp |= (*p++) << 8;
tmp |= (*p++);
*src = p;
}
...
char buff[256];
...
uint8_t *buffptr = buff;
first_word = read_uint32_bigendian(&buffptr);
next_word = read_uint32_bigendian(&buffptr);
This approach is simple, but has the disadvantage of having lots of redundancy in the packing and unpacking code. Adding a format string could simplify it:
#define BIGEND_INT32 "\x43" // Or whatever the appropriate token would be
uint8_t *buffptr = buff;
read_data(&buffptr, BIGEND_INT32 BIGEND_INT32, &first_word, &second_word);
This approach could read any number of data items with a single function call, passing buffptr only once, rather than once per data item. On some systems, it might still be a bit slow. An alternative approach would be to pass in a string indicating what sort of data should be received from the source, and then also pass in a string or structure indicating where the data should go. This could allow any amount of data to be parsed by a single call giving a double-indirect pointer for the source, a string pointer indicating the format of data at the source, a pointer to a struct indicating how the data should be unpacked, and a a pointer to a struct to hold the target data.
I have several structures defined to send over different Operating Systems (tcp networks).
Defined structures are:
struct Struct1 { uint32_t num; char str[10]; char str2[10];}
struct Struct2 { uint16_t num; char str[10];}
typedef Struct1 a;
typedef Struct2 b;
The data is stored in a text file.
Data Format is as such:
123
Pie
Crust
Struct1 a is stored as 3 separate parameters. However, struct2 is two separate parameters with both 2nd and 3rd line stored to the char str[] . The problem is when I write to a server over the multiple networks, the data is not received correctly. There are numerous spaces that separate the different parameters in the structures. How do I ensure proper sending and padding when I write to server? How do I store the data correctly (dynamic buffer or fixed buffer)?
Example of write: write(fd,&a, sizeof(typedef struct a)); Is this correct?
Problem Receive Side Output for struct2:
123( , )
0 (, Pie)
0 (Crust,)
Correct Output
123(Pie, Crust)
write(fd,&a, sizeof(a)); is not correct; at least not portably, since the C compiler may introduce padding between the elements to ensure correct alignment. sizeof(typedef struct a) doesn't even make sense.
How you should send the data depends on the specs of your protocol. In particular, protocols define widely varying ways of sending strings. It is generally safest to send the struct members separately; either by multiple calls to write or writev(2). For instance, to send
struct { uint32_t a; uint16_t b; } foo;
over the network, where foo.a and foo.b already have the correct endianness, you would do something like:
struct iovec v[2];
v[0].iov_base = &foo.a;
v[0].iov_len = sizeof(uint32_t);
v[1].iov_base = &foo.b;
v[1].iov_len = sizeof(uint16_t);
writev(fp, v, 2);
Sending structures over the network is tricky. The following problems you might have
Byte endiannes issues with integers.
Padding introduced by your compiler.
String parsing (i.e. detecting string boundaries).
If performance is not your goal, I'd suggest to create encoders and decoders for each struct to be send and received (ASN.1, XML or custom). If performance is really required you can still use structures and solve (1), by fixing an endianness (i.e. network byte
order) and ensure your integers are stored as such in those structures, and (2) by fixing a compiler and using the pragmas or attributes to enforce a "packed" structure.
Gcc for example uses attribute((packed)) as such:
struct mystruct {
uint32_t a;
uint16_t b;
unsigned char text[24];
} __attribute__((__packed__));
(3) is not easy to solve. Using null terminated strings at a network protocol
and depending on them being present would make your code vulnerable to several attacks. If strings need to be involved I'd use an proper encoding method such as the ones suggested above.
The easy way would be to write two functions for each structure: one to convert from textual representation to the struct and one to convert a struct back to text. Then you just send the text over the network and on the receiving side convert it to your structures. That way endianness does not matter.
There are conversion functions to ensure portability of binary integers across a network. Use htons, htonl, ntohs and ntohl to convert 16 and 32 bit integers from host to network byte order and vice versa.