I have given an array of structs:
typedef struct sRawMsg{
int a;
}sRawMsg;
sRawMsg RawMsg[10];
First the struct array entries are filled with data. Then the data is copied to an output buffer given as a 2D array.
// sending buffer which allocates memory for the array struct
static unsigned char sendingBuffer[10][sizeof(sRawMsg)];
for(int i = 0; i < 10; ++i)
{
sRawMsg* pMsg = &(RawMsg[i]);
// data is now stored in the struct array # pos i
...
// data from the struct entry is now saved in the output sending buffer
memcopy(&(sendingBuffer[i][0]), pMsg, sizeof(sRawMsg));
}
The obtained output buffer is transmitted as an plain byte array over an wireless connection. Since I am new to C programming I want to ask whether a more efficient / elegant / secure way exist to handle the struct array data.
Since you don't have any padding in the structure (one element structures don't have padding with any normal compiler), you could simply pass the
(unsigned char *)&RawMsg[0]
as the argument to your sending function.
If you were converting the data to a fixed format (e.g. network order), or if your structure included a mixture of types with padding between the elements, or if your structure included pointers to strings (or other pointers to data), you'd have to work harder — use serialization analogous to what you are doing. With pointers to strings, you'd probably need a protocol that knows how to identify the lengths of strings. One such convention is known as TLV (Type, length, value). Another (vastly more complex) one is ASN.1. Or you can use a format such as JSON or BSON, or maybe Google's Protocol buffers.
Related
I'm receiving byte buffer array and trying to copy it to a structure:
my structure is:
typedef struct mydata_block
{
uint8_t cmd;
uint32_t param;
char str_buf[10];
uint32_t crc32;
} mydata_t;
first, the program that sends the data as following:
blockTX.cmd = 2
blockTX.str_buf = "eee789"
blockTX.param = 1001
blockTX.crc32 = 3494074521
-
02-00-00-00-E9-03-00-00-65-65-65-37-38-39-00-00-00-00-00-00-99-58-43-D0
when the data is recieved im copying the data to the structure using the memcpy code below:
memcpy((uint8_t *)&blockRX,(uint8_t *)usbd_cdc_buffer,sizeof(blockRX));
everything looks fine, but not the cmd (its 1 byte but there is padding? in structure?) how do i fix this?
Transfering data needs to consider padding, sizes, endianess etc so you need to generate and parse the byte stream correctly. You can use something like googloe protobuf to serialize and deserialize your data protable and comfortable.
But if you must you can give the structure the packed attribute. This removes all the padding and alignment restrictions. That lets you memcpy() the struct without paddings but at the cost of slower access to the members of the struct itself. There are only two good reasons to do this:
The alignemnt and padding of the struct is determined by forces outside your control (has to match hardware or 3rd party software).
As intermediate step to converting the data into host format.
I'm trying to create a C client for dalmatinerdb but having trouble to understand how to combine the variables, write it to a buffer and send it to the database. The fact that dalmatinerdb is written in Erlang makes it more difficult. However, by looking at a python client for dalmatinerdb i have (probably) found the necessary variable sizes and order.
The erlang client has a function called "encode", see below:
encode({stream, Bucket, Delay}) when
is_binary(Bucket), byte_size(Bucket) > 0,
is_integer(Delay), Delay > 0, Delay < 256->
<<?STREAM,
Delay:?DELAY_SIZE/?SIZE_TYPE,
(byte_size(Bucket)):?BUCKET_SS/?SIZE_TYPE, Bucket/binary>>;
According to the official dalmatinerdb protocol we can see the following:
-define(STREAM, 4).
-define(DELAY_SIZE, 8). /bits
-define(BUCKET_SS, 8). /bits
Let's say i would like to create this kind of structure in C,
would it look something like the following:
struct package {
unsigned char[1] mode; // = "4"
unsigned char[1] delay; // = for example "5"
unsigned char[1] bucketNameSize; // = "5"
unsigned char[1] bucketName; // for example "Test1"
};
Update:
I realized that the dalmatinerdb frontend (web interface) only reacts and updates when values have been sent to the bucket. With other words just sending the first struct won't give me any clue if it's right or wrong. Therefore I will try to create a secondary struct with the actual values.
The erland code snippet which encodes values looks like this:
encode({stream, Metric, Time, Points}) when
is_binary(Metric), byte_size(Metric) > 0,
is_binary(Points), byte_size(Points) rem ?DATA_SIZE == 0,
is_integer(Time), Time >= 0->
<<?SENTRY,
Time:?TIME_SIZE/?SIZE_TYPE,
(byte_size(Metric)):?METRIC_SS/?SIZE_TYPE, Metric/binary,
(byte_size(Points)):?DATA_SS/?SIZE_TYPE, Points/binary>>;
The different sizes:
-define(SENTRY, 5)
-define(TIME_SIZE, 64)
-define(METRIC_SS, 16)
-define(DATA_SS, 32)
Which gives me this gives me:
<<?5,
Time:?64/?SIZE_TYPE,
(byte_size(Metric)):?16/?SIZE_TYPE, Metric/binary,
(byte_size(Points)):?32/?SIZE_TYPE, Points/binary>>;
My guess is that my struct containing a value should look like this:
struct Package {
unsigned char sentry;
uint64_t time;
unsigned char metricSize;
uint16_t metric;
unsigned char pointSize;
uint32_t point;
};
Any comments on this structure?
The binary created by the encode function has this form:
<<?STREAM, Delay:?DELAY_SIZE/?SIZE_TYPE,
(byte_size(Bucket)):?BUCKET_SS/?SIZE_TYPE, Bucket/binary>>
First let's replace all the preprocessor macros with their actual values:
<<4, Delay:8/unsigned-integer,
(byte_size(Bucket):8/unsigned-integer, Bucket/binary>>
Now we can more easily see that this binary contains:
a byte of value 4
the value of Delay as a byte
the size of the Bucket binary as a byte
the value of the Bucket binary
Because of the Bucket binary at the end, the overall binary is variable-sized.
A C99 struct that resembles this value can be defined as follows:
struct EncodedStream {
unsigned char mode;
unsigned char delay;
unsigned char bucket_size;
unsigned char bucket[];
};
This approach uses a C99 flexible array member for the bucket field, since its actual size depends on the value set in the bucket_size field, and you are presumably using this structure by allocating memory large enough to hold the fixed-size fields together with the variable-sized bucket field, where bucket itself is allocated to hold bucket_size bytes. You could also replace all uses of unsigned char with uint8_t if you #include <stdint.h>. In traditional C, bucket would be defined as a 0- or 1-sized array.
Update: the OP extended the question with another struct, so I've extended my answer below to cover it too.
The obvious-but-wrong way to write a struct corresponding to the metric/time/points binary is:
struct Wrong {
unsigned char sentry;
uint64_t time;
uint16_t metric_size;
unsigned char metric[];
uint32_t points_size;
unsigned char points[];
};
There are two problems with the Wrong struct:
Padding and alignment: Normally, fields are aligned on natural boundaries corresponding to their sizes. Here, the C compiler will align the time field on an 8-byte boundary, which means there will be padding of 7 bytes following the sentry field. But the Erlang binary contains no such padding.
Illegal flexible array field in the middle: The metric field size can vary, but we can't use the flexible array approach for it as we did in the earlier example because such arrays can only be used for the final field of a struct. The fact that the size of metric can vary means that it's impossible to write a single C struct that matches the Erlang binary.
Solving the padding and alignment issue requires using a packed struct, which you can achieve with compiler support such as the gcc and clang __packed__ attribute (other compilers might have other ways of achieving this). The variable-sized metric field in the middle of the struct can be solved by using two structs instead:
typedef struct __attribute((__packed__)) {
unsigned char sentry;
uint64_t time;
uint16_t size;
unsigned char metric[];
} Metric;
typedef struct __attribute((__packed__)) {
uint32_t size;
unsigned char points[];
} Points;
Packing both structs means their layouts will match the layouts of the corresponding data in the Erlang binary.
There's still a remaining problem, though: endianness. By default, fields in an Erlang binary are big-endian. If you happen to be running your C code on a big-endian machine, then things will just work, but if not — and it's likely you're not — the data values your C code reads and writes won't match Erlang.
Fortunately, endianness is easily handled: you can use byte swapping to write C code that can portably read and write big-endian data regardless of the endianness of the host.
To use the two structs together, you'd first have to allocate enough memory to hold both structs and both the metric and the points variable-length fields. Cast the pointer to the allocated memory — let's call it p — to a Metric*, then use the Metric pointer to store appropriate values in the struct fields. Just make sure you convert the time and size values to big-endian as you store them. You can then calculate a pointer to where the Points struct is in the allocated memory as shown below, assuming p is a pointer to char or unsigned char:
Points* points = (Points*)(p + sizeof(Metric) + <length of Metric.metric>);
Note that you can't just use the size field of your Metric instance for the final addend here since you stored its value as big-endian. Then, once you fill in the fields of the Points struct, again being sure to store the size value as big-endian, you can send p over to Erlang, where it should match what the Erlang system expects.
Basicly i have a custom structure that contains different kind of data. For example:
typedef struct example_structure{
uint8_t* example_1[4];
int example_2[4];
int example_3;
} example_structure;
What i need to do is to copy context of this structure to a const char* buffer so i can send that copied data (buffer) using winsock2's send(SOCKET s, const char* buffer, int len, int flags) function. I tried using memcpy() but wouldn't i just copy address of pointers and not the data?
Yes, if you copied or sent that structure through a socket you would end up copying/sending pointers, which would obviously be meaningless to the recipient, however, if the recipient is running on different hardware (e.g. not the same endian), all of the data may be meaningless anyway. On top of that, differences in the amount of padding between structure members may also become a problem.
For non-trivial situations it is best to use an existing protocol (such as protobuf), or roll your own protocol, keeping in mind the potential differences in hardware representation of your data.
You need to design a protocol before you can encode the data in accord with that protocol. Decide exactly how the data will be encoded at the byte level. Then write code to encode and decode to that format that you decided on.
Do not skip the step of actually documenting the wire protocol at the byte level. It will save you pain later, I promise.
See this answer for a bit more detail.
const char* buffer
This buffer has a constant value so u cant copy anything to it. You probably don't need to copy anything. Just use send function in such a way:
send(s, (char*)&example_structure, sizeof(structure), flags)
But here is the problem with pointers in your structure (uint8_t* example_1[4];).
Sending pointers between different applications / machine does not make sense.
Hmm, your struct contains uint8_t * fields, what looks like C strings... It does not make sense copying or sending a pointer which is just a mere memory address in sending process user space.
If your struct has been (note, no pointers):
typedef struct example_structure{
uint8_t example_1[4];
int example_2[4];
int example_3;
} example_structure;
and provided you transfer it on exactly same architecture (same hardware, same compiler, same compiler options), you could do simply:
example_structure ex_struc;
// initialize the struct
...
send(s, &ex_struc, sizeof(ex_struc), flags);
And even in that case, I would strongly advise you to define and use a protocol - as already said by #DavidSchwartz, it could save you time and headaches later...
But as you have pointers, you cannot do that and must define a protocol.
it could be (but you are free to prefere little endian order, or 2 or 8 bytes for each int depending on your actual data):
one byte (or two) for length of first uint8_t array, followed by the array
above repeated 3 more times
four bytes in big endian order for first int of example_2
repeated 3 times
four bytes in big endian order for int of example_3
This clearly defines the format of a message.
According to this,
https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
It is said that the benefit is
They are very useful as the last element of a structure that is really
a header for a variable-length object
What does it mean?
The zero-length array is a GCC extension (read as: not standard) which you should not use.
While recent versions of C allow for someting similar (flexible array member with empty brackets), C++ knows no such thing. As people often mix C and C++, this is a possible source of confusion.
Instead, an array of length 1 should be used, which is standards-compliant under both C and C++, and which just works with every compiler.
What is this useful for at all?
Sometimes you need to access "invalid" out-of-bounds data knowing that it is valid in reality. In the strictest sense, this is undefined behavior (since you are accessing out-of-bounds values which are indeterminate, and using indeterminate values is UB), but that is only for what the compiler knows, not for what it fact, so it nevertheless "works fine".
For example, you might receive framed data on the network consisting of a tag word, a length, and an amount of data corresponding to the length given. Or an operating system function might return a variable amount of results to you (a couple of Win32 API functions work that way, for example).
In either case, you have a unknown (unknown at compile time) number of elements at the end of this structure, so it is not possible to define a single legitimate structure to hold everything.
That is what flexible array members are for. And with this, it is explained why they must be the last member as well. It doesn't make sense for something that could have "any size" to be anywhere but at the end -- it's impossible for the compiler to lay out any members after it, not knowing its size.
(In case you wonder how the compiler can ever free the storage not knowing the objects's size... it cannot! There normally exists an explicit function for freeing such an object as part of the API, which takes care of this exact problem.)
It's probably best to demonstrate with a small example:
#include <stdio.h>
#include <stdlib.h>
#define BLOB_TYPE_FOO 0xBEEF
struct blob {
/* Part of your object header... perhaps describing the type of blob. */
int type;
/* This is actually the length of the "data" field below */
unsigned length;
/* The data */
unsigned char data[];
};
struct blob *
create_blob(int type, size_t size)
{
/* Allocate enough space for the "header" and "size" bytes of data. */
struct blob *x = calloc(1, sizeof(struct blob) + size);
x->type = type;
x->length = size;
return x;
}
int
main(void)
{
/* Note that sizeof(struct blob) doesn't include the data field. */
printf("sizeof(struct blob): %zu\n", sizeof(struct blob));
struct blob *x = create_blob(BLOB_TYPE_FOO, 1000);
/*
You can manipulate data here, but be careful not to exceed the
allocated size.
*/
size_t i;
for (i = 0; i < 1000; i++)
{
x->data[i] = 'A' + (i % 26);
}
/*
Since data was allocated with the rest of the header, everything is
freed.
*/
free(x);
return 0;
}
The nice part about this setup is that sizeof(struct blob) represents the size of the "object header" (on my machine, that's 8 bytes), and that since you allocate the whole object together, a single free() is all that is needed to release the memory.
Like others have stated here, this is a non-standard extension and you should really consider using it with care. Damon's answer is the better way to go, though the sizeof() operation is not quite the right size (it's a bit too large to represent the size of the actual header). It's not too hard to workaround that problem though.
You cannnot have the array of 0 length because if you try to make a zero length array then it would mean that you are trying to create a pointer to nothing which is not correct. The C standard says:
Flexible array members are written as contents[] without the 0.
Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero.
Flexible array members may only appear as the last member of a struct that is otherwise non-empty.
A structure containing a flexible array member, or a union containing such a structure (possibly recursively), may not be a member of a structure or an element of an array. (However, these uses are permitted by GCC as extensions.
IMPORTANT EDIT:
Sorry everyone, i made a big mistake in the structure.
char *name; is meant to be outside of the structure, written to the file after the structure.
This way, you read the structure, find out the size of the name, then read in the string. Also explains why there is no need for a null terminator.
However, i feel somewhere, my actual question has been answered. If someone would like to edit their responses so i can choose one which is the best fitting i'd appreciate it.
Again, the question I was asking is "If you read in a structure, are you also reading in the data it holds, or do you need to access it some other way".
Sorry for the confusion
For an assignment, I've been tasked with a program which writes and reads structures to a disk (using fread and fwrite).
I'm having trouble grasping the concept.
Lets say we have this structure:
typedef struct {
short nameLength;
char* name;
}attendenceList;
attendenceList names;
now assume we give it this data:
names.name = "John Doe\0";
names.nameLength = strlen(names.name); /*potentially -1?*/
and then we use fwrite... given a file pointer fp.
fwrite(&names,sizeof(names),1,fp);
now we close the file, and open it later to read in the structure.
the question is this: when we read in the structure, are we also reading in the variables it stores?
Can we then now do something like:
if(names.nameLength < 10)
{
...
}
Or do we have to fread something more then just the structure, or assign them somehow?
Assuming the fread is:
fread(&names,sizeof(names),1,fp);
Also assuming we've defined the structure in our current function, as above.
Thanks for the help!
You have a problem here:
fwrite(&names,sizeof(names),1,fp);
Since attendenceList saves the name as a char * this will just write out the pointer, not the actual text. When you read that back in, the memory the pointer is referencing will most likely have something else in it.
You have two choices:
Put a character array (char names[MAXSIZE]) in attendenceList.
Don't write the raw data structure, but write the necessary fields.
You're writing the memory layout of the structure, which includes its members.
You'll get them back if you read the structure back in again - atleast if you do it on the same platform, with a program compiled with the same compiler and compiler settings.
Your name member is declared just as a char, so you can't store a string in it.
If name was a pointer like this:
typedef struct {
short nameLength;
char *name;
}attendenceList;
You really should not read/write the struct to a file. You will write the structure as it's laid out in memory, and that includes the value if the name pointer.
fwrite knows nothing about pointers inside your structure, it will not follow pointers and also write whatever they point to.
when you read the structure back again, you'll read in the address in the name pointer, and that might not point to anything sensible anymore.
If you declare name as an array, you'll be ok, as the array and its content is part of the structure.
typedef struct {
short nameLength;
char name[32];
}attendenceList;
As always, make sure you don't try to copy a string - including its nul terminator- to name that's larger than 32. And when you read it back again. set yourstruct.name[31] = 0; so you are sure the buffer is null terminated.
To write a structure, you'd do
attendenceList my_list;
//initialize my_list
if(fwrite(&my_list,sizeof my_list,1,f) != 1) {
//handle error
}
And to read it back again:
attendenceList my_list;
//initialize my_list
if(fread(&my_list,sizeof my_list,1,f) != 1) {
//handle error
}
}
I'm assuming you meant char* name instead of char name.
Also sizeof(name) will return 4 because you are getting the size of a char* not the length of the char array. So you should write strlen(name) not sizeof(name) inside your fwrite.
In your above example I would recommend storing the string exact size without the null termination. You don't need to store the string length as you can get that after.
If you are reading just a string from a file, and you wrote the exact size without the null termination. Then you need to manually null terminate your buffer after you read the data in.
So make sure you allocate at least the size of your data you are reading in plus 1.
Then you can set the last byte of that array to '\0'.
If you write a whole struct at a time to the buffer, you should be careful because of padding. The padding may not always be the same.
when we read in the structure, are we also reading in the variables it stores?
Yes you are, but the problem you have is that as I mentioned above you will be storing the pointer char* (4 bytes) and not the actual char array. I would recommend storing the struct elements individually.
You ask:
now we close the file, and open it later to read in the structure. the question is this: when we read in the structure, are we also reading in the variables it stores?
No. sizeof(names) is a constant value defined at compile time. It will be the same as
sizeof(short) + sizeof(void*) + some_amount_of_padding_to_align_things
it will NOT include the size of what names.name points to, it will only include the size of the pointer itself.
So you have two problems when writing this to a file.
you aren't actually writing the name string to the file
you are writing a pointer value to the file that will have no meaning when you read it back.
As your code is currently written, When you read back the names, names.name will point to somewhere, but it won't point to "John Doe\0".
What you need to do is to write the string pointed to by names.name instead of the pointer value.
What you need to do is sometimes called "flattening" the structure, You make a structure in memory that contains no pointers, but holds the same data as the structure you want to use, then you write the flattened structure to disk. This is one way to do that.
typedef struct {
short nameLength;
char name[1]; // this will be variable sized at runtime.
}attendenceListFlat;
int cbFlat = sizeof(attendenceListFlat) + strlen(names.name);
attendenceListFlat * pflat = malloc(cbFlat);
pflat->nameLength = names.nameLength;
strcpy(pflat->name, names.name);
fwrite(pflat, cbFlat, 1, fp);
The flattened structure ends with an array that has a minimum size of 1, but when we malloc, we add strlen(names.name) so we can treat that as an array of strlen(names.name)+1 size.
A few things.
Structures are just chunks of memory. It's just taking a bunch of bytes and drawing boundaries on them. Accessing structure elements is just a convenient way of getting a particular memory offset cast as a particular type of data
You are attempting to assign a string to a char type. This will not work. In C, strings are arrays of characters with a NULL byte at the end of them. The easiest way to get this to work is to set a side a fixed buffer for the name. When you create your structure you'll have to copy the name into the buffer (being very careful not to write more bytes than the buffer contains). You can then write/read the buffer from the file in one step.
struct attendanceList {
int namelen;
char name[256]; //fixed size buffer for name
}
Another way you could do it is by having the name be a pointer to a string. This makes what you're trying to do more complicated, because in order to write/read the struct to/from a file, you will have to take into account that the name is stored in a different place in memory. This means two writes and two reads (depending on how you do it) as well as correctly assigning the name pointer to wherever you read the data for the name.
struct attendanceList {
int namelen;
char* name; //the * means "this is a pointer to a char somewhere else in memory"
}
There's a third way you could do it, with a dynamically sized struct using a trick with a zero length array at the end of a struct. Once you know how long the name is, you allocate the correct amount (sizeof(struct attendanceList) + length of string). Then you have it in one contiguous buffer. You just need to remember that sizeof(struct attendanceList) is not the size you need to write/read. This might be a little confusing as a beginning. It is also kind of a hack that's not supported under all compilers.
struct attendanceList {
int namelen;
char name[0]; //this just allows easy access to the data following the struct. Be careful!
}