use of struct { ... char arr[1]; } construct? - c

I looked at couple of instances wherein I see something like char fl[1] in the following code snippet. I am not able to guess what might possibly be the use of such construct.
struct test
{
int i;
double j;
char fl[1];
};
int main(int argc, char *argv[])
{
struct test a,b;
a.i=1;
a.j=12;
a.fl[0]='c';
b.i=2;
b.j=24;
memcpy(&(b.fl), "test1" , 6);
printf("%lu %lu\n", sizeof(a), sizeof(b));
printf("%s\n%s\n",a.fl,b.fl);
return 0;
}
output -
24 24
c<some junk characters here>
test1

It's called "the struct hack", and you can read about it at the C FAQ. The general idea is that you allocate more memory then necessary for the structure as listed, and then use the array at the end as if it had length greater than 1.
There's no need to use this hack anymore though, since it's been replaced by C99+ flexible array members.

The idea usually is to have a name for variable-size data, like a packet read off a socket:
struct message {
uint16_t len; /* tells length of the message */
uint16_t type; /* tells type of the message */
char payload[1]; /* placeholder for message data */
};
Then you cast your buffer to such struct, and work with the data by indexing into the array member.

Note that the code you have written is overwriting memory that you shouldn't be touching. The memcpy() is writing more than one character into a one character array.
The use case for this is often more like this:
struct test *obj;
obj = malloc(sizeof(struct test) + 300); // 300 characters of space in the
// flexible member (the array).
obj->i = 3;
obj->j = 300;
snprintf(obj->f, 300, "hello!");

Related

Cast a struct member in GDB Pretty Print?

I want to pretty print this struct struct MyStruct { char buffer[16]; }. Depending on buffer[15] I want to print buffer as a 10 byte string or treat it as a pointer. The 10 byte case is simple and works return self.val['buffer'].string(length = 10)
The second case I can't figure out. I want to do something like (*(char**)buffer[0]). I'm not sure how to do that. I was thinking parse_and_eval could be easy even if its not optimal but I couldn't figure out how to access buffer. I also need to cast the buffer to a 32bit int (len = *(int*)(bufer+4);) I couldn't figure that out either.
If I interpret your description correctly, I think your data is actually layed out like this:
struct MyStruct {
union {
char string[10];
struct {
char *p;
int size;
} ptr;
}
int flag;
}
So I think casting the buffer to a pointer of that type, and then choosing the format based on mystruct->flag would make life easier for you.
Even if my interpretation is not correct, try to find the correct version of that struct that captures the duality of the data.

Preventing char pointer overflow in struct

I have a function that accepts a struct * pointer containing sensitive data (in a char array) as an argument (sort of a small library).
The two struct models are as follows:
struct struct1 {
char str[1024]; /* maybe even 4096 or 10KB+ */
size_t str_length;
}
struct struct2 {
char *str;
size_t str_length;
}
The test function is:
/* Read str_length bytes from the char array. */
void foo(struct struct1/struct2 *s) {
int i;
for (i = 0; i < s->str_length; i++) {
printf("%c\n", s->str[i]);
}
}
My concern is that, since the str_length parameter is an arbitrary value, one could intentionally set it to cause a buffer overflow (actually someone stupid enough to purposely create a security flaw in its own program, but I feel I have to take such cases into account). By using the struct1 model, however, I could simply check for a possible buffer overflow by just using:
if (s->str_length > sizeof(s->str)) {
/* ERROR */
}
The problem is that the length array is actually unknown at compile-time. So I don't know whether to use a char * pointer (struct2 style, so no overflow check) or define a very big array (struct1), which would limit the max length (something I would like to avoid) and would allocate unnecessary space most of the time (which could be problematic in embedded systems with scarce memory, I suppose). I know I have to make a compromise, I'd personally use the struct2 model, but I'm not sure if it's a good choice security-wise.
Where does the user of your library get the struct2 instance to pass to the function from? I don't think he creates it by himself and then passes its address to your function, that would be a weird way to pass arguments. It is most likely returned from another function in your library, in which case you can make struct2 an opaque data type that the user cannot alter directly (or only in hacky ways):
/* in the header file */
typedef struct2_s struct2;
/* in the implementation file, where allocation is handled as well
* so you know str_length is set to the proper value.
*/
struct struct2_s {
char *str;
size_t str_length;
};
Put the big array at the end..
struct struct1 {
anyType thisVar;
someType anotherVar
size_t str_length;
char str[10240000]; /
}
Let the user malloc it to whatever 'real' size they wish. If they set 'str_length' wrong, well, there's not much you can do about it, no matter what you do:(

Continuous memory allocation with different data type in C?

I'm trying to compose a string (char array exactly) containing a fixed 14 starting characters and ending with varying content. The varying bit contains 2 floats and 1 32-bit integer that's to be individually treated as 4 1-byte characters in the array separated by commas. It can be illustrated by the following piece of code, which doesn't compile for some obvious reasons (*char can't assign to *float). So, what can I do to get around it?
char *const comStr = "AT+UCAST:0000=0760,0020,0001\r"; // command string
float *pressure;
float *temperature;
uint32_t *timeStamp;
pressure = comStr + 14; // pressure in the address following the '=' in command string
temperature = comStr + 18; // temperature in the address following the 1st ',' in command string
timeStamp = comStr + 22; // time stamp in the address following the 2nd ',' in command string
I have an unclear memory about something like struct and union in the C language which reserves strictly the memory allocation order in which the variables are defined within the "structure". Maybe something like this:
typedef struct
{
char[14] command;
float *pressure;
char comma1;
float *temperature;
char comma2;
uint32_t *time_stamp;
char CR;
}comStr;
Does this structure guarantee that comStr-> command[15] gives me the first/last byte (depends on the endian) of *pressure? Or is there any other special structure do the trick hiding from me?
(Note: comStr-> command[15] isn't going to be evaluated in future code, so exceeding index boundary is not a concern here. The only important thing here is just whether the memory is allocated continuously so that a hardware fetch lasting for 29 bytes starting from the memory address (comStr-> command) gives me exactly the string I want).
p.s. As I am writing this, I came up with an idea. Can I possibly just use memcpy() for the purpose ;) memcpy has parameters of void* type, hopefully it works! I am going to try it now! All hail stackOverflow anyway!
EDIT: I should have made myself clearer, sorry for any misleading and misunderstanding! The character array I want to construct is to be sent through UART byte by byte. To do this, a DMA system is to be used to transfer the array to the transmit buffer byte by byte automatically if the character array's starting memory address and length are given to the DMA system. So the character array must to be stored continuously in the memory. I hope this makes the question clearer.
This proposed structure:
typedef struct
{
char[14] command;
float *pressure;
char comma;
float *temperature;
char comma;
uint32_t *time_stamp;
char CR;
}comStr;
Is not going to help you with your requirement:
The only important thing here is just whether the memory is allocated continuously so that a hardware fetch lasting for 29 bytes starting from the memory address (comStr->command) gives me exactly the string I want.
Note you can't have two members with the same name; you'd need to use comma1 and comma2 for example. Also, the array dimension is in the wrong place.
One problem is that there will be padding bytes within the structure.
Another problem is that the pointers will be holding addresses of something outside the structure (since there is nothing valid inside the structure for them to point at).
It is not clear what you're after. Only a very limited range of floating point values can be represented by 4 bytes in a string. If you're after binary data I/O, then you can drop the pointers and the commas:
typedef struct
{
char command[14];
float pressure;
float temperature;
uint32_t time_stamp;
}comStr;
If you want the commas present, then you're going to have to work harder:
typedef struct
{
char command[14];
char pressure[4];
char comma1;
char temperature[4];
char comma2;
char time_stamp[4];
char CR;
} comStr;
You will have to load the data carefully:
struct comStr com;
float pressure = ...;
float temperature = ...;
uint32_t time_stamp = ...;
assert(sizeof(float) == 4);
...
memmove(&com.pressure, &pressure, sizeof(pressure));
memmove(&com.temperature, &temperature, sizeof(temperature));
memmove(&com.time_stamp, &time_stamp, sizeof(time_stamp));
You have to unpack with a similar set of memory copies. Note that you won't be able to use simple string manipulation on the structure; there could be zero bytes in any or all of the pressure, temperature and time_stamp sections of the structure.
Structure padding
#include <stddef.h>
#include <stdio.h>
#include <stdint.h>
typedef struct
{
char command[14];
float *pressure;
char comma1;
float *temperature;
char comma2;
uint32_t *time_stamp;
char CR;
} comStr;
int main(void)
{
static const struct
{
char *name;
size_t offset;
} offsets[] =
{
{ "command", offsetof(comStr, command) },
{ "pressure", offsetof(comStr, pressure) },
{ "comma1", offsetof(comStr, comma1) },
{ "temperature", offsetof(comStr, temperature) },
{ "comma2", offsetof(comStr, comma2) },
{ "time_stamp", offsetof(comStr, time_stamp) },
{ "CR", offsetof(comStr, CR) },
};
enum { NUM_OFFSETS = sizeof(offsets)/sizeof(offsets[0]) };
printf("Size of comStr = %zu\n", sizeof(comStr));
for (int i = 0; i < NUM_OFFSETS; i++)
printf("%-12s %2zu\n", offsets[i].name, offsets[i].offset);
return 0;
}
Output on Mac OS X:
Size of comStr = 64
command 0
pressure 16
comma1 24
temperature 32
comma2 40
time_stamp 48
CR 56
Note how large the structure is on a 64-bit machine. Pointers are 8-bytes each and are 8-byte aligned.
Various issues to be a covered in your question. I'll take a shot at some of those issues.
The order of members in a structure is guaranteed to be the same as order you have declared them. But there is a different issue here - padding.
Check this -http://c-faq.com/struct/padding.html and follow other links/questions there
Next thing is that you are mistaken in thinking that something like "125" is an integer or something like "1.25" is a float - it's not - it's a string. i.e.
char * p = "125";
p[0] will not contain 0. It will contain '0' - if the encoding is ASCII, then this will be 48. i.e. p[0] will contain 48 & not 0. p[1] will contain 49 & p[2] will contain 52. It will be something similar for float.
The opposite will also happen.
i.e. if you have at an address and you treat it as a char array - the char array will not contain the float you think it will.
Try this program to see this
#include <stdio.h>
struct A
{
char c[4];
float * p;
int i;
};
int main()
{
float x = 1.25;
struct A a;
a.p = &x;
a.i = 0; // to make sure the 'presumed' string starting at p gets null terminate after the float
printf("%s\n", &a.c[4]);
}
For me, it prints "╪·↓". And this has nothing to do with endianness.
Another thing you need to remember, while assigning values to your structure object - you need to remember that comStr.pressure & comStr.temperature are pointers. You cannot assign values to them directly. You need to either give them the address of an existing float or allocate memory dynamically to which they can point to.
Also are you trying to create the char array or to parse the char array which already exists. If you are trying to create it, a better way to do this will be to use snprintf to do what you want. snprintf uses format specifiers similar to printf but prints to a char array. You can create your char array that way. A bigger question remains - what do you plan to do with this char array you create - that will determine if endianness is relevant for you.
If you are trying to read from the char array you have been given and trying to split into floats and commas and whatever, then one way to do this will be sscanf but may be difficult for your particular string format.
At last, I found an easy way round but I don't know if there is any drawback for this method. I did:
char commandStr[27];
char *commandHeader = "AT+UCAST:0000=";
float pressure = 760.0;
float temperature = 20.0;
uint32_t timeStamp = 0;
memcpy(commandStr, commandHeader, 14);
commandStr[26] = '\r';
memcpy((void*)(comStr+14), (void*)(&pressure), 4);
memcpy((void*)(comStr+18), (void*)(&temperature), 4);
memcpy((void*)(comStr+22), (void*)(&timeStamp), 4);
Does this code have any security issues or performance issues or whatever?

how to serialize a struct in c?

I have a struct object that comprises of several primitive data types, pointers and struct pointers. I want to send it over a socket so that it can be used at the other end. As I want to pay the serialization cost upfront, how do I initialize an object of that struct so that it can be sent immediately without marshalling? For example
struct A {
int i;
struct B *p;
};
struct B {
long l;
char *s[0];
};
struct A *obj;
// can do I initialize obj?
int len = sizeof(struct A) + sizeof(struct B) + sizeof(?);
obj = (struct A *) malloc(len);
...
write(socket, obj, len);
// on the receiver end, I want to do this
char buf[len];
read(socket, buf, len);
struct A *obj = (struct A *)buf;
int i = obj->i;
char *s = obj->p->s[0];
int i obj.i=1; obj.p.
Thank you.
The simplest way to do this may be to allocate a chunk of memory to hold everything. For instance, consider a struct as follows:
typedef struct A {
int v;
char* str;
} our_struct_t;
Now, the simplest way to do this is to create a defined format and pack it into an array of bytes. I will try to show an example:
int sLen = 0;
int tLen = 0;
char* serialized = 0;
char* metadata = 0;
char* xval = 0;
char* xstr = 0;
our_struct_t x;
x.v = 10;
x.str = "Our String";
sLen = strlen(x.str); // Assuming null-terminated (which ours is)
tLen = sizeof(int) + sLen; // Our struct has an int and a string - we want the whole string not a mem addr
serialized = malloc(sizeof(char) * (tLen + sizeof(int)); // We have an additional sizeof(int) for metadata - this will hold our string length
metadata = serialized;
xval = serialized + sizeof(int);
xstr = xval + sizeof(int);
*((int*)metadata) = sLen; // Pack our metadata
*((int*)xval) = x.v; // Our "v" value (1 int)
strncpy(xstr, x.str, sLen); // A full copy of our string
So this example copies the data into an array of size 2 * sizeof(int) + sLen which allows us a single integer of metadata (i.e. string length) and the extracted values from the struct. To deserialize, you could imagine something as follows:
char* serialized = // Assume we have this
char* metadata = serialized;
char* yval = metadata + sizeof(int);
char* ystr = yval + sizeof(int);
our_struct_t y;
int sLen = *((int*)metadata);
y.v = *((int*)yval);
y.str = malloc((sLen + 1) * sizeof(char)); // +1 to null-terminate
strncpy(y.str, ystr, sLen);
y.str[sLen] = '\0';
As you can see, our array of bytes is well-defined. Below I have detailed the structure:
Bytes 0-3 : Meta-data (string length)
Bytes 4-7 : X.v (value)
Bytes 8 - sLen : X.str (value)
This kind of well-defined structure allows you to recreate the struct on any environment if you follow the defined convention. To send this structure over the socket, now, depends on how you develop your protocol. You can first send an integer packet containing the total length of the packet which you just constructed, or you can expect that the metadata is sent first/separately (logically separately, this technically can still all be sent at the same time) and then you know how much data to receive on the client-side. For instance, if I receive metadata value of 10 then I can expect sizeof(int) + 10 bytes to follow to complete the struct. In general, this is probably 14 bytes.
EDIT
I will list some clarifications as requested in the comments.
I do a full copy of the string so it is in (logically) contiguous memory. That is, all the data in my serialized packet is actually full data - there are no pointers. This way, we can send a single buffer (we call is serialized) over the socket. If simply send the pointer, the user receiving the pointer would expect that pointer to be a valid memory address. However, it is unlikely that your memory addresses will be exactly the same. Even if they are, however, he will not have the same data at that address as you do (except in very limited and specialized circumstances).
Hopefully this point is made more clear by looking at the deserialization process (this is on the receiver's side). Notice how I allocate a struct to hold the information sent by the sender. If the sender did not send me the full string but instead only the memory address, I could not actually reconstruct the data which was sent (even on the same machine we have two distinct virtual memory spaces which are not the same). So in essence, a pointer is only a good mapping for the originator.
Finally, as far as "structs within structs" go, you will need to have several functions for each struct. That said, it is possible that you can reuse the functions. For instance, if I have two structs A and B where A contains B, I can have two serialize methods:
char* serializeB()
{
// ... Do serialization
}
char* serializeA()
{
char* B = serializeB();
// ... Either add on to serialized version of B or do some other modifications to combine the structures
}
So you should be able to get away with a single serialization method for each struct.
This answer is besides the problems with your malloc.
Unfortunately, you cannot find a nice trick that would still be compatible with the standard. The only way of properly serializing a structure is to separately dissect each element into bytes, write them to an unsigned char array, send them over the network and put the pieces back together on the other end. In short, you would need a lot of shifting and bitwise operations.
In certain cases you would need to define a kind of protocol. In your case for example, you need to be sure you always put the object p is pointing to right after struct A, so once recovered, you can set the pointer properly. Did everyone say enough already that you can't send pointers through network?
Another protocolish thing you may want to do is to write the size allocated for the flexible array member s in struct B. Whatever layout for your serialized data you choose, obviously both sides should respect.
It is important to note that you cannot rely on anything machine specific such as order of bytes, structure paddings or size of basic types. This means that you should serialize each field of the element separately and assign them fixed number of bytes.
You should serialize the data in a platform independent way.
Here is an example using the Binn library (my creation):
binn *obj;
// create a new object
obj = binn_object();
// add values to it
binn_object_set_int32(obj, "id", 123);
binn_object_set_str(obj, "name", "Samsung Galaxy Charger");
binn_object_set_double(obj, "price", 12.50);
binn_object_set_blob(obj, "picture", picptr, piclen);
// send over the network
send(sock, binn_ptr(obj), binn_size(obj));
// release the buffer
binn_free(obj);
If you don't want to use strings as keys you can use a binn_map which uses integers as keys. There is also support for lists. And you can insert a structure inside another (nested structures). eg:
binn *list;
// create a new list
list = binn_list();
// add values to it
binn_list_add_int32(list, 123);
binn_list_add_double(list, 2.50);
// add the list to the object
binn_object_set_list(obj, "items", list);
// or add the object to the list
binn_list_add_object(list, obj);
Interpret your data and understand what you want to serialize. You want to serialize an integer and a structure of type B (recursivelly, you want to serialize an int, a long, and an array of strings). Then serialize them. The length you need it sizeof(int) + sizeof(long) + ∑strlen(s[i])+1.
On the other hand, serialization is a solved problem (multiple times actually). Are you sure you need to hand write a serialization routine ? Why don't you use D-Bus or a simple RPC call ? Please consider using them.
I tried the method provided by #RageD but it didn't work.
The int value I got from deserialization was not the original one.
For me, memcpy() works for non-string variables. (You can still use strcpy() for char *)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct A {
int a;
char *str;
} test_struct_t;
char *serialize(test_struct_t t) {
int str_len = strlen(t.str);
int size = 2 * sizeof(int) + str_len;
char *buf = malloc(sizeof(char) * (size+1));
memcpy(buf, &t.a, sizeof(int));
memcpy(buf + sizeof(int), &str_len, sizeof(int));
memcpy(buf + sizeof(int) * 2, t.str, str_len);
buf[size] = '\0';
return buf;
}
test_struct_t deserialize(char *buf) {
test_struct_t t;
memcpy(&t.a, buf, sizeof(int));
int str_len;
memcpy(&str_len, buf+sizeof(int), sizeof(int));
t.str = malloc(sizeof(char) * (str_len+1));
memcpy(t.str, buf+2*sizeof(int), str_len);
t.str[str_len] = '\0';
return t;
}
int main() {
char str[15] = "Hello, world!";
test_struct_t t;
t.a = 123;
t.str = malloc(strlen(str) + 1);
strcpy(t.str, str);
printf("original values: %d %s\n", t.a, t.str);
char *buf = serialize(t);
test_struct_t new_t = deserialize(buf);
printf("new values: %d %s\n", new_t.a, new_t.str);
return 0;
}
And the output of the code above is:
original values: 123 Hello, world!
new values: 123 Hello, world!
#Shahbaz is right I would think you actually want this
int len = sizeof(struct A);
obj = (struct A *) malloc(len);
But also you will run into problems when sending a pointer to another machine as the address the pointer points to means nothing on the other machine.

Sending char pointer with UDP in C

I have a struct that I am sending to a UDP socket:
typedef struct
{
char field_id;
short field_length;
char* field;
} field_t, *field_p;
I am able to read the field_id and field_length once received on the UDP server-side, however the pointer to field is invalid as expected.
What is the best method to properly send and receive a dynamic char*?
I have a basic solution using memcpy on the client side:
char* data =
(char*)malloc(sizeof(field_t) + (sizeof(char) * strlen(my_field->field)));
memcpy(data, my_field, sizeof(field_t));
memcpy(data+sizeof(field_t), my_field->field, strlen(my_field->field) + 1);
And on the server side:
field_p data = (field_p)buffer;
field_string = (char*)buffer+sizeof(field_t);
Is there a cleaner way of doing this or is this the only way?
Thanks.
You of course cannot send a pointer over a socket - get rid of the char* field; member. Instead, just append id and size pair with the data itself. Use writev(2) or sendmsg(2) to avoid moving data around from buffer to buffer.
Watch out for structure member alignment and padding and number endianness.
Serialization is your friend.
Related Links:
SO-1
SO-2
Define your structure as:
typedef struct
{
uint8_t field_id;
uint16_t field_length;
char field[0]; // note: in C99 you could use char field[];
} field_t, *field_p;
Then, text buffer will immediately follow your structure. Just remember a few tricks:
// initialize structure
field_t *
field_init (uint8_t id, uint16_t len, const char *txt)
{
field_t *f = malloc (sizeof (field_t + len)); // note "+ len";
f->field_id = id;
f->field_length = len;
memcpy (f->field, txt, len);
return f;
}
// send structure
int
field_send (field_t *f, int fd)
{
return write (fd, f, sizeof (*f) + f->field_length); // note "+ f->field_length"
}
I don't think it's standard, though. However, most compilers (GCC && MSVC) should support this. If your compiler does not support zero-sized array, you can use one-element char array - just remember to subtract extra one byte when calculating packet size.

Resources