How to save a dynamic struct to file - c

I have something like this, in fact more complex struct than this:
typedef struct _sample {
unsigned char type;
char *name;
test *first;
} sample;
typedef struct _test {
test *prev;
test *next;
char *name;
int total;
test_2 **list;
} test;
typedef struct _test_2 {
char *name;
unsigned int blabla;
} test_2;
sample *sample_var;
I want to backup this struct into a file and after restore it.
I also try with fwrite(sample_var, sizeof(sample), 1, file_handle); but the real problem is sizeof(sample) that return wrong size, not real variable size.
There is a way to save it into file & restore without knowing the size?

You are trying to serialize, or marshal the structure. You can't just fwrite the data (having pointers is the most obvious stopper). The sizeof problem is really minor when compared to storing pointers in a file (a pointer is meaningless outside the program where it originated).
You will have to define your own serialization / deserialization functions. You could either use your own simple format or use JSON, XML, XDR or something like that.
Personally I would go with JSON, since it's all the rage these days anyway.
As an aside, here is a C FAQ vaguely linked to your own question (though it discusses interoperabillity issues).

There is no easy approach to save such a structure into a file. For instance, even the sample.name field has a size of 4 (depending on architecture), while what you probably want to save is the content of the memory pointed by sample.name.
Here is a sample code that will do such a thing. You will have to duplicate the process to save the entire structure.
void saveToFile(FILE *fh, sample s)
{
fwrite(s.type, sizeof(char), fh);
int nameSize = strlen(s.name); // get the length of the name field
fwrite(nameSize, sizeof(size_t), fh); // write the length of the name field
frwite(s.name, nameSize * sizeof(char), fh); // write the content of the name field
// continue with other fields
}
The idea is to store the size of the next structure and then writting the content. To get the information from the file, you read the size, and then get the data.

sizeof(sample) is not incorrect: it returns the size of a char followed by two pointers. If you need to save such a recursive data type, you have to manually follow dereference the pointers.

It seems like what you really want to do is store the struct and what it's pointer's are referring to, not the pointers themselves.
You will need to write some logic the determine the size of the the data being pointed at, and write that data to the file instead of the pointers.

Related

Make struct Array point to another struct Array

I have two structs in a library I cannot change. p.e:
struct{
uint8_t test;
uint8_t data[8];
}typedef aStruct;
struct{
uint8_t value;
uint8_t unimportant_stuff;
char data[8];
}typedef bStruct;
aStruct a;
bStruct b;
In my application there is a process that permantently refreshs my aStruct's.
Now I have a buffer of bStruct's I want to keep updated as well.
The data[] array is the important field. I don't really care about the other values of the structs.
I already made sure, that on that specific system where the code runs on, a "char" is 8Bits as well.
Now I'd like to make the "b.data" array point to exactly the same values as my "a.data" array. So if the process refreshs my aStruct, the values in my bStruct are up to date as well.
Therefore that in C an array is only a pointer to the first element, I thought something like this must be possible:
b.data = a.data
But unfortunately this gives me the compiler-error:
error: assignment to expression with array type
Is there a way to do what I intend to do?
Thanks in advance
Okay, according to the input I got from you guys, I think it might be the best thing to redesign my application.
So instead of a buffer of bStruct's I might use a buffer of aStruct*. This makes sure my buffer is always up to date. And then if I need to do something with an element of the buffer, I will write a short getter-function which copies the data from that aStruct* into a temporary bStruct and returns it.
Thanks for your responses and comments.
If you want b.data[] array to point to exactly the same values, then you can make data of b a char* and make it point to a's data.
Something like
struct{
uint8_t value;
uint8_t unimportant_stuff;
char* data;
}typedef bStruct;
and
b.data = a.data;
But, keep in mind, this means that b.data is pointing at the same memory location as a.data and hence, changing values of b.data would change values of a.data also.
There is another way of doing this. It is by copying all the values of a.data into b.data. Then, b.data would merely contain the same values as a.data, but it would point to different memory locations.
This can either be done by copying one by one. In a for loop for all the 8 elements.
Or, to use memcpy()
NOTE
Arrays cannot be made to point to another memory locations. As they are non modifiable l-value. If you cannot modify the structs, then you have to use the second method.
What you are asking is not possible when you can not modify the existing struct definitions. But you can still automate the functionality with a bit of OO style programming on your side. All of the following assumes that the data fields in the structs are of same length and contain elements of same size, as in your example.
Basically, you wrap the existing structs with your own container. You can put this in a header file:
/* Forward declaration of the wrapper type */
typedef struct s_wrapperStruct wrapperStruct;
/* Function pointer type for an updater function */
typedef void (*STRUCT_UPDATE_FPTR)(wrapperStruct* w, aStruct* src);
/* Definition of the wrapper type */
struct s_wrapperStruct
{
STRUCT_UPDATE_FPTR update;
aStruct* ap;
bStruct* bp;
};
Then you can can create a factory style module that you use to create your synced struct pairs and avoid exposing your synchronization logic to uninterested parties. Implement a couple of simple functions.
/* The updater function */
static void updateStructs(wrapperStruct* w, aStruct* src)
{
if ( (w != NULL) && (src != NULL) )
{
/* Copy the source data to your aStruct (or just the data field) */
memcpy(w->ap, src, sizeof(aStruct));
/* Sync a's data field to b */
sync(w); /* Keep this as a separate function so you can make it optional */
}
}
/* Sync the data fields of the two separate structs */
static void sync(wrapperStruct* w)
{
if (w != NULL)
{
memcpy(w->bp->data, w->ap->data, sizeof(w->bp->data));
}
}
Then in your factory function you can create the wrapped pairs.
/* Create a wrapper */
wrapperStruct syncedPair = { &updateStructs, &someA, &someB };
You can then pass the pair where you need it, e.g. the process that is updating your aStruct, and use it like this:
/* Pass new data to the synced pair */
syncedPair.update( &syncedPair, &newDataSource );
Because C is not designed as an OO language, it does not have a this pointer and you need to pass around the explicit wrapper pointer. Essentially this is what happens behind the scenes in C++ where the compiler saves you the extra trouble.
If you need to sync a single aStruct to multiple bStructs, it should be quite simple to change the bp pointer to a pointer-to-array and modify the rest accordingly.
This might look like an overly complicated solution, but when you implement the logic once, it will likely save you from some manual labor in maintenance.

Allocating a dynamic array in a dynamically allocated struct (struct of arrays)

This question is really about how to use variable-length types in the Python/C API (PyObject_NewVar, PyObject_VAR_HEAD, PyTypeObject.tp_basicsize and .tp_itemsize , but I can ask this question without bothering with the details of the API. Just assume I need to use an array inside a struct.
I can create a list data structure in one of two ways. (I'll just talk about char lists for now, but it doesn't matter.) The first uses a pointer and requires two allocations. Ignoring #includes and error handling:
struct listptr {
size_t elems;
char *data;
};
struct listptr *listptr_new(size_t elems) {
size_t basicsize = sizeof(struct listptr), itemsize = sizeof(char);
struct listptr *lp;
lp = malloc(basicsize);
lp->elems = elems;
lp->data = malloc(elems * itemsize);
return lp;
}
The second way to create a list uses array notation and one allocation. (I know this second implementation works because I've tested it pretty thoroughly.)
struct listarray {
size_t elems;
char data[1];
};
struct listarray *listarray_new(size_t elems) {
size_t basicsize = offsetof(struct listarray, data), itemsize = sizeof(char);
struct listarray *la;
la = malloc(basicsize + elems * itemsize);
la->elems = elems;
return lp;
}
In both cases, you then use lp->data[index] to access the array.
My question is why does the second method work? Why do you declare char data[1] instead of any of char data[], char data[0], char *data, or char data? In particular, my intuitive understanding of how structs work is that the correct way to declare data is char data with no pointer or array notation at all. Finally, are my calculations of basicsize and itemsize correct in both implementations? In particular, is this use of offsetof guaranteed to be correct for all machines?
Update
Apparently this is called a struct hack: In C99, you can use a flexible array member:
struct listarray2 {
size_t elems;
char data[];
}
with the understanding that you'll malloc enough space for data at runtime. Before C99, the data[1] declaration was common. So my question now is why declare char data[1] or char data[] instead of char *data or char data?
The reason you'd declare char data[1] or char data[] instead of char *data or char data is to keep your structure directly serializable and deserializable. This is important in cases where you'll be writing these sorts of structures to disk or over a network socket, etc.
Take for example your first code snippet that requires two allocations. Your listptr type is not directly serializable. i.e. listptr.elems and the data pointed to by listptr.data are not in a contiguous piece of memory. There is no way to read/write this structure to/from disk with a generic function. You need a custom function that is specific to your struct listptr type to do it. i.e. On serialize you'd have to first write elems to disk, and then write the data pointed to by the data pointer. On deserialization you'd have to read elems, allocate the appropriate space to listptr.data and then read the data from disk.
Using a flexible array member solves this problem because listptr.elem and the listptr.data reside in a contiguous memory space. So to serialize it you can simply write out the total allocated size for the structure and then the structure itself. On deserialize you then first read the allocated size, allocate the needed space and then read your listptr struct into that space.
You may wonder why you'd ever really need this, but it can be an invaluable feature. Consider a data stream of heterogeneous types. Provided you define a header that defines the which heterogeneous type you have and its size and precede each type in the stream with this header, you can generically serialize and deserialize data stream very elegantly and efficiently.
The only reason I know of for choosing char data[1] over char data[] is if you are defining an API that needs to be portable between C99 and C++ since C++ does not have support for flexible array members.
Also, wanted to point out that in the char data[1] you can do the following to get the total needed structure size:
size_t totalsize = offsetof(struct listarray, data[elems]);
You also ask why you wouldn't use char data instead of char data[1] or char data[]. While technically possible to use just plain old char data, it would be (IMHO) morally shunned. The two main issues with this approach are:
You wanted an array of chars, but now you can't access the data member directly as an array. You need to point a pointer to the address of data to access it as an array. i.e.
char *as_array = &listarray.data;
Your structure definition (and your code's use of the structure) would be totally misleading to anyone reading the code. Why declare a single char when you really meant an array of char?
Given these two things, I don't know why anyone would use char data in favor of char data[1]. It just doesn't benefit anyone given the alternatives.

Why does internal Lua strings store the way they do?

I was wanting a simple string table that will store a bunch of constants and I thought "Hey! Lua does that, let me use some of there functions!"
This is mainly in the lstring.h/lstring.c files (I am using 5.2)
I will show the code I am curious about first. Its from lobject.h
/*
** Header for string value; string bytes follow the end of this structure
*/
typedef union TString {
L_Umaxalign dummy; /* ensures maximum alignment for strings */
struct {
CommonHeader;
lu_byte reserved;
unsigned int hash;
size_t len; /* number of characters in string */
} tsv;
} TString;
/* get the actual string (array of bytes) from a TString */
#define getstr(ts) cast(const char *, (ts) + 1)
/* get the actual string (array of bytes) from a Lua value */
#define svalue(o) getstr(rawtsvalue(o))
As you see, the data is stored outside of the structure. To get the byte stream, you take the size of TString, add 1, and you got the char* pointer.
Isn't this bad coding though? Its been DRILLED into m in my C classes to make clearly defined structures. I know I might be stirring a nest here, but do you really lose that much speed/space defining a structure as header for data rather than defining a pointer value for that data?
The idea is probably that you allocate the header and the data in one big chunk of data instead of two:
TString *str = (TString*)malloc(sizeof(TString) + <length_of_string>);
In addition to having just one call to malloc/free, you also reduce memory fragmentation and increase memory localization.
But answering your question, yes, these kind of hacks are usually a bad practice, and should be done with extreme care. And if you do, you'll probably want to hide them under a layer of macros/inline functions.
As rodrigo says, the idea is to allocate the header and string data as a single chunk of memory. It's worth pointing out that you also see the non-standard hack
struct lenstring {
unsigned length;
char data[0];
};
but C99 added flexible array members so it can be done in a standard compliant way as
struct lenstring {
unsigned length;
char data[];
};
If Lua's string were done in this way it'd be something like
typedef union TString {
L_Umaxalign dummy;
struct {
CommonHeader;
lu_byte reserved;
unsigned int hash;
size_t len;
const char data[];
} tsv;
} TString;
#define getstr(ts) (ts->tsv->data)
It relates to the complications arising from the more limited C language. In C++, you would just define a base class called GCObject which contains the garbage collection variables, then TString would be a subclass and by using a virtual destructor, both the TString and it's accompanying const char * blocks would be freed properly.
When it comes to writing the same kind of functionality in C, it's a bit more difficult as classes and virtual inheritance do not exist.
What Lua is doing is implementing garbage collection by inserting the header required to manage the garbage collection status of the part of memory following it. Remember that free(void *) does not need to know anything other than the address of the memory block.
#define CommonHeader GCObject *next; lu_byte tt; lu_byte marked
Lua keeps a linked list of these "collectable" blocks of memory, in this case an array of characters, so that it can then free the memory efficiently without knowing the type of object it is pointing to.
If your TString pointed to another block of memory where the character array was, then it require the garbage collector determine the object's type, then delve into its structure to also free the string buffer.
The pseudo code for this kind of garbage collection would be something like this:
GCHeader *next, *prev;
GCHeader *current = firstObject;
while(current)
{
next = current->next;
if (/* current is ready for deletion */)
{
free(current);
// relink previous to the next (singly-linked list)
if (prev)
prev->next = next;
}
else
prev = current; // store previous undeleted object
current = next;
}

Reading mixed data into C struct

I am trying to read mixed data into a C struct
usually, I do something like this
typedef struct data {
uint32_t value;
float x,y,z;
} __attribute__((__packed__));
and read it in like so:
data x;
fread(&x, 1, sizeof(data), filePointer);
and that works just fine for fixed length data, however, I need to load a ASCIIZ string, which is variable length, and I was wondering if there was a easy way to read that into a struct
Sorry, but there is no built-in serialization for C. This has been asked on SO before with some very good answers.
If that doesn't give you what you want, then search for C serialize or C serialization in your favorite search engine.
There are two ways you could be storing your ASCIIZ string in the structure, exemplified by:
struct asciiz_1
{
char asciiz[32];
};
struct asciiz_2
{
size_t buflen;
char *buffer;
};
The first (struct asciiz_1) can be treated the same way as your struct data; even though the string may be of variable length with garbage after the null (zero) byte, the structure as a whole is a fixed size and can be handled safely with fread() and fwrite().
The second (struct asciiz_2) is a lost cause. You have to allocate the extra space to receive the string (presumably after reading the length), and the pointer value should not be written to the file (it won't have any meaning to the reading process). So, you have to handle this differently.
Your data structure - your choice.

Flexible Arrays in C99

NOTE: I've re written the original question to make it much more clear.
I have a function called
VcStatus readVcard( FILE *const vcf, Vcard **const cardp )
vcf is an open file I will read, and cardp is a pointer to the start of an array of cards.
a file will have multiple cards in it.
readVCard reads the file a line at a time, and calls the function parseVcProp to indentify keywords in the line, and assign them to the appropriate place in a structure.
Here are the structures
typedef struct { // property (=contentline)
VcPname name; // property name
// storage for 0-2 parameters (NULL if not present)
char *partype; // TYPE=string
char *parval; // VALUE=string
char *value; // property value string
void *hook; // reserved for pointer to parsed data structure
} VcProp;
typedef struct { // single card
int nprops; // no. of properties
VcProp prop[]; // array of properties
} Vcard;
typedef struct { // vCard file
int ncards; // no. of cards in file
Vcard **cardp; // pointer to array of card pointers
} VcFile;
So a file contains multiple cards, a card contains multiple properties, etc.
The thing is, a single card can any have number of properties. It is not known how many until you are done reading them.
Here is what I do not understand.
How must I allocate the memory to use parseVcProp properly?
Each time I call parseVcProp, i obviously want it to be storing the data in a new structure, so how do i allocate this memory before hand? Do i just malloc(sizeof(VcProp)*1)?
Vcard *getcards(int n) {
Vcard *c = malloc(sizeof(Vcard) + sizeof(VcProp) * n);
c->nprops = n;
return c;
}
You really need to show us the particular line that's producing the error.
With that said, for a structure like vcard that contains a flexible array member, you cannot create variables of that type. You can only create pointer variables. For instance:
vcard *vc = malloc(sizeof(vcard) + n*sizeof(VcProp));
At this point, vc->prop[0] through vc->prop[n-1] are valid array elements (each has type VcProp).
Note that a flexible array member is an array, not a pointer.
Sorry for the confusion everyone.
I figured out my error.
The reason things were going wacky is because propp is an output pointer, not a input pointer
I was trying to use Vcard->prop as a passing argument, when I actually had to just create my own, and send the address of it.

Resources