I'm working on a hot-upgrade feature and need to package up an array of structs to be stashed away for the new version to find them. I really want to avoid adding a conversion function for every possible version transition. Is this reasonable?
The most likely change to the struct is for more fields to be added to the structure in the future and if this happens then a default value for the new field will be available. I will also soon face the task of saving the array of structs into a configuration file, so extra credit for answers that can be applied to both hot-upgrade and configuration saving.
I don't have to worry about the hot-update mechanism I just give it a pointer and a size and it does fantastic magic :)
The most likely change to the struct is for more fields to be added to the structure in the future and if this happens then a default value for the new field will be available.
From version 1, always include sizeof(myStruct) as a field in the beginning of each struct. Then, when you need to add new fields, always do so in the end of each struct, never in the middle. Now when receiving (or reading from a file), first read the size field only, so that you know how many bytes will be coming after it. If the size is less than sizeof(myStruct) as determined by the receiver/reader, then you know that something is missing, and default values are needed.
I'd recommend using something like Google's protocol buffers, which automatically handle versioning. If you add new fields to your messages, it's very easy to handle.
Related
So I have a struct that looks something like this (more or less):
typedef struct AST_STRUCT
{
enum {
AST_OBJECT,
AST_REFERENCE,
AST_VARIABLE,
AST_VARIABLE_DEFINITION,
AST_VARIABLE_ASSIGNMENT,
AST_VARIABLE_MODIFIER,
AST_FUNCTION_DEFINITION,
AST_FUNCTION_CALL,
AST_NULL,
AST_STRING,
AST_CHAR,
AST_FLOAT,
AST_LIST,
AST_BOOLEAN,
AST_INTEGER,
AST_COMPOUND,
AST_TYPE,
AST_BINOP,
AST_NOOP,
AST_BREAK,
AST_RETURN,
AST_IF,
AST_ELSE,
AST_WHILE,
AST_ATTRIBUTE_ACCESS,
AST_LIST_ACCESS,
AST_NEW
} type;
struct AST_STRUCT* variable_value;
}
Now I would like to write this struct, serialized to the disk into a .dat file.
The problem is that as you can see, it has a field called variable_value .
I am using this function to write it to disk:
https://www.geeksforgeeks.org/readwrite-structure-file-c/
I am also using the other function in that article to read it from the disk.
It appears as if the variable_value field on the struct is not loaded properly.
How would I write the entire struct to disk and maintain the data of the variable_value field?
I was first thinking about dumping the variable_value field into a separate file and then sort of "link it back" into the struct once I load it, but maybe there is another way of doing this?
You are trying to serialize a linked structure (I assume a tree, from the name "AST").
If we imagine for a second that you successfully did that and wrote it to disk, when you load it back, you'll allocate memory for the links in the tree, but the addresses (values of pointers) of these memory chunks are not guaranteed to be the same as the old ones. You will not be able to reconstruct the tree.
So you can't use the value of the addresses as links on the disk. You'll need to use some other method. These days, the popular method is the JSON format, which would work for you assuming that you don't have cross-links or back-links in the tree.
So what you need is a JSON C library. I've never used one, but here are a couple I found in 2 seconds:
json-c
cJSON
JSON and xml are perfectly good formats to use if you want to save your data as text. They do have issues with restoring multiple references to the same object, but even they can be resolved.
If the data is a simple link-list, you can simply restore the objects in order and repair the link-list yourself as you restore the objects. If the data structure is more complex than that you need a more generic solution.
If you really want to save the data as binary, then you need to fix up pointers when you reload. The main way to do this is to keep a map of saved addresses vs newly allocated addresses. If you really don't like the idea of emitting the saved addresses, you can use a serial number to represent each unique address you find.
For each object you save you have to record it's scalar address, and the type of object, before you save the object itself.
For each pointer you save, you need to save the scalar address, and "remember" to later save that referenced object.
On restoring, when you restore an object you need to load the object based on its type, and create a mapping entry that shows how the saved address has turned into a restored address.
If an object contains any pointers, the address stored in those pointers is found by applying that mapping. However, you will often find that the object has not yet been loaded for that pointer, so you will also need to record a mapping from the saved scalar address to the address of the pointer. You can then either fix up all these unfinished pointers after you finish loading, or you can do it for a particular referenced object after that object is loaded.
You need to put in a little extra care to handle objects that support multiple inheritance, by noting that the pointer does not point to the root of the object.
But otherwise, this is just about all you need, plus considerations of versioning, endianness, padding - if you care about the longevity of the data you are saving.
is there a way to create a fixed size array in LabView?
I know that I can do some check on the array size, then discard values when an array size become greater than a specific value. But, I think that is a common problem, so there is some built in function in LabView to have a fixed size array?
As far as I know this is impossible, unless they changed something in one of their latest releases but I doubt it: it would probably require a serious rewrite of the core array code.
The closest you can get is writing your own (possibly polymorphic) array class in which you encapsulate an actual array, that you initialize once with a certain size. For the rest your class only exposes methods to get/set by index. No resize etc.
Or, if you are talking about arrays of controls etc on the front panel, you can probably do this at the UI level by hide the indexing control from it and making sure it cannot be resized graphically. Or probably it's also doable to create a custom control and strip lots of array functionality from it.
If the array size is fixed at design time, then you might consider using a cluster instead. There is even a primitive to convert an array to a cluster of fixed size, provided the length is less then 257. (Array To Cluster function.)
There is also a primitive to go the other way if you need to index the array.
One implementation that you could do is a queue with a fixed size. You can use preview queue and flush queue to implement the functionality you want. However a specific custom class is probably a better idea.
In regular desktop LabVIEW, fixed-sized arrays would be something you'd have to code as per the answers you've already gotten here. However, in LabVIEW FPGA with, say, cRIO, all arrays must be fixed-size.
When calling the Call Library Function Node to a WINAPI DLL, there are times where a structure element may be officially be defined as BYTE[130]. So how do you absolutely, positively make sure your cluster has exactly the space for 130 bytes?
You can't do it with arrays no matter what, because LabVIEW arrays are pointers to a structure (the first element being the length), meaning any array you insert will only allocate enough space for a pointer, 4 bytes.
The work-around I came up with is to insert a cluster that includes sixteen U64 and one U16, pass that through an unflatten to string and you'll find it's exactly 130 bytes long.
When the cluster returns from the call, merely type cast the flattened into string results into a U8 array
I know one can use SetWindowLongPtr + GWLP_USERDATA to store a pointer which points to some data.
But could one store the data directly, for example "a handle", "a bool, an "int" or other larger data.
From http://msdn.microsoft.com/zh-tw/library/windows/desktop/ms644898%28v=vs.85%29.aspx, it says:
Sets new extra information that is private to the application, such as handles or pointers.
, so I guess to store a handle is OK. I also used this method to store an RGB value without problem.
But I don't know if this is a good idea to do things like this. And can we store other data which is large (for example, a structure)?
p.s: The motivation of this question is: When I create a dialog window, I want to store data for each of its controls. Of course I can use static variables in the window procedure and pass pointer (to them) to SetWindowLongPtr function. But this is not "perfect" in theory, because when the dialog window is closed, I don't need these data anymore. Of course, in practice, the data I need to use is very small, and I should not care about the usage of memory. But I still like to know if there is a better way.
You only need one pointer to store anything you want. Declare a struct with the data you want to store. Allocate it before the CreateWindowEx() call and pass the pointer as the last argument. You get it back in your window procedure for the WM_CREATE message, CREATESTRUCT.lpCreateParams field. Now call SetWindowsLongPtr to store that pointer.
Anytime you need it back, use GetWindowlongPtr to recover the pointer to the struct. You'll need to cleanup again, use the WM_NCDESTROY message to release the pointer.
Note that this is a standard technique used in C++ class libraries that wrap the winapi. Do consider using one of them instead of spinning this yourself.
The SetWindowLongPtr function can store a piece of data which has the same size as LONG_PTR (most likely 32bit or 64bit). If your data can be stored in that size, you're fine. I.e. a bool would be fine, so would most handles (since handles tend to be pointers, too).
A typical RGB value would work as well since it's stored as three bytes (one byte per color component) or four bytes (an extra byte for the alpha channel).
If you need more space than this, you should allocate a structure somewhere else and store a pointer to that structure.
I am creating a new TCL_ObjType and so I need to define the 4 functions, setFromAnyProc, updateStringProc, dupIntRepProc and freeIntRepProc. When it comes to test my code, I see something interesting/mystery.
In my testing code, when I do the following:
Tcl_GetString(p_New_Tcl_obj);
updateStringProc() for the new TCL object is called, I can see it in gdb, this is expected.
The weird thing is when I do the following testing code:
Tcl_SetStringObj(p_New_Tcl_obj, p_str, strlen(p_str));
I expect setFromAnyProc() is called, but it is not!
I am confused. Why it is not called?
The setFromAnyProc is not nearly as useful as you might think. It's role is to convert a value[*] from something with a populated bytes field into something with a populated bytes field and a valid internalRep and typePtr. It's called when something wants a generic conversion to a particular format, and is in particular the core of the Tcl_ConvertToType function. You probably won't have used that; Tcl itself certainly doesn't!
This is because it turns out that the point when you want to do the conversion is in a type-specific accessor or manipulator function (examples from Tcl's API include Tcl_GetIntFromObj and Tcl_ListObjAppendElement, which are respectively an accessor for the int type[**] and a manipulator for the list type). At that point, you're in code that has to know the full details of the internals of that specific type, so using a generic conversion is not really all that useful: you can do the conversion directly if necessary (or factor that out to a conversion function).
Tcl_SetStringObj works by throwing away the internal representation of your object (with the freeIntRepProc callback), disposing of the old bytes string representation (through Tcl_InvalidateStringRep, or rather its internal analog) and then installing the new bytes you've supplied.
I find that I can leave the setFromAnyProc field of a Tcl_ObjType set to NULL with no problems.
[*] The Tcl_Obj type is mis-named for historic reasons. It's a value. Tcl_Value was taken for something else that's now obsolete and virtually unused.
[**] Integers are actually represented by a cluster of internal types, depending on the number of bits required. You don't need to know the details if you're just using them, as the accessor functions completely hide the complexity.
I'd like a C library that can serialize my data structures to disk, and then load them again later. It should accept arbitrarily nested structures, possibly with circular references.
I presume that this tool would need a configuration file describing my data structures. The library is allowed to use code generation, although I'm fairly sure it's possible to do this without it.
Note I'm not interested in data portability. I'd like to use it as a cache, so I can rely on the environment not changing.
Thanks.
Results
Someone suggested Tpl which is an awesome library, but I believe that it does not do arbitrary object graphs, such as a tree of Nodes that each contain two other Nodes.
Another candidate is Eet, which is a project of the Enlightenment window manager. Looks interesting but, again, seems not to have the ability to serialize nested structures.
Check out tpl. From the overview:
Tpl is a library for serializing C
data. The data is stored in its
natural binary form. The API is small
and tries to stay "out of the way".
Compared to using XML, tpl is faster
and easier to use in C programs. Tpl
can serialize many C data types,
including structures.
I know you're asking for a library. If you can't find one (::boggle::, you'd think this was a solved problem!), here is an outline for a solution:
You should be able to write a code generator[1] to serialize trees/graphs without (run-time) pre-processing fairly simply.
You'll need to parse the node structure (typedef handling?), and write the included data values in a straight ahead fashion, but treat the pointers with some care.
For pointer to other objects (i.e. char *name;) which you know are singly referenced, you can serialize the target data directly.
For objects that might be multiply refernced and for other nodes of your tree you'll have to represent the pointer structure. Each object gets assigned a serialization number, which is what is written out in-place of the pointer. Maintain a translation structure between current memory position and serialization number. On encountering a pointer, see if it is already assigned a number, if not, give it one and queue that object up for serialization.
Reading back also requires a node-#/memory-location translation step, and might be easier to do in two passes: regenerate the nodes with the node numbers in the pointer slots (bad pointer, be warned) to find out where each node gets put, then walk the structure again fixing the pointers.
I don't know anything about tpl, but you might be able to piggy-back on it.
The on-disk/network format should probably be framed with some type information. You'll need a name-mangling scheme.
[1] ROOT uses this mechanism to provide very flexible serialization support in C++.
Late addition: It occurs to me that this is not always as easy as I implied above. Consider the following (contrived and badly designed) declaration:
enum {
mask_none = 0x00,
mask_something = 0x01,
mask_another = 0x02,
/* ... */
mask_all = 0xff
};
typedef struct mask_map {
int mask_val;
char *mask_name;
} mask_map_t;
mask_map_t mask_list[] = {
{mask_something, "mask_something"},
{mask_another, "mask_another"},
/* ... */
};
struct saved_setup {
char* name;
/* various configuration data */
char* mask_name;
/* ... */
};
and assume that we initalize out struct saved_setup items so that mask_name points at mask_list[foo].mask_name.
When we go to serialize the data, what do we do with struct saved_setup.mask_name?
You will need to take care in designing your data structures and/or bring some case-specific intelligence to the serialization process.
This is my solution. It uses my own implementation of malloc, free and mmap, munmap system calls. Follow the given example codes. Ref: http://amscata.blogspot.com/2013/02/serialize-your-memory.html
In my approach I create a char array as my own RAM space. Then there are functions for allocate the memory and free them. After creating the data structure, by using mmap, I write the char array to a file.
Whenever you want to load it back to the memory there is a function which used munmap to put the data structure again to the char array. Since it has virtual addresses for your pointers, you can re use your data structure. That means, you can create data structure, save it, load it, again edit it, and save it again.
You can take a look on eet. A library of the enlightenment project to store C data types (including nested structures). Although nearly all libs of the enlightenment project are in pre-alpha state, eet is already released. I'm not sure, however, if it can handle circular references. Probably not.
http://s11n.net/c11n/
HTH
you should checkout gwlib. the serializer/deserializer is extensive. and there are extensive tests available to look at. http://gwlib.com/
I'm assuming you are talking about storing a graph structure, if not then disregard...
If your storing a graph, I personally think the best idea would be implementing a function that converts your graph into an adjacency matrix. You can then make a function that converts an adjacency matrix to your graph data structure.
This has three benefits (that may or may not matter in your application):
adjacency matrix are a very natural way to create and store a graph
You can create an adjacency matrix and import them into your applications
You can store and read your data in a meaningful way.
I used this method during a CS project and is definitely how I would do it again.
You can read more about adjacency matrix here: http://en.wikipedia.org/wiki/Modified_adjacency_matrix
Another option is Avro C, an implementation of Apache Avro in C.
Here is an example using the Binn library (my creation):
binn *obj;
// create a new object
obj = binn_object();
// add values to it
binn_object_set_int32(obj, "id", 123);
binn_object_set_str(obj, "name", "Samsung Galaxy Charger");
binn_object_set_double(obj, "price", 12.50);
binn_object_set_blob(obj, "picture", picptr, piclen);
// send over the network
send(sock, binn_ptr(obj), binn_size(obj));
// release the buffer
binn_free(obj);
If you don't want to use strings as keys you can use a binn_map which uses integers as keys.
There is also support for lists, and all these structures can be nested:
binn *list;
// create a new list
list = binn_list();
// add values to it
binn_list_add_int32(list, 123);
binn_list_add_double(list, 2.50);
// add the list to the object
binn_object_set_list(obj, "items", list);
// or add the object to the list
binn_list_add_object(list, obj);
In theory YAML should do what you want http://code.google.com/p/yaml-cpp/
Please let me know if it works for you.