I am creating an algorithm in C that is confidential and cannot be shared with external customers. So, I decided to go with creating a library (.a) file which compiles my algorithm and lets others use it without modifying it. It basically alters the data of a variable within a structure. Now, the structure as such is visible externally (The structure is defined in a separate header file which is included in my .c file) and is generated based on user's configuration. But the said variable is always present within the structure - only remaining data is changed based on user's configuration.
The problem is that if the structure is not exactly the one I used to create the library file, the code fails.
So is there a way to create a library file to modify the data inside a structure, if the structure itself is not available in the beginning?
Any help is greatly appreciated...
Technically all structures you use must be character by character equal everywhere. If you have any difference between the same structure in two (or more) translation units that will lead to undefined behavior.
There are ways around that though, for example by using nested structures. For example you could create one structure to contain your private data, and then another structure whose first member is an instance of the first private structure.
For example something like this:
struct private_data
{
// TODO: The private members here
};
struct public_data
{
struct private_data private;
// TODO: The public members here
};
This is in effect similar to inheritance of an object-oriented language. A pointer to the public_data structure can be cast as a pointer to the private_data structure and passed to the functions that need it.
To keep the private data, well, private you could use opaque data types and opaque pointers:
// Forward declaration of the actual private data
struct actual_private_data;
// The "public" private structure
struct private_data
{
// Pointer to the actual private data
struct actual_private_data *private;
};
It's important to note that this only works for the private data used for the library. If the public data structure contains data that needs to be accessed by the library as well, you might want to rename the private_data structure and put the common data there. Note that this common data must be in all variants of the structure, it can't be auto-generated differently than what is used in the library.
If you are interested only in one data member of a structure,Then get the address of that variable in your confidential application and modify it's value.
Related
I've use quite a bit of JavaScript so far. If you were to use an object constructor in JavaScript, you have access to the this constructor.
So my question relates to trying to use a similar concept in C. I created a struct that I want to be able to self reference:
struct Storage {
void (*delete)();
}
So if I were to allocate a Storage class:
struct Storage *myStruct = malloc(sizeof(struct Storage));
Let's say I'm trying to delete myStruct. If I have some delete function that I point to (with myStruct->delete = deleteStructure), I would like to do something like this:
myStruct.delete();
which would then free() the struct through a self referencing variable inside of said delete function. I'm wondering if there would be a way to have the delete function look like:
void deleteStructure() {
free( /* "this" or some equivalent C self-reference */ );
}
My assumption from research so far is that this is not possible since this is usually only in object oriented programming languages. If this is not possible, I'm wondering what would be the semantically correct way to do this. I'm hoping to make the usage of this delete functionality rather simplistic from a user interface perspective. The only way I understand this to work would be passing a reference to the structure like:
void deleteStructure(struct Storage *someStructure) {
free(someStructure);
}
which would then require deletion to be done as follows:
deleteStructure(myStruct);
To sum up: is there a way to make a delete function that uses self references in C, and if not, what would be the most semantically correct way to delete a structure in the most user friendly way?
No. You cannot even define a function for a struct.
struct Storage {
void (*delete)();
}
simply stores a pointer to a void function. That could be any void function and when it is being called, it has no connection to Storage whatsoever.
Also note that in your code, every instance of the struct stores one pointer to a void function. You could initialize them so that they all point to the same function, in which case you would simply waste 64 bit per instance without any real benefit. You could also make them point to completely different functions with different semantics.
As per #UnholySheep's comment, the correct semantical use of a struct with connection to a C function will follow the structure:
struct Storage {
/* Some definitions here */
}
void deleteStructure(struct Storage *someStructure) {
free( /* all inner structure allocations */ );
free(someStructure);
}
Here's more about passing structs by reference.
So I have a struct that looks something like this (more or less):
typedef struct AST_STRUCT
{
enum {
AST_OBJECT,
AST_REFERENCE,
AST_VARIABLE,
AST_VARIABLE_DEFINITION,
AST_VARIABLE_ASSIGNMENT,
AST_VARIABLE_MODIFIER,
AST_FUNCTION_DEFINITION,
AST_FUNCTION_CALL,
AST_NULL,
AST_STRING,
AST_CHAR,
AST_FLOAT,
AST_LIST,
AST_BOOLEAN,
AST_INTEGER,
AST_COMPOUND,
AST_TYPE,
AST_BINOP,
AST_NOOP,
AST_BREAK,
AST_RETURN,
AST_IF,
AST_ELSE,
AST_WHILE,
AST_ATTRIBUTE_ACCESS,
AST_LIST_ACCESS,
AST_NEW
} type;
struct AST_STRUCT* variable_value;
}
Now I would like to write this struct, serialized to the disk into a .dat file.
The problem is that as you can see, it has a field called variable_value .
I am using this function to write it to disk:
https://www.geeksforgeeks.org/readwrite-structure-file-c/
I am also using the other function in that article to read it from the disk.
It appears as if the variable_value field on the struct is not loaded properly.
How would I write the entire struct to disk and maintain the data of the variable_value field?
I was first thinking about dumping the variable_value field into a separate file and then sort of "link it back" into the struct once I load it, but maybe there is another way of doing this?
You are trying to serialize a linked structure (I assume a tree, from the name "AST").
If we imagine for a second that you successfully did that and wrote it to disk, when you load it back, you'll allocate memory for the links in the tree, but the addresses (values of pointers) of these memory chunks are not guaranteed to be the same as the old ones. You will not be able to reconstruct the tree.
So you can't use the value of the addresses as links on the disk. You'll need to use some other method. These days, the popular method is the JSON format, which would work for you assuming that you don't have cross-links or back-links in the tree.
So what you need is a JSON C library. I've never used one, but here are a couple I found in 2 seconds:
json-c
cJSON
JSON and xml are perfectly good formats to use if you want to save your data as text. They do have issues with restoring multiple references to the same object, but even they can be resolved.
If the data is a simple link-list, you can simply restore the objects in order and repair the link-list yourself as you restore the objects. If the data structure is more complex than that you need a more generic solution.
If you really want to save the data as binary, then you need to fix up pointers when you reload. The main way to do this is to keep a map of saved addresses vs newly allocated addresses. If you really don't like the idea of emitting the saved addresses, you can use a serial number to represent each unique address you find.
For each object you save you have to record it's scalar address, and the type of object, before you save the object itself.
For each pointer you save, you need to save the scalar address, and "remember" to later save that referenced object.
On restoring, when you restore an object you need to load the object based on its type, and create a mapping entry that shows how the saved address has turned into a restored address.
If an object contains any pointers, the address stored in those pointers is found by applying that mapping. However, you will often find that the object has not yet been loaded for that pointer, so you will also need to record a mapping from the saved scalar address to the address of the pointer. You can then either fix up all these unfinished pointers after you finish loading, or you can do it for a particular referenced object after that object is loaded.
You need to put in a little extra care to handle objects that support multiple inheritance, by noting that the pointer does not point to the root of the object.
But otherwise, this is just about all you need, plus considerations of versioning, endianness, padding - if you care about the longevity of the data you are saving.
I already have, say, a struct smallbox with two primitive variables (int identifier, int size) in it. This smallbox is part of higher structs that are used to build i.e. queues.
Now, I have in a part of my project an issue for which I came up with the solution to expand this smallbox, so it has another piece of information like int costs_to_send_it. While, I am not allowed to change my basis structs, is there a way to expand this struct in some fashion like methods overloading in java or so? Will I still be able to use all operation that I have on my higher structs while having the new struct smallbox with the new attribute inside instead of the old one?
This sentence determines the answer: “[Will] I still be able to use all operation that I have on my higher structs while having the new struct smallbox with color attribute inside instead of the old one?” The answer is no.
If the headers and routines involved were completely separate, there are some compiling and linking “games” you could play—compiling one set of source files with one definition of the structure and another set of source files with another definition of the structure and ensuring they never interacted in ways depending on the structure definition. However, since you ask whether the operations defined using one definition could be used with the alternate definition, you are compelling one set of code to use both definitions. (An alternate solution would be to engineer one source file to use different names for its routines under different circumstances, and then you could compile it twice, once for one definition of the structure and once for another, and then you could use the “same” operations on the different structures, but they would actually be different routines with different names performing the “same” operation in some sense.)
While you could define the structure differently within different translation units, when the structure or any type derived from it (such as a pointer to the structure) is used with a routine in a different translation unit, the type the routine is expecting to receive as a parameter must be compatible with the type that is passed to it as an argument, aside from some rules about signed types, adding qualifiers, and so on that do not help here.
For two structures to be compatible, there must be a one-to-one correspondence between their members, which must themselves be of compatible types (C 2018 6.2.7 1). Two structures with different numbers of members do not have a one-to-one correspondence.
is there a way to expand this struct in some fashion like methods
overloading in java or so?
In method overloading, the compiler chooses among same-named methods by examining the arguments to each invocation of a method of that name. Observe that that is an entirely localized decision: disregarding questions of optimization, the compiler's choice here affects only code generation for a single statement.
Where I still be able to use all operation
that I have on my higher structs while having the new struct smallbox
with color attribute inside instead of the old one?
I think what you're looking for is polymorphism, not overloading. Note well that in Java (and C++ and other the other languages I know of that support this) it is based on a type / subtype relationship between differently-named types. I don't know of any language that lets you redefine type names and use the two distinct types as if they were the same in any sense. Certainly C does not.
There are some alternatives, however. Most cleanly-conforming would involve creating a new, differently-named structure type that contains an instance of the old:
struct sb {
int id;
int size;
};
struct sb_metered {
struct sb box;
int cost;
}
Functions that deal in individual instances of these objects by pointer, not by value, can be satisfied easily:
int get_size(struct sb *box) {
return sb->size;
}
int get_cost(struct sb_metered *metered_box) {
return metered_box->cost;
}
int main() {
struct sb_metered b = { { 1, 17}, 42 };
printf("id: %d, size: %d, cost: %d\n",
b.id,
get_size(&b.box),
get_cost(&b));
}
Note that this does not allow you to form arrays of the supertype (struct sb) that actually contain instances of the subtype, nor to pass or return structure objects of the subtype by value as if they were objects of the supertype.
In one file I have a struct like...
struct t {
int private;
int public;
};
struct t s;
One way to have other object files be able to access s.public would be to put...
struct t {
int private;
int public;
};
extern struct t s;
...into a header file and the have the other files reference s.public.
I'd like to avoid this because it locks in the offset between between the base of s and the base of public in any object files that reference s.public. This means that these files would have the wrong address for public if I ever added a new private2 after private it would require a recompile.
Instead I'd like to find a way to export the location of symbol s.public as maybe s_public_direct directly rather than as s with an offset to `public. So other files would then only need the header...
extern int s_public_direct;
...and would have no knowledge of the layout (or even existence) of the structure that public happens to live in.
Is there any way to export a symbol reference for a variable that lives inside a structure in C/C++? If not, is there an elegant way to solve this problem?
Note that is not a scoping issue so marking private with C++ private: would not change the fact that the referencing object file would still get passed the base address of the enclosing struct and then would add the offset to get to public. I am really looking for some kind of C/C++ syntax that tells the compiler to export a symbol of a variable that is in inside a struct. Or maybe a way to declare a new exportable symbol like int s_public_direct as an alias for the variable inside the struct.
The easiest way to do this is to keep the type opaque and export accessor funtion(s) for the field:
extern int get_t_public(struct t *);
extern void set_t_public(struct t *, int);
This allows you to also export it read-only (by defining the get without the set), or a variety of other useful things (enforcing some constraints on the value, or caching things dependent on them and invalidating the cache when the value changes and things need to be recomputed.)
I'd like a C library that can serialize my data structures to disk, and then load them again later. It should accept arbitrarily nested structures, possibly with circular references.
I presume that this tool would need a configuration file describing my data structures. The library is allowed to use code generation, although I'm fairly sure it's possible to do this without it.
Note I'm not interested in data portability. I'd like to use it as a cache, so I can rely on the environment not changing.
Thanks.
Results
Someone suggested Tpl which is an awesome library, but I believe that it does not do arbitrary object graphs, such as a tree of Nodes that each contain two other Nodes.
Another candidate is Eet, which is a project of the Enlightenment window manager. Looks interesting but, again, seems not to have the ability to serialize nested structures.
Check out tpl. From the overview:
Tpl is a library for serializing C
data. The data is stored in its
natural binary form. The API is small
and tries to stay "out of the way".
Compared to using XML, tpl is faster
and easier to use in C programs. Tpl
can serialize many C data types,
including structures.
I know you're asking for a library. If you can't find one (::boggle::, you'd think this was a solved problem!), here is an outline for a solution:
You should be able to write a code generator[1] to serialize trees/graphs without (run-time) pre-processing fairly simply.
You'll need to parse the node structure (typedef handling?), and write the included data values in a straight ahead fashion, but treat the pointers with some care.
For pointer to other objects (i.e. char *name;) which you know are singly referenced, you can serialize the target data directly.
For objects that might be multiply refernced and for other nodes of your tree you'll have to represent the pointer structure. Each object gets assigned a serialization number, which is what is written out in-place of the pointer. Maintain a translation structure between current memory position and serialization number. On encountering a pointer, see if it is already assigned a number, if not, give it one and queue that object up for serialization.
Reading back also requires a node-#/memory-location translation step, and might be easier to do in two passes: regenerate the nodes with the node numbers in the pointer slots (bad pointer, be warned) to find out where each node gets put, then walk the structure again fixing the pointers.
I don't know anything about tpl, but you might be able to piggy-back on it.
The on-disk/network format should probably be framed with some type information. You'll need a name-mangling scheme.
[1] ROOT uses this mechanism to provide very flexible serialization support in C++.
Late addition: It occurs to me that this is not always as easy as I implied above. Consider the following (contrived and badly designed) declaration:
enum {
mask_none = 0x00,
mask_something = 0x01,
mask_another = 0x02,
/* ... */
mask_all = 0xff
};
typedef struct mask_map {
int mask_val;
char *mask_name;
} mask_map_t;
mask_map_t mask_list[] = {
{mask_something, "mask_something"},
{mask_another, "mask_another"},
/* ... */
};
struct saved_setup {
char* name;
/* various configuration data */
char* mask_name;
/* ... */
};
and assume that we initalize out struct saved_setup items so that mask_name points at mask_list[foo].mask_name.
When we go to serialize the data, what do we do with struct saved_setup.mask_name?
You will need to take care in designing your data structures and/or bring some case-specific intelligence to the serialization process.
This is my solution. It uses my own implementation of malloc, free and mmap, munmap system calls. Follow the given example codes. Ref: http://amscata.blogspot.com/2013/02/serialize-your-memory.html
In my approach I create a char array as my own RAM space. Then there are functions for allocate the memory and free them. After creating the data structure, by using mmap, I write the char array to a file.
Whenever you want to load it back to the memory there is a function which used munmap to put the data structure again to the char array. Since it has virtual addresses for your pointers, you can re use your data structure. That means, you can create data structure, save it, load it, again edit it, and save it again.
You can take a look on eet. A library of the enlightenment project to store C data types (including nested structures). Although nearly all libs of the enlightenment project are in pre-alpha state, eet is already released. I'm not sure, however, if it can handle circular references. Probably not.
http://s11n.net/c11n/
HTH
you should checkout gwlib. the serializer/deserializer is extensive. and there are extensive tests available to look at. http://gwlib.com/
I'm assuming you are talking about storing a graph structure, if not then disregard...
If your storing a graph, I personally think the best idea would be implementing a function that converts your graph into an adjacency matrix. You can then make a function that converts an adjacency matrix to your graph data structure.
This has three benefits (that may or may not matter in your application):
adjacency matrix are a very natural way to create and store a graph
You can create an adjacency matrix and import them into your applications
You can store and read your data in a meaningful way.
I used this method during a CS project and is definitely how I would do it again.
You can read more about adjacency matrix here: http://en.wikipedia.org/wiki/Modified_adjacency_matrix
Another option is Avro C, an implementation of Apache Avro in C.
Here is an example using the Binn library (my creation):
binn *obj;
// create a new object
obj = binn_object();
// add values to it
binn_object_set_int32(obj, "id", 123);
binn_object_set_str(obj, "name", "Samsung Galaxy Charger");
binn_object_set_double(obj, "price", 12.50);
binn_object_set_blob(obj, "picture", picptr, piclen);
// send over the network
send(sock, binn_ptr(obj), binn_size(obj));
// release the buffer
binn_free(obj);
If you don't want to use strings as keys you can use a binn_map which uses integers as keys.
There is also support for lists, and all these structures can be nested:
binn *list;
// create a new list
list = binn_list();
// add values to it
binn_list_add_int32(list, 123);
binn_list_add_double(list, 2.50);
// add the list to the object
binn_object_set_list(obj, "items", list);
// or add the object to the list
binn_list_add_object(list, obj);
In theory YAML should do what you want http://code.google.com/p/yaml-cpp/
Please let me know if it works for you.