How to encode JSON buffer in C?

How to encode JSON buffer in C? - c

I have need of some advice.
I gather data from sensors on the analogue ports and I maintain data on the readings.
I then format this data into a json style format which I then use to send it to cloud.
Now the specific code I have for formatting the various values to json are held, not in a string of course, but in a character array using the int sprintf ( char * str, const char * format, ... ); method.
Here is my routines that uses this code:
void StackData() {
char buff[256];
sprintf(buff, "{\"id\":\"stat\",\"minHour\":%1i,\"maxHour\":%2i,\"minDay\":%3i,\"maxDay\":%4i,\"inHour\":%5lu,\"iinDay\":%6lu,\"inWeek\":%7lu}",
minHour, maxHour, minDay, maxDay, AmpsHour, AmpsDay, AmpsWeek);
}
I would like to see how others might do this differently, or is this another way by using a specific library to do this?
PS: I have successfully used coreJSON library to parse JSON input

What you have is reasonable, although an alternative might be some sort of result builder:
char buff[256] = { 0 }
jsonObjectOpen(buff);
jsonObjectInteger(buff,"minHour", minHour);
jsonObjectInteger(buff,"maxHour", maxHour);
jsonObjectClose(buff);
Basically each function is appending the necessary json elements to the buffer, and you'd need to implement functions for each data type (string, int, float), and of course, make sure you use the in the correct order.
I don't think this is more succinct, but if you are doing it more than a few times, especially for more complex structures, you might find it more readible and maintainable.
It's entirely possible there is an existing library that will help with this type of approach, also being mindful of ensuring that the buffer space isn't exceeded during the building process.
In other languages that have type detection, this is a lot easier, and I supposed you could always have a single function that takes a void pointer and a 'type' enum, but that could be more error prone for the sake of a marginally simpler API.

I might be good idea to separate JSON object building from the encoding.
One of the existing JSON C-library do it by the following way:
json_t *item = json_object();
json_object_set_new(item, "id", json_string("stat"));
json_object_set_new(item, "minHour", json_integer(minHour));
json_object_set_new(item, "maxHour", json_integer(maxHour));
...
// Dump to console
json_dumpf(item, stdout, JSON_INDENT(4) | JSON_SORT_KEYS);
// Dump to file
json_dumpf(item, file, JSON_COMPACT);
// Free allocated resources
json_decref(item);
The separation give some benefits.
For example, encode formatting can be selected in one place.
And the same object can be easily encoded several ways (as in the example).

Related

How to safely used a padded struct as a hashmap key

I'm using libuv to write a UDP server. To tell clients apart I need to look at the source IP and source port. This is provided in the on_read callback as const struct sockaddr*. I need to use this information as the key for looking up the user's context somehow.
Ideally I would use a hashmap and use this struct as the key. However it's not clear if libuv zero initialises that structure and so there could be random data in the padding making it unsuitable as a raw hashmap key (a memcmp on the struct).
Assuming that libuv doesn't zero out the padding first, what would be the most efficient way to build a key out of this information? I am thinking I can simply use assignment or memcpy to copy the two fields I want into a clean struct, but I would have to do this for every packet.
I know that in the grand scheme of things this is not a huge amount of overhead, but have I missed a more elegant or efficient solution?
Edit: I've updated the title to reflect that even though my challenge is with libuv right now, this isn't really just a libuv specific problem as a struct like this could come from a number of places. When you get passed a struct from somewhere and you need to use that (or its contents) as a key, what's the correct / safe way to do that?

EDIT: Adding "generic" response, moving back the bad libuv TCP response.
If you don't want to copy the struct (that is very small in this case, but as a generic problem) the straightforward solution is hash member by member.
Let's assume that you need to extract a lot of sparse fields from a large struct. For example, if the hash is only to sum:
#define KEY_INITIAL_STATUS 0
void hash(char *status, const char *buf, size_t len) {
size_t i;
for (i=0; i<len; ++i)
status += buf[i];
}
void receive_buf(struct addr_t addr, ...) {
char key = KEY_INITIAL_STATUS;
hash(&key, addr.field1, addr.field1_len);
hash(&key, addr.field2, addr.field2_len);
void *value = hashtable_search(hashtable, key, ...);
// Do things with the value
}
The majority of hashes can be calculated this way, and then optimized (no need to be byte by byte).
Benchmark is needed to check if is better to do this or to copy all to a zeroed struct.
I see that the libuv read callback use this signature:
void read_cb(uv_stream_t * stream, ssize_t nread, const uv_buf_t *buf)
The client data is linked to the connection/stream, and libuv already have done this lookup for you. The library expects you to pass the data somehow.
If I look for the doc here:
http://docs.libuv.org/en/v1.x/stream.html
"See also: The uv_handle_t members also apply."
So if I check the uv_handle_t members in http://docs.libuv.org/en/v1.x/handle.html#c.uv_handle_t:
void* uv_handle_t.data
Space for user-defined arbitrary data. libuv does not use this field.
So you should save and use your client information here, no need for you to do a single search.
In other libraries, is common to return this type of data either in the connection struct, as a parameter in the "on_read" (or similar) callback as a void * pointer , or even allocating more memory in the library_stream_t structure, like malloc(sizeof(uv_stream_t) + sizeof(my_opaque_data).

I wouldn't recommend directly using the struct as a key, but rather choose a Set or Hashtable library that allows you to pass in a comparator, when you initialize it. Of course the comparator should know how to compare the struct.

How can one avoid mistakenly casting a void* to the wrong type?

Programming in C, are there any techniques one can use to avoid (or at least minimize the likelihood of) casting a void * to the wrong pointer type by mistake? I'm working on a program that parses several different types of CSV datafiles and stores the fields into specific data structures for processing. For example, the records of one of the data files is stored in a hash table; data from another file is stored in a directed graph.
I'd like to create one main parsing function that reads in a record of fields from the file and passes each record to a function that tokenizes the fields and stores them in the appropriate data type. Instead of creating separate parsing/tokenizing functions for each file-type, I wanted to create a generic function to do this. In my design, the calling function would pass a record of fields, a function pointer to the tokenizer applicable to the data-file, and a void* pointing to a node of the destination data structure applicable to the data-file.
What I want to know is whether there is any way to ensure that the user does not mistakenly call the parsing function with a mismatched tokenizer / data structure. (By using the pointer to void, the compiler is certainly helpless here.) Or, if there are no such techniques, are there any effective exception-handling methods to catch this error and prevent major problems (e.g., sigfaults)?
I'd like the code to be as portable as possible.
Any thoughts? I'm not married to this algorithm, if someone has a better idea I'm open to it.

One possibility is to create a set of very thin wrapper functions that abstract away the type-safety issues:
void foo_generic(FieldRecords *records, Parser *parser, void *dest);
...
void foo_A(FieldRecords *records, A *dest) { foo_generic(records, &parse_A, dest); }
void foo_B(FieldRecords *records, B *dest) { foo_generic(records, &parse_B, dest); }
void foo_C(FieldRecords *records, C *dest) { foo_generic(records, &parse_C, dest); }
Now all the potentials for typos are restricted to a single location, where mistakes should be easier to find due to the symmetry.
If you're feeling especially naughty, you could use a macro to simplify the generation of these wrapper functions:
#define FOO(T) void foo_##T(FieldRecords *records, T *dest) { \
foo_generic(records, &parse_##T, dest); \
}
FOO(A)
FOO(B)
FOO(C)
This minimises the probability of typos, but increases the chance of confusing the debugger/IDE!

Writing and reading (fwrite - fread) structures with pointers

I'm working on a mailbox project, and I have these two structures:
struct mmbox_mail
struct mmbox_mail {
char *sender, *recipient;
char *obj, *date;
char flags;
size_t size;
};
and
mail_t
typedef struct{
struct mmbox_mail info;
void *body;
void *next;
} mail_t;
I cannot modify the structures' fields, because I need variable data (for this purpose I used char* instead of char[]).
Each mail_t structure is a mail. I need to save every mail of a user in a file, that could be binary or text file (but I think it's better with a binary file, because I have the void* body that is difficult to save in plain text.
I tried to do this, but it seems like it doesn't work:
while(mailtmp != NULL){
fwrite(mailtmp, sizeof(mail_t), 1, fp);
/* next mail */
mailtmp=mailtmp->next;
}
while(mailtmp != NULL){ /* i have a list of mails and i use a mailtmp pointer to save each mail */
Could you help me? I tried to search everywhere but I never found someone that ask to save two structures, one inside one other.

Of course, that will not work as for strings it will copy the size of pointer, (usually 4 bytes). I see 3 options here:
Serializing data, binary file (http://en.wikipedia.org/wiki/Serialization).
Creating a format to store data in a text file.
Use markup language like XML/JSON etc.
In any case you would need to go through every field of the structure in order to write it to data file. As for reading, in first 2 cases you would have to do reading exactly in the order you wrote the data, in third case you would be able to read fields independently in any order.
In case you choose first method, for every string (char *) field write also zero-termination byte so that you always know where it ends when reading it back.

What you're doing is saving the literal binary representation of mail_t into the text file, which is just a bunch of pointers. What you want to do is something to the effect of:
fprintf( fp, "To: %s\nFrom: %s\n....\nContents: %*s\n\n", mailtmp->info.recipient, mailtmp->info.sender, mailtmp->info.size, mailtmp->body );
That will render the values pointed to as a string and save it to the file. A pointer to a location in memory held by your application is a bit useless to most people after said application closes ;)
EDIT: "Could you help me? I tried to search everywhere but i never found someone that ask to save two structures, one inside one other."
If you just had first class data types, such as ints or floats etc, your method would work perfectly. However, since you are using second class types, namely your char and void arrays, you have to actually specify how the data pointed to should be saved.

well,you are storing the struct's pointer into file.not the data it point to.even you store the struct you want.it is hard to get it from file. i think you need a serialization component like google protocal buffer. then you can write a adaptor,translate the struct to probuf object,then store it to file.when you want,retr it.hoping it will help you:)

C functions overusing parameters?

I have legacy C code base at work and I find a lot of function implementations in the style below.
char *DoStuff(char *inPtr, char *outPtr, char *error, long *amount)
{
*error = 0;
*amount = 0;
// Read bytes from inPtr and decode them as a long storing in amount
// before returning as a formatted string in outPtr.
return (outPtr);
}
Using DoStuff:
myOutPtr = DoStuff(myInPtr, myOutPtr, myError, &myAmount);
I find that pretty obtuse and when I need to implement a similar function I end up doing:
long NewDoStuff(char *inPtr, char *error)
{
long amount = 0;
*error = 0;
// Read bytes from inPtr and decode them as a long storing in amount.
return amount;
}
Using NewDoStuff:
myAmount = NewDoStuff(myInPtr, myError);
myOutPtr += sprintf (myOutPtr, "%d", myAmount);
I can't help but wondering if there is something I'm missing with the top example, is there a good reason to use that type of approach?

One advantage is that if you have many, many calls to these functions in your code, it will quickly become tedious to have to repeat the sprintf calls over and over again.
Also, returning the out pointer makes it possible for you to do things like:
DoOtherStuff(DoStuff(myInPtr, myOutPtr, myError, &myAmount), &myOther);
With your new approach, the equivalent code is quite a lot more verbose:
myAmount = DoNewStuff(myInPtr, myError);
myOutPtr += sprintf("%d", myAmount);
myOther = DoOtherStuff(myInPtr, myError);
myOutPtr += sprintf("%d", myOther);

It is the C standard library style. The return value is there to aid chaining of function calls.
Also, DoStuff is cleaner IMO. And you really should be using snprintf. And a change in the internals of buffer management do not affect your code. However, this is no longer true with NewDoStuff.

The code you presented is a little unclear (for example, why are you adding myOutPtr with the results of the sprintf.
However, in general what it seems that you're essentially describing is the breakdown of one function that does two things into a function that does one thing and a code that does something else (the concatenation).
Separating responsibilities into two functions is a good idea. However, you would want to have a separate function for this concatenation and formatting, it's really not clear.
In addition, every time you break a function call into multiple calls, you are creating code replication. Code replication is never a good idea, so you would need a function to do that, and you will end up (this being C) with something that looks like your original DoStuff.
So I am not sure that there is much you can do about this. One of the limitations of non-OOP languages is that you have to send huge amounts of parameters (unless you used structs). You might not be able to avoid the giant interface.

If you wind up having to do the sprintf call after every call to NewDoStuff, then you are repeating yourself (and therefore violating the DRY principle). When you realize that you need to format it differently you will need to change it in every location instead of just the one.

As a rule of thumb, if the interface to one of my functions exceeds 110 columns, I look strongly at using a structure (and if I'm taking the best approach). What I don't (ever) want to do is take a function that does 5 things and break it into 5 functions, unless some functionality within the function is not only useful, but needed on its own.
I would favor the first function, but I'm also quite accustomed to the standard C style.

Serialize Data Structures in C

I'd like a C library that can serialize my data structures to disk, and then load them again later. It should accept arbitrarily nested structures, possibly with circular references.
I presume that this tool would need a configuration file describing my data structures. The library is allowed to use code generation, although I'm fairly sure it's possible to do this without it.
Note I'm not interested in data portability. I'd like to use it as a cache, so I can rely on the environment not changing.
Thanks.
Results
Someone suggested Tpl which is an awesome library, but I believe that it does not do arbitrary object graphs, such as a tree of Nodes that each contain two other Nodes.
Another candidate is Eet, which is a project of the Enlightenment window manager. Looks interesting but, again, seems not to have the ability to serialize nested structures.

Check out tpl. From the overview:
Tpl is a library for serializing C
data. The data is stored in its
natural binary form. The API is small
and tries to stay "out of the way".
Compared to using XML, tpl is faster
and easier to use in C programs. Tpl
can serialize many C data types,
including structures.

I know you're asking for a library. If you can't find one (::boggle::, you'd think this was a solved problem!), here is an outline for a solution:
You should be able to write a code generator[1] to serialize trees/graphs without (run-time) pre-processing fairly simply.
You'll need to parse the node structure (typedef handling?), and write the included data values in a straight ahead fashion, but treat the pointers with some care.
For pointer to other objects (i.e. char *name;) which you know are singly referenced, you can serialize the target data directly.
For objects that might be multiply refernced and for other nodes of your tree you'll have to represent the pointer structure. Each object gets assigned a serialization number, which is what is written out in-place of the pointer. Maintain a translation structure between current memory position and serialization number. On encountering a pointer, see if it is already assigned a number, if not, give it one and queue that object up for serialization.
Reading back also requires a node-#/memory-location translation step, and might be easier to do in two passes: regenerate the nodes with the node numbers in the pointer slots (bad pointer, be warned) to find out where each node gets put, then walk the structure again fixing the pointers.
I don't know anything about tpl, but you might be able to piggy-back on it.
The on-disk/network format should probably be framed with some type information. You'll need a name-mangling scheme.
[1] ROOT uses this mechanism to provide very flexible serialization support in C++.
Late addition: It occurs to me that this is not always as easy as I implied above. Consider the following (contrived and badly designed) declaration:
enum {
mask_none = 0x00,
mask_something = 0x01,
mask_another = 0x02,
/* ... */
mask_all = 0xff
};
typedef struct mask_map {
int mask_val;
char *mask_name;
} mask_map_t;
mask_map_t mask_list[] = {
{mask_something, "mask_something"},
{mask_another, "mask_another"},
/* ... */
};
struct saved_setup {
char* name;
/* various configuration data */
char* mask_name;
/* ... */
};
and assume that we initalize out struct saved_setup items so that mask_name points at mask_list[foo].mask_name.
When we go to serialize the data, what do we do with struct saved_setup.mask_name?
You will need to take care in designing your data structures and/or bring some case-specific intelligence to the serialization process.

This is my solution. It uses my own implementation of malloc, free and mmap, munmap system calls. Follow the given example codes. Ref: http://amscata.blogspot.com/2013/02/serialize-your-memory.html
In my approach I create a char array as my own RAM space. Then there are functions for allocate the memory and free them. After creating the data structure, by using mmap, I write the char array to a file.
Whenever you want to load it back to the memory there is a function which used munmap to put the data structure again to the char array. Since it has virtual addresses for your pointers, you can re use your data structure. That means, you can create data structure, save it, load it, again edit it, and save it again.

You can take a look on eet. A library of the enlightenment project to store C data types (including nested structures). Although nearly all libs of the enlightenment project are in pre-alpha state, eet is already released. I'm not sure, however, if it can handle circular references. Probably not.

http://s11n.net/c11n/
HTH

you should checkout gwlib. the serializer/deserializer is extensive. and there are extensive tests available to look at. http://gwlib.com/

I'm assuming you are talking about storing a graph structure, if not then disregard...
If your storing a graph, I personally think the best idea would be implementing a function that converts your graph into an adjacency matrix. You can then make a function that converts an adjacency matrix to your graph data structure.
This has three benefits (that may or may not matter in your application):
adjacency matrix are a very natural way to create and store a graph
You can create an adjacency matrix and import them into your applications
You can store and read your data in a meaningful way.
I used this method during a CS project and is definitely how I would do it again.
You can read more about adjacency matrix here: http://en.wikipedia.org/wiki/Modified_adjacency_matrix

Another option is Avro C, an implementation of Apache Avro in C.

Here is an example using the Binn library (my creation):
binn *obj;
// create a new object
obj = binn_object();
// add values to it
binn_object_set_int32(obj, "id", 123);
binn_object_set_str(obj, "name", "Samsung Galaxy Charger");
binn_object_set_double(obj, "price", 12.50);
binn_object_set_blob(obj, "picture", picptr, piclen);
// send over the network
send(sock, binn_ptr(obj), binn_size(obj));
// release the buffer
binn_free(obj);
If you don't want to use strings as keys you can use a binn_map which uses integers as keys.
There is also support for lists, and all these structures can be nested:
binn *list;
// create a new list
list = binn_list();
// add values to it
binn_list_add_int32(list, 123);
binn_list_add_double(list, 2.50);
// add the list to the object
binn_object_set_list(obj, "items", list);
// or add the object to the list
binn_list_add_object(list, obj);

In theory YAML should do what you want http://code.google.com/p/yaml-cpp/
Please let me know if it works for you.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to encode JSON buffer in C? - c

Related

How to safely used a padded struct as a hashmap key

How can one avoid mistakenly casting a void* to the wrong type?

Writing and reading (fwrite - fread) structures with pointers

C functions overusing parameters?

Serialize Data Structures in C

Categories

Resources