I'm here going through this book (Mastering Algorithms in C), looking for graph implementations. But he uses some notations that are not familiar to me. It's not obvious, and I tried to find if he explained it somewhere in the book, but I didn't.
When defining the type Graph, one of the struct members is
int (*match) (const void *key1, const void *key2);
Ok, so here we have 2 generic values that are being compared? Why inside the struct? Where is this *match function, that doesn't appear anywhere else?
He's using this kind of declaration all the way from singly linked lists, but without any explaining I could find.
Is the *destroy more or less the same kind of function? I found him saying this deallocates the memory for the struct. But again, why here?
This feels like the sort of very basic and obvious question, but I couldn't find the answer anywhere, and I don't really have anyone to ask.
This is how you define a pointer to a method.
So the struct has a variable match, which should get value a method that gets 2 const void *.
int matchGraphMyWay(const void *key1, const void *key2){
...
}
int main(){
// Graph g = ...
g.match = &matchGraphMyWay; // Assign value to the match function.
g.match(...) // Execute the match function
}
Related
I was recently reading about ways to create something that resembles c++ objects, but in C, and I came across some pretty good examples. However, there was this single piece of code that had me thinking for hours, since it was the first time I saw this kind of syntax, and I didn't find anything like it on Google...
The block itself is this one:
struct stack {
struct stack_type * my_type;
// Put the stuff that you put after private: here
};
struct stack_type {
void (* construct)(struct stack * this); // This takes uninitialized memory
struct stack * (* operator_new)(); // This allocates a new struct, passes it to construct, and then returns it
void (*push)(struct stack * this, thing * t); // Pushing t onto this stack
thing * (*pop)(struct stack * this); // Pops the top thing off the stack and returns it
int this_is_here_as_an_example_only;
}Stack = {
.construct = stack_construct,
.operator_new = stack_operator_new,
.push = stack_push,
.pop = stack_pop
};
Assuming all the functions being set to the pointers are defined somewhere else, my doubts are the following:
1) Why does the point ( '.' ) mean, or whats its purpose when initializing the function pointers? (example: .construct = stack_construct )
2) why is there an equal sign after 'Stack' at the end of the struct definition, and why is there something other than a ';' ( in this case the word 'Stack' ), given the fact that there is no typedef at the beginning?
I assume it has something to do with initialization (like a constructor, I don't know), but it is the first time I see a struct = {...,...,...} in the definition. I've seen that when you initialize a struct like in the following example:
typedef struct s{
int a;
char b;
}struc;
void main(){
struc my_struct={12,'z'};
}
But this is not in the main declaration of the struct, and still, there are no '=' within the {}, unlike the first example, where it showed something like...
struc my_struct={ a = 12,
b = 'z'};
3) This is a minor doubt, meaning, I'm much more interested in the first two. Anyway, here it goes...
At the beginning of the first code, it says something like '// Put the stuff that you put after private: here'. Why is that? how would that make them private?
That is all, I would appreciate anything, this had me thinking for hours! Thanks in advance, have a great day!
1) Don't worry about the members being functions, it's just this:
What does dot (.) mean in a struct initializer?
2) In
struct S {int i;}
s = {0};
the first line names a type that can be used to declare and initilize a variable.
It is really equivalent with something like
double d = 1;
where you replace double with struct S {int i;}, replace s with d, and replace 1 with the initializer for a struct, {0}. Now combine this with the dot syntax.
It just has the side effect of also defining struct S.
Update:
Regarding 3), I do not think that with private they refer to an implementation of access control, but rather refer to the fact that in C++, you would usually list the data members in the private section of the class, so I understand this as instructions to add the data members after the type identifying element.
I'm trying to create a generic hash table in C. I've read a few different implementations, and came across a couple of different approaches.
The first is to use macros like this: http://attractivechaos.awardspace.com/khash.h.html
And the second is to use a struct with 2 void pointers like this:
struct hashmap_entry
{
void *key;
void *value;
};
From what I can tell this approach isn't great because it means that each entry in the map requires at least 2 allocations: one for the key and one for the value, regardless of the data types being stored. (Is that right???)
I haven't been able to find a decent way of keeping it generic without going the macro route. Does anyone have any tips or examples that might help me out?
C does not provide what you need directly, nevertheless you may want to do something like this:
Imagine that your hash table is a fixed size array of double linked lists and it is OK that items are always allocated/destroyed on the application layer. These conditions will not work for every case, but in many cases they will. Then you will have these data structures and sketches of functions and protototypes:
struct HashItemCore
{
HashItemCore *m_prev;
HashItemCore *m_next;
};
struct HashTable
{
HashItemCore m_data[256]; // This is actually array of circled
// double linked lists.
int (*GetHashValue)(HashItemCore *item);
bool (*CompareItems)(HashItemCore *item1, HashItemCore *item2);
void (*ReleaseItem)(HashItemCore *item);
};
void InitHash(HashTable *table)
{
// Ensure that user provided the callbacks.
assert(table->GetHashValue != NULL && table->CompareItems != NULL && table->ReleaseItem != NULL);
// Init all double linked lists. Pointers of empty list should point to themselves.
for (int i=0; i<256; ++i)
table->m_data.m_prev = table->m_data.m_next = table->m_data+i;
}
void AddToHash(HashTable *table, void *item);
void *GetFromHash(HashTable *table, void *item);
....
void *ClearHash(HashTable *table);
In these functions you need to implement the logic of the hash table. While working they will be calling user defined callbacks to find out the index of the slot and if items are identical or not.
The users of this table should define their own structures and callback functions for every pair of types that they want to use:
struct HashItemK1V1
{
HashItemCore m_core;
K1 key;
V1 value;
};
int CalcHashK1V1(void *p)
{
HashItemK1V1 *param = (HashItemK1V1*)p;
// App code.
}
bool CompareK1V1(void *p1, void *p2)
{
HashItemK1V1 *param1 = (HashItemK1V1*)p1;
HashItemK1V1 *param2 = (HashItemK1V1*)p2;
// App code.
}
void FreeK1V1(void *p)
{
HashItemK1V1 *param = (HashItemK1V1*)p;
// App code if needed.
free(p);
}
This approach will not provide type safety because items will be passed around as void pointers assuming that every application structure starts with HashItemCore member. This will be sort of hand made polymorphysm. This is maybe not perfect, but this will work.
I implemented this approach in C++ using templates. But if you will strip out all fancies of C++, in the nutshell it will be exactly what I described above. I used my table in multiple projects and it worked like charm.
A generic hashtable in C is a bad idea.
a neat implementation will require function pointers, which are slow, since these functions cannot be inlined (the general case will need at least two function calls per hop: one to compute the hash value and one for the final compare)
to allow inlining of functions you'll either have to
write the code manually
or use a code generator
or macros. Which can get messy
IIRC, the linux kernel uses macros to create and maintain (some of?) its hashtables.
C does not have generic data types, so what you want to do (no extra allocations and no void* casting) is not really possible. You can use macros to generate the right data functions/structs on the fly, but you're trying to avoid macros as well.
So you need to give up at least one of your ideas.
You could have a generic data structure without extra allocations by allocating something like:
size_t key_len;
size_t val_len;
char key[];
char val[];
in one go and then handing out either void pointers, or adding an api for each specific type.
Alternatively, if you have a limited number of types you need to handle, you could also tag the value with the right one so now each entry contains:
size_t key_len;
size_t val_len;
int val_type;
char key[];
char val[];
but in the API at least you can verify that the requested type is the right one.
Otherwise, to make everything generic, you're left with either macros, or changing the language.
I'm implementing a set of common yet not so trivial (or error-prone) data structures for C (here) and just came with an idea that got me thinking.
The question in short is, what is the best way to implement two structures that use similar algorithms but have different interfaces, without having to copy-paste/rewrite the algorithm? By best, I mean most maintainable and debug-able.
I think it is obvious why you wouldn't want to have two copies of the same algorithm.
Motivation
Say you have a structure (call it map) with a set of associated functions (map_*()). Since the map needs to map anything to anything, we would normally implement it taking a void *key and void *data. However, think of a map of int to int. In this case, you would need to store all the keys and data in another array and give their addresses to the map, which is not so convenient.
Now imagine if there was a similar structure (call it mapc, c for "copies") that during initialization takes sizeof(your_key_type) and sizeof(your_data_type) and given void *key and void *data on insert, it would use memcpy to copy the keys and data in the map instead of just keeping the pointers. An example of usage:
int i;
mapc m;
mapc_init(&m, sizeof(int), sizeof(int));
for (i = 0; i < n; ++i)
{
int j = rand(); /* whatever */
mapc_insert(&m, &i, &j);
}
which is quite nice, because I don't need to keep another array of is and js.
My ideas
In the example above, map and mapc are very closely related. If you think about it, map and set structures and functions are also very similar. I have thought of the following ways to implement their algorithm only once and use it for all of them. Neither of them however are quite satisfying to me.
Use macros. Write the function code in a header file, leaving the structure dependent stuff as macros. For each structure, define the proper macros and include the file:
map_generic.h
#define INSERT(x) x##_insert
int INSERT(NAME)(NAME *m, PARAMS)
{
// create node
ASSIGN_KEY_AND_DATA(node)
// get m->root
// add to tree starting from root
// rebalance from node to root
// etc
}
map.c
#define NAME map
#define PARAMS void *key, void *data
#define ASSIGN_KEY_AND_DATA(node) \
do {\
node->key = key;\
node->data = data;\
} while (0)
#include "map_generic.h"
mapc.c
#define NAME mapc
#define PARAMS void *key, void *data
#define ASSIGN_KEY_AND_DATA(node) \
do {\
memcpy(node->key, key, m->key_size);\
memcpy(node->data, data, m->data_size);\
} while (0)
#include "map_generic.h"
This method is not half bad, but it's not so elegant.
Use function pointers. For each part that is dependent on the structure, pass a function pointer.
map_generic.c
int map_generic_insert(void *m, void *key, void *data,
void (*assign_key_and_data)(void *, void *, void *, void *),
void (*get_root)(void *))
{
// create node
assign_key_and_data(m, node, key, data);
root = get_root(m);
// add to tree starting from root
// rebalance from node to root
// etc
}
map.c
static void assign_key_and_data(void *m, void *node, void *key, void *data)
{
map_node *n = node;
n->key = key;
n->data = data;
}
static map_node *get_root(void *m)
{
return ((map *)m)->root;
}
int map_insert(map *m, void *key, void *data)
{
map_generic_insert(m, key, data, assign_key_and_data, get_root);
}
mapc.c
static void assign_key_and_data(void *m, void *node, void *key, void *data)
{
map_node *n = node;
map_c *mc = m;
memcpy(n->key, key, mc->key_size);
memcpy(n->data, data, mc->data_size);
}
static map_node *get_root(void *m)
{
return ((mapc *)m)->root;
}
int mapc_insert(mapc *m, void *key, void *data)
{
map_generic_insert(m, key, data, assign_key_and_data, get_root);
}
This method requires writing more functions that could have been avoided in the macro method (as you can see, the code here is longer) and doesn't allow optimizers to inline the functions (as they are not visible to map_generic.c file).
So, how would you go about implementing something like this?
Note: I wrote the code in the stack-overflow question form, so excuse me if there are minor errors.
Side question: Anyone has a better idea for a suffix that says "this structure copies the data instead of the pointer"? I use c that says "copies", but there could be a much better word for it in English that I don't know about.
Update:
I have come up with a third solution. In this solution, only one version of the map is written, the one that keeps a copy of data (mapc). This version would use memcpy to copy data. The other map is an interface to this, taking void *key and void *data pointers and sending &key and &data to mapc so that the address they contain would be copied (using memcpy).
This solution has the downside that a normal pointer assignment is done by memcpy, but it completely solves the issue otherwise and is very clean.
Alternatively, one can only implement the map and use an extra vectorc with mapc which first copies the data to vector and then gives the address to a map. This has the side effect that deletion from mapc would either be substantially slower, or leave garbage (or require other structures to reuse the garbage).
Update 2:
I came to the conclusion that careless users might use my library the way they write C++, copy after copy after copy. Therefore, I am abandoning this idea and accepting only pointers.
You roughly covered both possible solutions.
The preprocessor macros roughly correspond to C++ templates and have the same advantages and disadvantages:
They are hard to read.
Complex macros are often hard to use (consider type safety of parameters etc.)
They are just "generators" of more code, so in the compiled output a lot of duplicity is still there.
On other side, they allow compiler to optimize a lot of stuff.
The function pointers roughly correspond to C++ polymorphism and they are IMHO cleaner and generally easier-to-use solution, but they bring some cost at runtime (for tight loops, few extra function calls can be expensive).
I generally prefer the function calls, unless the performance is really critical.
There's also a third option that you haven't considered: you can create an external script (written in another language) to generate your code from a series of templates. This is similar to the macro method, but you can use a language like Perl or Python to generate the code. Since these languages are more powerful than the C pre-processor, you can avoid some of the potential problems inherent in doing templates via macros. I have used this method in cases where I was tempted to use complex macros like in your example #1. In the end, it turned out to be less error-prone than using the C preprocessor. The downside is that between writing the generator script and updating the makefiles, it's a little more difficult to get set up initially (but IMO worth it in the end).
What you're looking for is polymorphism. C++, C# or other object oriented languages are more suitable to this task. Though many people have tried to implement polymorphic behavior in C.
The Code Project has some good articles/tutorials on the subject:
http://www.codeproject.com/Articles/10900/Polymorphism-in-C
http://www.codeproject.com/Articles/108830/Inheritance-and-Polymorphism-in-C
gcc 4.4.3 c89
I am creating a client server application and I will need to implement some callback functions.
However, I am not too experienced in callbacks. And I am wondering if anyone knowns some good reference material to follow when designing callbacks. Is there any design patterns that are used for c. I did look at some patterns but there where all c++.
Many thanks for any suggestions,
Here is a very rough example. Please note, the only thing I'm trying to demonstrate is the use of callbacks, its designed to be informational, not a demonstration.
Lets say that we have a library (or any set of functions that revolve around a structure), we're going to have code that looks similar to this (of course, I'm naming it foo):
typedef struct foo {
int value;
char *text;
} foo_t;
That's simple enough. We'd then (conventionally) provide some means of allocating and freeing it, such as:
foo_t *foo_start(void)
{
foo_t *ret = NULL;
ret = (foo_t *)malloc(sizeof(struct foo));
if (ret == NULL)
return NULL;
return ret;
}
And then:
void foo_stop(foo_t *f)
{
if (f != NULL)
free(f);
}
But we want a callback, so we can define a function that will be entered when foo->text has something to report. To do that, we use a typed function pointer:
typedef void (* foo_callback_t)(int level, const char *data);
We also want any of the foo family of functions to be able to enter this callback conveniently. To do that, we need to add it to the structure, which would now look like this:
typedef struct foo {
int value;
char *text;
foo_callback_t callback;
} foo_t;
Then we write the function that will actually be entered (using the same prototype of our callback type):
void my_foo_callback(int val, char *data)
{
printf("Val is %d, data is %s\n", val, data == NULL ? "NULL" : data);
}
We then need to write some convenient way to say what function it actually points to:
void foo_reg_callback(foo_t *f, void *cbfunc)
{
f->callback = cbfunc;
}
And then our other foo functions can use it, for instance:
int foo_bar(foo_t *f, char *data)
{
if (data == NULL)
f->callback(LOG_ERROR, "data was NULL");
}
Note that in the above:
f->callback(LOG_ERROR, "data was NULL");
Is just like doing this:
my_foo_callback(LOG_ERROR, "data was NULL"):
Except that, we enter my_foo_callback() via a function pointer that we previously set, thereby giving us the flexibility to define our own handler on the fly (and even switch handlers if / as needed).
One of the biggest problems with callbacks (and even the code above) is type safety when using them. A lot of callbacks will take a void * pointer, usually named something like context which could be any type of data/memory. This provides great flexibility, but can be problematic if your pointers get away from you. For instance, you don't want to accidentally cast what is actually a struct * as char * (or int for that matter) by assignment. You can pass much more than simple strings and integers - structures, unions, enums, etc can all be passed. CCAN's type safe callbacks help you to avoid unwittingly evil casts (to / from void *) when doing so.
Again, this is an over simplified example that's designed to give you an overview of one possible way to use callbacks. Please consider it psuedo code that is meant only as an example.
IN C, callbacks are done with function pointers.
One feature that you definitely want is user defined context. Your code takes a void * pointer and makes it available to the callback function:
void callback(..., void *ctx);
void call_service_which_invokes_callback(...,
void (*cb)(..., void *ctx),
void *ctx);
This way, the callback can access any necessary state without having to use global variables.
Callbacks in C are implemented using function pointers. This might be helpful for starting points:
What is a "callback" in C and how are they implemented?
Also,
http://www.newty.de/fpt/callback.html#howto
I am trying to explore OOP in C. I am however a C n00b and would like to pick the brilliant brains of stackoverflow :)
My code is below:
#include <stdio.h>
#include <stdlib.h>
typedef struct speaker {
void (*say)(char *msg);
} speaker;
void say(char *dest) {
printf("%s",dest);
}
speaker* NewSpeaker() {
speaker *s;
s->say = say;
return s;
}
int main() {
speaker *s = NewSpeaker();
s->say("works");
}
However I'm getting a segfault from this, if I however remove all args from say and make it void, I can get it to work properly. What is wrong with my current code?
Also. While this implements a form of object in C, I'm trying to further implement it with inheritance, and even overriding/overloading of methods. How do you think I can implement such?
Thank You!
In your code, NewSpeaker() doesn't actually create a "new" speaker. You need to use a memory allocation function such as malloc or calloc.
speaker* NewSpeaker() {
speaker *s = malloc(sizeof(speaker));
s->say = say;
return s;
}
Without assigning the value from, for example, the return value of malloc, s is initialized to junk on the stack, hence the segfault.
Firstly, as it has been noted already, you failed to allocate memory for your 'speaker' object in 'NewSpeaker'. Without the unnecessary clutter it would look as follows
speaker* NewSpeaker(void)
{
speaker *s = malloc(sizeof *s);
s->say = say;
return s;
}
Note, that there's no cast on the result of the malloc, no type name in the 'sizeof' argument and the function parameter list is declared as '(void)', not just '()'.
Secondly, if you want to be able to create non-dynamic objects of your 'speaker' type, you might want to provide an in-place initialization function first, and then proceed from there
speaker* InitSpeaker(speaker* s)
{
assert(s != NULL);
s->say = say;
return s;
}
speaker* NewSpeaker(void)
{
void *raw = malloc(sizeof(speaker));
return raw != NULL ? InitSpeaker(raw) : NULL;
}
Finally, if you really want to create something like virtual C++ methods, you need to supply each method with a 'this' parameter (to get access to other members of your object). So it should probably look something like
typedef struct speaker
{
void (*say)(struct speaker *this, char *msg);
} speaker;
void say(speaker *this, char *dest)
{
printf("%s",dest);
}
This, of course, will require you to pass the corresponding argument every time you call a "method", but there's no way around this.
Additionally, I hope you know that you need "method" pointers in your "class" for "virtual methods" only. Ordinary (non-virtual) methods don't need such pointers.
Finally, a "traditional" C++ class imlementation doesn't store virtual method pointers inside each instance of the class. Instead, they are placed in a separate table (VMT), pointer to which is added to each instance. This saves a lot of memory. And this, BTW, makes especially good sense when you implement inheritance.
You can implement inheritance by embedding the parent class structure in the top of the child class structure. That way you can safely cast from the child class to the parent class. Here's an article on implementing OO features in C. If you want an existing solution, or just want to learn more about ways of achieving OO, look at the GObject library.