Looking for a good hash table implementation in C [closed] - c

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I am primarily interested in string keys. Can someone point me towards a library?

I had the same need and did some research and ended up using libcfu
It's simple and readable so if I have a need to modify, I can do it without spending too much time to understand. It's also of BSD license. No need to change my structs (to embed say a next pointer)
I had to reject the other options for following reasons (my personal reasons, YMMV):
sglib --> it's a macro maze and I wasn't comfortable debugging/making
changes on such a code base using just macros
cbfalconer --> lot of licensing redflags, and the site was down and too many unfavorable discussions on web about support/author; didn't want to take the risk
google sparce-hash --> as stated already, it's for C++, not C
glib (gnome hash) --> looked very promising; but I couldn't find any easy way to install the developer kit; I just needed the C routines/files -- not the full blown developement environment
Judy --> seems too complex for a simple use.. also was not ready to debug myself if I had to run into any issues
npsml (mentioned here) --> can't find the source
strmap found very simple and useful -- it's just too simplistic that both key and value must be strings; value being string seems too restrictive (should accept void *)
uthash --> seems good (has been mentioned on wikipedia on hashtable); found that it requires struct to be modified -- didn't want to do that as performace is not really a concern for my use --it's more of development velocity.
In summary for very simple use strmap is good; uthash if you are concerned with additional memory use. If just speed of development or ease of use is primary objective, libcfu wins [note libcfu internally does memory allocation to maintain the nodes/hashtables]. It's surprising that there aren't many simple C hash implementations available.

GLib is a great library to use as a foundation in your C projects. They have some decent data structure offerings including Hash Tables: http://developer.gnome.org/glib/2.28/glib-Hash-Tables.html (link updated 4/6/2011)

For strings, the Judy Array might be good.
A Judy array is a complex but very fast associative array data structure for storing and looking up values using integer or string keys. Unlike normal arrays, Judy arrays may be sparse; that is, they may have large ranges of unassigned indices.
Here is a Judy library in C.
A C library that provides a state-of-the-art core technology that implements a sparse dynamic array. Judy arrays are declared simply with a null pointer. A Judy array consumes memory only when it is populated, yet can grow to take advantage of all available memory if desired.
Other references,
This Wikipedia hash implementation reference has some C open source links.
Also, cmph -- A Minimal Perfect Hashing Library in C, supports several algorithms.

There are some good answers here:
Container Class / Library for C
http://sglib.sourceforge.net.
http://cbfalconer.home.att.net/download/

Dave Hanson's C Interfaces and Implementations includes a fine hash table and several other well-engineered data structures. There is also a nice string-processing interface. The book is great if you can afford it, but even if not, I have found this software very well designed, small enough to learn in its entirety, and easy to reuse in several different projects.

A long time has passed since I asked this question... I can now add my own public domain library to the list:
http://sourceforge.net/projects/npsml/

C Interfaces and Implementations discusses hash table implementations in C. The source code is available online. (My copy of the book is at work so I can't be more specific.)

Apache's APR library has its own hash-implementation. It is already ported to anything Apache runs on and the Apache license is rather liberal too.

khash.h from samtools/bwa/seqtk/klib
curl https://raw.github.com/attractivechaos/klib/master/khash.h
via http://www.biostars.org/p/10353/

Never used it but Google Sparsehash may work

Download tcl and use their time-proven tcl hash function. It's easy. The TCL API is well documented.

Gperf - Perfect Hash Function Generator
http://www.ibm.com/developerworks/linux/library/l-gperf.html

https://github.com/dozylynx/C-hashtable
[updated URL as original now 404s: http://www.cl.cam.ac.uk/~cwc22/hashtable/ ]
Defined functions
* create_hashtable
* hashtable_insert
* hashtable_search
* hashtable_remove
* hashtable_count
* hashtable_destroy
Example of use
struct hashtable *h;
struct some_key *k;
struct some_value *v;
static unsigned int hash_from_key_fn( void *k );
static int keys_equal_fn ( void *key1, void *key2 );
h = create_hashtable(16, hash_from_key_fn, keys_equal_fn);
insert_key = (struct some_key *) malloc(sizeof(struct some_key));
retrieve_key = (struct some_key *) malloc(sizeof(struct some_key));
v = (struct some_value *) malloc(sizeof(struct some_value));
(You should initialise insert_key, retrieve_key and v here)
if (! hashtable_insert(h,insert_key,v) )
{ exit(-1); }
if (NULL == (found = hashtable_search(h,retrieve_key) ))
{ printf("not found!"); }
if (NULL == (found = hashtable_remove(h,retrieve_key) ))
{ printf("Not found\n"); }
hashtable_destroy(h,1); /* second arg indicates "free(value)" */

stl has map and hash_map (hash_map is only in some implementations) that are key to value if you are able to use C++.
http://www.cplusplus.com/reference/stl/map/

Related

How to implement C code with pointers in Prolog?

I am new to prolog. I have learned that ,though it is a declarative language, prolog can be used as a general purpose programming language, just like C. So, whatever problems you can solve in C, you can solve in prolog as well, even though its run-time may not be as good. Since there are no pointers in prolog (as far as i know), I am wondering if i can write an equivalent program in prolog for the following code written in C :-
#include <stdio.h>
int main()
{
int a = 5;
int *p;
p = &a;
printf("The address of a is %d.", p);
return 0;
}
You're trying to drive in a nail using a screwdriver, to use a popular analogy. Prolog is not C and solving problems in Prolog is fundamentally different from solving them in C.
Printing the value of a variable is easy to do, for example:
main :-
X = 5,
io:format("X = ~w~n", [X]).
but you can't get the address of X like you can in C. And why would you want to? The address could be different next time since Prolog has automatic garbage collection.
If you want to learn Prolog, forget about trying to write Prolog programs which look like C programs, and try to solve actual problems instead. You could try out the Project Euler series of problems, for example.
Apart from the comments and the existing answer, here is more:
Ask yourself: what is the use of the C program that you have shown? What problem does it solve? I can't answer this question, and I suspect you can't answer it either. In isolation, this program has no useful application whatsoever! So despite C being a general purpose programming language, you can write programs without any purpose, general or domain-specific.
The same, of course, is true of Prolog.
To pointers in particular: they are a very thin abstraction over absolute memory addresses. You can use (and abuse) pointers in many ways, and, if your algorithms are correct for the problem you are currently solving, the compiler can generate very efficient machine code. The same, however, is true of Prolog. The paradigms, however, will be very different.
In summary, you have managed to write a question so devoid of meaning that you provoked me to answer it without any code.
P.S. Or you have just trolled us with moderate success.
Well, since you tagged swi-prolog your question, I can show the code I used to exchange Qt GUI objects (just pointers, you know...) with the Prolog engine.
/** get back an object passed by pointer to Prolog */
template<typename Obj> Obj* pq_cast(T ptr) {
return static_cast<Obj*>(static_cast<void*>(ptr));
}
to be used, for instance in swipl-win, where _read_f is really a C callback:
/** fill the buffer */
ssize_t Swipl_IO::_read_f(void *handle, char *buf, size_t bufsize) {
auto e = pq_cast<Swipl_IO>(handle);
return e->_read_(buf, bufsize);
swipl-win has found its way as the new console in SWI-Prolog.

any hash table in C besides from search.h? [duplicate]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I am primarily interested in string keys. Can someone point me towards a library?
I had the same need and did some research and ended up using libcfu
It's simple and readable so if I have a need to modify, I can do it without spending too much time to understand. It's also of BSD license. No need to change my structs (to embed say a next pointer)
I had to reject the other options for following reasons (my personal reasons, YMMV):
sglib --> it's a macro maze and I wasn't comfortable debugging/making
changes on such a code base using just macros
cbfalconer --> lot of licensing redflags, and the site was down and too many unfavorable discussions on web about support/author; didn't want to take the risk
google sparce-hash --> as stated already, it's for C++, not C
glib (gnome hash) --> looked very promising; but I couldn't find any easy way to install the developer kit; I just needed the C routines/files -- not the full blown developement environment
Judy --> seems too complex for a simple use.. also was not ready to debug myself if I had to run into any issues
npsml (mentioned here) --> can't find the source
strmap found very simple and useful -- it's just too simplistic that both key and value must be strings; value being string seems too restrictive (should accept void *)
uthash --> seems good (has been mentioned on wikipedia on hashtable); found that it requires struct to be modified -- didn't want to do that as performace is not really a concern for my use --it's more of development velocity.
In summary for very simple use strmap is good; uthash if you are concerned with additional memory use. If just speed of development or ease of use is primary objective, libcfu wins [note libcfu internally does memory allocation to maintain the nodes/hashtables]. It's surprising that there aren't many simple C hash implementations available.
GLib is a great library to use as a foundation in your C projects. They have some decent data structure offerings including Hash Tables: http://developer.gnome.org/glib/2.28/glib-Hash-Tables.html (link updated 4/6/2011)
For strings, the Judy Array might be good.
A Judy array is a complex but very fast associative array data structure for storing and looking up values using integer or string keys. Unlike normal arrays, Judy arrays may be sparse; that is, they may have large ranges of unassigned indices.
Here is a Judy library in C.
A C library that provides a state-of-the-art core technology that implements a sparse dynamic array. Judy arrays are declared simply with a null pointer. A Judy array consumes memory only when it is populated, yet can grow to take advantage of all available memory if desired.
Other references,
This Wikipedia hash implementation reference has some C open source links.
Also, cmph -- A Minimal Perfect Hashing Library in C, supports several algorithms.
There are some good answers here:
Container Class / Library for C
http://sglib.sourceforge.net.
http://cbfalconer.home.att.net/download/
Dave Hanson's C Interfaces and Implementations includes a fine hash table and several other well-engineered data structures. There is also a nice string-processing interface. The book is great if you can afford it, but even if not, I have found this software very well designed, small enough to learn in its entirety, and easy to reuse in several different projects.
A long time has passed since I asked this question... I can now add my own public domain library to the list:
http://sourceforge.net/projects/npsml/
C Interfaces and Implementations discusses hash table implementations in C. The source code is available online. (My copy of the book is at work so I can't be more specific.)
Apache's APR library has its own hash-implementation. It is already ported to anything Apache runs on and the Apache license is rather liberal too.
khash.h from samtools/bwa/seqtk/klib
curl https://raw.github.com/attractivechaos/klib/master/khash.h
via http://www.biostars.org/p/10353/
Never used it but Google Sparsehash may work
Download tcl and use their time-proven tcl hash function. It's easy. The TCL API is well documented.
Gperf - Perfect Hash Function Generator
http://www.ibm.com/developerworks/linux/library/l-gperf.html
https://github.com/dozylynx/C-hashtable
[updated URL as original now 404s: http://www.cl.cam.ac.uk/~cwc22/hashtable/ ]
Defined functions
* create_hashtable
* hashtable_insert
* hashtable_search
* hashtable_remove
* hashtable_count
* hashtable_destroy
Example of use
struct hashtable *h;
struct some_key *k;
struct some_value *v;
static unsigned int hash_from_key_fn( void *k );
static int keys_equal_fn ( void *key1, void *key2 );
h = create_hashtable(16, hash_from_key_fn, keys_equal_fn);
insert_key = (struct some_key *) malloc(sizeof(struct some_key));
retrieve_key = (struct some_key *) malloc(sizeof(struct some_key));
v = (struct some_value *) malloc(sizeof(struct some_value));
(You should initialise insert_key, retrieve_key and v here)
if (! hashtable_insert(h,insert_key,v) )
{ exit(-1); }
if (NULL == (found = hashtable_search(h,retrieve_key) ))
{ printf("not found!"); }
if (NULL == (found = hashtable_remove(h,retrieve_key) ))
{ printf("Not found\n"); }
hashtable_destroy(h,1); /* second arg indicates "free(value)" */
stl has map and hash_map (hash_map is only in some implementations) that are key to value if you are able to use C++.
http://www.cplusplus.com/reference/stl/map/

Tool to convert (translate) C to Go? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
What tool to use to convert C source code into Go source code?
For example, if the C code contains:
struct Node {
struct Node *left, *right;
void *data;
};
char charAt(char *s, int i) {
return s[i];
}
the corresponding Go code generated by the tool should be:
type Node struct {
left, right *Node
data interface{}
}
func charAt(s string, i int) byte {
return s[i]
}
The tool does not need to be perfect. It is OK if some parts of the generated Go code need to be corrected by hand.
rsc created github.com/rsc/c2go to convert the c based Go compiler into Go.
As an external example, akavel seems to be trying to use it to create a Go based lua: github.com/akavel/goluago/
github.com/xyproto/c2go is another project, but it hasn't been touched in a little while.
I guess no such (C to Go source code conversion) tool exist today. You might consider to make your own converter. The question becomes: is it worth it, and how to do that?
It probably might not be worth the effort, because Go and C could be somehow interoperable. For example, if you use the GCC 4.6 (or to be released 4.7, i.e. the latest snapshot) your probably can link C & Go code together, with some care.
Of course, as usual, the evil is in the details.
If you want a converter, do you want the obtained Go code to be readable and editable (then the task is more difficult, since you want to keep the structure of the code, and you also want to keep the comments)? In that case, you probably need your own C parser (and it is a difficult task).
If you don't care about readability of the generated Go code, you could for example extend an existing compiler to do the work. For example, GCC is extensible thru plugins or thru MELT extensions, and you could customize GCC (with MELT, or your own C plugin for GCC) to transform Gimple representation (the main internal representation for instructions inside GCC) to unreadable Go code. This is somehow simpler (but still require more than a week of work).
Of course, Go interfaces, channels and even memory management (garbage collected memory) has no standard C counterpart.
Check out this project
https://github.com/elliotchance/c2go
The detailed description is in this article
Update: August 6, 2021
Also check this one
https://github.com/gotranspile/cxgo
I'm almost sure there is no such tool, but IMHO in every language it's good to write in its own "coding style".
Remember how much we all loved C preprocessor tricks and really artistic work with pointers? Remember how much care it took to deal with malloc/free or with threads?
Go is different. You have no preprocessor, but you have closures, objects with methods, interfaces, garbage collector, slices, goroutines and many other nice features.
So, why to convert code instead of rewriting it in a much better and cleaner way?
Of course, I hope you don't have a 1000K lines of code in C that you have to port to Go :)
Take a look at SWIG, http://www.swig.org/Doc2.0/Go.html it will translate the C/C++ headers to go and wrap them for a starting point. Then you can port parts over bit by bit.
As far as I know, such tool does not exist (yet). So you're bound to convert your C code to Go by hand.
I don't know how complex the C code is you want to convert, but you might want to keep in mind Go has a "special" way of doing things. Like the usage of interfaces and channels.

something like an "extended" C string library?

I have used several dynamically typed languages and I have been avoiding C but enough is enough, it's the right tool for the job sometimes and I need to get over it.
The things I miss working with C are associative arrays and large string libraries. Is there a library that gives more options then string.h? Any general advice when it comes to make the transition with strings?
Thanks for reading-Patrick
You can take a look at the Better String Library. The description from the site:
The Better String Library is an
abstraction of a string data type
which is superior to the C library
char buffer string type, or C++'s
std::string. Among the features
achieved are:
Substantial mitigation
of buffer overflow/overrun problems
and other failures that result from
erroneous usage of the common C string
library functions
Significantly
simplified string manipulation
High
performance interoperability with
other source/libraries which expect
'\0' terminated char buffers
Improved
overall performance of common string
operations
Functional equivalency with
other more modern languages
The
library is totally stand alone,
portable (known to work with gcc/g++,
MSVC++, Intel C++, WATCOM C/C++, Turbo
C, Borland C++, IBM's native CC
compiler on Windows, Linux and Mac OS
X), high performance, easy to use and
is not part of some other collection
of data structures. Even the file I/O
functions are totally abstracted (so
that other stream-like mechanisms,
like sockets, can be used.)
Nevertheless, it is adequate as a
complete replacement of the C string
library for string manipulation in any
C program.
POSIX gives you <string.h>, <strings.h> and <regex.h>.
If you really need more of a string library than this, C is probably not the right tool for that particular job.
As for a hash table, you can't get a type-safe hash table in C without a lot of nasty macros.
If you're OK with just storing void-pointers, or with doing some manual work for each type of map, then you shouldn't be lacking for options. Coding your own hash table is a hoot and a half - just search Stackoverflow for help with the hash function. If you don't want to roll your own, strmap [LGPL] looks decent.
GLib provides many pre-made data structures and string handling functions, but it's a set of functions and types completely separated from the "usual" ones, and it's not a very lightweight dependency.
If instead C++ is a viable alternative for your task, it bundles a string class and several generic containers ready-made into the standard library (and much other related stuff can be found in Boost).
What specifically are you looking for in your extended c-string library?
One way to get better at C, is to create your own c-string library. Then make it open source, and let others help refine it.
I don't usually advocate creating your own string libaries, but w.r.t. C, it's a great way to learn C.
Much of the power of C consists of the ability to have direct control over the memory as a sequence of bytes. It is a bit against the philosophy of the language to treat strings as something higher-level than that.
I would recommend rolling your own very basic one. It will be an enlightening experience especially to learn pointer arithmetics and loops.
For example, learn about "Schlemiel the Painter's algorithm" regarding strcat and design your library to solve this problem.
I've not used it myself, but you should at least review the SEI/CERT library Specifications for Managed Strings, 2nd Edition. The code can be found at CERT.
An associative array associating string keys and struct values in C consists of:
A hash function for strings
An array with a prime number of elements, inside each of which is a linked-list head.
Linked-list elements containing char * pointers to the stored keys and (optionally) a struct * pointer to the corresponding value for each key.
To store a string key in your associative array:
Hash it modulo that prime array size.
In that array bin, add it to the linked-list.
Assign the value pointer to the value you are adding.

what the author of nedtries means by "in-place"?

I. Just implemented a kind of bitwise trie (based on nedtries), but my code does lot
Of memory allocation (for each node).
Contrary to my implemetation, nedtries are claimed to be fast , among othet things,
Because of their small number of memory allocation (if any).
The author claim his implementation to be "in-place", but what does it really means in this context ?
And how does nedtries achieve such a small number of dynamic memory allocation ?
Ps: I know that the sources are available, but the code is pretty hard to follow and I cannot figure how it works
I'm the author, so this is for the benefit of the many according to Google who are similarly having difficulties in using nedtries. I would like to thank the people here on stackflow for not making unpleasant comments about me personally which some other discussions about nedtries do.
I am afraid I don't understand the difficulties with knowing how to use it. Usage is exceptionally easy - simply copy the example in the Readme.html file:
typedef struct foo_s foo_t;
struct foo_s {
NEDTRIE_ENTRY(foo_t) link;
size_t key;
};
typedef struct foo_tree_s foo_tree_t;
NEDTRIE_HEAD(foo_tree_s, foo_t);
static foo_tree_t footree;
static size_t fookeyfunct(const foo_t *RESTRICT r)
{
return r->key;
}
NEDTRIE_GENERATE(static, foo_tree_s, foo_s, link, fookeyfunct, NEDTRIE_NOBBLEZEROS(foo_tree_s));
int main(void)
{
foo_t a, b, c, *r;
NEDTRIE_INIT(&footree);
a.key=2;
NEDTRIE_INSERT(foo_tree_s, &footree, &a);
b.key=6;
NEDTRIE_INSERT(foo_tree_s, &footree, &b);
r=NEDTRIE_FIND(foo_tree_s, &footree, &b);
assert(r==&b);
c.key=5;
r=NEDTRIE_NFIND(foo_tree_s, &footree, &c);
assert(r==&b); /* NFIND finds next largest. Invert the key function to invert this */
NEDTRIE_REMOVE(foo_tree_s, &footree, &a);
NEDTRIE_FOREACH(r, foo_tree_s, &footree)
{
printf("%p, %u\n", r, r->key);
}
NEDTRIE_PREV(foo_tree_s, &footree, &a);
return 0;
}
You declare your item type - here it's struct foo_s. You need the NEDTRIE_ENTRY() inside it otherwise it can contain whatever you like. You also need a key generating function. Other than that, it's pretty boilerplate.
I wouldn't have chosen this system of macro based initialisation myself! But it's for compatibility with the BSD rbtree.h so nedtries is very easy to swap in to anything using BSD rbtree.h.
Regarding my usage of "in place"
algorithms, well I guess my lack of
computer science training shows
here. What I would call "in place"
is when you only use the memory
passed into a piece of code, so if
you hand 64 bytes to an in place
algorithm it will only touch that 64
bytes i.e. it won't make use of
extra metadata, or allocate some
extra memory, or indeed write to
global state. A good example is an
"in place" sort implementation where
only the collection being sorted
(and I suppose the thread stack)
gets touched.
Hence no, nedtries doesn't need a
memory allocator. It stores all the
data it needs in the NEDTRIE_ENTRY
and NEDTRIE_HEAD macro expansions.
In other words, when you allocate
your struct foo_s, you do all the
memory allocation for nedtries.
Regarding understanding the "macro
goodness", it's far easier to
understand the logic if you compile
it as C++ and then debug it :). The
C++ build uses templates and the
debugger will cleanly show you state
at any given time. In fact, all
debugging from my end happens in a
C++ build and I meticulously
transcribe the C++ changes into
macroised C.
Lastly, before a new release, I
search Google for people having
problems with my software to see if
I can fix things and I am typically
amazed what someone people say about
me and my free software. Firstly,
why didn't those people having
difficulties ask me directly for
help? If I know that there is
something wrong with the docs, then
I can fix them - equally, asking on
stackoverflow doesn't let me know
immediately that there is a docs
problem bur rather relies on me to
find it next release. So all I would
say is that if anyone finds a
problem with my docs, please do
email me and say so, even if there
is a discussion say like here on
stackflow.
Niall
I took a look at the nedtrie.h source code.
It seems that the reason it is "in-place" is that you have to add the trie bookkeeping data to the items that you want to store.
You use the NEDTRIE_ENTRY macro to add parent/child/next/prev links to your data structure, and you can then pass that data structure to the various trie routines, which will extract and use those added members.
So it is "in-place" in the sense that you augment your existing data structures and the trie code piggybacks on that.
At least that's what it looks like. There's lots of macro goodness in that code so I could have gotten myself confused (:
In-place means you operate on the original (input) data, so the input data becomes the output data. Not-in-place means that you have separate input and output data, and the input data is not modified. In-place operations have a number of advantages - smaller cache/memory footprint, lower memory bandwidth, hence typically better performance, etc, but they have the disadvantage that they are destructive, i.e. you lose the original input data (which may or may not matter, depending on the use case).
In-place means to operate on the input data and (possibly) update it. The implication is that there no copying and/moving of the input data. This may result in loosing the input data original values which you will need to consider if it is relevant for your particular case.

Resources