C lookup string by value - c

I need to translate a value into a human readable string. Normally for things I define I would use values that start at zero and create a simple array of strings with the values as the index.
static const char *foo[] = { "foo", "bar", "etc" };
if (val < 3) printf("%s\n", foo[val]);
For this circumstance I have values that do not start at zero and there are some gaps between them. Is there a good way to do this without having to manually code in a bunch of empty strings for the indexes without a matching value/string pair?
static const char *foo[] = { "", "", "", "foo", "", "bar", "", "", "etc" };

As of C99, you can use designated initialisers:
static const char *foo[] = {
[3] = "foo",
[5] = "bar",
[8] = "etc"
};
This is equivalent to the array definition you posted and will generate an array with 9 entries. There is a similar syntax for the initialisation of structs:
struct Person joe = {
.name = "Joe", .age = 24, .favcolor = "mauve"
};
Note that this is a C feature only and will not work in C++.

If there aren't too many gaps, you can encode each contiguous sequence as a separate array, and then do a little bounds-checking to find the appropriate one to use. Here's a quick-and-dirty example:
#include <stdio.h>
#include <stdlib.h>
static const int array_first_indices[] = {3, 15, 28, 32};
static const char * array0[] = {"foo"};
static const char * array1[] = {"bar", "baz"};
static const char * array2[] = {"bloop", "blorp", "blat"};
static const char * array3[] = {"glop", "slop", "bop"};
#define check_array(whichArray, idx) { \
unsigned int relIdx = idx - array_first_indices[whichArray]; \
if (relIdx < (sizeof(array##whichArray)/sizeof(const char *))) \
return array##whichArray[relIdx]; \
}
const char * LookupWord(int idx)
{
check_array(0, idx);
check_array(1, idx);
check_array(2, idx);
check_array(3, idx);
return NULL;
}
int main(int args, char ** argv)
{
for (int i=0; i<50; i++) printf(" LookupWord(%i) = %s\n", i, LookupWord(i));
return 0;
}
For a fully general lookup mechanism, you'd probably need to use a data structure like a hash table or a tree; C implementations of those data structures are available, although if you have the option of using C++ it would be easier to use those data structures in that language, as they are provided by the standard library.

Create a sorted array that maps IDs to strings and use the bsearch() function to look up the string:
#include <stdio.h>
#include <stdlib.h>
struct id_msg_map {
int id;
char const* str;
};
int comp_id_string( const void* key, const void* element)
{
int key_id = ((struct id_msg_map*) key)->id;
int element_id = ((struct id_msg_map*) element)->id;
if (key_id < element_id) return -1;
if (key_id > element_id) return 1;
return 0;
}
static struct id_msg_map msg_map[] = {
{3, "message 3"} ,
{12, "message 12"},
{100, "message 100"},
{32000, "message 32000"},
};
#define ELEMENTS_OF(x) (sizeof(x) / sizeof((x)[0]))
char const* get_msg(int x)
{
struct id_msg_map key = {x};
struct id_msg_map* msg = bsearch(&key, msg_map, ELEMENTS_OF(msg_map), sizeof(msg_map[0]), comp_id_string);
if (!msg) return "invalid msg id";
return msg->str;
}
void test_msg(int x)
{
printf("The message for ID %d: \"%s\"\n", x, get_msg(x));
}
int main(void)
{
test_msg(0);
test_msg(3);
test_msg(100);
test_msg(-12);
return 0;
}

You can use designated initialisers, as described in M. Oehm's post, but that silently introduces the same gaps you were referring to earlier (with implicit 0 values). That option is most suitable when you know 0 will never be an actual selection, when the table doesn't change dynamically (particularly in size) and when the size of the table is small.
If the table is particularly large, but items are never added or removed from it you can use qsort and bsearch on a key/value-pair style structure. For example:
struct foo_pair {
int key;
char *value;
};
int foo_pair_compare(void *x, void *y) {
struct foo_pair *a = x, *b = y;
return (a->key > b->key) - (a->key < b->key);
}
int main(void) {
struct foo_pair foo[] = { { .key = 3, .value = "foo" },
{ .key = 5, .value = "bar" },
{ .key = 6, .value = "etc" } };
/* qsort needs to be done at the start of the program,
and again each time foo changes */
qsort(foo, sizeof foo / sizeof *foo, sizeof *foo, foo_pair_compare);
/* bsearch is used to retrieve an item from the sorted array */
struct foo_pair *selection = bsearch(&(struct foo_pair) { .key = 5 },
foo, sizeof foo / sizeof *foo,
sizeof *foo, foo_pair_compare);
}
When items are routinely added or removed from the collection, it will make more sense to select a hashtable or some kind of ordered map. If you can't be bothered writing and testing your own of these collections, I imagine there are plenty of tried & tested libraries on the internet that you could check out.

Related

How can I print the name variable of my struct here?

I have an ArrayList struct and Department struct that go as follows:
typedef struct ArrayList {
void** elements;
int size;
int length;
} ArrayList;
typedef struct Department {
char* name;
ArrayList* courses;
} Department;
To print my list, I'm using these two methods:
void* get(ArrayList* arraylist, int i) {
if (i < 0 || i >= arraylist -> size) {
return (void*) NULL;
}
return arraylist -> elements[i];
}
void printAL(ArrayList* arraylist) {
for (int i = 0; i < arraylist -> size; i++) {
printf("%s\n", (char*) get(arraylist, i));
}
}
The issue I'm facing, however, is that when I add a Department to my ArrayList, the line 'return arraylist -> elements[i];' returns the address of that struct. I'm trying to get it to print the name of the struct using 'return arraylist -> elements[i] -> name' but I keep getting a warning that I'm dereferencing a void* pointer, followed by an error that says 'request for member ‘name’ in something not a structure'. This obviously means that 'arraylist -> elements[i]' isn't a struct but rather an address. How can I reference the name of the struct at that address then? I'm quite confused because of the double pointer in the ArrayList struct.
TIA!
You need different printing functions for each different type of data element that could be in the ArrayList. You need one function to print departments; you need a different function to print courses. You pass the function pointer to the printing function — printAL() — along with a pointer to other data (which in this case is probably just a FILE *, but could be a more general structure).
This is analogous to the qsort() function in standard C. It can sort any data type; you just need to pass it a different comparator function for different data types.
Like this:
#include <stdio.h>
#include <stdlib.h>
typedef struct ArrayList
{
void **elements;
int size; /* Allocated size */
int length; /* Space in use */
} ArrayList;
typedef struct Department
{
char *name;
ArrayList *courses;
} Department;
static void *get(ArrayList *arraylist, int i)
{
if (i < 0 || i >= arraylist->size)
return NULL;
return arraylist->elements[i];
}
static void printAL(ArrayList *arraylist, void (*function)(const void *data, void *thunk), void *thunk)
{
for (int i = 0; i < arraylist->length; i++)
{
(*function)(get(arraylist, i), thunk);
}
}
static void put(ArrayList *al, void *data)
{
if (al->length >= al->size)
{
size_t new_size = (al->size + 2) * 2;
void *new_data = realloc(al->elements, new_size * sizeof(void *));
if (new_data == 0)
{
fprintf(stderr, "Failed to allocate %zu bytes memory\n", new_size * sizeof(void *));
exit(1);
}
al->elements = new_data;
al->size = new_size;
}
al->elements[al->length++] = data;
}
/*
typedef struct Course
{
const char *name;
const char *code;
// ...
} Course;
static void print_courseinfo(const void *data, void *thunk)
{
FILE *fp = thunk;
const Course *cp = data;
fprintf(fp, " - %s (%s)\n", cp->name, cp->code);
}
*/
static void print_deptname(const void *data, void *thunk)
{
FILE *fp = thunk;
const Department *dp = data;
fprintf(fp, "Name: %s\n", dp->name);
/*
if (dp->courses != 0)
printAL(dp->courses, print_courseinfo, thunk);
*/
}
int main(void)
{
ArrayList al = { 0, 0, 0 };
Department dl[] =
{
{ "Engineering", 0 },
{ "Physics", 0 },
{ "Mathematics", 0 },
{ "Chemistry", 0 },
{ "Biology", 0 },
{ "English", 0 },
{ "Computational Astronomy and Universe-Scale Data Modelling", 0 },
{ "Economics", 0 },
};
enum { DL_SIZE = sizeof(dl) / sizeof(dl[0]) };
for (size_t i = 0; i < DL_SIZE; i++)
put(&al, &dl[i]);
printAL(&al, print_deptname, stdout);
return 0;
}
Sample output:
Name: Engineering
Name: Physics
Name: Mathematics
Name: Chemistry
Name: Biology
Name: English
Name: Computational Astronomy and Universe-Scale Data Modelling
Name: Economics
You didn't document what the length and size members of the ArrayList represent. I've annotated what I've assumed, but I had to change the printAL() function to iterate over length elements instead of size elements, so I may have inverted the meaning you intended. It's easy to reverse them. I tend to use names like max_elements and num_elements for the job; it is more obvious what they're for, perhaps, since length and size are ambiguous or even equivalent in many contexts.
There's skeletal code in there to show how to handle the ArrayList of courses offered by each department. I couldn't be bothered to write code to initialize a separate ArrayList for each department, though it wouldn't be particularly hard to do.
I still prefer the pre-standard notation (*funcptr)(arg1, arg2) notation for invoking a function designated by a function pointer — it was necessary when I learned C, and I still find it clearer than the alternative. You're excused if you prefer funcptr(arg1, arg2) instead, though that can leave me wondering where funcptr is defined.
You can also find some code closely related to what you're doing in my SOQ (Stack Overflow Questions) repository on GitHub as files aomcopy.c, aomcopy.h, aommngd.c, aommngd.h, aomptr.c, aomptr.h, aoscopy.c, aoscopy.h, aosptr.c and aosptr.h in the src/libsoq sub-directory.
aomcopy.c, aomcopy.h: array of memory blocks, copied.
aommngd.c, aommngd.h: array of memory blocks, managed.
aomptr.c, aomptr.h: array of memory blocks, 'raw'.
aoscopy.c, aoscopy.h: array of strings, copied.
aosptr.c. aosptr.h: array of strings, 'raw'.
The 'raw' versions simply take the pointer passed and store it. The onus is on the user to ensure the data pointed at remains valid while the array lasts. The 'copied' versions allocate a simple copy of the data passed to it; it doesn't matter if the data passed is reused to store the next value. The 'managed' version calls user-defined functions to create copies of the data structures. This would be necessary if you have a complex structure (like a department) where you need a 'deep copy' of the data.

Key Value Pair in C Language

I am new to C programming and I am trying to create a key value structure as in Perl Programming. I saw one solution like :-
struct key_value
{
int key;
char* value;
};
struct key_value kv;
kv.key = 1;
kv.value = "foo";
But I don't know how to access these values from this structure. Can someone enlight on this ?
Here is an example:
#include <stdio.h>
#include <stdlib.h>
struct key_value
{
int key;
char* value;
};
int main(void)
{
int number_of_keys = 2;
struct key_value *kv = malloc(sizeof(struct key_value) * number_of_keys);
if (kv == NULL) {
perror("Malloc");
exit(EXIT_FAILURE);
}
kv[0].key = 8;
kv[0].value = "Test 8 key!";
kv[1].key = 6;
kv[1].value = "Test 6 key!";
printf("Key = %d\nKey value = %s\n", kv[0].key, kv[0].value);
printf("Key = %d\nKey value = %s\n", kv[1].key, kv[1].value);
free(kv);
return 0;
}
What you are missing is a collection. Most languages have a data type called a dictionary or a map or an associative array or some variation thereof. C does not have a data structure of this type; in fact, the only collection type you have built in to C is the array. So, if you want something where you can supply a key and get the value, you have to roll your own or find one on the Internet. The latter is probably preferable because you are likely to make mistakes and produce a slow data structure if you roll your own (especially if you are a beginner).
To give you a flavour of what you'll end up with, here's a simple example:
You'll need something to represent the collection; call it a ListMap for now:
struct ListMap;
The above is called an incomplete type. For now, we are not concerned with what's in it. You can't do anything with it except pass pointers to instances around.
You need a function to insert items into your collection. Its prototype would look something like this:
bool listMapInsert(struct ListMap* collection, int key, const char* value);
// Returns true if insert is successful, false if the map is full in some way.
And you need a function to retrieve the value for any one key.
const char* listMapValueForKey(struct ListMap* collection, int key);
You also need a function to initialise the collection:
struct ListMap* newListMap();
and to throw it away:
void freeListMap(struct ListMap* listMap);
The hard bit is implementing how those functions do what they do. Anyway, here's how you would use them:
struct ListMap* myMap = newListMap();
listMapInsert(myMap, 1, "foo");
listMapInsert(myMap, 1729, "taxi");
listMapInsert(myMap, 28, "perfect");
char* value = listMapValueForKey(myMap, 28); // perfect
freeListMap(myMap);
Here's a simple implementation. This is just for illustration because I haven't tested it and searching for entries increases linearly with the number of entries (you can do much better than that with hash tables and other structures).
enum
{
listMapCapacity = 20
};
struct ListMap
{
struct key_value kvPairs[listMapCapacity];
size_t count;
};
struct ListMap* newListMap()
{
struct ListMap* ret = calloc(1, sizeof *ret);
ret->count = 0; // not strictly necessary because of calloc
return ret;
}
bool listMapInsert(struct ListMap* collection, int key, const char* value)
{
if (collection->count == listMapCapacity)
{
return false;
}
collection->kvPairs[count].key = key;
collection->kvPairs[count].value = strdup(value);
count++;
return true;
}
const char* listMapValueForKey(struct ListMap* collection, int key)
{
const char* ret = NULL;
for (size_t i = 0 ; i < collection->count && ret == NULL ; ++i)
{
if (collection->kvPairs[i].key == key)
{
ret = kvPairs[i].value;
}
}
return ret;
}
void freeListMap(struct ListMap* listMap)
{
if (listMap == NULL)
{
return;
}
for (size_t i = 0 ; i < listMap->count ; ++i)
{
free(listMap->kvPair[i].value);
}
free(listMap);
}
typedef struct key_value
{
int key;
char* value;
}List;
struct key_value k1;
struct key_value k2;
struct key_value k3;
k1.key = 1;
k1.value = "foo";
k2.key = 2;
k2.value = "sec";
k3.key = 3;
k3.value = "third";
You will need to create N times the struct and give them values the way you did the first one. Or create array with N structs and iterate assign it values with a loop.
Array:
List arr[29];
int i;
for(i = 0;i<=28;i++){
arr[i].key = i;
arr[i].value = "W/e it needs to be";
}
The functionality you are looking for needs your own implementation in C; e.g. an array of your struct-type.
Here is an example of how to read the value for a key, without knowing anything about at which array-index the key will be found.
I have the keys numbered backward in order to illustrate that.
Note that more sophisticated API definitions are needed for special cases such as non-existing key; I just blindly return the last entry to keep things easy here.
#include <stdio.h>
#define MAPSIZE 30
struct key_value
{
int key;
char* value;
};
struct key_value kvmap[MAPSIZE];
void initmap(void)
{
int i;
for(i=0; i<MAPSIZE; i++)
{
kvmap[i].key=MAPSIZE-i-1;
kvmap[i].value="unset";
}
kvmap[0].value="zero";
kvmap[1].value="one";
kvmap[2].value="two";
kvmap[3].value="three";
kvmap[4].value="four";
kvmap[5].value="five";
kvmap[6].value="six";
kvmap[7].value="seven";
kvmap[8].value="eight";
kvmap[24].value="find this"; // it has the key "5"
}
char* readmap(int key)
{
int i=0;
while ((i<MAPSIZE-1) && (kvmap[i].key!=key))
{ printf("Not in %d\n", i);
++i;}
// will return last entry if key is not present
return kvmap[i].value;
}
int main(void)
{
initmap();
printf("%s\n", readmap(5));
return 0;
}
"I have to store 30 key/value pair"
Create an array of struct e.g., key_value.
struct key_value
{
int key;
char* value;
};
struct key_value kv[30];
kv[0].key = 1;
kv[0].value = "foo";
printf("%s", kv[0].value);
You can loop through to assign values to keys and values.
Access to whatever is in kv is simple.
int i = kv[0].key`;// copy value of k[0].key to i
char *v = kv[0].value; // copy value of k[0].value to v;
Your code already have the method to acess the values.
kv.key = 1
kv.value = "foo"
To get the values assigned is simple
kv.key
kv.value
It is a simple struct, if you wanna something like python dict you will need to implement a hash struct which will be more complicated.

Translation array

I need to make a simple translator. For example:
input: "foo" output: "bar"
input: "the" output: "teh"
input: "what" output: "wut"
I know I can write it like this:
if (!strcmp(input, "foo"))
puts("bar");
else if (!strcmp(input, "the"))
puts("teh");
else if (!strcmp(input, "what"))
puts("wut");
But that's big and messy. Is there a shortcut to do this? I know that in PHP (sorry for the inevitable syntax errors, I'm not proficient) there's something like this:
value = array(
"foo" => "bar",
"the" => "teh",
"what" => "wut"
);
How can I shorten the original code using something like a PHP array?
You can define a struct, which contains the word and the translation:
typedef struct {
const char *word;
const char *translation;
} translate_t;
Then you can just create an array of structs like this:
const translate_t translate[] = {{"foo", "bar"},
{"the", "teh"},
{"what", "wut"}};
If you want print out the words and translations, then you can just do this:
size_t size = sizeof translate/sizeof *translate;
for (size_t i = 0; i < size; i++) {
printf("Word: %s Translation: %s\n", translate[i].word, translate[i].translation);
}
Which will output:
Word: foo Translation: bar
Word: the Translation: teh
Word: what Translation: wut
This is a good approach for associating a word with a translation.
UPDATE: #Olaf suggested using a macro for size, which is far better for declaring sizes of arrays. Therefore, the above code can be expressed as:
#define ARRAY_SIZE(x) ((sizeof x)/sizeof *x) /* near top, or before main() is a good place for this */
for (size_t i = 0; ARRAY_SIZE(translate); i++) {
printf("Word: %s Translation: %s\n", translate[i].word, translate[i].translation);
}
Found it out:
const char *Translate[] = {
"foo", "bar",
"the", "teh",
"what", "wut"
};
int t_idx(char *s)
{
int i;
for (i = 0; Translate[i]; i += 2)
if (!strcmp(Translate[i], s))
return i+1;
return -1;
}
const char *translate(char *s)
{
int idx = t_idx(s);
return (idx == -1) ? s : Translate[idx];
}
Return values of translate:
translate("what") = "wut"
translate("some") = "some"
translate("foo") = "bar"
There are no map or associative array types in standard C; you must implement it yourself. A simple idea would be to use a struct:
#include <string.h>
#include <stdio.h>
struct map {
struct map_elem {
char *key;
char *value;
} * elem;
size_t size;
};
int main(void) {
struct map_elem map_elem[] = {
{"foo", "bar"}, {"the", "teh"}, {"what", "wut"}};
struct map const map = {map_elem, sizeof map_elem / sizeof *map_elem};
char input[] = "foo";
for (size_t i = 0; i < map.size; i++) {
struct map_elem *elem = map.elem + i;
if (strcmp(input, elem->key) == 0) {
puts(elem->value);
break;
}
}
}
Of course, it's just a little example.

Custom sort by value using tokyo cabinet in C

I'm implementing a btree using tokyo cabinet, but I'd like to know if it's possible to keep the values sorted. I know I can use a tcbdbsetcmpfunc to set the custom comparison function for the keys, but not sure about the values?
I ask this because most of the time I only need the first 1000 records assuming my values are sorted. Otherwise I will have to loop over millions of records sort them and get the first 1000, which can be slow.
For instance:
#include <tcutil.h>
#include <tcbdb.h>
#include <stdbool.h>
#include <stdint.h>
struct foo {
int one;
double two;
char *three;
};
// sort by three field
static int val_cmp(const char *aptr, int asiz, const char *bptr, int bsiz, void *op) {
return 1;
}
int main() {
int ecode;
TCBDB *db;
db = tcbdbnew();
struct foo *f;
tcbdbsetcmpfunc(db, val_cmp, f); // sort by struct->three?
// open the database
if(!tcbdbopen(db, "struct.tcb", BDBOWRITER | BDBOCREAT)){
ecode = tcbdbecode(db);
fprintf(stderr, "open error: %s\n", tcbdberrmsg(ecode));
}
f = malloc(sizeof(struct foo));
f->one = 100;
f->two = 1.1111;
f->three = "Hello World";
printf("put: %d\n", tcbdbput(db, "foo", 3, f, sizeof(struct foo)));
f = malloc(sizeof(struct foo));
f->one = 100;
f->two = 1.1111;
f->three = "Hello Planet";
printf("put: %d\n", tcbdbput(db, "bar", 3, f, sizeof(struct foo)));
char *key;
BDBCUR *cursor;
cursor = tcbdbcurnew(db);
tcbdbcurfirst(cursor);
while ((key = tcbdbcurkey2(cursor)) != NULL) {
struct foo *val;
int size;
val = tcbdbcurval(cursor, &size);
printf("%s: one=%d\n", key, val->one);
printf("%s: two=%f\n", key, val->two);
tcbdbcurnext(cursor);
}
tcbdbdel(db);
return 0;
}
I think you cannot define an order for values.
I recommend to have a second tokyocabinet db, mapping
from values to keys. I expect that with this one indirection
you still get nice performance.
The tokyo cabinet have not the embed sort mechanism. You can like to use the list and it self custom ordering

Assign the static array to another array

#include<stdio.h>
struct test_ {
char *device_name;
char *path_name;
};
typedef struct test_ test_t;
struct capabilities_ {
test_t tab[3];
int enable;
};
static test_t table[3] = {
{ "first", "john"},
{ "second", "mike"},
{ "third:", "vik" },
};
int main()
{
struct capabilities_ cap;
//cap.tab = table; ???
return 0;
}
I have a static array with the value, which I want to assign/copy to the same type/sized variable under the structure to table to cap.tab. Could you please help how to do that?
To do it at runtime, you can use user9000's approach, or something like this:
for (i = 0; i < 3; i++)
cap.tab[i] = table[i];
Or, convert your tab to use a pointer to test_t instead of array of test_t.
struct capabilities_ {
test_t *tab;
int enable;
};
int main()
{
struct capabilities_ cap;
cap.tab = table;
printf("%s\n", cap.tab[1].device_name);
return 0;
}
Or, if you are trying to do it at initialization, use one of the following:
struct capabilities_ cap = {
{
{ "first", "john" },
{ "second", "mike" },
{ "third:", "vik" },
},
1
};
Or this,
struct capabilities_ cap = {
{
table[0],
table[1],
table[2],
},
1
};
If you want to copy the strings, and not just the pointer to the strings, you'll need to allocate memory for each string in the target capabilities struct. Here is one way to do that
for (int i = 0; i < sizeof(table) / sizeof(test_t); i++)
{
size_t device_name_length = strlen(table[i].device_name);
size_t path_name_length = strlen(table[i].path_name);
size_t target_device_length = device_name_length + 1; // + 1 for null terminator
size_t target_path_length = path_name_length + 1; // + 1 for null terminator
cap.tab[i].device_name = (char*) malloc( target_device_length );
cap.tab[i].path_name = (char*) malloc( target_path_length );
strncpy_s(cap.tab[i].device_name, target_device_length, table[i].device_name, device_name_length);
strncpy_s(cap.tab[i].path_name, target_path_length, table[i].path_name, path_name_length);
}
If you don't care to make a deep copy, you can use the shallow copy mechanism shown by user9000 to just copy the pointers to the strings.
Also, if you use the mechanism above, don't forget to free if your capabilities is going to go out of scope and no longer be used :)
You can do it like so:
memcpy(cap.tab, table, sizeof (test_t) * (sizeof(table) / sizeof(test_t)));
This is just the same mechanism used in copying a string to another. Since you have the table size known, you can just do:
memcpy(cap.tab, table, 3 * sizeof(test_t));
The equivalent method of copying characters is like:
memcpy(str, str1, sizeof(char) * 4); // copy 4 of str1 into str

Resources