Optimizing a search algorithm in C - c

Can the performance of this sequential search algorithm (taken from
The Practice of Programming) be improved using any of C's native utilities, e.g. if I set the i variable to be a register variable ?
int lookup(char *word, char*array[])
{
int i
for (i = 0; array[i] != NULL; i++)
if (strcmp(word, array[i]) == 0)
return i;
return -1;
}

Yes, but only very slightly. A much bigger performance improvement can be achieved by using better algorithms (for example keeping the list sorted and doing a binary search).
In general optimizing a given algorithm only gets you so far. Choosing a better algorithm (even if it's not completely optimized) can give you a considerable (order of magnitude) performance improvement.

I think, it will not make much of a difference. The compiler will already optimize it in that direction.
Besides, the variable i does not have much impact, word stays constant throughout the function and the rest is too large to fit in any register. It is only a matter how large the cache is and if the whole array might fit in there.
String comparisons are rather expensive computationally.
Can you perhaps use some kind of hashing for the array before searching?

There is well-known technique as sentinal method.
To use sentinal method, you must know about the length of "array[]".
You can remove "array[i] != NULL" comparing by using sentinal.
int lookup(char *word, char*array[], int array_len)
{
int i = 0;
array[array_len] = word;
for (;; ++i)
if (strcmp(word, array[i]) == 0)
break;
array[array_len] = NULL;
return (i != array_len) ? i : -1;
}

If you're reading TPOP, you will next see how they make this search many times faster with different data structures and algorithms.
But you can make things a bit faster by replacing things like
for (i = 0; i < n; ++i)
foo(a[i]);
with
char **p = a;
for (i = 0; i < n; ++i)
foo(*p);
++p;
If there is a known value at the end of the array (e.g. NULL) you can eliminate the loop counter:
for (p = a; *p != NULL; ++p)
foo(*p)
Good luck, that's a great book!

To optimize that code the best bet would be to rewrite the strcmp routine since you are only checking for equality and don't need to evaluate the entire word.
Other than that you can't do much else. You can't sort as it appears you are looking for text within a larger text. Binary search won't work either since the text is unlikely to be sorted.
My 2p (C-psuedocode):
wrd_end = wrd_ptr + wrd_len;
arr_end = arr_ptr - wrd_len;
while (arr_ptr < arr_end)
{
wrd_beg = wrd_ptr; arr_beg = arr_ptr;
while (wrd_ptr == arr_ptr)
{
wrd_ptr++; arr_ptr++;
if (wrd_ptr == wrd_en)
return wrd_beg;
}
wrd_ptr++;
}

Mark Harrison: Your for loop will never terminate! (++p is indented, but is not actually within the for :-)
Also, switching between pointers and indexing will generally have no effect on performance, nor will adding register keywords (as mat already mentions) -- the compiler is smart enough to apply these transformations where appropriate, and if you tell it enough about your cpu arch, it will do a better job of these than manual psuedo-micro-optimizations.

A faster way to match strings would be to store them Pascal style. If you don't need more than 255 characters per string, store them roughly like this, with the count in the first byte:
char s[] = "\x05Hello";
Then you can do:
for(i=0; i<len; ++i) {
s_len = strings[i][0];
if(
s_len == match_len
&& strings[i][s_len] == match[s_len-1]
&& 0 == memcmp(strings[i]+1, match, s_len-1)
) {
return 1;
}
}
And to get really fast, add memory prefetch hints for string start + 64, + 128 and the start of the next string. But that's just crazy. :-)

Another fast way to do it is to get your compiler to use a SSE2 optimized memcmp. Use fixed-length char arrays and align so the string starts on a 64-byte alignment. Then I believe you can get the good memcmp functions if you pass const char match[64] instead of const char *match into the function, or strncpy match into a 64,128,256,whatever byte array.
Thinking a bit more about this, these SSE2 match functions might be part of packages like Intel's and AMD's accelerator libraries. Check them out.

Realistically, setting I to be a register variable won't do anything that the compiler wouldn't do already.
If you are willing to spend some time upfront preprocessing the reference array, you should google "The World's Fastest Scrabble Program" and implement that. Spoiler: it's a DAG optimized for character lookups.

/* there is no more quick */
int lookup(char *word, char*array[])
{
int i;
for(i=0; *(array++) != NULL;i++)
if (strcmp(word, *array) == 0)
return i;
return -1;
}

Related

C for loop optimisation by embedding statements into loop-head itself

Just wondering if these variations of for loops are more efficient and practical.
By messing with the c for loop syntax i can embedd statements that would go in the loop-body into the loop-head like so:
Example 1:
#include <stdio.h>
int main(int argc, char ** argv)
{
// Simple program that prints out the command line arguments passed in
if (argc > 1)
{
for(int i = 1; puts(argv[i++]), i < argc;);
// This does the same as this:
// for(int i = 1; i < argc; i++)
// {
// puts(argv[i]);
// }
}
return 0;
}
I understand how the commas work in the for loop it goes through each statement in order, evaluates them then disregards all but the last one which is why it is able to iterate using the "i < argc"condition. There is no need for the final segment to increment the i variable as i did that in the middle segment of the loop head (in the puts(argv[i++]) bit).
Is this more efficient or is just just cleaner to seperate it into the loop body rather than combine it all into one line?
Example 2:
int stringLength(const char * string)
{
// Function that counts characters up until null terminator character and returns the total
int counter = 0;
for(counter; string[counter] != '\0'; counter++);
return counter;
// Same as:
//int counter = 0;
// for(int i = 0; string[i] != '\0'; i++)
//{
// counter++;
//}
//return counter;
}
This one seems more efficient than the version with the loop body as no local variable for the for-loop is initialised. Is it conventional to do these sorts of loops with no bodies?
Step 1: Correctness
Make sure code is correct.
Consider OP's code below. Does it attempt to print argv[argc] which would be bad?
if (argc > 1) {
for(int i = 1; puts(argv[i++]), i < argc;);
I initially thought it did. So did another user. Yet it OK.
… and this is exactly why code is weak.
Code not only should be correct, better code looks correct too. Using an anti-pattern as suggested by OP is rarely1 as good thing.
Step 2: Since code variations have the same big O, focus on understandably.
Sculpt your code – remove what is not needed.
for (int i = 1; i < argc; i++) {
puts(argv[i]);
}
What OP is doing is a trivial optimization concern.
Is premature optimization really the root of all evil?
Is it conventional to do these sorts of loops with no bodies?
Not really.
The key to the style of coding is to follow your group's style guide. Great software is often a team effort. If your group's likes to minimize bodies, go ahead. I have seen the opposite more common, explicit { some_code } bodies.
Note: int stringLength(const char * string) fails for strings longer than INT_MAX. Better to use size_t as the return type – thus an example of step 1 faltering.
1 All coding style rules, except this rule, have exceptions.

C array sorting ignoring special characters

char temp[size];
int b, z;
for (b = 0; b < size; b++) {
for (z = 0; z < size; z++) {
if (strcmp(processNames[b], processNames[z]) < 0) {
strcpy(temp, processNames[b]);
strcpy(processNames[b], processNames[z]);
strcpy(processNames[z], temp);
}
}
}
I'm sorting a list of char ** processNames;
I want it to sort like this:
abc
bee
george
(sally)
saw
thomas
zebra
However, it is sorting it like this:
(sally)
abc
bee
george
saw
thomas
zebra
Thanks, I'm not sure how to negate the special characters and only sort on alphabet. Thanks!
You can pre-process the string and use strcmp to compare the processed string:
// Inside the two-layer for loop
char newb[size], newz[size];
int ib, iz, tb = 0, tz = 0;
for (ib = 0; processNames[b][ib] != '\0'; ib++){
if (isalpha(processNames[b][ib])) {
newb[tb++] = processNames[b][ib];
}
}
newb[tb] = 0;
for (iz = 0; processNames[z][iz] != '\0'; iz++){
if (isalpha(processNames[z][iz])) {
newz[tz++] = processNames[z][iz];
}
}
newz[tz] = 0;
if (strcmp(newb, newz)) {
// swap the ORIGINAL string here
}
The above code is what I came up with at first. It is very inefficient and is not recommended. Alternatively, you can write your own mystrcmp() implementation:
int mystrcmp(const char* a, const char *b){
while (*a && *b) {
while (*a && !isalpha(*a)) a++;
while (*b && !isalpha(*b)) b++;
if (*a - *b) return *a - *b;
a++, b++;
}
return *a - *b;
}
“Sorting” means “putting things in order.” What order? The order is defined by some thing that tells us which of two items goes first.
In your code, you are using strcmp to decide which item goes first. That is the thing that decides the order. Since strcmp is giving an order you do not want, you need another function. In this case, you have to write your own function.
Your function should take two strings (via pointers to char), examine the strings, and return a value to indicate whether the first string should be before or after the second string (or whether they are equal).
Since this is likely a class assignment, I will leave it to you to ponder the necessary comparison function.
Alternative
There is an alternative method which is likely to be used in professionally deployed code, in suitable situations. I recommend the above because it is suitable for a class assignment—it addresses the key principle this assignment seems to target.
The alternative is to preprocess all the list items before doing the sort. Since you want to sort on the non-special characters of the names, you would augment the list by creating copies of the names with the special characters removed. These new versions would be your “sort keys”—they would be the values you use to decide order instead of the original names. You could compare them with strcmp.
This method requires allocating new memory for the new versions of the names, managing both the keys and the names while you sort them, and releasing the memory after the sort. It requires some overhead before you start the sort. However, if there are a very large number of things to sort with a considerable number of special characters, then doing the extra work up front can result in better performance overall.
(Again, I mention this only for completeness. It is likely not useful in a class assignment of this sort, just something computer science students should learn over time.)
Bonus Notes
You say you are sorting an array of char **ProcessNames. In this case, it is probably not necessary to move the strings themselves with strcpy. Instead, you can simply move the pointers to the strings. E.g., if you want to swap ProcessNames[4] and ProcessNames[7], just make a copy of the pointer that is ProcessNames[4], set ProcessNames[4] to be the pointer that is ProcessNames[7], and set ProcessNames[7] to be the temporary copy you made. This is generally faster than moving strings.
As others note, starting your z loop with z = 0 is probably not a good idea. You likely want z = b+1.
Your code uses size for the size of the string buffer (char temp[size]) and for the size of the ProcessNames array (for (b = 0; b < size; b++)). It is unlikely the number of strings to be sorted is the same as the maximum length of the strings. You should be sure to use the correct size in each instance.

How to optimize my hashtable to reduce real world running time?

Below is a piece of my program which loads a file(dictionary) into memory using hashtables. The dictionary contains only 1 word per line. But the process is taking too much time. How do I optimize it ??
bool load(const char* dictionary)
{
// TODO
int k;
FILE* fp = fopen(dictionary,"r");
if(fp == NULL)
return false;
for(int i=0; i<26; i++)
{
hashtable[i] = NULL;
}
while(true)
{
if(feof(fp))
return true;
node* n = malloc(sizeof(node));
n->pointer = NULL;
fscanf(fp,"%s",n->word);
if(isalpha(n->word[0]))
{
k = hashfunction(n->word);
}
else return true;
if(hashtable[k] == NULL)
{
hashtable[k] = n;
total_words++;
}
else
{
node* traverse = hashtable[k];
while(true)
{
if(traverse->pointer == NULL)
{
traverse->pointer = n;
total_words++;
break;
}
traverse = traverse->pointer;
}
}
}
return false;
}
Get rid of potential functional problems, then worry about performance.
A) for(int i=0; i<26; i++) may be wrong, hashtable[] definition not posted. It is certainly unwise for performance to use such a small fixed table.
B) "%s" is as safe as gets() - both are bad. Instead of fscanf(fp,"%s",n->word);, use fgets().
C) Instead of if(feof(fp)), check return value from fscanf()/fgets().
D) isalpha(n->word[0]) --> isalpha((unsigned char) n->word[0]) to cope with negative char values.
E) Check for memory allocation failure.
F) Other issues may also exists depending on unposted code.
Then form a simple test case and with minimal code that works, consider posting on codereview.stackexchange.com to solicit performance improvements.
You are making the assumption that all the words in the file are distinct. That is a reasonable assumption for a dictionary but it is bad defensive programming. You should always assume that input is out to get you, which means you cannot really assume anything about it.
In this case, though, you could argue that repeated words in the hashtable do not prevent it from working; they just slow it down slightly. Since the erroneous input won't cause bugs, undefined behaviour, or other catastrophes, it is marginally acceptable to document the requirement that reference words be unique.
Anyway, if you are not actually checking for duplicates, there is no need to walk the entire hash bucket for every insertion. If you insert new entries at the beginning of the bucket rather than at the end, you can avoid the scan which will probably yield a noticeable speedup if the buckets are large.
Of course, that optimization can only be used when loading the dictionary. It won't help you use the hashtable once initialization is complete, and it is rarely worthwhile hyper-optimizing startup code.

Convert read-only character array to float without null termination in C

I'm looking for a C function like the following that parses a length-terminated char array that expresses a floating point value and returns that value as a float.
float convert_carray_to_float( char const * inchars, int incharslen ) {
...
}
Constraints:
The character at inchars[incharslen] might be a digit or other character that might confuse the commonly used standard conversion routines.
The routine is not allowed to invoke inchars[incharslen] = 0 to create a z terminated string in place and then use the typical library routines. Even patching up the z-overwritten character before returning is not allowed.
Obviously one could copy the char array in to a new writable char array and append a null at the end, but I am hoping to avoid copying. My concern here is performance.
This will be called often so I'd like this to be as efficient as possible. I'd be happy to write my own routine that parses and builds up the float, but if that's the best solution, I'd be interested in the most efficient way to do this in C.
If you think removing constraint 3 really is the way to go to achieve high performance, please explain why and provide a sample that you think will perform better than solutions that maintain constraint 3.
David Gay's implementation, used in the *BSD libcs, can be found here: https://svnweb.freebsd.org/base/head/contrib/gdtoa/ The most important file is strtod.c, but it requires some of the headers and utilities. Modifying that to check the termination every time the string pointer is updated would be a bit of work but not awful.
However, you might afterwards think that the cost of the extra checks is comparable to the cost of copying the string to a temporary buffer of known length, particularly if the strings are short and of a known length, as in your example of a buffer packed with 3-byte undelimited numbers. On most architectures, if the numbers are no more than 8 bytes long and you were careful to ensure that the buffer had a bit of tail room, you could do the copy with a single 8-byte unaligned memory access at very little cost.
Here's a pretty good outline.
Not sure it covers all cases, but it shows most of the flow:
float convert_carray_to_float(char const * inchars, int incharslen)
{
int Sign = +1;
int IntegerPart = 0;
int DecimalPart = 0;
int Denominator = 1;
bool beforeDecimal = true;
if (incharslen == 0)
{
return 0.0f;
}
int i=0;
if (inchars[0] == '-')
{
Sign = -1;
i++;
}
if (inchars[0] == '+')
{
Sign = +1;
i++;
}
for( ; i<incharslen; ++i)
{
if (inchars[i] == '.')
{
beforeDecimal = false;
continue;
}
if (!isdigit(inchars[i]))
{
return 0.0f;
}
if (beforeDecimal)
{
IntegerPart = 10 * IntegerPart + (inchars[i] - '0');
}
else
{
DecimalPart = 10 * DecimalPart + (inchars[i] - '0');
Denominator *= 10;
}
}
return Sign * (IntegerPart + ((float)DecimalPart / Denominator));
}

C: sum of integer values by string identifiers

So, I have two files of financial data, say 'symbols', and 'volumes'. In symbols I have strings such as:
FOO
BAR
BAZINGA
...
In volumes, I have integer values such as:
0001387
0000022
0123374
...
The idea is that the stock symbols will repeat in the file and I need to find the total volume of each stock. So, each row where I observe foo I increment total volume of foo by the value observed in volumes. The problem is that these files can be huge: easily 5 - 100 million records. A typical day may have ~1K different symbols in the file.
Doing it using strcmp on symbols each new line will be very inefficient. I was thinking of using an associative array --- hash table library which allows string keys --- such as uthash or Glib's hashtable.
I am reading some pretty good things about Judy arrays? Is the licensing a problem in this case?
Any thoughts on the choice of an efficient hash-table implementation? And also, whether I should use hash tables at all or perhaps something else entirely.
Umm.. apologize for the omission earlier: I need to have a pure C solution.
Thanks.
Definitely hashtable sounds good. You should look at the libiberty implementation.
You can find it on the GCC project Here.
I would use Map of C++ STL. Here's how the pseudo-code looks like:
map< string, long int > Mp;
while(eof is not reached)
{
String stock_name=readline_from_file1();
long int stock_value=readline_from_file2();
Mp[stock_name]+=stock_value;
}
for(each stock_name in Mp)
cout<<stock_name<<" "<<stock_value<<endl;
Based on the amount of data you gave, it may be a bit inefficient, but I'd suggest this because its much easier to implement.
If the solution is to be implemented strictly in C, then hashing will be the best solution. But, if you feel that implementing a hash-table and writing the code to avoid collisions is complex, I have another idea of using trie. It may sound weird, but this can also help a bit.
I would suggest you to read this one. It has a nice explanation about what a trie is and how to construct it. The implementation in C was also given there. So, you may have a doubt of where to store the volumes for each stock. This value can be stored at the end of the stock string and can be updated easily whenever needed.
But as you say that you are new to C, i advice you to try implementing using hash table and then try this one.
Thinking why not stick to your associative array idea. I assume, at the end of execution you need to a have list of unique names with their aggregated values. Below will work as far as you have memory to hold all unique names. ofcourse, this might not be that efficient, however, few tricks can be done depending upon the patterns of your data.
Consolidate_Index =0;
struct sutruct_Customers
{
name[];
value[];
}
sutruct_Customers Customers[This_Could_be_worse_if_all_names_are_unique]
void consolidate_names(char *name , int value)
{
for(i=0;i<Consolidate_Index;i++){
if(Customers[i].name & name)
{
Customers[i].value+= Values[index];
}
else
{
Allocate memory for Name Now!
Customers[Consolidate_Index].name = name;
Customers[Consolidate_Index].value = Value;
Consolidate_Index++;
}
}
}
main(){
sutruct_Customers buffer[Size_In_Each_Iteration]
while(unless file is done){
file-data-chunk_names to buffer.name
file-data-chunk_values to buffer.Values
for(; i<Size_In_Each_Iteration;i++)
consolidate_names(buffer.Names , buffer.Values);
}
My solution:
I did end up using the JudySL array to solve this problem. After some reading, the solution was quite simple to implement using Judy. I am replicating the solution here in full for it to be useful to anyone else.
#include <stdio.h>
#include <Judy.h>
const unsigned int BUFSIZE = 10; /* A symbol is only 8 chars wide. */
int main (int argc, char const **argv) {
FILE *fsymb = fopen(argv[1], "r");
if (fsymb == NULL) return 1;
FILE *fvol = fopen(argv[2], "r");
if (fvol == NULL) return 1;
FILE *fout = fopen(argv[3], "w");
if (fout == NULL) return 1;
unsigned int lnumber = 0;
uint8_t symbol[BUFSIZE];
unsigned long volume;
/* Initialize the associative map as a JudySL array. */
Pvoid_t assmap = (Pvoid_t) NULL;
Word_t *value;
while (1) {
fscanf(fsymb, "%s", symbol);
if (feof(fsymb)) break;
fscanf(fvol, "%lu", &volume);
if (feof(fvol)) break;
++lnumber;
/* Insert a new symbol or return value if exists. */
JSLI(value, assmap, symbol);
if (value == PJERR) {
fclose(fsymb);
fclose(fvol);
fclose(fout);
return 2;
}
*value += volume;
}
symbol[0] = '\0'; /* Start from the empty string. */
JSLF(value, assmap, symbol); /* Find the next string in the array. */
while (value != NULL) {
fprintf(fout, "%s: %lu\n", symbol, *value); /* Print to output file. */
JSLN(value, assmap, symbol); /* Get next string. */
}
Word_t tmp;
JSLFA(tmp, assmap); /* Free the entire array. */
fclose(fsymb);
fclose(fvol);
fclose(fout);
return 0;
}
I tested the solution on a 'small' sample containing 300K lines. The output is correct and the elapsed time was 0.074 seconds.

Resources