Saving a value between function calls in C - c

Is there any difference in functionality between these two sections of code? The main difference is the use of 'static' in the first example to save the value of x each time function1 is called. But the second example removes the need to use 'static' altogether by passing the value of i from main to function1 on each iteration of the for loop. The both have exactly the same output. Is there any hidden advantage to using one way over the other?
Note: the first example is a simplified version of a piece of code I've seen. Just wondering why this was used rather than the alternative.
First example:
void function1()
{
static int x = 0;
printf("function1 has now been called %d times\n", ++x);
}
int main(void)
{
for (int i = 0; i < 5; i++)
function1();
return 0;
}
Second example:
void function1(int i)
{
printf("function1 has now been called %d times\n", ++i);
}
int main(void)
{
int i;
for (i = 0; i < 5; i++)
function1(i);
return 0;
}
I'd appreciate any shared knowledge.

As people stated in the comments, there are pros/cons to each approach. Which one you choose depends on the situation you are in, and what tradeoffs you are willing to make. Below are a few ideas to get you rolling.
Static variable approach
void function1(void)
{
static int x = 0;
printf("function1 has now been called %d times\n", ++x);
}
Pros:
Lower resource usage: You aren't passing x on the stack, so you use less memory, if memory is a premium. Additionally, you save a few instructions moving the argument onto the stack. The address is also fixed, so you don't need to store it or manipulate it in code.
Good locality: As a static, x remains minimally scoped to it's purpose, meaning the code is easier to understand and debug. If the scope of x was increased to the entire file, it would be a lot harder to understand (who is modifying x, how can it change, etc).
More flexible architecturally (simple interface): There really isn't a wrong way to use this function — just call it. Don't need to worry about validating inputs. If you need to move it around, just drop in the header and you are good to go.
Cons:
Less flexible functionally: If you need to change the value of x, for example, needing to reset it, you don't have a way to do that. You would need to alter your design somehow (make x a global, add some reset parameter to the function, etc) to make it work. Requirements change all the time, and the one that can be minimally changed to do what you want is a hallmark of good design.
Harder to test: Tying into the point above, how would you unit test this function? If you wanted to test some low numbers and some high numbers (typical boundary tests), you would need to iterate through the entire space, which may not be feasible if the function takes a long time to run.
Argument approach
void function1(int x)
{
printf("function1 has now been called %d times\n", x);
}
void caller(int *x) /* could get x from anywhere */
{ /* showing it as a pointer from outside here */
*x = *x + 1;
function1(*x);
}
Pros:
More flexible functionally: If your design requirements change, it's relatively simple to change. If you need to change the sequence or have special conditions, like repeat the first value or skip a particular value, that's really easy to do from the calling code. You've separated out things so you can bolt on a new driver and don't need to touch this function anymore. Things are more modular.
More testable: The function can be tested much easier, granting more confidence that it actually works. You can do boundary testing, test inputs you are worried about, or recreate a failure scenario with ease.
Easier to understand: Functions in this format have the ability to be easier to understand. If a function produces the same output for a given set of inputs, it is said to be pure. The example given is not pure, because it is doing IO (printing to the screen). However, generally speaking, a pure function is easier to reason about because it doesn't hold any internal state, so the only things you really care about are the inputs. Just like 1 + 1 = 2, a pure function has this same simplifying property.
(Potentially) more performant: In the case of pure functions, compilers can take advantage of the function's referential transparency (a fancy word meaning you can replace add(1,1) with 3) and cache results from previous calls safely. Why do the same work again if you've already done it? If calling the function was particularly expensive and sat in a tight loop that called it with similar arguments, you've just saved tons of cycles. Again, this function is not pure, but even if it was, you wouldn't get any performance boosts since it is a sequential counter. Any benefits would be seen when the counter wraps, or the function is called with the same arguments again.
Cons:
More resource usage: If you are squeezed for memory, you utilize a little more stack space as you pass the variable over. You also use more instructions keeping track of it's address and moving it over.
Easier to screw up: If you only support numbers 1-10, someone is going to pass it 0 or -1. Now you need to decide how to handle that.
(Potentially) less performant: You can also bog down your code handling cases that should never happen in the name of defensive programming (which is a good thing!). But generally speaking, defensive programming programming is not built for speed. Speed comes from carefully thought out assumptions. If you are guaranteed that your input falls in a certain range, you can keep things moving as fast as possible without pesky sanity checks peppering your pipeline. The flexibility you gain from exposing this interface comes at a performance cost.
Less flexible architecturally (more cruft): If you call this function from a lot of places, now you need to string along this parameter to feed to it. If you are in a particularly deep call stack, there can be 20 or more functions which pass this argument along. And if the design changes and you need to move this function call from one place to another, you have the pleasure of ripping out the argument from the existing call stack, changing all the callers to conform to the new signature, inserting the argument into the new call stack, and changing all it's callers to conform to the new signature! Your other option is to leave the old call stack alone, which leads to harder maintainability and a higher "huh" factor when someone peruses the extra baggage from the days of yore.

Related

Sharing common data in C for scientific computing

In assignments where I have been forced to use C for scientific computing (rather than say, C++, my default choice), I often come across the following pattern:
There's usually a set of data that is commonly needed by many functions. For example, in solving differential equations, I would need to know number of points, the equations' parameters, etc:
struct parameters{
unsigned int num_x;
double length_x;
// so forth
};
I often end up having to combine them in a structure and then end up repeating myself in nearly every function: void f(struct parameters* p, ...). This wouldn't be so bad if it made sense for every function to have it as part of its interface, but it is not always the case, and I dislike the repetition anyway.
Furthermore, it is not always meaningful to have all these parameters in one structure, but splitting it up would make the interface more unmanageable.
Are there any workarounds or useful design patterns to deal with this? Making a global p would fix this, but justifying the use of a global when they are generally not recommended is difficult.
To my mind, there are two big reasons not to use global variables:
Since they're accessible everywhere, it can be impossible to keep track of when and how they get changed.
Their use makes it much more difficult to turn some standalone code into a utility function (a library, for example) that can be easily called from another program, with perhaps multiple instances.
But sometimes, there is data that is just truly global, potentially needed in all parts of a program, and if that's the case, I don't believe there should be any stigma against making it global, if it basically is.
You can dutifully pass around a pointer to your "shared" or "common" data (as you suggested), and often this is absolutely the right pattern, but in that case you've basically reintroduced problem #1.
And if you're sure you're never going to want to repackage your program as a separable, callable library, objection #2 goes away, too.
As Mark Benningfield suggested in a comment, the reason not to use globals is not just because everyone says you shouldn't. If you know what you're doing, and if a global isn't going to cause you problems, you should go ahead and use it.
Me, the only thing I insist on is that if a variable is global, it must have a nice, long, descriptive name. One- or two-character global variable names are right out.
(But with all of that said, you will usually find that global variables, like gotos, can be kept to a bare minimum. The general advice to steer clear of them when possible, though that advice is indeed sometimes overzealously or religiously applied, is usually right.)
As generally you will be just passing a pointer around, using one big struct may be prefererd. You can document in your functions which members it uses (its actual interface).
You could break down the struct in a number of structs for different types of computation, all having distinct members, and can combine them all in the big struct.
There may be no preferred design pattern.
One method I have used in the past is to declare a set of macro constants at the file level - your example would then be something akin to
#define NUM_X <value>
#define LENGTH_X <value>
These are of course substituted by the preprocessor, which in a global-averse situation is beneficial.
If you really want to avoid using global variables but only want to set your structure once, then you can write a module to do just that.
parameters.h:
struct t_param{
unsigned int num_x;
double length_x;
// so forth
};
int get_parameters(struct t_param * out);
int init_parameters(const struct t_param* in);
parameters.c:
#include <string.h>
#include "parameters.h"
static struct t_param parameters = {0,0.0};
static int initialized = 0;
int init_paramaters(const t_param* in)
{
if(initialized == 0)
{
memcpy(&parameters, in);
initialized = 1;
return 0;
}else
{
return -1;
}
}
int get_parameters(t_param *out)
{
if(initialized == 0)
{
return -1;
}else
{
memcpy(out, &parameters);
return 0;
}
}
Edit: Traded out member assignment for memcpy calls.

Two approaches to writing functions

I am asking this question in the context of the C language, though it applies really to any language supporting pointers or pass-by-reference functionality.
I come from a Java background, but have written enough low-level code (C and C++) to have observed this interesting phenomenon. Supposing we have some object X (not using "object" here in the strictest OOP sense of the word) that we want to fill with information by way of some other function, it seems there are two approaches to doing so:
Returning an instance of that object's type and assigning it, e.g. if X has type T, then we would have:
T func(){...}
X = func();
Passing in a pointer / reference to the object and modifying it inside the function, and returning either void or some other value (in C, for instance, a lot of functions return an int corresponding to the success/failure of the operation). An example of this here is:
int func(T* x){...x = 1;...}
func(&X);
My question is: in what situations makes one method better than the other? Are they equivalent approaches to accomplishing the same outcome? What are the restrictions of each?
Thanks!
There is a reason that you should always consider using the second method, rather than the first. If you look at the return values for the entirety of the C standard library, you'll notice that there's almost always an element of error handling involved in them. For example, you have to check the return value of the following functions before you assume they've succeeded:
calloc, malloc and realloc
getchar
fopen
scanf and family
strtok
There are other non-standard functions that follow this pattern:
pthread_create, etc.
socket, connect, etc.
open, read, write, etc.
Generally speaking, a return value conveys a number of items successfully read/written/converted or a flat-out boolean success/fail value, and in practice you'll almost always need such a return value, unless you're going to exit(EXIT_FAILURE); at any errors (in which case I would rather not use your modules, because they give me no opportunity to clean up within my own code).
There are functions that don't use this pattern in the standard C library, because they use no resources (e.g. allocations or files) and so there's no chance of any error. If your function is a basic translation function (e.g. like toupper, tolower and friends which translate single character values), for example, then you don't need a return value for error handling because there are no errors. I think you'll find this scenario quite rare indeed, but if that is your scenario, by all means use the first option!
In summary, you should always highly consider using option 2, reserving the return value for a similar use, for the sake of consistent with the rest of the world, and because you might later decide that you need the return value for communicating errors or number of items processed.
Method (1) passes the object by value, which requires that the object be copied. It's copied when you pass it in and copied again when it's returned. Method (2) passes only a pointer. When you're passing a primitive, (1) is just fine, but when you're passing an object, a struct, or an array, that's just wasted space and time.
In Java and many other languages, objects are always passed by reference. Behind the scenes, only a pointer is copied. This means that even though the syntax looks like (1), it actually works like (2).
I think I got you.
These to approach are very different.
The question you have to ask your self when ever you trying to decide which approach to take is :
Which class would have the responsibility?
In case you passing the reference to the object you are decapul the creation of the object to the caller and creating this functionality to be more serviceability and you would be able to create a util class that all of the functions inside will be stateless, they are getting object manipulate the input and returning it.
The other approach is more likely and API, you are requesting an opperation.
For an example, you are getting array of bytes and you would like to convert it to string, you would probably would chose the first approch.
And if you would like to do some opperation in DB you would chose the second one.
When ever you will have more than 1 function from the first approch that cover the same area you would encapsulate it into a util class, same applay to the second, you will encapsulate it into an API.
In method 2, we call x an output parameter. This is actually a very common design utilized in a lot of places...think some of the various built-in C functions that populate a text buffer, like snprintf.
This has the benefit of being fairly space-efficient, since you won't be copying structs/arrays/data onto the stack and returning brand new instances.
A really, really convenient quality of method 2 is that you can essentially have any number of "return values." You "return" data through the output parameters, but you can also return a success/error indicator from the function.
A good example of method 2 being used effectively is in the built-in C function strtol. This function converts a string to a long (basically, parses a number from a string). One of the parameters is a char **. When calling the function, you declare char * endptr locally, and pass in &endptr.
The function will return either:
the converted value if it was successful,
0 if it failed, or
LONG_MIN or LONG_MAX if it was out of range
as well as set the endptr to point to the first non-digit it found.
This is great for error reporting if your program depends on user input, because you can check for failure in so many ways and report different errors for each.
If endptr isn't null after the call to strtol, then you know precisely that the user entered a non-integer, and you can print straight away the character that the conversion failed on if you'd like.
Like Thom points out, Java makes implementing method 2 simpler by simulating pass-by-reference behavior, which is just pointers behind the scenes without the pointer syntax in the source code.
To answer your question: I think C lends itself well to the second method. Functions like realloc are there to give you more space when you need it. However, there isn't much stopping you from using the first method.
Maybe you're trying to implement some kind of immutable object. The first method will be the choice there. But in general, I opt for the second.
(Assuming we are talking about returning only one value from the function.)
In general, the first method is used when type T is relatively small. It is definitely preferable with scalar types. It can be used with larger types. What is considered "small enough" for these purposes depends on the platform and the expected performance impact. (The latter is caused by the fact that the returned object is copied.)
The second method is used when the object is relatively large, since this method does not perform any copying. And with non-copyable types, like arrays, you have no choice but to use the second method.
Of course, when performance is not an issue, the first method can be easily used to return large objects.
An interesting matter is optimization opportunities available to C compiler. In C++ language compilers are allowed to perform Return Value Optimizations (RVO, NRVO), which effectively turn the first method into the second one "under the hood" in situations when the second method offers better performance. To facilitate such optimizations C++ language relaxes some address-identity requirements imposed on the involved objects. AFAIK, C does not offer such relaxations, thus preventing (or at least impeding) any attempts at RVO/NRVO.
Short answer: take 2 if you don't have a necessary reason to take 1.
Long answer: In the world of C++ and its derived languages, Java, C#, exceptions help a lot. In C world, there is not very much you can do. Following is an sample API I take from CUDA library, which is a library I like and consider well designed:
cudaError_t cudaMalloc (void **devPtr, size_t size);
compare this API with malloc:
void *malloc(size_t size);
in old C interfaces, there are many such examples:
int open(const char *pathname, int flags);
FILE *fopen(const char *path, const char *mode);
I would argue to the end of the world, the interface CUDA is providing is much obvious and lead to proper result.
There are other set of interfaces that the valid return value space actually overlaps with the error code, so the designers of those interfaces scratched their heads and come up with not brilliant at all ideas, say:
ssize_t read(int fd, void *buf, size_t count);
a daily function like reading a file content is restricted by the definition of ssize_t. since the return value has to encode error code too, it has to provide negative number. in a 32bit system, the max of ssize_t is 2G, which is very much limited the number of bytes you can read from your file.
If your error designator is encoded inside of the function return value, I bet 10/10 programmers won't try to check it, though they really know they should; they just don't, or don't remember, because the form is not obvious.
And another reason, is human beings are very lazy and not good at dealing if's. The documentation of these functions will describe that:
if return value is NULL then ... blah.
if return value is 0 then ... blah.
yak.
In the first form, things changes. How do you judge if the value has been returned? No NULL or 0 any more. You have to use SUCCESS, FAILURE1, FAILURE2, or something similar. This interface forces users to code more safer and makes the code much robust.
With these macro, or enum, it's much easier for programmers to learn about the effect of the API and the cause of different exceptions too. With all these advantages, there actually is no extra runtime overhead for it too.
I will try to explain :)
Let say you have to load a giant rocket into semi,
Method 1)
Truck driver places a truck on a parking lot, and goes on to find a hookers, you are stack with putting the load onto forklift or some kind of trailer to bring it to the track.
Method 2)
Truck driver forgets hooker and backs truck up right to the rocket, then you need just to push it in.
That is the difference between those two :). What it boils down to in programming is:
Method 1)
Caller function reserves and address for called function to return its return value to, but how is calling function going to get that value does not matter, will it have to reserve another address or not does not matter, I need something returned, it is your job to get it to me :). So called function goes and reserves the address for its calculations and than stores the value in address then returns value to caller. So caller goes and say oh thank you let me just copy it to the address I reserved earlier.
Method 2)
Caller function says "Hey I will help you, I will give you the address that I have reserved, store what ever calculations you do in it", this way you save not only memory but you save in time.
And I think second is better, and here is why:
So let say that you have struct with 1000 ints inside of it, method 1 would be pointless, it will have to reserve 2*100*32 bits of memory, which is 6400 plus you have to copy it to first location than copy it to second one. So if each copy takes 1 millisecond you will need to way 6.4 seconds to store and copy variables. Where if you have address you only have to store it once.
They are equivalent to me but not in the implementation.
#include <stdio.h>
#include <stdlib.h>
int func(int a,int b){
return a+b;
}
int funn(int *x){
*x=1;
return 777;
}
int main(void){
int sx,*dx;
/* case static' */
sx=func(4,6); /* looks legit */
funn(&sx); /* looks wrong in this case */
/* case dynamic' */
dx=malloc(sizeof(int));
if(dx){
*dx=func(4,6); /* looks wrong in this case */
sx=funn(dx); /* looks legit */
free(dx);
}
return 0;
}
In a static' approach it is more comfortable to me doing your first method. Because I don't want to mess with the dynamic part (with legit pointers).
But in a dynamic' approach I'll use your second method. Because it is made for it.
So they are equivalent but not the same, the second approach is clearly made for pointers and so for the dynamic part.
And so far more clear ->
int main(void){
int sx,*dx;
sx=func(4,6);
dx=malloc(sizeof(int));
if(dx){
sx=funn(dx);
free(dx);
}
return 0;
}
than ->
int main(void){
int sx,*dx;
funn(&sx);
dx=malloc(sizeof(int));
if(dx){
*dx=func(4,6);
free(dx);
}
return 0;
}

In C language, what is the best practice to check return value of a function for a branching statement?

I'm trying to have an embedded software development point of view, and I'd like to ask which one is better to go with, and what are the possible advantages and disadvantages?
bool funct(){
bool retVal = 0;
//do something
return retVal;
}
//First Choice
if(funct()){
//do something
}
//Second Choice
bool retVal = funct();
if(retVal)
{
//do something
}
Either is probably OK in this example, however the second has a slight advantage when debugging in that when stepping the code you will know whether the condition is true before the branch is taken and can coerce the variable to a different value if you want to test the alternate path, and being able to see the result of a call after the event is useful in any case during debugging.
In more complex expressions the approach may be more important, for example in:
if( x() || y() ) ...
if x() returns true, then y() will not be evaluated, which may or may not be desirable if y() has side effects, so the semantics of that are not the same as:
bool xx = x() ;
bool yy = y() ;
if( xx || yy ) ...
Using explicit assignment allows the required semantics to be clearly expressed.
//First Choise
if(funct()){
//do something
}
This is totally fine as you check the return value of function to take the decision and your function returns either 0 or 1.
Also there is a advantage here over the second choice as you are saving space of one variable retVal just to hold the return value and perform the check.
If there is a need to use the return value not only just for the check in if condition and somewhere else in the program then I would suggest storing the return value (choice 2)
Both methods will work fine. If you define better as code that will execute (very slightly) faster and take up (very slightly) less room when it is compiled, then alternative 1) is better. Alternative 1) will read the value of the function into a register and branch on the value in two commands and use no memory. Alternative 2) will read the value of the function into register, write the value to memory, read the value from memory into a register and branch on the value - for a total of four commands and four bytes of storage (assumes a 32 bit processor).
The first choice (note the spelling) is better, but for reasons entirely unrelated to what you might think.
The reason is that it is one line of code shorter, and therefore you have one less line of code to have to worry about, one less line of code to have to read when trying to understand how it works, one less line of code to have to maintain in the future.
Performance considerations are completely pointless under any real-life scenario, and as a matter of fact I would be willing to guess that any halfway decent compiler will produce the exact same machine code for both of these choices.
If you have questions of such a basic nature, I would strongly advice you to quit trying to "have an embedded software development point of view". Embedded is hard; try for non-embedded which is a lot easier. Once you master non-embedded, then you can try embedded.

when (not) to store a part of a nested data structure into a temporary variable -- pretty/ugly, faster/slower?

What's the best way to read multiple numbers/strings in one array/struct/union, which itself is nested in one or more parent arrays/structs/unions?
1st example without temporary variable:
printf("%d %d\n", a[9][3], a[9][4]);
1st example with temporary variable:
int *b = a[9];
printf("%d %d\n", b[3], b[4]);
The temporary variable in the first example above is quite silly I'm sure, but in the second example below it makes sense and looks better to use one, right?
2nd example without temporary variable:
foo[i]->bar.red[j][0]++;
foo[i]->bar.red[j][1]++;
foo[i]->bar.red[j][2]++;
foo[i]->bar.red[j][3]++;
2nd example with temporary variable:
int *p = foo[i]->bar.red[j];
p[0]++;
p[1]++;
p[2]++;
p[3]++;
So where do you draw the line? I realize that compilers are smart enough to insert any indirection needed to produce assembly of optimal efficiency in this way, but (hypothetically assuming extremely performance critical code) maybe there are exceptions? And from a code clarity/maintainability point of view, what's your rule of thumb, if any?
First, I believe that this is purely a matter of code readability/maintainability, that is an individual preference. Compilers are smarter than us today (joke). :-)
Personally I usually think about extracting a temporary variable (or a function, this applies to them as well) in two cases:
When a can name it so that its name is self-explanatory and tells more than initial expression, or
If I have to repeat some piece of code at least three times.
Thus, your first example I'd leave as is:
printf("%d %d\n", a[9][3], a[9][4]);
And the second one would be:
int *p = foo[i]->bar.red[j];
p[0]++;
// ...
Or, (often, but not always) better:
int *bar_red = foo[i]->bar.red[j];
bar_red[0]++;
// ...
My rule of thumb is : add a temporary variable if you can give it a clear a meaningful name that makes the code easier to understand.
A second advice would be : enclose the (small) part that uses this variable in braces. This way the intent is clear, and the code is almost ready if a refactoring is needed (the code in braces will be easy to,extract into a function).
In your second sample, this would lead to:
/*previous code
...
*/
int *quadruplet = foo[i]->bar.red[j];
{ /*intentionaly meaningless braces that prepare the introduction
of a function (incrementQuadruplet) in a later refactoring if needed*/
quadruplet[0]++;
quadruplet[1]++;
quadruplet[2]++;
quadruplet[3]++;
}
/*following code
...
*/
This doesn't answer your Question in an exact manner but I cant help ut stress on this since this is a trap most programmers fall too often.
Rule of thumb is: Avoid premature optimizations.
Modern day compilers are celever enough to perform trivial optimizations on their own.
Write a code which is easy to understand, maintain and which follows coding standards/practices of the team you work in.
Get your working correctly as per the requirememnt, Profile your code for bottle necks and only then try to optimize the code which is found to be the bottle neck.

what the author of nedtries means by "in-place"?

I. Just implemented a kind of bitwise trie (based on nedtries), but my code does lot
Of memory allocation (for each node).
Contrary to my implemetation, nedtries are claimed to be fast , among othet things,
Because of their small number of memory allocation (if any).
The author claim his implementation to be "in-place", but what does it really means in this context ?
And how does nedtries achieve such a small number of dynamic memory allocation ?
Ps: I know that the sources are available, but the code is pretty hard to follow and I cannot figure how it works
I'm the author, so this is for the benefit of the many according to Google who are similarly having difficulties in using nedtries. I would like to thank the people here on stackflow for not making unpleasant comments about me personally which some other discussions about nedtries do.
I am afraid I don't understand the difficulties with knowing how to use it. Usage is exceptionally easy - simply copy the example in the Readme.html file:
typedef struct foo_s foo_t;
struct foo_s {
NEDTRIE_ENTRY(foo_t) link;
size_t key;
};
typedef struct foo_tree_s foo_tree_t;
NEDTRIE_HEAD(foo_tree_s, foo_t);
static foo_tree_t footree;
static size_t fookeyfunct(const foo_t *RESTRICT r)
{
return r->key;
}
NEDTRIE_GENERATE(static, foo_tree_s, foo_s, link, fookeyfunct, NEDTRIE_NOBBLEZEROS(foo_tree_s));
int main(void)
{
foo_t a, b, c, *r;
NEDTRIE_INIT(&footree);
a.key=2;
NEDTRIE_INSERT(foo_tree_s, &footree, &a);
b.key=6;
NEDTRIE_INSERT(foo_tree_s, &footree, &b);
r=NEDTRIE_FIND(foo_tree_s, &footree, &b);
assert(r==&b);
c.key=5;
r=NEDTRIE_NFIND(foo_tree_s, &footree, &c);
assert(r==&b); /* NFIND finds next largest. Invert the key function to invert this */
NEDTRIE_REMOVE(foo_tree_s, &footree, &a);
NEDTRIE_FOREACH(r, foo_tree_s, &footree)
{
printf("%p, %u\n", r, r->key);
}
NEDTRIE_PREV(foo_tree_s, &footree, &a);
return 0;
}
You declare your item type - here it's struct foo_s. You need the NEDTRIE_ENTRY() inside it otherwise it can contain whatever you like. You also need a key generating function. Other than that, it's pretty boilerplate.
I wouldn't have chosen this system of macro based initialisation myself! But it's for compatibility with the BSD rbtree.h so nedtries is very easy to swap in to anything using BSD rbtree.h.
Regarding my usage of "in place"
algorithms, well I guess my lack of
computer science training shows
here. What I would call "in place"
is when you only use the memory
passed into a piece of code, so if
you hand 64 bytes to an in place
algorithm it will only touch that 64
bytes i.e. it won't make use of
extra metadata, or allocate some
extra memory, or indeed write to
global state. A good example is an
"in place" sort implementation where
only the collection being sorted
(and I suppose the thread stack)
gets touched.
Hence no, nedtries doesn't need a
memory allocator. It stores all the
data it needs in the NEDTRIE_ENTRY
and NEDTRIE_HEAD macro expansions.
In other words, when you allocate
your struct foo_s, you do all the
memory allocation for nedtries.
Regarding understanding the "macro
goodness", it's far easier to
understand the logic if you compile
it as C++ and then debug it :). The
C++ build uses templates and the
debugger will cleanly show you state
at any given time. In fact, all
debugging from my end happens in a
C++ build and I meticulously
transcribe the C++ changes into
macroised C.
Lastly, before a new release, I
search Google for people having
problems with my software to see if
I can fix things and I am typically
amazed what someone people say about
me and my free software. Firstly,
why didn't those people having
difficulties ask me directly for
help? If I know that there is
something wrong with the docs, then
I can fix them - equally, asking on
stackoverflow doesn't let me know
immediately that there is a docs
problem bur rather relies on me to
find it next release. So all I would
say is that if anyone finds a
problem with my docs, please do
email me and say so, even if there
is a discussion say like here on
stackflow.
Niall
I took a look at the nedtrie.h source code.
It seems that the reason it is "in-place" is that you have to add the trie bookkeeping data to the items that you want to store.
You use the NEDTRIE_ENTRY macro to add parent/child/next/prev links to your data structure, and you can then pass that data structure to the various trie routines, which will extract and use those added members.
So it is "in-place" in the sense that you augment your existing data structures and the trie code piggybacks on that.
At least that's what it looks like. There's lots of macro goodness in that code so I could have gotten myself confused (:
In-place means you operate on the original (input) data, so the input data becomes the output data. Not-in-place means that you have separate input and output data, and the input data is not modified. In-place operations have a number of advantages - smaller cache/memory footprint, lower memory bandwidth, hence typically better performance, etc, but they have the disadvantage that they are destructive, i.e. you lose the original input data (which may or may not matter, depending on the use case).
In-place means to operate on the input data and (possibly) update it. The implication is that there no copying and/moving of the input data. This may result in loosing the input data original values which you will need to consider if it is relevant for your particular case.

Resources