I have a function in C that accepts and returns a double (and uses several doubles internally). Is there a good way to make a second version of the function, just like it except with float in place of double? Also constants like DBL_EPSILON should be updated.
I suppose I could do this with the preprocessor, but that seems awkward (and probably difficult to debug if there's a compile error). What do best practices recommend? I can't imagine I'm the only one who's had to deal with this.
Edit: I forgot, this is stackoverflow so I can't just ask a question, I have to justify myself. I have code which is very sensitive to precision in this case; the cost of using doubles rather than floats is 200% to 300%. Up until now I only needed a double version -- when I needed it I wanted as much precision as possible, regardless of the time needed (in that application it was a tiny percentage). But now I've found a use that is sensitive to speed and doesn't benefit from the extra precision. I cringed at my first thought, which was to copy the entire function and replace the types. Then I thought that a better approach would be known to the experts at SO so I posted here.
don't know about "best practices", but the preprocessor definitely was the first thing to jump to my mind. it's similar to templates in C++.
[edit: and the Jesus Ramos answer mentions the different letters on functions with different types in libraries, and indeed you would probably want to do this]
you create a separate source file with your functions, everywhere you have a double change it to FLOATING_POINT_TYPE (just as an example) and then include your source file twice from another file. (or whatever method you choose you just need to be able to ultimately process the file twice, once with each data type as your define.) [also to determine the character appended to distinguish different versions of the function, define FLOATING_POINT_TYPE_CHAR]
#define FLOATING_POINT_TYPE double
#define FLOATING_POINT_TYPE_CHAR d
#include "my_fp_file.c"
#undef FLOATING_POINT_TYPE_CHAR
#undef FLOATING_POINT_TYPE
#define FLOATING_POINT_TYPE float
#define FLOATING_POINT_TYPE_CHAR f
#include "my_fp_file.c"
#undef FLOATING_POINT_TYPE
#undef FLOATING_POINT_TYPE_CHAR
then you can also use a similar strategy for your prototypes in your headers.
but, so in your header file you would need something something like:
#define MY_FP_FUNC(funcname, typechar) \
funcname##typechar
and for your function definitions/prototypes:
FLOATING_POINT_TYPE
MY_FP_FUNC(DoTheMath, FLOATING_POINT_TYPE_CHAR)
(
FLOATING_POINT_TYPE Value1,
FLOATING_POINT_TYPE Value2
);
and so forth.
i'll definitely leave it to someone else to talk about best practices :)
BTW for an example of this kind of strategy in a mature piece of software you can check out FFTW (fftw.org), although it's a bit more complicated than the example i think it uses basically the same strategy.
Don't bother.
Except for a few specific hardware implementations, there is no advantage to having a float version of a double function. Most IEEE 754 hardware performs all calculations in 64- or 80-bit arithmetic internally, and truncates the results to the desired precision on storing.
It is completely fine to return a double to be used or stored as a float. Creating a float version of the same logic is not likely to run any faster or be more suitable for much of anything. The only exception coming to mind would be GPU-optimized algorithms which do not support 64+ bit operations.
As you can see from most standard librarys and such methods aren't really overridden just new methods are created. For example:
void my_function(double d1, double d2);
void my_functionf(float f1, float f2);
A lot of them have different last letters in the method to indicate that it is sort of like a method override for different types. This also applies for return types such as the function atoi, atol, atof.... etc.
Alternatively wrap your function in a macro that adds the type as an argument such as
#define myfunction(arg1, arg2, type) ....
This way it's much easier as you can now just wrap everything with your type avoiding copy pasting the function and you can always check type.
In this case I would say the best practice would be writing a custom codegen tool, which will take 'generic' code and create new version of double and float each time before compilation.
Related
I have a long formula, like the following:
float a = sin(b)*cos(c)+sin(c+d)*sin(d)....
Is there a way to use s instead of sin in C, to shorten the formula, without affecting the running time?
There are at least three options for using s for sin:
Use a preprocessor macro:
#define s(x) (sin(x))
#define c(x) (cos(x))
float a = s(b)*c(c)+s(c+d)*c(d)....
#undef c
#undef s
Note that the macros definitions are immediately removed with #undef to prevent them from affecting subsequent code. Also, you should be aware of the basics of preprocessor macro substitution, noting the fact that the first c in c(c) will be expanded but the second c will not since the function-like macro c(x) is expanded only where c is followed by (.
This solution will have no effect on run time.
Use an inline function:
static inline double s(double x) { return sin(x); }
static inline double c(double x) { return cos(x); }
With a good compiler, this will have no effect on run time, since the compiler should replace a call to s or c with a direct call to sin or cos, having the same result as the original code. Unfortunately, in this case, the c function will conflict with the c object you show in your sample code. You will need to change one of the names.
Use function pointers:
static double (* const s)(double) = sin;
static double (* const c)(double) = cos;
With a good compiler, this also will have no effect on run time, although I suspect a few more compilers might fail to optimize code using this solution than than previous solution. Again, you will have the name conflict with c. Note that using function pointers creates a direct call to the sin and cos functions, bypassing any macros that the C implementation might have defined for them. (C implementations are allowed to implement library function using macros as well as functions, and they might do so to support optimizations or certain features. With a good quality compiler, this is usually a minor concern; optimization of a direct call still should be good.)
if I use define, does it affect runtime?
define works by doing text-based substitution at compile time. If you #define s(x) sin(x) then the C pre-processor will rewrite all the s(x) into sin(x) before the compiler gets a chance to look at it.
BTW, this kind of low-level text-munging is exactly why define can be dangerous to use for more complex expressions. For example, one classic pitfall is that if you do something like #define times(x, y) x*y then times(1+1,2) rewrites to 1+1*2, which evaluates to 3 instead of the expected 4. For more complex expressions like it is often a good idea to use inlineable functions instead.
Don't do this.
Mathematicians have been abbreviating the trigonometric functions to sin, cos, tan, sinh, cosh, and tanh for many many years now. Even though mathematicians (like me) like to use their favourite and often idiosyncratic notation so puffing up any paper by a number of pages, these have emerged as pretty standard. Even LaTeX has commands like \sin, \cos, and \tan.
The Japanese immortalised the abbreviations when releasing scientific calculators in the 1970s (the shorthand can fit easily on a button), and the C standard library adopted them.
If you deviate from this then your code immediately becomes difficult to read. This can be particularly pernicious with mathematical code where you can't immediately see the effects of a bad implementation.
But if you must, then a simple
static double(*const s)(double) = sin;
will suffice.
I'm writing a Scheme interpreter. For each built-in type (integer, character, string, etc) I want to have the read and print functions named consistently:
READ_ERROR Scheme_read_integer(FILE *in, Value *val);
READ_ERROR Scheme_read_character(FILE *in, Value *val);
I want to ensure consistency in the naming of these functions
#define SCHEME_READ(type_) Scheme_read_##type_
#define DEF_READER(type_, in_strm_, val_) READ_ERROR SCHEME_READ(type_)(FILE *in_strm_, Value *val_)
So that now, instead of the above, in code I can write
DEF_READER(integer, in, val)
{
// Code here ...
}
DEF_READER(character, in, val)
{
// Code here ...
}
and
if (SOME_ERROR != SCHEME_READ(integer)(stdin, my_value)) do_stuff(); // etc.
Now is this considered an unidiomatic use of the preprocessor? Am I shooting myself in the foot somewhere unknowingly? Should I instead just go ahead and use the explicit names of the functions?
If not are there examples in the wild of this sort of thing done well?
I've seen this done extensively in a project, and there's a severe danger of foot-shooting going on.
The problem happens when you try to maintain the code. Even though your macro-ized function definitions are all neat and tidy, under the covers you get function names like Scheme_read_integer. Where this can become an issue is when something like Scheme_read_integer appears on a crash stack. If someone does a search of the source pack for Scheme_read_integer, they won't find it. This can cause great pain and gnashing of teeth ;)
If you're the only developer, and the code base isn't that big, and you remember using this technique years down the road and/or it's well documented, you may not have an issue. In my case it was a very large code base, poorly documented, with none of the original developers around. The result was much tooth-gnashing.
I'd go out on a limb and suggest using a C++ template, but I'm guessing that's not an option since you specifically mentioned C.
Hope this helps.
I'm usually a big fan of macros, but you should probably consider inlined wrapper functions instead. They will add negligible runtime overhead and will appear in stack backtraces, etc., when you're debugging.
A situation I run into a lot in writing C code (context is scientific computation) is that I will have functions which have exactly the same body modulo minor type differences. I realize C++ offers the template feature and function overloading which allows one to have only one copy of said function and let the compiler figure out what signature you meant to use when you build.
While this is a great feature in C++, my project is in C and I furthermore do not need the full power of templating. So far what I have tried is m4 macros on a candidate source file, and this spits out respective .c files with appropriate name mangling for the different types I need. The preprocessor could therefore accomplish this as well, but I'm attempting to avoid using it in complicated ways (my code needs to be understandable for reproducibility reasons). I'm not very good with m4, so all the files have been hacks that only work in specific cases and are inapplicable in new situations.
What do other people programming in C do when this is necessary? Manually produce and maintain the different permutations of function signatures? I'm hoping that isn't the best answer, or that a tool exists to automate this dreary and error prone task.
Apologies for vagueness, let me give a toy example. Suppose I have need to add two numbers. The function might look something like this:
float add(float x,float y){
return x+y;
}
Ok that's great for floats, but what if I need it for a wide range of types on which arithmetic is available. Ok I can do this
float add_f(float x,float y){...}
double add_lf(double x,double y){...}
unsigned int add_ui(unsigned int x, unsigned int y){...}
and so forth. If for some (probably stupid) reason I decide I need to also write the contents of the arguments to a binary file, I now have to add in the requisite file I/O code in every single function. Is there a simple way/tool to take an add function and spit out different ones with name mangling to avoid this annoying situation?
Basically in my m4 cases I would just find/replace a macro TYPE with the requisite type, and have a macro MANGLE() which mangles the functions, then I point the output to an alternate .c file. My m4 skills are lacking though.
Function pointers can help with the ultimate interface of my code, but eventually those pointers have to point to something, and then we're just enumerating all the possibilities again. I'm also unclear on how this might affect potential inlining of short functions.
The only thing i can think of is: make the algorithm itself independent of the type, have the user of your function create his own function to handle the type-specific parts, and make one of the parameters to your function a pointer to the "handler function".
See the definition/implementation of the qsort routine for what i mean. Qsort works for all kinds of data, but handles the data itself transparently - the only things you pass to qsort is the size of each entry, and a function pointer to a function that does the real comparison.
You appear to be asking for generic type support. While the macro processing can work in restricted domains, what you are doing is complex.
If the variants are so similar that simply type and name mangling is enough, then could you not use regular C #defines before each of multiple inclusions of the same source fragment to allow the preprocessor perform the substitution? This way, at least there is only a single environment to manage.
Alternately, if the performance hit is not substantial, could you prepare multiple stub functions for each specialisation and map these to a generic version that can be called from the stubs?
I use GNU autogen for code generation tasks, which sounds somewhat like your current m4 solution, but might be better organized. For example:
type.def
autogen definitions type;
type = { name="int"; mangle="i"; };
type = { name="double"; mangle="lf"; };
type = { name="float"; mangle="f"; };
type = { name="unsigned int"; mangle="ui"; };
type.tpl
[+ autogen5 template
c=%s.c
(setenv "SHELL" "/bin/sh") +]/*
[+ (dne "* " "* ") +]
*/
[+
FOR type "\n" +][+name+] add_[+mangle+]([+name+] x, [+name+] y) { ... }[+ENDFOR+]
or something like that. This should spit out a function for each of the types in type.def looking something like:
unsigned int add_ui(unsigned int x, unsigned int y) { ... }
You can also have it insert type-specific code in certain places if needed, etc. You could have it output the add functions described above as well as the I/O versions. You'd have to compute the text for mangle instead of what I've got, but that's not a problem. You'd also have some conditional code for the I/O and a way to toggle the condition on and off (again, not a problem).
I'd definitely try and see if there was some way to generalize the algorithm, but this approach might have drawbacks (e.g. performance issues from not having the real underlying type) as well. But it sounds from the comments that this approach might not work for you.
I know that most C developers are afraid of it, but have you thought about using macros?
specific to your example:
// floatstuff.h
float add_f(float x,float y);
double add_lf(double x,double y);
unsigned int add_ui(unsigned int x, unsigned int y);
combined with:
// floatstuff.c
#define MY_CODE \
return x + y
float
add (float x, float y)
{
MY_CODE;
}
double
add_lf (double x, double y)
{
MY_CODE;
}
unsigned int
add_ui (unsigned int x, unsigned int y)
{
MY_CODE;
}
If the code you are using per function is truly identical, then this might be the solution you are looking for. It avoids most of the code duplication, maintains some degree of readability and has no impact on your runtime.
Also, if you keep the macro local to your .c file, you are unlikely to break anything, so no worries there either.
Also, you can do even more weird stuff using parameterized macros, which can give you even more reduced code duplication.
I have an question about performance of my code.
Let's say I have a struct in C for a point:
typedef struct _CPoint
{
float x, y;
} CPoint;
and a function where I use the struct.
float distance(CPoint p1, CPoint p2)
{
return sqrt(pow((p2.x-p1.x),2)+pow((p2.y-p1.y),2));
}
I was wondering if it would be a smart idea to replace this function for a #define,
#define distance(p1, p2)(sqrt(pow((p2.x-p1.x),2)+pow((p2.y-p1.y),2)));
I think it will be faster because there will be no function overhead, and I'm wondering if I should use this approach for all other functions in my program to increase the performance. So my question is:
Should I replace all my functions with #define to increase the performance of my code?
No. You should never make the decision between a macro and a function based on a perceived performance difference. You should evaluate it soley based on the merits of functions over macros. In general choose functions.
Macros have a lot of hidden downsides that can bite you. Case in point, your translation to a macro here is incorrect (or at least not semantics preserving with the original function). The argument to the macro distance gets evaluated 2 times each. Imagine I made the following call
distance(GetPointA(), GetPointB());
In the macro version this actually results in 4 function calls because each argument is evaluated twice. Had distance been left as a function it would only result in 3 function calls (distance and each argument). Note: I'm ignoring the impact of sqrt and pow in the above calculations as they're the same in both versions.
There are three things:
normal functions like your distance above
inline functions
preprocessor macros
While functions guarantee some kind of type safety, they also incur a performance loss due to the fact that a stack frame needs to be used at each function call. code from inline functions is copied at the call site so that penalty is not paid -- however, your code size will increase. Macros provide no type safety and also involve textual substitution.
Choosing from all three, I'd usually use inline functions. Macros only when they are very short and very useful in this form (like hlist_for_each from the Linux kernel)
Jared's right, and in this specific case, the cycles spent in the pow calls and the sqrt call would be in the range of 2 orders of magnitude more than the cycles spent in the call to distance.
Sometimes people assume that small code equals small time. Not so.
I'd recommend an inline function rather than a macro. It'll give you any possible performance benefits of a macro, without the ugliness. (Macros have some gotchas that make them very iffy as a general replacement for functions. In particular, macro args are evaluated every time they're used, while function args are evaluated once each before the "call".)
inline float distance(CPoint p1, CPoint p2)
{
float dx = p2.x - p1.x;
float dy = p2.y - p1.y;
return sqrt(dx*dx + dy*dy);
}
(Note i also replaced pow(dx, 2) with dx * dx. The two are equivalent, and multiplication is more likely to be efficient. Some compilers might try to optimize away the call to pow...but guess what they replace it with.)
If using a fairly mature compiler it propaby will do this for you on assembly level if optimisation is swtiched on.
For gcc the -O3 or (for "small" functions) even the -O2 option will do this.
For details on this you might consider reading here http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html for "-finline*" options.
I was reading some code written in C this evening, and at the top of
the file was the function-like macro HASH:
#define HASH(fp) (((unsigned long)fp)%NHASH)
This left me wondering, why would somebody choose to implement a
function this way using a function-like macro instead of implementing
it as a regular vanilla C function? What are the advantages and
disadvantages of each implementation?
Thanks a bunch!
Macros like that avoid the overhead of a function call.
It might not seem like much. But in your example, the macro turns into 1-2 machine language instructions, depending on your CPU:
Get the value of fp out of memory and put it in a register
Take the value in the register, do a modulus (%) calculation by a fixed value, and leave that in the same register
whereas the function equivalent would be a lot more machine language instructions, generally something like
Stick the value of fp on the stack
Call the function, which also puts the next (return) address on the stack
Maybe build a stack frame inside the function, depending on the CPU architecture and ABI convention
Get the value of fp off the stack and put it in a register
Take the value in the register, do a modulus (%) calculation by a fixed value, and leave that in the same register
Maybe take the value from the register and put it back on the stack, depending on CPU and ABI
If a stack frame was built, unwind it
Pop the return address off the stack and resume executing instructions there
A lot more code, eh? If you're doing something like rendering every one of the tens of thousands of pixels in a window in a GUI, things run an awful lot faster if you use the macro.
Personally, I prefer using C++ inline as being more readable and less error-prone, but inlines are also really more of a hint to the compiler which it doesn't have to take. Preprocessor macros are a sledge hammer the compiler can't argue with.
One important advantage of macro-based implementation is that it is not tied to any concrete parameter type. A function-like macro in C acts, in many respects, as a template function in C++ (templates in C++ were born as "more civilized" macros, BTW). In this particular case the argument of the macro has no concrete type. It might be absolutely anything that is convertible to type unsigned long. For example, if the user so pleases (and if they are willing to accept the implementation-defined consequences), they can pass pointer types to this macro.
Anyway, I have to admit that this macro is not the best example of type-independent flexibility of macros, but in general that flexibility comes handy quite often. Again, when certain functionality is implemented by a function, it is restricted to specific parameter types. In many cases in order to apply similar operation to different types it is necessary to provide several functions with different types of parameters (and different names, since this is C), while the same can be done by just one function-like macro. For example, macro
#define ABS(x) ((x) >= 0 ? (x) : -(x))
works with all arithmetic types, while function-based implementation has to provide quite a few of them (I'm implying the standard abs, labs, llabs and fabs). (And yes, I'm aware of the traditionally mentioned dangers of such macro.)
Macros are not perfect, but the popular maxim about "function-like macros being no longer necessary because of inline functions" is just plain nonsense. In order to fully replace function-like macros C is going to need function templates (as in C++) or at least function overloading (as in C++ again). Without that function-like macros are and will remain extremely useful mainstream tool in C.
On one hand, macros are bad because they're done by the preprocessor, which doesn't understand anything about the language and does text-replace. They usually have plenty of limitations. I can't see one above, but usually macros are ugly solutions.
On the other hand, they are at times even faster than a static inline method. I was heavily optimizing a short program and found that calling a static inline method takes about twice as much time (just overhead, not actual function body) as compared with a macro.
The most common (and most often wrong) reason people give for using macros (in "plain old C") is the efficiency argument. Using them for efficiency is fine if you have actually profiled your code and are optimizing a true bottleneck (or are writing a library function that might be a bottleneck for somebody someday). But most people who insist on using them have Not actually analyzed anything and are just creating confusion where it adds no benefit.
Macros can also be used for some handy search-and-replace type substitutions which the regular C language is not capable of.
Some problems I have had in maintaining code written by macro abusers is that the macros can look quite like functions but do not show up in the symbol table, so it can be very annoying trying to trace them back to their origins in sprawling codesets (where is this thing defined?!). Writing macros in ALL CAPS is obviously helpful to future readers.
If they are more than fairly simple substitutions, they can also create some confusion if you have to step-trace through them with a debugger.
Your example is not really a function at all,
#define HASH(fp) (((unsigned long)fp)%NHASH)
// this is a cast ^^^^^^^^^^^^^^^
// this is your value 'fp' ^^
// this is a MOD operation ^^^^^^
I'd think, this was just a way of writing more readable code with the casting and mod opration wrapped into a single macro 'HASH(fp)'
Now, if you decide to write a function for this, it would probably look like,
int hashThis(int fp)
{
return ((fp)%NHASH);
}
Quite an overkill for a function as it,
introduces a call point
introduces call-stack setup and restore
The C Preprocessor can be used to create inline functions. In your example, the code will appear to call the function HASH, but instead is just inline code.
The benefits of doing macro functions were eliminated when C++ introduced inline functions. Many older API like MFC and ATL still use macro functions to do preprocessor tricks, but it just leaves the code convoluted and harder to read.