I have an question about performance of my code.
Let's say I have a struct in C for a point:
typedef struct _CPoint
{
float x, y;
} CPoint;
and a function where I use the struct.
float distance(CPoint p1, CPoint p2)
{
return sqrt(pow((p2.x-p1.x),2)+pow((p2.y-p1.y),2));
}
I was wondering if it would be a smart idea to replace this function for a #define,
#define distance(p1, p2)(sqrt(pow((p2.x-p1.x),2)+pow((p2.y-p1.y),2)));
I think it will be faster because there will be no function overhead, and I'm wondering if I should use this approach for all other functions in my program to increase the performance. So my question is:
Should I replace all my functions with #define to increase the performance of my code?
No. You should never make the decision between a macro and a function based on a perceived performance difference. You should evaluate it soley based on the merits of functions over macros. In general choose functions.
Macros have a lot of hidden downsides that can bite you. Case in point, your translation to a macro here is incorrect (or at least not semantics preserving with the original function). The argument to the macro distance gets evaluated 2 times each. Imagine I made the following call
distance(GetPointA(), GetPointB());
In the macro version this actually results in 4 function calls because each argument is evaluated twice. Had distance been left as a function it would only result in 3 function calls (distance and each argument). Note: I'm ignoring the impact of sqrt and pow in the above calculations as they're the same in both versions.
There are three things:
normal functions like your distance above
inline functions
preprocessor macros
While functions guarantee some kind of type safety, they also incur a performance loss due to the fact that a stack frame needs to be used at each function call. code from inline functions is copied at the call site so that penalty is not paid -- however, your code size will increase. Macros provide no type safety and also involve textual substitution.
Choosing from all three, I'd usually use inline functions. Macros only when they are very short and very useful in this form (like hlist_for_each from the Linux kernel)
Jared's right, and in this specific case, the cycles spent in the pow calls and the sqrt call would be in the range of 2 orders of magnitude more than the cycles spent in the call to distance.
Sometimes people assume that small code equals small time. Not so.
I'd recommend an inline function rather than a macro. It'll give you any possible performance benefits of a macro, without the ugliness. (Macros have some gotchas that make them very iffy as a general replacement for functions. In particular, macro args are evaluated every time they're used, while function args are evaluated once each before the "call".)
inline float distance(CPoint p1, CPoint p2)
{
float dx = p2.x - p1.x;
float dy = p2.y - p1.y;
return sqrt(dx*dx + dy*dy);
}
(Note i also replaced pow(dx, 2) with dx * dx. The two are equivalent, and multiplication is more likely to be efficient. Some compilers might try to optimize away the call to pow...but guess what they replace it with.)
If using a fairly mature compiler it propaby will do this for you on assembly level if optimisation is swtiched on.
For gcc the -O3 or (for "small" functions) even the -O2 option will do this.
For details on this you might consider reading here http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html for "-finline*" options.
Related
I have a long formula, like the following:
float a = sin(b)*cos(c)+sin(c+d)*sin(d)....
Is there a way to use s instead of sin in C, to shorten the formula, without affecting the running time?
There are at least three options for using s for sin:
Use a preprocessor macro:
#define s(x) (sin(x))
#define c(x) (cos(x))
float a = s(b)*c(c)+s(c+d)*c(d)....
#undef c
#undef s
Note that the macros definitions are immediately removed with #undef to prevent them from affecting subsequent code. Also, you should be aware of the basics of preprocessor macro substitution, noting the fact that the first c in c(c) will be expanded but the second c will not since the function-like macro c(x) is expanded only where c is followed by (.
This solution will have no effect on run time.
Use an inline function:
static inline double s(double x) { return sin(x); }
static inline double c(double x) { return cos(x); }
With a good compiler, this will have no effect on run time, since the compiler should replace a call to s or c with a direct call to sin or cos, having the same result as the original code. Unfortunately, in this case, the c function will conflict with the c object you show in your sample code. You will need to change one of the names.
Use function pointers:
static double (* const s)(double) = sin;
static double (* const c)(double) = cos;
With a good compiler, this also will have no effect on run time, although I suspect a few more compilers might fail to optimize code using this solution than than previous solution. Again, you will have the name conflict with c. Note that using function pointers creates a direct call to the sin and cos functions, bypassing any macros that the C implementation might have defined for them. (C implementations are allowed to implement library function using macros as well as functions, and they might do so to support optimizations or certain features. With a good quality compiler, this is usually a minor concern; optimization of a direct call still should be good.)
if I use define, does it affect runtime?
define works by doing text-based substitution at compile time. If you #define s(x) sin(x) then the C pre-processor will rewrite all the s(x) into sin(x) before the compiler gets a chance to look at it.
BTW, this kind of low-level text-munging is exactly why define can be dangerous to use for more complex expressions. For example, one classic pitfall is that if you do something like #define times(x, y) x*y then times(1+1,2) rewrites to 1+1*2, which evaluates to 3 instead of the expected 4. For more complex expressions like it is often a good idea to use inlineable functions instead.
Don't do this.
Mathematicians have been abbreviating the trigonometric functions to sin, cos, tan, sinh, cosh, and tanh for many many years now. Even though mathematicians (like me) like to use their favourite and often idiosyncratic notation so puffing up any paper by a number of pages, these have emerged as pretty standard. Even LaTeX has commands like \sin, \cos, and \tan.
The Japanese immortalised the abbreviations when releasing scientific calculators in the 1970s (the shorthand can fit easily on a button), and the C standard library adopted them.
If you deviate from this then your code immediately becomes difficult to read. This can be particularly pernicious with mathematical code where you can't immediately see the effects of a bad implementation.
But if you must, then a simple
static double(*const s)(double) = sin;
will suffice.
I have couple of simple functions like
#define JacobiLog(x1,x2) ((x1>x2)?x1:x2)+log(1+exp(-fabs(x1-x2)))
What is better to implement (code, compile, memory...) - as above with define or to write some simple function
double JacobiLog(double x1,double x2)
{
return ((x1>x2) ? x1 : x2) + log(1+exp(-fabs(x1-x2)));
}
The compiler will probably automatically set your function as inline. You should use it and not a define.
It will also avoid unexpected comportment in the case where you use your define as
double num = JacobiLog(x++, y++);
I let you imagine the problem with code replacement...
define can possibly be little faster, but most probably compiler will inline the function anyway (or you can mark as inline) and they will be the same. But function is better, because it is more readable and easier to debug.
The function is better, assuming a good compiler.
With the function, it is left to the compiler whether the code is inlined, or not (assuming the definition of the function is accessible to everyone who uses it, for example if it is an inline function declared in a header for C++, or just a plain function with all of its users in the same translation unit). With the macro, it is always inlined, which is not necessarily faster, as it may lead to code bloat and therefore more cache misses and page faults.
Not to mention macros are difficult to read and, even worse, to debug.
Even though the 'define' is faster (since it prevents a function call), the compiler can optimize and inline your function, and make it as fast.
If you are in a c++ environment, you should always use template and functions. It will make you're program more readable and prevent type error.
In C, macro can be useful since the type is not specified (see example below):
/* Will work with int, long, double, short, etc. */
#HIGHER(VAL1, VAL2) ((VAL1) > (VAL2) ? (VAL1) : (VAL2))
It's a micro-optimization. Unless you're doing embedded programming and every instruction counts, go with the function. Not to mention that the log is likely about 100x slower than the overhead to call a function. So you can only get about a 1% saving if your program consists mainly of calling this function. [1] Once your program starts doing significant other things, this saving will be reduced to basically unnoticeable.
The compiler is free to inline the function wherever possible, which would make the two identical. However, you can't force the compiler to do so. There is an inline keyword in C++, but this is just a hint, the compiler is free to ignore it.
See this for some differences between the two (this covers inline versus non-inline functions, but, as stated above, inline functions are essentially the same as #define's). The basic conclusion to the link is "it depends".
Also note that, behaviourally, a #define and a function are not 100% equivalent.
[1]: Figures largely made up. Benchmark if you want accurate results.
First (for a complete answer) we have to acknowledge that using a macro can have surprise side-effects which you might not intend, and that a function ensures that you know the incoming types and you know that each parameter is evaluated exactly once.
More often than not, these effects of using a macro are a source of problems.
Generally a compiler will inline the function as appropriate, and if it does its job right then it should have nearly all the advantages of a macro but without the rarely-intended side-effects.
Occasionally, though, you can actually get some benefits that an inlining compiler mightn't recognise. For example your macro will temporarily defer converting the arguments to double if they were int or long and perform more operations in integer arithmetic (which might have a performance or precision advantage). You might also get integer overflow and incorrect results.
Since you included 'memory' in your list of "better" factors, it's tempting to say that the function is smaller (assuming you configure your compiler to optimise for size), but this isn't necessarily true.
Obviously as a function you need only one copy of it in memory and all callers can use that same code, whereas inlined or expanded at every use duplicates the code. Your compiler is very unlikely to isolate a macro and convert it into a function called from many different places in the code.
Where a never-inlined function can fail to be smaller is where it stands in the way of simplifications. There are three common cases I can think of:
If all of the uses of the function involve constant parameters, the inlined simplifications might come out smaller than the whole original function.
The register marshalling code required to execute a function call with the parameters in the correct registers can be longer than the function itself.
Adding a function call can add to the register pressure in the caller, forcing it to generate more complicated code, possibly forcing it to create a stack frame and save more registers on entry and exit.
I can' figure out what is the advantage of using
#define CRANDOM() (random() / 2.33);
instead of
float CRANDOM() {
return random() / 2.33;
}
By using a #define macro you are forcing the body of the macro to be inserted inline.
When using a function there will1 be a function call (and therefor a jump to the address of the function (among other things)), which will slow down performance somewhat.
The former will most often be faster, even though the size of the executable will grow for each use of the #defined macro.
greenend.org.uk - Inline Functions In C
1 a compiler might be smart enough to optimize away the function call, and inline the function - effectively making it the same as using a macro. But for the sake of simplicitly we will disregard this in this post.
It makes sure that the call to CRANDOM is inlined, even if the compiler doesn't support inlining.
Firstly, that #define is incorrect due to the semi-colon at the end, and the compiler would baulk at:
float f = CRANDOM() * 2;
Secondly, I personally try and avoid using the preprocessor beyond separating platform-independent sections in cross-platform code, and of course code reserved exclusively for DEBUG or non-DEBUG builds.
nightcracker correctly states it will always be "effectively" inline, but given you can re-write the function to be inline itself, I see no advantage to using the preprocessor version unless the C-compiler in question does not inline.
The former is old style. The only advantage of the former is that if you have a compiler following the old C90 standard, the macro will work as inlining. On a modern C compiler you should always write:
inline float CRANDOM() {
return random() / 2.33f;
}
where the inline keyword is optional.
(Note that float literals must have a f at the end, otherwise you force the calculation to be performed on double, which you then implicitly round into a float.)
Calling a function involves a little bit of overhead -- pushing the return address onto the machine stack and branching, in this case. By using a macro, you can avoid this overhead. A long time ago, this was important; these days, many compilers will insert the body of a tiny function like this inline, anyway. In general, trying to fool the compiler into emitting faster code is a fool's game; you often end up creating something slower.
In a previous question what I thought was a good answer was voted down for the suggested use of macros
#define radian2degree(a) (a * 57.295779513082)
#define degree2radian(a) (a * 0.017453292519)
instead of inline functions. Please excuse the newbie question, but what is so evil about macros in this case?
Most of the other answers discuss why macros are evil including how your example has a common macro use flaw. Here's Stroustrup's take: http://www.research.att.com/~bs/bs_faq2.html#macro
But your question was asking what macros are still good for. There are some things where macros are better than inline functions, and that's where you're doing things that simply can't be done with inline functions, such as:
token pasting
dealing with line numbers or such (as for creating error messages in assert())
dealing with things that aren't expressions (for example how many implementations of offsetof() use using a type name to create a cast operation)
the macro to get a count of array elements (can't do it with a function, as the array name decays to a pointer too easily)
creating 'type polymorphic' function-like things in C where templates aren't available
But with a language that has inline functions, the more common uses of macros shouldn't be necessary. I'm even reluctant to use macros when I'm dealing with a C compiler that doesn't support inline functions. And I try not to use them to create type-agnostic functions if at all possible (creating several functions with a type indicator as a part of the name instead).
I've also moved to using enums for named numeric constants instead of #define.
There's a couple of strictly evil things about macros.
They're text processing, and aren't scoped. If you #define foo 1, then any subsequent use of foo as an identifier will fail. This can lead to odd compilation errors and hard-to-find runtime bugs.
They don't take arguments in the normal sense. You can write a function that will take two int values and return the maximum, because the arguments will be evaluated once and the values used thereafter. You can't write a macro to do that, because it will evaluate at least one argument twice, and fail with something like max(x++, --y).
There's also common pitfalls. It's hard to get multiple statements right in them, and they require a lot of possibly superfluous parentheses.
In your case, you need parentheses:
#define radian2degree(a) (a * 57.295779513082)
needs to be
#define radian2degree(a) ((a) * 57.295779513082)
and you're still stepping on anybody who writes a function radian2degree in some inner scope, confident that that definition will work in its own scope.
For this specific macro, if I use it as follows:
int x=1;
x = radian2degree(x);
float y=1;
y = radian2degree(y);
there would be no type checking, and x,y will contain different values.
Furthermore, the following code
float x=1, y=2;
float z = radian2degree(x+y);
will not do what you think, since it will translate to
float z = x+y*0.017453292519;
instead of
float z = (x+y)+0.017453292519;
which is the expected result.
These are just a few examples for the misbehavior ans misuse macros might have.
Edit
you can see additional discussions about this here
if possible, always use inline function. These are typesafe and can not be easily redefined.
defines can be redfined undefined, and there is no type checking.
Macros are relatively often abused and one can easily make mistakes using them as shown by your example. Take the expression radian2degree(1 + 1):
with the macro it will expand to 1 + 1 * 57.29... = 58.29...
with a function it will be what you want it to be, namely (1 + 1) * 57.29... = ...
More generally, macros are evil because they look like functions so they trick you into using them just like functions but they have subtle rules of their own. In this case, the correct way would be to write it would be (notice the paranthesis around a):
#define radian2degree(a) ((a) * 57.295779513082)
But you should stick to inline functions. See these links from the C++ FAQ Lite for more examples of evil macros and their subtleties:
inline vs. macros
macros containing if
macros with multiple lines
macros used to paste two tokens together
The compiler's preprocessor is a finnicky thing, and therefore a terrible candidate for clever tricks. As others have pointed out, it's easy to for the compiler to misunderstand your intention with the macro, and it's easy for you to misunderstand what the macro will actually do, but most importantly, you can't step into macros in the debugger!
Macros are evil because you may end up passing more than a variable or a scalar to it and this could resolve in an unwanted behavior (define a max macro to determine max between a and b but pass a++ and b++ to the macro and see what happens).
If your function is going to be inlined anyway, there is no performance difference between a function and a macro. However, there are several usability differences between a function and a macro, all of which favor using a function.
If you build the macro correctly, there is no problem. But if you use a function, the compiler will do it correctly for you every time. So using a function makes it harder to write bad code.
I was reading some code written in C this evening, and at the top of
the file was the function-like macro HASH:
#define HASH(fp) (((unsigned long)fp)%NHASH)
This left me wondering, why would somebody choose to implement a
function this way using a function-like macro instead of implementing
it as a regular vanilla C function? What are the advantages and
disadvantages of each implementation?
Thanks a bunch!
Macros like that avoid the overhead of a function call.
It might not seem like much. But in your example, the macro turns into 1-2 machine language instructions, depending on your CPU:
Get the value of fp out of memory and put it in a register
Take the value in the register, do a modulus (%) calculation by a fixed value, and leave that in the same register
whereas the function equivalent would be a lot more machine language instructions, generally something like
Stick the value of fp on the stack
Call the function, which also puts the next (return) address on the stack
Maybe build a stack frame inside the function, depending on the CPU architecture and ABI convention
Get the value of fp off the stack and put it in a register
Take the value in the register, do a modulus (%) calculation by a fixed value, and leave that in the same register
Maybe take the value from the register and put it back on the stack, depending on CPU and ABI
If a stack frame was built, unwind it
Pop the return address off the stack and resume executing instructions there
A lot more code, eh? If you're doing something like rendering every one of the tens of thousands of pixels in a window in a GUI, things run an awful lot faster if you use the macro.
Personally, I prefer using C++ inline as being more readable and less error-prone, but inlines are also really more of a hint to the compiler which it doesn't have to take. Preprocessor macros are a sledge hammer the compiler can't argue with.
One important advantage of macro-based implementation is that it is not tied to any concrete parameter type. A function-like macro in C acts, in many respects, as a template function in C++ (templates in C++ were born as "more civilized" macros, BTW). In this particular case the argument of the macro has no concrete type. It might be absolutely anything that is convertible to type unsigned long. For example, if the user so pleases (and if they are willing to accept the implementation-defined consequences), they can pass pointer types to this macro.
Anyway, I have to admit that this macro is not the best example of type-independent flexibility of macros, but in general that flexibility comes handy quite often. Again, when certain functionality is implemented by a function, it is restricted to specific parameter types. In many cases in order to apply similar operation to different types it is necessary to provide several functions with different types of parameters (and different names, since this is C), while the same can be done by just one function-like macro. For example, macro
#define ABS(x) ((x) >= 0 ? (x) : -(x))
works with all arithmetic types, while function-based implementation has to provide quite a few of them (I'm implying the standard abs, labs, llabs and fabs). (And yes, I'm aware of the traditionally mentioned dangers of such macro.)
Macros are not perfect, but the popular maxim about "function-like macros being no longer necessary because of inline functions" is just plain nonsense. In order to fully replace function-like macros C is going to need function templates (as in C++) or at least function overloading (as in C++ again). Without that function-like macros are and will remain extremely useful mainstream tool in C.
On one hand, macros are bad because they're done by the preprocessor, which doesn't understand anything about the language and does text-replace. They usually have plenty of limitations. I can't see one above, but usually macros are ugly solutions.
On the other hand, they are at times even faster than a static inline method. I was heavily optimizing a short program and found that calling a static inline method takes about twice as much time (just overhead, not actual function body) as compared with a macro.
The most common (and most often wrong) reason people give for using macros (in "plain old C") is the efficiency argument. Using them for efficiency is fine if you have actually profiled your code and are optimizing a true bottleneck (or are writing a library function that might be a bottleneck for somebody someday). But most people who insist on using them have Not actually analyzed anything and are just creating confusion where it adds no benefit.
Macros can also be used for some handy search-and-replace type substitutions which the regular C language is not capable of.
Some problems I have had in maintaining code written by macro abusers is that the macros can look quite like functions but do not show up in the symbol table, so it can be very annoying trying to trace them back to their origins in sprawling codesets (where is this thing defined?!). Writing macros in ALL CAPS is obviously helpful to future readers.
If they are more than fairly simple substitutions, they can also create some confusion if you have to step-trace through them with a debugger.
Your example is not really a function at all,
#define HASH(fp) (((unsigned long)fp)%NHASH)
// this is a cast ^^^^^^^^^^^^^^^
// this is your value 'fp' ^^
// this is a MOD operation ^^^^^^
I'd think, this was just a way of writing more readable code with the casting and mod opration wrapped into a single macro 'HASH(fp)'
Now, if you decide to write a function for this, it would probably look like,
int hashThis(int fp)
{
return ((fp)%NHASH);
}
Quite an overkill for a function as it,
introduces a call point
introduces call-stack setup and restore
The C Preprocessor can be used to create inline functions. In your example, the code will appear to call the function HASH, but instead is just inline code.
The benefits of doing macro functions were eliminated when C++ introduced inline functions. Many older API like MFC and ATL still use macro functions to do preprocessor tricks, but it just leaves the code convoluted and harder to read.