Is it bad or good practice or maybe undefined behavior to re-assign function parameter inside function?
Let me explain what I'm trying to do with an example, here the function:
void
gkUpdateTransforms(GkNode *node /* other params */) {
GkNode *nodei;
if (!(nodei = node->chld))
return;
do {
/* do job */
nodei = nodei->next;
} while (nodei);
}
Alternative:
void
gkUpdateTransforms2(GkNode *node /* other params */) {
/* node parameter is only used here to get chld, not anywhere else */
if (!(node = node->chld))
return;
do {
/* do job */
node = node->next;
} while (node);
}
I checked assembly output and it seems same, we don't need to declare a variable in second one. You may ask what if parameter type changed but same condition would be same for first one, because it also need to be updated.
EDIT: Parameters are pass-by-value, and my intention is not edit pointer itself
EDIT2: What about recursive functions? What would happen if gkUpdateTransforms2 was recursive? I'm confused because function will call itself but I think in every call, parameters will be different stack
I have no idea why you think this would be undefined behavior - it is not. Mostly it is a matter of coding style, there's no obvious right or wrong.
Generally, it is good practice to regard parameters as immutable objects. It is useful to preserve an untouched copy of the input to the function. For that reason, it may be a good idea to use a local variable which is just a copy of the parameter. As you can see, this does not affect performance the slightest - the compiler will optimize the code.
However, it is not a big deal if you write to the parameters either. This is common practice too. Calling it bad practice to do so would be very pedantic.
Some pedantic coding styles make all function parameters const if they shouldn't be modified, but I personally think that's just obfuscation, which makes the code harder to read. In your case such pedantic style would be void gkUpdateTransforms(GkNode*const node). Not to be confused with const correctness, which is an universally good thing and not just a style matter.
However, there is something in your code which is definitely considered bad practice, and that is assignment inside conditions. Avoid this whenever possible, it is dangerous and makes the code harder to read. Most often there is no benefit.
The danger of mixing up = and == was noted early on in the history of C. To counter this, in the 1980s people came up with brain-damaged things like the "yoda conditions". Then around 1989 came Borland Turbo C which had a fancy warning feature "possible incorrect assignment". That was the death of the Yoda conditions, and compilers since then have warned against assignment in conditions.
Make sure that your current compiler gives a warning for this. That is, make sure not to use a worse compiler than Borland Turbo from 1989. Yes, there are worse compilers on the market.
(gcc gives "warning: suggest parentheses around assignment used as truth value")
I would write the code as
void gkUpdateTransforms(GkNode* node /* other params */)
{
if(node == NULL)
{
return ;
}
for(GkNode* i=node->chld; i!=NULL; i=i->next;)
{
/* do job */
}
}
This is mostly stylistic changes to make the code more readable. It does not improve performance much.
IMHO it is not exactly "bad" practice but it is worthwile to question oneself if there isn't a better way. About your analyzing the assembler output: it may serve as an interesting and educational look behind the curtain but you are ill advised to use this as an justification for optimization or worse, laziness in the source code. The next compiler or the next architecture may just render your musings completely invalid - my recommendation is to stay with Knuth here: "Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.".
In your code I think the decision is 50:50 with no clear winner. I would deem the node-iterator a concept of its own, justifying a separate programming construct (which in our case is just a variable) but then again the function is so simple that we don't win much in terms of clarity for the next programmer looking at your code, so we can very well live with the second version. If your function starts to mutate and grow over time, this premise may become invalid and we were better off the first version.
That said, I would code the first version like this:
void
gkUpdateTransforms(GkNode *node /* other params */) {
for (GkNode *nodei = node->chld; nodei != NULL; nodei = nodei->next) {
/* do job */
}
}
This is well defined and a perfectly good way to implement this behaviour.
The reason you might see it as an issue is the common mistake of doing the following:
int func(object a) {
modify a // only modifying copy, but user expects a to be modified
But in your case, you expect to make a copy of the pointer.
As long as it's passed by value, it can be safely treated as any other local variable. Not a bad practice in this scenario, and not undefined behaviour either.
Related
Say you have (for reasons that are not important here) the following code:
int k = 0;
... /* no change to k can happen here */
if (k) {
do_something();
}
Using the -O2 flag, GCC will not generate any code for it, recognizing that the if test is always false.
I'm wondering if this is a pretty common behaviour across compilers or it is something I should not rely on.
Does anybody knows?
Dead code elimination in this case is trivial to do for any modern optimizing compiler. I would definitely rely on it, given that optimizations are turned on and you are absolutely sure that the compiler can prove that the value is zero at the moment of check.
However, you should be aware that sometimes your code has more potential side effects than you think.
The first source of problems is calling non-inlined functions. Whenever you call a function which is not inlined (i.e. because its definition is located in another translation unit), compiler assumes that all global variables and the whole contents of the heap may change inside this call. Local variables are the lucky exception, because compiler knows that it is illegal to modify them indirectly... unless you save the address of a local variable somewhere. For instance, in this case dead code won't be eliminated:
int function_with_unpredictable_side_effects(const int &x);
void doit() {
int k = 0;
function_with_unpredictable_side_effects(k);
if (k)
printf("Never reached\n");
}
So compiler has to do some work and may fail even for local variables. By the way, I believe the problem which is solved in this case is called escape analysis.
The second source of problems is pointer aliasing: compiler has to take into account that all sort of pointers and references in your code may be equal, so changing something via one pointer may change the contents at the other one. Here is one example:
struct MyArray {
int num;
int arr[100];
};
void doit(int idx) {
MyArray x;
x.num = 0;
x.arr[idx] = 7;
if (x.num)
printf("Never reached\n");
}
Visual C++ compiler does not eliminate the dead code, because it thinks that you may access x.num as x.arr[-1]. It may sound like an awful thing to do to you, but this compiler has been used in gamedev area for years, and such hacks are not uncommon there, so the compiler stays on the safe side. On the other hand, GCC removes the dead code. Maybe it is related to its exploitation of strict pointer aliasing rule.
P.S. The const keywork is never used by optimizer, it is only present in C/C++ language for programmers' convenience.
There is no pretty common behaviour across compilers. But there is a way to explore how different compilers acts with specific part of code.
Compiler explorer will help you to answer on every question about code generation, but of course you must be familiar with assembler language.
We often write some functions which have more than one exit point (that is, return in C). At the same time, when exiting the function, for some general works such as resource cleanup, we wish to implement them only once, rather than implementing them at every exit point. Typically, we may achieve our wish by using goto like the following:
void f()
{
...
...{..{... if(exit_cond) goto f_exit; }..}..
...
f_exit:
some general work such as cleanup
}
I think using goto here is acceptable, and I know many people agree on using goto here. Just out of curiosity, does there exist any elegant way for neatly exiting a function without using goto in C?
Why avoid goto?
The problem you want to solve is: How to make sure some common code always gets executed before the function returns to the caller? This is an issue for C programmers, since C does not provide any built in support for RAII.
As you already concede in your question body, goto is a perfectly acceptable solution. Never-the-less, there may be non-technical reasons to avoid using it:
academic exercise
coding standard compliance
personal whim (which I think is what is motivating this question)
There are always more than one way to skin a cat, but elegance as a criteria is too subjective to provide a way to narrow to a single best alternative. You have to decide the best option for yourself.
Explicitly calling a cleanup function
If avoiding an explicit jump (e.g., goto or break) common cleanup code can be encapsulated within a function, and explicitly called at the point of early return.
int foo () {
...
if (SOME_ERROR) {
return foo_cleanup(SOME_ERROR_CODE, ...);
}
...
}
(This is similar to another posted answer, that I only saw after I initially posted, but the form shown here can take advantage of sibling call optimizations.)
Some people feel explicitness is more clear, and therefore more elegant. Others feel the need to pass cleanup arguments to the function to be a major detractor.
Add another layer of indirection.
Without changing the semantics of the user API, change its implementation into a wrapper composed of two parts. Part one performs the actual work of the function. Part two performs the cleanup necessary after part one is done. If each part is encapsulated within its own function, the wrapper function has a very clean implementation.
struct bar_stuff {...};
static int bar_work (struct bar_stuff *stuff) {
...
if (SOME_ERROR) return SOME_ERROR_CODE;
...
}
int bar () {
struct bar_stuff stuff = {};
int r = bar_work(&stuff);
return bar_cleanup(r, &stuff);
}
The "implicit" nature of the cleanup from the point of view of the function that performs the work may be viewed favorably by some. Some potential code bloat is also avoided by only calling the cleanup function from a single place. Some argue that "implicit" behaviors are "tricky", and therefore more difficult to understand and maintain.
Miscellaneous...
More esoteric solutions using setjmp()/longjmp() can be considered, but using them correctly can be difficult. There are open-source wrappers that implement try/catch exception handling style macros over them (for example, cexcept), but you have to change your coding style to use that style for error handling.
One could also consider implementing the function like a state machine. The function tracks progress through each state, an error causes the function to short circuit to the cleanup state. This style is usually reserved for particularly complex functions, or functions that need to be retried later and be able to pick up from where they left off.
Do as the Romans do.
If you need to comply to coding standards, then the best approach is to follow whatever technique is most prevalent in the existing code base. This applies to almost all aspects of making changes to an existing stable source code base. It would be considered disruptive to introduce a new coding style. You should seek approval from the powers that be if you feel a change would dramatically improve some aspect of the software. Otherwise, as "elegance" is subjective, arguing for the sake of "elegance" is not going to get you anywhere.
For example
void f()
{
do
{
...
...{..{... if(exit_cond) break; }..}..
...
} while ( 0 );
some general work such as cleanup
}
Or you could use the following structure
while ( 1 )
{
//...
}
The main advantage of the structural approach contrary to using goto statements is that it introduces a discipline in writing code.
I am sure and have enough experience that if a function has one goto statement then through some time it will have several goto statements.:)
I've seen a lot of solutions how to do this and they tend to be obscure, unreadable and ugly at some degree.
I personally think the least ugly way is this:
int func (void)
{
if(some_error)
{
cleanup();
return result;
}
...
if(some_other_error)
{
cleanup();
return result;
}
...
cleanup();
return result;
}
Yes, it uses two rows of code instead of one. So? It is clear, readable, maintainable. This is a perfect example of where you have to fight your knee-jerk reflexes against code repetition and use common sense. The cleanup function is written only once, all clean up code is centralized there.
I think the question is very interesting, but cannot be answered without being influenced by subjectivity because elegance is subjective. My ideas on it are as follows: In general, what you want to do in the scenario you describe, is to prevent control from passing through a series of statements along the execution path. Other languages would do this by raising an exception, which you would have to catch.
I had already written down neat hacks to do what you want to do with pretty much every control statement there is in C, sometimes in combination, but I think they are all just very obscure ways of expressing the idea of skipping to a special point. Instead I'll just make my point on how we arrive at a point where goto can be preferable : Once again, what you want to express using is that something has occurred that prevents following the regular execution path. Something that is not just a regular condition that can be handled by taking a different branch down the path, but something makes it impossible to use the path to the regular return point in a safe way in the current state. I think there are three options to proceed at that point:
return through a conditional clause
goto an error-label
every statement that could fail is inside a conditional statement, and regular execution is considered a series of conditional operations.
If your cleanup is similar enough on every possible emergency exit I would prefer the goto, because writing the code redundantly just clutters the function. I think you should trade the number of return points and replicated clean-up code that you create against the awkwardness of using a goto. Both solutions should be accepted as a personal choice of the programmer, unless there are severe reasons for not doing so, e.g. you agreed that all functions must have a single exit. However, the use of either should be consequent and consistent across the code. The third alternative is - imo - the less readable cousin of the goto, because, in the end you will skip to a set of cleanup routines - possibly enclosed by else-statements too, but it makes it much harder for humans to follow the regular flow of you program, due to the deep nesting of conditional statements.
tl;dr: I think choosing between conditional return and goto based on consequent style-decisions is the most elegant way, because it is the most expressive way to represent your ideas behind the code and clarity is elegance.
I guess that elegant may mean for you weird and that you simply want to avoid the goto keyword, so....
You might consider using setjmp(3) and longjmp :
void foo() {
jmp_buf jb;
if (setjmp(jb) == 0) {
some_stuff();
//// etc...
if (bad_thing() {
longjmp(jb, 1);
}
};
};
I have no idea if it fits your elegance criteria. (I believe it is not very elegant, but this is only an opinion; however, there is no explicit goto).
However, the interesting thing is that longjmp is a non-local jump : You could have passed (indirectly) jb to some_stuff and have some other routine (e.g. called by some_stuff) do the longjmp. This may become unreadable code (so comment it wisely).
Even uglier than longjmp : use (on Linux) setcontext(3)
Read about continuations and exceptions (and the call/cc operation in Scheme).
And of course, the standard exit(3) is an elegant (and useful) way to go out of some function. You could sometimes play neat trick by also using atexit(3)
BTW, Linux kernel code uses quite often goto including in some code which is considered as elegant.
My point is : IMHO don't be fanatic against goto-s since there are cases where using (with care) it is in fact elegant.
I'm a fan of:
void foo(exp)
{
if( ate_breakfast(exp)
&& tied_shoes(exp)
&& finished_homework(exp)
)
{
good_to_go(exp);
}
else
{
fix_the_problems(exp);
}
}
Where ate_breakfast, tied_shoes, and finished_homework take a pointer to exp that they work on, and return bools indicating a failure of that particular test.
It helps to remember that short circuit evaluation is at work here - Which may qualify as a code smell to some people, but like everybody else has been saying, elegance is somewhat subjective.
goto statement is never necessary, also it is easier to write code without using it.
Instead you can use a cleaner function and return.
if(exit_cond) {
clean_the_mess();
return;
}
Or you can break as Vlad mentioned above. But one drawback of that, if your loop has deeply nested structure break will only exit from innermost loop.
For example:
While ( exp ) {
for (exp; exp; exp) {
for (exp; exp; exp) {
if(exit_cond) {
clean_the_mess();
break;
}
}
}
}
will only exit from inner for loop and doesn't abandon process.
I'm writing a Scheme interpreter. For each built-in type (integer, character, string, etc) I want to have the read and print functions named consistently:
READ_ERROR Scheme_read_integer(FILE *in, Value *val);
READ_ERROR Scheme_read_character(FILE *in, Value *val);
I want to ensure consistency in the naming of these functions
#define SCHEME_READ(type_) Scheme_read_##type_
#define DEF_READER(type_, in_strm_, val_) READ_ERROR SCHEME_READ(type_)(FILE *in_strm_, Value *val_)
So that now, instead of the above, in code I can write
DEF_READER(integer, in, val)
{
// Code here ...
}
DEF_READER(character, in, val)
{
// Code here ...
}
and
if (SOME_ERROR != SCHEME_READ(integer)(stdin, my_value)) do_stuff(); // etc.
Now is this considered an unidiomatic use of the preprocessor? Am I shooting myself in the foot somewhere unknowingly? Should I instead just go ahead and use the explicit names of the functions?
If not are there examples in the wild of this sort of thing done well?
I've seen this done extensively in a project, and there's a severe danger of foot-shooting going on.
The problem happens when you try to maintain the code. Even though your macro-ized function definitions are all neat and tidy, under the covers you get function names like Scheme_read_integer. Where this can become an issue is when something like Scheme_read_integer appears on a crash stack. If someone does a search of the source pack for Scheme_read_integer, they won't find it. This can cause great pain and gnashing of teeth ;)
If you're the only developer, and the code base isn't that big, and you remember using this technique years down the road and/or it's well documented, you may not have an issue. In my case it was a very large code base, poorly documented, with none of the original developers around. The result was much tooth-gnashing.
I'd go out on a limb and suggest using a C++ template, but I'm guessing that's not an option since you specifically mentioned C.
Hope this helps.
I'm usually a big fan of macros, but you should probably consider inlined wrapper functions instead. They will add negligible runtime overhead and will appear in stack backtraces, etc., when you're debugging.
I'm a little bit new to C so I'm not familiar with how I would approach a solution to this issue. As you read on, you will notice its not critical that I find a solution, but it sure would be nice for this application and future reference. :)
I have a parameter int hello and I wan't to make a synonomous copy of not it.
f(int hello, structType* otherParam){
// I would like to have a synonom for (!hello)
}
My first thought was to make a local constant, but I'm not sure if there will be additional memory consumption. I'm building with GCC and I really don't know if it would recognize a constant of a parameter (before any modifications) as just a synonymous variable. I don't think so because the parameter could (even though it wont be) changed later on in that function, which would not effect the constant.
I then thought about making a local typedef, but I'm not sure exactly the syntax for doing so. I attempted the following:
typedef (!hello) hi;
However I get the following error.
D:/src-dir/file.c: In function 'f':
D:/src-dir/file.c: 00: error: expected identifier or '(' before '!' token
Any help is appreciated.
In general, in C, you want to write the code that most clearly expresses your intentions, and allow the optimiser to figure out the most efficient way to implement that.
In your example of a frequently-reused calculation, storing the result in a const-qualified variable is the most appropriate way to do this - something like the following:
void f(int hello)
{
const int non_hello = !hello;
/* code that uses non_hello frequently */
}
or more likely:
void x(structType *otherParam)
{
char * const d_name = otherParam->b->c->d->name;
/* code that uses d_name frequently */}
}
Note that such a const variable does not necessarily have to be allocated any memory (unless you take its address with & somewhere) - the optimiser might simply place it in a register (and bear in mind that even if it does get allocated memory, it will likely be stack memory).
Typedef defines an alias for a type, it's not what you want. So..
Just use !hello where you need it
Why would you need a "synonym" for a !hello ? Any programmer would instantly recognize !hello instead of looking for your clever trick for defining a "synonym".
Given:
f(int hello, structType* otherParam){
// I would like to have a synonom for (!hello)
}
The obvious, direct answer to what you have here would be:
f(int hello, structType *otherParam) {
int hi = !hello;
// ...
}
I would not expect to see any major (or probably even minor) effect on execution speed from this. Realistically, there probably isn't a lot of room for improvement in the execution speed.
There are certainly times something like this can make the code more readable. Also note, however, that when/if you modify the value of hello, the value of hi will not be modified to match (unless you add code to update it). It's rarely an issue, but something to remain aware of nonetheless.
Sometimes I have to write code that alternates between doing things and checking for error conditions (e.g., call a library function, check its return value, keep going). This often leads to long runs where the actual work is happening in the conditions of if statements, like
if(! (data = (big_struct *) malloc(sizeof(*data)))){
//report allocation error
} else if(init_big_struct(data)){
//handle initialization error
} else ...
How do you guys write this kind of code? I've checked a few style guides, but they seem more concerned with variable naming and whitespace.
Links to style guides welcome.
Edit: in case it's not clear, I'm dissatisfied with the legibility of this style and looking for something better.
Though it pains me to say it, this might be a case for the never-popular goto. Here's one link I found on on the subject: http://eli.thegreenplace.net/2009/04/27/using-goto-for-error-handling-in-c/
I usually write that code in this way:
data = (big_struct *) malloc(sizeof(*data));
if(!data){
//report allocation error
return ...;
}
err = init_big_struct(data);
if(err){
//handle initialization error
return ...;
}
...
In this way I avoid calling functions inside if and the debug is easier because you can check the return values.
Dont use assert in production code.
In debug mode, assert should never be used for something that can actually happen (like malloc returning NULL), rather it should be used in impossible cases (like array index is out of bounds in C)
Read this post for more.
One method which I used to great effect is the one used by W. Richard Stevens in Unix Network Programming (code is downloadable here. For common functions which he expects to succeed all the time, and has no recourse for a failure, he wraps them, using a capital letter (code compressed vertically):
void * Malloc(size_t size) {
void *ptr;
if ( (ptr = malloc(size)) == NULL)
err_sys("malloc error");
return(ptr);
}
err_sys here displays the error and then performs an exit(1). This way you can just call Malloc and know that it will error out if there is a problem.
UNP continues to be the only book I've where I think the author has code which checks the return values of all the functions which it's possible to fail. Every other book says "you should check the return values, but we'll leave that for you to do later".
I tend to
Delegate error checking to wrapper functions (like Stevens)
On error, simulate exceptions using longjmp. (I actually use Dave Hanson's C Interfaces and Implementations to simulate exceptions.)
Another option is to use Don Knuth's literate programming to manage the error-handling code, or some other kind of preprocessor. This option is available only if you get to set the rules for your shop :-)
The only grouping property of code like this is that there simply is an externally imposed sequence that it has to follow. This is why you put these allocations into one function, but this is a very weak commonality. Why some people recommend to abandon the scope advantages of nested if's is beyond my understanding. You are effectively trying to put lipstick on a pig (no insult intended) - the code's nature will never yield anything clean, the best you can do is to use the compilers help to catch (maintenance) errors. Stick with the if's IMHO.
PS: if I haven't convinced you yet: what will the goto-solution look like if you have to take ternary decisions? The if's will get uglier for sure, but the goto's???