C Programming: how to avoid code duplication without losing clarity

C Programming: how to avoid code duplication without losing clarity - c

edit: Thanks to all repliers. I should have mentioned in my original post that I am not allowed to change any of the specifications of these functions, so solutions using assertions and/or allowing to dereference NULL are out of the question.
With this in mind, I gather that it's either I go with a function pointer, or just leave the duplication as it is. For the sake of clarity I'd like to avoid function pointers this time.
original:
I am trying to avoid code duplication without losing clarity.
often when working on a specific assignment (Uni - undergrad) I recognize these patterns of functions return , but not always with a "great-job" solution..
What would any of you suggest I should do (pointers to functions, macros, etc.) with these three C functions that check some of their arguments in the same way to make the checking more modular (it should be more modular, right?)?
BTW these are taken directly from a HW assignment, so the details of their functionality are not concerning my question, only the arguments checking at the function's top.
teamIsDuplicateCoachName(Team team, bool* isDuplicate) {
TeamResult result = TEAM_SUCCESS;
if (!team || !isDuplicate) {
result = TEAM_NULL_ARGUMENT;
} else if (teamEmpty(team)) {
result = TEAM_IS_EMPTY;
} else {
for (int i = 0; i < team->currentFormations; ++i) {
if (teamIsPlayerInFormation(team->formations[i], team->coach)) {
*isDuplicate = true;
break;
}
}
}
return result;
}
TeamResult teamGetWinRate(Team team, double* winRate) {
TeamResult result = TEAM_SUCCESS;
if (!team || !winRate) {
result = TEAM_NULL_ARGUMENT;
} else {
int wins = 0, games = 0;
for (int i = 0; i < team->currentFormations; ++i) {
Formation formation = team->formations[i];
if (formationIsComplete(formation)) {
games += formation->timesPlayed;
wins += formation->timesWon;
}
}
double win = ( games == 0 ) ? 0 : (double) wins / games;
assert(win >= 0 && win <= 1);
*winRate = win;
}
return result;
}
TeamResult teamGetNextIncompleteFormation(Team team, Formation* formation,
int* index) {
TeamResult result = TEAM_SUCCESS;
if (!team || !formation || !index) {
result = TEAM_NULL_ARGUMENT;
} else {
*formation = NULL; /* default result, will be returned if there are no incomplete formations */
for (int i = 0; i < team->currentFormations; ++i) {
Formation formationPtr = team->formations[i];
if (!formationIsComplete(formationPtr)) {
*formation = formationPtr;
*index = i;
break;
}
}
}
return result;
}
Any advice on how (specifically) to avoid the code duplication would be appreciated.
Thanks for your time! :)

It looks like it's a coding mistake to pass nulls to these functions. There's three main ways to deal with this situation.
Handle the erroneous nulls and return an error value. This introduces extra code which checks the arguments to return error values, and extra code around every call site, which now has to handle the error return values. Probably none of this code is tested, since if you knew that code was mistakenly passing nulls you'd just fix it.
Use assert to check validity of arguments, resulting in a clean error message, clear to read preconditions, but some extra code.
Have no precondition checks, and debug segfaults when you deference a NULL.
In my experience 3 is usually the best approach. It adds zero extra code, and a segfault is usually just as easy to debug as the clean error message you'd get from 2. However, you'll find many software engineers who would prefer 2, and it's a matter of taste.
Your code, which is pattern 1, has some significant downsides. First, it's adding extra code which can't be optimised away. Second, more code means more complexity. Third, it's unclear if the functions are supposed to be able to accept broken arguments, or if the code's just there to help debugging when things are wrong.

I would create a function to check the team object:
TeamResult TeamPtrCheck(Team *team)
{
if (team == NULL)
return TEAM_NULL_ARGUMENT;
else if (teamEmpty(team))
return TEAM_IS_EMPTY;
else
return TEAM_SUCCESS;
}
And then reference that + your other checks at the top of each function, for example
TeamResult = TeamPtrCheck(team);
if (TeamResult != TEAM_SUCCESS)
return TeamResult;
if (winRate == NULL)
return TEAM_NULL_ARGUMENT;
Otherwise, if each function is different then leave the checks as different!

If you are concerned about the duplication of the NULL checks at the start of each function, I wouldn't be. It makes it clear to the user that you are simply doing input validation prior to doing any work. No need to worry about the few lines.
In general, don't sweat the small stuff like this.

There are a few techniques to reduce the redundancy you percieve, which one is applicable heavily depends on the nature of the condition you are checking. In any case, I would advise against any (preprocessor) tricks to reduce duplication which hide what is actually happening.
If you have a condition that should not happen, one concise way to check for it is to use an assert. With an assert you basically say: This condition must be true, otherwise my code has a bug, please check if my assumption is true, and kill my program immediately if it's not. This is often used like this:
#include <assert.h>
void foo(int a, int b) {
assert((a < b) && "some error message that should appear when the assert fails (a failing assert prints its argument)");
//do some sensible stuff assuming a is really smaller than b
}
A special case is the question whether a pointer is null. Doing something like
void foo(int* bar) {
assert(bar);
*bar = 3;
}
is pretty pointless, because dereferencing a null pointer will securely segfault your program on any sane platform, so the following will just as securely stop your program:
void foo(int* bar) {
*bar = 3;
}
Language lawyers may not be happy with what I'm saying because, according to the standard, dereferencing a null pointer is undefined behaviour, and technically the compiler would be allowed to produce code that formats your harddrive. However, dereferencing a null pointer is such a common error that you can expect your compiler not to do stupid things with it, and you can expect your system to take special care to ensure that the hardware will scream if you try to do it. This hardware check comes for free, the assert takes a few cycles to check.
The assert (and segfaulting null pointers), however, is only suitable for checking for fatal conditions. If you are just checking for a condition that makes any further work inside a function pointless, I would not hesitate to use an early return. It is usually much more readable, especially since syntax highlighting readily reveals the return statements to the reader:
void foo(int a, int b) {
if(a >= b) return;
//do something sensible assuming a < b
}
With this paradigm, your first function would look like this:
TeamResult teamIsDuplicateCoachName(Team team, bool* isDuplicate) {
if(!team || !isDuplicate) return TEAM_NULL_ARGUMENT;
if(teamEmpty(team)) return TEAM_IS_EMPTY;
for (int i = 0; i < team->currentFormations; ++i) {
if (teamIsPlayerInFormation(team->formations[i], team->coach)) {
*isDuplicate = true;
break;
}
}
return TEAM_SUCCESS;
}
I believe, this is much more clear and concise than the version with the if around the body.

This is more or less a design question. If the functions above are all static functions (or only one is extern), then the whole "bundle of function" should check the condition - execution flow-wise - once for each object and let the implementation details of lower level functions assume that input data is valid.
For example, if you go back to wherever the team is created, allocated and initialized and wherever the formation is created, allocated and initialized and build rules there that ensure that every created team exists and that no duplicate exists, you will not have to valid the input because by definition/construction it will always be. This is examples of pre conditions. Invariants would be the persistance of the truthfulness of these definitions (no function may alter invariant states upon return) and post conditions would be somewhat the opposite (for example when they are free'd but pointers still exists somewhere).
That being said, manipulating "object-like" data in C, my personnal preference is to create extern functions that creates, returns and destroys such objects. If the members are kept static within the .c files with minimal .h interface, you get something conceptually similar to object oriented programming (though you can never make members fully "private").

Thanks to all repliers. I should have mentioned in my original post that I am not allowed to change any of the specifications of these functions, so solutions using assertions and/or allowing to dereference NULL are out of the question, though I'll consider them for other occasions.
With this in mind, I gather that it's either I go with a function pointer, or just leave the duplication as it is. For the sake of clarity I'd like to avoid function pointers this time.

Related

scanf inside function to return value (or other function)

so i was going to run a function in an infinite loop which takes a number input, but then I remembered I codn't do
while (true) {
myfunc(scanf("%d));
}
because I need to put the scanf input into a variable. I can't do scanf(%*d) because that doesn't return value at all. I don't want to have to do
int temp;
while (true) {
scanf("%d", &temp);
myfunc(temp);
or include more libraries. Is there any standard single function like gets (I cod do myfunc((int) strtol(gets(), (char**) NULL, 10)); but its kinda messy sooo yea)
srry if im asking too much or being pedantic and i shod do ^
btw unrelated question is there any way to declare a string as an int--- or even better, a single function for converting int to string? I usually use
//num is some number
char* str = (char*) malloc(12);
sprintf(str, "%d", num);
func(str);
but wodnt func(str(num)); be easier?

For starters, the return value of scanf (and similar functions) is the number of conversions that took place. That return value is also used to signify if an error occurred.
In C you must manually manage these errors.
if ((retv = scanf("%d", &n)) != 1) {
/* Something went wrong. */
}
What you seem to be looking for are conveniences found in higher-level languages. Languages & runtimes that can hide the details from you with garbage collection strategies, exception nets (try .. catch), etc. C is not that kind of language, as by today's standards it is quite a low-level language. If you want "non-messy" functions, you will have to build them up from scratch, but you will have to decide what kinds of tradeoffs you can live with.
For example, perhaps you want a simple function that just gets an int from the user. A tradeoff you could make is that it simply returns 0 on any error whatsoever, in exchange for never knowing if this was an error, or the user actually input 0.
int getint(void) {
int n;
if (scanf("%d", &n) != 1)
return 0;
return n;
}
This means that if a user makes a mistake on input, you have no way of retrying, and the program must simply roll on ahead.
This naive approach scales poorly with the fact that you must manually manage memory in C. It is up to you to free any memory you dynamically allocate.
You could certainly write a simple function like
char *itostr(int n) {
char *r = malloc(12);
if (r && sprintf(r, "%d", n) < 1) {
r[0] = '0';
r[1] = '\0';
}
return r;
}
which does the most minimal of error checking (Again, we don't know if "0" is an error, or a valid input).
The problem comes when you write something like func(itostr(51));, unless func is to be expected to free its argument (which would rule out passing non-dynamically allocated strings), you will constantly be leaking memory with this pattern.
So no there is no real "easy" way to do these things. You will have to get "messy" (handle errors, manage memory, etc.) if you want to build anything with complexity.

How to check the formal parameters at the entry of a function and ensure that the function has only one return statement?

This is just to meet the so-called Misra C 2012 specification rule 15.5. I have to make only one return statement in my function, but I want to do the parameter check of the function. When the parameter check fails, it can end directly. With this function, there is no need to execute the following code. The code that does not meet the specification is as follows:
int32_t func(struct st *p, uint8_t num)
{
if ((NULL == P) || (num > MAX_NUM)) {
perror("print some error info");
return -1;
}
// do something
return 0;
}
How to improve it?

Generally when writing MISRA-C compliant code you'd start your function with something like
int32_t result = OK; // whatever "OK" means in your context
Then you change it to an error code if something goes wrong and return it at the end.
Notably, your code would benefit from enum named error codes instead of magic numbers 0, -1 etc. When we have an enum error code type we can make every single function in our API return that same type, then document which return values that are possible per function. Very user-friendly and makes it way more pleasant to write error handles in the calling application.
Now regarding the specific MISRA-C rule, I've been giving this particular one some pretty sour critique over the years, see this. And rightly so if you follow the chain of sources they give as rationale for the rule... So the sound solution might just be to create a permanent deviation against the rule in your coding standard. As noted in comments to that link, the rule was demoted from Required to Advisory in MISRA-C:2012 so you don't even need a formal deviation.
Personally I go with this rule:
Functions should only have one single return statement unless multiple return statements make the code more readable.

You'll need to store the return code as a separate variable and put the code following the if block into an else:
int32_t func(struct st *p, uint8_t num)
{
int32_t rval;
if ((NULL == P) || (num > MAX_NUM)) {
perror("print some error info");
rval = -1;
} else {
// do something
rval = 0;
}
return rval;
}

MISRA C Rule 15.5 is an Advisory rule.
This means that you do not require a "formal" deviation to justify violations, although you should still document it. The justification of "Code Quality - readability" is appropriate (see MISRA Compliance)
Alternatively, you could use a forward GOTO, although this too breaches an Advisory rule.
So what I'm saying is that, while MISRA makes some recommendations, there are appropriate mechanisms to follow if you feel you need to violate one of those Rules - and you can get appropriate sign-off, and as long as you are fully aware of the consequences..
Blind adherence to a Rule can result in poorer-quality code than controlled violation!
[See profile for affiliation]

Should you check parameters passed into function before passing them, or check them in the function?

As a good practice, do you think one should verify passed parameters within a function to which the parameters are being passed, or simply make sure the function will always accept correct parameters?
Consider the following code:
Matrix * add_matrices(const Matrix * left, const Matrix * right)
{
assert(left->rowsCount == right->rowsCount
&& left->colsCount == right->colsCount);
int rowsCount = left->rowsCount;
int colsCount = left->colsCount;
Matrix * matOut = create_matrix(rowsCount, colsCount);
int i = 0;
int j = 0;
for (i; i < rowsCount; ++i)
{
for (j; j < colsCount; ++j)
{
matOut->matrix[i][j] = left->matrix[i][j] + right->matrix[i][j];
}
}
return matOut;
}
Do you think I should check the parameters before passing them to the function or after, ie. in the function? What is a better practice or is it programmer dependant?

Inside. The function can be viewed as an individual component.
Its author is best placed to define any preconditions and check them.
Checking them outside presupposes the caller knows the preconditions which may not be the case.
Also by placing them inside the function you're assured every call is checked.
You should also check any post-conditions before leaving the function.
For example if you have a function called int assertValid(const Matrix*matrix) that checks integrity of the object (e.g. the data is not a NULL pointer) you could call it on entry to all functions and before returning from functions that modify a Matrix.
Consistently use of pre- and post- condition integrity are an enormously effective way of ensuring quality and localising faults.
In practice zealous conformance to this rule usually results in unacceptable performance. The assert() macro or a similar conditional compilation construct is a great asset. See <assert.h>.

Depends if the function is global in scope or local static.
A global function cannot control what calls it. Defensive coding will perform validation of the arguments received. But how much validation to do?
int my_abs(int x) {
assert(x >= -INT_MAX);
return abs(x);
}
The above example, in a debug build, checks to insure the absolute value function will succeed as abs(INT_MIN) may be a problem. Now if this checking should be in production builds is another question.
int some_string(char *s) {
assert(s != NULL);
...
}
In some_string() the test for NULL-ness may be dropped as function definition may state that s must be a string. Even though NULL is not a C string, testing for NULL-ness is only 1 of many bad pointers that could be passed which do not point to a string. So this test has limited validation.
With static functions, the code is under local control. Argument validation could occur by the function, the caller, both or neither. That selection is code dependent.
A counter-example exist with user/file input. Basic data qualification should occur promptly.
int GetDriversAge(FILE *inf) {
int age;
if (fscanf("%d", &age) != 1) Handle_Error();
if (age < 16 || age > 122) Handle_Error();
return age
}
In OP's example, parameter checking is done by the function, not the caller. Without the equivalence test, the function can easily fail in mysterious ways. The cost of this check here is a small fraction of the code's work. That makes it a good check as expensive checks (time, complexity) can cause more trouble than they solve. Note that if the calling code did this test and add_matrices() was called from N places, then that checking code is replicated N times in various, perhaps, inconsistent ways.
Matrix * add_matrices(const Matrix * left, const Matrix * right) {
assert(left->rowsCount == right->rowsCount
&& left->colsCount == right->colsCount);
Conclusion: more compelling reasons to check the parameters in the function than in the caller though exceptions exist.

What I do is to check the parameters inside the function and act accordingly (throw exceptions, return error messages, etc.). I suppose it's the function's job to check whether the passed parameters are of the correct data type and contain valid values.

The function should perform its task correctly, otherwise, it should throw an exception. The client/consuming code may or may not do a check, it depends on the data source and how much you trust it, either way, you should also enclose the function call in a catch-try block to catch invalid argument exception.
EDIT:
Sorry, I confused C for C++. Instead of throwing an exception, you can return null. The client doesn't necessarily have to check the data before calling (depending on the data source and other factors like performance constraints), but must always check for null as a return value.

C - Eclipse CDT -Efficient debugging + What is better code (pointers to functions)?

I'm a new C programmer and I'm writing some data structures for homework.
I have two questions here.
We see a lot of examples of C's function-pointers, usually used to save code duplication. I messed around with this function, which I initially wrote:
(The constants we're pre #defined. Indentation is off, too).
static PlayerResult playerCheckArguments(const char* name, int age,
int attack, int defense) {
PlayerResult result = PLAYER_SUCCESS;
if (!name) {
result = PLAYER_NULL_ARGUMENT;
} else if (strlen(name) > PLAYER_MAX_NAME_LENGTH) {
result = PLAYER_NAME_TOO_LONG;
} else if (invalidAge(age)) {
result = PLAYER_INVALID_AGE;
} else if (invalidAttack(attack)) {
result = PLAYER_INVALID_ATTACK;
} else if (invalidDefense(defense)) {
result = PLAYER_INVALID_DEFENSE;
}
return result;
}
until I got this ghoul:
static PlayerResult playerCheckArguments(const char* name, int age, int attack,
int defense) {
void* arguments[PLAYER_NUM_OF_PAREMETERS] = { name, &age, &attack, &defense };
PlayerResult (*funcArray[PLAYER_NUM_OF_PAREMETERS])(
int) = {&invalidName, &invalidAge, &invalidAttack, &invalidDefense };
PlayerResult result = PLAYER_SUCCESS;
for (int i = 0;
i < PLAYER_NUM_OF_PAREMETERS && result == PLAYER_SUCCESS; i++) {
PlayerResult (*func)(int) = funcArray[i];
void* key = arguments[i];
result = func(key);
}
return result;
My first question being - is there any reason why I should use/write the second function over the other, and generally try to use such "sophistications" which obviously lessen the code's clarity and/or simplicity?
now, for my second question: As you may have noticed, I am using a lot of local variables for the purpose of easier debugging. this way, I can see all relevant evaluations and efficiently monitor the program as it runs.
Is there any other way to display expressions made in a function other than using local variables?
Thanks very much!
return 0 ;-)

Clarity is far more important than cleverness. The harder it is to figure out the harder it is to get right, and to debug when you don't.
There is nothing wrong with using local variables for clarity or debugging. There is an ole saw that goes "Avoid the sin of premature optimization". Make your code as simple and as clear as you can. If you then find that isn't enough work to add as little complexity as needed to get the job done.

Since your question is tagged coding style, I'll just say, the first is definitely preferred. The reason is simple. Show the two functions to 200 programmers, 100 see the first, 100 see the second, and then record the average time it takes for the programmers to be able to describe what the function does. you'll absolutely, averaged over hundreds of programmers, find that the first wins every time.
So you would only do the second if perhaps you had 20+ different parameters to check, and even then there are cleaner ways to do it. I don't believe you'd see any speed increase for the second one either.

if statement in C & assignment without a double call

Would it be possible to implement an if that checks for -1 and if not negative -1 than assign the value. But without having to call the function twice? or saving the return value to a local variable. I know this is possible in assembly, but is there a c implementation?
int i, x = -10;
if( func1(x) != -1) i = func1(x);

saving the return value to a local variable
In my experience, avoiding local variables is rarely worth the clarity forfeited. Most compilers (most of the time) can often avoid the corresponding load/stores and just use registers for those locals. So don't avoid it, embrace it! The maintainer's sanity that gets preserved just might be your own.
I know this is possible in assembly, but is there a c implementation?
If it turns out your case is one where assembly is actually appropriate, make a declaration in a header file and link against the assembly routine.
Suggestion:
const int x = -10;
const int y = func1(x);
const int i = y != -1
? y
: 0 /* You didn't really want an uninitialized value here, right? */ ;

It depends whether or not func1 generates any side-effects. Consider rand(), or getchar() as examples. Calling these functions twice in a row might result in different return values, because they generate side effects; rand() changes the seed, and getchar() consumes a character from stdin. That is, rand() == rand() will usually1 evaluate to false, and getchar() == getchar() can't be predicted reliably. Supposing func1 were to generate a side-effect, the return value might differ for consecutive calls with the same input, and hence func1(x) == func1(x) might evaluate to false.
If func1 doesn't generate any side-effect, and the output is consistent based solely on the input, then I fail to see why you wouldn't settle with int i = func1(x);, and base logic on whether or not i == -1. Writing the least repetitive code results in greater legibility and maintainability. If you're concerned about the efficiency of this, don't be. Your compiler is most likely smart enough to eliminate dead code, so it'll do a good job at transforming this into something fairly efficient.
1. ... at least in any sane standard library implementation.

int c;
if((c = func1(x)) != -1) i = c;

The best implementation I could think of would be:
int i = 0; // initialize to something
const int x = -10;
const int y = func1(x);
if (y != -1)
{
i = y;
}
The const would let the compiler to any optimizations that it thinks is best (perhaps inline func1). Notice that func is only called once, which is probably best. The const y would also allow y to be kept in a register (which it would need to be anyway in order to perform the if). If you wanted to give more of a suggestion, you could do:
register const int y = func1(x);
However, the compiler is not required to honor your register keyword suggestion, so its probably best to leave it out.
EDIT BASED ON INSPIRATION FROM BRIAN'S ANSWER:
int i = ((func1(x) + 1) ?:0) - 1;
BTW, I probably wouldn't suggest using this, but it does answer the question. This is based on the SO question here. To me, I'm still confused as to the why for the question, it seems like more of a puzzle or job interview question than something that would be encountered in a "real" program? I'd certainly like to hear why this would be needed.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

C Programming: how to avoid code duplication without losing clarity - c

If you are concerned about the duplication of the NULL checks at the start of each function, I wouldn't be. It makes it clear to the user that you are simply doing input validation prior to doing any work. No need to worry about the few lines. In general, don't sweat the small stuff like this.

Related

scanf inside function to return value (or other function)

How to check the formal parameters at the entry of a function and ensure that the function has only one return statement?

Should you check parameters passed into function before passing them, or check them in the function?

C - Eclipse CDT -Efficient debugging + What is better code (pointers to functions)?

if statement in C & assignment without a double call

Categories

Resources