I write code in C. I have been striving to write more testable code but I am a little
confused on deciding between writing pure functions that are really good for testing
but require smaller functions and hurt readability in my opinion and writing functions
that do modify some internal state.
For example (all state variables are declared static and hence are "private" to my module):
Which of this is more testable in your opinion:
int outer_API_bar()
{
// Modify internal state
internal_foo()
}
int internal_foo()
{
// Do stuff
if (internal_state_variable)
{
// Do some more stuff
internal_state_variable = false;
}
}
OR
int outer_API_bar()
{
// Modify internal state
internal_foo(internal_state_variable)
// This could be another function if repeated many
// times in the module
if (internal_state_variable)
{
internal_state_variable = false;
}
}
int internal_foo(bool arg)
{
// Do stuff
if (arg)
{
// Do some more stuff
}
}
Although second implementation is more testable wrt to internal_foo as it has no sideeffects but it makes bar uglier and requires smaller functions that make it hard for the reader to even follow small snippets as he has to constantly shift attention to different functions.
Which one do you think is better ? Compare this to writing OOPS code, the private functions most of the time use internal state and are not pure. Testing is done by setting up internal state on a mock object instance and testing the private function. I am getting a little confused on whether to use or whether to pass in internal state to private functions for the sake of "testability"
Whenever writing automated tests, ideally we want to focus on testing the specification of that unit of code, not the implementation (otherwise we create fragile tests that will break whenever we modify the implementation). Therefore, what happens internally in the object should not be of concern to the test.
For this example, I would look to build a test that:
Executes the test by calling outer_API_bar.
Asserts that the correct behavior of the call using other publicly accessible functions and/or state (there must be some way of doing this, as if the only side effect of calling outer_API_bar was internal to this unit of code, then calling this function could not impact your wider application in any way, and essentially be useless).
This way, you are able to keep the fact that you use functions like internal_foo, and variables like internal_state_variable as implementation details, which you can freely change when refactoring your code (i.e. to make it more readable) without having to change your tests.
NOTE: This suggestion is based on my own personal preference for only testing public functions, and not private ones. You will find much debate on this topic where some people pose good arguments for testing private functions being a valid thing to do.
To answer your question very specifically pure functions are waaaaay more 'testable' than any other kind of abstraction. The more pure functions you can include, the more testable your code would be. As you rightly mention, this can come at the cost of readability, and I am sure there are other trade offs to consider. My suggestion would be to aim for more pure functions and look for other techniques that would allow you to compensate on the readability side of things.
Both snippets are testable via mocks. The second one, however, has the advantage that you can also check the argument of internal_foo(bool arg) for an expected value of true or false when the mock for internal_foo() is invoked. In my opinion, that would make for a more meaningful test.
Depending on the rest of the code that we don't know, testing without mocks may be more difficult.
Related
I've been messing around with SDL2 in c and was wondering how to abstract code away without using too many function parameters. For example, in a normal gameplay loop there is usually an input, update, render cycle. Ideally, I would like this to be abstracted as possible so I could have functions called "input", "update", "render", in my loop. How could i do this in c without having those functions take a ludicrous amount of parameters? I know that c++ kind of solves this issue through classes, but I am curious and want to know how to do this in a procedural programming setting.
So far, I can't really think of any way to fix this. I tried looking it up online but only get results for c++ classes. As mentioned before, I want to stick to c because that is what i am comfortable with right now and would prefer to use.
If you have complex state to transport some between calls, put that in a struct. Pass a pointer to that as the sole argument to your functions, out at least as the first of very few.
That is a very common design pattern on C code.
void inputstep(struct state_t* systemstate);
void updatestep(struct state_t* systemstate);
void renderstep(struct state_t* systemstate, struct opengl_context_t* oglctx);
Note also that it is exactly the same, if not even more (due to less safety about pointers), overhead as having a C++ class with methods.
this in a functional programming setting.
Well, C is about as far as you get from a purely functional language, so functional programming paradigms only awkwardly translate. Are you sure you didn't mean "procedural"?
In a functional programming mindset, the state you pass into a function would be immutable or discarded after the function, and the function would return a new state; something like
struct mystate_t* mystate;
...
while(1) {
mystate = inputfunc(mystate);
mystate = updatefunc(mystate);
…
}
Only that in a functional setting, you wouldn't re-assign to a variable, and wouldn't have a while loop like that. Essentially, you wouldn't write C.
My program answers on incoming messages and do some logic based on ID`s and data included in messages.
I have a different function for each ID.
The project is pure C.
To make the code easy to work with I have adjusted all functions to the same style (same return and parameters).
I also want to evade the long switch-case constructions and make code easier to edit later, so I have created the following function:
AnswerStruct IDHandler(Request Message)
{
struct AnswerStruct ANS;
SIDHandler = IDfunctions[Message.ID];
ANS = SIDHandler(Message);
return ANS;
}
AnswerStruct is struct for answer messages.
Request is struct for incoming messages.
IDfunctions is array of pointers to functions which looks like this -
AnswerStruct func1(Request);
AnswerStruct func4(Request);
...
typedef AnswerStruct(*f)(Request);
AnswerStruct (*SIDHandler)(Request);
static f IDfunctions[IDMax] = {0, *func1, 0, 0, *func4, ...};
Function pointers placed in the array cells equal to their id`s, for example:
func1 related to message with ID=1.
func4 related to message with ID=4.
I think, that by using this array I make my life much easier.
I can call function which I need in one step (just go to the IDfunctions[ID]).
Also, adding new functions becomes a two step operation (just add function to the IDfunctions and write logic).
I doubt the efficiency of the selected solution, it seems clunky to me.
The question is - Is this a good architecture?
If no, how can I edit my solution to make it better?
Thanks.
I doubt the efficiency of the selected solution, it seems clunky to
me.
It can be less efficient to call a function via a function pointer than to call it directly by name, because the former denies the compiler any opportunity to optimize the call. But you have to consider whether that actually matters. In a system that dispatches function calls based on messages received from an external source, the I/O involved in receiving the messages is likely to be much more expensive than the indirect function calls, so the difference in call performance is unlikely to be significant.
On the other hand, your approach affords simpler logic and many fewer lines of code, which is a different and potentially more valuable kind of efficiency.
The question is - Is this a good architecture?
The general approach is perfectly good, and I don't see much to complain about in the implementation sketch provided.
Personally, I would declare array IDFunctions to be const (supposing, of course, that you don't intend to replace any of its members after their initialization), but that's a minor safety / performance detail, where again the performance dimension is probably irrelevant.
Say I have an external library that computes the optima, say minima, of a given function. Say its headers give me a function
double[] minimizer(ObjFun f)
where the headers define
typedef double (*ObjFun)(double x[])
and "minimizer" returns the minima of the function f of, say, a two dimensional vector x.
Now, I want to use this to minimize a parameterized function. I don't know how to express this in code exactly, but say if I am minimizing quadratic forms (just a silly example, I know these have closed form minima)
double quadraticForm(double x[]) {
return x[0]*x[0]*q11 + 2*x[0]*x[1]*q12 + x[1]*x[1]*q22
}
which is parameterized by the constants (q11, q12, q22). I want to write code where the user can input (q11, q12, q22) at runtime, I can generate a function to give to the library as a callback, and return the optima.
What is the recommended way to do this in C?
I am rusty with C, so asking about both feasibility and best practices. Really I am trying to solve this using C/Cython code. I was using python bindings to the library so far and using "inner functions" it was really obvious how to do this in python:
def getFunction(q11, q12, q22):
def f(x):
return x[0]*x[0]*q11 + 2*x[0]*x[1]*q12 + x[1]*x[1]*q22
return f
// now submit getFunction(/*user params*/) to the library
I am trying to figure out the C construct so that I can be better informed in creating a Cython equivalent.
The header defines the prototype of a function which can be used as a callback. I am assuming that you can't/won't change that header.
If your function has more parameters, they cannot be filled by the call.
Your function therefor cannot be called as callback, to avoid undefined behaviour or bogus values in parameters.
The function therefor cannot be given as callback; not with additional parameters.
Above means you need to drop the idea of "parameterizing" your function.
Your actual goal is to somehow allow the constants/coefficients to be changed during runtime.
Find a different way of doing that. Think of "dynamic configuration" instead of "parameterizing".
I.e. the function does not always expect those values at each call. It just has access to them.
(This suggests the configuration values are less often changed than the function is called, but does not require it.)
How:
I only can think of one simple way and it is pretty ugly and vulnerable (e.g. due to racing conditions, concurrent access, reentrance; you name it, it will hurt you ...):
Introduce a set of global variables, or better one struct-variable, for readability. (See recommendation below for "file-global" instead of "global".)
Set them at runtime to the desired values, using a separate function.
Initialise them to meaningful defaults, in case they never get written.
Read them at the start of the minimizing callback function.
Recommendation: Have everything (the minimizing function, the configuration variable and the function which sets the configuration at runtime) in one code file and make the configuration variable(s) static (i.e. restricts access to it this code file).
Note:
The answer is only the analysis that and why you should not try paraemeters.
The proposed method is not considered part of the answer; it is more simple than good.
I invite more holistic answers, which propose safer implementation.
I work in safety critical application development. Recently as a code reviewer I complained against coding style shown below, but couldn't make a strong case against it. So what would be a good argument against such Variable redundancy/duplication, I am looking for cases where this might lead to problems or test cases which might fail, rather than just coding style.
//global data
// global data
int Block1Var;
int Block2Var;
...
//Block1
{
...
Block1Var = someCondition; // someCondition is an logical expression
...
}
//Block2
{
...
Block2Var = Block1Var; // Block2Var is an unconditional copy of Block1Var
...
}
I think a little more context would be helpful perhaps.
You could argue that the value of Block1Var is not guaranteed to stay the
same across concurrent access/modification. This is only valid if Block1Var
ever changes (ie is not only read). I don't know if you are concerned with
multi-threaded applications or not.
Readability is an important issue as well. Future code maintainers
don't want to have to trace around a bunch of trivial assignments.
Depends on what's done with those variables later, but one argument is that it's not future-proof. If, in the future, you change the code such that it changes the value of Block1Var, but Block2Var is used instead (without the additional change) later on, then this will result in erroneous behavior.
If the shown function context reaches a certain length (I'm assuming a lot of detail has been discarded to create the minimal reproducible example for this question), a good next step could be to create a new (sub-)function out of Block 2. This subfunction then should be started assigning Block1Var (-> actual parameter) to Block2Var (-> formal parameter). If there were no other coupling to the rest of the function, one could cut the rest of Block 2 and drop it as a function definition, and would only have to replace the assignment by the subfunction call.
My answer is fairly speculative, but I have seen many cases where this strategy helped me to mark useful points to split a complex function later during the development. Of course, this interpretation only applies to an intermediate stage of development and not to code that is stated to be "ready for release".
I have a library I wrote with API based on opaque structures. Using opaque structures has a lot of benefits and I am very happy with it.
Now that my API are stable in term of specifications, I'd like to write a complete battery of unit test to ensure a solid base before releasing it.
My concern is simple, how do you unit test API based on opaque structures where the main goal is to hide the internal logic?
For example, let's take a very simple object, an array with a very simple test:
WSArray a = WSArrayCreate();
int foo = 5;
WSArrayAppendValue(a, &foo);
int *bar = WSArrayGetValueAtIndex(a, 0);
if(&foo != bar)
printf("Eroneous value returned\n");
else
printf("Good value returned\n");
WSRelease(a);
Of course, this tests some facts, like the array actually acts as wanted with 1 value, but when I write unit tests, at least in C, I usualy compare the memory footprint of my datastructures with a known state.
In my example, I don't know if some internal state of the array is broken.
How would you handle that? I'd really like to avoid adding codes in the implementation files only for unit testings, I really emphasis loose coupling of modules, and injecting unit tests into the implementation would seem rather invasive to me.
My first thought was to include the implementation file into my unit test, linking my unit test statically to my library.
For example:
#include <WS/WS.h>
#include <WS/Collection/Array.c>
static void TestArray(void)
{
WSArray a = WSArrayCreate();
/* Structure members are available because we included Array.c */
printf("%d\n", a->count);
}
Is that a good idea?
Of course, the unit tests won't benefit from encapsulation, but they are here to ensure it's actually working.
I would test only the API, and focus on testing every possible corner case.
I can see the interest in checking that the memory structures hold what you expect. If you do this you will be tightly coupling the tests to the specifics of the implementation and I think creating a lot of long-term maintenance.
My thought here is that the API is the contract and if you fulfil that then yoru code is working. If you change the implementation later then presumably one of the things you need to know is that the contract is maintained. Your unit tests will verify that.
Your unit tests shouldn't depend on the internal details of the code that they're testing. Your initial example is actually a pretty good test. It does one thing, then verifies that the state of the object is as expected.
You'd want to create tests that verify the behavior of other parts of the API as well, of course. Fir example, in the array case, you'd want to have test cases that verify that the length if the array is reported correctly after adding and removing items.
Writing unit tests that depend on an exact match with a known good memory snapshot is generally a really bad idea, in that every implementation change will cause the tests to fail. If you do decide to use snapshot-based tests, make sure there's an easy to regenerate the "known good" snapshots.
I would suggest splitting the unit testing into white box and black box unit testing. The white box testing focuses on the API interface, and correctness of results, while the black box testing focuses on the internals.
To facilitate this I use a private header (e.g. example_priv.h), with a #ifdef TESTING for function prototypes that are other internal / private. Thus you can exercise internal functions for unit testing purposes, without exposing them in the general case.
The only loss with this method is losing the ability to explicitly label the internal functions as static in their source file.
I hope that is helpful.