How extensive should a single unit test be? [closed]

How extensive should a single unit test be? [closed] - c

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I writing an application in C. I am new to writing unit tests. I will be using the glib testing framework.
I read this article in wikipedia. I am unsure of what my unit tests should cover.
I know that my unit test should check whether for a valid input, the expected result is obtained. Is this all that needs to be done while writing a unit test for a function?
Should I also check the value of each variable every time it is modified? Because if the functionality is extended, then more variables might be added and current variables might be modified at various other places, so then I will have to change the test itself.
Please give me your input.

I am unsure of what my unit tests should cover.
Maybe it helps to look at unit tests as a specification (runnable documentation!) for your code.
What is considered valid input?
What happens when the code is given valid input? Does it produce the expected output? (Don't just test one valid input value; focus on maybe 1 typical case, then all edge cases and extremes that should still just work.)
What happens when the code is given invalid input? Are the expected errors / error codes produced?
What environment is the code dependent on?
What happens when that environment isn't properly set up?
The last two points are actually special cases of the first three. That's because unit tests don't necessarily test single, isolated functions:
"Input" doesn't just have to be a function argument. It can be any kind of program state that your code will read (i.e. depends on).
In the same way, "output" is not just a return value of a function. It can be any variable or program state that your code modifies.
Your unit tests might not test just one single isolated function, but the interplay between several functions that must called in sequence to get something done. Read as documentation/specification, such a unit test would suggest that calling the functions in that order is an appropriate or even suggested way to get some task done.
Should I also check the value of each variable every time it is modified?
Unit tests are completely separate from what they are testing (often called the System Under Test, abbreviated to SUT). That is, your unit tests should be separate functions wherein you set up the SUT, exercise it, and then check the outcome against the expected result.
Therefore your unit test functions will be very simple:
set up the input for your SUT.
call/exercise the SUT.
read the output of the SUT and compare against expected output.
As you can see, there's not much room in such a simple procedure for variables that will change their value all the time. If you have such a unit test, chances are that it's too complex and you might want to change it, e.g. split it up.
Changing variables are more likely seen in the tested code (i.e. in the SUT) itself. But that's not where you put your test logic. That goes into a completely separate function, which makes up your unit test.
(Note that I'm speaking very generally, since you haven't said what framework you are using for your unit tests, so I might be slightly off on some issues.)

Related

Time and memory limits as parameters in Metric-FF

When running Metric-FF, it does not appear any time or memory limitations as possible input parameters. My question is, is it possible to introduce them as input parameters when running the planner?
The cited planner is the following one:
https://fai.cs.uni-saarland.de/hoffmann/metric-ff.html

Error management for a C computer game [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
What kind of errors should I expect with a computer game written in C and how to handle them? With computer game I imply a program where there is no danger of any kind to human lives or "property".
I would like to add as few error handling code as necessary to keep everything as clear and simple as possible. For example I do not want to do this, because this is much simpler and sufficient for a game.
Up to now I have thought about this:
Error: out-of-memory when calling malloc.
Handling: Print error message and call exit(EXIT_FAILURE); (like this)
Error: A programming error, i.e. something which would work if implemented correctly.
Handling: Use assert to detect (which aborts the program if failed).
Error: Reading a corrupted critical file (e.g. game resource).
Handling: Print error message and call exit(EXIT_FAILURE);
Error: Reading a corrupted non-critical file (e.g. load a saved game).
Handling: Show message to user and ask to load another file.
Do you think this is a reasonable approach? What other error should I expect and what is a reasonable minimal approach to handle them?

You can expect at least those errors to happen as mentioned in the documentation to the libraries you use. For a C program that typically is libc at least.
Check the ERRORS section of the man-pages for the functions you'd be using.
I'd also think this over:
I do not want to do this, because this is much simpler and sufficient for a game.
Imagine you'd have fought yourself through a dozen game-levels and then suddenly the screen is gone with an odd OOM*1-error message. And ... - you didn't save! DXXM!
*1 Out-Of-Memory

As I've already stated in the comment I think this is a very broad question. However, it's Xmas and I'll try and be helpful (lest I upset Santa).
The general best practices have been given in the answers posted by #alk and #user2485710. I will try and give a generic boiler-plate for error handling as I see it in C.
You can't guard against everything without writing perfect code. Perfect code is unreachable (kind of like infinity in calculus) though you can try and get close.
If you try to put too much error handling code in, you will be affecting performance. So, let me define what I will call a simple function and a user function.
A user function is a function that can return an error value. e.g. fopen
A simple function is a function that can not return an error value. e.g.
long add(int a, int b)
{
long rv = a; // #alk - This way it shouldn't overflow. :P
return rv + b;
}
Here are a couple rules to follow:
All calls to user functions must handle the returned errors.
All calls to simple functions are assumed safe so no error handling is needed.
If a simple function's parameter is restricted (i.e. an int parameter that must be between 0 and 9) use an assert to ensure its validity (unless the value is the result of user input in which case you should either handle it or propagate it making this a user function).
If a user function's parameter is restricted and it doesn't cause an error do the same as above. Otherwise, propagate it without additional asserts.
Just like your malloc example you can wrap your user functions with code that will gracefully exit your game thereby turning them into simple functions.
This won't remove all errors but should help reduce them whilst keeping performance in mind. Testing should reduce the remaining errors to a minimum.
Forgive me for not being more specific, however, the question seems to ask for a generic method of error handling in C.
In conclusion I would add that testing, whether unit testing or otherwise, is where you make sure that your code works. Error handling isn't something you can plan for in its entirety because some possible errors will only be evident once you start to code (like a game not allowing you to move because you managed to get yourself stuck inside a wall which should be impossible but was allowed because of some strange explosive mechanics). However, testing can and should be planned for because that will reveal where you should spend more time handling errors.

My suggestion is about:
turning on the compiler's flags for raising errors and warning, make your compiler as much pedantic as possible, -Wall, -Werror, -Wextra, for example are a good start for both clang and gcc
be sure that you know what undefined behaviour means and what are the scenarios that can possibly trigger an UB, the compiler doesn't always helps, even with all the warnings turned on.
make your program modular, especially when it comes to memory management and the use of malloc
be sure that your compiler and your standard library of choice both support the C standard that you pick

Is it a bad practice to output error messages in a function with one input and one output [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I was once told that functions with one input and one output(not exactly one) should not print messages when being called. But I don't understand. Is it for security or just for convention?
Let me give an example. How to deal with an attempt that access data in a sequential list with an incorrect index?
// 1. Give out the error message inside the function directly.
DataType GetData(seqList *L, int index)
{
if (index < 0 || index >= L->length) {
printf("Error: Access beyond bounds of list.\n");
// exit(EXIT_FAILURE);
}
return L->data[index];
}
// 2. Return a value or use a global variable(like errno) that
// indicates whether the function performs successfully.
StateType GetData(seqList *L, int index, int *data)
{
if (index < 0 || index >= L->length) {
return ERROR;
}
*data = L->data[index];
return OK;
}

I think there are two things going on here:
Any visible and unexpected side-effect such as writing to streams is generally bad, and not just for functions with one input and one output. If I was using a list library, and it started silently writing error messages to the same output stream I was using for my regular output, I'd consider that a problem. However, if you are writing such a function for your own personal use, and you know ahead of time that the action you want taken is always to print a message and exit(), then it's fine. Just don't force this behavior on everyone else.
This is a specific case of the general problem of how to inform callers about errors. A lot of the time, a function cannot know the correct response to an error, because it doesn't have the context that the caller does. Take malloc(), for instance. The vast majority of the time, when malloc() fails, I just want to terminate, but once in a great while I might want to deliberately fill the memory by calling malloc() until it fails, and then proceed to do something else. In this case, I don't want the function to decide whether or not to terminate - I just want it to tell me it's failed, and then pass control back to me.
There are a number of different approaches to handling errors in library functions:
Terminate - fine if you're writing a program yourself, but bad for a general purpose library function. In general, for a library function, you'll want to let the caller decide what to do in the case of an error, so the function's role is limited to informing the caller of the error.
Return an error value - sometimes OK, but sometimes there is no feasible error value. atoi() is a good case in point - all the possible values it returns could be correct translations of the input string. It doesn't matter what you return on error, be it 0, -1 or anything else, there is no way to distinguish an error from a valid result, which is precisely why you get undefined behavior if it encounters one. It's also semantically questionable from a slightly purist point of view - for instance, a function which returns the square root of a number is one thing, but a function which sometimes returns the square root of a number, but which sometimes returns an error code rather than a square root is another thing. You can lose the self-documenting simplicity of a function when return values serve two completely separate purposes.
Leave the program in an error state, such as setting errno. You still have the fundamental problem that if there is no feasible return value, the function still can't tell you that an error has occurred. You could set errno to 0 in advance and check it afterwards every time, but this is a lot of work, and may just not be feasible when you start involving concurrency.
Call an error handling function - this basically just passes the buck, since the error function then also has to address the issues above, but at least you could provide your own. Also, as R. notes in the comments below, other than in very simple cases like "always terminate on any error" it can be asking too much of a single global error handling function to be able to sensibly handle any error that might arise in a way that your program can them resume normal execution. Having numerous error handling functions and passing the appropriate ones individually to each library function is technically possible, but hardly an optimal solution. Using error handling functions in this way can also be difficult or even impossible to use correctly in the presence of concurrency.
Pass in an argument that gets modified by the function if it encounters an error. Technically feasible, but it's not really desirable to add an additional parameter for this purpose to every library function ever written.
Throw an exception - your language has to support them to do this, and they come along with all kinds of associated difficulties including unclear structure and program flow, more complex code, and the like. Some people - I'm not one of them - consider exceptions to be the moral equivalent of longjmp().
All the possible ways have their drawbacks and advantages, as of yet humanity has not discovered the perfect way of reporting errors from library functions.

In general you should make sure you have a consistent and coherent error handling strategy, which means considering whether you want to pass an error up to a higher level or handle it at the level it initially occurs. This decision has nothing to do with how many inputs and outputs a function has.
In a deeply embedded system where a memory allocation failure occurs at a critical juncture, for example, there's no point passing that error back up (and indeed you may well not be able to) - all you can do might be enter a tight loop to let the watchdog reset you. In this case there's no point reserving invalid return values to indicate error, or indeed in even checking the return value at all if it doesn't make sense to do so. (Note I am not advocating just lazily not bothering to check return values, that is a different matter entirely).
On the other hand, in a lovely beautiful GUI app you probably want to fail as gracefully as possible and pass the error up to a level where it can be either worked around / retried / whatever is appropriate; or presented to the user as an error if nothing else can be done.

It is better to use perror() to display error messages rather than using printf()
Syntax:
void perror(const char *s);
Also error messages are supposed to be sent to the stderr stream than stdout.

Yes, it's a bad practice; even worse is that you're sending the output to stdout rather than stderr. This could end up corrupting data by mixing error message in with output.
Personally, I believe very strongly that this kind of "error handling" is harmful. There is no way you can validate that the caller passed a valid value for L, so checking the validity of index is inconsistent. The documented interface contract for the function should simply be that L must be a valid pointer to an object of the correct type, and index a valid index (in whatever sense is meaningful to your code). If an invalid value for L or index is passed, this is a bug in your code, not a legitimate error that can occur at runtime. If you need help debugging it, the assert macro from assert.h is probably a good idea; it makes it easy to turn off the check when you no longer need it.
One possible exception to the above principle would be the case where the value of L is coming from other data structures in your program, but index is coming from some external input that's not under your control. You could then perform an external validation step before calling this function, but if you always need the validation, it makes sense to integrate it like you're doing. However, in that case you need to have a way to report the failure to the caller, rather than printing a useless and harmful message to stdout. So you need to either reserve one possible return value as an error sentinel, or have an extra argument that allows you to return both a result and an error status to the caller.

Return a reserved value that's invalid for a success condition. For example, NULL.
It is advisable not to print because:
It doesn't help the calling code reason about the error.
You're writing to a stream that might be used by higher level code for something else.
The error may be recoverable higher, so you might be just printing misleading error messages.
As others have said, consistency in how you deal with error conditions is also an important factor. But consider this:
If your code is used as a component of another application, one that does not follow your printing convention, then by printing you're not allowing the client code to remain faithful to its own strategy. Thus, using this strategy you impose your convention to all related code.
On the other hand, if you follow the "cleaner" solution of returning a reserved value and the client code wants to follow the printing convention, the client code can easily adapt to what you return and even print an error, by making simple wrappers around your functions. Thus, using this strategy you give the users of your code enough space to choose the strategy that best works for them and to be faithful to it.

It is always best if code deals with one thing only. It is easier to understand, it is easier to use, and is applicable in more instances.
Your GetData function that prints an error message isn't suitable for use in cases where there may not be a value. i.e. The calling code wants to try to get a value and handle the error by using a default if it doesn't exist.
Since GetData doesn't know the context it can't report a good error message. As an example higher up the call stack we can report hey you forgot to give this user an age, vs in GetData where all it knows is we couldn't get some value.
What about a multithreaded situation? GetData seems like it would be something that might get called from multiple threads. With a random bit of IO shoved in the middle it will cause contention over who has access to the console if all the threads need to write at the same time.

How to evaluate if a code is correct against a submitted solution

I´m searching information about how to compare two codes and decide if the code submitted by someone is correct or not (based on a solution code defined before).
I could compare the output but many codes may have the same output. Then I think I must compare someway the codes and give a percentage of similitude.
Anybody can help me?
(the language code is C but I think this isn´t important)

Some of my teachers used online automated program grading systems like http://web-cat.org/
In the assignment they would specify a public api you must provide, and then they would just write tests against your functions, much like unit tests. They would intentionally pick tests that would exploit boundary conditions and other things students are notorious for not thinking about, and just call your code with many different inputs to try to get your code to fail.
Sometimes they would hardcode the expected values, other times they would allow values within a range, and other times they just did the assignment themselves and made it so your own code has to match the results produced by their code.
Obviously, not all programs can be effectively graded this way. It's also kinda error prone in that sometimes even the teacher made a mistake and overflowed an int or something, then the correct student submissions wouldn't match the teachers incorrect results. But, a system doesn't need to be perfect to be useful. But I think this raises an important point in that manually grading by reading the code won't necessarily reveal all mistakes either.

Another possibility is copy the submitted code, strip out all of the white space and search for substrings that must exist for the code to be correct and/or substrings that cannot exist for the code to be considered correct. The troublesome bit might be setting up to allow for some of the more tricky requirements such as [(a or c),((a or b) and c),((a or b) and c)], where the variables are the result of a boolean check as to if the substring related to the variable exists within the code.
For example, [("printf"),("for"), (not "1,2,3,4,5,6,7,9,10")], would require that "printf" and "for" be substrings in the code, while "1,2,3,4,5,6,7,9,10" i I'm not familiar with C, so I'm I'm assuming here that "printf" is required to be able to print anything without involving output streams, which could be accounted for by something like [("printf" or "out"),("for"), (not "1,2,3,4,5,6,7,9,10")], where "out" is part of C code required to make use of output streams.
It might be possible to automatically find required substrings based on a "correct" code, but as others have mentioned, there are alternative ways to do things. Which is why hard-coding the "solution" is probably required. Even so, it's quite possible that you'll miss a required substring, and it'll be marked as wrong, but it's probably the only way you can do what you ask with some degree of success.
Regular expressions might be useful here.

how to test c code or libraries written in c

I am new to testing field. I would like to firstly ask in what ways can a C application be tested (a C framework or a C tool), how should I start, what are the steps also which are the best tools I can use for testing C code.
Need some help and some documentation.
Thx

What a unit testing tool or frame work usually does is automate all input sets and check outputs for valid results as well as do negative tests i.e. put invalid values and see appropriate response, such as the system should at least remain stable. For e.g. if a function says it only processes positive numbers, ideally it should be able to say "invalid data" when passed a negative number, instead of giving wrong answers or worst getting crashed)
On an api level say if you have a function which takes a number and returns it's square, you write a script (or have a tool) which calls that function repeatedly passing it all valid inputs (or at least all inputs of different types such that each class is covered).
This would mean testing boundary conditions (min max values), basic use case conditions and negative conditions etc.
Beyond unit tests you can do white box testing. Such as code coverage i.e. ensuring you have executed test cases which cover most if not all code paths.
Automating some/most of above so that they can be repeatedly executed and validated every time a change is made is called regression testing.
Then there are several other areas of testing such as localization, globalization, security testing etc. to name a few.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight