I have the following C code that uses sqlite3:
if(SQLITE_OK == sqlite3_initialize()) {
self->db_open_result = sqlite3_open(self->db_uri, &(self->db));
} else {
self->db_open_result = SQLITE_ERROR;
}
Obviously I have a pretty high confidence that the code is correct and will behave as expected. However, I am measuring code coverage of my unit tests using gcov/lcov and I'm curious about how I might get my coverage number to 100% in this case. Under normal circumstances sqlite3_initialize() is never going to fail, so the else clause will never execute.
Is there a way to cause this to fail that isn't totally disruptive?
You want your unit-tests to test your code. But you also want to know that all of your test code has been properly exercised. One way to do that is to use "mocking", i.e. you replace your actual libraries (such as SQLite) with fake, or "mock" libraries and then run your programs against these fake libraries.
Whether this library replacement is done at compile time or runtime is really incidental, but in C it's easier to do it at compile time. You can do this mocking by hand, or you can use a tool such as Cmock.
In the faked library, you then provoke various errors and failures. Notably, the faked library doesn't even have to do anything or even keep track of much or any state, you can often get quite far by returning "OK" or "FAIL".
Is there a way to cause this to fail that isn't totally disruptive?
For portability reasons you should verify function success. What's happens if you not installed SQLite library? You cannot initialize the library if that happens.
"If for some reason, sqlite3_initialize() is unable to initialize the library (perhaps it is unable to allocate a needed resource such as a mutex) it returns an error code..."
So, if you want portability check the error.
Related
In our environment we're encountering a problem regarding mocking functions for our library unit tests.
The thing is that instead of mocking whole modules (.c files) we would like to mock single functions.
The library is compiled to an archive file and linked statically to the unit test. Without mocking there isn't any issue.
Now when trying to mock single functions of the library we would get multiple definitions obviously.
My approach now is to use the weak function attribute when compiling/linking the library so that the linker takes the mocked (non-weak) function when linking against the unit test. I already tested it and it seems to work as expected.
The downside of this is that we need many attribute declarations in the code.
My final approach would be to pass some compile or link arguments to the compiler, that every function is automatically declared as a weak symbol.
The question now is: Is there anything to do this in a nice way?
btw: We use clang 8 as a compiler.
James Grenning describes several options to solve this problem (http://blog.wingman-sw.com/linker-substitution-in-c-limitations-and-workarounds). The option "function pointer substitution" gives a high degree of freedom. It works as follows: Replace functions by pointers to functions. The function pointers are initialized to point to the original function, but each pointer can be redirected individually to a test double.
This approach allows to have one single test executable where you can still decide for each test case individually for which function you use a test double and for which you use the original function.
It certainly also comes at a price:
One indirection for each call. But, if you use link-time-optimization the optimizer will most likely eliminate that indirection again, so this may not be an issue.
You make it possible to redirect function calls also in production code. This would certainly be a misuse of the concept, however.
I would suggest using VectorCAST
https://www.vector.com/us/en/products/products-a-z/software/vectorcast/
I've used, unity/cmock and others for unit testing C in the past, but after a while its vary tedious to manually create these for a language that isnt really built around that concept and is very much a heres a Hammer and Chissel the world is yours approach.
VectorCAST abstracts majority of the manual work that is required with tools like Unity/Cmock, we can get results across a project/module sooner and quicker than we did in the past with the other tools.
Is vectorCAST expensive and very much an enterprise level tool? yes... but its defiantly worth its weight in gold. And thats coming from someone who is very old school, manual approach to software development... just text editors, terminals and commandline debuggers.
VetorCAST handles function pointers and pointers extremely well, stubbing functions is easy as two clicks away. It saved our team alot of time... allowing us to focus on results and reducing the feedback loop of development.
I have a function which do basic error checking before returning next node of the link list:
Node *next_node(Node *n) {
switch(type(n)){
case A:
case B:
.
.
case N:
return n->next;
default:
exit(1); //This is just a representation of my code handling error scenario.
}
}
There are more wrapper functions on top of it which does certain things depending upon return value of above mentioned function.
These wrapper functions are being used extensively in my whole code base. It is too much that when I run profiler, I found them as time consuming routines.
This do make sense to me because there are thousands of function calls from different code area to these functions and such number of calls will definitely take time due to function call overheading and/or instruction cache misses.
I also know that compilers do some optimization around these, so that they can be inlined while generating assembly code. It seems that it is not working properly because of current way of implementation.
So, my questions are:
What is the common way of writing such wrapper functions so that compiler can optimize them for low runtime?
How do other companies handle such a scenario in their code base?
Note1: code above is just for representation, there are lot of such wrapper function in the whole code base. Therefore, if someone has any idea on improving upon runtime issue due to wrapper functions, should share the idea.
Note2: I'm using gcc as compiler and my code base is entirely in C.
So you seem to be in a correct case for low level optimization because you have profiled your code and found a time consuming function.
Assuming that you already use the higher optimization level of your compiler, the bad part it that there is no general answer to such a question. I can only give you some hints here:
are some redundant operations somewhere? if yes you can try to make sure that you do them only once
are there complex loops that could be unwinded? not much to gain but some tests at the price of a longer code
is assembly code an option? if yes you can try to write all or part of the function in assembly language
But you could (should?) also wonder whether:
the general structure of your application is correct - if you can manage to less call the function you spend less time in it...
you really need the tests in production code. If they are assertions more than tests, they should only be conditionally included with a #ifdef in tests and debug versions.
I have a large code base of quite old C code on an embedded system and unfortunately there are no automated test cases/suites. This makes restructuring and refactoring code a dangerous task.
Manually writing test cases is very time consuming, so I thought that it should be possible to automate at least some part of this process for instance by tracing all the function calls and recording of the input and output values. I could then use these values in the test cases (this would not work for all but at least for some functions). It would probably also be possible to create mock functions based on the gathered data.
Having such test cases would make refactoring a less dangerous activity.
Are there any solutions that already can do this? What would be the easiest way to get this to work if I had to code it myself?
I thought about using ctags to find the function definitions, and wrapping them in a function that records the parameter values. Another possibility would probably be a gcc compiler plugin.
There is a gcc option "-finstrument-functions", which mechanism you can use to define your own callbacks for each funtion's entry/exit.
Google it and you can find many good examples.
[Edit] with this gcc option's call back you can only track the function's entry/exit,not the params. but with some tricks you may also track the params. (walk through the current frame pointer to get the param on the stack).
Here is an article talk about the idea of the implementation:
http://linuxgazette.net/151/melinte.html
Furthermore, depends on your embedded system, on linux you can try something like ltrace to show the params(like the strace way). There are many tools do the function trace work either in userspace or kernelspace on linux, ftrace/ust/ltrace/utrace/strace/systemtap/. Anyway, if you do not add any hard debugging code, it's not possible to display the params in the correct way. If you accept the efforts to add entry/exit debugging infomation, then it's much easier.
Also here is a similar thread talk about this problem.
Tool to trace local function calls in Linux
Is there a way to programmatically check if a single C source file is potentially harmful?
I know that no check will yield 100% accuracy -- but am interested at least to do some basic checks that will raise a red flag if some expressions / keywords are found. Any ideas of what to look for?
Note: the files I will be inspecting are relatively small in size (few 100s of lines at most), implementing numerical analysis functions that all operate in memory. No external libraries (except math.h) shall be used in the code. Also, no I/O should be used (functions will be run with in-memory arrays).
Given the above, are there some programmatic checks I could do to at least try to detect harmful code?
Note: since I don't expect any I/O, if the code does I/O -- it is considered harmful.
Yes, there are programmatic ways to detect the conditions that concern you.
It seems to me you ideally want a static analysis tool to verify that the preprocessed version of the code:
Doesn't call any functions except those it defines and non I/O functions in the standard library,
Doesn't do any bad stuff with pointers.
By preprocessing, you get rid of the problem of detecting macros, possibly-bad-macro content, and actual use of macros. Besides, you don't want to wade through all the macro definitions in standard C headers; they'll hurt your soul because of all the historical cruft they contain.
If the code only calls its own functions and trusted functions in the standard library, it isn't calling anything nasty. (Note: It might be calling some function through a pointer, so this check either requires a function-points-to analysis or the agreement that indirect function calls are verboten, which is actually probably reasonable for code doing numerical analysis).
The purpose of checking for bad stuff with pointers is so that it doesn't abuse pointers to manufacture nasty code and pass control to it. This first means, "no casts to pointers from ints" because you don't know where the int has been :-}
For the who-does-it-call check, you need to parse the code and name/type resolve every symbol, and then check call sites to see where they go. If you allow pointers/function pointers, you'll need a full points-to analysis.
One of the standard static analyzer tool companies (Coverity, Klocwork) likely provide some kind of method of restricting what functions a code block may call. If that doesn't work, you'll have to fall back on more general analysis machinery like our DMS Software Reengineering Toolkit
with its C Front End. DMS provides customizable machinery to build arbitrary static analyzers, for the a language description provided to it as a front end. DMS can be configured to do exactly the test 1) including the preprocessing step; it also has full points-to, and function-points-to analyzers that could be used to the points-to checking.
For 2) "doesn't use pointers maliciously", again the standard static analysis tool companies provide some pointer checking. However, here they have a much harder problem because they are statically trying to reason about a Turing machine. Their solution is either miss cases or report false positives. Our CheckPointer tool is a dynamic analysis, that is, it watches the code as it runs and if there is any attempt to misuse a pointer CheckPointer will report the offending location immediately. Oh, yes, CheckPointer outlaws casts from ints to pointers :-} So CheckPointer won't provide a static diagnostic "this code can cheat", but you will get a diagnostic if it actually attempts to cheat. CheckPointer has rather high overhead (all that checking costs something) so you probably want to run you code with it for awhile to gain some faith that nothing bad is going to happen, and then stop using it.
EDIT: Another poster says There's not a lot you can do about buffer overwrites for statically defined buffers. CheckPointer will do those tests and more.
If you want to make sure it's not calling anything not allowed, then compile the piece of code and examine what it's linking to (say via nm). Since you're hung up on doing this by a "programmatic" method, just use python/perl/bash to compile then scan the name list of the object file.
There's not a lot you can do about buffer overwrites for statically defined buffers, but you could link against an electric-fence type memory allocator to prevent dynamically allocated buffer overruns.
You could also compile and link the C-file in question against a driver which would feed it typical data while running under valgrind which could help detect poorly or maliciously written code.
In the end, however, you're always going to run up against the "does this routine terminate" question, which is famous for being undecidable. A practical way around this would be to compile your program and run it from a driver which would alarm-out after a set period of reasonable time.
EDIT: Example showing use of nm:
Create a C snippet defining function foo which calls fopen:
#include <stdio.h>
foo() {
FILE *fp = fopen("/etc/passwd", "r");
}
Compile with -c, and then look at the resulting object file:
$ gcc -c foo.c
$ nm foo.o
0000000000000000 T foo
U fopen
Here you'll see that there are two symbols in the foo.o object file. One is defined, foo, the name of the subroutine we wrote. And one is undefined, fopen, which will be linked to its definition when the object file is linked together with the other C-files and necessary libraries. Using this method, you can see immediately if the compiled object is referencing anything outside of its own definition, and by your rules, can considered to be "bad".
You could do some obvious checks for "bad" function calls like network IO or assembly blocks. Beyond that, I can't think of anything you can do with just a C file.
Given the nature of C you're just about going to have to compile to even get started. Macros and such make static analysis of C code pretty difficult.
I have been thinking about the difficulty incurred with C error handling.. like who actually does
if(printf("hello world")==-1){exit(1);}
But you break common standards by not doing such verbose, and usually useless coding. Well what if you had a wrapper around the libc? like so you could do something like..
//main...
error_catchall(my_errors);
printf("hello world"); //this will automatically call my_errors on an error of printf
ignore=1; //this makes it so the function will return like normal and we can check error values ourself
if(fopen.... //we want to know if the file opened or not and handle it ourself.
}
int my_errors(){
if(ignore==0){
_exit(1); //exit if we aren't handling this error by flagging ignore
}
return 0;
//this is called when there is an error anywhere in the libc
}
...
I am considering making such a wrapper as I am synthesizing my own BSD licensed libc(so I already have to touch the untouchable..), but I would like to know what people think about it..
would this actually work in real life and be more useful than returning -1?
during this years I've seen several attempts to mimics try/catch in ANSI C:
http://simgrid.gforge.inria.fr/doc/group__XBT__ex.html
http://llg.cubic.org/trycatch/
I think that try/catch approach is more simple than your.
But how would you be able to catch the error when it was expected? For example I might expect a file open to fail and want to deal with it in code instead of the generic error catcher.
To do this you would need two versions of every function. One that trapped errors and one the returns errors.
I did something like this long ago without modifying the library. I just created wrapper functions for common calls that did error checking. So my errchk_malloc call checked the return and raised an error if the allocation failed. Then I just used this version everywhere in place of the built in malloc.
if the goal is to exit cleanly as soon as you encounter an error that's ok... but if you want to do a minimum of error recovery, i can't see how your approach is useful...
To avoid this kind of problem, I sometimes use LD_PRELOAD_PATH to integrate my error management (only for my own projects since this is not a really good practice...)
Do you really want to change the standard behaviors of your LIBC ? You could add a few extensions around common functions.
For example, Gnome uses g_malloc and g_try_malloc. The former will abort on failure while the later will simply yield a null-pointer like malloc.