Automatically deleting unused local variables from C source code - c

I want to delete unused local variables from C file.
Example:
int fun(int a , int b)
{
int c,sum=0;
sum=a + b;
return sum;
}
Here the unused variable is 'c'.
I will externally have a list of all unused local variables. Now using unused local variables which I have, we have to find local variables from source code & delete.
In above Example "c" is unused variable. I will be knowing it (I have code for that).
Here I have to find c & delete it .
EDIT
The point is not to find unused local variables with an external tool. The point is to remove them from code given a list of them.

Turn up your compiler warning level, and it should tell you.
Putting your source fragment in "f.c":
% gcc -c -Wall f.c
f.c: In function 'fun':
f.c:1: warning: unused variable 'c'

Tricky - you will have to parse C code for this. How close does the result have to be?
Example of what I mean:
int a, /* foo */
b, /* << the unused one */
c; /* bar */
Now, it's obvious to humans that the second comment has to go.
Slight variation:
void test(/* in */ int a, /* unused */ int b, /* out */ int* c);
Again, the second comment has to go, the one before b this time.
In general, you want to parse your input, filter it, and emit everything that's not the declaration of an unused variable. Your parser would have to preserve comments and #include statements, but if you don't #include headers it may be impossible to recognize declarations (even more so if macro's are used to hide the declaration). After all, you need headers to decide if A * B(); is a function declaration (when A is a type) or a multiplication (when A is a variable)
[edit] Furthermore:
Even if you know that a variable is unused, the proper way to remove it depends a lot on remote context. For instance, assume
int foo(int a, int b, int c) { return a + b; }
Clearly, c is unused. Can you change it to ?
int foo(int a, int b) { return a + b; }
Perhaps, but not if &foo is stored int a int(*)(int,int,int). And that may happen somewhere else. If (and only if) that happens, you should change it to
int foo(int a, int b, int /*unused*/ ) { return a + b; }

Why do you want to do this? Assuming you have a decent optimizing compiler (GCC, Visual Studio et al) the binary output will not be any different wheter you remove the 'int c' in your original example or not.
If this is just about code cleanup, any recent IDE will give you quick links to the source code for each warning, just click and delete :)

My answer is more of an elaborate comment to MSalters' very thorough answer.
I would go beyond 'tricky' and say that such a tool is both impossible and inadvisable.
If you are looking to simply remove the references to the variable, then you could write a code parser of your own, but it would need to distinguish between the function context it is in such as
int foo(double a, double b)
{
b = 10.0;
return (int) b;
}
int bar(double a, double b)
{
a = 5.00;
return (int) a;
}
Any simple parser would have trouble with both 'a' and 'b' being unused variables.
Secondly, if you consider comments as MSalter has, you'll discover that people do not comment consistently;
double a;
/*a is designed as a dummy variable*/
double b;
/*a is designed as a dummy variable*/
double a;
double b;
double a; /*a is designed as a dummy variable*/
double b;
etc.
So simply removing the unused variables will create orphaned comments, which are arguably more dangerous than not commenting at all.
Ultimately, it is an obscenely difficult task to do elegantly, and you would be mangling code regardless. By automating the process, you would be making the code worse.
Lastly, you should be considering why the variables were in the code in the first place, and if they are deprecated, why they were not deleted when all their references were.

Static code analysis tools in additional to warning level as Paul correctly stated.

As well as being able to reveal these through warnings, the compiler will normally optimise these away if any optimisations are turned on. Checking if a variable is never referenced is quite trivial in terms of implementation in the compiler.

You will need a good parser that preserves original character position of tokens (even in presence of preprocessor!). There are some tools for automated refactoring of C/C++, but they are far from mainstream.
I recommend you to check out Taras' Blog. The guy is doing some large automated refactorings of Mozilla codebase, like replacing out-params with return values. His main tool for code rewriting is Pork:
Pork is a C++ parsing and rewriting
tool chain. The core of Pork is a C++
parser that provides exact character
positions for the start and end of
every AST node, as well as the set of
macro expansions that contain any
location. This information allows C++
to be automatically rewritten in a
precise way.
From the blog:
So far pork has been used for “minor”
things like renaming
classes&functions, rotating
outparameters and correcting prbool
bugs. Additionally, Pork proved itself
in an experiment which involved
rewriting almost every function (ie
generating a 3+MB patch) in Mozilla to
use garbage collection instead of
reference-counting.
It is for C++, but it may suit your needs.

One of the posters above says "impossible and inadvisable".
Another says "tricky", which is the right answer.
You need 1) a full C (or whatever language of interest) parser,
2) inference procedures that understand the language
identifier references and data flows to determine that a variable
is indeed "dead", and 3) the ability to actually modify
the source code.
What's hard about all this is the huge energy to build
1) 2) 3). You can't justify for any individual cleanup task.
What one can do is to build such infrastructure specifically
with the goal of amortizing it across lots of differnt
program analysis and transformation tasks.
My company offers such a tool: The DMS Software Reengineering
Toolkit. See
http://www.semdesigns.com/Products/DMS/DMSToolkit.html
DMS has production quality front ends for many languages,
including C, C++, Java and COBOL.
We have in fact built an automated "find useless declarations"
tool for Java that does two things:
a) lists them all (thus producing the list!)
b) makes a copy of the code with the useless declarations
removed.
You choose which answer you want to keep :-)
To do the same for C would not be difficult. We already
have a tool that identifies such dead variables/functions.
One case we did not addess, is the "useless parameter"
case, becasue to remove a useless parameter, you have
to find all the calls from other modules,
verify that setting up the argument doesn't have a side
effect, and rip out the useless argument.
We in fact have full graphs of the entire software
system of interest, and so this would also be
possible.
So, its just tricky, and not even very tricky
if you have the right infrastructure.

You can solve the problem as a text processing problem. There must be a small number of regexp patterns how unused local variables are defined in the source code.
Using a list of unused variable names and the line numbers where they are, You can process the C source code line-by-line. On each line You can iterate over the variable names. On each variable name You can match the patterns one-by-one. After a successful match You know the syntax of the definition, so You know how to delete the unused variable from it.
For example if the source line is: "int a, unused, b;" and the compiler reported "unused" as an unused variable in that line, than the pattern "/, unused,/" will match and You can replace that substring with a single ",".

Also: splint.
Splint is a tool for statically checking C programs for security vulnerabilities and coding mistakes. With minimal effort, Splint can be used as a better lint. If additional effort is invested adding annotations to programs, Splint can perform stronger checking than can be done by any standard lint.

Related

Is there a static C analyzer which detects uninitialized static variables? [duplicate]

I need to debug an ugly and huge math C library, probably once produced by f2c. The code is abusing local static variables, and unfortunately somewhere it seems to exploit the fact that these are automatically initialized to 0. If its entry function is called with the same input twice, it is giving different results. If I unload the library and reload it again, it works correctly. It needs to be fast, so I would like to get rid of the load/unload.
My question is that how to uncover these errors with valgrind or by any other tool without manually walking through the entire code.
I am hunting places where a local static variable is declared, read first, and written only later. The problem is even further complicated by the fact that the static variables are sometimes passed further via pointers (yep - it is so ugly).
I understand that one can argue that mistakes like this should not be necessary detected by an automatic tool, as in some scenarios this is exactly the intended behaviour. Still, is there a way to make the auto-initialized local static variables "dirty"?
The devil is in the details, but this may work for you:
First, get Frama-C. If you are using Unix, your distribution may have a package. The package won't be the last version but it may be good enough and it will save you some time if you install it this way.
Say your example is as below, only so much bigger that it's not obvious what is wrong:
int add(int x, int y)
{
static int state;
int result = x + y + state; // I tested it once and it worked.
state++;
return result;
}
Type a command like:
frama-c -lib-entry -main add -deps ugly.c
Options -lib-entry -main add mean "look at function add". Option -deps computes functional dependencies. You'll find these "functional dependencies" in the log:
[from] Function add:
state FROM state; (and default:false)
\result FROM x; y; state; (and default:false)
This lists the actual inputs the results of add depend on, and the actual outputs computed from these inputs, including static variables read from and modified. A static variable that was properly initialized before being used would normally not appear as input, unless the analyzer was unable to determine that it was always initialized before being read from.
The log shows state as dependency of \result. If you expected the returned result to depend only on the arguments (meaning two calls with the same arguments produce the same result), it's a hint there may be something wrong here, with the variable state.
Another hint shown in the above lines is that the function modifies state.
This may help or not. Option -lib-entry means that the analyzer does not assume that any non-const static variable has kept its value at the time the function under analysis is called, so that may be too imprecise for your code. There are ways around that, but then it is up to you whether you want to gamble the time it takes to learn these ways.
EDIT: here is a more complex example:
void initialize_1(int *p)
{
*p = 0;
}
void initialize_2(int *p)
{
*p; // I made a mistake here.
}
int add(int x, int y)
{
static int state1;
static int state2;
initialize_1(&state1);
initialize_2(&state2);
// This is safe because I have initialized state1 and state2:
int result = x + y + state1 + state2;
state1++;
state2++;
return result;
}
On this example, the same command produces the results:
[from] Function initialize_1:
state1 FROM p
[from] Function initialize_2:
[from] Function add:
state1 FROM \nothing
state2 FROM state2
\result FROM x; y; state2
What you see for initialize_2 is an empty list of dependencies, meaning the function assigns nothing. I will make this case clearer by displaying an explicit message rather than just an empty list. If you know what any of the functions initialize_1, initialize_2 or add is supposed to do, you can compare this a priori knowledge to the results of the analysis and see that something is wrong for initialize_2 and add.
SECOND EDIT: and now my example shows something strange for initialize_1, so perhaps I should explain that. Variable state1 depends on p in the sense that p is used to write to state1, and if p had been different, then the final value of state1 would have been different. Here is a last example:
int t[10];
void initialize_index(int i)
{
t[i] = 1;
}
int main(int argc, char **argv)
{
initialize_index(argv[1][0]-'0');
}
With the command frama-c -deps t.c, the dependencies computed for initialize_index are:
[from] Function initialize_index:
t[0..9] FROM i (and SELF)
This means that each of the cells depends on i (it may be modified if i is the index of that particular cell). Each cell may also keep its value (if i indicates another cell): this is indicated with the (and SELF) mention in the latest version, and was indicated with a more obscure (and default:true) in previous versions.
Static code analysis tools are pretty good at finding typical programming errors like the use of uninitialized variables. Here is a list of free tools that do this for C.
Unfortunately I can't recommend any of the tools in the list. I am only familiar with two commercial products, Coverity and Klocwork. Coverity is very good (and expensive). Klocwork is so so (but less expensive).
What I did in the end is removed all static qualifiers from the code by '#define static'. This turns uninitialised static usage into invalid use, and the type of abuse I am hunting can be uncovered by the tools.
In my actual case this was enough to determine the place of the bug, but in a more general situation it should be refined if static's are actually doing something important, by gradually re-adding 'static' when the code fails to continue.
I don't know of any library that does this for you, but I would look into using regular expressions to find them. Something like
rgrep "static\s*int" path/to/src/root | grep -v = | grep -v "("
That should return all static int variables declared without an equals sign, and the last pipe should remove anything with parenthesis in them (getting rid of funcions). There's a good change that this won't work exactly for you, but playing around with grep may be the fastest way for you to track this down.
Of course, once you find one that works you can replace int with all of the other kinds of variables to search for those too. HTH
My question is that how to uncover these errors ...
But these aren't errors: the expectation that a static variable is initialized to 0 is perfectly valid, as is assigning some other value to it.
So asking for a tool that will automatically find non-errors is unlikely to produce a satisfying result.
From your description, it appears that somefunc() returns correct result first time it is called, and incorrect result on subsequent calls.
The simplest way to debug such problems is to have two GDB sessions side-by-side: one freshly-loaded (will compute correct answer), and one with "second iteration" (will compute wrong answer). Then step through both sessions "in parallel", and see where their computation or control flow starts to diverge.
Since you can usually effectively divide the problem in half, it often doesn't take long to find the bug. Bugs that always reproduce are the easiest ones to find. Just do it.

Return a struct directly or fill a pointer?

Let's say I have the following function to initialize a data structure:
void init_data(struct data *d) {
d->x = 5;
d->y = 10;
}
However, with the following code:
struct data init_data(void) {
struct data d = { 5, 10 };
return d;
}
Wouldn't this be optimized away due to copy elision and be just as performant as the former version?
I tried to do some tests on godbolt to see if the assembly was the same, but when using any optimization flags everything was always entirely optimized away, with nothing left but something like this: movabsq $42949672965, %rax, and I am not sure if the same would happen in real code.
The first version I provided seems to be very common in C libraries, and I do not understand why as they should be both just as fast with RVO, with the latter requiring less code.
The first version I provided seems to be very common in C libraries, and I do not understand why as they should be both just as fast with
RVO, with the latter requiring less code.
The main reason for the first being so common is historic. The second way of initializing structures from literals was not standard (well, it was, but only for static initializers and never for automatic variables) and it's never allowed on assignments (well, I've not checked the status of the recent standards) Even, in ancient C, a simple assignment as:
struct A a, b;
...
a = b; /* this was not allowed a long time ago */
was not accepted at all.
So, in order to be able to compile code in every platform, you have to write the old way, as normally, modern compilers allow you to compile legacy code, while the opposite (old compilers accepting new code) is not possible.
And this also applies to returning structures or passing them by value. Apart of being normally a huge waste of resources (it's common to see the whole structure being copied in the stack or copied back to the proper place, once the function returns) old compilers didn't accept these, so to be portable, you must avoid to use these constructs.
Finally a comment: don't use your compiler to check if both constructs generate the same code, as probably it does... but you'll get the wrong assumption that this is common, and you'll run into error. Another different implementation can (and is allowed to do) different translation and result in different code.

Frama-C code slicer not loading any C files

I have a 1000 lines C file with 10 maths algorithms written by a professor, I need to delete 9 maths functions and all their dependencies from the 1000 lines, so i am having a go using Frama-C Boron windows binary installer.
Now it won't load the simplest example.c file... i select source file and nothing loads.
Boron edition is from 2010 so i checked how to compile a later Frama-C: they say having a space in my windows 7 user name can cause problems, which is not encouraging.
Is Frama-C my best option for my slicing task?
Here is an example file that won't load:
// File swap.c:
/*# requires \valid(a) && \valid(b);
# ensures A: *a == \old(*b) ;
# ensures B: *b == \old(*a) ;
# assigns *a,*b ;
#*/
void swap(int *a,int *b)
{
int tmp = *a ;
*a = *b ;
*b = tmp ;
return ;
}
Here is the code i wish to take only one function from, the option labelled smooth and swvd. https://sites.google.com/site/kootsoop/Home/cohens_class_code
I looked at the code you linked to, and it does not seem like the best candidate for a Frama-C analysis. For instance, that code is not strictly C99-conforming, using e.g. some old-style prototypes (including implicit int return types), functions that are used before they are defined without forward declarations (fft), and a missing header inclusion (stdlib.h). Those are not big issues, since the changes are relatively simple, and some of them are treated similarly to how gcc -std=c99 works: they emit warnings but not errors. However, it's important to notice that they do require a non-zero amount of time, therefore this won't be a "plug-and-play" solution.
On a more general note, Frama-C relies on CIL (C Intermediate Language) for C code normalization, so the sliced program will probably not be identical to "the original program minus the sliced statements". If the objective is merely to remove some statements but keep the code syntactically identical otherwise, then Frama-C will not be ideal1.
Finally, it is worth noting that some Frama-C analyses can help finding dead code, and the result is even clearer if the code is already split into functions. For instance, using Value analysis on a properly configured program, it is possible to see which statements/functions are never executed. But this does rely on the absence of at least some kinds of undefined behavior.
E.g. if uninitialized variables are used in your program (which is forbidden by the C standard, but occasionally happens and goes unnoticed), the Value analysis will stop its propagation and the code afterwards may be marked as dead, since it is semantically dead w.r.t. the standard. It's important to be aware of that, since a naive approach would be misleading.
Overall, for the code size you mention, I'm not sure Frama-C would be a cost-effective approach, especially if (1) you have never used Frama-C and are having trouble compiling it (Boron is a really old release, not recommended) and (2) if you already know your code base, and therefore would be relatively proficient in manually slicing its parts.
1That said, I do not know of any C slicer that preserves statements like that; my point is that, while intuitively one might think that a C slicer would straightforwardly preserve most of the syntax, C is such a devious language that doing so is very hard, hence why most tools will do some normalization steps beforehand.

Generating C code for functions of different signatures, but same implementation

A situation I run into a lot in writing C code (context is scientific computation) is that I will have functions which have exactly the same body modulo minor type differences. I realize C++ offers the template feature and function overloading which allows one to have only one copy of said function and let the compiler figure out what signature you meant to use when you build.
While this is a great feature in C++, my project is in C and I furthermore do not need the full power of templating. So far what I have tried is m4 macros on a candidate source file, and this spits out respective .c files with appropriate name mangling for the different types I need. The preprocessor could therefore accomplish this as well, but I'm attempting to avoid using it in complicated ways (my code needs to be understandable for reproducibility reasons). I'm not very good with m4, so all the files have been hacks that only work in specific cases and are inapplicable in new situations.
What do other people programming in C do when this is necessary? Manually produce and maintain the different permutations of function signatures? I'm hoping that isn't the best answer, or that a tool exists to automate this dreary and error prone task.
Apologies for vagueness, let me give a toy example. Suppose I have need to add two numbers. The function might look something like this:
float add(float x,float y){
return x+y;
}
Ok that's great for floats, but what if I need it for a wide range of types on which arithmetic is available. Ok I can do this
float add_f(float x,float y){...}
double add_lf(double x,double y){...}
unsigned int add_ui(unsigned int x, unsigned int y){...}
and so forth. If for some (probably stupid) reason I decide I need to also write the contents of the arguments to a binary file, I now have to add in the requisite file I/O code in every single function. Is there a simple way/tool to take an add function and spit out different ones with name mangling to avoid this annoying situation?
Basically in my m4 cases I would just find/replace a macro TYPE with the requisite type, and have a macro MANGLE() which mangles the functions, then I point the output to an alternate .c file. My m4 skills are lacking though.
Function pointers can help with the ultimate interface of my code, but eventually those pointers have to point to something, and then we're just enumerating all the possibilities again. I'm also unclear on how this might affect potential inlining of short functions.
The only thing i can think of is: make the algorithm itself independent of the type, have the user of your function create his own function to handle the type-specific parts, and make one of the parameters to your function a pointer to the "handler function".
See the definition/implementation of the qsort routine for what i mean. Qsort works for all kinds of data, but handles the data itself transparently - the only things you pass to qsort is the size of each entry, and a function pointer to a function that does the real comparison.
You appear to be asking for generic type support. While the macro processing can work in restricted domains, what you are doing is complex.
If the variants are so similar that simply type and name mangling is enough, then could you not use regular C #defines before each of multiple inclusions of the same source fragment to allow the preprocessor perform the substitution? This way, at least there is only a single environment to manage.
Alternately, if the performance hit is not substantial, could you prepare multiple stub functions for each specialisation and map these to a generic version that can be called from the stubs?
I use GNU autogen for code generation tasks, which sounds somewhat like your current m4 solution, but might be better organized. For example:
type.def
autogen definitions type;
type = { name="int"; mangle="i"; };
type = { name="double"; mangle="lf"; };
type = { name="float"; mangle="f"; };
type = { name="unsigned int"; mangle="ui"; };
type.tpl
[+ autogen5 template
c=%s.c
(setenv "SHELL" "/bin/sh") +]/*
[+ (dne "* " "* ") +]
*/
[+
FOR type "\n" +][+name+] add_[+mangle+]([+name+] x, [+name+] y) { ... }[+ENDFOR+]
or something like that. This should spit out a function for each of the types in type.def looking something like:
unsigned int add_ui(unsigned int x, unsigned int y) { ... }
You can also have it insert type-specific code in certain places if needed, etc. You could have it output the add functions described above as well as the I/O versions. You'd have to compute the text for mangle instead of what I've got, but that's not a problem. You'd also have some conditional code for the I/O and a way to toggle the condition on and off (again, not a problem).
I'd definitely try and see if there was some way to generalize the algorithm, but this approach might have drawbacks (e.g. performance issues from not having the real underlying type) as well. But it sounds from the comments that this approach might not work for you.
I know that most C developers are afraid of it, but have you thought about using macros?
specific to your example:
// floatstuff.h
float add_f(float x,float y);
double add_lf(double x,double y);
unsigned int add_ui(unsigned int x, unsigned int y);
combined with:
// floatstuff.c
#define MY_CODE \
return x + y
float
add (float x, float y)
{
MY_CODE;
}
double
add_lf (double x, double y)
{
MY_CODE;
}
unsigned int
add_ui (unsigned int x, unsigned int y)
{
MY_CODE;
}
If the code you are using per function is truly identical, then this might be the solution you are looking for. It avoids most of the code duplication, maintains some degree of readability and has no impact on your runtime.
Also, if you keep the macro local to your .c file, you are unlikely to break anything, so no worries there either.
Also, you can do even more weird stuff using parameterized macros, which can give you even more reduced code duplication.

tracking uninitialized static variables

I need to debug an ugly and huge math C library, probably once produced by f2c. The code is abusing local static variables, and unfortunately somewhere it seems to exploit the fact that these are automatically initialized to 0. If its entry function is called with the same input twice, it is giving different results. If I unload the library and reload it again, it works correctly. It needs to be fast, so I would like to get rid of the load/unload.
My question is that how to uncover these errors with valgrind or by any other tool without manually walking through the entire code.
I am hunting places where a local static variable is declared, read first, and written only later. The problem is even further complicated by the fact that the static variables are sometimes passed further via pointers (yep - it is so ugly).
I understand that one can argue that mistakes like this should not be necessary detected by an automatic tool, as in some scenarios this is exactly the intended behaviour. Still, is there a way to make the auto-initialized local static variables "dirty"?
The devil is in the details, but this may work for you:
First, get Frama-C. If you are using Unix, your distribution may have a package. The package won't be the last version but it may be good enough and it will save you some time if you install it this way.
Say your example is as below, only so much bigger that it's not obvious what is wrong:
int add(int x, int y)
{
static int state;
int result = x + y + state; // I tested it once and it worked.
state++;
return result;
}
Type a command like:
frama-c -lib-entry -main add -deps ugly.c
Options -lib-entry -main add mean "look at function add". Option -deps computes functional dependencies. You'll find these "functional dependencies" in the log:
[from] Function add:
state FROM state; (and default:false)
\result FROM x; y; state; (and default:false)
This lists the actual inputs the results of add depend on, and the actual outputs computed from these inputs, including static variables read from and modified. A static variable that was properly initialized before being used would normally not appear as input, unless the analyzer was unable to determine that it was always initialized before being read from.
The log shows state as dependency of \result. If you expected the returned result to depend only on the arguments (meaning two calls with the same arguments produce the same result), it's a hint there may be something wrong here, with the variable state.
Another hint shown in the above lines is that the function modifies state.
This may help or not. Option -lib-entry means that the analyzer does not assume that any non-const static variable has kept its value at the time the function under analysis is called, so that may be too imprecise for your code. There are ways around that, but then it is up to you whether you want to gamble the time it takes to learn these ways.
EDIT: here is a more complex example:
void initialize_1(int *p)
{
*p = 0;
}
void initialize_2(int *p)
{
*p; // I made a mistake here.
}
int add(int x, int y)
{
static int state1;
static int state2;
initialize_1(&state1);
initialize_2(&state2);
// This is safe because I have initialized state1 and state2:
int result = x + y + state1 + state2;
state1++;
state2++;
return result;
}
On this example, the same command produces the results:
[from] Function initialize_1:
state1 FROM p
[from] Function initialize_2:
[from] Function add:
state1 FROM \nothing
state2 FROM state2
\result FROM x; y; state2
What you see for initialize_2 is an empty list of dependencies, meaning the function assigns nothing. I will make this case clearer by displaying an explicit message rather than just an empty list. If you know what any of the functions initialize_1, initialize_2 or add is supposed to do, you can compare this a priori knowledge to the results of the analysis and see that something is wrong for initialize_2 and add.
SECOND EDIT: and now my example shows something strange for initialize_1, so perhaps I should explain that. Variable state1 depends on p in the sense that p is used to write to state1, and if p had been different, then the final value of state1 would have been different. Here is a last example:
int t[10];
void initialize_index(int i)
{
t[i] = 1;
}
int main(int argc, char **argv)
{
initialize_index(argv[1][0]-'0');
}
With the command frama-c -deps t.c, the dependencies computed for initialize_index are:
[from] Function initialize_index:
t[0..9] FROM i (and SELF)
This means that each of the cells depends on i (it may be modified if i is the index of that particular cell). Each cell may also keep its value (if i indicates another cell): this is indicated with the (and SELF) mention in the latest version, and was indicated with a more obscure (and default:true) in previous versions.
Static code analysis tools are pretty good at finding typical programming errors like the use of uninitialized variables. Here is a list of free tools that do this for C.
Unfortunately I can't recommend any of the tools in the list. I am only familiar with two commercial products, Coverity and Klocwork. Coverity is very good (and expensive). Klocwork is so so (but less expensive).
What I did in the end is removed all static qualifiers from the code by '#define static'. This turns uninitialised static usage into invalid use, and the type of abuse I am hunting can be uncovered by the tools.
In my actual case this was enough to determine the place of the bug, but in a more general situation it should be refined if static's are actually doing something important, by gradually re-adding 'static' when the code fails to continue.
I don't know of any library that does this for you, but I would look into using regular expressions to find them. Something like
rgrep "static\s*int" path/to/src/root | grep -v = | grep -v "("
That should return all static int variables declared without an equals sign, and the last pipe should remove anything with parenthesis in them (getting rid of funcions). There's a good change that this won't work exactly for you, but playing around with grep may be the fastest way for you to track this down.
Of course, once you find one that works you can replace int with all of the other kinds of variables to search for those too. HTH
My question is that how to uncover these errors ...
But these aren't errors: the expectation that a static variable is initialized to 0 is perfectly valid, as is assigning some other value to it.
So asking for a tool that will automatically find non-errors is unlikely to produce a satisfying result.
From your description, it appears that somefunc() returns correct result first time it is called, and incorrect result on subsequent calls.
The simplest way to debug such problems is to have two GDB sessions side-by-side: one freshly-loaded (will compute correct answer), and one with "second iteration" (will compute wrong answer). Then step through both sessions "in parallel", and see where their computation or control flow starts to diverge.
Since you can usually effectively divide the problem in half, it often doesn't take long to find the bug. Bugs that always reproduce are the easiest ones to find. Just do it.

Resources