Is there a warning flag so that gcc will warn about every variable that is defined and not immediately initialized? I know about -Wuninitialized but it does not warn about every case.
Consider this code:
int foo(int i)
{
int a;
a= i%2 ? 0x42 : 42;
return a;
}
Even tho this is perfectly valid code and does not cause UB, i would like a warning for this. The line int a; should be replaced by int a=0;. Can i tell gcc to warn about int a;?
Edit, a more complex example:
#include <stdio.h>
int foo(void)
{
int a;
if(1==scanf("%i",&a))
{
return a;
}
return 0;
}
Here i want gcc to warn about int a;
This does not seem a language problem, so I don't think GCC will ever react; actually, it is defining a value for a that might be warned against by some linter:
int a = 0;
^-- warning: redundant initialisation (value is immediately discarded)
a= i%2 ? 0x42 : 42;
What you can, perhaps, do - but this will not catch all possible cases - is recognize the issue at the static level.
I myself do this for some specific cases (which is why I have trained myself to write code in what is sometimes a unusual way -- for example, free(p); p = NULL; on a single line, or if p is to be immediately discarded, then free(p); // p = NULL;, so no pointer can ever point to freed memory, and I track this using grep. The same with fclose(fp); fp = NULL).
In this case, at least for the common data types and simple initializations, you can recognize the unassignment with a regex too:
((unsigned\\s+)?(long\\s+)?(int|char|float|double)...
(Not a real regex, just throwing in tokens at random. You may want to look at other questions to this effect).
Once you have a means of singling out all definitions, any definition that does not include an equal sign is suspect and can be printed for further analysis. You can avoid a "unneeded initialization" by writing int a; //OK and ignore the lines containing '//OK'.
In your code sample, being warned to replace the int a; line with int a = 0; will be counter-productive. A good compiler would/should warn you about the latter, as there you are assigning a value to a that will never be used (the value is immediately overwritten in the following line of code).
Related
Problem description
I'm developing a frama-c plugin that uses the slicing plugin as a library to remove unused bits of automatically generated code. Unfortunately the slicing plugin drops a bunch of stack values, which are actually used. They are used in so far as their addresses are contained in structures that are handed of to abstract external functions.
Simple example
This is a simpler example that models the same general structure I have.
/* Abstract external function */
void some_function(int* ints[]);
int main() {
int i;
int *p = &i;
int *a[] = { &p };
some_function(a);
return 0;
}
When slicing this example with frama-c-gui -slice-calls some_function experiment_slicing.c (I haven't figures out how to see the slicing output when invoking the command line without gui) it drops everything but the declaration int *a[]; and the call to some_function.
Attempted fixes
I tried fixing it by adding ACSL annotations.
However what I believed to be the sensible specification (see below) did not work
/*# requires \valid(ints) && \valid(ints[0]);
*/
void some_function(int* ints[]);
I then tried with an assign (see below) which does have the desired behaviour. however it is not a correct specification, since the function never actually writes to the pointer but needs to read it for correct functionality. I am worried that if I move ahead with such an incorrect specification it will lead to weird problems down the line.
/*# requires \valid(ints) && \valid(ints[0]);
assigns *ints;
*/
void some_function(int* ints[]);
You are on the right track: it is the assigns clause that you should use here: it will indicate which parts of the memory state are concerned by a call to an undefined function. However, you need to provide a complete assigns clause, with its \from part (that indicates which memory location are read to compute the new value of the memory location written to).
I have added an int variable to your example, as your function isn't returning a result (void return type). For a function that is returning something, you should also have a clause assigns \result \from ...;:
int x;
/*# assigns x \from indirect:ints[..], *(ints[..]); */
void some_function(int* ints[]);
int main() {
int i;
int*p = &i;
int *a[] = { &p };
some_function(a);
return 0;
}
The assigns clause indicates that some_function might change the value of x, and that the new value will be computed from the addresses stored in ints[..]
(the indirect label tells that we're not using their value directly, this is described in more detail in section 8.2 of Eva's manual), and their content.
using frama-c -slice-calls some_function file.c -then-last -print (the last arguments are here to print the resulting file on the standard output: -then-last indicates that the following options should operate on the last Frama-C project created, in that case the one resulting from the slicing, and -print prints the C code of said project. You may also use -ocode output.c to redirect the pretty-printing of the code into output.c.) gives the following result:
* Generated by Frama-C */
void some_function(int **ints);
void main(void)
{
int i;
int *p = & i;
int *a[1] = {(int *)(& p)};
some_function(a);
return;
}
Note in addition that your example is not well-typed: &p is a pointer to pointer to int, and should thus be stored in an int** array, not an int* array. But I assume that it only stems from reducing your original example and is does not matter much for slicing itself.
/*implementation of strcmp*/
#include<stdio.h>
#include<string.h>
/*length of the string*/
static const int MAX_LENGTH = 4;
/*gets string comparison's return value i.e. the difference between the first unmatched characters*/
int getStrCmpReturn(char[], char[]);
/*gets the maximum of the two integers*/
int max(int, int);
int main()
{
char string1[MAX_LENGTH];
char string2[MAX_LENGTH];
gets(string1);
gets(string2);
printf("\n%d", getStrCmpReturn(string1, string2));
return 0;
}
int getStrCmpReturn(char string1[], char string2[])
{
//int test = 50;
int strCmpReturn = 0;
int i;
for(i = 0; (i < max((int)strlen(string1), (int)strlen(string2))); i++)
{
if(string1[i] != string2[i])
{
strCmpReturn = string1[i] - string2[i];
break;
}
}
return strCmpReturn; //not required, why?
}
int max(int string1Length, int string2Length)
{
if(string1Length >= string2Length)
{
return string1Length;
}
else
{
return string2Length;
}
}
Look at the definition of the function getStrCmpReturn(). It is seen that if the the return statement is removed or commented, the function still returns the value stored in the variable strCmpReturn. Even if an extra variable is added to it, like "int test = 5;" (shown in comments), the function still returns the value stored in the variable strCmpReturn.
How is the compiler able to guess that the value in "strCmpReturn" is to be returned, and not the ones stored in other variables like "test" or "i"?
The compiler is not guessing.
You have (without the required return) some undefined behavior. Be scared.
What might happen is that your particular compiler (with your particular compilation flag on your particular machine) has filled (by accident, bad luck or whatever reason) a processor register which contains some apparently suitable return value (as requested by the relevant ABI and calling conventions).
With different compilers (or different versions of them) or a different operating system or computer, or different optimizations flags you could observe some other behavior.
A compiler might use random numbers to allocate registers (or make other decisions); but for the sake of compiler writers, it usually don't ; in other words, compilers writers try to make their compiler somehow deterministic, but the C11 standard (read n1570) does not require that.
6.9.1 Function definitions
...
12 If the } that terminates a function is reached, and the value of the function call is used by
the caller, the behavior is undefined.
C 2011 online draft
In plain English, the behavior of this code is not predictable. It's working as expected for you with your particular combinatoin of hardware, OS, and compiler, but that may not be the case with a different compiler, or even in a different program using the same compiler.
"Undefined behavior" means that the compiler and runtime environment are not required to "do the right thing", whatever the right thing would be. The code may work as expected, or it may crash immediately, or it may corrupt other data leading to a crash later on, or any of a hundred other outcomes.
C's definition is a bit loose in places. There is a constraint (i.e. a semantic rule) that says if a return statement appears in a function that returns anything other than void, then it must be followed by an expression (the return value). Similarly, there's a constraint that says if a return statement appears in a function returning void, then it must not be followed by an expression. However, there are no constraints that say a return statement must be present in either case.
Believe it or not, knowing the history of C, this makes sense. C didn't originally have a void type, and there wasn't a good way to distinguish between functions that computed and returned a value vs. functions that just executed some statements. It was a bit of a pain to force a return on something whose value would never be used anyway, so the presence of return statements are not enforced by either the grammar or any constraints.
With gcc 4.4.5, I have a warning with the following code.
char *f(void)
{
char c;
return &c;
}
But, when I use a temporary pointer, there is no warning anymore (even if the behavior is wrong).
char *f(void)
{
char c;
char *p = &c;
return p;
}
I heard that pointer-analysis is difficult in C, but can gcc warn about such code ?
Compilers, and most static analyzers, do not try to warn for everything wrong a program might do, because that would entail too many false positives (warnings that do not correspond to actual problems in the source code).
Macmade recommends Clang in the comments, a recommendation I can second. Note that Clang still aims at being useful for most developers by minimizing false positives. This means that it has false negatives, or, in other words, that it misses some real issues (when unsure that there is a problem, it may remains silent rather than risk wasting the developer's time with a false positive).
Note that it is even arguable whether there really is a problem in function f() in your program.
Function h() below is clearly fine, although the calling code mustn't use p after it returns:
char *p;
void h(void)
{
char c;
p = &c;
}
Another static analyzer I can recommend is Frama-C's value analysis (I am one of the developers). This one does not leave any false negatives, for some families of errors (including dangling pointers), when used in controlled conditions.
char *f(void)
{
char c;
return &c;
}
char *g(void)
{
char c;
char *p = &c;
return p;
}
$ frama-c -val -lib-entry -main g r.c
...
r.c:11:[value] warning: locals {c} escaping the scope of g through \result
...
$ frama-c -val -lib-entry -main f r.c
...
r.c:4:[value] warning: locals {c} escaping the scope of f through \result
...
The above are only informative messages, they do not mean the function is necessarily wrong. There is one for my function h() too:
h.c:7:[value] warning: locals {c} escaping the scope of h through p
The real error, characterized by the word “assert” in Frama-C's output, is if a function calls h() and then uses p:
void caller(void)
{
char d;
h();
d = *p;
}
$ frama-c -val -lib-entry -main caller h.c
...
h.c:7:[value] warning: locals {c} escaping the scope of h through p
...
h.c:13:[kernel] warning: accessing left-value p that contains escaping addresses; assert(Ook)
h.c:13:[kernel] warning: completely undefined value in {{ p -> {0} }} (size:<32>).
Frama-C's value analysis is called context-sensitive. It analyses function h() for each call, with the values that are actually passed to it. It also analyzes the code that comes after the call to h() in function caller() with the values that can actually be returned by h(). This is more expensive than the context-insensitive analyses that Clang or GCC typically do, but more precise.
In this first example, gcc can clearly see you're returning the address of an automatic variable that will no longer exist. In the second, the compiler would have to follow your program's logic, as p could easily point to something valid (e.g. an external character variable).
Although gcc won't complain here, it will warn with pointer use like this:
char *f(const char *x)
{
char *y = x;
...
}
Again, it can see without any doubt that you're removing the 'const' qualifier in this definition.
Another utility that will detect this problem is splint (http://splint.org).
We are able to modify the value of constant integer pointer by b, how
can we make sure/restrict accidentally modification of the value ?
#include <stdio.h>
/**
* Snippet to under working of "pointer to integer(any) constant"
*
* We are able to modify the value of constant integer pointer by b, how
* can we make sure/restrict accidentally modification of the value .
*
*/
void modify_value(const int *m, const int *n) {
//*m = 50; // expected error, assignment of read-only location
*((int*)n) = 100; // value of pointed by pointer gets updated !!
}
int main() {
int a=5,b=10;
printf("a : %d , b : %d \n", a,b);
modify_value(&a,&b);
printf("a : %d , b : %d \n", a,b);
return 0;
}
As far as I know there's no waterproof way to prevent this, but there are ways to avoid this by mistake.
For example some compilers have warnings and you can even make warnings into error (ie fail to compile if a warning is triggered). For example on GCC you can use -Wcast-qual and -Werror (or -Werror=cast-qual). But this will not completely prevent you from modifying data pointed to by const *, as with many warnings there are ways to work around them. For example you could on some platform cast via an integral type, for example (int*)((char const*)m - (char*)NULL), but note that this is not a portable construct (but I think the casting away constness is a non-portable construct anyway).
If you want to go a bit further you could of course recompile GCC to better keep track on qualifiers and prohibit some of these workarounds, but that might be at the cost of dropping standard conformance.
Another solution would be to use some kind of lint tool. These often can emit warnings that normal compilers pass. Also in your build scripts you can normally make these lint-warnings to be considered build errors (and not compile a file that has lint warnings).
With gcc 4.4.5, I have a warning with the following code.
char *f(void)
{
char c;
return &c;
}
But, when I use a temporary pointer, there is no warning anymore (even if the behavior is wrong).
char *f(void)
{
char c;
char *p = &c;
return p;
}
I heard that pointer-analysis is difficult in C, but can gcc warn about such code ?
Compilers, and most static analyzers, do not try to warn for everything wrong a program might do, because that would entail too many false positives (warnings that do not correspond to actual problems in the source code).
Macmade recommends Clang in the comments, a recommendation I can second. Note that Clang still aims at being useful for most developers by minimizing false positives. This means that it has false negatives, or, in other words, that it misses some real issues (when unsure that there is a problem, it may remains silent rather than risk wasting the developer's time with a false positive).
Note that it is even arguable whether there really is a problem in function f() in your program.
Function h() below is clearly fine, although the calling code mustn't use p after it returns:
char *p;
void h(void)
{
char c;
p = &c;
}
Another static analyzer I can recommend is Frama-C's value analysis (I am one of the developers). This one does not leave any false negatives, for some families of errors (including dangling pointers), when used in controlled conditions.
char *f(void)
{
char c;
return &c;
}
char *g(void)
{
char c;
char *p = &c;
return p;
}
$ frama-c -val -lib-entry -main g r.c
...
r.c:11:[value] warning: locals {c} escaping the scope of g through \result
...
$ frama-c -val -lib-entry -main f r.c
...
r.c:4:[value] warning: locals {c} escaping the scope of f through \result
...
The above are only informative messages, they do not mean the function is necessarily wrong. There is one for my function h() too:
h.c:7:[value] warning: locals {c} escaping the scope of h through p
The real error, characterized by the word “assert” in Frama-C's output, is if a function calls h() and then uses p:
void caller(void)
{
char d;
h();
d = *p;
}
$ frama-c -val -lib-entry -main caller h.c
...
h.c:7:[value] warning: locals {c} escaping the scope of h through p
...
h.c:13:[kernel] warning: accessing left-value p that contains escaping addresses; assert(Ook)
h.c:13:[kernel] warning: completely undefined value in {{ p -> {0} }} (size:<32>).
Frama-C's value analysis is called context-sensitive. It analyses function h() for each call, with the values that are actually passed to it. It also analyzes the code that comes after the call to h() in function caller() with the values that can actually be returned by h(). This is more expensive than the context-insensitive analyses that Clang or GCC typically do, but more precise.
In this first example, gcc can clearly see you're returning the address of an automatic variable that will no longer exist. In the second, the compiler would have to follow your program's logic, as p could easily point to something valid (e.g. an external character variable).
Although gcc won't complain here, it will warn with pointer use like this:
char *f(const char *x)
{
char *y = x;
...
}
Again, it can see without any doubt that you're removing the 'const' qualifier in this definition.
Another utility that will detect this problem is splint (http://splint.org).