Why did this code still work? - c

Some old code that I just came across:
MLIST * new_mlist_link()
{
MLIST *new_link = (MLIST * ) malloc(sizeof(MLIST));
new_link->next = NULL;
new_link->mapi = NULL;
new_link->result = 0;
}
This was being called to build a linked list, however I noticed there is no statement:
return new_link;
Even without the return statement there, the list still got built properly. Why did this happen?
Edit: Platform: Mandriva 2009 64bit Linux 2.6.24.7-server GCC 4.2.3-6mnb1
Edit: Funny... this code also ran successfuly on about 5 different Linux installations, all different versions/flavors, as well as a Mac.

On 32-bit Windows, most of the time, the return value from a function is left in the EAX register. Similar setups are used in other OSes, though of course it's compiler-specific. This particular function presumably stored the new_link variable in that same location, so when you returned without a return, the variable in that location was treated as the return value by the caller.
This is non-portable and very dangerous to actually do, but is also one of the little things that makes programming in C so much fun.

Possible it just used the EAX register which normally stores the return value of the last function that was called. This is not good practice at all! The behavior for this sort of things is undefined.. But it is cool to see work ;-)

It's basically luck; apparently, the compiler happens to stick new_link into the same place it would stick a returned value.

To avoid this problem, use:
-Wreturn-type:
Warn whenever a function is defined with a return-type that defaults to int. Also warn about any return statement with no return-value in a function whose return-type is not void (falling off the end of the function body is considered returning without a value), and about a return statement with an expression in a function whose return-type is void.
-Werror=return-type to turn the above into an error:
Make the specified warning into an error. The specifier for a warning is appended, for example -Werror=switch turns the warnings controlled by -Wswitch into errors. This switch takes a negative form, to be used to negate -Werror for specific warnings, for example -Wno-error=switch makes -Wswitch warnings not be errors, even when -Werror is in effect. You can use the -fdiagnostics-show-option option to have each controllable warning amended with the option which controls it, to determine what to use with this option.
(from GCCs warning options)

This works by coincidence. You shouldn't rely on that.

Most likely is that it'll produce a very hard to find bug. I'm unsure where I read it, but I recall that if you forget to put in a return statement, most compilers will default to return void.
Here's a short example:
#include <iostream>
using namespace std;
int* getVal();
int main()
{
int *v = getVal();
cout << "Value is: " << *v << endl;
return 0;
}
int* getVal()
{
// return nothing here
}
For me, this works as well. However, when I run it, I get a segment fault. So it's really undefined. Just because it compiles, doesn't mean it'll work.

It works because in the 1940's when the C language was created, there was no return keyword. If you look in the "functions" section of the C43 MINSI standard, it has this to say on the subject (amongst other things):
16.4.3b For backwards compatibility the EAX register MUST be used to return
the address of the first chunk of memory allocated by malloc.
</humour>

Probably a coincidence:
The space for the return value of the function is allocated ahead of time. Since that value is uninitialized, it could have pointed to the same space on the heap as the memory allocated for the struct.

Related

Casting function pointers with different return types

This question is about using function pointers, which are not precisely compatible, but which I hope I can use nevertheless as long as my code relies only on the compatible parts. Let's start with some code to get the idea:
typedef void (*funcp)(int* val);
static void myFuncA(int* val) {
*val *= 2;
return;
}
static int myFuncB(int* val) {
*val *= 2;
return *val;
}
int main(void) {
funcp f = NULL;
int v = 2;
f = myFuncA;
f(&v);
// now v is 4
// explicit cast so the compiler will not complain
f = (funcp)myFuncB;
f(&v);
// now v is 8
return 0;
}
While the arguments of myFuncA and myFuncB are identical and fully compatible, the return values are not and are thus just ignored by the calling code. I tried the above code and it works correctly using GCC.
What I learned so far from here and here is that the functions are incompatible by definition of the standard and may cause undefined behavior. My intuition, however, tells me that my code example will still work correctly, since it does not rely in any way on the incompatible parts (the return value). However, in the answers to this question a possible corruption of the stack has been mentioned.
So my question is: Is my example above valid C code so it will always work as intended (without any side effects), or does it depend on the compiler?
EDIT:
I want to do this in order to use a "more powerful" function with a "les powerful" interface. In my example funcp is the interface but I would like to provide additional functionality like myFuncB for optional use.
Agreed it is undefined behaviour, don't do that!
Yes the code functions, i.e. it doesn't fall over, but the value you assign after returning void is undefined.
In a very old version of "C" the return type was unspecified and int and void functions could be 'safely' intermixed. The integer value being returned in the designated accumulator register. I remember writing code using this 'feature'!
For almost anything else you might return the results are likely to be fatal.
Going forward a few years, floating-point return values are often returned using the fp coprocessor (we are still in the 80s) register, so you can't mix int and float return types, because the state of the coprocessor would be confused if the caller does not strip off the value, or strips off a value that was never placed there and causes an fp exception. Worse, if you build with fp emulation, then the fp value may be returned on the stack as described next. Also on 32-bit builds it is possible to pass 64bit objects (on 16 bit builds you can have 32 bit objects) which would be returned either using multiple registers or on the stack. If they are on the stack and allocated the wrong size, then some local stomping will occur,
Now, c supports struct return types and return value copy optimisations. All bets are off if you don't match the types correctly.
Also some function models have the caller allocate stack space for the parameters for the call, but the function itself releases the stack. Disagreement between caller and implementation on on the number or types of parameters and return values would be fatal.
By default C function names are exported and linked undecorated - just the function name defines the symbol, so different modules of your program could have different views about function signatures, which conflict when you link, and potentially generate very interesting runtime errors.
In c++ the function names are highly decorated primarily to allow overloading, but also it helps to avoid signature mismatches. This helps with keeping arguments in step, but actually, ( as noted by #Jens ) the return type is not encoded into the decorated name, primarily because the return type isn't used (wasn't, but I think occasionally can now influence) for overload resolution.
Yes, this is an undefined behaviour and you should never rely on undefined behaviour if want to write portable code.
Function with different return values can have different calling conventions. Your example will probably work for small return types, but when returning large structs (e.g. larger than 32 bits) some compilers will generate code where struct is returned in a temporary memory area which should be cleaned up the the caller.

Malloced pointer changes value on return

I have code that looks like this :
Foo* create(args) {
Foo *t = malloc (sizeof (Foo)) ;
// Fill up fields in struct t from args.
return t;
}
The call is
Foo *created = create (args)
Note that the function and the call to the function are two separate modules.
The value of the pointer assigned to t on being malloced is slightly different to what is captured in created. Seemingly the MSB of the address is changed and replaced with fffff. The LSB portion is the same for around 6-7 characters.
I'm at a loss as to what's going on.
I'm using GCC 4.6
The most likely explanation one can come up with from what you provided is that at the point of the call the function create is undeclared. A permissive C compiler assumed that unknown function create returned an int and generated code, which effectively truncated the pointer value or, more precisely, sign-extended the assumed int return value into the MSB of the recipient pointer (assuming a platform where pointers are wider than int in terms of their bit-width, e.g. 64-bit platform).
Most likely the compiler issued a warning about an undeclared function being called, but the warning was ignored by the user.
Make sure declaration (or, better, prototype) of create is visible at the point of the call. If you are compiling in C99 mode (or later), ask your compiler to strictly enforce C99 requirements. And don't ignore compiler diagnostic messages.

Splint warning "Statement has no effect" due to function pointer

I'm trying to check a C program with Splint (in strict mode). I annotated the source code with semantic comments to help Splint understand my program. Everything was fine, but I just can't get rid of a warning:
Statement has no effect (possible undected modification through call to unconstrained function my_function_pointer).
Statement has no visible effect --- no values are modified. It may modify something through a call to an unconstrained function. (Use -noeffectuncon to inhibit warning)
This is caused by a function call through a function pointer. I prefer not to use the no-effect-uncon flag, but rather write some more annotations to fix it up. So I decorated my typedef with the appropriate #modifies clause, but Splint seems to be completely ignoring it. The problem can be reduced to:
#include <stdio.h>
static void foo(int foobar)
/*#globals fileSystem#*/
/*#modifies fileSystem#*/
{
printf("foo: %d\n", foobar);
}
typedef void (*my_function_pointer_type)(int)
/*#globals fileSystem#*/
/*#modifies fileSystem#*/;
int main(/*#unused#*/ int argc, /*#unused#*/ char * argv[])
/*#globals fileSystem#*/
/*#modifies fileSystem#*/
{
my_function_pointer_type my_function_pointer = foo;
int foobar = 123;
printf("main: %d\n", foobar);
/* No warning */
/* foo(foobar); */
/* Warning: Statement has no effect */
my_function_pointer(foobar);
return(EXIT_SUCCESS);
}
I've read the manual, but there's not much information regarding function pointers and their semantic annotations, so I don't know whether I'm doing something wrong or this is some kind of bug (by the way, it's not already listed here: http://www.splint.org/bugs.html).
Has anyone managed to successfully check a program like this with Splint in strict mode? Please help me find the way to make Splint happy :)
Thanks in advance.
Update #1: splint-3.1.2 (windows version) yields the warning, while splint-3.1.1 (Linux x86 version) does not complain about it.
Update #2: Splint doesn't care whether the assignment and the call are short or long way:
/* assignment (short way) */
my_function_pointer_type my_function_pointer = foo;
/* assignment (long way) */
my_function_pointer_type my_function_pointer = &foo;
...
/* call (short way) */
my_function_pointer(foobar);
/* call (long way) */
(*my_function_pointer)(foobar);
Update #3: I'm not interested in inhibiting the warning. That's easy:
/*#-noeffectuncon#*/
my_function_pointer(foobar);
/*#=noeffectuncon#*/
What I'm looking for is the right way to express:
"this function pointer points to a function which #modifies stuff, so it does have side-effects"
Maybe you are confusing splint by relying on the implicit conversion from "function name" to "pointer to function" in your assignment of my_function_pointer. Instead, try the following:
// notice the &-character in front of foo
my_function_pointer_type my_function_pointer = &foo;
Now you have an explicit conversion and splint doesn't need to guess.
This is just speculation, though. I haven't tested it.
I'm not familiar with splint, but it looks to me that it will check function calls to see if they produce an effect, but it doesn't do analysis to see what a function pointer points to. Therefore, as far as it's concerned, a function pointer could be anything, and "anything" includes functions with no effect, and so you'll continue to get that warning on any use of a function pointer to call a function, unless you so something with the return value. The fact that there's not much on function pointers in the manual may mean they don't handle them properly.
Is there some sort of "trust me" annotation for an entire statement that you can use with function calls through pointers? It wouldn't be ideal, but it would allow you to get a clean run.
I believe the warning is correct. You're casting a value as a pointer but doing nothing with it.
A cast merely makes the value visible in a different manner; it doesn't change the value in any way. In this case you've told the compiler to view "foobar" as a pointer but since you're not doing anything with that view, the statement isn't doing anything (has no effect).

memcpy vs assignment in C -- should be memmove?

As pointed out in an answer to this question, the compiler (in this case gcc-4.1.2, yes it's old, no I can't change it) can replace struct assignments with memcpy where it thinks it is appropriate.
I'm running some code under valgrind and got a warning about memcpy source/destination overlap. When I look at the code, I see this (paraphrasing):
struct outer
{
struct inner i;
// lots of other stuff
};
struct inner
{
int x;
// lots of other stuff
};
void frob(struct inner* i, struct outer* o)
{
o->i = *i;
}
int main()
{
struct outer o;
// assign a bunch of fields in o->i...
frob(&o.i, o);
return 0;
}
If gcc decides to replace that assignment with memcpy, then it's an invalid call because the source and dest overlap.
Obviously, if I change the assignment statement in frob to call memmove instead, then the problem goes away.
But is this a compiler bug, or is that assignment statement somehow invalid?
I think that you are mixing up the levels. gcc is perfectly correct to replace an assignment operation by a call to any library function of its liking, as long as it can guarantee the correct behavior.
It is not "calling" memcpy or whatsoever in the sense of the standard. It is just using one function it its library for which it might have additional information that guarantees correctness. The properties of memcpy as they are described in the standard are properties seen as interfaces for the programmer, not for the compiler/environment implementor.
Whether or not memcpy in that implementation in question implements a behavior that makes it valid for the assignment operation is another question. It should not be so difficult to check that or even to inspect the code.
As far as I can tell, this is a compiler bug. i is allowed to alias &o.i according to the aliasing rules, since the types match and the compiler cannot prove that the address of o.i could not have been previously taken. And of course calling memcpy with overlapping (or same) pointers invokes UB.
By the way note that, in your example, o->i is nonsense. You meant o.i I think...
I suppose that there is a typo: "&o" instead of "0".
Under this hypothesis, the "overlap" is actually a strict overwrite: memcpy(&o->i,&o->i,sizeof(o->i)). In this particular case memcpy behaves correctly.

how to return NULL for double function with Intel C compiler?

I have some code that I am porting from an SGI system using the MIPS compiler. It has functions that are declared to have double return type. If the function can't find the right double, those functions return "NULL"
The intel C compiler does not like this, but I was trying to see if there is a compiler option to enable this "feature" so that I can compile without changing code. I checked the man page, and can't seem to find it.
Thanks
Example of the code that currently exists, and works fine with MIPS
double some_function(int param){
double test = 26.25;
if(param == 10){
return test;
}
return (NULL);
}
the intel compiler complains:
error: return value type does not match the function type
There might be a "NaN" (not a number) capability, via compiler options. But that still won't be a null.
Based on your comment:
I just checked, and the calling functions just use it as if it is a valid double (in an if statement)
I would just modify the function to return 0; instead of return NULL;
It's not a compiler option, but you could #define NULL 0 to (in theory) get the desired effect.
Are you sure they don't return pointer-to-double? A NULL double is meaningless, since zero is a valid value for double.
NULL doesn't necessarily have a special meaning when it is returned from a function. If you want to indicate that a function did not complete successfully, the common technique is to return some value that would never be returned under normal circumstances. Usually, this means returning an invalid value. When a function returns a pointer, it often uses NULL as a return value signaling an error since a NULL pointer is rarely a value that the user would expect the function to return.
The constant NULL only has meaning in the context of pointers, and since you are returning a scalar value the compiler is griping at you about using NULL. Instead of returning NULL, find some value that could never be a valid return value and use it as your error indicator. To make it more clear what you are doing, alias that value with a constant using #define AN_ERROR_OCCURED <<someval>> to avoid using "magic numbers".
Another common way to do it is to use the errno global (see errno.h). If your function encounters an error, set errno to some non-zero value before returning. The code calling your function will then can check for and handle errors using something like if (errno) {...}.
Yet another method is to have the function return zero if successful and a non-zero error code otherwise. If the user needs to get data back from the function, an extra "return buffer" pointer would have to be passed to the function.
I think I will just use #ifdef INTEL for a compiler flag, and define the return as zero for that compile time option. Kind of a pain, but seems to be the best way.

Resources