Could somebody help me interpret this line of code (from here)?
*(void **) (&funcp) = dlsym(libHandle, argv[2]);
I do not understand what *(void **) (&funcp) does.
This might clarify it (from here):
(Update: I didn't include the exact same link to be snarky. I missed the link in the original question. :P)
/* The rather clumsy cast above is necessary because the ISO C standard
does not require that pointers to functions can be cast back and
forth to 'void *'. (See TLPI pages 863-864.) SUSv3 TC1 and SUSv4
accept the ISO C requirement and propose casts of the above
form as the workaround. However, the 2013 Technical Corrigendum
(TC1) to SUSv4 requires casts of the following more natural form
to work correctly:
funcp = (void (*)()) dlsym(libHandle, argv[2]);
Various current compilers (e.g., gcc with the '-pedantic' flag)
may still complain about such casts, however. */
It's basically a hack to avoid having to cast the void* from dlsym() to a function pointer, by instead reinterpreting the data in funcp as a void* and storing into that. This is done by taking the address of funcp (the address of the variable itself), pretending that the address refers to a void* (via the (void**) cast), dereferencing it, and storing the void* from dlsym() into it. Simpler forms are likely to work in practice too.
This method of "reinterpreting" data by taking its address, casting that address to a pointer to a different type, and dereferencing, is often called type punning by the way. The pun comes from the same data having different meanings when interpreted in different ways, which is how real puns work too.
(Type punning can be unsafe in certain circumstances, including when the compiler makes use of strict aliasing rules, which lets it assume that certain pointers of different type do not alias (that they do not refer to the same data). The cast above might violate strict aliasing as you get a function pointer and a void* referring to the same data, though it's a "likely to work in practice" thing in this case.)
(The reason ISO C does not require that function pointers can be safely cast to void pointers and back is probably that functions and data (void pointers refer to "data") are stored separately on some machines. Since they are separate, they might also use different address lengths or formats, so that casting between function pointers and data pointers might not make sense. Architectures that separate code and data in that way are called Harvard architectures.)
dlsym() relates to dlopen() and dlclose() See the man pages.
The line you're seeing it is looking up a symbol (fetching the value of the symbol, and in this case is presuming it is a function address), in the binary object loaded at dlopen(), and assigning it to the pointer variable funcp.
When funcp is typecast to the corresponding function type, meaning given the proper function signature, that function can be called via funcp and parameters passed as well.
dl (dynamic library) set of functions are the mechanism by which plugins are typically facilitated on systems that support dlopen/dlsym/dlclose. The plugin has to conform to an interface defined by the user, or community, that the code that wants to load the plugin must also know, so that knows how to find and define the symbols by name, and how to cast them and use them.
Another way of putting it, is it lets you make calls into an object that isn't available at link time and lets your code do at runtime what the linker handles for you at link time.
dlsym() returns a (void *) because it can't assume anything about the symbol it's loading, which could be a variable, a function entry point, etc... to do anything with the value returned, you generally need to cast it into the type that corresponds to the symbol in the binary loaded with dlopen().
Related
I have a library with the following structure:
struct frame_meta_data
{
uint8_t id;
uint8_t general_field_1;
uint8_t general_field_2;
...
uint8_t user_data[16];
};
And I would like users of the library to be able to save custom data into frame objects (that's what the user_data field is for).
However when trying to cast user_data into a custom structure:
frame_meta_data cur_frame;
...
#define USER_HDR ((struct my_user_header*)cur_frame.user_data)
I get the following error:
warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
#define USER_HDR ((struct my_user_header*)cur_frame.user_data)
How can I work around this?
Thanks in advance.
Reinterpreting addresses like that isn't allowed by the C standard. Strict aliasing means that compilers are free to assume two pointers of different types will never point at the same object, and then make all sorts of optimizations based of that.
Your code violates the C standard and has undefined behavior on account of that. But you can fix it still. Like melpomene suggested in the comments, don't cast, but use memcpy:
struct my_user_header obj;
memcpy(&obj, cur_frame.user_data, sizeof obj);
Alternatively, some compilers allow you to write non-standard code with compiler option, such as GCC's -fno-strict-aliasing.
If you know what you are doing, you can disable that warning. There is however a potential problem.
Assume that the structure you want to use contains something that is larger than 1 byte. For example a 4-byte integer. Now if you simply cast that user_data field to your structure it is possible that the int is not aligned to 4-byte boundary as it should be. This might result in a runtime exception in some architectures.
Using memcpy should solve that problem though. And remove the warning.
I suspect it's because you're involving copies of this USR_DATA macro expression in multiple accesses to the data area, and it's confusing the compiler. Or perhaps you're even mixing USR_DATA accesses with manipulations of the underlying char array.
If the data area is only being initialized and accessed as that given user data type, there isn't any aliasing going on. Ensure you use it that way.
I would provide an external (as in, non-inlined, external linkage) API function which, given a frame object, returns a void * to the associated user data:
struct foobar *fbs = (struct foobar *) frame_get_userdata(fr);
// now work just with fbs
The cast isn't necessary; that's my style.
Depending on what precedes the user data, it might not be suitably aligned for arbitrary use. One easy way to fix that would be to make it the first struct member, if that option is available. Otherwise there are various fairly portable tricks involving making a union between a char array and various types like long double and whatnot, or else using compiler-specific constructs, like __attribute__((aligned)) with GCC.
Second edition. I'm looking at their hash table example in section 6.6. I found the full source transcribed here. This is the part I'm puzzling over:
struct nlist *np;
if((np=lookup(name))==NULL){
np=(struct nlist *)malloc(sizeof(*np));
Why the cast to (struct nlist *) on the last line? I can remove it without getting any compiler warnings.
I'm similarly confused by
free((void *)np->def);
Are these intended to aid readability somehow?
Casting the result of malloc was necessary in some pre-ANSI dialects of C, and the usage was retained in K&R2 (which covers the language as of the 1989 ANSI C standard).
This is mentioned in the errata list for the book. I've seen it via Dennis Ritchie's home page, which isn't currently available (I hope AT&T hasn't permanently removed it), but I found it elsewhere:
142(ยง6.5, toward the end): The remark about casting the return value of malloc ("the proper method is to declare ... then explicitly coerce") needs to be rewritten. The example is correct and works, but the advice is debatable in the context of the 1988-1989 ANSI/ISO standards. It's not necessary (given that coercion of void * to ALMOSTANYTYPE * is automatic), and possibly harmful if malloc, or a proxy for it, fails to be declared as returning void *. The explicit cast can cover up an unintended error. On the other hand, pre-ANSI, the cast was necessary, and it is in C++ also.
Despite the opinion of the legions of posters here who will immediately jump on any code with an unnecessary (but harmless) cast of malloc(), the truth is that it just doesn't matter. Yes, assignment to and from void * does not require casting, but nor is it forbidden, and the arguments for leaving it in or taking it out really aren't that strong.
There are more important things to spend brain cells on. It just doesn't matter.
For that example to be completely correct today, you have to put
#include <stdlib.h>
so you get the proper prototype for malloc(3). In this case, it's not important if you do a cast or not, as malloc is declared as returning void * there, and no need to cast from this type to another pointer type (but you can if you desire).
Today, it's better not to do the cast, as you can hide a more than frequent error. If you do the cast and don't provide a prototype to the compiler for malloc, the compiler assumes malloc is default declared as int malloc(); (returning an int instead of a void *, and taking an unspecified number of arguments) and you want (as you stated it explicitly) to convert that int to a pointer. The compiler will call malloc and take the supposed int result (the 32 bit value, not the actual, 64bit returned by malloc ---depending on the architecture calling conventions these values can be related or not, but they are always different, as the int space is smaller than the pointer space) as return value, convert it blindly to the cast type you propose (withoug warning, as you have explicitly put the cast, add the missing 32bits to complete a full 64bit pointer ---THIS IS TRULY DANGEROUS IF INTEGER TYPES ARE NOT THE SAME SIZE AS POINTER TYPES---, you can check this on 64bit platforms where they aren't, or in old MS-DOS compilers in large memory models, where they aren't also) and hide the real problem (which is that you did not provide a proper #include file) You will be lucky if it works, as that means all the virtual pointers returned by malloc are below the 0x100000000 limit (this in 64bit intel architectures, let's see in 64bit big endian architectures) This is the actual source of undefined behaviour you should expect.
Normally, with modern compilers, you'll probably get some kind of warning for using a function with no declaration (if you provided no prototype) but the compiler will compile the program and generate an executable program, probably not the one you want. This is one of the main differences between C and C++ languages (C++ don't allow you to compile code with a function call if you have not declared a prototype for it before, so you'll get an error, instead of a possible warning, if you get the invalid malloc I mention above)
This kind of errors are very difficult to target and that's the reason the people that advocates for not casting actually does.
The cast is deprecated. void * can be assigned directly to any pointer type; you actually even should do so. K&R is a bit outdated in some aspects and you should definitively get something more recent (and for newer standards - C99 upwards).
See n1570: 6.3.2.3/1.
Can you cast a function pointer of this type:
void (*one)(int a)
to one of this type:
void (*two)(int a, int b)
and then safely invoke the pointed-to function with the additional arguments(s) it has been cast to take? I had thought such a thing was illegal, that both function types had to be compatible. (Meaning the same prototype--same return value, same parameter list.) But that is exactly what this bit of GTK+ code appears to be doing (taken from here):
g_signal_connect_swapped(G_OBJECT(button), "clicked",
G_CALLBACK(gtk_widget_destroy), G_OBJECT(window));
If you look up the "clicked" signal (or just look at other examples of its use from the first link), you will see that its handlers are expected to be declared like this:
void user_function(GtkButton *button, gpointer user_data);
When you register a handler via g_signal_connect_swapped(), the widget pointer and data pointer arguments are swapped in order, thus, the declaration should look like this instead:
void user_function(gpointer user_data, GtkButton *button);
Here is the problem. The gtk_widget_destroy() function registered as a callback is prototyped like this:
void gtk_widget_destroy(GtkWidget *widget);
to take only a single argument. Presumably, because the data pointer (a GtkWindow) and the pointer to the signaling widget (a GtkButton) are swapped, the sole argument it receives will be the window pointer, and the button pointer, which will be passed after, will silently be ignored. Some Googling has turned up similar examples, even the registering of functions like gtk_main_quit() that take no arguments at all.
Am I correct in believing this to be a standards violation? Have the GTK+ developers found some legal magic to make this all work?
The C calling convention makes it the responsibility of the caller to clean up the arguments on the stack. So if the caller supplies too many arguments, it is not a problem. The additional arguments are just ignored.
So yes, you can cast a function pointer to another function pointer type with the same arguments and then some, and call the original function with too many arguments and it will work.
In my opinion, C89 standards in this regard are quite confusing. As far as I know, they don't disallow casting from/to function with no param specification, so:
typedef void (*one)(int first);
typedef void (*two)(int first, int second);
typedef void (*empty)();
one src = something;
two dst;
/* Disallowed by C89 standards */
dst = (two) src;
/* Not disallowed by C89 standards */
dst = (two) ((empty) src);
At the end, the compilers must be able to cast from one to two, so I don't see the reason to forbid the direct cast.
Anyway, signal handling in GTK+ uses some dark magic behind the scenes to manage callbacks with different argument patterns, but this is a different question.
Now, how cool is that, I'm also currently working my way through the GTK tutorial, and stumbled over exactly the same problem.
I tried a few examples, and then asked question What happens if I cast a function pointer, changing the number of parameters , with a simplified example.
The answer to your question (adapted from the excellent answers to my question above, and the answers from question Function pointer cast to different signature):
Yes, this is a violation of the C standard. If you cast a function pointer to a function pointer of an incompatible type, you must cast it back to its original (or a compatible) type before calling it. Anything else is undefined behaviour.
However, what it does in practice depends on the C compiler, especially the calling convention it uses. The most common calling convention (at least on i386) happens to just put the parameters on the stack in reverse order, so if a function needs less parameters than it is supplied, it will just use the first parameters and ignore the rest -- which is just what you want. This will break however on platforms with different calling conventions.
So the real question is why the GLib developers did it like this. But that's a different question I guess...
I know this has been asked before but none of the cases I've seen here are like this one.
I am importing some API functions at runtime, the general declaration on those functions would be like:
// Masks for UnmapViewOfFile and MapViewOfFile
typedef BOOL (WINAPI *MyUnmapViewOfFile)(LPCVOID);
typedef LPVOID (WINAPI *MyMapViewOfFile)(HANDLE, DWORD, DWORD, DWORD, SIZE_T);
// Declarations
MyUnmapViewOfFile LoadedUnmapViewOfFile;
MyMapViewOfFile LoadedMapViewOfFile;
I then call a generic "load" function where it calles GetProcAddress to get the address of the exported function from the proper DLL. That address is returned on a void**. This void** is one of the parameters in the generic load, something like:
int GenericLoad(char* lib, void** Address, char* TheFunctionToLoad)
and I would call this function:
void *Address;
GenericLoad("kernel32.dll", &Address, "UnmapViewOfFile");
LoadedUnmapViewOfFile = (MyUnmapViewOfFile) Address;
Or something similar to this.
Now, of course the compiler complains about trying to cast a data void* to a function pointer. How do I do this then?
I've read countless sites and all kinds of nasty casts, so I'd appreciate it if you add code to the explanation.
Thanks
Jess
The correct code would be this one line:
GenericLoad("kernel32.dll", (void**)&LoadedUnmapViewOfFile, "UnmapViewOfFile");
What is done here basically is this: the address of the pointer variable (the one in which you want the address of the function to be put) is passed to GenericLoad - which is basically what it expects. void** was to denote "give me the address of your pointer". All the type casting is a magic around it. C would not allow specifying "a pointer to any function pointer", so the API author preferred void**.
Now, of course the compiler complains about trying to cast a data void* to a function pointer
My compilers don't complain at all with your code (then again, I don't have warnings cranked up - I'm using more-or-less default options). This is with a variety of compilers from MS, GCC,and others. Can you give more details about the compiler and compiler options you're using and the exact warning you're seeing?
That said, C doesn't guarantee that a function pointer can be cast to/from a void pointer without problems, but in practice this will work fine on Windows.
If you want something that's standards compliant, you'll need to use a 'generic' function pointer instead of a void pointer - C guarantees that a function pointer can be converted to any other function pointer and back without loss, so this will work regardless of your platform. That's probably why the return value of the Win32 GetProcAddress() API returns a FARPROC, which is just a typedef for a function pointer to a function that takes no parameters (or at least unspecified parameters) and returns a pointer-sized int. Something like:
typedef INT_PTR (FAR WINAPI *FARPROC)();
FARPROC would be Win32's idea of a 'generic' function pointer. So all you should need to do is have a similar typedef (if you don't want to use FARPROC for some reason):
typedef intptr_t (*generic_funcptr_t)(); // intptr_t is typedef'ed appropriately elsewhere,
// like in <stdint.h> or something
int GenericLoad(char* lib, generic_funcptr_t* Address, char* TheFunctionToLoad)
generic_funcptr_t Address;
GenericLoad("kernel32.dll", &Address, "UnmapViewOfFile");
LoadedUnmapViewOfFile = (MyUnmapViewOfFile) Address;
Or you can dispense with the middle-man and pass the pointer you really want to get the value into:
GenericLoad2("kernel32.dll", (generic_funcptr_t *) &LoadedUnmapViewOfFile, "UnmapViewOfFile");
Though that's more dangerous than the method using the intermediate variable - for example the compiler will not give a diagnostic if you leave off the ampersand in this last example, however in the previous example, it would generally give at least a warning if you left off the ampersand from the Address argument. Similar to Bug #1 here: http://blogs.msdn.com/sdl/archive/2009/07/28/atl-ms09-035-and-the-sdl.aspx
Now you should be set. However, any way you look at it you'll need to perform some dangerous casting. Even in C++ with templates you'd have to perform a cast at some level (though you might be able to hide it in the template function) because the GetProcAddress() API doesn't know the actual type of the function pointer you're retrieving.
Also note that your interface for GenericLoad() has a possibly serious design problem - it provides no way to manage the lifetime of the library. That may not be a problem if your intent is to not allow unloading a library, but it's something that users may want so you should consider the issue.
Addresses to data and addresses to functions are incompatible things. I believe that your Address variable must be a function pointer as in your BOOL definition: (WINAPI *MyUnmapViewOfFile)(LPCVOID). This type of declaration is required because in order to have a pointer to a function, the return type, and the type and number of arguments must be known. This is because when you call your function, the correct amount of space must be allocated on the stack to contain these return value and args.
Given that, I believe that the correction in Pavel's answer is correct (FYI, his (void**) cast is a type safety measure).
I use a structure of function pointers to implement an interface for different backends. The signatures are very different, but the return values are almost all void, void * or int.
struct my_interface {
void (*func_a)(int i);
void *(*func_b)(const char *bla);
...
int (*func_z)(char foo);
};
But it is not required that a backends supports functions for every interface function. So I have two possibilities, first option is to check before every call if the pointer is unequal NULL. I don't like that very much, because of the readability and because I fear the performance impacts (I haven't measured it, however). The other option is to have a dummy function, for the rare cases an interface function doesn't exist.
Therefore I'd need a dummy function for every signature, I wonder if it is possible to have only one for the different return values. And cast it to the given signature.
#include <stdio.h>
int nothing(void) {return 0;}
typedef int (*cb_t)(int);
int main(void)
{
cb_t func;
int i;
func = (cb_t) nothing;
i = func(1);
printf("%d\n", i);
return 0;
}
I tested this code with gcc and it works. But is it sane? Or can it corrupt the stack or can it cause other problems?
EDIT: Thanks to all the answers, I learned now much about calling conventions, after a bit of further reading. And have now a much better understanding of what happens under the hood.
By the C specification, casting a function pointer results in undefined behavior. In fact, for a while, GCC 4.3 prereleases would return NULL whenever you casted a function pointer, perfectly valid by the spec, but they backed out that change before release because it broke lots of programs.
Assuming GCC continues doing what it does now, it will work fine with the default x86 calling convention (and most calling conventions on most architectures), but I wouldn't depend on it. Testing the function pointer against NULL at every callsite isn't much more expensive than a function call. If you really want, you may write a macro:
#define CALL_MAYBE(func, args...) do {if (func) (func)(## args);} while (0)
Or you could have a different dummy function for every signature, but I can understand that you'd like to avoid that.
Edit
Charles Bailey called me out on this, so I went and looked up the details (instead of relying on my holey memory). The C specification says
766 A pointer to a function of one type may be converted to a pointer to a function of another type and back again;
767 the result shall compare equal to the original pointer.
768 If a converted pointer is used to call a function whose type is not compatible with the pointed-to type, the behavior is undefined.
and GCC 4.2 prereleases (this was settled way before 4.3) was following these rules: the cast of a function pointer did not result in NULL, as I wrote, but attempting to call a function through a incompatible type, i.e.
func = (cb_t)nothing;
func(1);
from your example, would result in an abort. They changed back to the 4.1 behavior (allow but warn), partly because this change broke OpenSSL, but OpenSSL has been fixed in the meantime, and this is undefined behavior which the compiler is free to change at any time.
OpenSSL was only casting functions pointers to other function types taking and returning the same number of values of the same exact sizes, and this (assuming you're not dealing with floating-point) happens to be safe across all the platforms and calling conventions I know of. However, anything else is potentially unsafe.
I suspect you will get an undefined behaviour.
You can assign (with the proper cast) a pointer to function to another pointer to function with a different signature, but when you call it weird things may happen.
Your nothing() function takes no arguments, to the compiler this may mean that he can optimize the usage of the stack as there will be no arguments there. But here you call it with an argument, that is an unexpected situation and it may crash.
I can't find the proper point in the standard but I remember it says that you can cast function pointers but when you call the resulting function you have to do with the right prototype otherwise the behaviour is undefined.
As a side note, you should not compare a function pointer with a data pointer (like NULL) as thee pointers may belong to separate address spaces. There's an appendix in the C99 standard that allows this specific case but I don't think it's widely implemented. That said, on architecture where there is only one address space casting a function pointer to a data pointer or comparing it with NULL, will usually work.
You do run the risk of causing stack corruption. Having said that, if you declare the functions with extern "C" linkage (and/or __cdecl depending on your compiler), you may be able to get away with this. It would be similar then to the way a function such as printf() can take a variable number of arguments at the caller's discretion.
Whether this works or not in your current situation may also depend on the exact compiler options you are using. If you're using MSVC, then debug vs. release compile options may make a big difference.
It should be fine. Since the caller is responsible for cleaning up the stack after a call, it shouldn't leave anything extra on the stack. The callee (nothing() in this case) is ok since it wont try to use any parameters on the stack.
EDIT: this does assume cdecl calling conventions, which is usually the default for C.
As long as you can guarantee that you're making a call using a method that has the caller balance the stack rather than the callee (__cdecl). If you don't have a calling convention specified the global convention could be set to something else. (__stdcall or __fastcall) Both of which could lead to stack corruption.
This won't work unless you use implementation-specific/platform-specific stuff to force the correct calling convention. For some calling conventions the called function is responsible for cleaning up the stack, so they must know what's been pushed on.
I'd go for the check for NULL then call - I can't imagine it would have any impact on performance.
Computers can check for NULL about as fast as anything they do.
Casting a function pointer to NULL is explicitly not supported by the C standard. You're at the mercy of the compiler writer. It works OK on a lot of compilers.
It is one of the great annoyances of C that there is no equivalent of NULL or void* for function pointers.
If you really want your code to be bulletproof, you can declare your own nulls, but you need one for each function type. For example,
void void_int_NULL(int n) { (void)n; abort(); }
and then you can test
if (my_thing->func_a != void_int_NULL) my_thing->func_a(99);
Ugly, innit?