I have a function that needs some constant data, but retrieving the constant data requires calling a function that performs a linear search to retrieve the data. I don't want to perform the search for each call of the function, so I tried making the variable in question static. But static variables cannot be initialized to non-constant values:
int my_function(int foo)
{
static const Thing *bar = thing_from_name("bar");
return do_thing(foo, bar);
}
GCC rightly complains that the "initializer element is not constant".
After pondering my situation for a moment, I came up with a way to have my cake and eat it too:
int my_function(int foo)
{
static const Thing *bar = NULL;
if (!bar) bar = thing_from_name("bar");
return do_thing(foo, bar);
}
This appears to work fine so far, but it feels... wrong. Are there any pitfalls to this approach? Is there a better way to solve my problem?
To be clear, thing_from_name is effectively a pure function, in that it only reads from constant data in memory. Since it searches for a string, there's not really an easy way for me to optimize it down to a constant expression (as far as I know).
EDIT: here's a rough outline of what thing_from_name does, for further context:
const Thing *thing_from_name(const char *name)
{
const Thing *t;
for (t = &thing_array[0]; t->name != NULL; t++) {
if (strcmp(t->name, name) == 0) {
return t;
}
}
return NULL;
}
Likely for plain C you have to use a thing like Boost.Preprocessor. If thing_from_name() is expressable as completely static code, that can be pre-calculated at compile time, then you may succeed. If no, then you should use your solution but remember that it's not thread-safe.
Related
I am making a little project just for fun about OO in C.
The problem I am encountering is fairly odd to me. The program below is the entire thing.
When compiled and ran, all the 'methods' work properly... up until after it hits the first 'puts'. At this point, the 'methods' have lost the reference to 'self' at this point.
I'm not sure why it works up until then, but not after. It is worth noting that executing the 'constructor' again before the second 'puts' will make it work. The 'if' chains clearly show that the methods work because they properly set the new strings.
I do realize this would be fixed by simply passing the 'object's' address via parameters to the 'methods', but that's kind of the point of the post, I would like to find a neat way to do it without doing so.
#include <stdio.h>
//-----class_a--------------------------------------------------------------//
typedef struct{
char* string;
char* (*get_string)();
char* (*set_string)(char*);
} class_a;
void constructor_class_a(class_a *self, char* string){
self->string = string;
char* get_string(){
return self->string;
} self->get_string = get_string;
char* set_string(char* new_string){
return self->string = new_string;
} self->set_string = set_string;
}
//--------------------------------------------------------------------------//
int main(){
class_a object_a;
constructor_class_a(&object_a, "default string");
printf("string: %s\n", object_a.get_string());
if (object_a.get_string()=="default string"){
object_a.set_string("temporary string");
if (object_a.get_string()=="temporary string"){
object_a.set_string("final string");
if (object_a.get_string()=="final string"){
printf("string: %s\n", object_a.get_string());
}
}
}
printf("%s", object_a.get_string());
return 0;
}
I've tried doing this to no avail (including a int init member to the 'class'):
char* get_string(){
static class_a *self2 = {0};
if (!self2->inti){
*self2 = *self;
self2->init = 1;
}
return self2->string;
} self->get_string = get_string;
The problem is that you're trying to use nested functions, which are not part of the C language but a GCC extension, but using them incorrectly. Their lifetimes end at the end of the block they're nested in, and any use of a function pointer to them after their lifetime ends has undefined behavior.
Even if this did work, it would be an awful idea, since the ability to have pointers to nested functions necessarily depends on having an executable stack, which is deprecated because it makes most kinds of vulnerabilities trivial to exploit. This is among the many reasons that clang and other compilers refuse to copy this GCC feature and why it's essentially dead.
Just out of curiosity, I'm trying to understand how pointers to functions work in C.
In order to associate a function to a typedef, I've declared a pointer in it, and then I've stored the address of the desired function in there.
This is what I was able to achieve:
typedef struct
{
void (*get)(char*, int);
char string[10];
} password;
int main()
{
password userPassword;
userPassword.get = &hiddenStringInput;
userPassword.get(userPassword.string, 10);
return EXIT_SUCCESS;
}
While this does actually work perfectly, I'd like for "userPassword.get" to be a shortcut that when used calls the hiddenStringInput function and fills in the requested arguments (in this case, an array of characters and a integer).
Basically, since I'm always going to use userPassword.get in association with the arguments "userPassword.string" and "10", I'm trying to figure out a way to somehow store those parameters in the pointer that points to the hiddenString function. Is it even possible?
The way I see this usually done is by providing a "dispatch" function:
void get(password * pw) {
pw->get(pw->string, 10);
}
Then, after setting userPassword.get to your function, you call just:
get(userPassword);
Obviously this adds some boilerplate code when done for multiple functions. Allows to implement further funny "class like" things, though.
You can do this in Clang using the "Blocks" language extension. As commented, there have been attempts to standardize this (and it's not been received with hostility or anything), but they're moving slowly.
Translated to use Blocks, your example could look like this:
#include <stdlib.h>
#include <Block.h>
typedef void (^GetPw)(int); // notice how Block pointer types are used
typedef void (*GetPw_Impl)(char*, int); // the same way as function pointer types
typedef struct
{
GetPw get;
char string[10];
} password;
extern void hiddenStringInput(char*, int);
extern void setPw(char dst [static 10], char * src);
GetPw bindPw (GetPw_Impl get_impl, char * pw)
{
return Block_copy (^ (int key) {
get_impl (pw, key);
});
}
int main()
{
password userPassword;
setPw(userPassword.string, "secret");
userPassword.get = bindPw(hiddenStringInput, userPassword.string);
userPassword.get(10);
return EXIT_SUCCESS;
}
There are some subtleties to the way arrays are captured that might confuse this case; the example captures the password by normal pointer and assumes userPassword is responsible for ownership of it, separately from the block.
Since a block captures values, it needs to provide and release dynamic storage for the copies of the captured values that will be created when the block itself is copied out of the scope where it was created; this is done with the Block_copy and Block_release functions.
Block types (syntactically function pointers, but using ^ instead of *) are just pointers - there's no way to access the underlying block entity, just like basic C functions.
This is the Clang API - standardization would change this slightly, and will probably reduce the requirement for dynamic memory allocation to copy a block around (but the Clang API reflects how these are currently most commonly used).
So, I've just realized that I can write functions directly inside of structs
typedef struct
{
char string[10];
void get(void)
{
hiddenStringInput(string, 10);
return;
}
void set(const char* newPassword)
{
strcpy(string, newPassword);
return;
}
void show(void)
{
printf("%s", string);
return;
}
} password;
Now I can just call userPassword.get(), userPassword.show() and userPassword.set("something"), and what happens is exactly what the label says. Are there any reasons I shouldn't do this? This looks like it could come pretty handy.
EDIT: So this is only possible in C++. I didn't realize I'm using a C++ compiler and by attempting to do random stuff I came up with this solution. So this isn't really what I was looking for.
How can I check if a pointer to function was initialized?
I can check for NULL, but if not null could be garbage, right?
I have the following:
#include <stdio.h>
#include <stdlib.h>
typedef struct client_struct
{
char *name;
char *email;
void (*print)(struct client_struct *c);
} client;
void print_client(client *c)
{
if (c->print != NULL)
c->print(c);
}
int main()
{
client *c = (client *)malloc(sizeof(client));
c->email = (char *)malloc(50 * sizeof(char));
sprintf(c->email, "email#server.com");
c->name = (char *)malloc(50 * sizeof(char));
sprintf(c->name, "some name");
//Uncommenting line below work as expected, otherwise segmentation fault
//c->print = NULL;
print_client(c);
printf("\nEOF\n");
int xXx = getchar();
return 0;
}
How can I check if this pointer really points to function "void (*f)(client *)"?
Comparing size doesn't work because could garbage in same size, correct?
I would like a way to accomplish that preferably according to C standard.
As described in the comments, it is impossible to determine with 100% certainty whether a pointer is garbage.
To avoid such situation, you can provide a "constructor" function, like this:
struct client_struct* client_allocate()
{
struct client_struct* object = malloc(sizeof *object);
if (object)
{
object->name = NULL;
object->email = NULL;
object->print = NULL;
}
return object;
}
Then write in your documentation that the only valid way to create "clients" is by using your function. If you do this, you should also provide a destroy function, where you call free.
Suppose you add a new pointer to your struct one day. Then you update your client_allocate function, where you set this pointer to NULL, and the new pointer will always be properly initialized. There is no need to update all places in code where your struct is allocated, because now there is only one such place.
Caveats
Checking if a pointer to a function is initialized with an valid function is not an easily solvable problem. Any solution, will not be portable across platforms, and is also dependent on the binary format (statically or dynamically linkable formats) that you end up with. There are ways to do this, with varying success, on different binary formats, however I am not going to go over every permutation. Hopefully this will get you going down that rabbit hole :-) and you can figure out the particular solution that works for you in your circumstances.
In order for some of the solutions to work you have to ensure that the linked binaries have exported symbols (it's possible to do it without, but it's a lot harder and I don't have the time). So when you're linking your program ensure that you have dynamic symbols enabled.
Having said that, here's an approach you can use on systems using dlfcn functions. (See History below)
More Caveats
As #Deduplicator points out in his comment below, there may be situations where 0xdeadbeef may arbitrarily happen to point to a valid function, in which case you may end up with a situation where you end up calling the wrong valid function. There are ways to mitigate that situation at either compile-time or runtime but you'll have to build the solution by hand. For example, C++ does it by mangling in namespace into the symbols. You could require that to happen. (I'll think of an interesting way to do this and post it)
Linux / SysV variants (Mac OSX included)
Use dladdr (SysV) (GNU has a dladdr1 as well) to determine which function does the address you provide fall within:
Example:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <dlfcn.h>
int is_valid_function_ptr( void *func) {
Dl_info info;
int rc;
rc = dladdr(func, &info);
if (!rc) {
/* really should do more checking here */
return 0;
}
return 1; /* you can print out function names and stuff here */
}
void print(const char *value) {
fprintf(stdout, "%s", value);
}
void call_the_printer(void (*foo)(), const char *value)
{
if(is_valid_function_ptr(foo)) {
foo(value);
}
else {
fprintf(stderr, "The Beef is Dead!\n");
}
}
int main()
{
void (*funcptr)() = (void (*)()) 0xdeadbeef; /* some dead pointer */
call_the_printer(funcptr, "Hello Dead Beef\n");
funcptr = print; /* actually a function */
call_the_printer(funcptr, "Hello Printer\n");
return 0;
}
NOTE Enable dynamic symbols for this to work
GCC/LLVM etc.
use -rdynamic or -Wl,--export-dynamic during the link process, so compile with:
gcc -o ex1 -rdynamic ex1.c
Windows
Windows does its own thing (as always) and I haven't tested any of these, but the basic concept should work:
Use GetModuleHandle and EnumCurrentProcess together to get loaded symbol information and run through the pointers in a loop to see they match any of the address therein.
The other way would be to use VirtualQuery and then cast mbi.AllocationBase to (HMODULE) and see if you get the path of your own binary back.
In c function pointers are no different than regular pointers and by standard they have one value that says the value should not be used and this is NULL.
The way you should work with pointers is to set them only to valid value or NULL. There is no other way you can be sure there is a OK value. And by definition every value that is not NULL should be considered valid.
Like pointed to in other comments and answers, there is not way to check a variable is initialized. That's why initializing vars to NULL and then checking is considered good practice.
If you really want to validate your function pointer is pointing to the correct place, you could export the function and load your pointer from the ELF symbols (see: http://gcc.gnu.org/wiki/Visibility)
Always check for null parameters first of all.
void print_client(client *c)
{
if ((c != NULL) && (c->print != NULL))
{
c->print(c);
}
}
As for your question, nullify your client struct after it's been malloc'd. This way you can ensure that an unassigned function pointer shall indeed ==NULL.
client* create_client(void)
{
client *c = malloc(sizeof(client));
if (c != NULL)
{
memset(c, 0, sizeof(c))
}
return c;
}
In C99 is there an easier way of check if a structure of function pointers is NULL, other than checking each individual pointer?
What I currently have is similar to the following:
typedef struct {
void* (*foo)(int a);
int (*bar)(int a, int b);
} funcs;
void *funcs_dll;
funcs_dll = dlopen("my_funcs_dll.so", RTLD_GLOBAL);
if (funcs_dll == NULL) {
THROW_ERROR;
}
funs.foo = dlsym(funcs_dll, "foo");
funcs.bar = dlsym(funcs_dll, "bar");
if (!funcs.foo || !funcs.bar) {
THROW_ERROR;
}
What I am looking to do is reduce the second if check, so that I do not need to check each individual function. Any suggestions would be helpful.
Not directly, no.
You can't use memcmp() to compare to some constant buffer, since there might be padding inside the structure which will have "random" values. If you can make sure that the size of the structure is exactly the sum of the function pointer fields, you can perhaps go that way.
You can also use a proxy, by i.e. declaring an initial uint32_t member that is a bitset representing which function pointer(s) are valid. Then you can check up to 32 (or 64 with uint64_t) proxy bits in parallel.
If you only want to do this once, my suggestion would be a data-driven approach. Define a table of function names to look for, and process that in a loop, exiting as soon as a dlsym() call fails.
Something like:
const struct {
const char *name;
size_t offset;
} functions[] = {
{ "foo", offsetof(funcs, foo) },
{ "bar", offsetof(funcs, bar) },
};
Data-driven code like this is very powerful, and often very fast.
Make wrapper function for dlsym which will set error flag, if return value is NULL.
I'm about to debug someone else's code and I stumbled across a certain 'way' of handling with global arrays which I consider deeply bad, but the one who first used it swears to it.
I need to find arguments against it.
Here is the code written simplified (this is not the original code, just an abstracted version)
So my question: which arguments would you bring against (or maybe some code which brings down this method) this?
int test(int i, int v, int type, int** t)
{
static int *teeest;
int result = 0;
switch(type)
{
case (1):
{
int testarr[i];
teeest = testarr;
}
break;
case (2):
result = teeest[i];
break;
case (3):
teeest[i] = v;
break;
}
if (t != NULL)
{
*t = teeest;
}
return result;
}
int main()
{
int *te = (int*)1;
test(5, 0, 1, &te);
printf("%p\n", te);
int i=0;
for(;i<5;i++)
{
test(i, i, 3, NULL);
printf("Value: %d\n", test(i,0,2, NULL));
}
return 0;
}
local variables are dead after the block they declared in, so this code is undefined behavior. Like every accessing random address, it may work, but it also may not work.
Note that if you use malloc instead of int testarr[i], (and worry to free the previous array, and to initialize teeest), it will be correct. the problems of this code have nothing about static pointers.
This is really bad. Just because the pointer is static doesn't mean the data it points to will be around. For example, testarr disappears when the function exits and the returned pointer, if used, might cause dragons to appear.
It seems to me the big downfall of this style is that you are hiding the fact that you are accessing a locally declared array which is on the stack. Then you persist a pointer to your stack which will persist through calls, which will have different stacks each call.
Another thing I was thinking about is that you have hidden from the developer what the data structure is. Indexing an array is a normal operation. Indexing a pointer makes the developer acknowledge it is an array and not a more complex data type. This also adds confusion to bounds checking.
Another thing is, that all disadvantages of global variables apply directly. The code is not reentrant, and hard to make thread-safe (if that's a concern).