I am making a little project just for fun about OO in C.
The problem I am encountering is fairly odd to me. The program below is the entire thing.
When compiled and ran, all the 'methods' work properly... up until after it hits the first 'puts'. At this point, the 'methods' have lost the reference to 'self' at this point.
I'm not sure why it works up until then, but not after. It is worth noting that executing the 'constructor' again before the second 'puts' will make it work. The 'if' chains clearly show that the methods work because they properly set the new strings.
I do realize this would be fixed by simply passing the 'object's' address via parameters to the 'methods', but that's kind of the point of the post, I would like to find a neat way to do it without doing so.
#include <stdio.h>
//-----class_a--------------------------------------------------------------//
typedef struct{
char* string;
char* (*get_string)();
char* (*set_string)(char*);
} class_a;
void constructor_class_a(class_a *self, char* string){
self->string = string;
char* get_string(){
return self->string;
} self->get_string = get_string;
char* set_string(char* new_string){
return self->string = new_string;
} self->set_string = set_string;
}
//--------------------------------------------------------------------------//
int main(){
class_a object_a;
constructor_class_a(&object_a, "default string");
printf("string: %s\n", object_a.get_string());
if (object_a.get_string()=="default string"){
object_a.set_string("temporary string");
if (object_a.get_string()=="temporary string"){
object_a.set_string("final string");
if (object_a.get_string()=="final string"){
printf("string: %s\n", object_a.get_string());
}
}
}
printf("%s", object_a.get_string());
return 0;
}
I've tried doing this to no avail (including a int init member to the 'class'):
char* get_string(){
static class_a *self2 = {0};
if (!self2->inti){
*self2 = *self;
self2->init = 1;
}
return self2->string;
} self->get_string = get_string;
The problem is that you're trying to use nested functions, which are not part of the C language but a GCC extension, but using them incorrectly. Their lifetimes end at the end of the block they're nested in, and any use of a function pointer to them after their lifetime ends has undefined behavior.
Even if this did work, it would be an awful idea, since the ability to have pointers to nested functions necessarily depends on having an executable stack, which is deprecated because it makes most kinds of vulnerabilities trivial to exploit. This is among the many reasons that clang and other compilers refuse to copy this GCC feature and why it's essentially dead.
Related
Just out of curiosity, I'm trying to understand how pointers to functions work in C.
In order to associate a function to a typedef, I've declared a pointer in it, and then I've stored the address of the desired function in there.
This is what I was able to achieve:
typedef struct
{
void (*get)(char*, int);
char string[10];
} password;
int main()
{
password userPassword;
userPassword.get = &hiddenStringInput;
userPassword.get(userPassword.string, 10);
return EXIT_SUCCESS;
}
While this does actually work perfectly, I'd like for "userPassword.get" to be a shortcut that when used calls the hiddenStringInput function and fills in the requested arguments (in this case, an array of characters and a integer).
Basically, since I'm always going to use userPassword.get in association with the arguments "userPassword.string" and "10", I'm trying to figure out a way to somehow store those parameters in the pointer that points to the hiddenString function. Is it even possible?
The way I see this usually done is by providing a "dispatch" function:
void get(password * pw) {
pw->get(pw->string, 10);
}
Then, after setting userPassword.get to your function, you call just:
get(userPassword);
Obviously this adds some boilerplate code when done for multiple functions. Allows to implement further funny "class like" things, though.
You can do this in Clang using the "Blocks" language extension. As commented, there have been attempts to standardize this (and it's not been received with hostility or anything), but they're moving slowly.
Translated to use Blocks, your example could look like this:
#include <stdlib.h>
#include <Block.h>
typedef void (^GetPw)(int); // notice how Block pointer types are used
typedef void (*GetPw_Impl)(char*, int); // the same way as function pointer types
typedef struct
{
GetPw get;
char string[10];
} password;
extern void hiddenStringInput(char*, int);
extern void setPw(char dst [static 10], char * src);
GetPw bindPw (GetPw_Impl get_impl, char * pw)
{
return Block_copy (^ (int key) {
get_impl (pw, key);
});
}
int main()
{
password userPassword;
setPw(userPassword.string, "secret");
userPassword.get = bindPw(hiddenStringInput, userPassword.string);
userPassword.get(10);
return EXIT_SUCCESS;
}
There are some subtleties to the way arrays are captured that might confuse this case; the example captures the password by normal pointer and assumes userPassword is responsible for ownership of it, separately from the block.
Since a block captures values, it needs to provide and release dynamic storage for the copies of the captured values that will be created when the block itself is copied out of the scope where it was created; this is done with the Block_copy and Block_release functions.
Block types (syntactically function pointers, but using ^ instead of *) are just pointers - there's no way to access the underlying block entity, just like basic C functions.
This is the Clang API - standardization would change this slightly, and will probably reduce the requirement for dynamic memory allocation to copy a block around (but the Clang API reflects how these are currently most commonly used).
So, I've just realized that I can write functions directly inside of structs
typedef struct
{
char string[10];
void get(void)
{
hiddenStringInput(string, 10);
return;
}
void set(const char* newPassword)
{
strcpy(string, newPassword);
return;
}
void show(void)
{
printf("%s", string);
return;
}
} password;
Now I can just call userPassword.get(), userPassword.show() and userPassword.set("something"), and what happens is exactly what the label says. Are there any reasons I shouldn't do this? This looks like it could come pretty handy.
EDIT: So this is only possible in C++. I didn't realize I'm using a C++ compiler and by attempting to do random stuff I came up with this solution. So this isn't really what I was looking for.
How can I check if a pointer to function was initialized?
I can check for NULL, but if not null could be garbage, right?
I have the following:
#include <stdio.h>
#include <stdlib.h>
typedef struct client_struct
{
char *name;
char *email;
void (*print)(struct client_struct *c);
} client;
void print_client(client *c)
{
if (c->print != NULL)
c->print(c);
}
int main()
{
client *c = (client *)malloc(sizeof(client));
c->email = (char *)malloc(50 * sizeof(char));
sprintf(c->email, "email#server.com");
c->name = (char *)malloc(50 * sizeof(char));
sprintf(c->name, "some name");
//Uncommenting line below work as expected, otherwise segmentation fault
//c->print = NULL;
print_client(c);
printf("\nEOF\n");
int xXx = getchar();
return 0;
}
How can I check if this pointer really points to function "void (*f)(client *)"?
Comparing size doesn't work because could garbage in same size, correct?
I would like a way to accomplish that preferably according to C standard.
As described in the comments, it is impossible to determine with 100% certainty whether a pointer is garbage.
To avoid such situation, you can provide a "constructor" function, like this:
struct client_struct* client_allocate()
{
struct client_struct* object = malloc(sizeof *object);
if (object)
{
object->name = NULL;
object->email = NULL;
object->print = NULL;
}
return object;
}
Then write in your documentation that the only valid way to create "clients" is by using your function. If you do this, you should also provide a destroy function, where you call free.
Suppose you add a new pointer to your struct one day. Then you update your client_allocate function, where you set this pointer to NULL, and the new pointer will always be properly initialized. There is no need to update all places in code where your struct is allocated, because now there is only one such place.
Caveats
Checking if a pointer to a function is initialized with an valid function is not an easily solvable problem. Any solution, will not be portable across platforms, and is also dependent on the binary format (statically or dynamically linkable formats) that you end up with. There are ways to do this, with varying success, on different binary formats, however I am not going to go over every permutation. Hopefully this will get you going down that rabbit hole :-) and you can figure out the particular solution that works for you in your circumstances.
In order for some of the solutions to work you have to ensure that the linked binaries have exported symbols (it's possible to do it without, but it's a lot harder and I don't have the time). So when you're linking your program ensure that you have dynamic symbols enabled.
Having said that, here's an approach you can use on systems using dlfcn functions. (See History below)
More Caveats
As #Deduplicator points out in his comment below, there may be situations where 0xdeadbeef may arbitrarily happen to point to a valid function, in which case you may end up with a situation where you end up calling the wrong valid function. There are ways to mitigate that situation at either compile-time or runtime but you'll have to build the solution by hand. For example, C++ does it by mangling in namespace into the symbols. You could require that to happen. (I'll think of an interesting way to do this and post it)
Linux / SysV variants (Mac OSX included)
Use dladdr (SysV) (GNU has a dladdr1 as well) to determine which function does the address you provide fall within:
Example:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <dlfcn.h>
int is_valid_function_ptr( void *func) {
Dl_info info;
int rc;
rc = dladdr(func, &info);
if (!rc) {
/* really should do more checking here */
return 0;
}
return 1; /* you can print out function names and stuff here */
}
void print(const char *value) {
fprintf(stdout, "%s", value);
}
void call_the_printer(void (*foo)(), const char *value)
{
if(is_valid_function_ptr(foo)) {
foo(value);
}
else {
fprintf(stderr, "The Beef is Dead!\n");
}
}
int main()
{
void (*funcptr)() = (void (*)()) 0xdeadbeef; /* some dead pointer */
call_the_printer(funcptr, "Hello Dead Beef\n");
funcptr = print; /* actually a function */
call_the_printer(funcptr, "Hello Printer\n");
return 0;
}
NOTE Enable dynamic symbols for this to work
GCC/LLVM etc.
use -rdynamic or -Wl,--export-dynamic during the link process, so compile with:
gcc -o ex1 -rdynamic ex1.c
Windows
Windows does its own thing (as always) and I haven't tested any of these, but the basic concept should work:
Use GetModuleHandle and EnumCurrentProcess together to get loaded symbol information and run through the pointers in a loop to see they match any of the address therein.
The other way would be to use VirtualQuery and then cast mbi.AllocationBase to (HMODULE) and see if you get the path of your own binary back.
In c function pointers are no different than regular pointers and by standard they have one value that says the value should not be used and this is NULL.
The way you should work with pointers is to set them only to valid value or NULL. There is no other way you can be sure there is a OK value. And by definition every value that is not NULL should be considered valid.
Like pointed to in other comments and answers, there is not way to check a variable is initialized. That's why initializing vars to NULL and then checking is considered good practice.
If you really want to validate your function pointer is pointing to the correct place, you could export the function and load your pointer from the ELF symbols (see: http://gcc.gnu.org/wiki/Visibility)
Always check for null parameters first of all.
void print_client(client *c)
{
if ((c != NULL) && (c->print != NULL))
{
c->print(c);
}
}
As for your question, nullify your client struct after it's been malloc'd. This way you can ensure that an unassigned function pointer shall indeed ==NULL.
client* create_client(void)
{
client *c = malloc(sizeof(client));
if (c != NULL)
{
memset(c, 0, sizeof(c))
}
return c;
}
I have a function that needs some constant data, but retrieving the constant data requires calling a function that performs a linear search to retrieve the data. I don't want to perform the search for each call of the function, so I tried making the variable in question static. But static variables cannot be initialized to non-constant values:
int my_function(int foo)
{
static const Thing *bar = thing_from_name("bar");
return do_thing(foo, bar);
}
GCC rightly complains that the "initializer element is not constant".
After pondering my situation for a moment, I came up with a way to have my cake and eat it too:
int my_function(int foo)
{
static const Thing *bar = NULL;
if (!bar) bar = thing_from_name("bar");
return do_thing(foo, bar);
}
This appears to work fine so far, but it feels... wrong. Are there any pitfalls to this approach? Is there a better way to solve my problem?
To be clear, thing_from_name is effectively a pure function, in that it only reads from constant data in memory. Since it searches for a string, there's not really an easy way for me to optimize it down to a constant expression (as far as I know).
EDIT: here's a rough outline of what thing_from_name does, for further context:
const Thing *thing_from_name(const char *name)
{
const Thing *t;
for (t = &thing_array[0]; t->name != NULL; t++) {
if (strcmp(t->name, name) == 0) {
return t;
}
}
return NULL;
}
Likely for plain C you have to use a thing like Boost.Preprocessor. If thing_from_name() is expressable as completely static code, that can be pre-calculated at compile time, then you may succeed. If no, then you should use your solution but remember that it's not thread-safe.
I'm about to debug someone else's code and I stumbled across a certain 'way' of handling with global arrays which I consider deeply bad, but the one who first used it swears to it.
I need to find arguments against it.
Here is the code written simplified (this is not the original code, just an abstracted version)
So my question: which arguments would you bring against (or maybe some code which brings down this method) this?
int test(int i, int v, int type, int** t)
{
static int *teeest;
int result = 0;
switch(type)
{
case (1):
{
int testarr[i];
teeest = testarr;
}
break;
case (2):
result = teeest[i];
break;
case (3):
teeest[i] = v;
break;
}
if (t != NULL)
{
*t = teeest;
}
return result;
}
int main()
{
int *te = (int*)1;
test(5, 0, 1, &te);
printf("%p\n", te);
int i=0;
for(;i<5;i++)
{
test(i, i, 3, NULL);
printf("Value: %d\n", test(i,0,2, NULL));
}
return 0;
}
local variables are dead after the block they declared in, so this code is undefined behavior. Like every accessing random address, it may work, but it also may not work.
Note that if you use malloc instead of int testarr[i], (and worry to free the previous array, and to initialize teeest), it will be correct. the problems of this code have nothing about static pointers.
This is really bad. Just because the pointer is static doesn't mean the data it points to will be around. For example, testarr disappears when the function exits and the returned pointer, if used, might cause dragons to appear.
It seems to me the big downfall of this style is that you are hiding the fact that you are accessing a locally declared array which is on the stack. Then you persist a pointer to your stack which will persist through calls, which will have different stacks each call.
Another thing I was thinking about is that you have hidden from the developer what the data structure is. Indexing an array is a normal operation. Indexing a pointer makes the developer acknowledge it is an array and not a more complex data type. This also adds confusion to bounds checking.
Another thing is, that all disadvantages of global variables apply directly. The code is not reentrant, and hard to make thread-safe (if that's a concern).
I have argument with my friend. He says that I can return a pointer to local data from a function. This is not what I have learned but I can't find a counterargument for him to prove my knowledge.
Here is illustrated case:
char *name() {
char n[10] = "bodacydo!";
return n;
}
And it's used as:
int main() {
char *n = name();
printf("%s\n", n);
}
He says this is perfectly OK because after a program calls name, it returns a pointer to n, and right after that it just prints it. Nothing else happens in the program meanwhile, because it's single threaded and execution is serial.
I can't find a counter-argument. I would never write code like that, but he's stubborn and says this is completely ok. If I was his boss, I would fire him for being a stubborn idiot, but I can't find a counter argument.
Another example:
int *number() {
int n = 5;
return &n;
}
int main() {
int *a = number();
int b = 9;
int c = *a * b;
printf("%d\n", c);
}
I will send him this link after I get some good answers, so he at least learns something.
Your friend is wrong.
name is returning a pointer to the call stack. Once you invoke printf, there's no telling how that stack will be overwritten before the data at the pointer is accessed. It may work on his compiler and machine, but it won't work on all of them.
Your friend claims that after name returns, "nothing happens except printing it". printf is itself another function call, with who knows how much complexity inside it. A great deal is happening before the data is printed.
Also, code is never finished, it will be amended and added to. Code the "does nothing" now will do something once it's changed, and your closely-reasoned trick will fall apart.
Returning a pointer to local data is a recipe for disaster.
you will get a problem, when you call another function between name() and printf(), which itself uses the stack
char *fun(char *what) {
char res[10];
strncpy(res, what, 9);
return res;
}
main() {
char *r1 = fun("bla");
char *r2 = fun("blubber");
printf("'%s' is bla and '%s' is blubber", r1, r2);
}
As soon as the scope of the function ends i.e after the closing brace } of function, memory allocated(on stack) for all the local variables will be left. So, returning pointer to some memory which is no longer valid invokes undefined behavior.
Also you can say that local variable lifetime is ended when the function finished execution.
Also more details you can read HERE.
My counter-arguments would be:
it's never OK to write code with undefined behavior,
how long before somebody else uses that function in different context,
the language provides facilities to do the same thing legally (and possibly more efficiently)
It's undefined behavior and the value could easily be destroyed before it is actually printed. printf(), which is just a normal function, could use some local variables or call other functions before the string is actually printed. Since these actions use the stack they could easily corrupt the value.
If the code happens to print the correct value depends on the implementation of printf() and how function calls work on the compiler/platform you are using (which parameters/addresses/variables are put where on the stack,...). Even if the code happens to "work" on your machine with certain compiler settings it's far from sure that it will work anywhere else or under slightly different border conditions.
You are correct - n lives on the stack and so could go away as soon as the function returns.
Your friend's code might work only because the memory location that n is pointing to has not been corrupted (yet!).
As the others have already pointed out it is not illegal to do this, but a bad idea because the returned data resides on the non-used part of the stack and may get overridden at any time by other function calls.
Here is a counter-example that crashes on my system if compiled with optimizations turned on:
char * name ()
{
char n[] = "Hello World";
return n;
}
void test (char * arg)
{
// msg and arg will reside roughly at the same memory location.
// so changing msg will change arg as well:
char msg[100];
// this will override whatever arg points to.
strcpy (msg, "Logging: ");
// here we access the overridden data. A bad idea!
strcat (msg, arg);
strcat (msg, "\n");
printf (msg);
}
int main ()
{
char * n = name();
test (n);
return 0;
}
gcc : main.c: In function ‘name’:
main.c:4: warning: function returns address of local variable
Wherever it could been done like that (but it's not sexy code :p) :
char *name()
{
static char n[10] = "bodacydo!";
return n;
}
int main()
{
char *n = name();
printf("%s\n", n);
}
Warning it's not thread safe.
You're right, your friend is wrong. Here's a simple counterexample:
char *n = name();
printf("(%d): %s\n", 1, n);
Returning pointer to local variable is aways wrong, even if it appears to work in some rare situation.
A local (automatic) variable can be allocated either from stack or from registers.
If it is allocated from stack, it will be overwritten as soon as next function call (such as printf) is executed or if an interrupt occurs.
If the variable is allocated from a register, it is not even possible to have a pointer pointing to it.
Even if the application is "single threaded", the interrupts may use the stack. In order to be relatively safe, you should disable the interrupts. But it is not possible to disable the NMI (Non Maskable Interrupt), so you can never be safe.
While it is true that you cannot return pointers to local stack variables declared inside a function, you can however allocate memory inside a function using malloc and then return a pointer to that block. Maybe this is what your friend meant?
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
char* getstr(){
char* ret=malloc(sizeof(char)*15);
strcpy(ret,"Hello World");
return ret;
}
int main(){
char* answer=getstr();
printf("%s\n", answer);
free(answer);
return 0;
}
The way I see it you have three main options because this one is dangerous and utilizes undefined behavior:
replace: char n[10] = "bodacydo!"
with: static char n[10] = "bodacydo!"
This will give undesirable results if you use the same function more than once in row while trying to maintain the values contained therein.
replace:
char n[10] = "bodacydo!"
with:
char *n = new char[10];
*n = "bodacydo!"
With will fix the aforementioned problem, but you will then need to delete the heap memory or start incurring memory leaks.
Or finally:
replace: char n[10] = "bodacydo!";
with: shared_ptr<char> n(new char[10]) = "bodacydo!";
Which relieves you from having to delete the heap memory, but you will then have change the return type and the char *n in main to a shared_prt as well in order to hand off the management of the pointer. If you don't hand it off, the scope of the shared_ptr will end and the value stored in the pointer gets set to NULL.
If we take the code segment u gave....
char *name() {
char n[10] = "bodacydo!";
return n;
}
int main() {
char *n = name();
printf("%s\n", n);
}
Its okay to use that local var in printf() in main 'coz here we are using a string literal which again isn't something local to name().
But now lets look at a slightly different code
class SomeClass {
int *i;
public:
SomeClass() {
i = new int();
*i = 23;
}
~SomeClass() {
delete i;
i = NULL;
}
void print() {
printf("%d", *i);
}
};
SomeClass *name() {
SomeClass s;
return &s;
}
int main() {
SomeClass *n = name();
n->print();
}
In this case when the name() function returns SomeClass destructor would be called and the member var i would have be deallocated and set to NULL.
So when we call print() in main even though since the mem pointed by n isn't overwritten (i am assuming that) the print call will crash when it tried to de-reference a NULL pointer.
So in a way ur code segment will most likely not fail but will most likely fail if the objects deconstructor is doing some resource deinitialization and we are using it afterwards.
Hope it helps