Terminology when Initializing C Structures - c

This will be an easy question but googling around does not seem to provide me with an answer. The way I understand it in C we have two ways to initialize a foo object, when foo is a structure. Look at the code below for an example
typedef struct foo
{
int var1;
int var2;
char* var3;
}foo;
//initializes and allocates a foo
foo* foo_init1(int v1,int v2,const char* v3)
{
if(..some checks..)
return 0;
foo* ret = malloc(sizeof(foo));
ret->var1 = v1;
ret->var2 = v2;
ret-var3 = malloc(strlen(v3)+1);
strcpy(ret->var3,v3);
return ret;
}
// initializes foo by ... what? How do we call this method of initialization?
char foo_init2(foo* ret,int v1,int v2, const char* v3)
{
//do some checks and return false
if(...some checks..)
return false//;assume you defined true and false in a header file as 1 and 0 respectively
ret->var1 = v1;
ret->var1 = v1;
ret->var2 = v2;
ret-var3 = malloc(strlen(v3)+1);
strcpy(ret->var3,v3);
return true;
}
So my question is this. How do we refer in C to these different initializing methods? The first returns an initialized pointer to foo so it's easy to use if you want a foo object on the heap like that:
foo* f1 = foo_init1(10,20,"hello");
But the second requires a foo .. what? Look at the code below.
foo f1;
foo_init2(&f1,10,20,"hello");
So the second method makes it easy to initialize an object on the stack but how do you call it? This is basically my question, how to refer to the second method of initialization.
The first one allocates and initializes a pointer to foo.
The second one initializes a foo by ... what? Reference?
As a bonus question, how do you guys work when coding in C? Do you determine the usage of the object you are making and by that determine if you should have an initializing function of type1 , or 2 or even both of them?

I am not sure if there are any well defined nomenclature for the two methods,
In the first method the function dynamically allocates a structure and assigns values to the members,
while in second the structure is allocated before the function and the function then just assigns values to the members.
Do you determine the usage of the object you are making and by that determine if you should have an initializing function of type1 , or 2 or even both of them?
Selecting first or second method depends on a important difference:
The first method is preferred when you need to pass the returned structure across scopes, the memory on heap has to be explicitly freed untill which the data prevails while in Second method the data on stack gets removed once the scope of the passed object ends.

Since none of the people in the comments took up on the offer to turn their comments into an answer I am forced to reply to my own question.
Well basically a possible answer would be that as AIs states there is no specific naming convention. Of course whatever naming method is used, it should be:
Consistent across all of the project/s for clarity's sake
Recognizable by other programmers as a function that does what it is actually doing.
To achieve that there were some great recommendations in the comments. For when a foo object is:
Passed for initialization inside the function: foo_init
Allocated inside the function and a pointer returned: foo_alloc, foo_make , foo_new
All of the above are clear I suppose but what is most accurately describes what is happening in the functions would be foo_init and foo_alloc.
Personally I really dislike the _alloc solution because I don't like how it looks in my code so I decided to add the verb _create instead of alloc after the function to denote what it's doing.
But well what the answer boils down to I guess is personal preference. All should be okay and acceptable as long as the functionality of the function is made clear by reading its name.

Related

C function that returns a pointer to an array correct syntax?

In C you can declare a variable that points to an array like this:
int int_arr[4] = {1,2,3,4};
int (*ptr_to_arr)[4] = &int_arr;
Although practically it is the same as just declaring a pointer to int:
int *ptr_to_arr2 = int_arr;
But syntactically it is something different.
Now, how would a function look like, that returns such a pointer to an array (of int e.g.) ?
A declaration of int is int foo;.
A declaration of an array of 4 int is int foo[4];.
A declaration of a pointer to an array of 4 int is int (*foo)[4];.
A declaration of a function returning a pointer to an array of 4 int is int (*foo())[4];. The () may be filled in with parameter declarations.
As already mentioned, the correct syntax is int (*foo(void))[4]; And as you can tell, it is very hard to read.
Questionable solutions:
Use the syntax as C would have you write it. This is in my opinion something you should avoid, since it's incredibly hard to read, to the point where it is completely useless. This should simply be outlawed in your coding standard, just like any sensible coding standard enforces function pointers to be used with a typedef.
Oh so we just typedef this just like when using function pointers? One might get tempted to hide all this goo behind a typedef indeed, but that's problematic as well. And this is since both arrays and pointers are fundamental "building blocks" in C, with a specific syntax that the programmer expects to see whenever dealing with them. And the absensce of that syntax suggests an object that can be addressed, "lvalue accessed" and copied like any other variable. Hiding them behind typedef might in the end create even more confusion than the original syntax.
Take this example:
typedef int(*arr)[4];
...
arr a = create(); // calls malloc etc
...
// somewhere later, lets make a hard copy! (or so we thought)
arr b = a;
...
cleanup(a);
...
print(b); // mysterious crash here
So this "hide behind typedef" system heavily relies on us naming types somethingptr to indicate that it is a pointer. Or lets say... LPWORD... and there it is, "Hungarian notation", the heavily criticized type system of the Windows API.
A slightly more sensible work-around is to return the array through one of the parameters. This isn't exactly pretty either, but at least somewhat easier to read since the strange syntax is centralized to one parameter:
void foo (int(**result)[4])
{
...
*result = &arr;
}
That is: a pointer to a pointer-to-array of int[4].
If one is prepared to throw type safety out the window, then of course void* foo (void) solves all of these problems... but creates new ones. Very easy to read, but now the problem is type safety and uncertainty regarding what the function actually returns. Not good either.
So what to do then, if these versions are all problematic? There are a few perfectly sensible approaches.
Good solutions:
Leave allocation to the caller. This is by far the best method, if you have the option. Your function would become void foo (int arr[4]); which is readable and type safe both.
Old school C. Just return a pointer to the first item in the array and pass the size along separately. This may or may not be acceptable from case to case.
Wrap it in a struct. For example this could be a sensible implementation of some generic array type:
typedef struct
{
size_t size;
int arr[];
} array_t;
array_t* alloc (size_t items)
{
array_t* result = malloc(sizeof *result + sizeof(int[items]));
return result;
}
The typedef keyword can make things a lot clearer/simpler in this case:
int int_arr[4] = { 1,2,3,4 };
typedef int(*arrptr)[4]; // Define a pointer to an array of 4 ints ...
arrptr func(void) // ... and use that for the function return type
{
return &int_arr;
}
Note: As pointed out in the comments and in Lundin's excellent answer, using a typedef to hide/bury a pointer is a practice that is frowned-upon by (most of) the professional C programming community – and for very good reasons. There is a good discussion about it here.
However, although, in your case, you aren't defining an actual function pointer (which is an exception to the 'rule' that most programmers will accept), you are defining a complicated (i.e. difficult to read) function return type. The discussion at the end of the linked post delves into the "too complicated" issue, which is what I would use to justify use of a typedef in a case like yours. But, if you should choose this road, then do so with caution.

How to retrieve function pointer from inside function in C?

How can I retrieve the function pointer that was used to call a function, from within the function itself? Here's an example of what I need to accomplish:
struct vtable {
void (*func)(void);
};
void foobar(void) {
// How can I get the address of t.func from here?
}
int main(void)
{
struct vtable t = { foobar };
t.func();
return 0;
}
In particular I would like to know if this can be done without using additional parameters in the function definition, ie. not this way:
struct vtable {
void (*func)(struct vtable t);
};
void foobar(struct vtable t) {
...
}
int main(void)
{
struct vtable t = { foobar };
t.func(t);
return 0;
}
This is impossible in portable C. It's also impossible on typical implementations.
When you have a function call
int main(void) {
…foobar(…)…
}
there is no way for foobar to know that it was called by main using C language constructs alone. Many implementations make this information available through debugging features that let you explore the call stack, which the implementation maintains under the hood so as to keep track of where return goes to. In practice this doesn't always match the calling structure in the source code due to compile-time transformations such as inlining.
When the function is determined through a function pointer variable, typical implementations do not keep track of this information at all. A typical way to compile t.func() is:
Load the function pointer t.func into a processor register r.
Push the current instruction pointer to the call stack.
Branch to the address stored in r.
There is no information in memory that links steps 1 and 3. Other things may have happened between steps 1 and 3 depending on how the optimizer handled this particular chunk of code.
If you need to know from which “object” a “method” was called, you need to pass a pointer to the object to the function that is the method. This is how object-oriented languages with actual methods work: under the hood, there is an extra “this” or “self” argument, even if the language doesn't make it explicit.
the problem that I'm trying to solve is how to get the address of the struct without altering the function's list of arguments
The only way to do that, short of doing it the correct way with parameter passing, is to have the caller store the address in a global variable. That's ugly but possible:
#include <stdio.h>
struct vtable {
void (*func)(void);
};
static struct vtable* lastcall;
#define call(x, func) do { lastcall=&(x); (x).func(); } while(0)
void foobar(void) {
printf("foobar caller: %p\n", (void*)lastcall);
}
int main(void)
{
struct vtable t = { foobar };
printf("Address of t: %p\n", &t);
call(t, func);
return 0;
}
I wouldn't recommend the above - it is better if you change the API to include the struct, then hide that part behind a macro if you must.
Discarding everything that's portability, it is of course also possible to dissect the stack and find the caller address there. This is ABI-specific though, and you might have to do it in assembler.
No, it is not possible. How should a function know by which way it is called?
Consider if you call the function without using a structure holding its pointer, like this:
foobar();
You need to invent some way to pass the requested value as a parameter.
I can give you a working answer, but it won't be a pretty one.
C is a pretty chill language when it comes to accessing memory. In fact you can access the entire program stack from any function, this means that you can access main variables from foobar.
Knowing this is as powerfull as it is usually a bad idea.
For your problem, you can search any pointer to your foobar function in a range. Simply by creating a struct vtable pointing to ARBITRARY addresses stored at the stack and then checking if the func field is the same as the address of foobar.
Usually this will yield a SIGSEGV, to avoid this you can limit the addresses used to stack valid addresses using pointer arithmetic.
Here you have a working example in "pure" c (simply play with the RANGE define). But i have to warn you again, dont use this in the real world, unless you want to flex on your hacking skills.
#include <stdio.h>
#define RANGE 100
struct vtable {
void (*func)(void);
};
void foobar(void) {
int a[1]; //We control the stack from this address!
for (int i = 0; i < RANGE; i++) { //We are basically doing a buffer overflow
if (a[i] > a && a[i] < a+RANGE) { //Ignore addresses too far to prevent SEGF
struct vtable *t = (struct vtable*)a[i];
if (t->func == foobar)
printf("[FOOBAR] Address of t is: %x\n", a[i]);
}
}
}
int main(void)
{
struct vtable t = { foobar };
printf("[MAIN] Address of t: %x\n", &t);
t.func();
return 0;
}
Have a nice day!
First of all, you cannot get the address of t if you don't pass any reference to it. This is like trying to follow a pointer back to it's pointer. Pointers in general don't work in the reverse, and this is the reason to write data structures like double linked lists, or similar. Simply you can have millions of such pointers, all pointing to this function, so there's nothing in the function address that allow you to know where the function pointer was stored.
Once said that:
In your first paragraph you say:
How can I retrieve the function pointer that was used to call a function, from within the function itself?
Well, that's preciselly what you get when you use the plain name of the function (as in main) you can then execute that function using a (probably non empty) argument list, as you do in (). You don't know where your function has been called from, but what is true, is that if your program control is inside the body of it, it must have been called from the beginning, so using the function name inside the function could be a way to get a function's pointer. But you cannot get it further and get the structure where that pointer was used... this information is not passed in to your function, you have no means to get to it. This is the same problem as when you are forced to pass an array length to a function because there's nothing in the array that allows you to get how large it is.
I have not checked thoroughly your code, as it is just a snippet of code, that needs some adjustments to evolve into fully executable code, but from my point of view it is correct and will do what you are thinking on. Just test, the computer is not going to break if you make a mistake.
Beware in your code you have passed a full struct record by value, and that will make a full copy of the struct in order to put it in the parameter stack. Probably what you want is something like:
struct vtable {
void (*func)(void); /* correct */
};
void foobar(void) {
// How can I get the address of t.func from here?
/* if you want to get the address of the function, it is
* easy, every function knows its address, is in its
* name */
void (*f)(void) = foobar; /* this pointer is the only one
* that could be used to call
* this function and be now
* executing code here. :) */
/* ... */
f(); /* this will call foobar again, but through the pointer
* f, recursively (the pointer although, is the same) */
}
int main(void)
{
struct vtable t = { foobar };
t.func();
return 0;
}
It is very common to see functions that use callbacks to be executed on behalf of the calling code. Those functions is common also to require pointers to strcutres that represent the context they are called in behalf of. So don't hesitate to pass arguments to your function (try not to pass large structures by value, as you do in your example ---well, I recognize it is not large, it has only a pointer) but anyway, that is very common. OOP implementation rests deeply on these premises.

Defining a function as a function pointer

Mostly for fun, I've decided to write my own minimal test framework for my C code. I use a basic struct for the test information, create an array of test structs and then iterate over them to run all the tests. This amounts to a very small amount of work for a fairly elegant (imho) solution.
However, the one thing that is a little annoying is that I cannot figure out how to define functions as function pointers instead of defining the function and then creating a function pointer later.
I have the following (which works just fine):
typedef int (* test_p) (void);
struct test {
char * desc;
test_p func;
};
int
example_test (void) {
puts("This is a test");
return 0;
}
void
run_test (char * test_name, test_p test) {
printf("Testing %s\t\t\t[ PEND ]\r", test_name);
char * test_result = (test() ? "FAIL" : "PASS");
printf("Testing %s\t\t\t[ %s ]\n", test_name, test_result);
}
int
main (void) {
struct test test_list [] = {
{ "example test", (test_p )example_test }
};
for ( int i = 0; i < 1; i ++ ) {
run_test(test_list[i].desc, test_list[i].func);
}
return 0;
}
However, I am hoping I can remove the need for the casting in the struct and instead define the function as being a function pointer from the beginning. The following is an example of how I would like this to work (assuming many of the same things as above):
test_p
example_test = {
puts("This is a test");
return 0;
}
If I could do something like this, then in the struct, I could simply have the func field be example_test rather than (test_p )example_test. Is this (or something like it) possible? If not, is there a reason why not (If that reason is simply "because it wasn't added to the language", that's fine)?
A function pointer is one kind of thing and a function is another kind of thing so you can't really make the latter be the former. But if you use a function name where a function pointer is expected, that produces a pointer to the function, so you can just remove the unnecessary cast, as WhozCraig said in the first comment above. You write
If I could do something like this, then in the struct, I could simply have the func field be example_test rather than (test_p )example_test.
You can do that, with example_test defined just as it is in your current code ... did you try that?
You can also forward declare a function, like so:
typedef int test_func(void); // note no indirection
typedef test_func* test_p;
test_func example_test;
It would be nice if you could use that sort of syntax when you define the function, as in your attempted syntax, but there's simply no way to do that in C ... you have to explicitly provide the return type and parameter list.
Another detail is that, when you invoke the function pointed to by a function pointer, you don't have to dereference it ... that's why you were able to write
test()
instead of
(*test)()
although the latter also works. (In fact, because the deference is stripped, (********test)() also works ... but only do that if you're trying to win an obfuscation contest.)
What you are describing is a kind of meta-programming. Rather than writing code to explicitly solve the problem, you are concerned with a kind of syntactic structure that will allow you to define a whole raft of test functions without unnecessary cruft.
In Lisp you would use macros. In C++ you might use templates and/or lambdas. In C you use macros.
So you need to write a macro that:
takes a name and descriptive text as arguments
defines a static variable of type function (created from that name using token pasting)
defines a function (using a name created by token pasting)
[edit] At this point you have achieved the goal: you have created the function and given it a name that is (only) a function pointer, and you can use that name in your struct without a cast. I would suggest one additional step, the macro also:
adds the variable/function and descriptive text to a list of functions to be tested.
Then your boilerplate loop iterates over the structure calling each function and reporting the results using the descriptive text. Problem solved.
Some people don't like macros, but they are ideally suited to this situation, and there is no other way to do it in C. I did something just like this before making the move to C++.

Regarding typedefs of 1-element arrays in C

Sometimes, in C, you do this:
typedef struct foo {
unsigned int some_data;
} foo; /* btw, foo_t is discouraged */
To use this new type in an OO-sort-of-way, you might have alloc/free pairs like these:
foo *foo_alloc(/* various "constructor" params */);
void foo_free(foo *bar);
Or, alternatively init/clear pairs (perhaps returning error-codes):
int foo_init(foo *bar, /* and various "constructor" params */);
int foo_clear(foo *bar);
I have seen the following idiom used, in particular in the MPFR library:
struct foo {
unsigned int some_data;
};
typedef struct foo foo[1]; /* <- notice, 1-element array */
typedef struct foo *foo_ptr; /* let's create a ptr-type */
The alloc/free and init/clear pairs now read:
foo_ptr foo_alloc(/* various "constructor" params */);
void foo_free(foo_ptr bar);
int foo_init(foo_ptr bar, /* and various "constructor" params */);
int foo_clear(foo_ptr bar);
Now you can use it all like this (for instance, the init/clear pairs):
int main()
{
foo bar; /* constructed but NOT initialized yet */
foo_init(bar); /* initialize bar object, alloc stuff on heap, etc. */
/* use bar */
foo_clear(bar); /* clear bar object, free stuff on heap, etc. */
}
Remarks: The init/clear pair seems to allow for a more generic way of initializing and clearing out objects. Compared to the alloc/free pair, the init/clear pair requires that a "shallow" object has already been constructed. The "deep" construction is done using init.
Question: Are there any non-obvious pitfalls of the 1-element array "type-idiom"?
This is very clever (but see below).
It encourages the misleading idea that C function arguments can be passed by reference.
If I see this in a C program:
foo bar;
foo_init(bar);
I know that the call to foo_init does not modify the value of bar. I also know that the code passes the value of bar to a function when it hasn't initialized it, which is very probably undefined behavior.
Unless I happen to know that foo is a typedef for an array type. Then I suddenly realize that foo_init(bar) is not passing the value of bar, but the address of its first element. And now every time I see something that refers to type foo, or to an object of type foo, I have to think about how foo was defined as a typedef for a single-element array before I can understand the code.
It is an attempt to make C look like something it's not, not unlike things like:
#define BEGIN {
#define END }
and so forth. And it doesn't result in code that's easier to understand because it uses features that C doesn't support directly. It results in code that's harder to understand (especially to readers who know C well), because you have to understand both the customized declarations and the underlying C semantics that make the whole thing work.
If you want to pass pointers around, just pass pointers around, and do it explicitly. See, for example, the use of FILE* in the various standard functions defined in <stdio.h>. There is no attempt to hide pointers behind macros or typedefs, and C programmers have been using that interface for decades.
If you want to write code that looks like it's passing arguments by reference, define some function-like macros, and give them all-caps names so knowledgeable readers will know that something odd is going on.
I said above that this is "clever". I'm reminded of something I did when I was first learning the C language:
#define EVER ;;
which let me write an infinite loop as:
for (EVER) {
/* ... */
}
At the time, I thought it was clever.
I still think it's clever. I just no longer think that's a good thing.
The only advantage to this method is nicer looking code and easier typing. It allows the user to create the struct on the stack without dynamic allocation like so:
foo bar;
However, the structure can still be passed to functions that require a pointer type, without requiring the user to convert to a pointer with &bar every time.
foo_init(bar);
Without the 1 element array, it would require either an alloc function as you mentioned, or constant & usage.
foo_init(&bar);
The only pitfall I can think of is the normal concerns associated with direct stack allocation. If this in a library used by other code, updates to the struct may break client code in the future, which would not happen when using an alloc free pair.

Return function pointer to a nested function in C

As the title already states, I'm trying to declare a nested function and return a pointer to that function. I want this function 'not' to return a new function pointer which will return the negation of whatever the original function was.
Here is what I have:
someType not( someType original ) {
int isNot( ListEntry* entry ) {
return !original( entry );
}
someType resultFunc = calloc( 1024, 1 );
memcpy( resultFunc, &isNot, 1024 );
return resultFunc;
}
someType is defined as:
typedef int(*someType)(ListEntry* entry)
Steve, you have a completely wrong mental model of what is a C function.
someType resultFunc = calloc( 1024, 1 );
memcpy( resultFunc, &isNot, 1024 );
From your code fragment, I can surmise that you think that you can copy function's compiled code into a block of memory, and then reuse it. This kind of thing smells of Lisp, except even in lisp you don't do it that way.
In fact, when you say "&isNot", you get a pointer to function. Copying the memory that pointer points at is counterproductive - the memory was initialized when you loaded your executable into memory, and it's not changing. In any case, writing someFunc() would cause a core dump, as the heap memory behing someFunc cannot be executed - this protects you from all sorts of viruses.
You seem to expect an implementation of closures in C. That implementation is simply not there. Unlike Lisp or Perl or Ruby, C cannot preserve elements of a stack frame once you exited that frame. Even is nested functions are permitted in some compilers, I am sure that you cannot refer to non-global variables from inside those functions. The closes thing to closures is indeed C++ object that stores the state and implements operator(), but it's a completely different approach, and you'd still have to do things manually.
Update: here is the relevant portion of GCC documentation. Look for "But this technique works only so long as the containing function (hack, in this example) does not exit."
You're not going to be able to do this in the fashion you want. You have a couple of alternative options.
You can use macros:
#define FN_NOT(F) !F
#define notSomeFunc FN_NOT(someFunc)
...
x = notSomeFunc(entry);
But I suspect you wanted to be able to pass the negated function around to other functions that take function pointers, so that won't work.
You can change your interfaces to accept some extra information, eg
struct closure {
void *env;
int (*f)(struct closure* extra, ListEntry*);
};
static int isNot(struct closure* extra, ListEntry *entry) {
someType original = extra->env;
return !original(entry);
}
struct closure not(someType original) {
closure rv;
rv.env = original;
rv.f = &isNot;
return rv;
}
And then use it like:
struct closure inverse_fn;
inverse_fn = not( &fn );
if( inverse_fn.f(&inverse_fn, entry) ) {
...
}
There are other things you can try, like JITing functions at runtime, but those sorts of techniques are going to be platform and architecture dependent. This solution is awkward, but pure C and portable.
I'm using GCC.
You can turn on nested functions by using the flag:
-fnested-functions
when you compile.
I also never heard of nested functions in C, but if gcc supports it, this is not going to work the way you expect. You are just simply copying the machine instructions of isNot, and that won't include the actual value of "original" at the time "not" is being called.
You should use a C++ class to implement a function object that stores a pointer that you can initialize with the value of "original" and return an instance of this class from "not".

Resources