For example, I would like to do something like this:
#include <gmp.h>
typedef mpz_t Integer;
//
Integer F(Integer a,Integer b,Integer c,Integer d) {
Integer ret = times(plus(a,b),plus(c,d));
}
But, GMP doesn't let me do this, apparently mpz_t is an array, so I get the error:
error: ‘F’ declared as function returning an array
So instead I would have to do something like this:
void F(Integer ret,Integer a,Integer b,Integer c,Integer d) {
Integer tmp1,tmp2;
plus(tmp1,a,b);
plus(tmp2,c,d);
times(ret,tmp1,tmp2);
}
This is unnatural, and not following the logical way that C (or in general mathematical) expressions can be composed. In fact, you can't compose anything in a math-like way because apparently you can't return GMP numbers! If I wanted to write - for example - a simple yacc/bison style parser that converted a simple syntax using +, -, /, * etc. into C code implementing the given expressions using GMP it seems it would be much more difficult as I would have to keep track of all the intermediate values.
So, how can I force GMP to bend to my will here and accept a more reasonable syntax? Can I safely "cheat" and cast mpz_t to a void * and then reconstitute it at the other end back into mpz_t? I'm assuming from reading the documentation that it is not really passing around an array, but merely a reference, so why can't it return a reference as well? Is there some good sound programming basis for doing it this way that I should consider in writing my own program?
From gmp.h:
typedef __mpz_struct mpz_t[1];
This makes a lot of sense, and is pretty natural. Think about it: having an
array of size 1 allows you to deal with an obscured pointer (known as opaque
reference) and all its advantages:
mpz_t number;
DoubleIt(number); /* DoubleIt() operates on `number' (modifies it) as
it will be passed as a pointer to the real data */
Were it not an array, you'd have to do something like:
mpz_t number;
DoubleIt(&number);
And then it comes all the confusion. The intention behind the opaque type is
to hide these, so you don't have to worry about it. And one of the main
concerns should be clear: size (which leads to performance). Of course you
can't return such struct that holds data limited to the available memory. What
about this one (consider mpz_t here as a "first-class" type):
mpz_t number = ...;
number = DoubleIt(number);
You (the program) would have to copy all the data in number and push it as a
parameter to your function. Then it needs to leave appropriate space for
returning another number even bigger.
Conclusion: as you have to deal with data indirectly (with pointers) it's
better to use an opaque type. You'll be passing a reference only to your
functions, but you can operate on them as if the whole concept was
pass-by-reference (C defaults to pass-by-reference).
Related
In C you can declare a variable that points to an array like this:
int int_arr[4] = {1,2,3,4};
int (*ptr_to_arr)[4] = &int_arr;
Although practically it is the same as just declaring a pointer to int:
int *ptr_to_arr2 = int_arr;
But syntactically it is something different.
Now, how would a function look like, that returns such a pointer to an array (of int e.g.) ?
A declaration of int is int foo;.
A declaration of an array of 4 int is int foo[4];.
A declaration of a pointer to an array of 4 int is int (*foo)[4];.
A declaration of a function returning a pointer to an array of 4 int is int (*foo())[4];. The () may be filled in with parameter declarations.
As already mentioned, the correct syntax is int (*foo(void))[4]; And as you can tell, it is very hard to read.
Questionable solutions:
Use the syntax as C would have you write it. This is in my opinion something you should avoid, since it's incredibly hard to read, to the point where it is completely useless. This should simply be outlawed in your coding standard, just like any sensible coding standard enforces function pointers to be used with a typedef.
Oh so we just typedef this just like when using function pointers? One might get tempted to hide all this goo behind a typedef indeed, but that's problematic as well. And this is since both arrays and pointers are fundamental "building blocks" in C, with a specific syntax that the programmer expects to see whenever dealing with them. And the absensce of that syntax suggests an object that can be addressed, "lvalue accessed" and copied like any other variable. Hiding them behind typedef might in the end create even more confusion than the original syntax.
Take this example:
typedef int(*arr)[4];
...
arr a = create(); // calls malloc etc
...
// somewhere later, lets make a hard copy! (or so we thought)
arr b = a;
...
cleanup(a);
...
print(b); // mysterious crash here
So this "hide behind typedef" system heavily relies on us naming types somethingptr to indicate that it is a pointer. Or lets say... LPWORD... and there it is, "Hungarian notation", the heavily criticized type system of the Windows API.
A slightly more sensible work-around is to return the array through one of the parameters. This isn't exactly pretty either, but at least somewhat easier to read since the strange syntax is centralized to one parameter:
void foo (int(**result)[4])
{
...
*result = &arr;
}
That is: a pointer to a pointer-to-array of int[4].
If one is prepared to throw type safety out the window, then of course void* foo (void) solves all of these problems... but creates new ones. Very easy to read, but now the problem is type safety and uncertainty regarding what the function actually returns. Not good either.
So what to do then, if these versions are all problematic? There are a few perfectly sensible approaches.
Good solutions:
Leave allocation to the caller. This is by far the best method, if you have the option. Your function would become void foo (int arr[4]); which is readable and type safe both.
Old school C. Just return a pointer to the first item in the array and pass the size along separately. This may or may not be acceptable from case to case.
Wrap it in a struct. For example this could be a sensible implementation of some generic array type:
typedef struct
{
size_t size;
int arr[];
} array_t;
array_t* alloc (size_t items)
{
array_t* result = malloc(sizeof *result + sizeof(int[items]));
return result;
}
The typedef keyword can make things a lot clearer/simpler in this case:
int int_arr[4] = { 1,2,3,4 };
typedef int(*arrptr)[4]; // Define a pointer to an array of 4 ints ...
arrptr func(void) // ... and use that for the function return type
{
return &int_arr;
}
Note: As pointed out in the comments and in Lundin's excellent answer, using a typedef to hide/bury a pointer is a practice that is frowned-upon by (most of) the professional C programming community – and for very good reasons. There is a good discussion about it here.
However, although, in your case, you aren't defining an actual function pointer (which is an exception to the 'rule' that most programmers will accept), you are defining a complicated (i.e. difficult to read) function return type. The discussion at the end of the linked post delves into the "too complicated" issue, which is what I would use to justify use of a typedef in a case like yours. But, if you should choose this road, then do so with caution.
I have been writing C for a decent amount of time, and obviously am aware that C does not have any support for explicit private and public fields within structs. However, I (believe) I have found a relatively clean method of implementing this without the use of any macros or voodoo, and I am looking to gain more insight into possible issues I may have overlooked.
The folder structure isn't all that important here but I'll list it anyway because it gives clarity as to the import names (and is also what CLion generates for me).
- example-project
- cmake-build-debug
- example-lib-name
- include
- example-lib-name
- example-header-file.h
- src
- example-lib-name
- example-source-file.c
- CMakeLists.txt
- CMakeLists.txt
- main.c
Let's say that example-header-file.h contains:
typedef struct ExampleStruct {
int data;
} ExampleStruct;
ExampleStruct* new_example_struct(int, double);
which just contains a definition for a struct and a function that returns a pointer to an ExampleStruct.
Obviously, now if I import ExampleStruct into another file, such as main.c, I will be able to create and return a pointer to an ExampleStruct by calling
ExampleStruct* new_struct = new_example_struct(<int>, <double>);,
and will be able to access the data property like: new_struct->data.
However, what if I also want private properties in this struct. For example, if I am creating a data structure, I don't want it to be easy to modify the internals of it. I.e. if I've implemented a vector struct with a length property that describes the current number of elements in the vector, I wouldn't want for people to just be able to change that value easily.
So, back to our example struct, let's assume we also want a double field in the struct, that describes some part of internal state that we want to make 'private'.
In our implementation file (example-source-file.c), let's say we have the following code:
#include <stdlib.h>
#include <stdbool.h>
typedef struct ExampleStruct {
int data;
double val;
} ExampleStruct;
ExampleStruct* new_example_struct(int data, double val) {
ExampleStruct* new_example_struct = malloc(sizeof(ExampleStruct));
example_struct->data=data;
example_struct->val=val;
return new_example_struct;
}
double get_val(ExampleStruct* e) {
return e->val;
}
This file simply implements that constructor method for getting a new pointer to an ExampleStruct that was defined in the header file. However, this file also defines its own version of ExampleStruct, that has a new member field not present in the header file's definition: double val, as well as a getter which gets that value. Now, if I import the same header file into main.c, which contains:
#include <stdio.h>
#include "example-lib-name/example-header-file.h"
int main() {
printf("Hello, World!\n");
ExampleStruct* test = new_example(6, 7.2);
printf("%d\n", test->data); // <-- THIS WORKS
double x = get_val(test); // <-- THIS AND THE LINE BELOW ALSO WORK
printf("%f\n", x); //
// printf("%f\n", test->val); <-- WOULD THROW ERROR `val not present on struct!`
return 0;
}
I tested this a couple times with some different fields and have come to the conclusion that modifying this 'private' field, val, or even accessing it without the getter, would be very difficult without using pointer arithmetic dark magic, and that is the whole point.
Some things I see that may be cause for concern:
This may make code less readable in the eyes of some, but my IDE has arrow buttons that take me to and from the definition and the implementation, and even without that, a one line comment would provide more than enough documentation to point someone in the direction of where the file is.
Questions I'd like answers on:
Are there significant performance penalties I may suffer as a result of writing code this way?
Am I overlooking something that may make this whole ordeal pointless, i.e. is there a simpler way to do this or is this explicitly discouraged, and if so, what are the objective reasons behind it.
Aside: I am not trying to make C into C++, and generally favor the way C does things, but sometimes I really want some encapsulation of data.
Am I overlooking something that may make this whole ordeal pointless, i.e. is there a simpler way to do this or is this explicitly discouraged, and if so, what are the objective reasons behind it.
Yes: your approach produces undefined behavior.
C requires that
All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.
(C17 6.2.7/2)
and that
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of the object,
[...]
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a
subaggregate or contained union), or
a character type.
(C17 6.5/7, a.k.a. the "Strict Aliasing Rule")
Your two definitions of struct ExampleStruct define incompatible types because they specify different numbers of members (see C17 6.2.7/1 for more details on structure type compatibility). You will definitely have problems if you pass instances by value between functions relying on different of these incompatible definitions. You will have trouble if you construct arrays of them, whether dynamically, automatically, or statically, and attempt to use those across boundaries between TUs using one definition and those using another. You may have problems even if you do none of the above, because the compiler may behave unexpectedly, especially when optimizing. DO NOT DO THIS.
Other alternatives:
Opaque pointers. This means you do not provide any definition of struct ExampleStruct in those TUs where you want to hide any of its members. That does not prevent declaring and using pointers to such a structure, but it does prevent accessing any members, declaring new instances, or passing or receiving instances by value. Where member access is needed from TUs that do not have the structure definition, it would need to be mediated by accessor functions.
Just don't access the "private" members. Do not document them in the public documentation, and if you like, explicity mark them (in code comments, for example) as reserved. This approach will be familiar to many C programmers, as it is used a lot for structures declared in POSIX system headers.
As long as the public has a complete definition for ExampleStruct, it can make code like:
ExampleStruct a = *new_example_struct(42, 1.234);
Then the below will certainly fail.
printf("%g\n", get_val(&a));
I recommend instead to create an opaque pointer and provide access public functions to the info in .data and .val.
Think of how we use FILE. FILE *f = fopen(...) and then fread(..., f), fseek(f, ...), ftell(f) and eventually fclose(f). I suggest this model instead. (Even if in some implementations FILE* is not opaque.)
Are there significant performance penalties I may suffer as a result of writing code this way?
Probably:
Heap allocation is expensive, and - today - usually not optimized away even when that is theoretically possible.
Dereferencing a pointer for member access is expensive; although this might get optimized away with link-time-optimization... if you're lucky.
i.e. is there a simpler way to do this
Well, you could use a slack array of the same size as your private fields, and then you wouldn't need to go through pointers all the time:
#define EXAMPLE_STRUCT_PRIVATE_DATA_SIZE sizeof(double)
typedef struct ExampleStruct {
int data;
_Alignas(max_align_t) private_data[EXAMPLE_STRUCT_PRIVATE_DATA_SIZE];
} ExampleStruct;
This is basically a type-erasure of the private data without hiding the fact that it exists. Now, it's true that someone can overwrite the contents of this array, but it's kind of useless to do it intentionally when you "don't know" what the data means. Also, the private data in the "real" definition will need to have the same, maximal, _AlignAs() as well (if you want the private data not to need to use AlignAs(), you will need to use the real alignment quantum for the type-erased version).
The above is C11. You can sort of do about the same thing by typedef'ing max_align_t yourself, then using an array of max_align_t elements for private data, with an appropriate length to cover the actual size of the private data.
An example of the use of such an approach can be found in CUDA's driver API:
Parameters for copying a 3D array: CUDA_MEMCPY3D vs
Parameters for copying a 3D array between two GPU devices: CUDA_MEMCPY3D_peer
The first structure has a pair of reserved void* fields, hiding the fact that it's really the second structure. They could have used an unsigned char array, but it so happens that the private fields are pointer-sized, and void* is also kind of opaque.
This causes undefined behaviour, as detailed in the other answers. The usual way around this is to make a nested struct.
In example.h, one defines the public-facing elements. struct example is not meant to be instantiated; in a sense, it is abstract. Only pointers that are obtained from one of it's (in this case, the) constructor are valid.
struct example { int data; };
struct example *new_example(int, double);
double example_val(struct example *e);
and in example.c, instead of re-defining struct example, one has a nested struct private_example. (Such that they are related by composite aggregation.)
#include <stdlib.h>
#include "example.h"
struct private_example {
struct example public;
double val;
};
struct example *new_example(int data, double val) {
struct private_example *const example = malloc(sizeof *example);
if(!example) return 0;
example->public.data = data;
example->val = val;
return &example->public;
}
/** This is a poor version of `container_of`. */
static struct private_example *example_upcast(struct example *example) {
return (struct private_example *)(void *)
((char *)example - offsetof(struct private_example, public));
}
double example_val(struct example *e) {
return example_upcast(e)->val;
}
Then one can use the object as in main.c. This is used frequently in linux kernel code for container abstraction. Note that offsetof(struct private_example, public) is zero, ergo example_upcast does nothing and a cast is sufficient: ((struct private_example *)e)->val. If one builds structures in a way that always allows casting, one is limited by single inheritance.
I’m building a small multi-platform C library. A design objective is to have a set of header files that are suitably generic, and which define a number of placeholder pointers to structs - if that’s the right term - like this:
/* mylib_types.h */
typedef struct _mylib_matrix *mylib_matrix;
These placeholders can be used to specify the parameters to the function prototypes in the other headers, like:
/* mylib_api.h */
MY_API mylib_status mylibAddMatrix(mylib_matrix a,
mylib_matrix b,
mylib_matrix* result);
So, that’s fine for the headers - everything is self-contained and stand-alone. Then, when it comes to implementing the library I want to use different underlying, platform specific, libraries to actually implement the methods.
The idea being that the library is optimised for any given platform, but the API to the library will be universally defined (so easily cross-compiled).
The problem I have is that: yes - I have got this working - but in the rather crude way using casting. I just wonder what the best practice - if any - actually is?
For example, in my implementation of a method I must then remember to immediately cast the placeholder pointer to the actual type of thing we are using for that platforms implementation, and similarly cast back any results.
e.g.
/* mylib_matrix.c */
#include “mylib_types.h"
#include “mylib_api.h”
#include <PlatformSpecificFunkyMatrix.h>
MY_API mylib_status mylibAddMatrix(mylib_matrix a,
mylib_matrix b,
mylib_matrix* result)
{
*result = (mylib_matrix)PlatformSpecificFunkyMatrix_AddMatrix(
(PlatformSpecificFunkyMatrix*)a,
(PlatformSpecificFunkyMatrix*)b);
return MYLIB_SUCCESS;
}
This all seems very brittle and liable for me to forget a cast or allowing the compiler to do any type checking. Is it at all principled?
I guess I could be explicit in my types of cast - but that still requires some consideration. Perhaps some pre-processor #defines might help wrap things up, but of course that can get rather messy... I could of course go and redefine the low-level structs (e.g. mylib_matrix) for each implementation, but then we are talking a different set of headers for each platform (again, I could go with the preprocessor to help swap the right definitions in or out).
Hmmm. Maybe I’m dwelling too much upon this...
One way to get around the casting.
In the platform specific file, use:
struct _mylib_matrix
{
PlatformSpecificFunkyMatrix* realMatrix;
};
and
MY_API mylib_status mylibAddMatrix(mylib_matrix a,
mylib_matrix b,
mylib_matix* result)
{
PlatformSpecificFunkyMatrix* r =
PlatformSpecificFunkyMatrix_AddMatrix(a->realMatrix, b->realMatrix);
*result = malloc(sizeof(_mylib_matrix));
*result->realMatrix = r
return MYLIB_SUCCESS;
}
Better still...
You can avoid the double indirection and the need for casting by using:
struct _mylib_matrix
{
// Add all the data here that you have in PlatformSpecificFunkyMatrix
};
typedef struct _mylib_matrix PlatformSpecificFunkyMatrix;
and then,
MY_API mylib_status mylibAddMatrix(mylib_matrix a,
mylib_matrix b,
mylib_matix* result)
{
*result = PlatformSpecificFunkyMatrix_AddMatrix(a, b);
return MYLIB_SUCCESS;
}
So, recently I had the unfortunate need to make a C extension for Ruby (because of performance). Since I was having problems with understanding VALUE (and still do), so I looked into the Ruby source and found: typedef unsigned long VALUE; (Link to Source, but you will notice that there are a few other 'ways' it's done, but I think it's essentially a long; correct me if I'm wrong). So, while investigating this further I found an interesting blog post, which says:
"...in some cases the VALUE object could BE the data instead of POINTING TO the data."
What confuses me is that, when I attempt to pass a string to C from Ruby, and use RSTRING_PTR(); on the VALUE (passed to the C-function from Ruby), and try to 'debug' it with strlen(); it returns 4. Always 4.
example code:
VALUE test(VALUE inp) {
unsigned char* c = RSTRING_PTR(inp);
//return rb_str_new2(c); //this returns some random gibberish
return INT2FIX(strlen(c));
}
This example returns always 1 as the string length:
VALUE test(VALUE inp) {
unsigned char* c = (unsigned char*) inp;
//return rb_str_new2(c); // Always "\x03" in Ruby.
return INT2FIX(strlen(c));
}
Sometimes in ruby I see an Exception saying "Can't convert Module to String" (or something along those lines, however I was messing with the code so much trying to figure this out that I am unable to reproduce the error now the error would happen when I tried StringValuePtr(); [I'm a bit unclear what this exactly does. Documentation says it changes the passed paramater to char*] on inp):
VALUE test(VALUE inp) {
StringValuePtr(inp);
return rb_str_new2((char*)inp); //Without the cast, I would get compiler warnings
}
So, the Ruby code in question is: MyMod::test("blahblablah")
EDIT: Fixed a few typos and updated the post a little.
The questions
What exactly does VALUE imp hold? A pointer to the object/value?
The value itself?
If it holds the value itself: when does it do that, and is there a way to check for it?
How do I actually access the value (since I seem to accessing almost everything but
the value)?
P.S: My understanding of C isn't really the best, but it's a work in progress; also, read the comments in the code snippets for some additional description (if it helps).
Thanks!
Ruby Strings vs. C strings
Let's start with strings first. First of all, before trying to retrieve a string in C, it is good habit to call StringValue(obj) on your VALUE first. This ensures that you will really deal with a Ruby string in the end because if it is not already a string, then it will turn it into one by coercing it with a call to that object's to_str method. So this makes things safer and prevents the occasional segfault you might get otherwise.
The next thing to watch out for is that Ruby strings are not \0-terminated as your C code would expect them to make things like strlen etc. work as expected. Ruby's strings carry their length information with them instead - that's why in addition to RSTRING_PTR(str) there is also the RSTRING_LEN(str) macro to determine the actual length.
So what StringValuePtr now does is returning the non-zero-terminated char * to you - this is great for buffers where you have a separate length, but not what you want for e.g. strlen. Use StringValueCStr instead, it will modify the string to be zero-terminated so that it is safe for usage with functions in C that expect it to be zero-terminated. But, try to avoid this wherever possible, because this modification is much less performant than retrieving the non-zero-terminated string that does not have to be modified at all. It's surprising if you keep an eye on this how rarely you will actually need "real" C strings.
self as an implicit VALUE argument
Another reason why your current code doesn't work as expected is that every C function to be called by Ruby gets passed self as an implicit VALUE.
No arguments in Ruby ( e.g. obj.doit ) translates to
VALUE doit(VALUE self)
Fixed amount of arguments (>0, e.g. obj.doit(a, b)) translates to
VALUE doit(VALUE self, VALUE a, VALUE b)
Var args in Ruby ( e.g. obj.doit(a, b=nil)) translates to
VALUE doit(int argc, VALUE *argv, VALUE self)
in Ruby. So what you were working on in your example is not the string passed to you by Ruby but actually the current value of self, that is the object that was the receiver when you called that function. A correct definition for your example would be
static VALUE test(VALUE self, VALUE input)
I made it static to point out another rule that you should follow in your C extensions. Make your C functions only public if you intend to share them among several source files. Since that's almost never the case for function that you attach to a Ruby class, you should declare them as static by default and only make them public if there is a good reason to do so.
What is VALUE and where does it come from?
Now to the harder part. If you dig down deeply into Ruby internals, then you will find the function rb_objnew in gc.c. Here you can see that any newly created Ruby object becomes a VALUEby being cast as one from something called the freelist. It's defined as:
#define freelist objspace->heap.freelist
You can imagine the objspace as a huge map that stores each and every object that is currently alive at a given point in time in your code. This is also where the garbage collector fulfills his duty and the heap struct in particular is the place where new objects are born. The "freelist" of the heap is again declared as being an RVALUE *. This is the C-internal representation of the Ruby built-in types. An RVALUE is actually defined as follows:
typedef struct RVALUE {
union {
struct {
VALUE flags; /* always 0 for freed obj */
struct RVALUE *next;
} free;
struct RBasic basic;
struct RObject object;
struct RClass klass;
struct RFloat flonum;
struct RString string;
struct RArray array;
struct RRegexp regexp;
struct RHash hash;
struct RData data;
struct RTypedData typeddata;
struct RStruct rstruct;
struct RBignum bignum;
struct RFile file;
struct RNode node;
struct RMatch match;
struct RRational rational;
struct RComplex complex;
} as;
#ifdef GC_DEBUG
const char *file;
int line;
#endif
} RVALUE;
That is, basically a union of core data types that Ruby knows about. Missing something? Yes, Fixnums, Symbols, nil and boolean values are not included there. It's because these kinds of objects are directly represented using the unsigned long that a VALUE boils down to in the end. I think the design decision there was (besides being a cool idea) that dereferencing a pointer might be slightly less performant than the bit shifts that are currently needed when transforming the VALUE to what it actually represents. Essentially
obj = (VALUE)freelist;
says give me whatever freelist points to currently and treat is as unsigned long. This is safe because freelist is a pointer to an RVALUE - and a pointer can also be safely interpreted as unsigned long. This implies that every VALUE except those carrying Fixnums, symbols, nil or Booleans are essentially pointers to an RVALUE, the others are directly represented within the VALUE.
Your last question, how can you check for what a VALUE stands for? You can use the TYPE(x) macro to check whether a VALUE's type would be one of the "primitive" ones.
VALUE test(VALUE inp)
The first issue is here: inp is self (so, in your case, the module). If you want to refer to the first argument, you need to add a self argument before that (which makes me to add -Wno-unused-parameters to my cflags, as it is never used in the case of module functions):
VALUE test(VALUE self, VALUE inp)
Your first example uses a module as a string, which certainly won't result into anything good. RSTRING_PTR lacks type checks, which is a good reason not to use it.
A VALUE is a reference to the Ruby object, but not directly a pointer to what it may contain (like a char* in the case of a string). You need to get that pointer using some macros or functions depending on each object. For a string, you want StringValuePtr (or StringValueCStr to ensure that the string is null-terminated) which returns the pointer (it doesn't change the content of your VALUE in any way).
strlen(StringValuePtr(thing));
RSTRING_LEN(thing); /* I assume strlen was just an example ;) */
The actual content of the VALUE is, in MRI and YARV at least, the object_id of the object (or at least, it is after a bitshift).
For your own objects, the VALUE will most likely contain a pointer to a C object which you can get using Data_Get_Struct:
my_type *thing = NULL;
Data_Get_Struct(rb_thing, my_type, thing);
I came across the following function signature and I wondered if this (the ellipsis, or "...") is some kind of polymorphism?
#include <fcntl.h>
int fcntl(int fd, int cmd, ... );
Thanks in advance.
It's a variable argument list.
That is a variadic function. See stdarg.h for more details.
The ... means that you can pass any number of arguments to this function, as other commenters have already mentioned. Since the optional arguments are not typed, the compiler cannot check the types and you can technically pass in any argument of any type.
So does this mean you can use this to implement some kind of polymorphic function? (I.e., a function that performs some operation based on the type of its arguments.)
No.
The reason you cannot do this, is because you cannot at runtime inspect the types of the arguments passed in. The function reading in the variable argument list is expected to already know the types of the optional arguments it is going to receive.
In case of a function that really is supposed to be able to take any number of arguments of any type (i.e., printf), the types of the arguments are passed in via the format string. This means that the caller has to specify the types it is going to pass in at every invocation, removing the benefit of polymorphic functions (that the caller doesn't have to know the types either).
Compare:
// Ideal invocation
x = multiply(number_a, number_b)
y = multiply(matrix_a, matrix_b)
// Standard C invocation
x = multiply_number(number_a, number_b)
y = multiply_matrix(matrix_a, matrix_b)
// Simulated "polymorphism" with varargs
x = multiply(T_NUMBER, number_a, number_b)
y = multiply(T_MATRIX, matrix_a, matrix_b)
You have to specify the type before the varargs function can do the right thing, so this gains you nothing.
No, that's the "ellipsis" you're seeing there, assuming you're referring to the ... part of the declaration.
Basically it says that this function takes an unknown number of arguments after the first two that are specified there.
The function has to be written in such a way that it knows what to expect, otherwise strange results will ensue.
For other functions that support this, look at the printf function and its variants.
Does C support polymorphism?
No, it doesn't.
However there are several libraries, such as Python C API, that implements a rough variant of polymorphism using structs and pointers. Beware that compiler cannot perform appropriate type checking in most cases.
The tecnhique is simple:
typedef struct {
char * (*to_string)();
} Type;
#define OBJ_HEADER Type *ob_type
typedef struct {
OBJ_HEADER;
} Object;
typedef struct {
OBJ_HEADER;
long ival;
} Integer;
typedef struct {
OBJ_HEADER;
char *name;
char *surname;
} Person;
Integer and Person get a Type object with appropriate function pointers (e.g. to functions like integer_to_string and person_to_string).
Now just declare a function accepting an Object *:
void print(Object *obj) {
printf("%s", obj->type->to_string());
}
now you can call this function with both an Integer and a Person:
Integer *i = make_int(10);
print((Object *) i);
Person *p = make_person("dfa");
print((Object *) p);
EDIT
alternatively you can declare i and p as Object *; of course make_int and make_person will allocate space for Integer and Person and do the appropriate cast:
Object *
make_integer(long i) {
Integer *ob = malloc(sizeof(Integer));
ob->ob_type = &integer_type;
ob->ival = i;
return (Object *) ob;
}
NB: I cannot compile these examples rigth now, please doublecheck them.
I came across the following function signature and I wondered if this (the ellipsis, or "...") is some kind of polymorphism?
yes, it is a primitive form of polymorphism. With only one function signature you are able to pass various structures. However the compiler cannot help you with detecting type errors.
Adding to what's been said: C supports polymorphism through other means. For example, take the standard library qsort function which sorts data of arbitrary type.
It is able to do so by means of untyped (void) pointers to the data. It also needs to know the size of the data to sort (provided via sizeof) and the logic that compares the objects' order. This is accomplished by passing a function pointer to the qsort function.
This is a prime example of runtime polymorphism.
There are other ways to implement object-oriented behaviour (in particular, virtual function calls) by managing the virtual function tables manually. This can be done by storing function pointers in structures and passing them around. Many APIs do so, e.g. the WinAPI, which even uses advanced aspects of object orientation, e.g. base class call dispatch (DefWindowProc, to simulate calling the virtual method of the base class).
I assume you are referring to the ellipsis (...)? If so this indicates that 0 or more parameters will follow. It is called varargs, defined in stdarg.h
http://msdn.microsoft.com/en-us/library/kb57fad8.aspx
printf uses this functionality. Without it you wouldn't be able to keep adding parameters to the end of the function.
C supports a crude form of Polymorphism. I.e. a type being able to appear and behave as another type. It works in a similar was as in C++ under the hood (relying on memory being aligned) but you have to help the compiler out by casting. E.g. you can define a struct:
typedef struct {
char forename[20];
char surname[20];
} Person;
And then another struct:
typedef struct {
char forename[20];
char surname[20];
float salary;
char managername[20];
} Employee;
Then
int main (int argc, int *argv)
{
Employee Ben;
setpersonname((Person *) &Ben);
}
void setpersonname(Person *person)
{
strcpy(person->forename,"Ben");
}
The above example shows Employee being used as a Person.
No, it is a function that is taking variable number of arguments.
That is not technically polymorphism. fcntl takes variable number of arguments & that is the reason for the ... similar to printf function.
C neither supports function overloading - which is a type of ad-hoc polymorphism based on compile-time types - nor multiple dispatch (ie overloading based on runtime types).
To simulate function overloading in C, you have to create multiple differently named functions. The functions' names often contain the type information, eg fputc() for characters and fputs() for strings.
Multiple dispatch can be implemented by using variadic macros. Again, it's the programmer's job to provide the type information, but this time via an extra argument, which will be evaluated at runtime - in contrast to the compile-time function name in case of the approach given above. The printf() family of functions might not be the best example for multiple dispatch, but I can't think of a better one right now.
Other approaches to multiple dispatch using pointers instead of variadic functions or wrapping values in structures to provide type annotations exist.
The printf declaration in the standard library is
int printf(const char*, ...);
Think about that.
You can write code that supports Polymorphic behavior in C, but the ... (ellipsis) is not going to be much help. That is for variable arguments to a function.
If you want polymorphic behavior you can use, unions and structures to construct a data structure that has a "type" section and variable fields depending on type. You can also include tables of function pointers in the structures. Poof! You've invented C++.
Yes C Do support the polymorphism
the Code which we write in the C++ using virtual to implement the polymorphism
if first converted to a C code by Compiler (one can find details here).
It's well known that virtual functionality in C++ is implemented using function pointers.