My question is, if it is correct to define both:
typedef void* Elem;
typedef const void* const constElem;
If I know that I would work with const and non const generic elements, for example for the copyElem function as a parameter, I would prefer to get it as const, for the const correctness, is this practical?
I don't see how there is a matter of correctness here, except inasmuch as yes, both typedefs you present are syntactically correct, and it is possible that a program would have use for both types.
As a matter of style, however, it is poor form to hide pointer nature behind a typedef. It very easily becomes confusing.
It is also a bit questionable to typedef anything as void * -- it provides no type safety of any significance, and not much semantic assistance either. If you really want to accept a pointer to anything, then just say void *. If, on the other hand, you want to hide the details of a struct type, then just declare the structure type as an incomplete type, and leave it that way:
struct Elem;
Optionally, typedef that for convenience:
typedef struct Elem Elem;
Personally, and again as a matter of style, I would prefer to see the const keyword than to see a typedef that rolls in the const, no matter how clearly designated.
Note also that all questions of style aside, this particular typedef ...
typedef const void* const constElem;
... seems of rather narrow usefulness. It designates a pointer whose value cannot be changed and that points to an object that also cannot be changed. A non-const pointer to a const object is much more often what you want. Note, too, that the uncertainty over what constElem should mean is one consequence of rolling up pointer nature inside the Elem typedef.
const void *const doesn't buy you much, unless it is used as the type of a function parameter:
void foo(const void *const e)
{
e = NULL; // compile-time error because of the second const
char *s = e; // compile-time error because of the first const
const char *t = e; // OK
}
...
char x[] = "test";
foo(x); // you can be confident that foo will not modify the contents
For regular usage it is impractical because the first const says that the dereferenced value cannot be modified, but void pointers cannot be dereferenced anyway. The second const essentially doesn't allow you to reassign the pointer once it is initialised.
Related
I'm trying to understand how 'const' works in C.
What I would like to create is a polygon struct whose members cannot be mutated.
I started by creating the following structs
struct vector2{
float x;
float y;
};
struct polygon2{
const size_t count;
struct vector2* const points;
};
To create a polygon I created the following function:
struct polygon2* polygon2_create(size_t count)
{
struct vector2* points = calloc(count, sizeof *points);
struct polygon2 temp = {.count = count,
.points = points};
struct polygon2* actual = malloc(sizeof *actual);
memcpy(actual, &temp, sizeof(*actual));
return actual;
}
I believe this function doesn't cause undefined behavior.
This way I can do things like
struct polygon2* poly = polygon2_create(30);
poly->points[3] = (struct vector2){7.1, 5.3};
But I can't do
poly->points = NULL;
Nor
poly->count = 3;
Which is great. I'm sure that I won't accidentally change the contents of struct polygon.
But I'd like to make vector2's members const too.
If I change vector2 to:
struct vector2{
const float x;
const float y;
};
I no longer am able to do this:
poly->points[3] = (struct vector2){7.1, 5.3};
I'd like to know why. I expected that making vector2's members const I wouldn't be able to do this
poly->points[3].x = 3
But I'd still be able to do this
poly->points = otherpoint;
Can someone explain what am I missing? And how can I achieve the following:
create a "immutable" vector2 struct
create a polygon struct whose points or count member can't be changed, but the things pointed by points can be 'swaped'.
const qualification of a type means that lvalue expressions of that type are not modifiable. In particular, lvalues of const type or of composite type with at least one const member, recursively, cannot be the left-hand side of an assignment operator, and pointers to such objects cannot be free()d.
Moreover, qualified, including const-qualified, types are different types from their unqualified counterparts and from differently-qualified versions of the underlying unqualifed type. This has implications on compatibility of composite types that have qualified members.
On the other hand, do not mistake const to be a promise that the value is actually constant. It can be the case that the same object can designated by multiple lvalues, some const and others non-const. In that case, the object can be modified via any of the non-const lvalues that designate it, and those modifications will be visible even via the const lvalues that designate it.
With respect to the specifics of your question:
I agree that your function polygon2_create() is valid and has well-defined behavior. In particular, const members of a struct can be initialized in an initializer, and functions such as memcpy() can modify memory in which an object that can be referenced via a const lvalue is stored. Your compiler might warn about the memcpy(), though.
More generally, the initialization and assignment behavior and constraints you describe are correct.
As for poly->points[3] = (struct vector2){7.1, 5.3};, how would it make sense for that to be acceptable if the members of struct vector2 were const? If allowed, the assignment certainly would modify them, and preventing that is exactly the point of const. Or if you prefer a citation to authority, C2011 6.3.2.1/1 specifies that if a structure type has any const members, then lvalue expressions of that type are not modifiable.
It sounds like you are confused about the semantics of whole-struct assignment. If you assign one struct to a different struct, you are not replacing one struct with the other; rather you are copying the value of one struct to the other. This is exactly analogous to assignments to simple types, such as int.
You asked,
how can I achieve the following:
create a "immutable" vector2 struct
You already know how to do this, to the extent that it is possible. If you make all the members const then they cannot be modified via an expression of type struct vector2. As I remarked before, however, this does not confer absolute immutability. C has no such thing.
create a polygon struct whose points or count member can't be changed, but the things pointed by points can be 'swaped'.
I'm not sure I understand how an ability to swap points is consistent with your desired level of unmodifiability. Certainly, if struct vector2 has const members then you cannot assign to lvalue expressions of that type. You could still perform swapping via memcpy(), though, or by casting to a modifiable type. These mechanisms do, however, violate at least the spirit of const-ness. Your compiler will likely warn about them.
You could consider changing points from a struct vector2 * const to a struct vector2 ** const. You can then swap the (non-const) struct vector * objects accessible via *points:
struct vector *temp = poly->points[3];
poly->points[3] = poly->points[2];
poly->points[2] = temp;
Your focus on immutability makes me wonder whether you come from a Java background; either way, that's more Java-esque, since all non-primitives in Java are references, which are more or less pointers.
Overall, however, I think you are overly focused on immutability. const-ness will cause you trouble, especially with memory management. Consider doing without, at least for your struct members themselves. At most, use const qualification on function parameter types to express that the function will not modify the actual argument, and perhaps on global variables where you want at least to be warned about any possibility that they will be modified.
Sometimes it is useful to cast function callbacks without.
For example, we may have a function to duplicate some data:
struct MyStruct *my_dupe_fn(const struct MyStruct *s)
But pass it as a generic callback:
typedef void *(*MyGenericCopyCallback)(void *key);
Eg: ensure_key_in_set(my_set, my_key, (MyGenericCopyCallback)my_dupe_fn);
Since the difference between const struct MyStruct * and void * is not going to cause problems in this case, it won't cause any bugs (at least in the function call its self).
However, if later on an arguments added to my_dupe_fn, this could cause a bug which wouldn't give a compiler warning.
Is there a way to cast a function, but still show warnings if the arguments or return values are different sizes?
Obligatory disclaimer: of course C isn't *safe*, but ways to prevent potential bugs in a widely used language are still useful.
You say "won't cause any bugs", however it causes undefined behaviour to call a function through a function pointer with incompatible return types or parameter types, even in your example code.
If you want to rely on undefined behaviour then that's your risk to take. Relying on UB has a tendency to cause bugs sooner or later. A better idea would be to re-design the callback interface to not rely on undefined behaviour. For example, only use functions of the correct type as the callback function.
In your example this might be:
typedef void *MyCallback(void *key); // style: avoid pointer typedefs
struct MyStruct *my_dupe_fn(const struct MyStruct *s)
{ ... }
void *my_dupe_fn_callback(void *s)
{
return my_dupe_fn(s);
}
void generic_algorithm(MyCallback *callback)
{
// ....
ensure_key_in_set(my_set, my_key, callback);
// ....
}
// elsewhere
generic_algorithm(my_dupe_fn_callback);
Note the lack of casts. Managing a style policy of not using any function casts is simpler than a policy of allowing certain types.
If you are using gcc and are not afraid of using helpful extensions, you might have a look at plan9-extensions. In combination with anonymous struct fields (standard since C99) as the first field, they allow to build a type-hierarchy with static functions, etc. Avoids tons of casts in my code and makes it much more readable.
Not sure, but according to the gcc documentation, the MS-compiler supports some (all?) these features, too. No warranty for that, however.
That later error is coming from two pieces of code that say the same thing getting out of sync -- the first where you define the type of my_dupe_fn, and the second where you cast the generic callback pointer back to its original type.
This is where DRY (do not repeat yourself) comes in. The whole point is to only say something once, so that you can't later come back and change only one instance.
In this case, you'd want to typedef the type of a pointer to my_dupe_fn, preferably very close to where you declare the function itself, to help ensure that the typedef always changes along with the function signiture itself.
The compiler is never going to catch this for you as long as it thinks that it is just dealing with a generic void pointer.
Unfortunately you typically have to forgo some of this compile-time safety if you're using C. You might get a warning at best, but if you have a design that is uniformly casting function pointers this way, you're likely to ignore or outright disable them. Instead you want to place your emphasis on achieving safe coding standards. What you can't guarantee by force, you can encourage strongly by policy.
I would suggest, if you can afford it, to start by casting arguments and return values rather than whole function pointers. A flexible representation is like so:
typedef void* GenericFunction(int argc, void** args);
This emulates the ability to have variadic callbacks, and you can uniformly do runtime safety checks in debug builds, e.g., to make sure that the number of arguments matches the assumptions:
void* MyCallback(int argc, void** args)
{
assert(argc == 2);
...
return 0;
}
If you need more safety than this for the individual arguments being passed and can afford a typically-small cost of an extra pointer per argument with a slightly bulky structured solution, you can do something like this:
struct Variant
{
void* ptr;
const char* type_name;
};
struct Variant to_variant(void* ptr, const char* type_name)
{
struct Variant new_var;
new_var.ptr = ptr;
new_var.type_name = type_name;
return new_var;
}
void* from_variant(struct Variant* var, const char* type_name)
{
assert(strcmp(var->type_name, type_name) == 0 && "Type mismatch!");
return var->ptr;
}
void* pop_variant(struct Variant** args, const char* type_name)
{
struct Variant* var = *args;
assert(var->ptr && "Trying to pop off the end of the argument stack!");
assert(strcmp(var->type_name, type_name) == 0 && "Type mismatch!");
++*args;
return var->ptr;
}
With macros like so:
#define TO_VARIANT(val, type) to_variant(&val, #type);
#define FROM_VARIANT(var, type) *(type*)from_variant(&var, #type);
#define POP_VARIANT(args, type) *(type*)pop_variant(&args, #type);
typedef struct Variant* GenericFunction(struct Variant* args);
Example callback:
struct Variant* MyCallback(struct Variant* args)
{
// `args` is null-terminated.
int arg1 = POP_VARIANT(args, int);
float arg2 = POP_VARIANT(args, float);
...
return 0;
}
A side benefit is what you can see in your debugger when you trace into MyCallback through those type_name fields.
This kind of thing can be useful if your codebase supports callbacks into dynamically-typed scripting languages, since scripting languages should not be doing type casts in their code (typically they're meant to be a bit on the safer side). The type names can then be used to automatically convert the arguments into the scripting language's native types dynamically using those type_name fields.
Consider the following C code:
typedef char * MYCHAR;
MYCHAR x;
My understanding is that the result would be that x is a pointer of type "char". However, if the declaration of x were to occur far away from the typedef command, a human reader of the code would not immediately know that x is a pointer. Alternatively, one could use
typedef char MYCHAR;
MYCHAR *x;
Which is considered to be better form? Is this more than a matter of style?
If the pointer is never meant to be dereferenced or otherwise manipulated directly -- IOW, you only pass it as an argument to an API -- then it's okay to hide the pointer behind a typedef.
Otherwise, it's better to make the "pointerness" of the type explicit.
I would use pointer typedefs only in situations when the pointer nature of the resultant type is of no significance. For example, pointer typedef is justified when one wants to declare an opaque "handle" type which just happens to be implemented as a pointer, but is not supposed to be usable as a pointer by the user.
typedef struct HashTableImpl *HashTable;
/* 'struct HashTableImpl' is (or is supposed to be) an opaque type */
In the above example, HashTable is a "handle" for a hash table. The user will receive that handle initially from, say, CreateHashTable function and pass it to, say, HashInsert function and such. The user is not supposed to care (or even know) that HashTable is a pointer.
But in cases when the user is supposed to understand that the type is actually a pointer and is usable as a pointer, pointer typedefs are significantly obfuscating the code. I would avoid them. Declaring pointers explicitly makes code more readable.
It is interesting to note that C standard library avoids such pointer typedefs. For example, FILE is obviously intended to be used as an opaque type, which means that the library could have defined it as typedef FILE <some pointer type> instead of making us to use FILE * all the time. But for some reason they decided not to.
I don't particularly like typedef to a pointer, but there is one advantage to it. It removes confusion and common mistakes when you declare more than one pointer variable in a single declaration.
typedef char *PSTR;
...
PSTR str1, str2, str3;
is arguably clearer than:
char *str1, str2, str3; // oops
I prefer leaving the *, it shows there's a pointer. And your second example should be shortened as char* x;, it makes no sense.
I also think this is a matter of style/convention. In Apple's Core Graphics library they frequently "hide" the pointer and use a convention of appending "Ref" to the end of the type. So for example, CGImage * corresponds to CGImageRef. That way you still know it's a pointer reference.
Another way to look at it is from the perspective of types. A type defines the operations that are possible on that type, and the syntax to invokes these operations. From this perspective, MYCHAR is whatever it is. It is the programmers responsibility to know the operations allowed on it. If it is declared like the first example, then it supports the * operator. You can always name the identifier appropriately to clarify it's use.
Other cases where it is useful to declare a type that is a pointer is when the nature of the parameter is opaque to the user (programmer). There may be APIs that want to return a pointer to the user, and expect the user to pass it back to the API at some other point. Like a opaque handle or a cookie, to be used by the API only internally. The user does not care about the nature of the parameter. It would make sense not to muddy the waters or expose its exact nature by exposing the * in the API.
If you look at several existing APIs, it looks as if not putting the pointerness into the type seems better style:
the already mentionned FILE *
the MYSQL * returned by MySQL's mysql_real_connect()
the MYSQL * returned by MySQL's mysql_store_result() and mysql_use_result()
and probably many others.
For an API it is not necessary to hide structure definitions and pointers behind "abstract" typedefs.
/* This is part of the (hypothetical) WDBC- API
** It could be found in wdbc_api.h
** The struct connection and struct statement ar both incomplete types,
** but we are allowed to use pointers to incomplete types, as long as we don't
** dereference them.
*/
struct connection *wdbc_connect (char *connection_string);
int wdbc_disconnect (struct connection *con);
int wdbc_prepare (struct connection * con, char *statement);
int main(void)
{
struct connection *conn;
struct statement *stmt;
int rc;
conn = wdbc_connect( "host='localhost' database='pisbak' username='wild' password='plasser'" );
stmt = wdbc_prepare (conn, "Select id FROM users where name='wild'" );
rc = wdbc_disconnect (conn);
return 0;
}
The above fragment compiles fine. (but it fails to link, obviously)
Is this more than a matter of style?
Yes. For instance, this:
typedef int *ip;
const ip p;
is not the same as:
const int *p; // p is non-const pointer to const int
It is the same as:
int * const p; // p is constant pointer to non-const int
Read about const weirdness with typedef here typedef pointer const weirdness
This thing was partly touched upon in another question on SO, but somewhat casually as it was not the main question. As my confusion still persists, I am putting it in a separate question.
Why are the following two statements equivalent to int* const p=&num and not const int* p=&num when the latter seems more logical and intuitive? What are the rigorous reasons for this behavior of typedef?
typedef int* PTR;
const PTR p=#
And finally, in that question one member remarks that it is bad practice to use typedefed pointers. But I have seen it being widely used in many books and websites and it seems a convenient thing to do. It makes the code more understandable. So what is the final word on it? Should one avoid typedefed pointers as much as possible?
Edit: And what will be the correct syntax for that typedef statement if we intend the following:
const int* const p=#
Edit: I inadvertently forgot to ask an important thing. What is the correct syntax using that typedef statement for the following then?
const int* p=# //instead of the int* const p=&num that we got
Generally,
const TYPE var = ini;
declares a const variable var of type TYPE. So
const PTR p=#
declares a const variable p of type PTR, initialised with the address of num.
A typedef is not a textual alias, so you can't just replace the typedefed name with its expansion to see what it results in.
If you want to get
const int* const p=#
with a typedef, you must typedef something including const int, e.g.
typedef const int *CI_ptr;
and then you can write
const CI_ptr p = #
(but don't, it's ugly).
And for
const int *p = #
you can then write
CI_ptr p = #
And finally,in that question one member remarks that it is bad practice to use typedefed pointers.But I have seen it being widely used in many books and websites and seems a convenient thing to do,that makes the code more understandable.
Whether it makes the code more understandable or less depends. One thing that in my experience is always a bad thing is to typedef pointer types to names that hide the fact that you are dealing with pointers.
typedef struct list_node {
int value;
struct list_node next;
} *node;
for example is one unfortunately common abuse. When you read the type node, you don't suspect it's a pointer. At least typedef it to node_ptr. But then, why typedef the pointer at all, typedefing the structure and using node* is shorter and clearer.
So what is the final word on it? Should one avoid typedefed pointers as much as possible?
There's no ultimate authority on it, so it's mostly your decision. Follow the coding style in the company/project if there is one, use your judgment if you're coding on your own.
This is one of the issues of using typedef for object pointer types.
A qualifier (const, volatile) never penetrates a typedef.
That's one of the reasons some coding standards forbid the use of typedef for object pointer types.
In the following:
struct adt { void * A; };
int new_adt(const void * const A)
{
struct adt * r = malloc(sizeof(struct adt));
r->A = A;
}
I get:
warning: assignment discards qualifiers from pointer target type
I know I can use
memcpy(&(r->A), &A, sizeof(void *));
to workaround it, but I must ask: Is there any alternative?
By using const void * const I pretend to say that no changes will be made to the input. Also, now that I think of it, const void * would suffice, wouldn't it? (Since I can't change the pointer so that it affects the caller)
Thanks for taking the time to read.
By using const void * const I pretend to say that no changes will be made to the input.
You have not pretended to say that no changes will be made to the input, you have explicitly told the compiler that you have guaranteed that no changes will be made to the input.
You MUST NOT do what you are doing.
If you never want to modify A through its pointer from adt, then you should make that pointer const as well, i.e.:
struct adt {
const void * A;
};
That will make the error go away.
If you do want to modify A through adt, then new_adt should take a non-const pointer.
Edit: A more general note, which may help:
The const keyword, in general, applies to the type immediately to its left. However, if there is no type to its left, it will apply to the type to the right. So:
const int * A is the same as int const * A (the const applies to the int, not to the pointer), but in int * const A, the const applies to the pointer, not to the int.
There is a more comprehensive explanation on Wikipedia: const-correctness. The 'Pointers and References' section of that page includes an example which covers the various combinations of const and non-const that you can apply to a pointer.
You can cast the const away:
r->A = (void *) A;
The issue, though, is less about how to avoid the warning and more about what you're trying to do. The compiler is warning you because you're telling the compiler that "A" is not writable, but you're trying to store it in a location that defines it as writable. That's generally not OK, semantically.
Imagine that "A" points to a location in your binaries data section; trying to write to it, after casting the const away, will most probably cause a segfault.
So, really, think about what you're trying to do more closely before trying to work around the compiler warning. They're warnings for a good reason.