Consider a simple, re-usable library. It has a object for the current state,
and a callback function to feed it input.
typedef struct Context_S Context_T;
typedef size_t (*GetBytes_T) (Context_T * ctx, uint8_t * bytes, size_t max);
struct Context_S {
GetBytes_T byteFunc;
void * extra;
// more elements
};
void Init(Context_T * ctx, GetBytes_T func);
int GetNext(Context_T * ctx); // Calls callback when needs more bytes
User might need some extra data for callback (like file pointer). Library
provides functions to have 1 extra pointer:
void SetExtra(Context_T * ctx, void * ext); // May be called after init
void * GetExtra(Context_T const * ctx); // May be called in callback
However, if user extra data is constant, it would require him to cast constness
away before setting the data. I could change the functions to take/return const,
but this would require extra cast in callback, if data should not be constant.
void SetExtra(Context_T * ctx, void const * ext);
void const * GetExtra(Context_T const * ctx);
Third alternative would be to hide cast inside the function calls:
void SetExtra(Context_T * ctx, void const * ext);
void * GetExtra(Context_T const * ctx);
Is it good idea to hide cast in this case?
I'm trying to find balance with usability and type safety. But since we are
using void* pointers, lot of safety is gone already.
Or am I overlooking something worthy of consideration?
The C standard library has similar problems. Notoriously, the strchr function accepts a const char * parameter and returns a char * value that points into the given string.
This is a deficiency in the C language: Its provisions for const do not support all the ways in which const might be reasonably used.
It is not unreasonable to follow the example of the C standard: Accept a pointer to const and, when giving it back to the calling software, provide a pointer to non-const, as in your third example.
Another alternative is to define two sets of routines, SetExtra and GetExtra that use non-const, and SetExtraConst and GetExtraConst that use const. These could be enforced at run-time with an extra bit that records whether the set context was const or non-const. However, even without enforcement, they could be helpful because they could make errors more visible in the calling code: Somebody reading the code could see that SetExtraConst is used to set the data and GetExtra (non-const) is used to get the data. (This might not help if the calling code is somewhat convoluted and uses const data in some cases and non-const data in others, but it is better to catch more errors than fewer.)
For a standard "hack away" functional program design, it is quite simple:
If the function modifies the contents of a pointer parameter, the pointer should not be const.
If the function does not modify the contents of a pointer parameter, the pointer should always be const.
But in your case, it would rather seem that you are doing a proper object-oriented design, where your code module is the only one who knows what Context_T is and what it contains. (I take it the typedef on the first row is actually in the h file?)
If so, you cannot and should not make the pointer const. Especially not if you are implementing true OO encapsulation using incomplete type ("opaque" type), because in that case the caller can't modify the contents anyhow: the "const correctness" becomes superfluous.
Related
I have this very simple test function that I'm using to figure out what's going on with the const qualifier.
int test(const int* dummy)
{
*dummy = 1;
return 0;
}
This one throws me an error with GCC 4.8.3.
Yet this one compiles:
int test(const int* dummy)
{
*(char*)dummy = 1;
return 0;
}
So it seems like the const qualifier works only if I use the argument without casting to another type.
Recently I've seen codes that used
test(const void* vpointer, ...)
At least for me, when I used void *, I tend to cast it to char for pointer arithmetic in stacks or for tracing. How can const void prevent subroutine functions from modifying the data at which vpointer is pointing?
const int *var;
const is a contract. By receiving a const int * parameter, you "tell" the caller that you (the called function) will not modify the objects the pointer points to.
Your second example explicitly breaks that contract by casting away the const qualifier and then modifying the object pointed by the received pointer. Never ever do this.
This "contract" is enforced by the compiler. *dummy = 1 won't compile. The cast is a way to bypass that, by telling the compiler that you really know what you are doing and to let you do it. Unfortunately the "I really know what I am doing" is usually not the case.
const can also be used by compiler to perform optimization it couldn't otherwise.
Undefined Behavior note:
Please note that while the cast itself is technically legal, modifying a value declared as const is Undefined Behavior. So technically, the original function is ok, as long as the pointer passed to it points to data declared mutable. Else it is Undefined Behavior.
more about this at the end of the post
As for motivation and use lets take the arguments of strcpy and memcpy functions:
char* strcpy( char* dest, const char* src );
void* memcpy( void* dest, const void* src, std::size_t count );
strcpy operates on char strings, memcpy operates on generic data. While I use strcpy as example, the following discussion is exactly the same for both, but with char * and const char * for strcpy and void * and const void * for memcpy:
dest is char * because in the buffer dest the function will put the copy. The function will modify the contents of this buffer, thus it is not const.
src is const char * because the function only reads the contents of the buffer src. It doesn't modify it.
Only by looking at the declaration of the function, a caller can assert all the above. By contract strcpy will not modify the content of the second buffer passed as argument.
const and void are orthogonal. That is all the discussion above about const applies to any type (int, char, void, ...)
void * is used in C for "generic" data.
Even more on Undefined Behavior:
Case 1:
int a = 24;
const int *cp_a = &a; // mutabale to const is perfectly legal. This is in effect
// a constant view (reference) into a mutable object
*(int *)cp_a = 10; // Legal, because the object referenced (a)
// is declared as mutable
Case 2:
const int cb = 42;
const int *cp_cb = &cb;
*(int *)cp_cb = 10; // Undefined Behavior.
// the write into a const object (cb here) is illegal.
I began with these examples because they are easier to understand. From here there is only one step to function arguments:
void foo(const int *cp) {
*(int *)cp = 10; // Legal in case 1. Undefined Behavior in case 2
}
Case 1:
int a = 0;
foo(&a); // the write inside foo is legal
Case 2:
int const b = 0;
foo(&b); // the write inside foo causes Undefined Behavior
Again I must emphasize: unless you really know what you are doing, and all the people working in the present and in the future on the code are experts and understand this, and you have a good motivation, unless all the above are met, never cast away the constness!!
int test(const int* dummy)
{
*(char*)dummy = 1;
return 0;
}
No, this does not work. Casting away constness (with truly const data) is undefined behavior and your program will likely crash if, for example, the implementation put const data in ROM. The fact that "it works" doesn't change the fact that your code is ill-formed.
At least for me, when I used void*, I tend to cast it to char* for
pointer arithmetic in stacks or for tracing. How can const void*
prevent subroutine functions from modifying the data at which vpointer
is pointing?
A const void* means a pointer to some data that cannot be changed. In order to read it, yes, you have to cast it to concrete types such as char. But I said reading, not writing, which, again, is UB.
This is covered more in depth here. C allows you to entirely bypass type-safety: it's your job to prevent that.
It’s possible that a given compiler on a given OS could put some of its const data in read-only memory pages. If so, attempting to write to that location would fail in hardware, such as causing a general protection fault.
The const qualifier just means that writing there is undefined behavior. This means the language standard allows the program to crash if you do (or anything else). Despite that, C lets you shoot yourself in the foot if you think you know what you’re doing.
You can’t stop a subroutine from reinterpreting the bits you give it however it wants and running any machine instruction on them it wants. The library function you’re calling might even be written in assembler. But doing that to a const pointer is undefined behavior, and you really don’t want to invoke undefined behavior.
Off the top of my head, one rare example where it might make sense: suppose you’ve got a library that passes around handle parameters. How does it generate and use them? Internally, they might be pointers to data structures. So that’s an application where you might typedef const void* my_handle; so the compiler will throw an error if your clients try to dereference it or do arithmetic on it by mistake, then cast it back to a pointer to your data structure inside your library functions. It’s not the safest implementation, and you want to be careful about attackers who can pass arbitrary values to your library, but it’s very low-overhead.
I have a function with a prototype like this:
ErrorType function(void ** parameter, other_args);
This function reads the pointer pointed by 'parameter' and changes it (think of it like a realloc).
Now, to be right according to the C Standard, if I want to pass the address of other pointer than a void *, I must declare a temporary void * variable and use that instead.
So that I want is to create a wrapper (I don't care if it's a function or a macro), that do the function call with any pointer type.
I think I could do that in C11 with _Generic and a function for each basic type, plus a function for all structs and a function for all unions, but I think it's too troublesome.
I also read about a GCC extension that let you to write statements and declarations in expressions, and I think that I can easily do that I want with that, but I prefer that my code compiles in all standard compilers, not only in GCC or Clang.
So the question is, is there any way to do that without too much problems in a C11 compiler?
If I understand the question correctly, you'd like for function to be able to modify various types of pointers. Well, there's bad news and good news about that.
Bad news: The object representation of pointers is opaque, so you'd need to communicate to your function which kind of pointer it should be working with, unless your function is guaranteed to copy an object representation from a source pointer's object representation and you know that the two representations have the same meaning.
For example, sizeof (double *) could be different than sizeof (void *)
Good news: A pointer to any object type can be cast to a void * and back again, which includes a void ** and a double ** any many other pointer-types. So you could have:
ErrorType function(void * ptr, int ptr_type, ...) {
void ** vpp;
double ** dpp;
...
...
switch (ptr_type) {
case PTR_TYPE_VOIDP:
vpp = ptr;
/* Now you can work with *vpp */
*vpp = ...
break;
case PTR_TYPE_DOUBLEP:
dpp = ptr;
/* Now you can work with *dpp */
*dpp = ...
break;
}
...
}
I got a bunch of code, that I should analyze and prepare for import it to a new project. Often there are the following patterns:
typedef struct t_Substruct
{
/* some elements */
} ts;
typedef struct t_Superstruct
{
/* some elements */
ts substruct;
/* some more elements */
} tS;
void funct1(const tS * const i_pS, tS * const o_pS)
{ /* do some complicated calculations and transformations */ }
void funct2(const ts * const i_ps, tS * const o_pS)
{ /* do some complicated calculations and transformations */ }
void funct3(const tS * const i_ps, ts * const o_ps)
{ /* do some complicated calculations and transformations */ }
In general there is reading from the i_ parameters and writing to the o_ parameters. Now there might be calls like:
void some_funct()
{
tS my_struct = {0};
/* do some stuff */
funct1(&my_struct,&my_struct);
funct2(&my_struct.substruct, &my_struct);
funct3(&my_struct, &my_struct.substruct);
}
Im not sure about the possible pitfalls to such functions and calling context:
Are such declarations or calls in context of const correctness allowed with regards to language constraints and/or undefined behavior?
Is it allowed to change one object, that is referenced protected and not protected in the same function?
I know there are some problems with accessing / modifying the same variable multiple times between, sequence points (although I am not perfectly sure if I understand the sequence point thing totally). Does this or similar issues apply here, and in which way?
If not undefined behavior, is there any other typical problem which reduces portability in the above situation?
If there are problems, what would be a good (safe and if possible with as little overhead as possible) general pattern to allow those calls, so that such issues might not happen every other line?
I have to implement in C90, but if there are issues in porting to other C incarnations, regarding the above, this is also important for me.
Thank you in advance.
There are two different sides related to const with a pointer S* p.
Is it allowed to change the pointer? Example: p=5;
Is it allowed to change the object pointed to? Example: p->x = 5;
These are the four possibilities:
T* p: both changes allowed
const T* p: object can not be changed
T* const p: pointer can not be changed
const T* const p: neither object nor pointer can be changed
In your example void funct1(const tS * const i_pS, tS * const o_pS) this means the following:
You are not allowed to change the pointers i_pS and o_pS.
You are not allowed to change the object pointed to by i_pS.
You are allowed to change the object pointed to by o_pS.
The first condition looks rather pointless, so probably this
void funct1(const tS* i_pS, tS* o_pS)
is more readable.
Regarding the second and third case where you have two pointers which point to the same part of an object: Be careful that you do not make wrong assumptions in the code, for example that the object pointed to by the const pointer is actually not changing.
Remember, a const pointer does never mean the object is not changing, only that you are not allowed to change it via that pointer.
Example for problematic code:
void foo(const S* a, S* b) {
if(a->x != 0) {
b->x = 0;
b->y = 5 / a->x; // why is a->x suddenly 0 ??
}
}
S s;
foo(&s, &s);
Regarding undefined behaviour and sequence points. I would advice to read this answer: Undefined behavior and sequence points
So for example the expression i = a->x + (b->x)++; is definitely undefined behavior if a and b point to the same object.
A function void funct1(const tS* i_ps, tS* o_pS) which is called as funct1(&my_struct, &my_struct); is an open door to confusion and errors.
The C library also knows that problem. Consider for example memcpy and memmove.
So I would advice to build your functions such that you can be sure that no undefined behavior can occur. The most drastic measure would be to make a complete copy of the input struct. This has overhead, but in your specific case perhaps it is enough to copy only some small part of the input argument.
If the overhead is too big, specifically state in the function documentation that it is not allowed to give the same object as input and output. Then, if possible and necessary, create a second function with the necessary overhead to handle the case where input and output are the same.
This call is invalid. For example:
funct1(&my_struct.substruct, &my_struct.substruct);
because funct1 expects a tS * but these are ts *. You would need to get a cast to get this to compile. The code would then work (because there is actually a tS at the pointed-to location) but it is strange to say the least, you should just change it to &my_struct instead of adding the cast.
Also, I beg you to use a different naming convention than ts versus tS.
As Danvil says, it's important that your code takes account of the fact that the two pointers may be pointing to parts of the same object.
As a matter of style, I don't like the "top level const". It makes the function header harder to read, and you have to take some time to work out what is const and what isn't.
Modern compilers can optimize code when they see a const. However, I've never seen the C standard library use const for its non-pointer arguments. For example memcmp() is an example of this. It has 2 const void * arguments, but its third argument is size_t.
Why is the standard library (and other libraries) designed this way? Why don't I see const size_t or const int in modern code?
C uses call-by-value. It does not help the compiler one bit to mark a function argument as const (note that none of the arguments of memcmp() is const. The pointers arguments could also be declared const, and you could have suggested that they should be: int memcmp(const void * const s1, const void * const s2, size_t const n);. But they aren't).
The reason it would not help the compiler to mark function arguments const is that a function argument is just, from the point of view of the function, a local variable. As long as the function does not take its address, it's very easy for the compiler to see that the variable is never modified.
In contrast, the const modifiers that are part of memcmp()'s prototype (const void *s1)
are part of its contract: they express that the function does not modify the pointed data. The const modifier is never used this way for the argument themselves because the caller does not care if the function modify its arguments: they are just copies (again because C uses call-by-value).
Those consts mean different things. In
int memcmp ( const void * ptr1, const void * ptr2, size_t num );
const void * ptr1 means that memcmp will treat ptr1 as pointing to constant data and won't modify it; similarly for const void * ptr2. Thus, the caller knows that the stored values won't be changed and can optimize accordingly. In a function call like
int result = memcmp(ptr1, ptr2, num);
the variables ptr1, ptr2, and num are copied into the function. memcmp makes no promises not to adjust them; it only promises not to adjust what the pointers point to. Indeed, it might increment/decrement any of the copied variables in order to step through the arrays if that proves efficient. If it wanted to promise not to change any of them, the declaration would be:
int memcmp ( const void *const ptr1, const void *const ptr2, const size_t num );
For simple data types (like pointers and integers), little (if any) optimization can be gained in this way, and the original specifiers of this function (and others) apparently saw no reason to prevent implementations from modifying the variables in fortuitous circumstances.
Main reason for this is Library consistency. Changing a size_t argument to a const size_t will require libraries that do modify the size to be rewritten. Not all implementations of a library need to use the same algorithm. This is the main reason why existing library functions are not being adapted.
New library functions are often created purposefully with non const arguments so that certain machine dependent implementations can use the modifiable argument.
For instance the Intel C++ Compiler version of memcmp actually counts down the length argument during execution. Some other implementations may not do so.
I was recently making some adjustments to code wherein I had to change a formal parameter in a function. Originally, the parameter was similar to the following (note, the structure was typedef'd earlier):
static MySpecialStructure my_special_structure;
static unsigned char char_being_passed; // Passed to function elsewhere.
static MySpecialStructure * p_my_special_structure; // Passed to function elsewhere.
int myFunction (MySpecialStructure * p_structure, unsigned char useful_char)
{
...
}
The change was made because I could define and initialize my_special_structure before compile time and myFunction never changed the value of it. This led to the following change:
static const MySpecialStructure my_special_structure;
static unsigned char char_being_passed; // Passed to function elsewhere.
static MySpecialStructure * p_my_special_structure; // Passed to function elsewhere.
int myFunction (const MySpecialStructure * p_structure, unsigned char useful_char)
{
...
}
I also noticed that when I ran Lint on my program that there were several Info 818's referencing a number of different functions. The info stated that "Pointer parameter 'x' (line 253) could be declared as pointing to const".
Now, I have two questions in regards to the above. First, in regards to the above code, since neither the pointer nor the variables within MySpecialStructure is changed within the function, is it beneficial to declare the pointer as constant as well? e.g. -
int myFunction (const MySpecialStructure * const p_structure, unsigned char useful_char)
My second question is in regards to the Lint information. Are there any benefits or drawbacks to declaring pointers as a constant formal parameter if the function is not changing its value... even if what you are passing to the function is never declared as a constant? e.g. -
static unsigned char my_char;
static unsigned char * p_my_char;
p_my_char = &my_char;
int myFunction (const unsigned char * p_char)
{
...
}
Thanks for your help!
Edited for clarification -
What are the advantages of declaring a pointer to const or a const pointer to const- as a formal parameter? I know that I can do it, but why would I want to... particularly in the case where the pointer being passed and the data it is pointing to are not declared constant?
What are the advantages of declaring a pointer as a const - as a formal parameter? I know that I can do it, but why would I want to... particularly in the case where the pointer being passed and the data it is pointing to are not declared constant?
I assumed you meant a pointer to const.
By have a pointer to const as a parameter, the advantage is you document the API by telling the programmer your function does not modify the object pointed by the pointer.
For example look at memcpy prototype:
void *memcpy(void * restrict s1, const void * restrict s2, size_t n);
It tells the programmer the object pointed to by s2 will not be modified through memcpy call.
It also provides compiler enforced documentation as the implementation will issue a diagnostic if you modify a pointee from a pointer to const.
const also allows to indicate users of your function that you won't modify this parameter behind their back
If you declare a formal parameter as const, the compiler can check that your code does not attempt to modify that parameter, yielding better software quality.
Const correctness is a wonderful thing. For one, it lets the compiler help keep you from making mistakes. An obvious simple case is assigning when you meant to compare. In that instance, if the pointer is const, the compiler will give you an error. Google 'const correctness' and you'll find many resources on the benefits of it.
For your first question, if you are damn sure of not modifying either the pointer or the variable it points to, you can by all means go ahead and make both of them constant!
Now, for your Qn as to why declare a formal pointer parameter as const even though the passed pointer is not constant, A typical use case is library function printf(). printf is supposed to accept const char * but the compiler doesn't complain even if you pass a char* to it. In such a case, it makes sense that printf() doesn't not build upon the user's mistake and alter user's data inadvertantly! Its like printf() clearly telling- Whether you pass a const char * or char*, dont worry, I still wont modify your data!
For your second question, const pointers find excellent application in the embedded world where we generally write to a memory address directly. Here is the detailed explanation
Well, what are the advantages of declaring anything as a const while you have the option to not to do so? After all, if you don't touch it, it doesn't matter if it's const or not. This provides some safety checks that the compiler can do for you, and it gives some information of the function interface. For example, you can safely pass a string literal to a function that expects a const char *, but you need to be careful if the parameter is declared as just a char *.