Compiler Optimization: const on non-pointer function arguments in C - c

Modern compilers can optimize code when they see a const. However, I've never seen the C standard library use const for its non-pointer arguments. For example memcmp() is an example of this. It has 2 const void * arguments, but its third argument is size_t.
Why is the standard library (and other libraries) designed this way? Why don't I see const size_t or const int in modern code?

C uses call-by-value. It does not help the compiler one bit to mark a function argument as const (note that none of the arguments of memcmp() is const. The pointers arguments could also be declared const, and you could have suggested that they should be: int memcmp(const void * const s1, const void * const s2, size_t const n);. But they aren't).
The reason it would not help the compiler to mark function arguments const is that a function argument is just, from the point of view of the function, a local variable. As long as the function does not take its address, it's very easy for the compiler to see that the variable is never modified.
In contrast, the const modifiers that are part of memcmp()'s prototype (const void *s1)
are part of its contract: they express that the function does not modify the pointed data. The const modifier is never used this way for the argument themselves because the caller does not care if the function modify its arguments: they are just copies (again because C uses call-by-value).

Those consts mean different things. In
int memcmp ( const void * ptr1, const void * ptr2, size_t num );
const void * ptr1 means that memcmp will treat ptr1 as pointing to constant data and won't modify it; similarly for const void * ptr2. Thus, the caller knows that the stored values won't be changed and can optimize accordingly. In a function call like
int result = memcmp(ptr1, ptr2, num);
the variables ptr1, ptr2, and num are copied into the function. memcmp makes no promises not to adjust them; it only promises not to adjust what the pointers point to. Indeed, it might increment/decrement any of the copied variables in order to step through the arrays if that proves efficient. If it wanted to promise not to change any of them, the declaration would be:
int memcmp ( const void *const ptr1, const void *const ptr2, const size_t num );
For simple data types (like pointers and integers), little (if any) optimization can be gained in this way, and the original specifiers of this function (and others) apparently saw no reason to prevent implementations from modifying the variables in fortuitous circumstances.

Main reason for this is Library consistency. Changing a size_t argument to a const size_t will require libraries that do modify the size to be rewritten. Not all implementations of a library need to use the same algorithm. This is the main reason why existing library functions are not being adapted.
New library functions are often created purposefully with non const arguments so that certain machine dependent implementations can use the modifiable argument.
For instance the Intel C++ Compiler version of memcmp actually counts down the length argument during execution. Some other implementations may not do so.

Related

Does the second const matter in void f(char const * const p)?

A function declared
void f(const char * const p) { ... }
means that it takes a constant pointer to a constant character. But, with the variable p scoped only within the function itself and its usage hidden from the callee, does it matter if the second const is there?
In other words, wouldn't the following be semantically identical to the first from the perspective of the callee?
void f(const char *p) { ... }
The first const indicates that you can't change the data is pointing to, the second indicates that this pointer can't be overwritten.
So, for a caller of this function, it doesn't matter.
For the implementation of the function, it can be useful, though it will only have effect locally.
The second const (the one after the *) doesn't matter from the callee point of view.
But it does matter from the point of view of the function body.
The second const makes sure that you can't make the pointer change its value (i.e. point to a different memory location).
It's comparable to declaring a simple primitive parameter as const, like in void f(const int value) {}
Yes. You tell the compiler that you won't modify p in that function. This allows the compiler certain optimizations, e.g. if p is already in a register, there's no need to save it and no need to allocate a new register.
It also causes the compiler to issue a diagnostic should you attempt to modify p, eg. ++p or p = ... because that violates the promise you made.
Yes, I believe so. The additional const here only means that in the implementation of f, the implementer is not allowed to reassign p to point to some other character or C-style string.

When to use const void*?

I have this very simple test function that I'm using to figure out what's going on with the const qualifier.
int test(const int* dummy)
{
*dummy = 1;
return 0;
}
This one throws me an error with GCC 4.8.3.
Yet this one compiles:
int test(const int* dummy)
{
*(char*)dummy = 1;
return 0;
}
So it seems like the const qualifier works only if I use the argument without casting to another type.
Recently I've seen codes that used
test(const void* vpointer, ...)
At least for me, when I used void *, I tend to cast it to char for pointer arithmetic in stacks or for tracing. How can const void prevent subroutine functions from modifying the data at which vpointer is pointing?
const int *var;
const is a contract. By receiving a const int * parameter, you "tell" the caller that you (the called function) will not modify the objects the pointer points to.
Your second example explicitly breaks that contract by casting away the const qualifier and then modifying the object pointed by the received pointer. Never ever do this.
This "contract" is enforced by the compiler. *dummy = 1 won't compile. The cast is a way to bypass that, by telling the compiler that you really know what you are doing and to let you do it. Unfortunately the "I really know what I am doing" is usually not the case.
const can also be used by compiler to perform optimization it couldn't otherwise.
Undefined Behavior note:
Please note that while the cast itself is technically legal, modifying a value declared as const is Undefined Behavior. So technically, the original function is ok, as long as the pointer passed to it points to data declared mutable. Else it is Undefined Behavior.
more about this at the end of the post
As for motivation and use lets take the arguments of strcpy and memcpy functions:
char* strcpy( char* dest, const char* src );
void* memcpy( void* dest, const void* src, std::size_t count );
strcpy operates on char strings, memcpy operates on generic data. While I use strcpy as example, the following discussion is exactly the same for both, but with char * and const char * for strcpy and void * and const void * for memcpy:
dest is char * because in the buffer dest the function will put the copy. The function will modify the contents of this buffer, thus it is not const.
src is const char * because the function only reads the contents of the buffer src. It doesn't modify it.
Only by looking at the declaration of the function, a caller can assert all the above. By contract strcpy will not modify the content of the second buffer passed as argument.
const and void are orthogonal. That is all the discussion above about const applies to any type (int, char, void, ...)
void * is used in C for "generic" data.
Even more on Undefined Behavior:
Case 1:
int a = 24;
const int *cp_a = &a; // mutabale to const is perfectly legal. This is in effect
// a constant view (reference) into a mutable object
*(int *)cp_a = 10; // Legal, because the object referenced (a)
// is declared as mutable
Case 2:
const int cb = 42;
const int *cp_cb = &cb;
*(int *)cp_cb = 10; // Undefined Behavior.
// the write into a const object (cb here) is illegal.
I began with these examples because they are easier to understand. From here there is only one step to function arguments:
void foo(const int *cp) {
*(int *)cp = 10; // Legal in case 1. Undefined Behavior in case 2
}
Case 1:
int a = 0;
foo(&a); // the write inside foo is legal
Case 2:
int const b = 0;
foo(&b); // the write inside foo causes Undefined Behavior
Again I must emphasize: unless you really know what you are doing, and all the people working in the present and in the future on the code are experts and understand this, and you have a good motivation, unless all the above are met, never cast away the constness!!
int test(const int* dummy)
{
*(char*)dummy = 1;
return 0;
}
No, this does not work. Casting away constness (with truly const data) is undefined behavior and your program will likely crash if, for example, the implementation put const data in ROM. The fact that "it works" doesn't change the fact that your code is ill-formed.
At least for me, when I used void*, I tend to cast it to char* for
pointer arithmetic in stacks or for tracing. How can const void*
prevent subroutine functions from modifying the data at which vpointer
is pointing?
A const void* means a pointer to some data that cannot be changed. In order to read it, yes, you have to cast it to concrete types such as char. But I said reading, not writing, which, again, is UB.
This is covered more in depth here. C allows you to entirely bypass type-safety: it's your job to prevent that.
It’s possible that a given compiler on a given OS could put some of its const data in read-only memory pages. If so, attempting to write to that location would fail in hardware, such as causing a general protection fault.
The const qualifier just means that writing there is undefined behavior. This means the language standard allows the program to crash if you do (or anything else). Despite that, C lets you shoot yourself in the foot if you think you know what you’re doing.
You can’t stop a subroutine from reinterpreting the bits you give it however it wants and running any machine instruction on them it wants. The library function you’re calling might even be written in assembler. But doing that to a const pointer is undefined behavior, and you really don’t want to invoke undefined behavior.
Off the top of my head, one rare example where it might make sense: suppose you’ve got a library that passes around handle parameters. How does it generate and use them? Internally, they might be pointers to data structures. So that’s an application where you might typedef const void* my_handle; so the compiler will throw an error if your clients try to dereference it or do arithmetic on it by mistake, then cast it back to a pointer to your data structure inside your library functions. It’s not the safest implementation, and you want to be careful about attackers who can pass arbitrary values to your library, but it’s very low-overhead.

Choosing const vs. non-const pointer for user data

Consider a simple, re-usable library. It has a object for the current state,
and a callback function to feed it input.
typedef struct Context_S Context_T;
typedef size_t (*GetBytes_T) (Context_T * ctx, uint8_t * bytes, size_t max);
struct Context_S {
GetBytes_T byteFunc;
void * extra;
// more elements
};
void Init(Context_T * ctx, GetBytes_T func);
int GetNext(Context_T * ctx); // Calls callback when needs more bytes
User might need some extra data for callback (like file pointer). Library
provides functions to have 1 extra pointer:
void SetExtra(Context_T * ctx, void * ext); // May be called after init
void * GetExtra(Context_T const * ctx); // May be called in callback
However, if user extra data is constant, it would require him to cast constness
away before setting the data. I could change the functions to take/return const,
but this would require extra cast in callback, if data should not be constant.
void SetExtra(Context_T * ctx, void const * ext);
void const * GetExtra(Context_T const * ctx);
Third alternative would be to hide cast inside the function calls:
void SetExtra(Context_T * ctx, void const * ext);
void * GetExtra(Context_T const * ctx);
Is it good idea to hide cast in this case?
I'm trying to find balance with usability and type safety. But since we are
using void* pointers, lot of safety is gone already.
Or am I overlooking something worthy of consideration?
The C standard library has similar problems. Notoriously, the strchr function accepts a const char * parameter and returns a char * value that points into the given string.
This is a deficiency in the C language: Its provisions for const do not support all the ways in which const might be reasonably used.
It is not unreasonable to follow the example of the C standard: Accept a pointer to const and, when giving it back to the calling software, provide a pointer to non-const, as in your third example.
Another alternative is to define two sets of routines, SetExtra and GetExtra that use non-const, and SetExtraConst and GetExtraConst that use const. These could be enforced at run-time with an extra bit that records whether the set context was const or non-const. However, even without enforcement, they could be helpful because they could make errors more visible in the calling code: Somebody reading the code could see that SetExtraConst is used to set the data and GetExtra (non-const) is used to get the data. (This might not help if the calling code is somewhat convoluted and uses const data in some cases and non-const data in others, but it is better to catch more errors than fewer.)
For a standard "hack away" functional program design, it is quite simple:
If the function modifies the contents of a pointer parameter, the pointer should not be const.
If the function does not modify the contents of a pointer parameter, the pointer should always be const.
But in your case, it would rather seem that you are doing a proper object-oriented design, where your code module is the only one who knows what Context_T is and what it contains. (I take it the typedef on the first row is actually in the h file?)
If so, you cannot and should not make the pointer const. Especially not if you are implementing true OO encapsulation using incomplete type ("opaque" type), because in that case the caller can't modify the contents anyhow: the "const correctness" becomes superfluous.

Declaring a pointer to const or a const pointer to const as a formal parameter

I was recently making some adjustments to code wherein I had to change a formal parameter in a function. Originally, the parameter was similar to the following (note, the structure was typedef'd earlier):
static MySpecialStructure my_special_structure;
static unsigned char char_being_passed; // Passed to function elsewhere.
static MySpecialStructure * p_my_special_structure; // Passed to function elsewhere.
int myFunction (MySpecialStructure * p_structure, unsigned char useful_char)
{
...
}
The change was made because I could define and initialize my_special_structure before compile time and myFunction never changed the value of it. This led to the following change:
static const MySpecialStructure my_special_structure;
static unsigned char char_being_passed; // Passed to function elsewhere.
static MySpecialStructure * p_my_special_structure; // Passed to function elsewhere.
int myFunction (const MySpecialStructure * p_structure, unsigned char useful_char)
{
...
}
I also noticed that when I ran Lint on my program that there were several Info 818's referencing a number of different functions. The info stated that "Pointer parameter 'x' (line 253) could be declared as pointing to const".
Now, I have two questions in regards to the above. First, in regards to the above code, since neither the pointer nor the variables within MySpecialStructure is changed within the function, is it beneficial to declare the pointer as constant as well? e.g. -
int myFunction (const MySpecialStructure * const p_structure, unsigned char useful_char)
My second question is in regards to the Lint information. Are there any benefits or drawbacks to declaring pointers as a constant formal parameter if the function is not changing its value... even if what you are passing to the function is never declared as a constant? e.g. -
static unsigned char my_char;
static unsigned char * p_my_char;
p_my_char = &my_char;
int myFunction (const unsigned char * p_char)
{
...
}
Thanks for your help!
Edited for clarification -
What are the advantages of declaring a pointer to const or a const pointer to const- as a formal parameter? I know that I can do it, but why would I want to... particularly in the case where the pointer being passed and the data it is pointing to are not declared constant?
What are the advantages of declaring a pointer as a const - as a formal parameter? I know that I can do it, but why would I want to... particularly in the case where the pointer being passed and the data it is pointing to are not declared constant?
I assumed you meant a pointer to const.
By have a pointer to const as a parameter, the advantage is you document the API by telling the programmer your function does not modify the object pointed by the pointer.
For example look at memcpy prototype:
void *memcpy(void * restrict s1, const void * restrict s2, size_t n);
It tells the programmer the object pointed to by s2 will not be modified through memcpy call.
It also provides compiler enforced documentation as the implementation will issue a diagnostic if you modify a pointee from a pointer to const.
const also allows to indicate users of your function that you won't modify this parameter behind their back
If you declare a formal parameter as const, the compiler can check that your code does not attempt to modify that parameter, yielding better software quality.
Const correctness is a wonderful thing. For one, it lets the compiler help keep you from making mistakes. An obvious simple case is assigning when you meant to compare. In that instance, if the pointer is const, the compiler will give you an error. Google 'const correctness' and you'll find many resources on the benefits of it.
For your first question, if you are damn sure of not modifying either the pointer or the variable it points to, you can by all means go ahead and make both of them constant!
Now, for your Qn as to why declare a formal pointer parameter as const even though the passed pointer is not constant, A typical use case is library function printf(). printf is supposed to accept const char * but the compiler doesn't complain even if you pass a char* to it. In such a case, it makes sense that printf() doesn't not build upon the user's mistake and alter user's data inadvertantly! Its like printf() clearly telling- Whether you pass a const char * or char*, dont worry, I still wont modify your data!
For your second question, const pointers find excellent application in the embedded world where we generally write to a memory address directly. Here is the detailed explanation
Well, what are the advantages of declaring anything as a const while you have the option to not to do so? After all, if you don't touch it, it doesn't matter if it's const or not. This provides some safety checks that the compiler can do for you, and it gives some information of the function interface. For example, you can safely pass a string literal to a function that expects a const char *, but you need to be careful if the parameter is declared as just a char *.

returning const char * from a function in C

In my library I have to return a string to the callers. The string I am returning will be a global array, the intended use from the caller is just to read the string. I don't want them to modify it..
Is this the right way to declare the function..
const char * get_some_details();
This should generate a warning (tried only gcc) either when the caller assigns the return value to a char * ptr or assigns to const char * ptr, but later tries to modify it.
I am asking because I would expect functions like getenv() to return const char *. But it returns char *. Is there any gotcha in returning const char * ?
Returning const char* is exactly the right thing to do in these circumstances.
Many older APIs don't use const since they pre-date the introduction of const in C90 (there was no const before then).
That is the right way to do it.
Many standard library functions take or return char * when logically it should be const char *, for historical reasons: pre-C89, there was no const; post-C89, their type signatures could not be changed without breaking application code. As a general rule, do not use the standard library as a style guide; it is very old and contains many things that are no longer considered good practice. (cough gets)

Resources