Let's take the following two function:
#include<stdio.h>
void my_print1(char* str) {
// str = "OK!";
printf("%s\n", str);
}
void my_print2(char* const str) {
// str = "OK!";
printf("%s\n", str);
}
They both produce the same assembly:
How then is the const-ness enforced here? For example, if I un-comment str = "OK!; it will of course work in the first function call but not the second (error: assignment of read-only parameter ‘str’).
Is the const-ness of a local variable just a compiler construct, and it is responsible for checking that, or how does it work if the assembly for the two functions is the same? Note: this is C only, not C++ (as I think they treat const different).
Correct, on most implementations it's just a compiler construct.
On a typical mainstream OS implementation, there is a way to place const objects having static storage duration in memory that is actually write-protected by the CPU's memory-management unit (MMU), e.g. a .text or .rodata section. Then attempts to write it, if not prevented at compile time, will cause a trap at runtime. But hardware write protection applies to large blocks of memory (e.g. whole pages). There is no good way to do this with auto objects, such as local variables or function parameters, which live in stack memory or in registers. On the stack, since they are mixed in with non-const variables, hardware write protection is not fine-grained enough to apply to them, and in any case would be very expensive to be continually changing (it needs a call to the operating system). And registers on most machines cannot be made read-only at all.
Since there is no good way to protect them at runtime, compilers often generate the exact same code for const auto objects as for non-const.
You might see differences in some cases, since const informs the compiler that the object's value is not supposed to change, and therefore the compiler can assume that it does not. For instance, if you pass a pointer to a non-const object to another function, the compiler has to assume that the value of the object may have been changed, and will reload it from memory after each function call. But a const object may have its value cached in a register across the functionn call, or just optimized into an immediate constant if possible.
The const qualifier on function parameters and other local variables normally has no effect on the generated code. It just tells the compiler to prevent assigning to the variable.
Theoretically, it could generate code that prevents modifying the variable through other means. E.g. if you had
void my_print2(char* const str) {
*(char *)&str = "OK!";
printf("%s\n", str);
}
The assignment causes undefined behavior, but won't cause an error (although the compiler might warn about casting away constness). But the compiler could theoretically store str in memory that's marked read-only; in that case, the assignment would cause a segmentation fault. This would not normally be done for function parameters because it's difficult to reconcile that with using the stack for automatic data. (Nate Eldredge's answer explains this better.)
The compiler enforces const by refusing to generate non-compliant code. If the my_print2 source had attempted to modify str, the compiler would have issued an error message.
With respect to your code however:
void my_print2(char* const str)...
It's kind of pointless as it can only limit what the function can do with the value of the pointer itself, not what it can do to memory it points to, so:
void my_print2(char* const str)
{
str++; // Not allowed, compile time error.
*str = 'A'; // Okay.
}
Functions that do not need to modify the content of what is pointed to, should be declared as const <type> * to give callers confidence that your function, print in this case, won't change their data:
void my_print3(const char* str)
{
str++; // Okay
*str = 'A'; // Not allowed, compile time error.
}
void my_print4(char const* str)
{
str++; // Okay
*str = 'A'; // Not allowed, compile time error.
}
Related
Does the following scenario have undefined behavior?
void do_stuff(const int *const_pointer, int *pointer) {
printf("%i\n", *const_pointer);
*pointer = 1;
}
int data = 0;
do_stuff(&data, &data);
If this is undefined behavior it could probably cause problems if the compiler assumes that the value that const_pointer points to never changes. In this case it might reorder both instructions in do_stuff and thereby change the behavior from the intended printf("0") to printf("1").
If the compiler can prove that the value pointer to by a pointer to const will not change then it will not need to reload the value, or keep the ordering.
In this case this cannot be done because the two pointers may alias, so the compiler cannot assume the pointers don't point to the same object. (The call to the function might be done from a different translation unit, the compiler doesn't know what is passed to the function.)
There is no undefined behavior in your example, and you are guaranteed that the printf will output 0.
A const_pointer merely means that the data it points to cannot be changed.
When you pass a pointer to your function, it's entirely up to you to decide whether it should be const or not. const is a protection tool for you to prevent unwanted changes to your data. But without const, a function can still work. For example, your own version of strcpy can be written either as :
strcpy( char *s, const char *t );
or,
strcpy( char *s, char *t ); // without const, still work, just not as good as the first version
So there shouldn't anything unexpected in your code: you cannot modify data via const_pointer, but you can modify it via pointer (even when the two pointers are pointing to the same location).
I know global const is stored in .rodata
Also, I know variables declared in functions are stored in the stack. However since const is supposed to be only read only, is there a special section in stack for them? how are accesses to them controlled?
What you really should know: If an object is declared as const, the compiler will not easily let you attempt to modify it, and if you get around the compiler, then any attempt to modify the object is undefined behaviour. That's it. Nothing else. Forget about .rodata or anything you learned, what counts is that an attempt to modify a const object is undefined behaviour.
What I mean by "the compiler doesn't let you" and getting around it:
const int x = 5;
x = 6; // Not allowed by compiler
int* p = &x; *p = 6; // Not allowed by compiler
int* p = (int*)&x; *p = 6; // Allowed by compiler, undefined behaviour.
Executing the last statement can crash, or change x to 6, or change x to 999, or leave x unchanged, or make it behave in a schizophrenic way where it is 5 at some times and 6 at other times, including x == x being false.
The const local variable may be not stored at all, when it's initialized with a constant epression. Consider following code:
int foo(int param)
{
const int value = 10;
return param + value;
}
It is likely that an optimizing compiler will generate assembly code with e.g. add operation, where value is substituted by 10 literal.
Other than that, many compilers would place them on the stack frame, just as for "ordinary" automatic variables, thus any protection you can get is by compiler itself.
No, in general there is not any "constant" area of the stack. But that's okay, because what const really means is "I promise not to try to modify this". It does not mean "put this in read-only memory so that we're guaranteed to get a bus error if I goof".
There's no readonly stack segment, because it would have to be writable for the initializations, which happen every time the function is entered. Every function call would have an unreasonable amount of overhead, asking the kernel to change the page protection, initialize the variables, then change it back.
rodata works because the statically allocated const variables are only initialized once.
Sorry for being such a dumb here. Can't sort this out for myself.
In a header file there is a Macro like this.
#define kOID "1.3.6.1.4.1.1.1.2.4.0"
How to declare and initialize a char pointer to this data without creating a copy of this string?
Preprocessor macros are nothing but a textual substitution. Thus if you write
const char *pointer = kOID;
the preprocessor will substitute the text with
const char *pointer = "1.3.6.1.4.1.1.1.2.4.0";
One thing to bear in mind is that the const specifier is necessary since once the textual substitution is made, the memory will be allocated on read-only segments.
Also be careful to have the macro visible at the point where you'd like to declare that pointer.
Assuming that you're not planning to change the contents of this string, you can simply use:
char* p = kOID;
The string will reside in a read-only section of the program, so any attempt to change its contents will result with a memory access violation during runtime. So for your own safety, you should generally use:
const char* p = kOID;
Thus, any attempt to change the contents of the string pointed by p will lead to a compile-time error instead of a runtime error. The former is typically much easier to track-down and fix than the latter.
To summarize the const issue, here are the options that you can use:
char* p = kOID;
char* const p = kOID; // compilation error if you change the pointer
const char* p = kOID; // compilation error if you change the pointed data
const char* const p = kOID; // compilation error if you change either one of them
UPDATE - Memory Usage Considerations:
Please note that every such declaration may result with an additional memory usage, adding up to the length of the string plus one character, plus 4 or 8 bytes for the pointer (depending on your system). Now, the pointer is perhaps less of an issue, but the string itself might yield an extensive memory usage if you instantiate it in several places in the code. So if you're planning to use the string in various places within your program, then you should probably declare it globally in one place.
In addition, please note that the string may reside either in the code-section of the program or in the data-section of the program. Depending on your memory partitions, you may prefer having it in one place over the other.
include the header file first.
#include <header.h>
Add the defined constant
char * s = kOID;
This will compile the program fine. However as kOID is a string literal it'll be saved on read only memory of your program. So if you modify the s it'll cause Segmentation fault. The get around is to make s constant.
const char * s = kOID;
Now if you compile the program compiler will check any assignment on s and notice accordingly.
a.c: In function ‘main’:
a.c:10:5: error: assignment of read-only location ‘*s’
So you'll be safe.
To add to what has been said by others, also you can initialize your array this way:
const char some_string[] = kOID;
This is similar to const char *const some_string = kOID;. Possibly, it may lead to additional memory allocation but this depends on compiler.
For straight C and GCC, why doesn't the pointed-to string get corrupted here?
#include <stdio.h>
int main(int argc, char *argv[])
{
char* str_ptr = NULL;
{
//local to this scope-block
char str[4]={0};
sprintf(str, "AGH");
str_ptr = str;
}
printf("str_ptr: %s\n", str_ptr);
getchar();
return 0;
}
|----OUTPUT-----|
str_ptr: AGH
|--------------------|
Here's a link to the above code compiled and executed using an online compiler.
I understand that if str was a string literal, str would be stored in the bss ( essentially as a static ), but sprintf(ing) to a stack-allocated buffer, I thought the string buffer would be purely stack-based ( and thus the address meaningless after leaving the scope block )? I understand that it may take additional stack allocations to over-write the memory at the given address, but even using a recursive function until a stack-overflow occurred, I was unable to corrupt the string pointed to by str_ptr.
FYI I am doing my testing in a VS2008 C project, although GCC seems to exhibit the same behavior.
While nasal lizards are a popular part of C folklore, code whose behaviour is undefined can actually exhibit any behaviour at all, including magically resuscitating variables whose lifetime has expired. The fact that code with undefined behaviour can appear to "work" should neither be surprising nor an excuse to neglect correcting it. Generally, unless you're in the business of writing compilers, it's not very useful to examine the precise nature of undefined behaviour in any given environment, especially as it might be different after you blink.
In this particular case, the explanation is simple, but it's still undefined behaviour, so the following explanation cannot be relied upon at all. It might at any time be replaced with reptilian emissions.
Generally speaking, C compilers will make each function's stack frame a fixed size, rather than expanding and contracting as control flow enters and leaves internal blocks. Unless called functions are inlined, their stack frames will not overlap with the stack frame of the caller.
So, in certain C compilers with certain sets of compile options and except for particular phases of the moon, the character array str will not be overwritten by the call to printf, even though the variable's lifetime has expired.
Most likely the compiler does some sort of simple optimizations resulting in the string still being in the same place on the stack. In other words, the compiler allows the stack to grow to store 'str'. But it doesn't shrink the stack in the scope of main, because it is not required to do so.
If you really want to see the result of saving the address of variables on the stack, call a function.
#include <stdio.h>
char * str_ptr = NULL;
void onstack(void)
{
char str[4] = {0};
sprintf(str,"AGH");
str_ptr = str;
}
int main(int argc, char *argv[])
{
onstack();
int x = 0x61626364;
printf("str_ptr: %s\n", str_ptr);
printf("x:%i\n",x);
getchar();
return 0;
}
With gcc -O0 -std=c99 strcorrupt.c I get random output on the first printf. It will vary from machine to machine and architecture to architecture.
I'm wondering whether static constant variables are thread-safe or not?
Example code snippet:
void foo(int n)
{
static const char *a[] = {"foo","bar","egg","spam"};
if( ... ) {
...
}
}
Any variable that is never modified, whether or not it's explicitly declared as const, is inherently thread-safe.
const is not a guarantee from the compiler that a variable is immutable. const is a promise that you make to the compiler that a variable will never be modified. If you go back on that promise, the compiler will generate an error pointing that out to you, but you can always silence the compiler by casting away constness.
To be really safe you should do
static char const*const a[]
this inhibits modification of the data and all the pointers in the table to be modified.
BTW, I prefer to write the const after the typename such that it is clear at a first glance to where the const applies, namely to the left of it.
In your example the pointer itself can be considered as thread safe. It will be initialized once and won't be modified later.
However, the content of the memory pointed won't be thread-safe at all.
In this example, a is not const. It's an array of pointers to const strings. If you want to make a itself const, you need:
static const char *const a[] = {"foo","bar","egg","spam"};
Regardless of whether it's const or not, it's always safe to read data from multiple threads if you do not write to it from any of them.
As a side note, it's usually a bad idea to declare arrays of pointers to constant strings, especially in code that might be used in shared libraries, because it results in lots of relocations and the data cannot be located in actual constant sections. A much better technique is:
static const char a[][5] = {"foo","bar","egg","spam"};
where 5 has been chosen such that all your strings fit. If the strings are variable in length and you don't need to access them quickly (for example if they're error messages for a function like strerror to return) then storing them like this is the most efficient:
static const char a[] = "foo\0bar\0egg\0spam\0";
and you can access the nth string with:
const char *s;
for (i=0, s=a; i<n && *s; s+=strlen(s)+1);
return s;
Note that the final \0 is important. It causes the string to have two 0 bytes at the end, thus stopping the loop if n is out of bounds. Alternatively you could bounds-check n ahead of time.
static const char *a[] = {"foo","bar","egg","spam"};
In C that would be always thread safe: the sructures would be already created at compile time, thus no extra action is taken at run time, thus no race condition is possible.
Beware the C++ compatibility though. Static const object would be initialized on the first entry into the function, but the initialization is not guaranteed to be thread-safe by the language. IOW this is open to a race condition when two different threads come into the function simultaneously and try to initialize the object in parallel.
But even in C++, POD (plain old data: structures not using C++ features, like in your example) would behave in the C compatible way.