Why does the last line of this segment compile and run in C (Visual Studio)? It is int* on the left and int on the right. The types do not match so I would think it is an error. In other words a data value should not be compatible with a memory address.
int x = 6;
int* baz = &x;
int* foo = &x;
foo = baz; /* ok */
foo = *baz; /* ?? */
I assumed it would not compile.
MSVC does give a warning for this. You must be ignoring the warning.
Elevate warnings to errors by using the /WX switch.
The C standard requires a diagnostic message for this because it violates the constraint for simple assignment in C 2018 6.5.16 1:
One of the following shall hold:
[list of six cases, none of which covers assigning an int value to a pointer]
However, even a compiler that conforms to the C standard is allowed to accept the program even after issuing a diagnostic for a constraint violation,1 and that is what happened. Using /WX will prevent that.
Footnote
1 It is allowed to do this because there is no rule against it in the standard, and footnote 9 says:
… It [a C implementation] can also successfully translate an invalid program…
You can think of a memory address as an integer. Doing foo = *baz is the same as doing foo = 6.
In other words, your integer pointer foo is now pointing to memory at address 0x6. This still compiles, but if you try to dereference foo, you'll encounter undefined behavior.
Related
Do both statements mean the same thing that p is pointing at address location 10?
On compilation, the first initialization gives some warning. What's the meaning of that?
#include <stdio.h>
int main()
{
int *p = 10;
int *q = (int *)10;
return 0;
}
output:
warning: initialization of ‘int *’ from ‘int’ makes pointer from integer without a cast [- Wint-conversion]
Both cases convert the integer 10 to a pointer type which is used to initialize an int *. The cast in the second case makes it explicit that this behavior is intentional.
While converting from an integer to pointer is allowed, the assignment operator (and by extension, initialization) does not specifically allow this conversion, so a cast it required to be conforming. Many compilers however will still allow this and simply issue a warning (as your apparently does).
Note however that actually attempting to use a pointer that is assigned a specific numeric value will most likely cause a crash unless you're on a embedded system that supports reading or writing specific memory addresses.
int *p = 10; is incorrect (constraint violation), and the compiler must produce a diagnostic message. The compiler could reject the program, and there is no behaviour defined if it doesn't. The rule is that the initializer for a pointer must be a compatible pointer value or a null pointer constant.
int *q = (int *)10; means to convert the integer 10 to a pointer. The result is implementation-defined and it could be a trap representation, meaning that the initialization causes undefined behaviour if execution reaches this line.
int and pointer to an integer int* are different types. The 10 on the first line is an int that you are trying to assign to a pointer to int type. Hence the warning. (on X86 both share the same size, but consider that mostly coincidence at this point).
By casting the int to a pointer, like you do on the second line, you are telling the compiler "Hey, I know these are different types but I know what I'm doing, so go ahead and just treat the value 10 like a pointer because I really do want to point at the memory with an address of 10". (in almost every case the memory address of 10 is not going to be usable by you)
What would this statement yield?
void *p = malloc(sizeof(void));
Edit: An extension to the question.
If sizeof(void) yields 1 in GCC compiler, then 1 byte of memory is allocated and the pointer p points to that byte and would p++ be incremented to 0x2346? Suppose p was 0x2345. I am talking about p and not *p.
The type void has no size; that would be a compilation error. For the same reason you can't do something like:
void n;
EDIT.
To my surprise, doing sizeof(void) actually does compile in GNU C:
$ echo 'int main() { printf("%d", sizeof(void)); }' | gcc -xc -w - && ./a.out
1
However, in C++ it does not:
$ echo 'int main() { printf("%d", sizeof(void)); }' | gcc -xc++ -w - && ./a.out
<stdin>: In function 'int main()':
<stdin>:1: error: invalid application of 'sizeof' to a void type
<stdin>:1: error: 'printf' was not declared in this scope
If you are using GCC and you are not using compilation flags that remove compiler specific extensions, then sizeof(void) is 1. GCC has a nonstandard extension that does that.
In general, void is a incomplete type, and you cannot use sizeof for incomplete types.
Although void may stand in place for a type, it cannot actually hold a value. Therefore, it has no size in memory. Getting the size of a void isn’t defined.
A void pointer is simply a language construct meaning a pointer to untyped memory.
void has no size. In both C and C++, the expression sizeof (void) is invalid.
In C, quoting N1570 6.5.3.4 paragraph 1:
The sizeof operator shall not be applied to an expression that
has function type or an incomplete type, to the parenthesized name of
such a type, or to an expression that designates a bit-field member.
(N1570 is a draft of the 2011 ISO C standard.)
void is an incomplete type. This paragraph is a constraint, meaning that any conforming C compiler must diagnose any violation of it. (The diagnostic message may be a non-fatal warning.)
The C++ 11 standard has very similar wording. Both editions were published after this question was asked, but the rules go back to the 1989 ANSI C standard and the earliest C++ standards. In fact, the rule that void is an incomplete type to which sizeof may not be applied goes back exactly as far as the introduction of void into the language.
gcc has an extension that treats sizeof (void) as 1. gcc is not a conforming C compiler by default, so in its default mode it doesn't warn about sizeof (void). Extensions like this are permitted even for fully conforming C compilers, but the diagnostic is still required.
Taking the size of void is a GCC extension.
sizeof() cannot be applied to incomplete types. And void is incomplete type that cannot be completed.
In C, sizeof(void) == 1 in GCC, but this appears to depend on your compiler.
In C++, I get:
In function 'int main()':
Line 2: error: invalid application of 'sizeof' to a void type
compilation terminated due to -Wfatal-errors.
To the 2nd part of the question: Note that sizeof(void *)!= sizeof(void).
On a 32-bit arch, sizeof(void *) is 4 bytes, so p++, would be set accordingly.The amount by which a pointer is incremented is dependent on the data it is pointing to. So, it will be increased by 1 byte.
while sizeof(void) perhaps makes no sense in itself, it is important when you're doing any pointer math.
eg.
void *p;
while(...)
p++;
If sizeof(void) is considered 1 then this will work.
If sizeof(void) is considered 0 then you hit an infinite loop.
Most C++ compilers choosed to raise a compile error when trying to get sizeof(void).
When compiling C, gcc is not conforming and chose to define sizeof(void) as 1. It may look strange, but has a rationale. When you do pointer arithmetic adding or removing one unit means adding or removing the object pointed to size. Thus defining sizeof(void) as 1 helps defining void* as a pointer to byte (untyped memory address). Otherwise you would have surprising behaviors using pointer arithmetic like p+1 == p when p is void*. Such pointer arithmetic on void pointers is not allowed in c++ but works fine with when compiling C with gcc.
The standard recommended way would be to use char* for that kind of purpose (pointer to byte).
Another similar difference between C and C++ when using sizeof occurs when you defined an empty struct like:
struct Empty {
} empty;
Using gcc as my C compiler sizeof(empty) returns 0.
Using g++ the same code will return 1.
I'm not sure what states both C and C++ standards on this point, but I believe defining the size of some empty structs/objects helps with reference management to avoid that two references to differing consecutive objects, the first one being empty, get the same address. If reference are implemented using hidden pointers as it is often done, ensuring different address will help comparing them.
But this is merely avoiding a surprising behavior (corner case comparison of references) by introduction another one (empty objects, even PODs consume at least 1 byte memory).
Given the following C code, what is the difference between a = f; and a = (int *) f;?
float *f;
int *a;
...
a = f;
a = (int *) f;
float *f;
int *a;
a = f;
This assignment is erroneous (there is a C constraint violation), there is no implicit conversion between pointer types (except with void *). A compiler can refuse to compile a program with this assignment.
Given:
float *f;
int *a;
This:
a = f;
is a constraint violation. It requires a diagnostic from any conforming compiler. After issuing the required diagnostic, it may or may not reject the program. (IMHO it should do so.) A conforming compiler may choose to accept it with a mere warning (which qualifies as a diagnostic), but once it does so the behavior of the program is undefined. Compilers that do this most commonly generate an implicit conversion from float* to int*, giving the same behavior as if there were a cast (an explicit conversion), but the standard does not require that.
Non-conforming compilers, of course are free to do anything they like.
Conclusion: Don't write code like that. Even if your compiler lets you get away with it, another compiler might not. If you want to convert from one pointer type to another, use a cast. Aside from validity issues, the cast makes it much clearer to the reader that something funny is going on. If your compiler gave you a warning, heed it. If it didn't, find out how to increase the warning levels on your compiler.
This:
a = (int *) f;
takes the value of f (which is of type float*) and explicitly converts it to type int*, then assigns that int* value to a. (I'll assume that something between the declaration and the assignment has set f to some valid value.)
If f is a null pointer, the conversion is well defined, and yields a null pointer of type int*. The rules for converting a non-null object pointer to another pointer type are (quoting N1570 6.3.2.3p7):
A pointer to an object type may be converted to a pointer to a
different object type. If the resulting pointer is not correctly
aligned for the referenced type, the behavior is undefined.
Otherwise, when converted back again, the result shall compare equal
to the original pointer.
This kind of conversion, assuming int and float are the same size and have similar alignment requirements, is likely intended to let you treat a float object as if it were an int object. This is called "type-punning". If int and float aren't the same size, or if they have different alignment requirements, this can easily blow up in your face, crashing your program (if you're lucky) or giving you garbage results (if you're not). (Yes, crashing the program is a good outcome; it lets you know there's a problem.)
If you really need to do that for some reason, it's better to define a union with int and float members, or to use memcpy() to copy the contents of a float object into an int object.
But it very rarely makes sense to do that kind of thing. If you want to examine the representation of a float object, it's better to treat it as an array of unsigned char something that the language standard explicitly permits.
6.5.16.1 Simple assignment
the left operand has atomic, qualified, or unqualified pointer type, and (considering
the type the left operand would have after lvalue conversion) both operands are
pointers to qualified or unqualified versions of compatible types, and the type pointed
to by the left has all the qualifiers of the type pointed to by the right.
So, a = f is a constraint violation and invokes undefined behavior.
In second case you are making f (by casting it)to be compatible to a's type. It is legal to do a casting in C (not sure about other languages).
But it should be noted that after casting f is still pointer to float and you have to cast it every time when you will assign it to a.
a = (int*) f; makes explicit that you want to cast a float* pointer to an int* pointer. Without it, you'll receive an incompatible pointer types error.
Your code will compile (at least in my linux and gcc). But you will get a warning.
If you use a = f; and then use a somewhere in your code, you will get erroneous data, because a float is stored in a different format in memory. Even if you do the casting first you probably will get erroneous results, but the compiler sees your casting and assumes you know what you are doing.
a = f; //assignment
// is a constraint violation
a = (int *) f; //cast + assignment
Explicitly casting float pointer to int pointer.simply hides compiler warnings or errors.
but very well might crash when running as the sizes of what the program expects when dereferencing the pointer differs from reality.
I'm new to C++ and just trying to get a hang of it. It generally seems not too bad, but I stumbled upon this weird/pathological segfaulting behavior:
int main () {
int* b;
*b = 27;
int c = *b;
cout << "c points to " << c << endl; //OK
printf( "b points to %d\n", *b); //OK
// cout << "b points to " << (*b) << endl; - Not OK: segfaults!
return 0;
}
This program, as given, produces what you'd expect:
c points to 27
b points to 27
On the other hand, if you uncomment the second-to-last line, you get a program that crashes (seg-fault) in runtime. Why? This is a valid pointer.
int* b points to an unknown memory address because it wasn't initialized. If you initialized it to whatever null pointer value exists for your compiler (0 until C++11, nullptr in C++11 and newer), you'd most certainly get a segfault earlier. The problem lies in the fact that you allocated space for the pointer but not the data it points to. If you instead did this:
int c = 27;
int* b = &c;
cout << "c points to " << c << endl;
printf ("b points to %d\n", *b);
cout << "b points to " << (*b) << endl;
Things would work because int* b refers to a memory location that is accessible by your program (since the memory is actually a part of your program).
If you leave a pointer uninitialized or assign a null value to it, you can't use it until it points to a memory address that you KNOW you can access. For example, using dynamic allocation with the new operator will reserve memory for the data for you:
int* b = new int();
*b = 27;
int c = *b;
//output
delete b;
Update 3
My answer to Where exactly does C++ standard say dereferencing an uninitialized pointer is undefined behavior? gives a much better answer to why using an uninitialized pointer is undefined behavior. The basic logic from the C++ draft standard, section 24.2 Iterator requirements, specifically section 24.2.1 In general paragraph 5 and 10 which respectively say (emphasis mine):
[...][ Example: After the declaration of an uninitialized pointer x (as with int* x;), x must always be assumed to have a singular value of a pointer. —end example ] [...] Dereferenceable values are always non-singular.
Update 2
This was originally an answer to a C question with nearly identical circumstances but the original question I answered was merged with this one. I am updating my answer to include an answer specific to the new question and to the C++ draft standard.
b has not be initialized and therefore it's value is indeterminate but you used indirection on b which is undefined behavior.
One possible simple fix would be to assign b to the address of an existing variable, for example:
int a ;
int* b = &a;
Another option would have been to use dynamic allocation via new.
For completeness sake we can see this is undefined behavior by going to the draft C++ standard section 5.3.1 Unary operators paragraph 1 which says(emphasis mine):
The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points.[...]
and if we then go to section 3.10 Lvalues and rvalues paragraph 1 says(emphasis mine):
An lvalue (so called, historically, because lvalues could appear on the left-hand side of an assignment expression) designates a function or an object. [...]
but b does not point to a valid object.
Original Answer
You did not allocate any memory to f nor b but you used indirection on both which is undefined behavior.
Update
It is worth noting that cranking up the warning levels should have indicated this was a problem, for example using gcc -Wall gives me the following warning for this code:
warning: 'f' is used uninitialized in this function [-Wuninitialized]
The simplest fix would be to assign f to point to a valid object like so:
char a ;
char *f = &a ;
Another options would be to use dynamic allocation, if you don't have a handy reference the C FAQ is not a bad place to start.
for completeness sake, if we look at the C99 draft standard Annex J.2 Undefined behavior paragraph 1 says:
The behavior is undefined in the following circumstances:
and includes the following bullet:
The value of an object with automatic storage duration is used while it is
indeterminate (6.2.4, 6.7.8, 6.8).
The value of f and b are both automatic variables and are indeterminate since they are not initialized.
It is not clear from reading the referenced sections which statement makes it undefined but section 6.5.2.5 Compound literals paragraph 17 which is part of normative text has an example with the following text which uses the same language and says:
[...]next time around p would have an indeterminate value, which would result in undefined behavior.
In the C11 draft standard the paragraph is 16.
The pointer is valid in as much it's got a value. But the memory is probably not. It's your OS telling you that you are touching memory which isn't yours.
I'm frankly surprised it doesn't crash earlier than that.
Here's why:
int* b; // b is uninitialized.
*b = 27;
Where does b point? It might be somewhere valid, or somewhere totally off-limits. You can usually bet on the latter.
Here's a better way to do what you want.
int b1 = 27;
int *b = &b1;
Now b points to the location on the stack where b1s value is stored.
This is because f is a pointer and it need to be allocated some memory for it.
General rule: initialize variable before using it
char* f; is a variable. *f is usage of this variable. Like any variable, f must be initialized before usage.
From this article.
Another use for declaring a variable as register and const is to inhibit any non-local change of that variable, even trough taking its address and then casting the pointer. Even if you think that you yourself would never do this, once you pass a pointer (even with a const attribute) to some other function, you can never be sure that this might be malicious and change the variable under your feet.
I don't understand how we can modify the value of a const variable by a pointer. Isn't it undefined behavior?
const int a = 81;
int *p = (int *)&a;
*p = 42; /* not allowed */
The author's point is that declaring a variable with register storage class prevents you from taking its address, so it can not be passed to a function that might change its value by casting away const.
void bad_func(const int *p) {
int *q = (int *) p; // casting away const
*q = 42; // potential undefined behaviour
}
void my_func() {
int i = 4;
const int j = 5;
register const int k = 6;
bad_func(&i); // ugly but allowed
bad_func(&j); // oops - undefined behaviour invoked
bad_func(&k); // constraint violation; diagnostic required
}
By changing potential UB into a constraint violation, a diagnostic becomes required and the error is (required to be) diagnosed at compile time:
c11
5.1.1.3 Diagnostics
1 - A conforming implementation shall produce at least one diagnostic message [...] if a preprocessing translation unit or translation unit
contains a violation of any syntax rule or constraint, even if the behavior is also explicitly
specified as undefined or implementation-defined.
6.5.3.2 Address and indirection operators
Constraints
1 - The operand of the unary & operator shall be [...] an lvalue that designates an object that [...] is
not declared with the register storage-class specifier.
Note that array-to-pointer decay on a register array object is undefined behaviour that is not required to be diagnosed (6.3.2.1:3).
Note also that taking the address of a register lvalue is allowed in C++, where register is just an optimiser hint (and a deprecated one at that).
Can we modify the value of a const variable?
Yes, You can modify a const variable through various means: Pointer hackery, casts etc...
Do Read next Q!!
Is it valid code to modify the value of a const variable?
No! What that gives you is Undefined Behavior.
Technically, your code example has an Undefined Behavior.
The program is not adhering to c standard once you modify the const and hence may give any result.
Note that an Undefined Behavior does not mean that the compiler needs to report the violation as an diagnostic. In this case your code uses pointer hackery to modify a const and the compiler is not needed to provide a diagnostic for it.
The C99 standard 3.4.3 says:
Undefined behavior: behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements.
NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
Your code compiles, but it has undefined behavior.
The author's point is to use const and register so that the code no longer compiles:
const int a = 81;
int *p = (int *)&a; /* no compile error */
*p = 42; /* UB */
register const int b = 81;
int *q = (int *)&b; /* does not compile */
The code fragment indeed invokes undefined behavior.
I 'm not really sure what the author's point is: in order to not let "foreign code" change the value of the variable you make it const so that... UB is invoked instead? How is that preferable? To be frank, it does not make sense.
I think the author is also talking about this case, which is a misunderstanding of const:
int a = 1;
int* const a_ptr = (int* const)&a; //cast not relevant
int function(int* const p){
int* malicious = (int*)p;
*malicious = 2;
}
The variable itself is not constant, but the pointer is. The malicious code can convert to a regular pointer and legally modify the variable below.
I don't understand how we can modify the value of a const variable by a pointer. Isn't it undefined behavior?
Yes, it is undefined behavior:
Quote from C18, 6.7.3/7:
"If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined."
But just because the behavior is undefined, it does not mean you potentially can not do that. As far as I can think of, it is indeed the case, that the compiler will, most of the times your program contains any kind of undefined behavior, not warn you - which is a big problem.
Fortunately in this case, when compiling f.e.:
#include <stdio.h>
int main(){
const int a = 25;
int *p = &a;
*p = 26;
printf("a = %d",a);
}
the compiler will throw a warning:
initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers] (gcc)
or
warning: initializing 'int *' with an expression of type 'const int *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers] (clang)
but despite that the code contains parts which cause undefined behavior and you can never be sure what it will print on any execution, you get that malicious program compiled (without -Werror option of course).
Can we modify the value of a const variable?
So, yes - unfortunately. One can actually modify a const object, but you never ever should do that, neither intentionally nor by accident.
The method to using register keyword might be efficient because the address of a register marked variable can´t have its address taken - means you cannot assign a pointer with the address of the relative variable nor pass it to a function as argument of the respective pointer type.