Do both statements mean the same thing that p is pointing at address location 10?
On compilation, the first initialization gives some warning. What's the meaning of that?
#include <stdio.h>
int main()
{
int *p = 10;
int *q = (int *)10;
return 0;
}
output:
warning: initialization of ‘int *’ from ‘int’ makes pointer from integer without a cast [- Wint-conversion]
Both cases convert the integer 10 to a pointer type which is used to initialize an int *. The cast in the second case makes it explicit that this behavior is intentional.
While converting from an integer to pointer is allowed, the assignment operator (and by extension, initialization) does not specifically allow this conversion, so a cast it required to be conforming. Many compilers however will still allow this and simply issue a warning (as your apparently does).
Note however that actually attempting to use a pointer that is assigned a specific numeric value will most likely cause a crash unless you're on a embedded system that supports reading or writing specific memory addresses.
int *p = 10; is incorrect (constraint violation), and the compiler must produce a diagnostic message. The compiler could reject the program, and there is no behaviour defined if it doesn't. The rule is that the initializer for a pointer must be a compatible pointer value or a null pointer constant.
int *q = (int *)10; means to convert the integer 10 to a pointer. The result is implementation-defined and it could be a trap representation, meaning that the initialization causes undefined behaviour if execution reaches this line.
int and pointer to an integer int* are different types. The 10 on the first line is an int that you are trying to assign to a pointer to int type. Hence the warning. (on X86 both share the same size, but consider that mostly coincidence at this point).
By casting the int to a pointer, like you do on the second line, you are telling the compiler "Hey, I know these are different types but I know what I'm doing, so go ahead and just treat the value 10 like a pointer because I really do want to point at the memory with an address of 10". (in almost every case the memory address of 10 is not going to be usable by you)
Is it legal to cast a pointer to an array of ints to an int pointer?
int arr[4];
int (*a)[4] = &arr;
int *p = (int*)a;
C 2018 6.3.2.3 7 says we can convert an int (*)[4] to an int *:
A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined…
The alignment is necessarily correct since an array of int must have the alignment required for an int.
However, the only thing the C standard says about the value resulting from this conversion is:
… when converted back again, the result shall compare equal to the original pointer.
This means that an int * can temporarily hold the value of an int (*)[4]. If we execute:
int arr[4];
int (*x)[4] = &arr;
int *y = (int *) x;
int (*z)[4] = (int (*)[4]) y;
then we know x == z is true because the standard tells us that. But we do not know what y is. Because the standard permits different types of pointers to have different representations (use the bits that represent their values in different ways), it is possible that y has no useful meaning as an int *. The C standard does not say the converted pointer can be used to access objects.
Most C implementations either support this deliberately or as an artifact of how they are designed. However, in terms of what the C standard specifies, no guarantee is given.
If the original pointer's initialized to either NULL or a valid pointer to an int[4], then yes. Pointer casts must not violate alignment requirements lest you get UB. A cast such as what I've described won't violate such requirement ̶a̶n̶d̶ ̶f̶u̶r̶t̶h̶e̶r̶m̶o̶r̶e̶ ̶i̶t̶ ̶w̶i̶l̶l̶ ̶b̶e̶ ̶u̶s̶a̶b̶l̶e̶ ̶f̶o̶r̶ ̶d̶e̶r̶e̶f̶e̶r̶e̶n̶c̶i̶n̶g̶ ̶b̶e̶c̶a̶u̶s̶e̶ ̶i̶f̶ ̶t̶h̶e̶ ̶̶i̶n̶t̶(̶*̶a̶)̶[̶4̶]̶̶ ̶i̶s̶ ̶v̶a̶l̶i̶d̶ ̶a̶n̶d̶ ̶n̶o̶n̶n̶u̶l̶l̶ ̶t̶h̶e̶n̶ ̶t̶h̶e̶r̶e̶ ̶i̶n̶d̶e̶e̶d̶ ̶i̶s̶ ̶a̶n̶ ̶̶i̶n̶t̶̶ ̶a̶t̶ ̶̶(̶i̶n̶t̶*̶)̶a̶̶.
If you feel uneasy about pointer casts (as you should), you can effect the conversion in this case without casting by simply doing *a (will get int[4] which will decay to int*) or a[0] or &a[0][0] or &(*a)[0]. That way, you can also dereference the result while adhering to the letter of the standard.
Take a look at the following program. What I don't understand is why do I have to cast the address of the variable x to char* when it actually would be absolutely useless if you think about it for a second. All I really need is only the address of the variable and all the necessary type information is already in place provided by the declaration statement char* ptr.
#include <stdio.h>
int main(void) {
int x = 0x01020309;
char* ptr = &x; /* The GCC compiler is going to complain here. It will
say the following: "warning: initialization from
incompatible pointer type [enabled by default]". I
need to use the cast operator (char*) to make the
compiler happy. But why? */
/* char* ptr = (char*) &x; */ /* this will make the compiler happy */
printf("%d\n", *ptr); /* Will print 9 on a little-endian machine */
return 0;
}
The C Standard, 6.2.5 Types, paragraph 28 states:
A pointer to void shall have the same representation and
alignment requirements as a pointer to a character type.
Similarly, pointers to qualified or unqualified versions of
compatible types shall have the same representation and
alignment requirements. All pointers to structure types shall have
the same representation and alignment requirements as each other.
All pointers to union types shall have the same
representation and alignment requirements as each other.
Pointers to other types need not have the same representation or alignment requirements.
Since different types of pointers can have differing implementations or constraints, you can't assume it's safe to convert from one type to another.
For example:
char a;
int *p = &a
If the implementation has an alignment restriction on int, but not on char, that would result in a program that could fail to run.
This is because pointers of different types point to blocks of memory of different sizes even if they point to the same location.
&x is of type int* which tells the compiler the number of bytes (depending on sizeof(int)) to read when getting data.
Printing *(&x) will return the original value you entered for x
Now if you just do char* ptr = &x; the compiler assigns the address in &x to your new pointer (it can as they are both pointers) but it warns you that you are changing the size of the block of memory being addressed as a char is only 1 byte. If you cast it you are telling the compiler that this is what you intend.
Printing *(ptr) will return only the first byte of the value of x.
You are correct that it makes no practical difference. The warning is there to inform you that there might be something fishy with that assignment.
C has fairly strong type-checking, so most compilers will issue a warning when the types are not compatible.
You can get rid of the warning by adding an explicit cast (char*), which is you saying:
I know what I'm doing, I want to assign this value to my char* pointer even if the types don't match.
Its just simple as you assign integer type to character. similarly you are trying to assign integer type pointer to character type pointer.
Now why is so because this is how c works, if you increment a character pointer it will give you one byte next address and incrementing integer pointer will give you 2 byte next address.
According to your code, x is of type int. So the pointer that points to x should be of type int *. Compiler gives such error because you use a pointer which is not int *.
So make your pointer either int *, or void * then you don't need cast.
I have a void pointer pointing to a memory address. Then, I do
int pointer = the void pointer
float pointer = the void pointer
and then, dereference them go get the values.
{
int x = 25;
void *p = &x;
int *pi = p;
float *pf = p;
double *pd = p;
printf("x: n%d\n", x);
printf("*p: %d\n", *(int *)p);
printf("*pi: %d\n", *pi);
printf("*pf: %f\n", *pf);
printf("*pd: %f\n", *pd);
return 0;
}
The output of dereferencing pi(int pointer) is 25.
However the output of dereferencing pf(float pointer) is 0.000.
Also dereferncing pd(double pointer) outputs a negative fraction that keeps
changing?
Why is this and is it related to endianness(my CPU is little endian)?
As per C standard, you'er allowed to convert any pointer to void * and convert it back, it'll have the same effect.
To quote C11, chapter §6.3.2.3
[...] A pointer to
any object type may be converted to a pointer to void and back again; the result shall
compare equal to the original pointer.
That is why, when you cast the void pointer to int *, de-reference and print the result, it prints properly.
However, standard does not guarantee that you can dereference that pointer to be of a different data type. It is essentially invoking undefined behaviour.
So, dereferencing pf or pd to get a float or double is undefined behavior, as you're trying to read the memory allocated for an int as a float or double. There's a clear case of mismtach which leads to the UB.
To elaborate, int and float (and double) has different internal representations, so trying to cast a pointer to another type and then an attempt to dereference to get the value in other type won't work.
Related , C11, chapter §6.5.3.3
[...] If the operand has type ‘‘pointer to type’’, the result has type ‘‘type’’. If an
invalid value has been assigned to the pointer, the behavior of the unary * operator is
undefined.
and for the invalid value part, (emphasis mine)
Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an
address inappropriately aligned for the type of object pointed to, and the address of an object after the
end of its lifetime.
In addition to the answers before, I think that what you were expecting could not be accomplished because of the way the float numbers are represented.
Integers are typically stored in Two's complement way, basically it means that the number is stored as one piece. Floats on the other hand are stored using a different way using a sign, base and exponent, Read here.
So the main idea of convertion is impossible since you try to take a number represented as raw bits (for positive) and look at it as if it was encoded differently, this will result in unexpected results even if the convertion was legit.
So... here's probably what's going on.
However the output of dereferencing pf(float pointer) is 0.000
It's not 0. It's just really tiny.
You have 4-byte integers. Your integer looks like this in memory...
5 0 0 0
00000101 00000000 00000000 00000000
Which interpreted as a float looks like...
sign exponent fraction
0 00001010 0000000 00000000 00000000
+ 2**-117 * 1.0
So, you're outputting a float, but it's incredibly tiny. It's 2^-117, which is virtually indistinguishable from 0.
If you try printing the float with printf("*pf: %e\n", *pf); then it should give you something meaningful, but small. 7.006492e-45
Also dereferncing pd(double pointer) outputs a negative fraction that keeps changing?
Doubles are 8-bytes, but you're only defining 4-bytes. The negative fraction change is the result of looking at uninitialized memory. The value of uninitialized memory is arbitrary and it's normal to see it change with every run.
There are two kinds of UBs going on here:
1) Strict aliasing
What is the strict aliasing rule?
"Strict aliasing is an assumption, made by the C (or C++) compiler, that dereferencing pointers to objects of different types will never refer to the same memory location (i.e. alias each other.)"
However, strict aliasing can be turned off as a compiler extension, like -fno-strict-aliasing in GCC. In this case, your pf version would function well, although implementation defined, assuming nothing else has gone wrong (usually float and int are both 32 bit types and 32 bit aligned on most computers, usually). If your computer uses IEEE754 single, you can get a very small denorm floating point number, which explains for the result you observe.
Strict aliasing is a controversial feature of recent versions of C (and considered a bug by a lot of people) and makes it very difficult and more hacky than before to do reinterpret cast (aka type punning) in C.
Before you are very aware of type punning and how it behaves with your version of compiler and hardware, you shall avoid doing it.
2) Memory out of bound
Your pointer points to a memory space as large as int, but you dereference it as double, which is usually twice of the size of an int, you are basically reading half a double of garbage from somewhere in the computer, which is why your double keeps changing.
The types int, float, and double have different memory layouts, representations, and interpretations.
On my machine, int is 4 bytes, float is 4 bytes, and double is 8 bytes.
Here is how you explain the results you are seeing.
Derefrencing the int pointer works, obviously, because the original data was an int.
Derefrencing the float pointer, the compiler generates code to interpret the contents of 4 bytes in memory as a float. The value in the 4 bytes, when interpreted as a float, gives you 0.00. Lookup how float is represented in memory.
Derefrencing the double pointer, the compiler generates code to interpret the contents in memory as a double. Because a double is larger than an int, this accesses the 4 bytes of the original int, and an extra 4 bytes on the stack. Because the contents of these extra 4 bytes is dependent on the state of the stack, and is unpredictable from run to run, you see the varying values that correspond to interpreting the entire 8 bytes as a double.
In the following,
printf("x: n%d\n", x); //OK
printf("*p: %d\n", *(int *)p); //OK
printf("*pi: %d\n", *pi); //OK
printf("*pf: %f\n", *pf); // UB
printf("*pd: %f\n", *pd); // UB
The accesses in the first 3 printfs are fine as you are accessing int through the lvalue type of type int. But the next 2 are not fine as the violate 6.5, 7, Expressions.
An int * is not a compatible type with a float * or double *. So the accesses in the last two printf() calls cause undefined behaviour.
C11, $6.5, 7 states:
An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
The term "C" is used to describe two languages: one invented by K&R in which pointers identify physical memory locations, and one which is derived from that which works the same in cases where pointers are either read and written in ways that abide by certain rules, but may behave in arbitrary fashion if they are used in other ways. While the latter language is defined the by the Standards, the former language is what became popular for microcomputer programming in the 1980s.
One of the major impediments to generating efficient machine code from C code is that compilers can't tell what pointers might alias what variables. Thus, any time code accesses a pointer that might point to a given variable, generated code is required to ensure that the contents of the memory identified by the pointer and the contents of the variable match. That can be very expensive. The people writing the C89 Standard decided that compilers should be allowed to assume that named variables (static and automatic) will only be accessed using pointers of their own type or character types; the people writing C99 decided to add additional restrictions for allocated storage as well.
Some compilers offer means by which code can ensure that accesses using different types will go through memory (or at least behave as though they are doing so), but unfortunately I don't think there's any standard for that. C14 added a memory model for use with multi-threading which should be capable of achieving required semantics, but I don't think compilers are required to honor such semantics in cases where they can tell that there's no way for outside threads to access something [even if going through memory would be necessary to achieve correct single-thread semantics].
If you're using gcc and want to have memory semantics that work as K&R intended, use the "-fno-strict-aliasing" command-line option. To make code efficient it will be necessary to make substantial use of the "restrict" qualifier which was added in C99. While the authors of gcc seem to have focused more on type-based aliasing rules than "restrict", the latter should allow more useful optimizations.
I read that assigning a pointer to a type to another pointer to another type is illegal; for example, in this book:
C How To Program 7 ed. pag. 299
Common Programming Error 7.7
Assigning a pointer of one type to a
pointer of another type if neither is of type void * is a syntax
error.
or at this URL:
but hower in the C11 standard is written:
A pointer to an object type may be converted to a pointer to a
different object type. If the resulting pointer is not correctly
aligned for the referenced type, the behavior is undefined.
Otherwise, when converted back again, the result shall compare equal
to the original pointer.
So I understand that only if there is an alignment problem the behavior is undefined.
In fact compilers like GCC 4.8 or cl 2013 emits only a warning of assignment from incompatible pointer type.
That exception is not present for void pointer.
So:
int a = 10;
float b = 100.22f;
int *p_a = &a;
float *p_f = &b;
p_a = p_f; // NOT ILLEGAL but WARNING
p_f = p_a; // converted back again. it still works
void *v_p = p_a; // NOT ILLEGAL NO WARNING
p_a = v_p; // converted back again. it still works
Do I understand well? Or Am I missing something?
P.S.
Can also anyone show me an example of an "alignment problem"?
You can consider a pointer to be an offset in memory to pointed data, and no matter what type it is. Any standard pointer (not smart ones) has fixed size, depending on system (32 bit or 64). So you can assign them each to other, modern compiler will warn you about dangerous operations, BUT problem is when you try to dereference pointer of incompaitable type. When dereferencing, app looks up the number of bytes corresponding to pointed type, so in next example
int b = 10;
int* pB = &b;
double* pA = pB;
on dereference pB you will get 10 (size of int is 4 bytes), and on dereference pA you will get trash or crash because next 4 bytes (double is 8 bytes) can be memory space allocated for another variable.
Conclusion: keep track of pointer types when assigning them to each others and especially if you assign them indirectly, using void* as intermediary.
Legality or illegality of pointer assignment is compiler's matter, the old ones didn't warn you. Assigning to void* is not considered "illegal" maybe because it is generic, unreferencable pointer type (it has undefined referencing size and that restriction is c language restriction), so you can't receive error like in example above.