This question already has answers here:
Initialization from incompatible pointer type warning when assigning to a pointer
(5 answers)
Closed 4 years ago.
can anyone please let me know what's the difference b/w
int vector[]={10,20,30}
int *p = vector
and
int *p= &vector
I know by mentioning array name, we get it's base address . The second statement is giving me warning
initialisation from incompatible pointer type
Why warning, both statement give array base address.
When used in an expression, an array name will in most cases decay to a pointer to its first element.
So this:
int *p = vector;
Is equivalent to:
int *p = &vector[0];
The type of &vector[0] is int *, so the type is compatible.
One of the few times an array does not decay is when it is the object of the address-of operator &. Then &array is the address of the array and has type int (*)[3], i.e. a pointer to an array of size 3. This is not compatible with int *, hence the error.
Although the address of the array and the address of its first element have the same value, their types are different.
&vector is of type int (*)[3] i.e. a pointer to an array, whereas vector without & would decay to int * pointing to the first element.
int (*)[3], even though pointing at the same address is incompatible with int *, hence your program has a constraint violation, which makes your program an invalid program, and a compiler must issue a diagnostics message.
The C standard explicitly mentions in a [footnote (C11 footnote 9) that a compiler is allowed to successfully compile an invalid program:
9) The intent is that an implementation should identify the nature of, and where possible localize, each violation. Of course, an implementation is free to produce any number of diagnostics as long as a valid program is still correctly translated. It may also successfully translate an invalid program.
I.e. a compiler is allowed to do whatever it pleases with the given source, provided that a valid program is correctly translated. Hence many compilers will look these kinds of things through the fingers with default settings and provide a translation that has more or less expected - or equally unexpected - results, and you will get only a warning.
Expressions in C have both values and types.
Given int vector[] = {10, 20, 30};, vector is an array. C has a rule that, when an array is used in an expression outside of certain places1, it is automatically converted to be a pointer to its first element. Then its value is effectively the address of the start of the array2, and its type is “pointer to int”.
The expression &vector takes the address of the array. This is different from the address of its first element. Largely, they both have the same value. Both the array and its first element start at the same place in memory. But they have different types. The type of &vector is “pointer to array of 3 int”.
C has rules about types, and you cannot automatically use one type where another is expected. Sometimes types are converted automatically, as when a narrower integer is converted to a wider integer. But, generally in places where the type is important to the meaning of the software, there are no automatic conversions (or limited conversions). If you try to assign a pointer to an array to a pointer to an int, a good compiler will warn you you are doing something improper.
When a pointer to one type is assigned to a pointer to a different type, it might be because the programmer has made a mistake. This is why the compiler warns you.
Additionally, the same value with different types may behave differently. Because array is a pointer to its first element, a + 1 is a pointer to the second element. But, since &array is a pointer to the array, &array + 1 is a pointer to the end of the array (where the next array after it would start, if there were one).
Footnote
1 An array is not automatically converted when it is the operand of sizeof or the operand of unary & or is a string literal used to initialize an array.
2 The array starts in the same place in memory that its first element starts, of course. So they have the same virtual address. So they have the same value in that sense. However, there are some technical issues about C pointers that mean two pointers to the same place may not be exactly the “same” in certain senses. In this answer, I will not get into details of that. We can treat pointers to the same place as the same in this discussion.
Related
So for a while I was confused about array names and pointers.
We declare int a[10];
And somewhere down the road also have a and &a.
So I get how the syntax works. a is the array name. When it is not used as an operand for sizeof &, etc., it will be converted or "decayed" so it returns a pointer to integer holding the address of the first element of the array.
If the array name is used as an operand for sizeof or &, its type is int (*)[10]. So I guess the type is different because that "decay" does not happen.
But I still do not understand how &a works. My understanding is that it is giving me the address of whatever it was before the "decay" happened.. So before the "decay" to pointer happened, then what is it and how does the compiler work with the "original" to evaluate &a?
In comparison, if we declare int *p;
and later have &p and p somewhere in the code...
In this case the pointer to integer p is given a separate pointer cell with its address and the value at that address will be whatever address we assign to it (or the garbage value at that address pre-assignment).
a does not get assigned a separate pointer cell in memory when it is declared int a[10]. I heard it is identified with an offset on the register %ebp. Then what is happening with the compiler when it evaluates &a? The "decay" to a pointer to integer is not happening, there was no separate "pointer" in the first place. Then what does the compiler identify a as and what does it do when it sees that unary & operator is using the array name as an operand?
Given:
int a[10];
the object a is of type int[10]. The expression a, in most but not all contexts, "decays" to a pointer expression; the expression yields a value of type int*, equivalent to &a[0].
But I still do not understand how &a works. My understanding is that it is giving me the address of whatever it was before the "decay" happened.. So before the "decay" to pointer happened, then what is it and how does the compiler work with the "original" to evaluate &a?
That's not quite correct. In &a, the decay doesn't happen at all. a is of type "array of 10 int" (int[10]), so &a is of type "pointer to array of 10 int" (int(*)[10]).
There's nothing special about this. For any name foo of type some_type, the expression &foo is of type "pointer to some_type". (What's confusing about it is that this is one of the rare cases where an array name doesn't behave strangely.)
It's best to think of the words "array" and "pointer" as adjectives rather than nouns. Thus we can have an array object, an array expression, an array type, and so forth -- but just "an array" is ambiguous.
This:
int a[10];
defines an array object named a (and allocates 4 * sizeof (int) bytes to hold it). No pointer object is created. You can create a pointer value by taking the address of the object, or of any element of it. This is no different than objects of any other type. Defining an object of type some_type doesn't create an object of type some_type*, but you can create a value of type some_type* by computing the address of the object.
Then what does the compiler identify a as and what does it do when it
sees that unary & operator is using the array name as an operand?
The compiler identifies a as a 10-element integer array, and when it sees the & operator, it returns the address of that array.
Just like it would see int i = 3; as an integer, and &i as the address of that integer.
Concerning taking the address of an array: an array is an object in and of itself, so it has both a size and an address (though taking its address is seldom useful).
The conversion of an array to a pointer to its first element is a form of type coercion. It only happens if the alternative would be a compile error.
For instance, you can't compare an array to a pointer, so the array (implicitly) is coerced (cast) to an int* (to its first element) and then the pointer types are compared. In C you can compare any pointer types. C just doesn't care (though it will likely emit a warning).
This is actually comparing int* to int(*)[10] as far as types are concerned, as you said. These will necessary have the same address (regardless of typing) because arrays hold their data directly. So the address of an array will always be the address of its first element.
However, it's not an error to get the size of an array, so sizeof(a) gets the size of the entire array, as no coercion is needed to make this legal. So this is the same as sizeof(int[10]).
Your other case sizeof(&a) is really sizeof(int(*)[10]) as you said.
What is the meaning of (int*) &i?
char i;
int* p = (int*) &i;
...
*p = 1234567892;
...
If it was * &i, I would understand. But in this case, this an "int" in there.
&i : means to take the address of i (which is a char*)
(int*)&i : casts that pointer to be a pointer to integer (which is bad/wrong to do, but you told the compiler to do it so it won't even give a warning)
int* p = (int*)&i; : a statement that says to store the pointer of i in p (and cast it too: the compiler won't even complain)
*p = 1234567892; : write this value, which is several bytes to the base location pointed to by p (which although p thinks it points to an int, is to char!). One of those bytes will end up in i, but the others will over write the bytes neighboring i.
The construct (int *) &var, where var is a char, takes a pointer to var, and then converts it to a pointer of a different type (namely int). The program later writes an int value into the pointer. Since the pointer actually points to a char, an int value does not fit, which triggers undefined behavior, which is a fancy name for "literally anything (that your computer can physically accomplish) could happen -- this program is buggy".
EDIT: As requested, some standardology to explain why this program has undefined behavior. All section references below are to N1570, which is the closest approximation to the official text of C2011 that can be accessed online for free.
As a preamble, when reading the text of the C standard, you need to know that the word "shall" has special significance. Any sentence containing the word "shall" imposes a hard requirement on either the program, or the compiler and runtime environment; you have to figure out which from context. There are two kinds of hard requirements on the program. If a "shall" sentence appears in a "constraints" section, then the compiler is required to diagnose violations (§5.1.1.3) (the standard never flat out says that a program must be rejected, but that's the usual line drawn between hard errors and warnings). If a "shall" sentence appears somewhere else, then the compiler isn't required to diagnose it, but a program that violates the requirement has undefined behavior (§4p1,2). Sometimes the text says "If X, then the behavior is undefined" instead; there's no difference in the consequences.
First off, the conversion (int *) &var converts char * to int *, which is explicitly allowed by §6.3.2.3p7 if and only if the value of the pointer-to-char is properly aligned for an object of type int.
A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.
There's nothing in the code shown that would ensure that var is aligned appropriately for an int, so the program might already have triggered undefined behavior at this point, but let's assume it is aligned correctly. Saving a value of type int * into a variable declared with that type is unproblematic. The next operation is *p = integer_literal. This is a write access to the stored value of the object var, which must obey the rules in §6.5p6,7:
The effective type of an object for an access to its stored value is the declared type of the object, if any. [... more text about what happens if the object has no declared type; not relevant here ...]
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of the object,
a type that is the signed or unsigned type corresponding to the effective type of the object,
a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
a character type
For simple arithmetic types like int and char, compatible type means the same type after stripping typedefs. (The exact definition is spread over §§ 6.2.7, 6.7.2, 6.7.3, and 6.7.6.) What matters for this analysis is simply that the declared type of var is char, but the lvalue expression *p has type int; int is not compatible with char, and int is not a character type. Therefore this program violates a requirement stated with the word "shall", which is not within a section named "constraints", and its behavior is undefined.
Note the asymmetry of the last bullet point. Any object may have its stored value accessed by an lvalue expression with character type, but an object declared to have character type may not be accessed by an lvalue expression with an incompatible type. Yes, that means the common idiom of accessing a large array of characters (such as a buffer of data read from a file) "four at a time" via a pointer to int is, strictly speaking, invalid. Many compilers make a special exception to their pointer-aliasing rules for that case, to avoid invalidating that idiom.
However, accessing a single char via a pointer to int is also invalid because (on most systems) int is bigger than char, so you read or write bytes beyond the end of the object. The standard doesn't bother distinguishing that case from the array case, but it will blow up on you regardless.
int * is a type — specifically it is pointer to int.
(type)x is a type cast. It says to reinterpret or convert x to that type. With pointer types it always means reinterpret.
i is of type char. So &i is of type char *. Casting it to int * makes it of type int * so that p can be assigned to it.
When you subsequently write via p you'll be writing a whole int.
&i gives the address of the variable i. The (int *) converts that pointer, which is of type char *, into a pointer to int.
The statement *p = 1234567892 then has undefined behaviour, since p actually points to an address of a single char, but this expression treats that location as if it contains an int (different type). In practice, the usual result is to write to memory locations past the single char, which can cause anything from data poisoning (e.g. changing values of other variables) to an immediate program crash.
Without the (int*), gcc would complain because pointer to carrots are not pointers to potatoes.
warning: initialization from incompatible pointer type [enabled by default]
Thus, this notation just means ok I know what I'm doing here, consider it a pointer to different type, ie an int.
It means that your program is about to crash with a BUS error
Surely it's typecasting. i is a character variable and p is pointer to integer.
so p= (int *) &i means p is storing the address of i which is of type char but you have type cast it, so it's fine with that. now p is point to i.
*p = 123455; // here you stored the value at &i with 123455.
when you'll print these value like
print (*p) // 123455
print (i) // some garbage -- because i is char (1 byte) and having the value of integer (4 byte). so i will take this as a decimal value and print the value accordingly.
but but let just say *p = 65;
print(*p) // 65
print(i) // A -- because char i = 65 and char 65 is 'A'
hope it'll help you.
The C99 standard says the following in 6.7.5.3/7:
A declaration of a parameter as ‘‘array of type’’ shall be adjusted to ‘‘qualified pointer to
type’’, where the type qualifiers (if any) are those specified within the [ and ] of the
array type derivation.
Which I understand as:
void foo(int * arr) {} // valid
void foo(int arr[]) {} // invalid
However, gcc 4.7.3 will happily accept both function definitions, even when compiled with gcc -Wall -Werror -std=c99 -pedantic-errors. Since I am not a C expert, I am unsure if maybe I misinterpreted what the standard is saying.
I also noticed that
size_t foo(int arr[]) { return sizeof(arr); }
will always return sizeof(int *) instead of the array size, which firms my belief that int arr[] is handled as int * and gcc is just trying to make me feel more comfortable.
Can someone shed some light on this issue? Just for reference, this question arose from this comment.
Some context:
First of all, remember that when an expression of type "N-element array of T" appears in a context where it isn't the operand of the sizeof or unary & operator, or isn't a string literal being used to initialize another array in a declaration, it will be converted to an expression of type "pointer to T" and its value will be the address of the first element in the array.
That means when you pass an array argument to a function, the function will receive a pointer value as a parameter; the array expression is converted to a pointer type before the function is called.
That's all well and good, but why is arr[] allowed as a pointer declaration? I can't say that this is the reason for sure, but I suspect it's a holdover from the B language, from which C was derived. In fact, pretty much everything hinky or unintuitive about arrays in C is a holdover from B.
B was a "typeless" language; you didn't have different types for floats, integers, text, whatever. Everything was stored as fixed-size words, or "cells", and memory was treated as a linear array of cells. When you declared an array in B, as in
auto arr[10];
the compiler would set aside 10 cells for the array, and then set aside an additional 11th cell that would store an offset to the first element of the array, and that additional cell would be bound to the variable arr. As in C, array indexing in B was computed as *(arr + i); you'd take the value stored in arr, add an offset i, and dereference the result. Ritchie retained most of these semantics, with the huge exception of no longer setting aside storage for the pointer to the first element of the array; instead, that pointer value would be computed from the array expression itself when the code was translated. This is why array expressions are converted to pointer types, why &arr and arr give the same value, if different types (the address of the array and the address of the first element of the array are the same) and why an array expression cannot be the target of an assignment (there's nothing to assign to; no storage has been set aside for a variable independent of the array elements).
Now here's the fun bit; in B, you'd declare a "pointer" as
auto ptr[];
This had the effect of allocating the cell to store the offset to the first element of the array and binding it to ptr, but ptr didn't point anywhere in particular; you could assign it to point to various locations. I suspect that notation was held over for a couple of reasons:
Most of the guys who worked on the initial version of C were familiar with it;
It sort of emphasizes that the parameter represents an array in the caller;
Personally, I would have preferred that Ritchie had used * to designate pointers everywhere, but he didn't (or, alternately, use [] to designate a pointer in all contexts, not just a function parameter declaration). I will normally recommend that everyone use * notation for function parameters instead of [], simply because it more accurately conveys the type of the parameter, but I can understand why people would prefer the second notation.
Both your valid and invalid declarations are internally equivalent, i.e., the compiler converts the latter to the former.
What your function sees is the pointer to the first element of the array.
PS. The alternative would be to push the whole array on the stack, which would be grossly inefficient from both time and space viewpoints.
Is this always the case , i mean , that array name is always a pointer to the first element of the array.why is it so?is it something implementation kinda thing or a language feature?
An array name is not itself a pointer, but decays into a pointer to the first element of the array in most contexts. It's that way because the language defines it that way.
From C11 6.3.2.1 Lvalues, arrays, and function designators, paragraph 3:
Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type "array of type" is converted to an expression with type "pointer to type" that points to the initial element of the array object and is not an lvalue.
You can learn more about this topic (and lots about the subtle behaviour involved) from the Arrays and Pointers section of the comp.lang.c FAQ.
Editorial aside: The same kind of behaviour takes place in C++, though the language specifies it a bit differently. For reference, from a C++11 draft I have here, 4.2 Array-to-pointer conversion, paragraph 1:
An lvalue or rvalue of type "array of N T" or "array of unknown bound of T" can be converted to an rvalue of type "pointer to T". The result is a pointer to the first element of the array.
The historical reason for this behavior can be found here.
C was derived from an earlier language named B (go figure). B was a typeless language, and memory was treated as a linear array of "cells", basically unsigned integers.
In B, when you declared an N-element array, as in
auto a[10];
N cells were allocated for the array, and another cell was set aside to store the address of the first element, which was bound to the variable a. As in C, array indexing was done through pointer arithmetic:
a[j] == *(a+j)
This worked pretty well until Ritchie started adding struct types to C. The example he gives in the paper is a hypothetical file system entry, which is a node id followed by a name:
struct {
int inumber;
char name[14];
};
He wanted the contents of the struct type to match the data on the disk; 2 bytes for an integer immediately followed by 14 bytes for the name. There was no good place to stash the pointer to the first element of the array.
So he got rid of it. Instead of setting aside storage for the pointer, he designed the language so that the pointer value would be computed from the array expression itself.
This, incidentally, is why an array expression cannot be the target of an assignment; it's effectively the same thing as writing 3 = 4; - you'd be trying to assign a value to another value.
Carl Norum has given the language-lawyer answer on the question (and got my upvote on it), here comes the implementation detail answer:
To the computer, any object in memory is just a range of bytes, and, as far as memory handling is concerned, uniquely identified by an address to the first byte and a size in bytes. Even when you have an int in memory, its address is nothing more or less than the address of its first byte. The size is almost always implicit: If you pass a pointer to an int, the compiler know its size because it knows that the bytes at that address are to be interpreted as an int. The same goes for structures: their address is the address of their first byte, their size is implicit.
Now, the language designers could have implemented a similar semantic with arrays as they did with structures, but they didn't for a good reason: Copying was then even more inefficient than now compared to just passing a pointer, structures were already passed around using pointers most of the time, and arrays are usually meant to be large. Prohibitively large to force value semantics on them by language.
Thus, arrays were just forced to be memory objects at all times by specifying that the name of an array would be virtually equivalent to a pointer. In order, to not break the similarity of arrays to other memory objects, the size was again said to be implicit (to the language implementation, not the programmer!): The compiler could just forget about the size of an array when it was passed somewhere else and rely on the programmer to know, how many objects were inside the array.
This had the benefit that array accesses are excrutiatingly simple; they decay to a matter of pointer arithmetic, of multiplying the index with the size of the object in the array and adding that offset to the pointer. It's the reason why a[5] is exactly the same as 5[a], it's a shorthand for *(a + 5).
Another performance related aspect is that it is excrutiatingly simple to make a subarray from an array: only the start address needs to be calculated. There is nothing that would force us to copy the data into a new array, we just have to remember to use the correct size...
So, yes, it has profound reasons in terms of implementation simplicity and performance that array names decay to pointers the way they do, and we should be glad for it.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Is array name a pointer in C?
If I define:
int tab[4];
tab is a pointer, because if I display tab:
printf("%d", tab);
the code above will display the address to the first element in memory.
That's why i was wondering why we don't define an array like the following:
int *tab[4];
as tab is a pointer.
Thank you for any help!
tab is a pointer
No, tab is an array. An int[4] to be specific. But when you pass it as an argument to a function (and in many other contexts) the array is converted to a pointer to its first element. You can see the difference between arrays and pointers for example when you call sizeof array vs. sizeof pointer, when you try to assign to an array (that won't compile), and more.
int *tab[4];
declares an array of four pointers to int. I don't see how that is related to the confusion between arrays and pointers.
tab is not a pointer it's an array of 4 integers when passed to a function it decays into a pointer to the first element:
int tab[4];
And this is another array but it holds 4 integer pointers:
int *tab[4];
Finally, for the sake of completeness, this is a pointer to an array of 4 integers, if you dereference this you get an array of 4 integers:
int (*tab)[4];
You are not completely wrong, meaning that your statement is wrong but you are not that far from the truth.
Arrays and pointers under C share the same arithmetic but the main difference is that arrays are containers and pointers are just like any other atomic variable and their purpose is to store a memory address and provide informations about the type of the pointed value.
I suggest to read something about pointer arithmetic
Pointer Arithmetic
http://www.learncpp.com/cpp-tutorial/68-pointers-arrays-and-pointer-arithmetic/
Considering the Steve Jessop comment I would like to add a snippet that can introduce you to the simple and effective world of the pointer arithmetic:
#include <stdio.h>
int main()
{
int arr[10] = {10,11,12,13,14,15,16,17,18,19};
int pos = 3;
printf("Arithmetic part 1 %d\n",arr[pos]);
printf("Arithmetic part 2 %d\n",pos[arr]);
return(0);
}
arrays can behave like pointers, even look like pointers in your case, you can apply the same exact kind of arithmetic by they are not pointers.
int *tab[4];
this deffinition means that the tab array contains pointers of int and not int
From C standard
Coding Guidelines
The implicit conversion of array objects to a
pointer to their first element is a great inconvenience in trying to
formulate stronger type checking for arrays in C. Inexperienced, in
the C language, developers sometimes equate arrays and a pointers much
more closely than permitted by this requirement (which applies to uses
in expressions, not declarations). For instance, in:
file_1.c
extern int *a;
file_2.c
extern int a[10];
the two declarations of a are sometimes incorrectly assumed by
developers to be compatible. It is difficult to see what guideline
recommendation would overcome incorrect developer assumptions (or poor
training). If the guideline recommendation specifying a single point
of declaration is followed, this problem will not 419.1 identifier
declared in one file occur. Unlike the function designator usage,
developers are familiar with the fact that objects having an array
function designator converted to typetype are implicitly converted to
a pointer to their first element. Whether applying a unary & operator
to an operand having an array type provides readers with a helpful
visual cue or causes them to wonder about the intent of the author
(“what is that redundant operator doing there?”) is not known.
Example
static double a[5];
void f(double b[5])
{
double (*p)[5] = &a;
double **q = &b; /* This looks suspicious, */
p = &b; /* and so does this. */
q = &a;
}
If the array object has register storage class, the behavior is undefined
Under most circumstances, an expression of array type will be converted ("decay") to an expression of pointer type, and the value of the expression will be the address of the first element in the array. The exceptions to this rule are when the array expression is an operand of the sizeof, _Alignof, or unary & operators, or is a string literal being used to initialize another array in a declaration.
int tab[4];
defines tab as a 4-element array if int. In the statement
printf("%d", tab); // which *should* be printf("%p", (void*) tab);
the expression tab is converted from type "4-element array of int" to "pointer to int".