Adding pointer addresses - legality - c

Since adding two pointers together is illegal, how is this code snippet valid?
struct key *low = &tab[0];
struct key *high = &tab[n];
struct key *mid;
while (low < high)
{
mid = low + (high-low) / 2; //isn't this adding pointers?
//code continues...
The first statement in the while loop seems to add two addresses together, how is this legal?
This code is from K&Rs the C programming language on page 122

The difference of two pointers (high - low) is an integer (actually ptrdiff_t, which is a signed integer type), so you're adding an integer to a pointer, which is perfectly legal. This also explains why it's perfectly OK to divide the difference by 2, which is not something you could do with a pointer.

You are allowed to subtract two pointers(the result is ptrdiff_t) and you are allowed to add an integer value to a pointer. This is covered in the draft C99 standard section 6.5.6 Additive operators paragrph 2:
For addition, either both operands shall have arithmetic type, or one operand shall be a
pointer to an object type and the other shall have integer type. (Incrementing is
equivalent to adding 1.)
and paragraph 3:
For subtraction, one of the following shall hold:
and includes the following bullets:
both operands are pointers to qualified or unqualified versions of compatible object
types; or
Some important notes, when subtracting two pointers they must point to the same array, this is covered in paragraph 9 which says:
When two pointers are subtracted, both shall point to elements of the
same array object, or one past the last element of the array object; [...]
In order to avoid undefined behavior the resulting pointer from an addition must still point to the same array or one off the end of the array and if you point to one past the end you shall not dereference it, which is in paragraph 8 which says:
[...]If both the pointer operand and the result point to elements of
the same array object, or one past the last element of the array
object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined. If the result points one past the last element
of the array object, it shall not be used as the operand of a unary *
operator that is evaluated.

The difference of two pointers high - low is a value with an arithmetic, integral type (representable by a value of type ptrdiff_t, in case you need to store it). It's perfectly allowed to add integral values to pointers.

If you're subtracting two pointers that point to elements of the same array, or just past the last element of the same array, then the subtraction is the difference (in array elements) between them. The return value is a signed integer type, ptrdiff_t which is from stddef.h.
So, high - low is returning this signed integer which is then being added to low. So you're not adding pointers, you're adding a pointer with a signed integer type.

Related

Does the C standard require pointers to be (integer) numbers?

Does the C standard require pointers to be (integer) numbers?
One may argue that yes, because of pointer arithmetic...
But on the other hand operations like -- or ++ may be understood as previous memory location, next memory location, depending on how they are described in the standard, and actual implementation may use any representation to hold pointer data (as long as mentioned operations are implemented)...
Another question comes to mind - does C require arrays/buffers etc. to be contiguous, i.e. next element is stored in next memory location (++p where p is a pointer)? I ask because you can often see implementations online that seem to assume that it does.
No, pointers need not be plain numbers.
If you read the standard, there are provisions for that:
Two pointers to unrelated objects (meaning not part of a bigger object, remember structs and arrays) may not be compared, except for equality.
6.5.8 Relational operators
[...]
5 When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. If two pointers to object or incomplete types both point to the same object, or both point one past the last element of the same array object, they compare equal. If the objects pointed to are members of the same aggregate object, pointers to structure members declared later compare greater than pointers to members declared earlier in the structure, and pointers to array elements with larger subscript
values compare greater than pointers to elements of the same array with lower subscript values. All pointers to members of the same union object compare equal. If the expression P points to an element of an array object and the expression Q points to the last element of the same array object, the pointer expression Q+1 compares greater than P. In all other cases, the behavior is undefined.
Two pointers to unrelated objects may not be subtracted.
6.5.6 Additive operators
[...]
9 When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the twoarray elements. The size of the result is implementation-defined, and its type (a signed integer type) is ptrdiff_t defined in the <stddef.h> header. If the result is not representable in an object of that type, the behavior is undefined. In other words, if the expressions P and Q point to, respectively,the i-th and j-th elements of an array object, the expression (P)-(Q) has the value i−j provided the value fits in an object of type ptrdiff_t. Moreover, if the expression P points either to an element of an array object or one past the last element of an array object, and the expression Q points to the last element of the same array object, the expression ((Q)+1)-(P) has the same
value as ((Q)-(P))+1 and as -((P)-((Q)+1)), and has the value zero if the expression P points one past the last element of the array object, even though the expression (Q)+1 does not point to an element of the array object.91)
There may not be a way to represent a pointer as a number, as no suitable type might exist. Thus, trying to convert might result in Undefined Behavior.
Any specific implementation defining a behavior does not mean it isn't UB according to the standard.
6.3.2.3 Pointers
[...]
6 Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of anyinteger type.
7.18.1.4 Integer types capable of holding object pointers
1 The following type designates a signed integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer:
intptr_t
The following type designates an unsigned integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer:
uintptr_t
These types are optional.
That's just off the top of my head, I'm sure there's more.
All quotes from n1256 (C99 draft).
Arrays have always been required to be contiguous.
To answer to your second question in arrays elements are in contiguous Memory locations. Thats why you use pointer arithmetic to move between elements.

division during pointer subtraction in C

Consider below code snippet :
int *p;
/* Lets say p points to address 100
and sizeof(int) is 4 bytes. */
int *q = p+1;
unsigned long r = q-p;
/* r results in 1, hence for r = q-p
something is happening similar to r=(104-100)/4 */
Is there a real division by sizeof(datatype) going on during runtime when two pointers of same type are subtracted, or there is some other mechanism through which pointer subtraction works.
The C standard states the following regarding pointer subtraction (section 6.5.6p9):
When two pointers are subtracted, both shall point to elements of the
same array object, or one past the last element of the array
object; the result is the difference of the subscripts of the
two array elements. The size of the result is
implementation-defined, and its type (a signed integer type) is
ptrdiff_t defined in the header. If the result is not
representable in an object of that type, the behavior is
undefined. In other words, if the expressions P and Q point to,
respectively, the i
-th and j
-th elements of an array object, the expression (P)-(Q) has the value i−j provided the value fits in an object of type ptrdiff_t . Moreover,
if the expression P points either to an element of an array object or
one past the last element of an array object, and the expression Q
points to the last element of the same array object, the expression
((Q)+1)-(P) has the same value as ((Q)-(P))+1 and as
-((P)-((Q)+1)) , and has the value zero if the expression P points one past the last element of the array object, even
though the expression (Q)+1 does not point to an element of the array
object. 106)
Footnote 106 states:
Another way to approach pointer arithmetic is first to convert the
pointer(s) to character pointer(s): In this scheme the integer
expression added to or subtracted from the converted pointer is first
multiplied by the size of the object originally pointed to,
and the resulting pointer is converted back to the original
type. For pointer subtraction, the result of the difference
between the character pointers is similarly divided by the size of
the object originally pointed to. When viewed in this way, an
implementation need only provide one extra byte (which may
overlap another object in the program) just after the end of the
object in order to satisfy the "one past the last element"
requirements.
So the footnote states that pointer subtraction may be implemented by subtracting the raw pointer values and dividing by the size of the pointed-to object. It doesn't have to be implemented this way, however.
Note also that the standard requires that pointer subtraction is performed between pointers pointing to elements of the same array object (or one element past the end). If they don't then the behavior is undefined. In practice, if you're working on a system with a flat memory model you'll probably still get the "expected" values but you can't depend on that.
See #dbush answer for the explanation on how pointer substraction works.
If, instead, you are programming something low-level, say a kernel, driver, debugger or similar and you need to have actual subtraction of addresses, cast the pointers to char *:
(char *)q - (char *)p
The result will be of ptrdiff_t type, an implementation defined signed integer.
Of course, this is not defined/portable C, but will work on most architectures/environments.

Why does the compiler assume that these seemingly equal pointers differ?

Looks like GCC with some optimization thinks two pointers from different translation units can never be same even if they are actually the same.
Code:
main.c
#include <stdint.h>
#include <stdio.h>
int a __attribute__((section("test")));
extern int b;
void check(int cond) { puts(cond ? "TRUE" : "FALSE"); }
int main() {
int * p = &a + 1;
check(
(p == &b)
==
((uintptr_t)p == (uintptr_t)&b)
);
check(p == &b);
check((uintptr_t)p == (uintptr_t)&b);
return 0;
}
b.c
int b __attribute__((section("test")));
If I compile it with -O0, it prints
TRUE
TRUE
TRUE
But with -O1
FALSE
FALSE
TRUE
So p and &b are actually the same value, but the compiler optimized out their comparison assuming they can never be equal.
I can't figure out, which optimization made this.
It doesn't look like strict aliasing, because pointers are of one type, and -fstrict-aliasing option doesn't make this effect.
Is this the documented behavour? Or is this a bug?
There are three aspects in your code which result in general problems:
Conversion of a pointer to an integer is implementation defined. There is no guarantee conversion of two pointers to have all bits identical.
uintptr_t is guaranteed to convert from a pointer to the same type then back unchanged (i.e. compare equal to the original pointer). But nothing more. The integer values themselves are not guaranteed to compare equal. E.g. there could be unused bits with arbitrary value. See the standard, 7.20.1.4.
And (briefly) two pointers can only compare equal if they point into the same array or right behind it (last entry plus one) or at least one is a null pointer. For any other constellation, they compare unequal. For the exact details, see the standard, 6.5.9p6.
Finally, there is no guarantee how variables are placed in memory by the toolchain (typically the linker for static variables, the compiler for automatic variables). Only an array or a struct (i.e. composite types) guarantee the ordering of its elements.
For your example, 6.5.9p7 also applies. It basically treats a pointer to a non-array object for comparision like on to the first entry of an array of size 1. This does not cover an incremented pointer past the object like &a + 1. Relevant is the object the pointer is based on. That is object a for pointer p and b for pointer &b. The rest can be found in paragraph 6.
None of your variables is an array (last part of paragraph 6), so the pointers need not compare equal, even for &a + 1 == &b. The last "TRUE" might arise from gcc assuming the uintptr_t comparison returning true.
gcc is known to agressively optimise while strictly following the standard. Other compilers are more conservative, but that results in less optimised code. Please don't try "solving" this by disabling optimisation or other hacks, but fix it using well-defined behaviour. It is a bug in the code.
p == &b is a pointer comparison and is subject to the following rules from the C Standard (6.5.9 Equality operators, point 4):
Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.
(uintptr_t)p == (uintptr_t)&b is an arithmetic comparison and is subject to the following rules (6.5.9 Equality operators, point 6):
If both of the operands have arithmetic type, the usual arithmetic conversions are performed. Values of complex types are equal if and only if both their real parts are equal and also their imaginary parts are equal. Any two values of arithmetic types from different type domains are equal if and only if the results of their conversions to the (complex) result type determined by the usual arithmetic conversions are equal.
These two excerpts require very different things from the implementation. And it is clear that the C specification places no requirement on an implementation to mimic the behavior of the former kind of comparison in cases where the latter kind is invoked and vice versa. The implementation is only required to follow this rule (7.18.1.4 Integer types capable of holding object pointers in C99 or 7.20.1.4 in C11):
The [uintptr_t] type designates an unsigned integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer.
(Addendum: The above quote isn't applicable in this case, because the conversion from int* to uintptr_t does not involve void* as an intermediate step. See Hadi's answer for an explanation and citation on this. Still, the conversion in question is implementation-defined and the two comparisons you are attempting are not required to exhibit the same behavior, which is the main takeaway here.)
As an example of the difference, consider two pointers that point at the same address of two different address spaces. Comparing them as pointers shouldn't return true, but comparing them as unsigned integers might.
&a + 1 is an integer added to a pointer, which is subject to the following rules (6.5.6 Additive operators, point 8):
When an expression that has integer type is added to or subtracted from a pointer, the
result has the type of the pointer operand. If the pointer operand points to an element of
an array object, and the array is large enough, the result points to an element offset from
the original element such that the difference of the subscripts of the resulting and original
array elements equals the integer expression. In other words, if the expression P points to
the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and
(P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of
the array object, provided they exist. Moreover, if the expression P points to the last
element of an array object, the expression (P)+1 points one past the last element of the
array object, and if the expression Q points one past the last element of an array object,
the expression (Q)-1 points to the last element of the array object. If both the pointer
operand and the result point to elements of the same array object, or one past the last
element of the array object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined. If the result points one past the last element of the array object, it
shall not be used as the operand of a unary * operator that is evaluated.
I believe that this excerpt shows that pointer addition (and subtraction) is defined only for pointers within the same array object or one past the last element. And because (1) a is not an array and (2) a and b aren't members of the same array object, it seems to me that your pointer math operation invokes undefined behavior and your compiler takes advantage of it to assume that the pointer comparison returns false. Again as pointed out in Hadi's answer (and in contrast to what my original answer assumed at this point), pointers to non-array objects can be considered pointers to array objects of length one, and thus adding one to your pointer to the scalar does qualify as pointing to one past the end of the array.
Therefore your case seems to fall under the last part of the first excerpt mentioned in this answer, making your comparison well-defined to evaluate to true if and only if the two variables are linked in sequence and in ascending order. Whether this is true for your program is left unspecified by the standard and it's up to the implementation.
While one of the answers has already been accepted, the accepted answer (and all other answers for that matter) are critically wrong as I'll explain and then answer the question. I'll be quoting from the same C standard, namely n1570.
Let's start with &a + 1. In contrast to what #Theodoros and #Peter has stated, this expression has defined behavior. To see this, consider section 6.5.6 paragraph 7 "Additive operators" which states:
For the purposes of these operators, a pointer to an object that is
not an element of an array behaves the same as a pointer to the first
element of an array of length one with the type of the object as its
element type.
and paragraph 8 (in particular, the emphasized part):
When an expression that has integer type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
pointer operand points to an element of an array object, and the array
is large enough, the result points to an element offset from the
original element such that the difference of the subscripts of the
resulting and original array elements equals the integer expression.
In other words, if the expression P points to the i-th element of an
array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N
(where N has the value n) point to, respectively, the i+n-th and
i−n-th elements of the array object, provided they exist. Moreover, if
the expression P points to the last element of an array object, the
expression (P)+1 points one past the last element of the array object,
and if the expression Q points one past the last element of an array
object, the expression (Q)-1 points to the last element of the array
object. If both the pointer operand and the result point to elements
of the same array object, or one past the last element of the array
object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined. If the result points one past the last element
of the array object, it shall not be used as the operand of a unary *
operator that is evaluated.
The expression (uintptr_t)p == (uintptr_t)&b has two parts. The conversion from a pointer to an uintptr_t is NOT defined by section 7.20.1.4 (in contrast to what #Olaf and #Theodoros have said):
The following type designates an unsigned integer type with the
property that any valid pointer to void can be converted to this type,
then converted back to pointer to void, and the result will compare
equal to the original pointer:
uintptr_t
It's important to recognize that this rule applies only to valid pointers to void. However, in this case, we have a valid pointer to int. A relevant paragraph can be found in section 6.3.2.3 paragraph 1:
A pointer to void may be converted to or from a pointer to any object
type. A pointer to any object type may be converted to a pointer to
void and back again; the result shall compare equal to the original
pointer.
This means that (uintptr_t)(void*)p is allowed according to this paragraph and 7.20.1.4. But (uintptr_t)p and (uintptr_t)&b are ruled by section 6.3.2.3 paragraph 6:
Any pointer type may be converted to an integer type. Except as
previously specified, the result is implementation-defined. If the
result cannot be represented in the integer type, the behavior is
undefined. The result need not be in the range of values of any
integer type.
Note that uintptr_t is an integer type as stated in section 7.20.1.4 mentioned above and therefore this rule applies.
The second part of (uintptr_t)p == (uintptr_t)&b is comparing for equality. As previously discussed, since the result of conversion is implementation-defined, the result of equality is also implementation defined. This applies irrespective of whether the pointers themselves are equal or not.
Now I'll discuss p == &b. The third point in #Olaf's answer is wrong and #Theodoros's answer is incomplete regarding this expression. Section 6.5.9 "Equality operators" paragraph 7:
For the purposes of these operators, a pointer to an object that is
not an element of an array behaves the same as a pointer to the first
element of an array of length one with the type of the object as its
element type.
and paragraph 6:
Two pointers compare equal if and only if both are null pointers,
both are pointers to the same object (including a pointer to an object
and a subobject at its beginning) or function, both are pointers to
one past the last element of the same array object, or one is a
pointer to one past the end of one array object and the other is a
pointer to the start of a different array object that happens to
immediately follow the first array object in the address space.)
In contrast what #Olaf have said, comparing pointers using the == operator never results in undefined behavior (which may occur only when using relational operators such as <= according to section 6.5.8 paragraph 5 which I'll omit here for brevity). Now since p points to the next int relative to a, it will be equal to &b only when the linker has placed b in that location in the binary. Otherwise, there are unequal. So this is implementation-dependent (the relative order of a and b is unspecified by the standard). Since the declarations of a and b use a language extension, namely __attribute__((section("test"))), the relative locations is indeed implementation-dependent by J.5 and 3.4.2 (omitted for brevity).
We conclude that the results of check(p == &b) and check((uintptr_t)p == (uintptr_t)&b) are implementation-dependent. So the answer depends on which version of which compiler you are using. I'm using gcc 4.8 and by compiling with default options except for the level of optimization, the output I get in both -O0 and -O1 cases is all TRUE.
According to C11 6.5.9/6 and C11 6.5.9/7, the test p == &b must give 1 if a and b are adjacent in the address space.
Your example shows that GCC appears to not fulfill this requirement of the Standard.
Update 26/Apr/2016: My original answer contained suggestions about modifying the code to remove other potential sources of UB and isolate this one condition.
However, it's since come to light that the issues raised by this thread are under review - N2012.
One of their recommendations is that p == &b should be unspecified, and they acknowledge that GCC does in fact not implement the ISO C11 requirement.
So I have the remaining text from my answer, as it is no longer necessary to prove a "compiler bug", since the non-conformance (whether you want to call it a bug or not) has been established.
Re-reading your program I see that you are (understandably) baffled by the fact that in the optimized version
p == &b
is false, while
(uintptr_t)p == (uintptr_t)&b;
is true. The last line indicates that the numerical values are indeed identical; how can p == &b then be false??
I must admit that I have no idea. I am convinced that it is a gcc bug.
After a discussion with M.M I think I can make the following case if the conversion to uintptr_t goes through an intermediate void pointer (you should include that in your program and see whether it changes anything):
Because both steps in the conversion chain int* -> void* -> uintptr_t are guaranteed to be reversible, unequal int pointers can logically not result in equal uintptr_t values.1 (Those equal uintptr_t values would have to convert back to equal int pointers, altering at least one of them and thus violating the value-preserving conversion rule.) In code (I'm not aiming for equality here, just demonstrating the conversions and comparisons):
int a,b, *ap=&a, *bp = &b;
assert(ap != bp);
void *avp = ap, *bvp bp;
uintptr_t ua = (uintptr_t)avp, ub = (uintptr_t)bvp;
// Now the following holds:
// if ap != bp then *necessarily* ua != ub.
// This is violated by the OP's case (sans the void* step).
assert((int *)(void *)ua == (int*)(void*)ub);
1This assumes that the uintptr_t doesn't carry hidden information in the form of padding bits which are not evaluated in an arithmetic comparison but possibly in a type conversion. One can check that through CHAR_BIT, UINTPTR_MAX, sizeof(uintptr_t) and some bit fiddling.—
For a similar reason it's conceivable that two uintptr_t values compare different but convert back to the same pointer (namely if there are bits in uintptr_t not used for storing a pointer value, and the conversion does not zero them). But that is the opposite of the OP's problem.

Accessing variables in allocated memory

Lets say I want to allocate memory for 3 integers:
int *pn = malloc(3 * sizeof(*pn));
Now to assign to them values I do:
pn[0] = 5550;
pn[1] = 11;
pn[2] = 70000;
To access 2nd value I do:
pn[1]
But the [n] operator is just a shortcut for *(a+n). Then it would mean that i access first byte after a index. But int is 4 bytes long so shoudn't i do
*(a+sizeof(*a)*n)
instead? How does it work?
No, the compiler takes care of that. There are special rules in pointer arithmetic, and that is one of them.
If you really only want to increment it by one byte, you have to cast the pointer to a pointer to a type which is one byte long (for example char).
Good question, but C will automatically multiply the offset by the size of the pointed-to type. In other words, when you access
p[n]
for a pointer declared as
T *p;
you will access the address p + (sizeof(T) * n) implicitly.
For instance, we can use C99 standard to find out what is going on. According to C99 standard:
6.5.2.1 Array subscripting
Constraints
- 1
One of the expressions shall have type ‘‘pointer to object type’’, the other expression shall
have integer type, and the result has type ‘‘type’’.
Semantics
- 2 A postfix expression followed by an expression in square brackets [] is a subscripted
designation of an element of an array object. The definition of the subscript operator []
is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that
apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the
initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th
element of E1 (counting from zero).
And from 6.5.5.8 about conversion rules for + operator:
When an expression that has integer type is added to or subtracted from a pointer, the
result has the type of the pointer operand. If the pointer operand points to an element of
an array object, and the array is large enough, the result points to an element offset from
the original element such that the difference of the subscripts of the resulting and original
array elements equals the integer expression. In other words, if the expression P points to
the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and
(P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of
the array object, provided they exist. Moreover, if the expression P points to the last
element of an array object, the expression (P)+1 points one past the last element of the
array object, and if the expression Q points one past the last element of an array object,
the expression (Q)-1 points to the last element of the array object. If both the pointer
operand and the result point to elements of the same array object, or one past the last
element of the array object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined. If the result points one past the last element of the array object, it
shall not be used as the operand of a unary * operator that is evaluated.
Thus, all these notes are about your case and it works exactly as you wrote and you don't need special constructions, dereferencing or anything else (pointers arithmetic do that for you):
pn[1] => *((pn)+(1))
Or, in terms of byte pointers (to simplify description what is going on) this operation is similar to :
pn[1] => *(((char*)pn) + (1*sizeof(*pn)))
Moreover you can access this element with 1[pn] and result will be the same.
You should not. The rules what is happening when you add a int to a pointer are not obvious. So better not use your intuition, but read language standards about what is happens in such a cases. For example read more about pointer arithmetics here (C) or here (C++).
Shortly - a non-void pointer are "measured" in units of the type lengths.

How does unary addition on C pointers work?

I know that the unary operator ++ adds one to a number. However, I find that if I do it on an int pointer, it increments by 4 (the sizeof an int on my system). Why does it do this? For example, the following code:
int main(void)
{
int *a = malloc(5 * sizeof(int));
a[0] = 42;
a[1] = 42;
a[2] = 42;
a[3] = 42;
a[4] = 42;
printf("%p\n", a);
printf("%p\n", ++a);
printf("%p\n", ++a);
return 0;
}
will return three numbers with a difference of 4 between each.
It's just the way C is - the full explanation is in the spec, Section 6.5.6 Additive operators, paragraph 8:
When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.
To relate that to your use of the prefix ++ operator, you need to also read Section 6.5.3.1 Prefix increment and decrement operators, paragraph 2:
The value of the operand of the prefix ++ operator is incremented. The result is the new value of the operand after incrementation. The expression ++E is equivalent to (E+=1).
And also Section 6.5.16.2 Compound assignment, paragraph 3:
A compound assignment of the form E1 op= E2 differs from the simple assignment expression E1 = E1 op (E2) only in that the lvalue E1 is evaluated only once.
It's incrementing the pointer location by the size of int, the declared type of the pointer.
Remember, an int * is just a pointer to a location in memory, where you are saying an "int" is stored. When you ++ to the pointer, it shifts it one location (by the size of the type), in this case, it will make your value "4" higher, since sizeof(int)==4.
The reason for this is to make the following statement true:
*(ptr + n) == ptr[n]
These can be used interchangeably.
In pointer arithmetic, adding one to a pointer will add the sizeof the type which it points to.
so for a given:
TYPE * p;
Adding to p will actually increment by sizeof(TYPE). In this case the size of the int is 4.
See this related question
Because in "C" pointer arithmetic is always scaled by the size of the object being pointed to. If you think about it a bit, it turns out to be "the right thing to do".
It does this so that you don't start accessing an integer in the middle of it.
Because a pointer is not a reference ;). It's not a value, it's just an address in memory. When you check the pointer's value, it will be a number, possibly big, and unrelated to the actual value that's stored at that memory position. Say, printf("%p\n", a); prints "2000000" - this means your pointer points to the 2000000th byte in your machine's memory. It's pretty much unaware of what value it's stored there.
Now, the pointer knows what type it points to. An integer, in your case. Since an integer is 4 bytes long, when you want to jump to the next "cell" the pointer points to, it needs to be 2000004. That's exatly 1 integer farther, so a++ makes perfect sense.
BTW, if you want to get 42 (from your example), print out the value pointed to: printf("%d\n", *a);
I hope this makes sense ;)
Thats simple, cause when it comes down to pointer, in your case an integer pointer, a unary increment means INCREMENT THE MEMORY LOCATION BY ONE UNIT, where ONE UNIT = SIZE OF INTEGER .
This size of integer depends from compile to compiler, for a 32-bit and 16-bit it is 4bytes, while for a 64-bit compiler it is 8bytes.
Try doing the same program with character datatype, it will give difference of 1 byte as character takes 1 byte.
In Short, the difference of 4's that
you've come across is the difference
of SIZE OF ONE INTEGER in memory.
Hope this helped, if it didn't i'll be glad to help just let me know.
"Why does it do this?" Why would you expect it to do anything else? Incrementing a point makes it point to the next item of the type that it's a pointer to.

Resources