C array address confusion - c

Say we have the following code:
int main(){
int a[3]={1,2,3};
printf(" E: 0x%x\n", a);
printf(" &E[2]: 0x%x\n", &a[2]);
printf("&E[2]-E: 0x%x\n", &a[2] - a);
return 1;
}
When compiled and run the results are follows:
E: 0xbf8231f8
&E[2]: 0xbf823200
&E[2]-E: 0x2
I understand the result of &E[2] which is 8 plus the array's address, since indexed by 2 and of type int (4 bytes on my 32-bit system), but I can't figure out why the last line is 2 instead of 8?
In addition, what type of the last line should be - an integer or an integer pointer?
I wonder if it is the C type system (kinda casting) that make this quirk?

You have to remember what the expression a[2] really means. It is exactly equivalent to *(a+2). So much so, that it is perfectly legal to write 2[a] instead, with identical effect.
For that to work and make sense, pointer arithmetic takes into account the type of the thing pointed at. But that is taken care of behind the scenes. You get to simply use natural offsets into your arrays, and all the details just work out.
The same logic applies to pointer differences, which explains your result of 2.
Under the hood, in your example the index is multiplied by sizeof(int) to get a byte offset which is added to the base address of the array. You expose that detail in your two prints of the addresses.

When subtracting pointers of the same type the result is number of elements and not number of bytes. This is by design so that you can easily index arrays of any type. If you want number of bytes - cast the addresses to char*.

When you increment the pointer by 1 (p+1) then pointer would points to next valid address by adding ( p + sizeof(Type)) bytes to p. (if Type is int then p+sizeof(int))
Similar logic holds good for p-1 also ( of course subtract in this case).
If you just apply those principles here:
In simple terms:
a[2] can be represented as (a+2)
a[2]-a ==> (a+2) - (a) ==> 2
So, behind the scene,
a[2] - a[0]
==> {(a+ (2* sizeof(int)) ) - (a+0) } / sizeof(int)
==> 2 * sizeof(int) / sizeof(int) ==> 2

The line &E[2]-2 is doing pointer subtraction, not integer subtraction. Pointer subtraction (when both pointers point to data of the same type) returns the difference of the addresses in divided by the size of the type they point to. The return value is an int.
To answer your "update" question, once again pointer arithmetic (this time pointer addition) is being performed. It's done this way in C to make it easier to "index" a chunk of contiguous data pointed to by the pointer.

You may be interested in Pointer Arithmetic In C question and answers.
basically, + and - operators take element size into account when used on pointers.

When adding and subtracting pointers in C, you use the size of the data type rather than absolute addresses.
If you have an int pointer and add the number 2 to it, it will advance 2 * sizeof(int). In the same manner, if you subtract two int pointers, you will get the result in units of sizeof(int) rather than the difference of the absolute addresses.
(Having pointers using the size of the data type is quite convenient, so that you for example can simply use p++ instead of having to specify the size of the type every time: p+=sizeof(int).)

Re: "In addtion,what type of the last line should be?An integer,or a integer pointer??"
an integer/number. by the same token that the: Today - April 1 = number. not date

If you want to see the byte difference, you'll have to a type that is 1 byte in size, like this:
printf("&E[2]-E:\t0x%x\n",(char*)(&a[2])-(char*)(&a[0]))

Related

issue in double pointer address addition

I have got one issue from a open source code in pointers side, which i have tried to replicate in this below small snippet.
int main()
{
int **a=0x0;
printf ("a = %d Add = %d\n", a, a+75);
return 1;
}
Expectation is to get 75/0x4B but this code gives 300 in 32 bit and 600 in 64 bit machines.
Output:
a = 0 Add = 600
But the ideology behind to access the added position i.e 75th position in Hash table.
So it should be
printf ("a = %d Add = %d\n", a, sizeof (a)+75);
But i couldn't able to guess why this 300 or 600 output. could anyone please point out?
I went till a point where there is some left shift internally happening since:
75 - 1001011
600 - 1001011000.
Solutions are appreciated. Thanks in advance.
Pointer arithmetic is always done using the size of what is pointed to. In your case a is a pointer to a pointer to int, so the unit size is sizeof(int*) which in your case seems to be 4 (32 bits). 4 * 75 = 300.
More precisely, a + 75 adds the byte offset sizeof(*a) * 75 (note the dereferencing of a) to the pointer. What happens is that you are effectively doing &a[75], i.e. you're getting a pointer to the 75:th element.
On a slightly related note, when you print pointers with printf you should be using the format "%p", and casting the pointers to void *. See e.g. this printf (and family) reference.
As for the different size on 32 and 64 bit systems, it's to be expected. A pointer on a 32-bit system is typically 32 bits, while on a 64-bit system its 64 bits.
The program behaviour is undefined:
The format specifier %d is not valid for pointer types: use %p instead.
Pointer arithmetic is only valid within and one past the last element for arrays, or one past the address of the scalar for scalars. You can't read a + 75.
First of all, use %p for printing pointers and %zu for a sizeof result.
That said, check the type of a, it is int **, which is the size of a pointer. And, it depends on the platform / compiler.
Pointer arithmetic honors the data type, so the initial pointer is always incremented based on the LHS data type.

C array variable & address computation

int a[2][2]={{2,3},{1,6}};
printf(ā€œ%dā€,&a[1][0] - &a[0][1]);
Here, a[0][1] and a[1][0] are two consecutive integer items.As each integer will take 4 bytes then it should have 4 bytes difference between them.So,answer should be 4.
But I think,address subtraction is illegal.And in Dev-C++,it generates compiler error also.But the given output is 1.How come it becomes possible?
You're doing substraction on int pointers, so you get a result in "sizeof(int) units".
If you run your current code, it'll print 1, because those integers are indeed next to each other.
What you probably want to do is arithmetic on the addresses as numbers :
int a[2][2]={{2,3},{1,6}};
printf("%" PRIiPTR,(intptr_t)&a[1][0] - (intptr_t)&a[0][1]);
Casting the pointers to intptr_t (in header stdint.h) is a way to do that.
PRIiPTR is a macro (from header inttypes.h) used to output an inptr_t variable with printf.
No, it should not be 4.
Your assumption is incorrect: Pointer arithmetic is done in units of the type being pointed at (i.e. sizeof (int) here), not in bytes.
Your array looks like this in memory:
[ 2 | 3 | 1 | 6 ]
You are printing the difference between the addresses of the 1 and the 3, which are adjacent, i.e. there's exactly 1 int's worth of bytes between them.
Also, you're incorrect to print a pointer difference as if it's an int (with %d). The proper way is to use "%" PRIdPTR and cast to intptr_t.

Why address of i+2 is not 653064?

I'm learning pointers in C. I'm having confusion in Pointer arithmetic. Have a look at below program :
#include<stdio.h>
int main()
{
int a[] = 2,3,4,5,6;
int *i=a;
printf("value of i = %d\n", i); ( *just for the sake of simplicity I have use %d* )
printf("value of i+2 = %d\n", i+2);
return 0;
}
My question is if value of i is 653000 then why the value of i+2 is 653008 As far as I know every bit in memory has its address specified then according to this value of i+2 should be 653064 because 1 byte = 8 bit. Why pointer arithmetic is scaled with byte why not with bit?
THANKS in advance and sorry for my bad English!
As far as I know every bit in memory has its address specified
Wrong.
Why pointer arithmetic is scaled with byte why not with bit?
The byte is the minimal addressable unit of storage on a computer, not the bit. Addresses refer to bytes - you cannot create a pointer that points to a specific bit in memory1.
Addresses refer to *bytes*
|
|
v _______________
0x1000 |_|_|_|_|_|_|_|_| \
0x1001 |_|_|_|_|_|_|_|_| > Each row is one byte
0x1002 |_|_|_|_|_|_|_|_| /
\_______ _______/
v
Each column is one bit
As others have explained, this is basic pointer arithmetic in action. When you add n to a pointer *p, you're adding n elements, not n bytes. You're effectively adding n * sizeof(*p) bytes to the pointer's address.
1 - without using architecture-specific tricks like Bit-banding on ARM, as myaut pointed out
You should read about the pointer arithmetic.link given in the comment.
While incrementing the position of pointer , that will incremented based on the data type of that pointer.In this case i+2 will increment the byte into
eight bytes.
Integer is four bytes.(system defined). So i+2 will act as i+(2*sizeof(int)). So it will became i+8. So the answer is incremented by eight.
Addresses are calculating by the byte. Not the bit. Take a character pointer. Each byte having the 255 bits.
Consider the string like this. `"hi". It will stored like this.
h i
1001 1002
Ascii value of h is 104. It will be stored in one byte. signed character we can store positive in 0 to 127. So storing the one value we need the one byte in character dataype. Using the bits we cannot store the only value. so the pointer arithmetic is based on bytes.
When you do PTR + n then simple maths will be like
PTR + Sizeof(PTR)*n.
Here size of integer pointer is 4 Byte.

Array Pointers in C

Ok, so I'm learning pointers and I am having trouble understanding how the pointers function in arrays.
Basically given this:
int a[5] = {1,2,4,7,7}; // (allocated at 0xA000)
int b[5] = {4,3,5,1,8}; // (at 0xA0020)
short *c[2]; // (at 0xA0040)
c[0] = (short *)b;
c[1] = (short *)a;
I'm supposed to determine the values of these calculations.
c[0] + 4
To my understanding c is an array of pointers. c[0] is a short that holds the pointer to the first element of the array b. If b starts at 0xA0020 why is is that c[0] + 4 is not 0xA0024 and instead it is 0xA0028.
Also, how am I supposed to determine the value of c[1][2]. c is not a multidimensional array, so how would this calculation work out?
Thank you!
Actually, when you add a number to a pointer, this number is multiplied by the size of the element being pointed to (short in your case because you have a short*). The size of short is probably 2 bytes on your computer, hence it adds 4*2 to the address, which is 8.
Here is a link from MSDN that explains this concept:
Click Here
To my understanding c is an array of pointers.
Correct, to be precise: array of pointers to short
C[0] is a short that holds the pointer to the first element of the array b. If B starts at 0xA0020 why is is that c[0] + 4 is not 0xA0024 and instead it is A0028.
Nope, C[0] is a pointer to short. short size is 2 bytes. When you add an integer to pointer, you're adding its pointed type. In this case, since C[0] is a pointer to short, C[0] + 4 means C[0] + 4 * 2 in bytes. So, if C[0] points to 0xA0020, C[0] + 4 will point to 0xA0028.
Also, how am I supposed to determine the value of c[1][2]. C is not a multidimensional array, so how would this calculation work out?
Pointer semantic in C enables you to treat pointer as array. Provided this declaration:
int* X;
Then these equation applies:
*X == X[0]
*(X + 1) == X[1]
or in general:
*(X + n) == X[n]
where n is an integer value.
Your C is an array of pointers, so first dimension would list the pointers, and second dimension is the data pointed by pointer in the first dimension. Use above equation to find the answer to your question.
NOTE: One thing you have to be aware of is the endianness of the machine. Little endian and big endian stores values bigger than a byte (short, long, int, long long, etc.) in different byte order.
This is because the size of a short integer is 2 bytes.
When you do c[0] +4, you're saying,
"move forward 4 full spaces from c", where a 'space' is the size of the type c points to (a short, so 2 bytes).
If c were a char*, the result WOULD be A0024 (because a char is 1 byte in size). If it were a long, you'd get A0030 or even more instead-- and so on.
c[0] isn't a short that holds a pointer. But rather it's a pointer to a short. And the increment is based on size of what it points to. Which it looks like others have just explained.
In your case c[1][2] gives you a following:
"c" - array of pointers to short (as per declaration in you code)
[1] - gives a second element of array of pointers "c" (pointer to array of integers a)
[2] - gives a data (short) at offset of sizeof(short)*2 from the address of element 2 of array "c" (from start of array "a")
So it will be a one half of second element of array "a". What part you get depends on endianness of your machine.
If you have little endian - then you get a 16 lsb bits of second element of "a". It is 0x0002 in hex, or just "2"
(short *)b; is strictly speaking undefined behavior, anything can happen. So the correct answer to what c[0]+4 holds is anything.
Also, even though a specific compiler may implement this undefined behavior in a particular deterministic way, there would still be no way to tell with the information given. To answer, you would have to know the size of int and short, as well as the endianess of the particular machine.

Subtle differences in C pointer addresses

What is the difference between:
*((uint32_t*)(p) + 4);
*(uint32_t*)(p+4);
or is there even a difference in the value?
My intuition is that in the later example the value starts at the 4th index of the array that p is pointing at and takes the first 4 bytes starting from index 4. While in the first example it takes one byte every 4 indices. Is this intuition correct?
The p+4 expression computes the address by adding 4*sizeof(*p) bytes to the value of p. If the size of *p is the same as that of uint32_t, there is no difference between the results of these two expressions.
Given that
p is an int pointer
and assuming that int on your system is 32-bit, your two expressions produce the same result.

Resources