Adding the integer to hexadecimal address and How is the pointers calculation done in C? [duplicate] - arrays

#include<stdio.h>
int main(void){
int *ptr,a,b;
a = ptr;
b = ptr + 1;
printf("the vale of a,b is %x and %x respectively",a,b);
int c,d;
c = 0xff;
d = c + 1;
printf("the value of c d are %x and %x respectively",c,d);
return 0;
}
the out put value is
the vale of a,b is 57550c90 and 57550c94 respectively
the value of c d are ff and 100 respectively%
it turns out the ptr + 1 actually, why it behave this way?

Because pointers are designed to be compatible with arrays:
*(pointer + offset)
is equivalent to
pointer[offset]
So pointer aritmetic doesn't work in terms of bytes, but in terms of sizeof(pointer base type)-bytes sized blocks.

Consider what a pointer is... it's a memory address. Every byte in memory has an address. So, if you have an int that's 4 bytes and its address is 1000, 1001 is actually the 2nd byte of that int and 1002 is the third byte and 1003 is the fourth. Since the size of an int might vary from compiler to compiler, it is imperative that when you increment your pointer you don't get the address of some middle point in the int. So, the job of figuring out how many bytes to skip, based on your data type, is handled for you and you can just use whatever value you get and not worry about it.
As Basile Starynkvitch points out, this amount will vary depending on the sizeof property of the data member pointed to. It's very easy to forget that even though addresses are sequential, the pointers of your objects need to take into account the actual memory space required to house those objects.

Pointer arithmetic is a tricky subject. A pointer addition means passing to some next pointed element. So the address is incremented by the sizeof the pointed element.

Short answer
The address of the pointer will be incremented by sizeof(T) where T is the type pointed to. So for an int, the pointer will be incremented by sizeof(int).
Why?
Well first and foremost, the standard requires it. The reason this behaviour is useful (other than for compatibility with C) is because when you have a data structure which uses contiguous memory, like an array or an std::vector, you can move to the next item in the array by simply adding one to the pointer. If you want to move to the nth item in the container, you just add n.
Being able to write firstAddress + 2 is far simpler than firstAddress + (sizeof(T) * 2), and helps prevent bugs arising from developers assuming sizeof(int) is 4 (it might not be) and writing code like firstAddress + (4 * 2).
In fact, when you say myArray[4], you're saying myArray + 4. This is the reason that arrays indices start at 0; you just add 0 to get the first element (i.e. myArray points to the first element of the array) and n to get the nth.
What if I want to move one byte at a time?
sizeof(char) is guaranteed to be one byte in size, so you can use a char* if you really want to move one byte at a time.

A pointer is used to point to a specific byte of memory marking where an object has been allocated (technically it can point anywhere, but that's how it's used). When you do pointer arithmetic, it operates based on the size of the objects pointed to. In your case, it's a pointer to integers, which have a size of 4 bytes each.

Let consider a pointer p. The expression p+n is like (unsigned char *)p + n * sizeof *p (because sizeof(unsigned char) == 1).
Try this :
#include <stdio.h>
#define N 3
int
main(void)
{
int i;
int *p = &i;
printf("%p\n", (void *)p);
printf("%p\n", (void *)(p + N));
printf("%p\n", (void *)((unsigned char *)p + N * sizeof *p));
return 0;
}

Related

In C, why does incrementing a pointer adds the size of the type the pointer is referring to instead of 1? [duplicate]

#include<stdio.h>
int main(void){
int *ptr,a,b;
a = ptr;
b = ptr + 1;
printf("the vale of a,b is %x and %x respectively",a,b);
int c,d;
c = 0xff;
d = c + 1;
printf("the value of c d are %x and %x respectively",c,d);
return 0;
}
the out put value is
the vale of a,b is 57550c90 and 57550c94 respectively
the value of c d are ff and 100 respectively%
it turns out the ptr + 1 actually, why it behave this way?
Because pointers are designed to be compatible with arrays:
*(pointer + offset)
is equivalent to
pointer[offset]
So pointer aritmetic doesn't work in terms of bytes, but in terms of sizeof(pointer base type)-bytes sized blocks.
Consider what a pointer is... it's a memory address. Every byte in memory has an address. So, if you have an int that's 4 bytes and its address is 1000, 1001 is actually the 2nd byte of that int and 1002 is the third byte and 1003 is the fourth. Since the size of an int might vary from compiler to compiler, it is imperative that when you increment your pointer you don't get the address of some middle point in the int. So, the job of figuring out how many bytes to skip, based on your data type, is handled for you and you can just use whatever value you get and not worry about it.
As Basile Starynkvitch points out, this amount will vary depending on the sizeof property of the data member pointed to. It's very easy to forget that even though addresses are sequential, the pointers of your objects need to take into account the actual memory space required to house those objects.
Pointer arithmetic is a tricky subject. A pointer addition means passing to some next pointed element. So the address is incremented by the sizeof the pointed element.
Short answer
The address of the pointer will be incremented by sizeof(T) where T is the type pointed to. So for an int, the pointer will be incremented by sizeof(int).
Why?
Well first and foremost, the standard requires it. The reason this behaviour is useful (other than for compatibility with C) is because when you have a data structure which uses contiguous memory, like an array or an std::vector, you can move to the next item in the array by simply adding one to the pointer. If you want to move to the nth item in the container, you just add n.
Being able to write firstAddress + 2 is far simpler than firstAddress + (sizeof(T) * 2), and helps prevent bugs arising from developers assuming sizeof(int) is 4 (it might not be) and writing code like firstAddress + (4 * 2).
In fact, when you say myArray[4], you're saying myArray + 4. This is the reason that arrays indices start at 0; you just add 0 to get the first element (i.e. myArray points to the first element of the array) and n to get the nth.
What if I want to move one byte at a time?
sizeof(char) is guaranteed to be one byte in size, so you can use a char* if you really want to move one byte at a time.
A pointer is used to point to a specific byte of memory marking where an object has been allocated (technically it can point anywhere, but that's how it's used). When you do pointer arithmetic, it operates based on the size of the objects pointed to. In your case, it's a pointer to integers, which have a size of 4 bytes each.
Let consider a pointer p. The expression p+n is like (unsigned char *)p + n * sizeof *p (because sizeof(unsigned char) == 1).
Try this :
#include <stdio.h>
#define N 3
int
main(void)
{
int i;
int *p = &i;
printf("%p\n", (void *)p);
printf("%p\n", (void *)(p + N));
printf("%p\n", (void *)((unsigned char *)p + N * sizeof *p));
return 0;
}

Incrementing pointer to array

I came across this program on HSW:
int *p;
int i;
p = (int *)malloc(sizeof(int[10]));
for (i=0; i<10; i++)
*(p+i) = 0;
free(p);
I don't understand the loop fully.
Assuming the memory is byte addressable, and each integer takes up 4 bytes of memory, and say we allocate 40 bytes of memory to the pointer p from address 0 to 39.
Now, from what I understand, the pointer p initially contains value 0, i.e. the address of first memory location. In the loop, a displacement is added to the pointer to access the subsequent integers.
I cannot understand how the memory addresses uptil 39 are accessed with a displacement value of only 0 to 9. I checked and found that the pointer is incremented in multiples of 4. How does this happen? I'm guessing it's because of the integer type pointer, and each pointer is supposedly incremented by the size of it's datatype. Is this true?
But what if I actually want to point to memory location 2 using an integer pointer. So, I do this: p = 2. Then, when I try to de-reference this pointer, should I expect a segmentation fault?
Now, from what I understand, the pointer p initially contains value 0
No, the pointer p would not hold the value 0 in case malloc returns successfully.
At the point of declaring it, the pointer is uninitialized and most probably holds a garbage value. Once you assign it to the pointer returned by malloc, the pointer points to a region of dynamically allocated memory that the allocator sees as unoccupied.
I cannot understand how the memory addresses uptil 39 are accessed
with a displacement value of only 0 to 9
The actual displacement values are 0, 4, 8, 12 ... 36. Because the pointer p has a type, in that case int *, this indicates that the applied offset in pointer arithmetics is sizeof(int), in your case 4. In other words, the displacement multiplier is always based on the size of the type that your pointer points to.
But what if I actually want to point to memory location 2 using an
integer pointer. So, I do this: p = 2. Then, when I try to
de-reference this pointer, should I expect a segmentation fault?
The exact location 2 will most probably be unavailable in the address space of your process because that part would either be reserved by the operating system, or will be protected in another form. So in that sense, yes, you will get a segmentation fault.
The general problem, however, with accessing a data type at locations not evenly divisible by its size is breaking the alignment requirements. Many architectures would insist that ints are accessed on a 4-byte boundary, and in that case your code will trigger an unaligned memory access which is technically undefined behaviour.
Now, from what I understand, the pointer p initially contains value 0
No, it contains the address to the first integer in an array of 10. (Assuming that malloc was successful.)
In the loop, a displacement is added to the pointer to access the subsequent integers.
Umm no. I'm not sure what you mean but that is not what the code does.
I checked and found that the pointer is incremented in multiples of 4. How does this happen?
Pointer arithmetic, that is using + - ++ -- etc operators on a pointer, are smart enough to know the type. If you have an int pointer a write p++, then the address that is stored in p will get increased by sizeof(int) bytes.
But what if I actually want to point to memory location 2 using an integer pointer. So, I do this: p = 2.
No, don't do that, it doesn't make any sense. It sets the pointer to point at address 0x00000002 in memory.
Explanation of the code:
int *p; is a pointer to integer. By writing *p = something you change the contents of what p points to. By writing p = something you change the address of where p points.
p = (int *)malloc(sizeof(int[10])); was written by a confused programmer. It doesn't make any sense to cast the result of malloc in, you can find extensive information about that topic on this site.
Writing sizeof(int[10]) is the same as writing 10*sizeof(int).
*(p+i) = 0; is the very same as writing p[i] = 0;
I would fix the code as follows:
int *p = malloc(sizeof(int[10]));
if(p == NULL) { /* error handling */ }
for (int i=0; i<10; i++)
{
p[i] = 0;
}
free(p);
Since you have a typed pointer, when you perform common operations on it (addition or subtraction), it automatically adjusts the alignment for your type. Here, since on your computer sizeof (int) is 4, p + i will result in the address p + sizeof (int) * i, or p + 4*i in your case.
And you seem to misunderstand the statement *(p+i) = 0. This statement is equivalent to p[i] = 0. Obviously, your malloc() call won't return you 0, except if it fails to actually allocate the memory you asked.
Then, I assume that your last question means "If I shift my malloc-ated address by exactly two bytes, what will occur?".
The answer depends on what you do next and on the endianness of your system. For example:
/*
* Suppose our pointer p is well declared
* And points towards a zeroed 40 bytes area.
* (here, I assume sizeof (int) = 4)
*/
int *p1 = (int *)((char *)p + 2);
*p1 = 0x01020304;
printf("p[0] = %x, p[1] = %x.\n", p[0], p[1]);
Will output
p[0] = 102, p[1] = 3040000.
On a big endian system, and
p[0] = 3040000, p[1] = 102
On a little endian system.
EDIT : To answer to your comment, if you try to dereference a randomly assigned pointer, here is what can happen:
You are lucky : the address you type correspond to a memory area which has been allocated for your program. Thus, it is a valid virtual address. You won't get a segfault, but if you modify it, it might corrupt the behavior of your program (and it surely will ...)
You are luckier : the address is invalid, you get a nice segfault that prevents your program from totally screwing things up.
It is called pointer arithmetic. Add an integer n to a pointer of type t* moves the pointer by n * sizeof(t) elements. Therefore, if sizeof(int) is 4 bytes:
p + 1 (C) == p + 1 * sizeof(int) == p + 1 * 4 == p + 4
Then it is easier to index your array:
*(p+i) is the i-th integer in the array p.
I don't know if by "memory location 2" you mean your example memory address 2 or if you mean the 2nd value in your array. If you mean the 2nd value, that would be memory address 1. To get a pointer to this location you would do int *ptr = &p[1]; or equivalently int *ptr = p + 1;, then you can print this value with printf("%d\n", *ptr);. If you mean the memory address 2 (your example address), that would be the 3rd value in the array, then you'd want p[2] or p + 2. Note that memory addresses are usually in hex and wouldn't actually start at 0. It would be something like 0x092ef000, 0x092ef004, 0x092ef008, . . .. All of the other answers aren't understanding that you are using memory addresses 0 . . . 39 just as example addresses. I don't think you honestly are referring to the physical locations starting at address 0x00000000 and if you are then what everyone else is saying is right.

Accessing struct member by address

Consider a struct with two members of integer type. I want to get both members by address. I can successfully get the first, but I'm getting wrong value with the second. I believe that is garbage value. Here's my code:
#include <stdio.h>
typedef struct { int a; int b; } foo_t;
int main(int argc, char **argv)
{
foo_t f;
f.a = 2;
f.b = 4;
int a = ((int)(*(int*) &f));
int b = ((int)(*(((int*)(&f + sizeof(int))))));
printf("%d ..%d\n", a, b);
return 0;
}
I'm getting:
2 ..1
Can someone explain where I've gone wrong?
The offset of the first member must always be zero by the C standard; that's why your first cast works. The offset of the second member, however, may not necessarily be equal to the size of the first member of the structure because of padding.
Moreover, adding a number to a pointer does not add the number of bytes to the address: instead, the size of the thing being pointed to is added. So when you add sizeof(int) to &f, sizeof(int)*sizeof(foo_t) gets added.
You can use offsetof operator if you want to do the math yourself, like this
int b = *((int*)(((char*)&f)+offsetof(foo_t, b)));
The reason this works is that sizeof(char) is always one, as required by the C standard.
Of course you can use &f.b to avoid doing the math manually.
Your problem is &f + sizeof(int). If A is a pointer, and B is an integer, then A + B does not, in C, mean A plus B bytes. Rather, it means A plus B elements, where the size of an element is defined by the pointer type of A. Therefore, if sizeof(int) is 4 on your architecture, then &f + sizeof(int) means "four foo_ts into &f, or 4 * 8 = 32 bytes into &f".
Try ((char *)&f) + sizeof(int) instead.
Or, of course, &f.a and &f.b instead, quite simply. The latter will not only give you handy int pointers anyway and relieve you of all those casts, but also be well-defined and understandable. :)
The expression &f + sizeof(int) adds a number to a pointer of type foo_t*. Pointer arithmetic always assumes the pointer is to an element of an array, and the number is treated as a count of elements.
So if sizeof(int) is 4, &f + sizeof(int) points four foo_t structs past f, or 4*sizeof(foo_t) bytes after &f.
If you must use byte counts, something like this might work:
int b = *(int*)((char*)(&f) + sizeof(int));
... assuming there's no padding between members a and b.
But is there any reason you don't just get the value f.b or the pointer &f.b?
This works:
int b = ((int)(*(int*)((int)&f + sizeof(int))));
You are interpreting &f as a pointer and when you are adding 4 (size of int) to it, it interprets it as adding 4 pointers which is 16 or 32 bytes depending on 32 vs 64 arch. If you cast pointer to int it will properly add 4 bytes to it.
This is an explanation of what is going on. I'm not sure what you are doing, but you most certainly should not be doing it like that. This can get you in trouble with alignment etc. The safe way to figure out offset of a struct element is:
printf("%d\n", &(((foo_t *)NULL)->b));
You can do &f.b, and skip the actual pointer math.
Here is how I would change your int b line:
int b = (int)(*(((int*)(&(f.b)))));
BTW, when I run your program as is, I get 2 ..0 as the output.

Why a pointer + 1 add 4 actually

#include<stdio.h>
int main(void){
int *ptr,a,b;
a = ptr;
b = ptr + 1;
printf("the vale of a,b is %x and %x respectively",a,b);
int c,d;
c = 0xff;
d = c + 1;
printf("the value of c d are %x and %x respectively",c,d);
return 0;
}
the out put value is
the vale of a,b is 57550c90 and 57550c94 respectively
the value of c d are ff and 100 respectively%
it turns out the ptr + 1 actually, why it behave this way?
Because pointers are designed to be compatible with arrays:
*(pointer + offset)
is equivalent to
pointer[offset]
So pointer aritmetic doesn't work in terms of bytes, but in terms of sizeof(pointer base type)-bytes sized blocks.
Consider what a pointer is... it's a memory address. Every byte in memory has an address. So, if you have an int that's 4 bytes and its address is 1000, 1001 is actually the 2nd byte of that int and 1002 is the third byte and 1003 is the fourth. Since the size of an int might vary from compiler to compiler, it is imperative that when you increment your pointer you don't get the address of some middle point in the int. So, the job of figuring out how many bytes to skip, based on your data type, is handled for you and you can just use whatever value you get and not worry about it.
As Basile Starynkvitch points out, this amount will vary depending on the sizeof property of the data member pointed to. It's very easy to forget that even though addresses are sequential, the pointers of your objects need to take into account the actual memory space required to house those objects.
Pointer arithmetic is a tricky subject. A pointer addition means passing to some next pointed element. So the address is incremented by the sizeof the pointed element.
Short answer
The address of the pointer will be incremented by sizeof(T) where T is the type pointed to. So for an int, the pointer will be incremented by sizeof(int).
Why?
Well first and foremost, the standard requires it. The reason this behaviour is useful (other than for compatibility with C) is because when you have a data structure which uses contiguous memory, like an array or an std::vector, you can move to the next item in the array by simply adding one to the pointer. If you want to move to the nth item in the container, you just add n.
Being able to write firstAddress + 2 is far simpler than firstAddress + (sizeof(T) * 2), and helps prevent bugs arising from developers assuming sizeof(int) is 4 (it might not be) and writing code like firstAddress + (4 * 2).
In fact, when you say myArray[4], you're saying myArray + 4. This is the reason that arrays indices start at 0; you just add 0 to get the first element (i.e. myArray points to the first element of the array) and n to get the nth.
What if I want to move one byte at a time?
sizeof(char) is guaranteed to be one byte in size, so you can use a char* if you really want to move one byte at a time.
A pointer is used to point to a specific byte of memory marking where an object has been allocated (technically it can point anywhere, but that's how it's used). When you do pointer arithmetic, it operates based on the size of the objects pointed to. In your case, it's a pointer to integers, which have a size of 4 bytes each.
Let consider a pointer p. The expression p+n is like (unsigned char *)p + n * sizeof *p (because sizeof(unsigned char) == 1).
Try this :
#include <stdio.h>
#define N 3
int
main(void)
{
int i;
int *p = &i;
printf("%p\n", (void *)p);
printf("%p\n", (void *)(p + N));
printf("%p\n", (void *)((unsigned char *)p + N * sizeof *p));
return 0;
}

Why does my homespun sizeof operator need a char* cast?

Below is the program to find the size of a structure without using sizeof operator:
struct MyStruct
{
int i;
int j;
};
int main()
{
struct MyStruct *p=0;
int size = ((char*)(p+1))-((char*)p);
printf("\nSIZE : [%d]\nSIZE : [%d]\n", size);
return 0;
}
Why is typecasting to char * required?
If I don't use the char* pointer, the output is 1 - why?
Because pointer arithmetic works in units of the type pointed to. For example:
int* p_num = malloc(10 * sizeof(int));
int* p_num2 = p_num + 5;
Here, p_num2 does not point five bytes beyond p_num, it points five integers beyond p_num. If on your machine an integer is four bytes wide, the address stored in p_num2 will be twenty bytes beyond that stored in p_num. The reason for this is mainly so that pointers can be indexed like arrays. p_num[5] is exactly equivalent to *(p_num + 5), so it wouldn't make sense for pointer arithmetic to always work in bytes, otherwise p_num[5] would give you some data that started in the middle of the second integer, rather than giving you the sixth integer as you would expect.
In order to move a specific number of bytes beyond a pointer, you need to cast the pointer to point to a type that is guaranteed to be exactly 1 byte wide (a char).
Also, you have an error here:
printf("\nSIZE : [%d]\nSIZE : [%d]\n", size);
You have two format specifiers but only one argument after the format string.
If I don't use the char* pointer, the output is 1 - WHY?
Because operator- obeys the same pointer arithmetic rules that operator+ does. You incremented the sizeof(MyStruct) when you added one to the pointer, but without the cast you are dividing the byte difference by sizeof(MyStruct) in the operator- for pointers.
Why not use the built in sizeof() operator?
Because you want the size of your struct in bytes. And pointer arithmetics implicitly uses type sizes.
int* p;
p + 5; // this is implicitly p + 5 * sizeof(int)
By casting to char* you circumvent this behavior.
Pointer arithmetic is defined in terms of the size of the type of the pointer. This is what allows (for example) the equivalence between pointer arithmetic and array subscripting -- *(ptr+n) is equivalent to ptr[n]. When you subtract two pointers, you get the difference as the number of items they're pointing at. The cast to pointer to char means that it tells you the number of chars between those addresses. Since C makes char and byte essentially equivalent (i.e. a byte is the storage necessary for one char) that's also the number of bytes occupied by the first item.

Resources