C Code Pointer Puzzle

C Code Pointer Puzzle - c

I'm looking at some code a classmate posted and it has been simplified for the sake of this example:
int main()
{
int n = 0x4142;
char * a = (char *) &n;
char * b1 = (((char *) &n) + 1);
char * b2 = (((int) &n) + 1);
printf("B1 Points to 0x%x - B2 Points to 0x%x\n", b1, b2);
printf("A Info - 0x%x - %c\n", a, *a);
printf("B1 Info - 0x%x - %c\n", b1, *b1);
printf("B2 Info - 0x%x - %c\n", b2, *b2);
return 0;
}
The output is:
B1 Points to 0xcfefb03d - B2 Points to 0xcfefb03d
A Info - 0xcfefb03c - B
B1 Info - 0xcfefb03d - A
Segmentation fault (core dumped)
It segmentation faults when trying to print out b2. Whereas I think the output should include the following line instead of seg faulting:
B2 Info - 0xcfefb03d - A
Why is this the case? b1 and b2 are both char*, and they both point to the same address. This is on a 64bit machine where sizeof(char*) == 8 and sizeof(int) == 4 if that matters.

but it segfaults on a 64bit computer
The likely reason is that, on your platform, pointers are 64-bit and ints are 32-bit. Thus when you cast the pointer to int, you lose information.
My compiler specifically warns about this:
test.c:7:19: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
This is easy to see using the following program:
#include <stdio.h>
int main()
{
char *s = "";
printf("%p %p\n", s, (char*)(int)s);
return 0;
}
On my computer, this prints two different addresses (the second has its top bits chopped off).
what I don't quite understand is that both b1 and b2 are char*, and they both point to the same address
Actually, they don't point to the same address. Your printf() format specifiers are all wrong. The first printf() should be:
printf("B1 Points to %p - B2 Points to %p\n", b1, b2);
When you run this, you'll see that the addresses differ.

(int)&n may truncate the address of n to how many bits there are in int, IOW, if the pointer size is longer than the size of int (which is often the case on 64-bit CPUs), you get a bogus, truncated address/pointer that you cannot dereference safely.

In your example, since an int is 4 bytes, the value of &n is truncated when you cast it and try to assign it to b2. To treat pointer values as integers, use uintptr_t, an unsigned integer type that can safely store a pointer regardless of the platform capacity:
char * b2 = (((int) &n) + 1);
should be:
char * b2 = ((uintptr_t) &n) + 1;

Related

Pointer Arithmetic long and short in 32bit or 64bit System

I am just studying for an exam and the following question is asked, what output does the following program generate in 32 bit or 64 bit?
#include <stdio.h>
int main(int argc, char **argv)
{
long *p = (long *)8;
short *q = (short *)0;
int c, i, l;
p = p + 1;
q = q + 1;
printf("p = %p q = %p\n", p, q);
c = (char *)p - (char *)q;
i = (int *)p - (int *)q;
l = (long *)p - (long *)q;
printf("c = %d, i = %d, l = %d\n", c, i, l);
return 0;
}
The result on 32bit is the following:
p = 0xc q = 0x2
c = 10, i = 2, l = 2
The result on 64bit is the following:
p = 0x10 q = 0x2
c = 14, i = 3, l = 1
Only I must confess, I can't understand these results at all, because when I count up a pointer it should point to something random in the memory, it always comes out 0xc or 0x10.
Can someone help me?

Only I must confess, I can't understand these results at all, because when I count up a pointer it should point to something random in the memory, it always comes out 0xc or 0x10.
The program doesn't access "value in memory pointed to by the pointer" for any of the pointers. It only does pointer arithmetic (changing what the pointer points to).
The goal of the question is likely to test if you understand how pointer arithmetic works in C.
How pointer arithmetic works in C is that it hides a "multiplication by sizeof(whatever the pointer points to)". In other words, p = p + n; is roughly equivalent to p = (char *)p + n * sizeof(*p);, or alternatively p = &p[n];.

First, I should point out that there's a lot of undefined behavior in this code, and it makes a lot of assumptions that aren't guaranteed by the standard. But that having been said, the question can be answered in terms of what you'll see on a typical implementation.
Let's assume that the data type sizes (in bytes) on the two machines are:
type
32-bit size
64-bit size
long
4
8
int
4
4
short
2
2
char
1
1
Again, I don't think these are guaranteed, but they are very typical.
Now consider what the code is doing. First, it's setting
long *p = (long *)8;
short *q = (short *)0;
These addresses do not refer to valid data (at least, not portably), but the code is never actually deferencing them. All it's doing is using them to perform pointer arithmetic.
First, it increments them:
p = p + 1;
q = q + 1;
When adding an integer to a pointer, the integer is scaled by the size of the target data type. So in the case of p, the scale is 4 (for 32-bit) or 8 (for 64-bit). In the case of q, the scale is 2 (in both cases).
So p becomes 12 (for 32-bit) or 16 (for 64-bit), and q becomes 2 (in both cases). When printed in hex, 12 is 0xc and 16 is 0x10, so this is consistent with what you saw.
Then it takes the differences between the two pointer, after first casting them to various different pointer types:
c = (char *)p - (char *)q;
i = (int *)p - (int *)q;
l = (long *)p - (long *)q;
These results exhibit undefined behavior, but if you assume that all pointer values are byte addresses, then here's what's happening.
First, when subtracting two pointers of the same type, the difference is divided by the size of the target data type. This gives the number of elements of that type separating the two pointers.
So in the case of c (type char *), it divides by 1, giving the raw pointer difference, which is (12 - 2) = 10 (for 32-bit) and (16 - 2) = 14 (for 64-bit).
For i (type int *), it divides by 4 (and truncates the result), so the difference is 10 / 4 = 2 (for 32-bit) and 14 / 4 = 3 (for 64-bit).
Finally, for l (type long*), it divides by 4 (for 32-bit) or 8 (for 64-bit), again truncating the result. So the difference is 10 / 4 = 2 (for 32-bit) and 14 / 8 = 1 (for 64-bit).
Again, the C standard does not guarantee any of this, so it is not safe to assume these results will be obtained on different platforms.

why the value of *p0 is having a different value every time in the compilation. here is the code for the the problem in c language .?

<#include <stdio.h>
int main()
{
//type casting in pointers
int a = 500; //value is assgned
int *p; //pointer p
p = &a; //stores the address in the pointer
printf("p=%d\n*p=%d", p, *p);
printf("\np+1=%d\n*(p+1)=%d", p + 1, *(p + 1));
char *p0;
p0 = (char *)p;
printf("\n\np0=%d\n*p0=%d", p0, *p0);
return 0;
}
I was exploring the pointers in the C language and found a problem in finding the value at the address of
the char pointer when I converted it from a integer pointer.
Tell me how it works and explain please

To print a pointer use %p and cast the argument to (void*).
Like
printf("p=%p\n*p=%d", (void*)p, *p);
Reading p + 1, i.e. doing *(p + 1), is undefined behavior because p + 1 doesn't point to an int. So don't do that!
In a comment OP asks:
p=6487564 *p=500 p+1=6487568 *(p+1)=0 p0=6487564 *p0=-12
these are the output i am getting why *p0 is -12 pz explain
The decimal value 500 is the same as the hexadecimal value 0x000001F4. On a little endian machine (with 32 bit int) this is stored like:
p -> F4 01 00 00
Then you assign p0 the value of p so you have
p -> F4 01 00 00
^
|
p0
so p0 points to 0xF4 (assuming 8 bit char).
On a machine with signed chars, the hex value 0xF4 is the decimal value -12 (i.e. signed 8 bit 2's complement representation).
Conclusion On a little endian machine with signed 8 bit chars the printed value will be -12.
If you change
char *p0;
p0 = (char *)p;
to
unsigned char *p0;
p0 = (unsigned char *)p;
then it will print 244. That may be easier to understand because 500 is 256 + 244 (or in hex: 0x1F4 = 0x100 + 0xF4).

What happens when you cast a char * address to int * in C when the address is not word-aligned?

I'm running this bit of code to understand pointers a little better.
void foo(void)
{
int a[4] = {0, 1, 2, 3};
printf("a[0]:%d, a[1]:%d, a[2]:%d, a[3]:%d\n", a[0], a[1], a[2], a[3]);
int *c;
c = a + 1;
c = (int *)((char*) c + 1);
*c = 10;
printf("c:%p, c+1:%p\n", c, c+1);
printf("a:%p, a1:%p, a2:%p, a3:%p\n", a, a+1, a+2, a+3);
printf("a[0]:%d, a[1]:%d, a[2]:%d, a[3]:%d\n", a[0], a[1], a[2], a[3]);
printf("c[0]:%d, c[1]:%d\n", *c, *(c+1));
}
The output I get is:
a[0]:0, a[1]:1, a[2]:2, a[3]:3
c:0xbfca1515, c+1:0xbfca1519
a:0xbfca1510, a1:0xbfca1514, a2:0xbfca1518, a3:0xbfca151c
a[0]:0, a[1]:2561, a[2]:0, a[3]:3
c[0]:10, c[1]:50331648
Could someone please explain how a[1] is now 2561?
I understand that when we do this:
c = (int *) ((char *) c + 1);
c is now pointing to the 4 bytes following the first byte of a[1].
But how did a[1] end up with 2561?
I'm guessing this has to do with endianness?

c = a + 1;
now c points on 1 (second element of a)
c = (int *)((char*) c + 1);
You "cheated" with pointer arithmetic, adding 1 to the address, regardless of the size of the int (note that it is illegal on old machines like 68000 which don't tolerate multi-byte access to odd addresses, or will do the job, albeit a lot slower, which is kind of worse since you're not noticing it for instance it works on a 68020 but slower).
now c points on the 3 last bytes of a[1] and overflows on the first byte of a[2], so when you do:
*c = 10;
since your machine is little endian, you're leaving the leading 1 value, write 10 in the next location, and zeroes afterwards, clobbering the leading 2 byte of a[2]
So now:
a[1] = 1 + (10<<8) = 2561
a[2] = 0
the result is different on a big endian machine:
PowerPC big endian (if int is 32 bit, else it's a different result):
a[1] = 10485760
a[2] = 2 // first byte is overwritten, but with zero
68000/68010:
bus error (coredump) / guru meditation
to sum it up: Don't violate the strict aliasing rule

Output of the following C code

What will be the output of the following C code. Assuming it runs on Little endian machine, where short int takes 2 Bytes and char takes 1 Byte.
#include<stdio.h>
int main() {
short int c[5];
int i = 0;
for(i = 0; i < 5; i++)
c[i] = 400 + i;
char *b = (char *)c;
printf("%d", *(b+8));
return 0;
}
In my machine it gave
-108
I don't know if my machine is Little endian or big endian. I found somewhere that it should give
148
as the output. Because low order 8 bits of 404(i.e. element c[4]) is 148. But I think that due to "%d", it should read 2 Bytes from memory starting from the address of c[4].

The code gives different outputs on different computers because on some platforms the char type is signed by default and on others it's unsigned by default. That has nothing to do with endianness. Try this:
char *b = (char *)c;
printf("%d\n", (unsigned char)*(b+8)); // always prints 148
printf("%d\n", (signed char)*(b+8)); // always prints -108 (=-256 +148)
The default value is dependent on the platform and compiler settings. You can control the default behavior with GCC options -fsigned-char and -funsigned-char.

c[4] stores 404. In a two-byte little-endian representation, that means two bytes of 0x94 0x01, or (in decimal) 148 1.
b+8 addresses the memory of c[4]. b is a pointer to char, so the 8 means adding 8 bytes (which is 4 two-byte shorts). In other words, b+8 points to the first byte of c[4], which contains 148.
*(b+8) (which could also be written as b[8]) dereferences the pointer and thus gives you the value 148 as a char. What this does is implementation-defined: On many common platforms char is a signed type (with a range of -128 .. 127), so it can't actually be 148. But if it is an unsigned type (with a range of 0 .. 255), then 148 is fine.
The bit pattern for 148 in binary is 10010100. Interpreting this as a two's complement number gives you -108.
This char value (of either 148 or -108) is then automatically converted to int because it appears in the argument list of a variable-argument function (printf). This doesn't change the value.
Finally, "%d" tells printf to take the int argument and format it as a decimal number.
So, to recap: Assuming you have a machine where
a byte is 8 bits
negative numbers use two's complement
short int is 2 bytes
... then this program will output either -108 (if char is a signed type) or 148 (if char is an unsigned type).

To see what sizes types have in your system:
printf("char = %u\n", sizeof(char));
printf("short = %u\n", sizeof(short));
printf("int = %u\n", sizeof(int));
printf("long = %u\n", sizeof(long));
printf("long long = %u\n", sizeof(long long));
Change the lines in your program
unsigned char *b = (unsigned char *)c;
printf("%d\n", *(b + 8));
And simple test (I know that it is not guaranteed but all C compilers I know do it this way and I do not care about old CDC or UNISYS machines which had different addresses and pointers to different types of data
printf(" endianes test: %s\n", (*b + (unsigned)*(b + 1) * 0x100) == 400? "little" : "big");
Another remark: it is only because in your program c[0] == 400

Data stored with pointers

void *memory;
unsigned int b=65535; //1111 1111 1111 1111 in binary
int i=0;
memory= &b;
for(i=0;i<100;i++){
printf("%d, %d, d\n", (char*)memory+i, *((unsigned int * )((char *) memory + i)));
}
I am trying to understand one thing.
(char*)memory+i - print out adress in range 2686636 - 2686735.
and when i store 65535 with memory= &b this should store this number at adress 2686636 and 2686637
because every adress is just one byte so 8 binary characters so when i print it out
*((unsigned int * )((char *) memory + i)) this should print 2686636, 255 and 2686637, 255
instead of it it prints 2686636, 65535 and 2686637, random number
I am trying to implement memory allocation. It is school project. This should represent memory. One adress should be one byte so header will be 2686636-2586639 (4 bytes for size of block) and 2586640 (1 byte char for free or used memory flag). Can someone explain it to me thanks.
Thanks for answers.
void *memory;
void *abc;
abc=memory;
for(i=0;i<100;i++){
*(int*)abc=0;
abc++;
}
*(int*)memory=16777215;
for(i=0;i<100;i++){
printf("%p, %c, %d\n", (char*)memory+i, *((char *)memory +i), *((char *)memory +i));
}
output is
0028FF94,  , -1
0028FF95,  , -1
0028FF96,  , -1
0028FF97, , 0
0028FF98, , 0
0028FF99, , 0
0028FF9A, , 0
0028FF9B, , 0
i think it works. 255 only one -1, 65535 2 times -1 and 16777215 3 times -1.

In your program it seems that address of b is 2686636 and when you will write (char*)memory+i or (char*)&b+i it means this pointer is pointing to char so when you add one to it will jump to only one memory address i.e2686637 and so on till 2686735(i.e.(char*)2686636+99).
now when you are dereferencing i.e.*((unsigned int * )((char *) memory + i))) you are going to get the value at that memory address but you have given value to b only (whose address is 2686636).all other memory address have garbage values which you are printing.
so first you have to store some data at the rest of the addresses(2686637 to 2686735)
good luck..
i hope this will help

I did not mention this in my comments yesterday but it is obvious that your for loop from 0 to 100 overruns the size of an unsigned integer.
I simply ignored some of the obvious issues in the code and tried to give hints on the actual question you asked (difficult to do more than that on a handy :-)). Unfortunately I did not have time to complete this yesterday. So, with one day delay my hints for you.
Try to avoid making assumptions about how big a certain type is (like 2 bytes or 4 bytes). Even if your assumption holds true now, it might change if you switch the compiler or switch to another platform. So use sizeof(type) consequently throughout the code. For a longer discussion on this you might want to take a look at: size of int, long a.s.o. on Stack Overflow. The standard mandates only the ranges a certain type should be able to hold (0-65535 for unsigned int) so a minimal size for types only. This means that the size of int might (and tipically is) bigger than 2 bytes. Beyond primitive types sizeof helps you also with computing the size of structures where due to memory alignment && packing the size of a structure might be different from what you would "expect" by simply looking at its attributes. So the sizeof operator is your friend.
Make sure you use the correct formatting in printf.
Be carefull with pointer arithmetic and casting since the result depends on the type of the pointer (and obviously on the value of the integer you add with).
I.e.
(unsigned int*)memory + 1 != (unsigned char*)memory + 1
(unsigned int*)memory + 1 == (unsigned char*)memory + 1 * sizeof(unsigned int)
Below is how I would write the code:
//check how big is int on our platform for illustrative purposes
printf("Sizeof int: %d bytes\n", sizeof(unsigned int));
//we initialize b with maximum representable value for unsigned int
//include <limits.h> for UINT_MAX
unsigned int b = UINT_MAX; //0xffffffff (if sizeof(unsigned int) is 4)
//we print out the value and its hexadecimal representation
printf("B=%u 0x%X\n", b, b);
//we take the address of b and store it in a void pointer
void* memory= &b;
int i = 0;
//we loop the unsigned chars starting at the address of b up to the sizeof(b)
//(in our case b is unsigned int) using sizeof(b) is better since if we change the type of b
//we do not have to remember to change the sizeof in the for loop. The loop works just the same
for(i=0; i<sizeof(b); ++i)
{
//here we kept %d for formating the individual bytes to represent their value as numbers
//we cast to unsigned char since char might be signed (so from -128 to 127) on a particular
//platform and we want to illustrate that the expected (all bytes 1 -> printed value 255) occurs.
printf("%p, %d\n", (unsigned char *)memory + i, *((unsigned char *) memory + i));
}
I hope you will find this helpfull. And good luck with your school assignment, I hope you learned something you can use now and in the future :-).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

C Code Pointer Puzzle - c

(int)&n may truncate the address of n to how many bits there are in int, IOW, if the pointer size is longer than the size of int (which is often the case on 64-bit CPUs), you get a bogus, truncated address/pointer that you cannot dereference safely.

Related

Pointer Arithmetic long and short in 32bit or 64bit System

why the value of *p0 is having a different value every time in the compilation. here is the code for the the problem in c language .?

What happens when you cast a char * address to int * in C when the address is not word-aligned?

Output of the following C code

Data stored with pointers

Categories

Resources