Pointer dereferencing from a (char *) to (int * ) not understood in this example - c

I was taking practice tests on C on a website , where i happened to see this question.
My Doubt is explained in comments , so please read them.
#include<stdio.h>
int main()
{
int arr[3] = {2, 3, 4}; // its assumed to be stored in little-endian i.e;
// 2 = 00000010 00000000 00000000 00000000
// 3 = 00000011 00000000 00000000 00000000
// 4 = 00000100 00000000 00000000 00000000
char *p;
p = arr;
p = (char*)((int*)(p));
printf("%d ", *p);
p = (int*)(p+1); // This casting is expected to convert char pointer p
// to an int pointer , thus value at p ,now is assumed
// to be equal to 00000000 00000000 00000000 00000011
// but, the output was : 0 . As ,per my assumption it
// should be : 2^24+2^25 = 50331648 ,Please Clarify
// if my assumption is Wrong and explain Why?
printf("%d\n", *p);
return 0;
}

If you would cast p back to int*, then the int value would be:
00000000 00000000 00000000 00000011
where the last byte is the first byte of your second array element. By doing p+1, you're skipping the least signigicant byte of the first element.
Remember that p remains a char pointer, so assigning an int* to it will not change it's type.
When you printf the char at p+1, you are printing the value of the second byte, which is 0.

p = (char*)((int*)(p));
// till now the pointer p is type casted to store the variable of type character.
printf("%d, ", *p); // %d means integer value so value at first address i.e. 2 will be printed.
p = (int*)(p+1); // here p is still of type character as type casted in step 1 so p(i.e address) and plus 1 will increase only by one byte so
Assuming that integer requires 2 bytes of storage
the integer array will be stored in memory as
value 2 3 4
address 00000010 00000000 00000011 00000000 00000100 00000000
pointer p+1
so p+1 points to that location which is unfilled as during intialization 2,3,4 were stored in variable of type integer(2 bytes).
so p+1 will point to 00000000.
(int*)p+1 // p+1 is type casted again to integer
printf("%d", *p); // this will print 0 as output as by default integer contains 0 as value.

Remember p is still a char-pointer. So *p fetches a char value from it. The char value is then promoted to an int when passed as an argument to a variadic function (like printf).

Related

getting values of void pointer while only knowing the size of each element

ill start by saying ive seen a bunch of posts with similar titles but non focus on my question
ive been tasked to make a function that receives a void* arr, unsigned int sizeofArray and unsigned int sizeofElement
i managed to iterate through the array with no problem, however when i try to print out the values or do anything with them i seem to get garbage unless i specify the type of them beforehand
this is my function:
void MemoryContent(void* arr, unsigned int sizeRe, unsigned int sizeUnit)
{
int sizeArr = sizeRe/sizeUnit;
for (int i = 0; i < sizeArr ; i++)
{
printf("%d\n",arr); // this one prints garbage
printf("%d\n",*(int*)arr); // this one prints expected values given the array is of int*
arr = arr + sizeUnit;
}
}
the output of this with the following array(int arr[] = {1, 2, 4, 8, 16, 32, -1};) is:
-13296 1
-13292 2
-13288 4
-13284 8
-13280 16
-13276 32
-13272 -1
i realize i have to specify somehow the type. while the printf wont actually be used as i need the binary representation of whatever value is in there (already taken care of in a different function) im still not sure how to get the actual value without casting while knowing the size of the element
any explanation would be highly appreciated!
note: the compiler used is gcc so pointer arithmetics are allowed as used
edit for clarification:
the output after formating and all that should look like this for the given array of previous example
00000000 00000000 00000000 00000001 0x00000001
00000000 00000000 00000000 00000010 0x00000002
00000000 00000000 00000000 00000100 0x00000004
00000000 00000000 00000000 00001000 0x00000008
00000000 00000000 00000000 00010000 0x00000010
00000000 00000000 00000000 00100000 0x00000020
11111111 11111111 11111111 11111111 0xFFFFFFFF
getting values of void pointer getting values of void pointer while only knowing the size of each element
Not possible getting values of void pointer while only knowing the size of each element.
Say the size is 4. Is the element an int32_t, uint32_t, float, bool, some struct, or enum, a pointer, etc? Are any of the bits padding? The proper interpretation of the bits requires more than only knowing the size.
Code could print out the bits at void *ptr and leave the interpretation to the user.
unsigned char bytes[sizeUnit];
memcpy(bytes, ptr, sizeUnit);
for (size_t i = 0; i<sizeof bytes; i++) {
printf(" %02X", bytes[i]);
}
Simplifications exist.
OP's code void* arr, ... arr = arr + sizeUnit; is not portable code as adding to a void * is not defined by the C standard. Some compilers do allow it though, akin to as if the pointer was a char pointer.

How does Pointer Arithmetic work after Pointer Casting?

int main() {
short int a[4] = {1,1, [3] = 1};
int *p = (int*)a;
printf("p: %p %d \n ", p, *p);
printf("p+1: %p %d\n", (p +1), *(p+1));
}
why does *p = 65537 and *(p+1) = 65536?
Well, to understand why *P is 65537 and *(p+1) is 65536 lets take a look at the memory:
00000001 00000000 | 00000001 00000000 | 00000000 00000000 | 00000001 00000000
I've split a byte by a space and a single short int by a |. Now we cast the ptr to a int* and it now takes four bytes instead of two:
00000001 00000000 00000001 00000000 | 00000000 00000000 00000001 00000000
If you input those binaries into your calculator and let it show you the decimal representation you'd exactly get those numbers. (Thats little-endian however, so the rightmost-byte is the big end which you'd input first into your calculator)

Char pointer to integer array

int main()
{
int x[] = {1, 2, 3};
char *p = &x;
printf("%d", *(p+1));
return 0;
}
I run the code in codeblocks and it is giving 0 as output.
If I I change p as int pointer then its giving 2 as output.
int main()
{
int x[] = {1, 2, 3};
int *p = &x;
printf("%d", *(p+1));
return 0;
}
Why so?
When p is declared as a pointer to char, it is expected to point at data with size of 1 byte. So (p + 1) increments p by 1 byte.
Since an int is at least 4 bytes long, (p + 1) is likely pointing to the second of the higher order bytes of 1, which is still 0.
If you wanted it to have identical output, you would do something like that
printf("%d\n", *(p + sizeof(int)));
But it's best to avoid such code and compile with the -Wall flag, which would definitely produce a warning in your case.
Assume sizeof(int) is 16 bits. 2 in binary is 00000000 00000010.
sizeof(char) is 8 bits.
Little and big endian are two ways of storing multibyte data-types.
Consider the following code:
int i = 2;
char c = (char)&i;
if ((*c)==2)
printf("Little endian");
else //if *c is 0
printf("Big endian");
From this code you can conclude that Big Endian will store 2 as 00000000 00000010. But Little Endian will store it as 00000010 00000000. , So zero as output would mean first 8 bits are zero, so system is Big Endian. Had it been using Little Endian, answer would be 2 as a char p is supposed to point 8 bits only.
Actually, declaring the data type of pointer means to specify how any bits do you want it to refer and how many bits it will jump when incremented.
If in this example, as p is a char pointer, *(p+1) will refer 00000010 in Big endian and 00000000 in Little Endian.
Your compiler may be using 32 bit for interger, so i think in both cases *(p+1) will give 0. (as 2 => 00000000 00000000 00000000 00000010 2nd byte from either side is 0)
Refer to this: `#include
int main()
{
int x[] = {1, 2, 3};
char *p = &x;
printf("%d\n", *p);
printf("%d\n", *(p+1));
printf("%d\n", *(p+2));
printf("%d\n", *(p+3));
printf("%d\n", *(p+4));
printf("%d\n", *(p+5));
printf("%d\n", *(p+6));
printf("%d\n", *(p+7));
printf("%d\n", *(p+8));
return 0;
}`
Output:
1
0
0
0
2
0
0
3
To have a look from a slightly different angle, about the binary + operator, chapter 6.5.6, paragraph 8 of C99 standard says, [emphasis mine]
When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and
(P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist.
So, in your First case, p is of type char * and (p + 1) gives a result as a pointer which is incremented by sizeof(char)[that's 1 byte, most of the cases] and hence points to the 2nd element of the char array held by p. Since actually, the array held by p is of type int [Let's say 4 bytes of length, in a 32 bit system], so as per the value stored, [1 getting stored as 0000 0000 0000 0001], the *(p+1) prints out 0.
OTOH, in your second case, p is of type int * and (p + 1) gives a result as a pointer which is incremented by sizeof(int) and hence points to the 2nd element of the int array held by p. Since actually, the array held by p is of type int, so as per the value stored, [int x[] = {1, 2, 3};], the *(p+1) prints out 2.
When you increment a pointer, it moves by the size of the thing it is pointing to.
Let's say you have 16 bit integers. In binary, the number one is: 0000 0000 0000 0001
A char pointer can only point to 8 bits at a time: 0000 0000

Wrong number produced when memcpy-ing data into an integer?

I have a char buffer like this
char *buff = "aaaa0006france";
I want to extract the bytes 4 to 7 and store it in an int.
int i;
memcpy(&i, buff+4, 4);
printf("%d ", i);
But it prints junk values.
What is wrong with this?
The string
0006
does not have the same binary representation as the integer 6. Instead, its bit representation is as four ASCII characters representing the glyph 0, the glyph 0, the glyph 0, then the glyph 6. This has hex representation
0x30303036
If you try blindly reinterpreting these bits as a number on a little-endian system, you get back 808,464,438. On a big-endian system, you'd get 909,127,728.
If you want to convert a substring of your string into a number, you will need to instead look for a function that converts a string of text into a number. You might want to try something like this:
char digits[5];
/* Copy over the digits in question. */
memcpy(digits, buff + 4, 4);
digits[4] = '\0'; /* Make sure it's null-terminated! */
/* Convert the string to a number. */
int i = strtol(digits + 4, NULL, 10);
This uses the strtol function, which converts a text string into a number, to explicitly convert the text to an integer.
Hope this helps!
Here you need to note down two things
How the characters are stored
Endianess of the system
Each characters (Alphabhets, numbers or special characters) are stored as 7 bit ASCII values. While doing memcpy of the string(array of characters) "0006" to a 4bytes int variable, we have to give address of string as source and address of int as destination like below.
char a[] = "0006";
int b = 0, c = 6;
memcpy(&b, a, 4);
Values of a and b are stored as below.
a 00110110 00110000 00110000 00110000
b 00000000 00000000 00000000 00000000
c 00000000 00000000 00000000 00000110
MSB LSB
Because ASCII value of 0 character is 48 and 6 character is 54. Now memcpy will try to copy whatever value present in the a to b. After memcpy value of b will be as below
a 00110110 00110000 00110000 00110000
b 00110110 00110000 00110000 00110000
c 00000000 00000000 00000000 00000110
MSB LSB
Next is endianess. Now consider we are keeping the value 0006 to the character buffer in some other way like a[0] = 0; a[1] = 0; a[2]=0; a[3] = 6; now if we do memcpy, we will the get the value as 100663296(0x6000000) not 6 if it is little endian machine. In big endian machine you will get the value as 6 only.
c 00000110 00000000 00000000 00000000
b 00000110 00000000 00000000 00000000
c 00000000 00000000 00000000 00000110
MSB LSB
So these two problems we need to consider while writing a function which converts number charters to integer value. Simple solution for these problem is to make use of existing system api atoi.
the below code might help you...
#include <stdio.h>
int main()
{
char *buff = "aaaa0006france";
char digits[5];
memcpy(digits, buff + 4, 4);
digits[4] = '\0';
int a = atoi(digits);
printf("int : %d", a);
return 0;
}

How to interpret *( (char*)&a )

I see a way to know the endianness of the platform is this program but I don't understand it
#include <stdio.h>
int main(void)
{
int a = 1;
if( *( (char*)&a ) == 1) printf("Little Endian\n");
else printf("Big Endian\n");
system("PAUSE");
return 0;
}
What does the test do?
An int is almost always larger than a byte and often tracks the word size of the architecture. For example, a 32-bit architecture will likely have 32-bit ints. So given typical 32 bit ints, the layout of the 4 bytes might be:
00000000 00000000 00000000 00000001
or with the least significant byte first:
00000001 00000000 00000000 00000000
A char* is one byte, so if we cast this address to a char* we'll get the first byte above, either
00000000
or
00000001
So by examining the first byte, we can determine the endianness of the architecture.
This would only work on platforms where sizeof(int) > 1. As an example, we'll assume it's 2, and that a char is 8 bits.
Basically, with little-endian, the number 1 as a 16-bit integer looks like this:
00000001 00000000
But with big-endian, it's:
00000000 00000001
So first the code sets a = 1, and then this:
*( (char*)&a ) == 1)
takes the address of a, treats it as a pointer to a char, and dereferences it. So:
If a contains a little-endian integer, you're going to get the 00000001 section, which is 1 when interpeted as a char
If a contains a big-endian integer, you're going to get 00000000 instead. The check for == 1 will fail, and the code will assume the platform is big-endian.
You could improve this code by using int16_t and int8_t instead of int and char. Or better yet, just check if htons(1) != 1.
You can look at an integer as a array of 4 bytes (on most platforms). A little endian integer will have the values 01 00 00 00 and a big endian 00 00 00 01.
By doing &a you get the address of the first element of that array.
The expression (char*)&a casts it to the address of a single byte.
And finally *( (char*)&a ) gets the value contained by that address.
take the address of a
cast it to char*
dereference this char*, this will give you the first byte of the int
check its value - if it's 1, then it's little endian. Otherwise - big.
Assume sizeof(int) == 4, then:
|........||........||........||........| <- 4bytes, 8 bits each for the int a
| byte#1 || byte#2 || byte#3 || byte#4 |
When step 1, 2 and 3 are executed, *( (char*)&a ) will give you the first byte, | byte#1 |.
Then, by checking the value of byte#1 you can understand if it's big or little endian.
The program just reinterprets the space taken up by an int as an array of chars and assumes that 1 as an int will be stored as a series of bytes, the lowest order of which will be a byte of value 1, the rest being 0.
So if the lowest order byte occurs first, then the platform is little endian, else its big endian.
These assumptions may not work on every single platform in existance.
a = 00000000 00000000 00000000 00000001
^ ^
| |
&a if big endian &a if little endian
00000000 00000001
^ ^
| |
(char*)&a for BE (char*)&a for LE
*(char*)&a = 0 for BE *(char*)&a = 1 for LE
Here's how it breaks down:
a -- given the variable a
&a -- take its address; type of the expression is int *
(char *)&a -- cast the pointer expression from type int * to type char *
*((char *)&a) -- dereference the pointer expression
*((char *)&a) == 1 -- and compare it to 1
Basically, the cast (char *)&a converts the type of the expression &a from a pointer to int to a pointer to char; when we apply the dereference operator to the result, it gives us the value stored in the first byte of a.
*( (char*)&a )
In BigEndian data for int i=1 (size 4 byte) will arrange in memory as:- (From lower address to higher address).
00000000 -->Address 0x100
00000000 -->Address 0x101
00000000 -->Address 0x102
00000001 -->Address 0x103
While LittleEndian is:-
00000001 -->Address 0x100
00000000 -->Address 0x101
00000000 -->Address 0x102
00000000 -->Address 0x103
Analyzing the above cast:-
Also &a= 0x100 and thus
*((char*)0x100) implies consider by taking one byte(since 4 bytes loaded for int) a time so the data at 0x100 will be refered.
*( (char*)&a ) == 1 => (*0x100 ==1) that is 1==1 and so true,implying its little endian.

Resources