C pointer only using 24 bits - c

I have following snippet where value of pointer is printed, since it is pointer, on 64 bit machine, its size is 8 bytes, and 64 bits should be used to represent the address, But:
#include <stdio.h>
int main()
{
char *s = "";
printf("%p %p\n", s, (char*)(int)s);
return 0;
}
But output is :
0x4005e4 0x4005e4
why only 24 bits are used for the pointer value, shouldn't this be 64 bits ?
Also, is it UB if cast of different size pointer are involved like here (char *)(int)s ?
What I was expecting with this (char)s to give only 1 Byte address but it is printing 8 bytes address?

why only 24 bits are used for the pointer value
Your pointers happen to have their most significant bits set to zero, so they aren't printed. If you really want to print all 64 bits, you can change your printf format string to make it print leading zeros.
is it UB if cast of different size pointer are involved like here (char *)(int)s ?
There are no "different size pointers" on the machines you're likely to be using, but int is commonly 32 bits. So by casting through int on the way to char*, you are throwing away the most significant 32 bits. If they were zero, you may not notice the difference, but if not you'll corrupt the pointer and nobody knows what you'll get if you dereference it. You can still print its value (i.e. the address it points to, even if it's nonsense).

Related

Type casting a 16 bit unsigned integer to an 8 bit unsigned integer pointer in C

While playing around with pointers I came around something interesting.
I initialized a 16 bit unsigned int variable to the number 32771 and then assigned the address of that variable to an 8 bit unsigned int pointer.
Now 32771, in unsigned 16 bit form, has binary representation of 110000000000001. So when dereferencing the 8 bit pointer the first time, I expected it to print the value of 11000000, which is = 192 and after incrementing the pointer and then dereferencing the pointer again, is expected it to print the value 00000001, which is 128.
In actuality, for the first dereference, 3 was printed, which is what I would get if I read 11000000 from left to right and the second dereference printed 128.
int main(){
__uint16_t a = 32771;
__uint8_t *p = (__uint8_t *)&a;
printf("%d", *p); //Expected 192 but actual output 3.
++p;
printf("%d", *p); //Expected 1, but actual output 128
}
I know that bits are read from right to left, however in this case the bits are being read from left to right. Why?
32771 is 32768 plus 3. In binary, it's 1000000000000011.
If you split it into the most significant byte and the last significant byte, you get 128 and 3 because 128 * 256 + 3 = 32771.
So the bytes will be 3 and 128. There's no particular reason one should occur before the other. Your CPU can store the bytes that compose a multi-byte number in whatever order it wants to. Apparently, yours stores the least significant byte at a lower address than the most significant byte.

c programming question on reinterpret_cast

What is the reinterpret_cast of (char) doing here?
unsigned int aNumber = 258; // 4 bytes in allocated memory [02][01][00][00]
printf("\n is printing out the first byte %02i",(char)aNumber); // Outputs the first byte[02]
Why am i getting out the first byte without pointing to it? such as (char*)&aNumber
is the %02i doing this = (char)*&aNumber
or is the reinterpret_cast of (char) cutting out the rest 3 bytes since it is a char it only allocate one byte of them 4 bytes?
First, reinterpret_cast is a C++ operator. What you've shown is not that but a C-style cast.
The cast is converting a value of type unsigned int to a value of type char. Conversion of an out-of-range value is implementation defined, but in most implementations you're likely to come across, this is implemented as reinterpreting the lower order bytes as the converted value.
In this particular case, the low order byte of aNumber has the value 0x02, so that's what the result is when casted to a char.

Typecasting of pointers in c

How integer pointer to char pointer conversion works?
I have a program that has integer value 320 and I'm typecasting into char*. It will show the output 64. I want to know how its works?
#include <stdio.h>
int main()
{
int i=320;
char *p=(char*)&i;
printf("%d",*p);
return 0;
}
Well, on your little-endian system, let's assume sizeof (int) is 4.
Then the memory for i looks like:
+--+-+-+-+
i: |64|1|0|0|
+--+-+-+-+
This is because 320 is 0x00000140, i.e. 320 = 1 * 256 + 64.
So you set p to point at the first byte (64), and then dereference it so that single byte is read.
Your final line is wrong, you meant:
printf("%d\n", *p);
Quoting C11, chapter ยง6.3.2.3, emphasis mine
A pointer to an object type may be converted to a pointer to a different object type. If the
resulting pointer is not correctly aligned for the referenced type, the behavior is
undefined. Otherwise, when converted back again, the result shall compare equal to the
original pointer. When a pointer to an object is converted to a pointer to a character type,
the result points to the lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining bytes of the object.
So, the binary representation would look like (little-endian architecture assumed, based on output you presented)
00000001 01000000
^^ ^^
HAB LAB HAB- High Address Byte, LAB - Low Address Byte
And, by the cast, you are essentially pointing to
01000000
Part. So the dereference will produce that value as the integer result, (01000000)2 == (64)10.
Note: Only a character type pointer is capable of aliasing any other pointer type. Don;t try it with other target types which are not compatible with the source type.
The different value is due to truncation; it also depends on the endian-ness of the platform The value 640, if stored in an int of say 16 byte, has the following binary pattern.
0000 0001 0100 0000
If a pointer to these two bytes is cast to a pointer of char, it will refer to the lower byte, which is as follows.
0100 0000
However, this bit pattern has a numerical value of 64, which is the output of the program.

Why this program output 64?

I found this program during an online test on c programming I tried it on my level but I cannot figure it out that why the output of this program comes out to be 64.
Can anyone explain the concept behind this?
#include <iostream>
#include <stdio.h>
using namespace std;
int main()
{
int a = 320;
char *ptr;
ptr = (char *)&a;
printf("%d",*ptr);
return 0;
}
output:
64
Thankyou.
A char * points to one byte only. Assuming a byte on your system is 8 bits, the number 320 occupies 2 bytes. The lower byte of those is 64, the upper byte is 1, because 320 = 256 * 1 + 64. That is why you get 64 on your computer (a little-endian computer).
But note that on other platforms, so called big-endian platforms, the result could just as well be 1 (the most significant byte of a 16 bit/2 byte value) or 0 (the most significant byte of a value larger than 16 bit/2 bytes).
Note that all this assumes that the platform has 8-bit bytes. If it had, say 10-bit bytes, you would get a different result again. Fortunately, most computers have 8-bit bytes nowadays.
You won't be able to understand this unless you know about:
hex/binary represenation, and
CPU endianess.
Type out the decimal number 320 in hex. Split it up in bytes. Assuming int is 4 bytes, you should be able to tell which parts of the number that goes in which bytes.
After that, consider the endianess of the given CPU and sort the bytes in that order. (MS byte first or LS byte first.)
The code accesses the byte allocated at the lowest address of the integer. What it contains depends on the CPU endianess. You'll either get hex 0x40 or hex 0x00.
Note: You shouldn't use char for these kind of things, because it has implementation-defined signedness. In case the data bytes contains values larger than 0x7F, you might get some very weird bugs, that inconsistently appear/disappear across multiple compilers. Always use uint8_t* when doing any form of bit/byte manipulation.
You can expose this bug by replacing 320 with 384. Your little endian system may then either print -128 or 128, you'll get different results on different compilers.
What #Lundin said is enough.
BTW, maybe some basic knowledge is helpful. 320 = 0x0140. a int = 4 char. So when print the first byte, it output 0x40 = 64 because of cpu endianess.
ptr is char pointer of a. Thus *ptr will give char value of a. char occupies only 1 byte thus it repeats its values after 255. That is 256 becomes 0, 257 becomes 1 and so on. Thus 320 becomes 64.
Int is four byte data byte while char is one byte data byte, char pointer can keep the address one byte at time. Binary value of 320 is 00000000 00000000 00000001 01000000. So, char pointer ptr is pointing to only first byte.
*ptr i.e. content of first byte is 01000000 and its decimal value is 64.

Pointer indirection when pointer and data widths differ

I want to access a 32-bit data pointed to by an address in a hardware register (which is 64 bits, with only 40 LSb's set). So I do:
paddr_t address = read_hw(); // paddr_t is unsigned long long
unsigned int value = *(unsigned int*) address; // error: cast to pointer from integer of different size
unsigned int value2 = (unsigned int) *((paddr_t*) address); // error: cast to pointer from integer of different size
What would be the right way to do this without compiler error (I use -Werror)?
Nominally with C99 the first option is closest to correct,
uint32_t value = *(uint32_t*)address;
However you may also choose to use the other pointer/integer helpers,
uintptr_t address = read_hw();
uint32_t value = *(uint32_t*)address;
I'm not sure I understand the question.
"I want to access a 32-bit data pointed to by an address in a hardware register (which is 64 bits, with only 40 LSb's set)."
So you have a hardware register 64 bits wide, the least significant 40 of which should be interpreted as an address in memory which contains 32 bits of data?
Can you try
uint32_t* pointer = (*(uint64_t *) register_address) & (~0 >> 24)
uint32_t value = *pointer
Although this might get more complicated depending on endian-ness and whether the compiler interprets >> as a logical or arithmetic right-shift.
Although, really, I want to ask,
Does "I am using a cross-compiler, I don't have the luxury of printf" mean you can't actually run your code, or just that you have to do it some some hardware that lacks a convenient output channel?
What is your target architecture, that your pointers are 40 bits long?!
From what you have written you have a 64 pointer of which only 40 bits are the pointer, and that pointer points to some data that is 32 bits in size.
Your code seems to be trying to mangle the 40 bit pointer into a 32 bit pointer.
What you should be doing is &'ing the relevant 40 bits within the 64 bit pointer so that it remains a 64 bit pointer, and then using that to access the data, which you can then similarly & to get the data. Otherwise you are (as the errors indicate) truncating the pointer.
Something like (I don't have 64 bit so I can't test this, but you get the idea):
address = address & 0x????????????????; // use the ?s to mask off the bits you
// want to ignore
value64 = *address; // value64 is 64 bits
value32 = (int)(value64 & 0x00000000ffffffff); // if the data is in the lower
// half of value64
or
value32 = (int)((value64 & 0xffffffff00000000) > 32); // if the data is in the
// higher half of value64
where the ?'s are masking the bits as needed (depending on the endiness that you are working with).
You'll probably also need to change the (int) casts to suit (you want to instead cast it to whatever 32 bit data type the data represents - ie. the type of value32).
check the real sizes for your pointers and paddr_t type:
printf("paddr_t size: %d, pointer size: %d\n",
sizeof(paddr_t), sizeof(unsigned int *));
what do you get?
update:
ARM is a 32 bits architecture, so you are trying to convert from a 64bits integer to a 32bits pointer an your compiler doesn't like it!
If you are sure that the value in paddr_t fits in a 32bits pointer you can just cast it to an int first:
unsigned int *p = (unsigned int *)(int)addrs;

Resources