Casting pointers? - c

What will happen If we cast 32 bit pointer to 8 bit?
I assume that for example we have 0x8000 0000 and if we cast to 8 bit, the value of the new pointer will be 0x00. Am I right?

A pointer is a pointer, depend on platform. On a 32 bits CPU a pointer is always 32 bits.
void *temp = 0x8000000;
uint8_t *temp = 0x8000000;
uint16_t *temp = 0x8000000;
If you cast the pointer you change the pointed value size.
void *temp = 0x80000000;
uint8_t temp2 = *((char *)(temp)); // return a single char (8 bits) at 0x80000000
uint16_t temp3 = *((short *)(temp)); // return a single short (16 bits) at 0x80000000

On a x86 arch CPU a pointer is just a integer pointing to a memory area.
What you are really asking, i think, and hope, is if is possible to have a pointer to an integer, and then cast it to pointer to a char.
You can look inside the bits of an integer by creating a pointer to it and cast it to a narrower type, such as char and then increment it.
Basically, if you would increment a char pointer, you will move up only one byte inside the integer.

Well, there are actually architectures which have different pointers of different size. A classical x86 would be to have near and far pointers. The former had 16 bits and was restricted to a certain memory area. Any usage outside this area would have been undefined. The far pointer consisted of two 16 bit values (segment:offset) in a singel 32 bit variable. The segment part spcified the memory region the pointer was valid for, while the offset was actually identical to a near pointer.
However, that was (and is) not part of the C standard, but architecture-specific extensions. As these might have been accepted as crucial for the platform, it was supported by all toolchains available, but still nothing standard.
For your pointers, you should - in general - never fiddle with a pointer, expecially never cast it to a type which cannot hold all bits required for the pointer. If you have to store a pointer in an integer variable for some reason, use uintptr_t or intptr_t which are defined in stdint.h as of C99 standard. However, they are optional, so your toolchain might not support them (gcc for instances does; not idea if Microsoft already had the time to add this header to VS).
Remember that the value stored in uintptr_t (for example) might not be the same as a direct cast to another integer type (but for ARM this is the case as I know myself).
If you are asking, however, how to compress a pointer value, you might first cast it to unitptr_t and then compress the integer value using one of the well-known algorithms (Huffman, etc.).

Related

Why is it possible to store the information content of an int pointer to an int variable in c?

Let us consider the following piece of code:
#include <stdio.h>
int main()
{
int v1, v2, *p;
p = &v1;
v2 = &v1;
printf("%d\t%d\n",p,v2);
printf("%d\t%d\n",sizeof(v2),sizeof(p));
return 0;
}
We can see, as expected, that the v2 variable (int) occupies 4 bytes and that the p variable (int pointer) occupies 8 bytes.
So, if a pointer occupies more than 4 bytes of memory, why we can store its content in an int variable?
In the underlying implementation, does the pointer variables store only the memory address of another variable, or it stores something else?
We can see, as expected, that the v2 variable (int) occupies 4 bytes
and that the p variable (int pointer) occupies 8 bytes.
I'm not sure what exactly the source of your expectation is there. The C language does not specify the sizes of ints or pointers. Its requirements on the range of representable values of type int afford int size as small as two 8-bit bytes, and historically, that was once a relatively common size for int. Some implementations these days have larger ints (and maybe also larger char, which is the unit of measure for sizeof!).
I suppose that your point here is that in the implementation tested, the size of int is smaller than the size of int *. Fair enough.
So, if a pointer occupies more than 4 bytes of memory, why we can
store its content in an int variable?
Who says the code stores the pointer's (entire) content in the int? It converts the pointer to an int,* but that does not imply that the result contains enough information to recover the original pointer value.
Exactly the same applies to converting a double to an int or an int to an unsigned char (for example). Those assignments are allowed without explicit type conversion, but they are not necessarily value-preserving.
Perhaps your confusion is reflected in the word "content". Assignment does not store the representation of the right-hand side to the left-hand object. It converts the value, if necessary, to the target object's type, and stores the result.
In the underlying implementation, does the pointer variables store
only the memory address of another variable, or it stores something
else?
Implementations can and have varied, and so too the meaning of "address" for different machines. But most commonly these days, pointers are represented as binary numbers designating locations in a flat address space.
But that's not really relevant. C specifies that pointers can be converted to integers and vice versa. It also provides integer types intptr_t and uintptr_t (in stdint.h) that support full-fidelity round trip void * to integer to void * conversion. Pointer representation is irrelevant to all that. It is the implementation's responsibility to implement the types and conversions involved so that they behave as required, and there is more than one way to do that.
*C actually requires an explicit conversion -- that is, a typecast -- between pointers and integer. The language specification does not define the meaning of the cast-less assignment in the example code, but some compilers do accept that and perform the needed conversion implicitly. My remarks assume such an implementation.
There is always a warning, see below.
main.c: In function ‘main’:
main.c:6:8: warning: assignment to ‘int’ from ‘int *’ makes integer from pointer without a cast [-Wint-conversion]
v2 = &v1;
main.c: In function ‘main’:
main.c:6:10: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
v2 = (int) &v1;
In the first case, just setting an integer value to a pointer value is not appropriate, because it is not a compatible type.
In the second case, with a cast of the pointer to an integer, the compiler recognizes the problem of the different sizes, which means v2 can not completely hold (int) &v1;
Conclusion: Both cases are "bad" in terms of creating an undesired behaviour.
About your question "So, if a pointer occupies more than 4 bytes of memory, why we can store its content in an int variable?" - It can NOT completely be stored in an int variable.
About your question "In the underlying implementation, does the pointer variables store only the memory address of another variable, or it stores something else?" - A pointer just points to an address. (It could be the address of another variable or not. It does not matter. It just points to an address.)
The key to understanding what's going on here is that C is an abstraction layer on top of the underlying ISA. Most architectures have little more than registers and memory addresses1 to work with, all of which are of a fixed size. When manipulating "variables", you're really just expressing your intent which the compiler translates into more concrete instructions.
On x86_64, a common architecture, an int is in actuality either a portion of a 64-bit register, or it's a 4-byte location in memory that's aligned on a 4-byte boundary. An int* is a 64-bit value, or 8-byte location in memory with corresponding alignment constraints.
Putting an int* value into a suitably sized variable, such as uint64_t, is allowed. Putting that value back into a pointer and exercising that pointer may not be permitted, it depends on your architecture.
From the programmer's perspective a pointer is just 64 bits of data. From the CPU's perspective it may contain more than that, with modern architectures having things like internal "Pointer Authentication Codes" (PACs) that ensure pointers cannot be injected from external sources. It gets quite a bit more complicated under the hood.
In general it's best to treat pointers as opaque, that is their actual value is as good as random and irrelevant to the internal operation of your program. It's only when you're doing deeper analysis at the architectural level with sufficiently robust profiling tools that the actual internals of the pointer can be informative or relevant.
There are several well-defined operations you can do on pointers, like p[n] to access specific offsets within the bounds of a structure or allocation, but outside of that you're pretty limited in what you can do, or even infer. Remember that modern CPUs and operating systems use virtual memory, so pointer addresses are "fake" and don't represent where they are in physical memory. In fact, they're deliberately scrambled to make them harder to guess.
1 This disregards VLIW, SIMD, and other extensions which are not so simple.
So, if a pointer occupies more than 4 bytes of memory, why we can store its content in an int variable?
You cannot, indeed the code you post is not legal.
#include <stdio.h>
int main()
{
int v1, v2, *p;
this declares to int variables and a pointer to int called p.
p = &v1;
this is legal, as you assign to p the address of the integer variable v1.
v2 = &v1; /* INCORRECT!!! */
this is not. It assigns to an int variable the address of another variable (which is a pointer, and as you well say, it is not possible) The most probable intention of the code writer was:
v2 = *p;
which assigns to v2 the integer value stored at address pointed to by p (which is pointing to v1, so it assigns v2 the value stored in v1.

C - Why cast to uintptr_t vs char* when doing pointer arithmetic

I am working on a programm where I have to modify the target process memory/ read it.
So far I am using void* for storing adresses and cast those to char* if I need to change them (add offset or modify in general)
I have heard of that type defined in stdint.h but I don't see the difference in using it for pointer arithmetic over the char* conversion (which seems more C89 friendly atleast to me)
So my question: What of those both methods should I use for pointer arithmetic? Should I even consider using uintptr_t over char* in any case?
EDIT 1
Basically I just need to know if this yields
0x00F00BAA hard coded memory adress in target
process
void* x = (void*)0x00F00BAA;
char* y = (void*)0x00F00BAA;
x = (uintptr_t)x + 0x123;
y = (char*)y + 0x123;
x == y?
x == (void*)0x00F00CCD?
y == (void*)0x00F00CCD?
In comments user R.. points out that the following is likely incorrect if the addresses the code is dealing with are not valid within the current process. I've asked the OP for clarification.
Do not use uintptr_t for pointer arithmetic if you care about the portability of your code. uintptr_t is an integer type. Any arithmetic operations on it are integer arithmetic, not pointer arithmetic.
If you have a void* value and you want to add a byte offset to it, casting to char* is the correct approach.
It's likely that arithmetic on uintptr_t values will work the same way as char* arithmetic, but it absolutely is not guaranteed. The only guarantee that the C standard provides is that you can convert a void* value to uintptr_t and back again, and the result will compare equal to the original pointer value.
And the standard doesn't guarantee that uintptr_t exists. If there is no integer type wide enough to hold a converted pointer value without loss of information, the implementation just won't define uintptr_t.
I've actually worked on systems (Cray vector machines) where arithmetic on uintptr_t wouldn't necessarily work. The hardware had 64-bit words, with a machine address containing the address of a word. The Unix-like OS needed to support 8-bit bytes, so byte pointers (void*, char*) contained a word address with a 3-bit offset stored in the otherwise unused high-order 3 bits of the 64-bit word. Pointer/integer conversions simply copied the representation. The result was that adding 1 to a char* pointer would cause it to point to the next byte (with the offset handled in software), but converting to uintptr_t and adding 1 would cause it to point to the next word.
Bottom line: If you need pointer arithmetic, use pointer arithmetic. That's what it's for.
(Incidentally, gcc has an extension that permits pointer arithmetic on void*. Don't use it in portable code. It also causes some odd side effects, like sizeof (void) == 1.)

Fixed-sized pointer type in C99

I want to create a type to store pointers. The type should be compatible with C99 and have a fixed-width of 64 bits. I came up with several alternatives but they all seem flawed:
Using uint64_t is incorrect since conversions between pointers and integers are implementation-defined [C99 standard, 6.3.2.3].
uinptr_t also appears to be out of the picture, since the width of this type is not fixed and the type is optional anyway [7.18.1.4].
Using a struct such as
struct {
#ifdef __LP64__
void* ptr;
#else
// if big endian the following two fields need to be flipped
void* ptr;
uint32_t padding;
#endif
} fixed_ptr_type;
does not work either because the size of a pointer is not fixed even within the same implementation
Is there any C99-compatible definition of the type I'm looking for?
Object pointers
The best type to store object pointers is void *. Any object pointer can be converted to void * and back again.
Function pointers
A void * cannot necessarily store a function pointer. However, any function pointer can be converted to another type of function pointer, so you could store them in some arbitrary type (such as void (*)(void)).
Padding
I have no idea why you would need your pointer type to have a predetermined size, but you could pad them by using a union and hope that the result is not too large:
union fixed_ptr_type {
void *p;
char c[64/CHAR_BIT];
};
assert (CHAR_BIT * sizeof (union fixed_ptr_type) == 64);
I don't understand your objection to using the void * with padding. All objects of the same type have the same size. If a different object pointer type has a different size, that doesn't matter, because you convert it to void * to store it in your super-pointer.
Regarding uintptr_t: If it is not supported , then chances are that it's because there is actually no way of doing this on the particular platform.
So you could use uintptr_t. To add in the fixed-width requirement, you could cast to uintptr_t then to uint64_t (if you're happy with knowing you'll have to change your code when someone puts out a system that has pointers greater than 64bits!)
You cannot portably store pointer values in a 64-bit type. It's perfectly legal for an implementation to use 128-bit pointers.
If you don't mind losing portability to systems with pointers bigger than 64 bits, you can probably get away with using uint64_t. Conversions from pointer types to uint64_t are not guaranteed to work correctly without losing information, but they will almost certainly do so on any reasonable systems where pointers are no wider than 64 bits.
If an implementation has no 64-bit unsigned integer type without padding bits, then it will not define uint64_t at all (for example, a system with 9-bit bytes would not be able to implement uint64_t). There's a type uint_least64_t that's guaranteed, as the name implies, to be at least 64 bits wide; it will be exactly 64 bits on most systems, and wider than 64 bits only on systems where uint64_t doesn't exist.
uintptr_t is guaranteed to hold a converted void* value without loss of information, but it's not guaranteed to exist -- and if it doesn't exist, then no integer type can hold a converted void* value without loss of information. A conforming implementation needn't necessarily have any integer type that can hold a pointer value without loss of information.
Function pointers are another matter. Conversion from a function pointer to void*, or to any integer type, has undefined behavior (because the standard doesn't say what the behavior should be).
There simply is no 100% portable way to do what you're trying to do. You'll just have to settle for 99.9% portability. If you're not concerned with function pointers, I'd suggest using uint64_t (perhaps defining your own typedef to make it clear what you're doing) and add a compile-time or run-time check to confirm that sizeof (void*) <= sizeof (uint64_t). That should cover every existing implementation that I've ever heard of.
It might be helpful to know what your actual goal is. Why do you want to store pointers in no more or less than 64 bits? What problem does this solve that storing them in void* objects doesn't solve?
Incidentally, the __LP64__ macro that you mention in your question is non-standard.

what does the following type casting do in C?

I have a question about type casting in C:
void *buffer;
(int *)((int)buffer);
what is this type casting doing? and what does the ((int)buffer) doing?
Imagine you are on a Linux/x86-64 computer like mine. Then pointers are 64 bits and int is 32 bits wide.
So, the buffer variable has been initialized to some location; Perhaps 0x7fff4ec52020 (which could be the address of some local variable, perhaps inside main).
The cast (int)buffer gives you an int, probably the least significant 32 bits, i.e. 0x4ec52020
You are casting again with (int*)((int)buffer), which gives you the bogus address 0x000000004ec52020 which does not point into valid memory. If you dereference that bogus pointer, you'll very probably get a SIGSEGV.
So on some machines (notably mine) (int *)((int)buffer) is not the same as (int*)buffer;
Fortunately, as a statement, (int *)((int)buffer); has no visible side effect and will be optimized (by "removing" it) by the compiler (if you ask it to optimize).
So such code is a huge mistake (could become an undefined behavior, e.g. if you dereference that pointer). If the original coder really intended the weird semantics described here, he should add a comment (and such code is unportable)!
Perhaps #include-ing <stdint.h> and using intptr_t or uintptr_t could make more sense.
Let's see what the C Standard has to say. On page 55 of the last freely published version of the C11 standard http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf we have
5 An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.
6 Any pointer type may be converted to an integer type. Except as previously specified, the
result is implementation-defined. If the result cannot be represented in the integer type,
the behavior is undefined. The result need not be in the range of values of any integer
type.
What does this mean in your example? Section 6 says that the cast (int)buffer will compile, but if the integer is not big enough to hold the pointer (which is likely on 64-bit machines), the result is undefined. The final (int*) casts back to a pointer.
Section 5 says that if the integer was big enough to hold the intermediate result, the result is exactly the same as just casting to (int*) from the outset.
In short the cast (int) is at best useless and at worst causes the pointer value to be lost.
In straight C code this is pointless because conversions to and from void* are implicit. The following would compile just fine
int* p = buffer;
Worse though is this code potentially introduces errors. Consider the case of a 64 bit platform. The conversion to int will truncate the pointer to 32 bits and then assign it to an int*. This will cause the pointer value to be truncated and certainly lead to errors.

Why casting short to void * get a warning, and from int it's okey?

See
I'm using glib, and gpointer is a typedef of void *. (glib did this type to make things clear, I guess).
when I make a signal connect, I have to pass the data as void pointer (void *).
so the code is (something like this) :
...
g_signal_connect (object, function, (gpointer) data);
...
If I use short as data type, I get a warning message of gcc like this:
warning: cast to pointer from integer
of different size
If I use int as data type, I get no warnings.
But in both cases, everything works well, so, why I get this warning using short?
On a 32 bit architecture, a pointer is 32 bits. An int is usually 32 bits as well - so the conversion is '1 to 1'. A short is usually 2 bytes, so casting a short to a pointer isn't normally a very safe things to do - which is why the compiler warns you.
Gcc is telling you that you cast an integer of different size to a pointer, which is dangerous mainly the other way you have it (eg. from a larger datatype to a smaller one). To silence the warning, you can use intptr_t as an intermediate type.
g_signal_connect (object, function, (gpointer)(intptr_t) data);
A short is 2 Bytes a pointer is platform dependant but usually 4 or 8 Bytes. That should be why you get that error. You probably want to pass a the reference of the short in which would be:
(gpointer) &data);
Integer to pointer conversions are implementation-dependent in C. From the ISO standard:
An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.
As a short (i.e. signed short) is usually 16 bits, and a void * pointer is usually 32 bits on 32 bit architectures or 64 bits on 64 bit architectures, GCC has to decide how to pad the extra bits. What GCC actually does is sign-extend the short, i.e. it copies bits 14-bit 0 into bits 14-bit 0 of the pointer, and it copies bit 15 of the short into the remaining upper bits (bits 31-15 of a 32-bit pointer, or bits 63-15 of a 64-bit pointer.) Sign-extension preserves the value if the pointer is later converted back into an integer type of different width.
short s = -1; // i.e. -> 0xffff
void *p = (void *)s; // on 32 bit system -> 0xffffffff
If sign-extension wasn't what you wanted, perhaps you just wanted a direct copy of the short into bits 15-0 of the pointer. In that case, you'd do something like this:
short s = -1; // i.e. -> 0xffff
void *p = (void *)((unsigned short)s); // on 32 bit system -> 0x0000ffff

Resources