Size of Pointer Variables - c

Considering pch, pshort, pdouble declared as pointers to char ,short int ,double respectively, what would be the arrangement if the three variables were arranged according to their size ?

The facetious answer is you don't know. char, short, and double could all be the same size, and char*, short*, and double* could all be different sizes!
sizeof(char) is 1 by the standard. You can't have anything smaller than that, so it makes sense to put char first.
But short int could be the same size as a long int: the standard only specifies minimum ranges. And either could be larger than a double.
Normally a double weighs in at 64 bit, and a short 16 or 32 bits.
The parsimonious answer is char, short, double.
As for pointers, the standard allows sizeof(char*), sizeof(short*), and sizeof(double*) to all differ.

Pointers to different types may have different sizes, although on most modern platforms they are all the same size (32 bits on x86, 64 bits on x86_64).
The requirements1 are:
Pointers to char and pointers to void have the same size and representation;
Pointers to struct types all have the same size and representation;
Pointers to union types all have the same size and representation;
Pointers to qualified and unqualified versions of compatible types have the same size and representation (i.e., sizeof (int*) == sizeof (const int *))
Pointers to all other types may have different sizes.
C 2011 Online Draft, section 6.2.5, para 28.

Pointer variables are usualy 64 bits on machines that can do 64 bit arthmetic, because that allows the machine to handle more than 2^32(4 billion, or 4 gigabytes) of RAM. In machines that can't handle 64 bit arithmetic, long pointers, which are slower and actually two poimters put together, that treat each maximum amount of RAM as an index in an array, must be used. Consequentially, long pointers are slower than regular pointer.
EDIT
Bethsheba reminded me, in a comment below that the pointer will usually not be 64 bits when in a register, but will usually be padded to 64 bits in memory for performance reasons (so, treat it as 64 bits).

Related

Can a C primitive be wider than a void*?

I know that void* can hold any pointer but I wonder also if with casting it can hold anything besides an array or a struct, i.e., primitives and unions. Is this a guarantee with C (and if so which version) or just implementation-specific or not even a thing?
I wonder also if with casting it can hold anything besides an array or a struct, i.e., primitives and unions. Is this a guarantee with C [...]?
No, it is not. As a counterexample, pointers are typically 32 bits wide in C implementations for 32-bit CPUs,* but C requires type long long int to be at least 64 bits wide.** C places no upper bound on the size of any arithmetic type and no (explicit) lower bound on the size of any pointer type, so in principle, even short might be wider than void *.
*For example, GCC uses the pointer representation specified by the target application binary interface (ABI). On x86 Linux, that means the x86 System V ABI, which defines the size of an address as 4 bytes.
** Per paragraph 5.2.4.2.1/1 of the current C language standard, the maximum value representable by an object of type long long int must be at least 9223372036854775807, which is 263 - 1. If you include the sign bit, too, then it follows that a long long int requires at least 64 bits.

Casting pointers?

What will happen If we cast 32 bit pointer to 8 bit?
I assume that for example we have 0x8000 0000 and if we cast to 8 bit, the value of the new pointer will be 0x00. Am I right?
A pointer is a pointer, depend on platform. On a 32 bits CPU a pointer is always 32 bits.
void *temp = 0x8000000;
uint8_t *temp = 0x8000000;
uint16_t *temp = 0x8000000;
If you cast the pointer you change the pointed value size.
void *temp = 0x80000000;
uint8_t temp2 = *((char *)(temp)); // return a single char (8 bits) at 0x80000000
uint16_t temp3 = *((short *)(temp)); // return a single short (16 bits) at 0x80000000
On a x86 arch CPU a pointer is just a integer pointing to a memory area.
What you are really asking, i think, and hope, is if is possible to have a pointer to an integer, and then cast it to pointer to a char.
You can look inside the bits of an integer by creating a pointer to it and cast it to a narrower type, such as char and then increment it.
Basically, if you would increment a char pointer, you will move up only one byte inside the integer.
Well, there are actually architectures which have different pointers of different size. A classical x86 would be to have near and far pointers. The former had 16 bits and was restricted to a certain memory area. Any usage outside this area would have been undefined. The far pointer consisted of two 16 bit values (segment:offset) in a singel 32 bit variable. The segment part spcified the memory region the pointer was valid for, while the offset was actually identical to a near pointer.
However, that was (and is) not part of the C standard, but architecture-specific extensions. As these might have been accepted as crucial for the platform, it was supported by all toolchains available, but still nothing standard.
For your pointers, you should - in general - never fiddle with a pointer, expecially never cast it to a type which cannot hold all bits required for the pointer. If you have to store a pointer in an integer variable for some reason, use uintptr_t or intptr_t which are defined in stdint.h as of C99 standard. However, they are optional, so your toolchain might not support them (gcc for instances does; not idea if Microsoft already had the time to add this header to VS).
Remember that the value stored in uintptr_t (for example) might not be the same as a direct cast to another integer type (but for ARM this is the case as I know myself).
If you are asking, however, how to compress a pointer value, you might first cast it to unitptr_t and then compress the integer value using one of the well-known algorithms (Huffman, etc.).

Fixed-sized pointer type in C99

I want to create a type to store pointers. The type should be compatible with C99 and have a fixed-width of 64 bits. I came up with several alternatives but they all seem flawed:
Using uint64_t is incorrect since conversions between pointers and integers are implementation-defined [C99 standard, 6.3.2.3].
uinptr_t also appears to be out of the picture, since the width of this type is not fixed and the type is optional anyway [7.18.1.4].
Using a struct such as
struct {
#ifdef __LP64__
void* ptr;
#else
// if big endian the following two fields need to be flipped
void* ptr;
uint32_t padding;
#endif
} fixed_ptr_type;
does not work either because the size of a pointer is not fixed even within the same implementation
Is there any C99-compatible definition of the type I'm looking for?
Object pointers
The best type to store object pointers is void *. Any object pointer can be converted to void * and back again.
Function pointers
A void * cannot necessarily store a function pointer. However, any function pointer can be converted to another type of function pointer, so you could store them in some arbitrary type (such as void (*)(void)).
Padding
I have no idea why you would need your pointer type to have a predetermined size, but you could pad them by using a union and hope that the result is not too large:
union fixed_ptr_type {
void *p;
char c[64/CHAR_BIT];
};
assert (CHAR_BIT * sizeof (union fixed_ptr_type) == 64);
I don't understand your objection to using the void * with padding. All objects of the same type have the same size. If a different object pointer type has a different size, that doesn't matter, because you convert it to void * to store it in your super-pointer.
Regarding uintptr_t: If it is not supported , then chances are that it's because there is actually no way of doing this on the particular platform.
So you could use uintptr_t. To add in the fixed-width requirement, you could cast to uintptr_t then to uint64_t (if you're happy with knowing you'll have to change your code when someone puts out a system that has pointers greater than 64bits!)
You cannot portably store pointer values in a 64-bit type. It's perfectly legal for an implementation to use 128-bit pointers.
If you don't mind losing portability to systems with pointers bigger than 64 bits, you can probably get away with using uint64_t. Conversions from pointer types to uint64_t are not guaranteed to work correctly without losing information, but they will almost certainly do so on any reasonable systems where pointers are no wider than 64 bits.
If an implementation has no 64-bit unsigned integer type without padding bits, then it will not define uint64_t at all (for example, a system with 9-bit bytes would not be able to implement uint64_t). There's a type uint_least64_t that's guaranteed, as the name implies, to be at least 64 bits wide; it will be exactly 64 bits on most systems, and wider than 64 bits only on systems where uint64_t doesn't exist.
uintptr_t is guaranteed to hold a converted void* value without loss of information, but it's not guaranteed to exist -- and if it doesn't exist, then no integer type can hold a converted void* value without loss of information. A conforming implementation needn't necessarily have any integer type that can hold a pointer value without loss of information.
Function pointers are another matter. Conversion from a function pointer to void*, or to any integer type, has undefined behavior (because the standard doesn't say what the behavior should be).
There simply is no 100% portable way to do what you're trying to do. You'll just have to settle for 99.9% portability. If you're not concerned with function pointers, I'd suggest using uint64_t (perhaps defining your own typedef to make it clear what you're doing) and add a compile-time or run-time check to confirm that sizeof (void*) <= sizeof (uint64_t). That should cover every existing implementation that I've ever heard of.
It might be helpful to know what your actual goal is. Why do you want to store pointers in no more or less than 64 bits? What problem does this solve that storing them in void* objects doesn't solve?
Incidentally, the __LP64__ macro that you mention in your question is non-standard.

Is sizeof(int) guaranteed to equal sizeof(void*)

Is the size of the datatype "int" always equals to the size of a pointer in the c language?
I'm just curious.
Not at all, there is no guarantee that sizeof(int) == sizeof(void*). And on Linux/AMD64 sizeof(int) is 4 bytes, and sizeof(void*) is 8 bytes (same as sizeof(long) on that platform).
Recent C standard (e.g. C99) defines a standard header <stdint.h> which should define, among others, an integral type intptr_t which is guaranteed to have the size of pointers (and probably even which is reversably castable to and from pointers).
I think that the standard does not guarantee that all pointers have the same size, in particular pointer to functions can be "bigger" than data pointers (I cannot name a platform where it is true). I believe that recent Posix standard requires that (e.g. for dlsym(3)).
See also this C reference and the n1570 draft C11 standard (or better)
PS. In 2021 I cannot name a common platform with sizeof(long) != sizeof(void*). But in the previous century the old intel 286 could have been such a platform.
No. for example, in most 64bit systems, int is 4 bytes, and void* is 8.
It is not guaranteed.
And for example, in most 64-bit systems both sizes are usually different.
Even sizeof (int *) is not guranteed to be equal to sizeof (void *).
The only guarantee for void * size is
sizeof (void *) == sizeof (char *)
== sizeof (signed char *) == sizeof (unsigned char *)
No. Some (mostly older, VAX-era) code assumes this, but it's definitely not required, and assuming it is not portable. There are real implementations where the two differ (e.g., some current 64-bit environments use a 64-bit pointer and 32-bit int).
The C languages gives no guarantees of anything when it comes to integer or pointer sizes.
The size of int is typically the same as the data bus width, but not necessarily. The size of a pointer is typically the same as the address bus width, but not necessarily.
Many compilers use non-standard extensions like the far keyword, to access data beyond the width of the default pointer type.
In addition to 64-bit systems, there are also plenty of microcontroller/microprocessor architectures where the size of int and the size of a pointer are different. Windows 3.1 and DOS are other examples.
There's no guarantee of any relation between the sizes of these two types, nor that either can be faithfully represented in the other via round-trip casts. It's all implementation-defined.
With that said, in the real world, unless you're dealing with really obscure legacy 16-bit systems or odd DSPs or such, sizeof(int) is going to be less than or equal to sizeof(void *), and you can faithfully convert int values to void * to pass them to interfaces (like pthread_create) that take a generic void * argument to avoid wasteful allocation and freeing of memory to store a single int. In particular, if you're using POSIX or Windows interfaces already, this is definitely a safe real-world assumption to make.
You should never assume void * can be faithfully represented in int (i.e. casting a pointer to int and back). This does not work on any popular real-world 64-bit systems, and the percentage of systems it works on is sure to plummet in the near future.
No. Pointer types do not have to be the same size or representation as integer types. Here are a few relevant sections from the C language standard (online draft available here):
6.2.5 Types
...
27 A pointer to void shall have the same representation and alignment requirements as a
pointer to a character type.39) Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements. All pointers to structure types shall have the same representation and alignment requirements
as each other. All pointers to union types shall have the same representation and
alignment requirements as each other. Pointers to other types need not have the same
representation or alignment requirements.
...
39) The same representation and alignment requirements are meant to imply interchangeability as
arguments to functions, return values from functions, and members of unions.
...
6.3.2.3 Pointers
...
5 An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.56)
6 Any pointer type may be converted to an integer type. Except as previously specified, the
result is implementation-defined. If the result cannot be represented in the integer type,
the behavior is undefined. The result need not be in the range of values of any integer
type.
...
56) The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to
be consistent with the addressing structure of the execution environment.
No, it doesn't have to be, but it's usually the case that sizeof(long) == sizeof(void*).

Byte precision pointer arithmetic in C when sizeof(char) != 1

How can one portably perform pointer arithmetic with single byte precision?
Keep in mind that:
char is not 1 byte on all platforms
sizeof(void) == 1 is only available as an extension in GCC
While some platforms may have pointer deref pointer alignment restrictions, arithmetic may still require a finer granularity than the size of the smallest fundamental POD type
Your assumption is flawed - sizeof(char) is defined to be 1 everywhere.
From the C99 standard (TC3), in section 6.5.3.4 ("The sizeof operator"):
(paragraph 2)
The sizeof operator yields the size
(in bytes) of its operand, which may
be an expression or the
parenthesized name of a type.
(paragraph 3)
When applied to an operand that has
type char, unsigned char, or signed
char, (or a qualified version
thereof) the result is 1.
When these are taken together, it becomes clear that in C, whatever size a char is, that size is a "byte" (even if that's more than 8 bits, on some given platform).
A char is therefore the smallest addressable type. If you need to address in units smaller than a char, your only choice is to read a char at a time and use bitwise operators to mask out the parts of the char that you want.
sizeof(char) always returns 1, in both C and C++. A char is always one byte long.
According to the standard char is the smallest addressable chunk of data. You just can't address with greater precision - you would need to do packing/unpacking manually.
sizeof(char) is guaranteed to be 1 by the C standard. Even if char uses 9 bits or more.
So you can do:
type *pt;
unsigned char *pc = (unsigned char *)pt;
And use pc for arithmetic. Assigning pc to pt by using the cast above is undefined behavior by the C standard though.
If char is more than 8-bits wide, you can't do byte-precision pointer arithmetic in portable (ANSI/ISO) C. Here, by byte, I mean 8 bits. This is because the fundamental type itself is bigger than 8 bits.
Cast the pointer to a uintptr_t. This will be an unsigned integer that is the size of a pointer. Now do your arithmetic on it, then cast the result back to a pointer of the type you want to dereference.
(Note that intptr_t is signed, which is usually NOT what you want! It's safer to stick to uintptr_t unless you have a good reason not to!)
I don't understand what you are trying to say with sizeof(void) being 1 in GCC. While type char might theoretically consist of more than 1 underlying machine byte, in C language sizeof(char) is 1 and always exactly 1. In other words, from the point of view of C language, char is always 1 "byte" (C-byte, not machine byte). Once you understand that, you'd also understand that sizeof(void) being 1 in GCC does not help you in any way. In GCC the pointer arithmetic on void * pointers works in exactly the same way as pointer arithmetic on char * pointers, which means that if on some platform char * doesn't work for you, then void * won't work for you either.
If on some platform char objects consist of multiple machine bytes, the only way to access smaller units of memory than a full char object would be to use bitwise operations to "extract" and "modify" the required portions of a complete char object. C language offers no way to directly address anything smaller than char. Once again char is always a C-byte.
The C99 standard defines the uint8_t that is one byte long. If the compiler doesn't support this type, you could define it using a typedef. Of course you would need a different definition, depending on the the platform and/or compiler. Bundle everything in a header file and use it everywhere.

Resources