C why would a pointer be larger than an integer - c

I am playing around with sizeof() in GCC on a linux machine right now and I found something very surprising.
printf ("\nsize of int = %lu\n", sizeof(int));
printf ("\nsize of int* = %lu\n", sizeof(int *));
yields
size of int = 4
size of int* = 8
I thought the size of a pointer to an integer would be much smaller than the actual integer itself!
I am researching embedded software right now and I was under the understanding that passing by reference was more efficient ( in terms of power ) than passing by value.
Could someone please clarify why it is more efficient to pass by reference than by value if the size of the pointer is larger than the actual value.
Thanks!

Integer can be any size the compiler writer likes, the only rules (in standard C) are: a) int isn't smaller than a short or bigger than a long, and b) int has at least 16 bit.
It's not uncommon on a 64bit platform to keep int as 32bits for compatibility.

Passing by reference is more efficient than passing by value when the value to be passed is larger than the size of the reference. It makes a lot of sense to pass by reference if what you are passing is a large struct/object. It also makes sense to pass a reference if you want to make persistent modifications to your value.

Passing by reference is more efficient because no data (other than the pointer) needs to be copied. This means that this is only more efficient when passing classes with many fields or structs or any other data that is larger than a pointer on the system used.
In the case you mentioned it could indeed be more efficient to not use a pointer because the actual value is smaller than a pointer to it (at least on the machine you were using).
Bare in mind that on a 32 bit-machine a pointer has 4 bytes (4*8 = 32bits) while on the 64-bit machine you were apparently using the pointer has 8 bytes (8*8 = 64bits).
On even older 16 bit machines pointers do only require 2 bytes, maybe there are some embedded systems still using this architecture, but I don't know about this...

In C, a pointer, any pointer, is just a memory address. You're on a 64-bit machine, and at the hardware level, memory addresses are referred to with 64-bit values. This is why 64-bit machines can use much more memory than 32-bit machines.

A pointer to an integer can point at a single integer, but the same pointer can also point at ten, twenty, one hundred or one million integers.
Obviously passing a single 8 byte pointer in lieu of a single 4 byte integer is not a win; but passing a single 8 byte pointer in lieu of one million 4 byte integers certainly is.

One thing has nothing to do with the other. One is an address, a pointer to something, doesnt matter if that is a char or short or int or structure. The other is a language specific thing called an int, which the compiler for that system and that version of compiler and perhaps command line options happens to define as some size.
It appears as if you are running on a 64 bit system so all your pointers/addresses are going to be 64 bit. What they point to is a separate discussion, longs are probably going to be 64 bits as well, sometimes ints, shorts probably still 16 bit but not a hard/fast rule and chars hopefully 8 bits, but also not a hard/fast rule.
Where this can get even worse is cross compiling, while using llvm-gcc, before clang was as solid as it is now. With a 64 bit host the bytecode was all being generated based on the 64 bit host, so 64 bit integers, 64 bit pointers, etc. then when you do the backend for the arm target it had to use compiler library calls for all of this 64 bit work. The variables hardly needed to be shorts much less ints, but were ints. the -m32 switch was broken you still got 64 bit integers due to the host not the ultimate target. gcc directly doesnt appear to have this problem and clang+llvm doesnt currently have this problem either.
The short answer is the language defines some data types char, short, int, long, etc and those data types have a compiler implementation defined size. and address is just another implementation defined data type. it is like asking why is a short not the same number of bytes as a long? Because they are different data types one is a short, one is a long. One is an address the other is a variable, two different things.

http://developers.sun.com/solaris/articles/ILP32toLP64Issues.html
When converting 32-bit programs to 64-bit programs, only long types
and pointer types change in size from 32 bits to 64 bits; integers of
type int stay at 32 bits in size.
In 64-bit executables, pointers are 64-bits. Long ints are also 64-bits, but ints are only 32-bits.
In 32-bit executables, pointers, int's and long ints are all 32-bits. 32-bit executables also support 64-bit "long long" ints.

A pointer must be able to reference all of memory. If the (virtual) memory is larger than 4 Gi bytes or so, then the pointer must be more than 32 bits.

Related

Is this pointer code legal on 64-bit computers

I plan to use memory across two pointers. Let's call them pointer1 and pointer2. Each pointer will be connected to its own share of memory as defined by block1 and block2 respectively.
I think this way works for all systems (both 32 and 64 bit):
char block1[100000];
char *pointer1=block1;
char block2[100000];
char *pointer2=block2;
However I think a faster way would be to use this code:
char block[200000];
char *pointer1=block;
char *pointer2=block+100000;
My question is would the last line of the last code fragment be compatible with 64-bit architecture?
The address space of a 32-bit architecture is of 2**32 = 4294967296. For a 64-bit is 18446744073709551616. I think you will be ok. THe compiler should handle it on its own. For your use case, it is just plain simple pointer arithmetics that is still in the address space.
What you have done is set up a memory pool in its most basic form. Your example uses char arrays and pointers, so you are unlikely to get unwanted results; however if your second pointer was , for instance, long * (with proper casting) you would get differences in alignment which could cause significantly slower code unless you take special precautions to align them manually (using hex values instead of decimal for offsets makes this a bit more obvious)
So in a more complex scenario, it would matter because long may need to be aligned to 8 bytes or 4.
I apologize for going a bit beyond the scope of the question, but I didn't want someone mistakenly extrapolating what is fine for char to mixed types onto a char[]

Sizeof pointer for 16 bit and 32 bit

I was just curious to know what would the sizeof pointer return for a 16 bit and a 32 bit system
printf("%d", sizeof(int16 *));
printf("%d", sizeof(int32 *));
Thank you.
Short answer: On a 32bit Intel 386 you will likely see these returning 4, while targeting a 16bit 8086 you might most likely see either 2 or 4 depending on the memory model you selected.
The details
First standard C does not mandate anything particular about pointers, only that they need to be able to "point to" the given variable, and pointer arithmetic needs to work within the data area of the given variable. Even a C interpreter which has some exotic representation of pointers is possible, and given this flexibility pointers truly might be of any size depending on what you target.
Usually however compilers indeed represent pointers by memory addresses which makes several operations undefined by the C standard "usually working". The way how the compiler chooses to represent a pointer depends on the targeted architecture: compiler writers obviously chose representations which are either or both useful and efficient.
An example to useful representations is generic pointers on a Harward architecture micro. They allow you to address both code and data ram. On a 8 bit micro they might be encoded as one type byte plus 2 address bytes, this obviously implies that whenever you dereference one such pointer, more complex code has to be emitted to load the contents from the proper place.
That gives a good example to an efficient representation: why not have specific pointers then? One which points to code memory, an other which points to data memory? Just 2 bytes (assuming 16bit address space as usual for 8bit micros such as the 8051), and no need to select by type.
But then you have multiple types of pointers, eh (again the 8051: you will likely have at least one additional type of pointer pointing within it's internal RAM too...). The programmer then needs to think about which particular pointer type he needs to use.
And of course the sizes also differ. On this hypothetical compiler targeting the 8051, you would have a generic pointer type of 3 bytes, an external data memory pointer type of 2 bytes, a code memory pointer of 2 bytes, and an internal RAM pointer type of 1 byte.
Also note that these are types of pointers, and not the types of data they point to (function pointers are a little off here as the fact a pointer is a function pointer implies that it is of a different type than data pointers while not having any specific syntax difference except that the data type it points to is a function type).
Back to your 16bit machine, assuming it is a 8086:
If you use some memory model where the compiler assumes you have a single data segment, you will likely get 2 byte data pointers if you don't specifically declare one near or far. Otherwise you will get 4 byte pointers by default. The representation of 2 byte pointers is usually simply the 16bit offset, while for 4 byte pointers it is a segment:offset pair. You can always apply a near or far specifier to explicitly make your pointers one or another type.
(How near pointers work in an program which also uses far pointers? Simply there is a default data segment generated by the compiler, and all nears are located within that. The compiler may simply permanently, or at least most of the time, have the ds segment register filled with the default data segment, so access of data pointed by nears can be faster)
The size of the a pointer depends on the architecture. Precisely, it depends on the size of the addresses used in that architecture which reflects the size of the bus system to access the memory.
For example, on 32 bits architecture the size of an address is 4 bytes :
sizeof (void *) == 4 Bytes.
On 64bits, addreses have size 8 bytes:
sizeof (void *) == 8 bytes.
Note, that all pointers have the same size interdependently of the type. So if you execute your code, the size of a int16 pointer and the size of int32 pointer will be the same.
However, the size of a pointer on a 16 bit system should be 2 bytes. Usually, 16bit systems have really few memory (some megabytes) and 2 bytes are enough to address all its locations. To be more precise, with a pointer of 16 bit the maximum memory you can have is around 65 KB. (really few compared to the amount of memory of a today computer).

Size of pointers and if that size is dependent of the architecture

Well, sorry for the question, is more like a general culture one (haven't found precise answers).
If I have something like
char * Field
or
void * Field
or
double pointers
The size of the pointer is the same? (as far I remember from college it was 4 bytes but ...)
Is the size of the pointer the same depending of the architecture of the CPU?
If I point to a data structure, the size of the the pointer itself is the same, isn't it?
Assume the examples in C (I would be prone to believe that it will be the same for other languages that does not handle pointers directly)
The size of the pointer is the same? (as far I remember from college it was 4 bytes but ...)
Not necessarily the same and not necessarily 4 bytes: Are all data pointers the same size in one platform for all data types?
Is the size of the pointer the same depending of the architecture of the CPU?
It varies from archtecture to architecture. Even on the same hardware it can vary from operating system to operating system (e.g. 32-bit vs 64-bit).
If I point to a data structure, the size of the the pointer itself is the same, isn't it?
Again, not necessarily: Are all data pointers the same size in one platform for all data types?
In most systems, the size of the pointers is same, but C don't guarantee that. It's just promise you that void* is wide enough to contain every pointer type (except of pointer to function). and yes - it depends of the CPU. (In 64bit systems, pointer is usually 8 bytes)
A 32-bit system usually has pointers of size 4 bytes and a 64-bit machine, usually has pointers of 8 bytes size.
The keyword here is of course - usually, it is entirely possible that the device you maybe using is based on Harvard architecture(Or some other bus architecture scheme), which has separate memories for data and code regions.
Hence separate buses with different widths, therefore it can be a possibility that the size of variable pointers (int*, double*, long int* etc.) is 8-bit but the size of function pointer is 16-bit, in the very same architecture.
1) The size of a pointer is the same for all pointer types.
2) Generally on a 32 bit architecture it will be 4 bytes, and on a 64 bit architecture it will be 8 bytes.
3) The size of the pointer will be the same no matter what you point it to.

Is there any way the size of the pointer can be changed from 2 bytes?

Can we anyhow change the size of the pointer from 2 bytes so it can occupy more than 2 bytes?
Sure, compile for a 32 (or 64) bit platform :-)
The size of pointers is platform specific, it would be 2 bytes only on 16-bit platforms (which have not been widely used for more than a decade - nowadays all mainstream [update](desktop / laptop / server)[/update] platforms are at least 32 bits).
If your pointer size is 2 byte that means you're running on a 16-bit system.
The only way to increase the pointer size is to use a 32-bit or 64-bit system instead (which would mean any desktop or laptop computer built in the last 15 years or so).
If you're running on some embedded device that uses 16-bit, your only option would be to switch to another device which uses 32-bits (or just live with your pointers being 16-bit).
When a processor is said to be "X-bit" (where X is 16, 32, 64, etc), that X refers to the size of the memory address register. Thus a 16-bit system has a memory address register of 2 bytes.
You cannot cast a 4-byte address to anything smaller because it would lose part of where it's pointing to. (A 2-byte memory address register can only point to 2^16=64KB of memory, whereas a 4-byte register can point to 2^32=4GB of memory.)
You can always "step-up" (ie, run a 32-bit software application on a 64-bit computer) because there's no loss in pointer range. But you can never step down, which is why 64-bit programs don't run on 32-bit systems.
Think of a pointer as a number, only instead of an actual value used for computation, it's the number of a 'slot' in the memory map of the system.
A pointer must be able to represent the highest position of the memory map. That is, it must have at least the amount of bytes required to represent the number of the highest position.
In a 16-bit system, the highest possible position is 0xFFFF (a 16-bit number with all the bits set to 1). A pointer must also have 16 bits, so it can reach that number.
Generalizing, in an X-bit system, a pointer will have X bits.
You can store a pointer in a larger variable, the same way you can store the number 1 in a char, in an int, or an unsigned long long if you wanted to; but there's little point to that: think that, the same way a shorter pointer won't be able to reach the highest memory position, a longer pointer would be able to point to things that can't actually exist in memory, so why have it?
Also, you'd have to 'trick' the compiler for that. If you use the pointer notation in your code, the compiler will always use the correct amount of bytes for it. You can instruct the compiler to compile for another platform, though.

Does the size of pointers vary in C? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Can the Size of Pointers Vary Depending on what’s Pointed To?
Are there are any platforms where pointers to different types have different sizes?
Is it possible that the size of a pointer to a float in c differs from a pointer to int? Having tried it out, I get the same result for all kinds of pointers.
#include <stdio.h>
#include <stdlib.h>
int main()
{
printf("sizeof(int*): %i\n", sizeof(int*));
printf("sizeof(float*): %i\n", sizeof(float*));
printf("sizeof(void*): %i\n", sizeof(void*));
return 0;
}
Which outputs here (OSX 10.6 64bit)
sizeof(int*): 8
sizeof(float*): 8
sizeof(void*): 8
Can I assume that pointers of different types have the same size (on one arch of course)?
Pointers are not always the same size on the same arch.
You can read more on the concept of "near", "far" and "huge" pointers, just as an example of a case where pointer sizes differ...
http://en.wikipedia.org/wiki/Intel_Memory_Model#Pointer_sizes
In days of old, using e.g. Borland C compilers on the DOS platform, there were a total of (I think) 5 memory models which could even be mixed to some extent. Essentially, you had a choice of small or large pointers to data, and small or large pointers to code, and a "tiny" model where code and data had a common address space of (If I remember correctly) 64K.
It was possible to specify "huge" pointers within a program that was otherwise built in the "tiny" model. So in the worst case it was possible to have different sized pointers to the same data type in the same program!
I think the standard doesn't even forbid this, so theoretically an obscure C compiler could do this even today. But there are doubtless experts who will be able to confirm or correct this.
Pointers to data must always be compatible with void* so generally they would be nowadays realized as types of the same width.
This statement is not true for function pointers, they may have different width. For that reason in C99 casting function pointers to void* is undefined behavior.
As I understand it there is nothing in the C standard which guarantees that pointers to different types must be the same size, so in theory an int * and a float * on the same platform could be different sizes without breaking any rules.
There is a requirement that char * and void * have the same representation and alignment requirements, and there are various other similar requirements for different subsets of pointer types but there's nothing that encompasses everything.
In practise you're unlikely to run into any implementation that uses different sized pointers unless you head into some fairly obscure places.
Yes. It's uncommon, but this would certainly happen on systems that are not byte-addressable. E.g. a 16 bit system with 64 Kword = 128KB of memory. On such systems, you can still have 16 bits int pointers. But a char pointer to an 8 bit char would need an extra bit to indicate highbyte/lowbyte within the word, and thus you'd have 17/32 bits char pointers.
This might sound exotic, but many DSP's spend 99.x% of the time executing specialized numerical code. A sound DSP can be a bit simpler if it all it has to deal with is 16 bits data, leaving the occasional 8 bits math to be emulated by the compiler.
I was going to write a reply saying that C99 has various pointer conversion requirements that more or less ensure that pointers to data have to be all the same size. However, on reading them carefully, I realised that C99 is specifically designed to allow pointers to be of different sizes for different types.
For instance on an architecture where the integers are 4 bytes and must be 4 byte aligned an int pointer could be two bits smaller than a char or void pointer. Provided the cast actually does the shift in both directions, you're fine with C99. It helpfully says that the result of casting a char pointer to an incorrectly aligned int pointer is undefined.
See the C99 standard. Section 6.3.2.3
Yes, the size of a pointer is platform dependent. More specifically, the size of a pointer depends on the target processor architecture and the "bit-ness" you compile for.
As a rule of thumb, on a 64bit machine a pointer is usually 64bits, on a 32bit machine usually 32 bits. There are exceptions however.
Since a pointer is just a memory address its always the same size regardless of what the memory it points to contains. So a pointer to a float, a char or an int are all the same size.
Can I assume that pointers of different types have the same size (on one arch of course)?
For the platforms with flat memory model (== all popular/modern platforms) pointer size would be the same.
For the platforms with segmented memory model, for efficiency, often there are platform-specific pointer types of different sizes. (E.g. far pointers in the DOS, since 8086 CPU used segmented memory model.) But this is platform specific and non-standard.
You probably should keep in mind that in C++ size of normal pointer might differ from size of pointer to virtual method. Pointers to virtual methods has to preserve extra bit of information to not to work properly with polymorphism. This is probably only exception I'm aware of, which is still relevant (since I doubt that segmented memory model would ever make it back).
There are platforms where function pointers are a different size than other pointers.
I've never seen more variation than this. All other pointers must be at most sizeof(void*) since the standard requires that they can be cast to void* without loss of information.
Pointer is a memory address - and hence should be the same on a specific machine. 32 bit machine => 4Bytes, 64 bit => 8 Bytes.
Hence irrespective of the datatype of the thing that the pointer is pointing to, the size of a pointer on a specific machine would be the same (since the space required to store a memory address would be the same.)
Assumption: I'm talking about near pointers to data values, the kind you declared in your question.

Resources