How do we treat the header while 8-alignment malloc implementation - c

I am trying to build a mymalloc in c programming with 8-byte alignment.
But I find a problem, if there is a 4-byte header, and 8-byte payload(malloced data),
Do we have to malloc 16-byte to match the alignment or do we only care about the alignment of payload?

Per C 2018 7.22.3 1, malloc returns memory sufficiently aligned for any fundamental alignment requirement. The C implementation may define its greatest fundamental alignment, and it is sufficient for all of the basic, enumerator, and pointer types and arrays, structures, and unions whose members have fundamental alignment requirements and for all complete object types in the standard C library. You can find the fundamental alignment with _Alignof (max_align_t). Let’s call that F.
If you are using malloc to get memory to be used in your mymalloc, you will use a few bytes of it (less than F) for your own data, and you wish the address you return to have alignment F, then you need to ask malloc for F bytes more than the amount of memory the caller requested. That is because malloc will return some address A with alignment F (or better), and, after putting your data there, you have to return some address greater than that to the user. The next address with alignment F is A+F, so the memory block at A will have F bytes followed by the user’s data. Hence you need to ask malloc for F bytes plus the amount the caller requested.

Related

malloc boundary sizes: is there a performance difference

My beginner's class on C has notes which say malloc returns a pointer to a block aligned to a 16-byte boundary on x86 machines.
Does that mean that there is no advantage in calling malloc(1), ie the performance would be no different from calling malloc(16)?
The C standard says
The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated).
So the pointer alignment is not 16 bytes, but implementation-defined; and on your implementation it so happens that there are some types of objects that are required to be 16-byte-aligned in memory; and thus pointers returned by malloc are 16-byte-aligned.
However it does not mean that the char *p = malloc(1) allocates memory for 16 bytes - on the contrary, you're not to touch any memory beyond p[0]; malloc also needs some internal bookkeeping so it can be that malloc(1) consumes a total of 16 bytes of memory, whereas malloc(16) would consume 32, or 64; you would not know.
Each call to malloc does not require it to ask the OS for memory. It asks the OS when required for big chucks on memory and then allocates a bit of that to you. In future calls it will have that bit to spare and can just allocate it to you without the need to asking the OS.
So, it will allocate memory that is convenient to the processor to use - usually aligned.
You should just allocate the memory required and let malloc sort out the rest.

What is a memory block?

The C books keep talking about memory blocks, but I never understood what exactly they are. Is a memory block an array? A large memory cell? For example:
malloc(2*sizeof(int)); /*This allocates a block*/
A "memory block" is a contiguous chunk of memory.
An array in C is also a contiguous chunk of memory. However, using the less-generic term "array" has implications that the generic term "memory block" does not (you can, after all, have data of multiple types within a block, whereas the term "array" implies uniformity of use).
Using malloc gives you memory dynamically -- the alternative is to allocate memory from the stack, as with int my_ints[2], but that doesn't give you control over the size of the block after your function is already running, or let you allocate more blocks after your function has started.
Also, stack size is relatively limited.
A memory block is a group of one or more contiguous chars ("bytes" - see note) of (real or virtual) memory.
The malloc(size_t size) function allocates a memory block. The size is how large (in chars) the block should be. Note that sizeof(int) is the number of chars that an int consumes, so malloc(2*sizeof(int)); allocates a memory block that is large enough to store 2 ints.
Because C is designed for many very different architectures; there's no guarantee that there's any relationship between things in different memory blocks. For example, you can't allocate 2 memory blocks and then calculate the difference between them (at least, not without relying on implementation defined behaviour).
For arrays, there must be a relationship between elements in the array; and for structures there must be a relationship between members of the structure. For this reason, each array and each structure must be contained within a memory block.
Note: Historically, char was the smallest unit C handled, and "byte" was defined as 1 char (e.g. sizeof(char) == 1), even if CHAR_BIT happens to be something bizarre, like 9 or 16 or 32. Outside of C, "byte" has become synonymous with an 8-bit quantity, and is defined as an 8-bit unit in International standards (IEC 80000-13, IEEE 1541). The result of this is that the definition that C uses for "byte" is not an actual (International standard) byte, and it would be wrong to say that malloc() allocates (International standard) bytes, but correct to say that malloc() allocates chars or "non-standard things that were unfortunately called bytes by the C standard once upon a time".

Which guarantees does malloc make about memory alignment?

I came across the following code:
int main()
{
char *A=(char *)malloc(20);
char *B=(char *)malloc(10);
char *C=(char *)malloc(10);
printf("\n%d",A);
printf("\t%d",B);
printf("\t%d\n",C);
return 0;
}
//output-- 152928264 152928288 152928304
I want to know how the allocation and padding is done by malloc(). Looking at the output I can see that the starting address is a multiple of 8. Arethere any other rules?
Accdording to this documentation page,
the address of a block returned by malloc or realloc in the GNU system is always a multiple of eight (or sixteen on 64-bit systems).
In general, malloc implementations are system-specific. All of them keep some memory for their own bookkeeping (e.g. the actual length of the allocated block) in order to be able to release that memory correctly when you call free. If you need to align to a specific boundary, use other functions, such as posix_memalign.
The only standard rule is that the address returned by malloc will be suitably aligned to store any kind of variable. What exactly that means is platform-specific (since alignment requirements vary from platform to platform).
The C standard says that the result of malloc() must be cast-able to any legit pointer type. So
... = (DataType *)malloc(...);
must be possible, regardless what type DataType is.
If a system has memory alignment requirements for certain data types, malloc() has to take that into account. And since malloc() cannot know to which pointer type you are going to cast the result, it always must follow the strictest memory alignment requirement.
The original wording in the standard is:
The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated).
Source: ISO/IEC 9899:201x (aka ISO C11)
E.g. if a system requires int to be 4 byte aligned and long to be 8 byte aligned, malloc() must return memory that is 8 byte aligned because it cannot know if you are going to cast the result to to int * or to long *.
Theoretically, if you request less than sizeof(long) bytes, a cast to long * is invalid as a long would not even fit into that memory. One might think that in that case malloc() could choose a smaller alignment but that's not what the standard says. The alignment requirement in the standard does not depend on the size of the allocation!
Since many CPUs as well as many operation system do have alignment requirements, most malloc implementation will always return aligned memory but which alignment rules it follows is system specific. There are also CPUs and systems that have no alignment requirements in which case malloc() may as well return unaligned memory.
If you depend on a specific alignment, you can either use aligned_alloc(), which is defined in the ISO C11 standard and thus portable to all systems for that a C11 compiler exists or you can use posix_memalign(), which is defined in IEEE Std 1003.1-2001 (aka POSIX 2001) and is available on all POSIX conforming systems as well as systems that try to be as POSIX conforming as possible (Linux for example).
Fun fact:
malloc() on macOS always returns memory that is 16 byte aligned, despite the fact that no data type on macOS has a memory alignment requirement beyond 8. The reason for that is SSE. Some SSE instructions have a 16 byte alignment requirement and by ensuring that malloc() always returns memory that is 16 byte aligned, Apple can very often use SSE optimization in its standard library.
For 32 bit Linux system:
When malloc() allocate memory, it allocate memory in multiple of 8 (padding of 8) and allocate extra 8 byte for bookkeeping.
For example:
malloc(10) and malloc (12) will allocate 24 Bytes memory (16 Bytes after padding + 8
Byte for bookkeeping).
malloc() do padding because the addresses returned will be multiples of eight, and thus will be valid for pointers of any type. Bookkeeping 8 bytes is used when we call free function. Bookkeeping bytes stores length of allocated memory.

Aligned memory management?

I have a few related questions about managing aligned memory blocks. Cross-platform answers would be ideal. However, as I'm pretty sure a cross-platform solution does not exist, I'm mainly interested in Windows and Linux and to a (much) lesser extent Mac OS and FreeBSD.
What's the best way of getting a chunk of memory aligned on 16-byte boundaries? (I'm aware of the trivial method of using malloc(), allocating a little extra space and then bumping the pointer up to a properly aligned value. I'm hoping for something a little less kludge-y, though. Also, see below for additional issues.)
If I use plain old malloc(), allocate extra space, and then move the pointer up to where it would be correctly aligned, is it necessary to keep the pointer to the beginning of the block around for freeing? (Calling free() on pointers to the middle of the block seems to work in practice on Windows, but I'm wondering what the standard says and, even if the standard says you can't, whether it works in practice on all major OS's. I don't care about obscure DS9K-like OS's.)
This is the hard/interesting part. What's the best way to reallocate a memory block while preserving alignment? Ideally this would be something more intelligent than calling malloc(), copying, and then calling free() on the old block. I'd like to do it in place where possible.
If your implementation has a standard data type that needs 16-byte alignment (long long for example), malloc already guarantees that your returned blocks will be aligned correctly. Section 7.20.3 of C99 states The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object.
You have to pass back the exact same address into free as you were given by malloc. No exceptions. So yes, you need to keep the original copy.
See (1) above if you already have a 16-byte-alignment-required type.
Beyond that, you may well find that your malloc implementation gives you 16-byte-aligned addresses anyway for efficiency although it's not guaranteed by the standard. If you require it, you can always implement your own allocator.
Myself, I'd implement a malloc16 layer on top of malloc that would use the following structure:
some padding for alignment (0-15 bytes)
size of padding (1 byte)
16-byte-aligned area
Then have your malloc16() function call malloc to get a block 16 bytes larger than requested, figure out where the aligned area should be, put the padding length just before that and return the address of the aligned area.
For free16, you would simply look at the byte before the address given to get the padding length, work out the actual address of the malloc'ed block from that, and pass that to free.
This is untested but should be a good start:
void *malloc16 (size_t s) {
unsigned char *p;
unsigned char *porig = malloc (s + 0x10); // allocate extra
if (porig == NULL) return NULL; // catch out of memory
p = (porig + 16) & (~0xf); // insert padding
*(p-1) = p - porig; // store padding size
return p;
}
void free16(void *p) {
unsigned char *porig = p; // work out original
porig = porig - *(porig-1); // by subtracting padding
free (porig); // then free that
}
The magic line in the malloc16 is p = (porig + 16) & (~0xf); which adds 16 to the address then sets the lower 4 bits to 0, in effect bringing it back to the next lowest alignment point (the +16 guarantees it is past the actual start of the maloc'ed block).
Now, I don't claim that the code above is anything but kludgey. You would have to test it in the platforms of interest to see if it's workable. Its main advantage is that it abstracts away the ugly bit so that you never have to worry about it.
I'm not aware of any way of requesting malloc return memory with stricter alignment than usual. As for "usual" on Linux, from man posix_memalign (which you can use instead of malloc() to get more strictly aligned memory if you like):
GNU libc malloc() always returns 8-byte aligned memory addresses, so
these routines are only needed if you require larger alignment values.
You must free() memory using the same pointer returned by malloc(), posix_memalign() or realloc().
Use realloc() as usual, including sufficient extra space so if a new address is returned that isn't already aligned you can memmove() it slightly to align it. Nasty, but best I can think of.
You could write your own slab allocator to handle your objects, it could allocate pages at a time using mmap, maintain a cache of recently-freed addresses for fast allocations, handle all your alignment for you, and give you the flexibility to move/grow objects exactly as you need. malloc is quite good for general-purpose allocations, but if you know your data layout and allocation needs, you can design a system to hit those requirements exactly.
The trickiest requirement is obviously the third one, since any malloc() / realloc() based solution is hostage to realloc() moving the block to a different alignment.
On Linux, you could use anonymous mappings created with mmap() instead of malloc(). Addresses returned by mmap() are by necessity page-aligned, and the mapping can be extended with mremap().
Starting a C11, you have void *aligned_alloc( size_t alignment, size_t size ); primitives, where the parameters are:
alignment - specifies the alignment. Must be a valid alignment supported by the implementation.
size - number of bytes to allocate. An integral multiple of alignment
Return value
On success, returns the pointer to the beginning of newly allocated memory. The returned pointer must be deallocated with free() or realloc().
On failure, returns a null pointer.
Example:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int *p1 = malloc(10*sizeof *p1);
printf("default-aligned addr: %p\n", (void*)p1);
free(p1);
int *p2 = aligned_alloc(1024, 1024*sizeof *p2);
printf("1024-byte aligned addr: %p\n", (void*)p2);
free(p2);
}
Possible output:
default-aligned addr: 0x1e40c20
1024-byte aligned addr: 0x1e41000
Experiment on your system. On many systems (especially 64-bit ones), you get 16-byte aligned memory out of malloc() anyway. If not, you will have to allocate the extra space and move the pointer (by at most 8 bytes on almost every machine).
For example, 64-bit Linux on x86/64 has a 16-byte long double, which is 16-byte aligned - so all memory allocations are 16-byte aligned anyway. However, with a 32-bit program, sizeof(long double) is 8 and memory allocations are only 8-byte aligned.
Yes - you can only free() the pointer returned by malloc(). Anything else is a recipe for disaster.
If your system does 16-byte aligned allocations, there isn't a problem. If it doesn't, then you'll need your own reallocator, which does a 16-byte aligned allocation and then copies the data - or that uses the system realloc() and adjusts the realigned data when necessary.
Double check the manual page for your malloc(); there may be options and mechanisms to tweak it so it behaves as you want.
On MacOS X, there is posix_memalign() and valloc() (which gives a page-aligned allocation), and there is a whole series of 'zoned malloc' functions identified by man malloc_zoned_malloc and the header is <malloc/malloc.h>.
You might be able to jimmy (in Microsoft VC++ and maybe other compilers):
#pragma pack(16)
such that malloc( ) is forced to return a 16-byte-aligned pointer. Something along the lines of:
ptr_16byte = malloc( 10 * sizeof( my_16byte_aligned_struct ));
If it worked at all for malloc( ), I'd think it would work for realloc( ) just as well.
Just a thought.
-- pete

Pointer implementation details in C

I would like to know architectures which violate the assumptions I've listed below. Also, I would like to know if any of the assumptions are false for all architectures (that is, if any of them are just completely wrong).
sizeof(int *) == sizeof(char *) == sizeof(void *) == sizeof(func_ptr *)
The in-memory representation of all pointers for a given architecture is the same regardless of the data type pointed to.
The in-memory representation of a pointer is the same as an integer of the same bit length as the architecture.
Multiplication and division of pointer data types are only forbidden by the compiler. NOTE: Yes, I know this is nonsensical. What I mean is - is there hardware support to forbid this incorrect usage?
All pointer values can be casted to a single integer. In other words, what architectures still make use of segments and offsets?
Incrementing a pointer is equivalent to adding sizeof(the pointed data type) to the memory address stored by the pointer. If p is an int32* then p+1 is equal to the memory address 4 bytes after p.
I'm most used to pointers being used in a contiguous, virtual memory space. For that usage, I can generally get by thinking of them as addresses on a number line. See Stack Overflow question Pointer comparison.
I can't give you concrete examples of all of these, but I'll do my best.
sizeof(int *) == sizeof(char *) == sizeof(void *) == sizeof(func_ptr *)
I don't know of any systems where I know this to be false, but consider:
Mobile devices often have some amount of read-only memory in which program code and such is stored. Read-only values (const variables) may conceivably be stored in read-only memory. And since the ROM address space may be smaller than the normal RAM address space, the pointer size may be different as well. Likewise, pointers to functions may have a different size, as they may point to this read-only memory into which the program is loaded, and which can otherwise not be modified (so your data can't be stored in it).
So I don't know of any platforms on which I've observed that the above doesn't hold, but I can imagine systems where it might be the case.
The in-memory representation of all pointers for a given architecture is the same regardless of the data type pointed to.
Think of member pointers vs regular pointers. They don't have the same representation (or size). A member pointer consists of a this pointer and an offset.
And as above, it is conceivable that some CPU's would load constant data into a separate area of memory, which used a separate pointer format.
The in-memory representation of a pointer is the same as an integer of the same bit length as the architecture.
Depends on how that bit length is defined. :)
An int on many 64-bit platforms is still 32 bits. But a pointer is 64 bits.
As already said, CPU's with a segmented memory model will have pointers consisting of a pair of numbers. Likewise, member pointers consist of a pair of numbers.
Multiplication and division of pointer data types are only forbidden by the compiler.
Ultimately, pointers data types only exist in the compiler. What the CPU works with is not pointers, but integers and memory addresses. So there is nowhere else where these operations on pointer types could be forbidden. You might as well ask for the CPU to forbid concatenation of C++ string objects. It can't do that because the C++ string type only exists in the C++ language, not in the generated machine code.
However, to answer what you mean, look up the Motorola 68000 CPUs. I believe they have separate registers for integers and memory addresses. Which means that they can easily forbid such nonsensical operations.
All pointer values can be casted to a single integer.
You're safe there. The C and C++ standards guarantee that this is always possible, no matter the memory space layout, CPU architecture and anything else. Specifically, they guarantee an implementation-defined mapping. In other words, you can always convert a pointer to an integer, and then convert that integer back to get the original pointer. But the C/C++ languages say nothing about what the intermediate integer value should be. That is up to the individual compiler, and the hardware it targets.
Incrementing a pointer is equivalent to adding sizeof(the pointed data type) to the memory address stored by the pointer.
Again, this is guaranteed. If you consider that conceptually, a pointer does not point to an address, it points to an object, then this makes perfect sense. Adding one to the pointer will then obviously make it point to the next object. If an object is 20 bytes long, then incrementing the pointer will move it 20 bytes, so that it moves to the next object.
If a pointer was merely a memory address in a linear address space, if it was basically an integer, then incrementing it would add 1 to the address -- that is, it would move to the next byte.
Finally, as I mentioned in a comment to your question, keep in mind that C++ is just a language. It doesn't care which architecture it is compiled to. Many of these limitations may seem obscure on modern CPU's. But what if you're targeting yesteryear's CPU's? What if you're targeting the next decade's CPU's? You don't even know how they'll work, so you can't assume much about them. What if you're targeting a virtual machine? Compilers already exist which generate bytecode for Flash, ready to run from a website. What if you want to compile your C++ to Python source code?
Staying within the rules specified in the standard guarantees that your code will work in all these cases.
I don't have specific real world examples in mind but the "authority" is the C standard. If something is not required by the standard, you can build a conforming implementation that intentionally fails to comply with any other assumptions. Some of these assumption are true most of the time just because it's convenient to implement a pointer as an integer representing a memory address that can be directly fetched by the processor but this is just a consequent of "convenience" and can't be held as a universal truth.
Not required by the standard (see this question). For instance, sizeof(int*) can be unequal to size(double*). void* is guaranteed to be able to store any pointer value.
Not required by the standard. By definition, size is a part of representation. If the size can be different, the representation can be different too.
Not necessarily. In fact, "the bit length of an architecture" is a vague statement. What is a 64-bit processor, really? Is it the address bus? Size of registers? Data bus? What?
It doesn't make sense to "multiply" or "divide" a pointer. It's forbidden by the compiler but you can of course multiply or divide the underlying representation (which doesn't really make sense to me) and that results in undefined behavior.
Maybe I don't understand your point but everything in a digital computer is just some kind of binary number.
Yes; kind of. It's guaranteed to point to a location that's a sizeof(pointer_type) farther. It's not necessarily equivalent to arithmetic addition of a number (i.e. farther is a logical concept here. The actual representation is architecture specific)
For 6.: a pointer is not necessarily a memory address. See for example "The Great Pointer Conspiracy" by Stack Overflow user jalf:
Yes, I used the word “address” in the com­ment above. It is impor­tant to real­ize what I mean by this. I do not mean “the mem­ory address at which the data is phys­i­cally stored”, but sim­ply an abstract “what­ever we need in order to locate the value. The address of i might be any­thing, but once we have it, we can always find and mod­ify i."
And:
A pointer is not a mem­ory address! I men­tioned this above, but let’s say it again. Point­ers are typ­i­cally imple­mented by the com­piler sim­ply as mem­ory addresses, yes, but they don’t have to be."
Some further information about pointers from the C99 standard:
6.2.5 §27 guarantees that void* and char* have identical representations, ie they can be used interchangably without conversion, ie the same address is denoted by the same bit pattern (which doesn't have to be true for other pointer types)
6.3.2.3 §1 states that any pointer to an incomplete or object type can be cast to (and from) void* and back again and still be valid; this doesn't include function pointers!
6.3.2.3 §6 states that void* can be cast to (and from) integers and 7.18.1.4 §1 provides apropriate types intptr_t and uintptr_t; the problem: these types are optional - the standard explicitly mentions that there need not be an integer type large enough to actually hold the value of the pointer!
sizeof(char*) != sizeof(void(*)(void) ? - Not on x86 in 36 bit addressing mode (supported on pretty much every Intel CPU since Pentium 1)
"The in-memory representation of a pointer is the same as an integer of the same bit length" - there's no in-memory representation on any modern architecture; tagged memory has never caught on and was already obsolete before C was standardized. Memory in fact doesn't even hold integers, just bits and arguably words (not bytes; most physical memory doesn't allow you to read just 8 bits.)
"Multiplication of pointers is impossible" - 68000 family; address registers (the ones holding pointers) didn't support that IIRC.
"All pointers can be cast to integers" - Not on PICs.
"Incrementing a T* is equivalent to adding sizeof(T) to the memory address" - true by definition. Also equivalent to &pointer[1].
I don't know about the others, but for DOS, the assumption in #3 is untrue. DOS is 16 bit and uses various tricks to map many more than 16 bits worth of memory.
The in-memory representation of a pointer is the same as an integer of the same bit length as the architecture.
I think this assumption is false because on the 80186, for example, a 32-bit pointer is held in two registers (an offset register an a segment register), and which half-word went in which register matters during access.
Multiplication and division of pointer data types are only forbidden by the compiler.
You can't multiply or divide types. ;P
I'm unsure why you would want to multiply or divide a pointer.
All pointer values can be casted to a single integer. In other words, what architectures still make use of segments and offsets?
The C99 standard allows pointers to be stored in intptr_t, which is an integer type. So, yes.
Incrementing a pointer is equivalent to adding sizeof(the pointed data type) to the memory address stored by the pointer. If p is an int32* then p+1 is equal to the memory address 4 bytes after p.
x + y where x is a T * and y is an integer is equivilent to (T *)((intptr_t)x + y * sizeof(T)) as far as I know. Alignment may be an issue, but padding may be provided in the sizeof. I'm not really sure.
In general, the answer to all of the questions is "yes", and it's because only those machines that implement popular languages directly saw the light of day and persisted into the current century. Although the language standards reserve the right to vary these "invariants", or assertions, it hasn't ever happened in real products, with the possible exception of items 3 and 4 which require some restatement to be universally true.
It's certainly possible to build segmented MMU designs, which correspond roughly with the capability-based architectures that were popular academically in past years, but no such system has typically seen common use with such features enabled. Such a system might have conflicted with the assertions as it would probably have had large pointers.
In addition to segmented/capability MMUs, which often have large pointers, more extreme designs have tried to encode data types in pointers. Few of these were ever built. (This question brings up all of the alternatives to the basic word-oriented, a pointer-is-a-word architectures.)
Specifically:
The in-memory representation of all pointers for a given architecture is the same regardless of the data type pointed to. True except for extremely wacky past designs that tried to implement protection not in strongly-typed languages but in hardware.
The in-memory representation of a pointer is the same as an integer of the same bit length as the architecture. Maybe, certainly some sort of integral type is the same, see LP64 vs LLP64.
Multiplication and division of pointer data types are only forbidden by the compiler. Right.
All pointer values can be casted to a single integer. In other words, what architectures still make use of segments and offsets? Nothing uses segments and offsets today, but a C int is often not big enough, you may need a long or long long to hold a pointer.
Incrementing a pointer is equivalent to adding sizeof(the pointed data type) to the memory address stored by the pointer. If p is an int32* then p+1 is equal to the memory address 4 bytes after p. Yes.
It is interesting to note that every Intel Architecture CPU, i.e., every single PeeCee, contains an elaborate segmentation unit of epic, legendary, complexity. However, it is effectively disabled. Whenever a PC OS boots up, it sets the segment bases to 0 and the segment lengths to ~0, nulling out the segments and giving a flat memory model.
There were lots of "word addressed" architectures in the 1950s, 1960s and 1970s. But I cannot recall any mainstream examples that had a C compiler. I recall the ICL / Three Rivers PERQ machines in the 1980s that was word addressed and had a writable control store (microcode). One of its instantiations had a C compiler and a flavor of Unix called PNX, but the C compiler required special microcode.
The basic problem is that char* types on word addressed machines are awkward, however you implement them. You often up with sizeof(int *) != sizeof(char *) ...
Interestingly, before C there was a language called BCPL in which the basic pointer type was a word address; that is, incrementing a pointer gave you the address of the next word, and ptr!1 gave you the word at ptr + 1. There was a different operator for addressing a byte: ptr%42 if I recall.
EDIT: Don't answer questions when your blood sugar is low. Your brain (certainly, mine) doesn't work as you expect. :-(
Minor nitpick:
p is an int32* then p+1
is wrong, it needs to be unsigned int32, otherwise it will wrap at 2GB.
Interesting oddity - I got this from the author of the C compiler for the Transputer chip - he told me that for that compiler, NULL was defined as -2GB. Why? Because the Transputer had a signed address range: -2GB to +2GB. Can you beleive that? Amazing isn't it?
I've since met various people that have told me that defining NULL like that is broken. I agree, but if you don't you end up NULL pointers being in the middle of your address range.
I think most of us can be glad we're not working on Transputers!
I would like to know architectures which violate the assumptions I've
listed below.
I see that Stephen C mentioned PERQ machines, and MSalters mentioned 68000s and PICs.
I'm disappointed that no one else actually answered the question by naming any of the weird and wonderful architectures that have standards-compliant C compilers that don't fit certain unwarranted assumptions.
sizeof(int *) == sizeof(char *) == sizeof(void *) == sizeof(func_ptr
*) ?
Not necessarily. Some examples:
Most compilers for Harvard-architecture 8-bit processors -- PIC and 8051 and M8C -- make sizeof(int *) == sizeof(char *),
but different from the sizeof(func_ptr *).
Some of the very small chips in those families have 256 bytes of RAM (or less) but several kilobytes of PROGMEM (Flash or ROM), so compilers often make sizeof(int *) == sizeof(char *) equal to 1 (a single 8-bit byte), but sizeof(func_ptr *) equal to 2 (two 8-bit bytes).
Compilers for many of the larger chips in those families with a few kilobytes of RAM and 128 or so kilobytes of PROGMEM make sizeof(int *) == sizeof(char *) equal to 2 (two 8-bit bytes), but sizeof(func_ptr *) equal to 3 (three 8-bit bytes).
A few Harvard-architecture chips can store exactly a full 2^16 ("64KByte") of PROGMEM (Flash or ROM), and another 2^16 ("64KByte") of RAM + memory-mapped I/O.
The compilers for such a chip make sizeof(func_ptr *) always be 2 (two bytes);
but often have a way to make the other kinds of pointers sizeof(int *) == sizeof(char *) == sizeof(void *) into a a "long ptr" 3-byte generic pointer that has the extra magic bit that indicates whether that pointer points into RAM or PROGMEM.
(That's the kind of pointer you need to pass to a "print_text_to_the_LCD()" function when you call that function from many different subroutines, sometimes with the address of a variable string in buffer that could be anywhere in RAM, and other times with one of many constant strings that could be anywhere in PROGMEM).
Such compilers often have special keywords ("short" or "near", "long" or "far") to let programmers specifically indicate three different kinds of char pointers in the same program -- constant strings that only need 2 bytes to indicate where in PROGMEM they are located, non-constant strings that only need 2 bytes to indicate where in RAM they are located, and the kind of 3-byte pointers that "print_text_to_the_LCD()" accepts.
Most computers built in the 1950s and 1960s use a 36-bit word length or an 18-bit word length, with an 18-bit (or less) address bus.
I hear that C compilers for such computers often use 9-bit bytes,
with sizeof(int *) == sizeof(func_ptr *) = 2 which gives 18 bits, since all integers and functions have to be word-aligned; but sizeof(char *) == sizeof(void *) == 4 to take advantage of special PDP-10 instructions that store such pointers in a full 36-bit word.
That full 36-bit word includes a 18-bit word address, and a few more bits in the other 18-bits that (among other things) indicate the bit position of the pointed-to character within that word.
The in-memory representation of all pointers for a given architecture
is the same regardless of the data type pointed to?
Not necessarily. Some examples:
On any one of the architectures I mentioned above, pointers come in different sizes. So how could they possibly have "the same" representation?
Some compilers on some systems use "descriptors" to implement character pointers and other kinds of pointers.
Such a descriptor is different for a pointer pointing to the first "char" in a "char big_array[4000]" than for a pointer pointing to the first "char" in a "char small_array[10]", which are arguably different data types, even when the small array happens to start at exactly the same location in memory previously occupied by the big array.
Descriptors allow such machines to catch and trap the buffer overflows that cause such problems on other machines.
The "Low-Fat Pointers" used in the SAFElite and similar "soft processors" have analogous "extra information" about the size of the buffer that the pointer points into. Low-Fat pointers have the same advantage of catching and trapping buffer overflows.
The in-memory representation of a pointer is the same as an integer of
the same bit length as the architecture?
Not necessarily. Some examples:
In "tagged architecture" machines, each word of memory has some bits that indicate whether that word is an integer, or a pointer, or something else.
With such machines, looking at the tag bits would tell you whether that word was an integer or a pointer.
I hear that Nova minicomputers have an "indirection bit" in each word which inspired "indirect threaded code". It sounds like storing an integer clears that bit, while storing a pointer sets that bit.
Multiplication and division of pointer data types are only forbidden
by the compiler. NOTE: Yes, I know this is nonsensical. What I mean is
- is there hardware support to forbid this incorrect usage?
Yes, some hardware doesn't directly support such operations.
As others have already mentioned, the "multiply" instruction in the 68000 and the 6809 only work with (some) "data registers"; they can't be directly applied to values in "address registers".
(It would be pretty easy for a compiler to work around such restrictions -- to MOV those values from an address register to the appropriate data register, and then use MUL).
All pointer values can be casted to a single data type?
Yes.
In order for memcpy() to work right, the C standard mandates that every pointer value of every kind can be cast to a void pointer ("void *").
The compiler is required to make this work, even for architectures that still use segments and offsets.
All pointer values can be casted to a single integer? In other words,
what architectures still make use of segments and offsets?
I'm not sure.
I suspect that all pointer values can be cast to the "size_t" and "ptrdiff_t" integral data types defined in "<stddef.h>".
Incrementing a pointer is equivalent to adding sizeof(the pointed data
type) to the memory address stored by the pointer. If p is an int32*
then p+1 is equal to the memory address 4 bytes after p.
It is unclear what you are asking here.
Q: If I have an array of some kind of structure or primitive data type (for example, a "#include <stdint.h> ... int32_t example_array[1000]; ..."), and I increment a pointer that points into that array (for example, "int32_t p = &example_array[99]; ... p++; ..."), does the pointer now point to the very next consecutive member of that array, which is sizeof(the pointed data type) bytes further along in memory?
A: Yes, the compiler must make the pointer, after incrementing it once, point at the next independent consecutive int32_t in the array, sizeof(the pointed data type) bytes further along in memory, in order to be standards compliant.
Q: So, if p is an int32* , then p+1 is equal to the memory address 4 bytes after p?
A: When sizeof( int32_t ) is actually equal to 4, yes. Otherwise, such as for certain word-addressable machines including some modern DSPs where sizeof( int32_t ) may equal 2 or even 1, then p+1 is equal to the memory address 2 or even 1 "C bytes" after p.
Q: So if I take the pointer, and cast it into an "int" ...
A: One type of "All the world's a VAX heresy".
Q: ... and then cast that "int" back into a pointer ...
A: Another type of "All the world's a VAX heresy".
Q: So if I take the pointer p which is a pointer to an int32_t, and cast it into some integral type that is plenty big enough to contain the pointer, and then add sizeof( int32_t ) to that integral type, and then later cast that integral type back into a pointer -- when I do all that, the resulting pointer is equal to p+1?
Not necessarily.
Lots of DSPs and a few other modern chips have word-oriented addressing, rather than the byte-oriented processing used by 8-bit chips.
Some of the C compilers for such chips cram 2 characters into each word, but it takes 2 such words to hold a int32_t -- so they report that sizeof( int32_t ) is 4.
(I've heard rumors that there's a C compiler for the 24-bit Motorola 56000 that does this).
The compiler is required to arrange things such that doing "p++" with a pointer to an int32_t increments the pointer to the next int32_t value.
There are several ways for the compiler to do that.
One standards-compliant way is to store each pointer to a int32_t as a "native word address".
Because it takes 2 words to hold a single int32_t value, the C compiler compiles "int32_t * p; ... p++" into some assembly language that increments that pointer value by 2.
On the other hand, if that one does "int32_t * p; ... int x = (int)p; x += sizeof( int32_t ); p = (int32_t *)x;", that C compiler for the 56000 will likely compile it to assembly language that increments the pointer value by 4.
I'm most used to pointers being used in a contiguous, virtual memory
space.
Several PIC and 8086 and other systems have non-contiguous RAM --
a few blocks of RAM at addresses that "made the hardware simpler".
With memory-mapped I/O or nothing at all attached to the gaps in address space between those blocks.
It's even more awkward than it sounds.
In some cases -- such as with the bit-banding hardware used to avoid problems caused by read-modify-write -- the exact same bit in RAM can be read or written using 2 or more different addresses.

Resources