I'm trying to implement a fixed memory pool (first-fit free list)
my struct is:
struct mempool {
void* data;
int offset;
};
data is divided into 8byte blocks, 4 bytes pointing to next offset and 4 bytes data. offset points to the first free block. I'm trying to understand why accessing the first block is done by:
int* address = (int*)((int)&pool->data + pool->offset);
especially the (int)&pool->data part. Isn't pool->data already a pointer? why do I need its address to perform arithmetic and shift to the offset?
I'm trying to understand why accessing the first block is done by:
int* address = (int*)((int)&pool->data + pool->offset);
especially the (int)&pool->data part. Isn't pool->data already a
pointer?
Yes, pool->data is a pointer. And one can obtain the address of a pointer, so there's nothing inherently wrong with that. The result in this case has type void **.
Moreover, given that data is the first member of struct mempool, &pool would point to the same address. Although the latter has a different type (struct mempool *), that's probably mooted by the fact that the code performs a conversion to type int.
why do I need its address to perform arithmetic and shift to
the offset?
The effect is to compute an address relative to the location of the data pointer itself, rather than relative to its target. Furthermore, the cast to type int suggests that the offset is measured in bytes. That aspect of it is a bit unsafe, however, because it is not guaranteed that type int is large enough to support round-trip conversion from pointer to int to pointer.
This all seems consistent with your characterization of the pool having metadata adjacent to the data, but it remains unclear what purpose the data pointers serve.
Overall, although I'm not convinced of the correctness or efficacy of the minimal code presented. If it in fact serves the purpose for which it is intended, relative to the structure definition presented, then this variation should do so better and more clearly:
int* address = (int*)((char *)&pool + pool->offset);
That avoids the question of an integer type that can represent pointers (although there is one in the form of intptr_t). The cast to char * accounts for the fact that pointer arithmetic is performed in units the size of the pointed-to type, and the offset seems to be expressed in one-byte units.
You code does not seem correct. You are adding pool->offset to the address of pool->data field rather that to the address stored in pool->data field. I would suggest fixing like this:
int* address = (int *)pool->data + pool->offset;
in case your offset is in 4-byte chunks, or like this:
int* address = (int *)((char *)pool->data + pool->offset);
in case your offset is in bytes.
pool->data + pool->offset wouldn't be possible because you can't do pointer arithmetic on void pointers - that isn't valid C. Pointer arithmetic also assumes that the underlying type of this all is an array.
&pool->data gives the address of the pointer itself, which happens to be the address of the struct. The type void**. You can't do arithmetic on that either.
Therefore the naive, bad solution here is to cast the pointer to an int and then do simple addition. That doesn't work either, because int is not guaranteed to be able to hold the contents of a pointer. uintptr_t should have been used instead of int.
And finally, accessing that chunk of memory through int* then de-referencing it is only possible if what's stored there is already regarded as type int by the compiler. If not, it invokes undefined behavior, What is the strict aliasing rule?.
Summary: this is quite questionable code and there's many better ways to implement it.
Related
I cannot understand why are pointer type casts necessary, as long as pointers point to an address and their type is important only when it comes to pointer arithmetic.
That is to say, if I encounter the next code snippet:
int a = 5; then both
char*b = (char*)&a; and int*c = (int*)&a
point to the very same memory location.
Indeed, when executing char*b = (char*)&a part of the memory contents may be lost, but this is due to the type of b is char* which can store only sizeof(char) bytes of memory, and this could be done implicitly.
The pointer type is important when you dereference it, since it indicates how many bytes should be read or stored.
It's also important when you perform pointer arithmetic, since this is done in units of the size that the pointer points to.
The type of the pointer is most important when you dereference the pointer.
The excerpt
char *b = (int*)&a;
is wrong because int * cannot be assigned to char *. You meant char *b = (char *)&a;. The C standard says you need an explicit cast because the types are not compatible. This is the "yeah yeah, I know what I am doing".
The excerpt
int*c = (int*)&a;
is right, but &a is already a pointer to an int, so the cast will do no conversion, therefore you can write it as int *c = &a;.
as long as pointers point to an address and their type is important only when it comes to pointer arithmetic.
No, not only for pointer arithmetic. Its also important for accessing the data pointed by the pointer, the wrong way of accessing as in your example leads to improper results.
when executing charb = (int)&a part of the memory contents may be lost, but this is due to the type of b is char* which can store only sizeof(char) bytes of memory,
First of all, that is wrong to do (unless we really need to extract 1-byte out of 4-bytes)
Second, the application of pointer cast is mostly used w.r.t void*(void pointers) when passing data for a function which can handle many different type, best example is qsort
If I know, that too types T and U have same alignment, can I use one malloc call like this:
void* allocate_memory(int n, int m) {
return malloc(sizeof(T) * n + sizeof(U) * m);
}
to allocate contiguous memory for arrays of these two types?
If it is okay, what is the correct way to acquire the pointer to the first element of the second array? Conversion void* -> char* -> (+= sizeof(T) * n) -> U* seems fine, but I feel like there might be some kind of undefined behaviour there.
(I'm almost sure it can't be done in C++, rules of pointer arithmetic won't allow this (At no point array of U starts to exist, so you can't perform pointer arithmetic on this storage). Hence my cautiousness about C rules)
edit:
Since P0593R6 got accepted and applied as Defect Report to all C++ standards back to C++98, a call to malloc implicitly creates objects in allocated storage. Because of that, this construction is now valid in C++ too and pointer arithmetic on this range is well-defined as well.
In C, you can perform arithmetic on the full allocated object via its representation array, which has type unsigned char [] but can legally be addressed (less verbosely) via just char *. I'm not sure about in C++ but I would think you could do the same.
If p is the pointer returned, (U *)((char *)p + sizeof(T) * n) is a valid pointer to what you want.
Note that you can get rid of the "same alignment" requirement just by using _Alignof(U) or by using sizeof(U) (or the highest power of two that divides it) as a (not necessarily sharp) estimate for the alignment and working out the necessary padding in between to reach a multiple of the alignment. If you do this make sure to allocate the right total amount including the padding.
AFAIK, the size occupied by any type of pointer is the same on a given architecture. That is, the only difference between different types of pointers is what will happen when we use an operation such as ptr++ or ptr-- on the pointer.
As an example:
char *cptr;
int *iptr;
occupy the same amount of memory (such as 4 bytes, or 8 bytes or something else). However, the difference is what will happen when we use the increment (or decrement) operator on the pointers. cptr++ will increment cptr by 1, while iptr++ will increase iptr by 4 (depending on the architecture, it can be a different value than 4 as well).
The Question
My question is, are there any differences between:
char **cdptr;
int **idptr;
(Assume that for the machine under mention, pointers have a size of 4 bytes)
Since both are pointers, both will occupy the same amount of space: 4 bytes. Also, since both point to something that occupy the same size (again, 4 bytes), operations char cdptr++ and int idptr++ will work exactly the same on these two pointers (incrementing them by 4 respectively).
So, do different types of higher order pointers have any differences?
Formally speaking, yes, these pointer types are different. They have different types, types which are important to the programmer and which the compiler keeps intimate track of. You can prove they're different by trying to compile
char **cdptr;
int **idptr = NULL;
cdptr = idptr;
Your compiler will complain. (gcc says "assignment from incompatible pointer type".) You can also convince yourself that they're different by noticing what happens when you indirect on them: cdptr[1][2] is of course a char, while idptr[1][2] is an int.
Now, it's true, since sizeof(*cdptr) almost certainly equals sizeof(*idptr), pointer arithmetic like cdptr++ and idptr++ will generate the same code. But this doesn't strike me as a terribly useful fact -- it's about as interesting as observing that if we declare
int *iptr;
char **cdptr;
we get the same code for iptr++ and cdptr++ on a machine where ints and pointers happen to be the same size. But this doesn't tell us anything we can use while writing C programs. "Generate the same code when incremented" does not equal "are the same".
Basically, in C language, a pointer is more than a memory address. It is a memory address AND a type.
The type is needed when you use pointer arithmetics. For example: ptr + 2 means that you shift the current position of the pointer in memory by 2 sizeof(pointed type by ptr).
So, a pointer of pointer differ from a simple pointer by its type... That's all.
I found this declaration in a C program
char huge * far *p;
Explanation: p is huge pointer, *p is far pointer and **p is char type
data variable.
Please explain declaration in more detail.
PS: I'm not asking about huge or far pointer here. I'm a newbie to programming
**p is character.Now a pointer pointing to address of this character will have value &(**p). Again if you want to take pointer to this pointer then next will be &(*p) and result will be p only.
If you read below sentence from right to left, you will get it all
p is huge pointer, *p is far pointer and **p is char type data variable.
In a nutshell virtual addresses on an Intel x86 chip have two components - a selector and an offset. The selector is an index into a table of base addresses [2] and the offset is added onto that base address. This was designed to let the processor access 20 bit (on a 8086/8, 186), 30 bit (286) or 46 bit (386 and later) virtual address spaces without needing registers that big.
'far' pointers have an explicit selector. However when you do pointer arithmetic on them the selector isn't modified.
'huge' pointers have an explicit selector. When you do pointer arithmetic on them though the selector can change.
Huge and far pointers are not part of standard C. They are borland extensions to the C language for managing segmented memory in DOS and Windows 16/32bit. Functionally, what the declaration says is **p is a char. That means "dereference p to get a pointer to char, dereference that to get a char"
In order to understand C pointer declarator semantics try this expression instead:
int* p;
It means p is a pointer to int. (hint: read from right to left)
int* const p;
This means p is a const pointer to int. (so you can't change the value of p)
Here's a proof of that:
p= 42; // error: assignment of read-only variable āpā
Another example:
int* const* lol;
This means lol is a pointer to const pointer to int. So the pointer which lol points at cannot point at another int.
lol= &p; // and yes, p cannot be reassigned, so we are correct.
In most cases reading from right to left makes sense. Now read the expression in question from right to left:
char huge * far *p;
Now the huge and far are just behaviour specifiers for pointers created by borland. What it actually means is
char** p;
"p is a pointer to pointer to char"
That means whatever p points to, points to a char.
Back in the 16-bit days on 8086, it would have declared a 32-bit pointer to a "normalized" 32-bit pointer to char (or to the first of an array thereof).
The difference exists because 32-bit pointers were composed of a segment number and offset between that segment, and segments overlapped (which meant two different pointers could point to the same physical address; example: 0x1200:1000 and 0x1300:0000). The huge qualifier forced a normalization using the highest segment number (and therefore, the lowest possible offset).
However, this normalization had a cost performance-wise, because after each operation that modified a pointer, the compiler had to automatically insert a code like this:
ptr = normalize(ptr);
with:
void huge * normalize(void huge *input)
{
unsigned long input2 = (unsigned long)input;
unsigned short segment = input >> 16;
unsigned short offset = (unsigned short)input;
segment += (offset >> 4);
offset &= 0x000F;
return ((unsigned long)segment) << 16 | offset;
}
The upside was the advantage of using your memory like it was flat, without worrying about segments and offsets.
Clarification to the other answers:
The far keyword is non-standard C, but it is not just an old obsolete extension from ancient PC days. Today, there are many modern 8 and 16 bit CPUs that uses "banked" memory to extend the amount of addressable memory beyond 65k. Typically they use a special register to pick a memory bank, effectively ending up with 24-bit addresses. All small microcontrollers on the market with RAM+flash memory > 64kb use such features.
The following code :
int *a;
*a = 5;
will most likely result in a segmentation fault and I know why.
The following code :
int a;
*a = 5;
won't even compile.
(gcc says : invalid type argument of unary *).
Now, a pointer is simply an integer, which is used
for storing an address.
So, why should it be a problem if I say :
*a = 5;
Ideally, this should also result in a segmentation fault.
A pointer is not an integer. C has data types to
a) prevent certain programming errors, and
b) improve portability of programs
On some systems, pointers may not be integers, because they really consist of two integers (segment and offset). On other systems, the "int" type cannot be used to represent pointers because an int is 32 bits and a pointer is 64 bits. For these reasons, C disallows using ints directly as pointers. If you want to use an integral type that is large enough to hold a pointer, use intptr_t.
When you say
int a;
*a = 5;
you are trying to make the compiler dereference something that is not a pointer. Sure, you could cast it to a pointer and then dereference it, like so,
*((int*)a) = 5;
.. and that tells the compiler that you really, really want to do that. BUT -- It's kind of a risky thing to do. Why? Well, in your example, for instance, you never actually initialized the value of a, so when you use it as a pointer, you are going to have whatever value is already at the location being used for a. Since it looks like it is a local variable, that will be an un-init'd location in the function's stack frame, and could be anything. In essence, you would be trying to write the value 5 to some undetermined location; not really a wise thing to do!
It's said to illustrate that pointers merely store addresses, and that addresses may be thought as numbers, much like integers. But usually addresses have a structure (like, page number, offset within page, etc).
You should not take that by word. An integer literally stores a number, which you can add, subtract etc. But which you cannot use as a pointer. An integer is an integer, and a pointer is a pointer. They serve different purposes.
Sometimes, a cast from a pointer to an integer may be necessary (for whatever purposes - maybe in a OS kernel to do some address arithmetic). Then you may cast the pointer to such an integer type, previously figuring out whether your compiler guarantees correct sizes and preserves values. But if you want to dereference, you have to cast back to a pointer type.
You never actually assign "a" in the first case.
int* a = ?
*a = 5; //BAD. What is 'a' exactly?
int a = ? //but some int anyway
*a = 5; //'a' is not a pointer!
If you wish to use the integer as a pointer, you'll have to cast it first. Pointers may be integers, but conceptually they serve different purposes.
The operator * is a unary operator which is not defined for the integer data type. That's why the statement
*a = 5;
won't compile.
Also, an integer and a pointer are not the same thing. They are typically the same size in memory (4 bytes for 32 bit systems).
int* a ā is a pointer to int. It points nowhere, you haven't initialized it. Please, read any book about C before asking such questions.