For a course about the functioning of operating systems, we had to write a malloc/free implementation for a specific sized struct. Our idea was to store the overhead, like the start and end of the specified (static) memory block our code has to work in, in the first few addresses of that block.
However, something went wrong with the calculation of the last memory slot; We're adding the size of all usable memory slots to the address of the first usable memory slot, to determine what the last slot is. However, when adding the int sizeofslots to the address of currentslot, it actually adds sizeofslots 4 times to this address. Here's the relevant code:
/* Example memory block*/
/* |------------------------------------------------------------------------------|*/
/* | ovr | 1 t | 0 | 1 t | 1 t | 1 t | 0 | 0 | 0 | 0 | 0 | 0 | 0 |*/
/* |------------------------------------------------------------------------------|*/
/* ovr: overhead, the variables `currentslot`, `firstslot` and `lastslot`.
* 1/0: Whether or not the slot is taken.
* t: the struct
*/
/* Store the pointer to the last allocated slot at the first address */
currentslot = get_MEM_BLOCK_START();
*currentslot = currentslot + 3*sizeof(void *);
/* The first usable memory slot after the overhead */
firstslot = currentslot + sizeof(void *);
*firstslot = currentslot + 3*sizeof(void *);
/* The total size of all the effective memory slots */
int sizeofslots = SLOT_SIZE * numslots;
/* The last usable slot in our memory block */
lastslot = currentslot + 2*sizeof(void*);
*lastslot = firstslot + sizeofslots;
printf("%p + %i = %p, became %p\n", previous, sizeofslots, previous + (SLOT_SIZE*numslots), *lastslot);
We figured it had something to do with integers being 4 bytes, but we still don't get what is happening here; Can anyone explain it?
C's pointer arithmetic always works like this; addition and subtraction is always in terms of the item being pointed at, not in bytes.
Compare it to array indexing: as you might know, the expression a[i] is equivalent to *(a + i), for any pointer a and integer i. Thus, it must be the case that the addition happens in terms of the size of each element of a.
To work around it, cast the structure pointer down to (char *) before the add.
When you add an integer to a pointer, it increments by that many strides (i.e. myPointer + x will increment by x*sizeof(x). If this didn't happen, it would be possible to have unaligned integers, which is many processor architectures is a fault and will cause some funky behaviour, to say the least.
Take the following as an example
char* foo = (char*)0x0; // Foo = 0
foo += 5; // foo = 5
short* bar = (short*)0x0; // Bar = 0; (we assume two byte shorts)
bar += 5; // Bar = 0xA (10)
int* foobar = (int*)0x0; // foobar = 0; (we assume four byte ints)
foobar += 2; // foobar = 8;
char (*myArr)[8]; // A pointer to an array of chars, 8 size
myArr += 2; // myArr = 0x10 (16). This is because sizeof(char[8]) = 8;
Example
const int MAX = 3;
int main ()
{
int var[] = {10, 100, 200};
int i, *ptr;
/* let us have array address in pointer */
ptr = var;
for ( i = 0; i < MAX; i++)
{
printf("Address of var[%d] = %x\n", i, ptr );
printf("Value of var[%d] = %d\n", i, *ptr );
/* move to the next location */
ptr++;
}
return 0;
}
Output::
Address of var[0] = bfb7fe3c
Value of var[0] = 10
Address of var[1] = bfb7fe40
Value of var[1] = 100
Address of var[2] = bfb7fe44
Value of var[2] = 200
You can deduce from the example that, a pointer increments itself by "Number Of Bytes" = "Size of the type it is pointing to". Here it is, Number Of bytes = sizeof(int). Similarly, it will increment itself 1 byte in case of char.
Related
I'm trying to understand this code:
struct mys {
double d[128];
};
void my_func(int iters) {
int i;
struct mys *ptr = malloc(iters *sizeof(struct mys));
for(i = 0; i < iters; i++) {
ptr[i].d[0] = (double)i;
}
free(ptr);
}
What I know:
mys is of size 8 * 128 (size of double is 8, it's an array of 128 doubles)
*ptr is of size iters * (8 * 128)
What is going on here:
ptr[i].d[0] = (double)i;
?
What I know:
// ptr->d is the address of the first part of d
// same as (*ptr).d
// BECAUSE d IS A STRUCT
// ptr->d[i] is the actual value. so, 0.0000
// same as (*ptr).d[i]
Thanks in advance.
ptr[i] is the value at index i, so starts at 0.0000.
d is not initialized, it is just the name of the member of a struct. How can we just d here?
What I think:
*ptr is multiple (iters) structs.
So, ptr[0] is the first struct, ptr[1] is the second struct, etc.
ptr[i].d access the ith struct's d array.
ptr[i].d[0] accesses the first index of the d array. So the line above sets that number to double(i).
So this really only sets the first element of each struct to be 0. Am I right?
But when iters is 2, and I try:
for(int i = 0; i < iters; i++) {
printf("%p\n", ptr[200].d);
}
it still prints an address. Why is that?
What is going on here: ptr[i].d[0] = (double)i;?
This:
struct mys *ptr = malloc(iters *sizeof(struct mys));
allocates memory for an array of structs, called ptr.
This line of code:
ptr[i].d[0] = (double)i;
assigns i to the first cell of the array d, of the i-th struct, in the array ptr.
i is casted to double, because d is an array of doubles, and i is declared as int.
when iters is 2, and I try: for(int i = 0; i < iters; i++) { printf("%p\n", ptr[200].d); } it still prints an address. Why is that? Shouldn't it be out of range since ptr is only 2 structs?
This is definitely out of range, since arrays are 0-indexed.
However, that attempt invokes Undefined Behavior (UB), which means that you don't know how the code is going to behave. For example, in your computer it prints an address, in my computer it might cause a segmentation fault, and so on...
So this really only sets the first element of each struct to be 0. Am I right?
It copies the index i, converted to type double, into the first element of each struct. Otherwise you are right.
Regarding the expression ptr[200].d, this is the same as &(ptr[200]) because the array d[] is the sole element of a mys object. Because a double is eight bytes wide, each mys object occupies (8 bytes)(128) = 1 kiB. Therefore, &(ptr[200]) == ptr + 200*1024. The last is an address 200 kiB past the beginning of *ptr. Whether the address has meaning depends on whether anything meaningful is stored there.
One of my pointer is being changed after I deference it and affect it a value. I don't understand why, especially since the code in that function is runned multiple times but work most of the time. Here is the code :
typedef struct s_freeList
{
int sSize;
struct s_freeList *next;
struct s_freeList *back;
int *eSize;
} t_freeList;
void *addNode(void *addr, size_t size)
{
t_freeList *freeList;
// Working stuff
freeList = (t_freeList *)addr;
freeList->sSize = size * -1;
freeList->next = ((t_freeList *)g_startAddr)->next;
freeList->back = g_startAddr;
((t_freeList *)g_startAddr)->next->back = freeList;
((t_freeList *)g_startAddr)->next = freeList;
// Not working stuff
printf("addr = %p\nsize = %d\n", addr, (int)size);
freeList->eSize = addr + size - 4;
printf("freelist->esize = %p\n", freeList->eSize);
*(freeList->eSize) = size * -1;
printf("freelist->esize = %p\n\n", freeList->eSize);
return (collapseNodes(addr));
}
Here is the output with the address :
addr = 0x1647020
size = 32
freelist->esize = 0x164703c
freelist->esize = 0xffffffe00164703c
From your printf output, you're running on a platform with 64-bit pointers. Each pointer has size and, importantly, natural alignment of 8 bytes, which gives the following layout of your structure (offsets are in decimal):
00 sSize;
04 <padding>
08 *next;
16 *back;
24 *eSize;
32 <total size>
You're setting freeList->eSize = addr + size - 4, which, with size = 32, just happens to overlap with the higher portion of eSize field itself.
If you're going to do the address arithmetic, try to use the compiler-provided introspection facilities such as sizeof and offsetof wherever possible. They may reduce the incidence of such errors and will make the code more readable and more portable.
(Or, better yet, avoid the address arithmetic at all).
I have not tried the code, but here is what I think its happening. The following few lines assign values for the new node and insert it into the list:
// Working stuff
freeList = (t_freeList *)addr;
freeList->sSize = size * -1;
freeList->next = ((t_freeList *)g_startAddr)->next;
freeList->back = g_startAddr;
((t_freeList *)g_startAddr)->next->back = freeList;
((t_freeList *)g_startAddr)->next = freeList;
Clearly, addr is a pointer to the newly inserted node. Now this is the interesting thing:
freeList->eSize = addr + size - 4;
assuming that size == sizeof(struct s_freeList) then what you are basically doing is assigning eSize its own address + 4 bytes! I think this is because the sizeof(int *) happens to be 8 bytes and in memory your structure probably looks like this:
----------
|sSize | <-- addr
----------
|next |
----------
|back; |
----------
|eSize; | <-- addr + size - 4 probably because sizeof(int *) == 8 bytes.
----------
So when you do:
*(freeList->eSize) = size * -1;
you are modifying the value of eSize since it points to itself!
Edit: I just when over the code and output again and what atzz says is right, you are probably running a 64-bit machine, but the idea is still the same, eSize is overwriting itself...
I'm quite new in C language, so this "problem" is very confusing for me.
I wanted to create 2D array using array of int pointers (rows) which points to arrays of ints (columns) in one block of memory. I did it and it works but I'm not sure why after I checked something.
I've used malloc to allocate 48 bytes (2x4 array) in the heap (I'm on x86-64 machine):
int **a;
a = (int **)malloc(sizeof(int*) * 2 + sizeof(int) * 2 * 4);
Now lets assume that this is the whole 48 bytes in memory. I wanted 2 row's array so I needed 2 pointers to arrays of ints - a[0], a[1]:
----------------------------------------------------------------
| a[0] | a[1] | |
----------------------------------------------------------------
^
|
I assumed that all pointers are 8 bytes long and that address of a[2] (arrow) is the place where I can start storing my values (arrays of ints). So I did...
int *addr = (int*)&a[2];
a[0] = addr;
addr += 4;
a[1] = addr;
This is working perfectly fine, I can easily fill and print 2D array. Problem is that when I was writing int *addr = (int*)&a[2]; I was sure that this will be the address of a[0] plus 2 * 8 bytes, but it wasn't. I've checked it at another example with this simple code:
int *p;
int **k;
p = (int*) malloc(30);
k = (int**) malloc(30);
printf("&p = %p %p %p\n", &p[0], &p[1], &p[2]);
printf("&k = %p %p %p\n", &k[0], &k[1], &k[2]);
Output:
&p = 0x14d8010 0x14d8014 0x14d8018 <-- ok (int = 4 bytes)
&k = 0x14d8040 0x14d8048 0x14d8050 <-- something wrong in my opinion (ptrs = 8 bytes)
My question is: Why the third address of the pointer in array is 0x14d8050 not 0x14d8056. I think it might be because 0x14d8056 is not the best address for ints but why is that and why it happens only when dealing with array of pointers?? I've checked this on x86 machine and pointer has "normal" values
&p = 0x8322008 0x832200c 0x8322010
&k = 0x8322030 0x8322034 0x8322038
I know this might be an obvious or even stupid question for someone so please at least share some links with information about this behavior. Thank you.
Numbers prefixed by 0x are represented in hexa decimal.
Thus, 0x14d8048 + 8 == 0x14d8050 is expected.
as timrau said in his comment 0x14d8048 + 8 is not 0x14d8056 but 0x14d8050 because it's hexadecimal
concerning your 2D array , I'm not sure why it worked but that's not the way to create one.
there are two ways for creating a 2D array , the first and simple one is " statically " and it goes like this : int a[2][4]; .
the second one , the one you tried , is dynamically , the slightly more complicated and it goes like this
int **a;
int i;
a = malloc(2 * sizeof(*int));
for(i = 0 ; i < 2 ; i++)
a[i] = malloc(4 * sizeof(int));
I'm having a hard time understanding this program to illustrate pointers (from http://theocacao.com/document.page/234):
Below I don't understand why:
int * currentSlot = memoryBlock
isn't using &memoryBlock. I read the comment but don't get it. What is memoryBlock putting in there that &memoryBlock wouldn't? Won't both return the pointer to the set of ints created with calloc (assuming I understand what's been done that is)? What is really in * memoryBlock after calloc?
Then here, *currentSlot = rand();, how does the dereferencing work here? I thought the dereference would stop *currentSlot from giving the value of the memory address (the reference) to the actual value (no longer a reference but the value).
#include <stdio.h>
#include <stdlib.h> // for calloc and free
#include <time.h> // for random seeding
main ()
{
const int count = 10;
int * memoryBlock = calloc ( count, sizeof(int) );
if ( memoryBlock == NULL )
{
// we can't assume the memoryBlock pointer is valid.
// if it's NULL, something's wrong and we just exit
return 1;
}
// currentSlot will hold the current "slot" in the,
// array allowing us to move forward without losing
// track of the beginning. Yes, C arrays are primitive
//
// Note we don't have to do '&memoryBlock' because
// we don't want a pointer to a pointer. All we
// want is a _copy_ of the same memory address
int * currentSlot = memoryBlock;
// seed random number so we can generate values
srand(time(NULL));
int i;
for ( i = 0; i < count; i++ )
{
// use the star to set the value at the slot,
// then advance the pointer to the next slot
*currentSlot = rand();
currentSlot++;
}
// reset the pointer back to the beginning of the
// memory block (slot 0)
currentSlot = memoryBlock;
for ( i = 0; i < count; i++ )
{
// use the star to get the value at this slot,
// then advance the pointer
printf("Value at slot %i: %i\n", i, *currentSlot);
currentSlot++;
}
// we're all done with this memory block so we
// can free it
free( memoryBlock );
}
Thank you for any help.
Below I don't understand why:
int * currentSlot = memoryBlock
isn't using &memoryBlock.
Because both memoryBlock and currentSlot are pointers to int. &memoryBlock would be the address of a pointer to int, i.e. an int **.
What is "in" memoryBlock is a pointer to a block of memory.
Then here, *currentSlot = rand();, how does the dereferencing work here?
This is a rule of C: when a dereferencing expression like this occurs on the left-hand side of an expression, the right-hand side's value is stored in the memory location pointed to by the pointer being dereferenced.
int * memoryBlock;
memoryBlock is a variable which can hold the address of a memory block of integers. The size of the memoryBlock variable is the size of an address. Typically 4 or 8 bytes (sizeof(int*)). Its type is "pointer to int".
memoryBlock = calloc ( 5, sizeof(int) );
the memoryBlock variable is assigned the address of the start of the memory block able to hold 5 integers. The memory block size is 5 * sizeof(int) bytes.
memoryBlock + 1 is the address of the second integer in the block.
memoryBlock + 5 is one passed the address of the last integer in the block.
*memoryBlock is the content of the address (the first integer). type is integer.
*(memmoryBlock + 0) = 0;
*(memmoryBlock + 1) = 1;
*(memmoryBlock + 2) = 2;
*(memmoryBlock + 3) = 3;
*(memmoryBlock + 4) = 4;
// *(memmoryBlock + 5) = 5; illegal
Assigns integers to memory block.
Subscipt form same as above.
memmoryBlock[0] = 0;
memmoryBlock[1] = 1;
memmoryBlock[2] = 2;
memmoryBlock[3] = 3;
memmoryBlock[4] = 4;
// memmoryBlock[5] = 5; illegal
&memoryBlock is the address of the memoryBlock variable. This is not the address of the callocated space. It's type is int** "pointer to pointer to integer" not int*.
int ** pmemoryBlock;
pmemoryBlock is a variable which hold the address of an address of a memory block of integers. The size of pmemoryBlock is the size of an address. Typically 4 or 8 bytes (sizeof(int**)).
pmemoryBlock = &memoryBlock;
pmemoryBlock is assigned the address of a variable which holds the address of the start of the memory block able to hold 5 integers.
*pmemoryBlock is the address of the memory block.
**pmemoryBlock is the first integer in the memory block
*((*pmemoryBlock) + 0) is the first integer in the memory block
*((*pmemoryBlock) + 1) is the seconds integer in the memory block
...
memoryBlock is an array of integers (int*). (technically a pointer to an int but since it was allocated with enough room for 10 integers you can think of it as the start of an array)
*memoryBlock is the integer that memoryBlock is pointing to (the first int in the array). While the notation looks the same as the declariation it is actually dereferencing the value. IMHO it is poorly written as it should be declared
int* currentSlot = memoryBlock;
to make it more clear that it's a pointer to an integer, but that's a style choice.
&memoryBlock is the address of the pointer.
int * currentSlot = memoryBlock;
stores the pointer to the first slot in currentSlot. The program is then generating random numbers and putting them in each of the 10 clost by incrementing currentSlot (which internally is incrementing the pointer by the length of an integer.
Hope that helps.
In the code, memoryBlock is apointer to some memory that stores integers. That is, the actual value of the variable memoryBlock is the address just allocated. If you use &memoryBlock you get the address of where the variable is stored, not what it points to.
Lets take an example:
int foo = 5;
/* the variable "foo" is stored in memory,
and that memory contains the number 5 */
int bar = 7;
/* the variable "foo" is stored in memory,
and that memory contains the number 7 */
int *foo_pointer = &foo;
/* the variable "foo_pointer" is stored in memory,
and that memory contains the address of the variable "foo" */
foo_pointer = &bar;
/* the contents of the variable "foo_pointer" is no longer the address
of where the variable "foo" is in memory, instead it is the address
of where the variable "bar" */
I hope this makes some sense, and it helps a little.
It's not supposed to use &memoryBlock, which is the (int ** (heh!)) address of the the pointer into the memory you are clearing. In other words, memoryBLock (iff it's not NULL) points to (i.e., holds the address of) the first int in the calloc( )'ed memory. To reference that cleared-to-0 memory, you use *memoryBlock.
If you ever find yourself trying to use &memoryBlock, don't: it's never the right thing to do in the code fragment you posted.
HTH. If it doesn't help, go back to K&R and study pointers some more. Maybe a lot more.
int * is a pointer, which can be dereferenced. int ** is a pointer to a pointer, which can be dereferenced twice. So what does this mean? Well, a pointer is nothing more than an integer. Memory addresses just start from zero to its max range. For 32-bit systems, its range of addressable memory is 0 to 2^32-1 (or 4294967295). Each of this address has a byte value. If you have an int *, then it will access the address 4-bytes at a time.
Also, for simplicity, let's assume this is a virtual address, you can't just access all this memory, some will be protected (system), some are not valid (not committed). To gain more memory you can ask the system to allocate more from this range. sbrk in Linux, VirtualAlloc in Windows but you will be accessing them usually through C's malloc or calloc.
Let's say, you have starting from 0x100:
0x100: 'h', 'e', 'l', 'l', 'o', '\0'
So this string, occupies memory from 0x100 to 0x105 (including the null terminator). If you have a pointer:
char *p = 0x100;
Then you have:
p // 0x100
*p // 'h'
p+1 // 0x101
*(p+1) // 'e'
p += 2 // 0x102
*p // 'l'
p = 0x200;
p // now points to 0x200 in memory
*p // contains whatever value is in 0x200
If you have int pointers, then you are accessing memory 4-bytes at a time (or however big an int is on your system).
So with all that background, when you run calloc, it returns the address of the block you've requested.
int *memoryBlock = calloc(count, sizeof(int));
// memoryBlock is an array of int, assuming sizeof(int) == 4, then
// you have 40 bytes of memory starting from the address of what is
// returned by calloc.
memoryBlock++; // now memoryBlock is at base address + 4
*memoryBlock = 10; // now that address contains the value 10
(*memoryBlock)++; // now that address contains the value 11
memoryBlock++; // now memoryBlock is 4 bytes further
After using malloc() to initialize 5000 bytes of memory, how would I reference the bytes in this memory space? For example, if I need to point to a starting location of data within the memory, how would I go about that?
EDIT: Does it matter what I use to point to it? I mean I am seeing people use bytes/int/char? Is it relevant?
Error I get:
You can use the subscript array[n] operator to access the index you are interested in reading/writing, like so:
uint8_t* const bytes = (uint8_t*)malloc(5000);
bytes[0] = UINT8_MAX; // << write UINT8_MAX to the first element
uint8_t valueAtIndexZero = bytes[0]; // << read the first element (will be UINT8_MAX)
...
free(bytes), bytes = 0;
char * buffer = malloc(5000);
buffer[idx] = whatever;
char * p = buffer + idx;
*p = whatever;
Malloc doesn't initialize the bits allocated by it. Use calloc() rather.
int *p = malloc (5000); // p points to the start of the dynamically allocated area.
As has been mentioned by others, you could do something like this:
int nbytes = 23; // number of bytes of space to allocate
byte *stuff = malloc(nbytes * sizeof stuff[0]);
stuff[0] = 0; // set the first byte to 0
byte x = stuff[0]; // get the first byte
int n = 3;
stuff[n] = 0; // set the nth byte to 0
x = stuff[n]; // nth byte, or in the case of some other type, nth whatever - just make sure it's a safe value, from 0 (inclusive) to the number (nbytes here) of things you allocated (exclusive)
However, a couple of things to note:
malloc will not initialise the memory, but calloc will (as mentioned by Prasoon Saurav)
You should always check to see if the memory allocation failed (see below for an example)
int nbytes = 23; // or however many you want
byte *stuff = malloc(nbytes * sizeof stuff[0]);
if (NULL == stuff) // memory allocation failed!
{
//handle it here, e.g. by exiting the program and displaying an appropriate error message
}
stuff[0] = 0; // set the first byte to 0
byte x = stuff[0]; // get the first byte
int n = 3;
stuff[n] = 0; // set the nth byte to 0
x = stuff[n]; // nth byte, or in the case of some other type, nth whatever
malloc() returns a pointer to the allocated memory:
typedef unsigned char byte;
byte * mem = malloc( 5000 );
byte val = mem[1000]; /* gets the 1000th byte */
After using malloc() to initialize 5000 bytes of memory, how would I
reference the bytes in this memory space? For example, if I need to
point to a starting location of data within the memory, how would I go
about that?
Does it matter what I use to point to it? I mean I am seeing people
use bytes/int/char? Is it relevant?
as you have seen malloc allocates a block of memory counted in bytes, you can assign a pointer to that block and depending on the pointer type the compiler knows how to reference individual elements:
unsigned char *memblob = malloc( 1024 );
short* pshort = (short*)memblob;
now if you reference the second short value i.e. *(pshort + 1) or pshort[1] the compiler knows that it needs to add 2 bytes (sizeof(short)) in order get the next element.
float* pfloat = (float*)memblob;
now if you reference the second float value i.e. *(pfloat + 1) or pfloat[1] the compiler knows that it needs to add 4 bytes (sizeof(float)) in order get the next element.
same with own defined data types:
typedef struct s
{
short a;
long b;
} mystruct_t;
mystruct_t* pstruct = (mystruct_t*)memblob;
pstruct + 1 accesses the struct at offset sizeof(mystruct_t)
so it is really up to you how you want to use the allocated memory