Creating a segregated list for a malloc() clone

Creating a segregated list for a malloc() clone - c

so I want to create my own malloc. I was given a library that gives me my own sbrk() function to increase my heap which basically uses malloc() to create "simulated" virtual memory.
Anyway, I stumbled across problems when implementing my free() method. Resulting in pointer confusion and questioning the whole existence of the universe. I thus create an example code.
#include <stdio.h>
#include <stdlib.h>
int main(void) {
size_t *seg_lists[4] = {NULL};
// Allocate two memory blocks of size 2
// First 8 bytes hold the size
// the last 2 8 bytes partition hold the data
void *mem1 = malloc(3*sizeof(size_t));
void *mem2 = malloc(3*sizeof(size_t));
// Set size fields
*(size_t *)mem1 = 2;
*(size_t *)mem2 = 2;
// Move pointers to start of body
mem1 = (size_t *)mem1 + 1;
mem2 = (size_t *)mem2 + 1;
// Set data to body of mem1
size_t* mem1_first = (size_t *)mem1;
size_t* mem1_second = (size_t *)mem1 + 1;
*mem1_first = 100;
*mem1_second = 200;
// Set data to body of mem2
size_t* mem2_first = (size_t *)mem2;
size_t* mem2_second = (size_t *)mem2 + 1;
*mem2_first = 300;
*mem2_second = 400;
// Output bodies of mem1 and mem2
printf("mem1=[%zu|%zu|%zu]\n", *((size_t *)mem1-1), *mem1_first, *mem1_second);
printf("mem2=[%zu|%zu|%zu]\n", *((size_t *)mem2 - 1), *mem2_first, *mem2_second);
// We first free mem2 and add it to the seg_list with index 1
size_t *head = seg_lists[1];
seg_lists[1] = mem2;
*(size_t *)mem2 = (size_t)head;
// We now want to free mem1 and add it to the list.
// I.e. we set the head to mem1 and the first 8 bytes of the body of
// mem2 to the address of mem1
head = seg_lists[1];
seg_lists[1] = mem1;
*(size_t *)mem1 = (size_t)head;
// Follow first entry in seg_lists[1] and print the 2nd 8 bytes of it's body
size_t *first = *seg_lists[1];
printf("*first=%zu\n", *(first+1));
}
So first I create two memory blocks of size 2. I.e. they can hold up to 16 bytes of data. The first 8 bytes are used to store the size. So it looks like this:
mem1=[2|100|200]
mem2=[2|300|400]
Next I want to free both memory blocks. The address of the next free block should be saved in the first 8 bytes of the body of previous bloc whereas the last should always point to NULL.
After freeing mem1, we want seg_list[1] point to the address of mem1 and the first 8 bytes of the body of mem1 to hold NULL.
After freeing mem2, we want seg_list[1] to point to the address of mem2 and the first 8 bytes of the body of mem2 should hold the address of mem1.
Now, I did the "save address of X in the first 8 bytes of the body of Y" by actually saving the hex value of the address in those bytes.
Now someone kept trying to tell me that's wrong. I shouldn't do that. I can't safe the address as a value but I can't see why I couldn't. It obviously works and for me, memory just holds data. An address is a hex value which I can store as plain data and then just treat it like however I want.
So is it really that bad what I'm doing here? Shouldn't I safe an address as "plain data"?

Related

Memory alignment in C: How is offset factoring in the return address?

I came across following code here.
/* Allocate aligned memory in a portable way.
*
* Memory allocated with aligned alloc *MUST* be freed using aligned_free.
*
* #param alignment The number of bytes to which memory must be aligned. This
* value *must* be <= 255.
* #param bytes The number of bytes to allocate.
* #param zero If true, the returned memory will be zeroed. If false, the
* contents of the returned memory are undefined.
* #returns A pointer to `size` bytes of memory, aligned to an `alignment`-byte
* boundary.
*/
void *aligned_alloc(size_t alignment, size_t size, bool zero) {
size_t request_size = size + alignment;
char* buf = (char*)(zero ? calloc(1, request_size) : malloc(request_size));
size_t remainder = ((size_t)buf) % alignment;
size_t offset = alignment - remainder;
char* ret = buf + (unsigned char)offset;
// store how many extra bytes we allocated in the byte just before the
// pointer we return
*(unsigned char*)(ret - 1) = offset;
return (void*)ret;
}
/* Free memory allocated with aligned_alloc */
void aligned_free(void* aligned_ptr) {
int offset = *(((char*)aligned_ptr) - 1);
free(((char*)aligned_ptr) - offset);
}
Explanation:
char *ret = buf + (unsigned char)offset; here, we're setting a new pointer which is ahead of base address of buf by offset bytes.
E.g. we want to allocate 68 bytes in a 16-bit aligned memory it would look something like this:
requested_size = 68+16 = 84 and let's assume base address of buf is buf = 0x112223341 then
remainder = sizeof(buf)%16 = (84%16) = 4
offset = 16 - 4 = 12 (i.e. 0x0C)
ret = &buf + offset = 0x11223341+0x0C = 0x1122334D
Questions:
What does following line do ?
I'm having a little bit of trouble understanding this syntax and thus implementation that results.
*(unsigned char*)(ret - 1) = offset
When we return ret what happens to the extra bytes those were allocated but never part of the base address of ret ? i.e. if we allocate 16 extra bytes, but only needed 12 bytes for alignment, what happens to rest of the bytes?
=======UPDATE ON CODE IN QUESTION=======
thanks to #ThomasMailund and his insights, I think I can safely modify above code in question to simplify some of the type casting as following:
/* Allocate aligned memory in a portable way.
*
* Memory allocated with aligned alloc *MUST* be freed using aligned_free.
*
* #param alignment The number of bytes to which memory must be aligned. This
* value *must* be <= 255.
* #param bytes The number of bytes to allocate.
* #param zero If true, the returned memory will be zeroed. If false, the
* contents of the returned memory are undefined.
* #returns A pointer to `size` bytes of memory, aligned to an `alignment`-byte
* boundary.
*/
void *aligned_alloc(size_t alignment, size_t size, bool zero) {
size_t request_size = size + alignment;
unsigned char *buf = zero ? calloc(1, request_size) : malloc(request_size);
size_t remainder = ((size_t)buf) % alignment;
size_t offset = alignment - remainder;
unsigned char *ret = buf + (unsigned char)offset;
// store how many extra bytes we allocated in the byte just before the
// pointer we return
*(ret - 1) = offset;
return ret;
}
/* Free memory allocated with aligned_alloc */
void aligned_free(void* aligned_ptr) {
int offset = *(((char*)aligned_ptr) - 1);
free(((char*)aligned_ptr) - offset);
}

For anyone in future who is looking at this:
I think I understand it now, when I looked at the void aligned_free function.
We're just saving the number of offset bytes we calculated at the address of (ret-1) and then reading it back when ready to free the memory, in order to free those additional bytes.
i.e. read the offset from (ret-1), substract those many # bytes from base address we're requested to free to obtain original buf base address and then free up buf so we don't run into memory leaks !
Note: Those additional offset bytes, they're still allocated in memory, just not used !

Reallocation of memory to empty reference

I was watching a lesson on malloc and while they were doing a specific example, it made no sense for why such code to print out the entire pointer.
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int *list = malloc(3 * sizeof(int));
if (list == NULL)
return 1;
list[0] = 1;
list[1] = 2;
list[2] = 3;
int *tmp = malloc(4 * sizeof(int));
if (tmp == NULL)
return 1;
for (int i = 0; i < 3; i++)
tmp[i] = list[i];
tmp[3] = 4;
free(list);
//From this line and below is the thing in question.
list = tmp;
for (int i = 0; i < 4; i++)
printf("%i\n", list[i]);
}
OUTPUT:
~/test/ $ ./malloc_test
1
2
3
4
From my understanding of free(), it de-allocates the memory allocated by allocation functions and free() will 'free' up the memory of the pointer.
If I were to go by this definition:
*list was allocated 12 bytes (3 * sizeof(int), sizeof(int) = 4
Hard code list with numbers
*tmp was allocated 16 bytes
Copy the numbers in list to tmp
Free up list making it have no allocated bytes
List is now equal to tmp
How can list now be equal to tmp when list doesn't have any allocated memory?
Is list pointing to the address of tmp? If yes, why do we need to not allocate memory for list since earlier in the code, we did this int *tmp = malloc(4 * sizeof(int));

*list was allocated 12 bytes (3 * sizeof(int), sizeof(int) = 4
To be precise, you allocated 12 bytes of memory and set list to point to that memory.
Hard code list with numbers
*tmp was allocated 16 bytes
Again, to be precise, you allocated 16 bytes of memory and set the pointer variable tmp to point to it.
Copy the numbers in list to tmp
Free up list making it have no allocated bytes
To be precise, you freed the object that list pointed to so now list points to garbage and must not be dereferenced.
list is now equal to tmp
To be precise, list now points to the same thing tmp points to, those 16 bytes you allocated that previous was pointed to only by tmp.
How can list now be equal to tmp when list doesn't have any allocated memory?
Previously, list didn't point to allocated memory. But you changed it to point to the 16 bytes you allocated second. The value of a pointer is what it points to, list and tmp now point to the same thing so they have the same value. So they're now equal.
Is list pointing to the address of tmp? If yes, why do we need to not allocate memory for list since earlier in the code, we did this int *tmp = malloc(4 * sizeof(int));
The code frees the first allocated block of memory. If it didn't allocate a second block of memory, there would be nothing valid for either pointer to point to!

When you copy a pointer you're just copying the pointer's value1:
list = tmp;
This means that list now contains a copy of the tmp pointer, or in other words, they both point to the same memory address. In this case that address is an allocation.
Once they're made identical you can reference either of them interchangeably.
1 Of course this presumes the pointers are of the same type, like int* to int*, and not int to int**. You could also assign to void* and convert back again later, that works as well.

a pointer is a data type that is only assigned memory addresses of the corresponding data type, in your case in list = tmp; you assign the first position of that memory block that you created with int * tmp = malloc (4 * sizeof (int)); , when you make the assignment to list it will point to that memory address. (in your case at the beginning of that memory block).

Segmentation fault when trying to access a struct array

As part of an assignment, I have to deal with three structs. There is some larger table within a file, FileHeader, that is made up of SectionHeader structs. Hdr is made up of an array of these structs laid out in contiguous memory. As a result, I should be able to access the array by typecasting the location of the table in memory.
typedef struct {
unsigned int offset; // offset in bytes from start of file to section headers
unsigned short headers; // count of section headers in table
} FileHeader;
typedef struct {
unsigned int name;
unsigned int type;
} SectionHeader;
I am supposed to: Use the offset and headers fields from the FileHeader (hdr) to identify the location and length of the section header table. I have assumed the start of the file is &hdr.
So I did this, but it is giving me a seg-fault. What is the proper way to access this array?
int header_location = hdr.offset;
int header_length = hdr.headers;
SectionHeader *sec_hdrs = (SectionHeader *) &hdr + header_location;
SectionHeader sec_hdr;
for (int i = 0; i < header_length; i++) {
sec_hdr = sec_hdrs[i];
if (sec_hdr.type == SHT_SYMTAB) break;
}

Try this: ElfSectionHeader *sec_hdrs = (ElfSectionHeader *)((unsigned char *) &hdr + header_location);
Your orinal code &hdr + header_location would offset the pointer by sizeof(hdr) * header_location which is not your intention.

You declared sec_hdrs as a pointer to SectionHeader. It is not an array and it can't be indexed. Your compiler should raise a warning.
Try this:
SectionHeader hdrs[header_length]
int header_location = hdrs[0].offset;
int header_length = hdrs[0].headers;
SectionHeader *sec_hdrs = hdrs + header_location;
SectionHeader sec_hdr;
for (int i = 0; i < header_length; i++) {
sec_hdr = sec_hdrs[i];
if (sec_hdr.type == SHT_SYMTAB) break;
}

Here is a visualization of the memory with an initial offset followed by SectionHeader's placed in contiguous memory.
header_location | sizeof(SectionHeader)| sizeof(SectionHeader) | sizeof(SectionHeader)
vijairaj makes a very valid point about a possible bug in your code.
Your original code &hdr + header_location would offset the pointer by
sizeof(hdr) * header_location which is not your intention.
This is a valid diagnosis and you should investigate how pointer arithmetic works. We increment the address by the size of its type. Once you are sure that *sec_hdrs is pointing to the correct place, rerun your program. If the segfault persists, try my next piece of debugging advice.
Yes, on other questions here, I have seen that you might have to malloc first. But I do not understand why that is necessary if you have a pointer to the array if you know that it is in contiguous memory, and also how to do this.
Just because we know something is in contiguous memory does not mean it is safe from being overwritten or reused by our program. That is the point of malloc - to protect certain blocks of memory from being overwritten. If you access unallocated memory, you run the risk of accessing sensitive data, overwriting program-dependent data, or storing data that will get overwritten. This is why a segfault will occur and this is why you need to malloc.
Ensure that you malloc enough space:
malloc(header_location + header_length * sizeof(SectionHeader))
This line of code is saying, "Please allocate contiguous memory for one offset and n SectionHeader's". The malloc call will return a pointer to the start of that memory block (&hdr) and then you may access anything within that block of memory.
Perhaps include the code that is providing you with &hdr? Hope this is helpful!

When using calloc in C what is stored at the address of the pointer?

I'm having a hard time understanding this program to illustrate pointers (from http://theocacao.com/document.page/234):
Below I don't understand why:
int * currentSlot = memoryBlock
isn't using &memoryBlock. I read the comment but don't get it. What is memoryBlock putting in there that &memoryBlock wouldn't? Won't both return the pointer to the set of ints created with calloc (assuming I understand what's been done that is)? What is really in * memoryBlock after calloc?
Then here, *currentSlot = rand();, how does the dereferencing work here? I thought the dereference would stop *currentSlot from giving the value of the memory address (the reference) to the actual value (no longer a reference but the value).
#include <stdio.h>
#include <stdlib.h> // for calloc and free
#include <time.h> // for random seeding
main ()
{
const int count = 10;
int * memoryBlock = calloc ( count, sizeof(int) );
if ( memoryBlock == NULL )
{
// we can't assume the memoryBlock pointer is valid.
// if it's NULL, something's wrong and we just exit
return 1;
}
// currentSlot will hold the current "slot" in the,
// array allowing us to move forward without losing
// track of the beginning. Yes, C arrays are primitive
//
// Note we don't have to do '&memoryBlock' because
// we don't want a pointer to a pointer. All we
// want is a _copy_ of the same memory address
int * currentSlot = memoryBlock;
// seed random number so we can generate values
srand(time(NULL));
int i;
for ( i = 0; i < count; i++ )
{
// use the star to set the value at the slot,
// then advance the pointer to the next slot
*currentSlot = rand();
currentSlot++;
}
// reset the pointer back to the beginning of the
// memory block (slot 0)
currentSlot = memoryBlock;
for ( i = 0; i < count; i++ )
{
// use the star to get the value at this slot,
// then advance the pointer
printf("Value at slot %i: %i\n", i, *currentSlot);
currentSlot++;
}
// we're all done with this memory block so we
// can free it
free( memoryBlock );
}
Thank you for any help.

Below I don't understand why:
int * currentSlot = memoryBlock
isn't using &memoryBlock.
Because both memoryBlock and currentSlot are pointers to int. &memoryBlock would be the address of a pointer to int, i.e. an int **.
What is "in" memoryBlock is a pointer to a block of memory.
Then here, *currentSlot = rand();, how does the dereferencing work here?
This is a rule of C: when a dereferencing expression like this occurs on the left-hand side of an expression, the right-hand side's value is stored in the memory location pointed to by the pointer being dereferenced.

int * memoryBlock;
memoryBlock is a variable which can hold the address of a memory block of integers. The size of the memoryBlock variable is the size of an address. Typically 4 or 8 bytes (sizeof(int*)). Its type is "pointer to int".
memoryBlock = calloc ( 5, sizeof(int) );
the memoryBlock variable is assigned the address of the start of the memory block able to hold 5 integers. The memory block size is 5 * sizeof(int) bytes.
memoryBlock + 1 is the address of the second integer in the block.
memoryBlock + 5 is one passed the address of the last integer in the block.
*memoryBlock is the content of the address (the first integer). type is integer.
*(memmoryBlock + 0) = 0;
*(memmoryBlock + 1) = 1;
*(memmoryBlock + 2) = 2;
*(memmoryBlock + 3) = 3;
*(memmoryBlock + 4) = 4;
// *(memmoryBlock + 5) = 5; illegal
Assigns integers to memory block.
Subscipt form same as above.
memmoryBlock[0] = 0;
memmoryBlock[1] = 1;
memmoryBlock[2] = 2;
memmoryBlock[3] = 3;
memmoryBlock[4] = 4;
// memmoryBlock[5] = 5; illegal
&memoryBlock is the address of the memoryBlock variable. This is not the address of the callocated space. It's type is int** "pointer to pointer to integer" not int*.
int ** pmemoryBlock;
pmemoryBlock is a variable which hold the address of an address of a memory block of integers. The size of pmemoryBlock is the size of an address. Typically 4 or 8 bytes (sizeof(int**)).
pmemoryBlock = &memoryBlock;
pmemoryBlock is assigned the address of a variable which holds the address of the start of the memory block able to hold 5 integers.
*pmemoryBlock is the address of the memory block.
**pmemoryBlock is the first integer in the memory block
*((*pmemoryBlock) + 0) is the first integer in the memory block
*((*pmemoryBlock) + 1) is the seconds integer in the memory block
...

memoryBlock is an array of integers (int*). (technically a pointer to an int but since it was allocated with enough room for 10 integers you can think of it as the start of an array)
*memoryBlock is the integer that memoryBlock is pointing to (the first int in the array). While the notation looks the same as the declariation it is actually dereferencing the value. IMHO it is poorly written as it should be declared
int* currentSlot = memoryBlock;
to make it more clear that it's a pointer to an integer, but that's a style choice.
&memoryBlock is the address of the pointer.
int * currentSlot = memoryBlock;
stores the pointer to the first slot in currentSlot. The program is then generating random numbers and putting them in each of the 10 clost by incrementing currentSlot (which internally is incrementing the pointer by the length of an integer.
Hope that helps.

In the code, memoryBlock is apointer to some memory that stores integers. That is, the actual value of the variable memoryBlock is the address just allocated. If you use &memoryBlock you get the address of where the variable is stored, not what it points to.
Lets take an example:
int foo = 5;
/* the variable "foo" is stored in memory,
and that memory contains the number 5 */
int bar = 7;
/* the variable "foo" is stored in memory,
and that memory contains the number 7 */
int *foo_pointer = &foo;
/* the variable "foo_pointer" is stored in memory,
and that memory contains the address of the variable "foo" */
foo_pointer = &bar;
/* the contents of the variable "foo_pointer" is no longer the address
of where the variable "foo" is in memory, instead it is the address
of where the variable "bar" */
I hope this makes some sense, and it helps a little.

It's not supposed to use &memoryBlock, which is the (int ** (heh!)) address of the the pointer into the memory you are clearing. In other words, memoryBLock (iff it's not NULL) points to (i.e., holds the address of) the first int in the calloc( )'ed memory. To reference that cleared-to-0 memory, you use *memoryBlock.
If you ever find yourself trying to use &memoryBlock, don't: it's never the right thing to do in the code fragment you posted.
HTH. If it doesn't help, go back to K&R and study pointers some more. Maybe a lot more.

int * is a pointer, which can be dereferenced. int ** is a pointer to a pointer, which can be dereferenced twice. So what does this mean? Well, a pointer is nothing more than an integer. Memory addresses just start from zero to its max range. For 32-bit systems, its range of addressable memory is 0 to 2^32-1 (or 4294967295). Each of this address has a byte value. If you have an int *, then it will access the address 4-bytes at a time.
Also, for simplicity, let's assume this is a virtual address, you can't just access all this memory, some will be protected (system), some are not valid (not committed). To gain more memory you can ask the system to allocate more from this range. sbrk in Linux, VirtualAlloc in Windows but you will be accessing them usually through C's malloc or calloc.
Let's say, you have starting from 0x100:
0x100: 'h', 'e', 'l', 'l', 'o', '\0'
So this string, occupies memory from 0x100 to 0x105 (including the null terminator). If you have a pointer:
char *p = 0x100;
Then you have:
p // 0x100
*p // 'h'
p+1 // 0x101
*(p+1) // 'e'
p += 2 // 0x102
*p // 'l'
p = 0x200;
p // now points to 0x200 in memory
*p // contains whatever value is in 0x200
If you have int pointers, then you are accessing memory 4-bytes at a time (or however big an int is on your system).
So with all that background, when you run calloc, it returns the address of the block you've requested.
int *memoryBlock = calloc(count, sizeof(int));
// memoryBlock is an array of int, assuming sizeof(int) == 4, then
// you have 40 bytes of memory starting from the address of what is
// returned by calloc.
memoryBlock++; // now memoryBlock is at base address + 4
*memoryBlock = 10; // now that address contains the value 10
(*memoryBlock)++; // now that address contains the value 11
memoryBlock++; // now memoryBlock is 4 bytes further

How to access allocated memory in C?

After using malloc() to initialize 5000 bytes of memory, how would I reference the bytes in this memory space? For example, if I need to point to a starting location of data within the memory, how would I go about that?
EDIT: Does it matter what I use to point to it? I mean I am seeing people use bytes/int/char? Is it relevant?
Error I get:

You can use the subscript array[n] operator to access the index you are interested in reading/writing, like so:
uint8_t* const bytes = (uint8_t*)malloc(5000);
bytes[0] = UINT8_MAX; // << write UINT8_MAX to the first element
uint8_t valueAtIndexZero = bytes[0]; // << read the first element (will be UINT8_MAX)
...
free(bytes), bytes = 0;

char * buffer = malloc(5000);
buffer[idx] = whatever;
char * p = buffer + idx;
*p = whatever;

Malloc doesn't initialize the bits allocated by it. Use calloc() rather.
int *p = malloc (5000); // p points to the start of the dynamically allocated area.

As has been mentioned by others, you could do something like this:
int nbytes = 23; // number of bytes of space to allocate
byte *stuff = malloc(nbytes * sizeof stuff[0]);
stuff[0] = 0; // set the first byte to 0
byte x = stuff[0]; // get the first byte
int n = 3;
stuff[n] = 0; // set the nth byte to 0
x = stuff[n]; // nth byte, or in the case of some other type, nth whatever - just make sure it's a safe value, from 0 (inclusive) to the number (nbytes here) of things you allocated (exclusive)
However, a couple of things to note:
malloc will not initialise the memory, but calloc will (as mentioned by Prasoon Saurav)
You should always check to see if the memory allocation failed (see below for an example)
int nbytes = 23; // or however many you want
byte *stuff = malloc(nbytes * sizeof stuff[0]);
if (NULL == stuff) // memory allocation failed!
{
//handle it here, e.g. by exiting the program and displaying an appropriate error message
}
stuff[0] = 0; // set the first byte to 0
byte x = stuff[0]; // get the first byte
int n = 3;
stuff[n] = 0; // set the nth byte to 0
x = stuff[n]; // nth byte, or in the case of some other type, nth whatever

malloc() returns a pointer to the allocated memory:
typedef unsigned char byte;
byte * mem = malloc( 5000 );
byte val = mem[1000]; /* gets the 1000th byte */

After using malloc() to initialize 5000 bytes of memory, how would I
reference the bytes in this memory space? For example, if I need to
point to a starting location of data within the memory, how would I go
about that?
Does it matter what I use to point to it? I mean I am seeing people
use bytes/int/char? Is it relevant?
as you have seen malloc allocates a block of memory counted in bytes, you can assign a pointer to that block and depending on the pointer type the compiler knows how to reference individual elements:
unsigned char *memblob = malloc( 1024 );
short* pshort = (short*)memblob;
now if you reference the second short value i.e. *(pshort + 1) or pshort[1] the compiler knows that it needs to add 2 bytes (sizeof(short)) in order get the next element.
float* pfloat = (float*)memblob;
now if you reference the second float value i.e. *(pfloat + 1) or pfloat[1] the compiler knows that it needs to add 4 bytes (sizeof(float)) in order get the next element.
same with own defined data types:
typedef struct s
{
short a;
long b;
} mystruct_t;
mystruct_t* pstruct = (mystruct_t*)memblob;
pstruct + 1 accesses the struct at offset sizeof(mystruct_t)
so it is really up to you how you want to use the allocated memory

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight