Memory management in C

Memory management in C - c

Suppose I have a structure, and a pointer to a memory location p, how do I make sure that while creating an instance 'a' of the structure, it is placed into a memory chunk starting at p?
So I am passed a chunk of memory and I want to create a node at the beginning of it, so I need to make sure that the node is created at the beginning of it. (Note: I cannot use malloc, calloc, free, memcpy, or similar functions because I am writing code for a memory management system).

You don't really 'create' instances of structures in C like that. Assuming that p points to a block of usable memory, you can just treat p as a pointer to your structure type:
typedef struct {int x; long y;} a;
a *p2 = (a*)p;
int z = p2->x;
// or, if you don't want p2:
z = ((a*)p)->x;
Once p is cast (implicitly or explicitly as above), you can initialize the contents of your structure however you wish.
As an example, the following code will initialize a structure as you seem to request:
typedef struct {int x; float y;} tA;
void initA (void *p) {
tA *p2 = (tA*)p;
p2->x = 0;
p2->y = 3.14159;
}
int main (void) {
char bigmem[100];
initA (&(bigmem[0]));
return 0;
}
Don't get hung up on the main function above, it's only to illustrate how you can pass an arbitrary memory address to the function. In your real-world case, you will have the memory already allocated somehow.

If creation of the instance 'a' involves allocation of memory, then you can't make that allocation occur at memory pointed to by 'p'.
However, if by creation you mean initialisation of a structure in already allocated memory, then you should be able to pass 'p', typecast to a pointer to the structure, to the initialisation routine. But you will have to be careful that the memory pointed to by 'p' is large enough for the structure, is not being used for something else, and has the right alignment for the structure you are initialising.
If you are actually trying to do something else, you should post some code or go into a bit more detail.

just typecast the pointer to the type of your struct and you are done...
Hope it helps!

Basically, you take the address and cast it to a pointer of the appropriate type. The major problem you can run into is alignment: if the address isn't properly aligned for an object of that type, attempting to dereference the pointer can (and will) cause undefined behavior -- a typical reaction will be your program being aborted. If memory serves, a typical Unix kernel will give you an error message about a "bus error".

You can't control where the allocator will allocate memory from, but you can make a temporary instance (on the stack) and copy it into where p points with memcpy.
(Assuming p points to validly allocated memory, large enough for your structure and aligned appropriately.)

typedef struct {int x; long y;} A;
// Populate the members individually
A *aPtr = malloc(sizeof(A));
aPtr->x = 1;
aPtr->y = 2;
or
A *aPtr = malloc(sizeof(A));
A a;
a.x = 1;
a.y = 2;
// Use C's inherent "assignment == copy by value" capability
*aPtr = a;
or
A *aPtr = malloc(sizeof(A));
A a;
a.x = 1;
a.y = 2;
// Copy the memory yourself
memcpy(aPtr, &a, sizeof(A));
Feel free to replace my malloc with your own malloc.

Sounds like you are trying to do something similar to C++ placement new.
I think you confusing C with higher level languages. You never instantiate structs. You just allocate a bunch of memory and then cast those to a a struct pointer. Alternatively you allocate it on a stack which just means that compiler reserves so many bytes for you to use.

Code example for what Alex says:
struct foo {
int a;
char *b;
float c;
};
struct enough_space_for_a_foo {
char a[sizeof(struct foo))];
};
int main() {
// region of memory, which in the real code someone else is giving us
struct enough_space_for_a_foo memory_region;
// temporary object
struct foo tmp = {10, "ten", 10.0};
// copied to the specified region
memcpy(&memory_region, &tmp, sizeof(struct foo));
}
So, an arbitrary memory region now contains the same values as if it had been initialized as a struct foo, using the initializer expression {10, "ten", 10.0}.
If your struct doesn't need initializing with particular values, then you don't need to do anything. A region of memory in C basically is an instance of a struct if you choose to think of it as one (and it's big enough, and correctly aligned). There are no constructors, so just cast the pointer and get on with filling in the fields.

You asked just the right person. Not specifically, of course.
The answer depends on your OS. If you truly can't use anyone else's memory stuff, then you have your work cut out for you. You will have to make some kind of heap structure(s), perhaps some free list allocators for fixed size, and figure out what the OS has to offer. You have VirtualAlloc in windows and you have brk or similar in unix.
If this is really homework, this is way too much work if this is a single semester assignment.
Of course if all you want to know is how to prepend allocation size to what you return, just do whatever your code normally is, then put your value at the front, then advance the pointer by one and return that.

Related

malloc'd pointer inside struct that is passed by value

I am putting together a project in C where I must pass around a variable length byte sequence, but I'm trying to limit malloc calls due to potentially limited heap.
Say I have a struct, my_struct, that contains the variable length byte sequence, ptr, and a function, my_func, that creates an instance of my_struct. In my_func, my_struct.ptr is malloc'd and my_struct is returned by value. my_struct will then be used by other functions being passed by value: another_func. Code below.
Is this "safe" to do against memory leaks provided somewhere on the original or any copy of my_struct when passed by value, I call my_struct_destroy or free the malloc'd pointer? Specifically, is there any way that when another_func returns, that inst.ptr is open to being rewritten or dangling?
Since stackoverflow doesn't like opinion-based questions, are there any good references that discuss this behavior? I'm not sure what to search for.
typedef struct {
char * ptr;
} my_struct;
// allocates n bytes to pointer in structure and initializes.
my_struct my_func(size_t n) {
my_struct out = {(char *) malloc(n)};
/* initialization of out.ptr */
return out;
}
void another_func(my_struct inst) {
/*
do something using the passed-by-value inst
are there problems with inst.ptr here or after this function returns?
*/
}
void my_struct_destroy(my_struct * ms_ptr) {
free(ms_ptr->ptr);
ms_ptr->ptr = NULL;
}
int main() {
my_struct inst = my_func(20);
another_func(inst);
my_struct_destroy(&inst);
}

I's safe to pass and return a struct containing a pointer by value as you did it. It contains a copy of ptr. Nothing is changed in the calling function. There would, of course, be a big problem if another_func frees ptr and then the caller tries to use it or free it again.
Locality of alloc+free is a best practice. Wherever possible, make the function that allocates an object also responsible for freeing it. Where that's not feasible, malloc and free of the same object should be in the same source file. Where that's not possible (think complex graph data structure with deletes), the collection of files that manage objects of a given type should be clearly identified and conventions documented. There's a common technique useful for programs (like compilers) that work in stages where much of the memory allocated in one stage should be freed before the next starts. Here, memory is only malloced in big blocks by a manager. From these, the manager allocs objects of any size. But it knows only one way to free: all at once, presumably at the end of a stage. This is a gcc idea: obstacks. When allocation is more complex, bigger systems implement some kind of garbage collector. Beyond these ideas, there are as many ways to manage C storage as there are colors. Sorry I don't have any pointers to references (pun intended :)
If you only have one variable-length field and its size doesn't need to be dynamically updated, consider making the last field in the struct an array to hold it. This is okay with the C standard:
typedef struct {
... other fields
char a[1]; // variable length
} my_struct;
my_struct my_func(size_t n) {
my_struct *p = malloc(sizeof *p + (n - 1) * sizeof p->a[0]);
... initialize fields of p
return p;
}
This avoids the need to separately free the variable length field. Unfortunately it only works for one.
If you're okay with gcc extensions, you can allocate the array with size zero. In C 99, you can get the same effect with a[]. This avoids the - 1 in the size calculation.

Why memory allocation for a structure in C works with any value given to malloc?

I tried to find the proper way to dynamically allocate memory for a structure that looks like this:
typedef struct myThread {
unsigned int threadId;
char threadPriority;
unsigned int timeSlice;
sem_t threadSem;
} myThread;
I remember, but I'm not sure, that, in some school paper, I saw that the proper way to allocate memory for this case is this one:
myThread *node = (myThread *)malloc(sizeof(myThread *));
I tried that and it worked, but I didn't understand why. Sizeof pointer for my architecture is 8 bytes, so by writing the instruction above, I'm allocating 8 bytes of continuous memory, not enough to hold the information needed in my structure. So I tried to allocate 1 byte of memory, like this:
myThread *node = (myThread *)malloc(1);
And it's still working.
I tried to find the answer for this behavior but I didn't succeed. Why is this working? Besides that, I have few more questions:
Which is the right way to dynamically allocate memory for a structure?
Is that cast necessary?
How is the structure stored in memory? I know that (*node).threadId is equivalent to node->threadId and this confuses me a bit because by dereferencing the pointer to the structure, I get the whole structure, and then I have to access a specific field. I was expecting to access fields knowing the address of the structure in this way: *(node) it's the value for the first element, *(node + sizeof(firstElement)) it's the value for the second and so on. I thought that accessing structure fields it's similar to accessing array values.
Thank you
Later Edit: Thank you for your answers, but I realized that I didn't explained myself properly. By saying that it works, I mean that it worked to store values in those specific fields of the structure and use them later. I tested that by filling up the fields and printing them afterwards. I wonder why is this working, why I can fill and work with fields of the structure for which I allocated just one byte of memory.

The below works in that they allocate memory - yet the wrong size.
myThread *node = (myThread *)malloc(sizeof(myThread *));// wrong size,s/b sizeof(myThread)
myThread *node = (myThread *)malloc(1); // wrong size
Why is this working?
When code attempts to save data to that address, the wrong size may or may not become apparent. It is undefined behavior (UB).
C is coding without training wheels. When code has UB like not allocating enough memory and using it, it does not have to fail, it might fail, now or later or next Tuesday.
myThread *node = (myThread *)malloc(1); // too small
node->timeSlice = 42; // undefined behavior
Which is the right way to dynamically allocate memory for a structure? #M.M
The below is easy to code right, review and maintain.
p = malloc(sizeof *p); //no cast, no type involved.
// or
number_of_elements = 1;
p = malloc(sizeof *p * number_of_elements);
// Robust code does error checking looking for out-of-memory
if (p == NULL) {
Handle_error();
}
Is that cast necessary?
No. Do I cast the result of malloc?
How is the structure stored in memory?
Each member followed by potential padding. It is implementation dependent.
unsigned int
maybe some padding
char
maybe some padding
unsigned int
maybe some padding
sem_t
maybe some padding
I wonder why is this working, why I can fill and work with fields of the structure for which I allocated just one byte of memory.
OP is looking for a reason why it works.
Perhaps memory allocation is done in chunks of 64-bytes or something exceeding sizeof *p so allocating 1 had same effect as sizeof *p.
Perhaps the later memory area now corrupted by code's use of scant allocation will manifest itself later.
Perhaps the allocater is a malevolent beast toying with OP, only to wipe out the hard drive next April 1. (Nefarious code often takes advantage of UB to infect systems - this is not so far-fetched)
Its all UB. Anything may happen.

Since memory allocation in C is quite error prone I always define macro functions NEW and NEW_ARRAY as in the example below. This makes memory allocation more safe and succinct.
#include <semaphore.h> /*POSIX*/
#include <stdio.h>
#include <stdlib.h>
#define NEW_ARRAY(ptr, n) \
{ \
(ptr) = malloc((sizeof (ptr)[0]) * (n)); \
if ((ptr) == NULL) { \
fprintf(stderr, "error: Memory exhausted\n"); \
exit(EXIT_FAILURE); \
} \
}
#define NEW(ptr) NEW_ARRAY((ptr), 1)
typedef struct myThread {
unsigned int threadId;
char threadPriority;
unsigned int timeSlice;
sem_t threadSem;
} myThread;
int main(void)
{
myThread *node;
myThread **nodes;
int nodesLen = 100;
NEW(node);
NEW_ARRAY(nodes, nodesLen);
/*...*/
free(nodes);
free(node);
return 0;
}

malloc reserves memory for you to use.
When you attempt to use more memory than you requested, several results are possible, including:
Your program accesses memory it should not, but nothing breaks.
Your program accesses memory it should not, and this damages other data that your program needs, so your program fails.
Your program attempts to access memory that is not mapped in its virtual address space, and a trap is caused.
Optimization by the compiler transforms your program in an unexpected way, and strange errors occur.
Thus, it would not be surprising either that your program appears to work when you fail to allocate enough memory or that your program breaks when you fail to allocate enough memory.
Which is the right way to dynamically allocate memory for a structure?
Good code is myThread *node = malloc(sizeof *node);.
Is that cast necessary?
No, not in C.
How is the structure stored in memory? I know that (*node).threadId is equivalent to node->threadId and this confuses me a bit because by dereferencing the pointer to the structure, I get the whole structure, and then I have to access a specific field. I was expecting to access fields knowing the address of the structure in this way: *(node) it's the value for the first element, *(node + sizeof(firstElement)) it's the value for the second and so on. I thought that accessing structure fields it's similar to accessing array values.
The structure is stored in memory as a sequence of bytes, as all objects in C are. You do not need to do any byte or pointer calculations because the compiler does it for you. When you write node->timeSlice, for example, the compiler takes the pointer node, adds the offset to the member timeSlice, and uses the result to access the memory where the member timeSlice is stored.

you do not allocate the right size doing
myThread *node = (myThread *)malloc(sizeof(myThread *));
the right way can be for instance
myThread *node = (myThread *)malloc(sizeof(myThread));
and the cast is useless so finally
myThread *node = malloc(sizeof(myThread));
or as said in remarks to your question
myThread *node = malloc(sizeof(*node));
The reason is you allocate a myThread not a pointer to, so the size to allocate is the size of myThread
If you allocate sizeof(myThread *) that means you want a myThread ** rather than a myThread *
I know that (*node).threadId is equivalent to node->threadI
yes, -> dereference while . does not
Having myThread node; to access the field threadId you do node.threadId, but having a pointer to you need to deference whatever the way
Later Edit: ...
Not allocating enough when you access out of the allocated block the behavior is undefined, that means anything can happen, including nothing bad visible immediately

Difference in creating a struct using malloc and without malloc

Could someone please explain to me the difference between creating a structure with and without malloc. When should malloc be used and when should the regular initialization be used?
For example:
struct person {
char* name;
};
struct person p = {.name="apple"};
struct person* p_tr = malloc(sizeof(struct person));
p_tr->name = "apple";
What is really the difference between the two? When would one approach be used over others?

Having a data structure like;
struct myStruct {
int a;
char *b;
};
struct myStruct p; // alternative 1
struct myStruct *q = malloc(sizeof(struct myStruct)); // alternative 2
Alternative 1: Allocates a myStruct width of memory space on stack and hands back to you the memory address of the struct (i.e., &p gives you the first byte address of the struct). If it is declared in a function, its life ends when the function exits (i.e. if function gets out of the scope, you can't reach it).
Alternative 2: Allocates a myStruct width of memory space on heap and a pointer width of memory space of type (struct myStruct*) on stack. The pointer value on the stack gets assigned the value of the memory address of the struct (which is on the heap) and this pointer address (not the actual structs address) is handed back to you. It's life time never ends until you use free(q).
In the latter case, say, myStruct sits on memory address 0xabcd0000 and q sits on memory address 0xdddd0000; then, the pointer value on memory address 0xdddd0000 is assigned as 0xabcd0000 and this is returned back to you.
printf("%p\n", &p); // will print "0xabcd0000" (the address of struct)
printf("%p\n", q); // will print "0xabcd0000" (the address of struct)
printf("%p\n", &q); // will print "0xdddd0000" (the address of pointer)
Addressing the second part of your; when to use which:
If this struct is in a function and you need to use it after the function exits, you need to malloc it. You can use the value of the struct by returning the pointer, like: return q;.
If this struct is temporary and you do not need its value after, you do not need to malloc memory.
Usage with an example:
struct myStruct {
int a;
char *b;
};
struct myStruct *foo() {
struct myStruct p;
p.a = 5;
return &p; // after this point, it's out of scope; possible warning
}
struct myStruct *bar() {
struct myStruct *q = malloc(sizeof(struct myStruct));
q->a = 5;
return q;
}
int main() {
struct myStruct *pMain = foo();
// memory is allocated in foo. p.a was assigned as '5'.
// a memory address is returned.
// but be careful!!!
// memory is susceptible to be overwritten.
// it is out of your control.
struct myStruct *qMain = bar();
// memory is allocated in bar. q->a was assigned as '5'.
// a memory address is returned.
// memory is *not* susceptible to be overwritten
// until you use 'free(qMain);'
}

If we assume both examples occur inside a function, then in:
struct person p = {.name="apple"};
the C implementation automatically allocates memory for p and releases it when execution of the function ends (or, if the statement is inside a block nested in the function, when execution of that block ends). This is useful when:
You are working with objects of modest size. (For big objects, using many kibibytes of memory, malloc may be better. The thresholds vary depending on circumstances.)
You are working with a small number of objects at one time.
In:
struct person* p_tr = malloc(sizeof(struct person));
p_tr->name = "apple";
the program explicitly requests memory for an object, and the program generally should release that memory with free when it is done with the object. This is useful when:
The object must be returned to the caller of the function. An automatic object, as used above, will cease to exist (in the C model of computation; the actual memory in your computer does not stop existing—rather it is merely no longer reserved for use for the object) when execution of the function ends, but this allocated object will continue to exist until the program frees it (or ends execution).
The object is very large. (Generally, C implementations provide more memory for allocation by malloc than they do for automatic objects.)
The program will create a variable number of such objects, depending on circumstances, such as creating linked lists, trees, or other structures from input whose size is not known before it is read.
Note that struct person p = {.name="apple"}; initializes the name member with "apple" and initializes all other members to zero. However, the code that uses malloc and assigns to p_tr->name does not initialize the other members.
If struct person p = {.name="apple"}; appears outside of a function, then it creates an object with static storage duration. It will exist for the duration of program execution.
Instead of struct person* p_tr = malloc(sizeof(struct person));, it is preferable to use struct person *p_tr = malloc(sizeof *p_tr);. With the former, a change to the p_tr requires edits in two places, which allows a human opportunity to make mistakes. With the latter, changing the type of p_tr in just one place will still result in the correct size being requested.

struct person p = {.name="apple"};
^This is Automatic allocation for a variable/instance of type person.
struct person* p_tr = malloc(sizeof(person));
^This is dynamic allocation for a variable/instance of type person.
Static memory allocation occurs at Compile Time.
Dynamic memory allocation means it allocates memory at runtime when the program executes that line of instruction

Judging by your comments, you are interested in when to use one or the other. Note that all types of allocation reserve a computer memory sufficient to fit the value of the variable in it. The size depends on the type of the variable. Statically allocated variables are pined to a place in the memory by the compiler. Automatically allocated variables are pinned to a place in stack by the same compiler. Dynamically allocated variables do not exist before the program starts and do not have any place in memory till they are allocated by 'malloc' or other functions.
All named variables are allocated statically or automatically. Dynamic variables are allocated by the program, but in order to be able to access them, one still needs a named variable, which is a pointer. A pointer is a variable which is big enough to keep an address of another variable. The latter could be allocated dynamically or statically or automatically.
The question is, what to do if your program does not know the number of objects it needs to use during the execution time. For example, what if you read some data from a file and create a dynamic struct, like a list or a tree in your program. You do not know exactly how many members of such a struct you would have. This is the main use for the dynamically allocated variables. You can create as many of them as needed and put all on the list. In the simplest case you only need one named variable which points to the beginning of the list to know about all of the objects on the list.
Another interesting use is when you return a complex struct from a function. If allocated automatically on the stack, it will cease to exist after returning from the function. Dynamically allocated data will be persistent till it is explicitly freed. So, using the dynamic allocation would help here.
There are other uses as well.
In your simple example there is no much difference between both cases. The second requires additional computer operations, call to the 'malloc' function to allocate the memory for your struct. Whether in the first case the memory for the struct is allocated in a static program region defined at the program start up time. Note that the pointer in the second case also allocated statically. It just keeps the address of the memory region for the struct.
Also, as a general rule, the dynamically allocated data should be eventually freed by the 'free' function. You cannot free the static data.

initialising a structure through pointer

Suppose we have
struct me {
int b;
};
void main() {
struct me *m1;
m1->b=3;
}
My quesrion is that , as m1 is a pointer of type me and is currently
not holding any address of variable of type me then how we can access b which is member of me
through a pointer which is not pointing to any variable of type me and if we can then which variable of type me is accesing a?

It's either
struct me m1;
m1.b = 3;
or
struct me *m1 = malloc(sizeof(struct me));
m1->b = 3;
When you deal with pointers in C, you usually need to do 3 things:
create the pointer
make sure the memory is allocated where the pointer should point
make the pointer point to that memory
Your solution only did the first of these.
The reason why your printf works, is that the actual assignment and reading still works. You were overwriting some random memory in your process, this time without any disastrous result. But it's pure "luck". You could have ended up with a segmentation fault as well.

1) You must allocate space for the object you're pointing to first
2) Then - and only then - can you assign the value m1->b = 3

void main()
{
struct me *m1=malloc(sizeof(struct me)); //here allocating the memory first
m1->b=3;
//do what you want to do
free(m1); //once you allocate the memory, you have to free it after your job is done
}
If you do not allocate memory and access (like you have done), you are accessing a part of memory where m1 points to. It will compile fine. But if m1 has a value outside the segment of your code, it will give rise to segmentation fault. Also if it is within your segment, it may overwrite other values. So it is always desirable to allocate the memory before using it.

memory allocation for stack vs primitive datatypes

While declaring a structure in C, say:
typedef struct my_stuct {
int x;
float f;
} STRT;
If we want to create an instance of this struct and use it, we explicitly need to call malloc, get a pointer to the memory location for this struct before we can actually initialize/use any of the members of the structure:
STRT * my_struct_instance = (STRT *) (malloc(sizeof(STRT)));
However, if I declare a primitive data type (say "int a;") and then want to initialize it (or do any other operation to it), I do not need to explicitly assign mempory space for it by calling malloc before performing any operation on it:
// we do not need to do a malloc(sizeof(i)) blah blah here. Why?
i = 10;
Can you please explain what is the reason for this inconsistency? Thank you!

There is no inconsistency. Each of the two methods can be used both with primitives and with structs:
STRT s1 = {1, 2};
int i1 = 1;
STRT *s2 = (STRT *)malloc(sizeof(STRT));
int *i2 = (int *)malloc(sizeof(int));
...

you can do:
int i;
or
int *i = (int*) malloc(sizeof(int));
just like you can do
STRT my_struct_instance;
or
STRT * my_struct_instance = (STRT *) (malloc(sizeof(STRT)));

In your malloc example, you are using pointers. The inconsistency, as you call it, is because a pointer can be initialized in several ways. It is not always initialized by a new memory allocations, but it can also be initialized to point at an existing memory block. So, it is not possible for the language to assume that the variable should be allocated on the heap:
STRT* my_struct_instance; // here I assume (incorrectly) that it is automatically allocated on the heap
my_struct_instance->x = 0; // ERROR: uninitialized use of that variable
Don't know if that answers your question.

You can do it in both ways, there isn't any inconsistency.
Heap
int* a= malloc(sizeof(int));
*a=10;
STRT* b= malloc(sizeof(STRT));
b->x=1;
b->f=1.0;
Stack
int a=10;
STRT b= {1, 1.0};

STRT * my_struct_instance = (STRT *) (malloc(sizeof(STRT)));
uses dynamic storage;
int a;
uses automatic storage (I'm using C++ names right now, but it is probably called similarly in C). So, those are two completely different things. int a; is local, on (in most implementations) stack (although stack is not relevant implementation detail); SRTR * [...] is dynamic, on (in most implementations) heap (although, again, heap is not relevant implementation detail).
So, there is no inconsistency. Saying there is one is like saying that there is inconsistency between apples and oranges - but of course there is, since you are comparing apples and oranges. (The other parts of the question don't make sense, since they are based on assumption that apples and oranges are one and the same thing).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight