I'm doing a project where our professor has given us code with variables and prototype declarations that we are unable to change. One is a struct, with a pointer to that struct is typedef'd as pStudentRecord:
typedef struct
{
char* firstName;
char* lastName;
int id;
float mark;
}* pStudentRecord;
With a pointer to this type called g_ppRecords. This will be a dynamic array of pointers to the structs above.
Here's where my question comes in. The records will be read from a file. If the filename specified doesn't exist then it creates a new one. I'm initializing the g_ppRecords pointer whenever the user adds the first new record:
if(!n) //where n = number of records
g_ppRecords = (pStudentRecord*) malloc(sizeof(pStudentRecord));
g_ppRecords[n] = (pStudentRecord) malloc(16);
This code works every time I've ran it so far, but I'm not sure how. If I add more records then a new pointer (pStudentRecord) will be created in the next position in g_ppRecords. By my understanding, I haven't allocated the space for that new pointer, yet every time it works without even a hint of a problem. I can access the members of the new structs fine and I'm not getting a heap corruption error or access violation or anything like that. Are my concerns correct or am I doubting myself?
Based on the code that you've shown, your concerns are valid.
This line:
g_ppRecords = (pStudentRecord*) malloc(sizeof(pStudentRecord));
Allocates only enough space for a single pStudentRecord. Think of this as an array of pStudentRecord with only a single element, at g_ppRecords[0].
If I add more records then a new pointer (pStudentRecord) will be created in the next position in g_ppRecords.
Now the problem is what might happen when you do what you've described here. What happens when you add a new pointer? Unless you use realloc to get more space for g_ppRecords, you don't have room in that array for more pointers to records. If you malloc a new pointer at the second element, i.e.:
g_ppRecords[1] = (pStudentRecord) malloc(16);
Then you're using memory, g_ppRecords[1], that you haven't allocated. This may appear to work, but this memory doesn't belong to you. Keep adding new pointers and eventually your program will break. Or your program may break because of something totally unrelated in another part of your code.
The fix is that you should initially allocate your array so that it can hold multiple pointers, instead of only one. How can you do this with your first malloc line?
I should add that when you allocate memory for a struct using malloc(16) you're making assumptions about the data structure that you shouldn't make, specifically that the struct will always occupy 16 bytes. Given your typedef: straight to a pointer from an anonymous struct, you can change that 16 to something more general, but this isn't directly related to your question, and is something that you should ask your professor about.
As a general rule, try to avoid malloc( sizeof( type )), especially when the type is obfuscated by a typecast. It is much safer to call sizeof on a variable: malloc( sizeof x ). Also, in C, you should not cast the return from malloc. In other words, instead of allocating space for only one record with:
g_ppRecords = (pStudentRecord*) malloc(sizeof(pStudentRecord));
it would be better to allocate space for n records by writing:
g_ppRecords = malloc( n * sizeof *g_ppRecords );
typedef struct
{
char* firstName;
char* lastName;
int id;
float mark;
}* pStudentRecord;
This is an anonymous struct. A bit weird here, but maybe to teach you something. Do this to create a new one:
pStudentRecord ptr;
ptr = malloc(sizeof(*ptr));
This will automatically malloc the right amount of memory.
You've still got problems because you need to malloc the array to hold the pointers. For that do this:
pStudentRecord* g_ppRecords = malloc(n * sizeof(pStudentRecord));
You can then use g_ppRecords like this:
pStudentRecord ptr = g_ppRecords[3];
Putting it all together we have our custom allocator:
pStudentRecord* g_ppRecords = malloc(n * sizeof(pStudentRecord));
for (size_t i = 0; i < n; ++i)
{
pStudentRecord ptr;
g_ppRecords[i] = malloc(sizeof(*ptr));
}
I wrote all this without compiling and testing, so there may be errors (but its not my homework :-) )
Related
I am putting together a project in C where I must pass around a variable length byte sequence, but I'm trying to limit malloc calls due to potentially limited heap.
Say I have a struct, my_struct, that contains the variable length byte sequence, ptr, and a function, my_func, that creates an instance of my_struct. In my_func, my_struct.ptr is malloc'd and my_struct is returned by value. my_struct will then be used by other functions being passed by value: another_func. Code below.
Is this "safe" to do against memory leaks provided somewhere on the original or any copy of my_struct when passed by value, I call my_struct_destroy or free the malloc'd pointer? Specifically, is there any way that when another_func returns, that inst.ptr is open to being rewritten or dangling?
Since stackoverflow doesn't like opinion-based questions, are there any good references that discuss this behavior? I'm not sure what to search for.
typedef struct {
char * ptr;
} my_struct;
// allocates n bytes to pointer in structure and initializes.
my_struct my_func(size_t n) {
my_struct out = {(char *) malloc(n)};
/* initialization of out.ptr */
return out;
}
void another_func(my_struct inst) {
/*
do something using the passed-by-value inst
are there problems with inst.ptr here or after this function returns?
*/
}
void my_struct_destroy(my_struct * ms_ptr) {
free(ms_ptr->ptr);
ms_ptr->ptr = NULL;
}
int main() {
my_struct inst = my_func(20);
another_func(inst);
my_struct_destroy(&inst);
}
I's safe to pass and return a struct containing a pointer by value as you did it. It contains a copy of ptr. Nothing is changed in the calling function. There would, of course, be a big problem if another_func frees ptr and then the caller tries to use it or free it again.
Locality of alloc+free is a best practice. Wherever possible, make the function that allocates an object also responsible for freeing it. Where that's not feasible, malloc and free of the same object should be in the same source file. Where that's not possible (think complex graph data structure with deletes), the collection of files that manage objects of a given type should be clearly identified and conventions documented. There's a common technique useful for programs (like compilers) that work in stages where much of the memory allocated in one stage should be freed before the next starts. Here, memory is only malloced in big blocks by a manager. From these, the manager allocs objects of any size. But it knows only one way to free: all at once, presumably at the end of a stage. This is a gcc idea: obstacks. When allocation is more complex, bigger systems implement some kind of garbage collector. Beyond these ideas, there are as many ways to manage C storage as there are colors. Sorry I don't have any pointers to references (pun intended :)
If you only have one variable-length field and its size doesn't need to be dynamically updated, consider making the last field in the struct an array to hold it. This is okay with the C standard:
typedef struct {
... other fields
char a[1]; // variable length
} my_struct;
my_struct my_func(size_t n) {
my_struct *p = malloc(sizeof *p + (n - 1) * sizeof p->a[0]);
... initialize fields of p
return p;
}
This avoids the need to separately free the variable length field. Unfortunately it only works for one.
If you're okay with gcc extensions, you can allocate the array with size zero. In C 99, you can get the same effect with a[]. This avoids the - 1 in the size calculation.
I tried to find the proper way to dynamically allocate memory for a structure that looks like this:
typedef struct myThread {
unsigned int threadId;
char threadPriority;
unsigned int timeSlice;
sem_t threadSem;
} myThread;
I remember, but I'm not sure, that, in some school paper, I saw that the proper way to allocate memory for this case is this one:
myThread *node = (myThread *)malloc(sizeof(myThread *));
I tried that and it worked, but I didn't understand why. Sizeof pointer for my architecture is 8 bytes, so by writing the instruction above, I'm allocating 8 bytes of continuous memory, not enough to hold the information needed in my structure. So I tried to allocate 1 byte of memory, like this:
myThread *node = (myThread *)malloc(1);
And it's still working.
I tried to find the answer for this behavior but I didn't succeed. Why is this working? Besides that, I have few more questions:
Which is the right way to dynamically allocate memory for a structure?
Is that cast necessary?
How is the structure stored in memory? I know that (*node).threadId is equivalent to node->threadId and this confuses me a bit because by dereferencing the pointer to the structure, I get the whole structure, and then I have to access a specific field. I was expecting to access fields knowing the address of the structure in this way: *(node) it's the value for the first element, *(node + sizeof(firstElement)) it's the value for the second and so on. I thought that accessing structure fields it's similar to accessing array values.
Thank you
Later Edit: Thank you for your answers, but I realized that I didn't explained myself properly. By saying that it works, I mean that it worked to store values in those specific fields of the structure and use them later. I tested that by filling up the fields and printing them afterwards. I wonder why is this working, why I can fill and work with fields of the structure for which I allocated just one byte of memory.
The below works in that they allocate memory - yet the wrong size.
myThread *node = (myThread *)malloc(sizeof(myThread *));// wrong size,s/b sizeof(myThread)
myThread *node = (myThread *)malloc(1); // wrong size
Why is this working?
When code attempts to save data to that address, the wrong size may or may not become apparent. It is undefined behavior (UB).
C is coding without training wheels. When code has UB like not allocating enough memory and using it, it does not have to fail, it might fail, now or later or next Tuesday.
myThread *node = (myThread *)malloc(1); // too small
node->timeSlice = 42; // undefined behavior
Which is the right way to dynamically allocate memory for a structure? #M.M
The below is easy to code right, review and maintain.
p = malloc(sizeof *p); //no cast, no type involved.
// or
number_of_elements = 1;
p = malloc(sizeof *p * number_of_elements);
// Robust code does error checking looking for out-of-memory
if (p == NULL) {
Handle_error();
}
Is that cast necessary?
No. Do I cast the result of malloc?
How is the structure stored in memory?
Each member followed by potential padding. It is implementation dependent.
unsigned int
maybe some padding
char
maybe some padding
unsigned int
maybe some padding
sem_t
maybe some padding
I wonder why is this working, why I can fill and work with fields of the structure for which I allocated just one byte of memory.
OP is looking for a reason why it works.
Perhaps memory allocation is done in chunks of 64-bytes or something exceeding sizeof *p so allocating 1 had same effect as sizeof *p.
Perhaps the later memory area now corrupted by code's use of scant allocation will manifest itself later.
Perhaps the allocater is a malevolent beast toying with OP, only to wipe out the hard drive next April 1. (Nefarious code often takes advantage of UB to infect systems - this is not so far-fetched)
Its all UB. Anything may happen.
Since memory allocation in C is quite error prone I always define macro functions NEW and NEW_ARRAY as in the example below. This makes memory allocation more safe and succinct.
#include <semaphore.h> /*POSIX*/
#include <stdio.h>
#include <stdlib.h>
#define NEW_ARRAY(ptr, n) \
{ \
(ptr) = malloc((sizeof (ptr)[0]) * (n)); \
if ((ptr) == NULL) { \
fprintf(stderr, "error: Memory exhausted\n"); \
exit(EXIT_FAILURE); \
} \
}
#define NEW(ptr) NEW_ARRAY((ptr), 1)
typedef struct myThread {
unsigned int threadId;
char threadPriority;
unsigned int timeSlice;
sem_t threadSem;
} myThread;
int main(void)
{
myThread *node;
myThread **nodes;
int nodesLen = 100;
NEW(node);
NEW_ARRAY(nodes, nodesLen);
/*...*/
free(nodes);
free(node);
return 0;
}
malloc reserves memory for you to use.
When you attempt to use more memory than you requested, several results are possible, including:
Your program accesses memory it should not, but nothing breaks.
Your program accesses memory it should not, and this damages other data that your program needs, so your program fails.
Your program attempts to access memory that is not mapped in its virtual address space, and a trap is caused.
Optimization by the compiler transforms your program in an unexpected way, and strange errors occur.
Thus, it would not be surprising either that your program appears to work when you fail to allocate enough memory or that your program breaks when you fail to allocate enough memory.
Which is the right way to dynamically allocate memory for a structure?
Good code is myThread *node = malloc(sizeof *node);.
Is that cast necessary?
No, not in C.
How is the structure stored in memory? I know that (*node).threadId is equivalent to node->threadId and this confuses me a bit because by dereferencing the pointer to the structure, I get the whole structure, and then I have to access a specific field. I was expecting to access fields knowing the address of the structure in this way: *(node) it's the value for the first element, *(node + sizeof(firstElement)) it's the value for the second and so on. I thought that accessing structure fields it's similar to accessing array values.
The structure is stored in memory as a sequence of bytes, as all objects in C are. You do not need to do any byte or pointer calculations because the compiler does it for you. When you write node->timeSlice, for example, the compiler takes the pointer node, adds the offset to the member timeSlice, and uses the result to access the memory where the member timeSlice is stored.
you do not allocate the right size doing
myThread *node = (myThread *)malloc(sizeof(myThread *));
the right way can be for instance
myThread *node = (myThread *)malloc(sizeof(myThread));
and the cast is useless so finally
myThread *node = malloc(sizeof(myThread));
or as said in remarks to your question
myThread *node = malloc(sizeof(*node));
The reason is you allocate a myThread not a pointer to, so the size to allocate is the size of myThread
If you allocate sizeof(myThread *) that means you want a myThread ** rather than a myThread *
I know that (*node).threadId is equivalent to node->threadI
yes, -> dereference while . does not
Having myThread node; to access the field threadId you do node.threadId, but having a pointer to you need to deference whatever the way
Later Edit: ...
Not allocating enough when you access out of the allocated block the behavior is undefined, that means anything can happen, including nothing bad visible immediately
I've got a theoretical question on allocating memory for structs. Consider the following code IN THE MAIN FUNCTION:
I have the following struct:
typedef struct {
char *descr = NULL;
DWORD id = 0x00FFFF00;
int start_byte = 0;
int end_byte = 0;
double conversion_factor = 0.0;
} CAN_ID_ENTRY;
I want an array of this structs, so I'm allocating a pointer to the first struct:
can_id_list = (CAN_ID_ENTRY **)malloc(sizeof(CAN_ID_ENTRY));
And then I'm allocating memory for the first struct can_id_list[0]:
can_id_list[0] = (CAN_ID_ENTRY *)malloc(sizeof(CAN_ID_ENTRY));
Now the problem is, that I don't know HOW MANY of these structs I need (because I'm reading a CSV-File and I don't know the amount of lines/entries). So I need to enlarge the struct-pointer can_id_list for a second one:
can_id_list = (CAN_ID_ENTRY **)malloc(sizeof(CAN_ID_ENTRY));
And then I'm allocating the second struct can_id_list[1]:
can_id_list[1] = (CAN_ID_ENTRY *)malloc(sizeof(CAN_ID_ENTRY));
can_id_list[1]->id = 6;
Obviously, this works. But why? My point is the following: Normally, malloc allocates memory in one block in the memory (without gaps). But if another malloc is done BEFORE I'm allocating memory for the next struct, there is a gap between the first and the second struct. So, why can I access the second struct via can_id_list[1]? Does the index [1] store the actual address of the struct, or does it just calculate the size of the struct and jumps to this address beginning on the offset of the struct-pointer can_id_list (-> can_id_list+<2*sizeof(CAN_ID_ENTRY))?
Well, my real problem is, that I need to do this inside a function and therefore I need to pass the pointer of the struct to the function. But I don't know how to do this, because can_id_list is already a pointer ... and the changes must also be visible in the main method (that's the reason i need to use pointers).
The mentioned function is this one:
int load_can_id_list(char *filename, CAN_ID_ENTRY **can_id_list);
But is the parameter CAN_ID_ENTRY **can_id_list correct? And how do i pass the struct-array into this function? And how can i modify it inside??
Any help would be great!
EDIT: Casting malloc returns - Visual Studio forces me to do that! (Because it's a C++ project i think)
As the comments already said, the source of your confusion is can_id_list = (CAN_ID_ENTRY **)malloc(sizeof(CAN_ID_ENTRY)); allocating the wrong amount of memory. It probably gave you space for a few pointers to be stored, not just one. Should be can_id_list = (CAN_ID_ENTRY **)malloc(sizeof(CAN_ID_ENTRY*));.
To answer the question at the end,
But is the parameter CAN_ID_ENTRY **can_id_list correct? And how do i
pass the struct-array into this function? And how can i modify it
inside??
If you want to enlarge the size of the array within another function, you need to pass CAN_ID_ENTRY*** pr so you can set *ptr = realloc(...) inside as needed. Realloc may give you the new chunk of memory at a different address, so you can't simply pass in a CAN_ID_ENTRY** ptr then do realloc(ptr). See https://www.tutorialspoint.com/c_standard_library/c_function_realloc.htm
This seems to be a very simple problem but I can't quite figure out which part is causing it. Basically, I have a struct that just contains an array of strings
struct command_stream{
char **tokens;
};
typedef struct command_stream *command_stream_t;
command_stream_t test;
Then later on, I parse some strings into shorter ones and end up with another array of strings
char **words = *array of strings*
words contains the correct information I want, I looped through and printed out each element to make sure I wasn't getting a faulty string. So now I just point tokens to words
test->tokens = words;
But it gives me a segmentation fault. I'm not sure why though. They're both pointers, so unless I'm missing something obvious...
EDIT: The function as a whole has to return a pointer, which is why it was set up like this, which I keep forgetting. But I think I've got it, if I just create a new typedef
typedef struct command_stream command_stream_s;
command_stream_s new_command_stream;
and just return
&new_command_stream;
That should work right? Even though new_command_stream itself isn't a pointer.
From your code excerpt, it seems that you have not declared the struct. You have successfully declared a pointer to the struct command_stream_t test; but this pointer does not point to anywhere yet.
You need to allocate memory for your struct in some way and make test reference it. For instance:
command_stream_t test =
(command_stream_t) malloc(sizeof(struct command_stream));
This way you can successfully use:
test->tokens = words;
as you intended.
Note that you don't need to use malloc to allocate the memory. The pointer can reference a local/global variable as long as it has memory associated to it (N.B. if you use a local var don't use the pointer outside the declaration scope of that var).
typedef struct command_stream *command_stream_t;
command_stream_t test;
This makes "test" a pointer. There is no memory allocated for the structure.
You need to allocate memory for the structure and make the test pointer point to the block of memory before you can dereference by saying -
test->tokens = words;
Do this:
typedef struct command_stream command_stream_t;
command_stream_t test;
test.tokens = words;
The difference is that, command_stream_t is no more a pointer type, it is the actual structure.
Hello i'm not sure if I'm understand the following piece of code. I would glad if someone could read my explanations and correct me if I'm wrong.
So first of all I'm declaring a struct with three arrays of char and an integer.
struct Employee
{
char last[16];
char first[11];
char title[16];
int salary;
};
After that I declare a function which takes three pointers to char and an integer value. This function uses malloc() and sizeof() to create a struct on the heap. Now this creations of the object on the heap is not really clear to me. When I use struct Employee* p = malloc(sizeof(struct Employee)), what happens there exactly? What happens when I use the function struct Employee* createEmployee (char* last, char* first, char* title, int salary) several times with different input. I know that I will get back a pointer p but isn't that the same pointer to the same struct on the heap. So do I rewrite the information on the heap, when I use the function several times? Or does it always create a new object in a different memory space?
struct Employee* createEmployee(char*, char*, char*, int);
struct Employee* createEmployee(char* last, char* first, char* title, int salary)
{
struct Employee* p = malloc(sizeof(struct Employee));
if (p != NULL)
{
strcpy(p->last, last);
strcpy(p->first, first);
strcpy(p->title, title);
p->salary = salary;
}
return p;
}
I would be glad if someone could explain it to me. Thank you very much.
The malloc function allocates some new bytes on the heap and returns the pointer.
So the createEmployee function allocates new memory every time it's called, then fills it with some data (in an unsafe way - consider using strncpy instead) and returns the pointer to that memory. It will return a different pointer every time it's called.
Each instance you create with this function will exist as long as you don't call free on its pointer.
Your first question is a question about malloc. You might get better results searching for "How does malloc work?" The answer is different for different operating systems and C libraries.
The createEmployee function creates an all-new struct Employee every time it is called.
I also see that createEmployee is written in a very dangerous way. No checking is done to ensure that the strings fit into their destinations before calling strcpy. This is how buffer overflows are created.
malloc assigns you a block of memory equal to its first argument, in this case the size of the Employee.
Every time you call createEmployee, you call malloc a separate time, and every time you call malloc, it gives you a fresh piece of memory.
This is what allows you to have different employees: if they all used the same memory, you would only be able to create one.
This is why calling free, and freeing that memory is important: the operating system has no other way of knowing whether you're using the memory or not.
If you want to edit an existing employee, maintain a pointer reference to it, and add a strcpy(p->title, newTitle); to change its title to newTitle.
Also, something that has been mentioned, strcpy is dangerous, as it will continue to write its strings regardless of whether it has exceeded the 11 characters allotted for it.
Every time you call malloc(), you're telling it to give you a new chunk of memory, at least as long as you've asked for, not currently in use anywhere else. So the following gives you three different pointers:
void *p1 = malloc(100);
void *p2 = malloc(100);
void *p3 = malloc(100);
It's like hitting a button on a vending machine. Each time, you get a different candy bar that conforms to your requests ("Caramilk" for instance.)