C queue implementation using void* - good or bad practice? - c

I have implemented a basic queue structure in C using void pointers. The procedure is as follows:
initializing the structure - I set the size of the variable type to be stored in the queue
push - I pass the pointer to the variable to be stored, the queue then grabs a copy for itself
front - the structure returns a void* to the element in front. I may just grab the pointer, or memcpy() it to have a local copy.
The struct itself looks like this:
struct queue
{
void* start; //pointer to the beginning of queue
void* end; //-||- to the end
size_t memsize; //size of allocated memory, in bytes
size_t varsize; //size of a single variable, in bytes
void* initial_pointer; //position of the start pointer before pop() operations
};
start and end are just void pointers that point to some location within the currently allocated memory block. If I push elements on the queue, I increment the end pointer by varsize. If I pop(), I just decrement the end pointer also by varsize.
I don't think I should post the functions code here, it's over 100 lines.
The question: is this considered a good or a bad practice? Why (not)?
Note: I'm aware that there are many other options for a queue in C. I'm just asking about the quality of this one.
EDIT: The implementation is available here:
http:// 89.70.149.19 /stuff/queue.txt (remove the spaces)

It's OK to use void * if you don't know the type and size of the objects to be stored in the queue (in fact, the C standard library follows the same approach, see the memcpy() and qsort() functions for some examples). However, it would be better to use size_t (or ssize_t if you need a signed data type) for specifying the size of the elements stored in the queue.

You really don't show us enough to be sure about your implementation. void* for the user data items is fine, you can't do much otherwise in C.
But I strongly suspect that you have an internal list element type that you use to manage the individual items, something like
struct list_item {
struct list_item* next;
void* data;
};
If that is the case and your start and end pointers are pointing to such elements, you definitively should use your element type in the struct queue declaration:
struct queue
{
struct list_item* start; //pointer to the beginning of queue
struct list_item* end; //-||- to the end
size_t memsize; //size of allocated memory, in bytes
size_t varsize; //size of a single variable, in bytes
struct list_item* initial_pointer; //position of the start pointer before pop() operations
};
For this to work you don't even have to expose the definition of struct list_item to the user of struct queue.

Related

struct inside struct : to point or not to point?

I'd like to understand the difference between using a pointer and a value when it comes to referencing a struct inside another struct.
By that I mean, I can have those two declarations:
struct foo {
int bar;
};
struct fred {
struct foo barney;
struct foo *wilma;
}
It appears I can get the same behavior from both barney and wilma entries, as long as I de-reference accordingly when I access them. The barney case intuitively feels “wrong” but I cannot say why.
Am I just relying on some C undefined behavior? If not, what would be the reason(s) to opt for one style over the other?
The following code shows how I come to the conclusion both use cases are equivalent; neither clang nor gcc complain about anything.
#include <stdio.h>
#include <stdlib.h>
struct a_number {
int i;
};
struct s_w_ptr {
struct a_number *n;
};
struct s_w_val {
struct a_number n;
};
void store_via_ptr(struct s_w_ptr *swp, struct s_w_val *swv) {
struct a_number *i = malloc(sizeof(i));
i->i = 1;
swp->n = i;
swv->n = *i;
}
void store_via_val(struct s_w_ptr *swp, struct s_w_val *swv) {
struct a_number j;
j.i = 2;
swp->n = &j;
swv->n = j;
}
int main(void) {
struct s_w_ptr *swp = malloc(sizeof(swp));
struct s_w_val *swv = malloc(sizeof(swv));
store_via_ptr(swp, swv);
printf("p: %d | v: %d\n", swp->n->i, swv->n.i);
store_via_val(swp, swv);
printf("p: %d | v: %d\n", swp->n->i, swv->n.i);
}
It's perfectly valid to have both struct members in a struct and have pointers to struct in a struct. They must be used differently but both are legal.
Why have a struct in a struct ?
One reason is to group things together. For instance:
struct car
{
struct motor motor; // a struct with several members describing the motor
struct wheel wheel; // a struct with several members describing the wheels
...
}
struct car myCar = {....initializer...};
myCar.wheel = SomeOtherWheelModel; // Replace wheels in a single assign
myCar.wheel.pressure = 2.1; // Change a single wheel member
Why have a struct pointer in a struct?
One very obvious reason is that is can be used as an array of N structs by using dynamic allocation of N times the struct size.
Another typical example is linked lists where you have a pointer to a struct of the same type as the struct containing the pointer.
There are several advantages of having a struct in a struct instead of having a pointer to struct in a struct:
It requires less memory allocation. In the case where you have a pointer to a struct in a struct, the compiler will allocate memory to store the pointer to the struct within the parent struct and separately allocate the memory for the child struct.
Additional instructions are typically required to access the contents of the child struct. For example consider that the program is reading the contents of the child struct. If a struct within a struct is used, the program will apply an offset to the address of the variable and read the contents of that memory location. In the case of a pointer to a struct in a struct, the program will actually apply an offset to the parent struct variable address, fetch the address of the child struct, then read from memory the contents of the child struct.
A separate variable needs to be declared for both the parent and child struct and if an initializer is used, then a separate initializer is needed. In the case of a struct in a struct only one variable must be declared and a single initializer is used.
In cases where dynamic memory allocation is used, the developer must remember to deallocate memory for both the child and parent objects before the variables fall out of scope. In the case of struct in a struct the memory must be freed for only one variable.
Lastly, as is shown in the example, if a pointer is used, Null checking may be necessary to ensure that the pointer to the child struct has been initialized.
The primary advantages of having a pointer to a struct in a struct would be if you needed to replace the child struct with another struct within the program, such as a linked list. A less common case might be if the child struct can be of more than one type. In this case you might use a void * type for the child. I may also use a pointer within a struct to point to an array in case where the array pointed to may vary in size between instances.
Based on my knowledge the case shown in the example above, I would be inclined to use a struct in a struct, since both objects are of fixed size and type and since it appears that they would not need to be separated.
C structures can be used to group related data, such as the title of a book, its author, its assigned book number, and so on. But much of what we use structures for is creating data structures (in a different sense of the word “structure”) in memory.
Consider that the book’s author has a name, a date of birth, other biographical information, a list of books they have written, and more. We could include in the struct book a struct author that would contain all this information. But, if the author has written a hundred books, we could have 100 copies of all that information, one copy in each struct book. Further, we cannot continue the “contain the data inside the structure directly” model with the struct author, because it cannot contain a struct book for each book the author publishes if those struct book members also have to contain the struct author for the author—every object would have to contain itself.
It is more efficient to create one struct author and have each struct book for that author to link to their struct author.
Another example is that we use pointers to create data structures for efficient access to data. If we are reading data for thousands of items and want to keep them sorted by name, one option is to allocate memory for some number of structures, read the data, and sort the data. When new data is read and we have used all the memory we allocated, we allocate new memory, copy all the old data to the new memory if necessary, and move some of the data so we can insert the new data in its proper place. However, we have many better options than that. We can use linked lists, binary trees, other kinds of trees, and hash tables.
These data structures effectively require using pointers. A binary tree will have a root node, and each node contains two pointers, one to a subtree of nodes that are earlier than it in the sorting order and another to a subtree of nodes that are later than it. We can look up items in the tree by following pointers to earlier or later nodes to find the right position. And we can insert items by changing a few pointers. If the tree happens to become unbalanced, we can rearrange nodes in the tree by changing pointers. The bulk of the data in the nodes does not have to be changed or copied, just some pointers.
We can also use pointers to have multiple structures for the same data. All the data about books could be stored in one place, and a tree ordered by name could contain nodes in which each node contained a pointer to the book structure and two pointers to subtrees. We could have one tree like this ordered by title of the book and another tree ordered by the name of the author and another tree ordered by the assigned book number. Then we can efficiently look up a book by title or author or number, but there is only one master copy of the complete book data, in the struct book objects. The look-up data is in the tree, which contains only pointers. That is much more efficient than copying all of the struct book data for each tree.
So the reasons we choose between use structures or pointers as members is not whether the C syntax allows us to refer to the data or not—we can get to the data in both cases. The reasons are because one method requires embedding data, which is inflexible and requires copying data, and the other method is flexible and efficient.
Let's consider at first this function
void store_via_ptr(struct s_w_ptr *swp, struct s_w_val *swv) {
struct a_number *i = malloc(sizeof(i));
i->i = 1;
swp->n = i;
swv->n = *i;
}
This declaration
struct a_number *i = malloc(sizeof(i));
is equivalent to the following declaration
struct a_number *i = malloc(sizeof( struct a_number * ));
So in general the function can invoke undefined behavior when sizeof( struct a_number ) is greater than sizeof( struct a_number * ).
It seems you mean
struct a_number *i = malloc(sizeof( *i ) );
^^^
If you will split the function in two functions for each its parameter like
void store_via_ptr1( struct s_w_ptr *swp ) {
struct a_number *i = malloc(sizeof( *i ) );
i->i = 1;
swp->n = i;
}
and
void store_via_ptr( struct s_w_val *swv ) {
struct a_number *i = malloc(sizeof( *i));
i->i = 1;
swv->n = *i;
}
then in the first function the object pointed to by the pointer swp will need to remember to free the allocated memory within the function. Otherwise there will be a memory leak.
The second function already produces a memory leak because the allocated memory was not freed.
Now let's consider the second function
void store_via_val(struct s_w_ptr *swp, struct s_w_val *swv) {
struct a_number j;
j.i = 2;
swp->n = &j;
swv->n = j;
}
Here the pointer swp->n will point to a local object j. So after exiting the function this pointer will be invalid because the pointed object will not be alive.
So the both functions are incorrect. Instead you could write the following functions
int store_via_ptr(struct s_w_ptr *swp ) {
swp->n = malloc( sizeof( *swp->n ) );
int success = swp->n != NULL;
if ( success ) swp->n->i = 1;
return success;
}
and
void store_via_val( struct s_w_val *swv ) {
swv->n.i = 2;
}
When to include a whole object of a structure type in another object of a structure type or to use a pointer to an object of a structure type within other object of a structure type depends on the design and context where such objects are used.
For example consider a structure struct Point
struct Point
{
int x;
int y;
};
In this case if you want to declare a structure struct Rectangle then it is natural to define it like
struct Rectangle
{
struct Point top_left;
struct Point bottom_right;
};
On the other hand, if you have a two-sided singly-linked list then it can look like
struct Node
{
int value;
struct Node *next;
};
struct List
{
struct Node *head;
struct Node *tail;
};
Two problems:
In store_via_ptr you allocate memory for i dynamically. When you use s_w_val you copy the structure, and then leave the pointer. Which means the pointer will be lost and can't be passed to free later.
In store_via_val you make swp->n point to the local variable j. A variable whose life-time will end when the function returns, leaving you with an invalid pointer.
The first problem might lead to a memory leak (something you never care about in your simple example problem).
The second problem is worse, since it will lead to undefined behavior when you dereference the pointer swp->n.
Unrelated to that, in the main function you don't need to allocate memory dynamically for the structures. You could just have defined them as plain structure objects and used the pointer-to operator & when calling the functions.

Explicit free list (dynamic memory allocation)

I need some help with my assignment. My job is to create/implement malloc/free functions in C using “Explicit free list among only the free blocks” technique. I have already studied a lot of materials, but I am still stuck at some point and I do not understand some details. So my job is to create 4 functions – initialize(), allocate() ,free() and check(). I can use only one global variable void *memory – this is the block in which I can allocate my memory using alloc().
So I wanted to implement this using doubly linked-list and I created a structure:
typedef struct memoryBlock{
struct memoryBlock *prev,*next;
}memoryBlock;
And the structure for the header:
typedef struct header{
int size;
}header;
I was advised in my class to create a separate structure for a free memory block and another separate structure for an allocated block. My first idea was to distinguish the free/allocated blocks using one bit of the block size in header – set it to 1 if the block is allocated and 0 if it is free. ( I saw this technique used in implicit lists). So my question is: do I need to create a freeBlock and allocatedBlock structure for an explicit list or can I just use the one bit of the size?
The second question is: do I need a separate structure for the header/footer of the block? Or can I just write the size of the block in the header/footer as *(int *)ptr = size; ? I tried to use this in the initialize() function:
void initialize(void *ptr, int size){
memory = ptr;
*(int *)memory = size; //header
*((int *)memory + size) = size; //footer
}
Is this correct, please?
Many thanks for any help.
do I need to create a freeBlock and allocatedBlock structure for an explicit list[…]?
This sound to me like a very good idea. Just don't misunderstand the advice: You don't need two different structure definitions but two different lists. You can realize this with two pointers, one for free blocks and one for allocated blocks.
[…] can I just use the one bit of the size?
You can only use one bit of size if it is unused otherwise. If you choose bit 0 it will only work when only even numbers of memory words are allocated.
do I need a separate structure for the header/footer of the block?
That depends on your design and algorithm. Are you bound to create a header and a footer?
Or can I just write the size of the block in the header/footer as *(int *)ptr = size;?
You can do this. But I would assign the given pointer to a (temporary) pointer to the right struct and then assign values right in their places.
void initialize(void *ptr, int size) {
memory = ptr;
header* h = memory;
h->size = size;
}
Additional observation: Instead of
typedef struct memoryBlock{
struct memoryBlock *prev,*next;
}memoryBlock;
better get used to this, it will save you a lot of pulled-out hair:
typedef struct memoryBlock {
struct memoryBlock *prev;
struct memoryBlock *next;
} memoryBlock;
Note: Due to the (seemingly) complexity of pointer, please raise the warning level of you compiler to the maximum. Read all warnings and eliminate their reasons.

Naming a variable with another variable in C

I want to create a struct with 2 variables, such as
struct myStruct {
char charVar;
int intVar;
};
and I will name the structs as:
struct myStruct name1;
struct myStruct name2;
etc.
The problem is, I don't know how many variables will be entered, so there must be infinite nameX structures.
So, how can I name these structures with variables?
Thanks.
You should use an array and a pointer.
struct myStruct *p = NULL;
p = malloc(N * sizeof *p); // where N is the number of entries.
int index = 1; /* or any other number - from 0 to N-1*/
p[index].member = x;
Then you can add elements to it by using realloc if you need to add additional entries.
Redefine myStruct as
struct myStruct {
char charVar;
int intVar;
struct myStruct *next;
};
Keep track of the last structure you have as well as the start of the list. When addding new elements, append them to the end of your linked list.
/* To initialize the list */
struct myStruct *start, *end;
start = malloc(sizeof(struct myStruct));
start->next = NULL;
end = start;
/* To add a new structure at the end */
end->next = malloc(sizeof(struct myStruct));
end = end->next;
end->next = NULL;
This example does not do any error checking. Here is how you would step along the list to print all the values in it:
struct myStruct *ptr;
for(ptr = start; ptr != NULL; ptr = ptr->next)
printf("%d %s\n", ptr->intVar, ptr->charVar);
You not have to have a distinct name for each structure in a linked list (or any other kind of list, in general). You can assign any of the unnamed structures to the pointer ptr as you use them.
So, how can I name these structures with variables?
I think every beginner starts out wanting to name everything. It's not surprising -- you learn about using variables to store data, so it seems natural that you'd always use variables. The answer, however, is that you don't always use variables for storing data. Very often, you store data in structures or objects that are created dynamically. It may help to read about dynamic allocation. The idea is that when you have a new piece of data to store, you ask for a piece of memory (using a library call like malloc or calloc). You refer to that piece of memory by its address, i.e. a pointer.
There are a number of ways to keep track of all the pieces of memory that you've obtained, and each one constitutes a data structure. For example, you could keep a number of pieces of data in a contiguous block of memory -- that's an array. See Devolus's answer for an example. Or you could have lots of little pieces of memory, with each one containing the address (again, a pointer) of the next one; that's a linked list. Mad Physicist's answer is a fine example of a linked list.
Each data structure has its own advantages and disadvantages -- for example, arrays allow fast access but are slow for inserting and deleting, while linked lists are relatively slow for access but are fast for inserting and deleting. Choosing the right data structure for the job at hand is an important part of programming.
It usually takes a little while to get comfortable with pointers, but it's well worth the effort as they open up a lot of possibilities for storing and manipulating data in your program. Enjoy the ride.

Creating a struct at address pointed by pointer

I have a unsigned char* head that is pointing to a certain addess in memory and now I have to create a typedef struct that I've declared starting at the location of that pointer...I am confused on how to do that!
Here is the declaration of the typedef
typedef struct {
struct block *next;
struct block *prev;
int size;
unsigned char *buffer;
} block;
My assignment involves implementing malloc, so I can't use malloc. A block is part of a free_list which contains all chunks of free memory blocks that I have in my program heap. Hence, the previous and next pointers that point to the previous and next free blocks of memory.
Head points to the start of the free_list. When I have to split say the first block of free memory to satisfy a malloc() request that needs less space then that free block has I need to move my head and create a new block struct there.
Hope this makes sense. If not, the assignment looks something like this
Your struct has no tag, so you need to give it one in order for it to point to itself:
typedef struct block {
struct block *next;
struct block *prev;
int size;
unsigned char *buffer;
} block;
If you're using C99 you can initialise the memory at head directly, if necessary, without declaring a temporary struct block:
*(block *)head = (block){NULL, NULL, 0, NULL};
You now have a struct block at the address head, as long as you cast it properly.
e.g.
((block *)head)->size = 5;
Or you assign a cast pointer to it:
block *p = (block *)head;
p->size = 5;
unsigned char* head = /* whatever you have assuming that it has a sufficient size. */;
/* Create a block in memory */
block* b = (block*)malloc(sizeof(block));
/*
* modify data in b here as you wish.
*/
b->next = 0;
b->prev = 0;
/* etc... */
/* copy b to head */
memcpy(head, b, sizeof(block));
/* free block */
free(b);
The above assumes that head has enough space to store an instance of block.
What it does is create a block, and copy the memory to the position of head, then free the allocated block.
From comments:
head points to the start of a place in memory where I can overwrite data...You may assume that I have enough space!
Then to obtain a properly typed pointer:
struct block *p = (struct block *)head;
and to have a copy of the block:
struct block b = *(struct block *)head;
The operating system will provide an API call to allocate blocks of memory that your malloc can carve up and provide to callers. In Linux/unix look at sbrk. In Windows look at the Win32 heap API. Your records will point into this block. Making sure no two allocated sections of the block overlap is the job of your allocator code.
It looks like your records are implementing a free list. So how are you going to allocate list nodes when you don't have an allocator (yet)? The usual solution is to do it in the free blocks themselves. So a free block has the structure:
typedef struct free_block {
struct free_block *next, *prev;
size_t size;
unsigned char buffer[1];
} FREE_BLOCK;
Now this data structure actually lies at the start of a free block. Its buffer has only 1 byte in the declaration, but the actual buffer is size bytes. Initially you'd have something like:
static FREE_BLOCK *free_list = sbrk(ARENA_SIZE);
free_list->next = free_list->prev = free_list;
free_list->size = ARENA_SIZE - offsetof(FREEBLOCK, buffer);
This places the whole arena on the free list as a single block. Your allocator will search free_list to find a block that's big enough, carve out the piece it needs, put the remaining small block (if any) back on the free list. For freeing, it will add the freed block to the list and coalesce adjacent blocks.
Simple free list allocators differ in how they choose the free block to allocate from: first fit, rotating first fit, best fit, worst fit, etc. In practice rotating first fit seems to work as well as or better than any of the others.
Incidentally, all of the common algorithms implemented with free lists don't need double links. Single ones will do.
Since this is an academic assignment, it should be fine to just call malloc (instead of an operating system API) to establish the big block (often called the "arena") your allocator will manage. You could also declare a big array of bytes.

Why queues and stacks are declared as pointers?

I'm studying Data Structures, and I'm not getting why stacks and queues need to be declared like:
struct stack *Stack;
(forget about the struct syntax)
I mean, why it is always declared as a pointer?
They are not always declared like that!
In general, declaring a variable as a pointer is useful for later allocating it dynamically. This can be due to a couple of reasons:
The variable is too big for the program stack
You want to return that variable from a function
In your case, let's think of two different implementations of stack:
struct stack
{
void *stuff[10000];
int size;
};
This is a terrible implementation, but assuming you have one like this, then you'd most probably not want to put it on the program stack.
Alternatively, if you have:
struct stack
{
void **stuff;
int size;
int mem_size;
};
You dynamically change the size of stuff anyway, so there is absolutely no harm in declaring a variable of type struct stack on the program stack, i.e. like this:
struct stack stack;
Unless, you'd want to return it from a function. For example:
struct stack *make_stack(int initial_size)
{
struct stack *s;
s = malloc(sizeof(*s));
if (s == NULL)
goto exit_no_mem;
if (initial_size == 0)
initial_size = 1;
s->stuff = malloc(initial_size * sizeof(*s->stuff));
if (s->stuff == NULL)
goto exit_no_stuff_mem;
s->size = 0;
s->mem_size = initial_size;
return s;
exit_no_stuff_mem:
free(s);
exit_no_mem:
return NULL;
}
Personally, though, I would have declared the function like this:
int make_stack(struct stack *s, int initial_size);
and allocate the struct stack on the program stack.
It depends on how your stack structure is defined (not just the layout of the struct, but the operations that manipulate it as well).
It's entirely possible to define a stack as a simple array and index, such as
struct stack_ {
T data[N]; // for some type T and size N
size_t stackptr; // Nobody caught that error, so it never existed, right? ;-)
} stack;
stack.stackptr = N; // stack grows towards 0
// push operation
if (stack.stackptr)
stack.data[--stack.stackptr] = some_data();
else
// overflow
// pop operation
if (stack.stackptr < N)
x = stack.data[stack.stackptr++];
else
// underflow
However, fixed-sized arrays are limiting. One easy method of implementing a stack is to use a list structure:
struct stack_elem {
T data;
struct stack_elem *next;
};
The idea is that the head of the list is the top of the stack. Pushing an item onto the stack adds an element at the head of the list; popping an item removes that element from the head of the list:
int push(struct stack_elem **stack, T data)
{
struct stack_elem *s = malloc(sizeof *s);
if (s)
{
s->data = data; // new element gets data
s->next = *stack; // set new element to point to current stack head
*stack = s; // new element becomes new stack head
}
return s != NULL;
}
int pop(struct stack_elem **stack, T *data)
{
int stackempty = (*stack == NULL);
if (!stackempty)
{
struct stack_elem *s = *stack; // retrieve the current stack head
*stack = (*stack)->next; // set stack head to point to next element
*data = s->data; // get the data
free(s); // deallocate the element
}
return r;
}
int main(void)
{
struct stack_elem *mystack = NULL; // stack is initially empty
T value;
...
if (!push(&mystack, some_data()))
// handle overflow
...
if (!pop(&mystack, &value))
// handle underflow
...
}
Since push and pop need to be able to write new pointer values to mystack, we need to pass a pointer to it, hence the double indirection for stack in push and pop.
No, they don't have to be declared as pointers.
One can as well allocate stacks and queues as global variables:
struct myHash { int key; int next_idx; int data[4]; } mainTable[65536];
struct myHash duplicates[65536*10];
int stack[16384];
myHash also includes a linked list for duplicate entries using indices.
But as stated in the comments, if one has to add more elements to the structures, that was initially planned, then pointers come handy.
An additional reason to declare structures as pointers is that it typically with pointers one can access both the complete structure as a whole, any individual element of the structure or some subset of the elements. That makes the syntax more versatile. Also when the structure is passed as a parameter to some external function, a pointer is inevitable.
There's really no need to implement stacks and queues with pointers - others have already stated this fact clearly. Look at #JohnBode 's answer for how a stack can be perfectly implemented using arrays. The thing is that modelling certain data structures (such as stacks, queues and linked lists) using pointers, allows you to program them in a very efficient way in terms of both execution speed and memory consumption.
Usually an underlying array for holding a data structure is very good implementation choice if your use-cases require frequent random access to an element in the structure, given it's positional index (this is FAST with an array). However growing the structure past its initial capacity can be expensive AND you waste memory with the unused elements in the array. Insertion and deletion operations can also be very expensive since you may need to rearrange elements to either compact the structure or make space for the new elements.
Since a queue and a stack don't have this random-acess requirement and you don't insert or delete elements in the middle of them, it is a better implementation choice to dynamically allocate each individual element "on the fly", requesting memory when a new element is required (this is what malloc does), and freeing it as an element is deleted. This is fast and will consume no more memory than it is actually needed by your data structure.
As aleady pointed out it depends on how big the struct is.
Another reason is encapsulation. The stack implementation might not expose the definition of struct stack in its header file. This hides the implementation detail from the user, with the downside that free store allocation is required.

Resources