Simple redundance in a linked list - c

So I have the following structure:
typedef struct listElement
{
element value;
struct listElement;
} listElement, *List;
element is not a known type, meaning I don't know exactly what data type I'm dealing with, wether they're integers or or floats or strings.
The goal is to make a function that eletes the listElements that are redundant more than twice (meaning a value can only appear 0 times, once or twice, not more)
I've already made a function that uses bruteforce, with a nested loop, but that's a cluster**** as I'm dealing with a large number of elements in my list. (Going through every element and comparing it to the rest of the elements in the list)
I was wondering if there was a better solution that uses less isntructions and has a lower complexity.

You can use a hash table and map elements to their count.
if hashTable[element] (count for this particular element) returns 2, then delete the current element.

Related

Best algo to retrieve elements from random id

I'm currently trying to find the best data structure / algorithm that fits to my case:
I receive single unique random ids (uint32_t), and I create an element associated to each new id.
I need to retrieve elements from the id.
I also need to access the next and the previous element from any element (or even id) in the order of creation. The order of creation mainly depends on the current element, which is always accessible aside, so the new element should be its next.
Here is an example:
(12) <-> (5) <-> (8) <-> (1)
^ ^
'------------------------'
If I suppose the current element to be (8) and a new element (3) is created, it should look like:
(12) <-> (5) <-> (8) <-> (3) <-> (1)
^ ^
'--------------------------------'
An important thing to consider is that insertion, deletion and search happen with almost the same (high) frequency. Not completely sure about how many elements will live at the same time, but I would say max ~1000.
Knowing all of this, I think about using an AVL with ids as the sorted keys, keeping the previous and the next element too.
In C language, something like this:
struct element {
uint32_t id;
/* some other fields */
struct element *prev;
struct element *next;
}
struct node {
struct element *elt;
struct node *left;
struct node *right;
};
static struct element* current;
Another idea may be to use a hash map, but then I would need to find the right hash function. Not completely sure it always beats the AVL in practice for this amount of elements though. It depends on the hash function anyway.
Is the AVL a good idea or should I consider something else for this case?
Thanks !
PS: I'm not a student trying to make you do my homework, I'm just trying to develop a simple window manager (just for fun).
You are looking for some variation of what's called in java a LinkedHashMap
This is basically a combination of a hash-table and a (bi-directional) linked list.
The linked-list has elements in the desired order. Inserting an element in a known location (assuming you have the pointer to the correct location) is done in O(1). Same goes for deletion. The linked list contains all the elements in their desired order.
The second data-structure is the hash-map (or tree map). This data structure maps from a key (your unique id), to a POINTER in the linked list. This way, given an id - you can quickly finds its location on the linked-list, and from there you can easily access next and previous elements.
high level pseudo code for insertion:
insert(x, v, y): //insert key=x value=v, after element with key=y
if x is in hash-table:
abort
p = find(hash-table,y) //p is a pointer
insert_to_list_after(x,v,p) //insert key=x,value=v right after p
add(hash-table,x,p) //add x to the hash-table, and make it point to p.
high level pseudo code for search:
search(x):
if x is not in hash-table:
abort
p = find(hash-table,x)
return p->value;
deletion should be very similar to insertion (and in same time complexity).
Note that it is also fairly easy to find element that is after x:
p = find(hash-table,x)
if (p != NULL && p->next != NULL):
return p->next->value
My suggestion is that you use a combination of two data structures - a list to store the elements in the order they are inserted and a hash map or binary search tree to implement an associative array(map) between the id and list node. You will perform the search using the associative array and will be able to access neighboring elements using the list. Deletion is also relatively easy, but you need to delete from both structures.
Complexity of find/insert/delete will be log(n) if you use binary search tree and expected complexity is constant if you use a hash table.
You should definitely consider the Skip List data structure.
It seems perfect for your case, because it has an expected O(log(n)) insert / search / delete and if you have a pointer to a node, you can find the previous and the next element in O(1) by just moving that pointer.
The conclusion is that if you've just created a node, you have a pointer to it, and you can find the prev/next element in O(1) time.

C Two Dimensional Array into Linked List

So I'm still trying to wrap my head around linked lists in C. They are.. mind-boggling to me right now because I have yet to fully understand pointers, let alone pointers to pointers, and dynamic memory allocation that linked lists require.
I'm trying to create a two dimensional array with independent height, and width values. At most they would be 30x30. I have a two dimensional array let's call it arr[x][y]. arr[x][y] is filled with values of integers ranging from -2 to 1, how would I transfer this two dimensional array into a linked list? How would I then access values from this linked list on whim? I'm very confused, and any help would be appreciated. I'm looking through tutorials as we speak.
Additionally this is supposed to be a sort of stack linked list where I could call functions such as push(pushes a new value to the top of the linked list), pop(pops a value from the top of the linked list), top(returns the value most recently pushed onto the stack), isEmpty(checks if the stack is empty).
I don't need any full code, but code would be helpful here. I just need an understanding though of Linked Lists, and how to implement these sort of functions.
Additionally here is the assignment that this is related to: Assignment
It's a maze solver, I've already done code for analyzing a ascii picture into integer values for the two dimensional array. And as stated above that is what I need help with.
Hint : from your assignment, the stack is not supposed to fully represent the array, but to represent a path you dynamically build to find a way from the starting position of the maze to the target position of the maze.
Basically you need to create a link list, whose each node is the head of another list contained as a member (which conceptually grows downwards), along with a usual next pointer in the list.
For accessing an element like 2D array such as arr[3][4], you need to walk the first list while keeping a count of yand then move downward counting x Or you could do vice versa.
This is a common data structure assignment which goes by the name "multi stack or multi queue" which if implemented by lists gives what you are looking for.
struct Node
{
int data;
struct Node *next;
struct Node *head; // This head can be null initially as well as for the last node in a direction
};
First of all you need to define the proper structure.The first times it will be easier for you to create a list that terminates when the pointer to the next node is NULL.Afterwards you will discover lists with sentinel, bidirectional lists and things that now may seem too complicated.
For example that's a structure:
typedef struct __node
{
int info;
struct __node* next;
}node;
typedef node* list;
This time let's assume that list and node are the same thing, you will find more precise to separate the concept of list than the concept of node, and for example you may store in the list it's length (avoiding to count everytime all the nodes), but for now let's do it that way.
You initialize the list:
list l=NULL;
So the list contains zero nodes, to test if it's empty you just see if the pointer is NULL.
Add a new element:
if(NULL==l)
{
l=(node*)malloc(sizeof(node));
l->next=NULL;
l->info=0;
}
Now the list contains zero nodes, create a function to add a new node:
void pushBack(list* listPointer, int info)
{
if(NULL==*listPointer)
{
*listPointer=(node*)malloc(sizeof(node));
(*listPointer)->info=info;
}
else
{
node* ptr=l;
while(ptr->next!=NULL)
ptr=ptr->next;
ptr->next=(node*)malloc(sizeof(node));
ptr->info=info;
}
}
You could also gain efficiency adding the elements in front.Or optimize the code by returning the added element, so that you don't have to find the last element everytime.I leave this to you.Now let's call the pushBack function for every element of the array:
for(int i=0; i<N; i++)
{
pushBack(l,arr[i]);
}
That's all, learn your way to implement linked lists.
You're not supposed to convert the whole array into a linked list, you're only supposed to convert the best path into a linked list. You'd do this by brute force, trying directions and backtracking when you ran into dead ends.
Your path, the linked list, would need to look something like this:
struct PathNode
{
int coordX, coordY;
PathNode * next, * prev;
}
If I remember later, I'll draw a picture or something of this structure and add it to the post. comment on this post in a few hours to attract my attention.
The list would always contain a starting point, which would be the first node in the list. As you moved to other positions, one after the other, you'd push them onto the end of the list. This way, you could follow your path from your current position to the beginning of the maze by simply popping elements off of the list, one by one, in order.
This particular linked list is special in that it's two way: it has a pointer to both the next element and the previous one. Lists with only one of the two are called singly linked lists, this one with both is called a doubly linked list. Singly linked lists are one way only, and can only be traversed in one direction.
Think of your linked list as giant pile of strings, each with a starting end and a finishing end. As you walk through the maze, you tie a string at every node you visit and bring an end with you to the next square. If you have to backtrack, you bring the string back with you so it no longer points to the wrong square. Once you find your way to the end of the maze, you will be able to trace your steps by following the string.
Could you just explain what -> means exactly?
-> is an all-in-one pointer dereference and member access operator. Say we have:
PathNode * p = malloc(sizeof(*p));
PathNode q;
We can access p's and q's members in any of the following ways:
(*p).coordX;
q.coordX;
p->coordX;
(&q)->coordX;

How to get the number of used elements of an array of structures in C?

Say I have:
struct a b[4];
//i filled some elements of b
I need to know the number of non-empty elements of b.
Since I'm not sure whether b has exactly 4 non-empty elements, is there any way to do this?
there is no way to retrieve this information. you have to keep track of the number of elements you use by yourself.
typically, C developpers use another integer value alongside the array:
struct a b[4];
int b_count;
increment the counter each time you fill an element in the array.
you can wrap all this into a structure, in order to keep the counter near the array. this allows you to return the array along with the counter from a function:
struct array {
struct a values[4];
int count;
};
struct array b;
There are two normal ways to do this.
The first is to have some sort of sentinel value which indicates that the array element isn't in use. For example, if you were storing quantities in an integer, you could use the value -1 to indicate it wasn't in use.
As a more relevant example to your situation:
struct a {
int inUse;
// all other fields in structure
};
and set inUse within the array element to 1 or 0 depending on whether that array element is in use.
The second is to maintain extra information outside of the array to indicate which elements were in use. This could be a map if the usage information was sparse, or just a count if you could guarantee active elements would be contiguous at the start.
For a map, you could use:
struct a b[4];
int inUse[4]; // shows inUse indication for each element.
For a simpler count variation:
struct a b[4];
int inUseAtStart; // 0 thru 4 shows how many elements are in use,
// starting at b[0].
There is no such empty or non-empty distinction in C.
The very thing you describe as empty may refer to uninitialized variables.
You will have to keep track of how many elements of the array you use when you populate it. Note that you will have to(==must) do this because in c, there is no bound checking for arrays,so you have to keep track that you do not exceed the bounds of the array(You end up with an Undefined Behavior if you don't), while doing so you can easily keep track of how many elements you used.
C won't create any overhead in it's arrays and therefore it won't store any additional information including element count. There's a decent c++ std::vector template for it in case you don't want to do it yourself (which can be annoying) and in case you are willing to use c++, just saying :)
One thing that you can do is mark the next item after the last item you inserted. For example you used 2 elements then you can mark the third element with a specific data like -1.
Another way is that you can do is to keep a variable which has the count of the elements in the struct.

Arranging elements in C array so there are no gaps

I have a regular array of structs in C, in a program that runs every second and updates all the data in the structs. When a condition is met, one of the elements gets cleared and is used as a free slot for new element (in this case timers) that might come in at any point.
What I do is just to parse all the elements of the array looking for active elements requiring updates. But even if the amount of elements is small (<2000), I feel this is wasting time going through the inactive ones. Is there a way I can keep the array gap-free so I just need to iterate though the number of currently allocated elements?
Assuming the specific order of the elements does not matter, it can be done very easily.
If you have your array A and the number of active elements N, you can then add an element E like this:
A[N++] = E;
and remove the element at index I like this:
A[I] = A[--N];
So how does this work? Well, it's fairly simple. We want the array to only store active elements, so we may assume that the array is like that when we start doing either of these things.
Adding an element will always put it at the end, and since all elements currently in the array, as well as the newly added element, will be active, we can safely add one to the end.
Removing an element is done by moving the last element to take over the array index of the element we want to remove. Thus, A[0..I-1] is active, as well as A[I+1..N], and by moving A[N] to A[I], the entire range A[0..N-1] is active (A[N] is not active, because it no longer exists - we moved it to A[I], and that's why we decrease N by 1).
If you're removing elements while iterating over them to update them, note that you can only increment your loop counter after processing an element which doesn't get removed, since otherwise, you would never process the moved elements.
Traversing 2,000 entries per second is negligible. It's really not worth optimizing. If you really feel you must, swap the inactive entry for the last active entry.
It doesn't sound like you have a great reason for not using a linked list. If you do the implementation well, you'll get O(1) inserts, O(1) removals and you'll only ever need to keep (and iterate over)active structs. There'd be some memory overhead... for even moderately sized structs, though, even a doubly-linked list would be pretty efficient. The nice thing about this approach is that you can keep elements in their insertion order without extra computational overhead.
A relatively simple way to accomplish this:
void remove(struct foo *foo_array, int *n)
{
struct foo *src = foo_array, *dst = foo_array;
int num_removed = 0;
for (int i=0; i<*n; ++i)
{
// Do we want to remove this? (should_remove() left as exercise for reader.)
if (should_remove(src))
{
// yes, remove; advance src without advancing dst
++src;
++num_removed;
}
else if (src != dst)
{
// advance src and dst (with copy)
*dst++ = *src++;
}
else
{
// advance both pointers (no copy)
++src;
++dst;
}
}
// update size of array
*n -= num_removed;
}
The idea is that you keep track of how many elements of the array are valid (*n here), and pass its pointer as an "in/out parameter". remove() decides which elements to remove and copies the ones that are out of place. Notice that this is O(n) regardless of how many elements are decided to be removed.
A few alternatives come to mind, choose according to your needs:
1) Leave it as it is unless you are having some performance issues or need to scale up.
2) Add a "next" pointer to each struct to use it as an element in a doubly linked list. Keep two lists, one for the active ones and one for the unused ones. Depending on how you use the struts, also consider making the list doubly linked. (You can also still have the elements in an array if you need to index the structs, or you can stop using the array if not.)
3) If you don't need indices (or order) of the structs in the array to be constant, move unused entries to the end of the array. Then when you iterate through the array from the beginning, you can stop whenever you reach the first unused one. (You can store the index of the last active struct so that whenever a struct is deactivated you can just have it switch places with the last active one, and then decrement the index of the last active struct.)
How about adding a linked list behavior in your struct, i.e. a pointer member pointing to the next active element?
You would have to update these pointers on element activation and deactivation.
EDIT: This method is not suitable for dynamically resized arrays, because that may change the memory object's address, invalidating the pointers used by the list..

Space efficient trie

I'm trying to implement a space efficient trie in C. This is my struct:
struct node {
char val; //character stored in node
int key; //key value if this character is an end of word
struct node* children[256];
};
When I add a node, it's index is the unsigned char cast of the character. For example, if I want to add "c", then
children[(unsigned char)'c']
is the pointer to the newly added node. However, this implementation requires me to declare a node* array of 256 elements. What I want to do is:
struct node** children;
and then when adding a node, just malloc space for the node and have
children[(unsigned char)'c']
point to the new node. The issue is that if I don't malloc space for children first, then I obviously can't reference any index or else that's a big error.
So my question is: how do I implement a trie such that it only stores the non-null pointers to its children?
You could try using a de la Briandais trie, where you only have one child pointer for each node, and every node also has a pointer to a "sibling", so that all siblings are effectively stored as a linked list rather than directly pointed to by the parent.
You can't really have it both ways and be both space efficient and have O(1) lookup in the children nodes.
When you only allocate space for the entries that's actually added, and not the null pointers, you can no longer do
children[(unsigned char)'c']
As you can no longer index directly into the array.
One alternative is to simply do a linear search through the children. and store an additional count of how many entries the children array has i.e.
children[(unsigned char)'c'] = ...;
Have to become
for(i = 0; i < len; i++) {
if(children[i] == 'c')
break;
}
if(i == len) {
//...reallocate and add space for one item in children
}
children[i] = ...;
If your tree ends up with a lot of non-empty entries at one level, you might insert the children in sorted order and do a binary search. Or you might add the childrens as a linked list instead of an array.
If you just want to do an English keyword search, I think you can minimize the size of your children, from 256 to just 26 - just enough to cover the 26 letters a-z.
Furthermore, you can use a linked list to keep the number of children even smaller so we can have more efficient iteration.
I haven't gone through the libraries yet but I think trie implementation will help.
You can be both space efficient and keep the constant lookup time by making child nodes of every node a hash table of nodes. Especially when Unicode characters are involved and the set of characters you can have in your dictionary is not limited to 52 + some, this becomes more of a requirement than a nicety. This way you can keep the advantages of using a trie and be time and space efficient at the same time.
I must also add that if the character set you are using is approaching unbounded, chances are having a linked list of nodes may just do fine. If you like an unmanageable nightmare, you can opt for a hybrid approach where first few levels keep their children in hash tables while the lower levels have a linked list of them. For a true bug farm, opt for a dynamic one where as each linked list passes a threshold, you convert it to a hash table on the fly. You could easily amortize the cost.
Possibilities are endless!

Resources