to understand the data structure assume a linked list of nodes where each node has an array called bucked which can store some strings.
struct NODE
{
char *bucket[BUCKET_SIZE]; // an array of strings
int count; // number of items in the array
Node *next;
};
and to insert in this list the prototype of function is:
insert( List *list, char *new_string );
I am trying to understand what this means. I have a bucket (an array) of some size (e.g.: 20) which is inside a struct (node) of a linked list.
"As you add values to the list, they can be inserted into an existing bucket if there is room, using a regular ordered array insertion (shuffling items down)."
These are two ways which I think would work. Please let me know which one to implement.
bucket[8]
Type a:
1 insert "o".
bucket[0]="o";
2 insert "one"
bucket[0]="o"
bucket[1]="one"
3 insert "two"
bucket[0]= "0"
bucket[1]="one"
bucket[2]="two"
bucket[8]
Type b:
1 insert "o"
bucket[0]="o";
2 insert "one"
bucket[0]="one"
bucket[1]="o"
3 insert "two"
bucket[0]= "two"
bucket[1]="one"
bucket[2]="0"
or i am getting it completely wrong there is something else which its trying to tell me.
This really depends on how you interpret "regular ordered array insertion" and "shuffling items down".
By "ordered array insertion", does it mean "sorted array insertion"? If it does, neither of your suggestions would work. Rather, you'd want an insertion that inserts the value correctly within the array at its sorted position. If it does not, then either of your suggestions would work (but first, let's consider the next phrase).
By "shuffling items down", does it mean "keep pushing items down"? If it does, you'd want Type b. If it doesn't, you'd want Type a.
Since this doesn't seem like a standard library requirement or anything, you probably would have better luck asking the source of your requirements.
Also, as moeCake also pointed out, you'll have to handle the case when count > BUCKET_SIZE as well.
Related
I'm trying to create a program that reads a file that is filled with words in the dictionary, then stores every word in the hash table, I already have a hash function, for example the hash function returns an index 123 how will I be able to determine if that index right there has no value yet, else if the certain index has value should I just make the word the new head of the list or should I add it to the end of the list? Should I initialize the whole array first to something like "NULL" because if a variable wasn't initialized it contains garbage value, does that work the same with arrays from a struct..
typedef struct node
{
char word[LENGTH + 1];
struct node *next;
}
node;
// Number of buckets in hash table
// N = 2 ^ 13
const unsigned int N = 8192;
// Hash table
node *table[N];
This is part of my code LENGTH here is defined above with the value of 45..
how will I be able to determine if that index right there has no value yet
The "slots" in your table are linked lists. The table stores pointers to the head nodes of these linked lists. If that pointer is NULL, the list is empty, but you don't need to make it a special case: When you look up a word, just walk the list while the pointer to the next node is not null. If the pointer to the head node is null, your walk is stopped short early, that's all.
should I just make the word the new head of the list or should I add it to the end of the list?
It shouldn't really matter. The individual lists at the nodes are supposed to be short. The idea of the hash table is to turn a linear search on all W words into a faster linear search on W/N words on average. If you see that your table has only a few long lists, your hash function isn't good.
You must walk the list once to ensure that you don't insert duplicates anyway, so you can insert at the end. Or you could try to keep each linked list alphabetically sorted. Pick one method and stick with it.
Should I initialize the whole array first to something like "NULL" because if a variable wasn't initialized it contains garbage value, does that work the same with arrays from a struct.
Yes, please initialize your array of head node pointers to NULL, so that the hash table is in a defined state. (If your array is at file scope or static, the table should be initialized to null pointers already, but it doesn't hurt to make the initialization explicit.)
So I have the following structure:
typedef struct listElement
{
element value;
struct listElement;
} listElement, *List;
element is not a known type, meaning I don't know exactly what data type I'm dealing with, wether they're integers or or floats or strings.
The goal is to make a function that eletes the listElements that are redundant more than twice (meaning a value can only appear 0 times, once or twice, not more)
I've already made a function that uses bruteforce, with a nested loop, but that's a cluster**** as I'm dealing with a large number of elements in my list. (Going through every element and comparing it to the rest of the elements in the list)
I was wondering if there was a better solution that uses less isntructions and has a lower complexity.
You can use a hash table and map elements to their count.
if hashTable[element] (count for this particular element) returns 2, then delete the current element.
I'm currently trying to find the best data structure / algorithm that fits to my case:
I receive single unique random ids (uint32_t), and I create an element associated to each new id.
I need to retrieve elements from the id.
I also need to access the next and the previous element from any element (or even id) in the order of creation. The order of creation mainly depends on the current element, which is always accessible aside, so the new element should be its next.
Here is an example:
(12) <-> (5) <-> (8) <-> (1)
^ ^
'------------------------'
If I suppose the current element to be (8) and a new element (3) is created, it should look like:
(12) <-> (5) <-> (8) <-> (3) <-> (1)
^ ^
'--------------------------------'
An important thing to consider is that insertion, deletion and search happen with almost the same (high) frequency. Not completely sure about how many elements will live at the same time, but I would say max ~1000.
Knowing all of this, I think about using an AVL with ids as the sorted keys, keeping the previous and the next element too.
In C language, something like this:
struct element {
uint32_t id;
/* some other fields */
struct element *prev;
struct element *next;
}
struct node {
struct element *elt;
struct node *left;
struct node *right;
};
static struct element* current;
Another idea may be to use a hash map, but then I would need to find the right hash function. Not completely sure it always beats the AVL in practice for this amount of elements though. It depends on the hash function anyway.
Is the AVL a good idea or should I consider something else for this case?
Thanks !
PS: I'm not a student trying to make you do my homework, I'm just trying to develop a simple window manager (just for fun).
You are looking for some variation of what's called in java a LinkedHashMap
This is basically a combination of a hash-table and a (bi-directional) linked list.
The linked-list has elements in the desired order. Inserting an element in a known location (assuming you have the pointer to the correct location) is done in O(1). Same goes for deletion. The linked list contains all the elements in their desired order.
The second data-structure is the hash-map (or tree map). This data structure maps from a key (your unique id), to a POINTER in the linked list. This way, given an id - you can quickly finds its location on the linked-list, and from there you can easily access next and previous elements.
high level pseudo code for insertion:
insert(x, v, y): //insert key=x value=v, after element with key=y
if x is in hash-table:
abort
p = find(hash-table,y) //p is a pointer
insert_to_list_after(x,v,p) //insert key=x,value=v right after p
add(hash-table,x,p) //add x to the hash-table, and make it point to p.
high level pseudo code for search:
search(x):
if x is not in hash-table:
abort
p = find(hash-table,x)
return p->value;
deletion should be very similar to insertion (and in same time complexity).
Note that it is also fairly easy to find element that is after x:
p = find(hash-table,x)
if (p != NULL && p->next != NULL):
return p->next->value
My suggestion is that you use a combination of two data structures - a list to store the elements in the order they are inserted and a hash map or binary search tree to implement an associative array(map) between the id and list node. You will perform the search using the associative array and will be able to access neighboring elements using the list. Deletion is also relatively easy, but you need to delete from both structures.
Complexity of find/insert/delete will be log(n) if you use binary search tree and expected complexity is constant if you use a hash table.
You should definitely consider the Skip List data structure.
It seems perfect for your case, because it has an expected O(log(n)) insert / search / delete and if you have a pointer to a node, you can find the previous and the next element in O(1) by just moving that pointer.
The conclusion is that if you've just created a node, you have a pointer to it, and you can find the prev/next element in O(1) time.
So I'm still trying to wrap my head around linked lists in C. They are.. mind-boggling to me right now because I have yet to fully understand pointers, let alone pointers to pointers, and dynamic memory allocation that linked lists require.
I'm trying to create a two dimensional array with independent height, and width values. At most they would be 30x30. I have a two dimensional array let's call it arr[x][y]. arr[x][y] is filled with values of integers ranging from -2 to 1, how would I transfer this two dimensional array into a linked list? How would I then access values from this linked list on whim? I'm very confused, and any help would be appreciated. I'm looking through tutorials as we speak.
Additionally this is supposed to be a sort of stack linked list where I could call functions such as push(pushes a new value to the top of the linked list), pop(pops a value from the top of the linked list), top(returns the value most recently pushed onto the stack), isEmpty(checks if the stack is empty).
I don't need any full code, but code would be helpful here. I just need an understanding though of Linked Lists, and how to implement these sort of functions.
Additionally here is the assignment that this is related to: Assignment
It's a maze solver, I've already done code for analyzing a ascii picture into integer values for the two dimensional array. And as stated above that is what I need help with.
Hint : from your assignment, the stack is not supposed to fully represent the array, but to represent a path you dynamically build to find a way from the starting position of the maze to the target position of the maze.
Basically you need to create a link list, whose each node is the head of another list contained as a member (which conceptually grows downwards), along with a usual next pointer in the list.
For accessing an element like 2D array such as arr[3][4], you need to walk the first list while keeping a count of yand then move downward counting x Or you could do vice versa.
This is a common data structure assignment which goes by the name "multi stack or multi queue" which if implemented by lists gives what you are looking for.
struct Node
{
int data;
struct Node *next;
struct Node *head; // This head can be null initially as well as for the last node in a direction
};
First of all you need to define the proper structure.The first times it will be easier for you to create a list that terminates when the pointer to the next node is NULL.Afterwards you will discover lists with sentinel, bidirectional lists and things that now may seem too complicated.
For example that's a structure:
typedef struct __node
{
int info;
struct __node* next;
}node;
typedef node* list;
This time let's assume that list and node are the same thing, you will find more precise to separate the concept of list than the concept of node, and for example you may store in the list it's length (avoiding to count everytime all the nodes), but for now let's do it that way.
You initialize the list:
list l=NULL;
So the list contains zero nodes, to test if it's empty you just see if the pointer is NULL.
Add a new element:
if(NULL==l)
{
l=(node*)malloc(sizeof(node));
l->next=NULL;
l->info=0;
}
Now the list contains zero nodes, create a function to add a new node:
void pushBack(list* listPointer, int info)
{
if(NULL==*listPointer)
{
*listPointer=(node*)malloc(sizeof(node));
(*listPointer)->info=info;
}
else
{
node* ptr=l;
while(ptr->next!=NULL)
ptr=ptr->next;
ptr->next=(node*)malloc(sizeof(node));
ptr->info=info;
}
}
You could also gain efficiency adding the elements in front.Or optimize the code by returning the added element, so that you don't have to find the last element everytime.I leave this to you.Now let's call the pushBack function for every element of the array:
for(int i=0; i<N; i++)
{
pushBack(l,arr[i]);
}
That's all, learn your way to implement linked lists.
You're not supposed to convert the whole array into a linked list, you're only supposed to convert the best path into a linked list. You'd do this by brute force, trying directions and backtracking when you ran into dead ends.
Your path, the linked list, would need to look something like this:
struct PathNode
{
int coordX, coordY;
PathNode * next, * prev;
}
If I remember later, I'll draw a picture or something of this structure and add it to the post. comment on this post in a few hours to attract my attention.
The list would always contain a starting point, which would be the first node in the list. As you moved to other positions, one after the other, you'd push them onto the end of the list. This way, you could follow your path from your current position to the beginning of the maze by simply popping elements off of the list, one by one, in order.
This particular linked list is special in that it's two way: it has a pointer to both the next element and the previous one. Lists with only one of the two are called singly linked lists, this one with both is called a doubly linked list. Singly linked lists are one way only, and can only be traversed in one direction.
Think of your linked list as giant pile of strings, each with a starting end and a finishing end. As you walk through the maze, you tie a string at every node you visit and bring an end with you to the next square. If you have to backtrack, you bring the string back with you so it no longer points to the wrong square. Once you find your way to the end of the maze, you will be able to trace your steps by following the string.
Could you just explain what -> means exactly?
-> is an all-in-one pointer dereference and member access operator. Say we have:
PathNode * p = malloc(sizeof(*p));
PathNode q;
We can access p's and q's members in any of the following ways:
(*p).coordX;
q.coordX;
p->coordX;
(&q)->coordX;
I'm writing a program in which you enter words via the keyboard or file and then they come out sorted by length. I was told I should use linked lists, because the length of the words and their number aren't fixed.
should I use linked lists to represent words?
struct node{
char c;
struct node *next;
};
And then how can I use qsort to sort the words by length? Doesn't qsort work with arrays?
I'm pretty new to programming.
Thank you.
I think there is a bigger issue than the sorting algorithm which you should pick. The first of these is that the struct that you're defining is actually not going to hold a list of words, but rather a list of single letters (or a single word.) Strings in C are represented as null-terminated arrays of characters, laid out like so:
| A | n | t | h | o | n | y | \0 |
This array would ideally be declared as char[8] - one slot for each letter, plus one slot for the null byte (literally one byte of zeros in memory.)
Now I'm aware you probably know this, but it's worth pointing this out for clarity. When you operate on arrays, you can look at multiple bytes at a time and speed things up. With a linked list, you can only look at things in truly linear time: step from one character to the next. This is important when you're trying to do something quickly on strings.
The more appropriate way to hold this information is in a style that is very C like, and used in C++ as vectors: automatically-resized blocks of contiguous memory using malloc and realloc.
First, we setup a struct like this:
struct sstring {
char *data;
int logLen;
int allocLen;
};
typedef struct string sstring;
And we provide some functions for these:
// mallocs a block of memory and holds its length in allocLen
string_create(string* input);
// inserts a string and moves up the null character
// if running out of space, (logLen == allocLen), realloc 2x as much
string_addchar(string* input, char c);
string_delete(string* input);
Now, this isn't great because you can't just read into an easy buffer using scanf, but you can use a getchar()-like function to get in single characters and place them into the string using string_addchar() to avoid using a linked list. The string avoids reallocation as much as possible, only once every 2^n inserts, and you can still use string functions on it from the C string library!! This helps a LOT with implementing your sorts.
So now how do I actually implement a sort with this? You can create a similar type intended to hold entire strings in a similar manner, growing as necessary, to hold the input strings from the console. Either way, all your data now lives in contiguous blocks of memory that can be accessed as an array - because it is an array! For example, say we've got this:
struct stringarray {
string *data;
int logLen;
int allocLen;
};
typedef struct stringarray cVector;
cVector myData;
And similar functions as before: create, delete, insert.
The key here is that you can implement your sort functions using strcmp() on the string.data element since it's JUST a C string. Since we've got a built-in implementation of qsort that uses a function pointer, all we have to do is wrap strcmp() for use with these types and pass the address in.
If you know how you want the items sorted, you should use an insertion sort when reading the data so that once all the input has been entered, all you have to do is write the output. Using a linked list would be ok, though you'll find that it has O(N2) performance. If you store the input in a binary tree ordered by length (a balanced tree would be best), then your algorithm will have O(NlogN) performance. If you're only going to do it once, then go for simplicity of implementation over efficiency.
Pseudocode:
list = new list
read line
while not end of file
len = length(line)
elem = head(list)
while (len > length(elem->value))
elem = elem->next
end
insert line in list before elem
read line
end
// at this point the list's elements are sorted from shortest to longest
// so just write it out in order
elem = head(list)
while (elem != null)
output elem->value
elem = elem->next
end
Yes, the classic "C" library function qsort() only works on an array. That is a contiguous collection of values in memory.
Tvanfosson advice is pretty good - as you build the linked list, you can insert elements at the correct position. That way, the list is always sorted.
I think the comment you made that you were told to use a linked list is interesting. Indeed a list can be a good data structure to use in many instances, but it does have draw backs; for example, it must be traversed to find elements.
Depending on your application, you may want to use a hash table. In C++ you could use a hash_set or a hash_map.
I would recommend you you spend some time studying basic data structures. Time spent here will server you will and better put you in a position to evaluate advice such as "use a linked list".
There are lots of ways to handle it... You can use arrays, via dynamic memory allocation, with realloc, if you feel brave enough to try.
The standard implementation of qsort, though, needs each element to be a fixed length, which would mean having an array-of-pointers-to-strings.
Implementing a linked list, though, should be easy, compared to using pointers to pointers.
I think what you were told to do was not to save the strings as list; but in a linked list:
struct node {
char *string;
node *next;
}
Then, all you have to do is, every time you read a string, add a new node into the list, in its ordered place. (Walk the list until the current string's length is greater than the string you just read.)
The problem of words not being a fixed length is common, and it's usually handled by storing the world temporarily in a buffer, and then copying it into a proper length array (dynamically allocated, of course).
Edit:
In pseudo code:
array = malloc(sizeof(*char))
array_size = 1
array_count = 0
while (buffer = read != EOF):
if(array_count == array_size)
realloc(array, array_size * 2)
array_count++
sring_temp = malloc(strlen(buffer))
array[array_count] = string_temp
qsort(array, array_count, sizeof(*char), comparison)
print array
Of course, that needs a TON of polishing. Remember that array is of type char **array, ie "A pointer to a pointer to char" (which you handle as an array of pointers); since you're passing pointers around, you can't just pass the buffer into the array.
You qsort a linked list by allocating an array of pointers, one per list element.
You then sort that array, where in the compare function you are of course receiving pointers to your list elements.
This then gives you a sorted list of pointers.
You then traverse your list, by traversing the array of pointers and adjusting each element in turn. rearranging its order in the list to match the order of your array of pointers.