Inserting a number into a sorted array! - c

I would like to write a piece of code for inserting a number into a sorted array at the appropriate position (i.e. the array should still remain sorted after insertion)
My data structure doesn't allow duplicates.
I am planning to do something like this:
Find the right index where I should be putting this element using binary search
Create space for this element, by moving all the elements from that index down.
Put this element there.
Is there any other better way?

If you really have an array and not a better data structure, that's optimal. If you're flexible on the implementation, take a look at AA Trees - They're rather fast and easy to implement. Obviously, takes more space than array, and it's not worth it if the number of elements is not big enough to notice the slowness of the blit as compared to pointer magic.

Does the data have to be sorted completely all the time?
If it is not, if it is only necessary to access the smallest or highest element quickly, Binary Heap gives constant access time and logn addition and deletion time.
More over it can satisfy your condition that the memory should be consecutive, since you can implement a BinaryHeap on top of an array (I.e; array[2n+1] left child, array[2n+2] right child).

A heap based implementation of a tree would be more efficient if you are inserting a lot of elements - log n for both locating/removing and inserting operations.

Related

Array VS single linked list VS double link list

I am learning about arrays, single linked list and double linked list now a days and this question came that
" What is the best option between these three data structures when it comes to fast searching, less memory, easily insertion and updating of things "
As far I know array cannot be the answer because it has fixed size. If we want to insert a new thing. it wouldn't always be possible. Double linked list can do the task but there will be two pointers needed for each node so there will be memory problem, so I think single linked list will fulfill all given requirements. Am I right? Please correct me if I am missing any point. There is also one more question that instead of choosing one of them, can I make combination of one or more data structures given here to meet all the requirements?
"What is the best option between these three data structures when it comes to fast searching, less memory, easily insertion and updating of things".
As far as I can tell Arrays serve the purpose.
Fast search: You could do binary search if array is sorted. You dont get that option in linkedlist
Less memory: Arrays will take least memory (but contiguous memory )
Insertion: Inserting in array is a matter of a[i] = "value". If array size is exceeded then simply export data into a new array. That is exactly how HashMaps / ArrayLists work under covers.
Updating things: Only Arrays provide you with Random access. a[i] ="new value".. updated in O(1) time if you know the index.
Each of those has its own benefits and downsides.
For search speed, I'd say arrays are better suitable due to the quick lookup times.
Since an array is a sequence of same-size elements, retrieving the value at an index is just memoryLocation + index * elementSize. For a linked list, the whole list needs traversing.
Arrays also win in the "less memory" category, since there's no need to store extra pointers.
For insertions, arrays are slow. You'll need to traverse the array, copy contents to a new array, assign the new array, delete the old one...
Insertions go much quicker in linked- or double lists, because it's just a matter of changing one or two pointers.
In the end, it all just depends on the use case. Are you inserting a lot? Then you probably want to consider a non-array structure.
Do you need many quick lookups? Consider those arrays again. Etc..
See also this question.
A linked list is usually the best choice when we don’t know in advance the number of elements we will have to store or the number can change dynamically.
Arrays have slow insertion and deletion times. To insert an element to the front or middle of the array, the first step is to ensure that there is space in the array for the new element, otherwise, the array needs to be RESIZED. This is an expensive operation. The next step is to open space for the new element by shifting every element after the desired index. Likewise, for deletion, shifting is required after removing an element. This implies that insertion time for arrays is Big O of n (O(n)) as n elements must be shifted.
Using static arrays, we can save some extra memory in
comparison to linked lists because we do not need to store pointers to the next node
a doubly-linked list support fast insertion/removal at their ends. This is used in LRU cache, where you need to enter new item to front and remove the oldest item from the end.

Circular buffer with fast lookup/insert/remove

I have a problem that I didn't find a solution that could be efficient enough. I need to speed up a circular buffer with a fixed size of 1.000.000 elements. It is currently implemented using a singly linked list.
For the moment, I have changed the implementation to use an array instead of the linked list. I use a write and read pointer to avoid shifting every index of my array. I need to do A LOT of lookup in my fifo, and I would need to delete items from indexes (well, I know it breaks the fifo rule).
First I was thinking of a sorted index table that matches the fifo array. It would be a O(log n) complexity for the lookup, but every time I'll need to update my fifo, I'll also need to update my index table. This is the part I didn't manage to do it efficiently (with a small complexity).
Any hints about an implementation that keeps track of the FIFO's order, and gives good performances in insert/delete/search operations ?
Thanks.
One approach would be to use:
An array with n elements to store the items
A Fenwick tree with n elements to store the occupancy.
We use the Fenwick Tree to write a 1 whenever an element is present, or 0 if the element is not present.
Once you have this structure, you can find the k^th present element and perform deletions in O(logn) time. (The actual implementation details may be a bit fiddly due to the FIFO wraparound - it may help to keep track of the total occupancy in the array and the occupancy from the pointer to the first element until the end of the array.)
Note that this structure will allow you to delete items anywhere, but only to insert items at the end of the FIFO - it is not clear whether this matches your requirements?

Use of memory between an array and a linked list

In C, which is more efficient in terms of memory management, a linked list or an array?
For my program, I could use one or both of them. I would like to take this point into consideration before starting.
Both link list and array have good and bad sides.
Array
Accessing at a particular position take O(1) time, because memory initialized is consecutive for array. So if address of first position is A, then address of 5th element if A+4.
If you want to insert a number at some position it will take O(n) time. Because you have to shift every single numbers after that particular position and also increase size of array.
About searching an element. Considering the array is sorted. you can do a binary search and accessing each position is O(1). So you do the search in order of binary search. In case the array is not sorted you have to traverse the entire array so O(n) time.
Deletion its the exact opposite of insertion. You have to left shift all the numbers starting from the place where you deleted it. You might also need to recrete the array for memory efficiency. So O(n)
Memory must be contiguous, which can be a problem on old x86 machines
with 64k segments.
Freeing is a single operation.
LinkList
Accessing at a particular position take O(n) time, because you have to traverse the entire list to get to a particular position.
If you want to insert a number at some position and you have a pointer at that position already, it will take O(1) time to insert the new value.
About searching an element. No matter how the numbers are arranged you have to traverse the numbers from front to back one by one to find your particular number. So its always O(n)
about deletion its the exact opposite of insertion. If you know the position already by some pointer suppose the list was like this . p->q->r you want to delete q all you need is set next of p to r. and nothing else. So O(1) [Given you know pointer to p]
Memory is dispersed. With a naive implementation, that can be bad of cache coherency, and overall take can be high because the memory allocation system has overhead for each node. However careful programming can get round this problem.
Deletion requires a separate call for each node, however again careful programming can get round this problem.
So depending on what kind of problem you are solving you have to choose one of the two.
Linked list uses more memory, from both the linked list itself and inside the memory manager due to the fact you are allocating many individual blocks of memory.
That does not mean it is less efficient at all, depending on what you are doing.
While a linked list uses more memory, adding or removing elements from it is very efficient, as it doesn't require moving data around at all, while resizing a dynamic array means you have to allocate a whole new area in memory to fit the new and modified array with items added/removed. You can also sort a linked list without moving it's data.
On the other hand, arrays can be substantially faster to iterate due to caching, path prediction etc, as the data is placed sequentially in memory.
Which one is better for you will really depend on the application.

Can I use Day-Stout-Warren to rebalance a Binary Search Tree implemented in an array?

So, I've implemented a binary search tree backed by an array. The full implementation is here.
Because the tree is backed by an array, I determine left and right children by performing arithmetic on the current index.
private Integer getLeftIdx(Integer rootIndex) {
return 2 * rootIndex + 1;
}
private Integer getRightIdx(Integer rootIndex) {
return 2 * rootIndex + 2;
}
I've realized that this can become really inefficient as the tree becomes unbalanced, partly because the array will be sparsely populated, and partly because the tree height will increase, causing searches to tend towards O(n).
I'm looking at ways to rebalance the tree, but I keep coming across algorithms like Day-Stout-Warren which seem to rely on a linked-list implementation for the tree.
Is this just the tradeoff for an array implementation? I can't seem to think of a way to rebalance without creating a second array.
Imagine you have an array of length M that contains N items (with N < M, of course) at various positions, and you want to redistributed them into "valid positions" without changing their order.
To do that you can first walk through the array from end to start, packing all the items together at the end, and then walk through the array from start to end, moving an item into each valid position you find until you run out of items.
This easy problem is the same as your problem, except that you don't want to walk though the array in "index order", you want to walk through it in binary in-order traversal order.
You want to move all the items into "valid positions", i.e. the part of the array corresponding to indexes < N, and you don't want to change their in-order traversal order.
So, walk the array in reverse in-order order, packing items into the in-order-last-possible positions. Then walk forward over the items in order, putting each item into the in-order-first available valid position valid position until you run out of items.
BUT NOTE: This is fun to consider, but it's not going to make your tree efficient for inserts -- you have to do too many rebalancings to keep the array at a reasonable size.
BUT BUT NOTE: You don't actually have to rebalance the whole tree. When there's no free place for the insert, you only have to rebalance the smallest subtree on the path that has an extra space. I vaguely remember a result that I think applies, which suggests that the amortized cost of an insert using this method is O(log^2 N) when your array has a fixed number of extra levels. I'll do the math and figure out the real cost when I have time.
I keep coming across algorithms like Day-Stout-Warren which seem to rely on a linked-list implementation for the tree.
That is not quite correct. The original paper discusses the case where the tree is embedded into an array. In fact, section 3 is devoted to the changes necessary. It shows how to do so with constant auxiliary space.
Note that there's a difference between their implementation and yours, though.
Your idea is to use a binary-heap order, where once you know a single-number index i, you can determine the indices of the children (or the parent). The array is, in general, not sorted in increasing indices.
The idea in the paper is to use an array sorted in increasing indices, and to compact the elements toward the beginning of the array on a rebalance. Using this implementation, you would not specify an element by an index i. Instead, as in binary search, you would indirectly specify an element by a pair (b, e), where the idea is that the index is implicitly specified as ⌊(b + e) / 2⌋, but the information allows you to determine how to go left or right.

What data structure to use to emulate an array in which one can add data in any position?

I want to store a small amount of items( less than 255) which have constant size (a c char )and be able to do the following operations:
Append a value to an arbitrary position and have the other items preserve their previous order.
Delete an item and have the other items preserve their order(as above).
Find the next and previous of an item.
I have tried using an array and making a function to add a value by moving all items after it a place forward.Same thing can happen with deleting, but it is too inefficient.Of course, I do not mind having to use a library, long as it is readily available and free.
Array - access: O(1), insert: O(n)
Double-linked list - access O(n), previous/next: O(1), insert(*): O(1)
RB tree with number of childs stored: O(log n) for all operations.
(*): You need the traverse the list first to get to the position (O(n)).
Note: no, the array is not messy, it's really simple to implement. Also as you can see, depending on the usage, it can be quite efficient.
Based on the number of elements, and your remark to array implementation you should stick to arrays.
You could use a double-linked list for it. However, this won't work if you want to keep the array behaviour (e.g. accessing elements quickly (O(1), for a LL it's O(n)) by their index)

Resources