Why is array considered a data structure?

Why is array considered a data structure? - c

Why is array considered a data structure ? How is array a data structure in tetms of efficiency? Please explain by giving some examples

It's a data structure because it's collection of data and the tools to work it.
Primary features:
Extremely fast lookup by index.
Extremely fast index-order traversal.
Minimal memory footprint (not so with the optional modifications I mentioned).
Insertion is normally O(N) because you may need to copy the array when you reallocate the array to make space for new elements. However, you can bring the cost of appending down to amortized O(1) by over-allocating (i.e. by doubling the size of the array every time you reallocate).[1]
Deletion is O(N) because you will need to shift N/2 elements on average. You could keep track the number of unused elements at the start and end of the array to make removals from the ends O(1).[1]
Lookup by index is O(1). It's a simple pointer addition.
Lookup by value is O(N). If the data is ordered, one can use a binary search to reduce this to O(log N).
Keeping track of the first used element and the last used element would technically qualify as a different data structure because the functions to access the data structure are different, but it would still be called an array.

Related

Use cases of linked list in JavaScript especially in Reactjs /ExpressJS and nosql database [duplicate]

I use a lot of lists and arrays but I have yet to come across a scenario in which the array list couldn't be used just as easily as, if not easier than, the linked list. I was hoping someone could give me some examples of when the linked list is notably better.

Linked lists are preferable over arrays when:
you need constant-time insertions/deletions from the list (such as in real-time computing where time predictability is absolutely critical)
you don't know how many items will be in the list. With arrays, you may need to re-declare and copy memory if the array grows too big
you don't need random access to any elements
you want to be able to insert items in the middle of the list (such as a priority queue)
Arrays are preferable when:
you need indexed/random access to elements
you know the number of elements in the array ahead of time so that you can allocate the correct amount of memory for the array
you need speed when iterating through all the elements in sequence. You can use pointer math on the array to access each element, whereas you need to lookup the node based on the pointer for each element in linked list, which may result in page faults which may result in performance hits.
memory is a concern. Filled arrays take up less memory than linked lists. Each element in the array is just the data. Each linked list node requires the data as well as one (or more) pointers to the other elements in the linked list.
Array Lists (like those in .Net) give you the benefits of arrays, but dynamically allocate resources for you so that you don't need to worry too much about list size and you can delete items at any index without any effort or re-shuffling elements around. Performance-wise, arraylists are slower than raw arrays.

Arrays have O(1) random access, but are really expensive to add stuff onto or remove stuff from.
Linked lists are really cheap to add or remove items anywhere and to iterate, but random access is O(n).

Algorithm ArrayList LinkedList
seek front O(1) O(1)
seek back O(1) O(1)
seek to index O(1) O(N)
insert at front O(N) O(1)
insert at back O(1) O(1)
insert after an item O(N) O(1)
ArrayLists are good for write-once-read-many or appenders, but bad at add/remove from the front or middle.

To add to the other answers, most array list implementations reserve extra capacity at the end of the list so that new elements can be added to the end of the list in O(1) time. When the capacity of an array list is exceeded, a new, larger array is allocated internally, and all the old elements are copied over. Usually, the new array is double the size of the old one. This means that on average, adding new elements to the end of an array list is an O(1) operation in these implementations. So even if you don't know the number of elements in advance, an array list may still be faster than a linked list for adding elements, as long as you are adding them at the end. Obviously, inserting new elements at arbitrary locations in an array list is still an O(n) operation.
Accessing elements in an array list is also faster than a linked list, even if the accesses are sequential. This is because array elements are stored in contiguous memory and can be cached easily. Linked list nodes can potentially be scattered over many different pages.
I would recommend only using a linked list if you know that you're going to be inserting or deleting items at arbitrary locations. Array lists will be faster for pretty much everything else.

The advantage of lists appears if you need to insert items in the middle and don't want to start resizing the array and shifting things around.
You're correct in that this is typically not the case. I've had a few very specific cases like that, but not too many.

It all depends what type of operation you are doing while iterating , all data structures have trade off between time and memory and depending on our needs we should choose the right DS. So there are some cases where LinkedList are faster then array and vice versa . Consider the three basic operation on data structures.
Searching
Since array is index based data structure searching array.get(index) will take O(1) time while linkedlist is not index DS so you will need to traverse up to index , where index <=n , n is size of linked list , so array is faster the linked list when have random access of elements.
Q.So what's the beauty behind this ?
As Arrays are contiguous memory blocks, large chunks of them will be loaded into the cache upon first access this makes it comparatively quick to access remaining elements of the array,as much as we access the elements in array locality of reference also increases thus less catch misses, Cache locality refers to the operations being in the cache and thus execute much faster as compared to in memory,basically In array we maximize the chances of sequential element access being in the cache. While Linked lists aren't necessarily in contiguous blocks of memory, there's no guarantee that items which appear sequentially in the list are actually arranged near each-other in memory, this means fewer cache hits e.g. more cache misses because we need to read from memory for every access of linked list element which increases the time it takes to access them and degraded performance so if we are doing more random access operation aka searching , array will be fast as explained below.
Insertion
This is easy and fast in LinkedList as insertion is O(1) operation in LinkedList (in Java) as compared to array, consider the case when array is full, we need to copy contents to new array if array gets full which makes inserting an element into ArrayList of O(n) in worst case, while ArrayList also needs to update its index if you insert something anywhere except at the end of array , in case of linked list we needn't to be resize it, you just need to update pointers.
Deletion
It works like insertions and better in LinkedList than array.

Those are the most common used implementations of Collection.
ArrayList:
insert/delete at the end generally O(1) worst case O(n)
insert/delete in the middle O(n)
retrieve any position O(1)
LinkedList:
insert/delete in any position O(1) (note if you have a reference to the element)
retrieve in the middle O(n)
retrieve first or last element O(1)
Vector: don't use it. It is an old implementation similar to ArrayList but with all methods synchronized. It is not the correct approach for a shared list in a multithreading environment.
HashMap
insert/delete/retrieve by key in O(1)
TreeSet
insert/delete/contains in O(log N)
HashSet
insert/remove/contains/size in O(1)

In reality memory locality has a huge performance influence in real processing.
The increased use of disk streaming in "big data" processing vs random access shows how structuring your application around this can dramatically improve performance on a larger scale.
If there is any way to access an array sequentially that is by far the best performing. Designing with this as a goal should be at least considered if performance is important.

I think that main difference is whether you frequently need to insert or remove stuff from the top of the list.
With an array, if you remove something from the top of list than the complexity is o(n) because all of the indices of the array elements will have to shift.
With a linked list, it is o(1) because you need only create the node, reassign the head and assign the reference to next as the previous head.
When frequently inserting or removing at the end of the list, arrays are preferable because the complexity will be o(1), no reindexing is required, but for a linked list it will be o(n) because you need to go from the head to the last node.
I think that searching in both linked list and arrays will be o(log n) because you will be probably be using a binary search.

Hmm, Arraylist can be used in cases like follows I guess:
you are not sure how many elements will be present
but you need to access all the elements randomly through indexing
For eg, you need to import and access all elements in a contact list (the size of which is unknown to you)

Use linked list for Radix Sort over arrays and for polynomial operations.

1) As explained above the insert and remove operations give good performance (O(1)) in LinkedList compared to ArrayList(O(n)). Hence if there is a requirement of frequent addition and deletion in application then LinkedList is a best choice.
2) Search (get method) operations are fast in Arraylist (O(1)) but not in LinkedList (O(n)) so If there are less add and remove operations and more search operations requirement, ArrayList would be your best bet.

I did some benchmarking, and found that the list class is actually faster than LinkedList for random inserting:
using System;
using System.Collections.Generic;
using System.Diagnostics;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
int count = 20000;
Random rand = new Random(12345);
Stopwatch watch = Stopwatch.StartNew();
LinkedList<int> ll = new LinkedList<int>();
ll.AddLast(0);
for (int i = 1; i < count; i++)
{
ll.AddBefore(ll.Find(rand.Next(i)),i);
}
Console.WriteLine("LinkedList/Random Add: {0}ms", watch.ElapsedMilliseconds);
watch = Stopwatch.StartNew();
List<int> list = new List<int>();
list.Add(0);
for (int i = 1; i < count; i++)
{
list.Insert(list.IndexOf(rand.Next(i)), i);
}
Console.WriteLine("List/Random Add: {0}ms", watch.ElapsedMilliseconds);
Console.ReadLine();
}
}
}
It takes 900 ms for the linked list and 100ms for the list class.
It creates lists of subsequent integer numbers. Each new integer is inserted after a random number which is already in the list.
Maybe the List class uses something better than just an array.

Arrays, by far, are the most widely used data structures. However, linked lists prove useful in their own unique way where arrays are clumsy - or expensive, to say the least.
Linked lists are useful to implement stacks and queues in situations where their size is subject to vary. Each node in the linked list can be pushed or popped without disturbing the majority of the nodes. Same goes for insertion/deletion of nodes somewhere in the middle. In arrays, however, all the elements have to be shifted, which is an expensive job in terms of execution time.
Binary trees and binary search trees, hash tables, and tries are some of the data structures wherein - at least in C - you need linked lists as a fundamental ingredient for building them up.
However, linked lists should be avoided in situations where it is expected to be able to call any arbitrary element by its index.

A simple answer to the question can be given using these points:
Arrays are to be used when a collection of similar type data elements is required. Whereas, linked list is a collection of mixed type data linked elements known as nodes.
In array, one can visit any element in O(1) time. Whereas, in linked list we would need to traverse entire linked list from head to the required node taking O(n) time.
For arrays, a specific size needs to be declared initially. But linked lists are dynamic in size.

What data structure to use here

Hashes provide an excellent mechanism to extract values corresponding to some given key in almost O(1) time. But it never preserves the order in which the keys are inserted. So is there any data structure which can simulate the best of array as well as hash, that is, return the value corresponding to a given key in O(1) time, as well as returning the nth value inserted in O(1) time? The ordering should be maintained, i.e., if the hash is {a:1,b:2,c:3}, and something like del hash[b] has been done, nth(2) should return {c,3}.
Examples:
hash = {};
hash[a] = 1;
hash[b] = 2;
hash[c] = 3;
nth(2); //should return 2
hash[d] = 4;
del hash[c];
nth(3); //should return 4, as 'd' has been shifted up
Using modules like TIE::Hash or similar stuff won't do, the onus is on me to develop it from scratch!

It depends on how much memory may be allocated for this data structure. For O(N) space there are several choices:
It's easy to get a data structure with O(1) time for each of these operations: "get value by key", "get nth value inserted", "insert" - but only when "delete" time is O(N). Just use combination of a hash map and an array, as explained by ppeterka.
Less obvious, but still simple is O(sqrt N) for "delete" and O(1) for all other operations.
A little bit more complicated is to "delete" in O(N1/4), O(N1/6), or, in general case, in O(M*N1/M) time.
It's, most likely, impossible to decrease "delete" time to O(log N) while retaining O(1) for other operations. But it is possible if you agree to O(log N) time for every operation. Solutions, based on binary search tree or on a skip list, allow it. One option is order statistics tree. You can augment every node of a binary search tree with a counter, storing number of elements in the sub-tree under this node; then use it to find nth node. Other option is to use Indexable skiplist. One more option is to use O(M*N1/M) solution with M=log(N).
And I don't think you can get O(1) "delete" without increasing time for other operations even more.
If unlimited space is available, you can do every operation in O(1) time.
O(sqrt N) "delete"
You can use a combination of two data structures to find value by key and to find value by its insertion order. First one is a hash map (mapping key to both value and a position in other structure). Second one is tiered vector, which maps position to both value and key.
Tiered vector is a relatively simple data structure, it may be easily developed from scratch. Main idea is to split array into sqrt(N) smaller arrays, each of size sqrt(N). Each small array needs only O(sqrt N) time to shift values after deletion. And since each small array is implemented as circular buffer, small arrays can exchange a single element in O(1) time, which allows to complete "delete" operation in O(sqrt N) time (one such exchange for each sub-array between deleted value and first/last sub-array). Tiered vector allows insertion into the middle also in O(sqrt N), but this problem does not require it, so we can just append a new element at the end in O(1) time. To access element by its position, we need to determine starting position of circular buffer for sub-array, where element is stored, then get this element from circular buffer; this needs also O(1) time.
Since hash map remembers a position in tiered vector for each of its keys, it should be updated when any element in tiered vector changes position (O(sqrt N) hash map updates for each "delete").
O(M*N1/M) "delete"
To optimize "delete" operation even more, you can use approach, described in this answer. It deletes elements lazily and uses a trie to adjust element's position, taking into account deleted elements.
O(1) for every operation
You can use a combination of three data structures to do this. First one is a hash map (mapping key to both value and a position in the array). Second one is an array, which maps position to both value and key. And third one is a bit set, one bit for each element of the array.
"Insert" operation just adds one more element to the array's end and inserts it into hash map.
"Delete" operation just unsets corresponding bit in the bit set (which is initialized with every bit = 1). Also it deletes corresponding entry from hash map. (It does not move elements of array or bit set). If, after "delete" the bit set has more than some constant proportion of elements deleted (like 10%), the whole data structure should be re-created from scratch (this allows O(1) amortized time).
"Find by key" is trivial, only hash map is used here.
"Find by position" requires some pre-processing. Prepare a 2D array. One index is the position we search. Other index is current state of our data structure, the bit set, reinterpreted as an index. Calculate population count for each prefix of every possible bit set and store prefix length, indexed by both population count and the bit set itself. Having this 2D array ready, you can perform this operation by first indexing by position and current "state" in this 2D array, then by indexing in the array with values.
Time complexity for every operation is O(1) (for insert/delete it is O(1) amortized). Space complexity is O(N 2N).
In practice, using whole bit set to index an array limits allowed value of N by pointer size (usually 64), even more it is limited by available memory. To alleviate this, we can split both the array and the bit set into sub-arrays of size N/C, where C is some constant. Now we can use a smaller 2D array to find nth element in each sub-array. And to find nth element in the whole structure, we need additional structure to record number of valid elements in each sub-array. This is a structure of constant size C, so every operation on it is also O(1). This additional structure may me implemented as an array, but it is better to use some logarithmic-time structure like indexable skiplist. After this modification, time complexity for every operation is still O(1); space complexity is O(N 2N/C).

Now, that the question is clear for me too (better late than never...) here are my proposals:
you could maintain two hashes: one with keys, and one with the insert order. this however is very ugly and slow to maintain when deleting, and inserting in between. This would give the same almost O(1) time needed to access the elements both ways.
you could use a hash for the keys, and maintain an array for the insert order. this one is a lot nicer than the hash type, deleting is still not very fast, but I think still a lot quicker than with the two hash approach. This also gives true O(1) on accessing the nth element.
At first, I misunderstood the question, and gave a solution that gives O(1) key lookup, and O(n) lookup of nth element:
In Java, there is the LinkedHashMap for this particular task.
I think however that if someone finds this page, this might not be totally useless, so I leave it here...

There is no data structure in O(1) for everything you cited. In particular any data structure with random dynamic insertion/deletion in the middle AND sorted/indexed access cannot have maintenance time lower than O(log N), to maintain such a dynamic collection you have to resort either on the operator "less than" (binary thus O(log2 N)) or some computed organization (typical O(sqrt N), by using sqrt(N) sub arrays). Note that O(sqrt N)>O(log N).
So, no.
You might reach O(1) for everything including keeping order with the linked list+hash map, and if access is mostly sequential, you could cache nth(x), to access nth(x+/-1) in O(1).

I guess only a plain array will give you O(1), best variant is to look for solution which gives O(n) in worst scenario. You can also use a really really bad approach - using key as index in plain array. I guess there is a way to transform any key to index in plain array.
std::string memoryMap[0x10000];
int key = 100;
std::string value = "Hello, World!";
memoryMap[key] = value;

Inserting a number into a sorted array!

I would like to write a piece of code for inserting a number into a sorted array at the appropriate position (i.e. the array should still remain sorted after insertion)
My data structure doesn't allow duplicates.
I am planning to do something like this:
Find the right index where I should be putting this element using binary search
Create space for this element, by moving all the elements from that index down.
Put this element there.
Is there any other better way?

If you really have an array and not a better data structure, that's optimal. If you're flexible on the implementation, take a look at AA Trees - They're rather fast and easy to implement. Obviously, takes more space than array, and it's not worth it if the number of elements is not big enough to notice the slowness of the blit as compared to pointer magic.

Does the data have to be sorted completely all the time?
If it is not, if it is only necessary to access the smallest or highest element quickly, Binary Heap gives constant access time and logn addition and deletion time.
More over it can satisfy your condition that the memory should be consecutive, since you can implement a BinaryHeap on top of an array (I.e; array[2n+1] left child, array[2n+2] right child).

A heap based implementation of a tree would be more efficient if you are inserting a lot of elements - log n for both locating/removing and inserting operations.

Efficient data structure for fast random access, search, insertion and deletion

I'm looking for a data structure (or structures) that would allow me keep me an ordered list of integers, no duplicates, with indexes and values in the same range.
I need four main operations to be efficient, in rough order of importance:
taking the value from a given index
finding the index of a given value
inserting a value at a given index
deleting a value at a given index
Using an array I have 1 at O(1), but 2 is O(N) and insertion and deletions are expensive (O(N) as well, I believe).
A Linked List has O(1) insertion and deletion (once you have the node), but 1 and 2 are O(N) thus negating the gains.
I tried keeping two arrays a[index]=value and b[value]=index, which turn 1 and 2 into O(1) but turn 3 and 4 into even more costly operations.
Is there a data structure better suited for this?

I would use a red-black tree to map keys to values. This gives you O(log(n)) for 1, 3, 4. It also maintains the keys in sorted order.
For 2, I would use a hash table to map values to keys, which gives you O(1) performance. It also adds O(1) overhead for keeping the hash table updated when adding and deleting keys in the red-black tree.

How about using a sorted array with binary search?
Insertion and deletion is slow. but given the fact that the data are plain integers could be optimized with calls to memcpy() if you are using C or C++. If you know the maximum size of the array, you can even avoid any memory allocations during the usage of the array, as you can preallocate it to the maximum size.
The "best" approach depends on how many items you need to store and how often you will need to insert/delete compared to finding. If you rarely insert or delete a sorted array with O(1) access to the values is certainly better, but if you insert and delete things frequently a binary tree can be better than the array. For a small enough n the array most likely beats the tree in any case.
If storage size is of concern, the array is better than the trees, too. Trees also need to allocate memory for every item they store and the overhead of the memory allocation can be significant as you only store small values (integers).
You may want to profile what is faster, the copying of the integers if you insert/delete from the sorted array or the tree with it's memory (de)allocations.

I don't know what language you're using, but if it's Java you can leverage LinkedHashMap or a similar Collection. It's got all of the benefits of a List and a Map, provides constant time for most operations, and has the memory footprint of an elephant. :)
If you're not using Java, the idea of a LinkedHashMap is probably still suitable for a usable data structure for your problem.

Use a vector for the array access.
Use a map as a search index to the subscript into the vector.
given a subscript fetch the value from the vector O(1)
given a key, use the map to find the subscript of the value. O(lnN)
insert a value, push back on the vector O(1) amortized, insert the subscript into
the map O(lnN)
delete a value, delete from the map O(lnN)

Howabout a Treemap? log(n) for the operations described.

I like balanced binary trees a lot. They are sometimes slower than hash tables or other structures, but they are much more predictable; they are generally O(log n) for all operations. I would suggest using a Red-black tree or an AVL tree.

How to achieve 2 with RB-trees? We can make them count their children with every insert/delete operations. This doesn't make these operationis last significantly longer. Then getting down the tree to find the i-th element is possible in log n time. But I see no implementation of this method in java nor stl.

If you're working in .NET, then according to the MS docs http://msdn.microsoft.com/en-us/library/f7fta44c.aspx
SortedDictionary and SortedList both have O(log n) for retrieval
SortedDictionary has O(log n) for insert and delete operations, whereas SortedList has O(n).
The two differ by memory usage and speed of insertion/removal. SortedList uses less memory than SortedDictionary. If the SortedList is populated all at once from sorted data, it's faster than SortedDictionary. So it depends on the situation as to which is really the best for you.
Also, your argument for the Linked List is not really valid as it might be O(1) for the insert, but you have to traverse the list to find the insertion point, so it's really not.

When to use a linked list over an array/array list?

I use a lot of lists and arrays but I have yet to come across a scenario in which the array list couldn't be used just as easily as, if not easier than, the linked list. I was hoping someone could give me some examples of when the linked list is notably better.

Linked lists are preferable over arrays when:
you need constant-time insertions/deletions from the list (such as in real-time computing where time predictability is absolutely critical)
you don't know how many items will be in the list. With arrays, you may need to re-declare and copy memory if the array grows too big
you don't need random access to any elements
you want to be able to insert items in the middle of the list (such as a priority queue)
Arrays are preferable when:
you need indexed/random access to elements
you know the number of elements in the array ahead of time so that you can allocate the correct amount of memory for the array
you need speed when iterating through all the elements in sequence. You can use pointer math on the array to access each element, whereas you need to lookup the node based on the pointer for each element in linked list, which may result in page faults which may result in performance hits.
memory is a concern. Filled arrays take up less memory than linked lists. Each element in the array is just the data. Each linked list node requires the data as well as one (or more) pointers to the other elements in the linked list.
Array Lists (like those in .Net) give you the benefits of arrays, but dynamically allocate resources for you so that you don't need to worry too much about list size and you can delete items at any index without any effort or re-shuffling elements around. Performance-wise, arraylists are slower than raw arrays.

Arrays have O(1) random access, but are really expensive to add stuff onto or remove stuff from.
Linked lists are really cheap to add or remove items anywhere and to iterate, but random access is O(n).

Algorithm ArrayList LinkedList
seek front O(1) O(1)
seek back O(1) O(1)
seek to index O(1) O(N)
insert at front O(N) O(1)
insert at back O(1) O(1)
insert after an item O(N) O(1)
ArrayLists are good for write-once-read-many or appenders, but bad at add/remove from the front or middle.

To add to the other answers, most array list implementations reserve extra capacity at the end of the list so that new elements can be added to the end of the list in O(1) time. When the capacity of an array list is exceeded, a new, larger array is allocated internally, and all the old elements are copied over. Usually, the new array is double the size of the old one. This means that on average, adding new elements to the end of an array list is an O(1) operation in these implementations. So even if you don't know the number of elements in advance, an array list may still be faster than a linked list for adding elements, as long as you are adding them at the end. Obviously, inserting new elements at arbitrary locations in an array list is still an O(n) operation.
Accessing elements in an array list is also faster than a linked list, even if the accesses are sequential. This is because array elements are stored in contiguous memory and can be cached easily. Linked list nodes can potentially be scattered over many different pages.
I would recommend only using a linked list if you know that you're going to be inserting or deleting items at arbitrary locations. Array lists will be faster for pretty much everything else.

The advantage of lists appears if you need to insert items in the middle and don't want to start resizing the array and shifting things around.
You're correct in that this is typically not the case. I've had a few very specific cases like that, but not too many.

It all depends what type of operation you are doing while iterating , all data structures have trade off between time and memory and depending on our needs we should choose the right DS. So there are some cases where LinkedList are faster then array and vice versa . Consider the three basic operation on data structures.
Searching
Since array is index based data structure searching array.get(index) will take O(1) time while linkedlist is not index DS so you will need to traverse up to index , where index <=n , n is size of linked list , so array is faster the linked list when have random access of elements.
Q.So what's the beauty behind this ?
As Arrays are contiguous memory blocks, large chunks of them will be loaded into the cache upon first access this makes it comparatively quick to access remaining elements of the array,as much as we access the elements in array locality of reference also increases thus less catch misses, Cache locality refers to the operations being in the cache and thus execute much faster as compared to in memory,basically In array we maximize the chances of sequential element access being in the cache. While Linked lists aren't necessarily in contiguous blocks of memory, there's no guarantee that items which appear sequentially in the list are actually arranged near each-other in memory, this means fewer cache hits e.g. more cache misses because we need to read from memory for every access of linked list element which increases the time it takes to access them and degraded performance so if we are doing more random access operation aka searching , array will be fast as explained below.
Insertion
This is easy and fast in LinkedList as insertion is O(1) operation in LinkedList (in Java) as compared to array, consider the case when array is full, we need to copy contents to new array if array gets full which makes inserting an element into ArrayList of O(n) in worst case, while ArrayList also needs to update its index if you insert something anywhere except at the end of array , in case of linked list we needn't to be resize it, you just need to update pointers.
Deletion
It works like insertions and better in LinkedList than array.

Those are the most common used implementations of Collection.
ArrayList:
insert/delete at the end generally O(1) worst case O(n)
insert/delete in the middle O(n)
retrieve any position O(1)
LinkedList:
insert/delete in any position O(1) (note if you have a reference to the element)
retrieve in the middle O(n)
retrieve first or last element O(1)
Vector: don't use it. It is an old implementation similar to ArrayList but with all methods synchronized. It is not the correct approach for a shared list in a multithreading environment.
HashMap
insert/delete/retrieve by key in O(1)
TreeSet
insert/delete/contains in O(log N)
HashSet
insert/remove/contains/size in O(1)

In reality memory locality has a huge performance influence in real processing.
The increased use of disk streaming in "big data" processing vs random access shows how structuring your application around this can dramatically improve performance on a larger scale.
If there is any way to access an array sequentially that is by far the best performing. Designing with this as a goal should be at least considered if performance is important.

I think that main difference is whether you frequently need to insert or remove stuff from the top of the list.
With an array, if you remove something from the top of list than the complexity is o(n) because all of the indices of the array elements will have to shift.
With a linked list, it is o(1) because you need only create the node, reassign the head and assign the reference to next as the previous head.
When frequently inserting or removing at the end of the list, arrays are preferable because the complexity will be o(1), no reindexing is required, but for a linked list it will be o(n) because you need to go from the head to the last node.
I think that searching in both linked list and arrays will be o(log n) because you will be probably be using a binary search.

Hmm, Arraylist can be used in cases like follows I guess:
you are not sure how many elements will be present
but you need to access all the elements randomly through indexing
For eg, you need to import and access all elements in a contact list (the size of which is unknown to you)

Use linked list for Radix Sort over arrays and for polynomial operations.

1) As explained above the insert and remove operations give good performance (O(1)) in LinkedList compared to ArrayList(O(n)). Hence if there is a requirement of frequent addition and deletion in application then LinkedList is a best choice.
2) Search (get method) operations are fast in Arraylist (O(1)) but not in LinkedList (O(n)) so If there are less add and remove operations and more search operations requirement, ArrayList would be your best bet.

I did some benchmarking, and found that the list class is actually faster than LinkedList for random inserting:
using System;
using System.Collections.Generic;
using System.Diagnostics;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
int count = 20000;
Random rand = new Random(12345);
Stopwatch watch = Stopwatch.StartNew();
LinkedList<int> ll = new LinkedList<int>();
ll.AddLast(0);
for (int i = 1; i < count; i++)
{
ll.AddBefore(ll.Find(rand.Next(i)),i);
}
Console.WriteLine("LinkedList/Random Add: {0}ms", watch.ElapsedMilliseconds);
watch = Stopwatch.StartNew();
List<int> list = new List<int>();
list.Add(0);
for (int i = 1; i < count; i++)
{
list.Insert(list.IndexOf(rand.Next(i)), i);
}
Console.WriteLine("List/Random Add: {0}ms", watch.ElapsedMilliseconds);
Console.ReadLine();
}
}
}
It takes 900 ms for the linked list and 100ms for the list class.
It creates lists of subsequent integer numbers. Each new integer is inserted after a random number which is already in the list.
Maybe the List class uses something better than just an array.

Arrays, by far, are the most widely used data structures. However, linked lists prove useful in their own unique way where arrays are clumsy - or expensive, to say the least.
Linked lists are useful to implement stacks and queues in situations where their size is subject to vary. Each node in the linked list can be pushed or popped without disturbing the majority of the nodes. Same goes for insertion/deletion of nodes somewhere in the middle. In arrays, however, all the elements have to be shifted, which is an expensive job in terms of execution time.
Binary trees and binary search trees, hash tables, and tries are some of the data structures wherein - at least in C - you need linked lists as a fundamental ingredient for building them up.
However, linked lists should be avoided in situations where it is expected to be able to call any arbitrary element by its index.

A simple answer to the question can be given using these points:
Arrays are to be used when a collection of similar type data elements is required. Whereas, linked list is a collection of mixed type data linked elements known as nodes.
In array, one can visit any element in O(1) time. Whereas, in linked list we would need to traverse entire linked list from head to the required node taking O(n) time.
For arrays, a specific size needs to be declared initially. But linked lists are dynamic in size.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Why is array considered a data structure? - c

Why is array considered a data structure ? How is array a data structure in tetms of efficiency? Please explain by giving some examples

Related

Use cases of linked list in JavaScript especially in Reactjs /ExpressJS and nosql database [duplicate]

What data structure to use here

Inserting a number into a sorted array!

Efficient data structure for fast random access, search, insertion and deletion

When to use a linked list over an array/array list?

Categories

Resources