Which data-structure is used widely in C? [closed] - c

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
A couple of days ago I asked myself which data-structure I should in a function in C.
I usually write in C++ and the choice would have fallen to std::vector.
There a some possible choices:
a static (big enough) array
a dynamic array which grows when needed(e.g. doubling its size)
an own list implementation as struct with a pointer next
The last option seems to be unusual. Are there any bigger project where
someone uses own structures like lists?
Is there a general rule for the decision between array or own
data-structures?
When I would need a tree structure I wouldn't think twice an just write a tree.
Are there any widely used libs with such structures(like boost for C++)?
Or is this considered as bad style because you would have to store a void* instead of
the actual type?
Thank a lot for your experience!

Different data structures offer different computational complexity for insertions, lookups and other operations.
To take your specific example, there is a number of differences between an array and a linked list:
lookup by index is O(1) in an array and O(n) in a list;
insertions into a list are O(1) or O(n) (depending on whether they require traversal), and into an array are amortized O(1) if done well;
deletions from an array are O(n) whereas certain types of list deletions can be done in O(1) time;
arrays offer better locality of reference and consequently better cache performance.
You might find the following page useful: http://essays.hexapodia.net/datastructures/
As a general rule, when choosing a data structure I first consider whether I have strong reasons to believe that performance of the code in question is going to be important:
if I don't, I choose the simplest data structure that would do the job, or the one that lends itself to the clearest code;
if I do, I think carefully about what operations I am going to perform on the structure and choose accordingly, possibly followed by profiling.
As for recommendations for good C data structure libraries, take a look at Are there any open source C libraries with common data structures?

It completely depends on which operations you are going to perform on the data structure. If you will be retrieving data by index (eg, data[ 3 ]), then a list is a horrible idea since each read will require you to walk the list. If you will be inserting into the first position a lot (eg, data[ 0 ] = x), then an array will be terrible because you will be moving all the data for each insertion.
If you were going to use std::vector, then a dynamic array is probably the best replacement. But perhaps std::vector would not have been the correct choice.

Each has its own advantages and disadvantages.
Static arrays: perfect for lookup tables and data that doesn't change its size throughout the program execution. They get allocated on program startup, so you don't have to manage this memory in any way. A disadvantage is that you cannot resize or free a static array - it stays there until the program terminates.
Dynamically growing arrays: an easy to manage data structure that can almost be a substitute for vector arrays in C++. A disadvantage is the overhead of allocating memory at runtime, but that can be alleviated if you allocate bigger chunks at once.
Linked lists: take care if you are going to have a lot of elements because allocating each one separately without using a memory pool can lead to memory waste and fragmentation.

Dynamic Link lists are widely used in C where number of items to be stored is not known.

Vector is a dynamic array and is implemented internally as a dynamic array. I would suggest you to use vector as the machines are optimized to retrieve the continuous memory locations,while the link list will not be stored in continuous location.
Having said that,if you don't require the fast retrieval of element by index in your use case then you can go for link list also.The link list also has a benefit of not wasting the space unlike dynamic array(by doubling when it is about to get full) and also insertion in the start or between the elements is cheaper as compared to array.

In pure C I mostly use static arrays. It is connected with programming of embedded devices that have problems with allocating and freeing memory - heap fragmentation.
But if there is a need of using list I implement one myself - sometimes its implementation is based on static arrays (again).
I think there are libraries for C that offer decent implementation of more complex data structures. AFAIK glib is widely used and offers at least linked lists.

I remember I read a document from Apple some time ago saying that such a thing could be naively implemented with a structure similar to:
struct {
void *data; //other type rather than void is easier
int length;
} MyArray;
MyArray *MyArrayCreate(){
//mallocate memory
return ...
}
void MyArrayRelease(){
free(...);
}
And implement a function that checks the length of the array and its not long enough then another big enough array will be allocated, former data will be copied to it and new data added to it.
MyArrayInsertAt(MyArray *array, index, void *object){
if (length < index){
//mallocate again, copy data
//update length
}
data[index] = object;
}
So it can used like so:
MyArray *array = MyArrayCreate();
MyArrayInsertAt(array, 5, something);
MyArrayRelease(array);
MyArrayInsertAt() function won't win a price for its performance, but it could be a solution for no high-demanding programs/applications
I just can't find the link... maybe someone read it too?
ADDED:
I have found that GNU NSMutableArray (Objective-C) methods implementation are done in C and they do the same as above. They double the size of the array each time an object has to be added and won't fit in the array. See line 131 through 145

Related

Is Array of Objects Access Time O(1)?

I know that accessing something from an array takes O(1) time if it is of one type using mathematical formula array[n]=(start address of array + (n * size Of(type)), but Assume you have an array of objects. These objects could have any number of fields including nested objects. Can we consider the access time to be constant?
Edit- I am mainly asking for JAVA, but I would like to know if there is a difference in case I choose another mainstream language like python, c++, JavaScript etc.
For example in the below code
class tryInt{
int a;
int b;
String s;
public tryInt(){
a=1;
b=0;
s="Sdaas";
}
}
class tryobject{
public class tryObject1{
int a;
int b;
int c;
}
public tryobject(){
tryObject1 o=new tryObject1();
sss="dsfsdf";
}
String sss;
}
public class Main {
public static void main(String[] args) {
System.out.println("Hello World!");
Object[] arr=new Object[5];
arr[0]=new tryInt();
arr[1]=new tryobject();
System.out.println(arr[0]);
System.out.println(arr[1]);
}
}
I want to know that since the tryInt type object should take less space than tryobject type object, how will an array now use the formula array[n]=(start address of array + (n * size Of(type)) because the type is no more same and hence this formula should/will fail.
The answer to your question is it depends.
If it's possible to random-access the array when you know the index you want, yes, it's a O(1) operation.
On the other hand, if each item in the array is a different length, or if the array is stored as a linked list, it's necessary to start looking for your element at the beginning of the array, skipping over elements until you find the one corresponding to your index. That's an O(n) operation.
In the real world of working programmers and collections of data, this O(x) stuff is inextricably bound up with the way the data is represented.
Many people reserve the word array to mean a randomly accessible O(1) collection. The pages of a book are an array. If you know the page number you can open the book to the page. (Flipping the book open to the correct page is not necessarily a trivial operation. You may have to go to your library and find the right book first. The analogy applies to multi-level computer storage ... hard drive / RAM / several levels of processor cache)
People use list for a sequentially accessible O(n) collection. The sentences of text on a page of a book are a list. To find the fifth sentence, you must read the first four.
I mention the meaning of the words list and array here for an important reason for professional programmers. Much of our work is maintaining existing code. In our justifiable rush to get things done, sometimes we grab the first collection class that comes to hand, and sometimes we grab the wrong one. For example, we might grab a list O(n) rather than an array O(1) or a hash O(1, maybe). The collection we grab works well for our tests. But, boom!, performance falls over just when the application gets successful and scales up to holding a lot of data. This happens all the time.
To remedy that kind of problem we need a practical understanding of these access issues. I once inherited a project with a homegrown hashed dictionary class that consumed O(n cubed) when inserting lots of items into the dictionary. It took a lot of digging to get past the snazzy collection-class documentation to figure out what was really going on.
In Java, the Object type is a reference to an object rather than an object itself. That is, a variable of type Object can be thought of as a pointer that says “here’s where you should go to find your Object” rather than “I am an actual, honest-to-goodness Object.” Importantly, the size of this reference - the number of bytes used up - is the same regardless of what type of thing the Object variable refers to.
As a result, if you have an Object[], then the cost of indexing into that array is indeed O(1), since the entries in that array are all the same size (namely, the size of an object reference). The sizes of the objects being pointed at might not all be the same, as in your example, but the pointers themselves are always the same size and so the math you’ve given provides a way to do array indexing in constant time.
The answer depends on context.
It's really common in some textbooks to treat array access as O(1) because it simplifies analysis.
And in fairness it is O(1) cpu instructions in today's architectures.
But:
As the dataset gets larger tending to infinity, it doesn't fit in memory. If the "array" is implemented as a database structure spread across multiple machines, you'll end up with a tree structure and probably have logarithmic worst case access times.
If you don't care about data size going to infinity, then big O notation may not be the right fit for your situation
On real hardware, memory accesses are not all equal -- there are many layers of caches and cache misses cost hundreds or thousands of cycles. The O(1) model for memory access tends to ignore that
In theory work, random access machines access memory in O(1), but turing machines cannot. Hierarchical cache effects tend to be ignored. Some models like transdichotomous RAM try to account for this.
In short, this is a property of your model of computation. There are many valid and interesting models of computation and what to choose depends on your needs and your situation.
In general. array denotes a fixed-size memory range, storing elements of the same size. If we consider this usual concept of array, then if objects are members of the array, then under the hood your array stores the object references and referring the i'th element in your array you find the reference of the object/pointer it contains, with an O(1) complexity and the address it points to is something the language is to find.
However, there are arrays which do not comply to this definition. For example, in Javascript you can easily add items to the array, which makes me think that in Javascript the arrays are somewhat different from an allocated fixed-sized range of elements of the same size/type. Also, in Javascript you can add any types of elements to an array. So, in general I would say the complexity is O(1), but there are quite a few important exceptions from this rule, depending on the technologies.

Why are lists used infrequently in Go?

Is there a way to create an array/slice in Go without a hard-coded array size? Why is List ignored?
In all the languages I've worked with extensively: Delphi, C#, C++, Python - Lists are very important because they can be dynamically resized, as opposed to arrays.
In Golang, there is indeed a list.Liststruct, but I see very little documentation about it - whether in Go By Example or the three Go books that I have - Summerfield, Chisnal and Balbaert - they all spend a lot of time on arrays and slices and then skip to maps. In souce code examples I also find little or no use of list.List.
It also appears that, unlike Python, Range is not supported for List - big drawback IMO. Am I missing something?
Slices are lovely, but they still need to be based on an array with a hard-coded size. That's where List comes in.
Just about always when you are thinking of a list - use a slice instead in Go. Slices are dynamically re-sized. Underlying them is a contiguous slice of memory which can change size.
They are very flexible as you'll see if you read the SliceTricks wiki page.
Here is an excerpt :-
Copy
b = make([]T, len(a))
copy(b, a) // or b = append([]T(nil), a...)
Cut
a = append(a[:i], a[j:]...)
Delete
a = append(a[:i], a[i+1:]...) // or a = a[:i+copy(a[i:], a[i+1:])]
Delete without preserving order
a[i], a = a[len(a)-1], a[:len(a)-1]
Pop
x, a = a[len(a)-1], a[:len(a)-1]
Push
a = append(a, x)
Update: Here is a link to a blog post all about slices from the go team itself, which does a good job of explaining the relationship between slices and arrays and slice internals.
I asked this question a few months ago, when I first started investigating Go. Since then, every day I have been reading about Go, and coding in Go.
Because I did not receive a clear-cut answer to this question (although I had accepted one answer) I'm now going to answer it myself, based on what I have learned, since I asked it:
Is there a way to create an array /slice in Go without a hard coded
array size?
Yes. Slices do not require a hard coded array to slice from:
var sl []int = make([]int, len, cap)
This code allocates slice sl, of size len with a capacity of cap - len and cap are variables that can be assigned at runtime.
Why is list.List ignored?
It appears the main reasons list.List seem to get little attention in Go are:
As has been explained in #Nick Craig-Wood's answer, there is
virtually nothing that can be done with lists that cannot be done
with slices, often more efficiently and with a cleaner, more
elegant syntax. For example the range construct:
for i := range sl {
sl[i] = i
}
cannot be used with list - a C style for loop is required. And in
many cases, C++ collection style syntax must be used with lists:
push_back etc.
Perhaps more importantly, list.List is not strongly typed - it is very similar to Python's lists and dictionaries, which allow for mixing various types together in the collection. This seems to run contrary
to the Go approach to things. Go is a very strongly typed language - for example, implicit type conversions never allowed in Go, even an upCast from int to int64 must be
explicit. But all the methods for list.List take empty interfaces -
anything goes.
One of the reasons that I abandoned Python and moved to Go is because
of this sort of weakness in Python's type system, although Python
claims to be "strongly typed" (IMO it isn't). Go'slist.Listseems to
be a sort of "mongrel", born of C++'s vector<T> and Python's
List(), and is perhaps a bit out of place in Go itself.
It would not surprise me if at some point in the not too distant future, we find list.List deprecated in Go, although perhaps it will remain, to accommodate those rare situations where, even using good design practices, a problem can best be solved with a collection that holds various types. Or perhaps it's there to provide a "bridge" for C family developers to get comfortable with Go before they learn the nuances of slices, which are unique to Go, AFAIK. (In some respects slices seem similar to stream classes in C++ or Delphi, but not entirely.)
Although coming from a Delphi/C++/Python background, in my initial exposure to Go I found list.List to be more familiar than Go's slices, as I have become more comfortable with Go, I have gone back and changed all my lists to slices. I haven't found anything yet that slice and/or map do not allow me to do, such that I need to use list.List.
I think that's because there's not much to say about them as the container/list package is rather self-explanatory once you absorbed what is the chief Go idiom for working with generic data.
In Delphi (without generics) or in C you would store pointers or TObjects in the list, and then cast them back to their real types when obtaining from the list. In C++ STL lists are templates and hence parameterized by type, and in C# (these days) lists are generic.
In Go, container/list stores values of type interface{} which is a special type capable to represent values of any other (real) type—by storing a pair of pointers: one to the type info of the contained value, and a pointer to the value (or the value directly, if it's size is no greater than the size of a pointer). So when you want to add an element to the list, you just do that as function parameters of type interface{} accept values coo any type. But when you extract values from the list, and what to work with their real types you have to either type-asert them or do a type switch on them—both approaches are just different ways to do essentially the same thing.
Here is an example taken from here:
package main
import ("fmt" ; "container/list")
func main() {
var x list.List
x.PushBack(1)
x.PushBack(2)
x.PushBack(3)
for e := x.Front(); e != nil; e=e.Next() {
fmt.Println(e.Value.(int))
}
}
Here we obtain an element's value using e.Value() and then type-assert it as int a type of the original inserted value.
You can read up on type assertions and type switches in "Effective Go" or any other introduction book. The container/list package's documentation summaries all the methods lists support.
Note that Go slices can be expanded via the append() builtin function. While this will sometimes require making a copy of the backing array, it won't happen every time, since Go will over-size the new array giving it a larger capacity than the reported length. This means that a subsequent append operation can be completed without another data copy.
While you do end up with more data copies than with equivalent code implemented with linked lists, you remove the need to allocate elements in the list individually and the need to update the Next pointers. For many uses the array based implementation provides better or good enough performance, so that is what is emphasised in the language. Interestingly, Python's standard list type is also array backed and has similar performance characteristics when appending values.
That said, there are cases where linked lists are a better choice (e.g. when you need to insert or remove elements from the start/middle of a long list), and that is why a standard library implementation is provided. I guess they didn't add any special language features to work with them because these cases are less common than those where slices are used.
From: https://groups.google.com/forum/#!msg/golang-nuts/mPKCoYNwsoU/tLefhE7tQjMJ
It depends a lot on the number of elements in your lists,
whether a true list or a slice will be more efficient
when you need to do many deletions in the 'middle' of the list.
#1
The more elements, the less attractive a slice becomes.
#2
When the ordering of the elements isn't important,
it is most efficient to use a slice and
deleting an element by replacing it by the last element in the slice and
reslicing the slice to shrink the len by 1
(as explained in the SliceTricks wiki)
So
use slice
1. If order of elements in list is Not important, and you need to delete, just
use List swap the element to delete with last element, & re-slice to (length-1)
2. when elements are more (whatever more means)
There are ways to mitigate the deletion problem --
e.g. the swap trick you mentioned or
just marking the elements as logically deleted.
But it's impossible to mitigate the problem of slowness of walking linked lists.
So
use slice
1. If you need speed in traversal
Unless the slice is updated way too often (delete, add elements at random locations) the memory contiguity of slices will offer excellent cache hit ratio compared to linked lists.
Scott Meyer's talk on the importance of cache..
https://www.youtube.com/watch?v=WDIkqP4JbkE
list.List is implemented as a doubly linked list. Array-based lists (vectors in C++, or slices in golang) are better choice than linked lists in most conditions if you don't frequently insert into the middle of the list. The amortized time complexity for append is O(1) for both array list and linked list even though array list has to extend the capacity and copy over existing values. Array lists have faster random access, smaller memory footprint, and more importantly friendly to garbage collector because of no pointers inside the data structure.

Fast way to remove bytes from a buffer

Is there a faster way to do this:
Vector3* points = malloc(maxBufferCount*sizeof(Vector3));
//put content into the buffer and increment bufferCount
...
// remove one point at index `removeIndex`
bufferCount--;
for (int j=removeIndex; j<bufferCount; j++) {
points[j] = points[j+1];
}
I'm asking because I have a huge buffer from which I remove elements quite often.
No, sorry - removing elements from the middle of an array takes O(n) time. If you really want to modify the elements often (i. e. remove certain items and/or add others), use a linked list instead - that has constant-time removal and addition. In contrast, arrays have constant lookup time, while linked lists can be accessed (read) in linear time. So decide what you will do more frequently (reading or writing) and choose the appropriate data structure based upon that decision.
Note, however, that I (kindly) assumed you are not trying to commit the crime of premature optimization. If you haven't benchmarked that this is the bottleneck, then probably just don't worry about it.
Unless you know it's a bottleneck you can probably let the compiler optimize for you, but you could try memmove.
The selected answer here is pretty comprehensive: When to use strncpy or memmove?
A description is here: http://www.kernel.org/doc/man-pages/online/pages/man3/memmove.3.html
A few things to say. The memmove function will probably copy faster than you, often it is optimised by the writers of your particular complier to use special instructions which arent available in the C language without inline assembler. I believe these instructions are called SIMD instructions (Single Instruction Multiple Data)? Somebody correct me if I am wrong.
If you can save up items to be removed, then you can optimse by sorting the list of items you wish to remove and then, doing a single pass. It isnt hard but just takes some funny arithmetic.
Also you could just store each item in a linked list, removing an item is trivial, but you lose random acccess to your array.
Finally you can have an additional array of pointers, the same size of your array, each pointer pointing to an element. Then you can access the array through double indirection, you can sort the array by swapping pointers, and you can delete items by making their pointer NULL.
Hope this gives you some ideas. There usually is a way to optimise things, but then it becomes more application specific.

Sparse Array in C! How accomplish it? Can I alloc only parts of an array?

The first question is: "How I do a simple sparse array in C (with one dimension only)?" {with my own hands, without libraries.}
And the last one: "Can I allocate only parts of an array?"
like *array;
then use malloc to allocate some mem for this;
so, We free the index that we don't want.
Can I do it?
Thanks so much!
No, you can't do it.
What you can do is to allocate blocks, but you need to design it carefully.
Probably the best optimization is to use ranges of cell. So you can use a linked list (or a map) of available ranges:
struct SparseBlock
{
void *blockData;
int beginIndex;
int endIndex;
struct SparseBlock *next;
}
obviously if endIndex - beginIndex = 0 you have a single cell (that is isolated inside the array), otherwise you have got a block of cells, allowing you to allocate the right amount of memory for it.
This approach is simple for immutable sparse vectors, otherwise you should take care of
restructuring the blocks whenever a hole is filled or generated
just store single cells
In addition you have to decide how to index these blocks, you can keep them ordered in a linked list, or you can use a map to have a constant O(1) time to retrieve a n-th block (of course you will have to insert many equal keys for the same block if it's a range or reduce the index to the nearest lower index available).
Solutions are many, just express your creativity! :)
It is not uncommon to implement these in linked structures of one kind or another. In one dimension you can simple generate a linked list of occupied regions, and I've discussed a two dimensional implementation in another context before.
You do lose O(1) access time this way, but the win on space can be considerable if the structure really is sparse.

Array versus linked-list

Why would someone want to use a linked-list over an array?
Coding a linked-list is, no doubt, a bit more work than using an array and one may wonder what would justify the additional effort.
I think insertion of new elements is trivial in a linked-list but it's a major chore in an array. Are there other advantages to using a linked list to store a set of data versus storing it in an array?
This question is not a duplicate of this question because the other question is asking specifically about a particular Java class while this question is concerned with the general data structures.
Another good reason is that linked lists lend themselves nicely to efficient multi-threaded implementations. The reason for this is that changes tend to be local - affecting only a pointer or two for insert and remove at a localized part of the data structure. So, you can have many threads working on the same linked list. Even more, it's possible to create lock-free versions using CAS-type operations and avoid heavy-weight locks altogether.
With a linked list, iterators can also traverse the list while modifications are occurring. In the optimistic case where modifications don't collide, iterators can continue without contention.
With an array, any change that modifies the size of the array is likely to require locking a large portion of the array and in fact, it's rare that this is done without a global lock across the whole array so modifications become stop the world affairs.
It's easier to store data of different sizes in a linked list. An array assumes every element is exactly the same size.
As you mentioned, it's easier for a linked list to grow organically. An array's size needs to be known ahead of time, or re-created when it needs to grow.
Shuffling a linked list is just a matter of changing what points to what. Shuffling an array is more complicated and/or takes more memory.
As long as your iterations all happen in a "foreach" context, you don't lose any performance in iteration.
Wikipedia has very good section about the differences.
Linked lists have several advantages
over arrays. Elements can be inserted
into linked lists indefinitely, while
an array will eventually either fill
up or need to be resized, an expensive
operation that may not even be
possible if memory is fragmented.
Similarly, an array from which many
elements are removed may become
wastefully empty or need to be made
smaller.
On the other hand, arrays allow random
access, while linked lists allow only
sequential access to elements.
Singly-linked lists, in fact, can only
be traversed in one direction. This
makes linked lists unsuitable for
applications where it's useful to look
up an element by its index quickly,
such as heapsort. Sequential access on
arrays is also faster than on linked
lists on many machines due to locality
of reference and data caches. Linked
lists receive almost no benefit from
the cache.
Another disadvantage of linked lists
is the extra storage needed for
references, which often makes them
impractical for lists of small data
items such as characters or boolean
values. It can also be slow, and with
a naïve allocator, wasteful, to
allocate memory separately for each
new element, a problem generally
solved using memory pools.
http://en.wikipedia.org/wiki/Linked_list
I'll add another - lists can act as purely functional data structures.
For instance, you can have completely different lists sharing the same end section
a = (1 2 3 4, ....)
b = (4 3 2 1 1 2 3 4 ...)
c = (3 4 ...)
i.e.:
b = 4 -> 3 -> 2 -> 1 -> a
c = a.next.next
without having to copy the data being pointed to by a into b and c.
This is why they are so popular in functional languages, which use immutable variables - prepend and tail operations can occur freely without having to copy the original data - very important features when you're treating data as immutable.
Besides inserting into the middle of the list being easier - it's also much easier to delete from the middle of a linked list than an array.
But frankly, I've never used a linked list. Whenever I needed fast insertion and deletion, I also needed fast lookup, so I went to a HashSet or a Dictionary.
Merging two linked lists (especially two doubly linked lists) is much faster than merging two arrays (assuming the merge is destructive). The former takes O(1), the latter takes O(n).
EDIT: To clarify, I meant "merging" here in the unordered sense, not as in merge sort. Perhaps "concatenating" would have been a better word.
A widely unappreciated argument for ArrayList and against LinkedList is that LinkedLists are uncomfortable while debugging. The time spent by maintenance developers to understand the program, e.g. to find bugs, increases and IMHO does sometimes not justify the nanoseconds in performance improvements or bytes in memory consumption in enterprise applicatons. Sometimes (well, of course it depends on the type of applications), it's better to waste a few bytes but have an application which is more maintainable or easier to understand.
For example, in a Java environment and using the Eclipse debugger, debugging an ArrayList will reveal a very easy to understand structure:
arrayList ArrayList<String>
elementData Object[]
[0] Object "Foo"
[1] Object "Foo"
[2] Object "Foo"
[3] Object "Foo"
[4] Object "Foo"
...
On the other hand, watching the contents of a LinkedList and finding specific objects becomes a Expand-The-Tree clicking nightmare, not to mention the cognitive overhead needed to filter out the LinkedList internals:
linkedList LinkedList<String>
header LinkedList$Entry<E>
element E
next LinkedList$Entry<E>
element E "Foo"
next LinkedList$Entry<E>
element E "Foo"
next LinkedList$Entry<E>
element E "Foo"
next LinkedList$Entry<E>
previous LinkedList$Entry<E>
...
previous LinkedList$Entry<E>
previous LinkedList$Entry<E>
previous LinkedList$Entry<E>
First of all, in C++ linked-lists shouldn't be much more trouble to work with than an array. You can use the std::list or the boost pointer list for linked lists. The key issues with linked lists vs arrays are extra space required for pointers and terrible random access. You should use a linked list if you
you don't need random access to the data
you will be adding/deleting elements, especially in the middle of the list
For me it is like this,
Access
Linked Lists allow only sequential access to elements. Thus the algorithmic complexities is order of O(n)
Arrays allow random access to its elements and thus the complexity is order of O(1)
Storage
Linked lists require an extra storage for references. This makes them impractical for lists of small data items such as characters or boolean values.
Arrays do not need an extra storage to point to next data item. Each element can be accessed via indexes.
Size
The size of Linked lists are dynamic by nature.
The size of array is restricted to declaration.
Insertion/Deletion
Elements can be inserted and deleted in linked lists indefinitely.
Insertion/Deletion of values in arrays are very expensive. It requires memory reallocation.
Two things:
Coding a linked list is, no doubt, a bit more work than using an array and he wondered what would justify the additional effort.
Never code a linked list when using C++. Just use the STL. How hard it is to implement should never be a reason to choose one data structure over another because most are already implemented out there.
As for the actual differences between an array and a linked list, the big thing for me is how you plan on using the structure. I'll use the term vector since that's the term for a resizable array in C++.
Indexing into a linked list is slow because you have to traverse the list to get to the given index, while a vector is contiguous in memory and you can get there using pointer math.
Appending onto the end or the beginning of a linked list is easy, since you only have to update one link, where in a vector you may have to resize and copy the contents over.
Removing an item from a list is easy, since you just have to break a pair of links and then attach them back together. Removing an item from a vector can be either faster or slower, depending if you care about order. Swapping in the last item over top the item you want to remove is faster, while shifting everything after it down is slower but retains ordering.
Eric Lippert recently had a post on one of the reasons arrays should be used conservatively.
Fast insertion and removal are indeed the best arguments for linked lists. If your structure grows dynamically and constant-time access to any element isn't required (such as dynamic stacks and queues), linked lists are a good choice.
Here's a quick one: Removal of items is quicker.
Linked-list are especially useful when the collection is constantly growing & shrinking. For example, it's hard to imagine trying to implement a Queue (add to the end, remove from the front) using an array -- you'd be spending all your time shifting things down. On the other hand, it's trivial with a linked-list.
Other than adding and remove from the middle of the list, I like linked lists more because they can grow and shrink dynamically.
Arrays Vs Linked List:
Array memory allocation will fail sometimes because of fragmented memory.
Caching is better in Arrays as all elements are allocated contiguous memory space.
Coding is more complex than Arrays.
No size constraint on Linked List, unlike Arrays
Insertion/Deletion is faster in Linked List and access is faster in Arrays.
Linked List better from multi-threading point of view.
No one ever codes their own linked list anymore. That'd be silly. The premise that using a linked list takes more code is just wrong.
These days, building a linked list is just an exercise for students so they can understand the concept. Instead, everyone uses a pre-built list. In C++, based the on the description in our question, that'd probably mean an stl vector (#include <vector> ).
Therefore, choosing a linked list vs an array is entirely about weighing the different characteristics of each structure relative to the needs of your app. Overcoming the additional programming burden should have zero impact on the decision.
It's really a matter of efficiency, the overhead to insert, remove or move (where you are not simply swapping) elements inside a linked list is minimal, i.e. the operation itself is O(1), verses O(n) for an array. This can make a significant difference if you are operating heavily on a list of data. You chose your data-types based on how you will be operating on them and choose the most efficient for the algorithm you are using.
Arrays make sense where the exact number of items will be known, and where searching by index makes sense. For example, if I wanted to store the exact state of my video output at a given moment without compression I would probably use an array of size [1024][768]. This will provide me with exactly what I need, and a list would be much, much slower to get the value of a given pixel. In places where an array does not make sense there are generally better data types than a list to deal with data effectively.
as arrays are static in nature, therefore all operations
like memory allocation occur at the time of compilation
only. So processor has to put less effort at its runtime .
Suppose you have an ordered set, which you also want to modify by adding and removing elements. Further, you need ability to retain a reference to an element in such a way that later you can get a previous or next element. For example, a to-do list or set of paragraphs in a book.
First we should note that if you want to retain references to objects outside of the set itself, you will likely end up storing pointers in the array, rather than storing objects themselves. Otherwise you will not be able to insert into array - if objects are embedded into the array they will move during insertions and any pointers to them will become invalid. Same is true for array indexes.
Your first problem, as you have noted yourself, is insertion - linked list allows inserting in O(1), but an array would generally require O(n). This problem can be partially overcome - it is possible to create a data structure that gives array-like by-ordinal access interface where both reading and writing are, at worst, logarithmic.
Your second, and more severe problem is that given an element finding next element is O(n). If the set was not modified you could retain the index of the element as the reference instead of the pointer thus making find-next an O(1) operation, but as it is all you have is a pointer to the object itself and no way to determine its current index in the array other than by scanning the entire "array". This is an insurmountable problem for arrays - even if you can optimized insertions, there is nothing you can do to optimize find-next type operation.
In an array you have the privilege of accessing any element in O(1) time. So its suitable for operations like Binary search Quick sort, etc. Linked list on the other hand is suitable for insertion deletion as its in O(1) time. Both has advantages as well as disadvantages and to prefer one over the other boils down to what you want to implement.
-- Bigger question is can we have a hybrid of both. Something like what python and perl implement as lists.
Linked List
Its more preferable when it comes about insertion! Basically what is does is that it deals with the pointer
1 -> 3 -> 4
Insert (2)
1........3......4
.....2
Finally
1 -> 2 -> 3 -> 4
One arrow from the 2 points at 3 and the arrow of 1 points at 2
Simple!
But from Array
| 1 | 3 | 4 |
Insert (2)
| 1 | 3 | | 4 |
| 1 | | 3 | 4 |
| 1 | 2 | 3 | 4 |
Well anyone can visualize the difference!
Just for 4 index we are performing 3 steps
What if the array length is one million then? Is array efficient?
The answer is NO! :)
The same thing goes for deletion!
In Linked List we can simply use the pointer and nullify the element and next in the object class!
But for array, we need to perform shiftLeft()
Hope that helps! :)
Linked List are more of an overhead to maintain than array, it also requires additional memory storage all these points are agreed. But there are a few things which array cant do. In many cases suppose you want an array of length 10^9 you can't get it because getting one continous memory location has to be there. Linked list could be a saviour here.
Suppose you want to store multiple things with data then they can be easily extended in the linked list.
STL containers usually have linked list implementation behind the scene.
1- Linked list is a dynamic data structure so it can grow and shrink at runtime by allocating and deallocating memory. So there is no need to give an initial size of the linked list. Insertion and deletion of nodes are really easier.
2- size of the linked list can increase or decrease at run time so there is no memory wastage. In the case of the array, there is a lot of memory wastage, like if we declare an array of size 10 and store only 6 elements in it then space of 4 elements is wasted. There is no such problem in the linked list as memory is allocated only when required.
3- Data structures such as stack and queues can be easily implemented using linked list.
Only reason to use linked list is that insert the element is easy (removing also).
Disadvatige could be that pointers take a lot of space.
And about that coding is harder:
Usually you don't need code linked list (or only once) they are included in
STL
and it is not so complicated if you still have to do it.
i also think that link list is more better than arrays.
because we do traversing in link list but not in arrays
Depending on your language, some of these disadvantages and advantages could be considered:
C Programming Language: When using a linked list (through struct pointers typically), special consideration must be made sure that you are not leaking memory. As was mentioned earlier, linked lists are easy to shuffle, because all were doing is changing pointers, but are we going to remember to free everything?
Java: Java has an automatic garbage collect, so leaking memory won't be an issue, but hidden from the high level programmer is the implementation details of what a linked list is. Methods such as removing a node from the middle of the list is more complicated of a procedure than some users of the language would expect it to be.
Why a linked list over an array ? Well as some have already said, greater speed of insertions and deletions.
But maybe we don't have to live with the limits of either, and get the best of both, at the same time... eh ?
For array deletions, you can use a 'Deleted' byte, to represent the fact that a row has been deleted, thus reorging the array is no longer necessary. To ease the burden of insertions, or rapidly changing data, use a linked list for that. Then when referring to them, have your logic first search one, then the other. Thus, using them in combination gives you the best of both.
If you have a really large array, you could combine it with another, much smaller array or linked list where the smaller one hold thes 20, 50, 100 most recently used items. If the one needed is not in the shorter linked list or array, you go to the large array. If found there, you can then add it to the smaller linked list/array on the presumption that 'things most recently used are most likey to be re-used' ( and yes, possibly bumping the least recently used item from the list ). Which is true in many cases and solved a problem I had to tackle in an .ASP security permissions checking module, with ease, elegance, and impressive speed.
While many of you have touched upon major adv./dis of linked list vs array, most of the comparisons are how one is better/ worse than the other.Eg. you can do random access in array but not possible in linked list and others. However, this is assuming link lists and array are going to be applied in a similar application. However a correct answer should be how link list would be preferred over array and vice-versa in a particular application deployment.
Suppose you want to implement a dictionary application, what would you use ?
Array : mmm it would allow easy retrieval through binary search and other search algo .. but lets think how link list can be better..Say you want to search "Blob" in dictionary. Would it make sense to have a link list of A->B->C->D---->Z and then each list element also pointing to an array or another list of all words starting with that letter ..
A -> B -> C -> ...Z
| | |
| | [Cat, Cave]
| [Banana, Blob]
[Adam, Apple]
Now is the above approach better or a flat array of [Adam,Apple,Banana,Blob,Cat,Cave] ? Would it even be possible with array ?
So a major advantage of link list is you can have an element not just pointing to the next element but also to some other link list/array/ heap/ or any other memory location.
Array is a one flat contigous memory sliced into blocks size of the element it is going to store.. Link list on the other hand is a chunks of non-contigous memory units (can be any size and can store anything) and pointing to each other the way you want.
Similarly lets say you are making a USB drive. Now would you like files to be saved as any array or as a link list ? I think you get the idea what I am pointing to :)

Resources