what is the difference between array and list?
In C, an array is a fixed-size region of contiguous storage containing multiple objects, one after the other. This array is an "object" in the meaning which C gives to the word - basically just some memory that represents something. An object could just be an int.
You can distinguish slightly between array objects, and array types. Often people use array objects which are allocated with malloc, and used via a pointer to the first element. But C does also have specific types for arrays of different sizes, and also for variable-length-arrays, whose size is set when they are created. VLAs have a slightly misleading name: the size is only "variable" in the sense that it isn't fixed at compile time. It can't change during the lifetime of the object.
So, when I say an array is fixed-size I mean that the size cannot change once the array is created, and this includes VLAs. There is realloc, which logically returns a pointer to a new array that replaces the old one, but can sometimes return the same address passed in, having changed the size of the array in place. realloc operates on memory allocations, not on arrays in general.
That's what an array is. The C programming language doesn't define anything called a list. Can't really compare something which is well defined, with something that isn't defined ;-) Usually "list" would mean a linked list, but in some contexts or in other languages it means other things.
For that matter, in other languages "array" could mean other things, although I can't immediately think of a language where it means anything very different from a C array.
If your question really has nothing to do with C, and is a language-agnostic data-structures question, "what is the difference between an array and a linked list?", then it's a duplicate of this:
Array versus linked-list
There is no such thing as a standard list in C. There is such a thing in C++, where it is implemented as a double-linked list.
The main differences are that arrays have random access - you can access any member of the array in O(1) time (i.e. if a was an array, a[4]) and have a pre-set size at compile time. Linked lists have sequential access - to access an element, you have to loop through the list until you get to the element you want (i.e. if b was a linked list, to get to the 5th element of b you would have to iterate through elements 0, 1, 2, 3 and 4), and the size can be grown and shrunk dynamically.
Although there is nothing like a list in C per se but you sure could be talking about a linked lists implementation.
Array: Random access, predefine size.
Linked List: Sequential access, size at runtime.
Other languages like, say Python, may have have both lists and arrays inbuilt and their meaning may differ.
Useful comments from below:
You could add array lists. Lists which internally is an array which is doubled when needed and halved when only 1/4 full. This gives O(1) for add, remove, get(index) amortized. – lasseespeholt
Python's list is not a linked list. And the distinction between Python list and array is list can store anything while array can only store primitive types (int, float, etc). – KennyTM
For array, it has a fixed size like we write, new int [100]
but list does not have a fixed size...it can go on and on
Insertion and Deletion is easier in list than in array
Reason: we can simply use to change the pointers to insert and delete for linked list but for array insert and deletion needs shiftRight and shiftLeft
Linked List uses a dummy head node to avoid special cases of inserting into an empty list, or removing the last node from a list of unit size; and, it uses double links to allow iterating in both directions. The cost of course is the extra space needed to hold the dummy node (minimal cost), and the extra previous link in addition the usual next link for each node (much more significant cost).
In array, we can add with the help of its random access
In Linked list, reference to the tail node is simply header.prev, which gives us ability to append to the list in constant time (without having to iterate to find the tail reference, or having to maintain a separate tail reference).
But in array, we need to re-size the array before inserting.
Array has the flexibility to attain random access unlike Linked List.
Linked list has problems like,
It consumes extra memory storage for the pointer we are using!
Time complexity of O(n) instead of O(1) like in array
Reverse traversing is difficult for singly linked list and if we use doubly linked list, another pointer means more of extra memory storage
Heap Restriction as well! Memory is allocated only if there is space available in the heap. If insufficient memory then memory won't be created.
Array has problems like,
a chance of memory wastage or shortage.
Hope this helps ! :)
An often under appreciated characteristic of Linked data structures is that you can use them in situations where memory is highly fragmented due to there being no contiguous memory guarantee between elements. For example you could have 100MB of free space but only say a maximum run of free memory of length 10MB. In this case you can only create an an array of size 10MB but perhaps a potentially larger linked list since you would be able to make use of every run of free memory which was large enough to contain a single node.
array has only similar data types(i.e.,) they are homogeneous in nature. we can only have an array of strings , integers etc. also the size of array is predefined.
but in the case of list we can have any type of elements. let it be a string integer or combination of both.Also null or duplicate elements are allowed in list. example of list include arraylist , linkedlist.here in list the size can grow or shrink at any time.
Related
As I continue learning the C language I got a doubt. Which are the differences between using an array in which each element is an struct and using an array in which each element is a pointer to the same type of struct. It seems to me that you can use both equally (Although in the pointers one you have to deal with memory allocation). Can somebody explain me in which case it is better to use one or the other?
Thank you.
Arrays of structures and arrays of pointers to structures are different ways to organize memory.
Arrays of structures have these strong points:
it is easy to allocate such an array dynamically in one step with struct s *p = calloc(n, sizeof(*p));.
if the array is part of an enclosing structure, no separate allocation code is needed at all. The same is true for local and global arrays.
the array is a contiguous block of memory, a pointer to the next and previous elements can be easily computed as struct s *prev = p - 1, *next = p + 1;
accessing array element members may be faster as they are close in memory, increasing cache efficiency.
They also have disadvantages:
the size of the array must be passed explicitly as there is no way to tell from the pointer to the array how many elements it has.
the expression p[i].member generates a multiplication, which may be costly on some architectures if the size of the structure is not a power of 2.
changing the order of elements is costly as it may involve copying large amounts of memory.
Using an array of pointers has these advantages:
the size of the array could be determined by allocating an extra element and setting it to NULL. This convention is used for the argv[] array of command line arguments provided to the main() function.
if the above convention is not used, and the number of elements is passed separately, NULL pointer values could be used to specify missing elements.
it is easy to change the order of elements by just moving the pointers.
multiple elements could be made to point to the same structure.
reallocating the array is easier as only the array of pointers needs reallocation, optionally keeping separate length and size counts to minimize reallocations. Incremental allocation is easy too.
the expression p[i].member generates a simple shift and an extra memory access, but may be more efficient than the equivalent expression for arrays of structures.
and the following drawbacks:
allocating and freeing this indirect array is more cumbersome. An extra loop is required to allocate and/or initialize the structures pointed to by the array.
access to structure elements involve an extra memory indirection. Compilers can generate efficient code for this if multiple members are accessed in the same function, but not always.
pointers to adjacent structures cannot be derived from a pointer to a given element.
EDIT: As hinted by David Bowling, one can combine some of the advantages of both approaches by allocating an array of structures on one hand and a separate array of pointers pointing to the elements of the first array. This is a handy way to implement a sort order, or even multiple concomitant sort orders with separate arrays of pointers, like database indexes.
I was recently in an interview that required me to choose over the two data structures for a problem, and now I have the question of:
What is the reasoning for using a Stack over an array if the only operations needed are push and pop? An array provides constant time for appending and popping the last element from it and it takes up generally less memory than implementing a Stack with a LinkedList. It also provides random access should it be required. Is the only reasoning because an array is typically of fixed size, so we need to dynamically resize the array for each element we put in? This still is in constant time though isn't it unless the penalty is disproportionate?
There are several aspects to consider here...
First, a Stack is an abstract data type. It doesn't define how to implement itself.
An array is (generally) a well defined concrete implementation, and might even be fixed size unless explicitly defined to be dynamic.
A dynamic array can be implemented such that it automatically grows by some factor when exhausted and also might shrink when fill rate drops. These operations are not constant time, but are actually amortized to constant time because the array doesn't grow or shrink in each operation. In terms of memory usage it's hard to imagine an array being more expensive then a linked list unless extremely under used.
The main problem with an array is large allocation size. This is both a problem of maximum limitation and memory fragmentation. Using a linked list avoids both issues because every entry has a small memory footprint.
In some languages like C++, the underlying container that the 'stack' class uses can actually be changed between a dynamic array (vector), linked list (list), or even a double ended queue (deque). I only mention this because its typically not fair to compare a stack vs an array (one is an interface, another is a data structure).
Most dynamic array implementations will allocate more space than is needed, and upon filling the array they will again resize to 2x the size and so on. This avoids allocations and keeps the performance of push generally constant time. However the occasional resize does require copying elements O(n), though this is usually said to amortized to constant time. So in general, you are correct in that this is efficient.
Linked lists on the other hand typically require allocations for every push, which can be somewhat expensive, and the node's they create are larger in size than a single element in the array.
One possible advantage of linked lists, however, is that they do not require contiguous memory. If you have many many elements, its possible that you can fail to allocate a large enough block of memory for an array. Having said that, linked lists take up more memory... so its a bit of a wash.
In C++ for example, the stack by default uses the deque container. The deque is typically implemented as a dynamic array of 'pages' of memory. Each page of memory is fixed in size, which allows the container to actually have random access properties. Moreover, since each page is separate, then the entire container does not require contiguous memory meaning that it can store many many elements. Resizing is also cheap for a deque because it simply allocates another page, making it a great choice for a large stack.
In a program I'm writing, I am implementing binary tree and linked list structures; because I don't know how many nodes I will need, I am putting them on the heap and having the program use realloc() if they need more room.
The problem is that such structures include pointers to other locations in the same structure, and because the realloc() moves the structure, I need to redo all those pointers (unless I change them to offsets, but that increases the complexity of the code and the cost of using the structure, which is far more common than the reallocations).
Now, this might not be a problem; I could just take the old pointer, subtract it from the new pointer, and add the result to each of the pointers I need to change. However, this only works if it is possible to subtract two pointers and get the difference in their addresses (and then add that difference to another pointer to get the pointer that many bytes ahead); because I'm working on the heap, I can't guarantee that the difference of addresses will be divisible by the size of the entries, so normal pointer subtraction (which gives the number of objects in between) will introduce errors. So how do I make it give me the difference in bytes, and work even when they are in two different sections of the heap?
To get the difference between two pointers in bytes, cast them to char *:
(char *) ptrA - (char *) ptrB;
However, if you'd like to implement a binary tree or linked list with all nodes sharing the same block of memory, consider using an array of structs instead, with the pointers being replaced by array indices. The primary advantage of using pointers for a linked list or tree, rather than an array of structs, is that you can add or remove nodes individually without reallocating memory or moving other nodes around, but by making the nodes share the same array, you're negating this advantage.
The best way would indeed be to malloc() a new chunk for every node you have. But this might have some overhead for the internal management of the memory, so if you have lots of them, it might be useful to indeed allocate space fore more nodes at once.
If you need to realloc then, you should go another way:
1. Calculate the offset within your memory block: `ofs = ptrX - start`
2. Add this offset to the new address returned by `realloc()`.
This way, you always stay inside the area you allocated and don't have strange heap pointer differences with nearly no meaning.
In fact ,you can use malloc or calloc to get memory for each node.
So you only need to remeber the address of tree's root node.
In this way, you never need realloc memeory for the whole tree . The address of each node also never change . :)
Many times, stack is implemented as a linked list, Is array representation not good enough, in array we can perform push pop easily, and linked list over array complicates the code, and has no advantage over array implementation.
Can you give any example where a linked list implementation is more beneficial, or we cant do without it.
I would say that many practical implementations of stacks are written using arrays. For example, the .NET Stack implementation uses an array as a backing store.
Arrays are typically more efficient because you can keep the stack nodes all nearby in contiguous memory that can fit nicely in your fast cache lines on the processor.
I imagine you see textbook implementations of stacks that use linked lists because they're easier to write and don't force you to write a little bit of extra code to manage the backing array store as well as come up with a growth/copy/reserve space heuristic.
In addition, if you're really pressed to use little memory, a linked list implementation might make sense since you don't "waste" space that's not currently used. However, on modern processors with plenty of memory, it's typically better to use arrays to gain the cache advantages they offer rather than worry about page faults with the linked list approach.
Size of array is limited and predefined. When you dont know how many of them are there then linked list is a perfect option.
More Elaborated comparison:-(+ for dominating linked list and - for array)
Size and type constraint:-
(+) Further members of array are aligned at equal distance and need contiguous memory while on the other side link list can provide non contiguous memory solution, so sometimes it is good for memory as well in case of huge data(avoids cpu polling for resource).
(+) Suppose in a case you are using an array as stack, and the array is of type int.Now how will you accommodate a double in it??
Portability
(+) Array can cause exceptions like index out of bound exceptions but you can increase the chain anytime in a linked list.
Speed and performance
(-)If its about performance, then obviously most of the complexity fall around O(1) for arrays.In case of a linked list you will have to select a starting node to start the tracing and this adds to performance penalty.
When the size of the stack can vary greatly you waste space if you have generalized routines which always allocate a huge array.
Obviously a fixed size array has limitation of knowing maximum size before hand.
If you consider dynamic array then Linked List vs. Arrays covers the details including complexities for performing operations.
Stack is implemented using Linked List because Push and Pop operations are of O(1) time complexities, compared to O(n) for arrays. (apart from flexible size advantage in Linked List)
can someone explain to me the difference between Vector and Linked List ADT in a c programming language context.
Thanks.
Well, in C, there are no "vector" and "list" data types available to you directly like in C++ std library. But in terms of "abstract data type", a vector is usually considered to represent contiguous storage, and a linked list is considered to be represented by individual cells linked together. Vectors provide fast constant time random-access read and write operations, but inserting and deleting vector elements take linear time. Lists have linear lookup performance to find an element to read and write, but given an element location, have constant time insertion and deletion. You can also add items to the start and to the end of a list in constant time (if the ADT implementation caches the location of the last element in the list).
A vector is often implemented as a contiguous block of memory as an array. Whereas a list can be spread across memory as each element holds pointers to one or more other elements (could be doubly linked). This gives vectors the access speed advantage but lists the insertion/deletion advantage.
Basically, a vector resides in contiguous memory. A linked list contains pointers to the previous and next structures. Vector is faster for random access, the linked list is better for growing.
http://www.codeguru.com/forum/archive/index.php/t-309352.html
vector is a dynamic array. the elements inside is adjacent in the memory. The elements inside linked list is not adjacent.