Is stack an efficient data structure? - arrays

If two stacks growing in opposite directions are implemented within the same array, would this be a better solution or would this raise issues? Can somebody explain please.

A priori, there is no issue. You can totally grow two stacks in opposite directions within an array.
In general, I don't think it's better or worse than growing the two stacks in two different arrays. Since most libraries provide an implementation for stacks, we usually don't bother implementing our own 2-stacks-in-one-array and we use simple stacks as they are implemented.
In some particular cases, using one array instead of two might be better, and in some other particular cases, it might be worse.
When allocating an array, we ask the operating system for a contiguous region of memory. If you want to allocate two size-10000 array, then you ask for a contiguous region of 10000 free spaces, plus another contiguous region of 10000 free spaces. If you want to allocate a size-20000 array, then you ask for a contiguous region of 20000 free spaces.
Obviously, it's easier to find two regions of 10000 spaces each, than one contiguous region of 20000 spaces. So in the case of two huge stacks, it might be better to use two arrays.
However, if you are working with two stacks with the property that the total number of elements is approximately constant, but the distribution of the elements among the two stacks is changing, then it might be more efficient to store the two stacks in the same array. For instance, if you start with 10000 elements in the first stack and 0 in second, then at some point there are 5000 elements in each stack, then later there are no elements in first stack and 10000 elements in second stack, then storing the two stacks in the same array might be much more efficient than using two separate arrays.

Related

Why would you use a LIFO stack over an array?

I was recently in an interview that required me to choose over the two data structures for a problem, and now I have the question of:
What is the reasoning for using a Stack over an array if the only operations needed are push and pop? An array provides constant time for appending and popping the last element from it and it takes up generally less memory than implementing a Stack with a LinkedList. It also provides random access should it be required. Is the only reasoning because an array is typically of fixed size, so we need to dynamically resize the array for each element we put in? This still is in constant time though isn't it unless the penalty is disproportionate?
There are several aspects to consider here...
First, a Stack is an abstract data type. It doesn't define how to implement itself.
An array is (generally) a well defined concrete implementation, and might even be fixed size unless explicitly defined to be dynamic.
A dynamic array can be implemented such that it automatically grows by some factor when exhausted and also might shrink when fill rate drops. These operations are not constant time, but are actually amortized to constant time because the array doesn't grow or shrink in each operation. In terms of memory usage it's hard to imagine an array being more expensive then a linked list unless extremely under used.
The main problem with an array is large allocation size. This is both a problem of maximum limitation and memory fragmentation. Using a linked list avoids both issues because every entry has a small memory footprint.
In some languages like C++, the underlying container that the 'stack' class uses can actually be changed between a dynamic array (vector), linked list (list), or even a double ended queue (deque). I only mention this because its typically not fair to compare a stack vs an array (one is an interface, another is a data structure).
Most dynamic array implementations will allocate more space than is needed, and upon filling the array they will again resize to 2x the size and so on. This avoids allocations and keeps the performance of push generally constant time. However the occasional resize does require copying elements O(n), though this is usually said to amortized to constant time. So in general, you are correct in that this is efficient.
Linked lists on the other hand typically require allocations for every push, which can be somewhat expensive, and the node's they create are larger in size than a single element in the array.
One possible advantage of linked lists, however, is that they do not require contiguous memory. If you have many many elements, its possible that you can fail to allocate a large enough block of memory for an array. Having said that, linked lists take up more memory... so its a bit of a wash.
In C++ for example, the stack by default uses the deque container. The deque is typically implemented as a dynamic array of 'pages' of memory. Each page of memory is fixed in size, which allows the container to actually have random access properties. Moreover, since each page is separate, then the entire container does not require contiguous memory meaning that it can store many many elements. Resizing is also cheap for a deque because it simply allocates another page, making it a great choice for a large stack.

Storing arrays in cache

I was reviewing an interview question and comparing notes with a friend, and we have different ideas on one with respect to CPU caching.
Consider a large set of data, such as a large array of double, ie:
double data[1024];
Consider using a dynamically allocated on-the-fly linked list to store the same number of elements. The question asked for a description of a number of trade-offs:
Which allows for quicker random access: we both agreed the array was quicker, since you didn't have to traverse the list in a linear fashion (O(n)), just provide an index (O(1)).
Which is quicker for comparing two lists of the same length: we both decided that if it was just primitive data types, the array would allow for a memcmp(), while the linked list required element-wise comparison plus dereferencing overhead.
Which allowed for more efficient caching if you were accessing the same element several times?
In point 3, this is where our opinions differed. I contended that that the CPU is going to try and cache the entire array, and if the array is obscenely large, it can't be stored in cache, and therefore there will be no caching benefit. With the linked list, individual elements can be cached. Therefore, linked lists lend themselves to cache "hits" more than static arrays do when dealing with a very large number of elements.
To the question: Which of the two is better for cache "hits", and can modern systems cache part of an array, or do they need the whole array or it won't try? Any sort of references to technical documents or standards I could also use to provide a definitive answer would help a lot.
Thanks!
The CPU doesn't know about your data structures. It caches more-or-less raw blocks of memory. Therefore, if you suppose you can access the same one element multiple times without traversing the list each time, then neither linked list nor array has a caching advantage over the other.
HOWEVER, arrays have a big advantage over dynamically-allocated linked lists for accessing multiple elements in sequence. Because CPU caches operate on blocks of memory rather larger than one double, when one array element is in the cache, it is likely that several others that reside at adjacent addresses are in the cache, too. Thus one (slow) read from main memory gives access to (fast) cached access to multiple adjacent array elements. The same is not true of linked lists, as nodes may be allocated anywhere in memory, and even a single node has at minimum the overhead of a next pointer to dilute the number of data elements that may be cached at the same time.
Caches don't know about arrays, they just see memory accesses and store a little bit of the memory near that address. Once you've accessed something at an address it should stick around in the cache a while, regardless of whether that address belongs to an array or a linked list. But the cache controllers don't really know what's being accessed.
When you traverse an array, the cache system may pre-fetch the next bit of an array. This is usually heuristically driven (maybe with some compiler hints).
Some hardware and toolchains offer intrinsics that let you control cache residency (through pre-fetches, explicit flushes and so forth). Normally you don't need this kind of control, but for things like DSP code, resource-constrained game consoles and OS-level stuff that needs to worry about cache coherency it's pretty common to see people use this functionality.

Memory consumption of a concurrent data structure in C

I would like to know how much memory a given data structure is consuming. So suppose I have a concurrent linked list. I would like to know how big the list is. I have a few options: malloc_hooks, which I do not think is thread-safe, and getrusage's ru_maxrss, but I don't really know what that gives me (how much memory the whole process consumed during its execution?). I would like to know if someone has actually measured memory consumption in this way. Is there a tool to do this? How does massif fare?
To get an idea of how many bytes it actually costs to malloc some structure, like a linked list node, make an isolated test case(non-concurrent!) which allocates thousands of them, and look at the delta values in the program's memory usage. There are various ways to do that. If your library has a mallinfo structure, like the GNU C Library found on GNU/Linux systems, you can look at the statistics before and after. Another way is to trace the program's system calls to watch its pattern of allocating from the OS. If, say, we allocate 10,000,000 list nodes, and the program performs a sbrk() call about 39,000 times, increasing the size of the process by 8192 bytes in each call, then that means that a list node takes up 32 bytes, overhead and all.
Keeping in mind that allocating thousands of objects of the same size in a single thread does not realistically represent the actual memory usage in a realistic program, which includes fragmentation.
If you want to allocate small structures and come close to not wasting a byte (or not causing any waste that you don't know about and control), and to control fragmentation, then allocate large arrays of the objects from malloc (or your system allocator of choice) and break them up yourself. There is still unknown overhead in the malloc but it is divided over a large number of objects, making it negligible.
Or, generally, write your own allocator whose behavior and overheads you understand in detail, and which itself takes big chunks from the system.
Conceptually speaking you need to know the number of items you are working with. Then you need to know the size of each different data type used in your data structure. You also will have to take into account the size of pointers or anything that is somewhat using some sort of memory.
Then you can come up with a formula that looks like the following:
Consumption= N *(sizeof(data types) ).
So in other words you want to make sure you add any data type together (the data type's size) and multiply it by the number of items.

Why stack is implemented as a linked list ? What is the need, why array implementation is not sufficient?

Many times, stack is implemented as a linked list, Is array representation not good enough, in array we can perform push pop easily, and linked list over array complicates the code, and has no advantage over array implementation.
Can you give any example where a linked list implementation is more beneficial, or we cant do without it.
I would say that many practical implementations of stacks are written using arrays. For example, the .NET Stack implementation uses an array as a backing store.
Arrays are typically more efficient because you can keep the stack nodes all nearby in contiguous memory that can fit nicely in your fast cache lines on the processor.
I imagine you see textbook implementations of stacks that use linked lists because they're easier to write and don't force you to write a little bit of extra code to manage the backing array store as well as come up with a growth/copy/reserve space heuristic.
In addition, if you're really pressed to use little memory, a linked list implementation might make sense since you don't "waste" space that's not currently used. However, on modern processors with plenty of memory, it's typically better to use arrays to gain the cache advantages they offer rather than worry about page faults with the linked list approach.
Size of array is limited and predefined. When you dont know how many of them are there then linked list is a perfect option.
More Elaborated comparison:-(+ for dominating linked list and - for array)
Size and type constraint:-
(+) Further members of array are aligned at equal distance and need contiguous memory while on the other side link list can provide non contiguous memory solution, so sometimes it is good for memory as well in case of huge data(avoids cpu polling for resource).
(+) Suppose in a case you are using an array as stack, and the array is of type int.Now how will you accommodate a double in it??
Portability
(+) Array can cause exceptions like index out of bound exceptions but you can increase the chain anytime in a linked list.
Speed and performance
(-)If its about performance, then obviously most of the complexity fall around O(1) for arrays.In case of a linked list you will have to select a starting node to start the tracing and this adds to performance penalty.
When the size of the stack can vary greatly you waste space if you have generalized routines which always allocate a huge array.
Obviously a fixed size array has limitation of knowing maximum size before hand.
If you consider dynamic array then Linked List vs. Arrays covers the details including complexities for performing operations.
Stack is implemented using Linked List because Push and Pop operations are of O(1) time complexities, compared to O(n) for arrays. (apart from flexible size advantage in Linked List)

Array Performance very similar to LinkedList - What gives?

So the title is somewhat misleading... I'll keep this simple: I'm comparing these two data structures:
An array, whereby it starts at size 1, and for each subsequent addition, there is a realloc() call to expand the memory, and then append the new (malloced) element to the n-1 position.
A linked list, whereby I keep track of the head, tail, and size. And addition involves mallocing for a new element and updating the tail pointer and size.
Don't worry about any of the other details of these data structures. This is the only functionality I'm concerned with for this testing.
In theory, the LL should be performing better. However, they're near identical in time tests involving 10, 100, 1000... up to 5,000,000 elements.
My gut feeling is that the heap is large. I think the data segment defaults to 10 MB on Redhat? I could be wrong. Anyway, realloc() is first checking to see if space is available at the end of the already-allocated contiguous memory location (0-[n-1]). If the n-th position is available, there is not a relocation of the elements. Instead, realloc() just reserves the old space + the immediately following space. I'm having a hard time finding evidence of this, and I'm having a harder time proving that this array should, in practice, perform worse than the LL.
Here is some further analysis, after reading posts below:
[Update #1]
I've modified the code to have a separate list that mallocs memory every 50th iteration for both the LL and the Array. For 1 million additions to the array, there are almost consistently 18 moves. There's no concept of moving for the LL. I've done a time comparison, they're still nearly identical. Here's some output for 10 million additions:
(Array)
time ./a.out a 10,000,000
real 0m31.266s
user 0m4.482s
sys 0m1.493s
(LL)
time ./a.out l 10,000,000
real 0m31.057s
user 0m4.696s
sys 0m1.297s
I would expect the times to be drastically different with 18 moves. The array addition is requiring 1 more assignment and 1 more comparison to get and check the return value of realloc to ensure a move occurred.
[Update #2]
I ran an ltrace on the testing that I posted above, and I think this is an interesting result... It looks like realloc (or some memory manager) is preemptively moving the array to larger contiguous locations based on the current size.
For 500 iterations, a memory move was triggered on iterations:
1, 2, 4, 7, 11, 18, 28, 43, 66, 101, 154, 235, 358
Which is pretty close to a summation sequence. I find this to be pretty interesting - thought I'd post it.
You're right, realloc will just increase the size of the allocated block unless it is prevented from doing so. In a real world scenario you will most likely have other objects allocated on the heap in between subsequent additions to the list? In that case realloc will have to allocate a completely new chunk of memory and copy the elements already in the list.
Try allocating another object on the heap using malloc for every ten insertions or so, and see if they still perform the same.
So you're testing how quickly you can expand an array verses a linked list?
In both cases you're calling a memory allocation function. Generally memory allocation functions grab a chunk of memory (perhaps a page) from the operating system, then divide that up into smaller pieces as required by your application.
The other assumption is that, from time to time, realloc() will spit the dummy and allocate a large chunk of memory elsewhere because it could not get contiguous chunks within the currently allocated page. If you're not making any other calls to memory allocation functions in between your list expand then this won't happen. And perhaps your operating system's use of virtual memory means that your program heap is expanding contiguously regardless of where the physical pages are coming from. In which case the performance will be identical to a bunch of malloc() calls.
Expect performance to change where you mix up malloc() and realloc() calls.
Assuming your linked list is a pointer to the first element, if you want to add an element to the end, you must first walk the list. This is an O(n) operation.
Assuming realloc has to move the array to a new location, it must traverse the array to copy it. This is an O(n) operation.
In terms of complexity, both operations are equal. However, as others have pointed out, realloc may be avoiding relocating the array, in which case adding the element to the array is O(1). Others have also pointed out that the vast majority of your program's time is probably spent in malloc/realloc, which both implementations call once per addition.
Finally, another reason the array is probably faster is cache coherency and the generally high performance of linear copies. Jumping around to erratic addresses with significant gaps between them (both the larger elements and the malloc bookkeeping) is not usually as fast as doing a bulk copy of the same volume of data.
The performance of an array-based solution expanded with realloc() will depend on your strategy for creating more space.
If you increase the amount of space by adding a fixed amount of storage on each re-allocation, you'll end up with an expansion that, on average, depends on the number of elements you have stored in the array. This is on the assumption that realloc will need to (occasionally) allocate space elsewhere and copy the contents, rather than just expanding the existing allocation.
If you increase the amount of space by adding a proportion of your current number of elements (doubling is pretty standard), you'll end up with an expansion that, on average, takes constant time.
Will the compiler output be much different in these two cases?
This is not a real life situation. Presumably, in real life, you are interested in looking at or even removing items from your data structures as well as adding them.
If you allow removal, but only from the head, the linked list becomes better than the array because removing an item is trivial and, if instead of freeing the removed item, you put it on a free list to be recycled, you can eliminate a lot of the mallocs needed when you add items to the list.
On the other had, if you need random access to the structure, clearly an array beats the linked list.
(Updated.)
As others have noted, if there are no other allocations in between reallocs, then no copying is needed. Also as others have noted, the risk of memory copying lessens (but also its impact of course) for very small blocks, smaller than a page.
Also, if all you ever do in your test is to allocate new memory space, I am not very surprised you see little difference, since the syscalls to allocate memory are probably taking most of the time.
Instead, choose your data structures depending on how you want to actually use them. A framebuffer is for instance probably best represented by a contiguous array.
A linked list is probably better if you have to reorganise or sort data within the structure quickly.
Then these operations will be more or less efficient depending on what you want to do.
(Thanks for the comments below, I was initially confused myself about how these things work.)
What's the basis of your theory that the linked list should perform better for insertions at the end? I would not expect it to, for exactly the reason you stated. realloc will only copy when it has to to maintain contiguity; in other cases it may have to combine free chunks and/or increase the chunk size.
However, every linked list node requires fresh allocation and (assuming double linked list) two writes. If you want evidence of how realloc works, you can just compare the pointer before and after realloc. You should find that it usually doesn't change.
I suspect that since you're calling realloc for every element (obviously not wise in production), the realloc/malloc call itself is the biggest bottleneck for both tests, even though realloc often doesn't provide a new pointer.
Also, you're confusing the heap and data segment. The heap is where malloced memory lives. The data segment is for global and static variables.

Resources