Searching for first set bit in a huge bitmap - c

Interview question:
In a parking slot which can hold million cars, you need to find a free parking slot. There is no condition on where the slot could be, i.e., the parking lot can have multiple entrances and finding a slot near the entrance, etc., does not matter. The question was what kind of data structure should be used and what would be complexity of various operations.
I suggested using a bit array of million bits, with 0/1 for taken/free slot, so for finding free spot the question translated to finding first set bit. Do not assume anything about how many cars are there, etc., i.e., the bit array could be sparse or dense.
What is the fastest way to find a set bit in a huge bitmap? I did suggest binary search + efficient ffs() per word as the scheme.

A million 32-bit integers require about 4MB of memory. So I'd say you keep a list of free slots. Whenever a car enters, you take an item off the list and assign that. Whenever a car leaves, you put the freed slot number into the list.
As you'd only ever be manipulating the end of the list (so this is in fact used as a stack or LIFO structure), this gives you optimal O(1) performance both for finding a free slot and for returning a slot to free state. If you do this on a low level with a raw chunk of memory, you'll need a pointer indicating the current end of the list. Finding a slot decrements that pointer and returns its value. Returning a slot assigns to the pointer and increments it afterwards.
If you decide to add additional requirements later on, you could do some manipulation of your data, e.g. turn it into a heap. With a big map of 0/1 bits, such extensions wouldn't be possible.

You can go this way:
Store the index of the last free slot in a variable and then looking for the next one don't scan the bitmap from the beginning, but from this value.
If you need to free some slot, assign it to the last index.
std::vector<bool> can be your bit array, so you will not need to deal with bits yourself (bool's are packed into ints internally).
You can introduce a mip-mapped structure:
``std::vector<bool>`` Bitmap;
``std::vector<bool>`` Bitmap2; // half-sized
``std::vector<bool>`` Bitmap4; // 1/4
``std::vector<bool>`` Bitmap8; // 1/8
// etc
The free values in the upper-level arrays correspond to the situation where the lower level array have any free slots. You can use binary search to traverse this structure.

Related

Quadratic algorithm to access a cube [duplicate]

I'm working on a cardiac simulation tool that uses 4-dimensional data, i.e. several (3-30) variables at locations in 3D space.
I'm now adding some tissue geometry which will leave over 2/3 of the points in the containing 3D box outside of the tissue to be simulated, so I need a way to efficiently store the active points and not the others.
Crucially, I need to be able to:
Iterate over all of the active points within a constrained 3D box (iterator, perhaps?)
Having accessed one point, find its orthogonal neighbours (x,y,z) +/- 1.
That's probably more than one question! My main concern is how to efficiently represent the sparse data.
I'm using C.
How often do you add the tissue, and how much time can it take?
One simple solution is using a linked list+hash with pointers from one to the other.
Meaning:
Save a linked list containing all of the relevant points and their data
Save a hash to easily get to this data: key = coordinates, data = pointer to the linked list.
The implementation of the operations would be:
Add a box: Go over the full linked list, and take only the relevant elements into the "work" linked list
Iterate: Go over the "work" linked list
Find neighbors: Seek each of the neighbors in the hash.
Complexity:
Add: O(n), Iterate O(1) for finding next element, Neighbor O(1) average (due to hash).
If you want to use plain array indexing, you can create a sparse array on POSIX systems using mmap():
float (*a)[500][500];
a = mmap(0, (size_t)500 * sizeof a[0], PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (a && (void *)a != MAP_FAILED)
{
/* a is now 500 x 500 x 500 sparse array of floats */
You can then just access a[x][y][z] as you like, and it will only actually allocate real memory for each page that's touched. The array will be initialised to zero bytes.
If your system doesn't have MAP_ANONYMOUS you can achieve the same effect by mapping from /dev/zero.
Note that on many systems, swap space will be reserved (though not used) for the whole array.
First off, I think it's worth considering what your real requirement is. I suspect that it's not just "store the active points and none of the others in as space-efficient a manner as possible", but also a certain amount of "store adjacent points in nearby memory locations so that you get good caching behavior" and "store points in a manner for which lookups can be done efficiently".
With that said, here's what I would suggest. Divide the full 3D region into cubical blocks, all the same size. For each block, store all of the points in the block in dense arrays, including a boolean isTissue array for whether each point is in the tissue region or not. Allocate only the blocks that have points in them. Make a (dense) array of pointers to blocks, with NULL pointers for non-allocated blocks.
Thus, to find the point at (i,j), you first compute ii=i/blockside, jj=j/blocksize, and then look in the pointer-to-block table at (ii,jj) to find the block that contains your point. If that pointer is NULL, your point isn't in the tissue. If it's non-null, you look at (i mod blocksize, j mod blocksize) in that block, and there is your (i,j) point. You can check its isTissue flag to see if it's a "present" point or not.
You'll want to choose the block size as a balance between minimizing the number of times you do adjacent-point computations that cross block boundaries, and minimizing the number of points that are in blocks but not in the tissue region. I'd guess that at a minimum you want a row of the block to be a cache-line long. Probably the optimum is rather larger than that, though it will depend at least somewhat on your geometry.
To iterate over all the points in a 3D box, you would either just do lookups for each point, or (more efficiently) figure out which blocks the box touches, and iterate over the regions in those blocks that are within your box, skipping the ones where isTissue is false.
If you're doing a lot of deallocation and re-allocation of blocks, you probably want to "deallocate" blocks by dropping them into an "unused" pool, and then pulling blocks out of that pool rather than reallocating them. This also has the advantage that those blocks will already have all of their points set to "not present" (because that's why you deallocated the block), so you don't need to initialize them.
The experienced reader will probably recognize similarities between this and ways of laying out data for parallel computations; if you have a really big simulation, you can easily distribute the blocks across multiple nodes, and you only need to do cross-node communication for the cross-block computations. For that sort of application, you may find it useful to do nested levels of blocks, where you have meta-blocks (for the cross-node communication) containing smaller blocks (for the geometry).

Use of memory between an array and a linked list

In C, which is more efficient in terms of memory management, a linked list or an array?
For my program, I could use one or both of them. I would like to take this point into consideration before starting.
Both link list and array have good and bad sides.
Array
Accessing at a particular position take O(1) time, because memory initialized is consecutive for array. So if address of first position is A, then address of 5th element if A+4.
If you want to insert a number at some position it will take O(n) time. Because you have to shift every single numbers after that particular position and also increase size of array.
About searching an element. Considering the array is sorted. you can do a binary search and accessing each position is O(1). So you do the search in order of binary search. In case the array is not sorted you have to traverse the entire array so O(n) time.
Deletion its the exact opposite of insertion. You have to left shift all the numbers starting from the place where you deleted it. You might also need to recrete the array for memory efficiency. So O(n)
Memory must be contiguous, which can be a problem on old x86 machines
with 64k segments.
Freeing is a single operation.
LinkList
Accessing at a particular position take O(n) time, because you have to traverse the entire list to get to a particular position.
If you want to insert a number at some position and you have a pointer at that position already, it will take O(1) time to insert the new value.
About searching an element. No matter how the numbers are arranged you have to traverse the numbers from front to back one by one to find your particular number. So its always O(n)
about deletion its the exact opposite of insertion. If you know the position already by some pointer suppose the list was like this . p->q->r you want to delete q all you need is set next of p to r. and nothing else. So O(1) [Given you know pointer to p]
Memory is dispersed. With a naive implementation, that can be bad of cache coherency, and overall take can be high because the memory allocation system has overhead for each node. However careful programming can get round this problem.
Deletion requires a separate call for each node, however again careful programming can get round this problem.
So depending on what kind of problem you are solving you have to choose one of the two.
Linked list uses more memory, from both the linked list itself and inside the memory manager due to the fact you are allocating many individual blocks of memory.
That does not mean it is less efficient at all, depending on what you are doing.
While a linked list uses more memory, adding or removing elements from it is very efficient, as it doesn't require moving data around at all, while resizing a dynamic array means you have to allocate a whole new area in memory to fit the new and modified array with items added/removed. You can also sort a linked list without moving it's data.
On the other hand, arrays can be substantially faster to iterate due to caching, path prediction etc, as the data is placed sequentially in memory.
Which one is better for you will really depend on the application.

Can I represent a red black tree as an array?

Is it worth representing a red black tree as an array to eliminate the memory overhead. Or will the array take up more memory since the array will have empty slots?
It will have both positive and negative sides. This answer is applicable for C [since you mentioned this is what you will use]
Positive sides
Lets assume you have created an array as pool of objects that you will use for red-black tree. Deleting an element or initializing a new element when the position is found will be a little fast, because you probably will use the memory pool you have created yourself.
Negative sides
Yes the array will most probably end up taking more memory since the array will have empty slots sometimes.
You have to be sure about the MAX size of the red-black trees in this case. So there is a limitation of size.
You are not using the benefit of sequential memory space, so that might be a waste of resource.
Yes, you can represent red-black tree as an array, but it's not worth it.
Maximum height of red-black tree is 2*log2(n+1), where n is number of entries. Number of entries in array representation on each level is 2**n, where n is level. So to store 1_000 entries you'd have to allocate array of 1_048_576 entries. To store 1_000_000 entries you'd have to allocate array of 1_099_511_627_776 entries.
It's not worth it.
Red-back tree (and most data structures, really) doesn't care about which storage facility is used, that means you can use array or even HashTable/Map to store the tree node, the array index or map key is your new "pointer". You can even put the tree on the disk as a file and use file offset as node index if you would like to (though, in this case, you should use B-Tree instead).
The main problem is increased complexity as now you have to manage the storage manually (opposed to letting the OS and/or language runtime do it for you). Sometimes you want to scale the array up so you can store more items, sometimes you want to scale it down (vacuum) to free up unused space. These operations can be costly on their own.
Memory usage wise, storage facility does not change how many nodes on the tree. If you have 2,000 nodes on your old school pointerer tree (tree height=10), you'll still have 2,000 nodes on your fancy arrayilized tree (tree height is still 10). However, redundant space may exists in between vacuum operations.

Trees: Linked Lists vs Arrays (Efficiency)

This is an assignment question that I am having trouble wording an answer to.
"Suppose a tree may have up to k children per node. Let v be the average number of children per node. For what value(s) of v is it more efficient (in terms of space used) to store the child nodes in a linked list versus storage in an array? Why?"
I believe I can answer the "why?" more or less in plain English -- it will be more efficient to use the linked list because rather than having a bunch of empty nodes (ie empty indexes in the array if your average is lower than the max) taking up memory you only alloc space for a node in a linked list when you're actually filling in a value.
So if you've got an average of 6 children when your maximum is 200, the array will be creating space for all 200 children of each node when the tree is created, but the linked list will only alloc space for the nodes as needed. So, with the linked list, space used will be approximately(?) the average; with the array, spaced used will be the max.
...I don't see when it would ever be more efficient to use the array. Is this a trick question? Do I have to take into account the fact that the array needs to have a limit on total number of nodes when it's created?
For many commonly used languages, the array will require allocating storage k memory addresses (of the data). A singly-linked list will require 2 addresses per node (data & next). A doubly-linked list would require 3 addresses per node.
Let n be the actual number of children of a particular node A:
The array uses k memory addresses
The singly-linked list uses 2n addresses
The doubly-linked list uses 3n addresses
The value k allows you to determine if 2n or 3n addresses will average to a gain or loss compared to simply storing the addresses in an array.
...I don't see when it would ever be more efficient to use the array. Is this a trick question?
It’s not a trick question. Think of the memory overhead that a linked list has. How is a linked list implemented (vs. an array)?
Also (though this is beyond the scope of the question!), space consumption isn’t the only deciding factor in practice. Caching plays an important role in modern CPUs and storing the individual child nodes in an array instead of a linked list can improve the cache locality (and consequently the tree’s performance) drastically.
Arrays must pre-allocate space but we can use them to access very fast any entry.
Lists allocate memory whenever they create a new node, and that isn't ideal because
memory allocation costs on cpu time.
Tip: you can allocate the whole array at once, if you want to, but usually we allocate,
lets say, 4 entries and we resize it by doubling the size when we need more space.
I can imagine it could be a very good idea and very efficient in many scenarios to use a LinkedList if the data item one uses has intrinsic logic for a previous and next item anyway, for example a TimeTableItem or anything that is somehow time-related. These should implement an interface, so the LinkedList implementation can leverage that and doesn't have to wrap the items into its own node objects. Inserting and removing would be much more efficient here than using a List implementation which internally juggles arrays around.
You're assuming the array can't be dynamically re-allocated. If it can, the array wins, because it need be no bigger than k items (plus a constant overhead), and because it doesn't need per-item storage for pointers.

Sparse Multi-Dimensional Data Representation

I'm working on a cardiac simulation tool that uses 4-dimensional data, i.e. several (3-30) variables at locations in 3D space.
I'm now adding some tissue geometry which will leave over 2/3 of the points in the containing 3D box outside of the tissue to be simulated, so I need a way to efficiently store the active points and not the others.
Crucially, I need to be able to:
Iterate over all of the active points within a constrained 3D box (iterator, perhaps?)
Having accessed one point, find its orthogonal neighbours (x,y,z) +/- 1.
That's probably more than one question! My main concern is how to efficiently represent the sparse data.
I'm using C.
How often do you add the tissue, and how much time can it take?
One simple solution is using a linked list+hash with pointers from one to the other.
Meaning:
Save a linked list containing all of the relevant points and their data
Save a hash to easily get to this data: key = coordinates, data = pointer to the linked list.
The implementation of the operations would be:
Add a box: Go over the full linked list, and take only the relevant elements into the "work" linked list
Iterate: Go over the "work" linked list
Find neighbors: Seek each of the neighbors in the hash.
Complexity:
Add: O(n), Iterate O(1) for finding next element, Neighbor O(1) average (due to hash).
If you want to use plain array indexing, you can create a sparse array on POSIX systems using mmap():
float (*a)[500][500];
a = mmap(0, (size_t)500 * sizeof a[0], PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (a && (void *)a != MAP_FAILED)
{
/* a is now 500 x 500 x 500 sparse array of floats */
You can then just access a[x][y][z] as you like, and it will only actually allocate real memory for each page that's touched. The array will be initialised to zero bytes.
If your system doesn't have MAP_ANONYMOUS you can achieve the same effect by mapping from /dev/zero.
Note that on many systems, swap space will be reserved (though not used) for the whole array.
First off, I think it's worth considering what your real requirement is. I suspect that it's not just "store the active points and none of the others in as space-efficient a manner as possible", but also a certain amount of "store adjacent points in nearby memory locations so that you get good caching behavior" and "store points in a manner for which lookups can be done efficiently".
With that said, here's what I would suggest. Divide the full 3D region into cubical blocks, all the same size. For each block, store all of the points in the block in dense arrays, including a boolean isTissue array for whether each point is in the tissue region or not. Allocate only the blocks that have points in them. Make a (dense) array of pointers to blocks, with NULL pointers for non-allocated blocks.
Thus, to find the point at (i,j), you first compute ii=i/blockside, jj=j/blocksize, and then look in the pointer-to-block table at (ii,jj) to find the block that contains your point. If that pointer is NULL, your point isn't in the tissue. If it's non-null, you look at (i mod blocksize, j mod blocksize) in that block, and there is your (i,j) point. You can check its isTissue flag to see if it's a "present" point or not.
You'll want to choose the block size as a balance between minimizing the number of times you do adjacent-point computations that cross block boundaries, and minimizing the number of points that are in blocks but not in the tissue region. I'd guess that at a minimum you want a row of the block to be a cache-line long. Probably the optimum is rather larger than that, though it will depend at least somewhat on your geometry.
To iterate over all the points in a 3D box, you would either just do lookups for each point, or (more efficiently) figure out which blocks the box touches, and iterate over the regions in those blocks that are within your box, skipping the ones where isTissue is false.
If you're doing a lot of deallocation and re-allocation of blocks, you probably want to "deallocate" blocks by dropping them into an "unused" pool, and then pulling blocks out of that pool rather than reallocating them. This also has the advantage that those blocks will already have all of their points set to "not present" (because that's why you deallocated the block), so you don't need to initialize them.
The experienced reader will probably recognize similarities between this and ways of laying out data for parallel computations; if you have a really big simulation, you can easily distribute the blocks across multiple nodes, and you only need to do cross-node communication for the cross-block computations. For that sort of application, you may find it useful to do nested levels of blocks, where you have meta-blocks (for the cross-node communication) containing smaller blocks (for the geometry).

Resources