How to efficiently store large sparse array - arrays

I am tracking particles into a 3D lattice. Each lattice element is labeled with an index corresponding to an unrolled 3D array
S = x + WIDTH * (y + DEPTH * z)
I am interested in the transition form cell S1 to cell S2. The resulting transition matrix M(S1,S2) is sparsely populated, because particles can reach only near by cells. Unfortunately using the indexing of an unrolled 3D array cells that are geometrically near might have big difference in their indexes. For instance, cells that are siting on top of each other (say at z and z+1) will have their indexes shifted by WIDTH*DEPTH. Therefore if I try accumulating the resulting 2D matrix M(S1,S2) , S1 and S2 will be very different, even dough the cells are adjacent. This is a significant problem, because I can't use the usual sparse matrix storage.
At the beginning I tried storing the matrix in coordinate format:
I , J VALUE
Unfortunately I need to loop the entire index set to find the proper S1,S2 and store the accumulated M(S1,S2).
Unusually sparse matrices have some underlying structure and therefore the indexing is quite straightforward. In this case however, I have some troubles figuring out how to index my cells.
I would appreciate your help
Thank you in advance,

There are several approaches. Which is best depends on operations that need to be performed on the matrix.
A good general purpose one is to use a hash table where the key is the index tuple, in your case (i,j).
If neighboring (in the Euclidean sense) matrix elements must be discoverable, then an alternate strategy is a balanced tree with a Morton Order key. The Morton order value of a key (i,j) is just the integers i and j with their bits interleaved. You should quickly see that index tuples close to each other in the index 2-space are also close in linear Morton order.
Of course if you are building the matrix all at once, after which it's immutable, then you can build the key-value pairs in an array rather than a hash table or balanced tree, sort them (lexicographically for (i,j) pairs and linearly for Morton keys) and then do reads with simple binary search.

Related

Sparse matrices, sparse accumulator and multiplication

I'm trying to implement the algorithm for multiplying two sparse matrices from this paper: https://crd.lbl.gov/assets/pubs_presos/spgemmicpp08.pdf (the first algorithm - 1D algorithm).
What bothers me is that I'm not sure what SPA (sparse accumulator) really is. I've done some research and what I've concluded is that SPA represents a 𝐬𝐢𝐧𝐠𝐥𝐞 row/column of a sparse matrix (I'm mostly not sure about that part) and that it consists of a dense vector with nonzero values, a list of indices of nonzero elements (why list?) and a bool dense vector consisting of "occupied" flags (𝑇𝑟𝑢𝑒 on 𝑖-th index if an element in the active row/column on that position is not zero). Some also keep the number of nonzero inputs.
Am I correct? If so, I have some questions. If this structure has a dense boolean vector and we must keep the values, isn't it easier to simply fill one dense vector and ignore that it's sparse? I'm sure that there are reasons why this is more efficient (memory and time), but I don't see why.
Also, as I've already asked, why is everything a vector except the list of indices? Why isn't that also a vector?
Thanks in advance!
Many sparse matrix algorithms use a dense working vector to allow random access to the currently "active" column or row of a matrix.
The sparse MATLAB implementation formalizes this idea by defining an
abstract data type called the sparse accumulator, or SPA. The SPA consists of a dense vector of real (or complex) values, a dense vector of true/false "occupied" flags, and an unordered list of the indices whose occupied flags are true.
The SPA represents a column vector whose "unoccupied" positions are zero and
whose "occupied" positions have values (zero or nonzero) specified by the dense real or complex vector. It allows random access to a single element in constant time, as well as sequencing through the occupied positions in constant time per element.
Check section 3.1.3 at https://epubs.siam.org/doi/pdf/10.1137/0613024

Find the largest rectangle with no repeated elements

Find the max size of rectangular contiguous submatrix of unique (i.e. non repeated within a given submatrix) element.
How can I solve this?
You should set a maximum value to 0. Iterate the rows of the matrix and if they are not repeating (whatever that means), compare its size to the maximum. If it is bigger, then store the new maximum value and use that for further iterations. In case you found a new maximum, store whatever you need to store. So, the algorithm looks like this:
maximum <- 0
for all rows as row
if (row is not repeating) then
if (row rectangle size > maximum) then
maximum <- new maximum
store whatever you need to store
end if
end if
end for
Note, that if you do not have further information, then it is pointless to do a binary search, since you will have to check the size of each rectangle. If you have further knowledge about your rectangles, then the algorithm might be optimized.
A first idea (recursion): Maybe identify pairs in the whole array, this will identify constraints to respect. If there is a value v at both positions x0,y0 and x1,y1 then you cannot have a rectangle containing these positions, so this will let you construct some possible rectangles from these values and recurse on them?
Another one (dynamic programming): start with elementary arrays (size 1x1) and try to merge them respecting the constraint?

In-place sorting items into segments

I have an array of n items of type T, and a categorization function f(t) that assigns to each item a category number, from O to k-1. (k being the number of categories).
The goal is to divide the array into k segments, one for each category, and rearrange the items so that they are all in the right segment.
With two different arrays for input and output, I could do it in O(n), but I need to do it in-place (i.e. using swaps as basic operation), and if possible, using a parallelizable algorithm.
One idea would be to do one segment after the other (first swapping all 0's onto a segment at the beginning [O, i0], then all 1's (starting after i0) to a new segment after that, etc). This would be O(n * k) (with n getting smaller), but is not parallelizable.
Another way would be to use a sorting algorithm in O(n log n) that may be parallelizable, but this is likely not optimal because most items compare as equal.
My question is what would be a good approach for this problem, and how this problem would be called in literature?
As a quick note, this problem is related to - but not exactly the same as - the Dutch national flag problem. In this problem, you have an array with balls of three different colors (red, white, and blue), and the goal is to reorder the elements to get them sorted so that red comes first, then white, then blue.
Using ideas from the Dutch national flag problem, I think that you can solve this relatively efficiently and in-place. For example, you may want to use a quicksort variant that's specifically designed to handle duplicate elements. The Bentley-McIlroy 3-way partitioning algorithm, for example, was specifically designed to handle inputs where there are a lot of duplicate keys and does a quicksort where the partitioning scheme groups elements into three groups - elements less than the key, elements greater than the key, and elements equal to the key - then only sorts the "less" and "greater" groups. If you have an array with only k distinct values in it, then the runtime will be O(n log k) on expectation, since each recursive call will be made on a subarray with roughly half as many distinct keys in it. This isn't O(n), but it does work in-place and parallelizes really well (have different threads handle each subarray).

Finding Median in Three Sorted Arrays in O(logn)

By googling for minutes, I know the basic idea.
Let A,B,and C be sorted arrays containing n elements.
Pick median in each array and call them medA, medB, and medC.
Without loss of generality, suppose that medA > medB > medC.
The elements bigger than medA in array A cannot become the median of three arrays. Likewise, the elements smaller than medC in array C cannot, so such elements will be ignored.
Repeat steps 2-4 recursively.
My question is, what is the base case?
Assuming a lot of base cases, I tested the algorithm by hands for hours, but I was not able to find a correct base case.
Also, the lengths of three arrays will become different every recursive step. Does step 4 work even if the length of three arrays are different?
This algorithm works for two sorted arrays of same sizes but not three. After the one iteration, you eliminates half of the elements in A and C but leaves B unchanged, so the number of elements in these arrays are no longer the same, and the method no longer apply. For arrays of different sizes, if you apply the same method, you will be removing different number of elements from the lower half and upper half, therefore the median of the remaining elements is not the same as the median of the original arrays.
That being said, you can modify the algorithm to eliminate same number of elements at both end in each iteration, this could be in efficient when some of the arrays are very small and some are very large. You can also turn this into a question of finding the k-th element, track the number of elements being throw away and change value of k at each iteration. Either way this is much trickier than the two array situation.
There is another post talking about a general case: Median of 5 sorted arrays
I think you can use the selection algorithm, slightly modified to handle more arrays.
You're looking for the median, which is the p=[n/2]th element.
Pick the median of the largest array, find for that value the splitting point in the other two arrays (binary search, log(n)). Now you know that the selected number is the kth (k = sum of the positions).
If k > p, discard elements in the 3 arrays above it, if smaller, below it (discarding can be implemented by maintaing lower and upper indexes for each array, separately). If it was smaller, also update p = p - k.
Repeat until k=p.
Oops, I think this is log(n)^2, let me think about it...

Find common elements in two sorted arrays [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
The intersection of two sorted arrays
We have two sorted arrays A and B, besides compare one with all the elements in other array, how to design a best algorithm to find the array with their common elements?
Hold two pointers: one for each array.
i <- 0, j <- 0
repeat while i < length(arr1) and j < length(arr2):
if arr1[i] > arr2[j]: increase j
else if arr1[i] < arr2[j]: increase i
else : output arr[i], increase both pointers
The idea is, if the data is sorted, if the element is "too big" in one array, it will be "too big" for all other elements left in the array - since it is sorted.
This solution requires a single traversal on the data. O(n) (with good constants as well).
If the lengths of two arrays (say, A has N elements and B has M elements) are similar, then the best approach would be to perform linear search of one array's elements in another array. Of course, since the arrays are sorted, the next search should begin where the previous search has stopped. This is the classic principle used in "sorted array merge" algorithm. The complexity on O(N + M).
If the lengths are significantly different (say, M << N), then a much more optimal approach would be to iterate through elements of the shorter array and use binary search to look for these values in the longer array. The complexity is O(M * log N) in that case.
As you can see O(M * log N) is better than O(N + M) if M is much smaller than N, and worse otherwise.
The difference in array sizes which should trigger the switch from one approach to another depends on some practical considerations. If should be chosen based on practical experiments with your data.
These two approaches (linear and binary searches) can be "blended" into a single algorithm. Let's assume M <= N. In that case let's choose step value S = [N / M]. You take first element from array A and perform a straddled linear search for that element in array B with step S, meaning that you check elements B[0], B[S], B[2*S], B[3*S], ... and so on. Once you find the index range [S*i, S*(i+1)] that potentially contains the element you are searching for, you switch to binary search inside that segment of array B. Done. The straddled linear search for the next element of A begins where the previous search left off. (As a side note, it might make sense to choose the value of S equal to a power of 2).
This "blended" algorithm is the most asymptotically optimal search/merge algorithm for two sorted arrays in existence. However, in practice the more simple approach with choosing either binary or linear search depending on relative sizes of the arrays works perfectly well.
besides compare one with all the elements in other array
You will have to compare A[] to B[] in order to know that they are the same -- unless you know a lot about what kind of data they can hold. The nature of the comparison probably has many solutions and can be optimized as required.
If the arrays are very strictly created ie only sequential values of a known pattern and always starts from a known point you could just look at the length of each array and know whether or not all items are common.
This unfortunately doesn't sound like a very realistic or useful array and so you are back to checking for A[i] in B[]

Resources