motion estimation by FS-LBP - video-processing

I want to do the motion estimation by the method of FS-LBP, but I need to find NNMN.I dont understand the ⊗ symboll.
This formule is NNMN(m, n) =∑∑LBPt(i,j) ⊗ LBPt−1(i + m, j + n).
− s ≤ m, n ≤ s − 1
LBPt is current image. LBPt-1 is reference image frames(previous image). where LBPt and LBPt−1 are the LBP transforms for the current and the reference frames
respectively, and s is the search range.NNMN captures the number of mismatching neighbors
around the central pixel of a macro-block in the current frame and the reference frame

⊗ seymboll is xor operation.
NNMN , for each pixel there are 8 bits of information about its 8 neighbors.Recreate the search block -+16 (search block size is 48) in the LBPt-1 .Devide the LBPt by macro block. Macro blok size (16,16).And scan macro block on search block to find max similarity.

Related

2D peak finding binary search

Taken from mit 6.006 : To find a peak in a 2D array, where a number is a peak if it is >= than all its neighbours:
Pick middle column j = m/2
Find global maximum on column j at (i, j)
Compare (i, j − 1),(i, j),(i, j + 1)
Pick left columns of (i, j − 1) > (i, j), Similarly for right
(i, j) is a 2D-peak if neither condition holds
Solve the new problem with half the number of columns.
When you have a single column, find global maximum and you‘re done.
I understand why it might find a peak, but I think it only finds a peak on half of the array if it exists
Using binary search here confuses me since (1) the 2d array is not sorted, and each time you halve you are essentially saying there can be no peak on the left(which is not confirmed?)
It finds the maximum element in the middle column - This ignores the possibility of a peak formed from non-maximal numbers, or that you can have more than one 1D peak in that column
They compare numbers to the left and right of the max of the middle column - this discounts that there may be elements in the left and right column that are larger than max but not adjacent
Can someone explain to me why this algorithm is correct, hopefully by explaining (1)(2)(3)
each time you halve you are essentially saying there can be no peak on the left
Ah, no, we're saying that there is a peak on the right. There can be peaks on the left too, but we don't need to find every peak.
To prove that there is a peak on the (without loss of generality) right, consider the following "gradient ascent" algorithm:
Start at an arbitrary number.
While the current number has at least one greater neighbor, go to an arbitrary greater neighbor.
This algorithm never cycles because the current number only increases. This algorithm hence terminates because there are finitely many numbers. When the algorithm terminates, it has found a peak.
Consider what happens if (i, j) has the maximum value in its column and we start gradient ascent at (i, j). Either (i, j) is a peak (great!), or we move to a greater number in one of the adjacent columns. In the latter case, this number is greater than the maximum in column j, hence greater than every number in column j. Therefore, gradient ascent will never reenter the column, and thus it will never enter the columns on the other side, implying the existence of a peak on the desired side.
The idea of using binary search to find peak elements in a matrix is very good. But if used alone, it returns only one peak element (this value is generally not the maximum).
You can use the following code to find all the peak elements by binary search method.
def Div(lst, low, high, c):
mid = (low + high)//2
if low > high:
return ""
else:
if mid+1 < len(lst) and lst[mid][c] > lst[mid+1][c] and lst[mid][c] > lst[mid -1][c] and lst[mid][c] > lst[mid][c + 1] and lst[mid][c] > lst[mid][c - 1]:
return str(lst[mid][c]) + " " + Div(lst, low, mid-1, c) + Div(lst, mid+1, high, c)
else:
return Div(lst, low, mid-1, c) + Div(lst, mid + 1, high, c)
def peak(lst, c):
return Div(lst, 0, len(lst), c)
lst=[[0,0,0,0,0,0],
[0,0,1,0,4,0],
[0,2,0,3,0,0],
[0,0,5,0,6,0],
[0,0,0,0,0,0]]
for i in range (0,5):
print(peak(lst,i))
output:
2
1 5
3
4 6
O(mn)

Fill Grid With Random Pixels

I have a grid of pixels 64x8. The aim is to to activate the pixels on this grid in a random manner till the whole grid is activated.
Logically I can generate random numbers in 0-63 and 0-7 range and then activate this pixel. Assuming I run this for long enough, the grid should be completely activated.
However, I am wondering if there is any algorithm that can minimize / avoid altogether collision (returning already activated pixel coordinate) and guarantee complete grid activation in a finite amount of time?
Fill an array of length 512 with numbers increasing from from 0 to 511 (64x8 = 512), so the array will contain {0,1,2,3,..., 511}).
Then shuffle that array, for example like explained here: Shuffle array in C.
Then define a function that maps a number to a coordinate, that would be:
y = n / 8
x = n % 8
n being one of the numbers of the array.
If the array is well shuffled this guarantees that all pixels will be activatged in a random order.
You could implement a pseudo random generator (PRG # Wikipedia) with a period of 64 * 8. Use 3 bits for the axis with 8, and the remaining 6 bits for the axis with 64.

Searching through a partially sorted array in O(lgn)

I'm having a hard time solving this problem.
A[1..n] is an array of real numbers which is partially sorted:
There are some p,q (1 <= p <= q <=n) so:
A[1] <= ... <= A[p]
A[p] >= ... >= A[q]
A[q] <= ... <= A[n]
How can we find a value in this array in O(lgn)?
(You can assume that the value exists in the array)
Make 3 binary searches: from 1 to p, p to q and q to n. The complexity is still O(logn).
Since we don't know p and q:
You cannot solve this problem in logn time. Assume a case where you have a sorted list of positive numbers with one zero mixed in (p+1=q and A[q]=0). This situation satisfies all the criteria you mentioned. Now, the problem of finding where that zero is located cannot be solved in sub O(n) time. Therefore your problem cannot be solved in O(logn) time.
Despite the "buried zero" worst case already pointed out, I would still recommend implementing an algorithm that can often speed things up, depending on p,q. For example, suppose that you have n numbers, and each increasing and decreasing region has size at least k. Then if you check 2^m elements in your array, including the first and last element and the rest of the elements as equally spaced as possible, starting with m=2 and then iteratively increasing m by 1, eventually you will reach m when you find 3 pairs of consecutive elements (A,B),(C,D),(E,F) from left-to-right out of the 2^m elements that you have checked, which satisfy A < B, C > D, E < F (some pairs may share elements). If my back-of-the-envelope calculation is correct, then the worst-case m you will need to achieve this will have you checking no more than 4n/k elements, so e.g. if k=100 you are much faster than checking all n elements. Then you know everything before A and everything after F are increasing sequences, and you can binary search through them. Now, if m got big enough that you checked at least sqrt(n) elements, then you can finish up by doing a brute-force search between A and F and the overall running time will be O(n/k + sqrt(n)). On the other hand, if the final m had you check fewer than sqrt(n) elements, then you can further increase m until you have checked sqrt(n) elements. Then there will be 2 pairs of consecutive checked elements (A,B),(C,D) that satisfy A < B, C > D, and there will also be 2 pairs of consecutive checked elements (W,X),(Y,Z) later in the array that satisfy W > X, Y < Z. Then everything before A is increasing, everything between D and W is decreasing, and everything after Z is increasing. So you can binary search these 3 regions in the array. The remaining part of the array that you haven't entirely searched through has size O(sqrt(n)), so you can use brute-force search the unchecked regions and the overall running time is O(sqrt(n)). Thus the bound O(n/k + sqrt(n)) holds in general. I have a feeling this is worst-case optimal, but I don't have a proof.
It's solvable in O(log2n).
if at midpoint the slope is decreasing we're in the p..q range.
if at midpoint the slope is increasing, we're either in 1..p or in q..n range.
perform a binary search in 1.. mid point and mid point..n ranges to seek for a value where the slope is decreasing. It will be found only in one of the ranges. Now we know in which of the 1..p and q..n subranges the mid point is located.
repeat the process from (1) for the subrange with the peaks until hitting the p..q range.
find the peaks in the subranges by applying algorithm in Divide and conquer algorithm applied in finding a peak in an array.
perform 3 binary searches in the ranges 1..p, p..q, q..n.
==> Overall complexity is O(log2n).

How would you convert X,Y points to Rho,Theta for hough transform in C?

So I am trying to code Hough Transform on C. I have a binary image and have extracted the binary values from the image. Now to do hough transform I have to convert the [X,Y] values from the image into [rho,theta] to do a parametric transform of the form
rho=xcos(theta)+ysin(theta)
I don't quite understand how it's actually transformed, looking at other online codes. Any help explaining the algorithm and how the accumulator for [rho,theta] values should be done based on [X,Y] would be appreciated.Thanks in advance. :)
Your question hints at the fact that you think that you need to map each (X,Y) point of interest in the image to ONE (rho, theta) vector in the Hough space.
The fact of the matter is that each point in the image is mapped to a curve, i.e. SEVERAL vectors in the Hough space. The number of vectors for each input point depends on some "arbitrary" resolution that you decide upon. For example, for 1 degree resolution, you'd get 360 vectors in Hough space.
There are two possible conventions, for the (rho, theta) vectors: either you use [0, 359] degrees range for theta, and in that case rho is always positive, or you use [0,179] degrees for theta and allow rho to be either positive or negative. The latter is typically used in many implementation.
Once you understand this, the Accumulator is little more than a two dimension array, which covers the range of the (rho, theta) space, and where each cell is initialized with 0. It is used to count the number of vectors that are common to various curves for different points in the input.
The algorithm therefore compute all 360 vectors (assuming 1 degree resolution for theta) for each point of interest in the input image. For each of the these vectors, after rounding rho to the nearest integral value (depends on precision in the rho dimension, e.g. 0.5 if we have 2 points per unit) it finds the corresponding cell in the accumulator, and increment the value in this cell.
when this has been done for all points of interest, the algorithm searches for all cells in the accumulator which have a value above a chosen threshold. The (rho, theta) "address" of these cells are the polar coordinates values for the lines (in the input image) that the Hough algorithm has identified.
Now, note that this gives you line equations, one is typically left with figure out the segment of these lines that effectively belong in the input image.
A very rough pseudo-code "implementation" of the above
Accumulator_rho_size = Sqrt(2) * max(width_of_image, height_of_image)
* precision_factor // e.g. 2 if we want 0.5 precision
Accumulator_theta_size = 180 // going with rho positive or negative convention
Accumulator = newly allocated array of integers
with dimension [Accumulator_rho_size, Accumulator_theta_size]
Fill all cells of Accumulator with 0 value.
For each (x,y) point of interest in the input image
For theta = 0 to 179
rho = round(x * cos(theta) + y * sin(theta),
value_based_on_precision_factor)
Accumulator[rho, theta]++
Search in Accumulator the cells with the biggest counter value
(or with a value above a given threshold) // picking threshold can be tricky
The corresponding (rho, theta) "address" of these cells with a high values are
the polar coordinates of the lines discovered in the the original image, defined
by their angle relative to the x axis, and their distance to the origin.
Simple math can be used to compute various points on this line, in particular
the axis intercepts to produce a y = ax + b equation if so desired.
Overall this is a rather simple algorithm. The complexity lies mostly in being consistent with the units, for e.g. for the conversion between degrees and radians (most math libraries' trig functions are radian-based), and also regarding the coordinates system used for the input image.

fast algorithm of finding sums in array

I am looking for a fast algorithm:
I have a int array of size n, the goal is to find all patterns in the array that
x1, x2, x3 are different elements in the array, such that x1+x2 = x3
For example I know there's a int array of size 3 is [1, 2, 3] then there's only one possibility: 1+2 = 3 (consider 1+2 = 2+1)
I am thinking about implementing Pairs and Hashmaps to make the algorithm fast. (the fastest one I got now is still O(n^2))
Please share your idea for this problem, thank you
Edit: The answer below applies to a version of this problem in which you only want one triplet that adds up like that. When you want all of them, since there are potentially at least O(n^2) possible outputs (as pointed out by ex0du5), and even O(n^3) in pathological cases of repeated elements, you're not going to beat the simple O(n^2) algorithm based on hashing (mapping from a value to the list of indices with that value).
This is basically the 3SUM problem. Without potentially unboundedly large elements, the best known algorithms are approximately O(n^2), but we've only proved that it can't be faster than O(n lg n) for most models of computation.
If the integer elements lie in the range [u, v], you can do a slightly different version of this in O(n + (v-u) lg (v-u)) with an FFT. I'm going to describe a process to transform this problem into that one, solve it there, and then figure out the answer to your problem based on this transformation.
The problem that I know how to solve with FFT is to find a length-3 arithmetic sequence in an array: that is, a sequence a, b, c with c - b = b - a, or equivalently, a + c = 2b.
Unfortunately, the last step of the transformation back isn't as fast as I'd like, but I'll talk about that when we get there.
Let's call your original array X, which contains integers x_1, ..., x_n. We want to find indices i, j, k such that x_i + x_j = x_k.
Find the minimum u and maximum v of X in O(n) time. Let u' be min(u, u*2) and v' be max(v, v*2).
Construct a binary array (bitstring) Z of length v' - u' + 1; Z[i] will be true if either X or its double [x_1*2, ..., x_n*2] contains u' + i. This is O(n) to initialize; just walk over each element of X and set the two corresponding elements of Z.
As we're building this array, we can save the indices of any duplicates we find into an auxiliary list Y. Once Z is complete, we just check for 2 * x_i for each x_i in Y. If any are present, we're done; otherwise the duplicates are irrelevant, and we can forget about Y. (The only situation slightly more complicated is if 0 is repeated; then we need three distinct copies of it to get a solution.)
Now, a solution to your problem, i.e. x_i + x_j = x_k, will appear in Z as three evenly-spaced ones, since some simple algebraic manipulations give us 2*x_j - x_k = x_k - 2*x_i. Note that the elements on the ends are our special doubled entries (from 2X) and the one in the middle is a regular entry (from X).
Consider Z as a representation of a polynomial p, where the coefficient for the term of degree i is Z[i]. If X is [1, 2, 3, 5], then Z is 1111110001 (because we have 1, 2, 3, 4, 5, 6, and 10); p is then 1 + x + x2 + x3 + x4 + x5 + x9.
Now, remember from high school algebra that the coefficient of xc in the product of two polynomials is the sum over all a, b with a + b = c of the first polynomial's coefficient for xa times the second's coefficient for xb. So, if we consider q = p2, the coefficient of x2j (for a j with Z[j] = 1) will be the sum over all i of Z[i] * Z[2*j - i]. But since Z is binary, that's exactly the number of triplets i,j,k which are evenly-spaced ones in Z. Note that (j, j, j) is always such a triplet, so we only care about ones with values > 1.
We can then use a Fast Fourier Transform to find p2 in O(|Z| log |Z|) time, where |Z| is v' - u' + 1. We get out another array of coefficients; call it W.
Loop over each x_k in X. (Recall that our desired evenly-spaced ones are all centered on an element of X, not 2*X.) If the corresponding W for twice this element, i.e. W[2*(x_k - u')], is 1, we know it's not the center of any nontrivial progressions and we can skip it. (As argued before, it should only be a positive integer.)
Otherwise, it might be the center of a progression that we want (so we need to find i and j). But, unfortunately, it might also be the center of a progression that doesn't have our desired form. So we need to check. Loop over the other elements x_i of X, and check if there's a triple with 2*x_i, x_k, 2*x_j for some j (by checking Z[2*(x_k - x_j) - u']). If so, we have an answer; if we make it through all of X without a hit, then the FFT found only spurious answers, and we have to check another element of W.
This last step is therefore O(n * 1 + (number of x_k with W[2*(x_k - u')] > 1 that aren't actually solutions)), which is maybe possibly O(n^2), which is obviously not okay. There should be a way to avoid generating these spurious answers in the output W; if we knew that any appropriate W coefficient definitely had an answer, this last step would be O(n) and all would be well.
I think it's possible to use a somewhat different polynomial to do this, but I haven't gotten it to actually work. I'll think about it some more....
Partially based on this answer.
It has to be at least O(n^2) as there are n(n-1)/2 different sums possible to check for other members. You have to compute all those, because any pair summed may be any other member (start with one example and permute all the elements to convince yourself that all must be checked). Or look at fibonacci for something concrete.
So calculating that and looking up members in a hash table gives amortised O(n^2). Or use an ordered tree if you need best worst-case.
You essentially need to find all the different sums of value pairs so I don't think you're going to do any better than O(n2). But you can optimize by sorting the list and reducing duplicate values, then only pairing a value with anything equal or greater, and stopping when the sum exceeds the maximum value in the list.

Resources