Closest pair of points algorithm variation

Closest pair of points algorithm variation - c

I know this may be a duplicate, but it seems like a variation on the 'Closest pair of Points' algorithm.
Given a Set of N points (x, y) in the unit square and a distance d, find all pair of points such that the distance between them is at most d.
For large N the brute force method is not an option. Besides the 'sweep line' and 'divide and conquer' methods, is there a simpler solution? These pair of points are the edges of an undirected graph, that i need to traverse it and say if it's connected or not (which i already did using DFS, but when N = 1 million it never finishes!).
Any pseudocode, comments or ideas are welcome,
Thanks!
EDIT: I found this on Sedgewick book (i'm looking at the code right now):
Program 3.18 uses a two-dimensional array of linked lists to improve the running time of Program 3.7 by a factor of about 1/d2 when N is sufficiently large. It divides the unit
square up into a grid of equal-sized smaller squares. Then, for each square, it builds a linked list of all the
points that fall into that square. The two-dimensional array provides the capability to access immediately
the set of points close to a given point; the linked lists provide the flexibility to store the points where
they may fall without our having to know ahead of time how many points fall into each grid square.

We are really looking for points that are inside of a circle of center (x,y) and radius d.
The square that encloses circle is a square of center (x,y) and sides 2d. Any point out of this square does not need to be checked, it's out. So, a point a (xa, ya) is out if abs(xa - x) > d or abs (ya -yb) > d.
Same for the square that is enclosed by that circle is a square of center (x,y) and diagonals 2d. Any point out of this square does not need to be checked, it's in. So, a point a (xa, ya) is in if abs(xa - x) < (d * 1.412) or abs(ya -yb) < (d * 1.412).
Those two easy rules combined reduce a lot the number of points to be checked. If we sort the point by their x, filter the possible points, sort by their y, we come to the ones we really need to calculate the full distance.

For any given point, you can use a Manhattan distance (x-delta plus y-delta) heuristic to filter out most of the points that are not within the distance "d" - filter out any points whose Manhattan distance is greater than (sqrt(2) * d), then run the expensive-and-precise distance test on the remaining points.

Related

Efficient way of calculating minimum distance between point and multiple faces

I have multiple faces in 3D space creating cells. All these faces lie within a predefined cube (e.g. of size 100x100x100).
Every face is convex and defined by a set of corner points and a normal vector. Every cell is convex. The cells are result of 3d voronoi tessellation, and I know the initial seed points of the cells.
Now for every integer coordinate I want the smallest distance to any face.
My current solution uses this answer https://math.stackexchange.com/questions/544946/determine-if-projection-of-3d-point-onto-plane-is-within-a-triangle/544947 and calculates for every point for every face for every possible triple of this faces points the projection of the point to the triangle created by the triple, checks if the projection is inside the triangle. If this is the case I return the distance between projection and original point. If not I calculate the distance from the point to every possible line segment defined by two points of a face. Then I choose the smallest distance. I repeat this for every point.
This is quite slow and clumsy. I would much rather calculate all points that lie on (or almost lie on) a face and then with these calculate the smallest distance to all neighbour points and repeat this.
I have found this Get all points within a Triangle but am not sure how to apply it to 3D space.
Are there any techniques or algorithms to do this efficiently?

Since we're working with a Voronoi tessellation, we can simplify the current algorithm. Given a grid point p, it belongs to the cell of some site q. Take the minimum over each neighboring site r of the distance from p to the plane that is the perpendicular bisector of qr. We don't need to worry whether the closest point s on the plane belongs to the face between q and r; if not, the segment ps intersects some other face of the cell, which is necessarily closer.
Actually it doesn't even matter if we loop r over some sites that are not neighbors. So if you don't have access to a point location subroutine, or it's slow, we can use a fast nearest neighbors algorithm. Given the grid point p, we know that q is the closest site. Find the second closest site r and compute the distance d(p, bisector(qr)) as above. Now we can prune the sites that are too far away from q (for every other site s, we have d(p, bisector(qs)) ≥ d(q, s)/2 − d(p, q), so we can prune s unless d(q, s) ≤ 2 (d(p, bisector(qr)) + d(p, q))) and keep going until we have either considered or pruned every other site. To do pruning in the best possible way requires access to the guts of the nearest neighbor algorithm; I know that it slots right into the best-first depth-first search of a kd-tree or a cover tree.

Generating a set of N random convex disjoint 2D polygons with at most V vertices, and two additional points?

I want to create a set of N random convex disjoint polygons on a plane where each polygon must have at most V vertices, where N and V are parameters for my function, and I'd like to obtain a distribution as close as possible to uniform (every possible set being equally probable). Also I need to randomly select two points on the plane that either match with one of the vertices in the scene or are in empty space (not inside a polygon).
I already implemented for other reasons in the same programming language an AABB tree, Separating Axis Theorem-based collision detection between convex polygons and I can generate a random convex polygon with arbitrary amount of vertices inside a circle of given radius. My best bet thus far:
Generate a random polygon using the function I have available.
Query the AABB tree to test for interception with existing polygons.
If the AABB tree query returns empty set I push the generated polygon into it, otherwise I test with SAT against all the other polygons whose AABB overlaps with the generated one's. If SAT returns "no intersection" I push the polygon, otherwise I discard it.
Repeat from 1 until N polygons are generated.
Generate a random number in {0,1}
If the generated number is 1 I pick a random polygon and a random vertex on it as a point
If the generated number is 0 I generate a random position in (x,y) and test if it falls within some polygon (I might create a tiny AABB around it and exploit the AABB tree to reduce the required number of PiP tests). In case it's not inside any polygon I approve it as a valid point, otherwise I repeat from 5.
Repeat from 5 once more to get the second point.
I think the solution would possibly work, but unfortunately there's no way to guarantee that I can generate N such polygons for very large N, or find two good points in an acceptable time, and I'm programming in React, where long operations run on the main thread blocking the UI till they end. I could circumvent the issue by ejecting from create-react-app and learn Web Workers, which would require probably more time than it's worth for me.

This is definitely non-uniform distribution, but perhaps you could begin by generating N points in the plane and then computing the Voronoi diagram for those points. The Voronoi diagram can be computed in O(n log n) time with Fortune's algorithm. The cells of the Voronoi diagram are convex, so you can then construct a random polygon of the desired number of vertices that lies within each cell of the diagram.
By Balu Ertl - Own work, CC BY-SA 4.0, Link

Ok, here is another proposal. I have little knowledge of js, but could cook up something in Python.
Use Poisson disk sampling with distance parameter d to generate N samples of the centers
For a given center make a circle with R≤d.
Generate V angles using Dirichlet distribution such that sum of angles is equal to 2π. Sort them.
Place vertices on the circle using angles generate at step#3 and connect them. This would be be your polygon
UPDATE
Instead using Poisson disk sampling for step 1, one could use Sobol quasi-random sequences. For N points sampled in the 1x1 box (well, you have to scale it afterwards), least distance between points would be
d = 0.5 * sqrt(D) / N,
where D is dimension of the problem, 2 in your case. So radius of the circle for step 2 would be 0.25 * sqrt(2) / N. This ties nicely N and d.
https://www.sciencedirect.com/science/article/abs/pii/S0378475406002382

Algorithm for best fit rectangle

I'm looking for an algorithm to do a best fit of an arbitrary rectangle to an unordered set of points. Specifically, I'm looking for a rectangle where the sum of the distances of the points to any one of the rectangle edges is minimised. I've found plenty of best fit line, circle and ellipse algorithms, but none for a rectangle. Ideally, I'd like something in C, C++ or Java, but not really that fussy on the language.
The input data will typically be comprised of most points lying on or close to the rectangle, with a few outliers. The distribution of data will be uneven, and unlikely to include all four corners.

Here are some ideas that might help you.
We can estimate if a point is on an edge or on a corner as follows:
Collect the point's n neares neighbours
Calculate the points' centroid
Calculate the points' covariance matrix as follows:
Start with Covariance = ((0, 0), (0, 0))
For each point calculate d = point - centroid
Covariance += outer_product(d, d)
Calculate the covariance's eigenvalues. (e.g. with SVD)
Classify point:
if one eigenvalue is large and the other very small, we are probably on an edge
otherwise we should be on a corner
Extract all corner points and do a segmentation. Choose the four segments with most entries. The centroid of those segments are candidates for the rectangle's corners.
Calculate the normalized direction vectors of two opposite sides and calculate their mean. Calculate the mean of the other two opposite sides. These are the direction vectors of a parallelogram. If you want a rectangle, calculate a perpendicular vector to one of those directions and calculate the mean with the other direction vector. Then the rectangle's direction's are the mean vector and a perpendicular vector.
In order to calculate the corners, you can project the candidates on their directions and move them so that they form the corners of a rectangle.

The idea of a line of best fit is to compute the vertical distances between your points and the line y=ax+b. Then you can use calculus to find the values of a and b that minimize the sum of the squares of the distances. The reason squaring is chosen over absolute value is because the former is differentiable at 0.
If you were to try the same approach with a rectangle, you would run into the problem that the square of the distance to the side of a rectangle is a piecewise defined function with 8 different pieces and is not differentiable when the pieces meet up inside the rectangle.
In order to proceed, you'll need to decide on a function that measures how far a point is from a rectangle that is everywhere differentiable.

Here's a general idea. Make a grid with smallish cells; calculate best fit line for each not-too-empty cell (the calculation is immediate1, there's no search involved). Join adjacent cells while making sure the standard deviation is improving/not worsening much. Thus we detect the four sides and the four corners, and divide our points into four groups, each belonging to one of the four sides.
Next, we throw away the corner cells, put the true rectangle in place of the four approximate
lines and do a bit of hill climbing (or whatever). The calculation of best fit line may be augmented for this case, since the two lines are parallel, and we've already separated our points into the four groups (for a given rectangle, we know the delta-y between the two opposing sides (taking horizontal-ish sides for a moment), so we just add this delta-y to the ys of the lower group of points and make the calculation).
The initial rectangular grid may be replaced with working by stripes (say, vertical). Then, at least half of the stripes will have two pronounced groupings of points (find them by dividing each stripe by horizontal division lines into cells).
1For a line Y = a*X+b, minimize the sum of squares of perpendicular distances of data points {xi,yi} to that line. This is directly solvable for a and b. For more vertical lines, flip the Xs and the Ys.
P.S. I interpret the problem as minimizing the sum of squares of perpendicular distances of each point to its nearest side of the rectangle, not to all the rectangle's sides.

I am not completely sure, but You might play around first 2 (3?) dimensions over the PCA from your points. it will work reasonably fast for the most cases.

Given centers, find minimum radius for set of circles such that they fully cover another

I have the following geometry problem:
You are given a circle with the center in origin - C(0, 0), and radius 1. Inside the circle are given N points which represent the centers of N different circles. You are asked to find the minimum radius of the small circles (the radius of all the circles are equal) in order to cover all the boundary of the large circle.
The number of circles is: 3 ≤ N ≤ 10000 and the problem has to be solved with a precision of P decimals where 1 ≤ P ≤ 6.
For example:
N = 3 and P = 4
and the coordinates:
(0.193, 0.722)
(-0.158, -0.438)
(-0.068, 0.00)
The radius of the small circles is: 1.0686.
I have the following idea but my problem is implementing it. The idea consists of a binary search to find the radius and for each value given by the binary search to try and find all the intersection point between the small circles and the large one. Each intersection will have as result an arc. The next step is to 'project' the coordinates of the arcs on to the X axis and Y axis, the result being a number of intervals. If the reunions of the intervals from the X and the Y axis have as result the interval [-1, 1] on each axis, it means that all the circle is covered.
In order to avoid precision problems I thought of searching between 0 and 2×10P, and also taking the radius as 10P, thus eliminating the figures after the comma, but my problem is figuring out how to simulate the intersection of the circles and afterwards how to see if the reunion of the resulting intervals form the interval [-1, 1].
Any suggestions are welcomed!

Each point in your set has to cover the the intersection of its cell in the point-set's voronoi diagram and the test-circle around the origin.
To find the radius, start by computing the voronoi diagram of your point set. Now "close" this voronoi diagram by intersecting all infinite edges with your target-circle. Then for each point in your set, check the distance to all the points of its "closed" voronoi cell. The maximum should be your solution.
It shouldn't matter that the cells get closed by an arc instead of a straight line by the test-circle until your solution radius gets greater than 1 (because then the "small" circles will arc stronger). In that case, you also have to check the furthest point from the cell center to that arc.

I might be missing something, but it seems that you only need to find the maximal minimal distance between a point in the circle and the given points.
That is, if you consider the set of all points on the circle, and take the minimal distance between each point to one of the given points, and then take the maximal values of all these - you have found your radius.
This is, of course, not an algorithm, as there are uncountably many points.
I think what I'll do would be along the line of:
Find the minimal distance between the circumference and the set of points, this is your initial radius R.
Check if the entire circle was covered, like so:
For any two points whose distance from each other is more than 2R, check if the entire segment was covered (for each point, check if the circle around it intersects, and if so, remove that segment and keep going). That should take about o(N^3) (you iterate over all of the points for each pair of points). If I'm correct (though I didn't formally prove it) the circle is covered iff all of the segments are covered.
Of all the segment which weren't covered, take the long one, and add half it's length to R.
Repeat.
This algorithm will never cover the circle per se, but it's easy to prove that it exponentially converges to a full cover, so it should be able to find the needed radius with arbitrary accuracy within a reasonable amount of iterations.
Hope that helps.

Angle at corner of two lines

I search for the fastest or simplest method to compute the outer angle at any point of a convex polygon. That means, always the bigger angle, whereas the two angles in question add up to 360 degrees.
Here is an illustration:
Now I know that I may compute the angles between the two vectors A-B and C-B which involves the dot product, normalization and cosine. Then I would still have to determine which of the two resulting angles (second one is 180 degrees minus first one) I want to take two times added to the other one.
However, I thought there might be a much simpler, less tricky solution, probably using the mighty atan2() function. I got stuck and ask you about this :-)
UPDATE:
I was asked what I need the angle for. I need to calculate the area of this specific circle around B, but only of the polygon that is described by A, B, C, ...
So to calculate the area, I need the angle to use the formula 0.5*angle*r*r.

Use the inner-product (dot product) of the vectors describing the lines to get the inner angle and subtract from 360 degrees?
Works best if you already have the lines in point-vector form, but you can get vectors from point-to-point form pretty easily (i.e. by subtraction).
Taking . to be the dot product we have
v . w = |v| * |w| * cos(theta)
where v and w are vectors and theta is the angle between the lines. And the dot product can be computed from the components of the vectors by
v . w = SUM(v_i * w_i : i=0..3) // 3 for three dimensions. Use more or fewer as needed
here the subscripts indicate components.
Having actually read the question:
The angle returned from inverting the dot-product will always be less than 180 degrees, so it is always the inner angle.

Use this formula:
beta = 360° - arccos(<BA,BC>/|BA||BC|)
Where <,> is the scalar product and BA (BC) are the vectors from B to A (B to C).

I need to calculate the area of the circle outside of the polygon that is described by A, B, C, ...
It sounds like you're taking the wrong approach, then. You need to calculate the area of the circumcircle, then subtract the area of the polygon.

If you need the angle there is no way around normalizing the vectors and do a dot or cross-product. You often have a choice if you want to calculate the angle via asin, acos or atan but in the end that does not make a difference to execution speed.
However, It would be nice if you could tell us what you're trying to archive. If we have a better picture of what you're doing we might be able to give you some hints how to solve the problem without calculating the angle at the first place.
Lots of geometric algorithms can be rewritten to work with cross and dot-products only. Euler-angles are rarely needed.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight