Why does my implementation of Dijkstra's algorithm not behave as it should? - c

I am writing an implementation of Dijkstra's algorithm to learn about cool graph algorithms (this isn't a homework assignment, FYI). I am using Wikipedia's description of the algorithm as my main resource.
I have tested different traversal paths and gotten the following results ((foo, bar) means foo to bar):
crashes:
(e, f)
(f, d)
(c, a)
(f, g)
incorrect:
(a, c)
(g, f)
working:
(d, f)
My graph that I am working with looks like this:
F - H
| |
A ---- B - C
| /
| /
E - G - D
By tracing the path from E to F, I understand mostly why my code is failing. The other problem is that I don't know how to implement the algorithm using my way of doing it otherwise. Here's a tracing from E to F:
At node E, my neighbors are A and G. The shortest tentative distance is G, so that's the next current node. G's neighbors are E and D, but E was already traversed, so C is the next one. For C, its neighbor D was traversed, so we now arrive at B (B and H are equidistant, but it was chosen first in C's list of edges). Here is where my problem lies:
A's tentative distance was already calculated by E to be 2. Since the new tentative distance from B to A is much larger than just two, its distance stays at 2. For F, its distance is set to the tentative distance, since it was initialized as infinity. A's distance is smaller, so it's chosen as the next node. A's only neighbors are E and B, which have already been traversed, so all nodes around it have already been explored. The variable closest (see my code below) was initialized to a node with no other filled-in fields than a distance of infinity, so for the next iteration, it has no edges, and I get a segmentation fault.
I know that this is what happened in my code because of its output, shown below:
Current: e
New neighbor: a
New neighbor: g
g, closest, distance of 1
Current: g
New neighbor: d
d, closest, distance of 2
Current: d
New neighbor: c
c, closest, distance of 4
Current: c
New neighbor: b
New neighbor: h
b, closest, distance of 5
Current: b
New neighbor: a
New neighbor: f
a, closest, distance of 2
Current: a
?, closest, distance of 1000
Current: ?
Segmentation fault: 11
Where did I step wrong in implementing this algorithm? I have tried to follow Wikipedia's 6-step description of it very carefully. The only difference between their description and mine is that I am not using sets to keep track of explored and unexplored nodes (rather, that data is kept in the nodes themselves). Please provide any insight you can.
Note: I am compiling with Clang on a Mac with no optimization (-O0). I've noticed that with higher optimization, my program recurs infinitely and then gives me another segmentation fault, but I prioritize fixing the central problem with my algorithm before dealing with that.
#include <stdio.h>
#include <stdarg.h>
#include <stdlib.h>
#define infinity 1000
struct Node {
unsigned char value;
int visited, distance, edge_count;
int* weights, weight_assign_index, freed;
struct Node** edges;
};
typedef struct Node Node;
Node* init_node(const unsigned char value, const int edge_count) {
Node* node = malloc(sizeof(Node));
node -> value = value;
node -> visited = 0;
node -> distance = infinity;
node -> edge_count = edge_count;
node -> weights = malloc(edge_count * sizeof(int));
node -> weight_assign_index = 0;
node -> freed = 0;
node -> edges = malloc(edge_count * sizeof(Node*));
return node;
}
void assign_edges(Node* node, const int amount, ...) {
va_list edges;
va_start(edges, amount);
for (int i = 0; i < amount; i++)
node -> edges[i] = va_arg(edges, Node*);
va_end(edges);
}
void assign_weight(Node* node_1, Node* node_2, const int weight) {
for (int i = 0; i < node_1 -> edge_count; i++) {
if (node_1 -> edges[i] == node_2) {
node_1 -> weights[node_1 -> weight_assign_index++] = weight;
node_2 -> weights[node_2 -> weight_assign_index++] = weight;
}
}
}
void deinit_graph(Node* node) {
if (!node -> freed) {
node -> freed = 1;
free(node -> weights);
for (int i = 0; i < node -> edge_count; i++)
deinit_graph(node -> edges[i]);
free(node -> edges);
}
}
void dijkstra(Node* current, Node* goal) {
Node local_closest;
local_closest.distance = infinity;
Node* closest = &local_closest;
printf("Current: %c\n", current -> value);
for (int i = 0; i < current -> edge_count; i++) {
Node* neighbor = current -> edges[i];
if (!neighbor -> visited) {
printf("New neighbor: %c\n", neighbor -> value);
const int tentative_distance = current -> distance + current -> weights[i];
if (tentative_distance < neighbor -> distance)
neighbor -> distance = tentative_distance;
if (neighbor -> distance < closest -> distance)
closest = neighbor;
}
}
printf("%c, closest, distance of %d\n", closest -> value, closest -> distance);
current -> visited = 1;
if (closest == goal) printf("Shortest distance is %d\n", closest -> distance);
else dijkstra(closest, goal);
}
int main() {
Node
*a = init_node('a', 2),
*b = init_node('b', 3),
*c = init_node('c', 3),
*d = init_node('d', 2),
*e = init_node('e', 2),
*f = init_node('f', 2),
*g = init_node('g', 2),
*h = init_node('h', 2);
assign_edges(a, 2, e, b);
assign_edges(b, 3, a, f, c);
assign_edges(c, 3, b, h, d);
assign_edges(d, 2, c, g);
assign_edges(e, 2, a, g);
assign_edges(f, 2, b, h);
assign_edges(g, 2, e, d);
assign_edges(h, 2, f, c);
assign_weight(a, e, 2);
assign_weight(a, b, 4);
assign_weight(b, c, 1);
assign_weight(b, f, 1);
assign_weight(f, h, 1);
assign_weight(h, c, 1);
assign_weight(c, d, 2);
assign_weight(d, g, 1);
assign_weight(g, e, 1);
e -> distance = 0;
dijkstra(e, f);
deinit_graph(a);
}

Read Step 6 of Wikipedia's algorithm again:
Otherwise, select the unvisited node that is marked with the smallest tentative distance, set it as the new "current node", and go back to step 3.
The "select" here means "among all the unvisited nodes in the entire graph", not just among the unvisited neighbors of the current node, which is what your code is doing. So if the unvisited node of smallest tentative distance is not a neighbor of the current node, your code goes astray. And if the current node has no unvisited neighbors at all (which is entirely possible, either with a situation like what you encountered, or more simply with a dead-end node), your code absurdly visits the local_closest node, which isn't in the graph at all and whose edges are uninitialized, naturally causing a crash.
So you diverged from the correct algorithm sooner than the visit to A which you are focusing on. When you finished visiting D, the remaining unvisited nodes were A at tentative distance 2, C at tentative distance 4, and B,F,H at tentative distance infinity. So by the algorithm, you ought to visit A next. But instead you visit C, again because your code wrongly only considers neighbors of the current node as candidates for the next node to visit.
Frankly I don't see how a recursive algorithm is going to be workable at all here. You need to have access to some data structure that tracks all the unvisited nodes, so that at any time you can find the one at minimum tentative distance, even if it's very far away from your current node. Your idea of keeping track of visited status on the nodes themselves has a problem with this, because you have no good way to search them all, except by going back to the starting node and doing some kind of DFS/BFS. That's (1) not possible in your current implementation, because the recursive calls to dijkstra no longer have a pointer to the starting node, and (2) very inefficient, because it's O(N) on every visit.
There's a good reason why the Wikipedia algorithm suggests using a set here. And I think it lends itself better to an iterative than to a recursive algorithm.

Related

Function to find the closest number in an array to a given value (in C)

I'm given the task to find the closest value in an array to a given value t. We consider the absolute value.
I came up with the following function in C:
struct tuple
{
int index;
int val;
};
typedef struct tuple tuple;
tuple find_closest(int A[], int l, int r, int t)
{
if(l == r)
{
tuple t1;
t1.val = abs(A[l] - t);
t1.index = l;
return t1;
}
int m = (l+r)/2;
tuple t2, t3;
t2 = find_closest(A, l, m, t);
t3 = find_closest(A, m+1, r, t);
if(t2.val < t3.val)
{
return t2;
}
else
{
return t3;
}
}
int main()
{
int A[] = {5,7,9,13,15,27,2,3};
tuple sol;
sol = find_closest(A, 0, 7, 20);
printf("%d", sol.index);
return 0;
}
We learnt about the Divide and Conquer method which is why I implemented it recursively. I'm trying to compute the asymptotic complexity of my solution to make a statement about the efficiency of my function. Can someone help? I don't think that my solution is the most efficient one.
The code performs exactly n-1 comparisons of array values (which is easy to prove in several ways, for example by induction, or by noting that each comparison rejects exactly one element from being the best and you do comparisons until there's exactly one index left). The depth of the recursion is ceil(lg(n)).
An inductive proof looks something like this: let C(n) be the number of times if(t2.val < t3.val) is executed where n=r-l+1. Then C(1) = 0, and for n>1, C(n) = C(a) + C(b) + 1 for some a+b=n, a, b > 0. Then by the induction hypothesis, C(n) = a-1 + b-1 + 1 = a+b - 1 = n - 1. QED. Note that this proof is the same no matter how you choose m as long as l <= m < r.
This isn't a problem that divide-and-conquer helps with unless you are using parallelism, and even then a linear scan has the benefit of using the CPU's cache efficiently so the practical benefit of parallelism will be less (possibly a lot less) than you expect.

C - Depth first search in adjacency matrix using recursion

I have a recursion problem I would like to solve using recursion.
For example, given this adjacency matrix AdjMat:
0 1 2 3
0 0 1 0 0
1 1 0 1 0
2 0 1 0 1
3 0 0 1 0
Say I would like to look at column 0 and all of its neighbors, and its neighbors' neighbors (distance of 2), and store all of the row indices > 0 into a linked list of ints.
Here is my updated code:
intNode *Neighbors(intNode *head, int colOfInterest, int distance) {
int x = colOfInterest;
if (distance == 0) {
for (int i = 0; i < 10; i++) {
for (int j = 0; j < 10; j++) {
if (AdjMat[x][j] > 0) {
head = insertInt(head, j);
}
}
break;
}
}
intNode *subpath = NULL;
for (int i = 0; i < distance; i++) {
subpath = Neighbors(head, colOfInterest, distance);
}
// Once the final neighbor column has been reached, add elements to the linked list.
return head;
}
It currently does not return the expected output (which is 0, 1, and 2 in the linked list), but I am not sure why. Any help or direction is appreciated.
You have two major misconceptions in your code. The first is about recursion and the second is about how an adjacency matrix works.
The recursion basically works like this:
take a node and a max. distance: func(node, d);
if the distance is negative, return
add the node to the list;
for all adjacent nodes, call the function on that node and with a new, now shorter distance: func(next, d - dist(node, next).
To find all nodes in the vicinity of node #0, you'd start with an empty list and then call func(0, 2), which will lead to the following calls:
func(0, 2) // list {0}
func(1, 1) // list {0, 1}
func(0, 0) // list {0, 1, 0} error, see below
func(1, -1) // negative d, do nothing
func(2, 0) // list {0, 1, 0, 2}
func(1, -1) // negative d, do nothing
func(3, -1) // negative d, do nothing
--> recursion depth
This recursion will eventually stop, because you diminish the distance in each step. This is important: Every recursion must have a termination condition otherwise it would recurse endlessly. (It is a matter of style whether you test the distance up front or when you recurse. Up font catches invalid input early, but may lead to useless "dead" recursions.)
The recursion as given has a subtle problem: When you call func(0, 2) the function will add node #0 twice, because going from node 0 to 1 and then back to 0 yields a distance of 2, which is within reach. There are several ways to solve this. For example you could look whether the given node is already in your list. Or you could flag nodes as visited as you go.
The adjacency matrix determines whether two nodes are connected. Two nodes a and b are connected if adj[a][b] != 0. That means that if you want to find all neighbours next of a given node node, you should do something like this:
for (int next = 0; next < N; next++) {
if (adj[node][next]) {
// do something with next
}
}
You don't need two nested loops. The matrix has two dimensions, but the first one is always fixed: it's the source node. (If you look at your code, you'll see that you don't do anything with i.)
In your case, the adjacency matrix seems to have values of 0 and 1 only, but it could have other non-zero values to indicate the distances of a weighted graph.

Using the returned values as an input to the function continuously

I have this code that used KD tree written in C used in searching for nearest neighbor and then uses the returned value to search for the next nearest neighbor from that point. I want to do this for about 5 iterations i.e the result of first iteration is used as an input to the 2nd and the result of the second used for the 3rd etc. I am a beginner and I thought probably do while loop will work but fails after 2 iterations i.e i get the same input again.
How do I change the value of this variable to reflect changes such that the output of first iteration is the input of the last operation been performed. Also if there is a way to create a function for this, it will be highly appreciated. The code works with gcc compiler.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#define MAX_DIM 4
struct kd_node_t{
double x[MAX_DIM];
struct kd_node_t *left, *right;
};
inline double
dist(struct kd_node_t *a, struct kd_node_t *b, int dim)
{
double t, d = 0;
while (dim--) {
t = a->x[dim] - b->x[dim];
d += t * t;
}
return d;
}
inline void swap(struct kd_node_t *x, struct kd_node_t *y) {
double tmp[MAX_DIM];
memcpy(tmp, x->x, sizeof(tmp));
memcpy(x->x, y->x, sizeof(tmp));
memcpy(y->x, tmp, sizeof(tmp));
}
/* quickselect method */
struct kd_node_t*
find_median(struct kd_node_t *start, struct kd_node_t *end, int idx)
{
if (end <= start) return NULL;
if (end == start + 1)
return start;
struct kd_node_t *p, *store, *md = start + (end - start) / 2;
double pivot;
while (1) {
pivot = md->x[idx];
swap(md, end - 1);
for (store = p = start; p < end; p++) {
if (p->x[idx] < pivot) {
if (p != store)
swap(p, store);
store++;
}
}
swap(store, end - 1);
/* median has duplicate values */
if (store->x[idx] == md->x[idx])
return md;
if (store > md) end = store;
else start = store;
}
}
struct kd_node_t*
make_tree(struct kd_node_t *t, int len, int i, int dim)
{
struct kd_node_t *n;
if (!len) return 0;
if ((n = find_median(t, t + len, i))) {
i = (i + 1) % dim;
n->left = make_tree(t, n - t, i, dim);
n->right = make_tree(n + 1, t + len - (n + 1), i, dim);
}
return n;
}
int visited;
void nearest(struct kd_node_t *root, struct kd_node_t *nd, int i, int dim,
struct kd_node_t **best, double *best_dist)
{
double d, dx, dx2;
if (!root) return;
d = dist(root, nd, dim);
dx = root->x[i] - nd->x[i];
dx2 = dx * dx;
visited ++;
if (!*best || d < *best_dist) {
*best_dist = d;
*best = root;
}
nearest(dx > 0 ? root->left : root->right, nd, i, dim, best, best_dist);
if (dx2 >= *best_dist) return;
nearest(dx > 0 ? root->right : root->left, nd, i, dim, best, best_dist);
}
int main(void)
{
int i;
struct kd_node_t wp[] = {
{{7, 9, 5, 56}},{{2, 4, 8, 10}}, {{81, 2, 31, 80}}, {{31, 4, 900, 1}},{{4, 7, 1, 9}}, {{9, 6, 2,0}}, {{4, 3, 11, 2}},{{7, 7, 9, 1}}, {{6, 9, 0,2}}
};
struct kd_node_t testNode = {{0, 2}}; //This is the input
struct kd_node_t *root, *found, *million;
double best_dist;
double length =sizeof(wp) / sizeof(wp[1]);
root = make_tree(wp, sizeof(wp) / sizeof(wp[1]), 0, 2);
visited = 0;
found = 0;
nearest(root, &testNode, 0, 2, &found, &best_dist);
printf(">> WP tree\nsearching for (%g, %g)\n"
"found (%g, %g %g, %g) dist %g\nseen %d nodes\n\n",
testNode.x[0], testNode.x[1],
found->x[0], found->x[1],found->x[2], found->x[3], sqrt(best_dist), visited);
//It produces an output found->x[0], found->x[1], found->x[2], found->x[3]
for(int i=0; i<5; i++) //Where the problem is and where i will like to continuously use the returned value as input
{
testNode = {{found->x[2], found->x[3]}}; // This is the new input i.e the output of the previous run
printf(" (%g, %g) ",
found->x[2], found->x[3]);
root = make_tree(wp, sizeof(wp) / sizeof(wp[1]), 0, 2);
nearest(root, &testNode, 0, 2, &found, &best_dist);
printf(">> WP tree\nsearching for (%g, %g)\n"
"found (%g, %g %g, %g) dist %g\nseen %d nodes\n\n",
testNode.x[0], testNode.x[1],
found->x[0], found->x[1],found->x[2], found->x[3], sqrt(best_dist), visited);
}
return 0;
}
This is a sample output from the running code
>> WP tree
searching for (0, 2)
found (2, 4 8, 10) dist 2.82843
seen 4 nodes
8 10>> WP tree
searching for (8, 10)
found (7, 9 5, 56) dist 1.41421
seen 11 nodes
5 56>> WP tree
searching for (5, 56)
found (7, 9 5, 56) dist 1.41421
seen 16 nodes
5 56>> WP tree
searching for (5, 56)
found (7, 9 5, 56) dist 1.41421
seen 21 nodes
5 56>> WP tree
searching for (5, 56)
found (7, 9 5, 56) dist 1.41421
seen 26 nodes
5 56>> WP tree
searching for (5, 56)
found (7, 9 5, 56) dist 1.41421
seen 31 nodes
In fact 5,56 should return 6, 9,0,2 and the next point to be selected should be 0,2 etc.
Also how do I delete temporarily the value of the returned value from wp so that when it searches, it does not include its coordinate in the search space but at the next iteration, its value should be restored. e.g say I am searching for 5,56; the entire {{7, 9, 5, 56, 30}} will be removed from wp for that current search but will be restored later after returning its nearest data.
Any help will be highly appreciated. Thanks
In a comment, I stated that the OP needs to modify the code to ignore the already known points.
However, that turns out to be incorrect. A k-d tree works very well for locating the nearest "mapped" known point to any supplied point. However, trying to ignore mapped points lead to serious complications. A typical result is that your search simply misses some nearest points. This can be corrected for, but it slows down the search, and requires more complex code.
It took me quite a bit of thought and even an example program to find out exactly what happens, when a k-d tree search tries to exclude points in the tree from consideration. Those that are aware of this property of k-d trees are probably laughing at me for not realizing this earlier; it is quite simple when you realize what happens. However, most authors I have found gloss over this, and simply state that the k-d trees are approximate nearest neighbour data structures (at least when you wish to ignore specific points in the tree).
So, in the hopes that this might help someone else, here are my thoughts on the matter.
First, let's review how a "normal" k-d tree nearest neighbour search works:
All points are stored in leaf nodes
Typically a set of points is stored in each leaf node; the size depends on the number of dimensions used. For 3D, I'd start with at least ten vectors per leaf node. Optimum number depends on too many factors for any real guide to be just handwaving.
Each inner node splits the search volume in half, along one of the axes
To find the approximate nearest neighbour to a test point, a function descends into the tree, always selecting the search volume that contains the test point, until it arrives at a leaf node; there, it finds the point closest to the test point
To ensure that the neighbour found in the leaf node is the real nearest neighbour, or to find the actual one, a precise nearest neighbour search must ascend back towards the root. For each inner node, if the splitting plane (split coordinate) is closer than the distance to the thus far found nearest neighbour, the other subvolume (than we just ascended from) must also be verified.
There are several ways to extend the search to k-nearest neighbour.
My original suggestion that one could simply ignore points in the tree that are already known, will occasionally fail, because in cases where a neighbour is found in the leaf node after others have been excluded, we fail to descend into a sibling node because the split plane is too far. Typically, we stumble onto a closer split plane higher up, closer to the root, and descend into that subvolume, but to no avail: we have already skipped the subvolumes where the true nearest neighbour was in.
A much better option is to record all k nearest neighbours, and using the furthest one to decide when not to descend into sibling subvolumes on our way back towards the root. We only need room for the k nearest neighbours, sorted according to distance from the test point. This list will be modified along several steps when returning back from the leaf towards the root.
Of course, that does not return the kth nearest neighbour, but all k nearest neighbours to the test point. Usually, this is preferable. I do not know of efficiency figures (how many leaf nodes must be visited to satisfy a typical search), but if k is typically on the order of number of points stored in each leaf node, the performance should be acceptable.
Directly excluding from consideration specific points in the tree is possible, but complicated. When arriving at the initial leaf node, we must make the decisions on whether to descend into a sibling subvolume on our way back towards root, but exclude it from the nearest neighbour consideration. In particular, if all the points in our initial leaf node are excluded from consideration, we must descend into all sibling subvolumes on our way up, until we do have a nearest neighbour candidate.
Of these, the variant that retrieves k nearest neighbours, seems to me to have the highest potential. (It is quite analogous to the normal exact neighbour search, and with reasonable values of k (compared to number of points per leaf node) it should perform at least acceptably -- certainly faster than k separate searches on the same dataset.)
I am tempted to consider writing some example code for this, but seeing as I've already made a butt out of myself on this subject, I'm very hesitant.

Finding the squares in a plane given n points

Given n points in a plane , how many squares can be formed ...??
I tried this by calculating the distances between each 2 points , then sort them , and look for the squares in the points with four or more equal distances after verifying the points and slopes.
But this looks like an approach with very high complexity . Any other ideas ...??
I thought dynamic programming for checking for line segments of equal distances might work ... but could not get the idea quite right ....
Any better ideas???
P.S : The squares can be in any manner . They can overlap , have a common side, one square inside another ...
If possible please give a sample code to perform the above...
Let d[i][j] = distances between points i and j. We are interested in a function count(i, j) that returns, as fast as possible, the number of squares that we can draw by using points i and j.
Basically, count(i, j) will have to find two points x and y such that d[i][j] = d[x][y] and check if these 4 points really define a square.
You can use a hash table to solve the problem in O(n^2) on average. Let H[x] = list of all points (p, q) that have d[p][q] = x.
Now, for each pair of points (i, j), count(i, j) will have to iterate H[ d[i][j] ] and count the points in that list that form a square with points i and j.
This should run very fast in practice, and I don't think it can ever get worse than O(n^3) (I'm not even sure it can ever get that bad).
This problem can be solved in O(n^1.5) time with O(n) space.
The basic idea is to group the points by X or Y coordinate, being careful to avoid making groups that are too large. The details are in the paper Finding squares and rectangles in sets of points. The paper also covers lots of other cases (allowing rotated squares, allowing rectangles, and working in higher dimensions).
I've paraphrased their 2d axis-aligned square finding algorithm below. Note that I changed their tree set to a hash set, which is why the time bound I gave is not O(n^1.5 log(n)):
Make a hash set of all the points. Something you can use to quickly check if a point is present.
Group the points by their X coordinate. Break any groups with more than sqrt(n) points apart, and re-group those now-free points by their Y coordinate. This guarantees the groups have at most sqrt(n) points and guarantees that for each square there's a group that has two of the square's corner points.
For every group g, for every pair of points p,q in g, check whether the other two points of the two possible squares containing p and q are present. Keep track of how many you find. Watch out for duplicates (are the two opposite points also in a group?).
Why does it work? Well, the only tricky thing is the regrouping. If either the left or right columns of a square are in groups that are not too large, the square will get found when that column group gets iterated. Otherwise both its top-left and top-right corners get regrouped, placed into the same row group, and the square will be found when that row group gets iterated.
I have a O(N^2) time, O(N) space solution:
Assume given points is an array of object Point, each Point has x,y.
First iterate through the array and add each item into an HashSet: This action de-duplicate and give us an O(1) access time. The whole process takes O(N) time
Using Math, Say vertices A, B, C, D can form a square, AC is known and it's a diagonal line, then the corresponding B, D is unique. We could write a function to calculate that. This process is O(1) time
Now Let's get back to our thing. write a for-i-loop and a for-j-inner-loop. Say input[i] and input[j] form a diagonal line, find its anti-diagonal line in the set or not: If exist, counter ++; This process take O(N^2) time.
My code in C#:
public int SquareCount(Point[] input)
{
int count = 0;
HashSet<Point> set = new HashSet<Point>();
foreach (var point in input)
set.Add(point);
for (int i = 0; i < input.Length; i++)
{
for (int j = 0; j < input.Length; j++)
{
if (i == j)
continue;
//For each Point i, Point j, check if b&d exist in set.
Point[] DiagVertex = GetRestPints(input[i], input[j]);
if (set.Contains(DiagVertex[0]) && set.Contains(DiagVertex[1]))
{
count++;
}
}
}
return count;
}
public Point[] GetRestPints(Point a, Point c)
{
Point[] res = new Point[2];
int midX = (a.x + c.y) / 2;
int midY = (a.y + c.y) / 2;
int Ax = a.x - midX;
int Ay = a.y - midY;
int bX = midX - Ay;
int bY = midY + Ax;
Point b = new Point(bX,bY);
int cX = (c.x - midX);
int cY = (c.y - midY);
int dX = midX - cY;
int dY = midY + cX;
Point d = new Point(dX,dY);
res[0] = b;
res[1] = d;
return res;
}
It looks like O(n^3) to me. A simple algo might be something like:
for each pair of points
for each of 3 possible squares which might be formed from these two points
test remaining points to see if they coincide with the other two vertices
Runtime: O(nlog(n)^2), Space: θ(n), where n is the number of points.
For each point p
Add it to the existing arrays sorted in the x and y-axis respectively.
For every pair of points that collide with p in the x and y-axis respectively
If there exists another point on the opposite side of p, increment square count by one.
The intuition is counting how many squares a new point creates. All squares are created on the creation of its fourth point. A new point creates a new square if it has any colliding points on concerned axes and there exists the "fourth" point on the opposite side that completes the square. This exhausts all the possible distinct squares.
The insertion into the arrays can be done binary, and checking for the opposite point can be done by accessing a hashtable hashing the points' coordinates.
This algorithm is optimal for sparse points since there will be very little collision points to check. It is pessimal for dense-squares points for the opposite of the reason for that of optimal.
This algorithm can be further optimized by tracking if points in the axis array have a collision in the complementary axis.
Just a thought: if a vertex A is one corner of a square, then there must be vertices B, C, D at the other corners with AB = AD and AC = sqrt(2)AB and AC must bisect BD. Assuming every vertex has unique coordinates, I think you can solve this in O(n^2) with a hash table keying on (distance, angle).
This is just an example implementation in Java - any comments welcome.
import java.util.Arrays;
import java.util.NoSuchElementException;
import java.util.Map;
import java.util.HashMap;
import java.util.List;
import java.util.ArrayList;
public class SweepingLine {
public static void main(String[] args) {
Point[] points = {
new Point(1,1),
new Point(1,4),
new Point(4,1),
new Point(4,4),
new Point(7,1),
new Point(7,4)
};
int max = Arrays.stream(points).mapToInt(p -> p.x).max().orElseThrow(NoSuchElementException::new);
int count = countSquares(points, max);
System.out.println(String.format("Found %d squares in %d x %d plane", count, max, max));
}
private static int countSquares(Point[] points, int max) {
int count = 0;
Map<Integer, List<Integer>> map = new HashMap<>();
for (int x=0; x<max; x++) {
for (int y=0; y<max; y++) {
for(Point p: points) {
if (p.x == x && p.y == y) {
List<Integer> ys = map.computeIfAbsent(x, _u -> new ArrayList<Integer>());
ys.add(y);
Integer ley = null;
for (Integer ey: ys) {
if (ley != null) {
int d = ey - ley;
for (Point p2: points) {
if (x + d == p2.x && p2.y == ey){
count++;
}
}
}
ley = ey;
}
}
}
}
}
return count;
}
private static class Point {
public final int x;
public final int y;
public Point(int x, int y) {
this.x = x;
this.y = y;
}
}
}
Here is a complete implemention of finding the diagonal points in C++!
Given points a and c, return b and d, which lie on the opposite diagonal
If b or d are not integer points, dicard them (optional)
To find all squares generated by n points, can check out this C++ implementation
Idea credited to Kevman. Hope it can help!
vector<vector<int>> createDiag(vector<int>& a, vector<int>& c){
double midX = (a[0] + c[0])/2.0;
double midY = (a[1] + c[1])/2.0;
double bx = midX - (a[1] - midY);
double by = midY + (a[0] - midX);
double dx = midX - (c[1] - midY);
double dy = midY + (c[0] - midX);
// discard the non-integer points
double intpart;
if(modf(bx, &intpart) != 0 or modf(by, &intpart) != 0 or modf(dx, &intpart) != 0 or modf(dy, &intpart) != 0){
return {{}};
}
return {{(int)bx, (int)by}, {(int)dx, (int)dy}};
}

Finding the intersecting node from two intersecting linked lists

Suppose there are two singly linked lists both of which intersect at some point and become a single linked list.
The head or start pointers of both the lists are known, but the intersecting node is not known. Also, the number of nodes in each of the list before they intersect are unknown and both list may have it different i.e. List1 may have n nodes before it reaches intersection point and List2 might have m nodes before it reaches intersection point where m and n may be
m = n,
m < n or
m > n
One known or easy solution is to compare every node pointer in the first list with every other node pointer in the second list by which the matching node pointers will lead us to the intersecting node. But, the time complexity in this case will O(n2) which will be high.
What is the most efficient way of finding the intersecting node?
This takes O(M+N) time and O(1) space, where M and N are the total length of the linked lists. Maybe inefficient if the common part is very long (i.e. M,N >> m,n)
Traverse the two linked list to find M and N.
Get back to the heads, then traverse |M − N| nodes on the longer list.
Now walk in lock step and compare the nodes until you found the common ones.
Edit: See more here.
If possible, you could add a 'color' field or similar to the nodes. Iterate over one of the lists, coloring the nodes as you go. Then iterate over the second list. As soon as you reach a node that is already colored, you have found the intersection.
Dump the contents (or address) of both lists into one hash table. first collision is your intersection.
Check last nodes of each list, If there is an intersection their last node will be same.
This is crazy solution I found while coding late at night, it is 2x slower than accepted answer but uses a nice arithmetic hack:
public ListNode findIntersection(ListNode a, ListNode b) {
if (a == null || b == null)
return null;
int A = a.count();
int B = b.count();
ListNode reversedB = b.reverse();
// L = a elements + 1 c element + b elements
int L = a.count();
// restore b
reversedB.reverse();
// A = a + c
// B = b + c
// L = a + b + 1
int cIndex = ((A+B) - (L-1)) / 2;
return a.atIndex(A - cIndex);
}
We split lists at three parts: a this is part of the first list until start of the common part, b this is part of the second list until common part and c which is common part of two lists. We count list sizes then reverse list b, this will cause that when we start traversing list from a end we will end at reversedB (we will go a -> firstElementOfC -> reversedB). This will give us three equations that allow us to get length of common part c.
This is too slow for programming competitions or use in production, but I think this approach is interesting.
Maybe irrelevant at this point, but here's my dirty recursive approach.
This takes O(M) time and O(M) space, where M >= N for list_M of length M and list_N of length N
Recursively iterate to the end of both lists, then count from the end for step 2. Note that list_N will hit null before list_M, for M > N
Same lengths M=N intersects when list_M != list_N && list_M.next == list_N.next
Different lengths M>N intersects when list_N != null
Code Example:
Node yListsHelper(Node n1, Node n2, Node result) {
if (n1 == null && n2 == null)
return null;
yLists(n1 == null ? n1 : n1.next, n2 == null ? n2 : n2.next, result);
if (n1 != null && n2 != null) {
if (n2.next == null) { // n1 > n2
result.next = n1;
} else if (n1.next == null) { // n1 < n2
result.next = n2;
} else if (n1 != n2 && n1.next == n2.next) { // n1 = n2
result.next = n1.next; // or n2.next
}
}
return result.next;
}

Resources