Double bridge move in TSP - permutation

I want to implement the double bridge move in TSP. I know that I have to select 3 random positions and split the permutation into 4 parts and then I have to reconnect these parts together in different order, but I would like to get all possible combination that can be available for TSP problem by double bridge?
Assuming that the number of cities are n, so will all possible combination for double bridge be n?

If we try to divide the whole permutation into near about four equal parts and reconnect them to find new solution, then the approximate number of neighborhood solutions may be [(n-2)/4]^3 for n number of cities. Here, [x] represents the least integer value that is greater than or equal to x.

Related

Simplest way to make a histogram of an unknown, finite list of discrete floating point numbers

I have a code that generates a sequence of configurations of some system of interest (Markov Chain Monte Carlo). For each configuration, I make a measurement of a particular value for that configuration, which is bounded between zero and some maximum which I can presumably predict before hand, let's call it Rmax. It can only take a finite number of discrete values in between 0 and Rmax, but the values could be irrational and are not evenly spaced, and I don't know them a priori, or necessarily how many there are (though I could probably estimate an upper bound). I want to generate a very large number of configurations (on the order 1e8) and make a histogram of the distribution of these values, but the issue that I am facing is how to effectively keep track of them.
For example, if the values were integers in the range [0,N-1], I would just create an integer array of N elements, initially set to zero, and increment the appropriate array element for each configuration, e.g. in pseudocode
do i = 1, 1e8
call generateConfig()
R = measureR() ! R is an integer
Rhist(R)++
end do
How can I do something similar to count or tally the number of times each of these irrational, non-uniformly distributed numbers occurs?

How to find pattern groups in boolean array?

Given a 2D array of Boolean values I want to find all patterns that consist of at least 2 columns and at least 2 rows. The problem is somewhat close to finding cliques in a graph.
In the example below green cells represent "true" bits, greys are "false". Pattern 1 contains cols 1,3,4 and 5 and rows 1 and 2. Pattern 2 contains only columns 2 and 4, and rows 2,3,4.
Business idea behind this is finding similarity patterns among various groups of social network users. In real world number of rows can go up to 3E7, and the number of columns up to 300.
Can't really figure out a solution other than brute force matching.
Please advice the proper name of the problem, so I could read more, or advice an elegant solution.
This is (equivalent to) asking for all bicliques (complete bipartite subgraphs) larger than a certain size in a bipartite graph. Here the rows are the vertices of one part A of the graph, and the columns are the vertices of the other part B, and there is an edge between u \in A and v \in B whenever the cell at row u, column v is green.
Although you say that you want to find all patterns, you probably only want to find only maximal ones -- that is, patterns that cannot be extended to become larger patterns by adding more rows or columns. (Otherwise, for any pattern with c >= 2 columns and r >= 3 rows, you will also get back the more than 2^(c-2)*2^(r-3) non-maximal patterns that can be formed by deleting some of the rows or columns.)
But even listing just the maximal patterns can take time exponential in the number of rows and columns, assuming that P != NP. That's because the problem of finding a maximum (i.e. largest-possible) pattern, in terms of the total number of green cells, has been proven to be NP-complete: if it were possible to list all maximal patterns in polynomial time, then we could simply do so, and pick the largest, thereby solving this NP-complete problem in polynomial time.

Algorithm - How to select one number from each column in an array so that their sum is as close as possible to a particular value

I have an m x n matrix of real numbers. I want to choose a single value from each column such that the sum of my selected values is as close as possible to a pre-specified total.
I am not an experienced programmer (although I have an experienced friend who will help). I would like to achieve this using Matlab, Mathematica or c++ (MySQL if necessary).
The code only needs to run a few times, once every few days - it does not necessarily need to be optimised. I will have 16 columns and about 12 rows.
Normally I would suggest dynamic programming, but there are a few features of this situation suggesting an alternative approach. First, the performance demands are light; this program will be run only a couple times, and it doesn't sound as though a running time on the order of hours would be a problem. Second, the matrix is fairly small. Third, the matrix contains real numbers, so it would be necessary to round and then do a somewhat sophisticated search to ensure that the optimal possibility was not missed.
Instead, I'm going to suggest the following semi-brute-force approach. 12**16 ~ 1.8e17, the total number of possible choices, is too many, but 12**9 ~ 5.2e9 is doable with brute force, and 12**7 ~ 3.6e7 fits comfortably in memory. Compute all possible choices for the first seven columns. Sort these possibilities by total. For each possible choice for the last nine columns, use an efficient search algorithm to find the best mate among the first seven. (If you have a lot of memory, you could try eight and eight.)
I would attempt a first implementation in C++, using std::sort and std::lower_bound from the <algorithm> standard header. Measure it; if it's too slow, then try an in-memory B+-tree (does Boost have one?).
I spent some more time thinking about how to implement what I wrote above in the simplest way possible. Here's an approach that will work well for a 12 by 16 matrix on a 64-bit machine with roughly 4 GB of memory.
The number of choices for the first eight columns is 12**8. Each choice is represented by a 4-byte integer between 0 and 12**8 - 1. To decode a choice index i, the row for the first column is given by i % 12. Update i /= 12;. The row for the second column now is given by i % 12, et cetera.
A vector holding all choices requires roughly 12**8 * 4 bytes, or about 1.6 GB. Two such vectors require 3.2 GB. Prepare one for the first eight columns and one for the last eight. Sort them by sum of the entries that they indicate. Use saddleback search to find the best combination. (Initialize an iterator into the first vector and a reverse iterator into the second. While neither iterator is at its end, compare the current combination against the current best and update the current best if necessary. If the current combination sums to than the target, increment the first iterator. If the sum is greater than the target, increment the second iterator.)
I would estimate that this requires less than 50 lines of C++.
Without knowing the range of values that might fill the arrays, how about something generic like this:
divide the target by the number of remaining columns.
Pick the number from that column closest to that value.
Repeat from 1. Until each column picked.

search the closest points for a given point in 1 million points

This is an algorithm question.
Given 1 million points , each of them has x and y coordinates, which are floating point numbers.
Find the 10 closest points for the given point as fast as possible.
The closeness can be measured as Euclidean distance on a plane or other kind of distance on a globe. I prefer binary search due to the large number of points.
My idea:
save the points in a database
1. Amplify x by a large integer e.g. 10^4 and cut off the decimal part and then Amplify x integer part by 10^4 again.
2. Amplify y by a large integer e.g. 10^4
3. Sum the above result from step 1 and 2 , we call the sum as associate_value
4. Repeat 1 to 3 for each number in the database
E.g.
x = 12.3456789 , y = 98.7654321
x times 10^4 = 123456 and then times 10^4 to get 1234560000
y times 10^2 = 9876.54321 and then get 9876
Sum them, get 1234560000 + 9876 = 1234569876
In this way, I transform 2-d data to 1-d data. In the database, each point is associated with an integer (associate_value). The integer column can be set as index in the database for fast search.
For a given point (x, y), I perform step 1 - 3 for it and then find the points in the database such that their associate_value is close to the given point associate_value.
e.g.
x = 59.469797 , y = 96.4976416
their associated value is 5946979649
Then in the database, I search the associate_values that are close to 5946979649, for example, 5946979649 + 50 , 5946979649 - 50 and also 5946979649 + 50000000 , 5946979649 - 500000000. This can be done by index-search in database.
In this way, I can find a group of points that are close to the given point. I can reduce the search space greatly. Then, I can use Euclidean or other distance formula to find the closest points.
I am not sure the efficiency of the algorithm, especially, the process of generating associate_values.
My idea works or not ? Any better ideas ?
Thanks
Your idea seems like it may work, but I would be concerned with degenerate cases (like if no points are in your specified ranges, but maybe that's not possible given the constraints). Either way, since you asked for other ideas, here's my stab at it: Store all of your points in a quad tree. Then just walk down the quad tree until you have a sufficiently small group to search through. Since the points are fixed, the cost of creating the quad is constant, and this should be logarithmic in the number of points you have.
You can do better and just concatenate the binary value from the x- and y co-oordinates. Instead of a straight line it orders the points along a z-curve. Then you can compute the upper bounds with the mostsignificant bits. The z-curve is often use in mapping applications:http://msdn.microsoft.com/en-us/library/bb259689.aspx.
The way I read your algorithm you are discriminating the values along a line with a slope of -1 that are similar to your point. i.e. if your point is 2,2 you would look at points 1,3 0,4 and -1,5 and likely miss points closer. Most algorithms to solve this are O(n) which isn't terribly bad.
A simple algorithm to solve this problem is to keep a priority queue of the closest ten and a measurement of the furthest distance of the ten points as you iterate over the set. If the x or y value is not within the furthest distance discard it immediately. Otherwise calculate it with whatever distance measurement your using and see if it gets inserted into the queue. If so update your furthest on top ten threshold and continue iterating.
If your points are pre-sorted on one of the axes you can further optimize the algorithm by starting at the matching the point on that axis and radiate outward until you are at a difference greater than the distance from your tenth closest point. I did not include sorting in the description in the paragraph above because sorting is O(nlogn) which is slower than O(n). If you are doing this multiple times on the same set then it could be beneficial to sort it.

Graph Theory - Fill nodes with a limited number of routes

First I must say this is not a Homework or something related, this is a problem of a game named (freeciv).
Ok, in the game we have 'n' number of cities usually (8-12), each city can have a max number of trade-routes of 'k' usually (4), and those trade-routes need to be 'd' distance or further (8 Manhattan tiles).
The problem consist in to find the k*n trade-routes with (max distances or min distances), obviously this problem can be solved with a brute-force algorithm but it is really slow when you the player have more than 10 cities because the program has to make several iterations; I tried to solve it using graph theory but I am not an not an expert in it, I even asked some of my teachers and none could explain me an smart-algorithm, so I didn't come here to find the exact solution but I did to get the idea or the steps to analyze this.
The problem has two parts:
Calculate pair-wise distances between the cities
Select which pairs should become trade-route
I don't think the first part can be calculated faster than O(n·t) where t is number of tiles, as each run of Dijkstra's algorithm will give you distances from one city to all other cities. However if I understand correctly, distance between two cities never changes and is symmetrical. So whenever a new city is built, you just need to run Dijkstra's algorithm from it and cache the distances.
For the second part I would expect greedy algorithm to work. Order all pairs of cities by suitability and in each step pick the first one that does not violate the constraint of k routes per city. I am not sure whether it can be proven (the proof should be similar to the one for Kruskal's minimal spanning-tree algorithm if it exists. But I suspect it will work fine in practice even if you find that it does not work in theory (I haven't tried to either prove or disprove it; it's up to you)
continue #Jan Hudec way:
Init Stage:
lets say you have N cities (c1, c2,... cN). you should build a list of connections when each entity in the list will have a format of (cX, cY, Distance) (while X < Y, this is n^2/2 time) and order it by distance (descending for max distance or ascending for min distance), and you should also have an array/list which will hold the number of connection per City (cZ = W) which initialized for each city at N-1 because they all connected at the beginning.
Iterations:
iterate over the lists of connections
for each (cX, cY, D) if the number of connection (in the connection number array) of cX > k and cY > k then delete (cX, cY, D) from the connection list and also decrees by one the value of cX and cY in the connection array.
in the end, you'll have the connection list with the value you wish for.

Resources