FIND-S Algorithm - simple question - artificial-intelligence

The FIND-S algorithm is probably one of the most simple machine learning algorithms. However, I can't find many examples out there.. Just the standard 'sunny, rainy, play-ball' examples that's always used in machine learning. Please could someone help me with this application (its a past exam question in machine learning).
Hypotheses are of the form a <= x <= b, c <= y <= d where x and y are points in an x,y plane and c and d are any integer. Basically, these hypotheses define rectangles in the x,y space.
These are the training examples where - is a negative example and + is a positive example and the pairs are the x,y co-ordinates:
+ 4, 4
+ 5, 3
+ 6, 5
- 1, 3
- 2, 6
- 5, 1
- 5, 8
- 9, 4
All I want to do is apply FIND-S to this example! It must be simple! Either some tips or a solution would be awesome.
Thank you.

Find-S seeks the most restrictive (ie most 'specific') hypothesis that fits all the positive examples (negatives are ignored).
In your case, there's an obvious graphical interpretation: "find the smallest rectangle that contains all the '+' coordinates"...
... which would be a=4, b=6, c=3, d=5.
The algorithm for doing it would be something like this:
Define a hypothesis rectangle h[a,b,c,d], and initialise it to [-,-,-,-]
for each + example e {
if e is not within h {
enlarge h to be just big enough to hold e (and all previous e's)
} else { do nothing: h already contains e }
}
If we step through this with your training set, we get:
0. h = [-,-,-,-] // initial value
1. h = [4,4,4,4] // (4,4) is not in h: change h so it just contains (4,4)
2. h = [4,5,3,4] // (5,3) is not in h, so enlarge h to fit (4,4) and (5,3)
3. h = [4,6,3,5] // (6,5) is not in h, so enlarge again
4. // no more positive examples left, so we're done.

Related

MATLAB Vectorised Pairwise Distance

I'm struggling to vectorise a function which performs a somewhat pairwise difference between two vectors x = 2xN and v = 2xM, for some arbitrary N, M. I have this to work when N = 1, although, I would like to vectorise this function to apply to inputs with N arbitrary.
Indeed, what I want this function to do is for each column of x find the normed difference between x(:,column) (a 2x1) and v (a 2xM).
A similar post is this, although I haven't been able to generalise it.
Current implementation
function mat = vecDiff(x,v)
diffVec = bsxfun(#minus, x, v);
mat = diffVec ./ vecnorm(diffVec);
Example
x =
1
1
v =
1 3 5
2 4 6
----
vecDiff(x,v) =
0 -0.5547 -0.6247
-1.0000 -0.8321 -0.7809
Your approach can be adapted as follows to suit your needs:
Permute the dimensions of either x or v so that its number of columns becomes the third dimension. I'm choosing v in the code below.
This lets you exploit implicit expansion (or equivalently bsxfun) to compute a 2×M×N array of differences, where M and N are the numbers of columns of x and v.
Compute the vector-wise (2-)norm along the first dimension and use implicit expansion again to normalize this array:
x = [1 4 2 -1; 1 5 3 -2];
v = [1 3 5; 2 4 6];
diffVec = x - permute(v, [1 3 2]);
diffVec = diffVec./vecnorm(diffVec, 2, 1);
You may need to apply permute differently if you want the dimensions of the output in another order.
Suppose your two input matrices are A (a 2 x N matrix) and B (a 2 x M matrix), where each column represents a different observation (note that this is not the traditional way to represent data).
Note that the output will be of the size N x M x 2.
out = zeros(N, M, 2);
We can find the distance between them using the builtin function pdist2.
dists = pdist2(A.', B.'); (with the transpositions required for the orientation of the matrices)
To get the individual x and y distances, the easiest way I can think of is using repmat:
xdists = repmat(A(1,:).', 1, M) - repmat(B(1,:), N, 1);
ydists = repmat(A(2,:).', 1, M) - repmat(B(2,:), N, 1);
And we can then normalise this by the distances found earlier:
out(:,:,1) = xdists./dists;
out(:,:,2) = ydists./dists;
This returns a matrix out where the elements at position (i, j, :) are the components of the normed distance between A(:,i) and B(:,j).

Algorithm find number position in snail 2D array

I have a 2D array square size.
such as :
(3x3) (4x4)
1 2 3 or 1 2 3 4
8 9 4 12 13 14 5
7 6 5 11 16 15 6
10 9 8 7
I am trying to find a solution to get by giving a value and the array size the Y, X position of the 2D array.
Exemple:
>> find_y_x_in_snail(3, 4)
1, 2
# in a 3x3 array, search value 4
return y=1 x=2
the only idea i have to create the snail in a 2D array and return the position.. not that great.
I found the opposite algorithm here (first exemple)
Any idea ?
You could use this function:
def find_y_x_in_snail(n, v):
r = 0
span = n
while v > span:
v -= span
r += 1
span -= r%2
d, m = divmod(r,4);
c = n-1-d
return [d, d+v, c, c-v][m], [d+v-1, c, c-v, d][m] # y, x
Explanation
r is the number of corners the "snake" needs to take to get to the targetted value.
span is the number of values in the current, straight segment of the snake, i.e. it starts with n, and decreases after the next corner, and again every second next corner.
d is the distance the snake has from the nearest side of the matrix, i.e. the "winding" level.
m indicates which of the 4 sides the segment -- containing the targetted value -- is at:
0: up
1: right
2: down
3: left
Depending on m a value is taken from a list with 4 expressions, each one tailored for the corresponding side: it defines the y coordinate. A similar method is applied (but with different expressions) for x.

finding maximum sum of a disjoint sequence of an array

Problem from :
https://www.hackerrank.com/contests/epiccode/challenges/white-falcon-and-sequence.
Visit link for references.
I have a sequence of integers (-10^6 to 10^6) A. I need to choose two contiguous disjoint subsequences of A, let's say x and y, of the same size, n.
After that you will calculate the sum given by ∑x(i)y(n−i+1) (1-indexed)
And I have to choose x and y such that sum is maximised.
Eg:
Input:
12
1 7 4 0 9 4 0 1 8 8 2 4
Output: 120
Where x = {4,0,9,4}
y = {8,8,2,4}
∑x(i)y(n−i+1)=4×4+0×2+9×8+4×8=120
Now, the approach that I was thinking of for this is something in lines of O(n^2) which is as follows:
Initialise two variables l = 0 and r = N-1. Here, N is the size of the array.
Now, for l=0, I will calculate the sum while (l<r) which basically refers to the subsequences that will start from the 0th position in the array. Then, I will increment l and decrement r in order to come up with subsequences that start from the above position + 1 and on the right hand side, start from right-1.
Is there any better approach that I can use? Anything more efficient? I thought of sorting but we cannot sort numbers since that will change the order of the numbers.
To answer the question we first define S(i, j) to be the max sum of multlying the two sub-sequence items, for sub-array A[i...j] when the sub-sequence x starts at position i, and sub-sequence y ends on position j.
For example, if A=[1 7 4 0 9 4 0 1 8 8 2 4], then S(1, 2)=1*7=7 and S(2, 5)=7*9+4*0=63.
The recursive rule to compute S is: S(i, j)=max(0, S(i+1, j-1)+A[i]*A[j]), and the end condition is S(i, j)=0 iff i>=j.
The requested final answer is simply the maximum value of S(i, j) for all combinations of i=1..N, j=1..N, since one of the S(i ,j) values will correspond to the max x,y sub-sequences, and thus will be equal the maximum value for the whole array. The complexity of computing all such S(i, j) values is O(N^2) using dynamic programming, since in the course of computing S(i, j) we will also compute the values of up to N other S(i', j') values, but ultimately each combination will be computed only once.
def max_sum(l):
def _max_sub_sum(i, j):
if m[i][j]==None:
v=0
if i<j:
v=max(0, _max_sub_sum(i+1, j-1)+l[i]*l[j])
m[i][j]=v
return m[i][j]
n=len(l)
m=[[None for i in range(n)] for j in range(n)]
v=0
for i in range(n):
for j in range(i, n):
v=max(v, _max_sub_sum(i, j))
return v
WARNING:
This method assumes the numbers are non-negative so this solution does not answer the poster's actual problem now it has been clarified that negative input values are allowed.
Trick 1
Assuming the numbers are always non-negative, it is always best to make the sequences as wide as possible given the location where they meet.
Trick 2
We can change the sum into a standard convolution by summing over all values of i. This produces twice the desired result (as we get both the product of x with y, and y with x), but we can divide by 2 at the end to get the original answer.
Trick 3
You are now attempting to find the maximum of a convolution of a signal with itself. There is a standard method for doing this which is to use the fast fourier transform. Some libraries will have this built in, e.g. in Scipy there is fftconvolve.
Python code
Note that you don't allow the central value to be reused (e.g. for a sequance 1,3,2 we can't make x 1,3 and y 3,1) so we need to examine alternate values of the convolved output.
We can now compute the answer in Python via:
import scipy.signal
A = [1, 7, 4, 0, 9, 4, 0, 1, 8, 8, 2, 4]
print max(scipy.signal.fftconvolve(A,A)[1::2]) / 2

How can i find the number of lowest possible square that can fit in the given square

let's suppose i have a square of 7x7.i can fill the square with other squares(i.e the squares of dimension 1x1,2x2.....6x6).How can i can fill the square with least possible smaller squares.please help me.
Consider a square with dimensions s x s. Cutting a smaller square of dimensions m x m out will result in a square of m x m, a square of n x n, and two rectangles of dimensions m x n, where m + n = s.
When s is even, the square can be divided such that m = n, in which case the rectangles will also be squares, resulting in an answer of 4.
However, when s is odd, values of m and n must be chosen such that the resulting rectangle can be filled with the least number of squares possible. There doesn't seem to be an immediately obvious way to figure out the best configuration, so I would suggest coming up with an algorithm to figure out the least number of squares that can be used to fill a rectangle of size m x n (this is a slightly simpler problem and I believe it can be solved with a recursive algorithm). The total number of squares needed will then be equal to 2 x ([number of squares in m x n rectangle] + 1). You can use a loop to check all the sizes of m between 1 and s/2.
Hope that gets you started.
Consider a square with dimensions s x s.
Factorialise s into primes. Then solve the problem for each prime sp. The answer will be the same for sp x sp as for s x s. It is probable that the smallest prime will give the lowest result. I have have no proof of this, but I have checked by hand up to 17 x 17.
This is a generalisation of Otaias notion of an even s resulting in an answer of 4.
Placiing algorithm:
You need to loop from n = (s+1)/2, rounded down, to n = s-1.
Put the n x n square in a corner.
Let m = s - n.
Place m x m squares in the adjacent corners and keep placing them until they (almost) reach the end of the n x n square.
The remaining space will be m x m (if you are lucky), or up to 2m-1 x 2m-1 with a corner piece missing.
Fill the remaining space with a similar algorithm. Start with placing a n2 x n2 square in the corner opposite to the missing corner piece.
Working by hand I have obtained the following results:
s minimum number of squares:
2 4
3 6
5 8
7 9
11 10
13 11
17 12
First check if n is even. If n is even, then the answer is four since there isn't a way to fit 3 squares or 2 squares together to make another square so that solves it for half of all possible cases
BEFORE YOU PROCEED: This approach is incomplete and this may be the WRONG approach
I just intend to throw out an out-of-the-box idea just because I feel like this may help and, hopefully, advance the problem. I feel like it may have some correlation with Goldbach's weak conjecture. The algorithm may be too long to compute for larger values, and I'm not sure how much optimization is happening.
Now my idea would be to try to enumerate all triples (n1,n2,n3) where n1 + n2 + n3 = n AND n1, n2, n3 are all prime (which are >= 2) AND n >= 7 AND n1 <= n2 <= n3
Now let me literally depict my algorithm:
Now my idea is find all possible triples (n1,n2,n3) so it fits the definition stated above. Next set n_s = n1 + n2. IF n_s > n3 follow the depiction above else flip n_s and n3
Now the problem is the white rectangles left over (that should be congruent to each other).
Let n4 x n3 denote the rectangles where:
n4 = n - 2 * n3 \\if following the depicted example
Enumerate all possible triples (n41, n42, n43) (treating n as n = n4, so n3 >= 7) and (n31, n32, n33) (treating n as n = n3, so n3 >= 7). Next find the value where n_s3 == n_s4 and both are the greatest they could be. For example:
Let's suppose x3 = 17 and x4 = 13
Enumeration of x3 = 17:
2 + 2 + 13
3 + 3 + 11
5 + 5 + 7
Enumeration of x_s3:
4 = 2 + 2
6 = 3 + 3
10 = 5 + 5
12 = 5 + 7
14 = 3 + 11
15 = 2 + 13
Enumeration of x4 = 13:
2 + 2 + 7
3 + 5 + 5
Enumeration of x_s4:
4 = 2 + 2
8 = 3 + 5
9 = 2 + 7
10 = 5 + 5
Since 10 is the largest value shared between 13 and 17, you fit a 10 by 10 square in the (both rectangles) and now you have a none parallelogram which get further and further more difficult to fill, but may be (I feel) towards the right direction.
All feed back appreciated.

Determining the direction of a turn?

I want to turn an object clockwise or counter-clockwise. A couple of integers (from 0 -> 7) represent the direction that object is looking to (eg. left, leftup, up, upright, right, ...). Adding +1 to the current direction of the object turns it clockwise, substracting -1 turns it counter-clockwise.
If I want the object to turn to a certain direction (= integer), how do I determine the minimum amount of turns necessary?
Currently I'm using this way of thinking :
int minimumRequiredTurns = min(abs(currentDirection.intvalue - goalDirection.intvalue),
8 - abs(currentDirection.intvalue - goalDirection.intvalue));
Is it possible to do it without a min statement?
I think
(1-(abs(abs(currentDirection.intvalue - goalDirection.intvalue)/(n/2)-1)))*(n/2)
should do the trick, where n is the number of possible directions.
In order to have integer only calculations transform this to
(n/2)-abs(abs(currentDirection.intvalue - goalDirection.intvalue)-(n/2))
Explanation: Using the hat function to generate the map:
0 -> 0
1 -> 1
2 -> 2
3 -> 3
4 -> 4
5 -> 3
6 -> 2
7 -> 1
If you really don't like the "min", you could use a lookup table.
int minRequiredTurns[8][8] = {
0, 1, 2, 3, 4, 3, 2, 1,
1, 0, 1, 2, 3, 4, 3, 2,
2, 1, 0, 1, 2, 3, 4, 3,
/* and so on... */
};
Almost certainly, a much better design would be to use vectors to represent directions; treat the "direction" as a pair of numbers (x,y) so that x represents the horizontal direction, y represents the vertical.
So (1,0) would represent facing right; (0,1) would represent facing up; (-1, 0) would be facing left; (1,1) would be facing up-right; etc.
Then you can just use the normal vector-based solution to your problem: Take the direction you're facing, and the direction you want to face, and take the cross-product of the two.
result = x1y2 - x2y1
If the result is positive, rotate counter-clockwise; if the result is negative, rotate clockwise (this works because of the right-hand rule that defines cross-products).
Note that this approach generalizes trivially to allow arbitrary directions, not just horizontal/vertical/diagonal.
First, force a positive difference, then force to be between 0 and N/2 (0 and 4):
N=8
diff = (new-old+N)%N;
turns = diff - (diff>N/2 ? N/2 : 0)
int N = 8, turns = abs(current-goal);
if (turns > N/2) turns = N-turns;
But I don't understand why you don't want the min-statement...
No min, no abs, one expression, no division:
turns = ((((goalDirection + 8 - currentDirection) % 8) + 4) % 8) - 4
How it works: the innermost expression (goalDirection + 8 - currentDirection) is the same as given by AShelley; number of required turns in the clockwise direction. The outermost expression shifts this to its equivalent in [-4..+3]

Resources