Positive and Negative labeled data points Support Vector Machine - artificial-intelligence

I was learning about SVM on YouTube (https://www.youtube.com/watch?v=ivPoCcYfFAw) and there is this example. Suppose we are given the following positively labelled data points,
{(3 1), (3 -1), (6 1), (6 -1)}
and the following negatively labelled data points,
{(1 0), (0 1), (0 -1), (-1 0)}
My question is how do you differentiate positive and negative labelled data points? Is it because there is 0?
And there is this problem on my book that said:
Use values +1 and -1 (instead of 1 and 0) for the target output, so that the four input points and the corresponding target outputs look like ([0, 0], -1); ([1, 0], -1); ([0, 1], - 1); and ([1, 1], 1]
How can I determine what is positive and negative data points?
Thank you.

Related

Next permutation/ranking with specific strength

I am searching an algorithm which gives me the next permutation with a specific strength.
A permutation of length n is defined with the elements (1,2,3,...n)
What is the strength of a permutation?
The strength of a permutation with length 10 is definded as |a1-a2|+|a2-a3|+...+|a9-a10|+|a10-a1|.
For example:
(1,2,3,4,5,6) has the strength 10
(1,2,6,3,4,5) has the strength 14
Exist there a formula to compute the next permutation of a given strength and length, or its necesary to compute all elements?
Is ranking/unranking of the subsets possible?
The next permutation function should return the next lexicographical permutation within the subset defined by the given strength and length and without compute the intermediate permutations different strengths.
This is a nicely masked problem in combinatorics. First, note that this is a ring of integers; the linear "array" is an implementation choice, rather than part of the strength analysis. Let's look at the second case, given as (1,2,6,3,4,5):
1
5 2
4 6
3
Every element appears in exactly two terms. Thus, we have a simple linear combination of the elements, with coefficients of -2, 0 2. If the element is larger than both neighbors (e.g. 5), the coefficient is 2; if smaller than both neighbors (e.g. 1), it's -2; if between, the two abs operations cancel, and it's 0 (e.g. 4).
Lemma: the strength must be an even number.
Thus, the summation and some transformations can be examined easily enough with simple analysis. The largest number always has a coefficient of +2; the smallest always has a coefficient of -2.
You can find "close relative" permutations by finding interchangeable elements. For instance, you can always interchange the largest two elements (6 and 5) and/or the smallest two elements (1 and 2), without affecting the strength. For instance, 6 and 5 can be interchanged because they're strictly larger than their neighbors:
(6-2) + (6-3) + (5-1) + (5-4) =
(5-2) + (5-3) + (6-1) + (6-4) =
2*6 + 2*5 - 2 - 3 - 1 - 4
1 and 2 can be interchanged, even though they're adjacent, for a similar reason ... except that there are only three terms, one of which involves the pair:
(5-1) + (2-1) + (6-2) =
(5-2) + (2-1) + (6-1) =
5 + 6 - 2*1
Depending on the distribution of the set of numbers, there will likely be more direct ways to construct a ring with a given strength. Since we do not yet have an ordering defined on the permutations, we have no way to determine a "next" one. However, the simple one is to note that rotations and reflections of a given permutation will all have the same strength:
(1,2,6,3,4,5)
(2,6,3,4,5,1)
(6,3,4,5,1,2)
...
(5,4,3,6,2,1)
(4,3,6,2,1,5)
...
Does that get you moving?
Addition w.r.t. OP updates:
There are several trivially strength-invariant swaps available. I've already mentioned the two extreme pairs (6-5) and (1-2). You can also swap adjacent, consecutive numbers: that adds (4-5) and (3-4) in the above example. From simple algebraic properties, you can often identify a 2-element swap or 3-element rotation (respecting an increase in lexicographic position) that generates the next desired permutation. For instance:
(5, 6, 1, 3, 4, 2)
(5, 6, 1, 4, 2, 3) rotate 3, 4, 2
(5, 6, 1, 4, 3, 2) swap 2, 3
However, there are irruptions in the sequence that you'd be hard-pressed to find in this fashion. For instance, making the leap to change the first or second element is not so clean:
(5, 6, 3, 1, 4, 2)
(5, 6, 3, 2, 4, 1) swap 1, 2 -- easy
(6, 1, 2, 4, 5, 3) wholesale rearrangement --
hard to see that this is the next strength=14
I feel that finding these would require a set of algebraic rules that would find the simple moves and eliminate invalid moves (such as generating 563421 before the "wholesale rearrangement" just above). However, following these rules would often take more time than working through all permutations.
I'd love to find that I'm wrong on this last point. :-)

Generating all unique orders of looping series of characters (all circular permutations)

I have a string that is made out of Xs and Ys.
For the sake of the question let's say this string is constructed of Four-Xs and Two-Ys:
XXYYYY
How can I generate all of the possible unique strings that are made out of Four-Xs and Two-Ys given that the string is not considered unique if by looping (/rotating/shifting) its characters around it produces a string that was already found?
For instance:
XXYYYY is considered similar to YXXYYY and YYYXXY (cardinal numbers added clarify)
123456 612345 456123
Notice: that the order of the characters stays unchanged, the only thing that changed is the starting character (The original string starts with 1, the 2nd string with 6, and the 3rd with 4, but they all keep the order).
In the case of Two-Xs and Four-Ys (our example) all the possible permutations that are unique are:
XXYYYY
XYXYYY
XYYXYY
And every other order would be a shifted version of one of those 3.
How would you generate all of the possible permutation of a string with and N number of Xs and an M number of Ys?
Essentially you need to generate combinatorial objects named binary necklaces with fixed number of ones
This is Python code adapted from Sawada article "A fast algorithm to generate necklaces with fixed contents".
(I used the simplest variant, there are also more optimized ones)
n = 6
d = 3
aa = [0] * n
bb = [n - d, d] #n-d zeros, d ones
def SimpleFix(t, p):
if t > n:
if n % p == 0:
print(aa)
else:
for j in range(aa[t - p - 1], 2):
if bb[j] > 0:
aa[t - 1] = j
bb[j] -= 1
if j == aa[t-p-1]:
SimpleFix(t+1, p)
else:
SimpleFix(t+1, t)
bb[j] += 1
SimpleFix(1, 1)
#example for 3+3
[0, 0, 0, 1, 1, 1]
[0, 0, 1, 0, 1, 1]
[0, 0, 1, 1, 0, 1]
[0, 1, 0, 1, 0, 1]

Algorithm to fill an array randomly without collision

Say I have an array of N integers set to the value '0', and I want to pick a random element of that array that has the value '0' and put it to value '1'
How do I do this efficiently ?
I came up with 2 solutions but they look quite ineficient
First solution
int array[N] //init to 0s
int n //number of 1s we want to add to the array
int i = 0
while i < n
int a = random(0, N)
if array[a] == 0
array[a] = 1
i++
end if
end while
It would be extremely inefficient for large arrays because of the probability of collision
The second involves a list containing all the 0 positions remaining and we choose a random number between 0 and the number of 0 remaining to lookup the value in the list that correspond to the index in the array.
It's a lot more reliable than the first solution, since the number of operations is bounded, but still has a worst case scenario complexity of N² if we want to fill the array completely
Your second solution is actually a good start. I assume that it involves rebuilding the list of positions after every change, which makes it O(N²) if you want to fill the whole array. However, you don't need to rebuild the list every time. Since you want to fill the array anyway, you can just use a random order and choose the remaining positions accordingly.
As an example, take the following array (size 7 and not initially full of zeroes) : [0, 0, 1, 0, 1, 1, 0]
Once you have built the list of zeros positions, here [0, 1, 3, 6], just shuffle it to get a random ordering. Then fill in the array in the order given by the positions.
For example, if the shuffle gives [3, 1, 6, 0], then you can fill the array like so :
[0, 0, 1, 0, 1, 1, 0] <- initial configuration
[0, 0, 1, 1, 1, 1, 0] <- First, position 3
[0, 1, 1, 1, 1, 1, 0] <- Second, position 1
[0, 1, 1, 1, 1, 1, 1] <- etc.
[1, 1, 1, 1, 1, 1, 1]
If the array is initially filled with zeros, then it's even easier. Your initial list is the list of integers from 0 to N (size of the array). Shuffle it and apply the same process.
If you do not want to fill the whole array, you still need to build the whole list, but you can truncate it after shuffling it (which just means to stop filling the array after some point).
Of course, this solution requires that the array does not change between each step.
You can fill array with n ones and N-n zeros and make random shuffling.
Fisher-Yates shuffle has linear complexity:
for i from N−1 downto 1 do
j ← random integer such that 0 ≤ j ≤ i
exchange a[j] and a[i]

Numpy: finding nonzero values along arbitrary dimension

It seems I just cannot solve this in Numpy: I have a matrix, with an arbitrary number of dimensions, ordered in an arbitrary way. Inside this matrix, there is always one dimension I am interested in (as I said, the position of this dimension is not always the same). Now, I want to find the first nonzero value along this dimension. In fact, I need the index of that value to perform some operations on the value itself.
An example: if my matrix a is n x m x p and the dimension I am interested in is number 1, I would do something like:
for ii in xrange(a.shape[0]):
for kk in xrange(a.shape[2]):
myview = np.squeeze(a[ii, :, kk])
firsti = np.nonzero(myview)[0][0]
myview[firsti] = dostuff
Apart from performance considerations, I really do not know how to do this having different number of dimensions, and having the dimension I am interested in an arbitrary position.
You can abuse np.argmax for your purpose. Here, you can specify the axis which you are interested in, where 0 is along columns, 1 is along rows, and so on. You just need an array which contains the same value for all elements that are not zero. You can achieve that by doing a != 0, as this will contain False (meaning 0) for all zero-elements and True (meaning 1) for all non-zero-elements. Now np.argmax(a != 0, axis=1) would give you the first non-zero element along the 1 axis.
For example:
import numpy as np
a = np.array([[0, 1, 4],[1, 0, 2],[0, 0, 1]])
# a = [[0, 1, 4],
# [1, 0, 2],
# [0, 0, 1]]
print(np.argmax(a!=0, axis=0))
# >>> array([1, 0, 0]) -> along columns
print(np.argmax(a!=0, axis=1))
# >>> array([1, 0, 2]) -> along rows
This will also work for higher dimension, but the output is less instructive, as it is hard to imagine.

tensorflow creating mask of varied lengths

I have a tensor of lengths in tensorflow, let's say it looks like this:
[4, 3, 5, 2]
I wish to create a mask of 1s and 0s whose number of 1s correspond to the entries to this tensor, padded by 0s to a total length of 8. I.e. I want to create this tensor:
[[1,1,1,1,0,0,0,0],
[1,1,1,0,0,0,0,0],
[1,1,1,1,1,0,0,0],
[1,1,0,0,0,0,0,0]
]
How might I do this?
This can now be achieved by tf.sequence_mask. More details here.
This can be achieved using a variety of TensorFlow transformations:
# Make a 4 x 8 matrix where each row contains the length repeated 8 times.
lengths = [4, 3, 5, 2]
lengths_transposed = tf.expand_dims(lengths, 1)
# Make a 4 x 8 matrix where each row contains [0, 1, ..., 7]
range = tf.range(0, 8, 1)
range_row = tf.expand_dims(range, 0)
# Use the logical operations to create a mask
mask = tf.less(range_row, lengths_transposed)
# Use the select operation to select between 1 or 0 for each value.
result = tf.select(mask, tf.ones([4, 8]), tf.zeros([4, 8]))
I've got a bit shorter version, than previous answer. Not sure if it is more efficient or not
def mask(self, seq_length, max_seq_length):
return tf.map_fn(
lambda x: tf.pad(tf.ones([x], dtype=tf.int32), [[0, max_seq_length - x]]),
seq_length)

Resources