Construct a decision-tree classifier with binary splits at each node? - database

Construct a decision-tree classifier with binary splits at each node, using tuples in relation r (A, B, C) shown below as training data; attribute C denotes the class.
Show the final tree, and with each node show the best split for each attribute along with its information gain value.
Training Data:
(1, 2, a), (2, 1, a), (2, 5, b), (3, 3, b), (3, 6, b), (4, 5, b), (5, 5, c), (6, 3, b), (6, 7, c) ?
How to proceed?
Any link will be helpful?

Have you found what algorithm (i.e. ID3) do you want to use to build your decision tree? To predict the class, you need to train your decision tree based on observations about data (i.e. features). This lilnk explains the decision tree learning.

Related

How to remove loops from three nodes? [networkx]

I have a graph with hundreds of edges and I want to remove loops like this:
(1, 2)
(1, 3)
(2, 3)
I have tried:
G.remove_edges_from(nx.selfloop_edges(G))
But it does not seems to work. Any advices?
Selfloops are edges of a node to itself. For example, (1,1) or (2,2) are self loops. The example you is a simple cycle, i.e., a closed path were no node appears twice. You can use simple_cycle or find_cycle. For example, you could iteratively use find cycle:
import networkx as nx
G = nx.karate_club_graph()
print(nx.find_cycle(G, orientation="ignore"))
# [(0, 1, 'forward'), (1, 2, 'forward'), (2, 0, 'forward')]

minimum operations to make array left part equal to right part

Given an even length array, [a1, a2,....,an], a beautiful array is an array where a[i] == a[i + n / 2] for 0<= i < n / 2. define an operation as change all array elements equal to value x to value y. what's the minimum operations required to make a given array beautiful? all elements are in range [1, 100000]. If simply return unmatch array pairs (ignore order) in left and right part of array, it will return wrong results in some cases such as [1, 1, 2, 5, 2, 5, 5, 2], unmatched pairs are (1, 2), (1, 5), (2, 5), but when change 2 -> 5, than (1, 2) and (1, 5) become the same. so what's the correct method to solve this problem?
It is a graph question.
For every pair(a[i], a[i+n/2]) where a[i]!=a[i+n/2], add an undirected edge between the two nodes.
Note that you shouldn't add multiple edges between 2 numbers.
Now you essentially need to remove all the edges in the graph by performing some operations. The final answer is the number of operations.
In each operation, you remove an edge. After removing an edge between two vertices, combine the vertices and rearrange their edges.

Find total number of ways possible to create an array of size M

Suppose I have M = 2 and N = 5 and K = 2 where
M = size of array
N = Maximum number that can be present as an array element
K = Minimum number that can be present as an array element.
So how do I find the number of possible ways to create an array using the above conditions. Also the current number should be not be greater than the previous element.
The arrays created using the above conditions are
[5,5],[5,4],[5,3],[5,2],[4,4],[4,3],[4,2],[3,3],[3,2],[2,2]
i.e 10 array can be created from the above conditions.
I tried doing it by using combinations and factorials, but not getting the desired output. Any help would be appreciated.
Assuming you are just interested in the number of combinations the formula is -
(N-K+M)!/(M!(N-K+1)!)
See more here
This is known as a combinations_with_replacement: combination because the order doesn't matter (or it would be a permutation), and with replacement because elements can be repeated, like [5, 5].
list(itertools.combinations_with_replacement(range(2, 6), 2))
# [(2, 2), (2, 3), (2, 4), (2, 5), (3, 3), (3, 4), (3, 5), (4, 4), (4, 5), (5, 5)]
If you want the exact ones you listed, you will have to reverse each element, and the list itself.
list(reversed([tuple(reversed(element)) for element in itertools.combinations_with_replacement(range(2,6), 2)]))

Grouping lines considering intersections of each line using python

There are 5 lines. I want to group them considering whether they intersect or not by limiting to the two end points of each line.
I want to get the logic for any of the lines, not being limited to the given scenario.
Array of 5 lines (coordinates of end points).
lines_all = [[(1, 10), (5, 10)],[(3, 5), (5, 5)],[(3, 10), (3, 13)],[(5,10),(5,13)],[(3,13),(4,13)]]
Then finally I want to get the following array list.
result = [[[(1, 10), (5, 10)], [(3, 10), (3, 13)],[(3, 13), (4, 13)]], [[(1, 10), (5, 10)], [(5, 10), (5, 13)]],[(3, 5), (5, 5)]]
To find all line segment intersections, you can use Bentley-Ottmann algorithm.
Arbitrary found Python implementation

Finding the set of all winning tic tac toe board states

Here's my problem. I want to create an algorithm which generates an array of arrays of every possible winning board state for an n-dimensional tic-tac-toe board. Say you have an n = 2 board, meaning 2x2, then the function should return the following array:
wins = [
[1,2],
[1,3],
[1,4],
[2,4]
]
I know this isn't specifically a MATLAB problem, however I'm trying to expand my understanding of how MATLAB works. My general idea is an algorithm that does the following:
generate an n-dimensional board of zeros
1. Go to the first cell, record that index ([1,])
2. Go to the end of the row, and that's your first board state ([1,2])
3. Go to the end of the column, that's your second board state ([1,3])
4. Go to the end of the diagonal, that's your third board state ([2,3])
5. Advance to the next cell, repeat, checking if you have already created that board state first ([2,4] should be the only one it hasn't done)
I think I'm overthinking the problem, but I'm not sure how to approach it. Can someone give me some guidance how to do this in a MATLAB-y way? My guess is that traversing the matrix and just picking whole rows/colums/diagonals is easy, it's the 'checking if it exists' part that I'm not getting. How would you call this algorithm, in general? Thanks for any help!
Better idea: you don't do this square by square, you do this by dimension. For each dimension on the board, you have these possibilities for the coordinate to vary or not through winning combinations:
iterate through all the possible values, low to high
iterate through all the possible values, high to low
hold constant as the other dimensions iterate, but do so for each value in range, repeating for the other coordinates.
For instance, for a 4^3 board, let's look at the last coordinate (call them x1, x2, x3), x3. Assume that you've already determined x1 will iterate low to high, x2 is constant at 2. You will now treat x3 with:
iterate through all the possible values, low to high
(1, 2, 1), (2, 2, 2), (3, 2, 3)
iterate through all the possible values, high to low
(1, 2, 3), (2, 2, 2), (3, 2, 1)
hold constant as the other dimensions iterate, but do so for each value in range, repeating for the other coordinates.
(1, 2, 1), (2, 2, 1), (3, 2, 1)
(1, 2, 2), (2, 2, 2), (3, 2, 2)
(1, 2, 3), (2, 2, 3), (3, 2, 3)
Does that get you moving?

Resources