How to display decision tree plot using export_graphviz - export

NEWBIE
using microsoft version 10, python 3.5.2, dot - graphviz version 2.38.0 (correctly installed)
trying to use export_graphviz to visualize a decision tree.
think it is pretty close, just can't do the last step.
here is the sample code
from sklearn.datasets import load_iris
from sklearn import tree
clf = tree.DecisionTreeClassifier()
iris = load_iris()
clf = clf.fit(iris.data, iris.target)
tree.export_graphviz(clf, out_file='tree.dot')
`
the 'tree.dot' file is output. when double clicked, it calls up microsoft word and display the following text.
digraph Tree {
node [shape=box] ;
0 [label="X[2] <= 2.45\ngini = 0.6667\nsamples = 150\nvalue = [50, 50, 50]"] ;
1 [label="gini = 0.0\nsamples = 50\nvalue = [50, 0, 0]"] ;
0 -> 1 [labeldistance=2.5, labelangle=45, headlabel="True"] ;
2 [label="X[3] <= 1.75\ngini = 0.5\nsamples = 100\nvalue = [0, 50, 50]"] ;
0 -> 2 [labeldistance=2.5, labelangle=-45, headlabel="False"] ;
3 [label="X[2] <= 4.95\ngini = 0.168\nsamples = 54\nvalue = [0, 49, 5]"] ;
2 -> 3 ;
4 [label="X[3] <= 1.65\ngini = 0.0408\nsamples = 48\nvalue = [0, 47, 1]"] ;
3 -> 4 ;
5 [label="gini = 0.0\nsamples = 47\nvalue = [0, 47, 0]"] ;
4 -> 5 ;
6 [label="gini = 0.0\nsamples = 1\nvalue = [0, 0, 1]"] ;
4 -> 6 ;
7 [label="X[3] <= 1.55\ngini = 0.4444\nsamples = 6\nvalue = [0, 2, 4]"] ;
3 -> 7 ;
8 [label="gini = 0.0\nsamples = 3\nvalue = [0, 0, 3]"] ;
7 -> 8 ;
9 [label="X[0] <= 6.95\ngini = 0.4444\nsamples = 3\nvalue = [0, 2, 1]"] ;
7 -> 9 ;
10 [label="gini = 0.0\nsamples = 2\nvalue = [0, 2, 0]"] ;
9 -> 10 ;
11 [label="gini = 0.0\nsamples = 1\nvalue = [0, 0, 1]"] ;
9 -> 11 ;
12 [label="X[2] <= 4.85\ngini = 0.0425\nsamples = 46\nvalue = [0, 1, 45]"] ;
2 -> 12 ;
13 [label="X[1] <= 3.1\ngini = 0.4444\nsamples = 3\nvalue = [0, 1, 2]"] ;
12 -> 13 ;
14 [label="gini = 0.0\nsamples = 2\nvalue = [0, 0, 2]"] ;
13 -> 14 ;
15 [label="gini = 0.0\nsamples = 1\nvalue = [0, 1, 0]"] ;
13 -> 15 ;
16 [label="gini = 0.0\nsamples = 43\nvalue = [0, 0, 43]"] ;
12 -> 16 ;
}
this example code works properly
[http://scikit-learn.org/stable/auto_examples/tree/plot_iris.html#sphx-glr-auto-examples-tree-plot-iris-py ][1]
thanks in advance

this answer worked great for me! thanks ashley!
For windows: dl the msi and install; Find gvedit.exe in your programs list; Open .dot file in question; Click running person on toolbar; Go to graph -> settings; change Output file type to file type of your liking and press ok.. It doesn't say anything, you just find the file in the same directory as your .dot file. – ashley Mar 26 '15 at 9:15
Graphviz: How to go from .dot to a graph?

Related

Getting memory Address like values of my array when printing (C programming)

I am trying to create 8x8 array, but when I am printing that array, after [7][7] Element I am not getting the exact values that I assigned while creating the array.
My array is a follows
#include <stdio.h>
int main() {
int puzzle[8][8] = {
// 0 1 2 3 4 5 6 7 8
{0,0,2,0,4,0,5,9,3},//0
{7,0,0,3,1,0,4,0,8},//1
{4,0,0,8,0,5,1,0,2},//2
{8,3,0,2,0,0,0,0,0},//3
{0,9,6,0,0,0,3,5,0},//4
{0,0,0,0,0,4,0,8,6},//5
{3,0,1,5,0,9,0,0,7},//6
{6,0,5,0,2,8,0,0,1},//7
{0,4,0,0,7,0,6,0,0} //8
};
for (int i = 0; i < 9; ++i)
{
for (int j = 0; j < 9; ++j)
{
printf("\n puzzle[%d][%d] = %d",i,j,puzzle[i][j]);
}
}
return 0;
}
and I am getting output as
puzzle[0][0] = 0
puzzle[0][1] = 0
puzzle[0][2] = 2
puzzle[0][3] = 0
puzzle[0][4] = 4
puzzle[0][5] = 0
puzzle[0][6] = 5
puzzle[0][7] = 9
puzzle[0][8] = 7
puzzle[1][0] = 7
puzzle[1][1] = 0
puzzle[1][2] = 0
puzzle[1][3] = 3
puzzle[1][4] = 1
puzzle[1][5] = 0
puzzle[1][6] = 4
puzzle[1][7] = 0
puzzle[1][8] = 4
puzzle[2][0] = 4
puzzle[2][1] = 0
puzzle[2][2] = 0
puzzle[2][3] = 8
puzzle[2][4] = 0
puzzle[2][5] = 5
puzzle[2][6] = 1
puzzle[2][7] = 0
puzzle[2][8] = 8
puzzle[3][0] = 8
puzzle[3][1] = 3
puzzle[3][2] = 0
puzzle[3][3] = 2
puzzle[3][4] = 0
puzzle[3][5] = 0
puzzle[3][6] = 0
puzzle[3][7] = 0
puzzle[3][8] = 0
puzzle[4][0] = 0
puzzle[4][1] = 9
puzzle[4][2] = 6
puzzle[4][3] = 0
puzzle[4][4] = 0
puzzle[4][5] = 0
puzzle[4][6] = 3
puzzle[4][7] = 5
puzzle[4][8] = 0
puzzle[5][0] = 0
puzzle[5][1] = 0
puzzle[5][2] = 0
puzzle[5][3] = 0
puzzle[5][4] = 0
puzzle[5][5] = 4
puzzle[5][6] = 0
puzzle[5][7] = 8
puzzle[5][8] = 3
puzzle[6][0] = 3
puzzle[6][1] = 0
puzzle[6][2] = 1
puzzle[6][3] = 5
puzzle[6][4] = 0
puzzle[6][5] = 9
puzzle[6][6] = 0
puzzle[6][7] = 0
puzzle[6][8] = 6
puzzle[7][0] = 6
puzzle[7][1] = 0
puzzle[7][2] = 5
puzzle[7][3] = 0
puzzle[7][4] = 2
puzzle[7][5] = 8
puzzle[7][6] = 0
puzzle[7][7] = 0
puzzle[7][8] = -445142480
puzzle[8][0] = -445142480
puzzle[8][1] = 32765
puzzle[8][2] = 480800256
puzzle[8][3] = -129200335
puzzle[8][4] = 0
puzzle[8][5] = 0
puzzle[8][6] = -2108403533
puzzle[8][7] = 32522
puzzle[8][8] = -2106304992
As you can see I am not getting exact values that I assigned to [7][8]th position.
The output I am getting looks like address or ids. I am not getting why it is happening, is it ide problem or is there any mistake in my code?
As you already seem to understand, array elements are indexed beginning at position zero. So when you define an array of size n arr[n], you can only hold n elements(0 to n - 1) in this array. Same applies for multi-dimensional arrays.
In your case you have only defined an array of size 8x8 which can hold only 64 elements. But you are trying to assign 9x9 81 elements to your array. Thus, only indices puzzle[0][0] to puzzle[7][7] are accessible.
The compiler shall issue a message for this initialization
int puzzle[8][8] = {
// 0 1 2 3 4 5 6 7 8
{0,0,2,0,4,0,5,9,3},//0
{7,0,0,3,1,0,4,0,8},//1
{4,0,0,8,0,5,1,0,2},//2
{8,3,0,2,0,0,0,0,0},//3
{0,9,6,0,0,0,3,5,0},//4
{0,0,0,0,0,4,0,8,6},//5
{3,0,1,5,0,9,0,0,7},//6
{6,0,5,0,2,8,0,0,1},//7
{0,4,0,0,7,0,6,0,0} //8
};
because elements of the array are arrays with 8 elements bur you are supplying 9 initializers for each element.
And if an array has N elements then the valid range of indices is [0, N).

Matlab - Subtract 1 vector with another in struct array

I have to different struct arrays(In the same Matlab file), what I want is to take 1 parameter/vector from a variable in a struct array and subtract it with different parameters from another variable in another struct array, is this possible?
Here is a small part of my code:
Dist(1).name = 'Pristina'
Dist(1).KM_To_Fushe_ks = 13.7 % 199-13.7 =
Dist(1).KM_to_Lipjan = 8.7 % 199-8.7 =
Dist(1).KM_to_Sllatina = 4.2 % 199-4.2 =
Dist(1).KM_to_Hajvali = 3.5 % 199-3.5 =
Dist(1).KM_to_Mitrovica = 46.9 % 199-46.9 =
Dist(1).KM_to_Anija = 1.9 % 199-1.9 =
EV(1).name = 'Nissan Leaf 24 kWh pack'
EV(1).RangeInKM_By_Manufacturer = 199 %SUBTRACT this with parameters above:
EV(1).Battery_Capacity = 21.6
EV(1).Battery_Warranty_KM = 100000
EV(1).Battery_Warrany_Year = 5
EV(1).EnginePower_Kw = 80
EV(1).EnginePower_hK = 109
EV(1).Torque_in_NewtonMeter = 254
EV(1).QuickCharging_type = 'CHAdeMO'
EV(1).QuickChargingEffect_kW_DC = 50
EV(1).NormalCharging_OnBoard_kW_AC = 3.3
EV(1).Seats = 5
EV(1).Luggage_in_Liters = 370
EV(1).Consumption_Mixed_kWh_per_10km_NEDC = 1.5
EV(1).Weight_Without_Driver = 1475
EV(1).TopSpeed_KM_per_hour = 144
EV(1).Acceleration_0to100KM_per_hour = 11.5
EV(1).RangeInKM_By_Manufacturer_RANK = 10
What I want is to have the number off 199 as a vector, and substract it by all these numbers = [13.7, 8.7, 4.2, 3.5, 46.9, 1.9]
How to do this?
Maybe I misinterpret your question, but this seem to work:
EV(1).RangeInKM_By_Manufacturer = 199 - Dist(1).KM_To_Fushe_ks
In the line you quote in your question, you left the initialization of KM_To_Fushe_ks after the difference; in short, you cannot have to vaiable assignements in the same command.
Also, if you end your lines with semi-colons you will suppress the output to the command window. Like this:
Dist(1).name = 'Pristina';
Dist(1).KM_To_Fushe_ks = 13.7;
Dist(1).KM_to_Lipjan = 8.7;
% Etc...
Here is one solution to my problem:
distances = [KM_to_Fushe_KS, KM_to_Lipjan];
remainingrange = arrayfun(#(s) s.RangeInKM - distances, EV, 'UniformOutput', false)
Or I could do this:
remainingrange = cell(size(EV));
for evidx = 1:numel(EV)
remaingrange{evidx} = EV(evidx).RangeInKM - distances;
end
Another solution is doing is putting multiple distances in once matrix:
Example:
Towns = {'Town1', 'Town2', 'Town3', 'Town4'};
distances = [0 200 13.7 8.7;
200 0 13.3 9.3;
13.7 13.3 0 255;
8.7 9.3 255 0];
EVs = {'Nissan Leaf 24 kWh pack', 'Nissan Leaf 30 kWh pack'};
ranges = [199 250];
And then I can calculate distances as a 3D matrix:
remainingrange = permute(ranges, [1 3 2]) - distances;
remainingrange = bsxfun(#minus, permute(ranges, [1 3 2]), distances);
If I want to check if a EV has not enough range in KM, I could write:
tooFarForMyEV = find(remainingrange < 0)
[from, to, ev] = ind2sub(size(remainingrange), tooFarForMyEV);
lackingrange = table(Towns(from)', Towns(to)', EVs(ev)', remainingrange(tooFarForMyEV), 'VariableNames', {'From', 'To', 'EV', 'Deficit'})

find a list of integers for a checksum

I would need a list of n positive integers L that has following properties:
for each possible subset S of L, if I sum all items of S, this sum is not in L
for each possible subset S of L, if I sum all items of S, this sum is unique (each subset can be identified by his sum)
Working example 1:
n = 4
L = [1, 5, 7, 9]
check:
1+5 = 6 ok
5+7 = 12 ok
7+9 = 16 ok
9+1 = 10 ok
1+7 = 8 ok
5+9 = 14 ok
1+5+7 = 13 ok
5+7+9 = 21 ok
1+5+9 = 15 ok
1+7+9 = 17 ok
1+5+7+9 = 22 ok
All sums are unique -> L is OK for n = 4
As an easy to construct sequence, I suggest using power series, e.g.
1, 2, 4, 8, ..., 2**k, ...
1, 3, 9, 27, ..., 3**k, ...
1, 4, 16, 64, ..., 4**k, ...
...
1, n, n**2, n**3,..., n**k, ... where n >= 2
Take, for instance, 2: neither power of 2 is a sum of other 2 powers; given a sum (number) you can easily find out the subset by converting sum into binary representation:
23 = 10111 (binary) = 2**0 + 2**1 + 2**2 + 2**4 = 1 + 2 + 4 + 16
In general case, a simple greedy algorithm will do: given a sum subtract the largest item less or equal to the sum; continue subtracting up to zero:
n = 3
sum = 273
273 - 243 (3**5) = 30
30 - 27 (3**3) = 3
3 - 3 (3**1) = 0
273 = 3**5 + 3**3 + 3**1 = 243 + 27 + 3

How many iterations of the while loop must be made in order to locate it?

Im having some trouble trying to find out why the correct answer to this question is 4. Could anyone be kind enough to briefly explain why? Thanks in advanced! Here's the question:
Consider the array a with values as shown:
4, 7, 19, 25, 36, 37, 50, 100, 101, 205, 220, 271, 306, 321
where 4 is a [0] and 321 is a [13] . Suppose that the search method is called with
first = 0 and last = 13 to locate the key 205. How many iterations of the while loop must be made in order to locate it?
My guess is that you have to use a binary search here, as the items are sorted.
Given this array
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
4, 7, 19, 25, 36, 37, 50, 100, 101, 205, 220, 271, 306, 321
You initialize with:
left and right indexes: l = 0, r = 14 (= length of array)
Then you need these iterations:
m = (l + r) / 2 = (0 + 14) / 2 = 7
[m = 7] = 100 is < 205 ==> l = 7 + 1
m = (l + r) / 2 = (8 + 14) / 2 = 11
[m = 11] = 271 is > 205 ==> r = 11 - 1
m = (l + r) / 2 = (8 + 10) / 2 = 9
[m = 9] = 205 is = 205 ==> result = [9]
= 3 iterations!
However, a slight change to the algorithm can change the number of iterations. If you take r = N-1 instead of N as initial value then you get:
m = (l + r) / 2 = (0 + 13) / 2 = 6 (integer division)
[m = 6] = 50 is < 205 ==> l = 6 + 1
m = (l + r) / 2 = (7 + 13) / 2 = 10
[m = 10] = 220 is > 205 ==> r = 10 - 1
m = (l + r) / 2 = (7 + 9) / 2 = 8
[m = 8] = 101 is < 205 ==> l = 8 + 1
m = (l + r) / 2 = (9 + 9) / 2 = 9
[m = 9] = 205 is = 205 ==> result = [9]
= 4 iterations!
So the result depends on implementation details. Both variants are correct. Take care to choose the appropriate loop condition (I think l < r for the first and l <= r for the second algorithm.
Just go from the last index.
You start with index 13, which the first iteration you go to index 12, on the 4th iteration you are on index 9, which equals to 205.

Matlab create a matrix with values

I've written a function in matlab which generates a matrix using a loop. I was wondering if it is possible to generate the same results without a loop. X can either be a 1 x 50, 2 x 50, 3 x 50, etc... the values range from 1 to 50 incrementally for each column per row.
For example
1 x 1 = 1,
2 x 1 = 1,
3 x 1 = 1,
1 x 2 = 2,
2 x 2 = 2,
3 x 2 = 2,
.....................
1 x 50 = 50,
2 x 50 = 50,
3 x 50 = 50,
My function:
function [i] = m(x)
[a, b] = size(x);
i = zeros(a, b);
for c = 1 : a
i(c, :) = (1:size(x,2));
end
end
Thanks.
Try this:
N = 3;
M = 50;
x = repmat((1:N)',M,1);
y = reshape(repmat((1:M)',1,N)',N*M,1);
%z = x.*y
z = strcat(num2str(x),'x',num2str(y),'=',num2str(x.*y))
This will give the same format in your question.
Use repmat:
output = repmat(1:size(x,2), size(x,1), 1);
Some alternatives are
output = ones(size(x,1),1)*(1:size(x,2));
and
output = cumsum(ones(size(x)),2);
One alternate to repmat(Luis's answer) is bsxfun
out = bsxfun(#times,ones(size(x,1),1),1:size(x,2))

Resources