To create a heatmap in output apriori - apriori

I would like also to create a data.frame with the rules in the row and the column items filled in according to 1 or 0.
Example:
A B C D
A + B + C 1 1 1 0
A + D + C 1 0 1 1
Any suggestions? The idea is to create a heatmap.
Thanks a lot!

as(x, "matrix") converts an itemMatrix into a logical matrix. Example:
data("Adult")
rules <- apriori(Adult,
parameter = list(supp = 0.5, conf = 0.9, target = "rules"))
as(items(rules), "matrix")
items() extracts the LHS and RHS of the rule as a single itemMatrix.
You can convert the matrix into a data.frame and change it from logical to 0-1 (using storage.mode()).

Related

Julia / Cellular Automata: efficient way to get neighborhood

I'd like to implement a cellular automaton (CA) in Julia. Dimensions should be wrapped, this means: the left neighbor of the leftmost cell is the rightmost cell etc.
One crucial question is: how to get the neighbors of one cell to compute it's state in the next generation? As dimensions should be wrapped and Julia does not allow negative indices (as in Python) i had this idea:
Considered a 1D CA, one generation is a one-dimensional array:
0 0 1 0 0
What if we create a two dimensional Array, where the first row is shifted right and the third is shifted left, like this:
0 0 0 1 0
0 0 1 0 0
0 1 0 0 0
Now, the first column contain the states of the first cell and it's neighbors etc.
i think this can easily be generalized for two and more dimensions.
First question: do you think this is a good idea, or is this a wrong track?
EDIT: Answer to first question was no, second Question and code example discarded.
Second question: If the approach is basically ok, please have a look at the following sketch:
EDIT: Other approach, here is a stripped down version of a 1D CA, using mod1() for getting neighborhood-indices, as Bogumił Kamiński suggested.
for any cell:
- A array of all indices
- B array of all neighborhood states
- C states converted to one integer
- D lookup next state
function digits2int(digits, base=10)
int = 0
for digit in digits
int = int * base + digit
end
return int
end
gen = [0,0,0,0,0,1,0,0,0,0,0]
rule = [0,1,1,1,1,0,0,0]
function nextgen(gen, rule)
values = [mod1.(x .+ [-1,0,1], size(gen)) for x in 1:length(gen)] # A
values = [gen[value] for value in values] # B
values = [digits2int(value, 2) for value in values] # C
values = [rule[value+1] for value in values] # D
return values
end
for _ in 1:100
global gen
println(gen)
gen = nextgen(gen, rule)
end
Next step should be to extend it to two dimensions, will try it now...
The way I typically do it is to use mod1 function for wrapped indexing.
In this approach, no matter what dimensionality of your array a is then when you want to move from position x by delta dx it is enough to write mod1(x+dx, size(a, 1)) if x is the first dimension of an array.
Here is a simple example of a random walk on a 2D torus counting the number of times a given cell was visited (here I additionally use broadcasting to handle all dimensions in one expression):
function randomwalk()
a = zeros(Int, 8, 8)
pos = (1,1)
for _ in 1:10^6
# Von Neumann neighborhood
dpos = rand(((1,0), (-1,0), (0,1), (0,-1)))
pos = mod1.(pos .+ dpos, size(a))
a[pos...] += 1
end
a
end
Usually, if the CA has cells that are only dependent on the cells next to them, it's simpler just to "wrap" the vector by adding the last element to the front and the first element to the back, doing the simulation, and then "unwrap" by taking the first and last elements away again to get the result length the same as the starting array length. For the 1-D case:
const lines = 10
const start = ".........#........."
const rules = [90, 30, 14]
rule2poss(rule) = [rule & (1 << (i - 1)) != 0 for i in 1:8]
cells2bools(cells) = [cells[i] == '#' for i in 1:length(cells)]
bools2cells(bset) = prod([bset[i] ? "#" : "." for i in 1:length(bset)])
function transform(bset, ruleposs)
newbset = map(x->ruleposs[x],
[bset[i + 1] * 4 + bset[i] * 2 + bset[i - 1] + 1
for i in 2:length(bset)-1])
vcat(newbset[end], newbset, newbset[1])
end
const startset = cells2bools(start)
for rul in rules
println("\nUsing Rule $rul:")
bset = vcat(startset[end], startset, startset[1]) # wrap ends
rp = rule2poss(rul)
for _ in 1:lines
println(bools2cells(bset[2:end-1])) # unwrap ends
bset = transform(bset, rp)
end
end
As long as only the adjacent cells are used in the simulation for any given cell, this is correct.
If you extend this to a 2D matrix, you would also "wrap" the first and last rows as well as the first and last columns, and so forth.

Variation of indices of an array or matrix

If I use this syntax:
mX=[1:5];
A=rand(5,1);
C(mX)=sum(A(1:mX));
Why doesn't the content of C(mX) vary with varying mX?
Instead of doing
C(1)=A(1)
C(2)=A(1)+A(2), etc
it does:
C(1)=A(1)
C(2)=A(1)
C(3)=A(1), etc
Is there any way to vary C(mX) without resorting to a loop?
To answer your first question:
mX=1:5;
A=rand(5,1);
C(mX)=sum(A(1:mX));
makes the sum over A(1:[1 2 3 4 5]), which results in A(1:1), and hence all your C(mX) values will be filled with purely the element A(1).
What you want to do is make a cumulative sum, which can be done, as #leanderMoesinger mentioned with cumsum:
A=rand(5,1);
C = cumsum(A)
C =
0.0975
0.3760
0.9229
1.8804
2.8453
If you want to learn more about indexing I can highly recommend the following post: Linear indexing, logical indexing, and all that
If you want not all elements of A, but e.g. up to element three you can do
mX = 1:3;
A = rand(5,1);
C = cumsum(A(mX)); calculate only to mX
mX = [1 3 5];
C = cumsum(A(mX)) % Also works if you only want elements 1 3 and 5 to appear
% If you want elements of C 1 3 and 5 use
tmp = cumsum(A);
C = tmp(mX);
You can do this by cumsum like so:
mX=[1:5];
A=rand(5,1);
C = cumsum(A(mX));

How to compare each matrix to mean and return value in Matlab

for example lets consider
a = fix(8 * randn(10,5));
and mean(a) would give me mean of each column.
So, what I was planning to do was comparing the mean of first column to each of its content till the column and and proceed to the next column with its mean and comparing with each of its content.
I was able to get this code here (I know there are multiple for loops but thats the best I could come up with, any alternate answer would be greatly accepted)
if(ndims(a)==2)
b = mean(a);
for c = 1:size(a,2)
for d = 1:size(a)
for e = 1:size(b,2)
if(a(d,c)>b(1,c))
disp(1);
else
disp(false);
end
end
end
end
else
disp('Input should be a 2D matrix');
end
I don't know if this is the right answer? Could any one tell me?
Thanks in advance.
It seems you want to know whether each entry is greater than its column-mean.
This is done efficiently with bsxfun:
result = bsxfun(#gt, a, mean(a,1));
Example:
a =
3 1 3 2
5 2 3 1
1 3 5 2
The column-means, given by mean(a,1), are
ans =
3.000000000000000 2.000000000000000 3.666666666666667 1.666666666666667
Then
>> result = bsxfun(#gt, a, mean(a,1))
result =
0 0 0 1
1 0 0 0
0 1 1 1
If you are trying to do what I think you are (print one if the average value of a column is greater than the value in that column, zero otherwise) you can eliminate a lot of loops doing the following (using your same a and b):
for ii=1:length(b)
c(:,ii) = b(ii) > a(:,ii);
end
c will be your array of ones and zeros.

gnuplot--iteration to obtain variables in datafile

Let's say that I've got data called 'myData.dat' in the form
x y
0 0
1 1
2 2
4 3
8 4
16 5
I need to find the following things from this data:
slope for points
0 to 5
1 to 5
2 to 5
3 to 5
4 to 5
y-intercept for the same pairs
equation for the line connecting the same pairs
Then I need to plot the the data and overlay the lines; below is a picture of what I'm asking for.
I know how to obtain the the slope and y-intercept for a single pair of points, and plot the data and the equation of the line. For example for points 1 and 5:
set table
plot "myData.dat" using 0:($0==0 ? y1=$2 : $2)
plot "myData.dat" using 0:($0==4 ? y5=$2 : $2)
unset table
m1 = (y5 - y1)/(5-1)
b1 = y1 - m1*1
y1(x) = m1*x + b1
I'm new to iteration (and gnuplot) and I think there's something wrong with my syntax. I've tried a number of things and they haven't worked. My best guess is that it would be in the form
plot for [i=1:4] using 0:($0==1 ? y.i=$1 : $1)
do for [i=1:5]{
m.i = (y5 - y.i)/(5-i)
b.i = y.i - m.i*1
y.i(x) = m.i*x + b.i
}
set multiplot
plot "myData.dat" w lp
plot for [i=1:4] y.1(x)
unset multiplot
So what is going wrong? Is gnuplot able to concatencate the loop counter to variables?
Your syntax is incorrect. Although there are other ways to do what you want, for instace using word(var,i), the most straightforward fix to what you already have would be to use eval to evaluate a string to which you can concatenate variables:
do for [i=1:5]{
eval "m".i." = (y5 - y".i.")/(5-".i.")"
eval "b".i." = y".i." - m".i."*1"
eval "y".i."(x) = m".i."*x + b".i
}

Vectorization in MATLAB

I'm trying to create a vector of size 121x101 such that the ith column is made up of V_t*e, where V_t = 1000*10^((i-1)/20) and e is a 121 long column of ones.
Clearly i is to be varied from 1 to 101 million, but how would I apply that to a matrix without only yielding the final value in the results (applying this to every column without repeating commands)?
From your question, it looks like each row is the same. Thus, you can just calculate one row using REPMAT as
iRow = 1:101;
V_t = 1000*10.^((iRow-1)/20);
V_te = repmat(V_t,121,1);
If you want to have e be 1 in row 1, 2 in row 2, etc, you can use NDGRID to create two arrays of the same size as the output, which contain the values of e and i for every element (i,j) of the output
[ee,ii] = ndgrid(1:121,1:101);
V_te = 1000*10.^((i-1)/20) .* ee;
or you can use BSXFUN to do the expansion of e and i for you
iRow = 1:101;
V_t = 1000*10.^((iRow-1)/20);
V_te = bsxfun(#times,V_t,(1:121)');

Resources