Desicion Network Example - artificial-intelligence

I'am reading this example, but Could you explain a little more, I dont get the part when it says "then we Normalize"...
I know
P(sun) * P(F=bad|sun) = 0.7*0.2 = 0.14
P(rain)* P(F=bad|rain) = 0.3*0.9 = 0.27
But where do they get
W P(W | F=bad)
-----------------
sun 0.34
rain 0.66
Example from

To normalize a list of numbers, you divide each by the sum of the list.
e.g. python
>>> v = [0.14, 0.27]
>>> s = sum(v)
>>> print s
0.41000000000000003
>>> vnorm = [n/s for n in v]
>>> print vnorm
[0.34146341463414637, 0.65853658536585369]

Related

Julia way to write k-step look ahead function?

Suppose I have two arrays representing a probabilistic graph:
2
/ \
1 -> 4 -> 5 -> 6 -> 7
\ /
3
Where the probability of going to state 2 is 0.81 and the probability of going to state 3 is (1-0.81) = 0.19. My arrays represent the estimated values of the states as well as the rewards. (Note: Each index of the array represents its respective state)
V = [0, 3, 8, 2, 1, 2, 0]
R = [0, 0, 0, 4, 1, 1, 1]
The context doesn't matter so much, it's just to give an idea of where I'm coming from. I need to write a k-step look ahead function where I sum the discounted value of rewards and add it to the estimated value of the kth-state.
I have been able to do this so far by creating separate functions for each step look ahead. My goal of asking this question is to figure out how to refactor this code so that I don't repeat myself and use idiomatic Julia.
Here is an example of what I am talking about:
function E₁(R::Array{Float64,1}, V::Array{Float64, 1}, P::Float64)
V[1] + 0.81*(R[1] + V[2]) + 0.19*(R[2] + V[3])
end
function E₂(R::Array{Float64,1}, V::Array{Float64, 1}, P::Float64)
V[1] + 0.81*(R[1] + R[3]) + 0.19*(R[2] + R[4]) + V[4]
end
function E₃(R::Array{Float64,1}, V::Array{Float64, 1}, P::Float64)
V[1] + 0.81*(R[1] + R[3]) + 0.19*(R[2] + R[4]) + R[5] + V[5]
end
.
.
.
So on and so forth. It seems that if I was to ignore E₁() this would be exceptionally easy to refactor. But because I have to discount the value estimate at two different states, I'm having trouble thinking of a way to generalize this for k-steps.
I think obviously I could write a single function that took an integer as a value and then use a bunch of if-statements but that doesn't seem in the spirit of Julia. Any ideas on how I could refactor this? A closure of some sort? A different data type to store R and V?
It seems like you essentially have a discrete Markov chain. So the standard way would be to store the graph as its transition matrix:
T = zeros(7,7)
T[1,2] = 0.81
T[1,3] = 0.19
T[2,4] = 1
T[3,4] = 1
T[5,4] = 1
T[5,6] = 1
T[6,7] = 1
Then you can calculate the probabilities of ending up at each state, given an intial distribution, by multiplying T' from the left (because usually, the transition matrix is defined transposedly):
julia> T' * [1,0,0,0,0,0,0] # starting from (1)
7-element Array{Float64,1}:
0.0
0.81
0.19
0.0
0.0
0.0
0.0
Likewise, the probability of ending up at each state after k steps can be calculated by using powers of T':
julia> T' * T' * [1,0,0,0,0,0,0]
7-element Array{Float64,1}:
0.0
0.0
0.0
1.0
0.0
0.0
0.0
Now that you have all probabilities after k steps, you can easily calculate expectations as well. Maybe it pays of to define T as a sparse matrix.

End Loop when significant value found : Stata?

could you help me in figuring out: ho do i tell Stata to end the loop over iterations when it finds the first positive and significant value of a particular coefficient in a regression.
Here is a small sample using publicly available dataset that shows what I am trying to do: In the following case, I want stata to stop looping when it finds the "year" coefficient to be positive and significant.
set more off
clear all
clear matrix
use http://www.stata-press.com/data/r13/abdata
forvalues i=1/8{
xtabond n w k ys year, lags(`i') noconstant
matrix b = e(b)'
mat byear = b["year",1]
if `i'==1 matrix byear=b["year",1]
else matrix byear=(byear\ b["year",1])
}
Could you please help in figuring out how to tell stata to stop looping when it finds a condition is met.
Thank you
Here is some code that seems to do what you want. I had to set the confidence level to 80 (from the default of 95) so it would terminate before it exceeded the maximum number of lags.
set more off
clear all
clear matrix
set level 80
use http://www.stata-press.com/data/r13/abdata
forvalues i=1/8{
quietly xtabond n w k ys year, lags(`i') noconstant
matrix t = r(table)
scalar b = t[rownumb(t,"b"),colnumb(t,"year")]
scalar p = t[rownumb(t,"pvalue"),colnumb(t,"year")]
scalar r = 1-r(level)/100
scalar q = (b>0) & (p<=r)
if q {
display "success with `i' lags"
display "b: " b " p: " p " r: " r " q: " q
xtabond
continue, break
}
else {
display "no luck with `i' lags"
}
}
which yields
no luck with 1 lags
success with 2 lags
b: .00759529 p: .18035747 r: .2 q: 1
Arellano-Bond dynamic panel-data estimation Number of obs = 611
Group variable: id Number of groups = 140
Time variable: year
Obs per group:
min = 4
avg = 4.364286
max = 6
Number of instruments = 31 Wald chi2(6) = 1819.55
Prob > chi2 = 0.0000
One-step results
------------------------------------------------------------------------------
n | Coef. Std. Err. z P>|z| [80% Conf. Interval]
-------------+----------------------------------------------------------------
n |
L1. | .3244849 .0774312 4.19 0.000 .1727225 .4762474
L2. | -.0266879 .0363611 -0.73 0.463 -.0979544 .0445785
|
w | -.5464779 .0562155 -9.72 0.000 -.6566582 -.4362975
k | .360622 .0330634 10.91 0.000 .2958189 .4254252
ys | .5948084 .0818672 7.27 0.000 .4343516 .7552652
year | .0075953 .0056696 1.34 0.180 -.0035169 .0187075
------------------------------------------------------------------------------
Instruments for differenced equation
GMM-type: L(2/.).n
Standard: D.w D.k D.ys D.year
.
end of do-file

Bayesian Network

I have following bayesian network :
I was asked to find:
Value of P(b)
The solution
P(b) = ΣA={a,¬a} P(A)P(b|A)
= 0.1 × 0.5 + 0.9 × 0.8 = 0.77
and value of P(d/a)
The solution:
P (d|a) = ΣB={b,¬b} P (d|B)p(B|a)
= 0.9 × 0.5 + 0.2 × 0.5 = 0. 55
How did they come up with above formula?
What rule they have used to find marginal probability from bayesian network graph?
I understand basic joint probability distribution formula which is just product of individual probabilities given its parents.
Some explanation and resources relating to this will be helpful.
Thank you
I guess I found my answer.
It uses Marginal Probabilities.
The formula is:
P(X/Y) = ΣZ={all possible values of z} P(X/Y,Z)P(Z|Y)
Now you can easily find above two probabilities.

Using a for loop to generate elements of a vector

I am trying to compute with the equation
and I would like to store each value into a row vector. Here is my attempt:
multiA = [1];
multiB = [];
NA = 6;
NB = 4;
q = [0,1,2,3,4,5,6];
for i=2:7
multiA = [multiA(i-1), (factorial(q(i) + NA - 1))/(factorial(q(i))*factorial(NA-1))];
%multiA = [multiA, multiA(i)];
end
multiA
But this does not work. I get the error message
Attempted to access multiA(3); index out
of bounds because numel(multiA)=2.
multiA = [multiA(i-1), (factorial(q(i)
+ NA -
1))/(factorial(q(i))*factorial(NA-1))];
Is my code even remotely close to what I want to achieve? What can I do to fix it?
You don't need any loop, just use the vector directly.
NA = 6;
q = [0,1,2,3,4,5,6];
multiA = factorial(q + NA - 1)./(factorial(q).*factorial(NA-1))
gives
multiA =
1 6 21 56 126 252 462
For multiple N a loop isn't necessary neither:
N = [6,8,10];
q = [0,1,2,3,4,5,6];
[N,q] = meshgrid(N,q)
multiA = factorial(q + N - 1)./(factorial(q).*factorial(N-1))
Also consider the following remarks regarding the overflow for n > 21 in:
f = factorial(n)
Limitations
The result is only accurate for double-precision values of n that are less than or equal to 21. A larger value of n produces a result that
has the correct order of magnitude and is accurate for the first 15
digits. This is because double-precision numbers are only accurate up
to 15 digits.
For single-precision input, the result is only accurate for values of n that are less than or equal to 13. A larger value of n produces a
result that has the correct order of magnitude and is accurate for the
first 8 digits. This is because single-precision numbers are only
accurate up to 8 digits.
Factorials of moderately large numbers can cause overflow. Two possible approaches to prevent that:
Avoid computing terms that will cancel. This approach is specially suited to the case when q is of the form 1,2,... as in your example. It also has the advantage that, for each value of q, the result for the previous value is reutilized, thus minimizing the number of operations:
>> q = 1:6;
>> multiA = cumprod((q+NA-1)./q)
multiA =
6 21 56 126 252 462
Note that 0 is not allowed in q. But the result for 0 is just 1, so the final result would be just [1 multiA].
For q arbitrary (not necessarily of the form 1,2,...), you can use the gammaln function, which gives the logarithms of the factorials:
>> q = [0 1 2 6 3];
>> multiA = exp(gammaln(q+NA)-gammaln(q+1)-gammaln(NA));
>>multiA =
1.0000 6.0000 21.0000 462.0000 56.0000
You want to append a new element to the end of 'multiA':
for i=2:7
multiA = [multiA, (factorial(q(i) + NA - 1))/(factorial(q(i))*factorial(NA-1))];
end
A function handle makes it much simpler:
%define:
omega=#(q,N)(factorial(q + N - 1))./(factorial(q).*factorial(N-1))
%use:
omega(0:6,4) %q=0..6, N=4
It might be better to use nchoosek as opposed to factorial. The latter can overflow quite easily, I'd imagine.
multiA=nan(1,7);
for i=1:7
multiA(i)=nchoosek(q(i)+N-1, q(i));
end

Correct way to get weighted average of concrete array-values along continous interval

I've been looking for a while onto websearch, however, possibly or probably I am missing the right terminology.
I have arbitrary sized arrays of scalars ...
array = [n_0, n_1, n_2, ..., n_m]
I also have a function f->x->y, with 0<=x<=1, and y an interpolated value from array. Examples:
array = [1,2,9]
f(0) = 1
f(0.5) = 2
f(1) = 9
f(0.75) = 5.5
My problem is that I want to compute the average value for some interval r = [a..b], where a E [0..1] and b E [0..1], i.e. I want to generalize my interpolation function f->x->y to compute the average along r.
My mind boggles me slightly w.r.t. finding the right weighting. Imagine I want to compute f([0.2,0.8]):
array --> 1 | 2 | 9
[0..1] --> 0.00 0.25 0.50 0.75 1.00
[0.2,0.8] --> ^___________________^
The latter being the range of values I want to compute the average of.
Would it be mathematically correct to compute the average like this?: *
1 * (1-0.8) <- 0.2 'translated' to [0..0.25]
+ 2 * 1
avg = + 9 * 0.2 <- 0.8 'translated' to [0.75..1]
----------
1.4 <-- the sum of weights
This looks correct.
In your example, your interval's length is 0.6. In that interval, your number 2 is taking up (0.75-0.25)/0.6 = 0.5/0.6 = 10/12 of space. Your number 1 takes up (0.25-0.2)/0.6 = 0.05 = 1/12 of space, likewise your number 9.
This sums up to 10/12 + 1/12 + 1/12 = 1.
For better intuition, think about it like this: The problem is to determine how much space each array-element covers along an interval. The rest is just filling the machinery described in http://en.wikipedia.org/wiki/Weighted_average#Mathematical_definition .

Resources