How to assign NaN to values j < k in Matlab - arrays

I'm a Matlab newbie and I would like to assign NaN values to an array of size(j, k, l) wherever the dimension is j < k or j < l. How do I do this most efficiently?

You can use bsxfun to do it efficiently:
J = (1:size(A,1)).';
K = 1:size(A,2);
L = reshape(1:size(A,3),1,1,[]);
A(bsxfun(#or,bsxfun(#lt,J,K),bsxfun(#lt,J,L))) = NaN;
In MATLAB r2016b or Octave you can simply write:
J = (1:size(A,1)).';
K = 1:size(A,2);
L = reshape(1:size(A,3),1,1,[]);
A(J<K|J<L)=NaN;
Result of a test on a matrix A = rand(500,400,300):
________________________________
| METHOD | MEMORY | SPEED |
|==========|==========|==========|
| MESHGRID | 1547 MB | 1.24 Secs|
|----------|----------|----------|
| BSXFUN | 57 MB | 0.18 Secs|
|__________|__________|__________|

Use fancy vecotization:
% this may be memory expensive for big matrices:
[j,k,l]=meshgrid(1:size(A,1),1:size(A,2),1:size(A,3));
% Tada!
A(j<k | k<l)=NaN;
If you do not have enough RAM (or do not want to use it for this), then the best option is just loopy:
for jj=1:size(A,1)
for k=1:size(A,2)
for l=1:size(A,3)
if (jj<k | k<l)
A(jj,k,l)=NaN;
end
end
end
end
This will likely be slower, but doesn't need any extra memory.

Related

Comparing rows in spark dataframe to obtain a new column

I'm a beginner in spark and I'm dealing with a large dataset (over 1.5 Million rows and 2 columns). I have to evaluate the Cosine Similarity of the field "features" beetween each row. The main problem is this iteration beetween the rows and finding an efficient and rapid method. I will have to use this method with another dataset of 42.5 Million rows and it would be a big computational problem if I won't find the most efficient way of doing it.
| post_id | features |
| -------- | -------- |
| Bkeur23 |[cat,dog,person] |
| Ksur312kd |[wine,snow,police] |
| BkGrtTeu3 |[] |
| Fwd2kd |[person,snow,cat] |
I've created an algorithm that evaluates this cosine similarity beetween each element of the i-th and j-th row but i've tried using lists or creating a spark DF / RDD for each result and merging them using the" union" function.
The function I've used to evaluate the cosineSimilarity is the following. It takes 2 lists in input ( the lists of the i-th and j-th rows) and returns the maximum value of the cosine similarity between each couple of elements in the lists. But this is not the problem.
def cosineSim(lista1,lista2,embed):
#embed = hub.KerasLayer(os.getcwd())
eps=sys.float_info.epsilon
if((lista1 is not None) and (lista2 is not None)):
if((len(lista1)>0) and (len(lista2)>0)):
risultati={}
for a in lista1:
tem = a
x = tf.constant([tem])
embeddings = embed(x)
x = np.asarray(embeddings)
x1 = x[0].tolist()
for b in lista2:
tem = b
x = tf.constant([tem])
embeddings = embed(x)
x = np.asarray(embeddings)
x2 = x[0].tolist()
sum = 0
suma1 = 0
sumb1 = 0
for i,j in zip(x1, x2):
suma1 += i * i
sumb1 += j*j
sum += i*j
cosine_sim = sum / ((sqrt(suma1))*(sqrt(sumb1))+eps)
risultati[a+'-'+b]=cosine_sim
cosine_sim=0
risultati=max(risultati.values())
return risultati
The function I'm using to iterate over the rows is the following one:
def iterazione(df,numero,embed):
a=1
k=1
emp_RDD = spark.sparkContext.emptyRDD()
columns1= StructType([StructField('Source', StringType(), False),
StructField('Destination', StringType(), False),
StructField('CosinSim',FloatType(),False)])
first_df = spark.createDataFrame(data=emp_RDD,
schema=columns1)
for i in df:
for j in islice(df, a, None):
r=cosineSim(i[1],j[1],embed)
if(r>0.45):
z=spark.createDataFrame(data=[(i[0],j[0],r)],schema=columns1)
first_df=first_df.union(z)
k=k+1
if(k==numero):
k=a+1
a=a+1
return first_df
The output I desire is something like this:
| Source | Dest | CosinSim |
| -------- | ---- | ------ |
| Bkeur23 | Ksur312kd | 0.93 |
| Bkeur23 | Fwd2kd | 0.673 |
| Ksur312kd | Fwd2kd | 0.76 |
But there is a problem in my "iterazione" function.
I ask you to help me finding the best way to iterate all over this rows. I was thinking also about copying the column "features" as "features2" and applying my function using WithColumn but I don't know how to do it and if it will work. I want to know if there's some method to do it directly in a spark dataframe, avoiding the creation of other datasets and merging them later, or if you know some method more rapid and efficient. Thank you!

Algorithm complexity of traversing a 3D array

I have an algorithm that traverses a 3d array. For every value in the array I do some compiutations. I'm trying to figure out the time-complexity of the algorithm. In my case, it's not a complete traverse, some value of the array are not considered.
def process_matrix(k: int, V: int):
import numpy as np
sp_matrix = np.zeros((V, V, k))
for e in range(k):
for i in range(V):
# Note that the range of index j decreases while index i is growing
for j in range(i, V):
# Also the index a decreases acording to index i
for a in range(i, V):
if (something):
sp_matrix[i][j][e] = set_some_value()
As you can see I'm not considering the values j < i for every index e.
If we take only the three most outer loops, I think the complexity is:
V * (1+V)/2 * k
k -> the most outer loop
V*(1+V)/2 -> for the sencond and third loop I used Gauss formula for adding consecutive numbers
With some approximation, the complexity for this three loops, I think is O(((V^2)/2)*k).
First I thought the inner loop contributes to the O with another (1+V)/2. With the result (V * (1+V)/2 * k) * (1+V)/2. But then I considered this situation:
k = 1
V = 3
The resulting array is:
j=0 j=1 j=2
i=0 | 3 | 3 | 3 |
i=1 | x | 2 | 2 |
i=2 | x | x | 1 |
(the values in the matrix rapresents how many times the most inner loop.. loops)
The total is: 3+3+3+2+2+1 = 14
I expect the same value using my formula (V * (1+V)/2 * k) * (1+V)/2,
(3*(1+3)/2*1) * (1+3)/2 = 12
But it's not...
Big-O notation is about setting upper limits, ignoring constant factors. In this sense, and considering the 3 outer loops, O(((V^2)/2)*k) = O(k * V^2), and if k is constant, = O(V^2).
However, if you start to count executions of the innermost code, and compare those against your expected number of executions, you are leaving big-O territory, since constant factors can no longer be ignored. Also, counting executions of a single instruction, while useful, is by no means as exact as measuring real-world performance (which, however, will depend on the exact workload and machine/environment you test it on).
Since your 3 inner loops are essentially drawing a tetrahedron, you can use its formula to get an approximation of complexity: O(V^3 / 3). But, if you want to get exact, I have successfully tested out the following JS code:
let K=1, V=6, t=0; // t is for counting totals; reset to 0 for each try
for (let k=0; k<K; k++) // inside repeats K times
for (let i=0; i<V; i++) // inside repeats V*K times
for (let j=i; j<V; j++) // inside repeats (V+1)*(V) / 2 * K times
for (let a=i; a<V; a++) // inside repeats (V+1)*(V+.5)* V / 3 * K times
console.log(k,i,j,a,++t, (V+1)*(V+.5)* V / 3 * K);

End Loop when significant value found : Stata?

could you help me in figuring out: ho do i tell Stata to end the loop over iterations when it finds the first positive and significant value of a particular coefficient in a regression.
Here is a small sample using publicly available dataset that shows what I am trying to do: In the following case, I want stata to stop looping when it finds the "year" coefficient to be positive and significant.
set more off
clear all
clear matrix
use http://www.stata-press.com/data/r13/abdata
forvalues i=1/8{
xtabond n w k ys year, lags(`i') noconstant
matrix b = e(b)'
mat byear = b["year",1]
if `i'==1 matrix byear=b["year",1]
else matrix byear=(byear\ b["year",1])
}
Could you please help in figuring out how to tell stata to stop looping when it finds a condition is met.
Thank you
Here is some code that seems to do what you want. I had to set the confidence level to 80 (from the default of 95) so it would terminate before it exceeded the maximum number of lags.
set more off
clear all
clear matrix
set level 80
use http://www.stata-press.com/data/r13/abdata
forvalues i=1/8{
quietly xtabond n w k ys year, lags(`i') noconstant
matrix t = r(table)
scalar b = t[rownumb(t,"b"),colnumb(t,"year")]
scalar p = t[rownumb(t,"pvalue"),colnumb(t,"year")]
scalar r = 1-r(level)/100
scalar q = (b>0) & (p<=r)
if q {
display "success with `i' lags"
display "b: " b " p: " p " r: " r " q: " q
xtabond
continue, break
}
else {
display "no luck with `i' lags"
}
}
which yields
no luck with 1 lags
success with 2 lags
b: .00759529 p: .18035747 r: .2 q: 1
Arellano-Bond dynamic panel-data estimation Number of obs = 611
Group variable: id Number of groups = 140
Time variable: year
Obs per group:
min = 4
avg = 4.364286
max = 6
Number of instruments = 31 Wald chi2(6) = 1819.55
Prob > chi2 = 0.0000
One-step results
------------------------------------------------------------------------------
n | Coef. Std. Err. z P>|z| [80% Conf. Interval]
-------------+----------------------------------------------------------------
n |
L1. | .3244849 .0774312 4.19 0.000 .1727225 .4762474
L2. | -.0266879 .0363611 -0.73 0.463 -.0979544 .0445785
|
w | -.5464779 .0562155 -9.72 0.000 -.6566582 -.4362975
k | .360622 .0330634 10.91 0.000 .2958189 .4254252
ys | .5948084 .0818672 7.27 0.000 .4343516 .7552652
year | .0075953 .0056696 1.34 0.180 -.0035169 .0187075
------------------------------------------------------------------------------
Instruments for differenced equation
GMM-type: L(2/.).n
Standard: D.w D.k D.ys D.year
.
end of do-file

Find product of integers at interval of X and update value at position 'i' in an array for N queries

I have given an array of integers of length up to 10^5 & I want to do following operation on array.
1-> Update value of array at any position i . (1 <= i <= n)
2-> Get products of number at indexes 0, X, 2X, 3X, 4X.... (J * X <= n)
Number of operation will be up to 10^5.
Is there any log n approach to answer query and update values.
(Original thought is to use Segment Tree but I think that it is not needed...)
Let N = 10^5, A:= original array of size N
We use 0-based notation when we saying indexing below
Make a new array B of integers which of length up to M = NlgN :
First integer is equal to A[0];
Next N integers is of index 1,2,3...N of A; I call it group 1
Next N/2 integers is of index 2,4,6....; I call it group 2
Next N/3 integers 3,6,9.... I call it group 3
Here is an example of visualized B:
B = [A[0] | A[1], A[2], A[3], A[4] | A[2], A[4] | A[3] | A[4]]
I think the original thoughts can be used without even using Segment Tree..
(It is overkill when you think for operation 2, we always will query specific range on B instead of any range, i.e. we do not need that much flexibility and complexity to maintain the data structure)
You can create the new array B described above, also create another array C of length M, C[i] := products of Group i
For operation 1 simply use O(# factors of i) to see which Group(s) you need to update, and update the values in both B and C (i.e. C[x]/old B[y] *new B[y])
For operation 2 just output corresponding C[i]
Not sure if I was wrong but this should be even faster and should pass the judge, if the original idea is correct but got TLE
As OP has added a new condition: for operation 2, we need to multiply A[0] as well, so we can special handle it. Here is my thought:
Just declare a new variable z = A[0], for operation 1, if it is updating index 0, update this variable; for operation 2, query using the same method above, and multiply by z afterwards.
I have updated my answer so now I simply use the first element of B to represent A[0]
Example
A = {1,4,6,2,8,7}
B = {1 | 4,6,2,8,7 | 6,8 | 2 | 8 | 7 } // O(N lg N)
C = {1 | 2688 | 48 | 2 | 8 | 7 } // O (Nlg N)
factorization for all possible index X (X is the index, so <= N) // O(N*sqrt(N))
opeartion 1:
update A[4] to 5: factors = 1,2,4 // Number of factors of index, ~ O(sqrt(N))
which means update Group 1,2,4 i.e. the corresponding elements in B & C
to locate the corresponding elements in B & C maybe a bit tricky,
but that should not increase the complexity
B = {1 | 4,6,2,5,7 | 6,5 | 2 | 5 | 7 } // O(sqrt(N))
C = {1 | 2688 | 48/8*5 | 2 | 8/8*5 | 7 } // O(sqrt(N))
update A[0] to 2:
B = {2 | 4,6,2,5,7 | 6,5 | 2 | 5 | 7 } // O(1)
C = {2 | 2688/8*5 | 48/8*5 | 2 | 8/8*5 | 7 } // O(1)
// Now A is actually {2,4,6,2,5,7}
operation 2:
X = 3
C[3] * C[0] = 2*2 = 4 // O(1)
X = 2
C[2] * C[0] = 30*2 = 60 // O(1)

Fast Hypotenuse Algorithm for Embedded Processor?

Is there a clever/efficient algorithm for determining the hypotenuse of an angle (i.e. sqrt(a² + b²)), using fixed point math on an embedded processor without hardware multiply?
If the result doesn't have to be particularly accurate, you can get a crude
approximation quite simply:
Take absolute values of a and b, and swap if necessary so that you have a <= b. Then:
h = ((sqrt(2) - 1) * a) + b
To see intuitively how this works, consider the way that a shallow angled line is plotted on a pixel display (e.g. using Bresenham's algorithm). It looks something like this:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | | | | | | | | | | | | | | |*|*|*| ^
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| | | | | | | | | | | | |*|*|*|*| | | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| | | | | | | | |*|*|*|*| | | | | | | | a pixels
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| | | | |*|*|*|*| | | | | | | | | | | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
|*|*|*|*| | | | | | | | | | | | | | | | v
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
<-------------- b pixels ----------->
For each step in the b direction, the next pixel to be plotted is either immediately to the right, or one pixel up and to the right.
The ideal line from one end to the other can be approximated by the path which joins the centre of each pixel to the centre of the adjacent one. This is a series of a segments of length sqrt(2), and b-a segments of length 1 (taking a pixel to be the unit of measurement). Hence the above formula.
This clearly gives an accurate answer for a == 0 and a == b; but gives an over-estimate for values in between.
The error depends on the ratio b/a; the maximum error occurs when b = (1 + sqrt(2)) * a and turns out to be 2/sqrt(2+sqrt(2)), or about 8.24% over the true value. That's not great, but if it's good enough for your application, this method has the advantage of being simple and fast. (The multiplication by a constant can be written as a sequence of shifts and adds.)
For the record, here are a few more approximations, listed in roughly
increasing order of complexity and accuracy. All these assume 0 ≤ a ≤ b.
h = b + 0.337 * a // max error ≈ 5.5 %
h = max(b, 0.918 * (b + (a>>1))) // max error ≈ 2.6 %
h = b + 0.428 * a * a / b // max error ≈ 1.04 %
Edit: to answer Ecir Hana's question, here is how I derived these
approximations.
First step. Approximating a function of two variables can be a
complex problem. Thus I first transformed this into the problem of
approximating a function of one variable. This can be done by choosing
the longest side as a “scale” factor, as follows:
h = √(b2 + a2)
= b √(1 + (a/b)2)
= b f(a/b) where f(x) = √(1+x2)
Adding the constraint 0 ≤ a ≤ b means we are only concerned with
approximating f(x) in the interval [0, 1].
Below is the plot of f(x) in the relevant interval, together with the
approximation given by Matthew Slattery (namely (√2−1)x + 1).
Second step. Next step is to stare at this plot, while asking
yourself the question “how can I approximate this function cheaply?”.
Since the curve looks roughly parabolic, my first idea was to use a
quadratic function (third approximation). But since this is still
relatively expensive, I also looked at linear and piecewise linear
approximations. Here are my three solutions:
The numerical constants (0.337, 0.918 and 0.428) were initially free
parameters. The particular values were chosen in order to minimize the
maximum absolute error of the approximations. The minimization could
certainly be done by some algorithm, but I just did it “by hand”,
plotting the absolute error and tuning the constant until it is
minimized. In practice this works quite fast. Writing the code to
automate this would have taken longer.
Third step is to come back to the initial problem of approximating a
function of two variables:
h ≈ b (1 + 0.337 (a/b)) = b + 0.337 a
h ≈ b max(1, 0.918 (1 + (a/b)/2)) = max(b, 0.918 (b + a/2))
h ≈ b (1 + 0.428 (a/b)2) = b + 0.428 a2/b
Consider using CORDIC methods. Dr. Dobb's has an article and associated library source here. Square-root, multiply and divide are dealt with at the end of the article.
One possibility looks like this:
#include <math.h>
/* Iterations Accuracy
* 2 6.5 digits
* 3 20 digits
* 4 62 digits
* assuming a numeric type able to maintain that degree of accuracy in
* the individual operations.
*/
#define ITER 3
double dist(double P, double Q) {
/* A reasonably robust method of calculating `sqrt(P*P + Q*Q)'
*
* Transliterated from _More Programming Pearls, Confessions of a Coder_
* by Jon Bentley, pg. 156.
*/
double R;
int i;
P = fabs(P);
Q = fabs(Q);
if (P<Q) {
R = P;
P = Q;
Q = R;
}
/* The book has this as:
* if P = 0.0 return Q; # in AWK
* However, this makes no sense to me - we've just insured that P>=Q, so
* P==0 only if Q==0; OTOH, if Q==0, then distance == P...
*/
if ( Q == 0.0 )
return P;
for (i=0;i<ITER;i++) {
R = Q / P;
R = R * R;
R = R / (4.0 + R);
P = P + 2.0 * R * P;
Q = Q * R;
}
return P;
}
This still does a couple of divides and four multiples per iteration, but you rarely need more than three iterations (and two is often adequate) per input. At least with most processors I've seen, that'll generally be faster than the sqrt would be on its own.
For the moment it's written for doubles, but assuming you've implemented the basic operations, converting it to work with fixed point shouldn't be terribly difficult.
Some doubts have been raised by the comment about "reasonably robust". At least as originally written, this was basically a rather backhanded way of saying that "it may not be perfect, but it's still at least quite a bit better than a direct implementation of the Pythagorean theorem."
In particular, when you square each input, you need roughly twice as many bits to represent the squared result as you did to represent the input value. After you add (which needs only one extra bit) you take the square root, which gets you back to needing roughly the same number of bits as the inputs. Unless you have a type with substantially greater precision than the inputs, it's easy for this to produce really poor results.
This algorithm doesn't square either input directly. It is still possible for an intermediate result to underflow, but it's designed so that when it does so, the result still comes out as well as the format in use supports. Basically, the situation in which it happens is that you have an extremely acute triangle (e.g., something like 90 degrees, 0.000001 degrees, and 89.99999 degrees). If it's close enough to 90, 0, 90, we may not be able to represent the difference between the two longer sides, so it'll compute the hypotenuse as being the same length as the other long side.
By contrast, when the Pythagorean theorem fails, the result will often be a NaN (i.e., tells us nothing) or, depending on the floating point format in use, quite possibly something that looks like a reasonable answer, but is actually wildly incorrect.
You can start by reevaluating if you need the sqrt at all. Many times you are calculating the hypotenuse just to compare it to another value - if you square the value you're comparing against you can eliminate the square root altogether.
Unless you're doing this at >1kHz, multiply even on a MCU without hardware MUL isn't terrible. What's much worse is the sqrt. I would try to modify my application so it doesn't need to calculate it at all.
Standard libraries would probably be best if you actually need it, but you could look at using Newton's method as a possible alternative. It would require several multiply/divide cycles to perform, however.
AVR resources
Atmel App note AVR200: Multiply and Divide Routines (pdf)
This sqrt function on AVR Freaks forum
Another AVR Freaks post
Maybe you could use some of Elm Chans Assembler Libraries and adapt the ihypot-function to your ATtiny. You would need to replace the MUL and maybe (i haven't checked) some other instructions.

Resources