Count the number of elements in an array fulfilling a simple condition

Count the number of elements in an array fulfilling a simple condition - arrays

I am just jumping back into R after a long time away and I am surprised by how simple some of the things are to do. I have created 3 arrays:
Xs = runif(N, min=-1, max=1);
Ys = runif(N, min=-1, max=1);
Rs = sqrt( Xs^2 + Ys^2 );
where, obviously, X and Y (together) define N points within the (-1,1) square and R is the vector defining the distances of these points.
If I want to count the number of elements in Rs which are less than or equal to 1, is there a simple inl-line command to do this?

sum( Rs <= 1 )
Rs <= 1 yields a logical vector. TRUE equals 1; FALSE equals 0.

Related

Maximum adjacent product sum (Interview question)

We have an array of integers where integer in each position is seen as its value. Each time a position is selected, you will earn the amount associated with it multiplied by its adjacent position's value (left and right side). After a position has been selected it would be removed from the array and its left and right positions would become adjacent to each other.
If there are no adjacent positions assume a value of 1 for the same. For example, if there is only single position left and you select it then it's value will be multiplied by 1 as both left and right adjacent positions.
Find out what can be maximum amount earned at the end after selecting all positions.
I have implemented a dynamic programming approach to it using the following recurrence relation : First we observe that if we somehow in the process as mentioned in question encounter a step where we multiply arr[position_p] and arr[position_q], then all positions in between position_p and position_q should have already been chosen, if any.
For simplicity let us assume array indices start from 1 and position 0 and position n+1 contain value 1 in accordance with the question, where n is the number of elements in array.
So we need to select positions p+1 to q-1 in such an order that maximizes the amount.
Using this, we obtain recurrence relation :
If f(p,q) is maximum amount obtained by choosing only from positions p+1 to q-1, then we have :
f(p, q) = max ( f(p,k) + f(k,q) + arr[p] * arr[k] * arr[q] ) for k between p and q (Excluding p and q)
where k is last position chosen from positions p+1 to q-1 before choosing either p or q
And here is the python implementation :
import numpy as np
n = int(input("Enter the no. of inputs : "))
arr = [1]
arr = arr + list( map( int, input("Enter the list : ").split() ) )
arr.append(1)
# matrix created to memoize values instead of recomputing
mat = np.zeros( (n+2, n+2), dtype = "i8" )
# Bottom-up dynamic programming approach
for row in range ( n + 1, -1, -1 ) :
for column in range ( row + 2, n + 2 ) :
# This initialization to zero may not work when there are negative integers in the list.
max_sum = 0
# Recurrence relation
# mat[row][column] should have the maximmum product sum from indices row+1 until column-1
# And arr[row] and arr[column] are boundary values for sub_array
# By above notation, if column <= row + 1, then there would be no elements between them and thus mat[row][column] should remain zero
for k in range ( row + 1 , column ) :
max_sum = max( max_sum, mat[row][k] + mat[k][column] + ( arr[row] * arr[k] * arr[column] ) )
mat[row][column] = max_sum
print(mat[0][n+1])
The problem is that I have seen the following question in a programming round of interview before some time back. Though my solution seems to be working, it has O(n^3) time complexity and O(n^2) space complexity.
Can I do better, what about the case when all values of array positions are positive (original question assumes this). And any help on reducing space complexity is also appreciated.
Thank you.
Edit :
Though this is no proof, as suggested by #risingStark I have seen the same question on LeetCode also where all correct algorithms seem to have used O(n^2) space running in O(n^3) time for general case solution.

A matrix and a column vector containing indices, how to iterate with no loop?

I have a big matrix (500212x7) and a column vector like below
matrix F vector P
0001011 4
0001101 3
1101100 6
0000110 1
1110000 7
The vector contains indices considered within the matrix rows. P(1) is meant to point at F(1,4), P(2) at F(2,3) and so on.
I want to negate a bit in each row in F in a column pointed by P element (in the same row).
I thought of things like
F(:,P(1)) = ~F(:,P(1));
F(:,P(:)) = ~F(:,P(:));
but of course these scenarios won't produce the result I expect as the first line won't make P element change and the second one won't even let me start the program because a full vector cannot make an index.
The idea is I need to do this for all F and P rows (changing/incrementing "simultaneously") but take the value of P element.
I know this is easily achieved with for loop but due to large dimensions of the F array such a way to solve the problem is completely unacceptable.
Is there any kind of Matlab wizardry that lets solving such a task with the use of matrix operations?

I know this is easily achieved with for loop but due to large dimensions of the F array such a way to solve the problem is completely unacceptable.
You should never make such an assumption. First implement the loop, then check to see if it really is too slow for you or not, then worry about optimizing.
Here I'm comparing Luis' answer and the trival loop:
N = 500212;
F = rand(N,7) > 0.6;
P = randi(7,N,1);
timeit(#()method1(F,P))
timeit(#()method2(F,P))
function F = method1(F,P)
ind = (1:size(F,1)) + (P(:).'-1)*size(F,1); % create linear index
F(ind) = ~F(ind); % negate those entries
end
function F = method2(F,P)
for ii = 1:numel(P)
F(ii,P(ii)) = ~F(ii,P(ii));
end
end
Timings are 0.0065 s for Luis' answer, and 0.0023 s for the loop (MATLAB Online R2019a).
It is especially true for very large arrays, that loops are faster than vectorized code, if the vectorization requires creating an intermediate array. Memory access is expensive, using more of it makes the code slower.
Lessons: don't dismiss loops, don't prematurely try to optimize, and don't optimize without comparing.

Another solution:
xor( F, 1:7 == P )
Explanation:
1:7 == P generates one-hot arrays.
xor will cause a bit to retain its value against a 0, and flip it against a 1

Not sure if it qualifies as wizardry, but linear indexing does exactly what you want:
F = [0 0 0 1 0 1 1; 0 0 0 1 1 0 1; 1 1 0 1 1 0 0; 0 0 0 0 1 1 0; 1 1 1 0 0 0 0];
P = [4; 3; 6; 1; 7];
ind = (1:size(F,1)) + (P(:).'-1)*size(F,1); % create linear index
F(ind) = ~F(ind); % negate those entries

Find longest slices of an array that contain distinct values

I have a very long one-dimensional array of positive integers. Starting from one end, I need to find the longest slices/chunks of the array that have values that are at-least one number away from all the constituents of that slice.
i.e., I want to make a partitioning of the array (starting from the left) such that each partition contains elements that are more than one unit away from all the other elements in that partition.
Eg:
[1,1,9,5,3,8,7,4,1,2] -> [1],[1,9,5,3],[8],[7,4,1],[2]
[1,5,9,1,3,6,4,2,7,0] -> [1,5,9],[1,3,6,4],[2,7,0]
Bellow, I've written a little code in Fortran that will let me find the first such point of recurrence of a previous value.
mask is a LOGICAL array
array is the array in question
n is the length of the array
I can easily extend this to find the full partitioning.
mask = .FALSE.
DO i = 1,n
k = array(i)
IF ( mask(k) ) THEN
PRINT*, i
EXIT
ELSE
mask(k-1 : k+1) = .TRUE.
END IF
END DO
So my question is, is there a better way (alorithm) to do this? When I say better, I mean speed. I don't mind a memory cost.

Conceptually it could look something like this...
DO i = 1,n-1
Delta(I) = array(I+1) - array(I)
ENDDO
iMask = 0
WHERE(ABS(Delta) < 2) iMask =1
ALLOCATE(splits(SUM(iMask)))
K=0
DO I = 1, n-1
IF(iMask(I) == 0) CYCLE
K = K +1
Splits(K) = I
ENDDO
!... DEALLOCATE(Splits)
Then just print out the data between the splits values, which could be off by a count, and you may also need to do something for the Nth point, so it depends a bit on your implementation and whether your delta is "too the next point" or "from the last point".
In this case I used imask as an integer rather than a logical so I could use SUM.

My initial reaction is the naive approach:
save index bounds on the partition you're currently expanding (partitionNumber from iStart to iEnd)
Take the next point with index iEnd+1 and loop from iStart to iEnd testing that the candidate point is not within 1 of the current members
If the candidate fails the inclusion test, start it in its own partition by resetting iStart and incrementing partitionNumber
Increment iEnd.
If you're expecting the partitions to mostly be quite short then this should be pretty quick. If you're expecting long chains of increasing or decreasing integers you could save the min and max of values in the partition include a quick test to see if your candidate is outside the range.
I've not tested this and my fortran might be a bit rusty, but I think it represents the above algorithm.
partitionNumber = 1
iStart = 1
iEnd = 1
iCandidate = iEnd + 1
arrayMember(iStart) = partitionNumber
DO WHILE (iCandidate <= N)
DO j = iStart,iEnd
IF ( ABS(array(iCandidate)-array(j)) < 2 )
partitionNumber = partitionNumber + 1
iStart = iCandidate
EXIT
END IF
END DO
arrayMember(iCandidate) = partitionNumber
iEnd = iEnd + 1
iCandidate = iEnd + 1
END DO
Operating on your two examples, I would hope it to return arrayMember with entries
[1,1,9,5,3,8,7,4,1,2] -> [1,2,2,2,2,3,4,4,4,5] (represents [1],[1,9,5,3],[8],[7,4,1],[2])
[1,5,9,1,3,6,4,2,7,0] -> [1,1,1,2,2,2,3,3,3,3] (represents [1,5,9],[1,3,6],[4,2,7,0])
I'm not entirely sure I understand how you would extend your version to all partitions, but this might save on defining mask of size MAX(array)?

Array Manipulation - randomly choosing elements

Suppose I have an array of length N. I want to choose n positions randomly, make them zero and then add the existing elements to the next non-zero element.
For example, suppose r = (r1,r2,r3,r4,r5), N = 5. Let n = 2. And the randomly picked positions are 3rd and 4th. Then I want to transform r to
r_new = (r1, r2, 0, 0, r3+r4+r5).
Instead if the randomly selected positions were 1 and 3, then I want to have
r_new = (0, r1 + r2, 0, r3+r4, r5).
I am coding in MATLAB. Here is my current code.
u = randperm(T);
ind = sort(u(1:n(i)));
tmp = r(ind);
r(ind) = 0;
x = find( r );
I am not necessarily looking for MATLAB code. Pseudocode would be helpful enough.

I'm assuming the last position can never be selected, otherwise the intended behaviour is undefined. So you randomly select n positions uniformly distributed from 1 up to N-1 (not up to N).
Here's one approach:
Select n distinct random positions from 1 to N-1, and sort them. Call the resulting vector of positions pos. This can be easily done with randperm and sort.
For each value in pos, say p, accumulate r(p) into r(p+1), and set r(p) to zero. This is done with a for loop.
In step 2, if position p+1 happens to belong to pos too, the accumulated value will be moved further to the right in a subsequent iteration. This works because pos has been sorted, so the randomly selected positions are processed from left to right.
r = [3 5 4 3 7 2 8]; %// data
n = 2; %// number of positions
pos = sort(randperm(numel(r)-1,n)); %// randomly select positions, and sort them
for p = pos
r([p p+1]) = [0 r(p)+r(p+1)]; %// process position p
end

Assuming N, n and r are already generated, then we select random indexes:
inds = randi(N,n,1);
Then to achieve the desired results you can loop as follows:
inds = sort(inds);
for ii=1:numel(inds)
if(inds(ii)<N)
r(inds(ii)+1)=r(inds(ii)+1) +r(inds(ii));
r(inds)=0;
else
r(inds)=0;
end
end
This will create the desired outcome of adding the values to the next index that wasn't selected to be set to 0.
Note I had to assume an edge case where if the last index is set to 0, then its value is not added to anything.

Identify missing values from array

I am striving to create a function in VBA that calculates the number of missing values in each column of a matrix of nxn dimensions.
Each column should contain the numbers 1 to n only once.
However if this is not the case I want to the function to state how many values are missing. For example in a column of 4x4 matrix (1,2,1,3) there is one missing value which is 4, and the function should return the value 1, for the 1 missing value.
I am very new to VBA and by no means a master, but this is what I have done so far...
Function calcCost(sol() As Integer, n As Integer) As Integer
Dim ArrayOfTruth(1 To n) As Boolean
For Row = 1 To n
For i = 1 To n
If ProbMatrix(Column, Row) = i Then
ArrayOfTruth(i) = True
cost = 0
For i = 1 To n
If ArrayOfTruth(i) = True Then
cost = cost + 1

Assuming that the requirement of a square range of cells supersedes the description of the 'matrix's' values, I'm not sure why an array is needed at all.
Function calcCost(rTopLeft As Range, n As Long)
Dim c As Long, r As Long
For c = 1 To n
If Not CBool(Application.CountIf(rTopLeft.Resize(n, n), c)) Then _
r = r + 1
Next c
calcCost = r
End Function
Syntax:
    =calcCost(<top left corner of desired range>, <number of cells both right and down>)
Example:
=calcCost(E9, 18)
The above implementation could also be written as,
=18-SUMPRODUCT(--SIGN(COUNTIF(OFFSET(E9,0,0,18,18), ROW(1:18))))