Order ranking column algorithm - database

For example I have a column called rank in table questions. It described the order of these questions. Let's describe that column with a list: [1, 2, 3, 4, ... , 9998, 9999, 10000].
Now the user see question has rank = 10000 so important and wanna drag that question and drop to rank = 4. The list now will be [1, 2, 3, 5, 6, ... , 9999, 100000, 4] so that I can get and order by this column and there is a new order.
In other words, we will make [1, 2, 3, 4, ... , 9998, 9999, 10000] -> [1, 2, 3, 5, 6, ..., 9999, 10000, 4]
Is there any way to do this without loop and plus 1 every element from [4, 9999] ?
Thank you guys!

Related

Apply scipy scoreatpercentile for each list in array

Assuming to have an array like this:
a = np.array([[10, 3, 7, 1, 3, 3],
[ 7, 2, 4, 2, 4, 6],
[ 4, 1, 9, 3, 5, 7]])
scipy.stats.scoreatpercentile or numpy.percentile would allow me to get the percentiles for each element of an array:
from scipy import stats
test = stats.scoreatpercentile(a_transposed, 50, axis=1)
print (test)
# Output: [3. 4. 4.5]
For me it is even more interesting to get the score of a percentile. Therefore I would use scipy.stats.percentileofscore but this ('reverse')function is not applicable to this type of array since it does not take an axis-parameter like the related scipy.stats.scoreatpercentile. Is there any other smooth way to apply this function for each element of the array?

counting DISTINCT copies of row elements

Consider the array sample A.
import numpy as np
A = np.array([[2, 3, 6, 7, 3, 6, 7, 2],
[2, 3, 6, 7, 3, 6, 7, 7],
[2, 4, 3, 4, 6, 4, 9, 4],
[4, 9, 0, 1, 2, 5, 3, 0],
[5, 5, 2, 5, 4, 3, 7, 5],
[7, 5, 4, 8, 0, 1, 2, 6],
[7, 5, 4, 7, 3, 8, 0, 7]])
PROBLEM: I want to identify rows that have a specified number of DISTINCT element copies. The following code comes close: The code needs to be able to answer questions like "which rows of A have exactly 4 elements that appear twice?", or "which rows of A have exactly 1 element that appear three times?"
r,c = A.shape
nCopies = 4
s = np.sort(A,axis=1)
out = A[((s[:,1:] != s[:,:-1]).sum(axis=1)+1 == c - nCopies)]
This produces 2 output rows, both having 4 copied elements.
The 1st row has copies of 2,3,6,7. The 2nd row has copies of 3,6,7,7:
array([[2, 3, 6, 7, 3, 6, 7, 2],
[2, 3, 6, 7, 3, 6, 7, 7]])
My problem is that I don't want the 2nd output row because it only has 3 DISTINCT copies (ie: 3,6,7)
How can to code be modified to identify only distinct copies?
If I understand correctly, you want the rows of A that have 4 distinct values and every value must have at least one copy. You can leverage np.unique(return_counts=True) which returns 2 values, the distinct values and the count of each value.
counts = [np.unique(row,return_counts=True) for row in A ]
valid_indices = [ np.all(row[1] > 1) and row[0].shape[0] == 4 for row in counts ]
valid_rows = A[valid_indices]

drop np array rows based on element uniqueness and one other condition

Consider the 2d integer array below:
import numpy as np
arr = np.array([[1, 3, 5, 2, 8],
[9, 6, 1, 7, 6],
[4, 4, 1, 8, 0],
[2, 3, 1, 8, 5],
[1, 2, 3, 4, 5],
[6, 6, 7, 9, 1],
[5, 3, 1, 8, 2]])
PROBLEM: Eliminate rows from arr that meet two conditions:
a) The row's elements MUST be unique
b) From these unique-element rows, I want to eliminate the permutation duplicates.
All other rows in arr are kept.
In the example given above, the rows with indices 0,3,4, and 6 meet condition a). Their elements are unique.
Of these 4 rows, the ones with indices 0,3,6 are permutations of each other: I want to keep
one of them, say index 0, and ELIMINATE the other two.
The output would look like:
[[1, 3, 5, 2, 8],
[9, 6, 1, 7, 6],
[4, 4, 1, 8, 0],
[1, 2, 3, 4, 5],
[6, 6, 7, 9, 1]])
I can identify the rows that meet condition a) with something like:
s = np.sort(arr,axis=1)
arr[~(s[:,:-1] == s[:,1:]).any(1)]
But, I'm not sure at all how to eliminate the permutation duplicates.
Here's one way -
# Sort along row
b = np.sort(arr,axis=1)
# Mask of rows with unique elements and select those rows
m = (b[:,:-1] != b[:,1:]).all(1)
d = b[m]
# Indices of uniq rows
idx = np.flatnonzero(m)
# Get indices of rows among them that are unique as per possible permutes
u,stidx,c = np.unique(d, axis=0, return_index=True, return_counts=True)
# Concatenate unique ones among these and non-masked ones
out = arr[np.sort(np.r_[idx[stidx], np.flatnonzero(~m)])]
Alternatively, final step could be optimized further, with something like this -
m[idx[stidx]] = 0
out = arr[~m]

Given an unsorted array, Find the maximum subtraction between two elements in the array

I've got this question from an Interview in Microsoft: Given an unsorted array, Find the maximum subtraction between two elements in the array is a way that:
(Index1, Index2) = arr[Index2] - arr[Index1]. Index1<Index2.
Example:
given the array: [1, 5, 3, 2, 7, 9, 4, 3] -> Output: (1,9)=8.
given the array: [4, 9, 2, 3, 6, 3, 8, 1] -> Output: (2,8)=6.
The naive solution works in O(n^2) times: Scan the first index for subtraction with all other indexes and save the max value, Continue to the next index and so on.
Is there any way to optimize this?
Fairly simple when you write it down. Rephrasing the problem, you want to find the largest element to the right of each element. Now given the first example, this is:
[1, 5, 3, 2, 7, 9, 4, 3]
=>
[9, 9, 9, 9, 9, 4, 3]
Now, notice the maximums array is just the cumulative maximums from the right. Given this property it is easy to construct an O(n) time algorithm.
Implementation in python:
def find_max(xs):
ys = []
cur_max = float('-inf')
for x in reversed(xs):
cur_max = max(x, cur_max)
ys.append(cur_max)
ys = ys[::-1][1:]
return max(y - x for x, y in zip(xs, ys))
We can also construct the maximums array lazily, doing so gives us O(1) memory, which is the best possible:
def find_max(xs):
cur_max = float('-inf')
cum_max = xs[-1]
for i in range(len(xs) - 2, -1, -1):
cur_max = max(cur_max, cum_max - xs[i])
cum_max = max(cum_max, xs[i])
return cur_max
I think this is correct and O(nlogn): Split in the middle and select from right the max, from left the min value. Also split the the other 2 quarters, if one of them gives bigger result continue on that branch recursively.
Second example:
4, 9, 2, 3| 6, 3, 8, 1 -> 2 and 8
4, 9| 2, 3, 6, 3, 8, 1 -> 4 and 8
4, 9, 2, 3, 6, 3| 8, 1 -> 2 and 8
So working on the right split:
4, 9, 2, 3, 6, 3, 8| 1 -> 2 and 1
Selecting the 2 and 8 option. It also works for the first example.

Get a number in an unordered straight that follow a start value from an interval

I search a fast method to perform my problem.
imagine ordered seats numeroted from 1 to 8, imagine they are people on seats [ 2, 6, 5, 3 ]. i want to get back the second (interval +2) people after the seat number 4 (start value)
for examples :
with this array : [2, 5, 8, 7, 1] , i started with value 3 and i move +2 times,
the third next number in the list is 5, the second is 7, the method must return this value
with the same [2, 5, 8, 7, 1] , i started from 7 and i move +3 times
here the method must return to the minimal value. trought 8.. 1.. 2.., result : 2
with [1, 3], start 4, count +2, result 3
with [5, 3, 9], start 3, count +1, result 5
with [5, 3, 9], start 3, count +2, result 9
I hope someone will understand my problem.
thanks you
Sort your list, use bisect to find the starting index, then mod the result of the addition by the length of the list.
So, this is basically just an example implementation of Ignacio's algorithm in Python:
from bisect import bisect
def circular_highest(lst, start, move):
slst = sorted(lst)
return slst[(bisect(slst, start) - 1 + move) % len(lst)]
print circular_highest([2, 5, 8, 7, 1], 3, 2)
print circular_highest([2, 5, 8, 7, 1], 7, 3)
print circular_highest([1, 3], 4, 2)
print circular_highest([5, 3, 9], 3, 1)
print circular_highest([5, 3, 9], 3, 2)
Output:
7
2
3
5
9

Resources