create an array based on grouping (and conditions) from an array - arrays

So, I have the following array (structured as Array{Tuple{Int,Float64,Int,Int},1} but it can also be an Array of Arrays) and where the first element of the tuple is an ID and the second is a number indicating a cost. What i want to do is to group by ID and then take the cost difference between the cheapest and the second cheapest cost for such ID, if there is no second cost, the cost difference should be typemax(Float64) -firstcost. Regarding the third and fourth elements of the Tuple, I want to keep those of the firstcost (or minimum cost in that sense).
Example of what I have
(1, 223.2, 2, 2)
(1, 253.2, 3, 2)
(2, 220.0, 4, 6)
(3, 110.0, 1, 4)
(3, 100.0, 3, 8)
Example of what I want:
(1, 30.0, 2, 2)
(2, typemax(Float64)-220.0, 4, 6)
(3,10.0, 3, 8)

This is one way of doing it:
A = [(1, 223.2, 2, 2), (1, 253.2, 3, 2), (2, 220.0, 4, 6), (3, 110.0, 1, 4), (3, 100.0, 3, 8)]
function f(a)
aux(b::Vector) = (b[1][1], (length(b) == 1 ? typemax(Float64) : b[2][2]) - b[1][2], b[1][3:4]...)
sort([aux(sort(filter(x -> x[1] == i, a))) for i in Set(map(first, a))])
end
#show f(A)

There's SplitApplyCombine.jl, which implements (unsurprisingly) a split-apply-combine logic like that found in DataFrames. This is an example where I would stay away from simple one-liners / short solution and write things out more explicitly in the interest of making the code readable and understandable if someone else (or you yourself in a few months time!) reads it:
julia> tups = [(1, 223.2, 2, 2)
(1, 253.2, 3, 2)
(2, 220.0, 4, 6)
(3, 110.0, 1, 4)
(3, 100.0, 3, 8)]
5-element Array{Tuple{Int64,Float64,Int64,Int64},1}:
(1, 223.2, 2, 2)
(1, 253.2, 3, 2)
(2, 220.0, 4, 6)
(3, 110.0, 1, 4)
(3, 100.0, 3, 8)
julia> using SplitApplyCombine
julia> function my_fun(x) # function to apply
if length(x) == 1
return (x[1][1], typemax(Float64) - x[1][2], x[1][3], x[1][4])
else
return (x[1][1], -diff(sort(getindex.(x, 2), rev = true)[1:2]), x[1][4])
end
end
my_fun (generic function with 1 method)
julia> [my_fun(x) for x in group(first, tups)] # apply function group wise
3-element Array{Tuple{Int64,Any,Int64,Vararg{Int64,N} where N},1}:
(2, Inf, 4, 6)
(3, [10.0], 4)
(1, [30.0], 2)
If performance is a concern you might want to think about my_fun and do some profiling to see if and how you can improve it - the only thing I've done here is to use diff to subtract the first from the second element of the sorted array to avoid sorting twice.

Related

How to find sum of every possible trio in a given array in O(n^2)?

I have an array like this:
1,2,3,4,5.
I want to find sum of every possible trio like:
123, 124, 125, 134, 135, etc
I have tried using 3 while loops and 3 variables(i, j, k) to iterate over every trio, but the time complexity
was O(n^3), and I want in O(n^2).
Hope someone will help.
k-permutations of n ? My first try would be something like this, given the fact that the array is like [1,2,3,...,n,n+1] and sums of 3 will repeat:
unique_sums = set()
iter_counter = [0]
call_counter = [0]
def sum_permute(arr, perm, k):
call_counter[0] += 1
result = sum(perm)
if result in unique_sums:
return
if len(perm) == k:
iter_counter[0] += 1
# print(f"sum{perm}={result}")
unique_sums.add(result)
return
for el in arr:
iter_counter[0] += 1
if el not in perm:
perm.add(el)
sum_permute(arr, perm, k)
perm.pop()
If don't discard repeaded sums, this runs in n(n-1)(n-2)...(n-k+1), k=3, ~O(n^3), assuming lookup in set be O(1). Anyway here is a result:
arr = [el for el in range(100)]
sum_permute(arr ,set(), 3)
print(f"Total elements: {len(arr)}")
print(f"Total iterations: {iter_counter[0]}")
print(f"Total function call: {call_counter[0]}")
print(f"n^2={len(arr)}^2={len(arr)**2}")
Result:
Total elements: 100
Total iterations: 3372
Total function call: 3091
n^2=100^2=10000
Python got a build-in method to do it.
time complex:What is the computational complexity of `itertools.combinations` in python?
code:
import itertools
print(list(itertools.combinations(range(1,6),3)))
result:
[(1, 2, 3), (1, 2, 4), (1, 2, 5), (1, 3, 4), (1, 3, 5), (1, 4, 5), (2, 3, 4), (2, 3, 5), (2, 4, 5), (3, 4, 5)]
if you need sum
code:
import itertools
print(sum([int(''.join(map(str,x))) for x in list(itertools.combinations(range(1,6),3))]))
result:
1845

Python itertools.combinations: how to obtain the indices of the combined numbers within the combinations at the same time

According to the question presented here: Python itertools.combinations: how to obtain the indices of the combined numbers, given the following code:
import itertools
my_list = [7, 5, 5, 4]
pairs = list(itertools.combinations(my_list , 2))
#pairs = [(7, 5), (7, 5), (7, 4), (5, 5), (5, 4), (5, 4)]
indexes = list(itertools.combinations(enumerate(my_list ), 2)
#indexes = [(0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3)]
Is there any way to obtain pairs and indexes in a single line so I can have a lower complexity in my code (e.g. using enumerate or something likewise)?
#Maf - try this, this is as #jonsharpe suggested earlier, use zip:
from pprint import pprint
from itertools import combinations
my_list = [7, 5, 5, 4]
>>> pprint(list(zip(combinations(enumerate(my_list),2), combinations(my_list,2))))
[(((0, 7), (1, 5)), (7, 5)),
(((0, 7), (2, 5)), (7, 5)),
(((0, 7), (3, 4)), (7, 4)),
(((1, 5), (2, 5)), (5, 5)),
(((1, 5), (3, 4)), (5, 4)),
(((2, 5), (3, 4)), (5, 4))]
(Explicit is better than implicit. Simple is better than complex.)
I would use list-comprehension for its flexiblity:
list((x, (x[0][1], x[1][1])) for x in list(combinations(enumerate(my_list), 2)))
This can be further extended using the likes of opertor.itemgetter.
Also, the idea is to run use the iterator only once, so that the method can potentially be applied to other non-deterministic iterators as well, say, an yield from random.choices.

Sort array of objects in numpy?

How can I efficiently sort an array of objects on two or more attributes in Numpy?
class Obj():
def __init__(self,a,b):
self.a = a
self.b = b
arr = np.array([],dtype=Obj)
for i in range(10):
arr = np.append(arr,Obj(i, 10-i))
arr_sort = np.sort(arr, order=a,b) ???
Thx, Willem-Jan
The order parameter only applies to structured arrays:
In [383]: arr=np.zeros((10,),dtype='i,i')
In [385]: for i in range(10):
...: arr[i] = (i,10-i)
In [386]: arr
Out[386]:
array([(0, 10), (1, 9), (2, 8), (3, 7), (4, 6), (5, 5), (6, 4), (7, 3), (8, 2), (9, 1)],
dtype=[('f0', '<i4'), ('f1', '<i4')])
In [387]: np.sort(arr, order=['f0','f1'])
Out[387]:
array([(0, 10), (1, 9), (2, 8), (3, 7), (4, 6), (5, 5), (6, 4), (7, 3), (8, 2), (9, 1)],
dtype=[('f0', '<i4'), ('f1', '<i4')])
In [388]: np.sort(arr, order=['f1','f0'])
Out[388]:
array([(9, 1), (8, 2), (7, 3), (6, 4), (5, 5), (4, 6), (3, 7), (2, 8),
(1, 9), (0, 10)],
dtype=[('f0', '<i4'), ('f1', '<i4')])
With a 2d array, lexsort provides a similar 'ordered' sort
In [402]: arr=np.column_stack((np.arange(10),10-np.arange(10)))
In [403]: np.lexsort((arr[:,1],arr[:,0]))
Out[403]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)
In [404]: np.lexsort((arr[:,0],arr[:,1]))
Out[404]: array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0], dtype=int32)
With your object array, I could extract the attributes into either of these structures:
In [407]: np.array([(a.a, a.b) for a in arr])
Out[407]:
array([[ 0, 10],
[ 1, 9],
[ 2, 8],
....
[ 7, 3],
[ 8, 2],
[ 9, 1]])
In [408]: np.array([(a.a, a.b) for a in arr],dtype='i,i')
Out[408]:
array([(0, 10), (1, 9), (2, 8), (3, 7), (4, 6), (5, 5), (6, 4), (7, 3),
(8, 2), (9, 1)],
dtype=[('f0', '<i4'), ('f1', '<i4')])
The Python sorted function will work on arr (or its list equivalent)
In [421]: arr
Out[421]:
array([<__main__.Obj object at 0xb0f2d24c>,
<__main__.Obj object at 0xb0f2dc0c>,
....
<__main__.Obj object at 0xb0f35ecc>], dtype=object)
In [422]: sorted(arr, key=lambda a: (a.b,a.a))
Out[422]:
[<__main__.Obj at 0xb0f35ecc>,
<__main__.Obj at 0xb0f3570c>,
...
<__main__.Obj at 0xb0f2dc0c>,
<__main__.Obj at 0xb0f2d24c>]
Your Obj class is missing a nice __str__ method. I have to use something like [(i.a, i.b) for i in arr] to see the values of the arr elements.
As I stated in the comment, for this example, a list is much nice than an object array.
In [423]: alist=[]
In [424]: for i in range(10):
...: alist.append(Obj(i,10-i))
list append is faster than the repeated array append. And object arrays don't add much functionality compared to a list, especially when 1d, and the objects are custom classes like this. You can't do any math on arr, and as you can see, sorting isn't any easier.

Sorting an integer circular array swapping elements with 1 in between

Given an integer circular array v, with n > 0 numbers, and only one possible move "m" is to swap one element with the one after 2 ahead.
I have to print a sequence of moves that sort v or say that's impossible (in c language)
Example:
initial: v = [8, 0, 5, 9, -1, 5]
after applying m to v[4]: v = [-1, 0, 5, 9, 8, 5]
after applying m to v[3]: v = [-1, 0, 5, 5, 8, 9]
Which is now sorted from it's 0 position. The output would be "4 3"
What I know up to this point:
- If there are an odd number of elements, you can move any element to any position. (But is it enough to guarantee it can be sorted?)
- For an even number of elements, it's not always possible to sort it, since you can't move elements between odd and even positions (ex: v = [-1, -2, 1, 7], impossible because -2 in is an odd position, but should be at an even position).
I've been thinking about this:
- Use an auxiliar array "aux" to put the numbers with their real neighbours, like:
v = [8, 0, 5, 9, -1, 5, 6] -> aux = [8, 5, -1, 6, 0, 9, 5]
Now in aux I can perform simple swaps with adjacents numbers.
- The final configuration in terms of their position "i" in aux is:
v = [8, 0, 5, 9, -1, 5, 6] -> aux = [8, 5, -1, 6, 0, 9, 5] -> i = [0, 2, 4, 6, 1, 3, 5]
- If n is even, there would be 2 aux, because to sort v, you can't move numbers between odd and even positions, so there's 2 sub problems. Ex:
v = [8, 0, 5, 9, -1, 5]
aux-even = [8, 5, -1] -> i-even = [0, 2, 4]
aux-odd = [0, 9, 5] -> i-even = [1, 3, 5] (thinking in terms of v)
I'm not sure where to go from here or if it's even a good path to run.
Any ideas or help are welcome.
EDIT
I'm trying to simulate the algorithm proposed by AlexD for the odd case:
v = [8, 0, 5, 9, -1, 5, 6]
v-sorted = [-1, 0, 5, 5, 6, 8, 9]
Assigning them keys (-1, 1), (0, 5), (5, 2), (5, 6), (6, 3), (8, 7), (9, 4).
Back to the original order:
(8, 7) (0, 5) (5, 2) (9, 4) (-1, 1) (5, 6) (6, 3)
Using bubble sort for the keys with 1 in between.
(8, 7) (0, 5) (5, 2) (9, 4) (-1, 1) (5, 6) (6, 3)
(5, 2) (0, 5) (8, 7) (9, 4) (-1, 1) (5, 6) (6, 3)
(5, 2) (0, 5) (-1, 1) (9, 4) (8, 7) (5, 6) (6, 3)
(5, 2) (0, 5) (-1, 1) (9, 4) (6, 3) (5, 6) (8, 7)
(-1, 1) (0, 5) (5, 2) (9, 4) (6, 3) (5, 6) (8, 7)
(-1, 1) (9, 4) (5, 2) (0, 5) (6, 3) (5, 6) (8, 7)
(-1, 1) (9, 4) (5, 2) (0, 5) (6, 3) (5, 6) (8, 7)
But the numbers aren't sorted. What am I missing?
For briefness, let me illustrate the idea by concrete cases, which are easy to generalize.
Even case
Let's say we have 8 elements. Then we sort two sub-array (with odd and even indices) separately, using the bubble sort with step 2, and have
a1 a2 a3 a4
b1 b2 b3 b4
If the resulting array a1 b1 a2 b2 a3 b3 a4 b4 happened to be sorted, we are done. Otherwise there is no solution.
Odd case
Let's say you have 7 elements. Sort them with your favorite method. Let's say after sorting they are
a5 a4 a1 a7 a3 a2 a6
Assign key to every element like this:
a5 a4 a1 a7 a3 a2 a6
1 5 2 6 3 7 4
Then come back to the original order:
a1 a2 a3 a4 a5 a6 a7
2 7 3 5 1 4 6
Then sort the pairs by the new keys (1 - 7), using the bubble sort and with jumping over one element every time. As soon as you have the keys sorted, your a1 ... a7 values take their destination positions.
Let's illustrate with array
5 1 5 9 0
The sorted array and corresponding keys would be
0 1 5 5 9
1 4 2 5 3
Now let's sort pairs by key (the pairs whose keys we compare one each step are in <>)
<5 2> (1 4) <5 5> (9 3) (0 1) // starting the first loop
(5 2) (1 4) <5 5> (9 3) <0 1>
(5 2) <1 4> (0 1) (9 3) <5 5>
(5 2) <5 5> (0 1) <9 3> (1 4)
<5 2> (9 3) <0 1> (5 5) (1 4) // (5 5) is in its final position, starting the second loop
(0 1) (9 3) <5 2> (5 5) <1 4>
(0 1) <9 3> (5 2) (5 5) <1 4>
<0 1> (1 4) <5 2> (5 5) (9 3) // (1 4) in its final position, starting the third loop
(0 1) (1 4) <5 2> (5 5) <9 3>
<0 1> (1 4) <5 2> (5 5) (9 3) // (9 3) in the final position
(0 1) (1 4) (5 2) (5 5) (9 3) // (0 1) and (5 2) are in final positions

Array manipulation in Fortran

I have two arrays fListU and fListD both of which contain 4-tuples. Specifically:
fListU = [(2, 1, 1, 0), (2, 5, 5, 0), (5, 4, 10, 0), (6, 1, 5, 0), (6, 5, 7, 0)]
fListD = [(1, 4, 0, 4), (3, 4, 0, 4), (5, 4, 0, 6)]
Now I want to put together these into one array, with the condition that when the first two items of the tuples are equal, then the third and fourth items of two lists should be added. In this case, the result I am looking for is
fList = [(2, 1, 1, 0), (2, 5, 5, 0), (5, 4, 10, 6), (6, 1, 5, 0),
(6, 5, 7, 0), (1, 4, 0, 4), (3, 4, 0, 4)]
where (5, 4, 10, 0) and (5, 4, 0, 6) are combined to (5, 4, 10, 6).
This is what I tried.
ALLOCATE (fList((n-1)**2,4))
fList = 0
p = 1 ! p signifies the position in fList.
DO k = 1, ((n-1)**2), 1 ! k is the index for fListD
DO l = 1, ((n-1)**2), 1 ! l is the index for fListU
IF ( ALL (fListU(l,1:2) == fListD(k,1:2)) ) THEN
fList(p,1:2) = fListU(l,1:2)
fList(p,3) = fListU(l,3)
fList(p,4) = fListD(k,4)
ELSE
fList(p,:) = fListU(l,:)
p = p+1
fList(p,:) = fListD(k,:)
p = p+1
END IF
END DO
END DO
This is not producing what I want. What would be the problem?
I'm not sure how you are reading in fListU and fListD. One thing that you need to realise is that in Fortran (other than most other programming languages), the first index of a multi-dimensional array is the fastest changing. That's why the way you read the data in is so important: If you read the data in sequentially, or use reshape, then the second element you read in will be in position (2, 1), not (1, 2) as you might expect.
So I strongly suggest to have the shape of fListU as (4, 5), not (5, 4), and consequently address the first two elements of a tuple as flist(1:2, p).
Here's a possible solution that knows the lengths of the two input arrays.
The output will still contain another line of all zeros, because I haven't programmed it to get the size of the output array right (instead it just uses the sum of the sizes of the input arrays).
program Combine_List_Simple
implicit none
integer, dimension(:, :), allocatable :: fListU, fListD, fList
integer :: u_size, d_size
integer :: u_index, d_index, f_index
u_size = 5
allocate(fListU(4, u_size))
fListU = reshape((/2, 1, 1, 0, 2, 5, 5, 0, 5, 4, 10, 0, &
6, 1, 5, 0, 6, 5, 7, 0/), (/4, u_size/))
d_size = 3
allocate(fListD(4, d_size))
fListD = reshape((/1, 4, 0, 4, 3, 4, 0, 4, 5, 4, 0, 6/), &
(/4, d_size/))
allocate(fList(4, u_size + d_size))
flist(:, 1:u_size) = fListU(:, :)
flist(:, u_size+1:) = 0
f_index = u_size+1
d_loop : do d_index = 1, d_size
do u_index = 1, u_size
if (all(fListD(1:2, d_index) == fList(1:2, u_index))) then
fList(4, u_index) = fListD(4, d_index)
cycle d_loop
end if
end do
fList(:, f_index) = fListD(:, d_index)
f_index = f_index+1
end do d_loop
write(*, '(4I4)') fList
end program Combine_List_Simple
This code also assumes that the 4th element of all tuples in fListU and the 3rd element of all tuples in fListD is zero. But your code seems to assume that as well. Also it assumes that the combination of 1st and 2nd elements of the tuples are unique in each of the input arrays.
First, I completely copy the contents of fListU into fList. Then I loop over fListD, and compare it to the first entries in fList, because that's where the contents of fListU are. If it finds a match, it updates only the 4th element of the tuple, and then cycles the loop of the fListD array.
Only if it doesn't find a match will it reach the end of the inner loop, and then append the tuple to fList.

Resources