Julia: delete rows and columns from an array or matix - arrays

How can I delete one or more rows and/or columns from an array?

Working with:
julia> array = [1 2 3 4; 5 6 7 8; 9 10 11 12; 13 14 15 16]
4×4 Array{Int64,2}:
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
To delete a single row (here row 2):
julia> newarray = array[1:end .!= 2, :]
3×4 Array{Int64,2}:
1 2 3 4
9 10 11 12
13 14 15 16
To delete a single column (here column 3):
julia> newarray = array[:, 1:end .!= 3]
4×3 Array{Int64,2}:
1 2 4
5 6 8
9 10 12
13 14 16
To delete a single row and a single column (here row 2, column 3):
julia> newarray = array[1:end .!= 3, 1:end .!= 3]
3×3 Array{Int64,2}:
1 2 4
5 6 8
13 14 16
To delete multiple rows (here rows 2, 4):
julia> newarray = array[setdiff(1:end, (2,4)), :]
2×4 Array{Int64,2}:
1 2 3 4
9 10 11 12
To delete multiple columns (here columns 2, 4):
julia> newarray = array[:, setdiff(1:end, (2,4))]
4×2 Array{Int64,2}:
1 3
5 7
9 11
13 15
To delete a single row and multiple columns (here row 4 and columns 3, 4):
julia> newarray = array[1:end .!= 4, setdiff(1:end, (3,4))]
3×2 Array{Int64,2}:
1 2
5 6
9 10
# or
julia> newarray = array[setdiff(1:end, 4), setdiff(1:end, (3,4))]
3×2 Array{Int64,2}:
1 2
5 6
9 10
# or
julia> newarray = array[setdiff(1:end, (4,)), setdiff(1:end, (3,4))]
3×2 Array{Int64,2}:
1 2
5 6
9 10
To delete multiple rows and columns (here rows 1, 2 and columns 3, 4):
julia> newarray = array[setdiff(1:end, (1,2)), setdiff(1:end, (3,4))]
2×2 Array{Int64,2}:
9 10
13 14

Related

filter an array by another array in julia

I have two arrays in Julia, The array 1 (45807x2) has two columns, first column is for position of snp and second colunm is for snpID, now I want to the snpIDs that has position in array2(4580x1) from array 1.
for example, the first element (5) in array 1 is the fifth snpID (BTA-34880) in array 1.
how can I do it ? thanks.
45807×2 Array{Any,2}:
1 "BovineHD0100000015"
2 "Hapmap43437-BTA-101873"
3 "BovineHD0100000062"
4 "ARS-BFGL-NGS-16466"
5 "BTA-34880"
6 "BovineHD0100000096"
7 "Hapmap34944-BES1_Contig627_1906"
8 "ARS-BFGL-NGS-98142"
9 "rs29015850"
10 "ARS-BFGL-NGS-114208"
11 "ARS-BFGL-NGS-66449"
12 "BovineHD0100000204"
13 "BovineHD0100000220"
⋮
4580-element Array{Int64,1}:
5
6
18
25
26
54
55
67
69
84
88
You can directly use the second array as an index for the first array. Look at this example:
julia> using Random
julia> a = hcat(1:10, shuffle(1:10))
10×2 Array{Int64,2}:
1 7
2 6
3 10
4 1
5 9
6 8
7 4
8 5
9 3
10 2
julia> b = shuffle(1:5)
5-element Array{Int64,1}:
2
5
3
4
1
julia> a[b,2]
5-element Array{Int64,1}:
6
9
10
1
7

Julia: join two matrices using the same memory

I want to fuse two arrays without using more memory, it's posible?, for instance:
a=[1 2 3
4 5 6
7 8 9]
b=[11 12 13
14 15 16
17 18 19]
I need to get the array:
c=[a b]
but using the same memory as a and b, i.e, any change in a or b must be reflected in c.
There's also another package CatViews.jl
julia> x = CatView(a, b); # no copying!!!
julia> reshape(x, size(a, 1), :)
3×6 reshape(::CatView{2,Int64}, 3, 6) with eltype Int64:
1 2 3 11 12 13
4 5 6 14 15 16
7 8 9 17 18 19
If you start in reverse, define C first
julia> C = rand(0:9, 3, 6)
3×6 Array{Int64,2}:
3 2 4 4 9 8
8 8 6 5 5 9
0 7 5 8 7 5
then have A and B be views of C
julia> A = #view C[:, 1:3]
3×3 view(::Array{Int64,2}, :, 1:3) with eltype Int64:
3 2 4
8 8 6
0 7 5
julia> B = #view C[:, 4:6]
3×3 view(::Array{Int64,2}, :, 4:6) with eltype Int64:
4 9 8
5 5 9
8 7 5
then it works.
julia> A[2,2] = -1
-1
julia> C
3×6 Array{Int64,2}:
3 2 4 4 9 8
8 -1 6 5 5 9
0 7 5 8 7 5

Given an index of choices for each column, construct a 1D array from a 2D array

I have a 2D array such as:
julia> m = [1 2 3 4 5
6 7 8 9 10
11 12 13 14 15]
3×5 Array{Int64,2}:
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
I want to pick one value from each column and construct a 1D array.
So for instance, if my choices are
julia> choices = [1, 2, 3, 2, 1]
5-element Array{Int64,1}:
1
2
3
2
1
Then the desired output is [1, 7, 13, 9, 5]. What's the best way to do that? In my particular application, I am randomly generating these values, e.g.
choices = rand(1:size(m)[1], size(m)[2])
Thank you!
This is probably the simplest approach:
[m[c, i] for (i, c) in enumerate(choices)]
EDIT:
If best means fastest for you such a function should be approximately 2x faster than the comprehension for large m:
function selector(m, choices)
v = similar(m, size(m, 2))
for i in eachindex(choices)
#inbounds v[i] = m[choices[i], i]
end
v
end

How to fill an array by row in Julia

I would like to fill an Array object by row in the Julia language.
The reshape function wants to fill by column (Julia is column major).
julia> reshape(1:15, 3,5)
3x5 Array{Int64,2}:
1 4 7 10 13
2 5 8 11 14
3 6 9 12 15
Is there a way to persuade it to fill by row? It feels like there should be an obvious answer, but I've not found one.
One suggestion:
julia> reshape(1:15, 5, 3) |> transpose
3x5 Array{Int64,2}:
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
With array comprehension:
julia> [i+5*j for j=0:2,i=1:5]
3x5 Array{Int64,2}:
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
Ah, it's just more than 10x times faster than other suggestion (actually, an embarrassing 100x on my initial benchmark).
permutedims is another choice when dealing with more general multi-way arrays.
julia> permutedims(reshape(1:24, 2,3,4), [2,1,3])
3x2x4 Array{Int64,3}:
[:, :, 1] =
1 2
3 4
5 6
[:, :, 2] =
7 8
9 10
11 12
[:, :, 3] =
13 14
15 16
17 18
[:, :, 4] =
19 20
21 22
23 24
however, it's slowest among other suggestions in your specific case.

Taking averages of data based on logical filter

we have two columns ('A' and 'B') as follows.
A = [10 5 6 6 10 2 3 2 1 3 2 3 3 7 9 8 6 8 8 12]
B = [10 5 6 6 2 2 3 2 1 3 2 3 3 7 2 2 3 3 8 12]
logicalFilter= ~(B<=3 & B>1)
Now I need to take averages of data points in A corresponding to logicalFilter == 1 for three different blocks of logicalFilter == 1 separately and also ignoring first two points (for example) in A when logicalFilter == 1 in each block for the calculation of averages. How this can be done?
My mentalist skills leading me to this answer:
%// input
A = [10 5 6 6 10 2 3 2 1 3 2 3 3 7 9 8 6 8 8 12]
B = [10 5 6 6 2 2 3 2 1 3 2 3 3 7 2 2 3 3 8 12]
mask = (B<=3 & B>1)
%// get subs and vals for accumarray
C = cumsum(~mask) + 1
[~,~,subs] = unique(C(mask))
val = A(mask)
%// calculate mean starting with 3rd value of group
out = accumarray(subs(:),val(:),[],#(x) mean(x(3:end)) )
out =
2.5000 3.0000 7.0000

Resources