I'm trying to solve some Google Code Jam problems, where an input matrix is typically given in this form:
2 3 #matrix dimensions
1 2 3 4 5 6 7 8 9 # all 3 elements in the first row
2 3 4 5 6 7 8 9 0 # each element is composed of three integers
where each element of the matrix is composed of, say, three integers. So this example should be converted to
#!scala
Array(
Array(A(1,2,3),A(4,5,6),A(7,8,9),
Array(A(2,3,4),A(5,6,7),A(8,9,0),
)
An imperative solution would be of the form
#!python
input = """2 3
1 2 3 4 5 6 7 8 9
2 3 4 5 6 7 8 9 0
"""
lines = input.split('\n')
class Aclass:
def __init__(self,a,b,c):
pass
print lines[0]
m,n = (int(x) for x in lines[0].split())
array = []
row = []
A = []
for line in lines[1:]:
for elt in line.split():
A.append(elt)
if len(A)== 3:
row.append(Aclass(A[0],A[1],A[2]))
A = []
array.append(row)
row = []
from pprint import pprint
pprint(array)
A functional solution I've thought of is
#!scala
def splitList[A](l:List[A],i:Int):List[List[A]] = {
if (l.isEmpty) return List[List[A]]()
val (head,tail) = l.splitAt(i)
return head :: splitList(tail,i)
}
def readMatrix(src:Iterator[String]):Array[Array[TrafficLight]] = {
val Array(x,y) = src.next.split(" +").map(_.trim.toInt)
val mat = src.take(x).toList.map(_.split(" ").
map(_.trim.toInt)).
map(a => splitList(a.toList,3).
map(b => TrafficLight(b(0),b(1),b(2))
).toArray
).toArray
return mat
}
But I really feel it's the wrong way to go because:
I'm using the functional List structure for each line, and then convert it to an array. The whole code seems much less efficeint
I find it longer less elegant and much less readable than the python solution. It is harder to which of the map functions operates on what, as they all use the same semantics.
What is the right functional way to do that?
val x = """2 3
1 2 3 4 5 6 7 8 9
2 3 4 5 6 7 8 9 0
"""
val a = x split "\n" map (_.trim.split(" "))
val rows = a(0)(0).toInt
val columns = a(0)(1).toInt
val matrix = (a drop 1) map (_ grouped columns toList) toList
And to print the result:
matrix.map(_.map(_.mkString("(",",",")")).mkString("(",",",")")).mkString("\n")
res1: String =
((1,2,3),(4,5,6),(7,8,9))
((2,3,4),(5,6,7),(8,9,0))
with the assumptions:
assert(rows == matrix.length)
assert(matrix.forall(_.forall(_.size == columns)))
To produce an array tabulate fits better:
val a = x split "\n" map (_.trim.split(" "))
val rows = a(0)(0).toInt
val columns = a(0)(1).toInt
val matrix = Array.tabulate(rows, a(1).size / columns, columns)(
(i,j,k) => a(i + 1)(j * columns + k))
Here's a version that works on Scala 2.7:
val x = """2 3
1 2 3 4 5 6 7 8 9
2 3 4 5 6 7 8 9 0
"""
val a = x.trim split "\n" map (_.trim.split(" "))
val rows = a(0)(0).toInt
val columns = a(0)(1).toInt
def intervals(n: Int) = (Stream from (0, n)) zip (Stream from (n, n))
val matrix = (a drop 1) map (v =>
intervals(v.size / columns)
take columns
map Function.tupled(v.subArray)
toArray
) toArray
val repr = matrix map (
_ map (
_ mkString ("Array(", ", ", ")")
)
mkString ("Array(", ", ", ")")
) mkString ("Array(\n\t", ",\n\t", "\n)")
println(repr)
I asked a question recently that is very similar. I think you will find the answer there.
find unique matrices from a larger matrix
The input begins as a String, and in the process is transformed into series of 2D matrices.
Lets try this then... seems your not too woried about language, so I'll just describe the code for it.
so we shall have our function that takes in this string, and returns a multi-dimensional array.
The first thing the funciton needs to do is read the string until it gets a space, then convert this sub string into a int and store it as 'rows', then do the same again but store it as 'columns'.
Next, it will need to loop through the remainder of the string, reading out numbers and storing them as ints in an array.
Then it needs to calculate the amount of numbers per cell, which should be "rows * columns / numbers_of_ints" That divide should be the one that would say "16 / 5 = 3" not "16 / 5 = 1" or 16 / 5 = 3.2222...".
Then we create our our array of length rows, where each element is an array of length columsn, where each element is and array of length 'numbers per cell'. This 3D array lets us still access each and every number stored.
now we need to loop through each cell and put its numbers into it.
for(i = 0 ; i < rows ; i = i + 1)
{
for(j = 0 ; j < columns ; j = j + 1)
{
for(k = 0 ; k < numbers_per_cell ; k = k + 1)
{
matrix[i][j][k] = numbers[( i * columns ) + j + k]
}
}
}
You should now have a matrix which contains all of our numbers as a single int stored some where with in the array.
should look like
Array(
Array(Array(1,2,3),Array(4,5,6),Array(7,8,9),
Array(Array(2,3,4),Array(5,6,7),Array(8,9,0),
)
Hope this helps you. I will update it if I need explain something better, or some one has a suggestion.
Related
I have made a table composed of four variables[v1, v2, v3, v4], with near 26000 rows. I need to search the values of an specific line e.g. [1 1 2 2016] within the table (26000 x 4), and return the index of the line in which the search is located.
Example of what I would like to search:
want_1 = [1 1 3 2016];
want_2 = [1 1 5 2016];
And would like to obtain the number of the line in which it is located.
If you have a matrix M (which you could get from table2array(T) on your table) you should be able to use implicit expansion* and all to get your result
srch = [1 1 3 2016]; % Row to search for
res = find( all( M == srch, 2 ) );
The find converts the logical array returned by all into the row numbers where it is true.
The implicit expansion here is basically the same as repeating the srch array for the entire height of the matrix M and then doing an element-wise == operation. The all then ensures that every comparison in a given row was true (i.e. a match for every element of srch).
*Implicit expansion relies on having MATLAB R2016b or newer... for older versions you can achieve the same using bsxfun.
Just as an exercise in alternatives, you could use splitapply instead to apply the all and == operators to each row in turn, this is probably slower...
res = find( splitapply( #(x)all(x==srch), M, (1:size(M,1)).' ) );
Or you could even use rowfun, which is a bit of a loop-in-disguise, but would work on your table T without having to first convert it to a matrix:
res = find( rowfun( #(varargin)all([varargin{:}]==srch), T, 'OutputFormat', 'uniform' ) );
For a matrix, you can use ismember with the 'rows' option:
M = [1 2 3 4; 1 1 3 2016; 5 6 7 8]; % example data matrix
wanted = [1 1 3 2016]; % example wanted row
result = find(ismember(M, wanted, 'rows'));
This also works with a table, as long as the wanted row is a table (of one row) with the same variable names:
M = table;
M.hour = [1; 2; 3]; M.day = [4; 5; 6]; M.month = [7; 8; 9]; M.year = [10; 11; 12];
wanted = table;
wanted.hour = 2; wanted.day = 5; wanted.month = 8; wanted.year = 11;
result = find(ismember(M, wanted, 'rows'));
I have an array A=[a1,a2,a3, ..., aN] I would like to take a product of each 3 elements:
s1=a1+a2+a3
s2=a4+a5+a6
...
sM=a(N-2)+a(N-1)+aN
My solution:
k=size(A);
s=0;
for n=1:k
s(n)=s(n-2)+s(n-1)+s(n);
end
Error: Attempted to access s(2); index out of bounds because numel(s)=1.
Hoe to fix it?
If you want to sum in blocks, for the general case when the number of elements of A is not necessarily a multiple of the block size, you can use accumarray:
A = [3 8 5 8 2 3 4 7 9 6 4]; % 11 elements
s = 3; % block size
result = accumarray(ceil((1:numel(A))/s).', A(:));
If you want a sliding sum with a given block size, you can use conv:
A = [3 8 5 8 2 3 4 7 9 6 4]; % 11 elements
s = 3; % block size
result = conv(A(:).', ones(1,s), 'valid');
You try to calculate sby using values from s. Dont you mean s(n)=A(n-2)+A(n-1)+A(n);? Also size returns more than one dimension on its own.
That being said, getting the 2 privous values n-2 and n-1 doenst work for n=1;2 (because you must have positive indices). You have to explain how the first two values should be handeled. I assume either 0 for elements not yet exisiting
k=size(A,2); %only the second dimension when A 1xn, or length(A)
s=zeros(1,k); %get empty values instead of appending each value for better performance
s(1)=A(1);
s(2)=A(2)+A(1);
for n=3:k %start at 3
s(n)=A(n-2)+A(n-1)+A(n);
end
or sshoult be 2 values shorter than A.
k=size(A,2);
s=zeros(1,k-2);
for n=1:k-2
s(n)=A(n)+A(n+1)+A(n+2);
end
You initialise s as a scalar with s = 0. Then you try and index it like an array, but it only has a single element.
Your current logic (if fixed) will calculate this:
s(1) = a(1)+a(2)+a(3)
s(2) = a(2)+a(3)+a(4)
...
% 's' will be 2 elements shorter than 'a'
So we need to be a bit wiser with the indexing to get what you describe, which is
s(1) = a(1)+a(2)+a(3)
s(2) = a(4)+a(5)+a(6)
...
% 's' will be a third as big as 'a'
You should pre-allocate s to the right size, like so:
k = numel(A); % Number of elements in 'A'
s = zeros( 1, k/3 ); % Output array, assuming 'k' is divisible by 3
for n = 0:3:k-3
s(n/3+1) = a(n+1) + a(n+2) + a(n+3);
end
You could do this in one line by reshaping the array to have 3 rows, then summing down each column, this assumes that the number of elements in a is divisible by 3, and that a is a row vector...
s = sum( reshape( a, 3, [] ) );
I'm trying to write a function that shuffles an array, which contains repeating elements, but ensures that repeating elements are not too close to one another.
This code works but seems inefficient to me:
function shuffledArr = distShuffle(myArr, myDist)
% this function takes an array myArr and shuffles it, while ensuring that repeating
% elements are at least myDist elements away from on another
% flag to indicate whether there are repetitions within myDist
reps = 1;
while reps
% set to 0 to break while-loop, will be set to 1 if it doesn't meet condition
reps = 0;
% randomly shuffle array
shuffledArr = Shuffle(myArr);
% loop through each unique value, find its position, and calculate the distance to the next occurence
for x = 1:length(unique(myArr))
% check if there are any repetitions that are separated by myDist or less
if any(diff(find(shuffledArr == x)) <= myDist)
reps = 1;
break;
end
end
end
This seems suboptimal to me for three reasons:
1) It may not be necessary to repeatedly shuffle until a solution has been found.
2) This while loop will go on forever if there is no possible solution (i.e. setting myDist to be too high to find a configuration that fits). Any ideas on how to catch this in advance?
3) There must be an easier way to determine the distance between repeating elements in an array than what I did by looping through each unique value.
I would be grateful for answers to points 2 and 3, even if point 1 is correct and it is possible to do this in a single shuffle.
I think it is sufficient to check the following condition to prevent infinite loops:
[~,num, C] = mode(myArr);
N = numel(C);
assert( (myDist<=N) || (myDist-N+1) * (num-1) +N*num <= numel(myArr),...
'Shuffling impossible!');
Assume that myDist is 2 and we have the following data:
[4 6 5 1 6 7 4 6]
We can find the the mode , 6, with its occurence, 3. We arrange 6s separating them by 2 = myDist blanks:
6 _ _ 6 _ _6
There must be (3-1) * myDist = 4 numbers to fill the blanks. Now we have five more numbers so the array can be shuffled.
The problem becomes more complicated if we have multiple modes. For example for this array [4 6 5 1 6 7 4 6 4] we have N=2 modes: 6 and 4. They can be arranged as:
6 4 _ 6 4 _ 6 4
We have 2 blanks and three more numbers [ 5 1 7] that can be used to fill the blanks. If for example we had only one number [ 5] it was impossible to fill the blanks and we couldn't shuffle the array.
For the third point you can use sparse matrix to accelerate the computation (My initial testing in Octave shows that it is more efficient):
function shuffledArr = distShuffleSparse(myArr, myDist)
[U,~,idx] = unique(myArr);
reps = true;
while reps
S = Shuffle(idx);
shuffledBin = sparse ( 1:numel(idx), S, true, numel(idx) + myDist, numel(U) );
reps = any (diff(find(shuffledBin)) <= myDist);
end
shuffledArr = U(S);
end
Alternatively you can use sub2ind and sort instead of sparse matrix:
function shuffledArr = distShuffleSparse(myArr, myDist)
[U,~,idx] = unique(myArr);
reps = true;
while reps
S = Shuffle(idx);
f = sub2ind ( [numel(idx) + myDist, numel(U)] , 1:numel(idx), S );
reps = any (diff(sort(f)) <= myDist);
end
shuffledArr = U(S);
end
If you just want to find one possible solution you could use something like that:
x = [1 1 1 2 2 2 3 3 3 3 3 4 5 5 6 7 8 9];
n = numel(x);
dist = 3; %minimal distance
uni = unique(x); %get the unique value
his = histc(x,uni); %count the occurence of each element
s = [sortrows([uni;his].',2,'descend'), zeros(length(uni),1)];
xr = []; %the vector that will contains the solution
%the for loop that will maximize the distance of each element
for ii = 1:n
s(s(:,3)<0,3) = s(s(:,3)<0,3)+1;
s(1,3) = s(1,3)-dist;
s(1,2) = s(1,2)-1;
xr = [xr s(1,1)];
s = sortrows(s,[3,2],{'descend','descend'})
end
if any(s(:,2)~=0)
fprintf('failed, dist is too big')
end
Result:
xr = [3 1 2 5 3 1 2 4 3 6 7 8 3 9 5 1 2 3]
Explaination:
I create a vector s and at the beggining s is equal to:
s =
3 5 0
1 3 0
2 3 0
5 2 0
4 1 0
6 1 0
7 1 0
8 1 0
9 1 0
%col1 = unique element; col2 = occurence of each element, col3 = penalities
At each iteration of our for-loop we choose the element with the maximum occurence since this element will be harder to place in our array.
Then after the first iteration s is equal to:
s =
1 3 0 %1 is the next element that will be placed in our array.
2 3 0
5 2 0
4 1 0
6 1 0
7 1 0
8 1 0
9 1 0
3 4 -3 %3 has now 5-1 = 4 occurence and a penalities of -3 so it won't show up the next 3 iterations.
at the end every number of the second column should be equal to 0, if it's not the minimal distance was too big.
Let say I have 3 MATs
X = [ 1 3 9 10 ];
Y = [ 1 9 11 20];
Z = [ 1 3 9 11 ];
Now I would like to find the values that appear only once, and to what array they belong to
I generalized EBH's answer to cover flexible number of arrays, arrays with different sizes and multidimensional arrays. This method also can only deal with integer-valued arrays:
function [uniq, id] = uniQ(varargin)
combo = [];
idx = [];
for ii = 1:nargin
combo = [combo; varargin{ii}(:)]; % merge the arrays
idx = [idx; ii*ones(numel(varargin{ii}), 1)];
end
counts = histcounts(combo, min(combo):max(combo)+1);
ids = find(counts == 1); % finding index of unique elements in combo
uniq = min(combo) - 1 + ids(:); % constructing array of unique elements in 'counts'
id = zeros(size(uniq));
for ii = 1:numel(uniq)
ids = find(combo == uniq(ii), 1); % finding index of unique elements in 'combo'
id(ii) = idx(ids); % assigning the corresponding index
end
And this is how it works:
[uniq, id] = uniQ([9, 4], 15, randi(12,3,3), magic(3))
uniq =
1
7
11
12
15
id =
4
4
3
3
2
If you are only dealing with integers and your vectors are equally sized (all with the same number of elements), you can use histcounts for a quick search for unique elements:
X = [1 -3 9 10];
Y = [1 9 11 20];
Z = [1 3 9 11];
XYZ = [X(:) Y(:) Z(:)]; % one matrix with all vectors as columns
counts = histcounts(XYZ,min(XYZ(:)):max(XYZ(:))+1);
R = min(XYZ(:)):max(XYZ(:)); % range of the data
unkelem = R(counts==1);
and then locate them using a loop with find:
pos = zeros(size(unkelem));
counter = 1;
for k = unkelem
[~,pos(counter)] = find(XYZ==k);
counter = counter+1;
end
result = [unkelem;pos]
and you get:
result =
-3 3 10 20
1 3 1 2
so -3 3 10 20 are unique, and they appear at the 1 3 1 2 vectors, respectively.
I have quite big array. To make things simple lets simplify it to:
A = [1 1 1 1 2 2 3 3 3 3 4 4 5 5 5 5 5 5 5 5];
So, there is a group of 1's (4 elements), 2's (2 elements), 3's (4 elements), 4's (2 elements) and 5's (8 elements). Now, I want to keep only columns, which belong to group of 3 or more elements. So it will be like:
B = [1 1 1 1 3 3 3 3 5 5 5 5 5 5 5 5];
I was doing it using for loop, scanning separately 1's, 2's, 3's and so on, but its extremely slow with big arrays...
Thanks for any suggestions how to do it in more efficient way :)
Art.
A general approach
If your vector is not necessarily sorted, then you need to run to count the number of occurrences of each element in the vector. You have histc just for that:
elem = unique(A);
counts = histc(A, elem);
B = A;
B(ismember(A, elem(counts < 3))) = []
The last line picks the elements that have less than 3 occurrences and deletes them.
An approach for a grouped vector
If your vector is "semi-sorted", that is if similar elements in the vector are grouped together (as in your example), you can speed things up a little by doing the following:
start_idx = find(diff([0, A]))
counts = diff([start_idx, numel(A) + 1]);
B = A;
B(ismember(A, A(start_idx(counts < 3)))) = []
Again, note that the vector need not to be entirely sorted, just that similar elements are adjacent to each other.
Here is my two-liner
counts = accumarray(A', 1);
B = A(ismember(A, find(counts>=3)));
accumarray is used to count the individual members of A. find extracts the ones that meet your '3 or more elements' criterion. Finally, ismember tells you where they are in A. Note that A needs not be sorted. Of course, accumarray only works for integer values in A.
What you are describing is called run-length encoding.
There is software for this in Matlab on the FileExchange. Or you can do it directly as follows:
len = diff([ 0 find(A(1:end-1) ~= A(2:end)) length(A) ]);
val = A(logical([ A(1:end-1) ~= A(2:end) 1 ]));
Once you have your run-length encoding you can remove elements based on the length. i.e.
idx = (len>=3)
len = len(idx);
val = val(idx);
And then decode to get the array you want:
i = cumsum(len);
j = zeros(1, i(end));
j(i(1:end-1)+1) = 1;
j(1) = 1;
B = val(cumsum(j));
Here's another way to do it using matlab built-ins.
% Set up
A=[1 1 1 1 2 2 3 3 3 3 4 4 5 5 5 5 5];
threshold=2;
% Get the unique elements of the array
uniqueElements=unique(A);
% Count haw many times each unique element occurs
counts=histc(A,uniqueElements);
% Write which elements should be kept
toKeep=uniqueElements(counts>threshold);
% Make a logical index
indexer=false(size(A));
for i=1:length(toKeep)
% For every unique element we want to keep select the indices in A that
% are equal
indexer=indexer|(toKeep(i)==A);
end
% Apply index
B=A(indexer);