How do I create a function (e.g. here, an anonymous one but I don't mind any) to get x elements from vec that are most centered (i.e. around the median)? In essence I want a function with same syntax as Matlab's randsample(n,k), but for non-random, with elements spanning around the center.
cntr=#(vec,x) vec(round(end*.5)+(-floor(x/2):floor(x/2))); %this function in question
cntr(1:10,3) % outputs 3 values around median 5.5 => [4 5 6];
cntr(1:11,5) % outputs => [4 5 6 7 8]
Note that vec is always sorted.
One part that I struggle with is not to output more than the limits of vec. For example, cntr(1:10, 10) should not throw an error.
edit: sorry to answer-ers for many updates of question
It's not a one-line anonymous function, but you can do this pretty simply with a couple calls to sort:
function vec = cntr(vec, x)
[~, index] = sort(abs(vec-median(vec)));
vec = vec(sort(index(1:min(x, end))));
end
The upside: it will still return the same set of values even if vec isn't sorted. Some examples:
>> cntr(1:10, 3)
ans =
4 5 6
>> cntr(1:11, 5)
ans =
4 5 6 7 8
>> cntr(1:10, 10) % No indexing errors
ans =
1 2 3 4 5 6 7 8 9 10
>> cntr([3 10 2 4 1 6 5 8 11 7 9], 5) % Unsorted version of example 2
ans =
4 6 5 8 7 % Same values, in their original order in vec
OLD ANSWER
NOTE: This applied to an earlier version of the question where a range of x values below and x values above the median were desired as output. Leaving it for posterity...
I broke it down into these steps (starting with a sorted vec):
Find the values in vec less than the median, get the last x indices of these, then take the first (smallest) of them. This is the starting index.
Find the values in vec greater than the median, get the first x indices of these, then take the last (largest) of them. This is the ending index.
Use the starting and ending indices to select the center portion of vec.
Here's the implementation of the above, using the functions find, min, and max:
cntr = #(vec, x) vec(min(find(vec < median(vec), x, 'last')):max(find(vec > median(vec), x)));
And a few tests:
>> cntr(1:10, 3) % 3 above and 3 below 5.5
ans =
3 4 5 6 7 8
>> cntr(1:11, 5) % 5 above and 5 below 6 (i.e. all of vec)
ans =
1 2 3 4 5 6 7 8 9 10 11
>> cntr(1:10, 10) % 10 above and 10 below 5.5 (i.e. all of vec, no indexing errors)
ans =
1 2 3 4 5 6 7 8 9 10
median requires sorting the array elements. Might as well sort manually, and pick out the middle block (edit: OP's comment indicates elements are already sorted, more justification for keeping it simple):
function data = cntr(data,x)
x = min(x,numel(data)); % don't pick more elements than exist
data = sort(data);
start = floor((numel(data)-x)/2) + 1;
data = data(start:start+x-1);
You could stick this into a single-line anonymous function with some tricks, but that just makes the code ugly. :)
Note that in the case of an uneven division (when we don't leave an even number of elements out), here we prioritize an element on the left. Here is what I mean:
0 0 0 0 0 0 0 0 0 0 0 => 11 elements, x=4
\_____/
picking these 4 values
This choice could be made more complex, for example shifting the interval left or right depending on which of those values is closest to the mean.
Given data (i.e. vec) is already sorted, the indexing operation can be kept to a single line:
cntr = #(data,x) data( floor((numel(data)-x)/2) + (1:x) );
The thing that is missing in that line is x = min(x,numel(data)), which we need to add twice becuase we can't change a variable in an anonymous function:
cntr = #(data,x) data( floor((numel(data)-min(x,numel(data)))/2) + (1:min(x,numel(data))) );
This we can simplify to:
cntr = #(data,x) data( floor(max(numel(data)-x,0)/2) + (1:min(x,numel(data))) );
Related
I have a vector like this:
h = [1,2,3,4,5,6,7,8,9,10,11,12]
And I want to repeat every third element like so:
h_rep = [1,2,3,3,4,5,6,6,7,8,9,9,10,11,12,12]
How do I accomplish this elegantly in MATLAB? The actual arrays are huge, so ideally I don't want to write a for loop. Is there a vectorized way to do this?
One way to do this would be to use the recent repelem function that was released in version R2015b where you can repeat each element in a vector a certain amount of times. In this case, specify a vector where every third element is a 2 with the rest of the values being a 1 as the number of times to repeat the corresponding element, then use the function:
N = numel(h);
rep = ones(1, N);
rep(3:3:end) = 2;
h_rep = repelem(h, rep);
Using your example: h = 1 : 12, we thus get:
>> h_rep
h_rep =
1 2 3 3 4 5 6 6 7 8 9 9 10 11 12 12
If repelem is not available to you, then a clever use of cumsum may help. Basically, note that for every three elements, the next one is a copy of the previous element. If we had an indicator vector of [1 1 1 0] where 1 is the position that we want to copy and 0 tells us to copy the last value, using cumulative sum or cumsum on repeated versions of this vector - exactly 1 + (numel(h) / 4) will give us exactly where we would need to index into h. Therefore, create a vector of ones that is the length of h added with 1 + (numel(h) / 4 to ensure that we make space for the duplicate elements, then make sure every fourth element is set to 0 before applying the cumsum:
N = numel(h);
rep = ones(1, N + 1 + (N / 4));
rep(4:4:end) = 0;
rep = cumsum(rep);
h_rep = h(rep);
Thus:
>> h_rep
h_rep =
1 2 3 3 4 5 6 6 7 8 9 9 10 11 12 12
One last suggestion (thanks to user #bremen_matt) would be to reshape your vector into a matrix so that it has 3 rows, duplicate the last row, then reshape the resulting duplicated matrix back to a single vector:
h_rep = reshape(h, 3, []);
h_rep = reshape([h_rep; h_rep(end,:)], 1, []);
We again get:
>> h_rep
h_rep =
1 2 3 3 4 5 6 6 7 8 9 9 10 11 12 12
Of course the obvious caveat with the above code is that the length of vector h is evenly divisible by 4.
(Modified according to rayryeng's correct observations)...
Another solution is to play around with the reshape function. If you reshape the matrix to a 3xn matrix first...
B = reshape(h,3,[])
And then copy the last row
B = [B;B(end,:)]
And finally vectorize the solution...
B(:).'
You can use just indexing:
h = [1,2,3,4,5,6,7,8,9,10,11,12]; % initial data
n = 3; % step for repetition
h_rep = h(ceil(n/(n+1):n/(n+1):end));
An index-based approach (using sort):
h_rep = h(sort([1:numel(h) 3:3:numel(h)]));
Or a slightly shorter syntax...
h_rep = h(sort([1:end 3:3:end]));
I think this will do it:
h = [1,2,3,4,5,6,7,8,9,10,11,12];
h0=kron(h,[1 1])
h_rep=h0(mod(1:length(h0),2)==0 | mod(1:length(h0),3)==2)
Answer:
1 2 3 3 4 5 6 6 7 8 9 9 10 11 12 12
Explanation:
After duplicating every element, you select only those that you wants. You can extend this idea to duplicate second and third. etc..
I am trying to remove array values whose difference is a member of that array in MATLAB. For example, if I have an array defined as
x = [1 2 4 3 7];
I would like to remove 2, because it can be achieved from 4 - 2. I would also like to remove 4 because it can be achieved from 7 - 3. I would then like to store these values (2 and 4, respectively) into a matrix. The latter is easy. I just have a hard time doing this checker for summation.
I know you can use
ismember(*any 2 differences*),x(:))
to check if the differences are in the array. However, I don't know how to code my function to try out all the combinations of element subtraction.
Seemed like a good setup to use bsxfun -
abs_diffs = abs(bsxfun(#minus,x(:),x(:).')) %//'
unq_abs_diffs = unique(abs_diffs)
out = x(~any(bsxfun(#eq,unq_abs_diffs(:),x(:).'),1)) %//'
%// OR x(~ismember(x,unq_abs_diffs))
Sample run -
>> x
x =
1 2 4 3 7
>> abs_diffs = abs(bsxfun(#minus,x(:),x(:).'))
abs_diffs =
0 1 3 2 6
1 0 2 1 5
3 2 0 1 3
2 1 1 0 4
6 5 3 4 0
>> unq_abs_diffs = unique(abs_diffs)
unq_abs_diffs =
0
1
2
3
4
5
6
>> out = x(~any(bsxfun(#eq,unq_abs_diffs(:),x(:).'),1))
out =
7
So, in [1 2 4 3 7], only 7 seemed like the one that could not be removed.
You could do it like this:
n = length(a);
differences = meshgrid(a,a) - meshgrid(a,a)'; % get differences between elements
differences(1:n+1:n*n) = []; % remove diagonal
a(ismember(a,differences)) = []; % remove elements in differences
I'm assuming that you only want differences between unique elements. If you want to allow the difference between an element of a and itself, then remove the 3rd line.
I'm trying to create a script to solve my problem, but I got stuck in one place.
So I have imported .txt file with 4x1 sized matrix (simplified to give an example in my case it might be 1209x1 matrix) which contains some coordinate X. And it's look like this:
0
1
2
3
That's coordinates for one period, and I need to get one column for different number of periods N . Each period is the same and lenght=L
So you can do it manually by this script, for example for N=3 periods:
X=[X; X+L; X+2*L];
so for example if L=3
then i will get
0
1
2
3
3
4
5
6
6
7
8
9
it works well but it's not efficient in case if I need to work with number of periods let's say N=1000 or if I need to change their number quickly. Any solution to parameterize this operation so I can just put number for N and get X for N periods?
Thanks and Regards
I don't have MATLAB on this machine so I can't test, but the most straightforward implementation would be something like:
n = 1000;
L = 3;
nvalues = length(X); % Assuming X is your initial vector
newx = zeros(n*nvalues, 1); % Preallocate new array
for ii = 0:(n-1)
startidx = (nvalues*ii) + 1;
endidx = nvalues*(ii+1);
newx(startidx:endidx) = X + ii*L
end
You can use bsxfun to create X, X+L, X+2*L, ... and then reshape it to a vector
>> F=bsxfun(#plus, X, (0:(N-1))*L)
F =
0 3 6
1 4 7
2 5 8
3 6 9
>> X=F(:)
X =
0
1
2
3
3
4
5
6
6
7
8
9
or in a more concise form:
>> X=reshape(bsxfun(#plus, X, (0:(N-1))*L), [], 1)
X =
0
1
2
3
3
4
5
6
6
7
8
9
I have quite big array. To make things simple lets simplify it to:
A = [1 1 1 1 2 2 3 3 3 3 4 4 5 5 5 5 5 5 5 5];
So, there is a group of 1's (4 elements), 2's (2 elements), 3's (4 elements), 4's (2 elements) and 5's (8 elements). Now, I want to keep only columns, which belong to group of 3 or more elements. So it will be like:
B = [1 1 1 1 3 3 3 3 5 5 5 5 5 5 5 5];
I was doing it using for loop, scanning separately 1's, 2's, 3's and so on, but its extremely slow with big arrays...
Thanks for any suggestions how to do it in more efficient way :)
Art.
A general approach
If your vector is not necessarily sorted, then you need to run to count the number of occurrences of each element in the vector. You have histc just for that:
elem = unique(A);
counts = histc(A, elem);
B = A;
B(ismember(A, elem(counts < 3))) = []
The last line picks the elements that have less than 3 occurrences and deletes them.
An approach for a grouped vector
If your vector is "semi-sorted", that is if similar elements in the vector are grouped together (as in your example), you can speed things up a little by doing the following:
start_idx = find(diff([0, A]))
counts = diff([start_idx, numel(A) + 1]);
B = A;
B(ismember(A, A(start_idx(counts < 3)))) = []
Again, note that the vector need not to be entirely sorted, just that similar elements are adjacent to each other.
Here is my two-liner
counts = accumarray(A', 1);
B = A(ismember(A, find(counts>=3)));
accumarray is used to count the individual members of A. find extracts the ones that meet your '3 or more elements' criterion. Finally, ismember tells you where they are in A. Note that A needs not be sorted. Of course, accumarray only works for integer values in A.
What you are describing is called run-length encoding.
There is software for this in Matlab on the FileExchange. Or you can do it directly as follows:
len = diff([ 0 find(A(1:end-1) ~= A(2:end)) length(A) ]);
val = A(logical([ A(1:end-1) ~= A(2:end) 1 ]));
Once you have your run-length encoding you can remove elements based on the length. i.e.
idx = (len>=3)
len = len(idx);
val = val(idx);
And then decode to get the array you want:
i = cumsum(len);
j = zeros(1, i(end));
j(i(1:end-1)+1) = 1;
j(1) = 1;
B = val(cumsum(j));
Here's another way to do it using matlab built-ins.
% Set up
A=[1 1 1 1 2 2 3 3 3 3 4 4 5 5 5 5 5];
threshold=2;
% Get the unique elements of the array
uniqueElements=unique(A);
% Count haw many times each unique element occurs
counts=histc(A,uniqueElements);
% Write which elements should be kept
toKeep=uniqueElements(counts>threshold);
% Make a logical index
indexer=false(size(A));
for i=1:length(toKeep)
% For every unique element we want to keep select the indices in A that
% are equal
indexer=indexer|(toKeep(i)==A);
end
% Apply index
B=A(indexer);
This question already has answers here:
Repeat copies of array elements: Run-length decoding in MATLAB
(5 answers)
Closed 8 years ago.
My question is similar to this one, but I would like to replicate each element according to a count specified in a second array of the same size.
An example of this, say I had an array v = [3 1 9 4], I want to use rep = [2 3 1 5] to replicate the first element 2 times, the second three times, and so on to get [3 3 1 1 1 9 4 4 4 4 4].
So far I'm using a simple loop to get the job done. This is what I started with:
vv = [];
for i=1:numel(v)
vv = [vv repmat(v(i),1,rep(i))];
end
I managed to improve by preallocating space:
vv = zeros(1,sum(rep));
c = cumsum([1 rep]);
for i=1:numel(v)
vv(c(i):c(i)+rep(i)-1) = repmat(v(i),1,rep(i));
end
However I still feel there has to be a more clever way to do this... Thanks
Here's one way I like to accomplish this:
>> index = zeros(1,sum(rep));
>> index(cumsum([1 rep(1:end-1)])) = 1;
index =
1 0 1 0 0 1 1 0 0 0 0
>> index = cumsum(index)
index =
1 1 2 2 2 3 4 4 4 4 4
>> vv = v(index)
vv =
3 3 1 1 1 9 4 4 4 4 4
This works by first creating an index vector of zeroes the same length as the final count of all the values. By performing a cumulative sum of the rep vector with the last element removed and a 1 placed at the start, I get a vector of indices into index showing where the groups of replicated values will begin. These points are marked with ones. When a cumulative sum is performed on index, I get a final index vector that I can use to index into v to create the vector of heterogeneously-replicated values.
To add to the list of possible solutions, consider this one:
vv = cellfun(#(a,b)repmat(a,1,b), num2cell(v), num2cell(rep), 'UniformOutput',0);
vv = [vv{:}];
This is much slower than the one by gnovice..
What you are trying to do is to run-length decode. A high level reliable/vectorized utility is the FEX submission rude():
% example inputs
counts = [2, 3, 1];
values = [24,3,30];
the result
rude(counts, values)
ans =
24 24 3 3 3 30
Note that this function performs the opposite operation as well, i.e. run-length encodes a vector or in other words returns values and the corresponding counts.
accumarray function can be used to make the code work if zeros exit in rep array
function vv = repeatElements(v, rep)
index = accumarray(cumsum(rep)'+1, 1);
vv = v(cumsum(index(1:end-1))+1);
end
This works similar to solution of gnovice, except that indices are accumulated instead being assigned to 1. This allows to skip some indices (3 and 6 in the example below) and remove corresponding elements from the output.
>> v = [3 1 42 9 4 42];
>> rep = [2 3 0 1 5 0];
>> index = accumarray(cumsum(rep)'+1, 1)'
index =
0 0 1 0 0 2 1 0 0 0 0 2
>> cumsum(index(1:end-1))+1
ans =
1 1 2 2 2 4 5 5 5 5 5
>> vv = v(cumsum(index(1:end-1))+1)
vv =
3 3 1 1 1 9 4 4 4 4 4