Vectorizing Matlab replace array values from start to end - arrays

I have an array in which I want to replace values at a known set of indices with the value immediately preceding it. As an example, my array might be
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 0];
and the indices of values to be replaced by previous values might be
y = [2, 3, 8];
I want this replacement to occur from left to right, or else start to finish. That is, the value at index 2 should be replaced by the value at index 1, before the value at index 3 is replaced by the value at index 2. The result using the arrays above should be
[1, 1, 1, 4, 5, 6, 7, 7, 9, 0]
However, if I use the obvious method to achieve this in Matlab, my result is
>> x(y) = x(y-1)
x =
1 1 2 4 5 6 7 7 9 0
Hopefully you can see that this operation was performed right to left and the value at index 3 was replaced by the value at index 2, then 2 was replaced by 1.
My question is this: Is there some way of achieving my desired result in a simple way, without brute force looping over the arrays or doing something time consuming like reversing the arrays around?

Well, practically this is a loop but the order is number of consecutive index elements
while ~isequal(x(y),x(y-1))
x(y)=x(y-1)
end

Using nancumsum you can achieve a fully vectorized version. Nevertheless, for most cases the solution karakfa provided is probably one to prefer. Only for extreme cases with long sequences in y this code is faster.
c1=[0,diff(y)==1];
c1(c1==0)=nan;
shift=nancumsum(c1,2,4);
y(~isnan(shift))=y(~isnan(shift))-shift(~isnan(shift));
x(y)=x(y-1)

Related

What do these constraints mean?

Can anyone help me understand this coding problem assignment?
I have an array of numbers, where each number appears twice except for one, and I need to identify which is the number that only appears once.
E.g.
const num_list = [8, 6, 3, 2, 4, 2, 3, 4, 5, 8, 7, 7, 6]
Answer: 5
The thing I'm confused about though is the constraints given for the problem are:
2 <= num_list[i] <= 100000
3 <= i <= 10,000
In particular the second constraint given - What does 'i' refer to here? Is it just stating the minimum number of elements that will be in the array (there are multiple test cases with different arrays as input)? Or does it mean that if I iterate over the array I can only start iterating from index 3 of the array onwards?
Thanks in advance

Minimum increment decrement operations to make array non increasing

Given an array a, your task is to convert it into a non-increasing
form such that we can either increment or decrement the array value by
1 in minimum changes possible.
Examples :
Input : a[] = {3, 1, 2, 1} Output : 1 Explanation : We can convert the
array into 3 1 1 1 by changing 3rd element of array i.e. 2 into its
previous integer 1 in one step hence only one step is required.
Input : a[] = {3, 1, 5, 1} Output : 4 We need to decrease 5 to 1 to
make array sorted in non-increasing order.
Input : a[] = {1, 5, 5, 5} Output : 4 We need to increase 1 to 5.
This is the problem: https://www.geeksforgeeks.org/minimum-incrementdecrement-to-make-array-non-increasing/
The solution given is wrong. It is failing for this test case
{1, 2, 3, 4, 5}. The answer should be 6 with all array elements converted into 3. But the code gives 4 as the output. I don't think it is a greedy algorithm problem. What should be the approach.
The code given in gfg is correct, you must have missed the part where they have pushed the element twice, one time inside the if and other outside the if condition
P.S. I am also looking for the logic behind it.

Efficient way of finding sequential numbers across multiple arrays?

I'm not looking for any code or having anything being done for me. I need some help to get started in the right direction but do not know how to go about it. If someone could provide some resources on how to go about solving these problems I would very much appreciate it. I've sat with my notebook and am having trouble designing an algorithm that can do what I'm trying to do.
I can probably do:
foreach element in array1
foreach element in array2
check if array1[i] == array2[j]+x
I believe this would work for both forward and backward sequences, and for the multiples just check array1[i] % array2[j] == 0. I have a list which contains int arrays and am getting list[index] (for array1) and list[index+1] for array2, but this solution can get complex and lengthy fast, especially with large arrays and a large list of those arrays. Thus, I'm searching for a better solution.
I'm trying to come up with an algorithm for finding sequential numbers in different arrays.
For example:
[1, 5, 7] and [9, 2, 11] would find that 1 and 2 are sequential.
This should also work for multiple sequences in multiple arrays. So if there is a third array of [24, 3, 15], it will also include 3 in that sequence, and continue on to the next array until there isn't a number that matches the last sequential element + 1.
It also should be able to find more than one sequence between arrays.
For example:
[1, 5, 7] and [6, 3, 8] would find that 5 and 6 are sequential and also 7 and 8 are sequential.
I'm also interested in finding reverse sequences.
For example:
[1, 5, 7] and [9, 4, 11]would return 5 and 4 are reverse sequential.
Example with all:
[1, 5, 8, 11] and [2, 6, 7, 10] would return 1 and 2 are sequential, 5 and 6 are sequential, 8 and 7 are reverse sequential, 11 and 10 are reverse sequential.
It can also overlap:
[1, 5, 7, 9] and [2, 6, 11, 13] would return 1 and 2 sequential, 5 and 6 sequential and also 7 and 6 reverse sequential.
I also want to expand this to check numbers with a difference of x (above examples check with a difference of 1).
In addition to all of that (although this might be a different question), I also want to check for multiples,
Example:
[5, 7, 9] and [10, 27, 8] would return 5 and 10 as multiples, 9 and 27 as multiples.
and numbers with the same ones place.
Example:
[3, 5, 7] and [13, 23, 25] would return 3 and 13 and 23 have the same ones digit.
Use a dictionary (set or hashmap)
dictionary1 = {}
Go through each item in the first array and add it to the dictionary.
[1, 5, 7]
Now dictionary1 = {1:true, 5:true, 7:true}
dictionary2 = {}
Now go through each item in [6, 3, 8] and lookup if it's part of a sequence.
6 is part of a sequence because dictionary1[6+1] == true
so dictionary2[6] = true
We get dictionary2 = {6:true, 8:true}
Now set dictionary1 = dictionary2 and dictionary2 = {}, and go to the third array.. and so on.
We only keep track of sequences.
Since each lookup is O(1), and we do 2 lookups per number, (e.g. 6-1 and 6+1), the total is n*O(1) which is O(N) (N is the number of numbers across all the arrays).
The brute force approach outlined in your pseudocode will be O(c^n) (exponential), where c is the average number of elements per array and n is the number of total arrays.
If the input space is sparse (meaning there will be more missing numbers on average than presenting numbers), then one way to speed up this process is to first create a single sorted set of all the unique numbers from all your different arrays. This "master" set will then allow you to early exit (i.e. break statements in your loops) on any sequences which are not viable.
For example, if we have input arrays [1, 5, 7] and [6, 3, 8] and [9, 11, 2], the master ordered set would be {1, 2, 3, 5, 6, 7, 8, 9, 11}. If we are looking for n+1 type sequences, we could skip ever continuing checking any sequence that contains a 3 or 9 or 11 (because the n+1 value in not present at the next index in the sorted set. While the speedups are not drastic in this particular example, if you have hundreds of input arrays and very large range of values for n (sparsity), then the speedups should be exponential because you will be able to early exit on many permutations. If the input space is not sparse (such as in this example where we didn't have many holes), the speedups will be less than exponential.
A further improvement would be to store a "master" set of key-value pairs, where the key is the n value as shown in the example above, and the value portion of the pair is a list of the indices of any arrays that contain that value. The master set of the previous example would then be: {[1, 0], [2, 2], [3, 1], [5, 0], [6, 1], [7, 0], [8, 1], [9, 2], [11, 2]}. With this architecture, scan time could potentially be as low as O(c*n), because you could just traverse this single sorted master set looking for valid sequences instead of looping over all the sub-arrays. By also requiring the array indexes to increment, you can clearly see that the 1->2 sequence can be skipped because the arrays are not in the correct order, and the same with the 2->3 sequence, etc. Note this toy example is somewhat oversimplified because in practice you would need a list of indices for the value portions of the key-value pairs. This would be necessary if the same value of n ever appeared in multiple arrays (duplicate values).

Removing numbers from a large range of numbers

I've got the following problem that I'm trying to find a more optimal solution for.
Let's say you have a range of numbers between 0 and 9:
Values: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Index: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Now, let's say you "remove" 1, 4, 5, and 7:
Values: 0, -, 2, 3, -, -, 6, -, 8, 9
Index: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Where there is no value, all subsequent values are shifted to the left:
Values: 0, 2, 3, 6, 8, 9
Index: 0, 1, 2, 3, 4, 5
The value at index 1 has now become 2 (was 1), the value at index 2 is now 3 (was 2), the value at index 3 is now 6 (was 3), etc.
Here's the problem. I need to manage this on a larger scale, up to tens of thousands of values. A random number of those values will be removed from the original contiguous range, and potentially added back afterwards (but not in the same order they were removed). The starting state will always be a complete sequence of numbers between 0 and MAX_VAL.
Things I've tried:
1) Maintaining an array of values, removing values from that array, and shifting everything over by one. This fails because you're iterating through all the values after the one you've just removed, and it's too slow as a result. Getting the value for a given index afterwards is really fast though.
2) Maintaining a linked list of values, and removing the value by pulling it out of the list. This seems to be slow both adding/removing values and getting the value at a given index, since I need to walk through the list first.
3) Keeping track of the "removed" values, rather then maintaining a giant array/list/etc of values from 0 to MAX_VAL. If the removed values are stored in an ordered array, then it becomes trivial to calculate how many values have been removed before and after a given index, and just return an offset index instead. This kinda works, except it's slow to maintain the ordered array of removed values and iterate through that instead, especially if the number of removed values approaches MAX_VAL.
Is there some sort of algorithm or technique that can handle this kind of problem more quickly and efficiently?
Is there some sort of algorithm or technique that can handle this kind of problem more quickly and efficiently?
The answer very much depends on typical use cases:
Is the set of numbers typically sparse or dense?
How often do you do insertions vs. removals vs. lookups?
In which patterns are numbers inserted or removed (random, continuous, from the end or start)?
What are there any memory constraints?
Here are some ideas for a generic solution:
Create a structure that stores ranges instead of numbers.
Start with a single entry: 0 - MAX_VAL.
A range can have subranges. This resulting graph of ranges forms a tree.
Removing a number splits a leaf range into two, creating two new leafs.
This algorithm would perform quite well when the set is dense (because there are few ranges). It would still perform somewhat fast when the graph grows (O(log n) for lookups) when you keep the tree balanced.
Now, let's say you "remove" 1, 4, 5, and 7:
Values: 0, -100, 2, 3, -100, -100, 6, -100, 8, 9// use a unique value that doesn't used in array
Index: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

Calling Groups of Elements of Matlab Arrays

I'm dealing with long daily time series in Matlab, running over periods of 30-100+ years. I've been meaning to start looking at it by seasons, roughly approximating that by taking 91-day segments of each year over the time period (with some tbd method of correcting for odd number of days in the year)
Basically, what I want is an array indexing method that allows me to make a new array that takes 91 elements every 365 elements, starting at element 1. I've been looking for some normal array methods (some (:) or other), but I haven't been able to find one. I guess an alternative would be to kind of iterate over 365-day segments 91 times, but that seems needlessly complicated.
Is there a simpler way that I've missed?
Thanks in advance for the help!
So if I understand correctly, you want to extract elements 1-91, 366-457, 731-822, and so on? I'm not sure that there is a way to do this with basic matrix indexing, but you can do the following:
days = 1:365; %Create array ranging from 1 - 365
difference = length(data) - 365; %how much bigger is time series data?
padded = padarray(days, [0, difference], 'circular'); %extend to fit time series
extracted = data(padded <= 91); %get every element in the range 1-91
Basically what I am doing is creating an array that is the same size as your time series data that repeats 1-365 over and over. I then perform logical indexing on data, such that the padded array is less than or equal to 91.
As a more approachable example, consider:
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
days = 1:5;
difference = length(x) - 5;
padded = padarray(days, [0, difference], 'circular');
extracted = x(padded <= 2);
padded then is equal to [1, 2, 3, 4, 5, 1, 2, 3, 4, 5] and extracted is going to be [1, 2, 6, 7]

Resources