What type of array should I use to combine numerical values and string values and have the ability to sort them in octave/matlab - arrays

I'm trying to link numerical (octave/matlab) values in an array to string values in the array how can I go about doing this. The reason I'm trying to do this is to sort the array based on the numerical values.
Example:
array=[1,2,'filename1';3,4,'filename2';5,6,'filename3'] (I know this is incorrect and will give an error)
This is what I'm trying to get it to look like so I can sort based on the first or the second column and have the third column be "linked" / follow the sort. (Please note the numbers will not be a sequential sequence like 1,2,3... I just used that as an example)
1,2,filename1
3,4,filename2
5,6,filename3
If I sort the first numerical column in descending order it should look like this
5,6,filename3
3,4,filename2
1,2,filename1
How can I go about doing this and still get the values of the array individually?
Example:
array(1,1) would be 5
and array(3,3) would be filename1
If you want to know, I plan on creating a playlist of wavfile names based on this sort.
Ps: I'm using Octave/Matlab

You can use 2 separate arrays, one with 2 columns of number and one with strings. When you sort the first array based on the first number, the function will return both the sorted list as well as the reordered indices. You can use the reordered indices to rearrange the strings.
It would be something like this:
[sort_list, indx] = sort(array1, 1);
array_string = array_strings[indx];
I don't think the access in that way is possible. Maybe you can do something similar with cells but I would be more expensive than using 2 arrays.

Use a cell arrray. You then sort using sortrows:
>> myCell = {1, 2, 'filename1';
3, 4, 'filename2';
5, 6, 'filename3'};
>> array = flipud(sortrows(myCell, 1)) %// "1" => col. "flipud" => descending
ans =
[5] [6] 'filename3'
[3] [4] 'filename2'
[1] [2] 'filename1'
>> array(1,1)
ans =
[5]
>> array(3,3)
ans =
'filename1'

Related

How to get the mean of first values in arrays in matrix in Matlab

If I have a square matrix of arrays such as:
[1,2], [2,3]
[5,9], [1,4]
And I want to get the mean of the first values in the arrays of each row such:
1.5
3
Is this possible in Matlab?
I've used the mean(matrix, 2) command to do this with a matrix of single values, but I'm not sure how to extend this to deal with the arrays.
Get the first elements in all arrays of matrix, then call mean function
mean(matrix(:,:,1))
maybe you need to reshape before call mean
a = matrix(:,:,1);
mean(a(:))
You can apply mean function inside mean function to get the total mean value of the 2D array at index 1. You can do similary with array at index 2. Consider the following snapshot.
After staring at your problem for a long time, it looks like your input is a 3D matrix where each row of your formatting corresponds to a 2D matrix slice. Therefore, in proper MATLAB syntax, your matrix is actually:
M = cat(3, [1,2; 2,3], [5,9; 1,4]);
We thus get:
>> M = cat(3, [1,2; 2,3], [5,9; 1,4])
M(:,:,1) =
1 2
2 3
M(:,:,2) =
5 9
1 4
The first slice is the matrix [1,2; 2,3] and the second slice is [5,9; 1,4]. From what it looks like, you would like the mean of only the first column of every slice and return this as a single vector of values. Therefore, use the mean function and index into the first column for all rows and slices. This will unfortunately become a singleton 3D array so you'll need to squeeze out the singleton dimensions.
Without further ado:
O = squeeze(mean(M(:,1,:)))
We thus get:
>> O = squeeze(mean(M(:,1,:)))
O =
1.5000
3.0000

How to structure multiple python arrays for sorting

A fourier analysis I'm doing outputs 5 data fields, each of which I've collected into 1-d numpy arrays: freq bin #, amplitude, wavelength, normalized amplitude, %power.
How best to structure the data so I can sort by descending amplitude?
When testing with just one data field, I was able to use a dict as follows:
fourier_tuples = zip(range(len(fourier)), fourier)
fourier_map = dict(fourier_tuples)
import operator
fourier_sorted = sorted(fourier_map.items(), key=operator.itemgetter(1))
fourier_sorted = np.argsort(-fourier)[:3]
My intent was to add the other arrays to line 1, but this doesn't work since dicts only accept 2 terms. (That's why this post doesn't solve my issue.)
Stepping back, is this a reasonable approach, or are there better ways to combine & sort separate arrays? Ultimately, I want to take the data values from the top 3 freqs and associated other data, and write them to an output data file.
Here's a snippet of my data:
fourier = np.array([1.77635684e-14, 4.49872050e+01, 1.05094837e+01, 8.24322470e+00, 2.36715913e+01])
freqs = np.array([0. , 0.00246951, 0.00493902, 0.00740854, 0.00987805])
wavelengths = np.array([inf, 404.93827165, 202.46913583, 134.97942388, 101.23456791])
amps = np.array([4.33257766e-16, 1.09724890e+00, 2.56328871e-01, 2.01054261e-01, 5.77355886e-01])
powers% = np.array([4.8508237956526163e-32, 0.31112370227749603, 0.016979224022185751, 0.010445983875848858, 0.086141014686372669])
The last 4 arrays are other fields corresponding to 'fourier'. (Actual array lengths are 42, but pared down to 5 for simplicity.)
You appear to be using numpy, so here is the numpy way of doing this. You have the right function np.argsort in your post, but you don't seem to use it correctly:
order = np.argsort(amplitudes)
This is similar to your dictionary trick only it computes the inverse shuffling compared to your procedure. Btw. why go through a dictionary and not simply a list of tuples?
The contents of order are now indices into amplitudes the first cell of order contains the position of the smallest element of amplitudes, the second cell contains the position of the next etc. Therefore
top5 = order[:-6:-1]
Provided your data are 1d numpy arrays you can use top5 to extract the elements corresponding to the top 5 ampltiudes by using advanced indexing
freq_bin[top5]
amplitudes[top5]
wavelength[top5]
If you want you can group them together in columns and apply top5 to the resulting n-by-5 array:
np.c_[freq_bin, amplitudes, wavelength, ...][top5, :]
If I understand correctly you have 5 separate lists of the same length and you are trying to sort all of them based on one of them. To do that you can either use numpy or do it with vanilla python. Here are two examples from top of my head (sorting is based on the 2nd list).
a = [11,13,10,14,15]
b = [2,4,1,0,3]
c = [22,20,23,25,24]
#numpy solution
import numpy as np
my_array = np.array([a,b,c])
my_sorted_array = my_array[:,my_array[1,:].argsort()]
#vanilla python solution
from operator import itemgetter
my_list = zip(a,b,c)
my_sorted_list = sorted(my_list,key=itemgetter(1))
You can then flip the array with my_sorted_array = np.fliplr(my_sorted_array) if you wish or if you are working with lists you can reverse it in place with my_sorted_list.reverse()
EDIT:
To get first n values only, you have to simply slice the array similarly to what #Paul is suggesting. Slice is done in a similar manner to classic list slicing by specifying start:stop:step (you can omit the step) arguments. In your case for 5 top columns it would be [:,-5:]. So in the example above you can take top 2 columns from each row like this:
my_sliced_sorted_array = my_sorted_array[:,-2:]
result will be:
array([[15, 13],
[ 3, 4],
[24, 20]])
Hope it helps.

Fill a vector in Julia with a repeated list

I would like to create a column vector X by repeating a smaller column vector G of length h a number n of times. The final vector X will be of length h*n. For example
G = [1;2;3;4] #column vector of length h
X = [1;2;3;4;1;2;3;4;1;2;3;4] #ie X = [G;G;G;G] column vector of
length h*n
I can do this in a loop but is there an equivalent to the 'fill' function that can be used without the dimensions going wrong. When I try to use fill for this case, instead of getting one column vector of length h*n I get a column vector of length n where each row is another vector of length h. For example I get the following:
X = [[1,2,3,4];[1,2,3,4];[1,2,3,4];[1,2,3,4]]
This doesn't make sense to me as I know that the ; symbol is used to show elements in a row and the space is used to show elements in a column. Why is there the , symbol used here and what does it even mean? I can access the first row of the final output X by X[1] and then any element of this by X[1][1] for example.
Either I would like to use some 'fill' equivalent or some sort of 'flatten' function if it exists, to flatten all the elements of the X into one column vector with each entry being a single number.
I have also tried the reshape function on the output but I can't get this to work either.
Thanks Dan Getz for the answer:
repeat([1, 2, 3, 4], outer = 4)
Type ?repeat at the REPL to learn about this useful function.
In older versions of Julia, repmat was an alternative, but it has now been deprecated and absorbed into repeat
As #DanGetz has pointed out in a comment, repeat is the function you want. From the docs:
repeat(A, inner = Int[], outer = Int[])
Construct an array by repeating the entries of A. The i-th element of inner specifies the number of times that the individual entries of the i-th dimension of A should be repeated. The i-th element of outer specifies the number of times that a slice along the i-th dimension of A should be repeated.
So an example that does what you want is:
X = repeat(G; outer=[k])
where G is the array to be repeated, and k is the number of times to repeat it.
I will also attempt to answer your confusion about the result of fill. Julia (like most languages) makes a distinction between vectors containing numbers and numbers themselves. We know that fill(5, 5) produces [5, 5, 5, 5, 5], which is a one-dimensional array (a vector) where each element is 5.
Note that fill([5], 5), however, produces a one-dimensional array (a vector) where each element is [5], itself a vector. This prints as
5-element Array{Array{Int64,1},1}:
[5]
[5]
[5]
[5]
[5]
and we see from the type that this is indeed a vector of vectors. That of course is not the same thing as the concatenation of vectors. Note that [[5]; [5]; [5]; [5]; [5]] is syntax for concatenation, and will return [5, 5, 5, 5, 5] as you might expect. But although ; syntax (vcat) does concatenation, fill does not do concatenation.
Mathematically (under certain definitions), we may imagine R^(kn) to be distinct (though isomorphic to) from (R^k)^n, for instance, where R^k is the set of k-tuples of real numbers. fill constructs an object of the latter, whereas repeat constructs an object of the former.
As long as you are working with 1-dimensional arrays (Vectors)...
X=repmat(G,4) should do it.
--
On another note, Julia makes no distinction between row and column vector, they are both one-dimensional arrays.
[1,2,3]==[1;2;3] returns true as they are both 3-element Array{Int64,1} or vectors (Array{Int,1} == Vector{Int} returns true)
This is one of the differences between Matlab and Julia...
If, for some specific reason you want to do it, you can create 2-dimensional Arrays (or Matrices) with one of the dimensions equal to 1.
For example:
C = [1 2 3 4] will create a 1x4 Array{Int64,2} the 2 there indicates the dimensions of the Array.
D = [1 2 3 4]' will create a 4x1 Array{Int64,2}.
In this case, C == D returns false of course. But neither is a Vector for Julia, they are both Matrices (Array{Int,2} == Matrix{Int} returns true).

Binsearch in cell matrix in Matlab

I have a cell matrix with 2 rows - sorted cell array of different strings and array of numbers. Also I have an example string. It's guarantied that this string appears in the 1st row in the cell array. I want to get index of example's appearance in cell array of strings.
Is there any function in Matlab, that provides solving with logarithmic complexity (something like binary search)?
If you look at ismember code (type open ismember), you'll see that it basically
Checks if the array is sorted (by calling issorted);
If not, it sorts the array;
Then it applies binary search.
So you can directly use ismember.
Example:
>> strings = {'a', 'aa', 'be', 'day', 'yes'};
>> [tf, loc ] = ismember('day', strings);
>> loc
loc =
4
Or maybe modify ismember (save it with another name) to bypass step 1, since you already know your array is sorted.

MATLAB: comparing all elements in three arrays

I have three 1-d arrays where elements are some values and I want to compare every element in one array to all elements in other two.
For example:
a=[2,4,6,8,12]
b=[1,3,5,9,10]
c=[3,5,8,11,15]
I want to know if there are same values in different arrays (in this case there are 3,5,8)
The answer given by AB is correct, but it is specific for the case when you have 3 arrays that you are comparing. There is another alternative that will easily scale to any number of arrays of arbitrary size. The only assumption is that each individual array contains unique (i.e. non-repeated) values:
>> allValues = sort([a(:); b(:); c(:)]); %# Collect all of the arrays
>> repeatedValues = allValues(diff(allValues) == 0) %# Find repeated values
repeatedValues =
3
5
8
If the arrays contains repeated values, you will need to call UNIQUE on each of them before using the above solution.
Leo is almost right, should be
unique([intersect(a,[b,c]), intersect(b,c)])
c(ismember(c,a)|ismember(c,b)),
ans =
3 5 8
I think this works for all matrices.
Define what you mean by compare. If the arrays are of the same length, and you are comparing equality then you can just do foo == bar -- it's vectorized. If you need to compare in the less than/greater than sense, you can do sign(foo-bar). If the arrays are not the same length and/or you aren't comparing element-wise -- please clarify what you'd like the output of the comparison to be. For instance,
foo = 1:3;
bar = [1,2,4];
baz = 1:2;
sign(repmat(foo',1,length([bar,baz])) - repmat([bar, baz],length(foo),1))
# or, more concisely:
bsxfun(#(x,y)sign(x-y),foo',[bar,baz])
does what you ask for, but there is probably a better way depending on what you want as an output.
EDIT (OP clarified question):
To find common elements in the 3 arrays, you can simply do:
>> [intersect(a,[b,c]), intersect(b,c)]
ans =
8 3 5

Resources