Sorting elements of a single array into different subarrays - arrays

I have an 1000 element array with values ranging from 1 - 120. I want to split this array into 6 different subarrays with respect to the value range
for ex:
array1 with values from ranges 0-20.
array 2 with values from range 20-40........100-120 etc.
At the end I would like to plot a histogram with X-axis as the range and each bar depicting the number of elements in that range. I dont know of any other way for 'this' kind of plotting.
Thanks

In other words, you want to create a histogram. Matlab's hist() will do this for you.

If you only need the histogram, you can achieve the result using histc, like this:
edges = 0:20:120; % edges to compute histogram
n = histc(array,edges);
n = n(1:end-1); % remove last (no needed in your case; see "histc" doc)
bar(edges(1:end-1)+diff(edges)/2, n); % do the plot. For x axis use
% mean value of each bin

Related

How to count for 2 different arrays how many times the elements are repeated, in MATLAB?

I have array A (44x1) and B (41x1), and I want to count for both arrays how many times the elements are repeated. And if the repeated values are present in both arrays, I want their counting to be divided (for instance: value 0.5 appears 500 times in A and 350 times in B, so now divide 500 by 350).
I have to do this for bigger arrays as well, so I was thinking about using a looping (but no idea how to do it on MATLAB).
I got what I want on python:
import pandas as pd
data1 = pd.read_excel('C:/Users/Desktop/Python/data1.xlsx')
data2 = pd.read_excel('C:/Users/Desktop/Python/data2.xlsx')
for i in data1['Mag'].value_counts() & data2['Mag'].value_counts():
a = data1['Mag'].value_counts()/data2['Mag'].value_counts()
print(a)
break
Any idea of how to do the same on MATLAB? Thanks!
Since you can enumerate all valid earthquake magnitude values, you could use:
% Make up some data
A=randi([2 58],[100 1])/10;
B=randi([2 58],[20 1])/10;
% Round data to nearest tenth
%A=round(A,1); %uncomment if necessary
%B=round(B,1); %same
% Divide frequencies
validmags=0.2:0.1:5.8;
Afreqs=sum(double( abs(A-validmags)<1e-6 ),1); %relies on implicit expansion; A must be a column vector and validmags must be a row vector; dimension argument to sum() only to remind user; double() not really needed
Bfreqs=sum(double( abs(B-validmags)<1e-6 ),1); %same
Bfreqs./Afreqs, %for a fancier version: [{'Magnitude'} num2cell(validmags) ; {'Freq(B)/Freq(A)'} num2cell(Bfreqs./Afreqs)].'
The last line will produce NaN for 0/0, +Inf for nn/0, and 0 for 0/nn.
You could also use uniquetol, align the unique values of each vector, and divide the respective absolute frequencies. But I think the above approach is cleaner and easier to understand.

How to scale very large numbers such that they could be represented as an array index?

I have a 2D array of size 30*70.
I have 70 columns. My values are very large ranging from 8066220960081 to (some number with same power of 10 as lowerlimit) and I need to plot a scatter plot in an array. How do I index into the array given very large values?
Also, I need to do this in kernel space
Let's take an array long long int A with large values.
A[0] = 393782040
A[1] = 2*393782040
... and so on
A[N] = 8066220960081; where N = 30*70 - 1
We can scale A with a factor or we can shift A by a certain number and scale it again. That's where you can deal with numbers ranging between 0 and 1 or -1 and 1 or x and y. You choose as per your need. Theoretically, this should not make a difference to the scatter plot other than the placement of the axis. However, if your scatter plot is also a representative of the underlying values i.e. the dots are proportional to values; then it is a good idea to be nice to your plotting tool and not flood it with terribly large values that might lead to overflow depending on how the code for plotting is written.
PS: I would assume you know how to flatten a 2d array.
I just ended up doing regular interval calculation between max and min
and then start from min + interval*index to get the number.
index would be the index in array.

Create list of random numbers between x and y using formula in Google Sheets

I'm trying to create a list of 50 random numbers let's say between 100 and 500 with one formula in Gsheets. Is there any formula like 'apply this to x cells'?
What I tried so far is (and doesn't work). I hoped randarray function will 'force' randbetween function to create 2D array (randarray creates a list of numbers between 0 and 1).
={
RANDARRAY(50,1), ARRAY_CONSTRAIN(RANDBETWEEN(100,500),50,1)
}
Error
Function ARRAY_ROW parameter 2 has mismatched row size. Expected: 50. Actual: 1.
So this error indicates that array_constrain didn't help either.
try like this:
=ARRAYFORMULA(RANDBETWEEN(ROW(A100:A149), 500))
In generic terms, if you need N random numbers, between X and Y, you would combine the following formulas:
RandBetween(X, Y)
Row(cell_ref)
Indirect(string_cell_ref)
ArrayFormula(array_formula)
Details
When combining a Row(cell_ref) with an ArrayFormula, you can specify a cell range or simply a number range:
ArrayFormula(Row(1:50))
The above example generates a one dimensional array (column) with the numbers 1 through 50. In order to programmatically change the number, we use the Indirect function to specify the upper bound of the range, N:
ArrayFormula(Row(Indirect("1:"&N)))
N can be a named range, hard coded, or a cell reference containing a number greater than 0. Because you want each row to contain a random number between X and Y, you need to eliminate the sequential number in each array position by multiplying the number generated by the above formula by zero:
ArrayFormula(Row(Indirect("1:"&N))*0)
which generates a on dimensional array (column) of N zeros. Now you can combine this as follows to generate a one dimensional array (column) of N random numbers between X and Y:
Solution
ArrayFormula(RandBetween(Row(Indirect("1:"&N))*0+X, Y))
You could use named ranges for N, X, and Y; hard code them eg. 50, 100, 500; or use simple cell references as in the example below:
ArrayFormula(RandBetween(row(indirect("1:"&B1))*0+B2, B3))
GSheet Example

Splitting an array into n parts and then joining them again forming a histogram

I am new to Matlab.
Lets say I have an array a = [1:1:1000]
I have to divide this into 50 parts 1-20; 21-40 .... 981-1000.
I am trying to do it this way.
E=1000X
a=[1:E]
n=50
d=E/n
b=[]
for i=0:n
b(i)=a[i:d]
end
But I am unable to get the result.
And the second part I am working on is, depending on another result, say if my answer is 3, the first split array should have a counter and that should be +1, if the answer is 45 the 3rd split array's counter should be +1 and so on and in the end I have to make a histogram of all the counters.
You can do all of this with one function: histc. In your situation:
X = (1:1:1000)';
Edges = (1:20:1000)';
Count = histc(X, Edges);
Essentially, Count contains the number of elements in X that fall into the categories defined in Edges, where Edges is a monotonically increasing vector whose elements define the boundaries of sequential categories. A more common example might be to construct X using a probability density, say, the uniform distribution, eg:
X = 1000 * rand(1000, 1);
Play around with specifications for X and Edges and you should get the idea. If you want the actual histogram plot, look into the hist function.
As for the second part of your question, I'm not really sure what you're asking.

How to create a numeric vector which gives a uniform grid for all dimensions of a matrix X?

I'm applying the function histcnd to a matrix of size 744x2. This function calculates frequencies of values within certain edges. I want to set the edges to, for example, groups of 5 values, but I can't seem to be able to do it.
The syntax of the function is histcnd(X,edges), where edges must have the same length to the number of columns of X. How do I define 'edges' as a 2-column vector, so that it will group values of each column every 5 values?
What about using something like this:
X = randn(744,2);
[a,b] = size(X);
edges = num2cell([linspace(min(X(1,:)),max(X(1,:)),a/5); linspace(min(X(2,:)), max(X(2,:)),a/5)],2);
# not sure if it's the same histcnd, the one I found wants edges to be a cell array
H = histcnd(X, edges);
You can probably pick the min/max values for each axis in a more intelligent way if you know something about your data.

Resources