I have data in column A, and would like to put the averages in column B like this:
a b
1 10 10
2 7 8.5
3 8 8.333
4 19 11
5 13 11.5
where b1 =average(a1), b2 =average(a1:a2), b3 =average(a1:a3)....
Using average() is alright for small amounts of data, but I have over 1500 data entries. I would like to find a more efficient way of doing this.
Make your initial range reference absolute, while the other is relative, i.e.:
b4 = average($a$1:a4)
You can paste that 1500 times an it will always increment the end of the range while keeping the beginning pinned to A1 due to the dollar signs in that reference.
Related
I need to extract data from one tab (extracted data) to another tab and validate the data in the following way:
if 0% assign 3
if from 0 till -10% assign 2
if from -10% and more assign 1
if from 0% till 10% assign 4
if from 10% and more assign 5
here is the link to the file https://docs.google.com/spreadsheets/d/1f8SFi2hNP6Anav7G7BYWyK-fasPk1pT1A2HFJblT-FI/edit?usp=sharing
I suggest you use two vlookups.
If you have a tab called 'Ranges' with the following two columns:
Percentage Result
-1000% 1
-10% 2
0% 3
10% 4
11% 5
Then the formula in cell B1 on the 'calculations' tab would be something like:
=arrayformula({"Con Potential";iferror(vlookup(vlookup(A2:A,'Extracted data'!A:D,4,0),Ranges!A:B,2,1),)})
Delete all data below cell B1 for the arrayformula to work correctly.
The second vlookup references col D on the 'Extracted data' tab because that is the percentage I think you are comparing? If not, alter 4 in the vlookup to another column.
If it helps, please see:
https://stackoverflow.com/help/someone-answers
NB: In place of Ranges!A:B you could use a fixed array:
=arrayformula({"Con Potential";iferror(vlookup(vlookup(A2:A,'Extracted data'!A:D,4,0),{-10,1;-0.1,2;0,3;0.1,4;0.11,5},2,1),)})
If you want to temporarily see the fixed array in case you want to edit any values, place this in a cell somewhere out of the way:
={-10,1;-0.1,2;0,3;0.1,4;0.11,5}
, is used to bump to a new column, ; is used as a return.
Relevance
Looking at 'Relevance' lookup from 'Position Delta' and this table in your sheet:
Since a 'position delta' value of 10 cannot both have a relevance of 5 and 4, I've made the assumption that 10 gets 5. If that is incorrect, then I'll adjust the boundaries.
Add this to cell C1 on the 'calculations' tab (clearing all cells below):
=arrayformula({"Relevance";iferror(vlookup(vlookup(calculations!A2:A,'Extracted data'!A:D,3,0),{0,5;11,4;21,3;31,2;41,1;51,0},2,1),)})
The fixed array {0,5;11,4;21,3;31,2;41,1;51,0} has these values:
0 5
11 4
21 3
31 2
41 1
51 0
If you need to change the boundaries so 10 is a 4, not 5, then change the vlookup to use this fixed range {0,5;10,4;20,3;30,2;40,1;50,0}:
0 5
10 4
20 3
30 2
40 1
50 0
vlookup is incremental and anything up to 11 will get 5, then 11 to 20 will get 4, 21 to 30 will get 3 and so on.
,1) in the vlookup at the far right gets the nearest value match until 'position delta' has reached the next boundary.
I want to apply feature selection on a dataset (lung.mat)
After loading the data, I computed the mean of distances between each feature with others by Jaccard measure. Then I sorted the distances descendingly in B1. And then I selected for example 25 number of all the features and saved the matrix in databs1.
I want to select the features that have distance values greater than the mean of the array (B1).
close all;
clc
load lung.mat
data=lung;
[n,m]=size(data);
for i=1:m-1
for j=i+1:m
t1(i,j)=fjaccard(data(:,i),data(:,j));
b1=sum(t1)/(m-1);
end
end
[B1,indB1]=sort(b1,'descend');
databs1=data(:,indB1(1:25));
databs1=[databs1,data(:,m)]; %jaccard
save('databs1.mat');
I’ll be grateful to have your opinions about how to define this in B1, selecting values of B1 which are greater than the mean of the array B1, It means cutting the rest of smaller values than the mean of B1.
I used this line,
B1(B1>mean(B1(:)))
after running, B1 still has the full number of features(column) equal to the full dataset, for example, lung.mat has 57 features and B1 by this line still has 57 columns,
I considered that by this line B1 will be cut to the number of features that are greater than the mean of B1.
the general answer to your question is here (this seems clear to you based on your code):
a=randi(10,1,10) %example data
a>mean(a) %get binary matrix of which elements are larger than mean
a(a>mean(a)) %select elements from a that are larger than mean
a =
1 9 10 7 8 8 4 7 2 8
ans =
1×10 logical array
0 1 1 1 1 1 0 1 0 1
ans =
9 10 7 8 8 7 8
This is my table (copied from the similar question Finding minimum value in index(match) array [EXCEL])
A B C D
tasmania 10 3 10
queensland 22 8 10
new south wales 10 12 12
northern territory 8 4 15
south australia 12 2 8
western australia 32 4 15
tasmania 72 6 16
I have criteria for B and C, and I want to retrieve the A with the lowest corresponding value D. Values in B, C and D can be duplicates, values in A can not.
Example:
B >= 8
C >= 4
Should result in "queensland" (lowest matching value is 10), but not "tasmania" (has the same cost)
I am currently trying this array formula:
{ =MIN(IF(B:B>=8;IF(C:C>=4;D;""));1) }
Which returns the correct lowest D, but since I am losing the informaiton about A, I can not retrieve the value for A
This as an array formula should work for you:
=INDEX($A$1:$A$7,MATCH(MIN(IF($B$1:$B$7>=8,IF($C$1:$C$7>=4,$D$1:$D$7))),IF($B$1:$B$7>=8,IF($C$1:$C$7>=4,$D$1:$D$7)),0))
It should be noted that if you have Excel 2016 or Office365, you'll have access to the MINIFS function which is probably better suited for this task (i don't actually have the newest version, so am unable to test)
I am new to array formulae and have noticed that while SUBTOTAL includes many functions, it does not feature COUNTIF (only COUNT and COUNTA).
I'm trying to figure out how I can integrate a COUNTIF-like feature to my array formula.
I have a matrix, a small subset of which looks like:
A B C D E
48 53 46 64 66
48 66 89
40 38 42 49 44
37 33 35 39 41
Thanks to the help of #Tom Shape in this post, I (he) was able to average the sum of each row in the matrix provided it had complete data (so rows 2 and 4 in the example above would not be included).
Now I would like to count the number of rows with complete data (so rows 2 and 4 would be ignored) which include at least one value above a given threshold (say 45).
In the current example, the result would be 2, since row 1 has 5/5 values > 45, and row 3 has 1 value > 45. Row 5 has values < 45 and rows 2 and 3 have partially or fully missing data, respectively.
I have recently discovered the SUMPRODUCT function and think that perhaps SUMPRODUCT(--(A1:E1 >= 45 could be useful but I'm not sure how to integrate it within Tom Sharpe's elegant code, e.g.,
=AVERAGE(IF(SUBTOTAL(2,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1)))=COLUMNS(A1:E1),SUBTOTAL(9,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1))),""))
Remember, I am no longer looking for the average: I want to filter rows for whether they have full data, and if they do, I want to count rows with at least 1 entry > 45.
Try the following. Enter as array formula.
=COUNT(IF(SUBTOTAL(4,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1)))>45,IF(SUBTOTAL(2,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1)))=COLUMNS(A1:E1),SUBTOTAL(9,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1))))))
Data
I am having difficulties with making an array formula work the way I want it to work.
Out of a column of dates which is not sorted, I want it to extract values into a new column. The formula below identifies the required cells of a given month and year, but they appear in their original row rather than on top of the output range. Moreover, I want all ""/FALSE cells to be excluded from the output array.
=IF((MONTH($I$15:$I$1346)=1)*(YEAR($I$15:$I$1346)=2008),$I$15:$I$1346,"")
In fact, the $I$15:$I$1346 should be dynamic and go to the last filled range (I could make a named range for that)
Part two is to expand on that formula so that it calculates the data that is an two column offset of the data described above.
Is the above possible to build into one cell probably with a combination of IF, INDEX, SMALL and maybe others?
I'm not looking for a filter solution. Hope the above is clear enough and that you can help!
Here's a shortened sample layout:
A B C
1 Date Series_A Series_B
2 03/01/2011 45 20
3 04/01/2011 73 30
4 06/01/2011 95 40
5 08/01/2011 72 50
6 06/02/2011 5 13
7 09/02/2011 12 #N/A
8 05/02/2011 23 65
9 07/03/2011 12 65
Then I want three input cells for the year and and the month and series name (index/match, as there are many more columns with data). If it would be 2011, Feb and Series_A, I want it to calculate the average for that month. In this case it would be (5+12+23)/3. If it would be Feb-2011 and Series_B instead, which has an error, it should show (13+65)/2 rather than an error.
Aside from that I want a separate which will output an array with the data instead without 'holes' in between and with the right 'length'. Example for Feb-2011 in Column C:
A B C D
1 Date Series_A Desired Output Output based on f above
2 03/01/2011 45 5
3 04/01/2011 73 12
4 06/01/2011 95 23
5 08/01/2011 72
6 06/02/2011 5 5
7 09/02/2011 12 12
8 05/02/2011 23 23
9 07/03/2011 12
If I then run a =ISBLANK(C5) it should be true, rather than =""=C5
Hope the edit clarifies
I reached out to various platsforms to get an answer, and here you have one which is ok. Still doesn't fully answer part 1, but works nonetheless.
http://www.excelforum.com/excel-formulas-and-functions/905356-exclude-blank-false-cells-in-in-excel-array-if-formula-output.html