Excel, Frequent string value with IF condition - arrays

I would like to find the modal area code for each ID number in excel.
I have 2 columns
ID no. Area Code
1 ABC
1 ABC
1 ABC
1 DEF
2 HIJ
2 HIJ
2 KLM
So far I am finding the mode of the whole column using:
=(INDEX(B:B,MODE(MATCH(B:B,B:B,0))))
But I would like all ID no. 1 area codes to be ABC and ID no. 2 to be HIJ
Any advice would be great! Thanks

You could use a lookup table with the following array formula:
=INDEX($B$2:$B$13,MODE(IF($A$2:$A$13=D2,MATCH($A$2:$A$13,$A$2:$A$13,0))))
You enter array formulas by pressing Ctrl + Shift + Enter to enter the formula
In the example shown below the formula would go in E2 next to the first listed ID and then you would drag it down for all the IDs in the adjacent column.
Example:

Related

How to extract a handicap value from column based on 2 criteria

Good morning,
I have an issue with extracting the correct handicap value within the following table:
K L M
Handicap York Hereford
0 1287 1280
1 1285 1275
2 1280 1271
3 1275 1268
4 1270 1265
5 1268 1260
6 1265 1258
7 1260 1254
8 1255 1250
9 1253 1246
I also have these 2 lines of sample score/round data:
G H I
Round Score Handicap
York 1269 5
York 1270 4
Hereford 1269 XXX
Hereford 1270 XXX
If for instance someone on a York, gets a score of 1269, they should get a handicap of 5, which this formula achieves:
INDEX($K$7:$K$16,MATCH($H7,$L$7:$L$16,-1))+1
However this formula only works on the one column $L$7:$L$16
Similarly, the 2ns score is calculated with the following formula:
=INDEX($K$8:$K$17,MATCH($H8,$L$8:$L$17,-1))
What I'd like to do is, build that out so if I changed the round to a Hereford, with the exact same score, the cell would automatically calculate that the handicap should be 3.
Is this possible, maybe with an array?
Regards,
Andrew.
With ms365, try:
Formula in I2:
=XLOOKUP(H2,FILTER(L$2:M$11,L$1:M$1=G2),K$2:K$11,"NB",-1,-1)
I would avoid using OFFSET because it is a volatile function.
To select the appropriate column, you can use another MATCH:
MATCH($G7,$L$6:$M$6,0)
will return the column number. This makes it simple if you more than just York and Hereford columns.
Then, to return the matching line:
=MATCH($H7,INDEX($L$7:$M$16,0,MATCH($G7,$L$6:$M$6,0)),-1)
Note the use of 0 for the Row argument in the INDEX function which will return the entire column (all the rows).
Since your handicaps are sequential, as written this formula returns the same values as does yours. But I don't think it is correct since both formulas return 1 for a 1287 York.
You probably need to subtract one from the result of the formula.
=MATCH($H7,INDEX($L$7:$M$16,0,MATCH($G7,$L$6:$M$6,0)),-1)-1
Reference your lookup range with an OFFSET() function, and for the third parameter (which is column offset), use a MATCH() on the headers.
The formula on your first row would be:
=INDEX($K$7:$K$16,MATCH(H7,OFFSET($L$7:$M$16,0,MATCH(G7,$L$6:$M$6,0)-1,ROWS($L$7:$M$16),1),-1))+1

Alternate for Array Sum Formula

Table copied as Text
Column1 Column2 Column3 Column4 Column5 Column6
A AA AAA 100 95 92
A AA AAA 85 83 81
A AA BBB 200 199 160
A BB AAA 65 55 49
B AA AAA 89 88 83
B AA BBB 150 149 145
B BB AAA 140 135
B BB BBB 190 185
B AA AAA 510
AA
AAA BBB
A 173 160
B 593 145
and some more explanation
Basically i want the sum of "Column 6" for the given criteria but the data in Column 6 can only be entered after some delay w.r.t. Column 1, Column 2, Column 3 & Column 4.
Till Column 6 data is entered, i want excel to use the number available in Column 5 which is also entered after some delay w.r.t. Column 1, Column 2, Column 3 & Column 4 but before Column 6.
And till Column 5 data is entered, i want excel to use the number available in Column 4.
Now I am familiar with two SUM/IF arrangements as included below in post.
First one is array sum/if arrangement which is convenient to write but results in terribly long calculation time with 1.5 seconds for just one column and I have over 100 columns in one sheet and about 9 sheets.
Second one is using SUMIFS which requires extensive time to write but relatively better calculation time of 0.5 seconds for column but is still quite high.
Now I need to do away with the array arrangement but doing so will take quite some time and I want to know if there is any better/other arrangement.
Just let me know other arrangement which can get the required result and I will check the arrangement for calculation timing. If the other arrangement is also convenient to write than that is a plus.
This is my table:
And I want to add the right most columns which are not empty i.e. have a number in it, but with the criteria for the first three columns in cell D15.
I only found option to add image. Please let me know how to upload excel file.
enter image description here
Can somebody please suggest an alternate to this array formula so it can calculate way faster
{=SUM(
IF(
($B$2:$B$10=$C15)*
($C$2:$C$10=$C$13)*
($D$2:$D$10=D$14)>0,
IF(
$G$2:$G$10<>"",
$G$2:$G$10,
IF(
$F$2:$F$10<>"",
$F$2:$F$10,
$E$2:$E$10))))}
I have tried below which reduces the calculation time to 1/3 but it is too much typing for the large data I am dealing with
=SUMIFS(
$G$2:$G$10,
$B$2:$B$10,$C15,
$C$2:$C$10,$C$13,
$D$2:$D$10,H$14,
$G$2:$G$10,"<>"&"")
+SUMIFS(
$F$2:$F$10,
$B$2:$B$10,$C15,
$C$2:$C$10,$C$13,
$D$2:$D$10,H$14,
$G$2:$G$10,"="&"",
$F$2:$F$10,"<>"&"")
+SUMIFS(
$E$2:$E$10,
$B$2:$B$10,$C15,
$C$2:$C$10,$C$13,
$D$2:$D$10,H$14,
$G$2:$G$10,"="&"",
$F$2:$F$10,"="&"")
If you're OK with using a helper column (which you should be), you can use this formula in a helper cell and drag down. (In my example at bottom, this formula is in cell H2 and drag down.)
= INDEX(E2:G2,MATCH(-1E+300,E2:G2,-1))
This gets all of the data in either column 4 5 or 6 all into one column.
Then you can use a simpler SUMIFS formula in cell D15:
= SUMIFS($H$2:$H$10, // Sum range (helper column)
$B$2:$B$10,$C15, // Criteria 1 (A or B)
$C$2:$C$10,$C$13, // Criteria 2 (AA or BB)
$D$2:$D$10,D$14) // Criteria 3 (AAA or BBB)
See below, working example:
DISCLAIMER
This answer will simplify your formulas, but I'm not sure if this will help with the performance problems you are experiencing. SUMIFS in itself I don't see being likely the cause of long calculation times. Probably you are experiencing long calculation times because other parts of your spreadsheet are using inefficient formulas and/or formulas involving volatile cells, but that is just a guess because I have no idea what the rest of your spreadsheet looks like.

Count based on column and row

I can't seem to find something quite like this problem...
I have an array table where each row contains a random assortment of numbers 1-N
On another sheet, I have a table with column and row headers numbered 1-N
I want to count how many rows in the array contain both the column and row headers for a given cell in the table. Since countifs only reference the current cell in the specified array, they don't seem to be working in this scenario.
Example array:
A B C D
1 3 5 7
1 2 3 4
2 3 4 5
2 4 6 8
...
Table results (symmetrical about the diagonal):
A B C D E F
. 1 2 3 4 5 ...
1 - 1 2 1 1
2 1 - 2 2 1
3 2 2 - 2 2
4 1 2 2 - 1
5 1 1 2 1 -
Would using nested countifs work?
I don't agree with your results corresponding to 4/2, which surely should be 3, not 2, but this formula, based on the array table being in Sheet1 A1:D4 and the results table being in Sheet2 A1:F6, placed in cell B2 of the latter, should work:
=IF($A2=B$1,"-",SUMPRODUCT(N(MMULT(N(COUNTIF(OFFSET(Sheet1!$A$1:$D$1,ROW(Sheet1!$A$1:$D$4)-MIN(ROW(Sheet1!$A$1:$D$4)),),CHOOSE({1,2},B$1,$A2))>0),{1;1})=2)))
Copy across and down as required.
Note: If your actual table is in fact much larger than that given, it will probably be worth adding a simple clause into the above to the effect that the results for approximately half of the cells are obtained from their symmetrical counterparts, rather than via calculation of this construction, thus saving resource.
Regards

create index in SAS using do loop

Say I have a set of data in this format:
ID Product account open date
1 A 20100101
1 B 20100103
2 C 20100104
2 A 20100205
2 D 20100605
3 A 20100101
And I want to create a column to capture the sequence of the products opened so the table will look like this:
ID First Second third
1 A B
2 C A D
3 A
I know I need to create an index for each ID so I can transpose the data afterwards:
ID Product account open date sequence
1 A 20100101 1
1 B 20100103 2
2 C 20100104 1
2 A 20100205 2
2 D 20100605 3
3 A 20100101 1
From my limited knowledge in do loop, I think I need to write something like this:
if first.ID and not last.ID then n=1 do while ID not last n+1
Something like that. Can anyone help me with the exact syntax? I have tried googling for similar codes and haven't had much luck.
Thanks!
I'd sort by ID and then date and use proc transpose for simplicity. Here's an example:
data prod;
input ID Product $ Open_DT :yymmdd8.;
format open_dt date9.;
datalines;
1 A 20100101
1 B 20100103
2 C 20100104
2 A 20100205
2 D 20100605
3 A 20100101
;
run;
proc sort data=prod;
by ID Open_DT;run;
proc transpose data=prod
out=prod_trans(drop=_name_)
prefix=ITEM;
by id;
var Product;
run;
proc print data=prod_trans noobs;
run;

SQLServer Calculate Average of Multiple Columns

I have generated a table using PIVOT and the ouput of columns are dynamic. One of the output is as given below:
user test1 test2 test3
--------------------------------
A1 10 20 30
A2 90 87 75
A3 78 12 34
The output of above table represents a list of users attending tests. The tests will be added dynamically, so the columns are dynamic in nature.
Now, I want to find out average marks of each user as well as average marks of each test.
I am able to calculate the average of each test, but got puzzled to find out the average of each user.
Is there a way to do this??
Please help.
Mahesh
You can add the marks for each user then divide by the number of columns:
SELECT
user,
(test1 + test2 + test3) / 3 AS average_mark
FROM users
Or to ignore NULL values:
SELECT
user,
(ISNULL(test1, 0) + ISNULL(test2, 0) + ISNULL(test3, 0)) / (
CASE WHEN test1 IS NULL THEN 0 ELSE 1 END +
CASE WHEN test2 IS NULL THEN 0 ELSE 1 END +
CASE WHEN test3 IS NULL THEN 0 ELSE 1 END
) AS average_mark
FROM users
Your table structure has two disadvantages:
Because your table structure is created dynamically you would also have to construct this query dynamically.
Because some students will not have taken all tests yo may have some NULL values.
You may want to consider changing your table structure to fix both of these problems. I would suggest that you use the following structure for your table:
user test mark
-------------------
A1 1 10
A2 1 90
A3 1 78
A1 2 20
A2 2 87
A3 2 12
A1 3 30
A2 3 75
A3 3 34
Then you can do this to get the average mark per user:
SELECT user, AVG(mark) AS average_mark
FROM users
GROUP BY user
And this to get the average mark per test:
SELECT test, AVG(mark) AS average_mark
FROM users
GROUP BY test
Can you do it on your data source before you pivot it?
The simple answer is to UNPIVOT the same way you just PIVOTed. But the best answer is to not do the PIVOT in the first place! Store the unpivoted data in a table first, then from that do your PIVOT and your average.

Resources