Find Maximum of Array of Index/Match from Concatenated String - arrays

Given the following table:
I would like the Actual Start to show the Preferred Start value, if the Depends column is empty (easy).
If the Depends column contains one or more comma-separated Id values, I would like to split on comma, look up the array of "Preferred Start" values based on the corresponding Id value, and then select the maximum value.
The following formula will correctly split the "Depends" cell:
=FILTERXML("<t><s>"&SUBSTITUTE(G6,",","</s><s>")&"</s></t>","//s")
Which can be verified, by using an array-valued MAX function (this returns "4"):
={MAX((FILTERXML("<t><s>"&SUBSTITUTE(G6,",","</s><s>")&"</s></t>","//s")))}
However, what I really want to do is:
={MAX(INDEX(Table1[Preferred Start],MATCH((FILTERXML("<t><s>"&SUBSTITUTE(G6,",","</s><s>")&"</s></t>","//s")),Table1[Id],0)))}
Somewhere along the way however, it loses the "arrayness", and simply returns the "Preferred Start" of the first Id number of the split (Id 3, 17 Jan 18).
Is what I'm trying to do even possible without resorting to VBA? I suspect I will run into a circular reference in actuality, since I really need to take the maximum of the "Actual Start" (adjusted for dependencies), to properly cascade a chain of dependent items.
Thanks

This is a known issue with INDEX, it's reluctant to return an array without some co-ercion. Generically this should work
=INDEX(range,N(IF(1,{array})))
so that becomes the following with your specific scenario
=MAX(INDEX(Table1[Preferred Start],N(IF(1,MATCH((FILTERXML("<t><s>"&SUBSTITUTE(G6,",","</s><s>")&"</s></t>","//s")),Table1[Id],0)))))
confirm with CTRL+SHIFT+ENTER
I assume that every row has a different ID number because the MATCH function will only find the first match for each ID
....or for a completely different approach you can use AGGREGATE function (and SEARCH instead of FILTERXML), which doesn't require "array entry" and would return the correct MAX even if IDs repeat, i.e.
=AGGREGATE(14,6,Table1[Preferred Start]/SIGN(SEARCH(","&Table1[Id]&",",","&G6&",")),1)

Reorder the match to include the max in it:
=INDEX(Table1[Preferred Start],MATCH(MAX((FILTERXML("<t><s>"&SUBSTITUTE(G6,",","</s><s>")&"</s></t>","//s"))),Table1[Id],0))
Enter as an array formula using Ctrl-Shift-Enter.

Related

How to modify array under specific JSONB key in PostgreSQL?

We're storing various heterogeneous data in a JSONB column called ext and under some keys we have arrays of values. I know how to replace the whole key (||). If I want to add one or two values I still need to extract the original values (that would be ext->'key2' in the example lower) - in some cases this may be too many.
I realize this is trivial problem in relational world and that PG still needs to overwrite the whole row anyway, but at least I don't need to pull the unchanged part of the data from DB to the application and push them back.
I can construct the final value of the array in the select, but I don't know how to merge this into the final value of ext so it is usable in UPDATE statement:
select ext, -- whole JSONB
ext->'key2', -- JSONB array
ARRAY(select jsonb_array_elements_text(ext->'key2')) || array['asdf'], -- array + concat
ext || '{"key2":["new", "value"]}' -- JSONB with whole "key2" key replaced (not what I want)
from (select '{"key1": "val1", "key2": ["val2-1", "val2-2"]}'::jsonb ext) t
So the question: How to write such a modification into the UPDATE statement?
Example uses jsonb_*_text function, some values are non-textual, e.g. numbers, that would need non _text function, but I know what type it is when I construct the query, no problem here.
We also need to remove the values from the arrays as well, in which case if the array is completely empty we would like to remove the key from the JSONB altogether.
Currently we achieve this with this expression in the UPDATE statement
coalesce(ext, '{}')::jsonb - <array of items to delete> || <jsonb with additions> (<parts> are symbolic here, we use single JDBC parameter for each value). If the final value of the array is empty, the key for that value goes into the first array, otherwise the final value appears int he JSONB after || operator.
To be clear:
I know the path to the JSONB value I want to change - it's actually always a single key on the top level.
I know whether that key stores single value (no problem for those) or array (that's where I don't have satisfying solution yet), because we know the definitions of each key, this is stored separately.
I need to add and/or remove multiple values I provide, but I don't know what is in the array at that moment - that's the whole point, so that application doesn't need to read it.
I may also want to replace the whole array under the key, but this is trivial case and I know how to do this.
Finally, if removal results in an empty array, we'd like to get rid of the key as well.
I could probably write a function doing it all if necessary but I've not committed to that yet.
Obviously, restructuring the data out of that JSONB column is not an option. Eventually I want to make it more flexible and data with these characteristics would go to some other table, but at this moment we're not able to do it with our application.
You can use jsonb_set to modify an array which is placed under some key.
To update a value in an array you should specify a zero-based index within the array in the below example.
To add a new element on a start/end - specify negative/positive index which is greter than array's length.
UPDATE <table>
SET ext = jsonb_set(ext, '{key2, <index>}', '5')
WHERE <condition>

How to concatenate multiple ranges within a Match function

I have a list of values that I would like to match against the combination of multiple ranges.
So, for example, my ranges are A1:A100 and B1:B100.
Instead of concatenating A with B in a new column C, i.e.
CONCAT(A1,B1)...CONCAT(A100,B100)
and then matching my value against that new column - I would like to do something like this:
MATCH(value,CONCATENATE(A1:B100),0)
And copy this down a column near my list of values.
I have a feeling this can be done with some sort of array formula...
Yes as an array formula:
=MATCH(value,$A$1:$A$100 & $B$1:$B$100,0)
Being an array formula it must be confirmed with Ctrl-Shift-Enter instead of Enter when exiting edit mode.
Though they may seem similar in approach they are not. CONCATENATE will return a string not an array to the MATCH with all 200 values in one long string. Where the above will return 100 values, each row concatenated, as an array which can be used to search.
One further note, If performance becomes a issue, Array formulas are inherently slower, adding the helper column and using a regular MATCH will improve the responsiveness.
This should work, basically you just need to concatenate it yourself using &
=MATCH(D1,A1:A10&B1:B10,0)
D1 is the value you're trying to look for.
This is an array, so remember to hit Ctrl+Shift+Enter when you input it.

How to use index match (array formula) to return corresponding values from a drop down list?

Excel Screenshot
Excel Screenshot with Formulas
I have attached photos to show an idea of what I am trying to do. Basically, I have a very large list of features that are shared between certain groups. I want to use a drop down list of the features, and then have a formula that will output the group that has the lowest cost of that feature along with the cost of that feature within the group.
(Also you will see that I purposefully ignore zero values. I do this because not every group has a certain feature and those cells default to zero).
I figured out how to get the cost of the feature to output, but I'm having trouble getting to output the group name. I am assuming there will be an array formula to do this, but I am just starting to learn those and I'm having trouble with this one.
Well you could always use the same approach you used to pull in the value, by pulling in the index of the column heading that matches the computed min, and using an offset function to match on the right row:
=+INDEX($B$1:$D$1,MATCH($B$10,OFFSET($B$1:$D$1,MATCH($A$7,$A$2:$A$4,0),0),0))
The thing is, I'm not sure how you would want to handle ties, if 2 vendors had the same price, this would match the first one in the list.

Returning multiple adjacent cell results from an min array which may include multiple duplicate values

I'm trying to setup a formula that will return the contents of an related cell (my related cell is on another sheet) from the smallest 2 results in an array. This is what I'm using right now.
=INDEX('Sheet1'!$A$40:'Sheet1'!$A$167,MATCH(SMALL(F1:F128,1),F1:F128,0),1)
And
=INDEX('Sheet1'!$A$40:'Sheet1:!$A$167,MATCH(SMALL(F1:F128,2),F1:F128,0),1)
The problem I've run into is twofold.
First, if there are multiple lowest results I get whichever one appears first in the array for both entries.
Second, if the second lowest result is duplicated but the first is not I get whichever one shows up on the list first, but any subsequent duplicates are ignored. I would like to be able to display the names associated with the duplicated scores.
You will have to adjust the k parameter of the SMALL function to raise the k according to duplicates. The COUNTIF function should be sufficient for this. Once all occurrences of the top two scores are retrieved, standard 'lookup multiple values' formulas can be applied. Retrieving successive row positions with the AGGREGATE¹ function and passing those into an INDEX of the names works well.
    
The formulas in H2:I2 are,
=IF(SMALL(F$40:F$167, ROW(1:1))<=SMALL(F$40:F$167, 1+COUNTIF(F$40:F$167, MIN(F$40:F$167))), SMALL(F$40:F$167, ROW(1:1)), "") '◄ H2
=IF(LEN(H40), INDEX(A$40:A$167, AGGREGATE(15, 6, ROW($1:$128)/(F$40:F$167=H40), COUNTIF(H$40:H40, H40))), "") '◄ I2
Fill down as necessary. The scores are designed to terminate after the last second place so it would be a good idea to fill down several rows more than is immediately necessary for future duplicates.
¹ The AGGREGATE function was introduced with Excel 2010². It is not available in earlier versions.
² Related article for pre-xl2010 functions - see Multiple Ranked Returns from INDEX().
The following formula will do what I think you want:
=IF(OR(ROW(1:1)=1,COUNTIF($E$1:$E1,INDEX(Sheet1!$A$40:$A$167,MATCH(SMALL($F$1:$F$128,ROW(1:1)),$F$1:$F$128,0)))>0,ROW(1:1)=2),INDEX(Sheet1!$A$40:$A$167,MATCH(1,INDEX(($F$1:$F$128=SMALL($F$1:$F$128,ROW(1:1)))*(COUNTIF($E$1:$E1,Sheet1!$A$40:$A$167)=0),),0)),"")
NOTE:
This is an array formula and must be confirmed with Ctrl-Shift-Enter.
There are two references $E$1:$E1. This formula assumes that it will be entered in E2 and copied down. If it is going in a different column Change these two references. It must go in the second row or it will through a circular reference.
What it will do
If there is a tie for first place it will only list those teams that are tied for first.
If there is only one first place but multiple tied for second places it will list all those in second.
So make sure you copy the formula down far enough to cover all possible ties. It will put "" in any that do not fill, so err on the high side.
To get the Scores use this simple formula, I put mine in Column F:
=IF(E2<>"",SMALL($F$1:$F$128,ROW(1:1)),"")
Again change the E reference to the column you use for the output.
I did a small test:

Using a string key to return a value from an array

I have a named array of 14 rows by 2 columns. The first has a string key (ie: Country), and the second an attribute (ie: Owner). I want to retrieve the Owner by supplying the Country.
I only know how to use =INDEX to retrieve values from named arrays, but that expects col/row numbers.
How might I achieve my requirement?
For the sake of an answer.
Feed the INDEX function with a MATCH function to provide the requisite row number, along the lines:
=INDEX(B:B,MATCH(A2,A:A,0))
VLOOKUP will work but INDEX/MATCH is more powerful (see) so if you are already comfortable with INDEX it might be better to add MATCH to your arsenal rather than to bother with V/H LOOKUP.

Resources