I have a table in Matlab crsp and and cell array of numbers that serve as keys. I want to retrieve information from the table using that cell array which is stored as a variable. My code is as follows:
function y = generateWeights(permno_vector, this_datenum, crsp)
crsp(crsp.PERMNO == permno_vector,:);
crsp is defined as a table while permno_vector is the cell array. It contains a couple permnos that are used to retrieve information.
In this case, my code is not working and will not allow me to access the values in crsp. How do we access table values using a vector array?
As James Johnstone points out, the first problem with the code you've posted is that it doesn't assign anything to y, so as written your function doesn't return a value. Once you've fixed that, I assume the error you are seeing is Undefined operator '==' for input arguments of type 'cell'. It's always helpful to include this sort of detail when asking a question.
The syntax
crsp(crsp.PERMNO == x,:)
would return those rows of crsp that had PERMNO equal to x. However if you want to supply a list of possible values, and get back all the rows of your table where your target variable matches one of the values in the list, you need to use ismember:
crsp(ismember(crsp.PERMNO, cell2mat(permno_vector)),:)
if permno_vector is a cell array, or simply:
crsp(ismember(crsp.PERMNO, permno_vector),:)
if you can instead supply permno_vector as a numeric vector (assuming of course the data in crsp.PERMNO is also numeric).
Related
We're storing various heterogeneous data in a JSONB column called ext and under some keys we have arrays of values. I know how to replace the whole key (||). If I want to add one or two values I still need to extract the original values (that would be ext->'key2' in the example lower) - in some cases this may be too many.
I realize this is trivial problem in relational world and that PG still needs to overwrite the whole row anyway, but at least I don't need to pull the unchanged part of the data from DB to the application and push them back.
I can construct the final value of the array in the select, but I don't know how to merge this into the final value of ext so it is usable in UPDATE statement:
select ext, -- whole JSONB
ext->'key2', -- JSONB array
ARRAY(select jsonb_array_elements_text(ext->'key2')) || array['asdf'], -- array + concat
ext || '{"key2":["new", "value"]}' -- JSONB with whole "key2" key replaced (not what I want)
from (select '{"key1": "val1", "key2": ["val2-1", "val2-2"]}'::jsonb ext) t
So the question: How to write such a modification into the UPDATE statement?
Example uses jsonb_*_text function, some values are non-textual, e.g. numbers, that would need non _text function, but I know what type it is when I construct the query, no problem here.
We also need to remove the values from the arrays as well, in which case if the array is completely empty we would like to remove the key from the JSONB altogether.
Currently we achieve this with this expression in the UPDATE statement
coalesce(ext, '{}')::jsonb - <array of items to delete> || <jsonb with additions> (<parts> are symbolic here, we use single JDBC parameter for each value). If the final value of the array is empty, the key for that value goes into the first array, otherwise the final value appears int he JSONB after || operator.
To be clear:
I know the path to the JSONB value I want to change - it's actually always a single key on the top level.
I know whether that key stores single value (no problem for those) or array (that's where I don't have satisfying solution yet), because we know the definitions of each key, this is stored separately.
I need to add and/or remove multiple values I provide, but I don't know what is in the array at that moment - that's the whole point, so that application doesn't need to read it.
I may also want to replace the whole array under the key, but this is trivial case and I know how to do this.
Finally, if removal results in an empty array, we'd like to get rid of the key as well.
I could probably write a function doing it all if necessary but I've not committed to that yet.
Obviously, restructuring the data out of that JSONB column is not an option. Eventually I want to make it more flexible and data with these characteristics would go to some other table, but at this moment we're not able to do it with our application.
You can use jsonb_set to modify an array which is placed under some key.
To update a value in an array you should specify a zero-based index within the array in the below example.
To add a new element on a start/end - specify negative/positive index which is greter than array's length.
UPDATE <table>
SET ext = jsonb_set(ext, '{key2, <index>}', '5')
WHERE <condition>
I have a working array formula, which creates a list of values in column A. The length of the list depends on the input values of the formula and will vary over time. Values could be removed and added anywhere in the list, not necessarily at the end. Now in column B I want to manually add comments on the values in column A. When the length of the list changes, the values in column B don't move along with the values in column A, so they are not longer on the correct line. Is there a way to solve this?
You can try to use the OFFSET function for the dynamic variable.
Let's say I have in B2 the formula =ROW(A1:A3). This formula gives an array of {1;2;3}. Because a cell only holds one value, cell B2 displays a value of 1. Any place in the worksheet, the formula =B2 gives 1, not the array. Yet, Excel still remembers the array because the formula is still in B2.
Is there any way to get the array back so it, the whole array, not its individual elements, can be used for further manipulation? I'm looking for something like OPERATION(B2) = {1;2;3}. For example, SUMPRODUCT(OPERATION(B2)) = SUMPRODUCT(ROW(A1:A3)) = 6.
As a workaround, you can store your formula in Name Manager, e.g.:
Then you can use it as a reference in Excel formulas, like =INDEX(Rows,2,1):
I realize that this is not the answer to the OP's question as they do not have the latest Office 365. I put this here for those who come later and do have it.
With the latest Office 365 with Dynamic array formulas this problem is now gone.
Putting =ROW(1:3) or the equivalent dynamic array formula =SEQUENCE(3) Excel will spill the array down automatically without the need of Ctrl-Shift-Enter:
And now one can refer to the array by only refering to A1 by putting # after the reference:
=SUM(A1#)
Then no matter how the array in A1 changes the other formula does not need to be changed.
The caveat is that the formula in A1 needs the room to spill down. If there is a cell that has anything in it that resides in the spill down array it will return an error.
Just to try and clarify this, arrays in Excel before the September 2018 update for Office 365 don't self-expand like they do in Google sheets, so if you put an array into a single cell you just get the first element of the array (see column A for example).
If you want to enter the array directly into the sheet, you have to select the cells you want it to occupy (say B2:B4) and enter it as an array formula (see column B) using CtrlShiftEnter
You can still use the array in another formula as long as it's expecting an array - e.g. with sumproduct you get the correct total, 6 (see column C).
=SUMPRODUCT(ROW(A1:A3))
Unfortunately this doesn't always work without entering the whole formula as an array formula (see column D).
=SUM(ROW(A1:A3))
EDIT
#JvDV is correct, you don't always get the first element of an array expression entered into a single cell - see this reference for explanation of the implicit intersection mechanism.
I have two methods, which might qualify as solutions or workarounds, depending on what technology you're willing to accept.
Both rely on retrieving the formula that produces the array from B2 and then evaluating that formula each time you invoke your OPERATION(B2).
Method 1: VBA
Create a UDF:
Function Arr(R as Range)
Arr = Application.Evaluate(R.Formula)
End Function
which you then invoke in a cell, e.g. =SUM(Arr(B2)).
Method 2: Name Manager + EVALUATE()
Note that this is different from Justyna's answer. The idea is to allow you to specify the address of the cell holding the array-generating formula.
Create a name that holds =EVALUATE(FORMULATEXT(INDIRECT(INDEX($1:$1048576;ROW();COLUMN()+1)))). This way you can specify that you want the content of the cell B2 by writing the string B2 to the nearest cell to the right of the cell in which you're using your newly created name.
Unfortunately, EVALUATE() cannot be used within a cell.
I have a list of values that I would like to match against the combination of multiple ranges.
So, for example, my ranges are A1:A100 and B1:B100.
Instead of concatenating A with B in a new column C, i.e.
CONCAT(A1,B1)...CONCAT(A100,B100)
and then matching my value against that new column - I would like to do something like this:
MATCH(value,CONCATENATE(A1:B100),0)
And copy this down a column near my list of values.
I have a feeling this can be done with some sort of array formula...
Yes as an array formula:
=MATCH(value,$A$1:$A$100 & $B$1:$B$100,0)
Being an array formula it must be confirmed with Ctrl-Shift-Enter instead of Enter when exiting edit mode.
Though they may seem similar in approach they are not. CONCATENATE will return a string not an array to the MATCH with all 200 values in one long string. Where the above will return 100 values, each row concatenated, as an array which can be used to search.
One further note, If performance becomes a issue, Array formulas are inherently slower, adding the helper column and using a regular MATCH will improve the responsiveness.
This should work, basically you just need to concatenate it yourself using &
=MATCH(D1,A1:A10&B1:B10,0)
D1 is the value you're trying to look for.
This is an array, so remember to hit Ctrl+Shift+Enter when you input it.
How do I convert a string from a cell to a text array inside of a function, that should use the array, without using VBA and without adding the array into any other part of the document? It will be one of these arrays on more than 1000 rows. The string format is ^[a-zA-Z0-9.,-]*$ with "," as delimiter.
This is the functionality I would like to achieve
I have an excel table with the following columns
A: ID numbers to compare, separated by comma (delimiter can be changed if needed). About 100 ID's would be good to support at least.
B: ID (Each value on the rows in the column are unique but not sorted and can't be sorted because sorting is needed based on other criterias)
C: Value (Several rows in the column can have the same value)
D: Output the one ID of the comma separated ID's that has the highest value on its row
The problem part of the output
So far I have made a function which find the correct ID in column B based on the values in column C but only if I enter the string from column A as an array constant manually within the function. I have not managed to get the function to create the array itself from column A, which is required.
Working part of the code
In this code I have entered the values from column A manually and this is working as it should.
=INDEX({"1.01-1","1.01-3","1.08-1","1.01-1-1A"},MATCH(MAX(INDEX(C$10:C$20,N(IF(1,MATCH({"1.01-1","1.01-3","1.08-1","1.01-1-1A"},B$10:B$20,0))))),INDEX(C$10:C$20,N(IF(1,MATCH({"1.01-1","1.01-3","1.08-1","1.01-1-1A"},B$10:B$20,0)))),0))
Note that the start row is not the first row and the array is used 3 times in the function.
Code to try to convert the string to a text array
Not working but if wrapped in SUMPRODUCT() it provide an array to the SUMPRODUCT() function, of course not usable since I then can't pass on the array. The background to this code can be found in question Split a string (cell) in Excel without VBA (e.g. for array formula)!.
=TRIM(MID(SUBSTITUTE(A10,",",REPT(" ",99)),(ROW(OFFSET($A$1,,,LEN(A10)-LEN(SUBSTITUTE(A10,",",""))+1))-1)*99+((ROW(OFFSET($A$1,,,LEN(A10)-LEN(SUBSTITUTE(A10,",",""))+1)))=1),99))
The second code output the first item of the array and inserted in the first code do not change this result as it did when wrapping the second code in SUMPRODUCT().
Here is a picture of my simplified test setup in Excel for this case, identical to what is described above.
Simplified test setup
I'm not really sure what you are doing with your formula.
But to convert contents of a cell to a comma separated text array to be used as the array argument to the INDEX or MATCH functions, you can use the FILTERXML function. You'll need to educate yourself about XML and XPATH to understand what's going on, but there are plenty of web resource for this.
For example, with
A10: "1.01-1","1.01-3","1.08-1","1.01-1-1A"
The formula below will return 1.08-1. Note the 3 for the row argument to the INDEX function.
=INDEX(FILTERXML("<t><s>" & SUBSTITUTE(SUBSTITUTE(A10,"""",""), ",", "</s><s>") & "</s></t>", "//s"),3)