How does row ptr for the Compressed Row Storage method work? - sparse-matrix

I am trying to understand how does the row ptr for compressed row storage method work? I am reading this article: http://netlib.org/linalg/html_templates/node91.html#SECTION00931100000000000000
The val and col_ind are kind of straight forward but I am confused about how the row ptr array are generated?
For example, how did you get the second 3 and third 6?
According to the article, row_ptr[n+1] should equal to nnz +1.
However, the definition given in the above article does not make sense to me. NNZ is the number of non zeros in the original matrix which should be a constant. If so, we would end up with the same val for all elements inside the row ptr array.
Can someone please help me understand this? Thanks for your help!

I think I finally understand how it works. 3 refers to the fact that the first row has only only two non zeros element. 6 refers to the fact that the second row has 3. Hope that helps someone. So the nnz here is a variable rather than a constant

Related

Deleting consecutive and non-consecutive columns from a cell array in Matlab

I'm trying to delete multiple consecutive and non-consecutive columns from a 80-column, 1-row cell array mycells. My question is: what's the correct indexing of a vector of columns in Matlab?
What I tried to do is: mycells(1,[4:6,8,9]) = [] in an attempt to remove columns 4 to 6, column 8 and 9. But I get the error: A null assignment can have only one non-colon index.
Use a colon for the first index. That way only the 2nd index is "non-colon". E.g.,
mycells(:,[4:6,8,9]) = []
MATLAB could have been smart enough to recognize that when there is only one row the 1 and : amount to the same thing and you will still get a rectangular array result, but it isn't.
Before getting the above VERY VERY HELPFUL AND MUCH SIMPLER answers, I ended up doing something more convoluted. As it worked in my case, I'll post it here for anyone in future:
So, I had a cell array vector, of which I wanted to drop specific cells. I created another cell array of the ones I wanted to remove:
remcols = mycells(1,[4:6,8,9])
Then I used the bellow function to overwrite onto mycells only those cells which are different between remcols and mycells (these were actually the cells I wanted to keep from mycells):
mycells = setdiff(mycells,remcols)
This is not neat at all but hopefully serves the purpose of someone somewhere in the world.

Excel: creating an array with n times a constant

I have been looking around for a while but unable to find an answer to my question.
In Excel, what compact formula can I use to create an array made up of a single element repeated n times, where n is an input (potentially hard-coded)?
For example, something that would look like this (the formula below does not work but gives an idea of what I am looking for):
{={"Constant"}*3}
Note: I am not looking for a VBA-based solution.
EDIT Reading #AxelRichter answer, I see I should also indicate that the formulas below assume Constant is a number. If Constant is text, then this solution will not work.
Volatile:
=ROW(INDIRECT("1:" & Repts))/ROW(INDIRECT("1" & ":" & Repts)) * Constant
non-Volatile:
=ROW(INDEX($1:$65535,1,1):INDEX($1:$65535,Repts,1))/ROW(INDEX($1:$65535,1,1):INDEX($1:$65535,Repts,1))*Constant
If
Constant = 14
Repts = 3
then
Result = {14;14;14}
The first part of the formulas create an array of 1's repeated Repts times. Then we multiply that array by Constant to get the desired result.
And after reading #MacroMarc's comment, the following non-volatile formula shouyld also work for numbers:
=(ROW($A$1:INDEX($A:$A,Repts))>0)*Constant
One could concatenate 1:n empty cells to the "Constant" to create a string array having n items "Constant":
"Constant"&INDEX(XFD:XFD,1):INDEX(XFD:XFD,3)
There 3 is n.
Used in Formula
=INDEX("Constant"&INDEX(XFD:XFD,1):INDEX(XFD:XFD,3),0)
Evaluate Formula shows that it works:
Here column XFD is used because in most cases this column will be empty and a column which is guaranteed to be empty is needed for this solution.
If used
"Constant"&T(ROW($A$1:INDEX($A:$A,3)))
=INDEX("Constant"&T(ROW($A$1:INDEX($A:$A,3))),0)
the need of an empty column disappears. The function ROW returns numbers but the T returns an empty string if its parameter is not text. So empty strings will be concatenated for each 1:3 (n).
Thanks to #MacroMarc for the hint.
Try:
REPT("Constant", SEQUENCE(3,1,1,0))
Or, if the reference is to a dynamic array:
REPT("Constant", SEQUENCE(A1#,1,1,0))
The dynamic array spills, and has your constant repeated one time.
Using SEQUENCE with a step of 0 is a much cleaner way to make an array of constants. You can choose whether you want rows or columns (or both!) as well.
=SEQUENCE(Repts,1,Constant,0)
I will generally use a sequence (like Claire (above) said). But if you want to provide an output of text objects, I would do it this way:
=IF(SEQUENCE(A1,A2,1,0),A3)
Where:
A1 has the number of rows
A2 has the number of columns
A3 has the thing you want repeated into an array
The sequence will create a matrix of 1's, which the IF statement will default to the TRUE expression (being the contents of A3).
So, if you wanted a vertical list of 3 items that says "Constant", this would do it:
=IF(SEQUENCE(3,,1,0),"Constant")
If you would prefer it be arranged horizontally instead of vertically, just amend the SEQUENCE function:
=IF(SEQUENCE(,3,1,0),"Constant")

Excel Array formulas details and guidance needed (INDEX,MATCH and SUM/COUNTIF)

So I'm having difficulties understanding fully how arrays works and when they are used by excel and specifically what happens in the background.
From reading the past few hours I understand that one of the reasons my Index Match doesn't work without array is simply because its a multicriteria Match that I use as below:
{=INDEX(D30:E36,MATCH(F33&G33,B30:B36&C30:C36),2)}
From what I understand the reason is that Match returns a {x,y} result which classifies it as an array formula. But considering the point is to get a row number, if the row I'm looking for is 5 then Match will return a {5,5} for row number for Index. And then Index interprets this as just 5? or what exactly happens in the background here?
Then I found an article which showed how to circumvent the array formula and not need ctrl+shift+enter as shown below. How does the below change things and what happens in the background?
=INDEX(D30:E36,MATCH(F33&G33,INDEX(B30:B36&C30:C36)),2)
The below is a an array SUM/COUNTIF formula which counts unique cells only which does not work without array brackets. Why is that and how does it work? It involves maths so I'm not sure.
{=SUM(1/(COUNTIF(A1:A5,A1:A5)))}
Thank you!

Change array dimensions, using spreadsheet functions, when used inside SUMPRODUCT

I am interested in spreadsheet functions, not VBA solutions, to be included in a single cell formula.
[A1:A15 contain numeric values from 1 to 127, B1:B15 contain integers from 1 to 7 that set a divisor.]
Given the function:
=SUMPRODUCT(MOD(FREQUENCY(A1:A15;A1:A15);B1:B15))
FREQUENCY(A1:A15;A1:A15) gives a 1-column array of 15+1 rows, whereas the second part (B1:B15) is a 1-column array of 15 rows.
I would like to change the resulting array given by FREQUENCY (only in memory -not explicit in sheet-) from a 1-column 16 rows array to a 1-column 15 rows array with the first 15 cell values of that array.
[FREQUENCY documentation: https://support.office.com/en-us/article/FREQUENCY-function-44e3be2b-eca0-42cd-a3f7-fd9ea898fdb9 NB: for Excel, second remark states number of elements that depend on bins_array. ]
I would appreciate suggestions.
Thus, both arrays within MOD will have the same dimensions and SUMPRODUCT will not find cells with error values. I can disregard error values using IF and ISERROR within SUMPRODUCT, but I'd rather disregard the non-relevant part of the FREQUENCY resulting array if it is possible.
It has been thought that making it more specific might be more helpful, so it has been heavily reduced and simplified.
With external help, I have been able to fine-tune a way to solve my problem using INDEX in array formula mode. I am posting the answer in case it helps others.
One way: Put FREQUENCY(A1:A15;A1:A15), or any formula that produces an multi-cell array, within INDEX and have 2nd and/or 3rd arguments as array of consecutive values which will represent rows/columns.
INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(FREQUENCY(A1:A15;A1:A15)-1));1)
First argument within INDEX is the resulting array coming from a formula to shrink (from 16x1 to 15x1), which would be a multi-cell array formula if explicitly entered; second argument is the array 1..15 given by row numbers from 1 to the number of total rows of the "array from formula to shrink" MINUS 1: the first 15 (out of 16) values in the array from a formula; 3rd argument is the column of the shrank array (if need be, more than one could be selected using an analogue to the second argument).
In the particular case of FREQUENCY, because it is known that we are interested in the "bins" part of the function, the formula can be simplified by including the total rows of the "bins"/"intervals" array inside FREQUENCY (its second argument). We will have
INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(A1:A15)));1)
and the complete formula would become
SUMPRODUCT(MOD(INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(A1:A15)));1);B1:B15))
Now, both dividend and divisor of MOD have exactly the same dimensions (15x1) and because B1:B15 includes integers greater than 0 there are no errors.
Thanks all for helping me in making question more concise and better formatted.
ADDITIONAL INFORMATION: As pointed out correctly in comments by XOR LX, this does not seem to work in the widely popular spreadsheet software Excel. It has been developed for an INDEX function inside SUMPRODUCT as used in Open Office Calc which I had mistakenly thought 100% equivalent to Excel's version. A more complete answer perhaps using other functions would be appreciated.
In the previous answer, XOR LX points out very correctly that this formula cannot work in Excel, due to row_num/column_num argument behaviour. Very kindly XOR LX has shown me how that approach can work, and also thanks and credit for supplying a good answer: "INDEX can be used to redimension array (even dynamically created ones) if the row_num/column_num array is coerced to take an arbitrary array with the right dimensions, as shown on this blog entry " The following formula has been checked in Excel 2010 and has the expected results:
SUMPRODUCT(MOD(INDEX(FREQUENCY(A1:A15,A1:A15),N(INDEX(ROW(INDIRECT("1:" & ROWS(A1:A15))),,)),1),B1:B15))
NB: row_num argument of first INDEX, a ROW generated auxiliary array, has been nested inside N(INDEX([...],,)); at least one comma is necessary to account for the two arguments minimum of the nested INDEX. It is in itself interesting the discussion that applies generally to INDEX's arguments, and other functions', that need to be coerced to take arrays (see, here and here at XOR LX's blog). For Open Office users it might be worth stressing the point made at the blog
Unlike OFFSET, (...) for which the first parameter must be a
reference (...) in the worksheet, INDEX can also accept –
and manipulate – for its reference arrays which consist of values
generated e.g. via other subfunctions within the formula. XOR LX's blog
That would be indeed the case in changing the dimension in an array as in this question, but also useful in reversing or displacing the values in an array, for example. Open Office accepts arrays as row_num/column_num, so the coercion is not needed and some formulas rely on this, but without it, these formulas are unlikely to work when files are open in Excel.
Regrettably, this type of coercion is not passed correctly to Open Office, and formula need to be "decoerced" to work, at least in my casual tests.
In order to use a formula that would work in both spreadsheet programs regarding shortening arrays, the only thing I have managed is the following (required: arrays must be single-column)
SUMPRODUCT(
(COLUMN(INDIRECT("R1C1:R"& ROWS(vals_to_mod) &"C"& ROWS(FREQUENCY(vals_for_freq,vals_for_freq)),FALSE))
-ROW(COLUMN(INDIRECT("R1C1:R"& ROWS(vals_to_mod) &"C"& ROWS(FREQUENCY(vals_for_freq,vals_for_freq)),FALSE))
=0)
*MOD(TRANSPOSE(FREQUENCY(vals_for_freq,vals_for_freq)),vals_to_mod)
)
(it "shortens" one array to the shortest of the pair, by creating an auxiliary array with TRUE/1s on the diagonal starting top-left and FALSE/0s elsewhere, therefore disregarding all defined values outside the square section of the array. Thus, SUMPRODUCT adds values within the diagonal of the square section which are the product of the corresponding values up to the last value of the shorter array.)

Cell arrays in MATLAB

I know what a cell array is. I just came to the following line:
cell_array = cell([], 1);
What does that mean? How can we read the above line?
Thanks.
So this makes a 0x1 empty cell array. As in literally 0 rows and 1 column. You could make a 0x0 cell array like this:
cell_array = {}
which makes sense to me. I do that sometimes like if you can't preallocate before a loop, sometimes it's useful to concatenate onto or to go cell_array(end) = ... in the loop.
I really don't know why you'd prefer a 0x1 but this question shows how it differs from a 0x0. So I mean, if for some weird reason you are running a for loop on this empty array you know it will run at least once :/ But that's really scraping the barrel for a reason. I think just stick with = {}.
EDIT:
As #radarhead points out, this is one way to preset the number of columns if you plan on concatenating new rows in a loop.
cell is used to allocate a cell array, in this case: cell_array.
So an empty cell of size 0x1 is allocated.
However, this is no usefull syntax for cells. For other (custom) objects it can be useful to generate such empty arrays
myArray = myObject.empty(0, 1);

Resources