Array constants aren't working as expected in excel - arrays

I was trying to use an array constant to do some calculations. I saw this thread: Array Constants in Excel, but I am using the array constant within the formula so it is not relivant. If I use =SUM({1,2,3}) the result is 6 as expected. However,if I use it with DCOUNT, it doesn't work as expected:
A
1 Colour
2 Red
3 Yellow
4 Green
5 Red
6
7 Colour
8 =Red
The result of =DCOUNT(A1:A5;;A7:A8) is 2.
The result of =DCOUNT(A1:A5;;{"Colour";"=Red"}) is #Value!. The error message is Value used in formula is wrong data type.
Is this some inconsistency in MS Excel 2010? Or have I done something wrong?
EDIT
It was suggested that "=Red" was the issue, but the reference to this page at heading Elements you can use in constants IMO doesn't really expain it. If it were the issue however, then the following should work:
A
1 Number
2 1
3 2
4 3
5 1
6
7 Number
8 1
The formula =DCOUNT(A1:A5;;A7:A8) gives 2, but the formula =DCOUNT(A1:A5;;{"Number";1}) or =DCOUNT(A1:A5;;{"Number";"1"}) both still give the same error as my previous example.

A range can be used as an array but an array cannot be used as a range.
Since DCOUNT specifies only a range parameter, an array constant is illegal type for that parameter.
According to these pages:
Introducing array formulas in Excel
Putting advanced array formulas to work
they would imply that array constants are to be used with items that do not take ranges but:
Either an array, which would result in a single value
-or-
Single values, resulting in array (Ctrl-Shift-Enter must be used to generate the array result)
To do what I was trying to do (count all of the cells that contained the string Red in range A2:A5), I would do something like this:
A
1 Colour
2 Red
3 Yellow
4 Green
5 Red
=SUM(IF(A2:A5="Red", 1, 0)) which would count the number of Red entries by creating the intermediate array {1;0;0;1} and then add all the elements together resulting in 2.

Related

(arrays) how to move a range of elements to another position

i have a dynamic html table, generated with php, with data from mariadb
i need an algorithm (preferently in sql, or in php if not) to re-order the rows and submit... like this (where x = checkbox, o = radiobutton):
range dest data
---------------
. . one
x . two
x o three
x . four
. . five
[submit move]
the idea is similar to excel> select rows> move+insert rows... first you select a source RANGE (i did with js, just select first, last, and the range selects itself), then you select an insert destination position, then submit... in this example the result will be
one
five
two
three
four
as you can see the selected range [two..four] has advanced to specified position, shifting the overwriten items to space left by the range, preserving the length but just changing indexes... i imagine some sql commands like
UPDATE my_table SET index=index+$diff WHERE index>=$start AND index<=$end;
seems easy but is not so much (for me at least) since there are many different cases with different range lenghts, positions, etc... i couldn't found any native sql or php function or snippet to do it
please if you have some knowledge or idea about some related algorithm i will be very thankful ... and forgive my bad english xD
due the question popularity xD i tried to find an answer myself... and share it with the hope will be useful to somebody, and hopufully motivate someone else to find a better solution
consider this initial set of elements (only indexes shown), and that we want to move elements from 2 to 4 to position 6... we expect a final result
orig 1 2 3 4 5 6 7 8 9
src 2 3 4
dest 6
final 1 5 6 7 8 2 3 4 9
first thing i did was to calculate the source range length, that is 4 -2 = 2
then saved the source range in a temp table, or by marking elements someway, or by writing down unique id's, etc...
until the indexes could simply be duplicated, imagine instead we take away the source range and left a "hole" of "length +1" elements in the list
2 3 4
^ ^ ^
1 . . . 5 6 7 8 9
then the key is to shift the remaining elements to the opposite side "length +1" positions, can be done by substracting "length" to remaining elements
< < <
1 5 6 7 8 . . . 9
take note the remaining items you shifted is 4, this is the difference between after the last source element and the last destination position... the last destination position is destination + "length" (6 +2 = 8)
in the case the movement is done to the opposite direction (i.e. the source range to their left) all ranges, shifts and operations must be corrected acordingly
now you can set source range to his new position by simply correcting the indices adding "remaining length" (4) to indexes
> > > > 2 3 4
1 5 6 7 8 . . . 9
be aware destination position can't go outside (first) and (last -"length") positions, i.e. (1 and 7) for this case... and if first source index equals destination the list wouldn't change
so to conclude, there 3 cases with slightly different operations:
move the source range to their right
move it to same position (do nothing)
move it to left
now you could post a pseudocode snippet below as a complement :) ... thanks

Complex NumPy Array Manipulation

I have two numpy arrays:
e.g.
np.array_1([
[5,2,0]
[4,3,0]
[4,2,0]
[3,2,1]
[4,1,1]
])
np.array_2([
[5,2,10]
[4,2,52]
[3,2,80]
[1,2,4]
[5,3,6]
])
In np.array_1, 0 and 1 at index 2 represent two different categories. For arguments sake say 0 = Red and 1 = Blue.
So, where the first two elements match in the two numpy arrays, I need to average the third element in np.array_2 by category. For example, [5,2,10] and [4,2,52] both match with category 0 i.e. Red. The code will return the average of the elements at index 2 for the Red category. It will also do the same for the Blue category.
I have no idea where to start with this, any ideas welcome.
You marked your post with Numpy tag due to the type of source arrays,
but it is much easier and intuitive to generate your result using Pandas.
Start from conversion of your both arrays to pandasonic DataFrames.
While converting the first array, convert also 0 and 1 in the last
column to Red and Blue:
import pandas as pd
df1 = pd.DataFrame(array_1, columns=['A', 'B', 'key'])
df1.key.replace({0: 'Red', 1: 'Blue'}, inplace=True)
df2 = pd.DataFrame(array_2, columns=['A', 'B', 'C'])
Then, to generate the result, run:
result = df2.merge(df1, on=['A', 'B']).groupby('key').C.mean().rename('Mean')
The result is:
key
Blue 80
Red 31
Name: Mean, dtype: int32
Details:
df2.merge(df1, on=['A', 'B']) - Generates:
A B C key
0 5 2 10 Red
1 4 2 52 Red
2 3 2 80 Blue
eliminating at the same time rows which don't belong to any group
(are neither Red nor Blue).
groupby('key') - From the above result, generates groups by key
(Red / Blue).
C.mean() - the last step is to take C column (from each group)
and compute its mean.
The result is a Series with:
index - the grouping key,
value - the value computed for the corresponding group.
rename('Mean') - Change the name from the source column name (C)
to a more meaningful Mean.

Issues with SUMPRODUCT in Excel: Trying to count the number of average subtractions above a given threshold

I have a fairly simple issue that I cannot seem to work out. It may be familiar to some of you now.
I have the following matrix (which I will refer to as two arrays):
F G H I J ... R S T U V
1 0 0 1 1
4 4 2 3 5 1 2 3 1 2
2 2 3 1 2 0 1
2 1 0 0 4 0 0 3 0 0
I would like to take the difference between the average of each row in array 1 (columns F:J) and the average of each row in array 2 (columns R:V). For example, the average of F2:J2 = 3.6, the average of R2:V2 = 1.8, and the overall difference = 1.8. I would then like to count the number of overall differences which exceed a given threshold (e.g., 1), but I want to ignore rows which have no entries (see R1:V1) and/or partially missing entries (see the 2nd entry in row F3:J3 and 4th and 5th entry in row R3:V3).
I was lucky enough to be introduced to array formulae by #Tom Sharpe, and have attempted to adapt his code for a similar issue I had, e.g.,:
=SUMPRODUCT(--((SUBTOTAL(1,OFFSET(F1,ROW(F1:F4)-ROW(F1),0,1,COLUMNS(F1:J1)))-SUBTOTAL(1,OFFSET(R1,ROW(R1:R4)-ROW(R1),0,1,COLUMNS(R1:V1)))>1)*(SUBTOTAL(2,OFFSET(F1,ROW(F1:F4)-ROW(F1),0,1,COLUMNS(F1:J1)))=COLUMNS(F1:J1))*(SUBTOTAL(2,OFFSET(R1,ROW(R1:R4)-ROW(R1),0,1,COLUMNS(R1:V1)))=COLUMNS(R1:V1))>0))
From what I understand, the code attempts to count the number of differences between the averages of each row in each array that exceed 1, so long as the product between the number of columns with full entries is >0 (i.e. has full data). However, it keeps throwing the #DIV/0! error, which I believe stems from that fact that it is still trying to subtract the average of F1:J1 and R1:V1 (e.g., the empty row), which would produce this kind of error. The correct answer for the current example is 1 (e.g., F2:J2 [3.6] - R2:V2 [1.8] = 1.8 == 1.8 > 1).
Does anyone have any ideas as to how the code can be attempted for the current purposes, and perhaps a v. brief explanation of what is going awry in the current code?
You're right, SUBTOTAL falls over when it's trying to find the average of an range containing only empty cells.
If you want to persevere and try and do it the same way as before with an array formula, you have to turn it round and put the condition for all the cells in both ranges to be non-blank in an if statement so that it doesn't try and take the average unless both ranges have no blanks:
=SUM(IF((SUBTOTAL(2,OFFSET(F1,ROW(F1:F4)-ROW(F1),0,1,COLUMNS(F1:J1)))=COLUMNS(F1:J1))*(SUBTOTAL(2,OFFSET(R1,ROW(R1:R4)-ROW(R1),0,1,COLUMNS(R1:V1)))=COLUMNS(R1:V1)),
--(SUBTOTAL(1,OFFSET(F1,ROW(F1:F4)-ROW(F1),0,1,COLUMNS(F1:J1)))-SUBTOTAL(1,OFFSET(R1,ROW(R1:R4)-ROW(R1),0,1,COLUMNS(R1:V1)))>1)))
This time unfortunately I found I couldn't SUMPRODUCT it - I think this is because of the presence of the IF statement - so you have to enter it as an array formula using CtrlShiftEnter
Will this work for you?
=IF(NOT(OR(IFERROR(MATCH(TRUE,ISBLANK(F1:J1),0),FALSE),IFERROR(MATCH(TRUE,ISBLANK(R1:V1),0),FALSE))), SUBTOTAL(1,F1:J1)-SUBTOTAL(1,R1:V1), "Missing Value(s)")
My approach was a little different from what you tried to adapt from #TomSharp in that I'm validating the cells have data (not blank) and then perform the calculation, othewise return an error message. This is still an array function call, so when you enter the formulas, press ctrl+shft+enter.
The condition part of the opening if() checks to see that each range's cells are not blank: if a match( true= isblank(cell))
means a cell is blank (bad), if no match ... ie no blank cells, Match will return an #NA "error" (good). False is good = Errors found ? No. ((ie no blank cells))

Excel average rows to array formula

I want to take the average of rows which would result in a column (array). Example input:
3 4
4 4
4 6
With an array formula I want to create:
3.5
4
5
The average is the sum of numbers divided by the count of that numbers.
So first add them (A1:A3+B1:B3)
3+4 = 7
4+4 = 8
4+6 = 10
Then divide by the number of numbers(/2):
7/2 = 3.5
8/2 = 4
10/2 = 5
{=(A1:A3+B1:B3)/2}
edit after comment from op:
formula for addition without adding column manually from https://productforums.google.com/forum/#!topic/docs/Q9x44sclzfY
{=mmult(A1:B3,sign(transpose(column(A1:B3))))/Columns(A1:B3)}
This is one way to do that in Excel
=SUBTOTAL(1,OFFSET(A1:B3,ROW(A1:B3)-MIN(ROW(A1:B3)),0,1))
OFFSET supplies an "array of ranges", each range being a single row, and SUBTOTAL with 1 as first argument, averages each of those ranges. You can use this in another formula or function or entered in a range on the worksheet.
The advantage over Siphor's suggestion with MMULT is that this will still work even with blanks or text values in the range (those will be ignored)
If first column is A and the second is B, then enter this formuls in column C:
=AVERAGE(A1,B1)
and extend it to the last row
Also you can use a range if you have more than 2 columns (this function allows for some cells to be empty):
=AVERAGE(A1:F1)

How do you pick out rows from an array based on a specific value in a column?

I have an array of data. For simplicity, let's call it a 4 x 3 matrix. Let's say I want to find a data point in column 2 that has a value of 5. Then, I want to take all rows that contains the value of 5 in column 2 and place it in its own array. My data is much larger than the one displayed below, so I don't want to go through by eye and look at every line of data and identify all the 5's.
% My idea of the code:
data = [1 2 3 4; 5 5 5 6; 6 4 5 6]
if data(:,2) == 5
% This is the part I can't figure out
end
Let's call the finaldata the array in which the data with 5's will be stored. How do I do this?
You should use logical indexing:
all_fives_rows = data(data(:, 2) == 5, :)
You can use the FIND Function to search that value, and give the coords back (it might be a vector) to retrieve the rows:
data(find (data(:,2)==5),:)
Why not using logical indexing: Performance

Resources