Array formula not working in Excel - arrays

I have the following table in Excel (blank spaces are empty):
A B C D
1 1
2 3
3 4
4 -2
5 4
6 9
7 8
8
9
10
I would like to return the minimum of column A from A1 to A1000000, using the QUARTILE function, while excluding all negative values. The reason I want it from A1 to A1000000 and not A1 to A7 is because I want to update the table (adding new rows starting from A8) and have the formula also automatically update. The reason I want the QUARTILE and not MIN function is because I will be extending it to calculate other statistics like 1st and 3rd quartile.
This function works correctly and returns 1 (pressing ctrl+shift+enter):
QUARTILE(IF(A1:A7 > -1, A1:A7), 0)
However, when I tried the following, it returned 0 when it should still return 1 (pressing ctrl+shift+enter):
QUARTILE(IF(A1:A1000000 > -1, A1:A1000000), 0)
I also tried the following and it returned 0 (pressing ctrl+shift+enter):
QUARTILE(IF(AND(NOT(ISBLANK(A1:A1000000)), A1:A1000000 > -1), A1:A1000000), 0)
Anybody have a solution to my problem?

Create a dynamic named range, called for example, rng, defined by =OFFSET($A$1,0,0,COUNT($A1:$A10000),1)
Then modify your array formula to refer to rng, via =QUARTILE(IF(rng >-1,rng), 0)

Actually what you have works. Try doing:
=QUARTILE(IF(A:A > 0,A:A ),0)
The reason you are returning 0 is that a blank cell is considered to be of the value 0 when this formula is ran. For example, erase one of the values in the A1:A7 range and your original formula will return 0. Also, I would run the formula on the entire A column if possible (for readability, etc.)
Or do you need to return a "0" if that number is in the list?

Related

Spreadsheet Formula read the wrong Value

I run the Vlookup Formula with data like this in he Spreadsheet:
A65 B65 C65 D65 E65 F65 G65
AGR 1 1 Penjualan 12/12/2022 Makanan 15,500
I used formula :
=IFNA(Vlookup(A65,A:D,4,0),"")
should returns the value as "Penjualan" or "" if nothing.
but it returns other value.
your formula is correct unless there is one more AGR in range A1:A64. try:
=IFNA(VLOOKUP(A65, A65:D65, 4, 0))
to see that its working properly

Using the window function "last_value", when the values of the sorted field are same, the value snowflake returns is not the last value

As we all known, the window function "last_value" returns the last value within an ordered group of values.
In the following example, group by field "A" and sort by field "B" in positive order.
In the group of "A = 1", the last value is returned, which is, the C value 4 when B = 2.
However, in the group of "A = 2", the values of field "B" are the same.
At this time, instead of the last value, which is, the C value 4 in line 6, the first C value 1 in B = 2 is returned.
This puzzles me why the last value within an ordered group of values is not returned when I encounter the value I want to use for sorting.
Example
row_number
A
B
C
LAST_VALUE(C) IGNORE NULLS OVER (PARTITION BY A ORDER BY B ASC)
1
1
1
2
4
2
1
1
1
4
3
1
1
3
4
4
1
2
4
4
5
2
2
1
1
6
2
2
4
1
This puzzles me why the last value within an ordered group of values is not returned when I encounter the value I want to use for sorting.
For partition A equals 2 and column B, there is a tie:
The sort is NOT stable. To achieve stable sort a column or a combination of columns in ORDER BY clause must be unique.
To ilustrate it:
SELECT C
FROM tab
WHERE A = 2
ORDER BY B
LIMIT 1;
It could return either 1 or 4.
If you sort by B within A then any duplicate rows (same A and B values) could appear in any order and therefore last_value could give any of the possible available values.
If you want a specific row, based on some logic, then you would need to sort by all columns within the group to reflect that logic. So in your case you would need to sort by B and C
Good day Bill!
Right, the sorting is not stable and it will return different output each time.
To get stable results, we can run something like below
select
column1,
column2,
column3,
last_value(column3) over (partition by column1 order by
column2,column3) as column2_last
from values
(1,1,2), (1,1,1), (1,1,3),
(1,2,4), (2,2,1), (2,2,4)
order by column1;

Is there an Excel array formula to return 1 for the first occurrence of a value, and 0 for subsequent occurrences

I need an Excel formula to return 1 for every unique text values for a sequence and 0 for duplicate in the same sequence.
I included this screenshot. <--- correction
Desired Output--> 1 for every unique text value and 0 for its duplicate but for a the next number in ID the counting restart from zero
Thanks
Gianluca

How to make Google Studio Handle Blank values in my source (google sheet) when creating calculated fields

I'm trying to create a field in Google Data Studio that would sum 5 different fields (my source), the problem is those values are currently blank (the sheet is reading from another system), but i want to create the field anyway for when the values are feeded in. Google Data Studio is telling me the field cannot be deleted
I've tried to handle with CASE if X is null then 0... but this isnt working, when i try to manualy add data into the sheet (the source) it works, but then when i delete it the scorecard i'm using is returning error
This is the code for one of the 5 fields i'm trying create.
The 1 to 5 range is the range of all possible values in the spreadsheet.
Field A:
case
when X IS NULL then 0
when X = 1 then 1
when X = 2 then 2
when X = 3 then 3
when X = 4 then 4
when X = 5 then 5
else 0
end
Failed to create field. Please try again later... this is when i try to create the field when the values of the fields in the spreadsheet are blanks.
0) Summary
IFNULL (#1) is the new recommended suggestion; NARY_MAX (#2) was the original suggestion:
1) Update - Recommend Suggestion (IFNULL)
The 01 Apr 2021 Update introduced the IFNULL function which is specifically designed to assign a numeric value to NULL values, whereas the original NARY_MAX Calculated Field below would set the minimum value to 0 which would lead that negative values (such as -1) being be captured as a 0:
IFNULL(Field1, 0) +
IFNULL(Field2, 0) +
IFNULL(Field3, 0) +
IFNULL(Field4, 0) +
IFNULL(Field5, 0)
Added a New Page to the Editable Google Data Studio Report (Embedded Google Sheets Data Source) and a GIF to elaborate:
2) Original Suggestion (NARY_MAX)
It can be achieved using the NARY_MAX function:
NARY_MAX(Field1, 0) + NARY_MAX(Field2, 0) + NARY_MAX(Field3, 0) + NARY_MAX(Field4, 0) + NARY_MAX(Field5, 0)
How it works:
NARY_MAX takes the MAX value from a range of fields, thus in the above case, when evaluating NARY_MAX(NULL, 0), 0 is greater than NULL, thus 0 is preferred in the calculation.
NULL vs 0
While a number has a value (negative, 0 or positive), NULL is not a value, thus can't be used in a calculation.
Google Data Studio Report to demonstrate:

Link two tables based on conditions in matlab

I am using matlab to prepare my dataset in order to run it in certain data mining models and I am facing an issue with linking the data between two of my tables.
So, I have two tables, A and B, which contain sequential recordings of certain values in a certain timestamps and I want to create a third table, C, in which I will add columns of both A and B in the same rows according to some conditions.
Tables A and B don't have the same amount of rows (A has more measurements) but they both have two columns:
1st column: time of the recording (hh:mm:ss) and
2nd column: recorded value in that time
Columns of A and B are going to be added in table C when all the following conditions stand:
The time difference between A and B is more than 3 sec but less than 5 sec
The recorded value of A is the 40% - 50% of the recorded value of B.
Any help would be greatly appreciated.
For the first condition you need something like [row,col,val]=find((A(:,1)-B(:,1))>2sec && (A(:,1)-B(:,1))<5sec) where you do need to use datenum or equivalent to transform your timestamps. For the second condition this works the same, use [row,col,val]=find(A(:,2)>0.4*B(:,2) && A(:,2)<0.5*B(:,2)
datenum allows you to transform your arrays, so do that first:
A(:,1) = datenum(A(:,1));
B(:,1) = datenum(B(:,1));
you might need to check the documentation on datenum, regarding the format your string is in.
time1 = [datenum([0 0 0 0 0 3]) datenum([0 0 0 0 0 3])];
creates the datenums for 3 and 5 seconds. All combined:
A(:,1) = datenum(A(:,1));
B(:,1) = datenum(B(:,1));
time1 = [datenum([0 0 0 0 0 3]) datenum([0 0 0 0 0 3])];
[row1,col1,val1]=find((A(:,1)-B(:,1))>time1(1)&& (A(:,1)-B(:,1))<time1(2));
[row2,col2,val2]=find(A(:,2)>0.4*B(:,2) && A(:,2)<0.5*B(:,2);
The variables of row and col you might not need when you want only the values though. val1 contains the values of condition 1, val2 of condition 2. If you want both conditions to be valid at the same time, use both in the find command:
[row3,col3,val3]=find((A(:,1)-B(:,1))>time1(1)&& ...
(A(:,1)-B(:,1))<time1(2) && A(:,2)>0.4*B(:,2)...
&& A(:,2)<0.5*B(:,2);
The actual adding of your two arrays based on the conditions:
C = A(row3,2)+B(row3,2);
Thank you for your response and help! However for the time I followed a different approach by converting hh:mm:ss to seconds that will make the comparison easier later on:
dv1 = datevec(A, 'dd.mm.yyyy HH:MM:SS.FFF ');
secs = [3600,60,1];
dv1(:,6) = floor(dv1(:,6));
timestamp = dv1(:,4:6)*secs.';
Now I am working on combining both time and weight conditions in a piece of code that will run. Should I use an if condition inside a for loop or is a for loop not necessary?

Resources