I'm trying to calculate the sum of best segments in a run. For example, each Km gives a list as such:
5:40 6:00 5:45 5:55 6:21 6 :30
I'm trying to gather the best segments of 2km/3km/4km etc and would like a simple code to do it. At the moment, I'm using the formula
=Min(If(B1=0,9:9:9,sum(A1:B1),If(C1=0,9:9:9,sum(B1:C1))
but this goes all the way to 50km, meaning a very long formulae that I then have to repeat slightly differently at 3km, then 4km, then 5km etc. Surely there must me a way of
generating an array of summed columns of every n column, then iterating over that to find the min while ignoring values of 0?
I can do it manually for now, but what if I want to go over 50km? I might want to incorporate bike rides/car drives in the future just for some data analysis so I figured it best finding an ideal formulae now.
It's frustrating as I could code it and I want to avoid VBA ideally and stick to formulae in Excel.
Here is a draft of the case where there aren't any zeroes just for groups of 2Km. I decided the simplest approach initially was to add a couple of helper rows containing the running total of times (and for later use counts) and use a formula like this to subtract them in pairs:
=MIN(INDEX(A2:J2,SEQUENCE(1,9,2))-IF(SEQUENCE(1,9,0)=0,0,INDEX(A2:J2,SEQUENCE(1,9,0))))
but if you have access to recent additions to Excel 365 like Scan you can do it without helper rows.
Here is a more realistic scenario with a couple of zeroes thrown in
=LET(runningSum,Y$4:AP$4,runningCount,Y$5:AP$5,cols,COLUMNS(runningSum),leg,X7,
seqEnd,SEQUENCE(1,cols-leg+1,leg),seqStart,SEQUENCE(1,cols-leg+1,0),
times,INDEX(runningSum,seqEnd)-IF(seqStart=0,0,INDEX(runningSum,seqStart)),
counts,INDEX(runningCount,seqEnd)-IF(seqStart=0,0,INDEX(runningCount,seqStart)),
MIN(IF(counts=leg,times)))
Note that there are no runs of more than seven consecutive legs that don't contain a zero so 8, 9, 10 etc. just work out to 0.
As mentioned you could dispense with the helper rows by using Scan, but not everyone has access to this so I will add it separately:
=LET(data,Y$3:AP$3,runningSum,SCAN(0,data,LAMBDA(a,b,a+b)),
runningCount,SCAN(0,data,LAMBDA(a,b,a+(b>0))),leg,X7,cols,COLUMNS(data),
seqEnd,SEQUENCE(1,cols-leg+1,leg),seqStart,SEQUENCE(1,cols-leg+1,0),
times,INDEX(runningSum,seqEnd)-IF(seqStart=0,0,INDEX(runningSum,seqStart)),
counts,INDEX(runningCount,seqEnd)-IF(seqStart=0,0,INDEX(runningCount,seqStart)),
MIN(IF(counts=leg,times)))
Tom that worked! I learnt a few things on the way too and using the indexing method alongside sequence and columns is something I had not thought of. I'd never heard of the LET command before and I can already see that this is going to really help with some of the bigger calculations in the future.
Thank you so much, I'd like to show you how it now looks. Row 3087 is my old formula, row 3088 is a copy of the same data using the new formula, as you can see I've gotten exactly the same results so it's clear that it works perfectly and it is can be easily duplicated.
Having an issue getting the Typeahead feature working properly in UI-BootStrap with large datasets. I've got nearly 92,000 records coming back and it seems the maximum number of records that can be in an Array is 10000. So that means I have 10 arrays that contain data.
However, currently I am only able to search through any one Array...so if I set it to response.data[0] that means I am going to be missing 81,000+ records to run typeahead on...
I'm sure there has to be a way to set it so that it can:
A) Either put the data in a single array for it to work with
OR, preferably:
B) wait until the user types in a certain number of keystrokes, say 3, and then do a "Get" call to the server with that data, and only bring back the data that matches those 3 keystrokes, which will likely be far lower than 10,000 items for it to search through...
Can anyone help with either scenario? Preferably Scenario B?
I was wondering what is wrong with the following formula.
IF [Age] = Null() THEN Average([Age]) ELSE [Age] ENDIF
What I am trying to do "If the cell is blank then fill the cell with the average of all other cells called [Age].
Many thanks all!
We do a lot of imputation to correct null values during our ETL process, and there are really two ways of accomplishing it.
The First Way: Imputation tool. You can use the "Imputation" tool in the Preparation category. In the tool options, select the fields you wish to impute, click the radio button for "Null" on Incoming Value to Replace, and then click the radio button for "Average" in the Replace With Value section. The advantages of using the tool directly are that it is much less complicated than the other way of doing it. The downsides are 1) if you are attempting to fix a large number of rows relative to machine specs it can be incredibly slow (much slower than the next way), and 2) it occasionally errors when we use it in our process without much explanation.
The Second Way: Calculate averages and use formulas. You can also use the "Summarize" tool in the Transform category to generate an average field for each column. After generating the averages, use the "Append" tool in the Join category to join them back into the stream. You will have the same average values for each row in your database. At that point, you can use the Formula tool as you attempted in your question. E.g.
IF [Age] = Null() THEN [Ave_Age] ELSE [Age] ENDIF
The second way is significantly faster to run for extremely large datasets (e.g. fixing possible nulls in a few dozen columns over 70 million rows), but is much more time intensive to set up and must be created for each column.
That is not the way the Average function works. You need to pass it the entire list of values, not just one.
I'm using Microsoft SQL Server Report Builder to list some production data i a table, mainly part numbers. I would like it to change fill color of the part number cells, based on the number in the cell.
Previously we have been using a solution, using mod10 to color it, based on the last digit. This will cause a repeat for every tenth part number, but that is fine. However we have now started a new series, which means that I need to deal with the number 1-9. Obviously, the mod10 trick does not work here. Is there a smarter way of getting the last digit, which also works on numbers from 1-9, or do I have to make some sort of IIF statement?
Her is an example of the code I use, though with mod5, rather than mod10:
=Choose(1+ Fields!cPri_runnr.Value.Value Mod 5,"DarkOliveGreen","Olive","LimeGreen","Yellow","Khaki")
There are several options here, if Mod is sufficient you can use Choose, Switch, or even IIF. In my opinion, the best solution would use Custom Code to hash the part number (or even take multiple inputs from the details row) and return a color directly. This could then be easily re-used in multiple sections of the report (chart colors, additional cell back-colors, even cell text color).
I have very simple grid (30 lines and 20 columns) with some numbers. And I have to make a lot of operations on this data in the runtime. I mean - user enter data to the cell and program check lines, columns, some single cells and results write to some cells in this grid.
I found that reading is quite good but writing is horrible slow, even in such a small grid.
I found also, that when I set pair:
suspendLayouts() and resumeLayouts(true)
before and after block of grid operations speed is much better. But in this grid I use celledit plugin and problem with speed is the same.
Could you suggest me some rules how to write such a code to make it max speedy?