I want to get some rows calculated according to some parameters. For example, I would like to get average and percentils of all columns. I've got about 10 columns and all need to be calculated in the same row.
Edit: I tried to put in a Table via ASCII, but it did not work. So I describe it better here:
I've got about 6 rows for percentiles and average for all columns. All columns are computed with the same parameter per row, for example average(Row) or Percentile_Cont(0.4)(Row)
Is it possible to put the calculation only once for all rows?
Is it possible to parameterize the percentiles?
EDIT:
alejandro zuleta asked me to post a picture made in Excel of what I am trying to achieve. Here it is:
So I've got 6 columns with numeric numbers. So far, I wrote a stored procedure with which i would like to parameterize the calculation rows (they may change) and to parameterize the percentiles. Sometimes, the percentile could be 20 instead of 25.
I'm not sure if this is possible to achieve.
Thank you for any insights in advance.
Related
I'm trying to calculate the sum of best segments in a run. For example, each Km gives a list as such:
5:40 6:00 5:45 5:55 6:21 6 :30
I'm trying to gather the best segments of 2km/3km/4km etc and would like a simple code to do it. At the moment, I'm using the formula
=Min(If(B1=0,9:9:9,sum(A1:B1),If(C1=0,9:9:9,sum(B1:C1))
but this goes all the way to 50km, meaning a very long formulae that I then have to repeat slightly differently at 3km, then 4km, then 5km etc. Surely there must me a way of
generating an array of summed columns of every n column, then iterating over that to find the min while ignoring values of 0?
I can do it manually for now, but what if I want to go over 50km? I might want to incorporate bike rides/car drives in the future just for some data analysis so I figured it best finding an ideal formulae now.
It's frustrating as I could code it and I want to avoid VBA ideally and stick to formulae in Excel.
Here is a draft of the case where there aren't any zeroes just for groups of 2Km. I decided the simplest approach initially was to add a couple of helper rows containing the running total of times (and for later use counts) and use a formula like this to subtract them in pairs:
=MIN(INDEX(A2:J2,SEQUENCE(1,9,2))-IF(SEQUENCE(1,9,0)=0,0,INDEX(A2:J2,SEQUENCE(1,9,0))))
but if you have access to recent additions to Excel 365 like Scan you can do it without helper rows.
Here is a more realistic scenario with a couple of zeroes thrown in
=LET(runningSum,Y$4:AP$4,runningCount,Y$5:AP$5,cols,COLUMNS(runningSum),leg,X7,
seqEnd,SEQUENCE(1,cols-leg+1,leg),seqStart,SEQUENCE(1,cols-leg+1,0),
times,INDEX(runningSum,seqEnd)-IF(seqStart=0,0,INDEX(runningSum,seqStart)),
counts,INDEX(runningCount,seqEnd)-IF(seqStart=0,0,INDEX(runningCount,seqStart)),
MIN(IF(counts=leg,times)))
Note that there are no runs of more than seven consecutive legs that don't contain a zero so 8, 9, 10 etc. just work out to 0.
As mentioned you could dispense with the helper rows by using Scan, but not everyone has access to this so I will add it separately:
=LET(data,Y$3:AP$3,runningSum,SCAN(0,data,LAMBDA(a,b,a+b)),
runningCount,SCAN(0,data,LAMBDA(a,b,a+(b>0))),leg,X7,cols,COLUMNS(data),
seqEnd,SEQUENCE(1,cols-leg+1,leg),seqStart,SEQUENCE(1,cols-leg+1,0),
times,INDEX(runningSum,seqEnd)-IF(seqStart=0,0,INDEX(runningSum,seqStart)),
counts,INDEX(runningCount,seqEnd)-IF(seqStart=0,0,INDEX(runningCount,seqStart)),
MIN(IF(counts=leg,times)))
Tom that worked! I learnt a few things on the way too and using the indexing method alongside sequence and columns is something I had not thought of. I'd never heard of the LET command before and I can already see that this is going to really help with some of the bigger calculations in the future.
Thank you so much, I'd like to show you how it now looks. Row 3087 is my old formula, row 3088 is a copy of the same data using the new formula, as you can see I've gotten exactly the same results so it's clear that it works perfectly and it is can be easily duplicated.
Referring to the snippet of a pivot table below in the image, there are 6,000 J####### models (i.e. J2253993, J2254008, J2254014 ... etc).
How can the difference between the last Odometer reading and the first Odometer reading for each model be calculated? There is no consistency in the number of recorded months for each model and there is no consistency between the first and last timestamps for each model.
i.e.
For model J2253993:
Desired answer is: 378
Because 2501 minus 2123
For model J2254008:
Desired answer is: 178
Because 1231 minus 1053
... And so on for the remaining 6,000 models
Would a dynamic array be needed?
Messy SUM/INDIRECT Solution
EDIT: A similar formula for Max-Min in column B (my first idea):
=INDEX(INDIRECT("B"&MATCH(E4,A$1:A$50000,0)+1&":B50000"),MATCH("",INDIRECT("B"&MATCH(E4,A$1:A$50000,0)+1&":B50000"),0)-1)-INDEX(B$1:B$50000,MATCH(E4,A$1:A$50000,0)+1)
I abandoned it because the image wasn't showing any empty cells.
EDIT-END
The formula is calculating the C column sums. A drawback is that you have to insert ="" in all the empty cells of column C unless you know a way how the MATCH function returns an empty cell. In the E column write the ID-s starting from the 4th row and in F4 write the formula:
=SUM(INDIRECT("C"&MATCH(E4,A$1:A$50000,0)+2&":C"&MATCH("",INDIRECT("C"&MATCH(E4,A$1:A$50000,0)+2&":C44"),0)-1+MATCH(E4,A$1:A$50000,0)+2))
Copy/Paste down.
If I am understanding you correctly, it looks like you just need to add a sum of the "Odometer Reading Change" column in your pivot table. When I sum them for J2253993 I get 378 like you say.
Pivot table will total all of the rows by model based on the way you have built it, no matter how many rows are there.
I am using a nested IF statement within a Quartile wrapper, and it only kind of works, for the most part because it's returning values that are slightly off from what I would have expected if I calculate the range of values manually.
I've looked around but most of the posts and research is about designing the fomrula, I haven't come across anything compelling in terms of this odd behaviour I'm observing.
My formula (ctrl+shift enter as it's an array): =QUARTILE(IF(((F2:$F$10=$W$4)($Q$2:$Q$10=$W$3))($E$2:$E$10=W$2),IF($O$2:$O$10<>"",$O$2:$O$10)),1)
The full dataset:
0.868997877*
0.99480118
0.867040346*
0.914032128*
0.988150438
0.981207615*
0.986629288
0.984750004*
0.988983643*
*The formula has 3 AND conditions that need to be met and should return range:
0.868997877
0.867040346
0.914032128
0.981207615
0.984750004
0.988983643
At which 25% is calculated based on the range.
If I take the output from the formula, 25%-ile (QUARTILE,1) is 0.8803, but if I calculate it manually based on the data points right above, it comes out to 0.8685 and I can't see why.
I feel it's because the IF statements identifies slight off range but the values that meet the IF statements are different rows or something.
If you look at the table here you can see that there is more than one way of estimating quartile (or other percentile) from a sample and Excel has two. The one you are doing by hand must be like Quartile.exc and the one you are using in the formula is like Quartile.inc
Basically both formulas work out the rank of the quartile value. If it isn't an integer it interpolates (e.g. if it was 1.5, that means the quartile lies half way between the first and second numbers in ascending order). You might think that there wouldn't be much difference, but for small samples there is a massive difference:
Quartile.exc Rank=(N+1)/4
Quartile.inc Rank=(N+3)/4
Here's how it would look with your data
I have a pivot table
Year and Week are Rows. Orders and Return Loaded are calculations based on sum(OrderSum) and sum(Reutilized) respectively
What I need is a third calculation based on the division of Return Loaded and Orders. This is to show what % Reutilized represent of the total orders. In the first case would be 14.04% (8 represent 14% of 57).
That type of calculation is not possible within the pivot itself. There are two general workarounds.
You can setup the calculation in the underlying data in a way that lets you pull the value through to the pivot. I would need to know how that is laid out to explain further.
You can create an additional table where you read the values from your pivot using GetPivot() formulas and then do your calculation there. You would manually add new rows here for new weeks.
I have a matrix in SQL reporting and I would like it to print on an A4 page. If the matrix has less than 4 columns then it fits but for more than 4 columns I would like the matrix to wrap and show only 4 columns per page. Is this possible? I am using SQL Reporting 2005 in localmode.
I found a work around:
First I added a field to my datasource called column count. Because the datasource is built in a business object it was easy for me to tell how many columns of data there is.
Next I created a list on my report and moved my matrix into the list.
I made the group expression =Ceiling(Fields!ColumnCount.Value/4) for the list.
In short I am telling the list to break every 4 columns. This causes the matrix to be split after 4 columns.
This will not work in all scenarios and probably screws up subtotalling but it worked for my application.
Disclaimer: this was not my idea...I adapted it from Chris Hays's Sleezy Hacks.
There is no way to intrinsically wrap columns; Mboy's solution above is very similar to what I have done in the past so I won't repeat his steps here, although I will warn you: for matrices with a large number of columns you will grow the number of pages in your report exponentially. In your case this may not be a problem; but we have found that in most cases it is cheaper ( in terms of page output) not to wrap columns.
Further to MBoy's answer, I wanted to show multiple charts on one page, but the number of charts would vary depending on the data. What I wanted was to show two charts on a row with as many rows as necessary. I did as follows:
As suggested by MBoy, I created a 'Count' field called [ChartNumber] in the data that increases by one for each chart (so if I had 7 charts, rows would be numbered 1-7).
To achieve this I used the DENSE_RANK() SQL function to create a field in my query, such as DENSE_RANK() OVER (ORDER BY [Data].[ItemtoCount]) AS [ChartNumber].
So if I wanted a different chart for each department I might use DENSE_RANK() OVER (ORDER BY [Data].[Department]) AS [ChartNumber]
I added a list to the form and bound to my dataset
I then set the row group to group on =Ceiling(Fields!ChartNumber.Value/2)
I then added a colum group on =Ceiling(Fields!ChartNumber.Value Mod 2)
Create a chart inside the list and preview, and you should see two charts side-by-side on each row.
I used charts, but you could easily put a matrix or any other item inside the list.
Edit: A more general solution for n columns is =Ceiling(Fields!ChartNumber.Value / n) and =Ceiling(n * Fields!ChartNumber.Value Mod n)
I don't think so. I've found that exporting to excel then printing was the most flexable way of printing SSRS matrix reports I've found - esp. since most of my users know excel well.
According to MSDN, Tablix data regions do pagination horizontally in much the same way a table does it vertically, which is to say you can specify a page break on a group change. There is another MSDN article that suggests the use of a pagination expression, but this technique is already explained by MBoy so I won't repeat it, except to say that it is an endorsed technique.