Excel: Need to Generate IDs based on multiple criteria with repeating IDs - arrays

Looking to create pricing groups bases on multiple criteria. Each group could have multiple items within the group. I'm struggling with the autocreation the naming of each group. I estimate there should be about 6.5K pricing groups out of 14K items.
Below is the criteria -
QTY per case - is the number of bottles in a case
Size - size of the bottle
Family Brand - contains a group of like items
Code - CS1 - This is my unique code for each group that contains each of the above and lowest possible case price.
enter image description here
The "Thinking" column is how I want each group to look, but how do I do this with 14K items quickly?

If I understood correctly your pricing group name consists of two parts: a simple combination of columns and a "special" column, that should be counted.
Part 1 is simple: =C2&"-"&B2&"-"&A1&"-"
To make Part 2 easier you could sort, sorting fields Part 1, CODE-CS1.
After have done this you could use helping columns. If Part 1 is in column x and code-CS1 in column y you could find a formula for
Part 2 (column z): ="T"&IF(X1=X2;IF(Y1=Y2;Z1;Z1+1);1)
That means: If Part 1 is changing your counter starts with T1, if not so if your CODE CS1 changes, it counts, if not, so it keeps last number.
the result code would be =X2&Z2
It is untested and I use german excel, maybe the code doesn't work without any adaption, but in general it should work

Related

Using a multiple arrays in order to find the lowest "Ranking" in an character defined ranking system

Wordy title but I was interviewed on this question, couldn't derive the answer and really would love to better understand array usage in excel.
Question:
You have two arrays. One depicts the credit ratings given by various companies (i.e. Company 1, Company 2...) and then an array showing the rankings of all credit ratings. Your goal is to use a formula to find the Lowest Rating by "CUSIP" and which company gave this rating. NOTE: A rating of NR = "NOT RATED" and should be excluded.
Arrays (Red & Black Array is the desired outcome of the formula)
Figured out my own answer... Thought I would document it here for you. One of the issues I had in the interview was to do with me setting up the "Ranking" table as a vertical table rather than a longwise table. This did not allow the max-matching array function to properly work.
I had to rearrange the ranking table to be elongated: | AAA | AA | A |
Lowest Rating Formula:
=INDEX((Ranking Table Array),
2,
MAX(IF(Credit Ratings Array)<>"NR",MATCH((CUSIP row in the table),
(Credit Ratings Array),0))))
I found this to accurately match the credit ratings with the lowest rated (would rank higher than a higher rated bond) for all of the companies
The company formula:
INDEX(Table1[[#Headers],[Company 1]:[Company 3]],1,MATCH(C11,B4:D4,0))
Piggie backing off the other formula - this formula matches the lowest credit rating found in the last formula to the company rating that in each row.
Issues with answers:
They require the answer table to be in the same format as the data being read. Because the formulas assume that CUSIPs ascending are being assessed the formulas will move 1 row down with each shift of the formula down. This is not scalable (even though this is an interview question).

How to multiply values within a nested array...times values in an another array (in Google Sheets)?

This is hard to explain so my title sucks, and is just my best guess at how I might be able to approach this. I have a Google Sheet of sales data for cases of various bottle sizes of kombucha. Column E is the sale date, Column G contains the item code, and column J is the quantity sold of said cases. See my (vastly simplified) sample data:
https://docs.google.com/spreadsheets/d/17-LzGrNJtBr-FwOZtdaoCws3ayeGOHu_TdtGOfXj4cA/edit?usp=sharing
See my current test code below (also present in the Formula tab of the linked spreadsheet). It successfully gives me the combined number of cases sold of half-liter bottles and Growlers. The values in E4 and E5 are cells containing my start and end dates, respectively, so I'm constraining the results only to those which fall within a certain date range.
This code works, but now I need to figure out a way to sum the total number of bottles sold instead of # of cases. The data set is already massive and pushing the limits of google sheets, so adding a column to the source data sheet with # of bottles per case is not an option. Half liter cases hold 13 bottles, and growlers hold 5. Is there any way to do this with my current approach, using another array perhaps? Or any other approach that keeps the formula as simple as possible?
FYI the current formula is a proof of concept and I will be adding many additional types of cases to the existing formula, each containing a different number of bottles per case, and using it as part of a larger dynamic formula that allows you to switch between showing # cases vs # bottles vs # of actual liters sold, so this is why I am hoping to find an array-based approach that will let me do this without needing to resort to an absurdly long and complex formula of nested IF statements.
=SUMPRODUCT(--((XeroInvoiceData!$E$3:$E>=B4)*(XeroInvoiceData!$E$3:$E<=B5)), (--(ISNUMBER(MATCH(XeroInvoiceData!$G$3:$G, {"HalfLiterCase","GrowlerCase"}, 0)))), XeroInvoiceData!$J$3:$J)
I would be eternally grateful for any assistance.
Here is my solution:
https://docs.google.com/spreadsheets/d/1ig0krumJu4Lj9-nIKJyRfPLTYbU-mzOL0JokRUDEqNc/edit?usp=sharing
My idea was to filter your table on date and sum by the type of container.
I wanted also to allow new types of containers that contain smaller units (bottles or liters).
I divided this job into 3 stages.
First we have to filter this table according to selected dates and container types.
I prepared a list that may be extended (all you need is to extend the filter range).
Then I have to vlookup values of units in each container and I try to do it inside the same formula.
General idea is
={[query results],arrayformula(ifna(vlookup([first column of query],$C$21:$D$26,2,0)*[second column of query])}
I divide it into 2 stages.
First stage referrs to query results in adjacent table:
Second stage uses indexes of query so formula is quite long:
Tell me if it solves your problem.

How many trucks came empty but bought something

I think this is the hardest to date I have had to crack - so hard I had a hard time finding a good headline.
So we have a site where trucks come and buy say Gravel, or sand or other building materials.
Sometimes they also unload demolition waste first.
I need to find out a couple of things
how many trucks (and from what companys) came empty
if they came empty what did they buy from us.
what companys are sending full trucks and what are sending empty trucks.
a tope 10 of materials they will drive to us from to buy even when coming empty to our facility.
a list of all the order numbers that they drove to us til fill and came with empty trucks. ( I have distances linked to order numbers, so now I can estimate the value of our products)
The data I have available:
I have a full data set of when what customer buys what and / or pay to deliver.
E.G.:
I can see the parts I need to split the data into I think it should be something like this
find all unique licence plates
somehow map if they bought materials within 30 minutes of
offloading demolition waste (most trucks will come between 2 and 10
times per day)
Present all this data (on a normal day we have about 800 trucks = 2000 lines since they weigh in, weigh out, and then some buy something = 2 more weigh lines)
I can easily find unique licence plates per day (either by formula or by Excel function Data/delete doublets,
but after that I have no clue where to start.
I think I need some sheets in between, where I somehow mark if a material was bought from an "empty truck" and I need a counter for that .. somehow...
Any help on how to get started is appreciated.
It seems like the best way to start is with a helper column (in the following exampes, I have chosen "Column M") to flag whether the truck arrived empty.
In the helper column, you can use something similar to the following formula.
{=IF(ISBLANK(B2),0,IF(C2="In",0,IF(B2=$B$2:$B$13,IF($C$2:$C$13="In",IF($A$2:$A$13>(A2-TIME(0,30,0)),0,1),1),1)))}
This is an array formula, which means you have to press ctrl+shift+enter after pasting it in the cell. Then you can copy that cell down the column.
Just to explain, the first if statement knows the truck is not arriving empty if Column C is 'In'. The second if statement creates an array and tests to see if other the same truck appears in other rows. The third if statement checks to see if the same truck checked 'In' in the matching rows, and the fourth if statement verifies if the time they checked in was less than thirty minutes ago. You can adjust the length by editing the TIME(0,30,0) function. The format is TIME(hours,minuites,seconds). Unless the truck matches all three of the second, third and fourth if statements, it is marked as coming empty.
Once you have this helper column, just about all of your tasks are quite simple.
1a: How many trucks came empty? Sum Column M
1b: How many trucks from what company? Create a unique list of companies. Then create a COUNTIFS formula based on Column M = 1 and Column K = Company. For example, if C32 had Company B then the formula =COUNTIFS($M$2:$M$13,1,$K$2:$K$13,C32) would return 2
1c: How many times did a truck come empty? Similar to 1b, create a unique list of License Plates, then use a COUNTIFS based on Column M = 1 and Column B = License Plate.
2: Similar to 1b, just use a unique list of products tested against Column F
3: Similar to 1b, just create a second column, next to the first that uses =COUNTIFS($M$2:$M$13,0,$K$2:$K$13,C53,$C$2:$C$13,"In") Which tests that Column M reports the truck did not come empty, that matches the company in Column K and that the truck came 'In' so you don't double count the same truck when it goes 'out'
4: Just sort list created by number 2. You can highlight the range, right-click and select "Sort" > "Custom Sort", then select the column you want to sort on and largest to smallest.
5: There are a couple of different ways, you could do this. The formula
{=TEXTJOIN(", ",TRUE,IF($M$2:$M$13=1,$J$2:$J$13,""))}
(again, entered as an array formula)
would create a comma separated list of order numbers. An alternative if you want a column of order numbers (but would only work if they are actually numbers), is to paste the formula {=MAX(IF($M$2:$M$13=1,$J$2:$J$13,))} in the first row of the column (in my example, its O2) and then {=MAX(IF($M$2:$M$13=1,IF($J$2:$J$13<O2,$J$2:$J$13,)))} in the row below (change the reference to O2 if you pasted it in a different spot)(again, note that both of these are array formulas). Then copy and paste the second formula down the column. When order numbers of trucks that came in empty are exhausted, the formula will report 0.

SSRS - Display only Top N Category Groups excluding duplicate groups at the end in Bar Chart

Consider the following table
I need to generate a bar chart with the Category Group = "Country". The chart should only display the top 3 Groups based on the count of records for a country. I have already applied a filter for the Category Group specifying the Top N Condition as 3 for Count(Country). The chart generated, applies the filter as expected based on count, but i need only 3 bars to be shown even if there are bars with duplicate values.
Below is the chart that I get.
Expected Result
Now I know that, i can create an additional column in my dataset with ranking values and then apply a filter on this column to get the expected result (i have tried this, and it works)
Is there a way to achieve the expected result without changing the underlying Dataset?
Note: The dataset shown above is a highly simplified version of my dataset. In reality i have a huge dataset with a lot of columns. The same dataset has been used for various charts (with groupings on different columns).
This was an interesting question as I've always just "solved" the tiebreaker in the dataset without much thought. However, I do see a fairly easy way to use the rnd() function to dissolve the ties as long as you don't care which of the tied countries is shown:
=(Count(Fields!Country.Value) * 1000) + (Rnd() * 100)
Which essentially just weights the count per country into the thousands and then tiebreaks with a random small value:
New York: 30XX
France: 20XX
China: 10XX
Italy: 10XX
Singapore: 10XX
If you wanted to actually solve the tiebreaker with an alphabetical preference, you could do something similar but incorporate the numeric value for the first letter of the country etc...
I just found out, I could use an x-axis minimum of 0 and an maximum of 10.5 together with an interval setting of 1.
So, I was able to achieve a top 10 limitation and the axis labels show the names - (it may be a side effect but when I changed the axis maximum to a whole number, the axis no longer shows the names but numbers).
I was not really happy with the other approaches. They seems much of an overkill to me for such a simple requirement like limiting the number of bars shown in a chart.

Return the smallest value from a list, in which only certain values are eligeble - excel

I am having some troubles formulating my problem but I hope you understand!
I have a table of firms building production plants in foreign countries in certain years. (Columns A to C).
In a seperate table i have so-called cross-national distance measures (based on the difference in gdp of the countries). (Columns G to M). Note that the distances change per year.
A simplified version of the excel would look like this:
https://new.wu.ac.at/fileadmin/wu/d/i/iib/photo/stack.JPG
What I want is a formula for the manually entered results in column D. It shall give me a result which is the following:
It shall look in which countries the specific company has previously (years before) built plants
It shall find the smallest cross-national distance from the current country to any of the countries previously entered
The value should be for the year of the current plant-construction
Let me illustrate my request with the example result i would want in cell D8:
The formula would have to find a list of countries that were previously entered in this case Turkey and Bulgaria
It would then have to into the second table and give me the minimum of the distances from Kosovo but only to Turkey and Bulgaria
This would have to be done in the rows for 2008 (current year)
I really hope you guys can help me, i figured out a way to find a minimum in a list and i can do it for certain years as well but the issue i am having that excel first needs to find the previously entered countries, memorize them in some kind of array and then use only these countries to consider the minimum distance.
Thank you very much!
Try this "array formula" for D2 copied down
=IFERROR(SMALL(IF(COUNTIFS(A$2:A$11,A2,B$2:B$11,"<"&B2,C$2:C$11,"<>"&C2,C$2:C$11,I$1:M$1)*(G$2:G$31=B2)*(H$2:H$31=C2),I$2:M$31),1),"N/A")
confirmed with CTRL+SHIFT+ENTER
That checks three conditions for your larger table - that the header row matches a qualifying country (using COUNTIFS function based on criteria in the small table), that column G matches the current year and column H matches the current country.
If all those criteria are satisfied then the relevant values in the table are returned, and SMALL finds the smallest. If there's an error (because there are no qualifying values) then N/A is returned
In Excel 2010 or later versions you can use AGGREGATE function instead of SMALL - this is useful because it doesn't require "array entry"
=IFERROR(AGGREGATE(15,6,I$2:M$31/(COUNTIFS(A$2:A$11,A2,B$2:B$11,"<"&B2,C$2:C$11,"<>"&C2,C$2:C$11,I$1:M$1)>0)/(G$2:G$31=B2)/(H$2:H$31=C2),1),"N/A")

Resources