Matching Duplicate values with unique attributes in a horizontal excel spreadsheet - arrays

Hopefully someone has had my problem before. I'm in the process of building an Excel model that sorts the prices that a certain product was sold for and the sales associated with that price. One spreadsheet houses the data and another sorts that data by sales and then matches the price that it sold for.
The problem is that there are cases where the number of sales are the same but the prices are different. In these cases, the first price is duplicated by the when the number of sales are the same. See below for a visual. I've looked tirelessly for a solution but because the formula needs to be designed horizontal
This sales volume sorting formula =IFERROR(LARGE('2016 Data Tab '!$B3:$BY3,{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76}),"")
this formula matches the price with the sales. This is where I'm having the problem =IFERROR(INDEX(DataTableLanes16,$A3*$C$1,MATCH('2016 Input Lanes '!C3,'2016 Data Tab '!$A3:$BY3,0)),"")
See the pictures below:
This is where the data is housed:
This is where the data is sorted by sales:
Thanks in advance for your assistance.
James

It is certainly possible to just sort the data. Excel can sort left to right.
If you MUST use formulas, you need to calculate the rank of each sales number in a manner that accounts for duplicates.
You can add a "helper row". Name it SalesRank
You can then use this formula in (as in the screenshot), B4
B4: =RANK(B3,Sales,0)+COUNTIF($B$3:B3,B3)-1
and fill right
For the desired results, we first list the sales in descending order:
B9: =LARGE(Sales,COLUMNS($A:A))
and fill right
And then, for the associated prices:
B10: =INDEX(Price,1,MATCH(COLUMNS($A:A),SalesRank,0))
In the above formulas:
| Price | Refers to: | =Sheet3!$B$2:$K$2 |
| Sales | Refers to: | =Sheet3!$B$3:$K$3 |
| SalesRank | Refers to: | =Sheet3!$B$4:$K$4 |

Related

Tableau remove duplicates based on a condition

I am trying to remove duplicates from the Ticket field in my database but I want to remove the duplicates that have older dates. example,
Ticket | Date
MG17000 | 1/1/2017
MG17000 | 1/1/2018
MG17010 | 1/1/2018
so I want the answer to be
MG17000 | 1/1/2018
MG17010 | 1/1/2018
I used countd(Ticket) but it does not remove the right tickets(it removes the ticket that corresponds to 1/1/2018 instead of 1/1/2017). any suggestions on how to perform this task.
Thanks!
Try this:
Create formula [Rank - Date] with below code:
RANK_UNIQUE((MAX(SPLIT([database field],'|',2))))
//This will create a values for every ticket
Now one more formula to filter only date with max value and drag to filter and select True
[Rank - Date]=1
You should be able to get required data
Use a level-of-detail (LOD) calculation. Create the calculation with this formula and it will give you the number of records per ticket, regardless of what dimensions you have on rows and shelves.
{FIXED [ticket] : count([date])}
If you have any date filtering and you want the calculation to count tickets outside the date filter range, switch FIXED to INCLUDE.
Drag that as one of you measures. Then use the max([date]) to show the most recent date.
From the sample data you showed in the question, you will see something like
MG17000 1/1/2018 2
MG17010 1/1/2018 1

SQL Server database design for evaluations

I'm designing this employee evaluation web page, and was wondering if my current database design is the correct one or if it could be improved.
This is my current design
Table Agenda:
+--------------+----------+----------+-----------+------+-------+-------+
| idEvaluation | Location | Employee | #Employee | Date | Date1 | Date2 |
+--------------+----------+----------+-----------+------+-------+-------+
Date is the date scheduled for the evaluation to be performed.
Date 1 and Date 2 its a period of time to retrieve some metrics from another database.
Table Evaluations:
+--------------+---------+------------+------+----------+
| idEvaluation | Manager | Department | Date | Comments |
+--------------+---------+------------+------+----------+
Table Scores:
+--------------+----------+-------+
| idEvaluation | idFactor | Score |
+--------------+----------+-------+
idFactor relates to another table which contains the factor and a description of it, like I said its this a correct design??
My concern its this, currently there are 60 employees, 11 managers and 12 factors, each employee its evaluated twice a year by every manager, so in the Agenda Table there's not much trouble since its only one record per evaluation (60 employees = 60 records), how ever on the Evaluations Table there are 11 records for every evaluation, so it goes to 660 records (60 employees * 11 managers = 660), and then on the Scores Table it goes even bigger since there are 12 factors for every evaluation, it goes to 7920 records (660 evaluations * 12 factors each = 7920).
Is this normal?? Am I doing it wrong?? Any input its appreciated.
EDIT
Location, Employee, #Employee, Manager and Department are loaded automatically by the vb.net page, they are "imported" from an Active Directory and its checked before insertion so duplicate names, misspelled names, and this sort of thing its not an issue.
The main idea is you dont want to repeat string literals
So if you have
id Department
1 Sales
2 IT
3 Admin
Instead of repeat Sales many time you only use 1 which is smaller so things also get faster.
Second if you have users
id user
1 Jhon Alexander
2 Maria Jhonson
If Jhon decide change his name then you will have to check all tables and change the name. Also there is the problem if two person have same name you wont know which one are you evaluating.
So go for separated table and use the ID.

Relevance and Solr Grouping

Say I have the following collection of webpages in a Solr index:
+-----+----------+----------------+--------------+
| ID | Domain | Path | Content |
+-----+----------+----------------+--------------+
| 1 | 1.com | /hello1.html | Hello dude |
| 2 | 1.com | /hello2.html | Hello man |
| 3 | 1.com | /hello3.html | Hello fella |
| 4 | 2.com | /hello1.html | Hello sir |
...
And I want a query for hello to show results grouped by domain like:
Results from 1.com:
/hello1.html
/hello2.html
/hello3.html
Results from 2.com:
/hello1.html
How is ordering determined if I sort by score? I use a combination of TF/IDF and PageRank for my results normally, but since that calculates scores for each individual item, how does it determine how to order the gruops? What if 1.com/hello3.html and 1.com/hello2.html have very low relevance but two results while 2.com/hello1.html has really high relevance and only one result? Or vice versa? Or is relevance summed when there are multiple items in a grouping field?
I've looked around, but haven't been able to find a good answer to this.
Thanks.
It sounds to me like you are using Result Grouping. If that's the case, then the groups are sorted according to the sort parameter, and the records within each group are sorted according to the group.sort parameter. If you sort the groups by sort=score desc (this is the default, so you wouldn't actually need to specify it), then it sorts the groups according to the score of each group. How this score is determined isn't made very clear, but if you look through the examples in the linked documentation you can see this statement:
The groups are sorted by the score of the top document within each group.
So, in your example, if 2.com's hello1.html was the most relevant document in your result set, "Results from 2.com" would be your most relevant group even though "Results from 1.com" includes three times the document count.
If this isn't what you want, your best options are to provide a different sort parameter or result post-processing. For example, for one project I was involved in, (where we had a very modest number of groups,) we chose to pull the top three results for each group and in post processing we calculated our own sort order for the groups based on the combination of their scores and numFound values. This sort of strategy might have been prohibitive for cases with too many groups, and may not be a good idea if the more numerous groups run the risk of making the most relevant documents harder to find.

How to Store purchases of "N of the same Product" in an Orders table

So We have our basic tables for the categories, products, and variants of products
categories
id | name | active | parent_id
products
id | name | price | active
c_p_link
category_id | product_id
variants
id | product_id | price | price_override | active | stock
Which works great.
But I have two queries.
The first being how to structure the orders.
We have an orders table
id | customer_id | ordered | status
And we also have a order_products table
id | order_id ..?
this is the one I am curious about.
Say a customer orders 30 of a product. do we
Add 30 rows, and add the price for each individual item on each row.
Add one row, add the combined total onto the row
Add one row, add the individual price onto the row
The next part is, later we are expecting to add voucher support to the cart. e.g. 10% off, buy two, get one free etc. the overall design of this I am not too fussed about right now (this is a couple of months off at least). but I am wondering if that is going to affect which version of the order_products table I should choose?
Disclaimer: I have never written a database model dealing the "Shopping Carts" or "Orders"'
I think the price at time of purchase should be encoded into the purchase data: just like a paper receipt from a store. Let's call this total_price which represents each itemized "line" on the receipt and should not be confused with total_purchase_price.
That is, the amount charged is fixed. It doesn't matter if the product price changes later and changes to prices should not reflect in how much was [to be] paid.
Thus I would have these fields: product, unit_price, quantity, total_price. A computed column of say, base_total_price (unit_price * quantity) can be easily added if required.
Now, the total_price might be a computed value based on say base_total_price * precent_discount field: but, whatever it ends up being, I hold that total_price should exist and should be fixed at time of purchase. (This implies that, if it is a computed column, all inputs are also fixed at time of purchase.)
Addendum: As stated above, I've never designed a model like this before, but one thing I have observed at stores is discounts being applied as a negative cost itemized item. That is, items are bought "at full price" and then the register adds an entry to offset the cost per whatever promotional is occuring. I do not know the merits/reasoning of such an approach.
simply add quantity of product to your order_products table :)
I prefer the 3rd solution, i think it's the best for the performance of your database..

How to merge two Excel sheets

I have an Excel document with 10000 rows of data in two sheets, the thing is one of these sheets have the product costs, and the other has category and other information. These two are imported automatically from the sql server so I don't want to move it to Access but still I want to link the product codes so that when I merge the product tables as product name and cost on the same table, I can be sure that I'm getting the right information.
For example:
Code | name | category
------------------------------
1 | mouse | OEM
4 | keyboard | OEM
2 | monitor | screen
Code | cost |
------------------------------
1 | 123 |
4 | 1234 |
2 | 1232 |
7 | 587 |
Let's say my two sheets have tables like these, as you can see the next one has one that doesn't exist on the other- I put it there because in reality one has a few more, preventing a perfect match. Therefore I couldn't just sort both tables to A-Z and get the costs that way- as I said there are more than 10000 products in that database and I wouldn't want to risk a slight shift of costs -with those extra entries on the other table- that would ruin the whole table.
So what would be a good solution to get the entry from another sheet and inserting it to the right row when merging? Linking two tables with field name??... checking field and trying to match it with the other sheet??... Anything at all.
Note: When I use Access I would make relationships and when I would run a query it would match them automatically... I was wondering if there's a way to do that in excel too.
Why not use a vlookup? If there is a match, it will list the cost. Assuming the top is sheet1 and the other sheet2 and they both start on cell A1. You just need this in cell D2.
=VLOOKUP(A2,Sheet2!A:B,2,0)
You can then drag it down. Easiest way to fill all your 10000 rows is to hover over the bottom left corner of the cell with your cursor. It will turn from a white plus sign into a thin black one. Then simply double click.
Just use VLOOKUP - you can add a row to your first sheet, and find the cost based on code in the other sheet.

Resources