Selecting unique records based on date of effect, ending on date of discontinue - sql-server

I have an interesting conundrum and I am using SQL Server 2012 or SQL Server 2016 (T-SQL obviously). I have a list of products, each with their own UPC code. These products have a discontinue date and the UPC code gets recycled to a new product after the discontinue date. So let's say I have the following in the Item_UPCs table:
Item Key | Item Desc | UPC | UPC Discontinue Date
123456 | Shovel | 0009595959 | 2018-04-01
123456 | Shovel | 0007878787 | NULL
234567 | Rake | 0009595959 | NULL
As you can see, I have a UPC that gets recycled to a new product. Unfortunately, I don't have an effective date for the item UPC table, but I do in an items table for when an item was added to the system. But let's ignore that.
Here's what I want to do:
For every inventory record up to the discontinue date, show the unique UPC associated with that date. An inventory record consists of the "Inventory Date", the "Purchase Cost", the "Purchase Quantity", the "Item Description", and the "Item UPC".
Once the discontinue date is over with (e.g.: it's the next day), start showing only the UPC that is in effect.
Make sure that no duplicate data exists and the UPCs are truly being "attached" to each row per whatever the date is in the query.
Here is an example of the inventory details table:
Inv_Key | Trans_Date | Item_Key | Purch_Qty | Purch_Cost
123 | 2018-05-12 | 123456 | 12.00 | 24.00
108 | 2018-03-22 | 123456 | 8.00 | 16.00
167 | 2018-07-03 | 234567 | 12.00 | 12.00
An example query:
SELECT DISTINCT
s.SiteID
,id.Item_Key
,iu.Item_Desc
,iu.Item_Department
,iu.Item_Category
,iu.Item_Subcategory
,iu.UPC
,iu.UPC_Discontinue_Date
,id.Trans_Date
,id.Purch_Cost
,id.Purch_Qty
FROM Inventory_Details id
INNER JOIN Item_UPCs iu ON iu.Item_Key = id.Item_Key
INNER JOIN Sites s ON s.Site_Key = id.Site_Key
The real query I have is far too long to post here. It has three CTEs and the resultant query. This is simply a mockup. Here is an example result set:
Site_ID | Item_Key | Item_Desc | Item_Department | Item_Category | UPC | UPC_Discontinue Date | Trans_Date | Purch_Cost | Purch_Qty
2457 | 123456 | Shovel | Digging Tools | Shovels | 0009595959 | 2018-04-01 | 2018-03-22 | 16.00 | 8.00
2457 | 123456 | Shovel | Digging Tools | Shovels | 0007878787 | NULL | 2018-03-22 | 16.00 | 8.00
2457 | 234567 | Rakes | Garden Tools | Rakes | 0009595959 | NULL | 2018-07-03 | 12.00 | 12.00
2457 | 123456 | Shovel | Digging Tools | Shovels | 0007878787 | NULL | 2018-05-12 | 24.00 | 12.00
Do any of you know how I can "assign" a UPC to a specific range of dates in my query and then "assign" an updated UPC to the item for every effective date thereafter?
Many thanks!

Given your current Item_UPC table, you can generate effective start dates from the Discontinue Date using the LAG analytic function:
With Effective_UPCs as (
select [Item_Key]
, [Item_Desc]
, [UPC]
, coalesce(lag([UPC_Discontinue_Date])
over (partition by [Item_Key]
order by coalesce( [UPC_Discontinue_Date]
, datefromparts(9999,12,31))
),
lag([UPC_Discontinue_Date])
over (partition by [UPC]
order by coalesce( [UPC_Discontinue_Date]
, datefromparts(9999,12,31))
)) [UPC_Start_Date]
, [UPC_Discontinue_Date]
from Item_UPCs i
)
select * from Effective_UPCs;
Which yields the following Results:
| Item_Key | Item_Desc | UPC | UPC_Start_Date | UPC_Discontinue_Date |
|----------|-----------|------------|----------------|----------------------|
| 123456 | Shovel | 0007878787 | 2018-04-01 | (null) |
| 123456 | Shovel | 0009595959 | (null) | 2018-04-01 |
| 234567 | Rake | 0009595959 | 2018-04-01 | (null) |
This function produces a fully open ended interval where both the start and discontinue dates could be null indicating that it's effective for all time. To use this in your query simply reference the Effective_UPCs CTE in place of the Item_UPCs table and add a couple additional predicates to take the effective dates into consideration:
SELECT DISTINCT
s.SiteID
,id.Item_Key
,iu.Item_Desc
,iu.Item_Department
,iu.Item_Category
,iu.Item_Subcategory
,iu.UPC
,iu.UPC_Discontinue_Date
,id.Trans_Date
,id.Purch_Cost
,id.Purch_Qty
FROM Inventory_Details id
INNER JOIN Effective_UPCs iu
ON iu.Item_Key = id.Item_Key
and (iu.UPC_Start_Date is null or iu.UPC_Start_Date < id.Trans_Date)
and (iu.UPC_Discontinue_Date is null or id.Trans_Date <= iu.UPC_Discontinue_Date)
INNER JOIN Sites s ON s.Site_Key = id.Site_Key
Note that the above query uses a partially open range (UPC_Start_Date < trans_date <= UPC_Discontinue_Date instead of <= for both inequalities) this prevents transactions occurring exactly on the discontinue date from matching both the prior and next Item_Key record. If transactions that occur exactly on the discontinue date should match the new record and not the old simply swap the two inequalities:
and (iu.UPC_Start_Date is null or iu.UPC_Start_Date <= id.Trans_Date)
and (iu.UPC_Discontinue_Date is null or id.Trans_Date < iu.UPC_Discontinue_Date)
instead of
and (iu.UPC_Start_Date is null or iu.UPC_Start_Date < id.Trans_Date)
and (iu.UPC_Discontinue_Date is null or id.Trans_Date <= iu.UPC_Discontinue_Date)

Related

How to get the top row from a SQL Server record set query and other constraint

I have two SQL Server tables as below:
Event
+------------+----------------------------+-------------+------------+-----------------------------+
| Id | EventTypeId | PersonId | UCNumber | Name |DateEvent
+------------+----------------------------+-------------+------------+-----------------------------+
| 2307 | 3 | 2189 | 004947 | Migrated | 1900-01-01 00:00:00.6780000 |
| 2308 | 15 | 2189 | 004947 | Birthday | 2020-09-18 16:48:32.6870000 |
| 3400 | 15 | 2190 | 006857 | Birthday | 1900-01-01 00:00:00.0000000 |
| 3401 | 2 | 2190 | 006857 | Migrated | 2016-03-12 00:00:00.0000000 |
Person
+------------+----------------+-------------------+-----------+-------------------------------+
| Id | UCNumber | Name |LastName | AnotherDate |
+------------+----------------+-------------------+-----------+-------------------------------+
| 2189 | 004947 | John | Smith | 1900-01-01 00:00:00.0000000 |
| 2190 | 006857 | Alice | Timo | 2020-02-20 00:00:00.0000000 |
I need to get retrieved the top row (latest in time) based on the Event's Id. (The higher the Id, the more recent the Event) and it should be a 15 as EventTypeId.
I tried this:
Select P.Id, P.UCNUMBER, P.AnotherDate from
db.dbo.Person P
Inner join db.dbo.Event L on L.PersonId = P.Id
where P.Id in (
SELECT TOP (1) PersonId
FROM
db.dbo.Event
where PersonId = P.Id --and EventTypeID = 15
ORDER BY
Id DESC)
and EventTypeId = 15
but it does not work properly. I posted here just samples from the 2 tables. Generally the query takes also other events which are not latest ones (as higher Id). Something is missing in it.
In this case, for instance, it should return only 1 row:
2189 004947 1900-01-01 00:00:00.0000000
Sounds like you just want ORDER BY and TOP 1.
SELECT TOP 1
p.id,
p.ucnumber,
p.anotherdate
FROM event e
LEFT JOIN person p
ON p.id = e.personid
WHERE e.eventtypeid = 15
ORDER BY e.dateevent DESC;
If you want all ties in case there are more events on the same latest time you can replace TOP 1 with TOP 1 WITH TIES.

Complex SQL Server Pivot Involving Null and blank vs. Filled values

I need to separate the "dep" column into 2 different columns one being "recorded" the other being "unrecorded". the "unrecorded" column will contain the sum of time for entries where "dep" is blank or null for that day. The "recorded" column needs to display the sum of time for that day where the "dep" column is not null or blank.
So far this is what I have
SELECT Cast(Start_Time AS DATE) AS Date, dep,
sum(time) as "Total Time"
FROM A6K_Events
Group By Cast(Start_Time AS DATE), dep
and it yields this
+------------+--------------+------------+
| Date | Dep | Total Time |
+------------+--------------+------------+
| 2018-06-29 | Null | 3544 |
+------------+--------------+------------+
| 2018-06-29 | Other | 268 |
+------------+--------------+------------+
| 2018-06-29 | Training | 471 |
+------------+--------------+------------+
| 2018-06-29 | Change Point | 371 |
+------------+--------------+------------+
| 2018-06-28 | Null | 4519 |
+------------+--------------+------------+
| 2018-06-28 | Training | 1324 |
+------------+--------------+------------+
| 2018-06-28 | | 50 |
+------------+--------------+------------+
This is what I would like the end result to be
+------------+----------+------------+
| Date | Recorded | Unrecorded |
+------------+----------+------------+
| 2018-06-29 | 1110 | 3544 |
+------------+----------+------------+
| 2018-06-28 | 1324 | 4569 |
+------------+----------+------------+
Any suggestions or help would be appreciated. I cannot figure out how to pivot and filter out null and blank values into one column while the other filled values into another column.
Thank you.
Pivot is not required, we can fetch the data using CASE statement
SELECT start_time as Date, sum(case when dep is not NULL or dep <> '' then time end) as "Recorded", sum(case when dep is NULL or dep = '' then time end) as "Unrecorded"
FROM A6K_Events
Group By start_time order by 1 desc

T-SQL Select one row for multiple groups from one table

I have the following table:
NUMBER | DATE | VALUE_1 | VALUE_2
145789 | 2016-10-01 | A | Carrot
145789 | 2016-10-03 | B | Apple
145789 | 2016-10-14 | C | Banana
748596 | 2016-10-07 | Mango | Watermelon
748596 | 2016-10-19 | Pear | Strawberry
748596 | 2016-10-30 | Orange | Avocado
I want to select the first record for each number (the record with the minimum date).
How can I have a result like this?
NUMBER | DATE | VALUE_A | VALUE_B
145789 | 2016-10-01 | A | Carrot
748596 | 2016-10-07 | Mango | Watermelon
Very simple. You need to use row_number() for this, like below. Below we have generated unique numbers(Using Row_number) for each Number group rows based on date. On top of it we have selected only minimum date record (By using where clause ). For More about row_number click here.
SELECT [NUMBER], [DATE], [VALUE_1], [VALUE_2]
FROM
(
SELECT *,ROW_NUMBER() OVER(PARTITION BY NUMBER ORDER BY DATE ASC) RNO
FROM TABLE1)A
WHERE RNO=1

SQL Server - tricky query to update contact id using ROW_NUMBER to reference index value

A previous developer used an index rather than the actual contactID to reference which of the associated contacts are the primary contact. The index works well when the app gets the contacts and sets the primary contact in the list on the page, but try joining for a report! Not easy; so I want to update the main table with the actual contact ID to make for a simple join and to avoid this buggary.
In this particular case, I need to update tblInquiry with the claimantContactID and agentContactID. Those two fields I just created and defaulted to 0. However, the challenge is to use the claimantContactIndex and agentContactIndex values from tblInquiry, to get the respective nth row from tblContacts. The index is 0 based, so if the index value is 2, then get the ID of the 3rd contact, for example.
Also, claimantContactIndex and agentContactIndex can either be NULL or some number. If NULL, then assume the first contact (index 0).
I will also add that the contacts index cannot have an order by on it because the application relies upon the natural order when getting the contacts list (there is no order by in the stored procedure), and selects then the index accordingly.
DB Platform: SQL Server 2008 R2 Express Edition.
I have the following table structure:
tblInquiry
id | claimantID | agentID | claimantContactIndex | agentContactIndex | claimantContactID | agentContactID
--------------------------------
1 | 1001 | 2001 | 2 | 0 | 0 | 0
2 | 1002 | NULL | 0 | NULL | 0 | 0
tblClaimant
id | name | address | phone | email
--------------------------------
1001 | Widgets Inc. | 123 W. Main | 5550000 | widgets#here.com
1002 | Thingies LLC. | 456 W. Main | 5551111 | thingies#here.com
tblAgent
id | name | address | phone | email
--------------------------------
2001 | Simon Bros. | 789 W. Main | 5552222 | simon#here.com
tblContacts
id | claimantID | agentID | fn | ln | phone | email
--------------------------------
3001 | 1001 | NULL | John | Doe | 5553333 | john#here.com
3002 | 1001 | NULL | Fred | Flynn | 5554444 | fred#here.com
3003 | 1001 | NULL | Mike | Brown | 55555555 | mike#here.com
3004 | 1001 | NULL | Susan | Pierce | 5556666 | susan#here.com
3005 | NULL | 2001 | Jeff | Bridges | 5557777 | jeff#here.com
3006 | NULL | 2001 | Karry | Sinclair | 5558888 | Karry#here.com
3007 | NULL | 2001 | Steve | Green | 5559999 | steve#here.com
3008 | NULL | 2001 | Peter | White | 5550001 | peter#here.com
Update:
I have worked out the select part of this solution and I can now get the correct claimant contact info using ROW_NUMBER() and a JOIN. I will add more to get correct agent contact info. I also handled the case where an index is NULL. And ultimately I will work this out to update the inquiry table now that I have the right contactID.
SELECT
i.id inquiryID, i.claimantContactIndex, i.agentContactIndex, i.claimantContactID, i.agentContactID
,r.id contactID, r.claimantID, r.agentID
,r.*
FROM
(
SELECT ROW_NUMBER()
OVER (Partition by con.claimantid Order by (SELECT NULL)) AS RowNumber, *
FROM tblContacts con
) r
INNER JOIN
tblInquiry i on i.claimantid = r.claimantid and ((isnull(i.claimantContactIndex, 0) + 1 = r.RowNumber ))
WHERE
i.id in (1, 2, 3, 4, 5)
ORDER BY
i.id
This issue was resolved by doing the following:
As I posted above, using ROW_NUMBER() and (SELECT NULL()) along with an isnull to handle null values to get the correct contacts.
I selected the results into a temp table.
I then updated the inquiry table by joining it to the temp table.
dropped temp table
I had to do this in two passes, once for claimants, a second time for agents.
Thx #EricH for pointing me in the right direction.
You could do something like:
Using ideas from here:
https://msdn.microsoft.com/en-us/library/ms186734.aspx
SELECT
ROW_NUMBER() OVER (Order by Id) AS RowNumber,
claimantID, agentID, (etc...)
FROM
tblContacts
To get an index based resultset. I'd drop it into a temp table and select from that where RowNumber = Whatever index you want.

Sum worked hours

I have a issues table where users can log worked hours and estimate hours that looks like this
id | assignee | task | timespent | original_estimate | date
--------------------------------------------------------------------------
1 | john | design | 2 | 3 | 2013-01-01
2 | john | mockup | 2 | 3 | 2013-01-02
3 | john | design | 2 | 3 | 2013-01-01
4 | rick | mockup | 5 | 4 | 2013-01-04
And I need to sum and group the worked and estimated hours by task and date to get this
assignee | task | total_spent | total_estimate | date
------------------------------------------------------------------
john | design | 4 | 6 | 2013-01-01
john | mockup | 2 | 3 | 2013-01-02
rick | design | 5 | 4 | 2013-01-04
Ok, this is easy, I've already got this:
SELECT assignee, task, SUM(timespent) as total_spent, SUM(original_estimate) AS total_estimate, date FROM issues GROUP BY assignee, task, date
My problem is I need to also show the assignees that did not logged hours on any task that day, I mean:
assignee | task | total_spent | total_estimate | date
------------------------------------------------------------------
john | design | 4 | 6 | 2013-01-01
john | mockup | 2 | 3 | 2013-01-02
rick | design | 5 | 4 | 2013-01-04
pete | design | 0 | 0 | 2013-01-01
pete | mockup | 0 | 0 | 2013-01-02
liz | design | 0 | 0 | 2013-01-04
liz | mockup | 0 | 0 | 2013-01-04
The goal is to draw a chart like this http://jsfiddle.net/uUjst/embedded/result/
You need the Assignees in their own separate table to join from.
SELECT tblAssignee.Name, task, SUM(timespent) as total_spent, SUM(original_estimate) AS total_estimate, date
FROM tblAssignee
LEFT JOIN issue ON issues.assignee = tblAssignee.Name
GROUP BY tblAssignee.Name, task, date
Assuming that you have a user table, but not a tasks or dates table... meaning that we have to derive these values from the values present in issues:
;WITH dates AS (
SELECT DISTINCT date
FROM issues
), tasks AS (
SELECT DISTINCT task
FROM issues
)
SELECT
u.user as assignee,
t.task,
SUM(i.timespent) as total_spent,
SUM(i.original_estimate) AS total_estimate,
d.date
FROM
users u CROSS JOIN
dates d CROSS JOIN
tasks t LEFT OUTER JOIN
issues i ON
i.assignee = u.user
AND i.task = t.task
AND i.date = d.date
GROUP BY u.user, t.task, d.date
SELECT
A.name,
task,
ISNULL(SUM(timespent), 0) as total_spent,
ISNULL(SUM(original_estimate), 0) AS total_estimate,
date
FROM Assignee A
LEFT JOIN issue
ON issues.assignee = A.Name
GROUP BY A.name, task, date

Resources