How to choose the first record based on time - sql-server

I have two tables and call table and a after call work table. These two tables have a one to one relationship even tho the after call work records are stored in the call table. You can link a call to a after call work call by joining on three values in the call table. The call table also holds the values of the start and end time of the call.
The data in the after call work table is a total mess there are occasions where one call has many after call work records. My client wants me to pick out the first record based on the start time of the call and only take this 1 row of data.
Its been suggested to use RANKING function but I'm unfamiliar with this any body got any ideas?
If anything needs explaining further let me know.
Thank you

Self extracting ranking and dense ranking example ASSUMING you are using SQL Server 2008 or newer.
declare #Person Table ( personID int identity, person varchar(8));
insert into #Person values ('Brett'),('Sean'),('Chad'),('Michael'),('Ray'),('Erik'),('Queyn');
declare #Orders table ( OrderID int identity, PersonID int, Desciption varchar(32), Amount int);
insert into #Orders values (1, 'Shirt', 20),(1, 'Shoes', 50),(2, 'Shirt', 22),(2, 'Shoes', 20),(3, 'Shirt', 20),(3, 'Shoes', 50),(3, 'Hat', 20),(4, 'Shirt', 20),(5, 'Shirt', 20),(5, 'Pants', 30),
(6, 'Shirt', 20),(6, 'RunningShoes', 70),(7, 'Shirt', 22),(7, 'Shoes', 40),(7, 'Coat', 80);
with a as
(
Select
person
, o.Desciption
, o.Amount
, rank() over(partition by p.personId order by Amount) as Ranking
, Dense_rank() over(partition by p.personId order by Amount) as DenseRanking
from #Person p
join #Orders o on p.personID = o.PersonID
)
select *
from a
where Ranking <= 2 -- determine top 2, 3, etc.... whatever you want.
order by person, amount

Related

Select all Main table rows with detail table column constraints with GROUP BY

I've 2 tables tblMain and tblDetail on SQL Server that are linked with tblMain.id=tblDetail.OrderID for orders usage. I've not found exactly the same situation in StackOverflow.
Here below is the sample table design:
/* create and populate tblMain: */
CREATE TABLE tblMain (
ID int IDENTITY(1,1) NOT NULL,
DateOrder datetime NULL,
CONSTRAINT PK_tblMain PRIMARY KEY
(
ID ASC
)
)
GO
INSERT INTO tblMain (DateOrder) VALUES('2021-05-20T12:12:10');
INSERT INTO tblMain (DateOrder) VALUES('2021-05-21T09:13:13');
INSERT INTO tblMain (DateOrder) VALUES('2021-05-22T21:30:28');
GO
/* create and populate tblDetail: */
CREATE TABLE tblDetail (
ID int IDENTITY(1,1) NOT NULL,
OrderID int NULL,
Gencod VARCHAR(255),
Quantity float,
Price float,
CONSTRAINT PK_tblDetail PRIMARY KEY
(
ID ASC
)
)
GO
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(1, '1234567890123', 8, 12.30);
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(1, '5825867890321', 2, 2.88);
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(3, '7788997890333', 1, 1.77);
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(3, '9882254656215', 3, 5.66);
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(3, '9665464654654', 4, 10.64);
GO
Here is my SELECT with grouping:
SELECT tblMain.id,SUM(tblDetail.Quantity*tblDetail.Price) AS TotalPrice
FROM tblMain LEFT JOIN tblDetail ON tblMain.id=tblDetail.orderid
WHERE (tblDetail.Quantity<>0) GROUP BY tblMain.id;
GO
This gives:
The wished output:
We see that id=2 is not shown even with LEFT JOIN, as there is no records with OrderID=2 in tblDetail.
How to design a new query to show tblMain.id = 2? Mean while I must keep WHERE (tblDetail.Quantity<>0) constraints. Many thanks.
EDIT:
The above query serves as CTE (Common Table Expression) for a main query that takes into account payments table tblPayments again.
After testing, both solutions work.
In my case, the main table has 15K records, while detail table has some millions. With (tblDetail.Quantity<>0 OR tblDetail.Quantity IS NULL) AND tblDetail.IsActive=1 added on JOIN ON clause it takes 37s to run, while the first solution of #pwilcox, the condition being added on the where clause, it ends up on 29s. So a gain of time of 20%.
tblDetail.IsActive column permits me ignore detail rows that is temporarily ignored by setting it to false.
So the for me it's ( #pwilcox's answer).
where (tblDetail.quantity <> 0 or tblDetail.quantity is null)
Change
WHERE (tblDetail.Quantity<>0)
to
where (tblDetail.quantity <> 0 or tblDetail.quantity is null)
as the former will omit id = 2 because the corresponding quantity would be null in a left join.
And as HABO mentions, you can also make the condition a part of your join logic as opposed to your where statement, avoiding the need for the 'or' condition.
select m.id,
totalPrice = sum(d.quantity * d.price)
from tblMain m
left join tblDetail d
on m.id = d.orderid
and d.quantity <> 0
group by m.id;

Can I pull Max and Min values in SQL without using group by for non aggregate values?

I have a table of user data for when they enroll in a program. The fields include a user ID, start date, end date, entry reason, exit reason and program type. For each year the user is enrolled in a specific program they will have an entry and exit date for that year along with an entry reason. They only get an exit reason when they are exited from the program completely. Here is an example of the data in the table.
Data Table
Desired Result
I need to pull one line for each user that has their original start date in the program, most recent start date, and most recent end date. I also need to pull the exit reason if one exists and entry reason associated with the most recent start date and this is what is getting me hung up. I’m assuming the problem is related to having to group by the entry reason. Is there any way around using an aggregate function to get the min/max dates?
My query is:
Select
Table1.userID,
CAST(Min(table2.startdate) as date) as Originalstartdate,
CAST(Max(table2.startdate) as date) as Maxstartdate,
CAST(Max(table2.enddate) as date) as ExitDate,
CASE
WHEN table2.exitreason = NULL then ‘’
ELSE table2.exitreason
END as Exitcode,
Table2.entryreason
From
Table1 left outer join
Table2 on Table1.userID = Table2.userID
Where
Table1.status = ‘active’ and Table2.programID = ‘Program1’ and (Table2.exitreason <> ‘NULL’ or Table2.entryreason <> ‘NULL’)
Group By
Table1.userID, Table2.exitreason, Table2.entryreason
I used the below sample code in order to generate this.
The idea here is to utilize the userID as the anchor (you want one row per user, right?), aggregating the rest of the information but with the situation you requested.
CREATE TABLE SCRIPT:
CREATE TABLE table1
(
userID INT IDENTITY(1, 1) PRIMARY KEY,
name VARCHAR(200),
stat CHAR(1) NOT NULL
DEFAULT 'A');
CREATE TABLE table2
(
t2ID INT IDENTITY(1, 1),
StartDate DATE,
UserID INT FOREIGN KEY REFERENCES table1(userID),
ProgramID VARCHAR(150) DEFAULT 'Program1',
EndDate DATE,
EntryReason VARCHAR(2000),
ExitReason VARCHAR(2000));
INSERT INTO Table1
(name)
SELECT *
FROM(VALUES
(
'First name'),
(
'Second name'),
(
'Third name')) x("name");
INSERT INTO Table2
SELECT *
FROM(VALUES
(
'20180101', 1, 'Program1', '20181231', 11, NULL),
(
'20190101', 1, 'Program1', '20191231', 12, NULL),
(
'20200101', 1, 'Program1', NULL, 11, NULL),
(
'20170101', 2, 'Program1', '20171231', 11, NULL),
(
'20180101', 2, 'Program1', '20171231', 14, 2),
(
'20200101', 3, 'Program1', NULL, 11, NULL)
) x(StartDate, UserID, ProgramID, EndDate, EntryReason, ExitReason);
QUERY:
SELECT t1.userID,
CAST(MIN(t2.StartDate) AS DATE)
AS OriginalStartDate, -- This uses your logic to grab the earliest date
CAST(MAX(t2.StartDate) AS DATE)
AS RecentStartDate, -- This utilizes your logic to grab the last start date
CAST(MAX(t2.enddate) AS DATE)
AS ExitDate,
-- This works because we know an ExitDate must be populated due to the where
-- criteria (which prevents people who haven't exited yet from showing up)
ISNULL(MAX(t2.exitreason), '')
AS ExitCode, -- This is just a cleaner way to handle nulls.
STUFF(
(
SELECT CONCAT(',', EntryReason)
FROM Table2
WHERE Table2.UserID=t1.UserID FOR XML PATH('')
), 1, 1, '')
AS EntryReasonList
-- this solution creates a list of entry reasons; we could pick a best winner
-- (e.g. first entry code, last entry code..) but I created a list because
-- I didn't understand your intent.
FROM Table1
AS t1
LEFT JOIN
Table2
AS t2
ON T1.userID=T2.userID
WHERE t1.stat='A' -- you would use status= 'active'
AND t2.programID='Program1' -- same as before
AND NOT EXISTS
-- a not exists clause will do what you want to filter graduates out
(
SELECT 1
FROM Table2
AS t2self
WHERE t2.userID=t2self.userID
AND t2self.exitreason IS NOT NULL
)
GROUP BY t1.userID;

SQL Server CTE left outer join

I have 2 tables in SQL Server 2008, customertest with columns customer id (cid) and it's boss id (upid), and conftest with cid, confname, confvalue
customertest schema and data:
conftest schema and data:
I want to know how to design a CTE that if cid in conftest doesn't have that confname's confvalue, it will keep searching upid and till find a upper line which have confname and confvalue.
For example , I want to get value of 100 if I search for cid=4 (this is normal case). And I want to get value of 200 if I search for cid=7 or 8.
And if cid7 and cid8 have child node , it will all return 200 (of cid5) if I search using this CTE.
I don't have a clue how to do this , I think maybe can use CTE and some left outer join, please give me some example ?? Thanks a lot.
If it's unknown how many levels there are in the hierarchy?
Then such challenge is often done via a Recursive CTE.
Example Snippet:
--
-- Using table variables for testing reasons
--
declare #customertest table (cid int primary key, upid int);
declare #conftest table (cid int, confname varchar(6) default 'budget', confvalue int);
--
-- Sample data
--
insert into #customertest (cid, upid) values
(1,0), (2,1), (3,1), (4,2), (5,2), (6,3),
(7,5), (8,5), (9,8), (10,9);
insert into #conftest (cid, confvalue) values
(1,1000), (2,700), (3,300), (4,100), (5,200), (6,300);
-- The customer that has his own budget, or not.
declare #customerID int = 10;
;with RCTE AS
(
--
-- the recursive CTE starts from here. The seed records, as one could call it.
--
select cup.cid as orig_cid, 0 as lvl, cup.cid, cup.upid, budget.confvalue
from #customertest as cup
left join #conftest budget on (budget.cid = cup.cid and budget.confname = 'budget')
where cup.cid = #customerID -- This is where we limit on the customer
union all
--
-- This is where the Recursive CTE loops till it finds nothing new
--
select RCTE.orig_cid, RCTE.lvl+1, cup.cid, cup.upid, budget.confvalue
from RCTE
join #customertest as cup on (cup.cid = RCTE.upid)
outer apply (select b.confvalue from #conftest b where b.cid = cup.cid and b.confname = 'budget') as budget
where RCTE.confvalue is null -- Loop till a budget is found
)
select
orig_cid as cid,
confvalue
from RCTE
where confvalue is not null;
Result :
cid confvalue
--- ---------
10 200
Btw, the Recursive CTE uses the OUTER APPLY because MS SQL Server doesn't allow a LEFT OUTER JOIN to be used there.
And if it's certain that there's maximum 1 level depth for the upid with a budget?
Then just simple left joins and a coalesce would do.
For example:
select cup.cid, coalesce(cBudget.confvalue, upBudget.confvalue) as confvalue
from #customertest as cup
left join #conftest cBudget on (cBudget.cid = cup.cid and cBudget.confname = 'budget')
left join #conftest upBudget on (upBudget.cid = cup.upid and upBudget.confname = 'budget')
where cup.cid = 8;
I don't think you are looking for a CTE to do that, from what I understand:
CREATE TABLE CustomerTest(
CID INT,
UPID INT
);
CREATE TABLE ConfTest(
CID INT,
ConfName VARCHAR(45),
ConfValue INT
);
INSERT INTO CustomerTest VALUES
(1, 0),
(2, 1),
(3, 1),
(4, 2),
(5, 2),
(6, 3),
(7, 5),
(8, 5);
INSERT INTO ConfTest VALUES
(1, 'Budget', 1000),
(2, 'Budget', 700),
(3, 'Budget', 300),
(4, 'Budget', 100),
(5, 'Budget', 200),
(6, 'Budget', 300);
SELECT MAX(CNT.CID) AS CID,
CNT.ConfName,
MIN(CNT.ConfValue) AS ConfValue
FROM ConfTest CNT INNER JOIN CustomerTest CMT ON CMT.CID = CNT.CID
OR CMT.UPID = CNT.CID
WHERE CMT.CID = 7 -- You can test for values (8, 4) or any value you want :)
GROUP BY
CNT.ConfName;

T-SQL prepare dynamic COALESCE

As attached in screenshot, there are two tables.
Configuration:
Detail
Using Configuration and Detail table I would like to populate IdentificationType and IDerivedIdentification column in the Detail table.
Following logic should be used, while deriving above columns
Configuration table has order of preference, which user can change dynamically (i.e. if country is Austria then ID preference should be LEI then TIN (in case LEI is blanks) then CONCAT (if both blank then some other logic)
In case of contract ID = 3, country is BG, so LEI should be checked first, since its NULL, CCPT = 456 will be picked.
I could have used COALESCE and CASE statement, in case hardcoding is allowed.
Can you please suggest any alternation approach please ?
Regards
Digant
Assuming that this is some horrendous data dump and you are trying to clean it up here is some SQL to throw at it. :) Firstly, I was able to capture your image text via Adobe Acrobat > Excel.
(I also built the schema for you at: http://sqlfiddle.com/#!6/8f404/12)
Firstly, the correct thing to do is fix the glaring problem and that's the table structure. Assuming you can't here's a solution.
So, here it is and what it does is unpivots the columns LEI, NIND, CCPT and TIN from the detail table and also as well as FirstPref, SecondPref, ThirdPref from the Configuration table. Basically, doing this helps to normalize the data although it's costing you major performance if there are no plans to fix the data structure or you cannot. After that you are simply joining the tables Detail.ContactId to DerivedTypes.ContactId then DerivedPrefs.ISOCountryCode to Detail.CountrylSOCountryCode and DerivedTypes.ldentificationType = DerivedPrefs.ldentificationType If you use an inner join rather than the left join you can remove the RANK() function but it will not show all ContactIds, only those that have a value in their LEI, NIND, CCPT or TIN columns. I think that's a better solution anyway because why would you want to see an error mixed in a report? Write a separate report for those with no values in those columns. Lastly, the TOP (1) with ties allows you to display one record per ContactId and allows for the record with the error to still display. Hope this helps.
CREATE TABLE Configuration
(ISOCountryCode varchar(2), CountryName varchar(8), FirstPref varchar(6), SecondPref varchar(6), ThirdPref varchar(6))
;
INSERT INTO Configuration
(ISOCountryCode, CountryName, FirstPref, SecondPref, ThirdPref)
VALUES
('AT', 'Austria', 'LEI', 'TIN', 'CONCAT'),
('BE', 'Belgium', 'LEI', 'NIND', 'CONCAT'),
('BG', 'Bulgaria', 'LEI', 'CCPT', 'CONCAT'),
('CY', 'Cyprus', 'LEI', 'NIND', 'CONCAT')
;
CREATE TABLE Detail
(ContactId int, FirstName varchar(1), LastName varchar(3), BirthDate varchar(4), CountrylSOCountryCode varchar(2), Nationality varchar(2), LEI varchar(9), NIND varchar(9), CCPT varchar(9), TIN varchar(9))
;
INSERT INTO Detail
(ContactId, FirstName, LastName, BirthDate, CountrylSOCountryCode, Nationality, LEI, NIND, CCPT, TIN)
VALUES
(1, 'A', 'DES', NULL, 'AT', 'AT', '123', '4345', NULL, NULL),
(2, 'B', 'DEG', NULL, 'BE', 'BE', NULL, '890', NULL, NULL),
(3, 'C', 'DEH', NULL, 'BG', 'BG', NULL, '123', '456', NULL),
(4, 'D', 'DEi', NULL, 'BG', 'BG', NULL, NULL, NULL, NULL)
;
SELECT TOP (1) with ties Detail.ContactId,
FirstName,
LastName,
BirthDate,
CountrylSOCountryCode,
Nationality,
LEI,
NIND,
CCPT,
TIN,
ISNULL(DerivedPrefs.ldentificationType, 'ERROR') ldentificationType,
IDerivedIdentification,
RANK() OVER (PARTITION BY Detail.ContactId ORDER BY
CASE WHEN Pref = 'FirstPref' THEN 1
WHEN Pref = 'SecondPref' THEN 2
WHEN Pref = 'ThirdPref' THEN 3
ELSE 99 END) AS PrefRank
FROM
Detail
LEFT JOIN
(
SELECT
ContactId,
LEI,
NIND,
CCPT,
TIN
FROM Detail
) DetailUNPVT
UNPIVOT
(IDerivedIdentification FOR ldentificationType IN
(LEI, NIND, CCPT, TIN)
)AS DerivedTypes
ON DerivedTypes.ContactId = Detail.ContactId
LEFT JOIN
(
SELECT
ISOCountryCode,
CountryName,
FirstPref,
SecondPref,
ThirdPref
FROM
Configuration
) ConfigUNPIVOT
UNPIVOT
(ldentificationType FOR Pref IN
(FirstPref, SecondPref, ThirdPref)
)AS DerivedPrefs
ON DerivedPrefs.ISOCountryCode = Detail.CountrylSOCountryCode
and DerivedTypes.ldentificationType = DerivedPrefs.ldentificationType
ORDER BY RANK() OVER (PARTITION BY Detail.ContactId ORDER BY
CASE WHEN Pref = 'FirstPref' THEN 1
WHEN Pref = 'SecondPref' THEN 2
WHEN Pref = 'ThirdPref' THEN 3
ELSE 99 END)

T_SQL Rank function not working as expected

once again I really need your expertise.It looks like I am not ranking well here...
I have two records returned and they are both ranked one even though their date time is slightly different. by some minutes.Thanks guys , I really appreciate your help as always.
SELECT CD.MEMACT,
CD.DATETIME,--DATETIME
CD.AG_ID,
RANK() OVER (PARTITION BY
CD.MEMACT,
CD.DATETIME,
CD.AG_ID
ORDER BY CD.DATETIME)RANKED
FROM MEM_ACT_TBL
WHERE CD.MEMACT='1024518'
You provided this sample data as violating your expectations:
MEMACT DATETIME AG_ID RANK
------- --------------- ------- ----
1024518 12/26/2013 7:43 Ag_1541 1
1024518 12/26/2013 7:53 Ag_2488 1
But in your example code, you PARTITION BY CD.MEMACT, CD.DATETIME, CD.AG_ID. This explicitly means to treat all rows that do not share these values as completely separate ranking series. Since the two rows you presented above do not have the same MEMACT, DATETIME, and AG_ID values, then they each start their ranking from 1.
If you were expecting the two rows above to ascend in rank, then you must not partition by anything but MEMACT, since only that column is the same between the rows. You would then most likely order by DATETIME and then AG_ID (to get determinism if you have two rows with the same date) like so:
PARTITION BY CD.MEMACT ORDER BY CD.DATETIME, CD.AG_ID
It never makes sense in any of the ranking functions to both PARTITION BY and ORDER BY the same column as this will ensure that the ORDER BY has no differing data to work with as you've already segregated all the separate series by the same column.
Without seeing your data I could say it is due to the partitioning method. Your 'partitioning' statement is essentially WHAT you group by to determine number position resetting. The 'order by' determines the sequence of HOW you are ordering that data. See this simple example where in my first windowed function I use one too many partition by's and thus just show a whole bunch of redundant ones.
Simple self extracting example:
declare #Person Table ( personID int identity, person varchar(8));
insert into #Person values ('Brett'),('Sean'),('Chad'),('Michael'),('Ray'),('Erik'),('Queyn');
declare #Orders table ( OrderID int identity, PersonID int, Desciption varchar(32), Amount int);
insert into #Orders values (1, 'Shirt', 20),(1, 'Shoes', 50),(2, 'Shirt', 22),(2, 'Shoes', 52),(3, 'Shirt', 20),(3, 'Shoes', 50),(3, 'Hat', 20),(4, 'Shirt', 20),(5, 'Shirt', 20),(5, 'Pants', 30),
(6, 'Shirt', 20),(6, 'RunningShoes', 70),(7, 'Shirt', 22),(7, 'Shoes', 40),(7, 'Coat', 80)
Select
p.person
, o.Desciption
, o.Amount
, row_number() over(partition by person, Desciption order by Desciption) as [Wrong I Used Too Many Partitions]
, row_number() over(partition by person order by Desciption) as [Correct By Alpha of Description]
, row_number() over(partition by person order by Amount desc) as [Correct By Amount of Highest First]
from #Person p
join #Orders o on p.personID = o.PersonID
I would guess you could do this instead:
SELECT
CD.MEMACT,
CD.DATETIME,--DATETIME
CD.AG_ID,
RANK() OVER (PARTITION BY CD.MEMACT ORDER BY CD.DATETIME, CD.AG_ID) RANKED
FROM MEM_ACT_TBL
WHERE CD.MEMACT='1024518'

Resources