Exclude Secondary ID Records from Original SELECT - sql-server

I'm relatively new to SQL and am running into a lot of issues trying to figure this one out. I've tried using a LEFT JOIN, and dabbled in using functions to get this to work but to no avail.
For every UserID, if there is a NULL value, I need to remove all records of the Product ID for that UserID from my SELECT.
I am using SQL Server 2014.
Example Table
+--------------+-------------+---------------+
| UserID | ProductID | DateTermed |
+--------------+-------------+---------------+
| 578 | 2 | 1/7/2017 |
| 578 | 2 | 1/7/2017 |
| 578 | 1 | 1/15/2017 |
| 578 | 1 | NULL |
| 649 | 1 | 1/9/2017 |
| 649 | 2 | 1/11/2017 |
+--------------+-------------+---------------+
Desired Output
+--------------+-------------+---------------+
| UserID | ProductID | DateTermed |
+--------------+-------------+---------------+
| 578 | 2 | 1/7/2017 |
| 578 | 2 | 1/7/2017 |
| 649 | 1 | 1/9/2017 |
| 649 | 2 | 1/11/2017 |
+--------------+-------------+---------------+

Try the following:
SELECT a.userid, a.productid, a.datetermed
FROM yourtable a
LEFT OUTER JOIN (SELECT userid, productid, datetermed FROM yourtable WHERE
datetermed is null) b
on a.userid = b.userid and a.productid = b.productid
WHERE b.userid is not null
This will left outer join all records with a null date to their corresponding UserID and ProductID records. If you only take records that don't have an associated UserID and ProductID in the joined table, you should only be left with records that don't have a null date.

You can use this WHERE condition:
SELECT
UserID,ProducID,DateTermed
FROM
[YourTableName]
WHERE
(CONVERT(VARCHAR,UserId)+
CONVERT(VARCHAR,ProductID) NOT IN (
select CONVERT(VARCHAR,UserId)+ CONVERT(VARCHAR,ProductID)
from
[YourTableName]
where DateTermed is null)
)
When you concatenate the UserId and the ProductId get a unique value for each pair, then you can use them as a "key" to exclude the "pairs" that have the null value in the DateTermed field.
Hope this help.

Related

How to get the top row from a SQL Server record set query and other constraint

I have two SQL Server tables as below:
Event
+------------+----------------------------+-------------+------------+-----------------------------+
| Id | EventTypeId | PersonId | UCNumber | Name |DateEvent
+------------+----------------------------+-------------+------------+-----------------------------+
| 2307 | 3 | 2189 | 004947 | Migrated | 1900-01-01 00:00:00.6780000 |
| 2308 | 15 | 2189 | 004947 | Birthday | 2020-09-18 16:48:32.6870000 |
| 3400 | 15 | 2190 | 006857 | Birthday | 1900-01-01 00:00:00.0000000 |
| 3401 | 2 | 2190 | 006857 | Migrated | 2016-03-12 00:00:00.0000000 |
Person
+------------+----------------+-------------------+-----------+-------------------------------+
| Id | UCNumber | Name |LastName | AnotherDate |
+------------+----------------+-------------------+-----------+-------------------------------+
| 2189 | 004947 | John | Smith | 1900-01-01 00:00:00.0000000 |
| 2190 | 006857 | Alice | Timo | 2020-02-20 00:00:00.0000000 |
I need to get retrieved the top row (latest in time) based on the Event's Id. (The higher the Id, the more recent the Event) and it should be a 15 as EventTypeId.
I tried this:
Select P.Id, P.UCNUMBER, P.AnotherDate from
db.dbo.Person P
Inner join db.dbo.Event L on L.PersonId = P.Id
where P.Id in (
SELECT TOP (1) PersonId
FROM
db.dbo.Event
where PersonId = P.Id --and EventTypeID = 15
ORDER BY
Id DESC)
and EventTypeId = 15
but it does not work properly. I posted here just samples from the 2 tables. Generally the query takes also other events which are not latest ones (as higher Id). Something is missing in it.
In this case, for instance, it should return only 1 row:
2189 004947 1900-01-01 00:00:00.0000000
Sounds like you just want ORDER BY and TOP 1.
SELECT TOP 1
p.id,
p.ucnumber,
p.anotherdate
FROM event e
LEFT JOIN person p
ON p.id = e.personid
WHERE e.eventtypeid = 15
ORDER BY e.dateevent DESC;
If you want all ties in case there are more events on the same latest time you can replace TOP 1 with TOP 1 WITH TIES.

Sql server- joining with a temp table does not return the right values

I am not sure why my query isn't returning the right values when I join a temp table with an actual table. I am using SQL Server 2008 as my database.
I have the below query
WITH TradeHistory AS (
SELECT
tran_num,
version_number,
row_creation AS last_update,
tran_status,
ROW_NUMBER() OVER(PARTITION BY tran_num ORDER BY tran_num DESC, row_creation DESC, version_number DESC) AS RowNum
FROM tran_history_view (NOLOCK)
where
row_creation BETWEEN '2018-01-22 17:02:47.083' AND '2019-01-23 19:02:47.083'
AND tran_status IN (3,5)
AND update_type NOT IN (15,52,24)
AND deal_tracking_num= 10738416
)
select ab.tran_num as ab_tran_num, ab.version_number as ab_version_num, hist.version_number as hist_version_num, ab.deal_tracking_num as ab_track_num,
ab.tran_status as ab_tran_status, hist.tran_status as hist_tran_status from
TradeHistory hist join ab_tran ab on ab.tran_num = hist.tran_num
and hist.RowNum =1
I am not sure why my join does not pick the matching row for tran_num and version_num? I get a different version_number and tran_status from tran_history table. If I join on version_number across the two tables, I don't get any rows (as expected based on the values in the final output).
My 'TradeHistory' temp table returns the below values
+----------+----------------+-------------------------+-------------+--------+
| tran_num | version_number | last_update | tran_status | RowNum |
+----------+----------------+-------------------------+-------------+--------+
| 10738416 | 2 | 2019-01-23 16:02:09.760 | 3 | 1 |
| 10738416 | 1 | 2019-01-23 16:01:51.803 | 3 | 2 |
| 10738422 | 4 | 2019-01-23 16:02:30.600 | 3 | 1 |
| 10738422 | 3 | 2019-01-23 16:02:30.243 | 3 | 2 |
| 10738422 | 2 | 2019-01-23 16:02:09.973 | 3 | 3 |
+----------+----------------+-------------------------+-------------+--------+
Result of output joining the select query with the temp table is as below. The version_number and tran_number from the two tables are different and I don't understand why? Can someone please explain?
+---------------+-------------+----------------+--------------+--------------+----------------+------------------+
| hist_tran_num | ab_tran_num | ab_version_num | hist_ver_num | ab_track_num | ab_tran_status | hist_tran_status |
+---------------+-------------+----------------+--------------+--------------+----------------+------------------+
| 10738416 | 10738416 | 4 | 2 | 10738416 | 10 | 3 |
| 10738422 | 10738422 | 6 | 4 | 10738416 | 10 | 3 |
+---------------+-------------+----------------+--------------+--------------+----------------+------------------+

SQL Server: Returning rows with multiple and distinct values

I've been working on this issue for the last day and a half and just can't seem to find another question on here that works for my code.
I have a table here:
Table_D
Policynumber| EntryDate | BI_Limit | P remium
------------------------------------------------------
ABCD100001 | 5/1/16 | 15/30 | 919
ABCD100001 | 5/13/16 | 15/30 | 1008
ABCD100002 | 5/24/16 | 100/300 | 1380
ABCD100003 | 5/30/16 | 25/50 | 1452
ABCD100003 | 6/2/16 | 25/50 | 1372
ABCD100003 | 6/4/16 | 30/60 | 951
ABCD100004 | 6/11/16 | 100/300 | 1038
ABCD100005 | 6/22/16 | 100/300 | 1333
ABCD100005 | 7/2/16 | 50/100 | 1208
ABCD100006 | 7/10/16 | 250/500 | 1345
ABCD100007 | 7/18/16 | 15/30 | 996
in which I'm trying to extract rows in which a policynumber has multiple listings and a different BI_Limit. So the output should be:
Output
Policynumber | EntryDate | BI_Limit | Premium
---------------------------------------------------
ABCD100003 | 5/30/16 | 25/50 | 1452
ABCD100003 | 6/2/16 | 25/50 | 1372
ABCD100003 | 6/4/16 | 30/60 | 951
ABCD100005 | 6/22/16 | 100/300 | 1333
ABCD100005 | 7/2/16 | 50/100 | 1208
I'm storing Policynumber as VARCHAR(Max), EntryDate as DATE, BI_Limit as VARCHAR(Max), and Premium as INTEGER.
The code I've want to say should work would be something along the lines of:
SELECT * FROM Table_D
WHERE BI_Limit IN (
SELECT BI_Limit
FROM Table_D
GROUP BY BI_Limit
HAVING COUNT(DISTINCT BI_Limit)>1);
But this returns nothing for me. Can anyone help to show me what I'm doing wrong? Thank you.
You could also try exists
select a.*
from Table_D a
where
exists (
select 1
from Table_D b
where a.Policynumber = b.Policynumber
and a.BI_Limit <> b.BI_Limit
)
SELECT d.*
FROM ( -- find the policy number with multiple listing and diff BI_Limit
SELECT PolicyNumber
FROM TableD
GROUP BY PolicyNumber
HAVING count(*) > 1
AND MIN (BI_Limit) <> MAX (BI_Limit)
) m -- join back the Table_D to for other information
INNER JOIN Table_D d
ON m.PolicyNumber = d.PolicyNumber

Delete partial dulicate rows - sql

I have some troubles with deleting partial duplicate rows
The structure is like this:
+-----+--------+--+-----------+--+------+
| id | userid | | location | | week |
+-----+--------+--+-----------+--+------+
| 1 | 001 | | amsterdam | | 11 |
| 2 | 001 | | amsterdam | | 23 |
| 3 | 002 | | berlin | | 28 |
| 4 | 002 | | berlin | | 22 |
| 5 | 003 | | paris | | 19 |
| 6 | 003 | | paris | | 35 |
+-----+--------+--+-----------+--+------+
I only need to keep one row from each userid, it doesn't matter which week number it has.
Thanks,
Maxcim
This should work across most databases:
DELETE
FROM yourTable
WHERE id <> (SELECT MIN(id)
FROM yourTable t
WHERE t.userid = userid)
This query would delete from each userid group all records except for the record having the lowest id for that group. I assume that id is a unique column.
This method is tested, try it.
We are getting the number of rows occuring at each record, and then we are deleting only the ones with more than 1 row occruring... keeping the original one.
BEGIN TRANSACTION
SELECT UserID, Location,
RN = ROW_NUMBER()OVER(PARTITION BY UserID, Location ORDER BY UserID, Location)
into #test1
FROM dbo.MyTbl
Delete MyTbl
From MyTbll
INNER JOIN #test1
ON #test1.UserID= MyTbl.UserID
WHERE RN > 1
if ##Error <> 0 GOTO Errlbl
Commit Transaction
RETURN
Errlbl:
RollBack Transaction
GO

Where to use Outer Apply

MASTER TABLE
x------x--------------------x
| Id | Name |
x------x--------------------x
| 1 | A |
| 2 | B |
| 3 | C |
x------x--------------------x
DETAILS TABLE
x------x--------------------x-------x
| Id | PERIOD | QTY |
x------x--------------------x-------x
| 1 | 2014-01-13 | 10 |
| 1 | 2014-01-11 | 15 |
| 1 | 2014-01-12 | 20 |
| 2 | 2014-01-06 | 30 |
| 2 | 2014-01-08 | 40 |
x------x--------------------x-------x
I am getting the same results when LEFT JOIN and OUTER APPLY is used.
LEFT JOIN
SELECT T1.ID,T1.NAME,T2.PERIOD,T2.QTY
FROM MASTER T1
LEFT JOIN DETAILS T2 ON T1.ID=T2.ID
OUTER APPLY
SELECT T1.ID,T1.NAME,TAB.PERIOD,TAB.QTY
FROM MASTER T1
OUTER APPLY
(
SELECT ID,PERIOD,QTY
FROM DETAILS T2
WHERE T1.ID=T2.ID
)TAB
Where should I use LEFT JOIN AND where should I use OUTER APPLY
A LEFT JOIN should be replaced with OUTER APPLY in the following situations.
1. If we want to join two tables based on TOP n results
Consider if we need to select Id and Name from Master and last two dates for each Id from Details table.
SELECT M.ID,M.NAME,D.PERIOD,D.QTY
FROM MASTER M
LEFT JOIN
(
SELECT TOP 2 ID, PERIOD,QTY
FROM DETAILS D
ORDER BY CAST(PERIOD AS DATE)DESC
)D
ON M.ID=D.ID
which forms the following result
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | NULL | NULL |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
This will bring wrong results ie, it will bring only latest two dates data from Details table irrespective of Id even though we join with Id. So the proper solution is using OUTER APPLY.
SELECT M.ID,M.NAME,D.PERIOD,D.QTY
FROM MASTER M
OUTER APPLY
(
SELECT TOP 2 ID, PERIOD,QTY
FROM DETAILS D
WHERE M.ID=D.ID
ORDER BY CAST(PERIOD AS DATE)DESC
)D
Here is the working : In LEFT JOIN , TOP 2 dates will be joined to the MASTER only after executing the query inside derived table D. In OUTER APPLY, it uses joining WHERE M.ID=D.ID inside the OUTER APPLY, so that each ID in Master will be joined with TOP 2 dates which will bring the following result.
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | 2014-01-08 | 40 |
| 2 | B | 2014-01-06 | 30 |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
2. When we need LEFT JOIN functionality using functions.
OUTER APPLY can be used as a replacement with LEFT JOIN when we need to get result from Master table and a function.
SELECT M.ID,M.NAME,C.PERIOD,C.QTY
FROM MASTER M
OUTER APPLY dbo.FnGetQty(M.ID) C
And the function goes here.
CREATE FUNCTION FnGetQty
(
#Id INT
)
RETURNS TABLE
AS
RETURN
(
SELECT ID,PERIOD,QTY
FROM DETAILS
WHERE ID=#Id
)
which generated the following result
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-11 | 15 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | 2014-01-06 | 30 |
| 2 | B | 2014-01-08 | 40 |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
3. Retain NULL values when unpivoting
Consider you have the below table
x------x-------------x--------------x
| Id | FROMDATE | TODATE |
x------x-------------x--------------x
| 1 | 2014-01-11 | 2014-01-13 |
| 1 | 2014-02-23 | 2014-02-27 |
| 2 | 2014-05-06 | 2014-05-30 |
| 3 | NULL | NULL |
x------x-------------x--------------x
When you use UNPIVOT to bring FROMDATE AND TODATE to one column, it will eliminate NULL values by default.
SELECT ID,DATES
FROM MYTABLE
UNPIVOT (DATES FOR COLS IN (FROMDATE,TODATE)) P
which generates the below result. Note that we have missed the record of Id number 3
x------x-------------x
| Id | DATES |
x------x-------------x
| 1 | 2014-01-11 |
| 1 | 2014-01-13 |
| 1 | 2014-02-23 |
| 1 | 2014-02-27 |
| 2 | 2014-05-06 |
| 2 | 2014-05-30 |
x------x-------------x
In such cases an APPLY can be used(either CROSS APPLY or OUTER APPLY, which is interchangeable).
SELECT DISTINCT ID,DATES
FROM MYTABLE
OUTER APPLY(VALUES (FROMDATE),(TODATE))
COLUMNNAMES(DATES)
which forms the following result and retains Id where its value is 3
x------x-------------x
| Id | DATES |
x------x-------------x
| 1 | 2014-01-11 |
| 1 | 2014-01-13 |
| 1 | 2014-02-23 |
| 1 | 2014-02-27 |
| 2 | 2014-05-06 |
| 2 | 2014-05-30 |
| 3 | NULL |
x------x-------------x
In your example queries the results are indeed the same.
But OUTER APPLY can do more: For each outer row you can produce an arbitrary inner result set. For example you can join the TOP 1 ORDER BY ... row. A LEFT JOIN can't do that.
The computation of the inner result set can reference outer columns (like your example did).
OUTER APPLY is strictly more powerful than LEFT JOIN. This is easy to see because each LEFT JOIN can be rewritten to an OUTER APPLY just like you did. It's syntax is more verbose, though.

Resources