Removing lines from a query - sql-server

I have the following table:
OrderID | OldOrderID | Action | EntryDate | Source
1 | NULL | Insert | 2016-01-12| A
1 | NULL | Remove | 2016-01-13| A
2 | NULL | Insert | 2016-01-12| B
3 | NULL | Insert | 2016-01-12| C
4 | 3 | Insert | 2016-01-13| C
4 | NULL | Remove | 2016-01-14| C
I want to query all orders that are currently active orders - they dont have the action remove. Currently I do it with this query :
WITH Active AS
(
SELECT *, rn = ROW_NUMBER()
OVER (PARTITION BY OrderID,Source ORDER BY EntryDate DESC)
FROM Orders
)
SELECT *
FROM Active WHERE [Action] <> 'Remove' AND rn = 1;
The problem is that some orders get child orders (OrderID 3 gets a child OrderID 4) and if a child ever gets the Action Remove the query should also ignore the parent, but with the current query it dosent.
In short the current query gets me this result:
OrderID | OldOrderID | Action | EntryDate | Source
2 | NULL | Insert | 2016-01-12| B
3 | NULL | Insert | 2016-01-12| C
But I need this result:
OrderID | OldOrderID | Action | EntryDate | Source
2 | NULL | Insert | 2016-01-12| B
Is it possible to fix the query to get a result like this?

Try this:
;WITH CTE AS (
SELECT OrderID, OldOrderID, Action, EntryDate, Source,
COUNT(CASE WHEN Action = 'Remove' THEN 1 END)
OVER (PARTITION BY OrderID) AS IsRemoved,
ROW_NUMBER() OVER (PARTITION BY OrderID ORDER BY EntryDate) AS rn
FROM Orders
)
SELECT c1.*
FROM CTE AS c1
LEFT JOIN CTE AS c2 ON c1.OrderID = c2.OldOrderID AND c2.IsRemoved >= 1
WHERE c1.rn = 1 AND c1.IsRemoved = 0 AND c2.IsRemoved IS NULL
The above query uses COUNT() OVER() in order to count the number of occurrences of Action = 'Remove' within each OrderID partition. Hence, a value of IsRemoved that is equal to or greater than 1 identifies a 'removed' order.

I also asked the question on dba stackexchange and got the following answer, which works well.

Related

How to select all PK's (column 1) where the MAX(ISNULL(value, 0)) in column 3 grouped by a value in column 2?

I couldn't find an answer on my question since all questions similar to this one aren't using a nullable int in the max value and getting 1 column out of it.
My table is as follows:
| ContractId | ContractNumber | ContractVersion |
+------------+----------------+-----------------+
| 1 | 11 | NULL |
| 2 | 11 | 1 |
| 3 | 11 | 2 |
| 4 | 11 | 3 | --get this one
| 5 | 24 | NULL |
| 6 | 24 | 1 | --get this one
| 7 | 75 | NULL | --get this one
The first version is NULL and all following versions get a number starting with 1.
So now I only want to get the rows of the latest contracts (as shown in the comments behind the rows).
So for each ContractNumber I want to select the ContractId from the latest ContractVersion.
The MAX() function wont work since it's a nullable int.
So I was thinking to use the ISNULL(ContractVersion, 0) in combination with the MAX() function, but I wouldn't know how.
I tried the following code:
SELECT
ContractNumber,
MAX(ISNULL(ContractVersion, 0))
FROM
Contracts
GROUP BY
ContractNumber
...which returned all of the latest version numbers combined with the ContractNumber, but I need the ContractId. When I add ContractId in the SELECT and the GROUP BY, I'm getting all the versions again.
The result should be:
| ContractId |
+------------+
| 4 |
| 6 |
| 7 |
It's just a simple application of ROW_NUMBER() when you're wanting to select rows based on Min/Max:
declare #t table (ContractId int, ContractNumber int, ContractVersion int)
insert into #t(ContractId,ContractNumber,ContractVersion) values
(1,11,NULL ),
(2,11, 1 ),
(3,11, 2 ),
(4,11, 3 ),
(5,24,NULL ),
(6,24, 1 ),
(7,75,NULL )
;With Numbered as (
select *,ROW_NUMBER() OVER (
PARTITION BY ContractNumber
order by ContractVersion desc) rn
from #t
)
select
*
from
Numbered
where rn = 1
this will work:
select ContractId,max(rank),ContractNumber from(select *,rank() over(partition by
ContractVersion order by nvl(ContractVersion,0)) desc ) rank from tablename) group by
ContractId,max(rank),ContractNumber;

Where clause if there are multiple of the same ID

I have following table:
ID | source | Name | Age | ... | ...
1 | SQL | John | 18 | ... | ...
2 | SAP | Mike | 21 | ... | ...
2 | SQL | Mike | 20 | ... | ...
3 | SAP | Jill | 25 | ... | ...
I want to have one record for each ID. The idea behind this is that if the ID comes only once (no matter the Source), that record will be taken. But, If there are 2 records for one ID, the one containing SQL as source will be the used record here.
So, In this case, the result will be:
ID | source | Name | Age | ... | ...
1 | SQL | John | 18 | ... | ...
2 | SQL | Mike | 20 | ... | ...
3 | SAP | Jill | 25 | ... | ...
I did this with a partition over (ordered by Source desc), but that wouldn't work well if a third source will be added one day.
Any other options/ideas?
The easiest approach(in my opinion) is using a CTE with a ranking function:
with cte as
(
select ID, source, Name, Age, ... ,
rn = row_number() over (partition by ID order by case when source = 'sql'
then 0 else 1 end asc)
from dbo.tablename
)
select ID, source, Name, Age, ...
from cte
where rn = 1
You can use ROW_NUMBER:
WITH CTE AS
(
SELECT *,
RN = ROW_NUMBER() OVER( PARTITION BY ID
ORDER BY CASE WHEN [Source] = 'SQL' THEN 1 ELSE 2 END)
FROM dbo.YourTable
)
SELECT *
FROM CTE
WHERE RN = 1;
You can use the WITH TIES clause and the window function Row_Number()
Select Top 1 With Ties *
From YourTable
Order By Row_Number() over (Partition By ID Order By Case When Source = 'SQL' Then 0 Else 1 End)
How about
SELECT *
FROM table
WHERE ID in (
SELECT ID FROM test
group by ID
having count(ID) = 1)
OR source = 'SQL'

Update several columns with latest values from another table

Here's the data:
[ TABLE_1 ]
id | prod1 | date1 | prod2 | date2 | prod3 | date3 |
---|--------|--------|--------|--------|--------|-------|
1 | null | null | null | null | null | null |
2 | null | null | null | null | null | null |
3 | null | null | null | null | null | null |
[ TABLE_2 ]
id | date | product |
-----|-------------|-----------|
1 | 20140101 | X |
1 | 20140102 | Y |
1 | 20140103 | Z |
2 | 20141201 | data |
2 | 20141201 | Y |
2 | 20141201 | Z |
3 | 20150101 | data2 |
3 | 20150101 | data3 |
3 | 20160101 | X |
Both tables have other columns not listed here.
date is formatted: yyyymmdd and datatype is int.
[ TABLE_2 ] doesn't have empty rows, just tried to make sample above more readable.
Here's the Goal:
I need to update [ TABLE_1 ] prod1,date1,prod2,date2,prod3,date3
with product collected from [ TABLE_2 ] with corresponding date values.
Data must be sorted so that "latest" product becomes prod1,
2nd latest product will be prod2 and 3rd is prod3.
Latest product = biggest date (int).
If dates are equal, order doesn't matter. (see id=2 and id=3).
Updated [ TABLE_1 ] should be:
id | prod1 | date1 | prod2 | date2 | prod3 | date3 |
---|--------|----------|--------|----------|--------|----------|
1 | Z | 20140103 | Y | 20140102 | X | 20140101 |
2 | data | 20141201 | Y | 20141201 | Z | 20141201 |
3 | X | 20160101 | data2 | 20150101 | data3 | 20150101 |
Ultimate goal is to get the following :
[ TABLE_3 ]
id | order1 | order2 | order3 | + Columns from [ TABLE_1 ]
---|--------------------|----------------------|------------|--------------------------
1 | 20140103:Z | 20140102:Y | 20140103:Z |
2 | 20141201:data:Y:Z | NULL | NULL |
3 | 20160101:X | 20150101:data2:data3 | NULL |
I have to admit this exceeds my knowledge and I haven't tried anything.
Should I do it with JOIN or SELECT subquery?
Should I try to make it in one SQL -clause or perhaps in 3 steps,
each prod&date -pair at the time ?
What about creating [ TABLE_3 ] ?
It has to have columns from [ TABLE_1 ].
Is it easiest to create it from [ TABLE_2 ] -data or Updated [ TABLE_1 ] ?
Any help would be highly appreciated.
Thanks in advance.
I'll post some of my own shots on comments.
After looking into it (after my comment), a stored procedure would be best, that you can call to view the data as a pivot, and do away with TABLE_1. Obviously if you need to make this dynamic, you'll need to look into dynamic pivots, it's a bit of a hack with CTEs:
CREATE PROCEDURE DBO.VIEW_AS_PIVOTED_DATA
AS
;WITH CTE AS (
SELECT ID, [DATE], 'DATE' + CAST(ROW_NUMBER() OVER(PARTITION BY ID ORDER BY [DATE] DESC) AS VARCHAR) AS [RN]
FROM TABLE_2)
, CTE2 AS (
SELECT ID, PRODUCT, 'PROD' + CAST(ROW_NUMBER() OVER(PARTITION BY ID ORDER BY [DATE] DESC) AS VARCHAR) AS [RN]
FROM TABLE_2)
, CTE3 AS (
SELECT ID, [DATE1], [DATE2], [DATE3]
FROM CTE
PIVOT(MAX([DATE]) FOR RN IN ([DATE1],[DATE2],[DATE3])) PIV)
, CTE4 AS (
SELECT ID, [PROD1], [PROD2], [PROD3]
FROM CTE2
PIVOT(MAX(PRODUCT) FOR RN IN ([PROD1],[PROD2],[PROD3])) PIV)
SELECT A.ID, [PROD1], [DATE1], [PROD2], [DATE2], [PROD3], [DATE3]
FROM CTE3 AS A
JOIN CTE4 AS B
ON A.ID=B.ID
Construction:
WITH ranked AS (
SELECT [id]
,[date]
,[product]
,row_number() over (partition by id order by date desc) rn
FROM [sistemy].[dbo].[TABLE_2]
)
SELECT id, [prod1],[date1],[prod2],[date2],[prod3],[date3]
FROM
(
SELECT id, type+cast(rn as varchar(1)) col, value
FROM ranked
CROSS APPLY
(
SELECT 'date', CAST([date] AS varchar(8))
UNION ALL
SELECT 'prod', product
) ca(type, value)
) unpivoted
PIVOT
(
max(value)
for col IN ([prod1],[date1],[prod2],[date2],[prod3],[date3])
) pivoted
You need to take a few steps to achive the aim.
Rank your products by date:
SELECT [id]
,[date]
,[product]
,row_number() over (partition by id order by date desc) rn
FROM [sistemy].[dbo].[TABLE_2]
Unpivot your date and product columns into one column. You can use UNPIVOT OR CROSS APPLY statements. I prefer CROSS APPLY
SELECT id, type+cast(rn as varchar(1)) col, value
FROM ranked
CROSS APPLY
(
SELECT 'date', CAST([date] AS varchar(8))
UNION ALL
SELECT 'prod', product
) ca(type, value)
or the same result using UNPIVOT
SELECT id, type+cast(rn as varchar(1)) col, value
FROM (
SELECT [id],
rn,
CAST([date] AS varchar(500)) date,
CAST([product] AS varchar(500)) prod
FROM ranked) t
UNPIVOT
(
value FOR type IN (date, product)
) unpvt
and at last you use PIVOTE and get a result.

Rebuild window function row_number in sybase

I have a problem that I could easily solve if I had window functions available in Sybase, but I dont:
Consider a table test:
+------------+----------------+-------------+
| Account_Id | Transaction_Id | CaptureDate |
+------------+----------------+-------------+
| 1 | 1 | 2014-01-01 |
| 1 | 2 | 2013-12-31 |
| 1 | 3 | 2015-07-20 |
| 2 | 1 | 2012-02-20 |
| 2 | 2 | 2010-01-10 |
| ... | ... | ... |
+------------+----------------+-------------+
I want to get a result set containing for each Account The most recent CaptureDate with the corresponding Transaction_Id. With the window function row_number this would be easy:
select Accounts_Id, CaptureDate, Transaction_Id from
(select
CallAccounts_Id,
CaptureDate,
Transaction_Id,
ROW_NUMBER() OVER(partition by Accounts_Id order by CaptureDate desc) row
from test) tbl
where tbl.row = 1
but my sybase version does not have this. Obviously, sth like
select max(Transaction_Id ), max(Transaction_Id ), Account_Id
from test
group by Account_Id
does not work because it does not always give me the correct Transaction_Id.
How can I do this then in Sybase and not make it terribly verbose?
Thanks!
Try below:
SELECT Account_Id, Transaction_Id, CaptureDate
FROM test a
WHERE CaptureDate = (
SELECT MAX(CaptureDate)
FROM test b
WHERE a.Account_Id = b.Account_Id
)
EDIT 1:
Duplicate CaptureDate was not in your example, so I did not take care of that scenario. Try below:
SELECT Account_Id, Transaction_Id, CaptureDate
FROM test a
WHERE CaptureDate = (
SELECT MAX(CaptureDate)
FROM test b
WHERE a.Account_Id = b.Account_Id
)
AND Transaction_Id =
(
SELECT MAX(Transaction_Id)
FROM test c
WHERE a.Account_Id = c.Account_Id
AND a.CaptureDate = c.CaptureDate
)

Efficient way to update column with arithmetic sequence of numbers after delete operation

I have a PresentationSlide table:
PresentationSlide
PresentationSlideId
PresentationId
Content
Order
and example rows:
+---------------------+----------------+---------+-------+
| PresentationSlideId | PresentationId | Content | Order |
+--------+------------+----------------+---------+-------+
| 123 | 3 | "bla" | 1 |
| 23 | 3 | "bla2" | 2 |
| 22 | 3 | "bla3" | 3 |
| 100 | 3 | "bla4" | 4 |
| 150 | 3 | "bla5" | 5 |
+---------------------+----------------+---------+-------+
I want to maintain arithmetic sequence of numbers (1,2,3,4,...) in the Order column after DELETE operation.
For example, if I delete third row (PresentationSlideId = 22), values in order column will be: (1,2,4,5) I want to update Order this way:
PresentationSlideId = 100: update order from 4 to 3
PresentationSlideId = 150: update order from 5 to 4
How is the most efficient way to do this kind of update? Is any way to do this with using only one UPDATE statement? I could do this using cursor and loop, but it doesn't seems efficient.
1) Order is a very poor name for a column, since it's an SQL Keyword
2) It would be a lot better if you could cope with gaps in the order (and possibly switch to using a float, so you can insert fractional values), because in your current model, every insert, update or delete is potentially going to affect the entire table. This doesn't scale well. Computing an order using ROW_NUMBER() during selects would generally be better.
3)
create table #PresentationSlide (
PresentationSlideID int not null,
PresentationId int not null,
Content varchar(10) not null,
[Order] int not null
)
insert into #PresentationSlide (PresentationSlideId , PresentationId , Content , [Order])
select 123,3,'bla',1 union all
select 23,3,'bla2',2 union all
select 22,3,'bla3',3 union all
select 100,3,'bla4',4 union all
select 150,3,'bla5',5
delete from #PresentationSlide where PresentationSlideId = 22
;With Reorder as (select PresentationSlideId,ROW_NUMBER() OVER (ORDER BY [Order]) as NewOrder from #PresentationSlide)
update ps set [Order] = NewOrder
from #PresentationSlide ps inner join Reorder r on ps.PresentationSlideId = r.PresentationSlideId
select * from #PresentationSlide order by [Order]
drop table #PresentationSlide
;with C as
(
select [Order],
row_number() over(order by [Order]) as rn
from PresentationSlide
)
update C set
[Order] = rn

Resources