SQL Server: join on derived table that contains WITH clause?

SQL Server: join on derived table that contains WITH clause? - sql-server

I'd like to join on a subquery / derived table that contains a WITH clause (the WITH clause is necessary to filter on ROW_NUMBER() = 1). In Teradata something similar would work fine, but Teradata uses QUALIFY ROW_NUMBER() = 1 instead of a WITH clause.
Here is my attempt at this join:
-- want to join row with max StartDate on JobModelID
INNER JOIN (
WITH AllRuns AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY JobModelID ORDER BY StartDate DESC) AS RowNumber
FROM Runs
)
SELECT * FROM AllRuns WHERE RowNumber = 1
) Runs
ON JobModels.JobModelID = Runs.JobModelID
What am I doing wrong?

You could use multiple WITH clauses. Something like
;WITH AllRuns AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY JobModelID ORDER BY StartDate DESC) AS RowNumber
FROM Runs
),
Runs AS(
SELECT *
FROM AllRuns
WHERE RowNumber = 1
)
SELECT *
FROM ... INNER JOIN (
Runs ON JobModels.JobModelID = Runs.JobModelID
For more detail on the usages/structure/rules see WITH common_table_expression (Transact-SQL)

Adding a join condition is probably less efficient, but usually works fine for me.
INNER JOIN (
SELECT *,
ROW_NUMBER() OVER
(PARTITION BY JobModelID
ORDER BY StartDate DESC) AS RowNumber
FROM Runs
) Runs
ON JobModels.JobModelID = Runs.JobModelID
AND Runs.RowNumber = 1

Related

How to select a specific range from first row with each id MSSQL

I have created a table in html to display first rows of each id and allow user to select specific range they want to see, but not sure how I can write it in query.
With results as
(
i.*,
ROW_NUMBER() OVER (PARTITION BY i.ID ORDER BY I.ID DESC) AS [RN]
FROM HolidayList AS I
INNER JOIN (
SELECT ID, MIN(CreateDate) FROM HolidayList GROUP BY ID
)
AS j ON i.ID = j.ID AND i.CreateDate = j.CreateDate
)
SELECT * FROM results WHERE [RN] = 1;
UPDATED
I have tried to inject two ROW_NUMBER(), but what I received is row 10 and row 25 and they are first rows of some specific id.
With results as
(
i.*,
ROW_NUMBER() OVER (PARTITION BY i.ID ORDER BY I.ID DESC) AS [RN],
ROW_NUMBER() OVER (ORDER BY I.ID) AS [R]
FROM HolidayList AS I
INNER JOIN (
SELECT ID, MIN(CreateDate) FROM HolidayList GROUP BY ID
)
AS j ON i.ID = j.ID AND i.CreateDate = j.CreateDate
)
SELECT * FROM results WHERE [RN] = 1 AND BETWEEN [R]>10 AND [R]<25
What I am really looking for is select a specific range from all the first row for each id.
FINAL
Thank to McGlothlin, finally I solve it. What I need is a nested CTE.
With First_CTE as
(
i.*,
ROW_NUMBER() OVER (PARTITION BY i.ID ORDER BY I.ID DESC) AS [RN]
FROM HolidayList AS I
INNER JOIN (
SELECT ID, MIN(CreateDate) FROM HolidayList GROUP BY ID
)
AS j ON i.ID = j.ID AND i.CreateDate = j.CreateDate
),
results AS
(
SELECT k.*,
ROW_NUMBER() OVER (ORDER BY k.ID DESC) AS [R]
FROM First_CTE AS k WHERE k.[RN] = 1
)
SELECT * FROM results WHERE [R]>10 AND [R]<25

The problem lies in this line:
ROW_NUMBER() OVER I.ID AS [R]
I think what you meant was:
ROW_NUMBER() OVER (ORDER BY I.ID) AS [R]
This is assuming you want to give it a row count over every row that is returned in the CTE. The ROW_NUMBER() function requires an ORDER BY to be specified, and the OVER clause would still require parentheses if you were using a function like SUM that could be used without an ORDER BY.
Edit: Based on your comment, it sounds like what you're looking for is an OFFSET. In that case you would remove the second ROW_NUMBER and include something like this:
ORDER BY ID OFFSET 10 ROWS FETCH FIRST 15 ROWS ONLY
This returns rows 11 to 25, sorted by ID.
Since the OFFSET syntax doesn't work in SQL Server 2008, you should do something like this as your end result:
With results as
(
i.*,
ROW_NUMBER() OVER (PARTITION BY i.ID ORDER BY I.ID DESC) AS [RN]
ROW_NUMBER() OVER (ORDER BY I.ID) AS [R]
FROM HolidayList AS I
INNER JOIN (
SELECT ID, MIN(CreateDate) FROM HolidayList GROUP BY ID
)
AS j ON i.ID = j.ID AND i.CreateDate = j.CreateDate
)
SELECT *
FROM results
WHERE [RN] = 1
AND [R] BETWEEN 11 AND 25

MSSQL DISTINCT again

This query work perfectly on MySQL, but I should rewrite to work with MSSQL and this doesn't work
SELECT DISTINCT TOP 20 [UF].[id], [UF].[created], [Company].[name]
FROM [user_functions] AS [UF]
LEFT JOIN [companies] AS [Company] ON ([Company].[code] = [UF].[company_code])
WHERE [UF].[user_id] = 8923 AND [UF].[state] != 500
ORDER BY [UF].[created] DESC
This query return duplicated rows, even i set DISTINCT.
But, when remove [Company].[name] from SELECT it's return correctly.
I would like using many fields from [Company] and [UF] tables.

You can try row_number and get the first rownum as below:
select * from
(
SELECT DISTINCT TOP 20 [UF].[id], [UF].[created], [Company].[name],
RowNum = row_number() over(partition by [Company].[name] order by [UF].[ID])
FROM [user_functions] AS [UF]
LEFT JOIN [companies] AS [Company] ON ([Company].[code] = [UF].[company_code])
WHERE [UF].[user_id] = 8923 AND [UF].[state] != 500
) a where RowNum = 1
order by a.Created Desc

SQL select the row with max value using row_number() or rank()

I have data of following kind:
RowId Name Value
1 s1 12
22 s1 3
13 s1 4
10 s2 14
22 s2 5
3 s2 100
I want to have the following output:
RowId Name Value
1 s1 12
3 s2 100
I am currently using temp tables to get this in two step. I have been trying to use row_number() and rank() functions but have not been successful.
Can someone please help me with syntax as I feel row_number() and rank() will make it cleaner?
Edit:
I changed the rowId to make it a general case
Edit:
I am open to ideas better than row_number() and rank() if there are any.

If you use rank() you can get multiple results when a name has more than 1 row with the same max value. If that is what you are wanting, then switch row_number() to rank() in the following examples.
For the highest value per name (top 1 per group), using row_number()
select sub.RowId, sub.Name, sub.Value
from (
select *
, rn = row_number() over (
partition by Name
order by Value desc
)
from t
) as sub
where sub.rn = 1
I can not say that there are any 'better' alternatives, but there are alternatives. Performance may vary.
cross apply version:
select distinct
x.RowId
, t.Name
, x.Value
from t
cross apply (
select top 1
*
from t as i
where i.Name = t.Name
order by i.Value desc
) as x;
top with ties using row_number() version:
select top 1 with ties
*
from t
order by
row_number() over (
partition by Name
order by Value desc
)
This inner join version has the same issue as using rank() instead of row_number() in that you can get multiple results for the same name if a name has more than one row with the same max value.
inner join version:
select t.*
from t
inner join (
select MaxValue = max(value), Name
from t
group by Name
) as m
on t.Name = m.Name
and t.Value = m.MaxValue;

If you really want to use ROW_NUMBER() you can do it this way:
With Cte As
(
Select *,
Row_Number() Over (Partition By Name Order By Value Desc) RN
From YourTable
)
Select RowId, Name, Value
From Cte
Where RN = 1;

Unless I'm missing something... Why use row_number() or rank?
select rowid, name, max(value) as value
from table
group by rowid, name

Deleting duplicates in a time series

I have a large set of measurements taken every 1 millisecond stored in a SQL Server 2012 table. Whenever there are 3 or more duplicate values in some rows that I would like to delete the middle duplicates. Highlighted values in this image of sample data are the ones that I want to delete. Is there a way to do this with a SQL query?

You can do this using a CTE and ROW_NUMBER:
SQL Fiddle
WITH CteGroup AS(
SELECT *,
grp = ROW_NUMBER() OVER(ORDER BY MS) - ROW_NUMBER() OVER(PARTITION BY Value ORDER BY MS)
FROM YourTable
),
CteFinal AS(
SELECT *,
RN_FIRST = ROW_NUMBER() OVER(PARTITION BY grp, Value ORDER BY MS),
RN_LAST = ROW_NUMBER() OVER(PARTITION BY grp, Value ORDER BY MS DESC)
FROM CteGroup
)
DELETE
FROM CteFinal
WHERE
RN_FIRST > 1
AND RN_LAST > 1

I'm sure there must be a more efficient way to do this, but you could join the table to itself twice to find the previous and next value in the list, and then delete all of the entries where all three values are the same.
DELETE FROM tbl
WHERE ms IN
(
SELECT T.ms
FROM tbl T
INNER JOIN tbl T1 ON T.ms = T1.ms + 1
INNER JOIN tbl T2 ON T.ms = T2.ms - 1
WHERE T.value = T1.value AND T.value = T2.value
)
If the table is really big, I can see this blowing tempdb though.

Yes there is
select * from table group by table.field ->value

RANK() Over Partition BY not working

When I run the code below the ROWID is always 1.
I need to the ID to start at 1 for each item with the same Credit Value.
;WITH CTETotal AS (SELECT
TranRegion
,TranCustomer
,TranDocNo
,SUM(TranSale) 'CreditValue'
FROM dbo.Transactions
LEFT JOIN customers AS C
ON custregion = tranregion
AND custnumber = trancustomer
LEFT JOIN products AS P
ON prodcode = tranprodcode
GROUP BY
TranRegion
,TranCustomer
,TranDocNo)
SELECT
r.RegionDesc
,suppcodedesc
,t.tranreason as [Reason]
,t.trandocno as [Document Number]
,sum(tranqty) as Qty
,sum(tranmass) as Mass
,sum(transale) as Sale
,cte.CreditValue AS 'Credit Value'
,RANK() OVER (PARTITION BY cte.CreditValue ORDER BY cte.CreditValue)AS ROWID
FROM transactions t
LEFT JOIN dbo.Regions AS r
ON r.RegionCode = TranRegion
LEFT JOIN CTETotal AS cte
ON cte.TranRegion = t.TranRegion
AND cte.TranCustomer = t.TranCustomer
AND cte.TranDocNo = t.TranDocNo
GROUP BY
r.RegionDesc
,suppcodedesc
,t.tranreason
,t.trandocno
,cte.CreditValue
ORDER BY CreditValue ASC
EDIT
All the credit values with 400 must have the ROWID set to 1. And all the credit values with 200 must have the ROWID set to 2. And so on and so on.

Do you need something like this?
with cte (item,CreditValue)
as
(
select 'a',8 as CreditValue union all
select 'b',18 union all
select 'a',8 union all
select 'b',18 union all
select 'a',8
)
select CreditValue,dense_rank() OVER (ORDER BY item)AS ROWID from cte
Result
CreditValue ROWID
----------- --------------------
8 1
8 1
8 1
18 2
18 2
In your code replace
,RANK() OVER (PARTITION BY cte.CreditValue ORDER BY cte.CreditValue)AS ROWID
by
,DENSE_RANK() OVER (ORDER BY cte.CreditValue)AS ROWID

You just don't have to use PARTITION, just DENSE_RANK() OVER (ORDER BY cte.CreditValue)

I think the problem is with the RANK() OVER (PARTITION BY clause
you have to partition it by item not by CreditValue

Try this
RANK() OVER (PARTITION BY cte.CreditValue ORDER BY cte.RegionDesc)AS ROWID

Edit: The issue here isn't actually the nesting of the subquery, it's potentially based on partition by having columns that truly make each row unique (or 1)
Rather than ranking within your complex query like this
select
rank() over(partition by...),
*
from
data_source
join table1
join table2
join table3
join table4
order by
some_column
Try rank() or row_number() on the resulting data set, not within it.
For example, using the query above, remove rank() and implement it this way:
select
rank() over(partition by...),
results.*
from (
select
*
from
data_source
join table1
join table2
join table3
join table4
order by
some_column
) as results

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

SQL Server: join on derived table that contains WITH clause? - sql-server

Adding a join condition is probably less efficient, but usually works fine for me. INNER JOIN ( SELECT *, ROW_NUMBER() OVER (PARTITION BY JobModelID ORDER BY StartDate DESC) AS RowNumber FROM Runs ) Runs ON JobModels.JobModelID = Runs.JobModelID AND Runs.RowNumber = 1

Related

How to select a specific range from first row with each id MSSQL

MSSQL DISTINCT again

SQL select the row with max value using row_number() or rank()

Deleting duplicates in a time series

RANK() Over Partition BY not working

Categories

Resources