Finding the max and min date values in SQL Server tables - sql-server

I have two tables:
A lookup table (tabOne):
KEY | Group | Name | Desc | Val_Key
----------------------------------------
1 | a | NameA | DescA | 10
2 | b | NameB | DescB | 20
3 | c | NameC | DescC | 30
4 | d | NameD | DescD | 40
5 | e | NameE | DescE | 50
6 | f | NameF | DescF | 60
A second table containing readings (tabTwo):
KEY | Date | Reading | Val_Key
----------------------------------------
1 | Date | Read | 10
2 | Date | Read | 20
3 | Date | Read | 40
4 | Date | Read | 40
5 | Date | Read | 30
6 | Date | Read | 20
7 | Date | Read | 40
8 | Date | Read | 20
9 | Date | Read | 10
10 | Date | Read | 20
11 | Date | Read | 50
12 | Date | Read | 60
What I need to do is join tabTwo with TabOne and create a column with the newest Reading and a column with the oldest reading for each item in the group column of TabOne.
At the end of the day I want a table that look as follow:
KEY | Group | Name | Desc | Val_Key | LastReading | FirstReading |
-------------------------------------------------------------------------
1 | a | NameA | DescA | 10 | | |
2 | b | NameB | DescB | 20 | | |
3 | c | NameC | DescC | 30 | | |
4 | d | NameD | DescD | 40 | | |
5 | e | NameE | DescE | 50 | | |
6 | f | NameF | DescF | 60 | | |
Thanks!
Freddie

If this is Sql Server 2005 or newer, outer apply will help:
select TabOne.*,
last.Reading LastReading,
first.Reading FirstReading
from TabOne
outer apply
(
select top 1
Reading
from TabTwo
where TabTwo.Val_Key = TabOne.val_Key
order by TabTwo.Date desc
) last
outer apply
(
select top 1
Reading
from TabTwo
where TabTwo.Val_Key = TabOne.val_Key
order by TabTwo.Date asc
) first
Live test is # Sql Fiddle.

#Nikola Markovinović's solution can be made more universally applicable if the subqueries are moved directly to the main query's SELECT clause, which is possible each of them retrieves only one value and is, therefore, valid as a scalar expression:
SELECT
t1.[KEY],
t1.[Group],
t1.Name,
t1.[Desc],
t1.Val_Key,
(
SELECT TOP 1 Reading
FROM TabTwo
WHERE Val_Key = t1.Val_Key
ORDER BY Date DESC
) AS LastReading,
(
SELECT TOP 1 Reading
FROM TabTwo
WHERE Val_Key = t1.Val_Key
ORDER BY Date ASC
) AS FirstReading
FROM TabOne t1
If you needed e.g. dates along the way, you would probably have to stick to Nikola's solution. There is an alternative to it, but it's more cumbersome (albeit more standard too): it would involve grouping TabTwo's data by Val_Key to get earliest/latest dates per Val_Key, then joining back to TabTwo to access entire rows corresponding to the found dates to finally pull the necessary columns, and ultimately joining both result sets to TabOne to get the final column set.

Related

What's an efficient way to count "previous" rows in SQL?

Hard to phrase the title for this one.
I have a table of data which contains a row per invoice. For example:
| Invoice ID | Customer Key | Date | Value | Something |
| ---------- | ------------ | ---------- | ------| --------- |
| 1 | A | 08/02/2019 | 100 | 1 |
| 2 | B | 07/02/2019 | 14 | 0 |
| 3 | A | 06/02/2019 | 234 | 1 |
| 4 | A | 05/02/2019 | 74 | 1 |
| 5 | B | 04/02/2019 | 11 | 1 |
| 6 | A | 03/02/2019 | 12 | 0 |
I need to add another column that counts the number of previous rows per CustomerKey, but only if "Something" is equal to 1, so that it returns this:
| Invoice ID | Customer Key | Date | Value | Something | Count |
| ---------- | ------------ | ---------- | ------| --------- | ----- |
| 1 | A | 08/02/2019 | 100 | 1 | 2 |
| 2 | B | 07/02/2019 | 14 | 0 | 1 |
| 3 | A | 06/02/2019 | 234 | 1 | 1 |
| 4 | A | 05/02/2019 | 74 | 1 | 0 |
| 5 | B | 04/02/2019 | 11 | 1 | 0 |
| 6 | A | 03/02/2019 | 12 | 0 | 0 |
I know I can do this using either a CTE like this...
(
select
count(*)
from table
where
[Customer Key] = t.[Customer Key]
and [Date] < t.[Date]
and Something = 1
)
But I have a lot of data and that's pretty slow. I know I can also use cross apply to achieve the same thing, but as far as I can tell that's not any better performing than just using a CTE.
So; is there a more efficient means of achieving this, or do I just suck it up?
EDIT: I originally posted this without the requirement that only rows where Something = 1 are counted. Mea culpa - I asked it in a hurry. Unfortunately I think that this means I can't use row_number() over (partition by [Customer Key])
Assuming you're using SQL Server 2012+ you can use Window Functions:
COUNT(CASE WHEN Something = 1 THEN CustomerKey END) OVER (PARTITION BY CustomerKey ORDER BY [Date]
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) -1 AS [Count]
Old answer before new required logic:
COUNT(CustomerKey) OVER (PARTITION BY CustomerKey ORDER BY [Date]
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) -1 AS [Count]
If you're not using 2012 an alternative is to use ROW_NUMBER
ROW_NUMBER() OVER (PARTITION BY CustomerKey ORDER BY [Date]) - 1 AS Count

Selecting grouped rows after first two rows SQL Server

This is a bit of a tricky question/situation and my search fu failed me.
Lets say i have the following data
| UID | SharedID | Type | Date |
|-----|----------|------|-----------|
| 1 | 1 | foo | 2/4/2016 |
| 2 | 1 | foo | 2/5/2016 |
| 3 | 1 | foo | 2/8/2016 |
| 4 | 1 | foo | 2/11/2016 |
| 5 | 2 | bar | 1/11/2016 |
| 6 | 2 | bar | 2/11/2016 |
| 7 | 3 | baz | 2/1/2016 |
| 8 | 3 | baz | 2/3/2016 |
| 9 | 3 | baz | 2/11/2016 |
And I would like to ommit a variable number of leading rows (most recent date in this case) and lets say that number is 2 in this example. The resulting table would be something like this:
| UID | SharedID | Type | Date |
|-----|----------|------|-----------|
| 1 | 1 | foo | 2/4/2016 |
| 2 | 1 | foo | 2/5/2016 |
| 7 | 3 | baz | 2/1/2016 |
Is this possible in SQL? Essentially I want to filter on an unknown number of rows which uses the date column as the order by. The goal is to get the oldest types and get a list of UID's in the process.
Sure, it's possible. Use a ROW_NUMBER function to assign a value to each row, partitioning by the SharedID column so that the count restarts every time that ID changes, and select those rows with a value greater than your limit.
WITH cteNumberedRows AS (
SELECT UID, SharedID, Type, Date,
ROW_NUMBER() OVER(PARTITION BY SharedID ORDER BY Date DESC) AS RowNum
FROM YourTable
)
SELECT UID, SharedID, Type, Date
FROM cteNumberedRows
WHERE RowNum > 2;
Not sure if I understand what you mean but something like this?
SELECT * FROM MyTable t1 JOIN MyTable T2 ON t2.id NOT IN (
SELECT TOP 2 UID FROM myTable
WHERE SharedID = t1.sharedID
ORDER BY [Date] DESC
)

Group by Date bigger than SQL Server

I'm working with Micrososft SqlServer 2012 and I have this table:
Table
+-----------+--------+------------+
| Id_Client | Amount | Date |
+-----------+--------+------------+
| 1 | 100 | 24/08/2015 |
| 2 | 100 | 24/07/2015 |
| 3 | 100 | 24/06/2015 |
| 3 | 100 | 24/05/2015 |
+-----------+--------+------------+
And I need to make a query like this:
Query
SELECT ID_CLIENT,
CASE WHEN DATE <= '01/07/2015' THEN 'OLD' ELSE 'NEW' END,
SUM(AMOUNT) FROM TABLE
GROUP BY ID_CLIENT
How do I Group by Date with the condition, instead of each Date?
I expect something like:
Expected result
+---+-----+-----+
| 1 | NEW | 100 |
| 2 | NEW | 100 |
| 3 | OLD | 200 |
+---+-----+-----+

Where to use Outer Apply

MASTER TABLE
x------x--------------------x
| Id | Name |
x------x--------------------x
| 1 | A |
| 2 | B |
| 3 | C |
x------x--------------------x
DETAILS TABLE
x------x--------------------x-------x
| Id | PERIOD | QTY |
x------x--------------------x-------x
| 1 | 2014-01-13 | 10 |
| 1 | 2014-01-11 | 15 |
| 1 | 2014-01-12 | 20 |
| 2 | 2014-01-06 | 30 |
| 2 | 2014-01-08 | 40 |
x------x--------------------x-------x
I am getting the same results when LEFT JOIN and OUTER APPLY is used.
LEFT JOIN
SELECT T1.ID,T1.NAME,T2.PERIOD,T2.QTY
FROM MASTER T1
LEFT JOIN DETAILS T2 ON T1.ID=T2.ID
OUTER APPLY
SELECT T1.ID,T1.NAME,TAB.PERIOD,TAB.QTY
FROM MASTER T1
OUTER APPLY
(
SELECT ID,PERIOD,QTY
FROM DETAILS T2
WHERE T1.ID=T2.ID
)TAB
Where should I use LEFT JOIN AND where should I use OUTER APPLY
A LEFT JOIN should be replaced with OUTER APPLY in the following situations.
1. If we want to join two tables based on TOP n results
Consider if we need to select Id and Name from Master and last two dates for each Id from Details table.
SELECT M.ID,M.NAME,D.PERIOD,D.QTY
FROM MASTER M
LEFT JOIN
(
SELECT TOP 2 ID, PERIOD,QTY
FROM DETAILS D
ORDER BY CAST(PERIOD AS DATE)DESC
)D
ON M.ID=D.ID
which forms the following result
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | NULL | NULL |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
This will bring wrong results ie, it will bring only latest two dates data from Details table irrespective of Id even though we join with Id. So the proper solution is using OUTER APPLY.
SELECT M.ID,M.NAME,D.PERIOD,D.QTY
FROM MASTER M
OUTER APPLY
(
SELECT TOP 2 ID, PERIOD,QTY
FROM DETAILS D
WHERE M.ID=D.ID
ORDER BY CAST(PERIOD AS DATE)DESC
)D
Here is the working : In LEFT JOIN , TOP 2 dates will be joined to the MASTER only after executing the query inside derived table D. In OUTER APPLY, it uses joining WHERE M.ID=D.ID inside the OUTER APPLY, so that each ID in Master will be joined with TOP 2 dates which will bring the following result.
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | 2014-01-08 | 40 |
| 2 | B | 2014-01-06 | 30 |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
2. When we need LEFT JOIN functionality using functions.
OUTER APPLY can be used as a replacement with LEFT JOIN when we need to get result from Master table and a function.
SELECT M.ID,M.NAME,C.PERIOD,C.QTY
FROM MASTER M
OUTER APPLY dbo.FnGetQty(M.ID) C
And the function goes here.
CREATE FUNCTION FnGetQty
(
#Id INT
)
RETURNS TABLE
AS
RETURN
(
SELECT ID,PERIOD,QTY
FROM DETAILS
WHERE ID=#Id
)
which generated the following result
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-11 | 15 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | 2014-01-06 | 30 |
| 2 | B | 2014-01-08 | 40 |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
3. Retain NULL values when unpivoting
Consider you have the below table
x------x-------------x--------------x
| Id | FROMDATE | TODATE |
x------x-------------x--------------x
| 1 | 2014-01-11 | 2014-01-13 |
| 1 | 2014-02-23 | 2014-02-27 |
| 2 | 2014-05-06 | 2014-05-30 |
| 3 | NULL | NULL |
x------x-------------x--------------x
When you use UNPIVOT to bring FROMDATE AND TODATE to one column, it will eliminate NULL values by default.
SELECT ID,DATES
FROM MYTABLE
UNPIVOT (DATES FOR COLS IN (FROMDATE,TODATE)) P
which generates the below result. Note that we have missed the record of Id number 3
x------x-------------x
| Id | DATES |
x------x-------------x
| 1 | 2014-01-11 |
| 1 | 2014-01-13 |
| 1 | 2014-02-23 |
| 1 | 2014-02-27 |
| 2 | 2014-05-06 |
| 2 | 2014-05-30 |
x------x-------------x
In such cases an APPLY can be used(either CROSS APPLY or OUTER APPLY, which is interchangeable).
SELECT DISTINCT ID,DATES
FROM MYTABLE
OUTER APPLY(VALUES (FROMDATE),(TODATE))
COLUMNNAMES(DATES)
which forms the following result and retains Id where its value is 3
x------x-------------x
| Id | DATES |
x------x-------------x
| 1 | 2014-01-11 |
| 1 | 2014-01-13 |
| 1 | 2014-02-23 |
| 1 | 2014-02-27 |
| 2 | 2014-05-06 |
| 2 | 2014-05-30 |
| 3 | NULL |
x------x-------------x
In your example queries the results are indeed the same.
But OUTER APPLY can do more: For each outer row you can produce an arbitrary inner result set. For example you can join the TOP 1 ORDER BY ... row. A LEFT JOIN can't do that.
The computation of the inner result set can reference outer columns (like your example did).
OUTER APPLY is strictly more powerful than LEFT JOIN. This is easy to see because each LEFT JOIN can be rewritten to an OUTER APPLY just like you did. It's syntax is more verbose, though.

SQL Server. Join two tables n a view, take rows from one and turn into columns [duplicate]

This question already has answers here:
Efficiently convert rows to columns in sql server
(5 answers)
Closed 8 years ago.
I'm pretty new to SQL Server so don't really know what I'm doing with this. I have two tables, which might look like this:
table 1
| ID | customer | Date |
| 1 | company1 | 01/08/2014 |
| 2 | company2 | 10/08/2014 |
| 3 | company3 | 25/08/2014 |
table 2
| ID | Status | Days |
| 1 | New | 6 |
| 1 | In Work | 25 |
| 2 | New | 17 |
| 3 | New | 14 |
| 3 | In Work | 72 |
| 3 | Complete | 25 |
What I need to do is join based on the ID, and create new columns to show how long each ID has been in each status. Every time an order goes to a new status, a new line is added and the number of days is counted as in the 2nd table above. What I need to create from this, should look like this:
| ID | customer | Date | New | In Work | Complete |
| 1 | company1 | 01/08/2014 | 6 | 25 | |
| 2 | company2 | 10/08/2014 | 17 | | |
| 3 | company3 | 25/08/2014 | 14 | 72 | 25 |
So what do I need to to to create this?
Thanks for any help, as I say I'm pretty new to this.
I would suggest that AHiggins' link is a better candidate to mark this as a dupe rather than the one that's actually been selected because his link involves a join.
WITH [TimeTable] AS (
SELECT
T1.ID,
T1.[Date],
T2.[Status] AS [Status],
T2.[Days]
FROM
dbo.Table1 T1
INNER JOIN dbo.Table2 T2
ON T2.ID = T1.ID
)
SELECT *
FROM
[TimeTable]
PIVOT (MAX([Days]) FOR [Status] IN ([New], [Complete], [In Work])) TT
;

Resources