How to join columns from the same table?

How to join columns from the same table? - sql-server

How to add extra column with time difference for the same SALESID and GO = SO and GO = ZW?
select SALESID, DATETIME AS Time, GO
FROM [Mer_PRD].[dbo].[TRACKINGTABLE]
WHERE GO IN('ZW', 'SO')
Result example:
****SALESID** | TIME SO | TIME ZW | DIFF**
ZS/0033428/2020 | 2020-07-16 08:37:00 | 2020-07-16 08:40:00 | 00:03:00

You could use a self join.
SELECT goSO.SALESID
, goSo.[Time] 'TIME SO'
, goZw.[Time] 'TIME ZW'
, cast(goZw.[Time] - goSo.[Time] as Time) Difference
FROM GOTRACKING goSo
INNER JOIN GOTRACKING goZw on goSo.SALESID = goZw.SALESID
WHERE goSo.[GO] = 'SO'
AND goZw.[GO] = 'ZW'
SQLFiddle

try this,
SELECT SALESID, Time FROM [Mer_PRD].[dbo].[TRACKINGTABLE]
;WITH cte AS (SELECT SALESID, Time, ROW_NUMBER()OVER(PARTITION BY SALESID ORDER BY Time) AS RN
FROM [Mer_PRD].[dbo].[TRACKINGTABLE])
SELECT a.SALESID, a.Time, b.Time AS Time2
FROM cte a
LEFT JOIN cte b
ON a.SALESID = b.SALESID
AND a.RN = b.RN -1
WHERE a.RN = 1

Related

SSIS audit step encountering errors

I have a step in my SSIS package where I'd like to update the latest row in my execution log (T1) with information from the latest row in another table (T2).
I get an error around the 'Where' statement
UPDATE T1
SET
[Survey_Size] = ssd.[FileName]
,Survey_Start_Date = ssd.[Start_Date]
,Survey_End_Date = ssd.[End_Date]
,[EndTime] = getdate()
,loaded = 1
FROM (SELECT max(log_sk) AS maxSk FROM T1) A
JOIN (SELECT max(PK) AS maxPK FROM T2) SS
JOIN (SELECT PK, [FileName], Start_Date, End_Date, Survey_Size FROM T2) ssd ON ss.maxPK = ssd.pk
WHERE log_sk = a.maxSk
Table 1 looks like this:
log_sk | FileName | Survey_Size | Start_Date | End_Date
and I'd like to update the information from Table 2 which looks like below, where FileName would be a joining key in both
PK | FileName | Start_Date | End_Date | Survey_Size

I rewrite it with CTE Because it's more efficient and much more readable.
;With LastT1 as (
Select
log_sk as ID,
Survey_Size,
Survey_Start_Date,
Survey_End_Date,
EndTime,
loaded,
ROW_NUMBER() over (order by log_sk Desc) as Row_No
From T1
), LatestT2 as (
Select
PK as ID,
[FileName],
[Start_Date],
End_Date,
Survey_Size,
ROW_NUMBER() over (order by PK Desc) as Row_No
From T2
)
Update Source
Set
Source.[Survey_Size] = LatestT2.[FileName],
Source.Survey_Start_Date = LatestT2.[Start_Date],
Source.Survey_End_Date = LatestT2.[End_Date],
Source.[EndTime] = getdate(),
Source.loaded = 1
From LastT1 as Source
Inner Join LatestT2 on Source.ID = LatestT2.ID and LatestT2.Row_No = 1
Where Source.Row_No = 1

SQL Server multiple select on an Offset...fetch...next query

I'm trying to get data into datatables (js library for data table) by server-side processing.
The data should be produced as below
+---------+--------+--------+
| Name | TotalA | TotalB |
+---------+--------+--------+
| Person1 | 10 | 40 |
+---------+--------+--------+
The query that I tried
select
a.Name,
(select count(*) from SummaryA where id = a.id) as TotalA,
(select count(*) from SummaryB where id = a.id) as TotalB
from
records a
order by
a.Name
offset 0 rows fetch next 10 rows only
and
select
aa.Name,
(select count(*) from SummaryA where id = aa.id) as TotalA,
(select count(*) from SummaryB where id = aa.id) as TotalB
from
(select
a.Name, a.id
from
records a
order by
a.Name
offset 0 rows fetch next 10 rows only) as aa
However, these queries will result in an error as below
Error in query: Invalid usage of the option NEXT in the FETCH statement.
Running below query is not a problem
select
a.Name
from
records a
offset 0 rows fetch next 10 rows only

Issue- offset_row_count_expression can be a variable, parameter, or constant scalar subquery. When a subquery is used, it cannot reference any columns defined in the outer query scope.
link
Try
;with temp as (select a.name ,
count(b.id) as TotalA ,
count(c.id) as Totalb
FROM records a
left join SummaryA b
b.id=a.id
left join SummaryB c
c.id=a.id
group by a.name)
select * from temp
order by temp.Name
Offset 0 rows
fetch next 10 rows only
This can also be solved
with tmp as (
select a.name ,
a.id
FROM records a
order by temp.Name
Offset 0 rows
fetch next 10 rows only
)
select a.name ,
count(b.id) as TotalA ,
count(c.id) as Totalb
FROM tmp a
left join SummaryA b
b.id=a.id
left join SummaryB c
c.id=a.id
group by a.name order by a.Name

Using RowNumber and Partition

Consider this code:
Select U.[user_id] As UserID
Max(AL.entry_dt) As LastLoginDate
From Users U with (nolock)
Inner Join activity_log AL with (nolock) On AL.[user_id] = U.[user_id]
And AL.activity_type = 'LOGIN'
And U.external_user = 1
Group By U.[user_id]
Having Max(al.entry_dt) < GetDate() - 30
Order By U.[user_id]
I was curious if the Row_Number / Partition could be used here? Perhaps to make this more effective, or if it can be used at all?
Essentially, I want 1 row per user with the last instance the user logged in where the user hasn't logged in during the last 30 days.
Bring on the pain.....

To use the result of the row_number() in a where clause, wrap the query in a subquery/derived table or common table expression:
Original answer for users that have logged in within the last 30 days:
select UserId, LastLoginDate
from (
Select
U.[user_id] As UserID
, AL.entry_dt As LastLoginDate
, rn = row_number() over(partition by u.user_id order by AL.entry_dt desc)
From Users U with (nolock)
Inner Join activity_log AL with (nolock)
On AL.[user_id] = U.[user_id]
And AL.activity_type = 'LOGIN'
And U.external_user = 1
Where AL.entry_dt > GetDate() - 30 -- swapped < for >
) sub
where rn = 1
Order By sub.[userid]
rextester demo: http://rextester.com/XZU40394
returns:
+--------+---------------+
| UserId | LastLoginDate |
+--------+---------------+
| 1 | 2017-09-13 |
| 2 | 2017-09-10 |
| 3 | 2017-09-07 |
+--------+---------------+
Updated answer for users who have not logged in in the last 30 days:
select UserId, LastLoginDate
from (
Select
U.[user_id] As UserID
, AL.entry_dt As LastLoginDate
, rn = row_number() over(partition by u.user_id order by AL.entry_dt desc)
From Users U with (nolock)
Inner Join activity_log AL with (nolock)
On AL.[user_id] = U.[user_id]
And AL.activity_type = 'LOGIN'
And U.external_user = 1
) sub
where rn = 1
and lastlogindate < getdate() - 30
Order By [userid]
rextester demo: http://rextester.com/XZU40394
returns:
+--------+---------------+
| UserId | LastLoginDate |
+--------+---------------+
| 4 | 2016-09-13 |
| 6 | 2016-09-10 |
+--------+---------------+
from test setup:
create table users (user_id int, external_user bit)
create table activity_log (user_id int, activity_type varchar(32), entry_dt date)
insert into users values (1,1),(2,1),(3,1),(4,1),(5,0),(6,1)
insert into activity_log values
(1,'login','20170913') ,(1,'login','20170912') ,(1,'login','20170911'),(1,'login','20160908')
,(2,'login','20170910') ,(2,'login','20170909') ,(2,'login','20170908')
,(3,'login','20170907') ,(3,'login','20170906') ,(3,'login','20170905')
,(4,'login','20160913') ,(4,'login','20160912') ,(4,'login','20160908')
,(5,'login','20160910') ,(5,'login','20160909') ,(5,'login','20160908')
,(6,'login','20160910') ,(6,'login','20160909') ,(6,'login','20160908')
To correct your query in the question, move your where to having like so:
Select U.[user_id] As UserID
,Max(AL.entry_dt) As LastLoginDate
From Users U with (nolock)
Inner Join activity_log AL with (nolock) On AL.[user_id] = U.[user_id]
And AL.activity_type = 'LOGIN'
And U.external_user = 1
Group By U.[user_id]
having max(al.entry_dt) < GetDate() - 30
Order By U.[user_id]

CROSS APPLY or OUTER APPLY allow you to return n records from correlated query for each record in related table. I think cross apply is what you want since if a user hasn't logged in in the past 30 days you don't want to see them at all in results. Cross apply similar to inner join but runs correlation query for each record related table. OUTER Apply similar to OUTER join so it returns all records from related table and only those that match in the correlated query.
So in the below example, for each user, return the top 1 record in descending order of entry_dT. for each related user. Outer apply would resemble a left join so all users would be returned even if no activity occurred.
MODIFIED DEMO: http://rextester.com/UQEI69366 (all 3 below) again thx to SQLZim for tester/data
SELECT U.[user_id] As UserID
, AL.entry_dt As LastLoginDate
FROM Users U with (nolock)
CROSS APPLY (SELECT top 1 *
FROM activity_log IAL
WHERE U.User_ID = IAL.User_ID
AND IAL.activity_type = 'LOGIN'
ORDER BY IAL.entry_DT Desc) AL
WHERE U.external_user = 1
AND IAL.entry_dt < GetDate() - 30
ORDER BY U.[user_id]
If all you're after is users who haven't logged in in the past 30 days...
a simple not exists seems like it would work. Who cares about the date time if they have; you're just after a list of users who haven't logged in in 30 days.
SELECT U.[user_id] As UserID
FROM Users U
WHERE not exists (SELECT *
FROM activity_log IAL
WHERE IAL.activity_type = 'LOGIN'
AND IAL.entry_dt > GetDate() - 30
AND IAL.[user_id] = U.[user_id])
AND U.external_user = 1
ORDER BY U.[user_id]
a simple left join would work as well (return all external users who have not had a login in 30 days from present date.
SELECT U.[user_id] As UserID
FROM Users U with (nolock)
LEFT JOIN activity_log AL
ON AL.[user_id] = U.[user_id]
AND AL.activity_type = 'LOGIN'
AND AL.entry_dt > GetDate() - 30
WHERE U.external_user = 1
and AL.user_ID is null
ORDER BY U.[user_id]

I was curious if the Row_Number / Partition could be used here? Perhaps to make this more effective, or if it can be used at all?
I prefer group by in your case than row number since row number needs additional index than group by.Read below to know more
Assuming you use the same query you posted,below are the indexes needed
for users table..
create index nci_test on
dbo.usertable(userid,external_login)
For activity log table, you will need to know more about the data..
Ex:
if join filters out more rows than where,then index can be
create index nci_test1 on
dbo.actvititlog(userid,entry_Dt,activity_type )
if entry_dt column filters out more rows,then leading column can be entry_Dt in above index
if you use RowNumber,it will need a POC index and your query spreads across two tables,so this can't be done

Group By and inner join with latest records based on TimeStamp

I have a History table as below:
ID | GroupCode | Category | TimeStamp
---+-----------+----------+-----------
1 | x | shoes | 2016-09-01
2 | y | blach | 2016-09-01
History table gets updated every month and a single entry for each GroupCode gets inserted in the table.
I have also a Current table which holds the latest position.
Before or after I update the History table with the current position I would like to find out whether the Category has changed from last month to this month.
I need to compare the last Category with the current Category and, if it has changed, then flag the CategoryChanged in the Current table.
Current table:
ID | GroupCode | Category | CategoryChanged
---+-----------+----------+----------------
1 | x | shoes | True
2 | y | blah | False
I tried to achieve this with INNER JOIN but I am having difficulties to INNER JOIN to latest month and year entries in History table, but no success.

--get highest group code based on timestamp
;with History
as
(select top 1 with ties groupcode,category
from
history
order by
row_number() over (partition by group code order by timestamp desc) as rownum
)
--now do a left join with current table
select
ct.ID,
ct.GroupCode,
ct.Category,
case when ct.category=ht.category or ht.category is null then 'False'
else 'true'
end as 'changed'
from
currenttable ct
left join
history ht
on ht.groupcode=ct.groupcode
use below to update ,after checking if your select values are correct..
update ct
set ct.category=
case when ct.category=ht.category or ht.category is null then 'False'
else 'true'
end
from
currenttable ct
left join
history ht
on ht.groupcode=ct.groupcode

if you make a CTE where the history records have rown_numbwer for each GroupCode ordered by date descending, then you are interested in rows 1 AND 2, SO YOU CAN THEREFORE join your CTE on GroupCode, and select records 1 and 2, you can the see if category has changed between rows 1 and 2
;WITH CTE AS (SELECT *, row_number() OVER (PARTITION BY GroupCode ORDER BY TimeStamp Desc) RN FROM History)
SELECT
C1.ID,
C1.GroupCode,
C1.Category,
CASE WHEN C1.Category = C2.Category THEN
'false'
else
'true'
end AS CategoryChanged
FROM CTE C1
JOIN
CTE C2
ON C1.GroupCode = C2.GroupCode
AND C1.Rn=1 AND C2.RN = 2;
if you have null categories, you can avoid with - BTW you will need to learn how to handle NULLs the way you want to handle them - you can't expect people to post on here thinking about NULLs you never mentioned forever! And happening to realise what you want to do with them for that matter
;WITH CTE AS (SELECT *, row_number() OVER (PARTITION BY GroupCode ORDER BY TimeStamp Desc) RN FROM History)
SELECT
C1.ID,
C1.GroupCode,
C1.Category,
CASE WHEN C1.Category = C2.Category OR C1.Category IS NULL AND C2.Category IS NULL THEN
'false'
else
'true'
end AS CategoryChanged
FROM CTE C1
JOIN
CTE C2
ON C1.GroupCode = C2.GroupCode
AND C1.Rn=1 AND C2.RN = 2;

How to get the latest entry for each item for in a Month with a single SQL query [duplicate]

This question already has answers here:
Fetch the rows which have the Max value for a column for each distinct value of another column
(35 answers)
Closed 6 years ago.
I am trying to write a query to pick one entry for each item for each month but the latest in the month from the following table:
Name | Date | Value
a |2015-01-01 | 1
a |2015-01-02 | 2
b |2015-01-03 | 1
b |2015-01-04 | 1
b |2015-01-03 | 3
c |2015-01-02 | 2
c |2015-01-29 | 10
a |2015-02-10 | 2
a |2015-02-20 | 1
c |2015-02-10 | 2
c |2015-02-22 | 23
b |2015-02-25 | 1
b |2015-02-19 | 2
return should be:
a |2015-01-02 | 2
b |2015-01-04 | 1
c |2015-01-29 | 10
a |2015-02-20 | 1
b |2015-02-25 | 1
c |2015-02-22 | 23
I wonder how would this be achieved instead of sending multiple queries to SQL server for each month I would like to load all the values with one query then filter the collection on the memory. Otherwise I would end up writing a query as below:
SELECT Name,Date, Value FROM MyTable mt
INNER JOIN (
select max(Date) as MaxDate
FROM [MyTable] m WHERE YEAR(Date) =YEAR(#date)
AND MONTH(Date)=MONTH(#date)) mx ON t.Date = mx.MaxDate)
And this query needs to be run for each month.
Any better idea to return all entries with a single query?
Thanks,

Try grouping by year and month in the derived table:
SELECT t1.Name, t1.[Date], t1.Value
FROM MyTable t1
INNER JOIN (
SELECT Name, YEAR(Date) AS y, MONTH([Date]) AS m, MAX([Date]) as MaxDate
FROM MyTable
GROUP BY Name, YEAR(Date), MONTH([Date])
) t2 ON t1.Name = t2.Name AND
YEAR(t1.[Date]) = t2.y AND MONTH(t1.[Date]) = t2.m AND
t1.[Date] = t2.MaxDate

SELECT *
FROM (
SELECT NAME, DATE, VALUE,
ROW_NUMBER() OVER (PARTITION BY NAME, YEAR(Date), MONTH(Date)
ORDER BY Date DESC) rn
FROM MyTable) AS t
WHERE t.rn = 1

Assuming that you are using a SQL Server version that supports it, you can use the ROW_NUMBER() windowing function to return a sequence number for each row, then you can subsequently use that to restrict to only the rows that you require.
SELECT [Name],[Date],[Value]
,ROW_NUMBER() OVER (PARTITION BY [Name] ORDER BY [Date] DESC) AS [Seq]
FROM myTable
Things to consider:
What happens when there is a tie? ROW_NUMBER will always return a sequence number, but if your data has > 1 row at the same Date value, the order will be arbritrary. To solve this add additional tie-break ORDER BY entries
How do I filter this? Put it into a Common Table Expression, Inline View or Real View

I think you need a correlated query once you have a set of distinct (Name, Month). There are various ways of doing this, one is to use cross apply:
select *
from (select distinct Name, Month(Date) as Month
from theTable) itemMonths
cross apply (select Max(value)
from theTable t
where Month(t.Date) = itemMonths.Month
and t.Name = itemMonths.Name)

You could try the following:
WITH MyTable AS
(SELECT 'a' AS name, GETDATE() AS date, 1 AS value
UNION ALL
SELECT 'a', GETDATE()+1, 2
)
, res AS (
SELECT Name,date,MAX(Date) OVER(PARTITION BY Name, DATEPART(yyyy,date), DATEPART(mm, date)) AS max_date , Value FROM MyTable
)
SELECT name,date,res.value FROM res WHERE date=max_date
You still need a filter though as the Max Over will return all rows.
If you were using Teradata I'd suggest using the Qualify Clause but Itzik hasn't had any luck getting this ported to SQL server!
https://connect.microsoft.com/SQLServer/feedback/details/532474

Use Cross apply
SELECT b.*
FROM mytable mt
CROSS apply (SELECT TOP 1 NAME, date, value
FROM [mytable] m
WHERE m.NAME = mt.NAME
AND Month(m.date) = Month(mt.date)
AND Year(m.date) = Year(mt.date)
ORDER BY m.date DESC) b

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to join columns from the same table? - sql-server

You could use a self join. SELECT goSO.SALESID , goSo.[Time] 'TIME SO' , goZw.[Time] 'TIME ZW' , cast(goZw.[Time] - goSo.[Time] as Time) Difference FROM GOTRACKING goSo INNER JOIN GOTRACKING goZw on goSo.SALESID = goZw.SALESID WHERE goSo.[GO] = 'SO' AND goZw.[GO] = 'ZW' SQLFiddle

Related

SSIS audit step encountering errors

SQL Server multiple select on an Offset...fetch...next query

Using RowNumber and Partition

Group By and inner join with latest records based on TimeStamp

How to get the latest entry for each item for in a Month with a single SQL query [duplicate]

Categories

Resources