SQL Server: Compare two columns in two tables - sql-server

I have two tables (from two different systems) that track the employees' hours. In both tables, the employees enter the date and hours. I need to create an audit report that shows the discrepancy. The report needs to show all the columns and display a null/no match if there is a mismatch. One table might have more/less entries than the other table or even duplicate entries and I need to catch that (two entries on the same day for the same amount of hours in one table). Both tables have UserID that can be joined on.
If there is match based on date and hours, show all the values.
If there is a mismatch based on hours, show null or no match when there is mismatch.
If there is a duplicate entry, like the image below, match the first entry and report the second one as null or no match.
I tried joining the tables based UserID, date, and hours but not able to tell where the mismatch came from.
Table A:
Table B:
Left Join on UserID, date, and hours

SET NOCOUNT ON;
declare #table1 table
(
userid int,
entry_date date,
[hours] varchar(10)
)
declare #table2 table
(
userid int,
entry_date datetime,
[hours] varchar(10)
)
INSERT INTO #table1
select 1,'8/14/2017','10:00'
INSERT INTO #table1
select 2,'8/14/2017','5:00'
INSERT INTO #table1
select 2,'8/14/2017','5:00'
INSERT INTO #table1
select 2,'8/14/2017','5:00'
INSERT INTO #table1
select 2,'8/14/2017','5:00'
INSERT INTO #table1
select 3,'8/14/2017','5:00'
INSERT INTO #table1
select 3,'8/14/2017','6:00'
INSERT INTO #table1
select 3,'8/14/2017','6:00'
INSERT INTO #table1
select 3,'8/14/2017','6:00'
INSERT INTO #table2
select 1,'8/14/2017','10:00'
INSERT INTO #table2
select 2,'8/14/2017','8:00'
INSERT INTO #table2
select 3,'8/14/2017','6:00'
INSERT INTO #table2
select 4,'8/14/2017','2:00'
INSERT INTO #table2
select 1,'8/14/2017','10:00'
INSERT INTO #table2
select 3,'8/14/2017','6:00'
;WITH CTE_TABLE1 AS
(
select t.userid as userid, CAST(t.entry_date as DATE) as entry_date, t.[hours] as [hours],
ROW_NUMBER() OVER(PARTITION BY t.userid, t.entry_date, t.[hours] ORDER BY t.[Hours]) as rnk
from #Table1 t
), CTE_TABLE2 AS
(
select t.userid as userid, CAST(t.entry_date as DATE) as entry_date, t.[hours] as [hours],
ROW_NUMBER() OVER(PARTITION BY t.userid, t.entry_date, t.[hours] ORDER BY t.[Hours]) as rnk
from #Table2 t
), CTE_MATCHES AS
(
select t1.userid as userid, t1.entry_date as entry_date, t1.[hours] as [hours], t1.rnk
from CTE_TABLE1 t1
inner join CTE_TABLE2 t2
on (t1.userid = t2.userid AND t1.entry_date = t2.entry_date AND t1.[hours] = t2.[hours] AND t1.rnk = t2.rnk)
),CTE_MATCH_DUPLICATES AS
(
select 'Table1MatchDuplicate' as ErrorType, *
from
(
select t.*
from (select userid, entry_date, [hours], max(rnk) as rnk from CTE_MATCHES group by userid, entry_date, [hours]) m
inner join CTE_TABLE1 t
on (t.userid = m.userid AND t.entry_date = m.entry_date AND t.[hours] = m.[hours] AND t.rnk > m.rnk)
)q
UNION ALL
select 'Table2MatchDuplicate' as ErrorType, *
from
(
select t.*
from (select userid, entry_date, [hours], max(rnk) as rnk from CTE_MATCHES group by userid, entry_date, [hours]) m
inner join CTE_TABLE2 t
on (t.userid = m.userid AND t.entry_date = m.entry_date AND t.[hours] = m.[hours] AND t.rnk > m.rnk)
)q
)
, CTE_Table1_UNMATCHED AS
(
select t.userid, t.entry_date, t.[hours]
from #Table1 t
left outer join CTE_MATCHES m
on (t.userid = m.userid AND CAST(t.entry_date as DATE) = m.entry_date AND t.[hours] = m.[hours])
where m.userid is null
), CTE_Table2_UNMATCHED AS
(
select t.userid, t.entry_date, t.[hours]
from #Table2 t
left outer join CTE_MATCHES m
on (t.userid = m.userid AND CAST(t.entry_date as DATE) = m.entry_date AND t.[hours] = m.[hours])
where m.userid is null
)
select null as ErrorType, userid, entry_date, [hours] from CTE_MATCHES
UNION ALL
select 'Table1Mismatch' as ErrorType, userid, entry_date, [hours] from CTE_Table1_UNMATCHED
UNION ALL
select 'Table2Mismatch' as ErrorType, userid, entry_date, [hours] from CTE_Table2_UNMATCHED
UNION ALL
select ErrorType, userid, entry_date, [hours] from CTE_MATCH_DUPLICATES
order by ErrorType
http://rextester.com/UDG95824
if you need to find duplicates that had no matches as well:
,CTE_Table1_Unmatched_Duplicates AS
(
select userid, entry_date, [hours]
from CTE_Table1_UNMATCHED
group by userid, entry_date, [hours]
having count(*) > 1
),CTE_Table2_Unmatched_Duplicates AS
(
select userid, entry_date, [hours]
from CTE_Table2_UNMATCHED
group by userid, entry_date, [hours]
having count(*) > 1
)
...
UNION
select 'Table1UnmatchedDuplicates' as ErrorType, userid, entry_date, [hours] from CTE_Table1_Unmatched_Duplicates
UNION
select 'Table2UnmatchedDuplicates' as ErrorType, userid, entry_date, [hours] from CTE_Table2_Unmatched_Duplicates
http://rextester.com/KEJJF79330

Related

How to use CTE to get a query repeated for multiple inputs?

I have the following query:
SELECT **top 1** account, date, result
FROM table_1 as t1
JOIN table_2 at t2 ON t1.accountId = t2.frn_accountId
WHERE accountID = 1
ORDER BY date
This query returns the result that I want however I want that result for multiple accountID. They query should return the top 1 value for each accountID.
The query that produce the list of the accountID-s is:
SELECT accountID from lskin WHERE refname LIKE '%BHA%' and isactive = 1
How can I write this query so it can produce the desired result? I have been playing around with CTE but haven't been able to make it correct. It doesn't have to be with CTE, I just thought it can be easier using CTE...
Here is CTE solution.
SELECT *
FROM (SELECT account
, date
, result
, ROW_NUMBER() OVER (PARTITION BY t1.accountId ORDER BY date DESC) AS Rownum
FROM table_1 AS t1
INNER JOIN table_2 AS t2
ON t1.accountId = t2.frn_accountId
INNER JOIN lskin AS l
ON l.accountID = t1.accountID
WHERE l.refname LIKE '%BHA%'
) a
WHERE a.Rownum = 1;
Use max on your date and group by the account, or what ever columns are appropriate.
SELECT
account,
DT = max(date),
result
FROM table_1 as t1
JOIN table_2 as t2 ON t1.accountId = t2.frn_accountId
JOIN lskin as l on l.accountID = t1.accountID
WHERE l.refname like '%BHA%'
GROUP BY
account
,result
If the grouping isn't correct, just join to a sub-query to limit it with max date. Just change the table names as necessary.
SELECT
account,
date,
result
FROM table_1 as t1
JOIN table_2 as t2 ON t1.accountId = t2.frn_accountId
JOIN lskin as l on l.accountID = t1.accountID
INNER JOIN (select max(date) dt, accountID from table_1 group by accountID) tt on tt.dt = t1.accountId and tt.accountId = t1.accountId
WHERE l.refname like '%BHA%'
Ignore the CTE at the top. That's just test data.
/* CTE Test Data */
; WITH table_1 AS (
SELECT 1 AS accountID, 'acc1' AS account UNION ALL
SELECT 2 AS accountID, 'acc2' AS account UNION ALL
SELECT 3 AS accountID, 'acc3' AS account
)
, table_2 AS (
SELECT 1 AS frn_accountID, 'new1' AS result, GETDATE() AS [date] UNION ALL
SELECT 1 AS frn_accountID, 'mid1' AS result, GETDATE()-1 AS [date] UNION ALL
SELECT 1 AS frn_accountID, 'old1' AS result, GETDATE()-2 AS [date] UNION ALL
SELECT 2 AS frn_accountID, 'new2' AS result, GETDATE() AS [date] UNION ALL
SELECT 2 AS frn_accountID, 'mid2' AS result, GETDATE()-1 AS [date] UNION ALL
SELECT 2 AS frn_accountID, 'old2' AS result, GETDATE()-2 AS [date] UNION ALL
SELECT 3 AS frn_accountID, 'new3' AS result, GETDATE() AS [date] UNION ALL
SELECT 3 AS frn_accountID, 'mid3' AS result, GETDATE()-1 AS [date] UNION ALL
SELECT 3 AS frn_accountID, 'old3' AS result, GETDATE()-2 AS [date]
)
, lskin AS (
SELECT 1 AS accountID, 'purple' AS refName, 1 AS isActive UNION ALL
SELECT 2 AS accountID, 'blue' AS refName, 1 AS isActive UNION ALL
SELECT 3 AS accountID, 'orange' AS refName, 0 AS isActive UNION ALL
SELECT 4 AS accountID, 'blue' AS refName, 1 AS isActive
)
,
/* Just use the below and remove comment markers around WITH to build Orders CTE. */
/* ; WITH */
theCTE AS (
SELECT s1.accountID, s1.account, s1.result, s1.[date]
FROM (
SELECT t1.accountid, t1.account, t2.result, t2.[date], ROW_NUMBER() OVER (PARTITION BY t1.account ORDER BY t2.[date]) AS rn
FROM table_1 t1
INNER JOIN table_2 t2 ON t1.accountID = t2.frn_accountID
) s1
WHERE s1.rn = 1
)
SELECT lskin.accountID
FROM lskin
INNER JOIN theCTE ON theCTE.accountid = lskin.accountID
WHERE lskin.refName LIKE '%blue%'
AND lskin.isActive = 1
;
EDITED:
I'm still making a lot of assumptions about your data structure. And again, make sure you're querying what you need. CTEs are awesome, but you don't want to accidentally filter out expected results.

SQL: How can I get from grouped table also one first row from each group

I have following table
CREATE TABLE [dbo].[Table_1](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Name] [nchar](10) NULL,
[DateOf] [nvarchar](20) NULL
) ON [PRIMARY]
and date as:
ID|Name|DateOf
1|A|2016-11-29 00:01:00
2|A|2016-11-29 00:02:00
3|A|2016-11-29 00:03:00
4|B|2016-11-29 00:01:00
5|B|2016-11-29 00:02:00
If I make like
select name, COUNT(name) from Table_1 group by Name
I'll have
A|3
B|2
So, how to get result like before with only one row taken from each group and sorted?
A|3|2016-11-29 00:01:00
B|2|2016-11-29 00:01:00
where last column will be sorted as date time (right now is nvarchar)
You can use a CTE and the ranking function PARTITION BY
WITH CTE AS
( select name, dateof,
rn = row_number() over (partition by NAME order by dateof desc)
from Table_1
)
SELECT name, dateof FROM CTE WHERE RN = 1
Or
select * from (
select name, dateof, ROW_NUMBER() over(partition by NAME order by dateof desc) as rnk
from Table_1
) a where rnk=1
You can use CROSS APPLY:
select t1.name, t1.[Count], t2.MinDate
from (
select name, count(*) as [Count]
from Table_1
group by name
) AS t1
cross apply (
select min(DateOf) as MinDate
from Table_1 t2
where t1.Name = t2.Name
) as t2
If you want to get additional data than just the min date from the row, you can modify the subselect:
select t1.name, t1.[Count], t2.MinDate, t2.ID
from (
select name, count(*) as [Count]
from Table_1
group by name
) AS t1
cross apply (
select top 1 DateOf as MinDate, ID
from Table_1 t2
where t1.Name = t2.Name
order by DateOf
) as t2
SELECT T1.Name,A.[COUNT],MIN(DateOf)
FROM Your_table_Name T1
JOIN
(
SELECT COUNT(*) [COUNT],T2.Name [Name]
FROM Your_table_Name T2
GROUP BY T2.Name
)A ON A.Name = T1.Name
GROUP BY T1.Name,A.[COUNT]

Performance comparison between CTE and CTE with fewer columns with inner join

I have the following queries. I am not sure about why is the one using inner join takes less time to execute, shouldn't it take more time than the first one?
Query #1:
CREATE TABLE #TEMPORAL
(
ID INT,
NAME NVARCHAR(300),
LASTNAME NVARCHAR(100),
PARENTID INT,
[LEVEL] INT
)
;WITH CTE AS
(
SELECT
ID, PARENTID, 0 [LEVEL], FIRSTNAME, LASTNAME
FROM
PERSON
WHERE
ID = 1123643
UNION ALL
SELECT
P.ID, P.PARENTID, C.[LEVEL] + 1, P.FIRSTNAME, P.LASTNAME
FROM
PERSON P
INNER JOIN
CTE C ON C.ID = P.PARENTID
)
INSERT INTO #TEMPORAL (ID, NAME, LASTNAME, PARENTID, [LEVEL])
(SELECT
ID, FIRSTNAME, LASTNAME, [LEVEL]
FROM
CTE)
SELECT ID, NAME, LASTNAME, PARENTID, [LEVEL]
FROM #TEMPORAL
Query #2:
CREATE TABLE #TEMPORAL
(
ID INT,
NAME NVARCHAR(300),
LASTNAME NVARCHAR(100),
PARENTID INT,
[LEVEL] INT
)
;WITH CTE AS
(
SELECT
ID, PARENTID, 0 [LEVEL]
FROM
PERSON
WHERE
ID = 1123643
UNION ALL
SELECT
P.ID, P.PARENTID, C.[LEVEL] + 1
FROM
PERSON P
INNER JOIN
CTE C ON C.ID = P.PARENTID
)
INSERT INTO #TEMPORAL (ID, PARENTID, [LEVEL])
(SELECT ID, PARENTID, [LEVEL]
FROM CTE)
SELECT
ID, PARENTID, P.FIRSTNAME, P.LASTNAME
FROM
#TEMPORAL T
INNER JOIN
PERSON P ON T.ID = P.ID
So, I am very confused about the reason of why is this happening. Can you give me some explanation? Also, if there is a better option to accomplish what I want would be great to know .

Converting Rows to Columns

I have a table with columns UserID and CountryName
Now I want get record in this way
[UserId] [ContryName1] [ContryName2] [ContryName3].........
Fiddle here : http://sqlfiddle.com/#!6/cd6f1/1
DECLARE #SQL AS NVARCHAR(MAX)
WITH CTE AS
(
SELECT USERID,COUNTRYNAME,ROW_NUMBER() OVER(PARTITION BY USERID ORDER BY COUNTRYNAME) AS RN
FROM CNTRIES
)
SELECT #SQL = 'WITH CTE1 AS
(
SELECT USERID,COUNTRYNAME,ROW_NUMBER() OVER(PARTITION BY USERID ORDER BY COUNTRYNAME) AS RN
FROM CNTRIES
)
SELECT *
FROM
(SELECT USERID,COUNTRYNAME,RN FROM CTE1)C
PIVOT (MAX(COUNTRYNAME) FOR RN IN (['+STUFF((SELECT '],['+CAST(RN AS VARCHAR(100))
FROM CTE
GROUP BY RN
FOR XML PATH('')),1,3,'')+'])) AS PIVOTT'
PIVOT is your best option if your version is SQL Server 2005 or above, but you don't state the version and trying to use PIVOT without a natural aggregate can be difficult to grasp for some. If your version is below 2005, you have bigger problems. Otherwise, you'll need to left join the table on itself to give you the same result. You can use a ranking function to make it a little easier. Something like this, while inefficient, will produce similar results.
/*
IF OBJECT_ID('Countries','U') IS NOT NULL
DROP TABLE Countries
CREATE TABLE Countries
(
UserID INT
, CountryName VARCHAR(255)
)
INSERT Countries
VALUES (1, 'India')
, (1, 'UK')
, (2, 'USA')
, (2, 'India')
, (2, 'Canada')
*/
SELECT DISTINCT x.UserID, x.CountryName Country1, y.CountryName Country2, z.CountryName Country3
FROM Countries c
LEFT JOIN
(
SELECT *, RANK() OVER(PARTITION BY UserID ORDER BY UserID, CountryName) AS UserRank
FROM Countries
)x ON x.UserID = c.UserID AND x.UserRank=1
LEFT JOIN
(
SELECT *, RANK() OVER(PARTITION BY UserID ORDER BY UserID, CountryName) AS UserRank
FROM Countries
)y ON y.UserID = c.UserID AND y.UserRank=2
LEFT JOIN
(
SELECT *, RANK() OVER(PARTITION BY UserID ORDER BY UserID, CountryName) AS UserRank
FROM Countries
)z ON z.UserID = c.UserID AND z.UserRank=3

SQL Server: How to use UNION with two queries that BOTH have a WHERE clause?

Given:
Two queries that require filtering:
select top 2 t1.ID, t1.ReceivedDate
from Table t1
where t1.Type = 'TYPE_1'
order by t1.ReceivedDate desc
And:
select top 2 t2.ID
from Table t2
where t2.Type = 'TYPE_2'
order by t2.ReceivedDate desc
Separately, these return the IDs I'm looking for: (13, 11 and 12, 6)
Basically, I want the two most recent records for two specific types of data.
I want to union these two queries together like so:
select top 2 t1.ID, t2.ReceivedDate
from Table t1
where t1.Type = 'TYPE_1'
order by ReceivedDate desc
union
select top 2 t2.ID
from Table t2
where t2.Type = 'TYPE_2'
order by ReceivedDate desc
Problem:
The problem is that this query is invalid because the first select cannot have an order by clause if it is being unioned. And it cannot have top 2 without having order by.
How can I fix this situation?
You should be able to alias them and use as subqueries (part of the reason your first effort was invalid was because the first select had two columns (ID and ReceivedDate) but your second only had one (ID) - also, Type is a reserved word in SQL Server, and can't be used as you had it as a column name):
declare #Tbl1 table(ID int, ReceivedDate datetime, ItemType Varchar(10))
declare #Tbl2 table(ID int, ReceivedDate datetime, ItemType Varchar(10))
insert into #Tbl1 values(1, '20010101', 'Type_1')
insert into #Tbl1 values(2, '20010102', 'Type_1')
insert into #Tbl1 values(3, '20010103', 'Type_3')
insert into #Tbl2 values(10, '20010101', 'Type_2')
insert into #Tbl2 values(20, '20010102', 'Type_3')
insert into #Tbl2 values(30, '20010103', 'Type_2')
SELECT a.ID, a.ReceivedDate FROM
(select top 2 t1.ID, t1.ReceivedDate
from #tbl1 t1
where t1.ItemType = 'TYPE_1'
order by ReceivedDate desc
) a
union
SELECT b.ID, b.ReceivedDate FROM
(select top 2 t2.ID, t2.ReceivedDate
from #tbl2 t2
where t2.ItemType = 'TYPE_2'
order by t2.ReceivedDate desc
) b
select * from
(
select top 2 t1.ID, t1.ReceivedDate
from Table t1
where t1.Type = 'TYPE_1'
order by t1.ReceivedDate de
) t1
union
select * from
(
select top 2 t2.ID
from Table t2
where t2.Type = 'TYPE_2'
order by t2.ReceivedDate desc
) t2
or using CTE (SQL Server 2005+)
;with One as
(
select top 2 t1.ID, t1.ReceivedDate
from Table t1
where t1.Type = 'TYPE_1'
order by t1.ReceivedDate de
)
,Two as
(
select top 2 t2.ID
from Table t2
where t2.Type = 'TYPE_2'
order by t2.ReceivedDate desc
)
select * from One
union
select * from Two
declare #T1 table(ID int, ReceivedDate datetime, [type] varchar(10))
declare #T2 table(ID int, ReceivedDate datetime, [type] varchar(10))
insert into #T1 values(1, '20010101', '1')
insert into #T1 values(2, '20010102', '1')
insert into #T1 values(3, '20010103', '1')
insert into #T2 values(10, '20010101', '2')
insert into #T2 values(20, '20010102', '2')
insert into #T2 values(30, '20010103', '2')
;with cte1 as
(
select *,
row_number() over(order by ReceivedDate desc) as rn
from #T1
where [type] = '1'
),
cte2 as
(
select *,
row_number() over(order by ReceivedDate desc) as rn
from #T2
where [type] = '2'
)
select *
from cte1
where rn <= 2
union all
select *
from cte2
where rn <= 2
The basic premise of the question and the answers are wrong. Every Select in a union can have a where clause. It's the ORDER BY in the first query that's giving yo the error.
The answer is misleading because it attempts to fix a problem that is not a problem. You actually CAN have a WHERE CLAUSE in each segment of a UNION. You cannot have an ORDER BY except in the last segment. Therefore, this should work...
select top 2 t1.ID, t1.ReceivedDate
from Table t1
where t1.Type = 'TYPE_1'
-----remove this-- order by ReceivedDate desc
union
select top 2 t2.ID, t2.ReceivedDate --- add second column
from Table t2
where t2.Type = 'TYPE_2'
order by ReceivedDate desc
Create views on two first "selects" and "union" them.
Notice that each SELECT statement within the UNION must have the same number of columns. The columns must also have similar data types. Also, the columns in each SELECT statement must be in the same order.
you are selecting
t1.ID, t2.ReceivedDate
from Table t1
union
t2.ID
from Table t2
which is incorrect.
so you have to write
t1.ID, t1.ReceivedDate from Table t1
union
t2.ID, t2.ReceivedDate from Table t1
you can use sub query here
SELECT tbl1.ID, tbl1.ReceivedDate FROM
(select top 2 t1.ID, t1.ReceivedDate
from tbl1 t1
where t1.ItemType = 'TYPE_1'
order by ReceivedDate desc
) tbl1
union
SELECT tbl2.ID, tbl2.ReceivedDate FROM
(select top 2 t2.ID, t2.ReceivedDate
from tbl2 t2
where t2.ItemType = 'TYPE_2'
order by t2.ReceivedDate desc
) tbl2
so it will return only distinct values by default from both table.
select top 2 t1.ID, t2.ReceivedDate, 1 SortBy
from Table t1
where t1.Type = 'TYPE_1'
union
select top 2 t2.ID, 2 SortBy
from Table t2
where t2.Type = 'TYPE_2'
order by 3,2

Resources