SQL Query to Find All distinct Duplicates across 2 columns

SQL Query to Find All distinct Duplicates across 2 columns - sql-server

Sale ID Group ID
JED93B53QEJYST4102 01A42QEJAXT17A7
JED93B53QEJYST4102 01A42QEJAXT17A7
JED93B53QEJYST4102 01A42QEJAXT17A7
JED93B53QEJYST4102 01A42QEJAXT17A7
JED8754AQEJEHT4119 01C49QEJPJT133E
JED8754AQEJEHT4119 01C49QEJPJT133E
JED8754AQEJEHT4119 01C49QEJPJT133E
JEDA67C1QEJEQR4A4A 03D80QEJRSR1BC5
JEDA67C1QEJEQR4A4A 03D80QEJRSR1BC5
JED46D04QEJXOR468B 040E5QEJGQR174D
JED658D9QEJIOS4F38 053BDQEJNSS11D4
JED658D9QEJIOS4F38 053BDQEJNSS11D4
JED99C53QEJNMR4973 053BDQEJNSS11D4
JED658D9QEJIOS4F38 053BDQEJNSS11D4
JED658D9QEJIOS4F38 053BDQEJNSS11D4
JED457D4QEJFGR468F 0B829QEJHJR18F5
JED457D4QEJFGR468F 0B829QEJHJR18F5
JEDA98F8QEJCZQ4F6A 0B829QEJHJR18F5
I am stuck in a dilemma wherein I am trying to write a SQL Query that will give me only those records which has duplicate group IDs but with unique Sale IDs. My Expected Output is similar to below. Is there anyway to achieve this?
JED99C53QEJNMR4973 053BDQEJNSS11D4
JED658D9QEJIOS4F38 053BDQEJNSS11D4
JED457D4QEJFGR468F 0B829QEJHJR18F5
JEDA98F8QEJCZQ4F6A 0B829QEJHJR18F5
Any help appreciated. Thanks a lot.
EDIT: Using Group By I can achieve till this:
Sale ID Group ID
JED93B53QEJYST4102 01A42QEJAXT17A7
JED8754AQEJEHT4119 01C49QEJPJT133E
JEDA67C1QEJEQR4A4A 03D80QEJRSR1BC5
JED46D04QEJXOR468B 040E5QEJGQR174D
JED658D9QEJIOS4F38 053BDQEJNSS11D4
JED99C53QEJNMR4973 053BDQEJNSS11D4
JED457D4QEJFGR468F 0B829QEJHJR18F5
JEDA98F8QEJCZQ4F6A 0B829QEJHJR18F5
EDIT FINAL: Thanks for all the responses and the finally was able to sort this out the way I wanted it. I managed to learn something new. Apologies if my question was not clear. I required it precisely that way as the table has more than a 100,000 records and I need to audit those with different SaleID for a single GroupID. The below query by Giorgos Betsos worked
SELECT t1.[Sale ID], t1.[Group ID]
FROM mytable AS t1
JOIN (
SELECT [Group ID]
FROM mytable
GROUP BY [Group ID]
HAVING COUNT(*) > 1 AND COUNT(DISTINCT [Sale ID]) > 1
) AS t2 ON t1.[Group ID] = t2.[Group ID]
Group By t1.[Group ID], t1.[Sale ID]

It seems like the required GROUP ID values can be obtained by the following query:
SELECT [Group ID]
FROM mytable
GROUP BY [Group ID]
HAVING COUNT(*) > 1 AND COUNT(DISTINCT [Sale ID]) > 1
You can use the above query as a derived to table to join back to the original table so as to get the Sale ID value as well:
SELECT t1.[Sale ID], t1.[Group ID]
FROM mytable AS t1
JOIN (
SELECT [Group ID]
FROM mytable
GROUP BY [Group ID]
HAVING COUNT(*) > 1 AND COUNT(DISTINCT [Sale ID]) > 1
) AS t2 ON t1.[Group ID] = t2.[Group ID]

Try this,
DECLARE #Table TABLE (
[Sale ID] VARCHAR(100)
,[Group ID] VARCHAR(100)
)
insert into #Table values
('JED93B53QEJYST4102','01A42QEJAXT17A7')
,('JED93B53QEJYST4102','01A42QEJAXT17A7')
,('JED93B53QEJYST4102','01A42QEJAXT17A7')
,('JED93B53QEJYST4102','01A42QEJAXT17A7')
,('JED8754AQEJEHT4119','01C49QEJPJT133E')
,('JED8754AQEJEHT4119','01C49QEJPJT133E')
,('JED8754AQEJEHT4119','01C49QEJPJT133E')
,('JEDA67C1QEJEQR4A4A','03D80QEJRSR1BC5')
,('JEDA67C1QEJEQR4A4A','03D80QEJRSR1BC5')
,('JED46D04QEJXOR468B','040E5QEJGQR174D')
,('JED658D9QEJIOS4F38','053BDQEJNSS11D4')
,('JED658D9QEJIOS4F38','053BDQEJNSS11D4')
,('JED99C53QEJNMR4973','053BDQEJNSS11D4')
,('JED658D9QEJIOS4F38','053BDQEJNSS11D4')
,('JED658D9QEJIOS4F38','053BDQEJNSS11D4')
,('JED457D4QEJFGR468F','0B829QEJHJR18F5')
,('JED457D4QEJFGR468F','0B829QEJHJR18F5')
,('JEDA98F8QEJCZQ4F6A','0B829QEJHJR18F5')
SELECT DISTINCT t1.[Sale ID]
,t1.[Group ID]
FROM #Table AS t1
JOIN (
SELECT [Group ID]
FROM #Table
GROUP BY [Group ID]
HAVING COUNT(DISTINCT [Sale ID]) > 1
) AS t2 ON t1.[Group ID] = t2.[Group ID]

write a SQL Query that will give me only those records which has duplicate group IDs but with unique Sale IDs
Here is how to do this verbatim:
SELECT t1.[Sale ID],
t1.[Group ID]
FROM yourTable t1
INNER JOIN
(
SELECT [Group ID]
FROM yourTable
GROUP BY [Group ID]
HAVING COUNT(*) > 1 AND -- duplicate group IDs
COUNT(DISTINCT [Sale ID]) = COUNT(*) -- but all sale IDs unique
) t2
ON t1.[Group ID] = t2.[Group ID]
However, your desired output seemed to suggest something more simple:
SELECT DISTINCT [Sale ID], [Group ID]
FROM yourTable
or
SELECT [Sale ID], [Group ID]
FROM yourTable
GROUP BY [Sale ID], [Group ID]

Related

SQL Server : restricting data range

The query below is supposed to show details for 2 types of products: DIS001 and DIS002.
When DIS002, this should "reset" the query, so that it only shows DIS001 products which were sold after that the date when DIS002 was sold.
To be honest, I'm not even sure if this is possible. I'll be grateful for any suggestions.
SELECT DISTINCT
Sales.RaisedDateTime AS [Date],
Contacts.ContactID AS [Contact ID],
Contacts.SiteID AS [Site ID],
Sales.ProductID AS [Product],
CASE
WHEN Sales.ProductID = 'DIS002'
THEN Sales.RaisedDate
ELSE CONVERT(DATETIME, CONVERT(VARCHAR(10), '2019-10-28', 101) + ' 00:00:00')
END AS [Start Date]
FROM
((Bookings.Bookings Bookings
INNER JOIN
Contacts.Contacts Contacts ON (Bookings.ContactID = Contacts.ContactID))
INNER JOIN
Sales.Sales Sales ON (Bookings.ContactID = Sales.ContactID))
WHERE
(Sales.ProductID = 'DIS001' AND
Sales.RaisedDate >= MAX([Start Date])

There are lots of different syntax's to solve this, but here is one:
;with DIS002 as (
select ContactID, max(RaisedDate) as DIS002Date
from Sales.Sales
where ProductID = 'DIS002'
group by ContactID
)
SELECT DISTINCT
Sales.RaisedDateTime AS [Date],
Contacts.ContactID AS [Contact ID],
Contacts.SiteID AS [Site ID],
Sales.ProductID AS [Product]
FROM
((Bookings.Bookings Bookings
INNER JOIN
Contacts.Contacts Contacts ON (Bookings.ContactID = Contacts.ContactID))
INNER JOIN
Sales.Sales Sales ON (Bookings.ContactID = Sales.ContactID))
LEFT JOIN
DIS002 on DIS002.ContactID = Sales.ContactID
WHERE
Sales.ProductID = 'DIS001' AND
Sales.RaisedDate >= isnull(DIS002date,'1900-01-01')

SQL: How can I get from grouped table also one first row from each group

I have following table
CREATE TABLE [dbo].[Table_1](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Name] [nchar](10) NULL,
[DateOf] [nvarchar](20) NULL
) ON [PRIMARY]
and date as:
ID|Name|DateOf
1|A|2016-11-29 00:01:00
2|A|2016-11-29 00:02:00
3|A|2016-11-29 00:03:00
4|B|2016-11-29 00:01:00
5|B|2016-11-29 00:02:00
If I make like
select name, COUNT(name) from Table_1 group by Name
I'll have
A|3
B|2
So, how to get result like before with only one row taken from each group and sorted?
A|3|2016-11-29 00:01:00
B|2|2016-11-29 00:01:00
where last column will be sorted as date time (right now is nvarchar)

You can use a CTE and the ranking function PARTITION BY
WITH CTE AS
( select name, dateof,
rn = row_number() over (partition by NAME order by dateof desc)
from Table_1
)
SELECT name, dateof FROM CTE WHERE RN = 1
Or
select * from (
select name, dateof, ROW_NUMBER() over(partition by NAME order by dateof desc) as rnk
from Table_1
) a where rnk=1

You can use CROSS APPLY:
select t1.name, t1.[Count], t2.MinDate
from (
select name, count(*) as [Count]
from Table_1
group by name
) AS t1
cross apply (
select min(DateOf) as MinDate
from Table_1 t2
where t1.Name = t2.Name
) as t2
If you want to get additional data than just the min date from the row, you can modify the subselect:
select t1.name, t1.[Count], t2.MinDate, t2.ID
from (
select name, count(*) as [Count]
from Table_1
group by name
) AS t1
cross apply (
select top 1 DateOf as MinDate, ID
from Table_1 t2
where t1.Name = t2.Name
order by DateOf
) as t2

SELECT T1.Name,A.[COUNT],MIN(DateOf)
FROM Your_table_Name T1
JOIN
(
SELECT COUNT(*) [COUNT],T2.Name [Name]
FROM Your_table_Name T2
GROUP BY T2.Name
)A ON A.Name = T1.Name
GROUP BY T1.Name,A.[COUNT]

Select Multiple Columns join SQL Server

I have an Employee table like this
And a second table for EmployeeComments with the EmployeeID as foreign key:
I would like to query the employees with their comments in the following format:
select Name, Comment
from Employee emp
left join EmployeeComments empC on empC.EmployeeID = emp.ID
I would like the results to be like:
I have already looked at Pivot, but it doesn't resolve my issue

Use window function:
select case when row_number() over(partition by emp.name order by empC.ID) = 1
then Name
else '' end as Name,
Comment
from Employee emp
left join EmployeeComments empC On empC.EmployeeID = emp.ID

This might give you some kind of order in your result window at least
WITH cte AS(
SELECT emp.Name ,
empC.Comment,
RANK() OVER (ORDER BY emp.Name) NameOrder,
ROW_NUMBER() OVER (PARTITION BY emp.Name ORDER BY empC.ID) RN
FROM Employee emp
LEFT JOIN EmployeeComments empC ON empC.EmployeeID = emp.ID
)
SELECT
Name = (CASE WHEN RN = 1 THEN Name ELSE '' END),
Comment
FROM
cte
ORDER BY
NameOrder,
RN

"use Cross Join:"
Query:
select case t.cnt
when 1 then
coalesce(t.Name,' ')
end as Name,t.comment
from
(
select t1.Name,t2.comment,row_number()
over(partition by t1.name order by t1.Name)
as cnt
from
EmployeeComments t1
cross join
Employee t2
where t1.ID=t2.Employeeid
)t

Add other column (time) to DISTINCT rows

I am selecting unique rows with
SELECT DISTINCT
LogContent
FROM
[WebAppLog] WITH (NOLOCK)
WHERE
LogName = 'frontendErrorLog'
But how to pair result with other column? I want to select unique LogName and assign it corresponding earliest LogTime just like in:
SELECT
LogTime, LogContent
FROM
[WebAppLog] WITH (NOLOCK)
WHERE
LogName = 'frontendErrorLog'

#Ultra
Will a grouping work for you, like this.
SELECT LogContent,MAX(LogTime) LogTime
FROM [WebAppLog] WITH (NOLOCK)
WHERE LogName = 'frontendErrorLog'
GROUP BY LogContent

You can use ROW_NUMBER:
WITH CTE AS
(
SELECT LogTime, LogContent,
rn = ROW_NUMBER() OVER (PARTITION BY LogContent ORDER BY LogTime ASC)
FROM [WebAppLog] WITH (NOLOCK)
WHERE LogName = 'frontendErrorLog'
)
SELECT LogTime, LogContent FROM CTE WHERE rn = 1

Place Data from Same Column in Different Columns in Resultset

I have a request I wasn't sure to handle. I was thinking of using PIVOT, but I wasn't sure if that would be the way to go.
I have the following Data:
EmployeeA, DepartmentB, 1/10/2010
EmployeeA, DepartmentA, 1/1/2000
EmployeeB, DepartmentC, 1/3/2011
They want output for only the employees that have been in different departments. Something that looks like this (order is important due to the dates):
EmployeeA, DepartmentA, DepartmentB
Any help is appreciated. For some reason, my mind isn't finding a good solution.

You can do this by doing a self JOIN on the table and then using a PIVOT to get the data in the format that you want:
SELECT *
FROM
(
SELECT t1.emp, t1.dept, t1.dt
FROM test t1
INNER JOIN test t2
ON t1.emp = t2.emp
AND t1.dept != t2.dept
) x
PIVOT
(
min(dt)
for dept in ([A], [B], [C], [D], [E])
) p
See SQL Fiddle with Demo
If you remove the JOIN you will get all records, but you stated you only want the records that have been in more than one department.

Here's the answer I got which I got largely based on your work. Pivot doesn't work because I don't know the categories (in this case Department) ahead of time and I can only have two of them.
Maybe there's an easier way. I didn't use a CTE, because I believe this should work for Sybase as well which I don't think supports that.
select Meta1.[Employee ID],
Meta1.Department as PreviousDepartment,
Meta2.Department as CurrentDepartment
from
(
SELECT t1.[First Name], t1.[Last Name],
t1.[Employee ID], t1.Department, t1.[Hire Date],
ROW_NUMBER() over(PARTITION by t1.[EMPLOYEE ID] order by t1.[Hire Date]) as RowNum
FROM EMPLOYEE t1
INNER JOIN EMPLOYEE t2
ON t1.[Employee ID] = t2.[Employee ID]
AND t1.Department != t2.Department
) Meta1
inner join
(
SELECT t1.[Employee ID], t1.Department, t1.[Hire Date],
ROW_NUMBER() over(PARTITION by t1.[EMPLOYEE ID] order by t1.[Hire Date]) as RowNum
FROM EMPLOYEE t1
INNER JOIN EMPLOYEE t2
ON t1.[Employee ID] = t2.[Employee ID]
AND t1.Department != t2.Department
) Meta2
on Meta1.[Employee ID]=Meta2.[Employee ID]
where Meta1.RowNum=1
and Meta2.RowNum=2

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

SQL Query to Find All distinct Duplicates across 2 columns - sql-server

Related

SQL Server : restricting data range

SQL: How can I get from grouped table also one first row from each group

Select Multiple Columns join SQL Server

Add other column (time) to DISTINCT rows

Place Data from Same Column in Different Columns in Resultset

Categories

Resources