remove duplicates based on column in inner join

remove duplicates based on column in inner join - sql-server

This is returning exactly what I want, except some rows need to be removed because the inner join has matched multiple rows when I only want it to match the first match.
select table1.IDa, table1.IDb, table1.name,
table1b.IDa, table1b.IDb, table1b.name
from
(select IDa,IDb,name from mytable) table1
inner join
(select IDa,IDb,name from mytable) table1b
ON
table1.IDa = table1b.IDa
and table1.IDb = table1b.IDb
order By table1.IDa
So I'm getting this:
IDa IDb name IDa IDb name
1 1 bob 1 1 public
1 1 bob 1 1 smith
1 2 sally 1 2 jones
2 1 nancy 2 1 dole
But I want to receive this:
IDa IDb name IDa IDb name
1 1 bob 1 1 public
1 2 sally 1 2 jones
2 1 nancy 2 1 dole
I only want the first match for the IDa+IDb combination returned.

Based on asker's comment
That would be the oldest entry into the database, it would also be the
same as order by IDa,IDb. It would also be the first match seen in the
returned results
Try this query :
select table1.IDa, table1.IDb, table1.name,
table1b.IDa, table1b.IDb, table1b.name
from
(select IDa,IDb,name from mytable) table1
inner join
(select IDa,IDb,name, ROW_NUMBER() OVER( ORDER BY Ida,IDb) as r from mytable ) table1b
ON
table1.IDa = table1b.IDa
and table1.IDb = table1b.IDb
and table1b.r=1
order By table1.IDa

As per your comments this should work, But Smith and Public has same IDa and IDb values hope it is data issue.
;WITH cte
AS (SELECT rn=Row_number()OVER(partition BY table1b.name ORDER BY table1.IDa, table1.IDb),
table1.IDa AS t1_ida,
table1.IDb AS t1_idb,
table1.name AS t1_name,
table1b.IDa AS t2_ida,
table1b.IDb AS t2_idb,
table1b.name AS t2_name
FROM mytable table1
INNER JOIN mytable table1b
ON table1.IDa = table1b.IDa
AND table1.IDb = table1b.IDb)
SELECT *
FROM cte
WHERE rn = 1

Related

Select the sum of a count in select statement where date between

I have a query that select the count of data from each day, I want to modify the query so it can get the data from a date between two dates
The first Query as follows:
SELECT ROW_NUMBER() OVER (ORDER BY q.english_Name DESC) as id,
COUNT(t.id) AS ticket,
q.english_name queue_name,
ts.code current_status,
COUNT(t.assigned_to) AS assigned,
(COUNT(t.id)-COUNT(t.assigned_to)) AS not_assigned
,trunc(t.create_date) create_Date
FROM ticket t
INNER JOIN ref_queue q
ON (q.id = t.queue_id)
INNER JOIN ref_ticket_status ts
ON(ts.id=t.current_status_id)
GROUP BY q.english_name,
ts.code
,trunc(t.create_date)
but when I modify it to :
SELECT ROW_NUMBER() OVER (ORDER BY q.english_Name DESC) as id,
COUNT(t.id) AS ticket,
q.english_name queue_name,
ts.code current_status,
COUNT(t.assigned_to) AS assigned,
(COUNT(t.id)-COUNT(t.assigned_to)) AS not_assigned
,trunc(t.create_date) create_Date
FROM ticket t
INNER JOIN ref_queue q
ON (q.id = t.queue_id)
INNER JOIN ref_ticket_status ts
ON(ts.id=t.current_status_id)
where t.create_date between '18-FEB-19' and '24-FEB-19'
GROUP BY q.english_name,
ts.code
,trunc(t.create_date)
the output is
1 1 Technical Support Sec. CLOSED 0 1 19-FEB-19
2 6 Technical Support Sec. OPEN 4 2 18-FEB-19
3 1 Technical Support Sec. OPEN 0 1 21-FEB-19
4 3 Network Sec. OPEN 2 1 18-FEB-19
5 1 Network Sec. OPEN 0 1 21-FEB-19
how can i get the total output of the days so that the output is:
1 7 Technical Support Sec. OPEN 4 3
2 4 Network Sec. OPEN 2 2

When you GROUP BY in a query, your result set will include one row for every distinct set of values in your GROUP BY list. For example, the reason you are getting two rows for the OPEN records for "Techical Support Sec" is because there are two distinct values for TRUNC(t.create_date) resulting in two groups and, therefore, two rows in your result set.
To avoid that, stop grouping by TRUNC(t.create_date).
SELECT ROW_NUMBER() OVER (ORDER BY q.english_Name DESC) as id,
COUNT(t.id) AS ticket,
q.english_name queue_name,
ts.code current_status,
COUNT(t.assigned_to) AS assigned,
(COUNT(t.id)-COUNT(t.assigned_to)) AS not_assigned
-- ,trunc(t.create_date) create_Date
FROM ticket t
INNER JOIN ref_queue q
ON (q.id = t.queue_id)
INNER JOIN ref_ticket_status ts
ON(ts.id=t.current_status_id)
where t.create_date between '18-FEB-19' and '24-FEB-19'
GROUP BY q.english_name,
ts.code
-- ,trunc(t.create_date)

T-SQL query to show all the past steps, active and future steps

I have 3 tables in SQL Server:
map_table: (workflow map path)
stepId step_name
----------------
1 A
2 B
3 C
4 D
5 E
history_table:
stepId timestamp author
----------------------------
1 9:00am John
2 9:20am Mary
current_stageTable:
Id currentStageId waitingFor
------------------------------------
12345 3 Kat
I would like to write a query to show the map with the workflow status. Like this result here:
step name time author
----------------------------
1 A 9:00am John
2 B 9:20am Mary
3 C waiting Kat
4 D
5 E
I tried left join
select
m.stepId, m.step_name, h.timestamp, h.author
from
map_table m
left join
history_table h on m.stepId = h.stepId
I thought it will list all the records from the map table, since I am using left join, but somehow it only shows 3 records which is from history table..
So I changed to
select
m.stepId, m.step_name, h.timestamp, h.author
from
map_table m
left join
history_table h on m.stepId = h.stepId
union
select
m.stepId, m.step_name, '' as timestamp, '' as author
from
map_table m
where
m.stageId not in (select stageId from history_table)
order by
m.stepId
Then it list the result almost as I expected, but how do I add the 3rd table in to show the current active stage?
Thank you very much for all your help!! Much appreciated.

Looks like it's what you asked:
with map_table as (
select * from (values (1,'A')
,(2,'B')
,(3,'C')
,(4,'D')
,(5,'E')) t(stepId, step_name)
)
, history_table as (
select * from (values
(1,'9:00am','John')
,(2,'9:20am','Mary')) t(stepId, timestamp, author)
)
, current_stapeTable as (
select * from (values (2345, 3, 'Kat')) t(Id, currentStageId, waitingFor)
)
select
m.stepId, m.step_name
, time = coalesce(h.timestamp, case when c.waitingFor is not null then 'waiting' end)
, author = coalesce(h.author, c.waitingFor)
from
map_table m
left join history_table h on m.stepId = h.stepId
left join current_stapeTable c on m.stepId = c.currentStageId

I think a union fits well with the data and avoids the coalescing the values on multiple joins.
with timeline as (
select stepId, "timestamp" as ts, author from history_table
union all
select currentStageId, 'waiting', waitingFor from current_stageTable
)
select step_id, step_name, "timestamp", author
from
map_table as m left outer join timeline as t
on t.stepId = m.stepId

Get distinct data from 2 tables using join

Below is my query . I take socail sercurity no from Employee table and match it with child_excel table. Using that I get Employee ID and matching that with EmployeeCh table (which has employee ID). The employee has 2 child's so when I put the gender clause I get 4 records 2 for each child with different (i.e Male, Female) for each of the child's. I want only 2 rows for child 1 for each child with their respective genders.
SELECT distinct
[SSN],
empch2.GenderID,
gen.Name1
FROM [child_excel] as t
INNER JOIN Employee as empch on t.SSN = empch.SocialSecurityNo
INNER JOIN EmployeeCh as empch2 on empch.ID = empch2.EmployeeID
INNER JOIN Gender as gen on empch2.GenderID = gen.ID
I am getting O/p as
12345 1 Male
12345 2 Female
99999 1 Male
99999 2 Female
Expected output is
12345 1 Male
99999 2 Female
But when I add First Name in the join it gives proper output. I dnt want to use first Name
SELECT distinct
[SSN],
empch2.GenderID,
gen.Name1
FROM [child_excel] as t
INNER JOIN Employee as empch on t.SSN = empch.SocialSecurityNo
INNER JOIN EmployeeCh as empch2 on empch.ID = empch2.EmployeeID and t.First_Name= empch2.FirstName
INNER JOIN Gender as gen on empch2.GenderID = gen.ID

Use ROW_NUMBER ()
Select * from (
SELECT distinct
[SSN],
empch2.GenderID,
gen.Name1,
ROW_NUMBER()OVER(PARTITION BY [SSN] ORDER BY empch2.GenderID)RN
FROM [child_excel] as t
INNER JOIN Employee as empch on t.SSN = empch.SocialSecurityNo
INNER JOIN EmployeeCh as empch2 on empch.ID = empch2.EmployeeID and t.First_Name= empch2.FirstName
INNER JOIN Gender as gen on empch2.GenderID = gen.ID )T
WHERE T.RN = 1

How to select distinct values after left outer join operation

I want to select some values from three tables with aggregate function but without duplicates in one of the columns, for example:
select t3.ValueDesc as FeatureType,
count(t2.Strategic) as TotalCount
,t2.RequestID,t1.StoryID --these are not needed, but put for better vision
from tblRequests t2
left outer join (select * from tblAgileMultiDD where Type=18) t3
on t3.FormulaValue = t2.Strategic
left outer join tblAgileStory t1
on t1.Feature = t2.RequestID
where t2.RequestID > 0
and t1.DemoStatus = 1
group by t3.ValueDesc
,t2.RequestID, t1.StoryID --these are not needed but put for better vision
order by t3.ValueDesc
And then it returns me something like this:
FeatureType TotalCount RequestID StoryID
Protect Base 1 311 1629
Protect Base 1 311 1630
Protect Base 1 312 1631
Protect Base 1 312 1637
New Market 1 313 1640
New Market 1 313 1645
And if I comment out lines with ",t2.RequestID, t1.StoryID", it gives me the result:
FeatureType TotalCount
Protect Base 4
New Market 2
So, for each unique combination of RequestID and StoryID it returns new row. How to make it return new row only for each unique RequestID regardless to StoryID?
So I want this query to result like this:
FeatureType TotalCount
Protect Base 2 (for RequestID = 311, 312)
New Market 1 (for RequestID = 313)
Putting word "distinct" at the beginning doesn't take effect on it.
Can you help me with this?

select distinct FeatureType,TotalCount from (
select t3.ValueDesc as FeatureType,
count(t2.Strategic) as TotalCount
,t2.RequestID
-- ,t1.StoryID --these are not needed, but put for better vision
from tblRequests t2
left outer join (select * from tblAgileMultiDD where Type=18) t3
on t3.FormulaValue = t2.Strategic
left outer join tblAgileStory t1
on t1.Feature = t2.RequestID
where t2.RequestID > 0
and t1.DemoStatus = 1
group by t3.ValueDesc
,t2.RequestID
-- , t1.StoryID --these are not needed but put for better vision
) as T
order by t3.ValueDesc
could you try this.

Select Count Top Inner Join and Where Clause in SQL

This is my Query:
SELECT TOP 3 tablestudentanswer.examid,
tablestudentanswer.studentid,
tablestudentanswer.itemno,
tablestudentanswer.studentanswer,
tablescore.score
FROM tablestudentanswer
INNER JOIN tablescore
ON tablestudentanswer.studentid = tablescore.studentid
AND tablestudentanswer.examid = tablescore.examid
WHERE tablestudentanswer.examid = 1
AND tablestudentanswer.itemno = 1
ORDER BY tablescore.score ASC
It returns this table:
ExamID StudentID ItemNo StudentAnswer Score
1006 1 1 A 25
1005 1 2 B 30
1004 1 3 A 35
What i want to do is it will return 2 if StudentAnswer='A' and 1 if StudentAnswer='B'
Guys there is nothing wrong with my query on top. What i am asking is what should I add in that query.
I have this which in my mind should return 2 but its an error.
Select COUNT(*) From (
Select Top 3 TableStudentAnswer.ExamID, TableStudentAnswer.StudentID, TableStudentAnswer.ItemNo, TableStudentAnswer.StudentAnswer, TableScore.Score
from TableStudentAnswer
Inner join TableScore on TableStudentAnswer.StudentID=TableScore.StudentID and TableStudentAnswer.ExamID=TableScore.ExamID
where TableStudentAnswer.ExamID=1 and TableStudentAnswer.ItemNo=1
Order By TableScore.Score Asc) where TableStudentAnswer.StudentAnswer = 'A'
It should return:
2
Please help me!

Will this do?
SELECT TOP 3 tablestudentanswer.examid,
tablestudentanswer.studentid,
tablestudentanswer.itemno,
tablestudentanswer.studentanswer,
tablescore.score,
case
when tablestudentanswer.studentanswer = 'A' then 2
when tablestudentanswer.studentanswer = 'B' then 1
else NULL
end as [MyColumn]
FROM tablestudentanswer
INNER JOIN tablescore
ON tablestudentanswer.studentid = tablescore.studentid
AND tablestudentanswer.examid = tablescore.examid
WHERE tablestudentanswer.examid = 1
AND tablestudentanswer.itemno = 1
ORDER BY tablescore.score ASC
Your question is a bit unclear. Perhaps you want the amount of answers for each?
count(1) over (partition by tablestudentanswer.studentanswer)
This will give you a column with the amount of all the answers with the given studentanswer to each of the rows in the result set. However, note that this could be quite slow. If you can, you're better off using a normal group by.

Do you mean you would like the query to return the number of answers? If so, using COUNT may help.
SELECT tablestudentanswer.studentid,
tablestudentanswer.studentanswer
COUNT(1) AS NumberOfAnswers
FROM tablestudentanswer
INNER JOIN tablescore
ON tablestudentanswer.studentid = tablescore.studentid
AND tablestudentanswer.examid = tablescore.examid
GROUP BY tablestudentanswer.studentid, tablestudentanswer.studentanswer
Please correct me if I am wrong.
By the way, why does your result table doesn't consist of itemno even though you have it in your SELECT statement?

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

remove duplicates based on column in inner join - sql-server

Related

Select the sum of a count in select statement where date between

T-SQL query to show all the past steps, active and future steps

Get distinct data from 2 tables using join

How to select distinct values after left outer join operation

Select Count Top Inner Join and Where Clause in SQL

Categories

Resources