Return one row when multiples are returned - sql-server

Firstly apologies for this as a number of similar posts have been posted but I just can't seem to return what I would like
My data returns
desc | date | taken | result | text | notes | page | group | q | answer | value | state | time |
------------------------------------------------------------------------------
Asess1 | 20170101 | John | 5 | Injury | xxx | Page1 | Assess11 | 1 | 1234567 | 1 | 1 | 0 |
Asess1 | 20170101 | John | 5 | Injury | xxx | Page1 | Assess11 | 1 | 1234567 | 1 | 1 | 0 |
Asess1 | 20170101 | John | 5 | Injury | xxx | Page1 | Assess11 | 1 | 1234567 | 1 | 1 | 0 |
Asess1 | 20170101 | John | 5 | Injury | xxx | Page1 | Assess11 | 1 | 1234567 | 1 | 1 | 0 |
Asess1 | 20170101 | John | 5 | Injury | xxx | Page1 | Assess11 | 1 | 1234567 | 1 | 1 | 0 |
Code as follows
select t.desc,a.date,a.taken,a.result,a.text,a.notes,d.page,d.group,d.q,d.answer,d.value,d.state,d.timeSpanSeconds
from cc_clientAssessments a
inner join cs_assessmentData d on a.assessmentId=d.assessment
inner join cs_clients c on c.person=a.residentId
inner join cs_facilities f on f.guid=a.facilityId
inner join cs_assessmentTypes t on t.assessmentTypeId=a.assessmentTypeId
where c.surname='smith'
and f.name='home'
and t.description ='injury'
and a.dateTaken='2017-05-28 00:00:00.000'
and d.questionName='1'
and d.answer='1234567'
order by t.desc, a.date desc,d.page,d.group,d.q
any help would be great.

One (or more) of your joins is causing this duplication, because you have not been specific enough in your join criteria.
As other commenters have said, remove all your joins and then add them back in one by one until you start to see duplicates. Using select * you can see what additional data is being pulled back and therefore what additional filters you need to include in that join. Once you have no duplication, add in the next join and repeat the whole process.
This is the sensible way to resolve this as nine times out of ten you can stop this duplication with more specific join criteria, which will also ensure your query is processing less data and will therefore be more efficient.

Although the elegant solution is to fix your joins so they don't result in as many rows, a very quick fix would be to just use distinct to eliminate duplicates and convert your text field to an string, so it can be compared.
select distinct t.desc,a.date,a.taken,a.result,substring(a.text,1,512) as text,a.notes,d.page,d.group,d.q,d.answer,d.value,d.state,d.timeSpanSeconds
from cc_clientAssessments a
inner join cs_assessmentData d on a.assessmentId=d.assessment
inner join cs_clients c on c.person=a.residentId
inner join cs_facilities f on f.guid=a.facilityId
inner join cs_assessmentTypes t on t.assessmentTypeId=a.assessmentTypeId
where c.surname='smith'
and f.name='home'
and t.description ='injury'
and a.dateTaken='2017-05-28 00:00:00.000'
and d.questionName='1'
and d.answer='1234567'
order by t.desc, a.date desc,d.page,d.group,d.q

Related

SQL Server - identify combinations of values and assign combination identifier

I am trying to assign what amounts to a 'combinationid' to rows of my table, based on the values in the two columns below. Each product has a number of customers linked to it. For every combination of customers, I need to create a combination ID.
For example, the combination of customers for product 'a' is the same combination of customers for product 'c' (they both have customers 1, 2 and 3), so products a and c should have the same combination identifier ('customergroup'). However, products should not share the same customergroup if they only share some of the same customers - e.g. product b only has customers 1 and 2 (not 3), so should have a different customergroup to products 'a' and 'c'.
Input:
| productid | customerid |
|-----------|------------|
| a | 1 |
| a | 2 |
| a | 3 |
| b | 1 |
| b | 2 |
| c | 3 |
| c | 2 |
| c | 1 |
| d | 1 |
| d | 3 |
| e | 1 |
| e | 2 |
| f | 1 |
| g | 2 |
| h | 3 |
Desired output:
| productid | customerid | customergroup |
|-----------|------------|---------------|
| a | 1 | 1 |
| a | 2 | 1 |
| a | 3 | 1 |
| b | 1 | 2 |
| b | 2 | 2 |
| c | 3 | 1 |
| c | 2 | 1 |
| c | 1 | 1 |
| d | 1 | 3 |
| d | 3 | 3 |
| e | 1 | 2 |
| e | 2 | 2 |
| f | 1 | 4 |
| g | 2 | 5 |
| h | 3 | 6 |
or just
| productid | customergroupid |
|-----------|-----------------|
| a | 1 |
| b | 2 |
| c | 1 |
| d | 3 |
| e | 2 |
| f | 4 |
| g | 5 |
| h | 6 |
Edit: first version of this did include a description of my attempts. I currently have nested queries that basically give me a column for customer 1, 2, 3 etc and then uses dense rank to get the grouping. The problem is that is not dynamic for different numbers of customers and I did not know where to start for getting a dynamic result as above. Thanks for the replies.
Considering you haven't shown your efforts, or confirmed the version you're using, I've assumed you have the latest ("and greatest") version of SQL Server, which means you have access to STRING_AGG.
This doesn't give the groupings in the same order, but I'm going to also also that doesn't matter, and the grouping is just arbitrary. This gives you the following:
WITH VTE AS(
SELECT *
FROM (VALUES('a',1),
('a',2),
('a',3),
('b',1),
('b',2),
('c',3),
('c',2),
('c',1),
('d',1),
('d',3),
('e',1),
('e',2),
('f',1),
('g',2),
('h',3)) V(productid,customerid)),
Groups AS(
SELECT productid,
STRING_AGG(customerid,',') WITHIN GROUP (ORDER BY customerid) AS CustomerIDs
FROM VTE
GROUP BY productid),
Rankings AS(
SELECT productid,
CustomerIDs,
DENSE_RANK() OVER (ORDER BY CustomerIDs ASC) AS Grouping
FROM Groups)
SELECT V.productid,
V.customerid,
R.Grouping AS customergroupid
FROM VTE V
JOIN Rankings R ON V.productid = R.productid
ORDER BY V.productid,
V.customerid;
db<>fiddle.
If you aren't using SQL Server 2017, I suggest looking up the FOR XML PATH method for string aggregation.
Using Larnu's answer this is how I got the result for 2008:
WITH VTE AS(
SELECT *
FROM (VALUES('a','1'),
('a','2'),
('a','3'),
('b','1'),
('b','2'),
('c','3'),
('c','2'),
('c','1'),
('d','1'),
('d','3'),
('e','1'),
('e','2'),
('f','1'),
('g','2'),
('h','3')) V(productid,customerid)),
Groups AS(
SELECT productid, CustomerIDs = STUFF((SELECT N', ' + customerid
FROM VTE AS p2
WHERE p2.productid = p.productid
ORDER BY customerid
FOR XML PATH(N'')), 1, 2, N'')
FROM VTE AS p
GROUP BY productid),
Rankings AS(
SELECT productid,
CustomerIDs,
DENSE_RANK() OVER (ORDER BY CustomerIDs ASC) AS Grouping
FROM Groups)
SELECT V.productid,
V.customerid,
R.Grouping AS customergroupid
FROM VTE V
JOIN Rankings R ON V.productid = R.productid
ORDER BY V.productid,
V.customerid;
Thanks again for your assistance.

Where to use Outer Apply

MASTER TABLE
x------x--------------------x
| Id | Name |
x------x--------------------x
| 1 | A |
| 2 | B |
| 3 | C |
x------x--------------------x
DETAILS TABLE
x------x--------------------x-------x
| Id | PERIOD | QTY |
x------x--------------------x-------x
| 1 | 2014-01-13 | 10 |
| 1 | 2014-01-11 | 15 |
| 1 | 2014-01-12 | 20 |
| 2 | 2014-01-06 | 30 |
| 2 | 2014-01-08 | 40 |
x------x--------------------x-------x
I am getting the same results when LEFT JOIN and OUTER APPLY is used.
LEFT JOIN
SELECT T1.ID,T1.NAME,T2.PERIOD,T2.QTY
FROM MASTER T1
LEFT JOIN DETAILS T2 ON T1.ID=T2.ID
OUTER APPLY
SELECT T1.ID,T1.NAME,TAB.PERIOD,TAB.QTY
FROM MASTER T1
OUTER APPLY
(
SELECT ID,PERIOD,QTY
FROM DETAILS T2
WHERE T1.ID=T2.ID
)TAB
Where should I use LEFT JOIN AND where should I use OUTER APPLY
A LEFT JOIN should be replaced with OUTER APPLY in the following situations.
1. If we want to join two tables based on TOP n results
Consider if we need to select Id and Name from Master and last two dates for each Id from Details table.
SELECT M.ID,M.NAME,D.PERIOD,D.QTY
FROM MASTER M
LEFT JOIN
(
SELECT TOP 2 ID, PERIOD,QTY
FROM DETAILS D
ORDER BY CAST(PERIOD AS DATE)DESC
)D
ON M.ID=D.ID
which forms the following result
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | NULL | NULL |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
This will bring wrong results ie, it will bring only latest two dates data from Details table irrespective of Id even though we join with Id. So the proper solution is using OUTER APPLY.
SELECT M.ID,M.NAME,D.PERIOD,D.QTY
FROM MASTER M
OUTER APPLY
(
SELECT TOP 2 ID, PERIOD,QTY
FROM DETAILS D
WHERE M.ID=D.ID
ORDER BY CAST(PERIOD AS DATE)DESC
)D
Here is the working : In LEFT JOIN , TOP 2 dates will be joined to the MASTER only after executing the query inside derived table D. In OUTER APPLY, it uses joining WHERE M.ID=D.ID inside the OUTER APPLY, so that each ID in Master will be joined with TOP 2 dates which will bring the following result.
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | 2014-01-08 | 40 |
| 2 | B | 2014-01-06 | 30 |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
2. When we need LEFT JOIN functionality using functions.
OUTER APPLY can be used as a replacement with LEFT JOIN when we need to get result from Master table and a function.
SELECT M.ID,M.NAME,C.PERIOD,C.QTY
FROM MASTER M
OUTER APPLY dbo.FnGetQty(M.ID) C
And the function goes here.
CREATE FUNCTION FnGetQty
(
#Id INT
)
RETURNS TABLE
AS
RETURN
(
SELECT ID,PERIOD,QTY
FROM DETAILS
WHERE ID=#Id
)
which generated the following result
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-11 | 15 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | 2014-01-06 | 30 |
| 2 | B | 2014-01-08 | 40 |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
3. Retain NULL values when unpivoting
Consider you have the below table
x------x-------------x--------------x
| Id | FROMDATE | TODATE |
x------x-------------x--------------x
| 1 | 2014-01-11 | 2014-01-13 |
| 1 | 2014-02-23 | 2014-02-27 |
| 2 | 2014-05-06 | 2014-05-30 |
| 3 | NULL | NULL |
x------x-------------x--------------x
When you use UNPIVOT to bring FROMDATE AND TODATE to one column, it will eliminate NULL values by default.
SELECT ID,DATES
FROM MYTABLE
UNPIVOT (DATES FOR COLS IN (FROMDATE,TODATE)) P
which generates the below result. Note that we have missed the record of Id number 3
x------x-------------x
| Id | DATES |
x------x-------------x
| 1 | 2014-01-11 |
| 1 | 2014-01-13 |
| 1 | 2014-02-23 |
| 1 | 2014-02-27 |
| 2 | 2014-05-06 |
| 2 | 2014-05-30 |
x------x-------------x
In such cases an APPLY can be used(either CROSS APPLY or OUTER APPLY, which is interchangeable).
SELECT DISTINCT ID,DATES
FROM MYTABLE
OUTER APPLY(VALUES (FROMDATE),(TODATE))
COLUMNNAMES(DATES)
which forms the following result and retains Id where its value is 3
x------x-------------x
| Id | DATES |
x------x-------------x
| 1 | 2014-01-11 |
| 1 | 2014-01-13 |
| 1 | 2014-02-23 |
| 1 | 2014-02-27 |
| 2 | 2014-05-06 |
| 2 | 2014-05-30 |
| 3 | NULL |
x------x-------------x
In your example queries the results are indeed the same.
But OUTER APPLY can do more: For each outer row you can produce an arbitrary inner result set. For example you can join the TOP 1 ORDER BY ... row. A LEFT JOIN can't do that.
The computation of the inner result set can reference outer columns (like your example did).
OUTER APPLY is strictly more powerful than LEFT JOIN. This is easy to see because each LEFT JOIN can be rewritten to an OUTER APPLY just like you did. It's syntax is more verbose, though.

SQL Server. Join two tables n a view, take rows from one and turn into columns [duplicate]

This question already has answers here:
Efficiently convert rows to columns in sql server
(5 answers)
Closed 8 years ago.
I'm pretty new to SQL Server so don't really know what I'm doing with this. I have two tables, which might look like this:
table 1
| ID | customer | Date |
| 1 | company1 | 01/08/2014 |
| 2 | company2 | 10/08/2014 |
| 3 | company3 | 25/08/2014 |
table 2
| ID | Status | Days |
| 1 | New | 6 |
| 1 | In Work | 25 |
| 2 | New | 17 |
| 3 | New | 14 |
| 3 | In Work | 72 |
| 3 | Complete | 25 |
What I need to do is join based on the ID, and create new columns to show how long each ID has been in each status. Every time an order goes to a new status, a new line is added and the number of days is counted as in the 2nd table above. What I need to create from this, should look like this:
| ID | customer | Date | New | In Work | Complete |
| 1 | company1 | 01/08/2014 | 6 | 25 | |
| 2 | company2 | 10/08/2014 | 17 | | |
| 3 | company3 | 25/08/2014 | 14 | 72 | 25 |
So what do I need to to to create this?
Thanks for any help, as I say I'm pretty new to this.
I would suggest that AHiggins' link is a better candidate to mark this as a dupe rather than the one that's actually been selected because his link involves a join.
WITH [TimeTable] AS (
SELECT
T1.ID,
T1.[Date],
T2.[Status] AS [Status],
T2.[Days]
FROM
dbo.Table1 T1
INNER JOIN dbo.Table2 T2
ON T2.ID = T1.ID
)
SELECT *
FROM
[TimeTable]
PIVOT (MAX([Days]) FOR [Status] IN ([New], [Complete], [In Work])) TT
;

Sum worked hours

I have a issues table where users can log worked hours and estimate hours that looks like this
id | assignee | task | timespent | original_estimate | date
--------------------------------------------------------------------------
1 | john | design | 2 | 3 | 2013-01-01
2 | john | mockup | 2 | 3 | 2013-01-02
3 | john | design | 2 | 3 | 2013-01-01
4 | rick | mockup | 5 | 4 | 2013-01-04
And I need to sum and group the worked and estimated hours by task and date to get this
assignee | task | total_spent | total_estimate | date
------------------------------------------------------------------
john | design | 4 | 6 | 2013-01-01
john | mockup | 2 | 3 | 2013-01-02
rick | design | 5 | 4 | 2013-01-04
Ok, this is easy, I've already got this:
SELECT assignee, task, SUM(timespent) as total_spent, SUM(original_estimate) AS total_estimate, date FROM issues GROUP BY assignee, task, date
My problem is I need to also show the assignees that did not logged hours on any task that day, I mean:
assignee | task | total_spent | total_estimate | date
------------------------------------------------------------------
john | design | 4 | 6 | 2013-01-01
john | mockup | 2 | 3 | 2013-01-02
rick | design | 5 | 4 | 2013-01-04
pete | design | 0 | 0 | 2013-01-01
pete | mockup | 0 | 0 | 2013-01-02
liz | design | 0 | 0 | 2013-01-04
liz | mockup | 0 | 0 | 2013-01-04
The goal is to draw a chart like this http://jsfiddle.net/uUjst/embedded/result/
You need the Assignees in their own separate table to join from.
SELECT tblAssignee.Name, task, SUM(timespent) as total_spent, SUM(original_estimate) AS total_estimate, date
FROM tblAssignee
LEFT JOIN issue ON issues.assignee = tblAssignee.Name
GROUP BY tblAssignee.Name, task, date
Assuming that you have a user table, but not a tasks or dates table... meaning that we have to derive these values from the values present in issues:
;WITH dates AS (
SELECT DISTINCT date
FROM issues
), tasks AS (
SELECT DISTINCT task
FROM issues
)
SELECT
u.user as assignee,
t.task,
SUM(i.timespent) as total_spent,
SUM(i.original_estimate) AS total_estimate,
d.date
FROM
users u CROSS JOIN
dates d CROSS JOIN
tasks t LEFT OUTER JOIN
issues i ON
i.assignee = u.user
AND i.task = t.task
AND i.date = d.date
GROUP BY u.user, t.task, d.date
SELECT
A.name,
task,
ISNULL(SUM(timespent), 0) as total_spent,
ISNULL(SUM(original_estimate), 0) AS total_estimate,
date
FROM Assignee A
LEFT JOIN issue
ON issues.assignee = A.Name
GROUP BY A.name, task, date

Finding the max and min date values in SQL Server tables

I have two tables:
A lookup table (tabOne):
KEY | Group | Name | Desc | Val_Key
----------------------------------------
1 | a | NameA | DescA | 10
2 | b | NameB | DescB | 20
3 | c | NameC | DescC | 30
4 | d | NameD | DescD | 40
5 | e | NameE | DescE | 50
6 | f | NameF | DescF | 60
A second table containing readings (tabTwo):
KEY | Date | Reading | Val_Key
----------------------------------------
1 | Date | Read | 10
2 | Date | Read | 20
3 | Date | Read | 40
4 | Date | Read | 40
5 | Date | Read | 30
6 | Date | Read | 20
7 | Date | Read | 40
8 | Date | Read | 20
9 | Date | Read | 10
10 | Date | Read | 20
11 | Date | Read | 50
12 | Date | Read | 60
What I need to do is join tabTwo with TabOne and create a column with the newest Reading and a column with the oldest reading for each item in the group column of TabOne.
At the end of the day I want a table that look as follow:
KEY | Group | Name | Desc | Val_Key | LastReading | FirstReading |
-------------------------------------------------------------------------
1 | a | NameA | DescA | 10 | | |
2 | b | NameB | DescB | 20 | | |
3 | c | NameC | DescC | 30 | | |
4 | d | NameD | DescD | 40 | | |
5 | e | NameE | DescE | 50 | | |
6 | f | NameF | DescF | 60 | | |
Thanks!
Freddie
If this is Sql Server 2005 or newer, outer apply will help:
select TabOne.*,
last.Reading LastReading,
first.Reading FirstReading
from TabOne
outer apply
(
select top 1
Reading
from TabTwo
where TabTwo.Val_Key = TabOne.val_Key
order by TabTwo.Date desc
) last
outer apply
(
select top 1
Reading
from TabTwo
where TabTwo.Val_Key = TabOne.val_Key
order by TabTwo.Date asc
) first
Live test is # Sql Fiddle.
#Nikola Markovinović's solution can be made more universally applicable if the subqueries are moved directly to the main query's SELECT clause, which is possible each of them retrieves only one value and is, therefore, valid as a scalar expression:
SELECT
t1.[KEY],
t1.[Group],
t1.Name,
t1.[Desc],
t1.Val_Key,
(
SELECT TOP 1 Reading
FROM TabTwo
WHERE Val_Key = t1.Val_Key
ORDER BY Date DESC
) AS LastReading,
(
SELECT TOP 1 Reading
FROM TabTwo
WHERE Val_Key = t1.Val_Key
ORDER BY Date ASC
) AS FirstReading
FROM TabOne t1
If you needed e.g. dates along the way, you would probably have to stick to Nikola's solution. There is an alternative to it, but it's more cumbersome (albeit more standard too): it would involve grouping TabTwo's data by Val_Key to get earliest/latest dates per Val_Key, then joining back to TabTwo to access entire rows corresponding to the found dates to finally pull the necessary columns, and ultimately joining both result sets to TabOne to get the final column set.

Resources