Sum of rows with certain values T-SQL - sql-server

Thanks for all the fast and great answers. But I just found out that in my case, the example table I gave were not correct enough to describe the true situation, so I updated it again. New updated table shown below.
I am having the following queries:
temporary table #d to list the selected domains, temporary table #total to list the total value of these domains and temporary table #speed to list the speeds of these domains.
Then
SELECT domains, total, speed
FROM #d AS d
LEFT JOIN #total AS t ON d.domains = t.domains
LEFT JOIN #speed as S ON d.domains = s.domains
ORDER BY d.domains DESC
The result is as followings:
Domains Total Speed
-----------------------
XYZ AB 10 1
XYZ CD 12 2
XYZ EF 14 3
Bhzu 6 4
Cjuki 19 5
What I wish to have is the SUM of XYZ AB, XYZ CD and XYZ EF like this:
Domains Total Speed Rate (Speed/Total)
---------------------------------------------
XYZ 36 6 6/36
Bhzu 6 4 4/6
Cjuki 19 5 5/19
In the reality, there is some other rows with some other length of Names of domains.
This will be an SSRS automated generation Report, I am using T-SQL.
How could I make it work?
Thank you!

If you want to remove the part of the domain that follows the (first) dot, and group by that, you can do:
select x.domain,
sum(t.total) total,
sum(s.speed) speed,
1.0 * sum(s.speed) / sum(t.total) rate
from #d as d
cross apply (values (left(d.domains, charindex('.', d.domains + '.') - 1))) x(domain)
left join #total as t on d.domains = t.domains
left join #speed as s on d.domains = s.domains
group by x.domains
order by d.domains desc
The upside is that this works regardless of how many characters there are before the dot.

If the data is literally in the form A.a, you could replace Domains with LEFT(Domains, 1), and group by that, then replace Total and Speed with SUM(Total) and SUM(Speed).
SELECT LEFT(domains,1), SUM(total), SUM(speed)
FROM #d AS d
LEFT JOIN #total AS t ON d.domains=t.domains
LEFT JOIN #speed as S ON d.domains=s.domains
GROUP BY LEFT(domains,1)
ORDER BY LEFT(domains,1)

Something like :
WITH T AS
(
SELECT domains,total, speed
FROM #d AS d
LEFT JOIN #total AS t ON d.domains=t.domains
LEFT JOIN #speed as S ON d.domains=s.domains
)
SELECT LEFT(Domains, CHARINDEX('.', Domains) -1) AS DOMAINS, SUM(Total) AS TOTAL, SUM(Seed) AS SPEED, 1.0 * SUM(Total) AS TOTAL, SUM(Seed) AS RATE
FROM T
GROUP BY LEFT(Domains, CHARINDEX('.', Domains) -1)

Related

Match single row from second table with multiple matches with many columns

I have the two joined tables below. I'd like to get only the one line from the REQUIREMENTS table with the most recent date (3/8/2019).
**PART** **REQUIREMENTS**
ID OH TIME PART ORDER QTY DATE
5512 5 21 5512 74619 102 3/8/2019
5512 74907 25 3/10/2019
5512 74908 41 3/19/2019
5512 74243 59 3/21/2019
When I use Min(REQUIREMENTS.DATE), I still get all four rows because of the unique data in both the ORDER and QTY tables. I'm pretty sure I need to use Select Top 1 [...] but I'm having trouble figuring out where to use it. Ultimately I'm looking to return:
PART DATE OH TIME ORDER QTY
5512 3/8/2019 5 21 74619 102
Can anyone point me in the right direction (SQL Server 2012)? Thanks in advance!
Dan
You can use a correlated subquery to do this:
SELECT *
FROM PART P
INNER JOIN REQUIREMENTS R ON
P.ID = R.PART
WHERE REQUIREMENTS.[DATE] = (SELECT MAX([DATE] FROM REQUIREMENTS WHERE R.PART = PART)
You can use APPLY, your choice if you want OUTER or CROSS.
SELECT p.ID, p.state, p.time
, r.qty, r.date1
FROM dbo.Part p
OUTER APPLY (
select top 1 qty, date
from dbo.Requirements
where part = p.ID
order by date1
) as r

SQL Server 2008 Is it Possible to Have Select Top Return Nulls

(Select top 1 pvd.Code from PatientVisitDiags pvd
where pvd.PatientVisitId = pv.PatientVisitId
Order By pvd.Listorder) as "DX1",
(Select top 1 a.code from (Select top 2 pvd.Code,pvd.ListOrder from PatientVisitDiags pvd
where pvd.PatientVisitId = pv.PatientVisitId
Order By pvd.Listorder)a order by a.ListOrder DESC ) as "DX2",
(Select top 1 a.code from (Select top 3 pvd.Code,pvd.ListOrder from PatientVisitDiags pvd
where pvd.PatientVisitId = pv.PatientVisitId
Order By pvd.Listorder)a order by a.ListOrder DESC ) as "DX3",
(Select top 1 a.code from (Select top 4 pvd.Code,pvd.ListOrder from PatientVisitDiags pvd
where pvd.PatientVisitId = pv.PatientVisitId
Order By pvd.Listorder)a order by a.ListOrder DESC ) as "DX4",
(Select top 1 a.code from (Select top 5 pvd.Code,pvd.ListOrder from PatientVisitDiags pvd
where pvd.PatientVisitId = pv.PatientVisitId
Order By pvd.Listorder)a order by a.ListOrder DESC ) as "DX5"
The above code is what I am using currently (It is not optimal but is only being used once for a one time Data Export).
In the database that we are currently exporting from, there is a table PatientVisitDiags that has columns "ListOrder" and "Code". There can be between 1 and 5 codes. The ListOrder holds the number of that code. For example:
ListOrder|Code |
1 |M51.27 |
2 |M54.17 |
3 |G83.4 |
I am trying to export the Code to its corresponding Column in the new table(DX1,DX2..etc). If I sort by ListOrder I can get them in the order I need (Row 1 to DX1 | Row 2 to DX2 etc.) However when I run the above SQL code, If the source table only has 3 Codes DX4 and DX5 will repeat DX3. For Example:
DX1 |DX2 |DX3 |DX4 |DX5
M51.27 |M54.17 |G83.4 |G83.4 |G83.4
Is there a way to have TOP return NULL values if you Select TOP more than what is given? SQL Sever 2008 does not allow for OFFSET/FETCH, this is what I normally would have done given the option to select individual rows.
TL:DR
ID | Name
1 | Joe
2 | Eric
3 | Steve
4 | John
If I have a table like above and run
SELECT TOP 5 Name FROM Table
Is there anyway to return?
Joe
Eric
Steve
John
NULL
What you're really doing is pivoting. So pivot! Try this little query:
WITH Top5 AS (
SELECT TOP 5
Dx = 'DX' + Convert(varchar(11), Row_Number() OVER (ORDER BY pvd.Listorder)),
pvd.Code
FROM dbo.PatientVisitDiags pvd
WHERE pvd.PatientVisitId = #patientVisitId
)
SELECT *
FROM
Top5 t
PIVOT (Max(Code) FOR Dx IN (DX1, DX2, DX3, DX4, DX5)) p
;
To answer your second question about getting an unpivoted rowset, basically do the same thing but provide the 5 rows somehow and left join to the desired data.
WITH Data AS (
SELECT TOP 5
Seq = Row_Number() OVER(ORDER BY ID),
Name
FROM dbo.Table
ORDER BY ID
)
SELECT
n.Seq,
t.Name
FROM
(VALUES
(1), (2), (3), (4), (5) -- or a numbers-generating CTE perhaps
) n (Seq)
LEFT JOIN Top 5 t
ON n.Seq = t.Seq
;
Side note
The fact that you're doing this:
where pvd.PatientVisitId = pv.PatientVisitId
tells me you're not using ANSI joins. Stop. Don't do that any more. Put this join condition in the ON clause of a JOIN. It's the year 2016... why are you using join syntax from the last century?
Oh, and prefix the schema on the table names. Look it up--you'll find actual performance reasons why you should do that. It's not just about the time taken to find the correct schema, but also about the execution plan cache...
one at a time - answering the last question
create a table with a bunch of null
select top (5) col
from
(
select col from table1
union
select nulCol from nullTable
) tt
order by tt.col

How do I exclude rows when an incremental value starts over?

I am a newbie poster but have spent a lot of time researching answers here. I can't quite figure out how to create a SQL result set using SQL Server 2008 R2 that should probably be using lead/lag from more modern versions. I am trying to aggregate data based on sequencing of one column, but there can be varying numbers of instances in each sequence. The only way I know a sequence has ended is when the next row has a lower sequence number. So it may go 1-2, 1-2-3-4, 1-2-3, and I have to figure out how to make 3 aggregates out of that.
Source data is joined tables that look like this (please help me format):
recordID instanceDate moduleID iResult interactionNum
1356 10/6/15 16:14 1 68 1
1357 10/7/15 16:22 1 100 2
1434 10/9/15 16:58 1 52 1
1435 10/11/15 17:00 1 60 2
1436 10/15/15 16:57 1 100 3
1437 10/15/15 16:59 1 100 4
I need to find a way to separate the first 2 rows from the last 4 rows in this example, based on values in the last column.
What I would love to ultimately get is a result set that looks like this, which averages the iResult column based on the grouping and takes the first instanceDate from the grouping:
instanceDate moduleID iResult
10/6/15 1 84
10/9/15 1 78
I can aggregate to get this result using MIN and AVG if I can just find a way to separate the groups. The data is ordered by instanceDate (please ignore the date formatting here) then interactionNum and the group separation should happen when the query finds a row where the interactionNum is <= than the previous row (will usually start over with '1' but not always, so prefer just to separate on a lower or equal integer value).
Here is the query I have so far (includes the joins that give the above data set):
SELECT
X.*
FROM
(SELECT TOP 100 PERCENT
instanceDate, b.ModuleID, iResult, b.interactionNum
FROM
(firstTable a
INNER JOIN
secondTable b ON b.someID = a.someID)
WHERE
a.someID = 2
AND b.otherID LIKE 'xyz'
AND a.ModuleID = 1
ORDER BY
instanceDate) AS X
OUTER APPLY
(SELECT TOP 1
*
FROM
(SELECT
instanceDate, d.ModuleID, iResult, d.interactionNum
FROM
(firstTable c
INNER JOIN
secondTable d ON d.someID = c.someID)
WHERE
c.someID = 2
AND d.otherID LIKE 'xyz'
AND c.ModuleID = 1
AND d.interactionNum = X.interactionNum
AND c.instanceDate < X.instanceDate) X2
ORDER BY
instanceDate DESC) Y
WHERE
NOT EXISTS (SELECT Y.interactionNum INTERSECT SELECT X.interactionNum)
But this is returning an interim result set like this:
instanceDate ModuleID iResult interactionNum
10/6/15 16:10 1 68 1
10/6/15 16:14 1 100 2
10/15/15 16:57 1 100 3
10/15/15 16:59 1 100 4
and the problem is that interactionNum 3, 4 do not belong in this result set. They would go in the next result set when I loop over this query. How do I keep them out of the result set in this iteration? I need the result set from this query to just include the first two rows, 'seeing' that row 3 of the source data has a lower value for interactionNum than row 2 has.
Not sure what ModuleID was supposed to be used, but I guess you're looking for something like this:
select min (instanceDate), [moduleID], avg([iResult])
from (
select *,row_number() over (partition by [moduleID] order by instanceDate) as RN
from Table1
) X
group by [moduleID], RN - [interactionNum]
The idea here is to create a running number with row_number for each moduleid, and then use the difference between that and InteractionNum as grouping criteria.
Example in SQL Fiddle
Here is my solution, although it should be said, I think #JamesZ answer is cleaner.
I created a new field called newinstance which is 1 wherever your instanceNumber is 1. I then created a rolling sum(newinstance) called rollinginstance to group on.
Change the last select to SELECT * FROM cte2 to show all the fields I added.
IF OBJECT_ID('tempdb..#tmpData') IS NOT NULL
DROP TABLE #tmpData
CREATE TABLE #tmpData (recordID INT, instanceDate DATETIME, moduleID INT, iResult INT, interactionNum INT)
INSERT INTO #tmpData
SELECT 1356,'10/6/15 16:14',1,68,1 UNION
SELECT 1357,'10/7/15 16:22',1,100,2 UNION
SELECT 1434,'10/9/15 16:58',1,52,1 UNION
SELECT 1435,'10/11/15 17:00',1,60,2 UNION
SELECT 1436,'10/15/15 16:57',1,100,3 UNION
SELECT 1437,'10/15/15 16:59',1,100,4
;WITH cte1 AS
(
SELECT *,
CASE WHEN interactionNum=1 THEN 1 ELSE 0 END AS newinstance,
ROW_NUMBER() OVER(ORDER BY recordID) as rowid
FROM #tmpData
), cte2 AS
(
SELECT *,
(select SUM(newinstance) from cte1 b where b.rowid<=a.rowid) as rollinginstance
FROM cte1 a
)
SELECT MIN(instanceDate) AS instanceDate, moduleID, AVG(iResult) AS iResult
FROM cte2
GROUP BY moduleID, rollinginstance

How to develop a recursive CTE in T-SQL?

I am new to recursive CTEs. I am trying to develop a CTE which will return all of the employees under each manager name. So I have two tables: people_rv and staff_rv
People_rv table contains all of the people, both managers and employees. Staff_rv only contains manager information. Uniqueidentifier staff values are stored in Staff_rv. Uniqueidentifier employee values are stored in people_rv. People_rv contains varchar first and last name values for both managers and employees.
But when I run the following CTE I get an error:
WITH
cteStaff (ClientID, FirstName, LastName, SupervisorID, EmpLevel)
AS
(
SELECT p.people_id, p.first_name, p.last_name, s.supervisor_id,1
FROM people_rv p JOIN staff_rv s on s.people_id = p.people_id
WHERE s.supervisor_id = '95E16819-8C3A-4098-9430-08F0E3B764E1'
UNION ALL
SELECT p2.people_id, p2.first_name, p2.last_name, s2.supervisor_id, r.EmpLevel + 1
FROM people_rv p2 JOIN staff_rv s2 on s2.people_id = p2.people_id
INNER JOIN cteStaff r on s2.staff_id = r.ClientID
)
SELECT
FirstName + ' ' + LastName AS FullName,
EmpLevel,
(SELECT first_name + ' ' + last_name FROM people_rv p join staff_rv s on s.people_id = p.people_id
WHERE s.staff_id = cteStaff.SupervisorID) AS Manager
FROM cteStaff
OPTION (MAXRECURSION 0);
My output is:
Barbara G 1 Melanie K
Dawn P 1 Melanie K
Garrett M 1 Melanie K
Stephanie P 1 Melanie K
Amanda F 1 Melanie K
Amanda T 1 Melanie K
Stephanie G 1 Melanie K
Carlos H 1 Melanie K
So it is not iterating any more than the first level. Why not?
Melanie is the top most supervisor, but each of the persons in the leftmost column are also supervisors. So this query should also return level 2.
You may be in an infinite loop with your join. I would check how many levels you expect the table to actually go down. Generally you join a recursion on something similar to do
ID = ParentID
of something either contained in a table or in an expression. Keep in mind you can also create a CTE prior to a recursive CTE if you have to make up your relationship.
Here is an example that will self execute, it may help.
Declare #table table ( PersonId int identity, PersonName varchar(512), Account int, ParentId int, Orders int);
insert into #Table values ('Brett', 1, NULL, 1000),('John', 1, 1, 100),('James', 1, 1, 200),('Beth', 1, 2, 300),('John2', 2, 4, 400);
select
PersonID
, PersonName
, Account
, ParentID
from #Table
; with recursion as
(
select
t1.PersonID
, t1.PersonName
, t1.Account
--, t1.ParentID
, cast(isnull(t2.PersonName, '')
+ Case when t2.PersonName is not null then '\' + t1.PersonName else t1.PersonName end
as varchar(255)) as fullheirarchy
, 1 as pos
, cast(t1.orders +
isnull(t2.orders,0) -- if the parent has no orders than zero
as int) as Orders
from #Table t1
left join #Table t2 on t1.ParentId = t2.PersonId
union all
select
t.PersonID
, t.PersonName
, t.Account
--, t.ParentID
, cast(r.fullheirarchy + '\' + t.PersonName as varchar(255))
, pos + 1 -- increases
, r.orders + t.orders
from #Table t
join recursion r on t.ParentId = r.PersonId
)
, b as
(
select *, max(pos) over(partition by PersonID) as maxrec -- I find the maximum occurrence of position by person
from recursion
)
select *
from b
where pos = maxrec -- finds the furthest down tree
-- and Account = 2 -- I could find just someone from a different department
Your problem as far as I can tell is is you have no join connecting managers to their employees.
This join
INNER JOIN cteStaff r on r.StaffID = s2.staff_id
Just joins the same initial level 1 staffer back to himself.
UPDATE:
Still not quite right! You have a supervisor_id, but again you're still not actually using that to join back to the CTE.
So for each recursion of this CTE you need to (excluding the name join):
select {Level 1 Boss}, NULL (no supervisor)
union
select {new employee}, {that employee's boss}
So the join must connect the CTE's ClientID (the level 1 boss) to the second UNION query's supervisor field, which looks to be supervisor_id , not staff_id.
The JOIN to accomplish this second task is (from what I can tell of your staff_rv table schema):
SELECT p2.people_id, p2.first_name, p2.last_name, s2.supervisor_id, r.EmpLevel + 1
FROM people_rv p2 JOIN staff_rv s2 on s2.people_id = p2.people_id
INNER JOIN cteStaff r on s2.supervisor_id = r.ClientID
Note the bottom join joins the r.ClientID (the level 1 boss) to the staffer's supervisor_id field.
(NB: I think your staff_id and supervisor_id's mimic your people_id values from the people_rv table, so this join should work fine. But if they are different (i.e. a staffer's supervisor_id isn't that supervisor's people_id) then you'll need to write the join such that the staffer's supervisor_id can be joined to their people_id you're storing as ClientID in the CTE.)
Here's a good simple Recursive CTE to review (it may not be the answer, but someone else searching on how to make a recursive CTE may need it):
-- Recursive CTE
;
WITH Years ( myYear )
AS (
-- Base case
SELECT DATEPART(year, GETDATE())
UNION ALL
-- Recursive
SELECT Years.myYear - 1
FROM Years
WHERE Years.myYear >= 2002
)
SELECT *
FROM Years
Note that this probably won't solve your problem, but is a means to hopefully seeing where you're going wrong in the original query.
The default is 100 levels of recursion - you can set it to unlimited by using the MAXRECURSION query hint where you're selecting from your CTE:
...
FROM cteStaff
OPTION (MAXRECURSION 0);
From MSDN:
MAXRECURSION number
Specifies the maximum number of recursions allowed for this query. number is a nonnegative integer between 0 and 32767. When 0 is
specified, no limit is applied. If this option is not specified, the
default limit for the server is 100.
When the specified or default number for MAXRECURSION limit is reached during query execution, the query is ended and an error is
returned.
Because of this error, all effects of the statement are rolled back. If the statement is a SELECT statement, partial results or no
results may be returned. Any partial results returned may not include
all rows on recursion levels beyond the specified maximum recursion
level.

How to group ranged values using SQL Server

I have a table of values like this
978412, 400
978813, 20
978834, 50
981001, 20
As you can see the second number when added to the first is 1 number before the next in the sequence. The last number is not in the range (doesnt follow a direct sequence, as in the next value). What I need is a CTE (yes, ideally) that will output this
978412, 472
981001, 20
The first row contains the start number of the range then the sum of the nodes within. The next row is the next range which in this example is the same as the original data.
From the article that Josh posted, here's my take (tested and working):
SELECT
MAX(t1.gapID) as gapID,
t2.gapID-MAX(t1.gapID)+t2.gapSize as gapSize
-- max(t1) is the specific lower bound of t2 because of the group by.
FROM
( -- t1 is the lower boundary of an island.
SELECT gapID
FROM gaps tbl1
WHERE
NOT EXISTS(
SELECT *
FROM gaps tbl2
WHERE tbl1.gapID = tbl2.gapID + tbl2.gapSize + 1
)
) t1
INNER JOIN ( -- t2 is the upper boundary of an island.
SELECT gapID, gapSize
FROM gaps tbl1
WHERE
NOT EXISTS(
SELECT * FROM gaps tbl2
WHERE tbl2.gapID = tbl1.gapID + tbl1.gapSize + 1
)
) t2 ON t1.gapID <= t2.gapID -- For all t1, we get all bigger t2 and opposite.
GROUP BY t2.gapID, t2.gapSize
Check out this MSDN Article. It gives you a solution to your problem, if it will work for you depends on the ammount of data you have and your performance requirements for the query.
Edit:
Well using the example in the query, and going with his last solution the second way to get islands (first way resulted in an error on SQL 2005).
SELECT MIN(start) AS startGroup, endGroup, (endgroup-min(start) +1) as NumNodes
FROM (SELECT g1.gapID AS start,
(SELECT min(g2.gapID) FROM #gaps g2
WHERE g2.gapID >= g1.gapID and NOT EXISTS
(SELECT * FROM #gaps g3
WHERE g3.gapID - g2.gapID = 1)) as endGroup
FROM #gaps g1) T1 GROUP BY endGroup
The thing I added is (endgroup-min(start) +1) as NumNodes. This will give you the counts.

Resources