SQL split values with same ID from 2 tables - sql-server

I have 2 tables (that's retrieving data from a main table). Example:
Table1
id GroupX Source GroupNum Amount
-------------------------------------------------------
1 23 School SH001 1 700
2 23 Bank BA001 2 300
3 23 Music MU001 3 500
4 23 School SH999 1 900
Table2
id GroupNum SourceAmt
----------------------------------
1 23 1 700
2 23 2 100
3 23 3 500
4 23 1 900
My dilemma is with the query I'm using. It's returning additional rows for split values(notice in table 2 "GroupNum" has a split value of 700 and 900)
My results should be
id GroupX Source GroupNum Amount SourceAmt
-----------------------------------------------------------------
1 23 School SH001 1 700 700
2 23 Bank BA001 2 300 100
3 23 Music MU001 3 500 500
4 23 School SH999 1 900 900
But instead I get this
id GroupX Source GroupNum Amount SourceAmt
-----------------------------------------------------------------
1 23 School SH001 1 700 700
2 23 School SH001 1 700 900
3 23 Bank BA001 2 300 100
4 23 Music MU001 3 500 500
5 23 School SH999 1 900 900
6 23 School SH999 1 900 700
Here's my query:
SELECT
t1.id,
t1.GroupX,
t1.Source,
t1.GroupNum,
t1.Amount,
t2.SourceAmt
FROM
table1 as t1
INNER JOIN
table2 as t2 ON t1.id = t2.id
AND t1.GroupNum = t2.GroupNum
WHERE
t1.id = 23
I've tried using Distinct as well. Assistance will be greatly appreciated.

If I understand correctly, you would like to join Table1 and Table2 such that the id, GroupNum, and amounts align. If this is indeed the case, then you'll need to join on the amounts as well, e.g.,:
Select t1.id, t1.Group, t1.Source, t1.GroupNum, t1.Amount, t2.SourceAmt
From table1 as t1 INNER JOIN
table2 as t2
ON t1.id = t2.id AND t1.GroupNum = t2.GroupNum AND t1.Amount = t2.SourceAmt
where id = 23
If this is not what you want, or you would prefer not to join using the amounts (e.g., you cannot guarantee that you won't see the same amount more than once), then you're in a bit of a conundrum; you'll note that (id, GroupNum) tuples are not unique in either table, and thus your join is not one-to-one. You may want to include Source in table2 or otherwise provide a transactionId in table1 that maps onto a unique ID column in table2.

You need an additional join key. There is no obvious candidate, except perhaps for amount -- but I'm not sure that is what you intend. SQL tables represent unordered sets, so there is no concept of matching by "line number".
You can assign a row number using row_number(). The following will do the match, but you need to specify the ordering column:
Select t1.id, t1.Group, t1.Source, t1.GroupNum, t1.Amount, t2.SourceAmt
From (select t1.*,
row_number() over (partition by t1.id order by ?) as seqnum
from table1 t1
) t1 inner join
(select t2.*
row_number() over (partition by t1.id order by ?) as seqnum
from table2 t2
) t2
on t1.id = t2.id and t1.GroupNum = t2.GroupNum and
t1.seqnum = t2.seqnum
where id = 23 ;
The ? is for the ordering column in each table.

I would choose a different approach from simple INNER JOIN, simply because you can do limited things with that result set (here's the result set from that inner join)
I would do multiple joins.
First I would LEFT JOIN with the default condition + where table1.[Amount] = table2.[SourceAmt]. This will give me set where [Amount] and [SourceAmt] are equal
After this I INNER JOIN with the default condition to get the Amounts that doesn't match
Here's the query
with t1 as
(
select 23 as [id], 'School' as [Group], 'SH001' as [Source], 1 as [GroupNum], 700 as [Amount]
union all
select 23, 'Bank', 'BA001', 2, 300
union all
select 23, 'Music', 'MU001', 3, 500
union all
select 23, 'School', 'SH999', 1, 900
),
t2 as
(
select 23 as [id], 1 as [GroupNum], 700 as [SourceAmt]
union all
select 23, 2, 100
union all
select 23, 3, 500
union all
select 23, 1, 900
)
select t1.*, a.*, b.*
from t1
left join t2 as a on
t1.[id] = a.[id]
and t1.[GroupNum] = a.[GroupNum]
and t1.[Amount] = a.[SourceAmt]
inner join t2 as b on
t1.[id] = b.[id]
and t1.[GroupNum] = b.[GroupNum]
where t1.[id] = 23
And here's the result set, which you can inspect
Now, I use this result as my pre-result actually, and do a little trick with CASE and [takeIt] column, here's the final query
with t1 as
(
select 23 as [id], 'School' as [Group], 'SH001' as [Source], 1 as [GroupNum], 700 as [Amount]
union all
select 23, 'Bank', 'BA001', 2, 300
union all
select 23, 'Music', 'MU001', 3, 500
union all
select 23, 'School', 'SH999', 1, 900
),
t2 as
(
select 23 as [id], 1 as [GroupNum], 700 as [SourceAmt]
union all
select 23, 2, 100
union all
select 23, 3, 500
union all
select 23, 1, 900
),
res as
(
select t1.[id],
t1.[Group],
t1.[Source],
t1.[GroupNum],
t1.[Amount],
isnull(a.[SourceAmt], b.[SourceAmt]) as [SourceAmt],
case when a.[SourceAmt] is null or a.[SourceAmt] = b.[SourceAmt] then 1
else 0
end as [takeIt]
from t1
left join t2 as a on
t1.[id] = a.[id]
and t1.[GroupNum] = a.[GroupNum]
and t1.[Amount] = a.[SourceAmt]
inner join t2 as b on
t1.[id] = b.[id]
and t1.[GroupNum] = b.[GroupNum]
where t1.[id] = 23
)
select [id],
[Group],
[Source],
[GroupNum],
[Amount],
[SourceAmt]
from res
where [takeIt] = 1

Related

Most recent record MS SQL

Need only the most recent record
Current Data:
RequestID RequestCreateDate VehID DeviceNum ProgramStatus InvID
1 08/12/2018 13:00:00:212 110 20178 Submitted A1
2 08/11/2018 11:12:33:322 110 20178 Pending A1
3 09/08/2018 4:14:28:132 110 Null Cancelled A1
4 11/11/2019 10:12:00:123 188 21343 Open B3
5 12/02/2019 06:15:00:321 188 21343 Submitted B3
Req Result:
RequestID RequestCreateDate VehID DeviceNum ProgramStatus InvID
3 09/08/2018 4:14:28:132 110 Null Cancelled A1
5 12/02/2019 06:15:00:321 188 21343 Submitted B3
InvID is from tableB that I am joining.
Here is the query that I am currently trying but there are duplicate records:
Select
max(t1.RequestID) ReqID,
max(t1.RequestCreateDate) NewDate,
t1.VehID,
t1.DeviceNum,
t1.ProgramStatus,
t2.InvID
FROM table1 t1
LEFT JOIN table2 t2 ON t1.VehID = t2.VehID
GROUP BY t1.VehID, t1.DeviceNum, t1.ProgramStatus, t2.InvID
I need only the most recent record for each VehID. Thanks
On option is to filter with a subquery:
select t1.*, t2.invid
from table1
left join table2 t2 on t1.vehid = t1.vehid
where t1.requestCreateDate = (
select max(t11.requestCreateDate)
from table1 t11
where t11.vehid = t1.vehid
)
For performance, consider an index on table1(vehid, requestCreateDate).
You can also use row_number():
select *
from (
select t1.*, t2.invid, row_number() over(partition by vehid order by requestCreateDate desc) rn
from table1
left join table2 t2 on t1.vehid = t1.vehid
) t
where rn = 1

Find recursively the original Id from data in one table

First look at simple data and expected result that I want to achieve:
SAMPLE DATA
Id ParentId Mode
----------- ----------- ---------
28 0 A
29 30 B
30 0 R
31 32 C
32 33 T
33 34 Y
34 0 G
I can get my expected results using this query:
select
t1.Id,
coalesce(t5.Id,t4.Id,t3.Id,t2.Id,t1.Id) as BaseId,
coalesce(t5.Mode,t4.Mode,t3.Mode,t2.Mode,t1.Mode) as BaseMode
from
#Table t1
left join
#Table t2 on t2.ParentId = t1.Id
left join
#Table t3 on t3.ParentId = t2.Id
left join
#Table t4 on t4.ParentId = t3.Id
left join
#Table t5 on t5.ParentId = t4.Id
Expected result:
Id BaseId BaseMode
----------- ----------- ---------
28 28 A
29 29 B
30 29 B
31 31 C
32 31 C
33 31 C
34 31 C
But the problem is - I don't know how many times I will have to left join..
I could be any number.
I tried to use recursive cte - but it blows my mind. And I see a problem to figure it out. Can anyone show me how to achieve it?
Here are simple data to paste in your management studio:
select *
into #Table
from (select 28 as Id, 0 as ParentId, 'A' as Mode
union all select 29, 30, 'B'
union all select 30, 0, 'R'
union all select 31, 32, 'C'
union all select 32, 33, 'T'
union all select 33, 34, 'Y'
union all select 34, 0, 'G') data
You can try with this:
WITH ParentIdCTE (Id, BaseId, BaseMode, RecursionLevel)
AS
(
SELECT
Id,
Id,
Mode,
0 As RecursionLevel
FROM
#Table
UNION ALL
SELECT
p.Id,
e.Id,
e.Mode,
p.RecursionLevel + 1
FROM
ParentIdCTE As p
INNER JOIN
#Table AS e
ON
e.ParentId = p.BaseId
WHERE
p.BaseId <> 0
AND p.RecursionLevel < 10
)
SELECT
Id,
BaseId,
BaseMode
FROM
ParentIdCTE
INNER JOIN
(
SELECT
Id As MaxRecId,
MAX(RecursionLevel) as MaxRecLevel
FROM
ParentIdCTE AS s
GROUP BY
Id
) AS MaxRec
ON
MaxRecId = Id
AND MaxRecLevel = RecursionLevel
Of course, it can be improved in many ways. It's just an example. There's a limit to the recursion level, just to have a stronger query.

How to find the cumulative sum in SubQuery? [duplicate]

declare #t table
(
id int,
SomeNumt int
)
insert into #t
select 1,10
union
select 2,12
union
select 3,3
union
select 4,15
union
select 5,23
select * from #t
the above select returns me the following.
id SomeNumt
1 10
2 12
3 3
4 15
5 23
How do I get the following:
id srome CumSrome
1 10 10
2 12 22
3 3 25
4 15 40
5 23 63
select t1.id, t1.SomeNumt, SUM(t2.SomeNumt) as sum
from #t t1
inner join #t t2 on t1.id >= t2.id
group by t1.id, t1.SomeNumt
order by t1.id
SQL Fiddle example
Output
| ID | SOMENUMT | SUM |
-----------------------
| 1 | 10 | 10 |
| 2 | 12 | 22 |
| 3 | 3 | 25 |
| 4 | 15 | 40 |
| 5 | 23 | 63 |
Edit: this is a generalized solution that will work across most db platforms. When there is a better solution available for your specific platform (e.g., gareth's), use it!
The latest version of SQL Server (2012) permits the following.
SELECT
RowID,
Col1,
SUM(Col1) OVER(ORDER BY RowId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Col2
FROM tablehh
ORDER BY RowId
or
SELECT
GroupID,
RowID,
Col1,
SUM(Col1) OVER(PARTITION BY GroupID ORDER BY RowId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Col2
FROM tablehh
ORDER BY RowId
This is even faster. Partitioned version completes in 34 seconds over 5 million rows for me.
Thanks to Peso, who commented on the SQL Team thread referred to in another answer.
For SQL Server 2012 onwards it could be easy:
SELECT id, SomeNumt, sum(SomeNumt) OVER (ORDER BY id) as CumSrome FROM #t
because ORDER BY clause for SUM by default means RANGE UNBOUNDED PRECEDING AND CURRENT ROW for window frame ("General Remarks" at https://msdn.microsoft.com/en-us/library/ms189461.aspx)
Let's first create a table with dummy data:
Create Table CUMULATIVESUM (id tinyint , SomeValue tinyint)
Now let's insert some data into the table;
Insert Into CUMULATIVESUM
Select 1, 10 union
Select 2, 2 union
Select 3, 6 union
Select 4, 10
Here I am joining same table (self joining)
Select c1.ID, c1.SomeValue, c2.SomeValue
From CumulativeSum c1, CumulativeSum c2
Where c1.id >= c2.ID
Order By c1.id Asc
Result:
ID SomeValue SomeValue
-------------------------
1 10 10
2 2 10
2 2 2
3 6 10
3 6 2
3 6 6
4 10 10
4 10 2
4 10 6
4 10 10
Here we go now just sum the Somevalue of t2 and we`ll get the answer:
Select c1.ID, c1.SomeValue, Sum(c2.SomeValue) CumulativeSumValue
From CumulativeSum c1, CumulativeSum c2
Where c1.id >= c2.ID
Group By c1.ID, c1.SomeValue
Order By c1.id Asc
For SQL Server 2012 and above (much better performance):
Select
c1.ID, c1.SomeValue,
Sum (SomeValue) Over (Order By c1.ID )
From CumulativeSum c1
Order By c1.id Asc
Desired result:
ID SomeValue CumlativeSumValue
---------------------------------
1 10 10
2 2 12
3 6 18
4 10 28
Drop Table CumulativeSum
A CTE version, just for fun:
;
WITH abcd
AS ( SELECT id
,SomeNumt
,SomeNumt AS MySum
FROM #t
WHERE id = 1
UNION ALL
SELECT t.id
,t.SomeNumt
,t.SomeNumt + a.MySum AS MySum
FROM #t AS t
JOIN abcd AS a ON a.id = t.id - 1
)
SELECT * FROM abcd
OPTION ( MAXRECURSION 1000 ) -- limit recursion here, or 0 for no limit.
Returns:
id SomeNumt MySum
----------- ----------- -----------
1 10 10
2 12 22
3 3 25
4 15 40
5 23 63
Late answer but showing one more possibility...
Cumulative Sum generation can be more optimized with the CROSS APPLY logic.
Works better than the INNER JOIN & OVER Clause when analyzed the actual query plan ...
/* Create table & populate data */
IF OBJECT_ID('tempdb..#TMP') IS NOT NULL
DROP TABLE #TMP
SELECT * INTO #TMP
FROM (
SELECT 1 AS id
UNION
SELECT 2 AS id
UNION
SELECT 3 AS id
UNION
SELECT 4 AS id
UNION
SELECT 5 AS id
) Tab
/* Using CROSS APPLY
Query cost relative to the batch 17%
*/
SELECT T1.id,
T2.CumSum
FROM #TMP T1
CROSS APPLY (
SELECT SUM(T2.id) AS CumSum
FROM #TMP T2
WHERE T1.id >= T2.id
) T2
/* Using INNER JOIN
Query cost relative to the batch 46%
*/
SELECT T1.id,
SUM(T2.id) CumSum
FROM #TMP T1
INNER JOIN #TMP T2
ON T1.id > = T2.id
GROUP BY T1.id
/* Using OVER clause
Query cost relative to the batch 37%
*/
SELECT T1.id,
SUM(T1.id) OVER( PARTITION BY id)
FROM #TMP T1
Output:-
id CumSum
------- -------
1 1
2 3
3 6
4 10
5 15
Select
*,
(Select Sum(SOMENUMT)
From #t S
Where S.id <= M.id)
From #t M
You can use this simple query for progressive calculation :
select
id
,SomeNumt
,sum(SomeNumt) over(order by id ROWS between UNBOUNDED PRECEDING and CURRENT ROW) as CumSrome
from #t
There is a much faster CTE implementation available in this excellent post:
http://weblogs.sqlteam.com/mladenp/archive/2009/07/28/SQL-Server-2005-Fast-Running-Totals.aspx
The problem in this thread can be expressed like this:
DECLARE #RT INT
SELECT #RT = 0
;
WITH abcd
AS ( SELECT TOP 100 percent
id
,SomeNumt
,MySum
order by id
)
update abcd
set #RT = MySum = #RT + SomeNumt
output inserted.*
For Ex: IF you have a table with two columns one is ID and second is number and wants to find out the cumulative sum.
SELECT ID,Number,SUM(Number)OVER(ORDER BY ID) FROM T
Once the table is created -
select
A.id, A.SomeNumt, SUM(B.SomeNumt) as sum
from #t A, #t B where A.id >= B.id
group by A.id, A.SomeNumt
order by A.id
The SQL solution wich combines "ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW" and "SUM" did exactly what i wanted to achieve.
Thank you so much!
If it can help anyone, here was my case. I wanted to cumulate +1 in a column whenever a maker is found as "Some Maker" (example). If not, no increment but show previous increment result.
So this piece of SQL:
SUM( CASE [rmaker] WHEN 'Some Maker' THEN 1 ELSE 0 END)
OVER
(PARTITION BY UserID ORDER BY UserID,[rrank] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Cumul_CNT
Allowed me to get something like this:
User 1 Rank1 MakerA 0
User 1 Rank2 MakerB 0
User 1 Rank3 Some Maker 1
User 1 Rank4 Some Maker 2
User 1 Rank5 MakerC 2
User 1 Rank6 Some Maker 3
User 2 Rank1 MakerA 0
User 2 Rank2 SomeMaker 1
Explanation of above: It starts the count of "some maker" with 0, Some Maker is found and we do +1. For User 1, MakerC is found so we dont do +1 but instead vertical count of Some Maker is stuck to 2 until next row.
Partitioning is by User so when we change user, cumulative count is back to zero.
I am at work, I dont want any merit on this answer, just say thank you and show my example in case someone is in the same situation. I was trying to combine SUM and PARTITION but the amazing syntax "ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW" completed the task.
Thanks!
Groaker
Above (Pre-SQL12) we see examples like this:-
SELECT
T1.id, SUM(T2.id) AS CumSum
FROM
#TMP T1
JOIN #TMP T2 ON T2.id < = T1.id
GROUP BY
T1.id
More efficient...
SELECT
T1.id, SUM(T2.id) + T1.id AS CumSum
FROM
#TMP T1
JOIN #TMP T2 ON T2.id < T1.id
GROUP BY
T1.id
Try this
select
t.id,
t.SomeNumt,
sum(t.SomeNumt) Over (Order by t.id asc Rows Between Unbounded Preceding and Current Row) as cum
from
#t t
group by
t.id,
t.SomeNumt
order by
t.id asc;
Try this:
CREATE TABLE #t(
[name] varchar NULL,
[val] [int] NULL,
[ID] [int] NULL
) ON [PRIMARY]
insert into #t (id,name,val) values
(1,'A',10), (2,'B',20), (3,'C',30)
select t1.id, t1.val, SUM(t2.val) as cumSum
from #t t1 inner join #t t2 on t1.id >= t2.id
group by t1.id, t1.val order by t1.id
Without using any type of JOIN cumulative salary for a person fetch by using follow query:
SELECT * , (
SELECT SUM( salary )
FROM `abc` AS table1
WHERE table1.ID <= `abc`.ID
AND table1.name = `abc`.Name
) AS cum
FROM `abc`
ORDER BY Name

Querying SQL Server to obtain sum total using CTE and joins

I have a below query that I am trying since yesterday with some 33 records of Employee with employeeId on various conditions:
With CTE
(
select EmployeeId, and other colums with joins and conditions.
)
Now I want to join this query to obtain sum of invoices of each employee from below tables, table1 and table2.
table1 has employeeid so as my CTE has employeeid I can join it with table1
With CTE
(
select EmployeeId, and other colums with joins and conditions.
)
select *, table1.invoiceId
from CTE
left join table1 on table1.employeeid = CTE.employeeId
left join table2 on table2.invoiceid = table1.invoiceid
groupby
But my table1 only have invoices and for each such invoice there are amount spend in other table i.e table2. table2 has column "amount" that I need to sum up depending upon invoiceid.
For more clarity I am writing the table structure or output as below.
I am trying like above but they are not showing correct results
Assume CTE has
Emplyeeid empName Empaddress empcode
1 john America 121
2 sandy America 122
Now table1 has
InvoiceId EmployeeId RecordId PAyeeid
1 1 223 202
2 1 222 212
3 1 121 378
4 2 229 987
5 2 345 333
table2 has the coulmm amount that we need for each invoice of epmloyee
now table2
InvLine Invoiceid Amount
1 1 30
2 1 30
3 1 20
4 2 10
5 2 10
6 2 10
The output should be as per employe john has two invoices in table1 ie with Id 1 and 2, and for 1 and 2 invoiceds there are amounts that need to be add up
Emplyeeid empName Empaddress empcode Amount
1 john America 121 80
With CTE
(
select EmployeeId, and other colums with joins and conditions.
)
With CTE1
(
select EmployeeId,table1.invoiceid
from cte
left join table1 on table1.employeeid=CTE.employeeId
)
select sum(amount), cte1.employeeId from CTE1
left join table2 on table2.invoiceid = cte1.invoiceid
group by cte1.employeeId
but you can join the table1 in the first cte itself. There is no need to go for second cte if the first cte is simple one.

How to get sum of a column for three different tables without Foreign Key relationship

I have three tables with common id and different cost for each table.
I need to display the sum of cost1, cost2, cost3 for all unique id as single ResultSet.
table1:
id cost1
1 100
1 100
2 200
table2:
id cost2
1 100
2 100
2 100
table 3:
id cost3
1 100
2 100
1 100
The out should look like sum of the column in each table cost:
Outout:
id cost1 cost2 cost3
1 200 100 200
2 200 200 100
Could anyone suggest me the best solution for this.
Your question is not exactly clear but I think you are looking for something along these lines.
with costs1 as
(
select ID
, SUM(cost1) as cost1
from table1
group by ID
)
, costs2 as
(
select ID
, SUM(cost2) as cost2
from table2
group by ID
)
, costs3 as
(
select ID
, SUM(cost3) as cost3
from table3
group by ID
)
select c1.ID
, c1.cost1
, c2.cost2
, c3.cost3
from costs1 c1
join costs2 c2 on c2.ID = c1.ID
join costs3 c3 on c3.ID = c1.ID
A UNION ALL followed with a PIVOT will solve this more efficiently than a JOIN to each table.
SQL Fiddle Demo
WITH t1(id,costnum,cost) AS (
SELECT id,'cost1',cost1 FROM Table1 UNION ALL
SELECT id,'cost2',cost2 FROM Table2 UNION ALL
SELECT id,'cost3',cost3 FROM Table3
)
SELECT *
FROM t1
PIVOT(SUM(cost) FOR costnum IN ([cost1],[cost2],[cost3])) t2
we can achieve the same using UNION ALL and Derived table
select DISTINCT A.Id,
SUM(A.cost1),
SUM(A.cost2),
SUM(A.cost3) from (
select id,SUM(cost1)cost1,'' As Cost2,'' As cost3 from table1
GROUP BY t.id
UNION ALL
select id,'' As Cost1,SUM(cost2)cost2,'' As cost3 from table2
GROUP BY id
UNION ALL
select id,'' As Cost1,'' As cost2,SUM(t.cost3)cost3 from table3
GROUP BY id )A
GROUP BY A.Id

Resources