Related
Having a transaction table with the following rows:
Id UserId PlatformId TransactionTypeId
-------------------------------------------------
0 1 3 1
1 1 1 2
2 2 3 2
3 3 2 1
4 2 3 1
How do I write a stored procedure that can aggregate the rows into a new table with the following format?
Id UserId Platforms TransactionTypeId
-------------------------------------------------
0 1 {"p3":1,"p1":1} {"t1":1,"t2":1}
1 2 {"p3":2} {"t2":1,"t1":1}
3 3 {"p2":1} {"t1":1}
So the rows are gouped by User, count each platform/transactionType and store as key/value json string.
Ref: My previous related question
You could use GROUP BY and FOR JSON:
SELECT MIN(ID) AS ID, UserId, MIN(sub.x) AS Platforms, MIN(sub2.x) AS Transactions
FROM tab t
OUTER APPLY (SELECT CONCAT('p', platformId) AS platform, cnt
FROM (SELECT PlatformId, COUNT(*) AS cnt
FROM tab t2 WHERE t2.UserId = t.UserId
GROUP BY PlatformId) s
FOR JSON AUTO) sub(x)
OUTER APPLY (SELECT CONCAT('t', TransactiontypeId) AS Transactions, cnt
FROM (SELECT TransactiontypeId, COUNT(*) AS cnt
FROM tab t2 WHERE t2.UserId = t.UserId
GROUP BY TransactiontypeId) s
FOR JSON AUTO) sub2(x)
GROUP BY UserId;
DBFiddle Demo
Result is a bit different(array of key-value) but please treat it as starting point.
Your sample JSON is not really a json, but since you want it that way:
SELECT u.UserId, plt.pValue, ttyp.ttValue
FROM Users AS [u]
CROSS APPLY (
SELECT '{'+STUFF( (SELECT ',"'+pn.pName+'":'+LTRIM(STR(pn.pCount))
FROM (SELECT p.Name AS pName, COUNT(*) AS pCount
FROM transactions t
left JOIN Platforms p ON p.PlatformID = t.PlatformId
WHERE t.UserId = u.UserId
GROUP BY p.PlatformId, p.Name
) pn
FOR XML PATH('')),1,1,'')+'}'
) plt(pValue)
CROSS APPLY (
SELECT '{'+STUFF( (SELECT ',"'+tty.ttName+'":'+LTRIM(STR(tty.ttCount))
FROM (SELECT tt.Name AS ttName, COUNT(*) AS ttCount
FROM transactions t
left JOIN dbo.TransactionType tt ON tt.TransactionTypeId = t.TransactionTypeID
WHERE t.UserId = u.UserId
GROUP BY tt.TransactionTypeId, tt.Name
) tty
FOR XML PATH('')),1,1,'')+'}'
) ttyp(ttValue)
WHERE EXISTS (SELECT * FROM transactions t WHERE u.UserId = t.UserId)
ORDER BY UserId;
DBFiddle Sample
How do I create an MSSQL query that joins TableA with TableB using the ID field, however I want it to join on ID record that has the highest value in the Number column?
TableA
ID
1
2
3
4
TableB
ID Number
1 1
1 2
1 3
2 1
3 1
3 2
4 1
4 2
4 3
I would want this as my output
TableJoined
ID Number
1 3
2 1
3 2
4 3
Is there a way to use a LEFT JOIN to achieve this or using max()?
Both. Use aggregation on the left join.
Select t1.id, max(t2.number)
From table1 t1
Left join table2 t2 on t1.id= t2.id
Group by t1.id;
You can query as below:
Select a.Id, Number from #a a join
(
Select top(1) with ties * from #b
order by row_number() over(partition by id order by number desc)
) b on a.id = b.id
Select A.Id, Max(Number) MaxNo from A
join B on A.Id=B.Id
Group by A.Id
create table #a(
id int
)
go
create table #b(
id int,
number int
)
go
insert into #a values(1),(2),(3),(4)
insert into #b values(1,1),(1,2),(1,3),(2,1),(3,1),(3,2),(4,1),(4,2),(4,3)
select #b.id,MAX(number) as maximum
from #b left outer join #a on #b.id=#a.id
group by #b.id
declare #t table
(
id int,
SomeNumt int
)
insert into #t
select 1,10
union
select 2,12
union
select 3,3
union
select 4,15
union
select 5,23
select * from #t
the above select returns me the following.
id SomeNumt
1 10
2 12
3 3
4 15
5 23
How do I get the following:
id srome CumSrome
1 10 10
2 12 22
3 3 25
4 15 40
5 23 63
select t1.id, t1.SomeNumt, SUM(t2.SomeNumt) as sum
from #t t1
inner join #t t2 on t1.id >= t2.id
group by t1.id, t1.SomeNumt
order by t1.id
SQL Fiddle example
Output
| ID | SOMENUMT | SUM |
-----------------------
| 1 | 10 | 10 |
| 2 | 12 | 22 |
| 3 | 3 | 25 |
| 4 | 15 | 40 |
| 5 | 23 | 63 |
Edit: this is a generalized solution that will work across most db platforms. When there is a better solution available for your specific platform (e.g., gareth's), use it!
The latest version of SQL Server (2012) permits the following.
SELECT
RowID,
Col1,
SUM(Col1) OVER(ORDER BY RowId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Col2
FROM tablehh
ORDER BY RowId
or
SELECT
GroupID,
RowID,
Col1,
SUM(Col1) OVER(PARTITION BY GroupID ORDER BY RowId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Col2
FROM tablehh
ORDER BY RowId
This is even faster. Partitioned version completes in 34 seconds over 5 million rows for me.
Thanks to Peso, who commented on the SQL Team thread referred to in another answer.
For SQL Server 2012 onwards it could be easy:
SELECT id, SomeNumt, sum(SomeNumt) OVER (ORDER BY id) as CumSrome FROM #t
because ORDER BY clause for SUM by default means RANGE UNBOUNDED PRECEDING AND CURRENT ROW for window frame ("General Remarks" at https://msdn.microsoft.com/en-us/library/ms189461.aspx)
Let's first create a table with dummy data:
Create Table CUMULATIVESUM (id tinyint , SomeValue tinyint)
Now let's insert some data into the table;
Insert Into CUMULATIVESUM
Select 1, 10 union
Select 2, 2 union
Select 3, 6 union
Select 4, 10
Here I am joining same table (self joining)
Select c1.ID, c1.SomeValue, c2.SomeValue
From CumulativeSum c1, CumulativeSum c2
Where c1.id >= c2.ID
Order By c1.id Asc
Result:
ID SomeValue SomeValue
-------------------------
1 10 10
2 2 10
2 2 2
3 6 10
3 6 2
3 6 6
4 10 10
4 10 2
4 10 6
4 10 10
Here we go now just sum the Somevalue of t2 and we`ll get the answer:
Select c1.ID, c1.SomeValue, Sum(c2.SomeValue) CumulativeSumValue
From CumulativeSum c1, CumulativeSum c2
Where c1.id >= c2.ID
Group By c1.ID, c1.SomeValue
Order By c1.id Asc
For SQL Server 2012 and above (much better performance):
Select
c1.ID, c1.SomeValue,
Sum (SomeValue) Over (Order By c1.ID )
From CumulativeSum c1
Order By c1.id Asc
Desired result:
ID SomeValue CumlativeSumValue
---------------------------------
1 10 10
2 2 12
3 6 18
4 10 28
Drop Table CumulativeSum
A CTE version, just for fun:
;
WITH abcd
AS ( SELECT id
,SomeNumt
,SomeNumt AS MySum
FROM #t
WHERE id = 1
UNION ALL
SELECT t.id
,t.SomeNumt
,t.SomeNumt + a.MySum AS MySum
FROM #t AS t
JOIN abcd AS a ON a.id = t.id - 1
)
SELECT * FROM abcd
OPTION ( MAXRECURSION 1000 ) -- limit recursion here, or 0 for no limit.
Returns:
id SomeNumt MySum
----------- ----------- -----------
1 10 10
2 12 22
3 3 25
4 15 40
5 23 63
Late answer but showing one more possibility...
Cumulative Sum generation can be more optimized with the CROSS APPLY logic.
Works better than the INNER JOIN & OVER Clause when analyzed the actual query plan ...
/* Create table & populate data */
IF OBJECT_ID('tempdb..#TMP') IS NOT NULL
DROP TABLE #TMP
SELECT * INTO #TMP
FROM (
SELECT 1 AS id
UNION
SELECT 2 AS id
UNION
SELECT 3 AS id
UNION
SELECT 4 AS id
UNION
SELECT 5 AS id
) Tab
/* Using CROSS APPLY
Query cost relative to the batch 17%
*/
SELECT T1.id,
T2.CumSum
FROM #TMP T1
CROSS APPLY (
SELECT SUM(T2.id) AS CumSum
FROM #TMP T2
WHERE T1.id >= T2.id
) T2
/* Using INNER JOIN
Query cost relative to the batch 46%
*/
SELECT T1.id,
SUM(T2.id) CumSum
FROM #TMP T1
INNER JOIN #TMP T2
ON T1.id > = T2.id
GROUP BY T1.id
/* Using OVER clause
Query cost relative to the batch 37%
*/
SELECT T1.id,
SUM(T1.id) OVER( PARTITION BY id)
FROM #TMP T1
Output:-
id CumSum
------- -------
1 1
2 3
3 6
4 10
5 15
Select
*,
(Select Sum(SOMENUMT)
From #t S
Where S.id <= M.id)
From #t M
You can use this simple query for progressive calculation :
select
id
,SomeNumt
,sum(SomeNumt) over(order by id ROWS between UNBOUNDED PRECEDING and CURRENT ROW) as CumSrome
from #t
There is a much faster CTE implementation available in this excellent post:
http://weblogs.sqlteam.com/mladenp/archive/2009/07/28/SQL-Server-2005-Fast-Running-Totals.aspx
The problem in this thread can be expressed like this:
DECLARE #RT INT
SELECT #RT = 0
;
WITH abcd
AS ( SELECT TOP 100 percent
id
,SomeNumt
,MySum
order by id
)
update abcd
set #RT = MySum = #RT + SomeNumt
output inserted.*
For Ex: IF you have a table with two columns one is ID and second is number and wants to find out the cumulative sum.
SELECT ID,Number,SUM(Number)OVER(ORDER BY ID) FROM T
Once the table is created -
select
A.id, A.SomeNumt, SUM(B.SomeNumt) as sum
from #t A, #t B where A.id >= B.id
group by A.id, A.SomeNumt
order by A.id
The SQL solution wich combines "ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW" and "SUM" did exactly what i wanted to achieve.
Thank you so much!
If it can help anyone, here was my case. I wanted to cumulate +1 in a column whenever a maker is found as "Some Maker" (example). If not, no increment but show previous increment result.
So this piece of SQL:
SUM( CASE [rmaker] WHEN 'Some Maker' THEN 1 ELSE 0 END)
OVER
(PARTITION BY UserID ORDER BY UserID,[rrank] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Cumul_CNT
Allowed me to get something like this:
User 1 Rank1 MakerA 0
User 1 Rank2 MakerB 0
User 1 Rank3 Some Maker 1
User 1 Rank4 Some Maker 2
User 1 Rank5 MakerC 2
User 1 Rank6 Some Maker 3
User 2 Rank1 MakerA 0
User 2 Rank2 SomeMaker 1
Explanation of above: It starts the count of "some maker" with 0, Some Maker is found and we do +1. For User 1, MakerC is found so we dont do +1 but instead vertical count of Some Maker is stuck to 2 until next row.
Partitioning is by User so when we change user, cumulative count is back to zero.
I am at work, I dont want any merit on this answer, just say thank you and show my example in case someone is in the same situation. I was trying to combine SUM and PARTITION but the amazing syntax "ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW" completed the task.
Thanks!
Groaker
Above (Pre-SQL12) we see examples like this:-
SELECT
T1.id, SUM(T2.id) AS CumSum
FROM
#TMP T1
JOIN #TMP T2 ON T2.id < = T1.id
GROUP BY
T1.id
More efficient...
SELECT
T1.id, SUM(T2.id) + T1.id AS CumSum
FROM
#TMP T1
JOIN #TMP T2 ON T2.id < T1.id
GROUP BY
T1.id
Try this
select
t.id,
t.SomeNumt,
sum(t.SomeNumt) Over (Order by t.id asc Rows Between Unbounded Preceding and Current Row) as cum
from
#t t
group by
t.id,
t.SomeNumt
order by
t.id asc;
Try this:
CREATE TABLE #t(
[name] varchar NULL,
[val] [int] NULL,
[ID] [int] NULL
) ON [PRIMARY]
insert into #t (id,name,val) values
(1,'A',10), (2,'B',20), (3,'C',30)
select t1.id, t1.val, SUM(t2.val) as cumSum
from #t t1 inner join #t t2 on t1.id >= t2.id
group by t1.id, t1.val order by t1.id
Without using any type of JOIN cumulative salary for a person fetch by using follow query:
SELECT * , (
SELECT SUM( salary )
FROM `abc` AS table1
WHERE table1.ID <= `abc`.ID
AND table1.name = `abc`.Name
) AS cum
FROM `abc`
ORDER BY Name
Here is a small simplified snipped of my data
OrderID QTY ItemID ActualQTY(does not exist in database)
1 2 1
2 1 2
3 1 1
4 5 3
Now I need a query that will fill in the ActualQTY based on the ItemID's. So summing total QTY for ItemID 1 = 3, and for ItemID 2 = 1, and last for ItemID 3 = 5
It should look like this
OrderID QTY ItemID ActualQTY
1 2 1 3
2 1 2 1
3 1 1 3
4 5 3 5
The problem is I am new to TSQL and I can't figure out a good way to do this.
--EDIT--
Someone else helped me with this problem and gave this solution which seems like the most efficient solution to me. However this solution doesn't work if you need to apply them to an XSD file in visual studio. So I turned it into a table valued function on the server.
SELECT OrderID, QTY, ItemID, SUM(QTY) OVER(PARTITION BY ItemID) AS ActualQTY
So if this solution doesn't work resort to answers below
You can do this:
SELECT
t2.OrderID,
t2.QTY,
t2.ItemID,
t1.ActualQTY
FROM
(
SELECT
ItemId,
SUM(QTY) AS ActualQTY
FROM tablename AS t1
GROUP BY ItemId
) AS t1
INNER JOIN tablename AS t2 ON t1.ItemId = t2.ItemID;
SQL Fiddle Demo
But if you want to update the column ActualQTY values, not just select, you can do this:
UPDATE t1
SET t1.ActualQTY = t2.ActualQTY
FROM tablename AS t1
INNER JOIN
(
SELECT
ItemId,
SUM(QTY) AS ActualQTY
FROM tablename AS t1
GROUP BY ItemId
) AS t2 ON t1.ItemID = t2.ItemID
SQL Fiddle Demo
SELECT b.OrderID,
b.QTY,
b.ItemID,
a.total_sum AS ActualQTY
FROM (SELECT ItemID,
Sum(QTY) AS total_sum
FROM tablename
GROUP BY ItemID) AS a
INNER JOIN tablename b
ON a.ItemID = b.ItemID
Select t.orderid, t.qty, t.itemid, a.actualq as actualqty
From yourtable t join (
Select sum(qty) as actualq, itemid
From yourtable
Group by itemid) a
On t.itemid = a.itemid
I have three tables, tableA (id, A), tableB (id,B) and tableC (id,C). id is unique and primary key. Now I want to run a query on these three tables to find out sum of values of A,B and C for every id. (i.e. if id 1 is present in tableA but not in tableB then value B should be considered as 0 for id 1).example: tableA:
id A
1 5
2 6
3 2
5 7
tableB:
id B
2 5
3 8
4 1
tableC:
id C
5 2
the output should be:
id Sum
1 (5 + 0 + 0 =)5
2 (6 + 5 + 0 =)11
3 (2 + 8 + 0 =)10
4 (0 + 1 + 0 =)1
5 (7 + 0 + 2 =)9
First get a distinct list ( UNION ) of the IDs so that you include all, then LEFT JOIN to add the values together.
Something like
SELECT IDs.ID,
IFNULL(tableA.A,0) + IFNULL(tableB.B,0) + IFNULL(tableC.C,0) SumVal
FROM (
SELECT ID
FROM tableA
UNION
SELECT ID
FROM tableB
UNION
SELECT ID
FROM tableC
) IDs LEFT JOIN
tableA ON IDs.ID = tableA.ID LEFT JOIN
tableB ON IDs.ID = tableB.ID LEFT JOIN
tableC ON IDs.ID = tableC.ID
Something like this should work:
select id, sum(val) from
( select id, "A" as val from "tableA"
union all
select id, "B" as val from "tableB"
union all
select id, "C" as val from "tableC" ) as joined
group by id
order by id
I could not test it with MySql but this works my databases (HSQLDB, Oracle):
select ID, sum(X) from
(SELECT ID, A as X FROM tableA
UNION
SELECT ID, B as X FROM tableB
UNION
SELECT ID, C as X FROM tableC)
group by ID
Not sure about the exact MySQL syntax, but this works in SQL Server:
SELECT ID, SUM(ColToSum) As SumValue FROM
(
SELECT ID, A As ColToSum FROM TableA
UNION ALL
SELECT ID, B As ColToSum FROM TableB
UNION ALL
SELECT ID, C As ColToSum FROM TableC
) Combined
GROUP BY ID
Remember to use "UNION ALL", not just "UNION" which strips out duplicate rows as it combines (see http://dev.mysql.com/doc/refman/5.0/en/union.html)