Cumulative totals field - sql-server

I have a data set that looks like this:
Key TypeSeq Type Duration
-------------------------------------
29671461 10 1001 4
29671461 20 1002 2
29671461 30 1003 0
29671461 40 1004 0
29671461 70 1007 261
29671463 10 1001 3
29671463 20 1002 5
29671463 30 1003 7
29671463 40 1004 8
29671463 70 1007 261
I have found this but I am trying to group by ID rather that sum by it
select t1.id, t1.SomeNumt, SUM(t2.SomeNumt) as sum
from #t t1
inner join #t t2 on t1.id >= t2.id
group by t1.id, t1.SomeNumt
order by t1.id
I need a 5th column that does a cumulative total by key column

You can use sum() as a window function. If an order by is used together with that, you get a cumulative sum:
select "Key", TypeSeq, type, duration,
sum(duration) over (partition by "Key" order by TypeSeq) as sum
from the_table
order by "Key", TypeSeq;

Related

Grouping rows to minimise deviation

I have a Employee Wages table like this, with their EmpID and their wages.
EmpId | Wages
================
101 | 1280
102 | 1600
103 | 1400
104 | 1401
105 | 1430
106 | 1300
I need to write a Stored Procedure in SQL Server, to group the Employees according to their wages, such that similar salaried people are in groups together and the deviations within the group is as minimum as possible.
There are no other conditions or rules mentioned.
The output should look like this
EmpId | Wages | Group
=======================
101 | 1280 | 1
106 | 1300 | 1
103 | 1400 | 2
104 | 1401 | 2
105 | 1430 | 2
102 | 1600 | 3
You can use a query like the following:
SELECT EmpId, Wages,
DENSE_RANK() OVER (ORDER BY CAST(Wages - t.min_wage AS INT) / 100) AS grp
FROM mytable
CROSS JOIN (SELECT MIN(Wages) AS min_wage FROM mytable) AS t
The query calculates the distance of each wage from the minimum wage and then uses integer division by 100 in order to place records in slices. So all records that have a deviation that is between 0 - 99 off the minimum wage are placed in the first slice. The second slice contains records off by 100 - 199 from the minimum wage, etc.
You can for +-30 deviation as the below:
DECLARE #Tbl TABLE (EmpId INT, Wages INT)
INSERT INTO #Tbl
VALUES
(99, 99),
(100, 101),
(101, 1280),
(102, 1600),
(103, 1400),
(104, 1401),
(105, 1430),
(106, 1300)
;WITH CTE AS ( SELECT *, ROW_NUMBER() OVER (ORDER BY Wages) AS RowId FROM #Tbl )
SELECT
A.EmpId ,
A.Wages ,
DENSE_RANK() OVER (ORDER BY MIN(B.RowId)) [Group]
FROM
CTE A CROSS JOIN CTE B
WHERE
ABS(B.Wages - A.Wages) BETWEEN 0 AND 30 -- Here +-30
GROUP BY A.EmpId, A.Wages
ORDER BY A.Wages
Result:
EmpId Wages Group
----------- ----------- --------------------
99 99 1
100 101 1
101 1280 2
106 1300 2
103 1400 3
104 1401 3
105 1430 3
102 1600 4

Need select 2 rows from Table2, which is joined with Table1. See description

For example i have a Table1:
ID Specified TIN Value DateCreated
----------------------------------
1 0 tin1 45 2014-12-30
2 1 tin2 34 2013-01-05
3 0 tin3 23 2015-02-20
4 3 tin4 47 2013-06-04
5 3 tin5 12 2012-04-02
And a Table2:
ID Table1ID RegistrationDate
----------------------------------
1 1 2015-10-12
2 2 2015-07-21
3 1 2015-11-26
4 1 2015-12-04
5 2 2015-09-18
I need select all columns from Table1 with first and last RegistrationDate column in Table2. The answer should be
ID Specified TIN Value DateCreated FirstRegDate LastRegDate
---------------------------------------------------------------
1 0 tin1 45 2014-12-30 2015-10-12 2015-12-04
2 1 tin2 34 2013-01-05 2015-07-21 2015-09-18
3 0 tin3 23 2015-02-20 NULL NULL
4 3 tin4 47 2013-06-04 NULL NULL
5 3 tin5 12 2012-04-02 NULL NULL
Hi one possible solution can be something similar to pseudo query below(if you can prepare the tables I will modify to reflect actual query)
SELECT table1.*, inlineTable2.firstRegDate, inlineTable2.lastRegDate
FROM Table1
LEFT JOIN
(
SELECT
Table1ID AS id,
MIN(registrationDate) as firstRegDate,
MAX(regsitrationDate) as lastRegDate
FROM table2
GROUP BY table1ID
) AS inlineTable2
ON table1.id = inlineTable2.id
You can group by all columns in table1, and look up the minumum and maximum registration date for the group:
select ID
, Specified
, ... other columns from table1 ...
, min(RegistrationDate)
, max(RegistrationDate)
from Table1 t1
left join
Table2 t2
on t1.ID = t2.Table1ID
group by
ID
, Specified
, ... other columns from table1 ...

Query is very slow

I have tables
table1
epid etid id EValue reqdate
----------- ----------- ----------- ------------ ----------
15 1 1 498925307069 2012-01-01
185 1 2 A5973FC43CE3 2012-04-04
186 1 2 44C6A4B776A2 2012-04-05
205 1 2 7A0ED3F1DA13 2012-09-19
206 1 2 77771D65F9C4 2012-09-19
207 1 2 AD74A4AA41BD 2012-09-19
208 1 2 9595ABE5A0C8 2012-09-19
209 1 2 7611D2FB395B 2012-09-19
210 1 2 04A510D6067A 2012-09-19
211 1 2 24D43EC268F8 2012-09-19
table2
PEId Id EPId
----------- ----------- -----------
43 9 15
44 10 15
45 11 15
46 12 15
47 13 15
48 14 15
49 15 15
50 16 15
51 17 15
52 18 15
table3
PLId PEId Id ToPayId
----------- ----------- ----------- -----------
71 43 9 1
72 43 9 2
73 44 10 1
74 44 10 2
75 45 11 1
76 45 11 2
77 46 12 1
78 46 12 2
79 47 13 1
80 47 13 2
I want to get one id whose count is less than 8 in table 3 and order by peid in table 2,
I have written query
SELECT Top 1 ToPayId FROM
(
SELECT Count(pl.ToPayId) C, pl.ToPayId
FROM table3 pl
INNER JOIN table2 pe ON pl.peid = pe.peid
INNER JOIN table1 e ON pe.epid = e.epid
WHERE e.EtId=1 GROUP BY pl.ToPayId
) As T
INNER JOIN table2 p ON T.ToPayId= p.Id
WHERE C < 8 ORDER BY p.PEId ASC
This query executes more than 1000 times in stored procedure depends on the entries in user-defined-table-type using while condition.
But it is very slow as we have millions of entries in each table.
Can anyone suggest better query regarding above?
maybe try with the having clause to get rid of the from select
select table2.id as due
from table3 inner join table2 on table2.PEId=table3.PEId...
group by ...
having count(due) <8
order by ...
-> you have a redundant Id column in table3 : seems pretty useless as the couple PEId and Id appears unique so remove it and reduce the size of table 3 by 25% hence improving performance of db
Will.. since you did not provide enough sample data and I am not sure what exactly your business logic is. So that I can just modify the code in blind.
SELECT ToPayId
FROM (
SELECT TOP 1 Count(pl.ToPayId) C, pl.ToPayId, pe.PEId
FROM table3 as pl
INNER JOIN table2 as pe ON pl.peid = pe.peid AND pl.ToPayId = pe.Id
INNER JOIN table1 e ON pe.epid = e.epid
WHERE e.EtId=1
GROUP BY pl.ToPayId, pe.PEId
HAVING Count(pl.ToPayId) < 8
ORDER BY pe.PEId ASC
) AS T

Fetch Only Last Entry by user daily

I am working on a small reporting application. I have two tables
Agent Table Data
AgentID AgentName
------- ---------
1001 ABC
1002 XYZ
1003 POI
1004 JKL
Report Table Data
ReportID AgentId Labor Mandays Amount SubmitDate
-------- ------- ----- ------- ------ ----------
1 1001 30 10 5000 11/12/2011
2 1001 44 18 8000 11/14/2011
3 1002 33 75 3022 11/12/2011
4 1001 10 10 1500 11/14/2011
5 1002 10 10 1800 11/14/2011
6 1001 10 10 1400 11/14/2011
7 1003 40 40 1500 11/14/2011
8 1003 40 40 1800 11/14/2011
I want to generate a report which gives us output like
ReportID AgentId Labor Mandays Amount SubmitDate
-------- ------- ----- ------- ------ ----------
1 1001 30 10 5000 11/12/2011
3 1002 33 75 3022 11/12/2011
6 1001 10 10 1400 11/14/2011
5 1002 10 10 1800 11/14/2011
8 1003 40 40 1800 11/14/2011
Thanks in Advance
You didn't mention what VERSION of SQL Server you're using - if you're on 2005 or newer, you can use a CTE (Common Table Expression) with the ROW_NUMBER function:
;WITH LastPerAgent AS
(
SELECT
AgentID, ReportID, Labor, Mandays, Amount, SubmitDate,
ROW_NUMBER() OVER(PARTITION BY AgentID,SubmitDate
ORDER BY ReportID DESC) AS 'RowNum'
FROM dbo.Report
)
SELECT
AgentID, ReportID, Labor, Mandays, Amount, SubmitDate,
FROM LastPerAgent
WHERE RowNum = 1
This CTE "partitions" your data by AgentID and SubmitDate, and for each partition, the ROW_NUMBER function hands out sequential numbers, starting at 1 and ordered by ReportID DESC - so the "last" row (with the highest ReportID) for each (AgentID, SubmitDate) pair gets RowNum = 1 which is what I select from the CTE in the SELECT statement after it.
PS: this doesn't work 100% on your input data, since you've not defined how to group and how to eliminate rows.... you might need to adapt this query a bit, based on your requirements...

SQL Server : Update a column

I have a TableA
ID MatCh01 Match02 Status
1 1001 12
2 1001 12
3 1001 12
4 1002 44
5 1002 47
6 1003 22
7 1003 22
8 1004 55
9 1004 57
I want to populate column = status with "FAIL" when :
For same match01, there exist different match02. Expected TableA :
ID MatCh01 Match02 Status
1 1001 12 NULL
2 1001 12 NULL
3 1001 12 NULL
4 1002 44 FAIL
5 1002 47 FAIL
6 1003 22 NULL
7 1003 22 NULL
8 1004 55 FAIL
9 1004 57 FAIL
Please NOTE: FAIL all 'match01' if its corresponding 'match02' is different.
Thanks
Basically this says Update all Values in TableA when the MAX and MIN of Column Match02 are not equal (meaning match01 has multiple rows with different values for match 02).
UPDATE A
SET Status = 'FAIL'
FROM TableA A
INNER JOIN (SELECT
a2.Match01
FROM TableA A2
GROUP BY a2.Match01
HAVING MAX(Match02) <> MIN(Match02)) B ON
A.Match01 = B.Match01
When there's more than one distinct value of match02 for any match01, update those rows with the same match01.
UPDATE t1
SET Status = 'FAIL'
FROM TableA t1
WHERE t1.Match01 in
(
SELECT t2.Match01
FROM TableA t2
GROUP BY t2.Match01
HAVING COUNT(DISTINCT t2.Match02) > 1
)

Resources