How can I apply the moving average in temporal database.
My data includes temperature and I want to apply moving average for every 15 records.
You can fire query as below
marc=# SELECT entity, name, salary, start_date,
avg(salary) OVER (ORDER BY entity, start_date
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
FROM salary;
entity | name | salary | start_date | avg
-----------+-----------+---------+---------------+----------------------
Accounting | millicent | 850.00 | 2006-01-01 | 825.0000000000000000
Accounting | jack | 800.00 | 2010-05-01 | 916.6666666666666667
R&D | tom | 1100.00 | 2005-01-01 | 966.6666666666666667
R&D | john | 1000.00 | 2008-07-01 | 933.3333333333333333
R&D | maria | 700.00 | 2009-01-01 | 733.3333333333333333
R&D | kevin | 500.00 | 2009-05-01 | 633.3333333333333333
R&D | marc | 700.00 | 2010-02-15 | 600.0000000000000000
WITH moving_avrag AS (
SELECT 0 AS [lag] UNION ALL
SELECT 1 AS [lag] UNION ALL
SELECT 2 AS [lag] UNION ALL
SELECT 3 AS [lag] --ETC
)
SELECT
DATEADD(day,[lag],[date]) AS [reference_date],
[otherkey1],[otherkey2],[otherkey3],
AVG([value1]) AS [avg_value1],
AVG([value2]) AS [avg_value2]
FROM [data_table]
CROSS JOIN moving_avg
GROUP BY [otherkey1],[otherkey2],[otherkey3],DATEADD(day,[lag],[date])
ORDER BY [otherkey1],[otherkey2],[otherkey3],[reference_date];
Related
I have two SQL Server tables as below:
Event
+------------+----------------------------+-------------+------------+-----------------------------+
| Id | EventTypeId | PersonId | UCNumber | Name |DateEvent
+------------+----------------------------+-------------+------------+-----------------------------+
| 2307 | 3 | 2189 | 004947 | Migrated | 1900-01-01 00:00:00.6780000 |
| 2308 | 15 | 2189 | 004947 | Birthday | 2020-09-18 16:48:32.6870000 |
| 3400 | 15 | 2190 | 006857 | Birthday | 1900-01-01 00:00:00.0000000 |
| 3401 | 2 | 2190 | 006857 | Migrated | 2016-03-12 00:00:00.0000000 |
Person
+------------+----------------+-------------------+-----------+-------------------------------+
| Id | UCNumber | Name |LastName | AnotherDate |
+------------+----------------+-------------------+-----------+-------------------------------+
| 2189 | 004947 | John | Smith | 1900-01-01 00:00:00.0000000 |
| 2190 | 006857 | Alice | Timo | 2020-02-20 00:00:00.0000000 |
I need to get retrieved the top row (latest in time) based on the Event's Id. (The higher the Id, the more recent the Event) and it should be a 15 as EventTypeId.
I tried this:
Select P.Id, P.UCNUMBER, P.AnotherDate from
db.dbo.Person P
Inner join db.dbo.Event L on L.PersonId = P.Id
where P.Id in (
SELECT TOP (1) PersonId
FROM
db.dbo.Event
where PersonId = P.Id --and EventTypeID = 15
ORDER BY
Id DESC)
and EventTypeId = 15
but it does not work properly. I posted here just samples from the 2 tables. Generally the query takes also other events which are not latest ones (as higher Id). Something is missing in it.
In this case, for instance, it should return only 1 row:
2189 004947 1900-01-01 00:00:00.0000000
Sounds like you just want ORDER BY and TOP 1.
SELECT TOP 1
p.id,
p.ucnumber,
p.anotherdate
FROM event e
LEFT JOIN person p
ON p.id = e.personid
WHERE e.eventtypeid = 15
ORDER BY e.dateevent DESC;
If you want all ties in case there are more events on the same latest time you can replace TOP 1 with TOP 1 WITH TIES.
Let's say I have a historical table keeping who has modified data
-------------------------------------------------------------
| ID | Last_Modif | User_Modif | Col3, Col4...
-------------------------------------------------------------
| 1 | 2018-04-09 12:12:00 | John
| 2 | 2018-04-09 11:10:00 | Jim
| 3 | 2018-04-09 11:05:00 | Mary
| 4 | 2018-04-09 11:00:00 | John
| 5 | 2018-04-09 10:56:00 | David
| 6 | 2018-04-09 10:53:00 | John
| 7 | 2018-04-08 19:50:00 | Eric
| 8 | 2018-04-08 18:50:00 | Chris
| 9 | 2018-04-08 15:50:00 | John
| 10 | 2018-04-08 12:50:00 | Chris
----------------------------------------------------------
I would like to find the modifs done by John and previous version before he did that, to check what he had modified. For example in this scenario I would like to return row 1,2,4,5,6,7,9,10
I am thinking of ranking first based on Last_modif then do a join to pick up the next row, but somehow the result is not correct. This seems not a LAG/LEAD case since I am not picking a single value from the next row, but instead the whole next row. Any idea ?
-- sample 1000 rows with RowNumber
with TopRows as
(select top 1000 *, ROW_NUMBER() OVER(ORDER BY Last_modif desc) RowNum from [Table])
--Reference rows : Rows modif by John
, ModifByJohn as
(Select * from TopRows where USER_MODIF = 'John')
select * from ModifByJohn
UNION
select ModifByNext.* from ModifByJohn join TopRows ModifbyNext on ModifByJohn.RowNum + 1 = ModifByNext.RowNum
order by RowNum
How will the code look like if we would like to return last 2 modifs before John did instead of 1 ?
Maybe you can take advantage of your current ID:
with x as
(
select t1.*,
(select top 1 id from tbl where id > t1.id) prev_id
from tbl t1
where t1.User_Modif = 'John'
)
select * from x;
GO
ID | Last_Modif | User_Modif | prev_id
-: | :------------------ | :--------- | ------:
1 | 09/04/2018 12:12:00 | John | 2
4 | 09/04/2018 11:00:00 | John | 5
6 | 09/04/2018 10:53:00 | John | 7
9 | 08/04/2018 15:50:00 | John | 10
with x as
(
select t1.*,
(select top 1 id from tbl where id > t1.id) prev_id
from tbl t1
where t1.User_Modif = 'John'
)
select ID, Last_Modif, User_Modif from x
union all
select ID, Last_Modif, User_Modif
from tbl
where ID in (select prev_id from x)
order by ID
GO
ID | Last_Modif | User_Modif
-: | :------------------ | :---------
1 | 09/04/2018 12:12:00 | John
2 | 09/04/2018 11:10:00 | Jim
4 | 09/04/2018 11:00:00 | John
5 | 09/04/2018 10:56:00 | David
6 | 09/04/2018 10:53:00 | John
7 | 08/04/2018 19:50:00 | Eric
9 | 08/04/2018 15:50:00 | John
10 | 08/04/2018 12:50:00 | Chris
dbfiddle here
From my table, I want to select for each project ID the ID with the latest deploymentDate and if there are two identical latest deployment dates for the same project ID, select the ID with the latest submittedOn datetime. So if my table looks like this:
id | projectId | deploymentDate | submittedOn |
1 | 1 | 2017-01-02 | 2017-01-02 13:00:00 |
2 | 1 | 2017-01-04 | 2017-01-04 11:00:00 |
3 | 2 | 2017-01-06 | 2017-01-06 17:00:00 |
4 | 2 | 2017-01-06 | 2017-01-01 12:00:00 |
5 | 3 | 2017-01-02 | 2017-01-02 13:30:00 |
6 | 3 | 2017-01-02 | 2017-01-05 15:00:00 |
7 | 3 | 2017-01-02 | 2017-01-04 10:00:00 |
The desired rows are:
id | projectId | deploymentDate | submittedOn |
2 | 1 | 2017-01-04 | 2017-01-04 11:00:00 |
3 | 2 | 2017-01-06 | 2017-01-06 17:00:00 |
6 | 3 | 2017-01-02 | 2017-01-05 15:00:00 |
You can try the below. Adjust the sorting in the row_number as needed.
select
a.id,
a.projectid,
a.deploymentdate,
a.submittedOn
from project a
inner join
(select
a.id,
row_number() over (partition by projectid order by deploymentdate desc, submittedOn desc, id) as rid
from project
) as b
on b.id = a.id
and b.rid = 1
This would work:
select t.id, latest.*
from tab t join (
select projectid, max(deploymentdate) deploymentdate, max(submittedon) submittedon
from tab
group by projectid
) latest on t.projectid = latest.projectid and t.deploymentdate = latest.deploymentdate and t.submittedon = latest.submittedon
I found the latest based on the project id and then, joined with the source table to find the corresponding id.
I have a issues table where users can log worked hours and estimate hours that looks like this
id | assignee | task | timespent | original_estimate | date
--------------------------------------------------------------------------
1 | john | design | 2 | 3 | 2013-01-01
2 | john | mockup | 2 | 3 | 2013-01-02
3 | john | design | 2 | 3 | 2013-01-01
4 | rick | mockup | 5 | 4 | 2013-01-04
And I need to sum and group the worked and estimated hours by task and date to get this
assignee | task | total_spent | total_estimate | date
------------------------------------------------------------------
john | design | 4 | 6 | 2013-01-01
john | mockup | 2 | 3 | 2013-01-02
rick | design | 5 | 4 | 2013-01-04
Ok, this is easy, I've already got this:
SELECT assignee, task, SUM(timespent) as total_spent, SUM(original_estimate) AS total_estimate, date FROM issues GROUP BY assignee, task, date
My problem is I need to also show the assignees that did not logged hours on any task that day, I mean:
assignee | task | total_spent | total_estimate | date
------------------------------------------------------------------
john | design | 4 | 6 | 2013-01-01
john | mockup | 2 | 3 | 2013-01-02
rick | design | 5 | 4 | 2013-01-04
pete | design | 0 | 0 | 2013-01-01
pete | mockup | 0 | 0 | 2013-01-02
liz | design | 0 | 0 | 2013-01-04
liz | mockup | 0 | 0 | 2013-01-04
The goal is to draw a chart like this http://jsfiddle.net/uUjst/embedded/result/
You need the Assignees in their own separate table to join from.
SELECT tblAssignee.Name, task, SUM(timespent) as total_spent, SUM(original_estimate) AS total_estimate, date
FROM tblAssignee
LEFT JOIN issue ON issues.assignee = tblAssignee.Name
GROUP BY tblAssignee.Name, task, date
Assuming that you have a user table, but not a tasks or dates table... meaning that we have to derive these values from the values present in issues:
;WITH dates AS (
SELECT DISTINCT date
FROM issues
), tasks AS (
SELECT DISTINCT task
FROM issues
)
SELECT
u.user as assignee,
t.task,
SUM(i.timespent) as total_spent,
SUM(i.original_estimate) AS total_estimate,
d.date
FROM
users u CROSS JOIN
dates d CROSS JOIN
tasks t LEFT OUTER JOIN
issues i ON
i.assignee = u.user
AND i.task = t.task
AND i.date = d.date
GROUP BY u.user, t.task, d.date
SELECT
A.name,
task,
ISNULL(SUM(timespent), 0) as total_spent,
ISNULL(SUM(original_estimate), 0) AS total_estimate,
date
FROM Assignee A
LEFT JOIN issue
ON issues.assignee = A.Name
GROUP BY A.name, task, date
I have a dataset (DATASET1) that lists all employees with their Dept IDs, the date they started and the date they were terminated.
I'd like my query to return a dataset in which every row represents a day for each employee stayed employed, with number of days worked (Start-to-Date).
How do I this query? Thanks for your help, in advance.
DATASET1
DeptID EmployeeID StartDate EndDate
--------------------------------------------
001 123 20100101 20120101
001 124 20100505 20130101
DATASET2
DeptID EmployeeID Date #ofDaysWorked
--------------------------------------------
001 123 20100101 1
001 123 20100102 2
001 123 20100103 3
001 123 20100104 4
.... .... ........ ...
EIDT: My goal is to build a fact table which would be used to derive measures in SSAS. The measure I am building is 'average length of employment'. The measure will be deployed in a dashboard and the users will have the ability to select a calendar period and drill-down into month, week and days. That's why I need to start with such a large dataset. Maybe I can accomplish this goal by using MDX queries but how?
You can use a recursive CTE to perform this:
;with data (deptid, employeeid, inc_date, enddate) as
(
select deptid, employeeid, startdate, enddate
from yourtable
union all
select deptid, employeeid,
dateadd(d, 1, inc_date),
enddate
from data
where dateadd(d, 1, inc_date) <= enddate
)
select deptid,
employeeid,
inc_date,
rn NoOfDaysWorked
from
(
select deptid, employeeid,
inc_date,
row_number() over(partition by deptid, employeeid
order by inc_date) rn
from data
) src
OPTION(MAXRECURSION 0)
See SQL Fiddle with Demo
The result is similar to this:
| DEPTID | EMPLOYEEID | DATE | NOOFDAYSWORKED |
-----------------------------------------------------
| 1 | 123 | 2010-01-01 | 1 |
| 1 | 123 | 2010-01-02 | 2 |
| 1 | 123 | 2010-01-03 | 3 |
| 1 | 123 | 2010-01-04 | 4 |
| 1 | 123 | 2010-01-05 | 5 |
| 1 | 123 | 2010-01-06 | 6 |
| 1 | 123 | 2010-01-07 | 7 |
| 1 | 123 | 2010-01-08 | 8 |
| 1 | 123 | 2010-01-09 | 9 |
| 1 | 123 | 2010-01-10 | 10 |
| 1 | 123 | 2010-01-11 | 11 |
| 1 | 123 | 2010-01-12 | 12 |
SELECT DeptID, EmployeeID, Date, DATEDIFF(DAY, StartDate, '3/1/2011') AS ofDaysWorked
FROM DATASET1
See if that worked!