SQL server: smoothed average of day of year - sql-server

I have a subquery with avg reservoir inflow pr day of year (from 1-365). Now I would like to calculate a smoothed/moving average for each day of year in a new column.
Example: for january 1st (DayOfYear = 1) I would like to calculate a smoothed average of 21 days (10 pre and 10 post days). I.e an avg of days ranging from (356-11). For day of year 55 the avg should be calculated on days of the year ranging from (45-65).
Her is the unfinished query based on a subquery called 'sub' where the 10 years of inflow first are averaged on day of year;
DECLARE #Dager int ;
SET #Dager = 10; /* # days pre and post the actual day of year to be included in avg */
Select sub.Magasin, sub.DayOfYear, AVG(sub.Inflow) as AvgInflow
FROM (SELECT Date, Magasin, Datepart(dy,Date) as DayOfYear, Value as Inflow
FROM inputtable
WHERE Date >= DATEFROMPARTS(2008,1,1) and Date <= DATEFROMPARTS(2017,12,31)) sub
GROUP By sub.Magasin, sub.DayOfYear
ORDER BY sub.magasin, sub.DayOfYear

Without any sample data, I'm going to suggest this for SQL Server 2012+
(Your SQL looks like SQL Server 2012+)
SELECT
Magasin,
Datepart(dy,Date) AS DayOfYear,
AVG(Inflow) OVER (
PARTITION BY Magasin
ORDER BY YEAR(Date), Datepart(dy,Date)
ROWS BETWEEN 10 PRECEDING AND 10 FOLLOWING)
FROM
inputtable
WHERE
Date >= DATEFROMPARTS(2008,1,1) and Date <= DATEFROMPARTS(2017,12,31))

Related

How to get the data in snowflake for past 24 months in snowflake

I have the data for the current month in Snowflake which I am extracting with the below mentioned query
select distinct HPOLICY
, ANNUALPREMIUMAMOUNT
, year(dateadd(year, 0, CURRENT_DATE))
, month(dateadd(month, 0, CURRENT_DATE)) yearmonth
from hub_test
I want to extrapolate this data to the past 24 months which means get the same data with Sep 2019, Aug 2019 and so on till past 24 months.
That query is get all time distinct data, and put a fake current year/month column on it.
If you where doing something like:
select distinct HPOLICY
,ANNUALPREMIUMAMOUNT
,date_part('year', date_column) as year
,date_part('month', date_column) as month
from hub_test
where date_column >= date_trunc('month',CURRENT_DATE());
you would have the current months data, if date_column was the date_of the data in the row.
Therefore to get the last 24 months you would alter that to:
select distinct HPOLICY
,ANNUALPREMIUMAMOUNT
,date_part('year', date_column) as year
,date_part('month', date_column) as month
from hub_test
where date_column >= dateadd('month',-24, date_trunc('month',CURRENT_DATE()));

update date column of a table in a database

How can I update date column of a table in a database(mssql) by 1 year for 1st 1000 data, 2 year for 2nd 1000 data and so on... I know how to implement it by assigning temporary id but is there a way to update data in a loop manner??
for example:
suppose if I have 6000 datas in table with joined_date column in range from 2012-01-01 to 2017-01-01 ordered in ascending order, I want to update first thousand rows increasing it by 1 year, 2nd thousand rows by 1 year as well and so on...
If my first thousand data contain joined date on year 2012, I want to update it to 2013 and if my 2nd thousand data contain joined date on year 2012 to 2013 then I want to increment it by 1 as well.
We can try assigning a row number to your table, then use it to do the updates:
WITH cte AS (
SELECT joined_date, ROW_NUMBER() OVER (ORDER BY joined_date) - 1 rn
FROM yourTable
)
UPDATE cte
SET joined_date = DATEADD(year, (rn % 1000) + 1, joined_date);
The trick here is that the first 1000 rows, which would receive a row number of 0 up to and including 999, would have an rn % 1000 value of 0, to which we add 1 to get the number of years to add. The next 1000 records would have 2 years added, and so on.

Compare the dates, compute the difference between them, postgres

I have a date column and a balance column for each user. Every time user makes a transaction, a new row gets added to this table. It could be that the user makes 15 transactions during the day, and no transaction at all during 5 days.
Like this one
date balance
2017-06-01 95.63
2017-06-01 97.13
2017-06-01 72.14
2017-06-06 45.04
2017-06-08 20.04
2017-06-09 10.63
2017-06-09 -29.37
2017-06-09 -51.35
2017-06-13 -107.55
2017-06-13 -101.35
2017-06-15 -157.55
2017-06-16 -159.55
2017-06-17 -161.55
The goal is to select the positive and negative transactions made during the same day, compute their average or min value and to consider it as one transaction.If the next day no transaction has been made, then the amount of the previous day should be used.
it means for each day in a month i should calculate an interest and it the balance has not been updated then the balance of the previous day should be used.
Hypothetically my table should look like
date balance
1/6/2017 72.14
6/2/2017 72.14
6/3/2017 72.14
6/4/2017 72.14
6/5/2017 72.14
6/6/2017 45.04
7/6/2017 45.04
8/6/2017 20.04
9/6/2017 -51.35
10/6/2017 -51.35
11/6/2017 -51.35
12/6/2017 -51.35
13/06/2017 -107.55
14/06/2017 -107.55
15/06/2017 -157.55
16/06/2017 -159.55
17/06/2017 -161.55
i have added those days that were missing and group the days that were duplicate.
Once I have this done, I can select the number of positive balance days, e.g. 8 days, compute the average positive balance, and multiply it by 0.4%.
8*58.8525*0.004=0.23
The same should be done with negative balance. but with a different interest rate number of negative balance days, e.g. 9 multiplied by average negative balance during those days and 8.49%.
9*-99.90555556*0.00849=-0.848
So my expected result is just to have these two columns
Neg Pos
-0.848 0.23
How can I do that it in postgres? The function OVERLAP does not really help since I need to specify the dates.
Besides i do not know how to
loop the days and to see if there is a duplicate.
See which days are missing and use the previous balance for each of these missing days.
please try this.. replace table with your table name
with cte as
(
Select "date" as date
,min(balance) as balance
,lead("date") over(order by "date") next_date
,Coalesce(ABS("date" - lead("date") over(order by "date")),1) date_diff
from table
group by "date"
),
cte2 as
(
Select date_diff*balance as tot_bal , date_diff
from cte
Where balance > 0
),
cte3 as
(
Select date_diff*balance as tot_bal , date_diff
from cte
Where balance < 0
)
Select (sum(cte2.tot_bal) / sum(cte2.date_diff) ) * 0.004 as pos
,(sum(cte3.tot_bal) / sum(cte3.date_diff) ) * 0.00849 as neg
from cte2
,cte3;

MSSQL order by previous 7 days

I run this query in MSSQL to get the items, grouping by the last 7 days of the week:
SELECT COUNT(Date_Entered), DATENAME(WEEKDAY, Date_Entered)
FROM my_table
WHERE Board_Name = 'Board'
AND DATEDIFF(DAY,Date_Entered,GETDATE()) <= 7
GROUP BY DATENAME(WEEKDAY, Date_Entered)
In the result, days of the week are sorted in alphabetical order: Friday > Monday > Saturday > Sunday > Thursday > Tuesday > Wednesday
How do I sort by the normal/correct/common sense order, starting with the weekday of 7 days ago and ending with yesterday?
Ordering by MAX(Date_Entered) should work too:
SELECT
COUNT(Date_Entered),
DATENAME(WEEKDAY, Date_Entered)
FROM my_table
WHERE Board_Name = 'Board' AND DATEDIFF(DAY,Date_Entered,GETDATE()) <= 7
GROUP BY DATENAME(WEEKDAY, Date_Entered)
ORDER BY MAX(Date_Entered);
Normally you would want to order by the date ascending, but since you use an aggregate function you would need to group by the date which would ruin it, but since the max(date) in every group is the date you can do max(date) to order.
DATEPART is your friend, try it like this:
SELECT COUNT(Date_Entered), DATENAME(WEEKDAY, Date_Entered),DATEPART(WEEKDAY,Date_Entered)
FROM my_table
WHERE Board_Name = 'Board'
AND DATEDIFF(DAY,Date_Entered,GETDATE()) <= 7
GROUP BY DATEPART(WEEKDAY,Date_Entered),DATENAME(WEEKDAY, Date_Entered)
ORDER BY DATEPART(WEEKDAY,Date_Entered)
If you can't count on data being available for every week then you'd need to do something more based on date calculations. Off the top of my head I think this will be more reliable:
ORDER BY (DATEDIFF(dd, MAX(Date_Entered), CURRENT_TIMESTAMP) + 77777) % 7
EDIT: I wrote that not realizing that the data was already limited to a single week. I thought the intention was to group in buckets by day of week for a longer range of dates.
I'll also comment that to me it is more natural to do the grouping on cast(Date_Entered as date) rather than on a string value and I wouldn't be surprised if it's a more efficient query.

Get record based on year in oracle

I am creating a query to give number of days between two days based on year. Actually I have below type of date range
From Date: TO_DATE('01-Jun-2011','dd-MM-yyyy')
To Date: TO_DATE('31-Dec-2013','dd-MM-yyyy')
My Result should be:
Year Number of day
------------------------------
2011 XXX
2012 XXX
2013 XXX
I've tried below query
WITH all_dates AS
(SELECT start_date + LEVEL - 1 AS a_date
FROM
(SELECT TO_DATE ('21/03/2011', 'DD/MM/YYYY') AS start_date ,
TO_DATE ('25/06/2013', 'DD/MM/YYYY') AS end_date
FROM dual
)
CONNECT BY LEVEL <= end_date + 1 - start_date
)
SELECT TO_CHAR ( TRUNC (a_date, 'YEAR') , 'YYYY' ) AS YEAR,
COUNT (*) AS num_days
FROM all_dates
WHERE a_date - TRUNC (a_date, 'IW') < 7
GROUP BY TRUNC (a_date, 'YEAR')
ORDER BY TRUNC (a_date, 'YEAR') ;
I got exact output
Year Number of day
------------------------------
2011 286
2012 366
2013 176
My question is if i use connect by then query execution takes long time as i have millions of records in table and hence i don't want to use connect by clause
connect by clause is creating virtual rows against the particular record.
Any help or suggestion would be greatly appreciated.
From your vague expected results I think you want the number of records between those dates, not the number of days; but it's rather unclear. Since you refer to a table in the question I assume you want something related to the table data, not simply days between two dates which wouldn't depend on a table at all. (I have no idea what the connect by clause reference means though). This should give you that, if it is what you want:
select extract(year from date_field), count(*)
from t42
where date_field >= to_date('01-Jun-2011', 'DD-MON-YYYY')
and date_field < to_date('31-Dec-2013') + interval '1' day
group by extract(year from date_field)
order by extract(year from date_field);
The where clause is as you'd expect between two dates; I've assumed there might be times in your date field (i.e. not all at midnight) and that you want to count all records on the last date in your range. Then it's grouping and counting based on the year for each record.
SQL Fiddle.
If you want the number of days that have records within the range, then you can just vary the count slightly:
select extract(year from date_field), count(distinct trunc(date_field))
...
SQL Fiddle.
you can use the below function to reduce the number of virtual rows by considering only the years in between.You can check the SQLFIDDLE to check the performance.
First consider only the number of days between start date and the year end of that year or
End date if it is in same year
Then consider the years in between from next year of start date to the year before the end date year
Finally consider the number of days from start of end date year to end date
Hence instead of iterating for all the days between start date and end date we need to iterate only the years
WITH all_dates AS
(SELECT (TO_CHAR(START_DATE,'yyyy') + LEVEL - 1) YEARS_BETWEEN,start_date,end_date
FROM
(SELECT TO_DATE ('21/03/2011', 'DD/MM/YYYY') AS start_date ,
TO_DATE ('25/06/2013', 'DD/MM/YYYY') AS end_date
FROM dual
)
CONNECT BY LEVEL <= (TO_CHAR(end_date,'yyyy')) - (TO_CHAR(start_date,'yyyy')-1)
)
SELECT DECODE(TO_CHAR(END_DATE,'yyyy'),YEARS_BETWEEN,END_DATE
,to_date('31-12-'||years_between,'dd-mm-yyyy'))
- DECODE(TO_CHAR(START_DATE,'yyyy'),YEARS_BETWEEN,START_DATE
,to_date('01-01-'||years_between,'dd-mm-yyyy'))+1,years_between
FROM ALL_DATES;
In Oracle you can perform Addition and Substraction to dates like this...
SELECT
TO_DATE('31-Dec-2013','dd-MM-yyyy') - TO_DATE('01-Jun-2011','dd-MM-yyyy')
DAYS FROM DUAL;
it will return day difference between two dates....
select to_date(2011, 'yyyy'), to_date(2012, 'yyyy'), to_date(2013, 'yyyy')
from dual;
TO_DATE(2011,'Y TO_DATE(2012,'Y TO_DATE(2013,'Y
--------------- --------------- ---------------
01-MAY-11 01-MAY-12 01-MAY-13
select to_char(date_field,'yyyy'), count(*)
from your_table
where date_field between to_date('01-Jun-2011', 'DD-MON-YYYY')
and to_date('31-Dec-2013 23:59:59', 'DD-MON-YYYY hh24:mi:ss')
group by to_char(date_field,'yyyy')
order by to_char(date_field,'yyyy');

Resources