I have a table of instances that have a Start Date and an End Date column. Here is a simple example:
ID StartDate EndDate
1 1/8/2015 1/10/2015
2 1/8/2015 1/15/2015
3 2/6/2015 3/2/2015
4 1/6/2015 2/20/2015
5 3/18/2015 4/2/2015
I'm trying to write a query to find out how many unique days occur for a given month, but some of the instances overlap and span multiple months which is making it difficult. The results I want would look something like this:
Month # of days
January 26 (earliest is ID 4 starting 1/6)
February 28 (entire month because of ID 3 and 4)
March 16 (2 days from ID 3, 14 days from ID 5)
April 2 (first 2 days of the month from ID 5)
May 0
Any help would be greatly appreciated. Thanks!!
J,
Please check my SQL script below.
Before you run the script you will realize that I've used a SQL Dates table actually a SQL function which returns a temporary dates table.
You can find the source codes at given tutorial
I also used multiple CTE queries
;with dates as (
select
cast(date as date) date
from [dbo].[DateTable]('1/1/2015','12/31/2015')
), cte as (
select
distinct date
from instances, dates
where dates.date between instances.startdate and instances.enddate
)
select
year(date) year, month(date) month, count(*) dayscount
from cte
group by year(date), month(date)
By the way the March returns 16 days, 2 from one and 14 from other.
I hope the Select statement is useful,
The problem is too complicated;
In my opinion, you need to write a function that counts the number of "unique" days of ranges mentioned in the records.
I didn't write the function, but the design of this new function, "num", is like this:
1- It should get the month and year (named aMonth and aYear).
2- It Finds all records that have at least a day in aYear/aMonth:
(month(startDate)=aMonth and year(startDate)=aYear)
or
(month(endDate)=aMonth and year(endDate)=aYear)
or
(
((month(startDate)<aMonth and year(startDate)=aYear) or (year(startDate)<aYear))
and
((month(endDate)>aMonth and year(endDate)=aYear) or (year(endDate)>aYear))
)
3- Over these records, it should open a cursor, and process the records one by one.
4- While processing each records of the cursor, you can count the days of the month and store them in an array (or 28-31 character string of 0/1, for example).
5- count the number of 1's of this array (or string) and return it.
Having written this function ("num"), The high level of the answer will be like this:
Select 'January', dbo.num(1, 2015) as days
union all
Select 'February', dbo.num(2, 2015) as days
union all
Select 'March', dbo.num(3, 2015) as days
union all
Select 'April', dbo.num(4, 2015) as days
union all
Select 'May', dbo.num(5, 2015) as days
union all
Select 'June', dbo.num(6, 2015) as days
union all
Select 'July', dbo.num(7, 2015) as days
union all
Select 'August', dbo.num(8, 2015) as days
union all
Select 'September', dbo.num(9, 2015) as days
union all
Select 'October', dbo.num(10, 2015) as days
union all
Select 'November', dbo.num(11, 2015) as days
union all
Select 'December', dbo.num(12, 2015) as days
If you count the days for the same year only, you can try this. I only build the code for two months but it's easy to extend it.
SELECT
(SELECT SUM(CASE WHEN sdate>='2015-2-1' OR edate<'2015-1-1' THEN 0
WHEN edate>='2015-2-1' THEN datediff(day, sdate, '2015-2-1')
ELSE datediff(day,sdate,edate) END)
FROM a1)
AS Jan_Days,
(SELECT SUM(CASE WHEN sdate>='2015-3-1' OR edate<'2015-2-1' THEN 0
WHEN edate>='2015-3-1' THEN datediff(day, sdate, '2015-3-1')
ELSE datediff(day,sdate,edate) END)
from a1 )
AS Feb_Days,
...
It's far from efficient. It will be more efficient to use a script or stored procedure running through your records and calculate the results.
Related
I have the data for the current month in Snowflake which I am extracting with the below mentioned query
select distinct HPOLICY
, ANNUALPREMIUMAMOUNT
, year(dateadd(year, 0, CURRENT_DATE))
, month(dateadd(month, 0, CURRENT_DATE)) yearmonth
from hub_test
I want to extrapolate this data to the past 24 months which means get the same data with Sep 2019, Aug 2019 and so on till past 24 months.
That query is get all time distinct data, and put a fake current year/month column on it.
If you where doing something like:
select distinct HPOLICY
,ANNUALPREMIUMAMOUNT
,date_part('year', date_column) as year
,date_part('month', date_column) as month
from hub_test
where date_column >= date_trunc('month',CURRENT_DATE());
you would have the current months data, if date_column was the date_of the data in the row.
Therefore to get the last 24 months you would alter that to:
select distinct HPOLICY
,ANNUALPREMIUMAMOUNT
,date_part('year', date_column) as year
,date_part('month', date_column) as month
from hub_test
where date_column >= dateadd('month',-24, date_trunc('month',CURRENT_DATE()));
Here is a weird one for you all.
I need to determine the number of days in a Month
;WITH cteNetProfit AS
(
---- NET PROFIT
SELECT DT.CreateDate
, SUM(DT.Revenue) as Revenue
, SUM(DT.Cost) as Cost
, SUM(DT.GROSSPROFIT) AS GROSSPROFIT
FROM
(
SELECT CAST([createDTG] AS DATE) as CreateDate
, SUM(Revenue) as Revenue
, SUM(Cost) as Cost
, SUM(REVENUE - COST) AS GROSSPROFIT
FROM [dbo].[CostRevenueSpecific]
WHERE CAST([createDTG] AS DATE) > CAST(GETDATE() - 91 AS DATE)
AND CAST([createDTG] AS DATE) <= CAST(GETDATE() - 1 AS DATE)
GROUP BY createDTG
UNION ALL
SELECT CAST([CallDate] AS DATE) AS CreateDate
, SUM(Revenue) as Revenue
, SUM(Cost) as Cost
, SUM(REVENUE - COST) AS GROSSPROFIT
FROM abc.PublisherCallByDay
WHERE CAST([CallDate] AS DATE) > CAST(GETDATE() - 91 AS DATE)
AND CAST([CallDate] AS DATE) <= CAST(GETDATE() - 1 AS DATE)
GROUP BY CALLDATE
) DT
GROUP BY DT.CreateDate
)
select distinct MONTH(CREATEDATE), DateDiff(Day,CreateDate,DateAdd(month,1,CreateDate))
FROM cteNetProfit
For some reason it is returning two different results for the month of March 2016 one result is 30 and the other 31(which of course is correct) I validate that the underlying data only has 31 days worth of data for the Month of March. Since Feb is a leap year can this affect the DATEDIFF function. The remaining months return the correct #.
2 29
3 31
3 30
4 30
5 31
Thanks for the input, however, I found the solution elsewhere
select Distinct MONTH(CREATEDATE), Day(EOMONTH(CreateDate))
FROM cteNetProfit
The difference comes when you hit the 2016-03-31 date. If you run the query below for 2016-03-30 and 2016-03-31, the results of adding 1 MONTH using DATEADD, in both instances, is 2016-04-30. It returns the last day of the next month.
SELECT DATEADD(MONTH,1,'2016-03-30') , DATEADD(MONTH,1,'2016-03-31')
This syntax seemed to work (courtesy of https://raresql.com/2013/01/06/sql-server-get-number-of-days-in-month/).
SELECT DAY(DATEADD(ms,-2,DATEADD(MONTH, DATEDIFF(MONTH,0,#DATE)+1,0))) AS [Current Month]
I run this query in MSSQL to get the items, grouping by the last 7 days of the week:
SELECT COUNT(Date_Entered), DATENAME(WEEKDAY, Date_Entered)
FROM my_table
WHERE Board_Name = 'Board'
AND DATEDIFF(DAY,Date_Entered,GETDATE()) <= 7
GROUP BY DATENAME(WEEKDAY, Date_Entered)
In the result, days of the week are sorted in alphabetical order: Friday > Monday > Saturday > Sunday > Thursday > Tuesday > Wednesday
How do I sort by the normal/correct/common sense order, starting with the weekday of 7 days ago and ending with yesterday?
Ordering by MAX(Date_Entered) should work too:
SELECT
COUNT(Date_Entered),
DATENAME(WEEKDAY, Date_Entered)
FROM my_table
WHERE Board_Name = 'Board' AND DATEDIFF(DAY,Date_Entered,GETDATE()) <= 7
GROUP BY DATENAME(WEEKDAY, Date_Entered)
ORDER BY MAX(Date_Entered);
Normally you would want to order by the date ascending, but since you use an aggregate function you would need to group by the date which would ruin it, but since the max(date) in every group is the date you can do max(date) to order.
DATEPART is your friend, try it like this:
SELECT COUNT(Date_Entered), DATENAME(WEEKDAY, Date_Entered),DATEPART(WEEKDAY,Date_Entered)
FROM my_table
WHERE Board_Name = 'Board'
AND DATEDIFF(DAY,Date_Entered,GETDATE()) <= 7
GROUP BY DATEPART(WEEKDAY,Date_Entered),DATENAME(WEEKDAY, Date_Entered)
ORDER BY DATEPART(WEEKDAY,Date_Entered)
If you can't count on data being available for every week then you'd need to do something more based on date calculations. Off the top of my head I think this will be more reliable:
ORDER BY (DATEDIFF(dd, MAX(Date_Entered), CURRENT_TIMESTAMP) + 77777) % 7
EDIT: I wrote that not realizing that the data was already limited to a single week. I thought the intention was to group in buckets by day of week for a longer range of dates.
I'll also comment that to me it is more natural to do the grouping on cast(Date_Entered as date) rather than on a string value and I wouldn't be surprised if it's a more efficient query.
I would like to know if I select a static range of dates (May 1 thru June 30 for example) and then tell me if anyone has more than 5 calendar entries in one week (week1, week2, week3, week4). If easier it could be by selecting a week number in place of range of dates and then showing anyone working more than 5 times in week1 for example for the static range of dates.
This will tell me approximately if anyone has overtime scheduled.
EmpCalendar table (relevant columns in bold shown) (bullets are sample rows)
Cal_ID, user_id, days_date, WeekNumber
1, 34, 2015-04-01, Week1
3, 34, 2015-04-02, Week1
5, 34, 2015-04-03, Week1
7, 34, 2015-04-04, Week1
8, 34, 2015-04-05, Week1
9, 34, 2015-04-06, Week1
So in the above table we see that the Employee with user_id '34' has worked 6 times on WeekNumber of 'Week1'. I need it to return something like:
Tom Thumb (user_id = 34) worked 6 times in Week1 or within dates falling in the same week. Something to that effect. I am using ColdFusion 8 and MS SQL 2008.
Simple group by (assuming week numbers are not duplicated - depends on how they are assigned in the table):
Select UserID, WeekNumber
, count(distinct Days_Date) as DaysWorked
From EmpCalendar
--optional, if you want to limit the dates you're searching
where days_date between #startDate and #endDate
--not optional
group by UserID, WeekNumber
having count(distinct Days_Date) >= 5
Week1 can be duplicated within different years or months, so correct way of doing this is grouping by year and month together with week:
Select UserID, Year(Date), Month(Date), DATEPART( wk, Date), Count(*) As Days
From Table
Where Date Between #StartDate And #EndDate
Group by UserID, Year(Date), Month(Date), DATEPART( wk, Date)
Having Count(*) > 5
I am creating a query to give number of days between two days based on year. Actually I have below type of date range
From Date: TO_DATE('01-Jun-2011','dd-MM-yyyy')
To Date: TO_DATE('31-Dec-2013','dd-MM-yyyy')
My Result should be:
Year Number of day
------------------------------
2011 XXX
2012 XXX
2013 XXX
I've tried below query
WITH all_dates AS
(SELECT start_date + LEVEL - 1 AS a_date
FROM
(SELECT TO_DATE ('21/03/2011', 'DD/MM/YYYY') AS start_date ,
TO_DATE ('25/06/2013', 'DD/MM/YYYY') AS end_date
FROM dual
)
CONNECT BY LEVEL <= end_date + 1 - start_date
)
SELECT TO_CHAR ( TRUNC (a_date, 'YEAR') , 'YYYY' ) AS YEAR,
COUNT (*) AS num_days
FROM all_dates
WHERE a_date - TRUNC (a_date, 'IW') < 7
GROUP BY TRUNC (a_date, 'YEAR')
ORDER BY TRUNC (a_date, 'YEAR') ;
I got exact output
Year Number of day
------------------------------
2011 286
2012 366
2013 176
My question is if i use connect by then query execution takes long time as i have millions of records in table and hence i don't want to use connect by clause
connect by clause is creating virtual rows against the particular record.
Any help or suggestion would be greatly appreciated.
From your vague expected results I think you want the number of records between those dates, not the number of days; but it's rather unclear. Since you refer to a table in the question I assume you want something related to the table data, not simply days between two dates which wouldn't depend on a table at all. (I have no idea what the connect by clause reference means though). This should give you that, if it is what you want:
select extract(year from date_field), count(*)
from t42
where date_field >= to_date('01-Jun-2011', 'DD-MON-YYYY')
and date_field < to_date('31-Dec-2013') + interval '1' day
group by extract(year from date_field)
order by extract(year from date_field);
The where clause is as you'd expect between two dates; I've assumed there might be times in your date field (i.e. not all at midnight) and that you want to count all records on the last date in your range. Then it's grouping and counting based on the year for each record.
SQL Fiddle.
If you want the number of days that have records within the range, then you can just vary the count slightly:
select extract(year from date_field), count(distinct trunc(date_field))
...
SQL Fiddle.
you can use the below function to reduce the number of virtual rows by considering only the years in between.You can check the SQLFIDDLE to check the performance.
First consider only the number of days between start date and the year end of that year or
End date if it is in same year
Then consider the years in between from next year of start date to the year before the end date year
Finally consider the number of days from start of end date year to end date
Hence instead of iterating for all the days between start date and end date we need to iterate only the years
WITH all_dates AS
(SELECT (TO_CHAR(START_DATE,'yyyy') + LEVEL - 1) YEARS_BETWEEN,start_date,end_date
FROM
(SELECT TO_DATE ('21/03/2011', 'DD/MM/YYYY') AS start_date ,
TO_DATE ('25/06/2013', 'DD/MM/YYYY') AS end_date
FROM dual
)
CONNECT BY LEVEL <= (TO_CHAR(end_date,'yyyy')) - (TO_CHAR(start_date,'yyyy')-1)
)
SELECT DECODE(TO_CHAR(END_DATE,'yyyy'),YEARS_BETWEEN,END_DATE
,to_date('31-12-'||years_between,'dd-mm-yyyy'))
- DECODE(TO_CHAR(START_DATE,'yyyy'),YEARS_BETWEEN,START_DATE
,to_date('01-01-'||years_between,'dd-mm-yyyy'))+1,years_between
FROM ALL_DATES;
In Oracle you can perform Addition and Substraction to dates like this...
SELECT
TO_DATE('31-Dec-2013','dd-MM-yyyy') - TO_DATE('01-Jun-2011','dd-MM-yyyy')
DAYS FROM DUAL;
it will return day difference between two dates....
select to_date(2011, 'yyyy'), to_date(2012, 'yyyy'), to_date(2013, 'yyyy')
from dual;
TO_DATE(2011,'Y TO_DATE(2012,'Y TO_DATE(2013,'Y
--------------- --------------- ---------------
01-MAY-11 01-MAY-12 01-MAY-13
select to_char(date_field,'yyyy'), count(*)
from your_table
where date_field between to_date('01-Jun-2011', 'DD-MON-YYYY')
and to_date('31-Dec-2013 23:59:59', 'DD-MON-YYYY hh24:mi:ss')
group by to_char(date_field,'yyyy')
order by to_char(date_field,'yyyy');