How to group by month when dateid is formatted as yyyymmdd - sql-server

I am working in SQL Server Management Studio. I am trying to get my query to group the number of repeat visitors by location by month for the year 2014-2015. I am almost done with the query, but I don't know how to group the result by month. What code should I insert to group it by month? Thanks!
SELECT
loc_id, date_id,
COUNT(visit_type) AS RepeatVisit
FROM
visit_fact
WHERE
visit_type = 'REPEAT'
AND date_id >= 20140101 AND date_id < 20150101
GROUP BY
loc_id, date_id
dateid is a varchar(8) column.

I assume dateid is a text type, use substr():
Select loc_id, substr(date_id,1,6) as year_month, count(visit_type) as RepeatVisit
FROM visit_fact
where visit_type = 'REPEAT'
and date_id >= 20140101
and date_id < 20150101
group by loc_id, substr(date_id,1,6)

Related

SQL- Calculating average of differences between times

I have an sql table that has transaction history of all the clients. I want to find what is the average difference in time between two transactions.
ClientCode Date
DL2xxx 2016-04-18 00:00:00.000
DL2xxx 2016-04-18 00:00:00.000
E19xxx 2016-04-18 00:00:00.000
E19xxx 2016-04-18 00:00:00.000
E19xxx 2016-04-18 00:00:00.000
JDZxxx 2016-04-18 00:00:00.000
Given above are the first few lines of the table the date given is the date transaction happened. I want to take an average of difference in days when successive transactions happen. Say for a client he makes transactions of Day 1, Day 3, Day 10, and Day 15. So differences are {2, 7, 5} average of which is 4.66. If only one transaction takes place this should be 0.
ClientCode AverageDays
DL2xxx <float_value>
DL2xxx <float_value>
E19xxx <float_value>
This is what the output should look like where each unique client code occurs only once.
You can use a query like below if you table name is T
see live demo
select
ClientCode,
AvgDays =ISNULL(AVG(d),0)
from
(
select
*,
d=DATEDIFF(
d,
dateofT,
LEAD(DateofT) over(
partition by ClientCode
order by DateofT asc ))
from t
)t
group by ClientCode
If Windowing functions aren't available to you, here's an alternative
--CREATE SAMPLE DATA
CREATE TABLE #TMP(ClientID INT, EventDate DATE)
GO
INSERT INTO #TMP VALUES
(1,DATEADD(DD,RAND()*365,'20180101'))
,(2,DATEADD(DD,RAND()*365,'20180101'))
,(3,DATEADD(DD,RAND()*365,'20180101'))
,(4,DATEADD(DD,RAND()*365,'20180101'))
,(5,DATEADD(DD,RAND()*365,'20180101'))
GO 50
--PRE SQL 2012 Compatible
SELECT A.ClientID
,AVG(DATEDIFF(DD,C.EventDate,A.Eventdate)) AS ClientAvg
FROM #TMP A
CROSS APPLY (SELECT ClientID, MAX(EventDate) EventDate FROM #TMP B
WHERE A.ClientID = B.ClientID AND A.EventDate > B.EventDate
GROUP BY ClientID) C
GROUP BY A.ClientID
ORDER BY A.ClientID
You can use LAG() function to compare a date to it's previous date by client, then group by client and calculate the average.
IF OBJECT_ID('tempdb..#Transactions') IS NOT NULL
DROP TABLE #Transactions
CREATE TABLE #Transactions (
ClientCode VARCHAR(100),
Date DATE)
INSERT INTO #Transactions (
ClientCode,
Date)
VALUES
('DL2', '2016-04-18'),
('DL2', '2016-04-19'),
('DL2', '2016-04-26'),
('E19', '2016-01-01'),
('E19', '2016-01-11'),
('E19', '2016-01-12')
;WITH DayDifferences AS
(
SELECT
T.ClientCode,
T.Date,
DayDifference = DATEDIFF(
DAY,
LAG(T.Date) OVER (PARTITION BY T.ClientCode ORDER BY T.Date ASC),
T.Date)
FROM
#Transactions AS T
)
SELECT
D.ClientCode,
AverageDayDifference = AVG(ISNULL(CONVERT(FLOAT, D.DayDifference), 0))
FROM
DayDifferences AS D
GROUP BY
D.ClientCode
Using the observation that the sum of differences within a group is simply the max - min of that group, you can use the simple group by select:
select IIF(COUNT(*) > 1,
(CAST(DATEDIFF(day, MIN(DateofT), MAX(DateofT)) AS FLOAT)) / (COUNT(*) - 1), 0.0)
AS AVGDays, ClientCode
FROM t GROUP BY ClientCode

T-SQL Get Records for this year grouped by month

I have a table of data which looks like this
ID CreatedDate
A123 2015-01-01
B124 2016-01-02
A125 2016-01-03
A126 2016-01-04
What I would like to do is group by month (as text) for this year only. I have some up with the following query but it returns data from all years not just this one:
Select Count(ID), DateName(month,createddate) from table
Where (DatePart(year,createddate)=datepart(year,getdate())
Group by DateName(month,createddate)
This returns
Count CreatedDate
4 January
Instead of
Count CreatedDate
3 January
Where have I gone wrong? I'm sure it's something to do with converting the date to month where it goes wrong
Just tested your code:
;WITH [table] AS (
SELECT *
FROM (VALUES
('A123', '2015-01-01'),
('B124', '2016-01-02'),
('A125', '2016-01-03'),
('A126', '2016-01-04')
) as t(ID, CreatedDate)
)
SELECT COUNT(ID),
DATENAME(month,CreatedDate)
FROM [table]
WHERE DATEPART(year,CreatedDate)=DATEPART(year,getdate())
GROUP BY DATENAME(month,CreatedDate)
Output was
3 January
I removed ( near WHERE
select count(id) as Count,
case when month(createddate)=1 THEN 'Januray' END as CreatedDate
from [table]
--where year(createddate)=2016 optional if you only want the 2016 count
group by month(createddate),year(createdDate)

GROUP BY DAY, CUMULATIVE SUM

I have a table in MSSQL with the following structure:
PersonId
StartDate
EndDate
I need to be able to show the number of distinct people in the table within a date range or at a given date.
As an example i need to show on a daily basis the totals per day, e.g. if we have 2 entries on the 1st June, 3 on the 2nd June and 1 on the 3rd June the system should show the following result:
1st June: 2
2nd June: 5
3rd June: 6
If however e.g. on of the entries on the 2nd June also has an end date that is 2nd June then the 3rd June result would show just 5.
Would someone be able to assist with this.
Thanks
UPDATE
This is what i have so far which seems to work. Is there a better solution though as my solution only gets me employed figures. I also need unemployed on another column - unemployed would mean either no entry in the table or date not between and no other entry as employed.
CREATE TABLE #Temp(CountTotal int NOT NULL, CountDate datetime NOT NULL);
DECLARE #StartDT DATETIME
SET #StartDT = '2015-01-01 00:00:00'
WHILE #StartDT < '2015-08-31 00:00:00'
BEGIN
INSERT INTO #Temp(CountTotal, CountDate)
SELECT COUNT(DISTINCT PERSON.Id) AS CountTotal, #StartDT AS CountDate FROM PERSON
INNER JOIN DATA_INPUT_CHANGE_LOG ON PERSON.DataInputTypeId = DATA_INPUT_CHANGE_LOG.DataInputTypeId AND PERSON.Id = DATA_INPUT_CHANGE_LOG.DataItemId
LEFT OUTER JOIN PERSON_EMPLOYMENT ON PERSON.Id = PERSON_EMPLOYMENT.PersonId
WHERE PERSON.Id > 0 AND DATA_INPUT_CHANGE_LOG.Hidden = '0' AND DATA_INPUT_CHANGE_LOG.Approved = '1'
AND ((PERSON_EMPLOYMENT.StartDate <= DATEADD(MONTH,1,#StartDT) AND PERSON_EMPLOYMENT.EndDate IS NULL)
OR (#StartDT BETWEEN PERSON_EMPLOYMENT.StartDate AND PERSON_EMPLOYMENT.EndDate) AND PERSON_EMPLOYMENT.EndDate IS NOT NULL)
SET #StartDT = DATEADD(MONTH,1,#StartDT)
END
select * from #Temp
drop TABLE #Temp
You can use the following query. The cte part is to generate a set of serial dates between the start date and end date.
DECLARE #ViewStartDate DATETIME
DECLARE #ViewEndDate DATETIME
SET #ViewStartDate = '2015-01-01 00:00:00.000';
SET #ViewEndDate = '2015-02-25 00:00:00.000';
;WITH Dates([Date])
AS
(
SELECT #ViewStartDate
UNION ALL
SELECT DATEADD(DAY, 1,Date)
FROM Dates
WHERE DATEADD(DAY, 1,Date) <= #ViewEndDate
)
SELECT [Date], COUNT(*)
FROM Dates
LEFT JOIN PersonData ON Dates.Date >= PersonData.StartDate
AND Dates.Date <= PersonData.EndDate
GROUP By [Date]
Replace the PersonData with your table name
If startdate and enddate columns can be null, then you need to add
addditional conditions to the join
It assumes one person has only one record in the same date range
You could do this by creating data where every start date is a +1 event and end date is -1 and then calculate a running total on top of that.
For example if your data is something like this
PersonId StartDate EndDate
1 20150101 20150201
2 20150102 20150115
3 20150101
You first create a data set that looks like this:
EventDate ChangeValue
20150101 +2
20150102 +1
20150115 -1
20150201 -1
And if you use running total, you'll get this:
EventDate Total
2015-01-01 2
2015-01-02 3
2015-01-15 2
2015-02-01 1
You can get it with something like this:
select
p.eventdate,
sum(p.changevalue) over (order by p.eventdate asc) as total
from
(
select startdate as eventdate, sum(1) as changevalue from personnel group by startdate
union all
select enddate, sum(-1) from personnel where enddate is not null group by enddate
) p
order by p.eventdate asc
Having window function with sum() requires SQL Server 2012. If you're using older version, you can check other options for running totals.
My example in SQL Fiddle
If you have dates that don't have any events and you need to show those too, then the best option is probably to create a separate table of dates for the whole range you'll ever need, for example 1.1.2000 - 31.12.2099.
-- Edit --
To get count for a specific day, it's possible use the same logic, but just sum everything up to that day:
declare #eventdate date
set #eventdate = '20150117'
select
sum(p.changevalue)
from
(
select startdate as eventdate, 1 as changevalue from personnel
where startdate <= #eventdate
union all
select enddate, -1 from personnel
where enddate < #eventdate
) p
Hopefully this is ok, can't test since SQL Fiddle seems to be unavailable.

Get record based on year in oracle

I am creating a query to give number of days between two days based on year. Actually I have below type of date range
From Date: TO_DATE('01-Jun-2011','dd-MM-yyyy')
To Date: TO_DATE('31-Dec-2013','dd-MM-yyyy')
My Result should be:
Year Number of day
------------------------------
2011 XXX
2012 XXX
2013 XXX
I've tried below query
WITH all_dates AS
(SELECT start_date + LEVEL - 1 AS a_date
FROM
(SELECT TO_DATE ('21/03/2011', 'DD/MM/YYYY') AS start_date ,
TO_DATE ('25/06/2013', 'DD/MM/YYYY') AS end_date
FROM dual
)
CONNECT BY LEVEL <= end_date + 1 - start_date
)
SELECT TO_CHAR ( TRUNC (a_date, 'YEAR') , 'YYYY' ) AS YEAR,
COUNT (*) AS num_days
FROM all_dates
WHERE a_date - TRUNC (a_date, 'IW') < 7
GROUP BY TRUNC (a_date, 'YEAR')
ORDER BY TRUNC (a_date, 'YEAR') ;
I got exact output
Year Number of day
------------------------------
2011 286
2012 366
2013 176
My question is if i use connect by then query execution takes long time as i have millions of records in table and hence i don't want to use connect by clause
connect by clause is creating virtual rows against the particular record.
Any help or suggestion would be greatly appreciated.
From your vague expected results I think you want the number of records between those dates, not the number of days; but it's rather unclear. Since you refer to a table in the question I assume you want something related to the table data, not simply days between two dates which wouldn't depend on a table at all. (I have no idea what the connect by clause reference means though). This should give you that, if it is what you want:
select extract(year from date_field), count(*)
from t42
where date_field >= to_date('01-Jun-2011', 'DD-MON-YYYY')
and date_field < to_date('31-Dec-2013') + interval '1' day
group by extract(year from date_field)
order by extract(year from date_field);
The where clause is as you'd expect between two dates; I've assumed there might be times in your date field (i.e. not all at midnight) and that you want to count all records on the last date in your range. Then it's grouping and counting based on the year for each record.
SQL Fiddle.
If you want the number of days that have records within the range, then you can just vary the count slightly:
select extract(year from date_field), count(distinct trunc(date_field))
...
SQL Fiddle.
you can use the below function to reduce the number of virtual rows by considering only the years in between.You can check the SQLFIDDLE to check the performance.
First consider only the number of days between start date and the year end of that year or
End date if it is in same year
Then consider the years in between from next year of start date to the year before the end date year
Finally consider the number of days from start of end date year to end date
Hence instead of iterating for all the days between start date and end date we need to iterate only the years
WITH all_dates AS
(SELECT (TO_CHAR(START_DATE,'yyyy') + LEVEL - 1) YEARS_BETWEEN,start_date,end_date
FROM
(SELECT TO_DATE ('21/03/2011', 'DD/MM/YYYY') AS start_date ,
TO_DATE ('25/06/2013', 'DD/MM/YYYY') AS end_date
FROM dual
)
CONNECT BY LEVEL <= (TO_CHAR(end_date,'yyyy')) - (TO_CHAR(start_date,'yyyy')-1)
)
SELECT DECODE(TO_CHAR(END_DATE,'yyyy'),YEARS_BETWEEN,END_DATE
,to_date('31-12-'||years_between,'dd-mm-yyyy'))
- DECODE(TO_CHAR(START_DATE,'yyyy'),YEARS_BETWEEN,START_DATE
,to_date('01-01-'||years_between,'dd-mm-yyyy'))+1,years_between
FROM ALL_DATES;
In Oracle you can perform Addition and Substraction to dates like this...
SELECT
TO_DATE('31-Dec-2013','dd-MM-yyyy') - TO_DATE('01-Jun-2011','dd-MM-yyyy')
DAYS FROM DUAL;
it will return day difference between two dates....
select to_date(2011, 'yyyy'), to_date(2012, 'yyyy'), to_date(2013, 'yyyy')
from dual;
TO_DATE(2011,'Y TO_DATE(2012,'Y TO_DATE(2013,'Y
--------------- --------------- ---------------
01-MAY-11 01-MAY-12 01-MAY-13
select to_char(date_field,'yyyy'), count(*)
from your_table
where date_field between to_date('01-Jun-2011', 'DD-MON-YYYY')
and to_date('31-Dec-2013 23:59:59', 'DD-MON-YYYY hh24:mi:ss')
group by to_char(date_field,'yyyy')
order by to_char(date_field,'yyyy');

SQL Server multiple date languages in same query

On a datawarehouse project with SSIS/SSAS, I have to generate my own time dimension because I've personal data to integrate with.
My problem is with SSAS because I also need to integrate translation. After reading the documentation, I've found a command to set language for the current session by using SET LANGUAGE ENGLISH but I'm not able to change language for different field of the query.
Is there a way to generate MONTH_NAME in French and also get MONTH_NAME_DE in German ?
Here is the script that I've found on Internet
WITH Mangal as
(
SELECT Cast ('1870-01-01' as DateTime) Date --Start Date
UNION ALL
SELECT Date + 1
FROM Mangal
WHERE Date + 1 < = '2015-12-31' --End date
)
SELECT
Row_Number() OVER (ORDER BY Date) as ID
, Date as DATE_TIME
, YEAR (date) as YEAR_NB
, MONTH (date) as MONTH_NB
, DAY (date) as DAY_NUMBER
, DateName (mm, date) as MONTH_NAME
, LEFT ( DateName (mm, date), 3) KMONTH_NAME
, DateName (dw, date) as DAY_NAME
, LEFT (DateName (dw, date), 3) as KDAY_NAME
, (SELECT TOP 1 FIELD
FROM TABLEXY
WHERE Date BETWEEN TABLEXY.DATE_FROM AND LEGISLATUR.DATE_TO
AND LANGAGE = 'FR'
) as PERSONAL_FIELD
, (SELECT TOP 1 FIELD
FROM TABLEXY
WHERE Date BETWEEN TABLEXY.DATE_FROM AND LEGISLATUR.DATE_TO
AND LANGAGE = 'DE'
) as PERSONAL_FIELD_DE
FROM Mangal
OPTION (MAXRECURSION 0)
SQL Server has a table containing names of Months and week days. However, they are stored as comma delimited values:
select
months,
shortmonths,
days
from
master.dbo.syslanguages
where
alias in ('English','French', 'German')
You might use this in your query.

Resources