T-SQL Get Records for this year grouped by month - sql-server

I have a table of data which looks like this
ID CreatedDate
A123 2015-01-01
B124 2016-01-02
A125 2016-01-03
A126 2016-01-04
What I would like to do is group by month (as text) for this year only. I have some up with the following query but it returns data from all years not just this one:
Select Count(ID), DateName(month,createddate) from table
Where (DatePart(year,createddate)=datepart(year,getdate())
Group by DateName(month,createddate)
This returns
Count CreatedDate
4 January
Instead of
Count CreatedDate
3 January
Where have I gone wrong? I'm sure it's something to do with converting the date to month where it goes wrong

Just tested your code:
;WITH [table] AS (
SELECT *
FROM (VALUES
('A123', '2015-01-01'),
('B124', '2016-01-02'),
('A125', '2016-01-03'),
('A126', '2016-01-04')
) as t(ID, CreatedDate)
)
SELECT COUNT(ID),
DATENAME(month,CreatedDate)
FROM [table]
WHERE DATEPART(year,CreatedDate)=DATEPART(year,getdate())
GROUP BY DATENAME(month,CreatedDate)
Output was
3 January
I removed ( near WHERE

select count(id) as Count,
case when month(createddate)=1 THEN 'Januray' END as CreatedDate
from [table]
--where year(createddate)=2016 optional if you only want the 2016 count
group by month(createddate),year(createdDate)

Related

Calculating Year to Date Total

I want to generate a Payroll type query whereby the values in Payroll 1 (say for the previous month) should be included in Payroll 2 (for the current month) Year-to-Date Totals.
This can best be explained with an example:
DECLARE #MyTable TABLE(ID INT IDENTITY, PayrollID INT, Description NVARCHAR(MAX), [Current Month] MONEY)
INSERT INTO #MyTable
VALUES (1,'Basic Salary',100),
(1,'Normal Over Time',50),
(1,'Work on Saturday',150),
(1,'Work on Sunday',200),
(2,'Basic Salary',100)
SELECT * ,SUM([Current Month]) OVER (PARTITION BY Description ORDER BY PayrollID) AS [Month to Date]
FROM #MyTable
When I run the above I get
ID EmployeeID PayrollID Description Current Month Month to Date
1 1 1 Basic Salary 100 100
2 1 1 Normal Over Time 50 50
3 1 1 Work on Saturday 150 150
4 1 1 Work on Sunday 200 200
5 1 2 Basic Salary 100 200
The Year-to-Date running totals are per each Description meaning Basic Salary Category has its own running total and so does Saturday and Sunday etc, etc. You will notice that for Basic Salary in Payroll 2 the running Year-to-Date total is 200 (i.e. 100 from Payroll 1 + 100 from Payroll 2)
The challenge I have is that Payroll 1 has data for Basic Salary, Work on Saturday and Work on Sunday whereas Payroll 2 only has Basic Salary as the employee did not work on Saturday nor on Sunday in Payroll 2 (the current month).
However, in the cumulative Year-to-Date column the data from Payroll 1 (previous month) should still be selected and included in the Year-to-Date running Total -
something like this:
ID EmployeeID PayrollID Description Current Month Month to Date
1 1 1 Basic Salary 100 100
2 1 1 Normal Over Time 50 50
3 1 1 Work on Saturday 150 150
4 1 1 Work on Sunday 200 200
5 1 2 Basic Salary 100 200
2 1 1 Normal Over Time NULL 50
3 1 1 Work on Saturday NULL 150
4 1 1 Work on Sunday NULL 200
Although the employee did not work on Saturday nor Sunday in the current month (Payroll 2) the running (Year-to-Date) totals for working on a Saturday should be 150 that he/she worked in the previous month (Payroll 1). The same should apply to working on Sunday where the running total in the current month (Payroll 2) should be the 200 that he/she worked in the previous month (Payroll 1).
How do I do that with a simple Select Statement without writing a complicated Procedure?
EDIT:
I have cleaned up the ode as follows:
DECLARE #MyTable TABLE(ID INT IDENTITY, EmployeeID INT, PayrollID INT, Description NVARCHAR(MAX), [Current Month] MONEY)
INSERT INTO #MyTable
VALUES (1,1,'Basic Salary',100),
(1,1,'Normal Over Time',50),
(1,1,'Work on Saturday',150),
(1,1,'Work on Sunday',200),
(1,2,'Basic Salary',100)
WITH pay_elements AS
(
SELECT Description
FROM #MyTable
GROUP BY Description
)
,pay_slips AS
(
SELECT EmployeeID, PayrollID
FROM #MyTable
GROUP BY EmployeeID, PayrollID
)
,pay_lines AS
(
SELECT
mt.ID
,PS.EmployeeID
,PS.PayrollID
,PE.Description
,ISNULL(mt.[Current Month], 0) AS [Current Month]
FROM
pay_slips AS ps
OUTER APPLY
pay_elements AS pe
LEFT JOIN
#MyTable AS mt
ON (mt.EmployeeID = ps.EmployeeID)
AND (mt.PayrollID = ps.PayrollID)
AND (mt.Description = pe.Description)
)
SELECT * ,SUM([Current Month]) OVER (PARTITION BY EmployeeID, Description ORDER BY PayrollID) AS [Month to Date]
FROM pay_lines
And I get this error:
Msg 319, Level 15, State 1, Line 10
Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.
Msg 102, Level 15, State 1, Line 17
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Line 23
Incorrect syntax near ','.
You first need to build a "structure" of row headings, and then join that onto the actual data.
So for example:
WITH pay_elements AS
(
SELECT Description
FROM #MyTable
GROUP BY Description
)
,pay_slips AS
(
SELECT EmployeeID, PayrollID
FROM #MyTable
GROUP BY EmployeeID, PayrollID
)
,pay_lines AS
(
SELECT
mt.ID
,pay_slips.EmployeeID
,pay_slips.PayrollID
,pay_elements.Description
,ISNULL(mt.Current_Month, 0) AS Current_Month
FROM
pay_slips AS ps
OUTER APPLY
pay_elements AS pe
LEFT JOIN
#MyTable AS mt
ON (mt.EmployeeID = ps.EmployeeID)
AND (mt.PayrollID = ps.PayrollID)
AND (mt.Description = pe.Description)
)
SELECT * ,SUM([Current Month]) OVER (PARTITION BY EmployeeID, Description ORDER BY PayrollID) AS [Month to Date]
FROM pay_lines
What we're doing here is getting a list of the different kind of pay elements in your table. Then we're getting a list of Employees and Payrolls done to date, and manually forcing every Payroll to include a row in respect of all possible pay elements.
Once that structure is built, we join onto the base table to get the actual values (replacing NULLs with zeros, for those pay elements that weren't originally included in the base table).
Then we simply query this padded-out table in the same way you did originally.
Note, I've written this on the fly and haven't checked this code so please excuse any minor errors.
I am little confused with the column you mentioned Year-to-Date in your description. I assume this might be [Month to Date] column present in your query. Please correct me if I am wrong.
I think what you are trying to achieve is - the descriptions which are not present in payroll ID 2 like Work on Saturday and Work on Sunday should also be selected below the result set.
Problem is:
Summation of NULL value is always NULL so if [Current Month] value is NULL then you can not achieve to display 50,150,200 in the [Month to Date] column
You can have fixed categories against each payroll id:
Normal Over Time
Work on Saturday
Work on Sunday
Basic Salary
Query:
DECLARE #MyTable TABLE(ID INT IDENTITY, PayrollID INT, Description NVARCHAR(MAX), [Current Month] MONEY)
INSERT INTO #MyTable
VALUES (1,'Basic Salary',100),
(1,'Normal Over Time',50),
(1,'Work on Saturday',150),
(1,'Work on Sunday',200),
(2,'Basic Salary',100),
(2,'Normal Over Time',0),
(2,'Work on Saturday',0),
(2,'Work on Sunday',0)
SELECT * ,SUM([Current Month]) OVER (PARTITION BY Description ORDER BY PayrollID) AS [Month to Date]
FROM #MyTable order by ID,PayrollID

How to group by month when dateid is formatted as yyyymmdd

I am working in SQL Server Management Studio. I am trying to get my query to group the number of repeat visitors by location by month for the year 2014-2015. I am almost done with the query, but I don't know how to group the result by month. What code should I insert to group it by month? Thanks!
SELECT
loc_id, date_id,
COUNT(visit_type) AS RepeatVisit
FROM
visit_fact
WHERE
visit_type = 'REPEAT'
AND date_id >= 20140101 AND date_id < 20150101
GROUP BY
loc_id, date_id
dateid is a varchar(8) column.
I assume dateid is a text type, use substr():
Select loc_id, substr(date_id,1,6) as year_month, count(visit_type) as RepeatVisit
FROM visit_fact
where visit_type = 'REPEAT'
and date_id >= 20140101
and date_id < 20150101
group by loc_id, substr(date_id,1,6)

SQL- Finding a gap that is x amount of months with the same foreign key

I am editing this to clarify my question.
Let's say I have a table that holds patient information. I need to find new patients for this year, and the date of their prescription first prescription when they were considered new. Anytime there is a six month gap they are considered a new patient.
How do I accomplish this using SQL. I can do this in Java and any other imperative language easily enough, but I am having problems doing this in SQL. I need this script to be run in Crystal by non-SQL users
Table:
Patient ID Prescription Date
-----------------------------------------
1 12/31/16
1 03/13/17
2 10/10/16
2 05/11/17
2 06/11/17
3 01/01/17
3 04/20/17
4 01/31/16
4 01/01/17
4 07/02/17
So Patients 2 and 4 are considered new patients. Patient 4 is considered a new patient twice, so I need dates for each time patient 4 was considered new 1/1/17 and 7/2/17. Patients 1 and 3 are not considered new this year.
So far I have the code below which tells me if they are new this year, but not if they had another six month gap this year.
SELECT DISTINCT
this_year.patient_id
,this_year.date
FROM (SELECT
patient_id
,MIN(prescription_date) as date
FROM table
WHERE prescription_date BETWEEN '2017-01-01 00:00:00.000' AND '2017-
12-31 00:00:00.000'
GROUP BY [patient_id]) AS this_year
LEFT JOIN (SELECT
patient_id
,MAX(prescription_date) as date
FROM table
WHERE prescription_date BETWEEN '2016-01-01 00:00:00.000' AND '2016-
12-31 00:00:00.000'
GROUP BY [patient_id]) AS last_year
WHERE DATEDIFF(month, last_year.date, this_year.date) > 6
OR last_year.date IS NULL
Patient 2 in your example does not meet the criteria you specified ... that being said ...
You can try something like this ... untested but should be similar (assuming you can put this in a stored procedure):
WITH ordered AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY [Prescription Date]) rn
FROM table1
)
SELECT o1.[PatientID], DATEDIFF(s, o1.[Prescription Date], o2.[Prescription Date]) diff
FROM ordered o1 JOIN ordered o2
ON o1.rn + 1 = o2.rn
WHERE DATEDIFF(m, o1.[Prescription Date], o2.[Prescription Date]) > 6
Replace table1 with the name of your table.
I assume that you mean the patient has not been prescribed in the last 6 months.
SELECT DISTINCT user_id
FROM table_name
WHERE prescribed_date >= DATEADD(month, -6, GETDATE())
This gives you the list of users that have been prescribed in the last 6 months. You want the list of users that are not in this list.
SELECT DISTINCT user_id
FROM table_name
WHERE user_id NOT IN (SELECT DISTINCT user_id
FROM table_name
WHERE prescribed_date >= DATEADD(month, -6, GETDATE()))
You'll need to amend the field and table names.

T-SQL Query to remove duplicate records in the output based on one particular column

I am running SQL Server 2014 and I have the following T-SQL query:
USE MYDATABASE
SELECT *
FROM RESERVATIONLIST
WHERE [MTH] IN ('JANUARY 2015','FEBRUARY 2015')
RESERVATIONLIST mentioned in the code above is a view. The query gives me the following output (extract):
ID NAME DOA DOD Nights Spent MTH
--------------------------------------------------------------------
251 AH 2015-01-12 2015-01-15 3 JANUARY 2015
258 JV 2015-01-28 2015-02-03 4 JANUARY 2015
258 JV 2015-01-28 2015-02-03 2 FEBRUARY 2015
The above output consist of around 12,000 records.
I need to modify my query so that it eliminates all duplicate ID and give me the following results:
ID NAME DOA DOD Nights Spent MTH
--------------------------------------------------------------------
251 AH 2015-01-12 2015-01-15 3 JANUARY 2015
258 JV 2015-01-28 2015-02-03 4 JANUARY 2015
I tried something like this, but it's not working:
USE MYDATABASE
SELECT *
FROM RESERVATIONLIST
WHERE [MTH] IN ('JANUARY 2015', 'FEBRUARY 2015')
GROUP BY [ID]
HAVING COUNT ([MTH]) > 1
Following query will return one row per ID :
SELECT * FROM
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY (SELECT NULL)) rn FROM RESERVATIONLIST
WHERE [MTH] IN ('JANUARY 2015','FEBRUARY 2015')
) T
WHERE rn = 1
Note : this will return a random row from multiple rows having same ID. IF you want to select some specific row then you have to define it in order by. For e.g. :
SELECT * FROM
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY DOA DESC) rn FROM RESERVATIONLIST
WHERE [MTH] IN ('JANUARY 2015','FEBRUARY 2015')
) T
WHERE rn = 1
definitely, it will return the row having max(DOA).
You are trying to do a GROUP BY statement which IMHO is the right way to go. You should formulate all columns that are a constant, and roll-up the others. Depending on the value of DOD and DOA I can see two solutions:
SELECT ID,NAME,DOA,DOD,SUM([Nights Spent]) as Nights,
min(MTH) as firstRes, max(MTH) as lastRes
FROM RESERVATIONLIST
GROUP BY ID,NAME,DOA,DOD
OR
SELECT ID,NAME,min(DOA) as firstDOA,max(DOD) as lastDOD,SUM([Nights Spent]) as Nights,
min(MTH) as firstRes, max(MTH) as lastRes
FROM RESERVATIONLIST
GROUP BY ID,NAME

GROUP BY DAY, CUMULATIVE SUM

I have a table in MSSQL with the following structure:
PersonId
StartDate
EndDate
I need to be able to show the number of distinct people in the table within a date range or at a given date.
As an example i need to show on a daily basis the totals per day, e.g. if we have 2 entries on the 1st June, 3 on the 2nd June and 1 on the 3rd June the system should show the following result:
1st June: 2
2nd June: 5
3rd June: 6
If however e.g. on of the entries on the 2nd June also has an end date that is 2nd June then the 3rd June result would show just 5.
Would someone be able to assist with this.
Thanks
UPDATE
This is what i have so far which seems to work. Is there a better solution though as my solution only gets me employed figures. I also need unemployed on another column - unemployed would mean either no entry in the table or date not between and no other entry as employed.
CREATE TABLE #Temp(CountTotal int NOT NULL, CountDate datetime NOT NULL);
DECLARE #StartDT DATETIME
SET #StartDT = '2015-01-01 00:00:00'
WHILE #StartDT < '2015-08-31 00:00:00'
BEGIN
INSERT INTO #Temp(CountTotal, CountDate)
SELECT COUNT(DISTINCT PERSON.Id) AS CountTotal, #StartDT AS CountDate FROM PERSON
INNER JOIN DATA_INPUT_CHANGE_LOG ON PERSON.DataInputTypeId = DATA_INPUT_CHANGE_LOG.DataInputTypeId AND PERSON.Id = DATA_INPUT_CHANGE_LOG.DataItemId
LEFT OUTER JOIN PERSON_EMPLOYMENT ON PERSON.Id = PERSON_EMPLOYMENT.PersonId
WHERE PERSON.Id > 0 AND DATA_INPUT_CHANGE_LOG.Hidden = '0' AND DATA_INPUT_CHANGE_LOG.Approved = '1'
AND ((PERSON_EMPLOYMENT.StartDate <= DATEADD(MONTH,1,#StartDT) AND PERSON_EMPLOYMENT.EndDate IS NULL)
OR (#StartDT BETWEEN PERSON_EMPLOYMENT.StartDate AND PERSON_EMPLOYMENT.EndDate) AND PERSON_EMPLOYMENT.EndDate IS NOT NULL)
SET #StartDT = DATEADD(MONTH,1,#StartDT)
END
select * from #Temp
drop TABLE #Temp
You can use the following query. The cte part is to generate a set of serial dates between the start date and end date.
DECLARE #ViewStartDate DATETIME
DECLARE #ViewEndDate DATETIME
SET #ViewStartDate = '2015-01-01 00:00:00.000';
SET #ViewEndDate = '2015-02-25 00:00:00.000';
;WITH Dates([Date])
AS
(
SELECT #ViewStartDate
UNION ALL
SELECT DATEADD(DAY, 1,Date)
FROM Dates
WHERE DATEADD(DAY, 1,Date) <= #ViewEndDate
)
SELECT [Date], COUNT(*)
FROM Dates
LEFT JOIN PersonData ON Dates.Date >= PersonData.StartDate
AND Dates.Date <= PersonData.EndDate
GROUP By [Date]
Replace the PersonData with your table name
If startdate and enddate columns can be null, then you need to add
addditional conditions to the join
It assumes one person has only one record in the same date range
You could do this by creating data where every start date is a +1 event and end date is -1 and then calculate a running total on top of that.
For example if your data is something like this
PersonId StartDate EndDate
1 20150101 20150201
2 20150102 20150115
3 20150101
You first create a data set that looks like this:
EventDate ChangeValue
20150101 +2
20150102 +1
20150115 -1
20150201 -1
And if you use running total, you'll get this:
EventDate Total
2015-01-01 2
2015-01-02 3
2015-01-15 2
2015-02-01 1
You can get it with something like this:
select
p.eventdate,
sum(p.changevalue) over (order by p.eventdate asc) as total
from
(
select startdate as eventdate, sum(1) as changevalue from personnel group by startdate
union all
select enddate, sum(-1) from personnel where enddate is not null group by enddate
) p
order by p.eventdate asc
Having window function with sum() requires SQL Server 2012. If you're using older version, you can check other options for running totals.
My example in SQL Fiddle
If you have dates that don't have any events and you need to show those too, then the best option is probably to create a separate table of dates for the whole range you'll ever need, for example 1.1.2000 - 31.12.2099.
-- Edit --
To get count for a specific day, it's possible use the same logic, but just sum everything up to that day:
declare #eventdate date
set #eventdate = '20150117'
select
sum(p.changevalue)
from
(
select startdate as eventdate, 1 as changevalue from personnel
where startdate <= #eventdate
union all
select enddate, -1 from personnel
where enddate < #eventdate
) p
Hopefully this is ok, can't test since SQL Fiddle seems to be unavailable.

Resources