Joining multiple date fields to calendar table - sql-server

I have a table, let's call it Records, where the relevant data is organized as such:
| Employee | SubmissionDate | FirstReviewDate | SecondReviewDate |
Anne 2017-10-02 2017-10-03 2017-10-10
Bernard 2017-10-03 2017-10-05 2017-10-10
Charlene 2017-10-06 2017-10-09 2017-10-09
Danielle 2017-10-02 2017-10-03 2017-10-09
Anne 2017-10-03 2017-10-03 2017-10-09
Every time an employee makes a submission, a new entry is added with a SubmissionDate. The record is later edited to include then the first and second reviews take place.
I also have a calendar table called Calendar with a field called TheDate which has dates for every day this year. What I would like to do is associate SubmissionDate, FirstReviewDate, and SecondReview date with Calendar.TheDate so that I can do a count for all three fields for any given day.
I have tried the following code:
SELECT Employee,
Count(SubmissionDate) AS "Submissions",
Count(FirstReviewDate) AS "First Reviews",
Count(SecondReviewDate) AS "Second Reviews"
FROM Records
LEFT JOIN Calendar ON Records.SubmissionDate = Calendar.TheDate
AND Records.FirstReviewDate = Calendar.TheDate
AND Records.SecondReviewDate = Calendar.TheDate
WHERE TheDate = '2017-10-11'
All of the variations of this I have tried output nothing:
| Employee | Submissions | First Reviews | Second Reviews |
My desired code would look like this:
SELECT Employee,
Count(SubmissionDate),
Count(FirstReviewDate),
Count(SecondReviewDate)
FROM ???
WHERE TheDate = '2017-10-03'
where ??? would be the proper join. Using the desired code block (including the where clause), the desired output would like this for the example data provided:
| Employee | Submissions | First Reviews | Second Reviews |
Anne 1 2 0
Bernard 1 0 0
Charlene 0 0 0
Danielle 0 1 0
I am not sure how to do this join. I have read many resources about calendar tables, but the examples always include joining tables that each have a single date identifier. My problem is that I have one table with three date fields that need to be associated with Calendar.TheDate.
I am arranging my data like this so that the data can be visualized in Qlik Sense. I would like users to be able to select TheDate from a filter panel and have it aggregate all three fields for the date specified (essentially identical to the WHERE clause in my example code).

Try below code
create table #recs (
Employee varchar(10),
SubmissionDate date,
FirstReviewDate date,
SecondReviewDate date
)
insert into #recs values
('Anne', '2017-10-02', '2017-10-03', '2017-10-10'),
('Bernard', '2017-10-03', '2017-10-05', '2017-10-10'),
('Charlene', '2017-10-06', '2017-10-09', '2017-10-09'),
('Danielle', '2017-10-02', '2017-10-03', '2017-10-09'),
('Anne', '2017-10-03', '2017-10-03', '2017-10-09')
select Employee
, sum(IIF(SubD.Date_Value = #recs.SubmissionDate, 1, 0)) AS Submissions
, sum(IIF(SubD.Date_Value = #recs.FirstReviewDate, 1, 0)) AS [First Reviews]
, sum(IIF(SubD.Date_Value = #recs.SecondReviewDate, 1, 0)) AS [Second Reviews]
from #recs
left join DimDate SubD on SubD.Date_Value = '2017-10-03' and
(SubD.Date_Value = #recs.SubmissionDate
or SubD.Date_Value = #recs.FirstReviewDate
or SubD.Date_Value = #recs.SecondReviewDate)
group by Employee
If your SQL Server is older than 2012, use CASE instead of IIF
select Employee
, sum(case when SubD.Date_Value = #recs.SubmissionDate then 1 else 0 end) AS Submissions
, sum(case when SubD.Date_Value = #recs.FirstReviewDate then 1 else 0 end) AS [First Reviews]
, sum(case when SubD.Date_Value = #recs.SecondReviewDate then 1 else 0 end) AS [Second Reviews]
from #recs
left join DimDate SubD on SubD.Date_Value = '2017-10-03' and
(SubD.Date_Value = #recs.SubmissionDate
or SubD.Date_Value = #recs.FirstReviewDate
or SubD.Date_Value = #recs.SecondReviewDate)
group by Employee

This should work - I've included some simple DML for sample data:-
--Table Setup
IF OBJECT_ID('Records') IS NOT NULL
DROP TABLE Records;
IF OBJECT_ID('Calendar') IS NOT NULL
DROP TABLE Calendar;
CREATE TABLE Records (
Employee varchar(100),
SubmissionDate datetime,
FirstReviewDate datetime,
SecondReviewDate datetime
)
CREATE TABLE Calendar (
TheDate datetime
)
INSERT Records
VALUES ('Anne', '2017-10-02', '2017-10-03', '2017-10-10'),
('Bernard', '2017-10-03', '2017-10-05', '2017-10-10'),
('Charlene', '2017-10-06', '2017-10-09', '2017-10-09'),
('Danielle', '2017-10-02', '2017-10-03', '2017-10-09'),
('Anne', '2017-10-03', '2017-10-03', '2017-10-09')
INSERT Calendar
VALUES ('2017-10-03'), ('2017-10-04'), ('2017-10-05')
--Main Query
DECLARE #TheDate datetime = '2017-10-03'
SELECT
Employee,
SUM(CASE
WHEN c1.TheDate IS NOT NULL THEN 1
ELSE 0
END),
SUM(CASE
WHEN c2.TheDate IS NOT NULL THEN 1
ELSE 0
END),
SUM(CASE
WHEN c3.TheDate IS NOT NULL THEN 1
ELSE 0
END)
FROM Records r
LEFT JOIN Calendar c1
ON r.SubmissionDate = c1.TheDate
AND c1.thedate = #TheDate
LEFT JOIN Calendar c2
ON r.FirstReviewDate = c2.TheDate
AND c2.thedate = #TheDate
LEFT JOIN Calendar c3
ON r.SecondReviewDate = c3.TheDate
AND c3.thedate = #TheDate
GROUP BY Employee

Related

create date range report based on history table

We have been keeping track of some changes in a History Table like this:
ChangeID EmployeeID PropertyName OldValue NewValue ModifiedDate
100 10 EmploymentStart Not Set 1 2013-01-01
101 10 SalaryValue Not Set 55000 2013-01-01
102 10 SalaryValue 55000 61500 2013-03-20
103 10 SalaryEffectiveDate 2013-01-01 2013-04-01 2013-03-20
104 11 EmploymentStart Not Set 1 2013-01-21
105 11 SalaryValue Not Set 43000 2013-01-21
106 10 SalaryValue 61500 72500 2013-09-20
107 10 SalaryEffectiveDate 2013-04-01 2013-10-01 2013-09-20
Basically if an Employee's Salary changes, we log two rows in the history table. One row for the Salary value itself and the other row for the salary effective date. So these two have identical Modification Date/Time and are kind safe to assume that are always after each other in the database. We can also assume that Salary Value is always logged first (so it is one record before the corresponding effective date
Now we are looking into creating reports based on a given date range into a table like this:
Annual Salary Change Report (2013)
EmployeeID Date1 Date2 Salary
10 2013-01-01 2013-04-01 55000
10 2013-04-01 2013-10-01 61500
10 2013-10-01 2013-12-31 72500
11 2013-03-21 2013-12-31 43000
I have done something similar in the past by joining the table to itself but in those cases the effective date and the new value where in the same row. Now I have to create each row of the output table by looking into a few rows of the existing history table. Is there an straightforward way of doing this whitout using cursors?
Edit #1:
Im reading on this and apparently its doable using PIVOTs
Thank you very much in advance.
You can use self join to get the result you want. The trick is to create a cte and add two rows for each EmployeeID as follows (I call the history table ht):
with cte1 as
(
select EmployeeID, PropertyName, OldValue, NewValue, ModifiedDate
from ht
union all
select t1.EmployeeID,
(case when t1.PropertyName = "EmploymentStart" then "SalaryEffectiveDate" else t1.PropertyName end),
(case when t1.PropertyName = "EmploymentStart" then t1.ModifiedDate else t1.NewValue end),
(case when t1.PropertyName = "SalaryValue" then t1.NewValue
when t1.PropertyName = "SalaryEffectiveDate" then "2013-12-31"
when t1.PropertyName = "EmploymentStart" then "2013-12-31" end),
"2013-12-31"
from ht t1
where t1.ModifiedDate = (select max(t2.ModifiedDate) from ht t2 where t1.EmployeeID = t2.EmployeeID)
)
select t3.EmployeeID, t4.OldValue Date1, t4.NewValue Date2, t3.OldValue Salary
from cte1 t3
inner join cte1 t4 on t3.EmployeeID = t4.EmployeeID
and t3.ModifiedDate = t4.ModifiedDate
where t3.PropertyName = "SalaryValue"
and t4.PropertyName = "SalaryEffectiveDate"
order by t3.EmployeeID, Date1
I hope this helps.
It is a little over kill to use pivot since you only need two properties. Use GROUP BY can also achieve this:
;WITH cte_salary_history(EmployeeID,SalaryEffectiveDate,SalaryValue)
AS
(
SELECT EmployeeID,
MAX(CASE WHEN PropertyName='SalaryEffectiveDate' THEN NewValue ELSE NULL END) AS SalaryEffectiveDate,
MAX(CASE WHEN PropertyName='SalaryValue' THEN NewValue ELSE NULL END) AS SalaryValue
FROM yourtable
GROUP BY EmployeeID,ModifiedDate
)
SELECT EmployeeID,SalaryEffectiveDate,
LEAD(SalaryEffectiveDate,1,'9999-12-31') OVER(PARTITION BY EmployeeID ORDER BY SalaryEffectiveDate) AS SalaryEndDate,
SalaryValue
FROM cte_salary_history

Using UPDATE + SET to change a value to another date's value

Forgive me if the answer is obvious here, however I have been stuck for days; my unsuccessful query below.
If a 'Retailer' reports sales figures, but not inventory values for a certain day, I want to update that missing value using the value for the day prior.
Here's a sample table:
Retailer Date ItemID Sold Inventory
Joe's 2017-10-30 00:00:00.000 111111 10 0
Joe's 2017-10-29 00:00:00.000 111111 10 999999
Mary's 2017-10-30 00:00:00.000 123123 10 0
Mary's 2017-10-29 00:00:00.000 123123 10 888888
Betty's 2017-10-30 00:00:00.000 111111 10 499990
Betty's 2017-10-29 00:00:00.000 111111 10 500000
And here is the query I'm trying to use:
SET T1.Inventory = (SELECT T2.Inventory
FROM [dbo].[TEST] T2
WHERE CAST(T2.Date AS DATE) = CONVERT(date,getDate()-2))
FROM [dbo].[TEST] T1
WHERE Inventory = '0'
Use the DATEADD function instead of getDate()-2
And if you want the day before today, you should use GetDate and subtract 1, rather than 2.
If you want the day before the record you are looking at with the same retailer, then you should use t1.Date and make sure you have correlated the subquery:
SET T1.Inventory = (SELECT T2.Inventory
FROM [dbo].[TEST] T2
WHERE CAST(T2.Date AS DATE) = DATEADD(day,-1,CONVERT(date,T1.Date))
AND t1.Retailer=t2.Retailer
)
...
Undoubtedly the reason for the difficulty with updating the inventory column is that the table lacks a unique column which is considered by most to be absolutely essential in any database table. So I have added an identity column RID as a Row ID which is unique.
ALTER TABLE T1 ADD RID INT IDENTITY(1,1)
DECLARE #RID INT = (SELECT MIN(RID) FROM T1 WHERE Inventory = 0)
DECLARE #INVZERO INT = (SELECT COUNT(*) FROM T1 WHERE Inventory = 0)
WHILE #INVZERO > 0
BEGIN
UPDATE T1 SET INVENTORY =
(
SELECT INVENTORY FROM T1
WHERE RETAILER = (SELECT RETAILER FROM T1 WHERE RID = #RID)
AND [DATE] = DATEADD(DAY,-1,(SELECT [DATE] FROM T1 WHERE RID = #RID))
)
WHERE RID = #RID
SET #RID = (SELECT MIN(RID) FROM T1 WHERE Inventory = 0 AND RID > #RID)
SET #INVZERO = (SELECT COUNT(*) FROM T1 WHERE Inventory = 0)
END
SELECT * FROM T1

Optimization of SQL Server Query

My problem is that I created a query that takes too long to execute.
City | Department | Employee | Attendance Date | Attendance Status
------------------------------------------------------------------------
C1 | Dept 1 | Emp 1 | 2016-01-01 | ABSENT
C1 | Dept 1 | Emp 2 | 2016-01-01 | LATE
C1 | Dept 2 | Emp 3 | 2016-01-01 | VACANCY
So I want to create a view that contains same data and adds a column that contains the total number of employees (that serves me later in a SSRS project to determine the percentage of each status).
So I created a function that makes simple select filtering by department and date.
and this is the query that uses the function:
SELECT City, Department, Employee, [Attendence Date], [Attendance Status], [Get Department Employees By Date](Department, [Attendence Date]) AS TOTAL
FROM attendenceTable
This is the function:
CREATE FUNCTION [dbo].[Get Department Employees By Date]
(
#deptID int = null,
#date datetime = null
)
RETURNS nvarchar(max)
AS
BEGIN
declare #result int = 0;
select #result = count(*) from attendenceTable where DEPT_ID = #deptID and ATT_DATE_G = #date;
RETURN #result;
END
The problem is that query takes too long (I mean very long time) to execute.
Any Suggestion of optimization?
Your function is a scalar function, which is run once for every row in the result set (~600,000) times, and is a known performance killer. It can be rewritten into an inline table-valued function, or if the logic is not required elsewhere, a simple group, count & join would suffice:
WITH EmployeesPerDeptPerDate
AS ( SELECT DEPT_ID ,
ATT_DATE_G ,
COUNT(DISTINCT Employee) AS EmployeeCount
FROM attendenceTable
GROUP BY DEPT_ID ,
ATT_DATE_G
)
SELECT A.City ,
A.Department ,
A.Employee ,
A.[Attendence Date] ,
A.[Attendance Status] ,
ISNULL(B.EmployeeCount, 0) AS EmployeeCount
FROM attendenceTable AS A
LEFT OUTER JOIN EmployeesPerDeptPerDate AS B ON A.DEPT_ID = B.DEPT_ID
AND A.ATT_DATE_G = B.ATT_DATE_G;

TSQL get COUNT of rows that are missing from right table

There was one other SIMILAR answer but it is 2 pages long and my requirement doesn't need that. I have 2 tables, tableA and a tableB, and I need to find the COUNTS of rows that are present in tableA but are not present in tableB OR if update_on in tableB is not today's date.
My tables:
tableA:
release_id book_name release_begin_date
----------------------------------------------------
1122 midsummer 2016-01-01
1123 fool's errand 2016-06-01
1124 midsummer 2016-04-01
1125 fool's errand 2016-08-01
tableB:
release_id book_name updated_on
-----------------------------------------
1122 midsummer 2016-08-17
1123 fool's errand 2016-08-16**
Expected result: Since each book is missing one release id, 1 is count. But in addition fool's errand's existing row in tableB has updated_on date of yesterday and not today, it needs to be counted in count_of_not_updated.
book_name count_of_missing count_of_not_updated
-------------------------------------------------------
midsummer 1 0
fool's errand 1 1
Note: Even though fool's errand is present in tableB, I need to show it in count_of_missing because it's updated_on date is yesterday and not today. I know it has to be a combination of a left join and something else, but the kicker here is not only getting the missing rows from left table but at the same time checking if the updated_on table was today's date and if not, count that row in count_of_not_updated.
select sum(case when b.release_id is null then 1 else 0 end) as noReleaseID
, sum(case when datediff(d, b.release_date, getdate()) > 0 then 1 else 0 end) as releaseDateNotToday
, a.release_id
from tableA a
left outer join tableB b on a.release_id = b.release_id
Group by a.release_id
This example uses a sum function on a case statement to add up the instances where the case statement returns true. Note that the current code assumes, as in your example, that you are looking to count all old release dates from table b - more steps would be required if each book has multiple old release dates in table b, and you only want to compare to the most recent release date.
Try this
DECLARE #tableA TABLE (release_id INT, book_name NVARCHAR(50), release_begin_date DATETIME)
DECLARE #tableB TABLE (release_id INT, book_name NVARCHAR(50), updated_on DATETIME)
INSERT INTO #tableA
VALUES
(1122, 'midsummer', '2016-01-01'),
(1123, 'fool''s errand', '2016-06-01'),
(1124, 'midsummer', '2016-04-01'),
(1125, 'fool''s errand', '2016-08-01')
INSERT INTO #tableB
VALUES
(1122, 'midsummer', '2016-08-17'),
(1123, 'fool''s errand', '2016-08-16')
;WITH TmpTableA
AS
(
SELECT
book_name,
COUNT(1) CountOfTableA
FROM
#tableA
GROUP BY
book_name
), TmpTableB
AS
(
SELECT
book_name,
COUNT(1) CountOfTableB,
SUM(CASE WHEN CONVERT(VARCHAR(11), updated_on, 112) = CONVERT(VARCHAR(11), GETDATE(), 112) THEN 0 ELSE 1 END) count_of_not_updated
FROM
#tableB
GROUP BY
book_name
)
SELECT
A.book_name ,
A.CountOfTableA - ISNULL(B.CountOfTableB, 0) AS count_of_missing,
ISNULL(B.count_of_not_updated, 0) AS count_of_not_updated
FROM
TmpTableA A LEFT JOIN
TmpTableB B ON A.book_name = B.book_name
Result:
book_name count_of_missing count_of_not_updated
-------------------- ---------------- --------------------
fool's errand 1 1
midsummer 1 1

Select Statement Returning Multiple of Same Dates

I am having issues with a select statement that is returning multiple dates that are the same.
I will start off with the tables that I have:
Calendar: Very basic, just a table that contains all of the days for the next 20 years.
PKDate
------
2015-04-01
2015-04-02
2015-04-03
etc...
DaysWorked: This table contains all days that all equipment in the company has worked. There is a foreign key constraint to the PKDate in calendar to the DayWorked in this table.
DayWorked | Unit
----------------
2015-04-01 | 102
2015-04-05 | 103
Event: This is the table that is behind our scheduling system. This holds all of the days that can be booked off as days off. The use can select a start and end date for the days off or vacation. There are no foreign key constraints in this table.
Name | EventStart | EventEnd | Unit
-----------------------------------------
Days Off | 2015-04-06 | 2015-04-08 | 103
Days Off | 2015-04-03 | 2015-04-09 | 102
This is the stored procedure that I am executing:
select distinct PKDate as 'Date', case when PKDate not in (select DayWorked
from DaysWorked
where Unit='124')
then 'AVAILABLE'
else ''
end
as 'Available',
case when PKDate in (select DayWorked
from DaysWorked
where Unit='124')
then 'WORKED'
else ''
end
as 'Worked',
case when PKDate between E.EventStart and DATEADD(day, -1, E.EventEnd)
and E.ResourceID='124'
then UPPER(E.Name)
else ''
end
as 'Schedule'
from Event E
full outer join Calendar C
on PKDate between E.EventStart and E.EventEnd
where PKDate between '2015-04-01' and GETDATE()
order by PKDate asc
This stored procedure almost works as planned. I want the result of the procedure to show every day in the calendar in one column (Date), then display if the equipment was available (available), if the equipment worked (worked), and if the equipment was booked for days off or vacation (Schedule).
What is happening when I run the procedure is the date displays more than once for the same date. An example is shown in the photo below:
For the days Aprilt 13th to April 16th the days are repeated. I believe these days are repeated because I have something for those days in the Event table, but I do no know why the day displays twice. How can I get these days to only display once?
select C.PKDate
,case when not exists ( select * from DaysWorked where Unit = '124' and DayWorked = C.PKDate )
and not exists ( select * from Event E where E.EventStart <= C.PKDate and E.EventEnd >= C.PKDate and E.ResourceID = '124')
then 'AVAILABLE' else '' end as Available
,case when exists ( select * from DaysWorked where Unit = '124' and DayWorked = C.PKDate ) then 'WORKED' else '' end as Worked
,isnull((select max(E.Name) from Event E where E.EventStart <= C.PKDate and E.EventEnd >= C.PKDate and E.ResourceID = '124' ), '') as Schedule
from Calendar C
where C.PKDate between '2015-4-1' and getdate()
order by c.PKDate

Resources