How to optimize below my SQL query shown here - sql-server

This query is written for those users who did not log-in to the system between 1st July to 31 July.
However when we run the query in query analyzer then it's taking more than 2 minutes. But in application side giving error as 'Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding'.
Below query takes start date as 1st July 2022 and get all the users and add those users into temp table called '#TABLE_TEMP' and increases to next date.
Again while loop runs and fetch users for 2nd July and so on until it reaches to 31st July.
Can anyone help on this to optimize the query using CTE or any other mechanism?
H
ow can we avoid While loop for better performance?
DECLARE #TABLE_TEMP TABLE
(
Row int IDENTITY(1,1),
[UserId] int,
[UserName] nvarchar(100),
[StartDate] nvarchar(20),
[FirstLogin] nvarchar(20),
[LastLogout] nvarchar(20)
)
DECLARE #START_DATE datetime = '2022-07-01';
DECLARE #END_DATE datetime = '2022-07-31';
DECLARE #USER_ID nvarchar(max) = '1,2,3,4,5,6,7,8,9';
DECLARE #QUERY nvarchar(max) = '';
WHILE(#START_DATE < #END_DATE OR #START_DATE = #END_DATE)
BEGIN
SET #QUERY = 'SELECT
s.userid AS [UserId],
s.username AS [UserName],
''' + CAST(#START_DATE as nvarchar) + ''' AS [StartDate],
MAX(h.START_TIME) as [FirstLogin],
MAX(ISNULL(h.END_TIME, s.LAST_SEEN_TIME)) as [LastLogout]
FROM USER s
LEFT JOIN USER_LOGIN_HISTORY h ON h.userid = s.userid
LEFT JOIN TEMP_USER_INACTIVATION TUI ON TUI.userid = s.userid AND ('''+ CAST(#START_DATE as nvarchar) +''' BETWEEN ACTIVATED_DATE AND DEACTIVATD_DATE)
WHERE s.userid IN (' + #USER_ID + ')
AND h.userid NOT IN (SELECT userid FROM USER_LOGIN_HISTORY WHERE CAST(START_TIME AS DATE) = '''+ CONVERT(nvarchar,(CAST(#START_DATE AS DATE))) +''') AND ACTIVATED_DATE IS NOT NULL
GROUP BY s.userid, h.userid, s.username, s.last_seen_time
HAVING CAST(MAX(ISNULL(h.END_TIME, s.LAST_SEEN_TIME)) AS DATE) <> '''+ CONVERT(nvarchar,(CAST(#START_DATE AS DATE))) + '''
ORDER BY [User Name]'
INSERT INTO #TABLE_TEMP
EXEC(#QUERY)
SET #START_DATE = DATEADD(DD, 1, #START_DATE)
END

Without the query plan, it's hard to say for sure.
But there are some clear efficiencies to be had.
Firstly, there is no need for a WHILE loop. Create a Dates table which has every single date in it. Then you can simply join it.
Furthermore, do not inject the #USER_ID values. Instead, pass them thorugh as a Table Valued Parameter. At the least, split what you have now into a temp table or table variable.
Do not cast values you want to join on. For example, to check if START_TIME falls on a certain date, you can do WHERE START_TIME >= BeginningOfDate AND START_TIME < BeginningOfNextDate.
The LEFT JOINs are suspicious, especially given you are filtering on those tables in the WHERE.
Use NOT EXISTS instead of NOT IN or you could get incorrect results
DECLARE #START_DATE date = '2022-07-01';
DECLARE #END_DATE date = '2022-07-31';
DECLARE #USER_ID nvarchar(max) = '1,2,3,4,5,6,7,8,9';
DECLARE #userIds TABLE (userId int PRIMARY KEY);
INSERT #userIds (userId)
SELECT CAST(value AS int)
FROM STRING_SPLIT(#USER_ID, ',');
SELECT
s.userid as [UserId],
s.username as [UserName],
d.Date as [StartDate],
MAX(h.START_TIME) as [FirstLogin],
MAX(ISNULL(h.END_TIME, s.LAST_SEEN_TIME)) as [LastLogout]
FROM Dates d
JOIN USER s
LEFT JOIN USER_LOGIN_HISTORY h ON h.userid = s.userid
LEFT JOIN TEMP_USER_INACTIVATION TUI
ON TUI.userid = s.userid
ON d.Date BETWEEN ACTIVATED_DATE AND DEACTIVATD_DATE -- specify table alias (don't know which?)
WHERE s.userid in (SELECT u.userId FROM #userIds u)
AND NOT EXISTS (SELECT 1
FROM USER_LOGIN_HISTORY ulh
WHERE ulh.START_TIME >= CAST(d.date AS datetime)
AND ulh.START_TIME < CAST(DATEADD(day, 1, d.date) AS datetime)
AND ulh.userid = h.userid
)
AND ACTIVATED_DATE IS NOT NULL
AND d.Date BETWEEN #START_DATE AND #END_DATE
GROUP BY
d.Date,
s.userid,
s.username,
s.last_seen_time
HAVING CAST(MAX(ISNULL(h.END_TIME, s.LAST_SEEN_TIME)) AS DATE) <> d.date
ORDER BY -- do you need this? remove if possible.
s.username;

Better to collect dates in a table rather than running query in a loop. Use following query to collect dates between given date range:
DECLARE #day INT= 1
DECLARE #dates TABLE(datDate DATE)
--creates dates table first and then create dates for the given month.
WHILE ISDATE('2022-8-' + CAST(#day AS VARCHAR)) = 1
BEGIN
INSERT INTO #dates
VALUES (DATEFROMPARTS(2022, 8, #day))
SET #day = #day + 1
END
Then to get all dates where user did not login, you have to use Cartesian join and left join as illustrated below
SELECT allDates.userID,
allDates.userName,
allDates.datDate notLoggedOn
FROM
(
--This will reutrun all users for all dates in a month i.e. 31 rows for august for every user
SELECT *
FROM Users,
#dates
) allDates
LEFT JOIN
(
--now get last login date for every user between given date range
SELECT userID,
MAX(login_date) last_Login_date
FROM USER_LOGIN_HISTORY
WHERE login_date BETWEEN '2022-08-01' AND '2022-08-31'
GROUP BY userID
) loggedDates ON loggedDates.last_Login_date = allDates.datDate
WHERE loggedDates.last_Login_date IS NULL --filter out only those users who have not logged in
ORDER BY allDates.userID,
allDates.datDate
From this query you will get every day of month when a user did not logged in.
If there is no need to list every single date when user did not log in, then Cartesian join can be omitted. This will further improve the performance.
I hope this will help.

Related

Can I "left join" days between 2 dates in sql server?

There is a table in SQL Server where data is entered day by day. In this table, data is not filled in some days.
Therefore, there are no records in the table.
Sample: dataTable
I need to generate a report like the one below from this table.
Create a table with all the days of the year. I know that I can output a report by "joining" the "dataTable" table.
But this solution seems a bit strange to me.
Is there another way?
the code i use for temp date table
CREATE TABLE tempDate (
calendarDate date,
PRIMARY KEY (calendarDate)
)
DECLARE
#start DATE= '2021-01-01',
#dateCount INT= 730,
#rowNumber INT=1
WHILE (#rowNumber < #dateCount)
BEGIN
INSERT INTO tempDate values (DATEADD(DAY, #rowNumber, #start))
set #rowNumber=#rowNumber+1
END
GO
select * from tempDate
This is how I join using this table
SELECT
*
FROM
tempDate td WITH (NOLOCK)
LEFT JOIN dataTable dt WITH (NOLOCK) ON dt.reportDate = td.calendarDate
WHERE
td.calendarDate BETWEEN '2021-09-05' AND '2021-09-15'
Create a table with all the days of the year. I know that I can output a report by "joining" the "dataTable" table.
This is the way. You can generate that "table" on the fly if you really want to, but normally the best way is to simply have a calendar table.
You can use common expression tables for dates. The code you need:
IF(OBJECT_ID('tempdb..#t') IS NOT NULL)
BEGIN
DROP TABLE #t
END
CREATE TABLE #t
(
id int,
dt date,
dsc varchar(100),
)
INSERT INTO #t
VALUES
(1, '2021.09.08', 'a'),
(1, '2021.09.09', 'b'),
(1, '2021.09.12', 'c')
DECLARE #minDate AS DATE
SET #minDate = (SELECT MIN(dt) FROM #t)
DECLARE #maxDate AS DATE
SET #maxDate = (SELECT MAX(dt) FROM #t)
;WITH cte
AS
(
SELECT #minDate AS [dt]
UNION ALL
SELECT DATEADD(DAY, 1, [dt])
FROM cte
WHERE DATEADD(DAY, 1, [dt])<=#maxDate
)
SELECT
ISNULL(CAST(t.id AS VARCHAR(10)), '') AS [id],
cte.dt AS [dt],
ISNULL(t.dsc, 'No record has been entered in the table.') AS [dsc]
FROM
cte
LEFT JOIN #t t on t.dt=cte.dt
The fastest method is to use a numbers table, you can get a date list between 2 dates with that:
DECLARE #Date1 DATE, #Date2 DATE
SET #Date1 = '20200528'
SET #Date2 = '20200625'
SELECT DATEADD(DAY,number+1,#Date1) [Date]
FROM master..spt_values
WHERE type = 'P'
AND DATEADD(DAY,number+1,#Date1) < #Date2
If you go go in LEFT JOIN this select, whit your table, you have the result that you want.
SELECT *
FROM (SELECT DATEADD(DAY,number+1,#Date1) [Date]
FROM master..spt_values WITH (NOLOCK)
WHERE type = 'P'
AND DATEADD(DAY,number+1,#Date1) < #Date2 ) as a
LEFT JOIN yourTable dt WITH (NOLOCK) ON a.date = dt.reportDate
WHERE td.[Date] BETWEEN '2021-09-05' AND '2021-09-15'

Get data from the last day of the month without the use of loops or variables

I wrote a query that should select the last record of each month in a year. I'd like to create a View based on this select, that I could run later in my project, but unfortunately I can't use any while loops or variables in a view command. Is there a way to select all these records - last days of a month in a View that I can use later?
My desired effect of the view:
The query that I'm trying to implement in a view:
DECLARE #var_day01 DATETIME;
DECLARE #month int;
SET #month = 1;
DROP TABLE IF EXISTS #TempTable2;
CREATE TABLE #TempTable2 (ID int, date datetime, INP2D float, INP3D float, ID_device varchar(max));
WHILE #month < 13
BEGIN
SELECT #var_day01 = CONVERT(nvarchar, date) FROM (SELECT TOP 1 * FROM data
WHERE DATEPART(MINUTE, CONVERT(nvarchar, date)) = '59'
AND
MONTH(CONVERT(nvarchar, date)) = (CONVERT(nvarchar, #month))
ORDER BY date DESC
) results
ORDER BY date DESC;
INSERT INTO #TempTable2 (ID, date, INP2D,INP3D,ID_device)
SELECT * FROM data
WHERE DATEPART(MINUTE, CONVERT(nvarchar, date)) = '59'
AND
MONTH(CONVERT(nvarchar, date)) = (CONVERT(nvarchar, #month))
AND
DAY(CONVERT(nvarchar, date)) = CONVERT(datetime, DATEPART(DAY, #var_day01))
ORDER BY date DESC
PRINT #var_day01
SET #month = #month +1;
END
SELECT * FROM #TempTable2;
If you are actually just after the single most recent row for each month, there is no need for a while loop to achieve this. You just need to identify the max date value for each month and then filter your source data for those for those rows.
One way to achieve this is via a row_number window function:
declare #t table(id int,dt datetime2);
insert into #t values(1,getdate()-40),(2,getdate()-35),(3,getdate()-25),(4,getdate()-10),(5,getdate());
select id
,id_device
,dt
from(select id
,id_device
,dt
,row_number() over (partition by id_device, year(dt), month(dt) order by dt desc) as rn
from #t
) as d
where rn = 1;
You can add a simple where to your select statement, in where clause you will add one day to the date field and then select the day from the resultant date. If the result date is 1 then only you will select that record
the where clause for your query will be : Where Day(DATEADD(d,1,[date])) = 1

Pivot Activity Code Days Months

I am trying to get the activity codes for specific days to show the 31 days in every month of the year for a specific staff member.
If the staff member was present, sick, holiday leave, etc... I want those activity codes to display based on the output below for a year act_date range.
Thanks!
Pivot Activity Code Days Months
This can be achieved with pivoting. Here you can enter the staff id in the query to fetch the results for that particular staff.
--create table
create table staff_info
(
staffId int,
actDate datetime,
activityCode int
)
--insert values
insert into staff_info values
(2699, '01/02/2017', 101),
(2699, '05/14/2017', 303),
(2699, '08/06/2017', 101),
(1927, '10/25/2017', 105)
--actual solution
select * from
(
select staffId, day(actDate) as act_day,month(actDate) as actual_month,
activityCode
from staff_info
where staffId=2699 ----- enter the staff id here
) src
pivot
(
sum(activityCode)
for act_day in ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],[13],
[14],[15],
[16],[17],[18],[19],[20],[21],[22],[23],[24],[25],[26],[27],[28],[29],[30],
[31]
)
) p
Result:
Firstly, create a function which would give the date values for a specific range
CREATE FUNCTION [dbo].[GetAllDatesBetweenRange]
(
#FromDate DATE
,#ToDate DATE
)
RETURNS #Dates TABLE
(
DateVal DATE
)
AS
BEGIN
;WITH CTE
AS
(
SELECT #FromDate AS FromDate
UNION ALL
SELECT DATEADD(DD,1,FromDate)
FROM CTE
WHERE FromDate < #ToDate
)
INSERT INTO #Dates
SELECT FromDate FROM CTE
OPTION (MAXRECURSION 0)
RETURN;
END
GO
Use the below dynamic query to pivot for the specific date range
DECLARE #Sql NVARCHAR(MAX);
DECLARE #DateVal NVARCHAR(MAX);
SELECT #DateVal = STUFF((SELECT ',['+CAST(DateVal AS NVARCHAR(50))+']'
FROM [dbo].[GetAllDatesBetweenRange]('2017-01-01','2017-12-31')
FOR XML PATH('')),1,1,'')
SET #Sql = '
;WITH CTE
AS
(
SELECT Res1.STAFF_ID
,Res2.DateVal
,Res1.ACTIVITY_CODE
FROM [dbo].[GetAllDatesBetweenRange](''''2017-01-01'''',''''2017-12-31'''') Res1
LEFT JOIN TableA A ON A.ACT_DATE = Res1.DateVal
)
SELECT STAFF_ID
,*
FROM CTE
PIVOT
(
MAX(ACTIVITY_CODE)
FOR DateVal IN ('+#DateVal+')'+'
)'
EXEC SP_EXECUTESQL #Sql

Using date parameter in SQL Server CTE

I'm trying to use a start and end date parameter in a T-SQL common table expression. I'm very new to SQL Server development and I'm unsure of what I'm missing in the query.
I can specify values for #startdate & #enddate and get correct results.
However, I'm trying to figure out how to make the two parameters open so a user can specify start and end date values. The query will be used in an SSRS report.
DECLARE #startdate Datetime,
#enddate Datetime;
SET #startdate = '2017-02-09';
SET #enddate = '2017-02-10';
WITH ManHours AS
(
SELECT DISTINCT
a.plant_name AS Plant, SUM(tc.total_hr) AS TotalHours
FROM
area AS a
INNER JOIN
tf_department AS dep ON a.plant_id = dep.plant_id
INNER JOIN
tf_timecard AS tc ON dep.department_id = tc.department_id
WHERE
tc.timecard_dt BETWEEN #startdate AND #enddate
AND tc.department_id IN (266, 453, ...endlessly long list of IDs......)
AND tc.hourtype_id = 1
GROUP BY
a.plant_name),
Tonnage AS
(
SELECT DISTINCT
a.plant_name AS Plant, SUM(tglt.postqty) AS TotalTonnage
FROM
area AS a
INNER JOIN
plantgl AS pgl ON a.plant_id = pgl.plant_id
INNER JOIN
tgltransaction AS tglt ON pgl.glacckey = tglt.glacctkey
WHERE
tglt.postdate BETWEEN #startdate AND #enddate
GROUP BY
a.plant_name
)
SELECT DISTINCT
ManHours.Plant,
SUM(TotalTonnage) as 'Production Tons' ,
SUM(TotalHours) as 'Man Hours',
TotalTonnage / TotalHours AS TonsPerManHour
FROM
ManHours
LEFT OUTER JOIN
Tonnage ON ManHours.Plant = tonnage.Plant
GROUP BY
ManHours.Plant, ManHours.TotalHours, Tonnage.TotalTonnage
Below is an example of a stored procedure you could use. In addition, I provided two alternatives to the "endlessly long list of IDs" that is specified in the CTE. In my opinion, it is optimal to pull this logic out of the query and place it at the beginning of the stored procedure. This will enable you, or others, to easily go back and modify this list if / when it changes. Even better, I provided a TABLE VARIABLE (#ListOfDeptIdsFromTable) that you can use to actually retrieve this data as opposed to hard-coding a string.
CREATE PROCEDURE Report
#startdate DATETIME,
#enddate DATETIME
AS
BEGIN
SET NOCOUNT ON;
DECLARE #ListOfDeptIds VARCHAR(MAX) = '266, 453, ...endlessly long list of IDs......';
DECLARE #ListOfDeptIdsFromTable TABLE
(
department_id INT
)
INSERT INTO #ListOfDeptIdsFromTable (department_id)
SELECT department_id
FROM -- Table here
WHERE -- Where credentials to retrieve the long list
WITH ManHours AS
(
SELECT DISTINCT
a.plant_name AS Plant, SUM(tc.total_hr) AS TotalHours
FROM
area AS a
INNER JOIN
tf_department AS dep ON a.plant_id = dep.plant_id
INNER JOIN
tf_timecard AS tc ON dep.department_id = tc.department_id
WHERE
tc.timecard_dt BETWEEN #startdate AND #enddate
AND tc.department_id IN (#ListOfDeptIds) -- or ... IN (SELECT department_id FROM #ListOfDeptIdsFromTable)
AND tc.hourtype_id = 1
GROUP BY
a.plant_name),
Tonnage AS
(
SELECT DISTINCT
a.plant_name AS Plant, SUM(tglt.postqty) AS TotalTonnage
FROM
area AS a
INNER JOIN
plantgl AS pgl ON a.plant_id = pgl.plant_id
INNER JOIN
tgltransaction AS tglt ON pgl.glacckey = tglt.glacctkey
WHERE
tglt.postdate BETWEEN #startdate AND #enddate
GROUP BY
a.plant_name
)
SELECT DISTINCT
ManHours.Plant,
SUM(TotalTonnage) as 'Production Tons' ,
SUM(TotalHours) as 'Man Hours',
TotalTonnage / TotalHours AS TonsPerManHour
FROM
ManHours
LEFT OUTER JOIN
Tonnage ON ManHours.Plant = tonnage.Plant
GROUP BY
ManHours.Plant, ManHours.TotalHours, Tonnage.TotalTonnage
END
GO

Conditional FROM clause

My coworkers are using entity framework and have got 3 (schematically) identical databases. These databases are updated and modified by their application. I am writing another, separate application to gather information about their application.
I am trying to use stored procedures but having trouble. It seems I must have three copies of my query in every stored procedure (one for each database) and JOIN them all at the end. I don't want to have three copies of every query with only the table name changed. Can I specify using a parameter, CASE statement, or something else the table I use in my FROM Clause?
Two options: dynamic SQL, or a UNION ALL statement.
SELECT columnlist
FROM TABLE1
WHERE #param = 'Table1'
UNION ALL
SELECT columnlist
FROM TABLE2
WHERE #param = 'Table2'
UNION ALL
SELECT columnlist
FROM TABLE3
WHERE #param = 'Table3'
Since you are working with stored procedures, you can pass the table name from which you want to query as parameter like
create procedure sp_test
#tab_name varchar(10)
as
begin
if(#tab_name = 'Table1')
select * from Table1
else if (#tab_name = 'Table2')
select * from Table2
else
select * from Table3
end
Then run your SP like
exec sp_test 'Table1'
EDIT:
As per your comment you want to change the DB name in your query. So in DB.HistoryOne JOIN DB.HistoryTwo you want to change the DB to DB1. You can do it like below in a procedure
create procedure sp_DB_change
#DBname varchar(10)
as
begin
declare #sql varchar(200);
set #sql = 'SELECT AVG(DATEDIFF(s, StartDate, OtherStartDate)) AS time1 ,
CAST(OtherStartDate AS Date) AS [Date]
FROM DB.HistoryOne
JOIN DB.HistoryTwo ON HistoryOne.Id = HistoryTwo.Id
WHERE StartDate IS NOT NULL
AND OtherStartDate IS NOT NULL
AND OtherStartDate > DATEADD(d, -7, GETDATE())
GROUP BY CAST(OtherStartDate AS DATE)';
select #sql = REPLACE(#sql,'DB',#newdb)
exec (#sql)
end
Then run your SP like
exec sp_DB_change 'testDB'
So your original query
SELECT AVG(DATEDIFF(s, StartDate, OtherStartDate)) AS time1 ,
CAST(OtherStartDate AS Date) AS [Date]
FROM DB.HistoryOne
JOIN DB.HistoryTwo ON HistoryOne.Id = HistoryTwo.Id
WHERE StartDate IS NOT NULL
AND OtherStartDate IS NOT NULL
AND OtherStartDate > DATEADD(d, -7, GETDATE())
GROUP BY CAST(OtherStartDate AS DATE)
Will be converted to
SELECT AVG(DATEDIFF(s, StartDate, OtherStartDate)) AS time1 ,
CAST(OtherStartDate AS Date) AS [Date]
FROM testDB.HistoryOne
JOIN testDB.HistoryTwo ON HistoryOne.Id = HistoryTwo.Id
WHERE StartDate IS NOT NULL
AND OtherStartDate IS NOT NULL
AND OtherStartDate > DATEADD(d, -7, GETDATE())
GROUP BY CAST(OtherStartDate AS DATE)

Resources