I have a table that lists all users for my company. There are multiple entries for each staff member showing how they have been employed.
RowID UserID FirstName LastName Title StartDate Active EndDate
-----------------------------------------------------------------------------------
1 1 John Smith Manager 2017-01-01 0 2017-01-31
2 1 John Smith Director 2017-02-01 0 2017-02-28
3 1 John Smith CEO 2017-03-01 1 NULL
4 2 Sam Davey Manager 2017-01-01 0 2017-02-28
5 2 Sam Davey Manager 2017-03-01 0 NULL
6 3 Hugh Holland Admin 2017-02-01 1 NULL
7 4 David Smith Admin 2017-01-01 0 2017-02-28
I am trying to write a query that will tell me someones length of service at any given time.
The part I am having trouble with is as a single person is represented by multiple rows as their information changes over time I need combine multiple rows...
I have a query to report on who is employed at a point in time which is as far as I have gotten.
DECLARE #DateCheck datetime
SET #DateCheck = '2017/05/10'
SELECT *
FROM UsersTest
WHERE #DateCheck >= StartDate AND #DateCheck <= ISNULL(EndDate, #DateCheck)
You need to use the datediff function. The key will be choosing the appropriate number - days, months, years. The return value is an integer so if you choose years, it will be rounded (and remember, it will round for each record, not for the summary. I've chosen months below. The following has been added to get the most recent information for user name:
WITH CurrentName AS
(SELECT UserID, FirstName, LastName
from
UserStartStop
where Active = 1 -- You can replace this with a date check
)
SELECT uss.UserID,
MAX(cn.FirstName) as FirstName, -- the max is necessary because we are
-- grouping. Could include in group by
MAX(cn.LastName) as LastName,
SUM(DATEDIFF(mm,uss.StartDate,COALESCE(uss.EndDate,GETDATE())))
from UserStartStop uss
JOIN CurrentName cn
on uss.UserID = cn.UserID
GROUP BY UserID
order by UserID
For months in service, change 'd' to 'mm':
Create table #UsersTest (
RowId int
, UserID int
, FirstName nvarchar(100)
, LastName nvarchar(100)
, Title nvarchar(100)
, StartDate date
, Active bit
, EndDate date)
Insert #UsersTest values (1, 1, 'John', 'Smith', 'Manager', '2017-01-01', 0, '2017-01-31')
Insert #UsersTest values (1, 1, 'John', 'Smith', 'Director', '2017-02-01', 0, '2017-02-28')
Insert #UsersTest values (1, 1, 'John', 'Smith', 'CEO', '2017-03-01', 1, null)
Insert #UsersTest values (1, 2, 'Sam', 'Davey', 'Manager', '2017-01-01', 0, '2017-02-28')
Insert #UsersTest values (1, 2, 'Sam', 'Davey', 'Manager', '2017-03-01', 0, null)
Insert #UsersTest values (1, 3, 'Hugh', 'Holland', 'Admin', '2017-02-01', 1, null)
Insert #UsersTest values (1, 4, 'David', 'Smith', 'Admin', '2017-01-01', 0, '2017-02-28')
Declare #DateCheck as datetime = '2017/05/10'
Select UserID, FirstName, LastName
, Datediff(d, Min([StartDate]), iif(isnull(Max([EndDate]),'1900-01-01')<#DateCheck, #DateCheck ,Max([Enddate]))) as [LengthOfService]
from #UsersTest
Group by UserID, FirstName, LastName
Try it's
Select
FirstName,
LastName,
Min(StartDate)StartDate,
Max(isnull(EndDate,getdate()) as EndDate
from Table
Related
for example, my table has a record for each date, and each date's record could be same as the previous date record, could be different. my case is from date 1 to date 3, all of the record are same, and then date 4, the record is changed, date 5 the record is changed too, but it changed back to same as date 3. Now I want to a way to query the table and get the records of date 1, date 4 and date 5. Any idea, how to do it? Thanks
I read the issue above, is that a) you take daily logs of all rows, and b) you want to report on any row that is different from the previous day's.
SQL Server has a great function for dealing with differences across a large number of columns - EXCEPT. It also has the advantage of treating NULLs as distinct values - so a change from something to NULL, or vice versa, counts as a change. This is not true for most equality/inequality checks.
Here is a version where I create a daily snapshot of some fields from a 'users' table.
The SELECT query finds all rows from the log, except where the previous entry in the log is the same.
CREATE TABLE #UserLog (LogDate date, UserID int, UserName nvarchar(100), UserEmail nvarchar(100), LastLogonDate datetime, PRIMARY KEY (LogDate, UserID));
INSERT INTO #UserLog (LogDate, UserID, UserName, UserEmail, LastLogonDate) VALUES
('20201011', 1, 'Bob', NULL, '20201009 15:38'),
('20201012', 1, 'Bob', NULL, '20201009 15:38'),
('20201013', 1, 'Bob', 'Bob#gm.com', '20201012 09:15'),
('20201014', 1, 'Bob', 'Bob#gm.com', '20201013 19:02'),
('20201015', 1, 'Bob', 'Bob#gm.com', '20201013 19:02'),
('20201017', 1, 'Bob', 'Bob#gm.com', '20201013 19:02'),
('20201013', 2, 'Pat', 'Pat#hm.com', NULL),
('20201014', 2, 'Pat', 'Pat#hm.com', NULL),
('20201015', 2, 'Pat', 'Pat#hm.com', '20201014 20:55'),
('20201017', 2, 'Pat', 'Pat#hm.com', '20201016 13:22');
SELECT LogDate, UserID, UserName, UserEmail, LastLogonDate
FROM #UserLog
EXCEPT
SELECT LEAD(LogDate) OVER (PARTITION BY UserID ORDER BY LogDate), UserID, UserName, UserEmail, LastLogonDate
FROM #UserLog
ORDER BY UserID, LogDate;
In the 'EXCEPT' segment, it basically gets the data for each given row, then changes the date to the next date in sequence for that user e.g., it turns
('20201012', 1, 'Bob', NULL, '20201009 15:38'),
into
('20201013', 1, 'Bob', NULL, '20201009 15:38'),
As this is not the same as the actual row for Bob on the 13th, the row in the top part of the statement shows.
My initial test run of this simply had a DATEADD(day, 1, Logdate) in the EXCEPT portion, and that would show all rows that were different from yesterday's. However, the updated version above allows for breaks in the sequence (e.g., in the above, the logging failed on the 16th).
Here's a DB<>fiddle with the code above.
UPDATE - data posted in comment in another answer.
Here's a version with that data.
CREATE TABLE #tLog (LogDate date, v_1 int, v_2 varchar(100), v_3 int, v_4 varchar(10), v_5 int, v_6 varchar(10));
INSERT INTO #tLog (Logdate, v_1, v_2, v_3, v_4, v_5, v_6) VALUES
('20200101', 100, 'test_1', 0, '123', 120, 'JJ'),
('20200102', 100, 'test_1', 0, '123', 120, 'JJ'),
('20200103', 100, 'test_1', 0, '123', 120, 'JJ'),
('20200104', 101, 'test_1', 1, '123', 120, 'JJ'),
('20200105', 100, 'test_1', 0, '123', 120, 'JJ'),
('20200106', 101, 'test_1', 1, '12345', 120, 'JJ'),
('20200107', 101, 'test_1', 1, '12345', 120, 'JJ'),
('20200108', 101, 'test_2', 2, '12345', 200, 'JJ'),
('20200109', 101, 'test_1', 1, '12345', 120, 'TT'),
('20200110', 100, 'test_1', 0, '123', 120, 'JJ');
SELECT LogDate, v_1, v_2, v_3, v_4, v_5, v_6
FROM #tLog
EXCEPT
SELECT LEAD(LogDate) OVER (ORDER BY LogDate), v_1, v_2, v_3, v_4, v_5, v_6
FROM #tLog
ORDER BY LogDate;
And here's a copy of the results of the above. Note that only on the 2nd, 3rd and 7th did the data not change from the previous day.
LogDate v_1 v_2 v_3 v_4 v_5 v_6
--------------- ----------------------------
2020-01-01 100 test_1 0 123 120 JJ
2020-01-04 101 test_1 1 123 120 JJ
2020-01-05 100 test_1 0 123 120 JJ
2020-01-06 101 test_1 1 12345 120 JJ
2020-01-08 101 test_2 2 12345 200 JJ
2020-01-09 101 test_1 1 12345 120 TT
2020-01-10 100 test_1 0 123 120 JJ
Note that I have removed the 'PARTITION BY' in the LEAD as there are no real partitions - it's just one row after the next. However there's a distinct chance you may need this when it comes to actual data.
Here's a DB<>fiddle with both the original and this cut-down one with the OP's data.
The Problem
I'm trying to detect and react to changes in a table where each update is being recorded as a new row with some values being the same as the original, some changed (the ones I want to detect) and some NULL values (not considered changed).
For example, given the following table MyData, and assuming the OrderNumber is the common value,
ID OrderNumber CustomerName PartNumber Qty Price OrderDate
1 123 Acme Corp. WG301 4 15.02 2020-01-02
2 456 Base Inc. AL337 7 20.15 2020-02-03
3 123 NULL WG301b 5 19.57 2020-01-02
If I execute the query for OrderNumber = 123 I would like the following data returned:
Column OldValue NewValue
ID 1 3
PartNumber WG301 WG301b
Qty 4 5
Price 15.02 19.57
Or possibly a single row result with only the changes filled, like this (however, I would strongly prefer the former format):
ID OrderNumber CustomerName PartNumber Qty Price OrderDate
3 NULL NULL WG301b 5 19.57 NULL
My Solution
I have not had a chance to test this, but I was considering writing the query with the following approach (pseudo-code):
select
NewOrNull(last.ID, prev.ID) as ID,
NewOrNull(last.OrderNumber, prev.OrderNumber) as OrderNumber
NewOrNull(last.CustomerName, prev.CustomerName) as CustomerName,
...
from last row with OrderNumber = 123
join previous row where OrderNumber = 123
Where the function NewOrNull(lastVal, prevVal) returns NULL if the values are equal or lastVal value is NULL, otherwise the lastVal.
Why I'm Looking for an Answer
I'm afraid that the ugly join, the number of calls to the function, and the procedural approach may make this approach not scalable. Before I start down the rabbit hole, I was wondering...
The Question
...are there any other approaches I should try, or any best practices to solving this specific type of problem?
I came up with a solution for the second (less preferred) format:
The Data
Using the following data:
INSERT INTO MyData
([ID], [OrderNumber], [CustomerName], [PartNumber], [Qty], [Price], [OrderDate])
VALUES
(1, 123, 'Acme Corp.', 'WG301', '4', '15.02', '2020-01-02'),
(2, 456, 'Base Inc.', 'AL337', '7', '20.15', '2020-02-03'),
(3, 123, NULL, 'WG301b', '5', '19.57', '2020-01-02'),
(4, 123, 'ACME Corp.', 'WG301b', NULL, NULL, '2020-01-02'),
(6, 456, 'Base Inc.', NULL, '7', '20.15', '2020-02-05');
The Function
This function returns the updated value if it has changed, otherwise NULL:
CREATE FUNCTION dbo.NewOrNull
(
#newValue sql_variant,
#oldValue sql_variant
)
RETURNS sql_variant
AS
BEGIN
DECLARE #ret sql_variant
SELECT #ret = CASE
WHEN #newValue IS NULL THEN NULL
WHEN #oldValue IS NULL THEN #newValue
WHEN #newValue = #oldValue THEN NULL
ELSE #newValue
END
RETURN #ret
END;
The Query
This query returns the history of changes for the given order number:
select dbo.NewOrNull(new.ID, old.ID) as ID,
dbo.NewOrNull(new.OrderNumber, old.OrderNumber) as OrderNumber,
dbo.NewOrNull(new.CustomerName, old.CustomerName) as CustomerName,
dbo.NewOrNull(new.PartNumber, old.PartNumber) as PartNumber,
dbo.NewOrNull(new.Qty, old.Qty) as Qty,
dbo.NewOrNull(new.Price, old.Price) as Price,
dbo.NewOrNull(new.OrderDate, old.OrderDate) as OrderDate
from MyData new
left join MyData old
on old.ID = (
select top 1 ID
from MyData pre
where pre.OrderNumber = new.OrderNumber
and pre.ID < new.ID
order by pre.ID desc
)
where new.OrderNumber = 123
The Result
ID OrderNumber CustomerName PartNumber Qty Price OrderDate
1 123 Acme Corp. WG301 4 15.02 2020-01-02
3 (null) (null) WG301b 5 19.57 (null)
4 (null) ACME Corp. (null) (null) (null) (null)
The Fiddle
Here's the SQL Fiddle that shows the whole thing in action.
http://sqlfiddle.com/#!18/b720f/5/0
I have 2 tables name EmployeeInfo and Leave and I am storing the values that which employee have taken which type of leave in month and how many times.
I am trying to calculate the number of leaves of same type but I'm stuck at one point for long time.
IF EXISTS(SELECT 1 FROM sys.tables WHERE object_id = OBJECT_ID('Leave'))
BEGIN;
DROP TABLE [Leave];
END;
GO
IF EXISTS(SELECT 1 FROM sys.tables WHERE object_id = OBJECT_ID('EmployeeInfo'))
BEGIN;
DROP TABLE [EmployeeInfo];
END;
GO
CREATE TABLE [EmployeeInfo] (
[EmpID] INT NOT NULL PRIMARY KEY,
[EmployeeName] VARCHAR(255)
);
CREATE TABLE [Leave] (
[LeaveID] INT NOT NULL PRIMARY KEY,
[LeaveType] VARCHAR(255) NULL,
[DateFrom] VARCHAR(255),
[DateTo] VARCHAR(255),
[Approved] Binary,
[EmpID] INT FOREIGN KEY REFERENCES EmployeeInfo(EmpID)
);
GO
INSERT INTO EmployeeInfo([EmpID], [EmployeeName]) VALUES
(1, 'Marcia'),
(2, 'Lacey'),
(3, 'Fay'),
(4, 'Mohammad'),
(5, 'Mike')
INSERT INTO Leave([LeaveID],[LeaveType],[DateFrom],[DateTo], [Approved], [EmpID]) VALUES
(1, 'Annual Leave','2018-01-08 04:52:03','2018-01-10 20:30:53', 1, 1),
(2, 'Sick Leave','2018-02-10 03:34:41','2018-02-14 04:52:14', 0, 2),
(3, 'Casual Leave','2018-01-04 11:06:18','2018-01-05 04:11:00', 1, 3),
(4, 'Annual Leave','2018-01-17 17:09:34','2018-01-21 14:30:44', 0, 4),
(5, 'Casual Leave','2018-01-09 23:31:16','2018-01-12 15:11:17', 1, 3),
(6, 'Annual Leave','2018-02-16 18:01:03','2018-02-19 17:16:04', 1, 2)
My query which I have tried so far look something like this.
SELECT Info.EmployeeName, Leave.LeaveType, SUM(DATEDIFF(Day, Leave.DateFrom, Leave.DateTo)) [#OfLeaves], DatePart(MONTH, Leave.DateFrom)
FROM EmployeeInfo Info, Leave
WHERE Info.EmpID = Leave.EmpID AND Approved = 1
GROUP BY Info.EmployeeName, Leave.LeaveType, [Leave].[DateFrom], [Leave].[DateTo]
And the record like given below
EmployeeName LeaveType #OfLeaves MonthNumber
-------------- ----------------- ----------- -----------
Fay Casual Leave 1 1
Fay Casual Leave 3 1
Lacey Annual Leave 3 2
Marcia Annual Leave 2 1
I want the record to look like this
EmployeeName LeaveType #OfLeaves MonthNumber
-------------- ----------------- ----------- -----------
Fay Casual Leave 4 1
Lacey Annual Leave 3 2
Marcia Annual Leave 2 1
If you don't want to modify existing query due to some constraint, this might work:
Select iq.EmployeeName, iq.LeaveType, SUM(iq.#OfLeaves) as #OfLeaves, iq.MonthNumber
From (
SELECT Info.EmployeeName, Leave.LeaveType, SUM(DATEDIFF(Day, Leave.DateFrom, Leave.DateTo)) [#OfLeaves], DatePart(MONTH, Leave.DateFrom) as MonthNumber
FROM EmployeeInfo Info, Leave
WHERE Info.EmpID = Leave.EmpID AND Approved = 1
GROUP BY Info.EmployeeName, Leave.LeaveType, [Leave].[DateFrom], [Leave].[DateTo]
)iq
group by iq.EmployeeName, iq.LeaveType, iq.MonthNumber
This just need small adjustment with your query in the GROUP BY clause. Instead of grouping them by [Leave].[DateFrom] and [Leave].[DateTo] which causes the row to be separated, you need to group it with the calculated column that uses datepart.
SELECT Info.EmployeeName,
Leave.LeaveType,
SUM(DATEDIFF(Day, Leave.DateFrom, Leave.DateTo)) [#OfLeaves],
DatePart(MONTH, Leave.DateFrom)
FROM EmployeeInfo Info
INNER JOIN Leave
ON Info.EmpID = Leave.EmpID
WHERE Approved = 1
GROUP BY Info.EmployeeName,
Leave.LeaveType,
DatePart(MONTH, Leave.DateFrom) -- <<<< change only this part
Here's a Demo.
I have also modified the syntax into ANSI format.
I need to run some query against each rowset in a table (Azure SQL):
ID CustomerID MsgTimestamp Msg
-------------------------------------------------
1 123 2017-01-01 10:00:00 Hello
2 123 2017-01-01 10:01:00 Hello again
3 123 2017-01-01 10:02:00 Can you help me with my order
4 123 2017-01-01 11:00:00 Are you still there
5 456 2017-01-01 10:07:00 Hey I'm a new customer
What I want to do is to extract "chat session" for every customer from message records, that is, if the gap between someone's two consecutive messages is less than 30 minutes, they belong to the same session. I need to record the start and end time of each session in a new table. In the example above, start and end time of the first session for customer 123 are 10:00 and 10:02.
I know I can always use cursor and temp table to achieve that goal, but I'm thinking about utilizing any pre-built mechanism to reach better performance. Please kindly give me some input.
You can use window functions instead of cursor. Something like this should work:
declare #t table (ID int, CustomerID int, MsgTimestamp datetime2(0), Msg nvarchar(100))
insert #t values
(1, 123, '2017-01-01 10:00:00', 'Hello'),
(2, 123, '2017-01-01 10:01:00', 'Hello again'),
(3, 123, '2017-01-01 10:02:00', 'Can you help me with my order'),
(4, 123, '2017-01-01 11:00:00', 'Are you still there'),
(5, 456, '2017-01-01 10:07:00', 'Hey I''m a new customer')
;with x as (
select *, case when datediff(minute, lag(msgtimestamp, 1, '19000101') over(partition by customerid order by msgtimestamp), msgtimestamp) > 30 then 1 else 0 end as g
from #t
),
y as (
select *, sum(g) over(order by msgtimestamp) as gg
from x
)
select customerid, min(msgtimestamp), max(msgtimestamp)
from y
group by customerid, gg
I'm having some difficulty understanding the best approach to get the following result set.
I have a result set (thousands of rows) that I want to update from:
ID Question Answer
--- -------- --------
1 Business NULL
1 Job Other
1 Location UK
2 Business Legal
3 Location US
4 Location UK
To This:
ID Buisness Job Location
--- -------- --- --------
1 NULL Other UK
2 Legal NULL NULL
3 NULL NULL US
4 NULL NULL UK
I have been looking at SELF JOINS and PIVOT tables but wanted to understand the best method as I have not been able to achieve the desired output.
Thanks
Gary
If you want to use pivot, you can do it like this:
CREATE TABLE #Table1
([ID] int, [Question] varchar(8), [Answer] varchar(5))
;
INSERT INTO #Table1
([ID], [Question], [Answer])
VALUES
(1, 'Business', NULL),
(1, 'Job', 'Other'),
(1, 'Location', 'UK'),
(2, 'Business', 'Legal'),
(3, 'Location', 'US'),
(4, 'Location', 'UK')
;
select * from
(select * from #Table1) S
pivot (
max(Answer) for Question in (Business, Job, Location)
) P
select
id,
max(case when question='business' then answer end) 'business',
max(case when question='Job' then answer end) 'Job',
max(case when question='Location' then answer end) 'Location'
group by id