Need help comparing difference between datetime stamps in SQL DB - sql-server

I have a MSSQL server 2012 express DB that logs user activities. I need some help creating a query to compare timestamps on the user activities based on the text in the notes. I am interested in figuring out how long it takes my users to perform certain activities. The activities they are performing are stored in text in a NOTES column. I want to build a query that will tell me the time difference for each [INVOICEID] from the ‘START NOTE’ to the next note for that invoice by that user. The note that is entered is always the same for the start of the timer (for the purposes of this I used ‘START NOTE’ to indicate the start of the timer, but I have multiple activites I will need to do this for so I plan on simply changing that text in the query), but the end of the timer the text of the note will vary because it will be user entered data. I want to find the time difference between ‘START NOTE’ and the note that immediately follows ‘START NOTE’ entered by the same USERID for the same INVOICEID. Please see the SQLfiddle for an example of my data:
http://sqlfiddle.com/#!3/a00d7/1
With the data in the sql fiddle I would want the results of this query to be:
INVOICE ID USERID TIME_Difference
100 5 1 day
101 5 3 days
102 5 9 days
(time_difference does not need to be formatted like that, standard SQL formatting is fine)
I don’t really know where to start with this. Please let me know if you can help.
Thanks

select a.userid,a.invoiceid,min(a.added),min(b.added),datediff(DAY,min(a.added),min(b.added)) from om_note a
left join om_note b on a.userid=b.userid and a.invoiceid = b.invoiceid and a.added < b.added
where a.notes = 'START NOTE' group by a.userid,a.invoiceid

;with x as (
select
o.*, sum(case when notes='START NOTE' then 1 else 0 end)
over(partition by o.invoiceid, o.userid order by o.added) as grp
from om_note o
),
y as (
select *,
row_number() over(partition by x.invoiceid, x.userid, x.grp order by x.added) as rn
from x
where grp > 0
)
select y1.invoiceid, y1.userid, datediff(hour, y1.added, y2.added)
from y y1
inner join y y2
on y1.invoiceid=y2.invoiceid and y1.userid=y2.userid and y1.grp=y2.grp
where y1.rn=1 and y2.rn=2

Related

trying to break down results of SQL query to show data for each month

I'm very new to SQL and have a problem I can't figure out.
I'm trying to replace an excel spreadsheet and turn it into a PowerBi report. Currently our team runs the following query to get the amount of active users every month and types it into an excel sheet which then graphs the number of users each month showing the increase. Since I don't want to manually input data each month my goal is to break down this query to give the current number of users in each month and add to that every month.
Desired result would look something like this
dateCreated # of Users
----------------------
2008-10 295
2008-11 355
2008-12 470
2009-01 522
I was able to break it down enough to give me the amount created each month, but that doesn't give me the total amount each month. This is the query that I used and a sample of the results I got.
SELECT
FORMAT(USERADDR.DateCreated, 'yyyy-MM') AS 'dateCreated',
COUNT(s.UserId) AS "# of Users"
FROM
ER.dbo.ssUser s,
ER.dbo.ssUserAddress USERADDR,
ER.dbo.ssAddress ADDRESS
WHERE
s.UserId = USERADDR.UserId
AND USERADDR.AddressId = ADDRESS.AddressId
AND Isdefault = 1
AND Type = 'soldto'
GROUP BY
FORMAT(USERADDR.DateCreated, 'yyyy-MM')
result sample:
dateCreated # of Users
2008-10 295
2008-11 41
2008-12 22
2009-01 19
This is almost there, but I need a running total. I've tried a lot of different things including SUM, SUM OVER, COUNT OVER etc. My boss suggested a while loop. I can't get that to work either and everything I've read says that should be the last resort. Here is one example of my failed attempts
SELECT
FORMAT(USERADDR.DateCreated, 'yyyy-MM') as 'dateCreated',
COUNT(s.UserId)
OVER(
PARTITION BY Month(USERADDR.DateCreated)
GROUP BY FORMAT(USERADDR.DateCreated, 'yyyy-MM')
)
AS "# of Users"
FROM
ER.dbo.User s,
ER.dbo.UserAddress USERADDR,
ER.dbo.Address ADDRESS
WHERE
s.UserId = USERADDR.UserId
AND USERADDR.AddressId = ADDRESS.AddressId
AND Isdefault = 1
AND Type = 'soldto'
--original query which gives total number of users right now.
SELECT
count(s.UserId) AS "# of Users"
FROM
ER.dbo.User s,
ER.dbo.UserAddress USERADDR,
ER.dbo.Address ADDRESS
WHERE
s.UserId = USERADDR.UserId
AND USERADDR.AddressId = ADDRESS.AddressId
AND Isdefault = 1
AND Type = 'soldto'
You can do a window sum() on the aggregated count of users per month, like so:
SELECT
FORMAT(USERADDR.DateCreated, 'yyyy-MM') [dateCreated],
SUM(COUNT(s.UserId)) OVER(ORDER BY FORMAT(USERADDR.DateCreated, 'yyyy-MM')) [# of Users]
FROM
ER.dbo.ssUser s
INNER JOIN ER.dbo.ssUserAddress USERADDR
ON s.UserId = USERADDR.UserId,
INNER JOIN ER.dbo.ssAddress ADDRESS
ON USERADDR.AddressId = ADDRESS.AddressId
WHERE Isdefault = 1 AND Type = 'soldto'
group by FORMAT(USERADDR.DateCreated, 'yyyy-MM')
Notes:
always prefer proper, explicit join syntax (with the ON keyword) over implicit, old-school joins, who were deprecated long time ago - I modified your query accordingly
SQLServer uses square brackets for identifiers - you should avoid single quotes, as they are generally used for litteral strings
you have unqualified column names in the WHERE clause: always qualify column names in your query, so it is easy to understand to which table they belong

Selecting Different rows if the values of one column are equal

I am trying to write a SQL query that selects the top 4 from a random query so I can do quality checks on the certain cases. Each case has an account number tied to a client. The problem is that each case has a unique number but may have the same account number.
What I am looking to do is if the account number is the same on two cases to have the SQL select a new row with a different account number.
Select Top 4
Account,
CaseNum
From dbo.tblRequest
Where LoggedDate Between GetDate() - 7 and GetDate() - 1
Order By NewId();
The Results will display 4 accounts but at times it is possible that the same account is displayed twice. As stated I want to only display distinct accounts for a 7 day period.
I have tried the distinct key word and it still displays the accounts twice in some queries results.
Try following statement. using row_number to get only on line for same accountNumber.
SELECT * FROM (
Select
Account,
CaseNum,
ROW_NUMBER()OVER(PARTITION BY Account ORDER BY GETDATE()) AS rn
From dbo.tblRequest
Where LoggedDate Between GetDate() - 7 and GetDate() - 1
) AS t WHERE t.rn=1
Order By
NewId()

Querying a running-percentage over a date range from MSSQL?

I want to graph the % of users over time that have their Twitter account connected. The number of users changes constantly, and so does the % of them that connect their Twitter account.
The table has a user account specific createDateTime column as well as a tw_connectDateTime column.
Let's say I'm interested in the trend of % connected over the last 7 days. Is there a way I can have MSSQL calculate the percentage for every day in the specified range, or do I need to do it myself using multiple queries?
Doing it in app logic would look something like (pseudocode):
for day in days:
query:
select
count(userId) as totalUsers
,c.connected
,cast(c.connected as float)/count(userId) as percentage
from
Users
outer apply (
select
count(userId) as connected
from
Users
where
tw_connectDateTime <= $day
) as c
where
createDateTime <= $day
group by
c.connected
What I'm unsure of is how, if it's possible, to expand this to run for each day, so that the results include the date as a column and the same values that I would get from running the above query for each date in the range.
Is it possible? If so, how?
actually you can use your query joined with days, like this:
with cte_days as (
select #DateStart as day
union all
select dateadd(dd, 1, c.[day]) as [day]
from cte_days as c
where c.[day] < #DateEnd
)
select
d.[day],
count(u.userId) as totalUsers,
c.connected,
cast(c.connected as float)/count(u.userId) as percentage
from cte_days as d
inner join Users as u on u.createDateTime <= d.[day]
outer apply (
select
count(T.userId) as connected
from Users as T
where T.tw_connectDateTime <= d.[day]
) as c
group by d.[day], c.connected

Filtering a complex SQL Query

Unit - hmy, scode, hProperty
InsurancePolicy - hmy, hUnit, dtEffective, sStatus
Select MAX(i2.dtEffective) as maxdate, u.hMy, MAX(i2.hmy) as InsuranceId,
i2.sStatus
from unit u
left join InsurancePolicy i2 on i2.hUnit = u.hMy
and i2.sStatus in ('Active', 'Cancelled', 'Expired')
where u.hProperty = 2
Group By u.hmy, i2.sStatus
order by u.hmy
This query will return values for the Insurance Policy with the latest Effective Date (Max(dtEffective)). I added Max(i2.hmy) so if there was more than one Insurance Policy for the latest Effective Date, it will return the one with the highest ID (i2.hmy) in the database.
Suppose there was a Unit that had 3 Insurance Policies attached with the same latest effective date and all have different sStatus'.
The result would look like this:
maxdate UnitID InsuranceID sStatus
1/23/12 2949 1938 'Active'
1/23/12 2949 2343 'Cancelled'
1/23/12 2949 4323 'Expired'
How do I filter the results so that if there are multiple Insurance Policies with different Status' for the same unit and same date, then we choose the Insurance Policy with the 'Active' Status first, if one doesn't exist, choose 'Cancelled', and if that doesn't exist, choose 'Expired'.
This seems to be a matter of proper ranking of InsurancePolicy's rows and then joining Unit to the set of the former's top-ranked rows:
;
WITH ranked AS (
SELECT
*,
rnk = ROW_NUMBER() OVER (
PARTITION BY hUnit
ORDER BY dtEffective DESC, sStatus, hmy DESC
)
FROM InsurancePolicy
)
SELECT
i2.dtEffective AS maxdate,
u.hMy,
i2.hmy AS InsuranceId,
i2.sStatus
FROM Unit u
LEFT JOIN ranked i2 ON i2.hUnit = u.hMy AND i2.rnk = 1
You could make this work with one SQL statement but it will be nearly unreadable to your everyday t-sql developer. I would suggest breaking this query up into a few steps.
First, I would declare a table variable and place all the records that require no manipulation into this table (ie - Units that do not have multiple statuses for the same date = good records).
Then, get a list of your records that need work done on them (multiple statuses on the same date for the same UnitID) and place them in a table variable. I would create a "rank" column within this table variable using a case statement as illustrated here:
Pseudocode: WHEN Active THEN 1 ELSE WHEN Cancelled THEN 2 ELSE WHEN Expired THEN 3 END
Then delete records where 2 and 3 exist with a 1
Then delete records where 2 exists and 3
Finally, merge this updated table variable with your table variable containing your "good" records.
It is easy to get sucked into trying to do too much within one SQL statement. Break up the tasks to make it easier for you to develop and more manageable in the future. If you have to edit this SQL in a few years time you will be thanking yourself, not to mention any other developers that may have to take over your code.

Count number of 'overlapping' rows in SQL Server

I've been asked to look at a database that records user login and logout activity - there's a column for login time and then another column to record logout, both in OLE format. I need to pull together some information about user concurrency - i.e. how many users were logged in at the same time each day.
Do anyone know how to do this in SQL? I don't really need to know the detail, just the count per day.
Thanks in advance.
Easiest way is to make a times_table from an auxiliary numbers table (by adding from 0 to 24 * 60 minutes to the base time) to get every time in a certain 24-hour period:
SELECT MAX(simul) FROM (
SELECT test_time
,COUNT(*) AS simul
FROM your_login_table
INNER JOIN times_table -- a table/view/subquery of all times during the day
ON your_login_table.login_time <= times_table.test_time AND times_table.test_time <= your_login_table.logout_time
GROUP BY test_time
) AS simul_users (test_time, simul)
I think this will work.
Select C.Day, Max(C.Concurrency) as MostConcurrentUsersByDay
FROM
(
SELECT convert(varchar(10),L1.StartTime,101) as day, count(*) as Concurrency
FROM login_table L1
INNER JOIN login_table L2
ON (L2.StartTime>=L1.StartTime AND L2.StartTime<=L1.EndTime) OR
(L2.EndTime>=L1.StartTime AND L2.EndTime<=L1.EndTime)
WHERE (L1.EndTime is not null) and L2.EndTime Is not null) AND (L1.ID<>L2.ID)
GROUP BY convert(varchar(10),L1.StartTime,101)
) as C
Group BY C.Day
Unchecked... but lose date values, count time between, use "end of day" for still logged in.
This assumes "logintime" is a date and a time. If not, the derived table can be removed (Still need ISNULL though). of course, SQL Server 2008 has "time" to make this easier too.
SELECT
COUNT(*)
FROM
(
SELECT
DATEADD(day, DATEDIFF(day, logintime, 0), logintime) AS inTimeOnly,
ISNULL(DATEADD(day, DATEDIFF(day, logouttime, 0), logintime), '1900-01-01 23:59:59.997') AS outTimeOnly
FROM
mytable
) foo
WHERE
inTimeOnly >= #TheTimeOnly AND outTimeOnly <= #TheTimeOnly

Resources