Calculating criteria occurrence across results per row - sql-server

I am trying to determine on a per row basis, how many of these requests exist at a specific moment in time.
The date and time are formatted specifically to appear this way in the results, however they are stored in default yyyy-mm-dd and 00:00:00.000 formats respectively in the database
Request Data:
ID | CDate | CTime | LDate | LTime
---------------------------------------------------------------
230700 | 13/07/2016 | 6:52am | 13/07/2016 | 7:21am
746970 | 13/07/2016 | 7:05am | 13/07/2016 | 7:10am
746971 | 13/07/2016 | 7:09am | 13/07/2016 | 7:09am
746972 | 13/07/2016 | 7:16am | 13/07/2016 | 7:27am
746973 | 13/07/2016 | 7:20am | 13/07/2016 | 7:29am
CTime refers to Issue Creation time, with LTime referring to the time the issue has been logged into by a user.
I wish to add a new column at the end of these results, based on the results of the entire query. The new column would count how many issues are visible at any given time. Issues are visible as soon as they are created, and disappear when a user logs into the request, creating an LTime entry.
In this example, we will use the 2nd row of data for ID: 746970. We can see that the creation time was 7:05am, however the issue wasn't logged into until 7:10am. At that login time, 2 other issues had already been created, however hadn't yet been logged into (230700 and 746971), with a creation time of 6:52am/7:21am and 7:09am/7:09am respectively. As such, the new column would report a value of 3 for number of issues visible at the time of logging in.
My thought process so far leads me to believe this would need a 2-3 part query, potentially storing results in a Temp Table. The first part of the query would obtain the results as they are shown above (already created). The second part of the query would determine on a 'per row' basis how many rows have a CTime less than the each row's LTime. The 3rd query would then run another check on the results of the 2nd query to count the number of rows where the LTime of the current row is equal to or less than the LTime of other rows.
The results upon running this would appear as below. The bracketed text would not show in the results, merely included to show working.
New data:
ID | CDate | CTime | LDate | LTime | #Active
----------------------------------------------------------------------------
230700 | 13/07/2016 | 6:52am | 13/07/2016 | 7:21am | 3 (230700, 746972, 746973)
746970 | 13/07/2016 | 7:05am | 13/07/2016 | 7:10am | 2 (230700, 746970)
746971 | 13/07/2016 | 7:09am | 13/07/2016 | 7:09am | 3 (230700, 746970, 746971)
746972 | 13/07/2016 | 7:16am | 13/07/2016 | 7:27am | 2 (746972, 746973)
746973 | 13/07/2016 | 7:20am | 13/07/2016 | 7:29am | 1 (746973)
I'm at a loss on this one, I know the logic for it, but can't for the life of me put it into MS SQL code. Any assistance would be greatly appreciated.

The first thing I'm going to say is that you should consider altering your table such that the DATE and TIME fields are not separate. Unless you're running a whole bunch of queries that only care about time or only care about date and you have indexes built around this, you're better off using a single DATETIME column - it makes running queries that require both date and time so much easier.
Moving on to what I believe is the solution to your problem (assuming the criteria for "active" is correct)... The short answer is that you don't need any temporary tables or anything (my solution uses a CTE but it doesn't even need that):
DECLARE # TABLE (ID INT PRIMARY KEY, CDate DATE NOT NULL, CTime TIME NOT NULL, LDate DATE NOT NULL, LTime TIME NOT NULL)
INSERT # VALUES (230700, '2016-07-13', '06:52:00.000', '2016-07-13', '07:21:00.000')
, (746970, '2016-07-13', '07:05:00.000', '2016-07-13', '07:10:00.000')
, (746971, '2016-07-13', '07:09:00.000', '2016-07-13', '07:09:00.000')
, (746972, '2016-07-13', '07:16:00.000', '2016-07-13', '07:27:00.000')
, (746973, '2016-07-13', '07:20:00.000', '2016-07-13', '07:29:00.000');
WITH CTE AS (
SELECT ID
, CDate + CAST(CTime AS DATETIME) CDateTime
, LDate + CAST(LTime AS DATETIME) LDateTime
FROM #)
SELECT T.ID, T.CDateTime, T.LDateTime, Z.ActiveCount
FROM CTE T
OUTER APPLY (
SELECT CAST(COUNT(*) AS VARCHAR(255))
+ ' (' +
STUFF((SELECT ', ' + CAST(ID AS VARCHAR(255))
FROM CTE
WHERE CDateTime <= T.LDateTime
AND LDateTime >= T.LDateTime
ORDER BY ID
FOR XML PATH ('')), 1, 2, '') + ')'
-- COUNT(*) -- this just produces the number
FROM CTE
WHERE CDateTime <= T.LDateTime
AND LDateTime >= T.LDateTime) Z(ActiveCount)
If you need to keep the date and time fields separate, the statement can be rewritten as:
DECLARE # TABLE (ID INT PRIMARY KEY, CDate DATE NOT NULL, CTime TIME NOT NULL, LDate DATE NOT NULL, LTime TIME NOT NULL)
INSERT # VALUES (230700, '2016-07-13', '06:52:00.000', '2016-07-13', '07:21:00.000')
, (746970, '2016-07-13', '07:05:00.000', '2016-07-13', '07:10:00.000')
, (746971, '2016-07-13', '07:09:00.000', '2016-07-13', '07:09:00.000')
, (746972, '2016-07-13', '07:16:00.000', '2016-07-13', '07:27:00.000')
, (746973, '2016-07-13', '07:20:00.000', '2016-07-13', '07:29:00.000');
SELECT T.ID
, CONVERT(VARCHAR(255), T.CDate, 103) CDate
, CONVERT(VARCHAR(255), T.CTime, 100) CTime
, CONVERT(VARCHAR(255), T.LDate, 103) LDate
, CONVERT(VARCHAR(255), T.LTime, 100) LTime
, Z.ActiveCount
FROM # T
OUTER APPLY (
SELECT COUNT(*) -- this just produces the number
FROM #
WHERE CDate + CAST(CTime AS DATETIME) <= T.LDate + CAST(T.LTime AS DATETIME)
AND LDate + CAST(LTime AS DATETIME) >= T.LDate + CAST(T.LTime AS DATETIME)) Z(ActiveCount)

Related

Get HH:MM:SS from a datetime in MSSQL

I have the following issue:
I have a datetime field, which contains entries like this: "1970-01-01 22:09:26.000"
I would like to extract only the 22:09:26 (hh:mm:ss) part, but I am unable to convert it into 24h format, I used FORMAT and CONVERT, but received the the am/pm culture (for the CONVERT I tried to use 13 culture value).
What is the simplest way to construct the formula to give back the above mentioned format?
Thank you!
1st way
You can select the format you wish from https://www.mssqltips.com/sqlservertip/1145/date-and-time-conversions-using-sql-server/
select replace(convert(nvarchar(20), CAST('1970-01-01 22:09:26.000' AS datetime), 114),'-',':')
2nd way
It is not a conversion,but if your entries are all the same format then you can use the below:
select right('1970-01-01 22:09:26.000',12)
Updated if you have null dates as well:
1.
select case when value is not null
then replace(convert(nvarchar(20), CAST(value AS datetime), 114),'-',':')
else null
end
select case when value is not null then right(value,12)
else null end
To get just the time portion of a datetime you just need to cast or convert it into the appropriate data type. If you really want to be formatting your data right in the query, this is very possible with format and I am not sure what issues you were facing there:
declare #t table(d datetime);
insert into #t values(dateadd(minute,-90,getdate())),(dateadd(minute,-60,getdate())),(dateadd(minute,-30,getdate())),(dateadd(minute,90,getdate()));
select d
,cast(d as time) as TimeValue
,format(d,'HH:mm:ss') as FormattedTimeValue
from #t;
Output
+-------------------------+------------------+--------------------+
| d | TimeValue | FormattedTimeValue |
+-------------------------+------------------+--------------------+
| 2020-08-10 11:51:15.560 | 11:51:15.5600000 | 11:51:15 |
| 2020-08-10 12:21:15.560 | 12:21:15.5600000 | 12:21:15 |
| 2020-08-10 12:51:15.560 | 12:51:15.5600000 | 12:51:15 |
| 2020-08-10 14:51:15.560 | 14:51:15.5600000 | 14:51:15 |
+-------------------------+------------------+--------------------+
By using format code as 108 we can get datetime in 'HH:mm:ss' format.
DECLARE #now DATETIME = GETDATE()
SELECT CONVERT(NVARCHAR(20), #now, 108)

T-SQL n rows for amount of days between 2 days

I have a question which I cannot answer myself. I'm using T-SQL and a basic query:
SELECT OpenArt, DayFrom, Dayto
FROM Locations
WHERE OpenArt = 'closed' AND S_ID = '123'
I want to get every date, where my location is closed. This works so far, as the output is something like:
| OpenArt | DayFrom | DayTo |
+---------+------------+------------+
| Closed | 06.12.2019 | 09.12.2019 |
| Closed | 23.12.2019 | 31.12.2019 |
Basically, it shows a range, when a location is closed. However, for an API, I need to send 1 row for each closed day. So for the range 23.12.2019 - 31.12.2019, I'd need 9 single rows like:
| OpenArt | DayClosed |
+---------+------------+
| Closed | 23.12.2019 |
| Closed | 24.12.2019 |
| Closed | 25.12.2019 |
and so on. The naming of the headers aren't that important, I can adjust that. I simply don't know how to "dupe" the results, depending on the range between the 2 days. I know there is datediff(), but that is all I could come up with. Thanks in advance.
There are no restrictions, there can be a new temp_table, an UDF or anything that works.
One option is to use an ad-hoc tally table in concert with a CROSS APPLY,
Example
Set Dateformat DMY
Declare #YourTable Table ([OpenArt] varchar(50),[DayFrom] date,[DayTo] date) Insert Into #YourTable Values
('Closed','06.12.2019','09.12.2019')
,('Closed','23.12.2019','31.12.2019')
Select OpenArt
,DayClosed = D
From #YourTable
Cross Apply (
Select Top (DateDiff(DAY,[DayFrom],[DayTo])+1)
D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),[DayFrom])
From master..spt_values n1,master..spt_values n2
) B
Returns
Or Yet another option with known date range
Declare #Date1 date = '2019-01-01'
Declare #Date2 date = '2020-12-31'
Select OpenArt
,DayClosed = D
From #YourTable
Join (
Select Top (DateDiff(DAY,#Date1,#Date2)+1)
D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),#Date1)
From master..spt_values n1,master..spt_values n2
) B on D between [DayFrom] and [DayTo]

Partition by syntax

I have the following statement which works to get the most recent row of data for a particular DDI. What I now want to do is replace the single DDI in the where statement with a long list of them but still have only the most recent row for each. I'm pretty sure that I need to use OVER and PARTITION BY to get a separate window for each DDI but even reading the microsoft documentation and a more simplified tutorial I still can't get the syntax right. I suspect I just need a nudge in the right direction. Can anyone help?
https://learn.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql?view=sql-server-2017
http://www.sqltutorial.org/sql-window-functions/sql-partition-by/
SELECT TOP 1
[Start Time]
,[Agent Name]
,[Reference]
,[charged op. (sec)]
,[Type]
,[Activation ID] as [actid]
FROM [iPR].[dbo].[InboundCallsView]
Where [type] = 'Normal operator call'
AND [DDI] = #DDI
Order By [Start Time] Desc
Not sure how you plan on handling the multiple values for DDI but that may be an issue. The best approach would be to use a table valued parameter. If you pass in a delimited list you have to split the string too which is not a good way of handling this type of thing.
This query will return the most recent for every DDI.
SELECT
[Start Time]
, [Agent Name]
, [Reference]
, [charged op. (sec)]
, [Type]
, [actid]
from
(
SELECT
[Start Time]
, [Agent Name]
, [Reference]
, [charged op. (sec)]
, [Type]
, [actid]
, RowNum = ROW_NUMBER() over(partition by DDI order by [Start Time] desc)
FROM [iPR].[dbo].[InboundCallsView]
where [type] = 'Normal operator call'
--and [DDI] = #DDI
) x
where x.RowNum = 1
So let's assume a table with this data (notice how I cleaned up the column names to remove spaces, special characters, etc.):
+---+------------------+--------+------+----+------+---+
| 1 | 2019-03-28 08:00 | agent1 | foo1 | 60 | foo1 | 1 |
+---+------------------+--------+------+----+------+---+
| 1 | 2019-03-28 09:00 | agent2 | foo2 | 70 | foo2 | 2 |
| 2 | 2019-03-27 08:00 | agent3 | foo3 | 80 | foo3 | 3 |
| 2 | 2019-03-27 09:00 | agent4 | foo4 | 90 | foo4 | 4 |
+---+------------------+--------+------+----+------+---+
As you say, you can use a window function to get what you want. However, let me show you a method that doesn't require a window function first.
You want records where the StartTime is the max value for that DDI. You can obtain the max StartTime for each DDI with the following query:
SELECT
ddi,
max_start = MAX(StartTime)
FROM InboundCallsView
GROUP BY ddi
You can then join that query to your base table/view to get the records you want. Using an intermediate CTE, you can do the following:
WITH
ddiWithMaxStart AS
(
SELECT
ddi,
max_start = MAX(StartTime)
FROM InboundCallsView
GROUP BY ddi
)
SELECT InboundCallsView.*
FROM InboundCallsView
INNER JOIN ddiWithMaxStart ON
ddiWithMaxStart.ddi = InboundCallsView.ddi
AND ddiWithMaxStart.max_start = InboundCallsView.StartTime
Now, if you really want to use WINDOW functions, you can use ROW_NUMBER for a similar effect:
WITH
ddiWithRowNumber AS
(
SELECT
InboundCallsView.*,
rn = ROW_NUMBER() OVER
(
PARTITION BY ddi
ORDER BY ddi, StartTime DESC
)
FROM InboundCallsView
)
SELECT *
FROM ddiWithRowNumber
WHERE rn = 1
Notice that with this method, you don't need to join the base view/table to the intermediate CTE.
You can test out performance of each method to see which works best for you.

T-SQL - Finding records with chronological gaps

This is my first post here. I'm still a novice SQL user at this point though I've been using it for several years now. I am trying to find a solution to the following problem and am looking for some advice, as simple as possible, please.
I have this 'recordTable' with the following columns related to transactions; 'personID', 'recordID', 'item', 'txDate' and 'daySupply'. The recordID is the primary key. Almost every personID should have many distinct recordID's with distinct txDate's.
My focus is on one particular 'item' for all of 2017. It's expected that once the item daySupply has elapsed for a recordID that we would see a newer recordID for that person with a more recent txDate somewhere between five days before and five days after the end of the daySupply.
What I'm trying to uncover are the number of distinct recordID's where there wasn't an expected new recordID during this ten day window. I think this is probably very simple to solve but I am having a lot of difficulty trying to create a query for it, let alone explain it to someone.
My thought thus far is to create two temp tables. The first temp table stores all of the records associated with the desired items and I'm just storing the personID, recordID and txDate columns. The second temp table has the personID, recordID and the two derived columns from the txDate and daySupply; these would represent the five days before and five days after.
I am trying to find some way to determine the number of recordID's from the first table that don't have expected refills for that personID in the second. I thought a simple EXCEPT would do this but I don't think there's anyway of getting around a recursive type statement to answer this and I have never gotten comfortable with recursive queries.
I searched Stackoverflow and elsewhere but couldn't come up with an answer to this one. I would really appreciate some help from some more clever data folks. Here is the code so far. Thanks everyone!
CREATE TABLE #temp1 (personID VARCHAR(20), recordID VARCHAR(10), txDate
DATE)
CREATE TABLE #temp2 (personID VARCHAR(20), recordID VARCHAR(10), startDate
DATE, endDate DATE)
INSERT INTO #temp1
SELECT [personID], [recordID], txDate
FROM recordTable
WHERE item = 'desiredItem'
AND txDate > '12/31/16'
AND txDate < '1/1/18';
INSERT INTO #temp2
SELECT [personID], [recordID], (txDate + (daySupply - 5)), (txDate +
(daySupply + 5))
FROM recordTable
WHERE item = 'desiredItem'
AND txDate > '12/31/16'
AND txDate < '1/1/18';
I agree with mypetlion that you could have been more concise with your question, but I think I can figure out what you are asking.
SQL Window Functions to the rescue!
Here's the basic idea...
CREATE TABLE #fills(
personid INT,
recordid INT,
item NVARCHAR(MAX),
filldate DATE,
dayssupply INT
);
INSERT #fills
VALUES (1, 1, 'item', '1/1/2018', 30),
(1, 2, 'item', '2/1/2018', 30),
(1, 3, 'item', '3/1/2018', 30),
(1, 4, 'item', '5/1/2018', 30),
(1, 5, 'item', '6/1/2018', 30)
;
SELECT *,
ABS(
DATEDIFF(
DAY,
LAG(DATEADD(DAY, dayssupply, filldate)) OVER (PARTITION BY personid, item ORDER BY filldate),
filldate
)
) AS gap
FROM #fills
ORDER BY filldate;
... outputs ...
+----------+----------+------+------------+------------+------+
| personid | recordid | item | filldate | dayssupply | gap |
+----------+----------+------+------------+------------+------+
| 1 | 1 | item | 2018-01-01 | 30 | NULL |
| 1 | 2 | item | 2018-02-01 | 30 | 1 |
| 1 | 3 | item | 2018-03-01 | 30 | 2 |
| 1 | 4 | item | 2018-05-01 | 30 | 31 |
| 1 | 5 | item | 2018-06-01 | 30 | 1 |
+----------+----------+------+------------+------------+------+
You can insert the results into a temp table and pull out only the ones you want (gap > 5), or use the query above as a CTE and pull out the results without the temp table.
This could be stated as follows: "Given a set of orders, return a subset for which there is no order within +/- 5 days of the expected resupply date (defined as txDate + DaysSupply)."
This can be solved simply with NOT EXISTS. Define the range of orders you wish to examine, and this query will find the subset of those orders for which there is no resupply order (NOT EXISTS) within 5 days of either side of the expected resupply date (txDate + daysSupply).
SELECT
gappedOrder.personID
, gappedOrder.recordID
, gappedOrder.item
, gappedOrder.txDate
, gappedOrder.daysSupply
FROM
recordTable as gappedOrder
WHERE
gappedOrder.item = 'desiredItem'
AND gappedOrder.txDate > '12/31/16'
AND gappedOrder.txDate < '1/1/18'
--order not refilled within date range tolerance
AND NOT EXISTS
(
SELECT
1
FROM
recordTable AS refilledOrder
WHERE
refilledOrder.personID = gappedOrder.personID
AND refilledOrder.item = gappedOrder.item
--5 days prior to (txDate + daysSupply)
AND refilledOrder.txtDate >= DATEADD(day, -5, DATEADD(day, gappedOrder.daysSupply, gappedOrder.txDate))
--5 days after (txtDate + daysSupply)
AND refilledOrder.txtDate <= DATEADD(day, 5, DATEADD(day, gappedOrder.daysSupply, gappedOrder.txtDate))
);

Netezza: Show dates even if 0 data for that day

I have this query through an odbc connection in excel for a refreshable report with data for every 4 weeks. I need to show the dates in each of the 4 weeks even if there is no data for that day because this data is then linked to a Graph. Is there a way to do this?
thanks.
Select b.INV_DT, sum( a.ORD_QTY) as Ordered, sum( a.SHIPPED_QTY) as Shipped
from fct_dly_invoice_detail a, fct_dly_invoice_header b, dim_invoice_customer c
where a.INV_HDR_SK = b.INV_HDR_SK
and b.DIM_INV_CUST_SK = c.DIM_INV_CUST_SK
and a.SRC_SYS_CD = 'ABC'
and a.NDC_NBR is not null
**and b.inv_dt between CURRENT_DATE - 16 and CURRENT_DATE**
and b.store_nbr in (2851, 2963, 3249, 3385, 3447, 3591, 3727, 4065, 4102, 4289, 4376, 4793, 5209, 5266, 5312, 5453, 5569, 5575, 5892, 6534, 6571, 7110, 9057, 9262, 9652, 9742, 10373, 12392, 12739, 13870
)
group by 1
The general purpose solution to this is to create a date dimension table, and then perform an outer join to that date dimension table on the INV_DT column.
There are tons of good resources you can search for on creating a good date dimension table, so I'll just create a quick and dirty (and trivial) example here. I highly recommend some research in that area if you'll be doing a lot of BI/reporting.
If our table we want to report from looks like this:
Table "TABLEZ"
Attribute | Type | Modifier | Default Value
-----------+--------+----------+---------------
AMOUNT | BIGINT | |
INV_DT | DATE | |
Distributed on random: (round-robin)
select * from tablez order by inv_dt
AMOUNT | INV_DT
--------+------------
1 | 2015-04-04
1 | 2015-04-04
1 | 2015-04-06
1 | 2015-04-06
(4 rows)
and our report looks like this:
SELECT inv_dt,
SUM(amount)
FROM tablez
WHERE inv_dt BETWEEN CURRENT_DATE - 5 AND CURRENT_DATE
GROUP BY inv_dt;
INV_DT | SUM
------------+-----
2015-04-04 | 2
2015-04-06 | 2
(2 rows)
We can create a date dimension table that contains a row for every date (or ate last 1024 days in the past and 1024 days in the future using the _v_vector_idx view in this example).
create table date_dim (date_dt date);
insert into date_dim select current_date - idx from _v_vector_idx;
insert into date_dim select current_date + idx +1 from _v_vector_idx;
Then our query would look like this:
SELECT d.date_dt,
SUM(amount)
FROM tablez a
RIGHT OUTER JOIN date_dim d
ON a.inv_dt = d.date_dt
WHERE d.date_dt BETWEEN CURRENT_DATE -5 AND CURRENT_DATE
GROUP BY d.date_dt;
DATE_DT | SUM
------------+-----
2015-04-01 |
2015-04-02 |
2015-04-03 |
2015-04-04 | 2
2015-04-05 |
2015-04-06 | 2
(6 rows)
If you actually needed a zero value instead of a NULL for the days where you had no data, you could use a COALESCE or NVL like this:
SELECT d.date_dt,
COALESCE(SUM(amount),0)
FROM tablez a
RIGHT OUTER JOIN date_dim d
ON a.inv_dt = d.date_dt
WHERE d.date_dt BETWEEN CURRENT_DATE -5 AND CURRENT_DATE
GROUP BY d.date_dt;
DATE_DT | COALESCE
------------+----------
2015-04-01 | 0
2015-04-02 | 0
2015-04-03 | 0
2015-04-04 | 2
2015-04-05 | 0
2015-04-06 | 2
(6 rows)
I agree with #ScottMcG that you need to get the list of dates. However if you are in a situation where you aren't allowed to create a table. You can simplify things. All you need is a table that has at least 28 rows. Using your example, this should work.
select date_list.dt_nm, nvl(results.Ordered,0) as Ordered, nvl(results.Shipped,0) as Shipped
from
(select row_number() over(order by sub.arb_nbr)+ (current_date -28) as dt_nm
from (select rowid as arb_nbr
from fct_dly_invoice_detail b
limit 28) sub ) date_list left outer join
( Select b.INV_DT, sum( a.ORD_QTY) as Ordered, sum( a.SHIPPED_QTY) as Shipped
from fct_dly_invoice_detail a inner join
fct_dly_invoice_header b
on a.INV_HDR_SK = b.INV_HDR_SK
and a.SRC_SYS_CD = 'ABC'
and a.NDC_NBR is not null
**and b.inv_dt between CURRENT_DATE - 16 and CURRENT_DATE**
and b.store_nbr in (2851, 2963, 3249, 3385, 3447, 3591, 3727, 4065, 4102, 4289, 4376, 4793, 5209, 5266, 5312, 5453, 5569, 5575, 5892, 6534, 6571, 7110, 9057, 9262, 9652, 9742, 10373, 12392, 12739, 13870)
inner join
dim_invoice_customer c
on b.DIM_INV_CUST_SK = c.DIM_INV_CUST_SK
group by 1 ) results
on date_list.dt_nm = results.inv_dt

Resources