Sql query to create graph from a table of events [duplicate] - sql-server

This question already has answers here:
Is there a way to access the "previous row" value in a SELECT statement?
(9 answers)
Closed 1 year ago.
I have a plain old table in SQL Server like so:
JOB Step Timestamp
----------------------------------
1 begin 12/25/2021 1:00 AM
1 foo 12/25/2021 1:01 AM
1 bar 12/25/2021 1:02 AM
1 end 12/25/2021 1:03 AM
It is a list of steps, that transition from one to the other, and the transition is determined by the timestamp. I would like to render it as a graph of events, so am trying to query it with results like:
JOB Source Target Timestamp
--------------------------------------------
1 begin foo 12/25/2021 1:01 AM
1 foo bar 12/25/2021 1:02 AM
1 bar end 12/25/2021 1:03 AM
This is not a SQL Server graph table but I'd like it to behave like one in this case.
This is ultimately going to be rendered in PowerBI using a force directed graph visualization, so answers in T-SQL or DAX would work for my use case.

In t-sql this is simple with lead() window function, the following should produce your expected results
select JOB, [Source], [Target], [TimeStamp]
from (
select JOB, Step [Source],
Lead(Step) over(partition by JOB order by [Timestamp]) [Target],
Lead([Timestamp]) over(partition by JOB order by [Timestamp]) [TimeStamp]
from t
)t
where [Target] is not null;

With DAX
You can write two measures like this to achieve the end goal
Source =
VAR _time1 =
MAX ( 'Table'[Timestamp] )
VAR _beginTime =
CALCULATE (
MAX ( 'Table'[Timestamp] ),
FILTER (
ALLSELECTED ( 'Table' ),
'Table'[JOB] = MAX ( 'Table'[JOB] )
&& 'Table'[Timestamp] < _time1
)
)
VAR _begin =
CALCULATE (
MAX ( 'Table'[Step] ),
FILTER (
ALLSELECTED ( 'Table' ),
'Table'[JOB] = MAX ( 'Table'[JOB] )
&& 'Table'[Timestamp] = _beginTime
)
)
RETURN
_begin
Target =
IF ( [Source] <> BLANK (), MAX ( 'Table'[Step] ) )

Related

SQL Duration that value was true

Edited: to include sample data
Looking for guidance on a TSQL query.
I have a table that stores readings from a sensor (Amperage). The table basically has a Date/Time and a Value column.
The date/time increments every 5 seconds (a new record is added on 5 second intervals).
I am trying to build a query to determine the duration of time that the value was >X.
Example Data:
http://sqlfiddle.com/#!18/f15c0/1/0
The example data is missing chunks to make it smaller but think you would get the idea.
I am trying to get the first record to the next record that goes above 7. This I would do a datediff to get the duration in seconds from when the data started to that first record over 7. I then need to repeat this but now find when it goes below 7.
This way I can see the cycle time duration.
Think of it as your Fridge. The sensor checks in every 5 seconds and sees that the fridge is off and records that. Eventually the fridge turns on and remains on for a period of time. I am trying to get all those cycle times.
I am trying to use Lead and Lag functions...but just getting lost in regards to pulling the data.
Any help?
declare #val numeric(10,5) = 7.0
select v1.entrydate,
v1.Amps,
case when v1.fl = 1 and v1.lg is null then 1
when v1.lg != v1.fl then 1
else 0
end as fl_new
from (
select v1.entrydate,
v1.Amps,
case when v1.Amps > #val then 1
else 0
end as fl,
lag(case when v1.Amps > #val then 1
else 0
end) over(order by v1.entrydate) as lg
from (
select t.entrydate as entrydate,
t.Amps as Amps
from YourTable t
) v1
) v1
where case when v1.fl = 1 and v1.lg is null then 1
when v1.lg != v1.fl then 1
else 0
end = 1
order by v1.entrydate
And don't forget set YourTable name and #val (which is "X").
Images are blocked at my current location, so I can't see your structure. I'll assume you have the following table (I'll ignore PK and other constraints):
create table reading(
entryDate datetime,
amps int
)
Assuming anything above 3 amps is ON, and you want to compute the duty cycles in seconds, then
declare #threshold int = 3;
with
state as (
select entryDate,
case when amps>#threshold then 'ON' else 'OFF' end state,
lag( case when amps>#threshold then 'ON' else 'OFF' end )
over(order by entryDate) prev_state
from reading
),
transition as (
select entryDate, state
from state
where state <> coalesce(prev_state,'')
)
select entryDate,
state,
dateDiff(
s,
entryDate,
lead(entryDate) over(order by entryDate)
) duration
from transition
order by 1
Not sure how fast it'll be, but if you want to try with LAG?
Here's an example that checks for a difference of X>=2
SELECT entrydate, amps
FROM
(
SELECT
entrydate, amps,
amps - LAG(amps) OVER (ORDER BY entrydate) AS PrevAmpsDiff
FROM YourTable
) q
WHERE ABS(FLOOR(PrevAmpsDiff)) >= 2
ORDER BY entrydate;
A test on rextester here

How to select the top 1 in case distinct returns 2 rows

I have a select distinct query that can return 2 rows with the same code since not all columns have the same value. Now my boss wants to get the first one. So how to I do it. Below is the sample result. I want only to return the get the first two unique pro
Use row_number in your query. Please find this link for more info link
; with cte as (
select row_number() over (partition by pro order by actual_quantity) as Slno, * from yourtable
) select * from cte where slno = 1
Your chances to get the proper answer can be much higher if you spend some time to prepare the question properly. Provide the DDL and sample data, as well as add the desired result.
To solve your problem, you need to know the right uniqueness order to get 1 record per window group. Google for window functions. In my example the uniqueness is --> Single row for every pro with earliest proforma_invoice_received_date date and small amount per this date.
DROP TABLE IF EXISTS #tmp;
GO
CREATE TABLE #tmp
(
pro VARCHAR(20) ,
actual_quantity DECIMAL(12, 2) ,
proforma_invoice_received_date DATE ,
import_permit DATE
);
GO
INSERT INTO #tmp
( pro, actual_quantity, proforma_invoice_received_date, import_permit )
VALUES ( 'N19-00945', 50000, '20190516', '20190517' ),
( 'N19-00945', 50001, '20190516', '20190517' )
, ( 'N19-00946', 50002, '20190516', '20190517' )
, ( 'N19-00946', 50003, '20190516', '20190517' );
SELECT a.pro ,
a.actual_quantity ,
a.proforma_invoice_received_date ,
a.import_permit
FROM ( SELECT pro ,
actual_quantity ,
proforma_invoice_received_date ,
import_permit ,
ROW_NUMBER() OVER ( PARTITION BY pro ORDER BY proforma_invoice_received_date, actual_quantity ) AS rn
FROM #tmp
) a
WHERE rn = 1;
-- you can also use WITH TIES for that to save some lines of code
SELECT TOP ( 1 ) WITH TIES
pro ,
actual_quantity ,
proforma_invoice_received_date ,
import_permit
FROM #tmp
ORDER BY ROW_NUMBER() OVER ( PARTITION BY pro ORDER BY proforma_invoice_received_date, actual_quantity );
DROP TABLE #tmp;
Try this-
SELECT * FROM
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY pro ORDER BY Pro) RN
-- You need to add other columns in the ORDER BY clause
-- with 'pro' to get your desired row. other case you
-- will get first row returned by the query with only
-- order by 'pro' and this can vary for different execution
FROM your_table
)A
WHERE RN = 1
CREATE TABLE T (
A [numeric](10, 2) NULL,
B [numeric](10, 2) NULL
)
INSERT INTO T VALUES (100,20)
INSERT INTO T VALUES (100,30)
INSERT INTO T VALUES (200,40)
INSERT INTO T VALUES (200,50)
select *
from T
/*
A B
100.00 20.00
100.00 30.00
200.00 40.00
200.00 50.00
*/
select U.A, U.B
from
(select row_number() over(Partition By A Order By B) as row_num, *
from T ) U
where row_num = 1
/*
A B
100.00 20.00
200.00 40.00
*/

How to filter on set of ID values without using subquery

I was wondering if someone can help me, I need to show this in oracle.
For that I use this select:
SELECT m.idMedicamento, m.nombre, m.precio
FROM medicamento m
WHERE m.idMedicamento IN (
SELECT idMedicamento
FROM (
SELECT idMedicamento
FROM medicamento
ORDER BY precio ASC)
WHERE ROWNUM <=3
)
OR m.idMedicamento IN (
SELECT idMedicamento
FROM (
SELECT idMedicamento
FROM medicamento
ORDER BY precio DESC)
WHERE ROWNUM <=3
)
ORDER BY m.precio DESC;
but the problem is that I can't use subselects I need to use functions or procedures, and I thought in this function:
CREATE OR REPLACE FUNCTION MAXI RETURN FLOAT IS
total INT := 0;
CURSOR ANIO IS
SELECT idMedicamento
FROM medicamento
ORDER BY precio DESC;
a anio%ROWTYPE;
BEGIN
OPEN ANIO;
LOOP
FETCH ANIO INTO a;
EXIT WHEN ANIO%NOTFOUND;
END LOOP;
CLOSE ANIO;
RETURN ROUND(a ,2);
END;
This is a function just for return the maximum, but I can't return the cursor. I dont know if you can understand me, thanks for your time.
No, you cannot return a cursor and use that in a SQL statement WHERE clause the way you seem to want to do.
You could write a function to return the id values you are looking for as a TABLE OF NUMBER but you would still need to use a subquery to access its results in your main query. I.e.,
WHERE idMedicament IN ( SELECT * FROM TABLE(my_custom_function()) )
So, I think a function to returns the values of idMedicamento that you want to filter on is not going to work well with your "no subqueries" limitation.
Alternate approach #1
Incidentally, on Oracle 12c, you can write your query with no subqueries and no functions like this:
SELECT m.idMedicamento, m.nombre, m.precio,
case when rownum() over ( order by idMedicamento asc ) <= 3
or rownum() over ( order by idMedicamento desc ) <= 3 THEN 'Y' ELSE null END include_flag
FROM medicamento m
ORDER BY include_flag nulls last, m.precio DESC
FETCH FIRST 6 ROWS ONLY;
Alternate approach #2
I assume you cannot use subqueries because you are using a tool or a framework that is building your SQL for you. If that is the case, perhaps you cannot use windowing functions (i.e., rownum() over (...)) either. You can get around this by making a view and then writing your query against the view.
Like this:
CREATE OR REPLACE VIEW medicamento_v AS
SELECT m.idMedicamento, m.nombre, m.precio,
rownum() over ( order by idMedicamento asc) asc_order,
rownum() over ( order by idMedicamento desc) desc_order
FROM medicamento m;
SELECT m.idMedicamento, m.nombre, m.precio
FROM medicameno_v m
WHERE ( asc_order <= 3 OR desc_order <= 3 )
ORDER BY m.precio DESC;

TSQL matching the first instances of multiple values in a resultset

Say I have part of a large query, as below, that returns a resultset with multiple rows of the same key information (PolNum) with different value information (PolPremium) in a random order.
Would it be possible to select the first matching PolNum fields and sum up the PolPremium. In this case I know that there are 2 PolNumber's used so given the screenshot of the resultset (yes I know it starts at 14 for illustration purposes) and return the first values and sum the result.
First match for PolNum 000035789547
(ROW 14) PolPremium - 32.00
First match for PolNum 000035789547
(ROW 16) PolPremium - 706043.00
Total summed should be 32.00 + 706043.00 = 706072.00
Query
OUTER APPLY
(
SELECT PolNum, PolPremium
FROM PN20
WHERE PolNum IN(SELECT PolNum FROM SvcPlanPolicyView
WHERE SvcPlanPolicyView.ControlNum IN (SELECT val AS ServedCoverages FROM ufn_SplitMax(
(SELECT TOP 1 ServicedCoverages FROM SV91 WHERE SV91.AccountKey = 3113413), ';')))
ORDER BY PN20.PolEffDate DESC
}
Resultset
Suppose that pic if the final result your query produces. Then you can do something like:
DECLARE #t TABLE
(
PolNum VARCHAR(20) ,
PolPremium MONEY
)
INSERT INTO #t
VALUES ( '000035789547', 32 ),
( '000035789547', 76 ),
( '000071709897', 706043.00 ),
( '000071709897', 1706043.00 )
SELECT t.PolNum ,
SUM(PolPremium) AS PolPremium
FROM ( SELECT * ,
ROW_NUMBER() OVER ( PARTITION BY PolNum ORDER BY PolPremium ) AS rn
FROM #t
) t
WHERE rn = 1
GROUP BY GROUPING SETS(t.PolNum, ( ))
Output:
PolNum PolPremium
000035789547 32.00
000071709897 706043.00
NULL 706075.00
Just replace #t with your query. Also I assume that row with minimum of premium is the first. You could probably do filtering top row in outer apply part but it really not clear for me what is going on there without some sample data.

How to avoid duplicate rows while inserting a set of row from flatfile in SQL SERVER by considering existing column values

I have a table with set of rows with same RecordtypeCode,
then the single/set row coming from a flatfile/other source like below,
finally I need a unique row in my table by elimating the duplicate Recordtypecode & taking the max of other field information,
Finally my table should like this,
What I tried right now?
Fetching all the rows from my table & then union with the new set of records then wrote the stored procedure (using group by & max keyword) to get the desired output in temp table & finally truncate my table & then insert the temp table data to my table.
Is there is any other better ways to avoid performance issue, because i am going to play with millions of records here.
Difficult to answer without more details, but you could try something like this to get grouped results:
SELECT RecordTypeCode,
Max(AgeGroupFemale60_64),
Max(AgeGroupFemale65_69),
Max(AgeGroupFemale70_74)
FROM [TempTable]
GROUP BY RecordTypeCode
Assuming you are using SQL Server 2005+, you could use MAX() OVER to determine maximum flag values within every Recordtypecode group:
SELECT
Recordtypecode,
AgeGroupFemale60_64,
AgeGroupFemale65_69,
AgeGroupFemale70_74,
MAX(AgeGroupFemale60_64) OVER (PARTITION BY Recordtypecode),
MAX(AgeGroupFemale65_69) OVER (PARTITION BY Recordtypecode),
MAX(AgeGroupFemale70_74) OVER (PARTITION BY Recordtypecode)
FROM
dbo.TempTable
and update all the flags with those values:
WITH maximums AS (
SELECT
Recordtypecode,
AgeGroupFemale60_64,
AgeGroupFemale65_69,
AgeGroupFemale70_74,
MaxFemale60_64 = MAX(AgeGroupFemale60_64) OVER (PARTITION BY Recordtypecode),
MaxFemale65_69 = MAX(AgeGroupFemale65_69) OVER (PARTITION BY Recordtypecode),
MaxFemale70_74 = MAX(AgeGroupFemale70_74) OVER (PARTITION BY Recordtypecode)
FROM
dbo.TempTable
)
UPDATE
maximums
SET
AgeGroupFemale60_64 = MaxFemale60_64,
AgeGroupFemale65_69 = MaxFemale65_69,
AgeGroupFemale70_74 = MaxFemale70_74
;
Next, you could use ROW_NUMBER() to enumerate all the rows within the groups:
SELECT
*
rn = ROW_NUMBER() OVER (PARTITION BY Recordtypecode ORDER BY Recordtypecode)
FROM
dbo.TempTable
and delete all the rows with rn > 1:
WITH enumerated AS (
SELECT
*
rn = ROW_NUMBER() OVER (PARTITION BY Recordtypecode ORDER BY Recordtypecode)
FROM
dbo.TempTable
)
DELETE FROM
enumerated
WHERE
rn > 1
;
Alternatively, instead of the two statements, UPDATE and DELETE, you could use one, MERGE (which now assumes SQL Server 2008+), like this:
WITH enumerated AS (
SELECT
*
rn = ROW_NUMBER() OVER (PARTITION BY Recordtypecode ORDER BY Recordtypecode)
FROM
dbo.TempTable
),
maximums AS (
SELECT
Recordtypecode,
MaxFemale60_64 = MAX(AgeGroupFemale60_64),
MaxFemale65_69 = MAX(AgeGroupFemale65_69),
MaxFemale70_74 = MAX(AgeGroupFemale70_74),
rn = 1
FROM
dbo.TempTable
GROUP BY
Recordtypecode
)
MERGE INTO
enumerated AS tgt
USING
maximums AS src
ON
tgt.Recordtypecode = src.Recordtypecode AND tgt.rn = src.rn
WHEN MATCHED THEN
UPDATE SET
tgt.AgeGroupFemale60_64 = src.MaxFemale60_64,
tgt.AgeGroupFemale65_69 = src.MaxFemale65_69,
tgt.AgeGroupFemale70_74 = src.MaxFemale70_74
WHEN NOT MATCHED THEN
DELETE
;
More information:
OVER Clause (Transact-SQL)
MERGE (Transact-SQL)
Note that there are known issues with the MERGE statement that you need to be aware before deciding to use it. You can start with this article to learn more about them and see whether any of them would apply to your situation:
Use Caution with SQL Server's MERGE Statement

Resources