How to use recursive CTE to add resolution to a data set - sql-server

I'm attempting to create a recursive CTE statement that adds blank rows in between data points that will later for interpolation. I'm a beginner with SQL and this is my first time using CTE's and am having some difficulty finding the proper way to do this.
I've attempted a few different slight variations on the code I have provided below after some research but haven't grasped a good enough understanding to see my issue yet. The following code should simulate sparse sampling by taking a observation every 4 hours from the sample data set and the second portion should add rows with there respective x values every 0.1 of an hour which will later be filled with interpolated values derived from a cubic spline.
--Sample Data
create table #temperatures (hour integer, temperature double precision);
insert into #temperatures (hour, temperature) values
(0,18.5),
(1,16.9),
(2,15.3),
(3,14.1),
(4,13.8),
(5,14.7),
(6,14.7),
(7,13.5),
(8,12.2),
(9,11.4),
(10,10.9),
(11,10.5),
(12,12.3),
(13,16.4),
(14,22.3),
(15,27.2),
(16,31.1),
(17,34),
(18,35.6),
(19,33.1),
(20,25.1),
(21,21.3),
(22,22.3),
(23,20.3),
(24,18.4),
(25,16.8),
(26,15.6),
(27,15.4),
(28,14.7),
(29,14.1),
(30,14.2),
(31,14),
(32,13.9),
(33,13.9),
(34,13.6),
(35,13.1),
(36,15),
(37,18.2),
(38,21.8),
(39,24.1),
(40,25.7),
(41,29.9),
(42,28.9),
(43,31.7),
(44,29.4),
(45,30.7),
(46,29.9),
(47,27);
--1
WITH xy (x,y)
AS
(
SELECT TOP 12
CAST(hour AS double precision) AS x
,temperature AS y
FROM #temperatures
WHERE cast(hour as integer) % 4 = 0
)
Select x,y
INTO #xy
FROM xy
Select [x] As [x_input]
INTO #x_series
FROM #xy
--2
with recursive
, x_series(input_x) as (
select
min(x)
from
#xy
union all
select
input_x + 0.1
from
x_series
where
input_x + 0.1 < (select max(x) from x)
)
, x_coordinate as (
select
input_x
, max(x) over(order by input_x) as previous_x
from
x_series
left join
#xy on abs(x_series.input_x - xy.x) < 0.001
)
The first CTE works as expected and produces a list of 12 (a sample every 4 hours for two days) but the second produces syntax error. The expected out put would be something like
(4,13.8), (4.1,null/0), (4.2,null/0),....., (8,12.2)

I dont think you need recursive.
What about this:
SQL DEMO
SELECT DISTINCT n = number *1.0 /10 , #xy.x, #xy.y
FROM master..[spt_values] step
LEFT JOIN #xy
ON step.number*1.0 /10 = #xy.x
WHERE number BETWEEN 40 AND 480
This 480 is based on the two days you mention.
OUTPUT
You dont even need the temporal table
SELECT DISTINCT n = number *1.0 /10 , #temperatures.temperature
FROM master..[spt_values] step
LEFT JOIN #temperatures
ON step.number *1.0 / 10 = #temperatures.hour
AND #temperatures.hour % 4 = 0
WHERE number BETWEEN 40 AND 480;

I don't think you need a recursive CTE here. I think a solution like this would be a better approach. Modify accordingly.
DECLARE #max_value FLOAT =
(SELECT MAX(hour) FROM #temperatures) * 10
INSERT INTO #temperatures (hour, temperature)
SELECT X.N / 10, NULL
FROM (
select CAST(ROW_NUMBER() over(order by t1.number) AS FLOAT) AS N
from master..spt_values t1
cross join master..spt_values t2
) X
WHERE X.N <= #max_value
AND X.N NOT IN (SELECT hour FROM #temperatures)

Use the temp table #xy produced in --1 you have, the following will give you a x series:
;with x_series(input_x)
as
(
select min(x) AS input_x
from #xy
union all
select input_x + 0.1
from x_series
where input_x + 0.1 < (select max(x) from #xy)
)
SELECT * FROM x_series;

Related

Return all records with a balance below a threshold value

I'm trying to setup a query to return all order line items with an outstanding balance below a certain threshold value (5%, for example). I managed this query without any concerns, but there is a complication. I only want to return these line items in cases where there aren't any line items outside of this threshold.
For example, if line item 1 has an Ordered Qty of 100, and 98 have been received, this line item would be returned unless there is a line item 2 with an Order qty of 100 and 50 received (since this is above the 5% threshold).
This might be more easily demonstrated than explained, so I set up a simplified SQL Fiddle to show what I have thus far. I'm using a CTE to add a remaining balance field and then querying against that within my threshold. I appreciate any advice
In the fiddle example, OrderNum 987654 should NOT be returned since that order has a second line item with 50% remaining.
SQL Fiddle
;WITH cte as (
SELECT
h.OrderNum
,d.ItemNumber
,d.OrderedQty
,d.ReceivedQty
,100.0 * (1 - (CAST(d.ReceivedQty as Numeric(10, 2)) / d.OrderedQty)) as RemainingBal
FROM OrderHeader h
INNER JOIN OrderDetail d
ON h.OrderNum = d.OrderNum
)
SELECT * FROM Cte
WHERE RemainingBal >0 and RemainingBal <= 5.0
I got this to work...
;WITH cte as (
SELECT
h.OrderNum
,d.ItemNumber
,d.OrderedQty
,d.ReceivedQty
,100.0 * (1 - (CAST(d.ReceivedQty as Numeric(10, 2)) / d.OrderedQty)) as
RemainingBal
FROM OrderHeader h
INNER JOIN OrderDetail d
ON h.OrderNum = d.OrderNum
)
SELECT * FROM Cte WHERE OrderNum IN(
SELECT OrderNum
FROM Cte
GROUP BY OrderNum
HAVING CAST((SUM(OrderedQty)) - (SUM(ReceivedQty)) AS
DECIMAL(10,2))/CAST(SUM(OrderedQty) AS DECIMAL(10,2)) <= .05
)

Consuming values from 2 columns

I am trying to figure out how to make a sort of "consumption" query where an INT value column (X) is subtracted from another INT column (Y) until it reaches 0, then stop. The column DesiredResult and DesiredResultExplanation are here only for reference to the math being performed. This takes place in DESC date order (future consuming back to the present)
My initial approach was to use window functionality, but the problem is once the value (Y) reaches 0, it needs to stop performing a running total. Had similar issues using a CTE as well.
If changing the table structure will help at all, this can be done.
Version: SQL Server 2014 or higher
Thanks!
DECLARE #test TABLE
(
ID INT IDENTITY (1,1)
,PeriodDate DATE
,X INT
,Y INT
,DesiredResult INT
,DesiredResultExplanation VARCHAR(100)
)
INSERT INTO #test VALUES ('2017-05-01', 100,0, 100,'Nothing left to subtract. Value is unchanged')
INSERT INTO #test VALUES ('2017-05-08', 200,0, 200,'Nothing left to subtract. Value is unchanged')
INSERT INTO #test VALUES ('2017-05-15', 300,0, 100,'300 - 200 = 100 (Orig -1100 has been consumed)')
INSERT INTO #test VALUES ('2017-05-22', 400,0,-200,'400 - 600 = -200 ')
INSERT INTO #test VALUES ('2017-05-29', 500,-1100,-600, '500 - 1100 = -600')
SELECT *
FROM #test
ORDER BY PeriodDate DESC
DEMO
WITH cte as (
SELECT *,
SUM(X) OVER (ORDER BY PeriodDate DESC) accumulated
FROM #test
), parameter as (
SELECT 1100 as startY
)
SELECT *,
CASE WHEN accumulated <= startY
THEN accumulated - startY
WHEN LAG(accumulated) OVER (ORDER BY PeriodDate DESC) < startY
THEN accumulated - startY
ELSE X
END as newDesire
FROM cte
CROSS JOIN parameter
ORDER BY PeriodDate DESC;
OUTPUT
EDIT: You can change the LAG condition with
WHEN accumulated - X < startY

T-SQL select value where value contains less than 3 of the declared characters

Im trying to write a select statement which returns the value if it doesnt have at least 3 of the declared characters but I cant think of how to get it working, can someone point me in the right direction?
One thing to consider, I am not allowed to create a temporary table for this exercise.
I havn't really got any SQL so far as I cant think of a way to do it without a temp table.
the declared characters are any alpha characters between a and z, so if the value in the db is '1873' then it would return the value because it doesnt have at least 3 of the declared characters, but if the value was 'abcdefg' then it would not be returned as it has at least 3 of the declared characters.
Is anyone able to point me in a starting direction for this?
This will find all sys.objects with an x or a z:
Some explanations, as this is an exercise and you want to learn something:
You can split a delimitted string by transforming it into XML. x,z comes out as <x>x</x><x>z</x>. You can use this to create a derived table.
I use a CTE to avoid a created or declared table...
You can use CROSS APPLY for row-wise actions. Here I use CHARINDEX to find the position(s) of the chars you are looking for.
If all of them are not found, there SUM is zero. I use GROUP BY and HAVING to check this.
Hope this is clear :-)
DECLARE #chars VARCHAR(100)='x,z';
WITH Splitted AS
(
SELECT A.B.value('.','char') AS TheChar
FROM
(
SELECT CAST('<x>' + REPLACE(#chars,',','</x><x>')+ '</x>' AS XML) AS AsXml
) AS tbl
CROSS APPLY AsXml.nodes('/x') AS A(B)
)
SELECT name
FROM sys.objects
CROSS APPLY (SELECT CHARINDEX(TheChar,name) AS Found FROM Splitted) AS Found
GROUP BY name,Found
HAVING SUM(Found)>0
With
SrcTab As (
Select *
From (values ('Contains x y z')
, ('Contains x and y')
, ('Contains y only')) v (SrcField)),
CharList As ( --< CTE instead of temporary table
Select *
From (values ('x')
, ('y')
, ('z')) v (c))
Select SrcField
From SrcTab, CharList
Group By SrcField
Having SUM(SIGN(CharIndex(C, SrcField))) < 3 --< Count hits
;
If Distinct is not desirable and we need to only check count for each row:
With
SrcTab As ( --< Sample Data CTE
Select *
From (values ('Contains x y z')
, ('Contains x and y')
, ('Contains y only')
, ('Contains y only')) v (SrcField))
Select SrcField
From SrcTab
Where (
Select Count(*) --< Count hits
From (Values ('x'), ('y'), ('z')) v (c)
Where CharIndex(C, SrcField) > 0
) < 3
;
Using Numbers Table and Joins..I used declared characters as only 4 for demo purposes
Input:
12345
abcdef
ab
Declared table:used only 3 for demo..
a
b
c
Output:
12345
ab
Demo:
---Table population Scripts
Create table #t
(
val varchar(20)
)
insert into #t
select '12345'
union all
select 'abcdef'
union all
select 'ab'
create table #declarecharacters
(
dc char(1)
)
insert into #declarecharacters
select 'a'
union all
select 'b'
union all
select 'c'
Query used
;with cte
as
(
select * from #t
cross apply
(
select substring(val,n,1) as strr from numbers where n<=len(val))b(outputt)
)
select val from
cte c
left join
#declarecharacters dc1
on
dc1.dc=c.outputt
group by val
having
sum(case when dc is null then 0 else 1 end ) <3

Creating test data for calculation using RAND()

I attempted to populate a table with two columns of random FLOATs, but of every row generated was identical.
;WITH CTE (x, y) AS (
SELECT RAND(), RAND()
UNION ALL
SELECT x, y FROM CTE
)
--INSERT INTO CalculationTestData (x, y)
SELECT TOP 5000000 x, y
FROM CTE
OPTION (MAXRECURSION 0)
I can accomplish what I need just fine by just not using the CTE, but this has peaked my curiosity.
Is there a way to do this quickly?
I know quickly is a relative term, by it, I mean approximately how quickly it would take to execute the above.
What do you expect other than for the cte to repeat the rows because you're recursion is just selecting them again
SELECT RAND(), RAND() -- SELECT 9 , 10
UNION ALL
SELECT x, y -- SELECT 9 , 10
what you want to do is more like this
SELECT RAND(), RAND()
UNION ALL
SELECT RAND(), RAND() -- but the problem is that this 'row' will be duplicated
so you need to seed and reseed for each row giving you something like
SELECT RAND(CAST(NEWID() AS VARBINARY)),
RAND(CAST(NEWID() AS VARBINARY))
UNION ALL
SELECT RAND(CAST(NEWID() AS VARBINARY)),
RAND(CAST(NEWID() AS VARBINARY))
using NEWID() as the seed is one way there may well be others that are more efficient etc
Try this instead of rand(): it will give a random positive whole number on each entry. I had the same issue with rand() recently
ABS(Checksum(NewID()))
Float:
cast(ABS(Checksum(NewID()) ) as float)
To be Clear:
;WITH CTE (x, y) AS (
SELECT cast(ABS(Checksum(NewID()) ) as float), cast(ABS(Checksum(NewID()) ) as float)
UNION ALL
SELECT x, y FROM CTE
)
Did not give a random entry on each line?

How to group ranged values using SQL Server

I have a table of values like this
978412, 400
978813, 20
978834, 50
981001, 20
As you can see the second number when added to the first is 1 number before the next in the sequence. The last number is not in the range (doesnt follow a direct sequence, as in the next value). What I need is a CTE (yes, ideally) that will output this
978412, 472
981001, 20
The first row contains the start number of the range then the sum of the nodes within. The next row is the next range which in this example is the same as the original data.
From the article that Josh posted, here's my take (tested and working):
SELECT
MAX(t1.gapID) as gapID,
t2.gapID-MAX(t1.gapID)+t2.gapSize as gapSize
-- max(t1) is the specific lower bound of t2 because of the group by.
FROM
( -- t1 is the lower boundary of an island.
SELECT gapID
FROM gaps tbl1
WHERE
NOT EXISTS(
SELECT *
FROM gaps tbl2
WHERE tbl1.gapID = tbl2.gapID + tbl2.gapSize + 1
)
) t1
INNER JOIN ( -- t2 is the upper boundary of an island.
SELECT gapID, gapSize
FROM gaps tbl1
WHERE
NOT EXISTS(
SELECT * FROM gaps tbl2
WHERE tbl2.gapID = tbl1.gapID + tbl1.gapSize + 1
)
) t2 ON t1.gapID <= t2.gapID -- For all t1, we get all bigger t2 and opposite.
GROUP BY t2.gapID, t2.gapSize
Check out this MSDN Article. It gives you a solution to your problem, if it will work for you depends on the ammount of data you have and your performance requirements for the query.
Edit:
Well using the example in the query, and going with his last solution the second way to get islands (first way resulted in an error on SQL 2005).
SELECT MIN(start) AS startGroup, endGroup, (endgroup-min(start) +1) as NumNodes
FROM (SELECT g1.gapID AS start,
(SELECT min(g2.gapID) FROM #gaps g2
WHERE g2.gapID >= g1.gapID and NOT EXISTS
(SELECT * FROM #gaps g3
WHERE g3.gapID - g2.gapID = 1)) as endGroup
FROM #gaps g1) T1 GROUP BY endGroup
The thing I added is (endgroup-min(start) +1) as NumNodes. This will give you the counts.

Resources