Interpolate missing values when joining two tables - sql-server

I have two tables of different density data and I'd like to be able to join them but interpolate this values in the lower frequency table to fill in the gaps.
I have no idea how to approach this other than it's a lag/lead thing but the differences are irregular.
Here is my set up below:
CREATE TABLE #HighFreq
(MD INT NOT NULL,
LOSS float)
INSERT INTO #HighFreq
VALUES
(6710,0.5)
,(6711,0.6)
,(6712,0.6)
,(6713,0.5)
,(6714,0.5)
,(6715,0.4)
,(6716,0.9)
,(6717,0.9)
,(6718,0.9)
,(6719,1)
,(6720,0.8)
,(6721,0.9)
,(6722,0.7)
,(6723,0.7)
,(6724,0.7)
,(6725,0.7)
CREATE TABLE #LowFreq
(MD INT NOT NULL
,X FLOAT
,Y FLOAT)
INSERT INTO #LowFreq
VALUES
(6710,12,1000)
,(6711,8,1001)
,(6718,10,1007)
,(6724,8,1013)
,(6730,11,1028)
And I want my output to look like this:

Here is an approach using a recursive cte and window functions. The recusive cte generates the list of mds from values available in both tables. Then, the idea is to put adjacent "missing" #LowFreq records into groups, using gaps-and-island technique. You can then do the interpolation in the outer query, by projecting values between the first (and only) non-null value in the group and the next one.
with cte as (
select min(coalesce(h.md, l.md)) md, max(coalesce(h.md, l.md)) md_max
from #HighFreq h
full join #LowFreq l on l.md = h.md
union all
select md + 1, md_max from cte where md < md_max
)
select
md,
loss,
coalesce(x, min(x) over(partition by grp)
+ (min(lead_x) over(partition by grp) - min(x) over(partition by grp))
* (row_number() over(partition by grp order by md) - 1)
/ count(*) over(partition by grp)
) x,
coalesce(y, min(y) over(partition by grp)
+ (min(lead_y) over(partition by grp) - min(y) over(partition by grp))
* (row_number() over(partition by grp order by md) - 1)
/ count(*) over(partition by grp)
) y
from (
select
c.md,
h.loss,
l.x,
l.y,
sum(case when l.md is null then 0 else 1 end) over(order by c.md) grp,
lead(l.x) over(order by c.md) lead_x,
lead(l.y) over(order by c.md) lead_y
from cte c
left join #HighFreq h on h.md = c.md
left join #LowFreq l on l.md = c.md
) t
Demo on DB Fiddle:
md | loss | x | y
---: | ---: | ---------------: | ---------------:
6710 | 0.5 | 12 | 1000
6711 | 0.6 | 8 | 1001
6712 | 0.6 | 8.28571428571429 | 1001.85714285714
6713 | 0.5 | 8.57142857142857 | 1002.71428571429
6714 | 0.5 | 8.85714285714286 | 1003.57142857143
6715 | 0.4 | 9.14285714285714 | 1004.42857142857
6716 | 0.9 | 9.42857142857143 | 1005.28571428571
6717 | 0.9 | 9.71428571428571 | 1006.14285714286
6718 | 0.9 | 10 | 1007
6719 | 1 | 9.66666666666667 | 1008
6720 | 0.8 | 9.33333333333333 | 1009
6721 | 0.9 | 9 | 1010
6722 | 0.7 | 8.66666666666667 | 1011
6723 | 0.7 | 8.33333333333333 | 1012
6724 | 0.7 | 8 | 1013
6725 | 0.7 | 8.5 | 1015.5
6726 | null | 9 | 1018
6727 | null | 9.5 | 1020.5
6728 | null | 10 | 1023
6729 | null | 10.5 | 1025.5
6730 | null | 11 | 1028

Related

How to check if SQL records are in a specific order

I'm having trouble figuring out how I can check if records on a table are in a specific order. The simplified table design is essentially this:
+------------+----------------+--------+
| ID (GUID) | StartDate | NumCol |
+------------+----------------+--------+
| CEE8C17... | 8/17/2019 3:11 | 22 |
| BB22001... | 8/17/2019 3:33 | 21 |
| 4D40B12... | 8/17/2019 3:47 | 21 |
| 3655125... | 8/17/2019 4:06 | 20 |
| 3456CD1... | 8/17/2019 4:22 | 20 |
| 38BAF92... | 8/17/2019 4:40 | 19 |
| E60CBE8... | 8/17/2019 5:09 | 19 |
| 5F2756B... | 8/17/2019 5:24 | 18 |
+------------+----------------+--------+
The ID column is a non-sequential GUID. The table is sorted by default on the StartDate when data is entered. However I am trying to flag instances where the NumCol values are out of descending order. The NumCol values can be identical on adjacent records, but ultimately they must be descending.
+--------+
| NumCol |
+--------+
| 22 |
| *20 | <- I want the ID where this occurs
| 21 |
| 20 |
| 20 |
| 19 |
| 19 |
| 18 |
+--------+
I've tried LEFT JOIN this table to itself, but can't seem to come up with an ON clause that gives the right results:
ON a.ID <> b.ID AND a.NumCol > b.NumCol
I also thought I could use OFFSET n ROWS to compare the default sorted table against one with an ORDER BY NumCol performed on it. I can't come up with anything that works.
I need a solution that will work for both SQL Server and SQL Compact.
With EXISTS:
select t.* from tablename t
where exists (
select 1 from tablename
where numcol > t.numcol and startdate > t.startdate
)
Or with row_number() window function:
select t.id, t.startdate, t.numcol
from (
select *,
row_number() over (order by startdate desc) rn1,
row_number() over (order by numcol) rn2
from tablename
) t
where rn1 > rn2
See the demo.
This might be easiest:
select * from T t1
where NumCol < (select max(NumCol) from T t2 where t2.StartDate > t1.StartDate);
The exists version is probably better to optimize though.
Using analytic functions you could try this approach which finds breaks in the monotonicity of consecutive rows. It might not return all the rows you're interested in seeing:
with data as (
select *, lag(NumCol) over (order by StartDate desc) as prevNumCol
from T
)
select * from data where prevNumCol > NumCol;
Here's a better solution that's probably not available in both of your environments:
with data as (
select *,
max(NumCol) over (
order by StartDate desc
rows between unbounded preceding and current row
) as prevMax
from T
)
select * from data where prevMax > NumCol;

Reconstructing Balances By Weekly Transaction Sums

I am looking for some advice or pointers on how to construct this. I have spent the last year self-learning SQL. I am at work and I only have access to the query interface in report builder. Which for me means, no procedures, no create tables and no IDE :(. So thats the limitations!
I am trying to reconstruct account balances. I have no intervening balances. I have the current balance and a table full of the transaction history
My current approach is to sum the transactions by posting week (Which I have done) in my CTE named
[SUMTRANSREF]
+--------------+------------+-----------+
| TNCY-SYS-REF | POSTING-WK | SUM-TRANS |
+--------------+------------+-----------+
| 1 | 47 | 37.95 |
| 1 | 46 | 37.95 |
| 1 | 45 | 37.95 |
| 2 | 47 | 50.00 |
| 2 | 46 | 25.00 |
| 2 | 45 | 25.00 |
+--------------+------------+-----------+
I then get the current balances in another CTE called
[CBAL]
+--------------+-------------+-----------+
| TNCY-SYS-REF | CUR-BALANCE | CURR-WEEK |
+--------------+-------------+-----------+
| 1 | 27.52 | 47 |
| 1 | 52.00 | 47 |
+--------------+-------------+-----------+
Now I am assuming I could create intervening CTEs to sum and then splice those altogether but is there a smarter (more automated) way?
Ideally my result should be
+--------------+-------------+----------+----------+
| TNCY-SYS-REF | CUR-BALANCE | BAL-WK46 | BAL-Wk45 |
+--------------+-------------+----------+----------+
| 1 | 27.52 | -10.43 | -48.38 |
| 2 | 52.00 | 2.00 | -48.00 |
+--------------+-------------+----------+----------+
I just am uncertain because each column requires the sum of intervening transactions
So BAL-WK46 is (CURR-BALANCE) - SUM(Transactions from 47)
So BAL-WK46 is (CURR-BALANCE) - SUM(Transactions 46+47)
So BAL-WK45 is (CURR-BALANCE) - SUM(Transactions 45+46+47)
and so on.
Normally I have an idea where to start but I am flummoxed by this one.
Any help you can give would be appreciated. Thank you
Here is some T-SQL that gets the result you require. Should be easy enough to play with to get what you want.
It makes use of Recursive CTE and a PIVOT
IF OBJECT_ID('Tempdb..#SUMTRANSREF') IS NOT NULL
DROP TABLE #SUMTRANSREF
IF OBJECT_ID('Tempdb..#CBAL') IS NOT NULL
DROP TABLE #CBAL
IF OBJECT_ID('Tempdb..#TEMP') IS NOT NULL
DROP TABLE #TEMP
CREATE TABLE #SUMTRANSREF
(
[TNCY-SYS-REF] int,
[POSTING-WK] int,
[SUM-TRANS] float
)
CREATE TABLE #CBAL
(
[TNCY-SYS-REF] int ,
[CUR-BALANCE] float , [CURR-WEEK] int
)
INSERT INTO #SUMTRANSREF
VALUES (1 ,47 , 37.95),
(1 ,46 , 37.95),
(1 ,45 , 37.95),
(2 ,47 , 50.00),
(2 ,46 , 25.00),
(2 ,45 , 25.00 )
INSERT INTO #CBAL
VALUES (1,27.52,47),(2,52.00,47);
WITH CBAL AS
(SELECT * FROM #CBAL),
SUMTRANSREF AS(SELECT * FROM #SUMTRANSREF),
RecursiveTotals([TNCY-SYS-REF],[CURR-WEEK],[CUR-BALANCE],RunningBalance)
AS
(
select C.[TNCY-SYS-REF], C.[CURR-WEEK],C.[CUR-BALANCE],C.[CUR-BALANCE] + S.RunningTotal RunningBalance from CBAL C
JOIN (select *,-SUM([SUM-TRANS]) OVER (PARTITION BY [TNCY-SYS-REF] ORDER BY [POSTING-WK] DESC) RunningTotal
from SUMTRANSREF) S
ON C.[CURR-WEEK]=S.[POSTING-WK] AND C.[TNCY-SYS-REF]=S.[TNCY-SYS-REF]
UNION ALL
select RT.[TNCY-SYS-REF], RT.[CURR-WEEK] -1 [CURR_WEEK],RT.[CUR-BALANCE],RT.[CUR-BALANCE] + S.RunningTotal RunningBalance FROM RecursiveTotals RT
JOIN (select *,-SUM([SUM-TRANS]) OVER (PARTITION BY [TNCY-SYS-REF] ORDER BY [POSTING-WK] DESC) RunningTotal
from #SUMTRANSREF) S ON RT.[TNCY-SYS-REF] = S.[TNCY-SYS-REF] AND RT.[CURR-WEEK]-1 = S.[POSTING-WK]
)
select [TNCY-SYS-REF],[CUR-BALANCE],[46] as 'BAL-WK46',[45] as 'BAL-WK45',[44] as 'BAL-WK44'
FROM (
select [TNCY-SYS-REF],[CUR-BALANCE],RunningBalance,BalanceWeek from (SELECT *,R.[CURR-WEEK]-1 'BalanceWeek' FROm RecursiveTotals R
) RT) AS SOURCETABLE
PIVOT
(
AVG(RunningBalance)
FOR BalanceWeek in ([46],[45],[44])
) as PVT

Find max and min of a column and update the first column sql server

Based on the product and product key, update the column ord_by. There should be only one min and max for a product and product_key .
E.g: Table
+-------------+---------+-------+--------+
| Product_key | product | price | ord_by |
+-------------+---------+-------+--------+
| 1 | ABC | 10 | |
| 1 | ABC | 10 | |
| 1 | ABC | 20 | |
| 1 | ABC | 100 | |
| 1 | ABC | 100 | |
| 2 | EFG | 20 | |
| 2 | EFG | 40 | |
| 3 | ABC | 100 | |
+-------------+---------+-------+--------+
Expected output:
+-------------+---------+-------+--------+
| Product_key | product | price | ord_by |
+-------------+---------+-------+--------+
| 1 | ABC | 10 | Min |
| 1 | ABC | 10 | Mid |
| 1 | ABC | 20 | Mid |
| 1 | ABC | 100 | Mid |
| 1 | ABC | 100 | Max |
| 2 | EFG | 20 | Min |
| 2 | EFG | 40 | Max |
| 3 | ABC | 100 | None |
+-------------+---------+-------+--------+
My try :
;WITH ord_cte
AS (
SELECT product
,product_key
,max(price) as max_price
,min(price) as min_price
FROM t_prod_ord
group by product,product_key
)
UPDATE t1
SET ord_by = case
when t2.max_price =t2.min_price then 'none'
when t2.max_price=t1.price then 'max'
when t2.min_price=t1.price then 'min'
else 'mid' end
FROM t_prod_ord t1
INNER JOIN ord_cte t2 ON t1.product_key = t2.product_key and t1.product=t2.product
using this query it is updating more than one max and min value for column ord_by.
Generate row number for each Product_key order by Price in both ASC and DESC order. Then use the row number in CASE statement to find the Min/Max values
Count() Over() aggregate window function will help you find the total count of each Product_key which we can use it for finding None
Here is one way
;WITH cte
AS (SELECT *,
Row_number()OVER(PARTITION BY Product_key ORDER BY price) AS Min_KEY,
Row_number()OVER(PARTITION BY Product_key ORDER BY price DESC) AS Max_KEY,
Count(1)OVER(partition BY Product_key) AS cnt
FROM Yourtable)
SELECT Product_key,
product,
price,
CASE
WHEN cnt = 1 THEN 'None'
WHEN Min_KEY = 1 THEN 'Min'
WHEN Max_Key = 1 THEN 'Max'
ELSE 'Mid'
END
FROM cte
Another way to do with out cte...
SELECT [Product_key],
[product],
[price],
CASE
WHEN Max(RN)
OVER(
PARTITION BY PRODUCT_KEY, PRODUCT
)=1 AND RN=1 THEN 'NONE'
WHEN Min(RN)
OVER(
PARTITION BY PRODUCT_KEY, PRODUCT
) = RN THEN 'MIN'
WHEN Max(RN)
OVER(
PARTITION BY PRODUCT_KEY, PRODUCT
) = RN THEN 'MAX'
ELSE 'MID'
END ORDER_BY
FROM (SELECT *,
Row_number()
OVER(
PARTITION BY PRODUCT_KEY, PRODUCT
ORDER BY PRICE) RN
FROM TABLE1)Z

Grouping by a same range of multiple values with sum and counts in SQL

I want to group different colons in same range by row. Example:
Amount1 | Amount2
------------------------
20,00 | 30,00
35,00 | 32,00
12,00 | 51,00
101,00 | 100,00
Here result should be;
Range |TotalAmount1 |TotalAmount2 | CountAmount1 | CountAmount2 | RateOfCountAmount1
-----------------------------------------------------------------------------
0-50 | 67,00 | 62,00 | 3 | 1 | %75
50-100 | 0,00 | 151,00 | 0 | 2 | %0
100+ | 101,00 | 0,00 | 1 | 0 | %25
Total | 168,00 | 213,00 | 4 | 3 | %100
Here is the example : http://sqlfiddle.com/#!9/05fd3
you can query like this
;with cte as (
select case when amount1 < 50 then '0-50'
when amount1 between 50.01 and 100 then '50-100'
when amount1 > 100 then '100+' end as rngamt1,
case when amount2 < 50 then '0-50'
when amount2 between 5.01 and 100 then '50-100'
when amount2 > 100 then '100+' end as rngamt2,
* from amounts
), cte2 as (select coalesce(rngamt1, rngamt2) as [Range], isnull(a.TotalAmount1,0) as TotalAmount1, isnull(b.TotalAmount2, 0) as TotalAmount2, isnull( a.TotalCount1 , 0) as TotalCount1, isnull(b.TotalCount2, 0) as Totalcount2 from
(select rngamt1, sum(amount1) TotalAmount1, count(amount1) TotalCount1 from cte c
group by rngamt1) a
full join
(select rngamt2, sum(amount2) TotalAmount2, count(amount2) TotalCount2 from cte c
group by rngamt2) b
on a.rngamt1 = b.rngamt2
)
select *, (TotalCount1 * 100 )/sum(TotalCount1) over () as RateCount1
from cte2
union
select 'Total' as [Range], sum(TotalAmount1) as TotalAmount1, sum(totalAmount2) as TotalAmount2,
sum(TotalCount1) as TotalCount1, sum(Totalcount2) as TotalCount2, (sum(TotalCount1)*100)/Sum(TotalCount1) as RateCount1 from cte2

TSQL pivot issue

Hello I have a temp table (#tempResult) that contains results like the following...
-----------------------------------------
| DrugAliasID | Dosage1 | Unit1 | rowID |
-----------------------------------------
| 322 | 10 | MG | 1 |
| 322 | 50 | ML | 2 |
| 441 | 20 | ML | 3 |
| 443 | 15 | ML | 4 |
-----------------------------------------
I'm looking to get the results to be like the following, pivoting the rows that have the same DrugAliasID.
--------------------------------------------------
| DrugAliasID | Dosage1 | Unit1 | Dosage2 | Unit2 |
--------------------------------------------------
| 322 | 10 | MG | 50 | ML |
| 441 | 20 | ML | NULL | NULL |
| 443 | 15 | ML | NULL | NULL |
--------------------------------------------------
So far I have a solution that isn't using pivot. I'm not too good with pivot and was wondering if anyone knew how to use it in this scenario. Or solve it some other way. Thanks
SELECT
tr.drugAliasID,
MIN(trmin.dosage1) AS dosage1,
MIN(trmin.unit1) AS unit1,
MIN(trmax.dosage1) AS dosage2,
MIN(trmax.unit1) AS unit2
FROM
#tempResult tr
JOIN
#tempResult trmin ON trmin.RowID = tr.rowid AND trmin.drugAliasID = tr.drugAliasID
JOIN
#tempResult trmax ON trmax.RowID = tr.rowid AND trmax.drugAliasID = tr.drugAliasID
JOIN
(SELECT
MIN(RowID) AS rowid,
drugAliasID
FROM
#tempResult
GROUP BY
drugAliasID) tr1 ON tr1.rowid = trmin.RowID
JOIN
(SELECT
MAX(RowID) AS rowid,
drugAliasID
FROM
#tempResult
GROUP BY
drugAliasID) tr2 ON tr2.rowid = tr.RowID
GROUP BY
tr.drugAliasID
HAVING
count(tr.drugAliasID) > 1
Assuming your version of SQL Server supports the use of CTEs, you can simplify your query thus:
;with cte as
(select *, row_number() over (partition by drugaliasid order by rowid) rn
from #tempResult
)
select c.drugaliasid, c.dosage1, c.unit1, c2.dosage1 as dosage2, c2.unit1 as unit2
from cte c
left join cte c2 on c.drugaliasid = c2.drugaliasid and c.rn = 1 and c2.rn = 2
where c.rn = 1
Demo
This will give you the desired result, without having to use the pivot keyword.

Resources