Join tables with Implied / Inferred data

Join tables with Implied / Inferred data - sql-server

This would be an easy join except: Table A is explicit for all times and values, but Table B only records rows when the there is a change from the previous value. In looking at Table B one can easily infer the missing times and values, but how to put that into a query?
Data in A.time contains every minute and a corresponding A.Value.
A.Time...........A.Value
9:00...............3.4
9:01...............5.0
9:02...............5.3
9:03...............5.3
9:04...............5.3
and so on…..
Table B only contains rows where the B.value has changed from the previous value.
B.Time..............B.Value
9:00...................4
9:01...................4.1
This is blank, but I know it to be 9:02 / 4.1
This is blank, but I know it to be 9:03 / 4.1
9:04....................4.7
and so on…
I need to do a query that links A.Time and B.Value, but I need the query to understand that a missing time in Table B should be substituted by the B.value of the first B.Time preceeding it.
Final table should be
A.Time...............B.Value
9:00...................4
9:01...................4.1
9:02...................4.1
9:03...................4.1
9:04...................4.7
I am currently writing this for SQL Server, but I need an Oracle solution too
Thanks in advance;

In Oracle, you can LEFT JOIN to get all the times and then use LAST_VALUE(b.value) IGNORE NULLS... to fill in the blanks. (NOTE: the ROWS BETWEEN... part is redundant with the ORDER BY in the OVER() clause, but I like it for extra clarity).
Like this:
SELECT a.time,
LAST_VALUE (b.VALUE)
IGNORE NULLS
OVER (PARTITION BY NULL
ORDER BY a.time
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM table_a a
LEFT JOIN table_b b ON b.time = a.time
ORDER BY a.time;
Here is a full example with test data:
with table_a ( time, value ) as
( SELECT '9:00', 3.4 FROM DUAL UNION ALL
SELECT '9:01', 5.0 FROM DUAL UNION ALL
SELECT '9:02', 5.3 FROM DUAL UNION ALL
SELECT '9:03', 5.3 FROM DUAL UNION ALL
SELECT '9:04', 5.3 FROM DUAL ),
table_b ( time, value ) as
( SELECT '9:00', 4 FROM DUAL UNION ALL
SELECT '9:01', 4.1 FROM DUAL UNION ALL
SELECT '9:04', 4.7 FROM DUAL )
SELECT a.time,
LAST_VALUE (b.VALUE)
IGNORE NULLS
OVER (PARTITION BY NULL
ORDER BY a.time
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM table_a a
LEFT JOIN table_b b ON b.time = a.time
ORDER BY a.time;
An alternative (which might work on SQL Server) is to use OUTER APPLY. Like so:
SELECT a.time, b.value
FROM table_a a
OUTER APPLY ( SELECT *
FROM table_b b
WHERE b.time <= a.time
ORDER BY b.time desc
FETCH FIRST 1 ROW ONLY ) b
ORDER BY a.time;
Basically, this finds the most recent non-null value from table B for each row in table A.
SQL*SERVER Solution
Here is the OUTER APPLY syntax translated to SQL*Server:
with table_a ( time, value ) as
( SELECT '9:00', 3.4 UNION ALL
SELECT '9:01', 5.0 UNION ALL
SELECT '9:02', 5.3 UNION ALL
SELECT '9:03', 5.3 UNION ALL
SELECT '9:04', 5.3 ),
table_b ( time, value ) as
( SELECT '9:00', 4 UNION ALL
SELECT '9:01', 4.1 UNION ALL
SELECT '9:04', 4.7 )
SELECT a.time, b.value
FROM table_a a OUTER APPLY (
SELECT * FROM table_b b
WHERE b.time <= a.time
ORDER BY b.time desc
OFFSET 0 ROWS
FETCH NEXT 1 ROWS ONLY ) b
ORDER BY a.time;

Related

Snowflake - is it possible to "merge" array output to table query

i have two inputs:
a. Table A with 3 columns: a ,b , c - with 7 rows
b. Table B with an array column d , the array has 7 values.
Is there a way to "merge" the table A and TABLE B in a query - so that the first row of A and the first cell value of B will printed in the same row ?
Input:
For example:
Table A- Column 1
a
b
c
Table A- Column 2
d
e
f
Table b - column 1 (and only)
k
r
j
OUTPUT
should be three columns:
two first columns : column a, b (from table a)
third column - which is column 1 from table 2

You can use Flatten to convert your array into a table. Then you just need to define ordering and how to join them together.
Here's an example, where I use a ROW_NUMBER() to provide an ordering. And since I added it to both tableA and the flattened tableB, it can also be used to join the records together.
If you have IDs for ordering or joining, then that might be slightly cleaner, but with the simple columns provided in the example, you need to do something like this to line up the array values to the table rows.
with tA as (
select col1, col2,
ROW_NUMBER() OVER(ORDER BY col1) rnum
from tableA
)
,tB as (
select
x.value::string col1,
ROW_NUMBER() OVER(ORDER BY 1) rnum
from tableB ,
lateral flatten(input => array) x
)
SELECT a.col1, a.col2, b.col1
FROM tA a
JOIN tB b
ON a.rnum = b.rnum;

There is no "first row" in databases. There is a first with respect to an order you place on the data. Thus there a number of way to get just the "first row" of table A and table B.
On is to only select the first rows, I will use CTE but it can be done in a sub-select also. Via a QUALIFY and ROW_NUMBER, and to make it 'stable' I will use the selected output column. After that to JOIN the two tables, I will use a CROSS-JOIN, but given there are only one row from both CTE's this will give just one output row. But this does not feel like what you want.
WITH first_from_table_a AS (
SELECT column1, column2
FROM table_a
QUALIFY ROW_NUMBER() OVER (ORDER BY column1) = 1
), first_from_table_b AS (
SELECT column1
FROM table_b
QUALIFY ROW_NUMBER() OVER (ORDER BY column1) = 1
)
SELECT a.column1, a.column2, b.column1
FROM first_from_table_a AS a
CROSS JOIN first_from_table_b AS b
The problem with this is, if there are other things you are want to do this over it doesn't scale.
FIRST_VALUE is a function that could also help, if you join your data on some other schema, and want to chose a value from a larger set, but really the problem needs to be clarified more.
Another way to consider your question is to use the same ROW_NUMBER idea, and join on those, thus:
WITH first_from_table_a AS (
SELECT column1,
column2,
ROW_NUMBER() OVER (ORDER BY column1) AS rn
FROM table_a
), first_from_table_b AS (
SELECT column1,
ROW_NUMBER() OVER (ORDER BY column1) AS rn
FROM table_b
)
SELECT a.column1, a.column2, b.column1
FROM first_from_table_a AS a
JOIN first_from_table_b AS b
ON a.rn = b.rn
ORDER BY a.rn

SQL Server Group By - Aggregate NULL or empty values into all other values

I am trying to group by a column. The problem is that the NULL values of the column are grouped as a separate group.
I want the NULL values to be added to each of the other group values instead.
Example of a table:
The results I want to get from group by with sum aggregation over the 'val' column:
Can anyone help me?
Thanks!

You can precalculate the value to spread through the rows and then just do arithmetic:
select t.id,
sum(t.val) + (null_sum / cnt_id)
from t cross join
(select count(distinct id) as cnt_id,
sum(case when id is null then val else 0 end) as null_sum
from t
) tt
group by t.id;
Note some databases do integer division, so you might need null_sum * 1.0 / cnt_id.

A GROUP BY operation can't really generate values for each group on the fly, so logically you need records which are missing to really be present.
One approach is to use a calendar table to generate a table containing one NULL record for each id group:
WITH ids AS (
SELECT DISTINCT id FROM yourTable
WHERE id IS NOT NULL
),
cte AS (
SELECT t1.id, t2.val
FROM ids t1
CROSS JOIN yourTable t2
WHERE t2.id IS NULL
)
SELECT t.id, SUM(t.val) AS val
FROM
(
SELECT id, val FROM yourTable WHERE id IS NOT NULL
UNION ALL
SELECT id, val FROM cte
) t
GROUP BY
id;
Demo

Transforming and repeating multiple rows

I have a table that has two IDs within it named FamilyID and PersonID. I need to be able to repeat these rows with all combinations, as the below screenshot shows noting that each of the numbers get an extra row.
Here is some SQL to create the table with some sample data. There is no set number of occurrences that could occur.
Anyone aware of how we could be achieved?
CREATE TABLE #TempStackOverflow
(
FamilyID int,
PersonID int
)
insert into #TempStackOverflow
(
FamilyID,
PersonID
)
select
1012,
1
union
select
1013,
1
union
select
1014,
1
union
select
1015,
2
union
select
14774,
3
union
select
1019,
5

I understand that you need some sort of a complete list of matches within groups, but honestly, it would be much better if you would explain the business context, using plain English, in the first place.
The following query seems to produce your sample result:
with cte as (
select a.FamilyID, a.PersonID, a.PersonID as [GroupId] from #TempStackOverflow a
union all
select b.PersonID, b.FamilyID, b.PersonID from #TempStackOverflow b
)
select distinct c.FamilyID, s.PersonID
from cte c
inner join cte s on s.GroupId = c.GroupId
where c.FamilyID != s.PersonID;

Here is the simplest version I can come up with that groups the items by PersonId, as you do above. Obviously if you don't want that, then you can remove the outer query.
SELECT FamilyId,
PersonID
FROM (
SELECT FamilyId, PersonId, PersonID as SortBy
FROM #TempStackOverflow t1
UNION
SELECT PersonId, FamilyId, PersonId as SortBy
FROM #TempStackOverflow t1
UNION
SELECT t1.FamilyID, t2.FamilyID, t1.PersonID as SortBy
FROM #TempStackOverflow t1
FULL OUTER JOIN #TempStackOverflow t2
ON t1.PersonID = t2.PersonID
WHERE t1.FamilyID != t2.FamilyID
) as Src
ORDER BY SortBy

MAX() SQL Server multiple rows. How fix to return only 1 row per month year?

I needed help with using the function MAX() properly as I seem to be getting more than one row when I have clearly stated that I want the MAX(Monthid), which should return the last monthyear row for the customer.
What I need is the last monthyear row for either customer_segment or agreement. When I finally put the customer_segment and agreement columns to the original, I get upto 6 different monthyear rows wiht different customer_segment names when I only want 1 row.
How do fix this?
--Finding customer segment
SELECT
a.[cust_no]
,Customer_Segment
,max(monthid) AS monthyear
INTO #Segment
FROM Original_table a
INNER JOIN Customer_Segment ku
on ku.Cust_no=a.cust_no
GROUP BY a.cust_no,Customer_Segment
--------------------------------------------------------------------------
--Finding agreement(yes/no)
SELECT DISTINCT
a.cust_no,
Agreement,
max(monthid) as Monthyear
into #Agreement
FROM Original_table a
INNER JOIN Cust_Details zx
ON zx.cust_no=a.cust_no
GROUP BY a.cust_no,
zx.Agreement
------------------------------------------------
-- Attaching columns to original file on cust_no
select DISTINCT
A.cust_no,
B.Customer_Segment,
d.Agreement
from Original_table A
LEFT JOIN ( SELECT DISTINCT * FROM #Segment ) b
on b.cust_no=A.cust_no
LEFT JOIN( SELECT distinct * FROM #Agreement ) d
ON d.cust_no=a.cust_no

Aren't you missing some info on the joins?
(...)
LEFT JOIN ( SELECT DISTINCT * FROM #Segment ) b
on b.cust_no=A.cust_no and
b.Customer_Segment = A.Customer_Segment
LEFT JOIN( SELECT distinct * FROM #Agreement ) d
ON d.cust_no=a.cust_no and
d.Agreement = A.Agreement

try this:
select
A.cust_no, b.monthyear,
e.Customer_Segment,
d.Agreement
from Original_table A
JOIN (SELECT a.[cust_no] cust_no ,max(monthid) AS monthyear
FROM Original_table a) b on b.cust_no=A.cust_no
OUTER APPLY
( SELECT TOP 1 Agreement FROM Cust_Details d
WHERE d.cust_no=a.cust_no
ORDER BY Agreement
) d
OUTER APPLY
( SELECT TOP 1 Customer_Segment FROM Customer_Segment e
WHERE e.cust_no=a.cust_no
ORDER BY Customer_Segment
) e

need empty row in output between rows with data Report Builder 3.0 T-SQL

I have reports that pull 50 random records. i would like to insert a blank Row in the output between each row of data. for example, Rows 1,3,5,7... are populated with data, and even number rows are empty.
Thanks

CROSS APPLY with NULL valued table gives equal number of rows as original table.
Now we can generate ROW_NUMBER for two SELECTs and sort it by row number to get alternate values.
select C.id, C.name, ROW_NUMBER() OVER ( ORDER BY C.id) as seq from TableC C
UNION ALL
SELECT T.id, T.name, seq as seq
FROM
(
select T.id, T.name ,ROW_NUMBER() OVER ( ORDER by C.id ) as seq from TableC C
cross apply ( select NULL as id,NULL as name ) T
) T
ORDER BY seq

A very simple solution could be :
Select id+Temp id, name+Temp name from TableA A
Cross Apply
(select '' as 'Temp'
union all
Select null) X
If you have large number of columns, then create the concatenation of column_name + temp dynamically and use that in dynamic sql
for Demo Click on --> DEMO