Snowflake - is it possible to "merge" array output to table query

Snowflake - is it possible to "merge" array output to table query - snowflake-cloud-data-platform

i have two inputs:
a. Table A with 3 columns: a ,b , c - with 7 rows
b. Table B with an array column d , the array has 7 values.
Is there a way to "merge" the table A and TABLE B in a query - so that the first row of A and the first cell value of B will printed in the same row ?
Input:
For example:
Table A- Column 1
a
b
c
Table A- Column 2
d
e
f
Table b - column 1 (and only)
k
r
j
OUTPUT
should be three columns:
two first columns : column a, b (from table a)
third column - which is column 1 from table 2

You can use Flatten to convert your array into a table. Then you just need to define ordering and how to join them together.
Here's an example, where I use a ROW_NUMBER() to provide an ordering. And since I added it to both tableA and the flattened tableB, it can also be used to join the records together.
If you have IDs for ordering or joining, then that might be slightly cleaner, but with the simple columns provided in the example, you need to do something like this to line up the array values to the table rows.
with tA as (
select col1, col2,
ROW_NUMBER() OVER(ORDER BY col1) rnum
from tableA
)
,tB as (
select
x.value::string col1,
ROW_NUMBER() OVER(ORDER BY 1) rnum
from tableB ,
lateral flatten(input => array) x
)
SELECT a.col1, a.col2, b.col1
FROM tA a
JOIN tB b
ON a.rnum = b.rnum;

There is no "first row" in databases. There is a first with respect to an order you place on the data. Thus there a number of way to get just the "first row" of table A and table B.
On is to only select the first rows, I will use CTE but it can be done in a sub-select also. Via a QUALIFY and ROW_NUMBER, and to make it 'stable' I will use the selected output column. After that to JOIN the two tables, I will use a CROSS-JOIN, but given there are only one row from both CTE's this will give just one output row. But this does not feel like what you want.
WITH first_from_table_a AS (
SELECT column1, column2
FROM table_a
QUALIFY ROW_NUMBER() OVER (ORDER BY column1) = 1
), first_from_table_b AS (
SELECT column1
FROM table_b
QUALIFY ROW_NUMBER() OVER (ORDER BY column1) = 1
)
SELECT a.column1, a.column2, b.column1
FROM first_from_table_a AS a
CROSS JOIN first_from_table_b AS b
The problem with this is, if there are other things you are want to do this over it doesn't scale.
FIRST_VALUE is a function that could also help, if you join your data on some other schema, and want to chose a value from a larger set, but really the problem needs to be clarified more.
Another way to consider your question is to use the same ROW_NUMBER idea, and join on those, thus:
WITH first_from_table_a AS (
SELECT column1,
column2,
ROW_NUMBER() OVER (ORDER BY column1) AS rn
FROM table_a
), first_from_table_b AS (
SELECT column1,
ROW_NUMBER() OVER (ORDER BY column1) AS rn
FROM table_b
)
SELECT a.column1, a.column2, b.column1
FROM first_from_table_a AS a
JOIN first_from_table_b AS b
ON a.rn = b.rn
ORDER BY a.rn

Related

How to joining table while using flatten logic in snowflake

I have used lateral flatten logic in Snowflake in the below query. The query works up to alias A.
I'm getting invalid identifier error when I use the join condition
based on the common column. ON B.product_id=A.product_id
SELECT A.ID, INDEX, purchase_list,
CASE WHEN INDEX = 1 and purchase_list NOT IN('121=find-values','122=find_results','123=item_details','',' ')
THEN purchase_list END as item_no
FROM (SELECT ID,index,d.value::string AS purchase_list FROM (
SELECT ID,c.value::string AS purchase_list
FROM table_1,lateral flatten(INPUT=>split(po_purchase_list, '|')) c
),
LATERAL flatten(INPUT=>split(purchase_list, ';')) d
) A -- The query would be correct till here
JOIN
table_2 B -- This is the table I need to join with table_1
ON B.product_id=A.product_id
WHERE date='2022-03-03'
AND b.item_src NOT IN('0','1','3','4')

from the error message it looks like it is not able to find the column product_id in one of the tables or CTE , can you include the column product_id in your SELECT statement for table A, and see if it works.

Merging two columns from two tables

I have to merge two columns from two unrelated tables with the same number of rows in another table, like
Table A:
AColumn
'ABC'
'152'
'XXX'
Table B:
BColumn
'FF'
'CD'
'91'
Expected result for the destination table (table C):
CColumn1 CColumn2
'ABC' 'FF'
'152' 'CD'
'XXX' '91'
Apparently this looks very simple but I can't find a way to achieve it.
My attempt would be something like:
SELECT A.AColumn as CColumn1, B.BColumn as CColumn2 into C
FROM A INNER JOIN B ON 1=1
but this obviously generates all the possible combinations over the elements, while I just want the first row from A matched with the first row from B, the second row with the second row, etc.
Any help?

You need to add a row_number to each table and join on that, here is an example using two CTEs:
with a as (
select AColumn, row_number() over(order by (select null)) rn
from Table1
),
b as (
select BColumn, row_number() over(order by (select null)) rn
from Table2
)
select a.AColumn, b.BColumn
from a full outer join b
on a.rn = b.rn
Sql Fiddle: http://sqlfiddle.com/#!18/291c7/4

Join tables with Implied / Inferred data

This would be an easy join except: Table A is explicit for all times and values, but Table B only records rows when the there is a change from the previous value. In looking at Table B one can easily infer the missing times and values, but how to put that into a query?
Data in A.time contains every minute and a corresponding A.Value.
A.Time...........A.Value
9:00...............3.4
9:01...............5.0
9:02...............5.3
9:03...............5.3
9:04...............5.3
and so on…..
Table B only contains rows where the B.value has changed from the previous value.
B.Time..............B.Value
9:00...................4
9:01...................4.1
This is blank, but I know it to be 9:02 / 4.1
This is blank, but I know it to be 9:03 / 4.1
9:04....................4.7
and so on…
I need to do a query that links A.Time and B.Value, but I need the query to understand that a missing time in Table B should be substituted by the B.value of the first B.Time preceeding it.
Final table should be
A.Time...............B.Value
9:00...................4
9:01...................4.1
9:02...................4.1
9:03...................4.1
9:04...................4.7
I am currently writing this for SQL Server, but I need an Oracle solution too
Thanks in advance;

In Oracle, you can LEFT JOIN to get all the times and then use LAST_VALUE(b.value) IGNORE NULLS... to fill in the blanks. (NOTE: the ROWS BETWEEN... part is redundant with the ORDER BY in the OVER() clause, but I like it for extra clarity).
Like this:
SELECT a.time,
LAST_VALUE (b.VALUE)
IGNORE NULLS
OVER (PARTITION BY NULL
ORDER BY a.time
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM table_a a
LEFT JOIN table_b b ON b.time = a.time
ORDER BY a.time;
Here is a full example with test data:
with table_a ( time, value ) as
( SELECT '9:00', 3.4 FROM DUAL UNION ALL
SELECT '9:01', 5.0 FROM DUAL UNION ALL
SELECT '9:02', 5.3 FROM DUAL UNION ALL
SELECT '9:03', 5.3 FROM DUAL UNION ALL
SELECT '9:04', 5.3 FROM DUAL ),
table_b ( time, value ) as
( SELECT '9:00', 4 FROM DUAL UNION ALL
SELECT '9:01', 4.1 FROM DUAL UNION ALL
SELECT '9:04', 4.7 FROM DUAL )
SELECT a.time,
LAST_VALUE (b.VALUE)
IGNORE NULLS
OVER (PARTITION BY NULL
ORDER BY a.time
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM table_a a
LEFT JOIN table_b b ON b.time = a.time
ORDER BY a.time;
An alternative (which might work on SQL Server) is to use OUTER APPLY. Like so:
SELECT a.time, b.value
FROM table_a a
OUTER APPLY ( SELECT *
FROM table_b b
WHERE b.time <= a.time
ORDER BY b.time desc
FETCH FIRST 1 ROW ONLY ) b
ORDER BY a.time;
Basically, this finds the most recent non-null value from table B for each row in table A.
SQL*SERVER Solution
Here is the OUTER APPLY syntax translated to SQL*Server:
with table_a ( time, value ) as
( SELECT '9:00', 3.4 UNION ALL
SELECT '9:01', 5.0 UNION ALL
SELECT '9:02', 5.3 UNION ALL
SELECT '9:03', 5.3 UNION ALL
SELECT '9:04', 5.3 ),
table_b ( time, value ) as
( SELECT '9:00', 4 UNION ALL
SELECT '9:01', 4.1 UNION ALL
SELECT '9:04', 4.7 )
SELECT a.time, b.value
FROM table_a a OUTER APPLY (
SELECT * FROM table_b b
WHERE b.time <= a.time
ORDER BY b.time desc
OFFSET 0 ROWS
FETCH NEXT 1 ROWS ONLY ) b
ORDER BY a.time;

T-SQL Full outer join on two subqueries or arbitrary static values?

I'm trying to basically combine the columns from two outputs into one row.
Here's one example:
SELECT * FROM (SELECT 'Today' AS Txt) t1
FULL OUTER JOIN (SELECT * FROM (SELECT GETDATE() AS D) t2)
-- desired result is one row with a 'Txt' column with value 'Today' and a 'D' column with the result of the GETDATE function
And another:
SELECT * FROM (SELECT * FROM dbo.myTableFunc()) t1 -- returns 5 rows
FULL OUTER JOIN (SELECT * FROM (SELECT * FROM dbo.myOtherTableFunc())) t2 -- also returns 5 rows
The thing I cannot figure out how to do is to do the "outer join" on the two subqueries. In the first example, I'm basically trying to combine the result of two scalars into a single row result. In the second I'm trying to take two tables, each with five rows, and combine their columns, without any relationship between the data in the two tables.
I'm trying to do the above in a UDF and also in a view, so anything that involves creating temporary tables will not work.
In both of the above cases I get syntax errors around the closing ) signs in the outer join.

You're just missing the join conditions. in the first example, your join condition is "always", or 1 = 1:
SELECT * FROM (SELECT 'Today' AS Txt) t1
FULL OUTER JOIN (SELECT * FROM (SELECT GETDATE() AS D) t2) t2 on 1=1
In the second example you don't want any relationship between the rows in each data set - well, if you want to join them then there needs to be SOME relationship, even if it's spurious. Using a row number like this would work (assumes you have a unique column called Id in both tables):
select * from (
select row_number() over (order by Id asc) rn, * from dbo.myTableFunc()
) t1
full join (
select row_number() over (order by Id asc) rn, * from dbo.myOtherTableFunc()
) t2 on t1.rn=t2.rn

need empty row in output between rows with data Report Builder 3.0 T-SQL

I have reports that pull 50 random records. i would like to insert a blank Row in the output between each row of data. for example, Rows 1,3,5,7... are populated with data, and even number rows are empty.
Thanks

CROSS APPLY with NULL valued table gives equal number of rows as original table.
Now we can generate ROW_NUMBER for two SELECTs and sort it by row number to get alternate values.
select C.id, C.name, ROW_NUMBER() OVER ( ORDER BY C.id) as seq from TableC C
UNION ALL
SELECT T.id, T.name, seq as seq
FROM
(
select T.id, T.name ,ROW_NUMBER() OVER ( ORDER by C.id ) as seq from TableC C
cross apply ( select NULL as id,NULL as name ) T
) T
ORDER BY seq

A very simple solution could be :
Select id+Temp id, name+Temp name from TableA A
Cross Apply
(select '' as 'Temp'
union all
Select null) X
If you have large number of columns, then create the concatenation of column_name + temp dynamically and use that in dynamic sql
for Demo Click on --> DEMO

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Snowflake - is it possible to "merge" array output to table query - snowflake-cloud-data-platform

Related

How to joining table while using flatten logic in snowflake

Merging two columns from two tables

Join tables with Implied / Inferred data

T-SQL Full outer join on two subqueries or arbitrary static values?

need empty row in output between rows with data Report Builder 3.0 T-SQL

Categories

Resources