Querying json that starts with an array - snowflake-cloud-data-platform

Querying json that starts with an array - snowflake-cloud-data-platform

I have a JSON that starts with an array and I don't manage to query it.
JSON is in this format:
[
{"#id":1,
"field1":"qwerty",
"#field2":{"name":"my_name", "name2":"my_name_2"},
"field3":{"event":[{"event_type":"OP",...}]}
},
{"#id":2..
}
]
Any suggestions on how to query this?
If I try to use lateral flatten I don't know what key to use:
select
'???'.Value:#id::string as id
from tabl1
,lateral flatten (tabl1_GB_RECORD:???) as gb_record

Your SQL was close but not complete, the following will give you #id values
with tbl1 (v) as (
select parse_json('
[
{"#id":1,
"field1":"qwerty",
"#field2":{"name":"my_name", "name2":"my_name_2"},
"field3":{"event":[{"event_type":"OP"}]}
},
{"#id":2
}
]')
)
select t.value:"#id" id from tbl1
, lateral flatten (input => v) as t
Result:
id
___
1
2
Let me know if you have any other questions

You leverage the field that you want to flatten when the json begins with an array. Something along these lines:
WITH x AS (
SELECT parse_json('[
{"#id":1,
"field1":"qwerty",
"#field2":{"name":"my_name", "name2":"my_name_2"},
"field3":{"event":[{"event_type":"OP"}]}
},
{"#id":2,
"field1":"qwerty",
"#field2":{"name":"my_name", "name2":"my_name_2"},
"field3":{"event":[{"event_type":"OP"}]}
}
]') as json_data
)
SELECT y.value,
y.value:"#id"::number as id,
y.value:field1::string as field1,
y.value:"#field2"::variant as field2,
y.value:field3::variant as field3,
y.value:"#field2":name::varchar as name
FROM x,
LATERAL FLATTEN (input=>json_data) y;

Related

Creating a merge statement from a conditional INSERT select

I'm trying to create a merge statement where I keep all the values in my FINAL_TABLE before the column DATE is >= today's date,
and insert new values from today's date from my LANDING_TABLE.
The working example with a DELETE and INSERT statement can be seen here:
DELETE FROM FINAL_TABLE
WHERE "DATE" >= CURRENT_DATE()
INSERT INTO FINAL_TABLE
SELECT X, Y.value :: string AS Y_SPLIT, "DATE", "PUBLIC"
FROM LANDING TABLE, LATERAL FLATTEN (INPUT => STRTOK_TO_ARRAY(LANDING_TABLE.column, ', '), OUTER => TRUE) y
WHERE "PUBLIC" ILIKE 'TRUE' AND "DATE" >= CURRENT_DATE()
I'd like to keep the FLATTEN statement and the WHERE conditions while having the whole statement in a single MERGE statement.
Is it possible or should I first create a temporary table with the values I want to insert and then use that in the merge statement?

The MERGE statement could use subqueries/cte as source:
MERGE INTO <target_table> USING <source>
ON <join_expr> { matchedClause | notMatchedClause } [ ... ]
source:
Specifies the table or subquery to join with the target table.
MERGE INTO FINAL_TABLE
USING (
SELECT X, Y.value :: string AS Y_SPLIT, "DATE" AS col1, "PUBLIC" AS col2
FROM LANDING TABLE
,LATERAL FLATTEN(INPUT=>STRTOK_TO_ARRAY(LANDING_TABLE.column, ', '), OUTER=>TRUE) y
WHERE "PUBLIC" ILIKE 'TRUE' AND "DATE" >= CURRENT_DATE()
) AS SRC
ON ...
WHEN ...;

Snowflake - Lateral Flatten creating duplicate rows

I'm creating a new table (my_new_table) from another table (my_existing_table) that has 4 columns, product and monthly_budget have nested values that I'm trying to extract:
Product column is a dictionary like this:
{"name": "Display", "full_name": "Ad Bundle"}
MONTHLY_BUDGETS is a list with several dictionaries, the column looks like this:
[{"id": 123, "quantity_booked": "23", "budget_booked": "0.0", "budget_booked_loc": "0.0"} ,
{"id": 234, "quantity_booked": "34", "budget_booked": "0.0", "budget_booked_loc": "0.0"},
{"id": 455, "quantity_booked": "44", "budget_booked": "0.0", "budget_booked_loc": "0.0"}]
The below is what I'm doing to create the new table and unnest from the other table:
CREATE OR REPLACE TABLE my_new_table as (
with og_table as (
select
id,
parse_json(product) as PRODUCT,
IO_NAME,
parse_json(MONTHLY_BUDGETS) as MONTHLY_BUDGETS
from my_existing_table
)
select
id,
PRODUCT:name::string as product_name,
PRODUCT:full_name::string as product_full_name,
IO_NAME,
MONTHLY_BUDGETS:id::integer as monthly_budgets_id,
MONTHLY_BUDGETS:quantity_booked::float as monthly_budgets_quantity_booked,
MONTHLY_BUDGETS:budget_booked_loc::float as monthly_budgets_budget_booked_loc
from og_table,
lateral flatten( input => PRODUCT) as PRODUCT,
lateral flatten( input => MONTHLY_BUDGETS) as MONTHLY_BUDGETS);
however once my new table is created and I run this:
select distinct id, count(*)
from my_new_table
where id = '123'
group by 1;
I see 18 under the count(*) column when I should only have 1, so it looks like there are a lot of duplicates, but why? and how do I prevent this?

LATERAL FLATTEN produces a CROSS JOIN between the input row and the flatten results.
So if we have this data
Id, Array
1, [10,20,30]
2, [40,50,60]
and you do a flatten on Array, via something like:
SELECT d.id,
d.array,
f.value as val
FROM data d
LATERAL FLATTEN(input => d.array) f
Id, Array, val
1, [10,20,30], 10
1, [10,20,30], 20
1, [10,20,30], 30
2, [40,50,60], 40
2, [40,50,60], 50
2, [40,50,60], 60
for for you case, given you are doing two flatten's for each ID you will have many duplicate rows of ID.
Just like above if on my output if I did a SELECT ID, count(*) FROM output GROUP BY 1 I will have the values 1,3 and 2,3

Select json values from array in Oracle 19c table

Newer to working with JSON and newly upgraded Oracle 19c DB.
I'm receiving a JSON array back from an api and storing it in an Oracle 19c table column with IS JSON (STRICT) constraint.
[ {"key1":"valueA", "key2":"valueB"}, {"key1":"valueC", "key2":"valueD"} ]
I need to select values in column form:
KEY1 KEY2
valueA valueB
valueC valueD
This returns one row with null columns.
Select jt.*
From json_data,
json_table(myData, '$.[*]'
columns( key1, key2)) jt;
I can't seem to make the Oracle functions (json_table, json_query, json_value, ...) handle this without wrapping the array in an object.
{ "base":[ {"key1":"valueA", "key2":"valueB"}, {"key1":"valueC", "key2":"valueD"} ] }
Then this query works:
Select jt.*
From json_data,
json_table(myData, '$.base[*]'
columns( key1, key2)) jt;
Is there a shortcoming with the Oracle functions or what am I doing wrong?

Select jt.*
From json_data,
json_table(myData, '$[*]'
columns( key1, key2)) jt;
Full test case with results:
with json_data(myData) as (
select '[ {"key1":"valueA", "key2":"valueB"}, {"key1":"valueC", "key2":"valueD"} ]' from dual
)
Select jt.*
From json_data,
json_table(myData, '$[*]'
columns( key1, key2)) jt;
KEY1 KEY2
-------------------- --------------------
valueA valueB
valueC valueD

You want $[*] not $.[*]
SELECT jt.*
FROM json_data
CROSS APPLY json_table(
myData,
'$[*]'
columns(
key1,
key2
)
) jt;
Which for the sample data:
CREATE TABLE json_data ( myData VARCHAR2(2000) CHECK( myData IS JSON(STRICT) ) );
INSERT INTO json_data ( myData )
VALUES ( '[ {"key1":"valueA", "key2":"valueB"}, {"key1":"valueC", "key2":"valueD"} ]' );
Outputs:
KEY1
KEY2
valueA
valueB
valueC
valueD
db<>fiddle here

Flatten and aggregate two columns of arrays via distinct in Snowflake

Table structure is
+------------+---------+
| Animals | Herbs |
+------------+---------+
| [Cat, Dog] | [Basil] |
| [Dog, Lion]| [] |
+------------+---------+
Desired output (don't care about sorting of this list):
unique_things
+------------+
[Cat, Dog, Lion, Basil]
First attempt was something like
SELECT ARRAY_CAT(ARRAY_AGG(DISTINCT(animals)), ARRAY_AGG(herbs))
But this produces
[[Cat, Dog], [Dog, Lion], [Basil], []]
Since the distinct is operating on each array, not looking at distinct components within all arrays

If I understand your requirements right and assuming the source tables of
insert into tabarray select array_construct('cat', 'dog'), array_construct('basil');
insert into tabarray select array_construct('lion', 'dog'), null;
I would say the result would look like this:
select array_agg(distinct value) from
(
select
value from tabarray
, lateral flatten( input => col1 )
union all
select
value from tabarray
, lateral flatten( input => col2 ))
;

UPDATE
It is possible without using FLATTEN, by using ARRAY_UNION_AGG:
Returns an ARRAY that contains the union of the distinct values from the input ARRAYs in a column.
For sample data:
CREATE OR REPLACE TABLE t AS
SELECT ['Cat', 'Dog'] AS Animals, ['Basil'] AS Herbs
UNION SELECT ['Dog', 'Lion'], [];
Query:
SELECT ARRAY_UNION_AGG(ARRAY_CAT(Animals, Herbs)) AS Result
FROM t
or:
SELECT ARRAY_UNION_AGG(Animals) AS Result
FROM (SELECT Animals FROM t
UNION ALL
SELECT Herbs FROM t);
Output:
You could flatten the combined array and then aggregate back:
SELECT ARRAY_AGG(DISTINCT F."VALUE") AS unique_things
FROM tab, TABLE(FLATTEN(ARRAY_CAT(tab.Animals, tab.Herbs))) f

Here is another variation to handle NULLs in case they appear in data set.
SELECT ARRAY_AGG(DISTINCT a.VALUE) unique_things from tab, TABLE (FLATTEN(array_compact(array_append(tab.Animals, tab.Herbs)))) a

SQL Server FOR JSON Path Nested Array

We are trying to use FOR JSON Path in SQL Server 2016 for forming a Nested Array from a SQL Query.
SQL Query:
SELECT A,
B.name as [child.name],
B.date as [child.date]
from Table 1 join Table 2 on Table 1.ID=Table 2.ID FOR JSON PATH
Desired Output:
[{
A:"text",
"child:"[
{"name":"value", "date":"value"},
{"name":"value", "date":"value"}
]
}]
However what we are getting is:
[{
A:"text",
"child:" {"name":"value", "date":"value"}
},
{
A:"text",
"child":{"name":"value", "date":"value"}
}]
How can we use FOR JSON PATH to form nested child array.

instead of join use nested query, e.g.:
SELECT A
, child=(
SELECT B.name as [child.name]
, B.date as [child.date]
FROM Table 2
WHERE Table 2.ID = Table 1.ID
FOR JSON PATH
)
from Table 1 FOR JSON PATH
(the query in the question is broken af so this query is just as broken but should give you the idea)

Assuming this schema:
create table parent(id int primary key, name varchar(100));
create table child(id int primary key, name varchar(100), parent_id int references parent(id));
Here is a working solution - abeit more convoluted - that doesn't involve correlated subqueries and only uses FOR JSON PATH:
SELECT
parent.name AS [name],
child.json_agg AS [children]
FROM parent
JOIN (
SELECT
child.parent_code,
JSON_QUERY(CONCAT('[', STRING_AGG(child.json, ','), ']')) AS json_agg
FROM (
SELECT
child.parent_code,
(SELECT
child.name AS [name]
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER
) AS json
FROM child
) AS child
GROUP BY child.parent_code
) AS child
ON child.parent_code = parent.code
FOR JSON PATH
If you have an index on child.parent_id, then using a correlated subquery as suggested, or the equivalent with CROSS/OUTER APPLY might be more efficient:
SELECT
parent.name AS [name],
child.json AS [children]
FROM parent
OUTER APPLY (
SELECT
name AS [name]
FROM child
WHERE child.parent_id = parent.id
FOR JSON PATH
) child(json)
FOR JSON PATH
Both queries will return :
[
{
"name": "foo",
"children": [
{ "name": "bar" },
{ "name": "baz" }
]
}
]

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Querying json that starts with an array - snowflake-cloud-data-platform

Related

Creating a merge statement from a conditional INSERT select

Snowflake - Lateral Flatten creating duplicate rows

Select json values from array in Oracle 19c table

Flatten and aggregate two columns of arrays via distinct in Snowflake

SQL Server FOR JSON Path Nested Array

Categories

Resources