How to extract elements of a JSONB array? - arrays

Running PostgresSQL v10.5.
In my table table_a that has a column metadata which is of type jsonb.
It has a JSON array as one of it's keys array_key with value something like this:
[{"key1":"value11", "key2":"value21", "key3":"value31"},
{"key1":"value21", "key2":"value22", "key3":"value23"}]
This is how I can query this key
SELECT metadata->>'array_key' from table_a
This gives me the entire array. Is there any way I can query only selected keys and perhaps format them?
The type of the array is text i.e pg_typeof(metadata->>'array_key') is text
An ideal output would be
"value11, value13", "value21, value23"

Use jsonb_array_elements() to get elements of the array (as value) which can be filtered by keys:
select value->>'key1' as key1, value->>'key3' as key3
from table_a
cross join jsonb_array_elements(metadata->'array_key');
key1 | key3
---------+---------
value11 | value31
value21 | value23
(2 rows)
Use an aggregate to get the output as a single value for each row, e.g.:
select string_agg(concat_ws(', ', value->>'key1', value->>'key3'), '; ')
from table_a
cross join jsonb_array_elements(metadata->'array_key')
group by id;
string_agg
------------------------------------
value11, value31; value21, value23
(1 row)
Working example in rextester.

Related

Preserve order while converting string array into int array in hive

I'm trying to convert a string array to int array by keeping the original order
here is a sample of what my data looks like:
id attribut string_array
id1 attribut1, 10283:990000 ["10283","990000"]
id2 attribut2, 10283:36741000 ["10283","36741000"]
id3 attribut3, 10283:37871000 ["10283","37871000"]
id4 attribut4, 3215:90451000 ["3215","90451000"]
and here's how i convert the field "string_array" into an array of integers
select
id,
attribut,
string_array,
collect_list(cast(array_explode as int)),
from table
lateral view outer explode(string_array) r as array_explode
it gives me:
id attribut string_array int_array
id1 attribut1,10283:990000 ["10283","990000"] [990000,10283]
id2 attribut2,10283:36741000 ["10283","36741000"] [10283,36741000]
id3 attribut3,10283:37871000 ["10283","37871000"] [37871000,10283]
id4 attribut4,3215:90451000 ["3215","90451000"] [90451000,3215]
As you can see, the order in "string array" has not been preserved in "int_array" and I need it to be exactly the same as in "string_array".
anyone know how to achieve this ?
Any help would be much appreciated
For Hive: Use posexplode, in a subquery before collect_list do distribute by id sort by position
select
id,
attribut,
string_array,
collect_list(cast(element as int)),
from
(select *
from table t
lateral view outer posexplode(string_array) e as pos,element
distribute by t.id, attribut, string_array -- distribute by group key
sort by pos -- sort by initial position
) t
group by id, attribut, string_array
Another way is to extract substring from your attributes and split without exploding (as you asked in the comment)
select split(regexp_extract(attribut, '[^,]+,(.*)$',1),':')
Regexp '[^,]+,(.*)$' means:
[^,]+ - not a comma 1+ times
, - comma
(.*)$ - everything else in catpturing group 1 after comma till the end of the string
Demo:
select split(regexp_extract('attribut3,10283:37871000', '[^,]+,(.*)$',1),':')
Result:
["10283","37871000"]

Sort the result according to the ARRAY elements?

I have the following query :
SELECT id,word FROM map
WHERE id::integer in (SELECT unnest(ary) FROM abc WHERE id = 11)
the problem is that the result comes in random order.
What I want is the result to come in the order defined by the content of ARRAY "ary"
How do I do that ?
I would unnest first and with that order given, would join the other tables on the id column:
SELECT
id,
word
FROM (
SELECT
unnest(ary) as id
FROM
abc
WHERE
id = 11
) a JOIN map
USING
(id)

Flatten and aggregate two columns of arrays via distinct in Snowflake

Table structure is
+------------+---------+
| Animals | Herbs |
+------------+---------+
| [Cat, Dog] | [Basil] |
| [Dog, Lion]| [] |
+------------+---------+
Desired output (don't care about sorting of this list):
unique_things
+------------+
[Cat, Dog, Lion, Basil]
First attempt was something like
SELECT ARRAY_CAT(ARRAY_AGG(DISTINCT(animals)), ARRAY_AGG(herbs))
But this produces
[[Cat, Dog], [Dog, Lion], [Basil], []]
Since the distinct is operating on each array, not looking at distinct components within all arrays
If I understand your requirements right and assuming the source tables of
insert into tabarray select array_construct('cat', 'dog'), array_construct('basil');
insert into tabarray select array_construct('lion', 'dog'), null;
I would say the result would look like this:
select array_agg(distinct value) from
(
select
value from tabarray
, lateral flatten( input => col1 )
union all
select
value from tabarray
, lateral flatten( input => col2 ))
;
UPDATE
It is possible without using FLATTEN, by using ARRAY_UNION_AGG:
Returns an ARRAY that contains the union of the distinct values from the input ARRAYs in a column.
For sample data:
CREATE OR REPLACE TABLE t AS
SELECT ['Cat', 'Dog'] AS Animals, ['Basil'] AS Herbs
UNION SELECT ['Dog', 'Lion'], [];
Query:
SELECT ARRAY_UNION_AGG(ARRAY_CAT(Animals, Herbs)) AS Result
FROM t
or:
SELECT ARRAY_UNION_AGG(Animals) AS Result
FROM (SELECT Animals FROM t
UNION ALL
SELECT Herbs FROM t);
Output:
You could flatten the combined array and then aggregate back:
SELECT ARRAY_AGG(DISTINCT F."VALUE") AS unique_things
FROM tab, TABLE(FLATTEN(ARRAY_CAT(tab.Animals, tab.Herbs))) f
Here is another variation to handle NULLs in case they appear in data set.
SELECT ARRAY_AGG(DISTINCT a.VALUE) unique_things from tab, TABLE (FLATTEN(array_compact(array_append(tab.Animals, tab.Herbs)))) a

Extraction all values between special characters SQL

I have the following values in the SQL Server table:
But I need to build query from which output look like this:
I know that I should probably use combination of substring and charindex but I have no idea how to do it.
Could you please help me how the query should like?
Thank you!
Try the following, it may work.
SELECT
offerId,
cTypes
FROM yourTable AS mt
CROSS APPLY
EXPLODE(mt.contractTypes) AS dp(cTypes);
You can use string_split function :
select t.offerid, trim(translate(tt.value, '[]"', ' ')) as contractTypes
from table t cross apply
string_split(t.contractTypes, ',') tt(value);
The data in each row in the contractTypes column is a valid JSON array, so you may use OPENJSON() with explicit schema (result is a table with columns defined in the WITH clause) to parse this array and get the expected results:
Table:
CREATE TABLE Data (
offerId int,
contractTypes varchar(1000)
)
INSERT INTO Data
(offerId, contractTypes)
VALUES
(1, '[ "Hlavni pracovni pomer" ]'),
(2, '[ "ÖCVS", "Staz", "Prahovne" ]')
Table:
SELECT d.offerId, j.contractTypes
FROM Data d
OUTER APPLY OPENJSON(d.contractTypes) WITH (contractTypes varchar(100) '$') j
Result:
offerId contractTypes
1 Hlavni pracovni pomer
2 ÖCVS
2 Staz
2 Prahovne
As an additional option, if you want to return the position of the contract type in the contractTypes array, you may use OPENJSON() with default schema (result is a table with columns key, value and type and the value in the key column is the 0-based index of the element in the array):
SELECT
d.offerId,
CONVERT(int, j.[key]) + 1 AS contractId,
j.[value] AS contractType
FROM Data d
OUTER APPLY OPENJSON(d.contractTypes) j
ORDER BY CONVERT(int, j.[key])
Result:
offerId contractId contractType
1 1 Hlavni pracovni pomer
2 1 ÖCVS
2 2 Staz
2 3 Prahovne

How to get a list of only keys from Hive map

I have a map stored in a column in Hive, where they keys of each row can be different. How can I get a list of only keys from each map?
Function map_keys(Map) returns an unordered array containing the keys of the input map.
Example, see comments in the code:
with mydata as (
select 1 id, map('key11','val11','key12','val12','key13','val13') as mymap
union all
select 2 id, map('key21','val21','key22','val22','key13','val13') as mymap --Key13 also exist in first row
)
select id, map_keys(d.mymap) keys
from mydata d
;
Result:
id keys
1 ["key11","key12","key13"]
2 ["key21","key22","key13"]
If you need list of unique keys from all rows, explode array and collect again using collect_set, it will return array of distinct keys:
with mydata as (
select 1 id, map('key11','val11','key12','val12','key13','val13') as mymap
union all
select 2 id, map('key21','val21','key22','val22','key13','val13') as mymap --Key13 also exist in first row
)
select --id,
collect_set(key) as keys
from mydata d
lateral view outer explode(map_keys(d.mymap)) e as key
--group by id --without id in groupby you get the distinct list of keys in all rows
--with id in groupby you get list of map keys for each row
;
Result:
["key11","key12","key13","key21","key22"]

Resources