Postgres: Need to select keywords as separate array values - arrays

Datatype:
id: int4
keywords: text
objectivable_id: int4
Postgres version: PostgreSQL 9.5.3
Business_objectives table:
id keywords objectivable_id
1 keyword1a,keyword1b,keyword1c 6
2 keyword2a 6
3 testing 5
Currently the query I'm using is :
select array(select b.keywords from business_objectives b where b.objectivable_id = 6)
It selects the keywords of matched objectivable_id as:
{"keyword1a,keyword1b,keyword1c","keyword2a"}
Over here I wanted the result to be :
{"keyword1a","keyword1b","keyword1c","keyword2a"}
I tried using "string_agg(text, delimiter)", but it just combines all the keywords into one single pocket of an array.

You can simply (and cheaply!) use:
SELECT string_to_array(string_agg(keywords, ','), ',')
FROM business_objectives
WHERE objectivable_id = 6;
Concatenate your comma separate lists with string_agg(), and then convert the complete text to an array with string_to_array().

So something like this can give you expected result:
SELECT array_agg( j.keys )
FROM business_objectives b,
LATERAL ( SELECT k
FROM unnest ( string_to_array( b.keywords, ',' ) ) u( k )
) j( keys )
WHERE b.objectivable_id = 6;
array_agg
-------------------------------------------
{keyword1a,keyword1b,keyword1c,keyword2a}
(1 row)
With the LATERAL part, we look at the outer query to create a new view. Simply it does split of your keywords as set of rows which you can then feed into array_agg() function.
See more about LATERAL: https://www.postgresql.org/docs/9.6/static/queries-table-expressions.html#QUERIES-LATERAL

Related

Replacing elements in an array column in snowflake

I have sample data as follows;
team_id
mode
123
[1,2]
Here mode is an array.The goal is to replace the values in column mode by literal values, such as 1 stands for Ocean, and 2 stands for Air
Expected Output
team_id
mode
123
[Ocean,Air]
Present Approach
As an attempt, I tried to first flatten the data into multiple rows;
team_id
mode
123
1
123
2
Then we can define a new column assigning literal values to mode column using a case statement, followed by aggregating the values into an array to get desired output.
Can I get some help here to do the replacement directly in the array? Thanks in advance.
Using FLATTEN and ARRAY_AGG:
CREATE OR REPLACE TABLE tab(team_id INT, mode ARRAY) AS SELECT 123, [1,2];
SELECT TEAM_ID,
ARRAY_AGG(CASE f.value::TEXT
WHEN 1 THEN 'Ocean'
WHEN 2 THEN 'Air'
ELSE 'Unknown'
END) WITHIN GROUP(ORDER BY f.index) AS new_mode
FROM tab
,LATERAL FLATTEN(tab.mode) AS f
GROUP BY TEAM_ID;
Output:
TEAM_ID
NEW_MODE
123
[ "Ocean", "Air" ]
For an alternative solution with easy array manipulation. you could create a JS UDF:
create or replace function replace_vals_in_array(A variant)
returns variant
language javascript
as $$
dict = {1:'a', 2:'b', 3:'c', 4:'d'};
return A.map(x => dict[x]);
$$;
Then to update your table:
update arrs
set arr = replace_vals_in_array(arr);
Example setup:
create or replace temp table arrs as (
select 1 id, [1,2,3] arr
union all select 2, [2,4]
);
select *, replace_vals_in_array(arr)
from arrs;

How to cast a string to array of struct in HiveQL

I have a hive table with the column "periode", the type of the column is string.
The column have values like the following:
[{periode:20160118-20160205,nb:1},{periode:20161130-20161130,nb:1},{periode:20161130-20161221,nb:1}]
[{periode:20161212-20161217,nb:0}]
I want to cast this column in array<struct<periode:string, nb:int>>.
The final goal is to have one raw by periode.
For this I want to use lateral view with explode on the column periode.
That's why I want to convert it to array<struct<string, int>>
Thanks for help.
Sidi
You don't need to "cast" anything, you just need to explode the array and then unpack the struct. I added an index to your data to make it more clear where things are ending up.
Data:
idx arr_of_structs
0 [{periode:20160118-20160205,nb:1},{periode:20161130-20161130,nb:1},{periode:20161130-20161221,nb:1}]
1 [{periode:20161212-20161217,nb:0}]
Query:
SELECT idx -- index
, my_struct.periode AS periode -- unpacks periode
, my_struct.nb AS nb -- unpacks nb
FROM database.table
LATERAL VIEW EXPLODE(arr_of_structs) exptbl AS my_struct
Output:
idx periode nb
0 20160118-20160205 1
0 20161130-20161130 1
0 20161130-20161221 1
1 20161212-20161217 0
It's a bit unclear from your question what the desired result is, but as soon as you update it I'll modify the query accordingly.
EDIT:
The above solution is incorrect, I didn't catch that your input is a STRING.
Query:
SELECT REGEXP_EXTRACT(tmp_arr[0], "([0-9]{8}-[0-9]{8})") AS periode
, REGEXP_EXTRACT(tmp_arr[1], ":([0-9]*)") AS nb
FROM (
SELECT idx
, pos
, COLLECT_SET(tmp_col) AS tmp_arr
FROM (
SELECT idx
, tmp_col
, CASE WHEN PMOD(pos, 2) = 0 THEN pos+1 ELSE pos END AS pos
FROM (
SELECT *
, ROW_NUMBER() OVER () AS idx
FROM database.table ) x
LATERAL VIEW POSEXPLODE(SPLIT(periode, ',')) exptbl AS pos, tmp_col ) y
GROUP BY idx, pos) z
Output:
periode nb
20160118-20160205 1
20161130-20161130 1
20161130-20161221 1
20161212-20161217 0
What about use the split function? you should be able to do something like
select nb, period from
(select split(periode, "-") as periods, nb from yourtable) t
LATERAL VIEW explode(periods) sss AS period;
I didnt tried but it should work :)
EDIT: the above should work if you have a column periodes following a pattern date-date-date.. and a column nb, but it looks like that it isn't the case here. The following query should work for you (verbose but work)
select period, nb from (
select
regexp_replace(split(split(tok1,",")[1],":")[1], "[\\]|}]", "") as nb,
split(split(split(tok1,",")[0],":")[1],"-") as periods
from
(select split(YOURSTRINGCOLUMN, "},") as s1 from YOURTABLE)
r1 LATERAL VIEW explode(s1) ss1 AS tok1
) r2 LATERAL VIEW explode(periods) ss2 AS period;
I realize this question is 1YO, but I ran into this same issue and tackled it by using the json_split brickhouse UDF.
SELECT EXPLODE(
json_split(
'[{"periode":"20160118-20160205","nb":1},{"periode":"20161130-20161130","nb":1},{"periode":"20161130-20161221","nb":1}]'
));
col
{"periode":"20160118-20160205","nb":1}
{"periode":"20161130-20161130","nb":1}
{"periode":"20161130-20161221","nb":1}
Sorry for the spaghetti code.
There's also a similar question here using JSON arrays instead of JSON strings. It's not the same case, but for anyone facing this kind of task it might be useful in a bigger context.

How to convert json array into postgres int array in postgres 9.3

I have scenario where i need to convert a json array into postgres int array and query it for the result. Below is my array
ID DATA
1 {"bookIds" : [1,2,3,5], "storeIds": [2,3]}
2 {"bookIds" : [4,5,6,7], "storeIds": [1,3]}
3 {"bookIds" : [11,12,10,9], "storeIds": [4,3]}
I want convert booksId array into int array and later query it. Is it possible in postgres 9.3? I know 9.4 + provides much more JSON support but i can't update my db at the moment.
Below query gives me error
Select data::json->>'bookIds' :: int[] from table
ERROR: malformed array literal: "bookIds"
LINE 1: Select data::json->>'bookIds' :: int[] from table
Is it possible to query elements inside json array in postgres 9.3.. Thanks in advance ...
The setup in the question should look like this:
create table a_table (id int, data json);
insert into a_table values
(1, '{"bookIds": [1,2,3,5], "storeIds": [2,3]}'),
(2, '{"bookIds": [4,5,6,7], "storeIds": [1,3]}'),
(3, '{"bookIds": [11,12,10,9], "storeIds": [4,3]}');
Note the proper syntax of json values.
You can use the function json_array_elements()
select id, array_agg(e::text::int)
from a_table, json_array_elements(data->'bookIds') e
group by 1
order by 1;
id | array_agg
----+--------------
1 | {1,2,3,5}
2 | {4,5,6,7}
3 | {11,12,10,9}
(3 rows)
Use any() to search for an element in the arrays, e.g.:
select *
from (
select id, array_agg(e::text::int) arr
from a_table, json_array_elements(data->'bookIds') e
group by 1
) s
where
1 = any(arr) or
11 = any(arr);
id | arr
----+--------------
1 | {1,2,3,5}
3 | {11,12,10,9}
(2 rows)
Read also about <# operator.
You can also search in json array (without converting it to int array) by examine its elements, e.g.:
select t.*
from a_table t, json_array_elements(data->'bookIds') e
where e::text::int in (1, 11);
id | data
----+-----------------------------------------------
1 | {"bookIds" : [1,2,3,5], "storeIds": [2,3]}
3 | {"bookIds" : [11,12,10,9], "storeIds": [4,3]}
(2 rows)
These two functions (for json/jsonb) modified from a fantastic answer to this question work perfectly
CREATE OR REPLACE FUNCTION json_array_castint(json) RETURNS int[] AS $f$
SELECT array_agg(x)::int[] || ARRAY[]::int[] FROM json_array_elements_text($1) t(x);
$f$ LANGUAGE sql IMMUTABLE;
CREATE OR REPLACE FUNCTION jsonb_array_castint(jsonb) RETURNS int[] AS $f$
SELECT array_agg(x)::int[] || ARRAY[]::int[] FROM jsonb_array_elements_text($1) t(x);
$f$ LANGUAGE sql IMMUTABLE;
You can use them as follows:
SELECT json_array_castint('[1,2,3]')
Which gives the expected return {1,2,3} as in integer[]. If you wonder why I'm concatenating with an empty array in each of the SELECT statement it's because the cast is lossy and without it, if you try to cast an empty json/jsonb array to an integer[] you'll get no return (not desired) instead of an empty array (as expected). With the above method when you do
SELECT json_array_castint('[]')
You'll get {} instead of nothing. See here for more on why I added that.
In my case I had to cast json data stored in a table col to pg array format and this was handy :
-- username is the table column, which has values like ["john","pete","kat"]
select id, ARRAY(SELECT json_array_elements_text((username)::json)) usernames
from public.table-name;
-- this produces : {john,pete,kat}
I would go a bit simpler:
select * from
(
select t.id, value::text::int as bookvalue
from testjson t, json_array_elements(t.data->'bookIds')
) as t
where bookvalue in (1,11)
See it working here: http://sqlfiddle.com/#!15/e69aa/37

Slicing the word to rows -TERADATA

I want to slice a word eg: SMILE into :
S
M
I
L
E
I did it like this
SEL SUBSTR(EMP_NAME,1,1) FROM etlt5.employe where EMP_ID='28008'
UNION ALL
SEL SUBSTR(EMP_NAME,2,1) FROM etlt5.employe where EMP_ID='28008'
UNION ALL
SEL SUBSTR(EMP_NAME,3,1) FROM etlt5.employe where EMP_ID='28008'
I also tried it with recursive query but no final results.is there a better way of doing this because this looks more like a hardcoded one.
You could use STRTOK_SPLIT_TO_TABLE to do this. STRTOK_SPLIT_TO_TABLE splits a field by a delimiter and then takes each token (stuff between the delimiter) and sticks it in it's own record of a new derived table.
In your case you don't have a delimiter between the characters of "SMILE" so we can use some REGEXP_REPLACE magic to stick a comma between each letter, and then split that to a table:
WITH test (id, word) AS (SELECT 1, 'SMILE')
SELECT D.*
FROM TABLE (strtok_split_to_table(test.id, REGEXP_REPLACE(test.word, '([a-zA-Z])', ',\1'), ',')
RETURNS
( id integer
, rownum integer
, new_col varchar(100)character set unicode)
) as d
I've used this STRTOK_SPLIT_TO_TABLE(REGEXP_REPLACE()) before to split apart document numbers in order to determine a check digit, so it definitely has its uses.
May I ask why you want to do that?
You need a table with a sequence from 1 to the max length of EMP_NAME:
select SUBSTR(EMP_NAME,n,1)
FROM etlt5.employe CROSS JOIN number_table
where EMP_ID='28008'

Sum of values of json array in PostgreSQL

In PostgreSQL 9.3, I have a table like this
id | array_json
---+----------------------------
1 | ["{123: 456}", "{789: 987}", "{111: 222}"]
2 | ["{4322: 54662}", "{123: 5121}", "{1: 5345}" ... ]
3 | ["{3232: 413}", "{5235: 22}", "{2: 5453}" ... ]
4 | ["{22: 44}", "{12: 4324}", "{234: 4235}" ... ]
...
I want to get the sum of all values in array_json column. So, for example, for first row, I want:
id | total
---+-------
1 | 1665
Where 1665 = 456 + 987 + 222 (the values of all the elements of json array). No previous information about the keys of the json elements (just random numbers)
I'm reading the documentation page about JSON functions in PostgreSQL 9.3, and I think I should use json_each, but can't find the right query. Could you please help me with it?
Many thanks in advance
You started looking at the right place (going to the docs is always the right place).
Since your values are JSON arrays -> I would suggest using json_array_elements(json)
And since it's a json array which you have to explode to several rows, and then combine back by running sum over json_each_text(json) - it would be best to create your own function (Postgres allows it)
As for your specific case, assuming the structure you provided is correct, some string parsing + JSON heavy wizardry can be used (let's say your table name is "json_test_table" and the columns are "id" and "json_array"), here is the query that does your "thing"
select id, sum(val) from
(select id,
substring(
json_each_text(
replace(
replace(
replace(
replace(
replace(json_array,':','":"')
,'{',''),
'}','')
,']','}')
,'[','{')::json)::varchar
from '\"(.*)\"')::int as val
from json_test_table) j group by id ;
if you plan to run it on a huge dataset - keep in mind string manipulations are expensive in terms of performance
You can get it using this:
/*
Sorry, sqlfiddle is busy :p
CREATE TABLE my_table
(
id bigserial NOT NULL,
array_json json[]
--,CONSTRAINT my_table_pkey PRIMARY KEY (id)
)
INSERT INTO my_table(array_json)
values (array['{"123": 456}'::json, '{"789": 987}'::json, '{"111": 222}'::json]);
*/
select id, sum(json_value::integer)
from
(
select id, json_data->>json_object_keys(json_data) as json_value from
(
select id, unnest(array_json) as json_data from my_table
) A
) B
group by id

Resources