How can I parse JSON arrays in postgresql? - arrays

I am using PostgreSQL 9.5.14, and have a column in a table that contains JSON arrays that I need to parse for their contents.
Using a select I can see that the structure of the JSON is of this kind:
SELECT rule_results from table limit 5;
Result:
[{"rule_key":"applicant_not_lived_outside_eu"},{"rule_key":"family_assets_exceed_limit"},{"rule_key":"owned_a_deed"}]
[]
[]
[{"rule_key":"family_category","details":"apply_with_parents_under_25"}]
[]
I have been unable to create an SQL command to give me the values of the rule_key keys.
I've attempted to use the documentation for json-functions in postgresql to find a solution from
https://www.postgresql.org/docs/9.5/functions-json.html
SELECT rule_results::json->'rule_key' as results from table;
This gives me null values only.
SELECT jsonb_object_keys(rule_results::jsonb) from table;
This results in the error msg "cannot call jsonb_object_keys on a scalar", which seems to mean that the query is limited to a single row.
This looks simple enough, an array with key:value pairs, but somehow the answer eludes me. I would appreciate any help.

demo: db<>fiddle
Different solutions are possible. It depends on what you are expecting finally. But all solutions would use the function json_array_elements(). This expands every element into one row. With that you can do whatever you want.
This results in one row per value:
SELECT
value -> 'rule_key'
FROM
data,
json_array_elements(rule_results)

Related

Big query nested json strings to arrays then new tables

Bigquery Database
I've got a webhook that's pushing to my big query table. The problem is it has lots of nested json strings which are brought in as strings. I ultimately want to make each column with these json strings into their own tables but I'm getting stuck because I can't figure out how to get them unnested and into an array.
[{"id":"63bddc8cfe21ec002d26b7f4","description":"General Admission", "src_currency":"USD","src_price":50.0,"src_fee":0.0,"src_commission":1.79,"src_discount":0.0,"applicable_pass_id":null,"seats_label":null,"seats_section_label":null,"seats_parent_type":null,"seats_parent_label":null,"seats_self_type":null,
"seats_self_label":null,"rate_type":"Rate","option_name":null,"price_level_id":null,"src_discount_price":50.0,"rate_id":"636d6d5cea8c6000222c640d","cost_item_id":"63bddc8cfe21ec002d26b7f4"}]
Here's the sample return from the original source and below is a screenshot of what I'm working with.
[Current Database
I've tried a number of things but the multiple nestings and string to array issue are really hampering everything I've tried.
I'm honestly not sure exactly what output/structure is best for this data set. I assume that each of the json returns probably just needs to be its own table and I can reference or join them based off that first "id" value in the json strings but I'm wide open to suggestions.
You can use a combination of JSON functions, and array functions to manipulate this kind of data.
JSON_EXTRACT_ARRAY can convert the JSON formatted string into an array, UNNEST then can make each entry into rows, and finally JSON_EXTRACT_SCALAR can pull out individual columns.
So here's an example of what I think you're trying to accomplish:
with sampledata as (
select """[{"id":"63bddc8cfe21ec002d26b7f4","description":"General Admission", "src_currency":"USD","src_price":50.0,"src_fee":0.0,"src_commission":1.79,"src_discount":0.0,"applicable_pass_id":null,"seats_label":null,"seats_section_label":null,"seats_parent_type":null,"seats_parent_label":null,"seats_self_type":null,"seats_self_label":null,"rate_type":"Rate","option_name":null,"price_level_id":null,"src_discount_price":50.0,"rate_id":"636d6d5cea8c6000222c640d","cost_item_id":"63bddc8cfe21ec002d26b7f4"},{"id":"63bddc8cfe21ec002d26b7f4","description":"General Admission", "src_currency":"USD","src_price":50.0,"src_fee":0.0,"src_commission":1.79,"src_discount":0.0,"applicable_pass_id":null,"seats_label":null,"seats_section_label":null,"seats_parent_type":null,"seats_parent_label":null,"seats_self_type":null,"seats_self_label":null,"rate_type":"Rate","option_name":null,"price_level_id":null,"src_discount_price":50.0,"rate_id":"636d6d5cea8c6000222c640d","cost_item_id":"63bddc8cfe21ec002d26b7f4"}]""" as my_json_string
)
select JSON_EXTRACT_SCALAR(f,'$.id') as id, JSON_EXTRACT_SCALAR(f,'$.rate_type') as rate_type, JSON_EXTRACT_SCALAR(f,'$.cost_item_id') as cost_item_id
from sampledata, UNNEST(JSON_EXTRACT_ARRAY(my_json_string)) as f
Which creates rows with specific columns from that data, like this:

How to query multiple JSON document schemas in Snowflake?

Could anyone tell me how to change the Stored Procedure in the article below to recursively expand all the attributes of a json file (multiple JSON document schemas)?
https://support.snowflake.net/s/article/Automating-Snowflake-Semi-Structured-JSON-Data-Handling-part-2
Craig Warman's stored procedure posted in that blog is a great idea. I asked him if it was okay to refactor his code, and he agreed. I've used the refactored version in the field, so I know the SP well as well as how it works.
It may be possible to modify the SP to work on your JSON. It will depend on whether or not Snowflake types the JSON in your variant column. The way you have it structured, it may not type everything. You can check by running this SQL and seeing if the result set includes all the columns you need:
set VARIANT_TABLE = 'WEATHER';
set VARIANT_COLUMN = 'V';
with MAIN_TABLE as
(
select * from identifier($VARIANT_TABLE) sample (1000 rows)
)
select distinct REGEXP_REPLACE(REGEXP_REPLACE(f.path, '\\[(.+)\\]'),'[^a-zA-Z0-9]','_') AS path_name, -- This generates paths with levels enclosed by double quotes (ex: "path"."to"."element"). It also strips any bracket-enclosed array element references (like "[0]")
typeof(f.value) AS attribute_type, -- This generates column datatypes.
path_name AS alias_name -- This generates column aliases based on the path
from
MAIN_TABLE,
LATERAL FLATTEN(identifier($VARIANT_COLUMN), RECURSIVE=>true) f
where TYPEOF(f.value) != 'OBJECT'
AND NOT contains(f.path, '[');
Be sure to replace the variables to your table and column names. If this picks up the type information for the columns in your JSON, then it's possible to modify this SP to do what you need. If it doesn't but there's a way to modify the query to get it to pick up the columns, that would work too.
If it doesn't pick up the columns, based on Craig's idea I decided to write type inference for non variant (such as strings from CSV log files without type information). Try the SQL above and see what results first.

How to update keys in a JSON array Postgresql

I am using PostgreSQL 9.4.1 and have run into an issue with a column that I need to update. The column is of type JSON with elements in the following format:
["a","b","c",...]
["a","c","d",...]
["c","d","e",...]
etc.
so that each element is a string. It is my understanding that each of these elements are considered keys to the JSON array (please correct me if I am a bit confused here, I haven't ever worked with JSON datatype columns before, so I'm still trying to get a grip on them anyways). There is not an actual pattern to these arrays, and their contents are dependent on user input from somewhere else. My goal is to update any of the arrays that contain a particular element (say "b" for the purpose of explaining my question more thoroughly) and replace the content "b" with say "b1". Meaning that:
["a","b","c",...]
would be updated to
["a","b1","c",...]
I have found a few ways listed on this site (I don't currently have the links, but I can find them again if necessary) to update VALUES for a particular KEY, but I haven't found a way mentioned to change the KEY itself. I have already found a way to target the specific rows of interest by doing something similar to:
SELECT *
FROM TableA
WHERE column::json ?| ["b", other string elements of interest]
Any suggestions would be greatly appreciated. Thanks for your help!
So I went ahead and gave that a check (because it looks like it should work, and it's more or less what I ended up doing), but I figured out what I was trying to do! What I got was this:
UPDATE TableA
SET column = REPLACE(column::TEXT,'b','b1')::JSON
WHERE column::JSON ?| ['b']
And now that I think about it, I probably don't even need the last where condition because the replace won't affect anything that doesn't have 'b' in it. But that worked for me, and it looks like yours probably should too! Thanks for the help!
I wanted to rename a specific key for json array column.
I tried and it worked on PostgreSQL 9.4:
UPDATE Your_Table_Name
SET Your_Column_Name = replace(Your_Column_Name::TEXT,'Key_Old_Name','Key_New_Name')::json
WHERE attributes::jsonb ? 'Key_Old_Name'
Basically, solution is to go over the list of json_array_elements, and based on json value using CASE condition replace certain value with other one. After all, need to re-build new json array using array_agg() and to_json() description of aggregate functions in psql is here.
Possible query can be the following:
-- Sample DDL and JSON data
CREATE TABLE jsontest (data JSON);
INSERT INTO jsontest VALUES ('["a","b","c"]'::JSON);
-- QUERY
WITH result AS (
SELECT to_json( -- creating updated json structure
array_agg( -- create array with new element "b1"
CASE WHEN element::TEXT = '"b"' -- here we process array elements to find "b"
THEN to_json('b1'::TEXT)
ELSE element
END)) as new_json
FROM jsontest,json_array_elements(jsontest.data) as element
)
UPDATE jsontest SET data = result.new_json FROM result;

How to populate a CTE with a list of values in Sqlite

I am working with SQLite and straight C. I have a C array of ids of length N that I am compiling into a string with the following format:
'id1', 'id2', 'id3', ... 'id[N]'
I need to build queries to do several operations that contain comparisons to this list of ids, an example of which might be...
SELECT id FROM tableX WHERE id NOT IN (%s);
... where %s is replaced by the string representation of my array of ids. For complicated queries and high values of N, this obviously produces some very ungainly queries, and I would like to clean them up using common-table-expressions. I have tried the following:
WITH id_list(id) AS
(VALUES(%s))
SELECT * FROM id_list;
This doesn't work because SQLite expects a column for for each value in my string. My other failed attempt was
WITH id_list(id) AS
(SELECT (%s))
SELECT * FROM id_list;
but this throws a syntax error at the comma. Does SQLite syntax exist to accomplish what I'm trying to do?
SQLite supports VALUES clauses with multiple rows, so you can write:
WITH id_list(id) AS (VALUES ('id1'), ('id2'), ('id3'), ...
However, this is not any more efficient than just listing the IDs in IN.
You could write all the IDs into a temporary table, but this would not make sense unless you have measured the performance improvement.
One solution I have found is to reformat my string as follows:
VALUES('id1') UNION VALUES('id2') ... UNION VALUES('id[N]')
Then, the following query achieves the desired result:
WITH id_list(id) AS
(%s)
SELECT * FROM id_list;
However, I am not totally satisfied with this solution. It seems inefficient.

Inserting an array column into the db in Yii

I needed to insert an array field into a database and I was pleased to notice that PostGreSQL had that functionality. But now I am not able to insert the data using the tables active record.
I have tried the below calls with no success
$active_record->array_column = $_array_of_values;
which gives me the exception
Exception Raised:CDbCommand failed to execute the SQL statement: SQLSTATE[22P02]: Invalid text representation: 7 ERROR: array value must start with "{" or dimension information
I have also tried this using
foreach($_array_of_values as $value){
$active_record->array_column[] = $value;
}
which tells me
Indirect modification of overloaded property FeatureRaw::$colors_names has no effect
Can anyone help me with this?
Thanks!
Data must be inserted in the form (text representation of an ARRAY):
INSERT INTO tbl (arr_col) VALUES ('{23,45}')
Or:
INSERT INTO tbl (arr_col) VALUES ('{foo,"bar, with comma"}')
So you need to enclose your array values in '{}' and separate them with comma ,. Use double quotes "" around text values that include a comma.
I listed more syntax variants to insert arrays in a related answer.
For those who also have same problem:
I didn't check the Yii1 behavior, but in Yii2 you simply can insert array as properly formed string as Erwin Brandstetter mentioned in his comment:
$activeRecord->arrayField = '{' . implode(',',$array_values) . '}';
Of course you need to make additional efforts when your $array_values has strings with commas, etc. And you still need to convert value back to array after you load ActiveRecord.
You can make these conversions in ActiveRecord's beforeSave() and afterLoad() and you will not need to convert values manually.
UPD. Recently I made a simple behavior for Yii2 to use array fields with ActiveRecord without manual field building: kossmoss/yii2-postgresql-array-field. It is more generalized way to solve the problem and I hope it will help. For those who use Yii1: you can investigate the package code and create your own solutuion compatible with your framework.

Resources