I have tricky query that attempts to find matches that compare a list of JSON arrays against a list of JSON values in a column
The "Things" table with "Keywords" column would contain something such as:
'["car", "house", "boat"]'::JSONB
The query would contain the values:
'["car", "house"]'::JSONB
I'd like to find all the rows that have BOTH "car" and "house" contained in the listing. Here's my (mostly) feeble attempt:
SELECT
*
FROM
"Things"
WHERE
"Keywords"::JSONB ?| ARRAY(
SELECT * FROM JSONB_ARRAY_ELEMENTS('["house","car"]'::JSONB)
)::TEXT[]
Also when it comes to indexing, I'm assuming adding a GIST index would be my best option.
I'd like to find all the rows that have BOTH "car" and "house"
So the right operator is ?& - Do all of these array strings exist as top-level keys?
You query is almost correct, change the operator and use jsonb_array_elements_text():
WITH "Things"("Keywords") AS (
VALUES
('["car", "house", "boat"]'::jsonb),
('["car", "boat"]'),
('["car", "house"]'),
('["house", "boat"]')
)
SELECT
*
FROM
"Things"
WHERE
"Keywords" ?& array(
SELECT jsonb_array_elements_text('["house","car"]')
)
Keywords
--------------------------
["car", "house", "boat"]
["car", "house"]
(2 rows)
The query would be simpler if the argument could be written down as a regular array of text:
SELECT
*
FROM
"Things"
WHERE
"Keywords" ?& array['house', 'car']
In both cases you can use a GIN index:
The default GIN operator class for jsonb supports queries with top-level key-exists operators ?, ?& and ?| operators (...)
Related
I have a table that has cast as jsonb column, which looks like this:
cast: [
{ name: 'Clark Gable', role: 'Rhett Butler' },
{ name: 'Vivien Leigh', role: 'Scarlett' },
]
I'm trying to query name in the jsonb array of objects. This is my query:
SELECT DISTINCT actor as name
FROM "Title",
jsonb_array_elements_text(jsonb_path_query_array("Title".cast,'$[*].name')) actor
WHERE actor ILIKE 'cla%';
Is there a way to index a query like this? I've tried using BTREE, GIN, GIN with gin_trgm_ops with no success.
My attempts:
CREATE INDEX "Title_cast_idx_jsonb_path" ON "Title" USING GIN ("cast" jsonb_path_ops);
CREATE INDEX "Title_cast_idx_on_expression" ON "Title" USING GIN(jsonb_array_elements_text(jsonb_path_query_array("Title".cast, '$[*].name')) gin_trgm_ops);
One of the issues is that jsonb_array_elements_text(jsonb_path_query_array())returns a set which can't be indexed. Using array_agg doesn't seem useful, since I need to extract name value, and not just check for existence.
I have a table with the following structure -
Column Name | Data Type
--------------------------
user_id | uuid
profile | jsonb
An example profile field would be something like -
{ "data": { "races": [ "white", "asian" ] } }
I want to query this table for users contain one of the following races (for example) - "white", "african american"
I would expect my example user above to be returned since their races field contains "white".
I have tried something like this with no success -
SELECT user_id from table
WHERE profile -> 'data' ->> 'races' = ANY('{"white", "african american"}')
Using Postgres 13.x
Thank you!
Use the ?| operator:
select user_id
from my_table
where profile -> 'data' -> 'races' ?| array['white', 'african american']
According to the documentation:
jsonb ?| text[] -> boolean
Do any of the strings in the text array exist as top-level keys or array elements?
tl;dr use the ?| operator.
There's two problems with your query.
->> returns text not jsonb. So you're asking if the text ["white", "asian"] matches white or african american.
You probably did that because otherwise you got type errors trying to use any with JSONB. any wants a Postgres array of things to compare, and it has to be an array of jsonb. We can do that...
select user_id
from user
where profile -> 'data' -> 'races' = ANY(array['"white"', '"african american"']::jsonb[]);
But this has the same problem as before, it's checking if the json array [ "white", "asian" ] equals "white" or "african american".
You need an operator which will match against each element of the JSON. Use the ?| operator.
select user_id
from users
where profile -> 'data' -> 'races' ?| array['white', 'african american'];
I'd like to create GIN index on a scalar text column using an ARRAY[] expression like so:
CREATE TABLE mytab (
scalar_column TEXT
)
CREATE INDEX idx_gin ON mytab USING GIN(ARRAY[scalar_column]);
Postgres reports an error on ARRAY keyword.
I'll use this index later in a query like so:
SELECT * FROM mytab WHERE ARRAY[scalar_column] <# ARRAY['some', 'other', 'values'];
How do I create such an index?
You forgot to add an extra pair of parentheses that is necessary for syntactical reasons:
CREATE INDEX idx_gin ON mytab USING gin ((ARRAY[scalar_column]));
The index does not make a lot of sense. If you need to search for membership in a given array, use a regular B-tree index with = ANY.
Currently I'm writing queries against a JSONB table with 8 million+ rows. How can I query from the parent and the friends objects in the most efficient manner possible?
Query (Postgres 9.6):
select distinct id, data->>'_id' jsonID, data->>'email' email, friends->>'name' friend_name, parent->>'name' parent
from temp t
CROSS JOIN jsonb_array_elements(t.data->'friends') friends
CROSS JOIN jsonb_array_elements(friends->'parent') parent
where friends ->> 'name' = 'Chan Franco'
and parent->>'name' = 'Hannah Golden'
Example DDL (with data): https://pastebin.com/byN7uyKx
Your regularly structured data would be cleaner, smaller and faster as normalized relational design.
That said, to make the setup you have much faster (if not as fast as a normalized design with matching indexes), add a GIN index on the expression data->'friends':
CREATE INDEX tbl_data_friends_gin_idx ON tbl USING gin ((data->'friends'));
Then add a matching WHERE clause to our query with the contains operator #>:
SELECT DISTINCT -- why DISTINCT ?
id, data->>'_id' AS json_id, data->>'email' AS email
, friends->>'name' AS friend_name, parent->>'name' AS parent
FROM tbl t
CROSS JOIN jsonb_array_elements(t.data->'friends') friends
CROSS JOIN jsonb_array_elements(friends->'parent') parent
WHERE t.data->'friends' #> '[{"name": "Chan Franco", "parent": [{"name": "Hannah Golden"}]}]'
AND friends->>'name' = 'Chan Franco'
AND parent ->>'name' = 'Hannah Golden';
db<>fiddle here
The huge difference: With the help of the index, Postgres can now identify matching rows before unnesting each an every nested "friends" array in the whole table. Only after having identified matching rows in the underlying table, jsonb_array_elements() is called and resulting rows with qualifying array elements are kept.
Note that the search expression has to be valid JSON, matching the structure of the JSON array data->'friends' - including the outer brackets []. But omit all key/value pairs that are not supposed to serve as filter.
Related:
Index for finding an element in a JSON array
I avoided the table name temp as this is an SQL key word, that might lead to confusing errors. Using the name tbl instead.
I have a jsonb column, called "product", that contains a similar jsonb object as the one below. I'm trying to figure out how to do a LIKE statement against the same data in a postgresql 9.5.
{
"name":"Some Product",
"variants":[
{
"color":"blue",
"skus":[
{
"uom":"each",
"code":"ZZWG002NCHZ-65"
},
{
"uom":"case",
"code":"ZZWG002NCHZ-65-CASE"
},
]
}
]}
The following query works for exact match.
SELECT * FROM products WHERE product#> '{variants}' #> '[{"skus":[{"code":"ZZWG002NCHZ-65"}]}]';
But I need to support LIKE statements like "begins with", "ends width" and "contains". How would this be done?
Example: Lets say I want all products returned that have a sku code that begins with "ZZWG00".
You should unnest variants and skus (using jsonb_array_elements()), so you could examine sku->>'code':
SELECT DISTINCT p.*
FROM
products p,
jsonb_array_elements(product->'variants') as variants(variant),
jsonb_array_elements(variant->'skus') as skus(sku)
WHERE
sku->>'code' like 'ZZW%';
Use DISTINCT as you'll have multiple rows as a result of multiple matches in one product.