Postgresql 9.5 JSONB nested arrays LIKE statement - arrays

I have a jsonb column, called "product", that contains a similar jsonb object as the one below. I'm trying to figure out how to do a LIKE statement against the same data in a postgresql 9.5.
{
"name":"Some Product",
"variants":[
{
"color":"blue",
"skus":[
{
"uom":"each",
"code":"ZZWG002NCHZ-65"
},
{
"uom":"case",
"code":"ZZWG002NCHZ-65-CASE"
},
]
}
]}
The following query works for exact match.
SELECT * FROM products WHERE product#> '{variants}' #> '[{"skus":[{"code":"ZZWG002NCHZ-65"}]}]';
But I need to support LIKE statements like "begins with", "ends width" and "contains". How would this be done?
Example: Lets say I want all products returned that have a sku code that begins with "ZZWG00".

You should unnest variants and skus (using jsonb_array_elements()), so you could examine sku->>'code':
SELECT DISTINCT p.*
FROM
products p,
jsonb_array_elements(product->'variants') as variants(variant),
jsonb_array_elements(variant->'skus') as skus(sku)
WHERE
sku->>'code' like 'ZZW%';
Use DISTINCT as you'll have multiple rows as a result of multiple matches in one product.

Related

Create Postgres JSONB Index on Array Sub-Object for ILIKE operator

I have a table that has cast as jsonb column, which looks like this:
cast: [
{ name: 'Clark Gable', role: 'Rhett Butler' },
{ name: 'Vivien Leigh', role: 'Scarlett' },
]
I'm trying to query name in the jsonb array of objects. This is my query:
SELECT DISTINCT actor as name
FROM "Title",
jsonb_array_elements_text(jsonb_path_query_array("Title".cast,'$[*].name')) actor
WHERE actor ILIKE 'cla%';
Is there a way to index a query like this? I've tried using BTREE, GIN, GIN with gin_trgm_ops with no success.
My attempts:
CREATE INDEX "Title_cast_idx_jsonb_path" ON "Title" USING GIN ("cast" jsonb_path_ops);
CREATE INDEX "Title_cast_idx_on_expression" ON "Title" USING GIN(jsonb_array_elements_text(jsonb_path_query_array("Title".cast, '$[*].name')) gin_trgm_ops);
One of the issues is that jsonb_array_elements_text(jsonb_path_query_array())returns a set which can't be indexed. Using array_agg doesn't seem useful, since I need to extract name value, and not just check for existence.

BigQuery ARRAY_TO_STRING based on condition in non-array field

I have a table that I query like this...
select *
from table
where productId = 'abc123'
Which returns 2 rows (even though the productId is unique) because one of the columns (orderName) is an Array...
**productId, productName, created, featureCount, orderName**
abc123, someProductName, 2020-01-01, 12, someOrderName
, , , , someOtherOrderName
I'm not sure whether the missing values in the 2nd row are empty strings or nulls because of the way the orderName array expands my search results but I want to now run a query like this...
select productName, ARRAY_TO_STRING(orderName,'-')
from table
where productId = 'abc123'
and ifnull(featureCount,0) > 0
But this query returns...
someProductName, someOrderName-someOtherOrderName
i.e. both array values came back even though I specified a condition of featureCount>0.
I'm sure I'm missing something very basic about how Arrays function in BigQuery but from Google's ARRAY_TO_STRING documentation I don't see any way to add a condition to the extracting of ARRAY values. Appreciate any thoughts on the best way to go about this.
For what I understand, this is because you are just querying one row of data which have a column as ARRAY<STRING>. As you are using ARRAY_TO_STRINGS it will only accept ARRAY<STRING> values you will see all array values fit into just one cell.
So, when you run your script, your output will fit your criteria and return the columns with arrays with additional rows for visibility.
The visualization on the UI should look like your mention in your question:
Row
productId
productName
created
featureCount
orderName
1
abc123
someProductName
2020-01-01
12
someOrderName
someOtherOrderName
Note: On bigquery this additional row is gray out ( ) and Its part of row 1 but it shows as an additional row for visibility. So this output only have 1 row in the table.
And the visualization on a JSON will be:
[
{
"productId": "abc123",
"productName": "someProductName",
"created": "2020-01-01",
"featureCount": "12",
"orderName": [
"someOrderName",
"someOtherOrderName"
]
}
]
I don't think there is specific documentation info about how you visualize arrays on UI but I can share the docs that talks about how to flattening your rows outputs into a single row line, check:
Working with Arrays
Flattening Arrays
I use the following to replicate your issue:
CREATE OR REPLACE TABLE `project-id.dataset.working_table` (
productId STRING,
productName STRING,
created STRING,
featureCount STRING,
orderName ARRAY<STRING>
);
insert into `project-id.dataset.working_table` (productId,productName,created,featureCount,orderName)
values ('abc123','someProductName','2020-01-01','12',['someOrderName','someOtherOrderName']);
insert into `project-id.dataset.working_table` (productId,productName,created,featureCount,orderName)
values ('abc123X','someProductNameX','2020-01-02','15',['someOrderName','someOtherOrderName','someData']);
output
Row
productId
productName
created
featureCount
orderName
1
abc123
someProductName
2020-01-01
12
someOrderName
someOtherOrderName
2
abc123X
someProductNameX
2020-01-02
15
someOrderName
someOtherOrderName
someData
Note: Table contains 2 rows.

How to compare two columns not having same value using sequelize orm

I have two fields in my table dispatchCount & qty.
I want to findOne tuple where dispatchCount is not equal to qty
I want to do something similar to this (Mysql Select Rows Where two columns do not have the same value) but using sequelize ORM.
I don't want to write the raw query myself bcz there are a lot of aliases & things like that. So how can I do the following using sequelize
SELECT *
FROM my_table
WHERE column_a != column_b
The following is the solution without writing raw query to do so :-
let en = Entity.findOne({
where: {
dispatchCount : {
[Op.ne]: sequelize.col("qty");
}
}
})

Most efficient way to query data nested deep in JSON arrays?

Currently I'm writing queries against a JSONB table with 8 million+ rows. How can I query from the parent and the friends objects in the most efficient manner possible?
Query (Postgres 9.6):
select distinct id, data->>'_id' jsonID, data->>'email' email, friends->>'name' friend_name, parent->>'name' parent
from temp t
CROSS JOIN jsonb_array_elements(t.data->'friends') friends
CROSS JOIN jsonb_array_elements(friends->'parent') parent
where friends ->> 'name' = 'Chan Franco'
and parent->>'name' = 'Hannah Golden'
Example DDL (with data): https://pastebin.com/byN7uyKx
Your regularly structured data would be cleaner, smaller and faster as normalized relational design.
That said, to make the setup you have much faster (if not as fast as a normalized design with matching indexes), add a GIN index on the expression data->'friends':
CREATE INDEX tbl_data_friends_gin_idx ON tbl USING gin ((data->'friends'));
Then add a matching WHERE clause to our query with the contains operator #>:
SELECT DISTINCT -- why DISTINCT ?
id, data->>'_id' AS json_id, data->>'email' AS email
, friends->>'name' AS friend_name, parent->>'name' AS parent
FROM tbl t
CROSS JOIN jsonb_array_elements(t.data->'friends') friends
CROSS JOIN jsonb_array_elements(friends->'parent') parent
WHERE t.data->'friends' #> '[{"name": "Chan Franco", "parent": [{"name": "Hannah Golden"}]}]'
AND friends->>'name' = 'Chan Franco'
AND parent ->>'name' = 'Hannah Golden';
db<>fiddle here
The huge difference: With the help of the index, Postgres can now identify matching rows before unnesting each an every nested "friends" array in the whole table. Only after having identified matching rows in the underlying table, jsonb_array_elements() is called and resulting rows with qualifying array elements are kept.
Note that the search expression has to be valid JSON, matching the structure of the JSON array data->'friends' - including the outer brackets []. But omit all key/value pairs that are not supposed to serve as filter.
Related:
Index for finding an element in a JSON array
I avoided the table name temp as this is an SQL key word, that might lead to confusing errors. Using the name tbl instead.

Values of JSON exist in columns' JSONB array

I have tricky query that attempts to find matches that compare a list of JSON arrays against a list of JSON values in a column
The "Things" table with "Keywords" column would contain something such as:
'["car", "house", "boat"]'::JSONB
The query would contain the values:
'["car", "house"]'::JSONB
I'd like to find all the rows that have BOTH "car" and "house" contained in the listing. Here's my (mostly) feeble attempt:
SELECT
*
FROM
"Things"
WHERE
"Keywords"::JSONB ?| ARRAY(
SELECT * FROM JSONB_ARRAY_ELEMENTS('["house","car"]'::JSONB)
)::TEXT[]
Also when it comes to indexing, I'm assuming adding a GIST index would be my best option.
I'd like to find all the rows that have BOTH "car" and "house"
So the right operator is ?& - Do all of these array strings exist as top-level keys?
You query is almost correct, change the operator and use jsonb_array_elements_text():
WITH "Things"("Keywords") AS (
VALUES
('["car", "house", "boat"]'::jsonb),
('["car", "boat"]'),
('["car", "house"]'),
('["house", "boat"]')
)
SELECT
*
FROM
"Things"
WHERE
"Keywords" ?& array(
SELECT jsonb_array_elements_text('["house","car"]')
)
Keywords
--------------------------
["car", "house", "boat"]
["car", "house"]
(2 rows)
The query would be simpler if the argument could be written down as a regular array of text:
SELECT
*
FROM
"Things"
WHERE
"Keywords" ?& array['house', 'car']
In both cases you can use a GIN index:
The default GIN operator class for jsonb supports queries with top-level key-exists operators ?, ?& and ?| operators (...)

Resources