Postgres JSONB query not finding result even though it exists

Postgres JSONB query not finding result even though it exists - database

I have a jsonb-type column called 'payloads' on a table called 'tweets'.
SELECT "payload" FROM "tweets";
payload
-------
"{\"id\":\"1596568061574008832\", ...lots more attributes}"
(1 row)
BUT
SELECT "payload" FROM "tweets" WHERE "payload"->>'id'='1596568061574008832';
payload
-------
(0 rows)
I have verified that the column type is indeed jsonb by running \d+ tweets. Storage on that column is "extended", if that helps.
In addition, I have followed multiple tutorials (I rarely interface with postgres directly, usually through an ORM), such as this one and this one, to no avail.
What am I doing wrong?

Related

How to select the matching records when the where clause contains an array column?

In a project I have a table called documents from where I have to select all the records as documents.id where I have a documents.recipients.
The resulting record I need to have is:
documentId
recipentId
1
gmiG_duuBQOX6WblPXpUk
1
TQ7o1lBsrfDPtBeqGnyYt
2
gmiG_duuBQOX6WblPXpUk
2
TQ7o1lBsrfDPtBeqGnyYt
The problem is as you see from table screenshot that the recipients are an array and I have no idea how to make this matching.
The document can have many recipients, not only 2 as showed in this dummy table, and I have to match all the documents to one recipient ID.
E.G. [{r1,d1},{r1,d2},{r2,d3}]
The result I need to be able to construct a payload in an API later where I have to send an OBJ as e.g.
{
"phone": null,
"email": "foo#bar.com",
"recipients": [{r1,d1},{r1,d2},{r2,d3}]
}
However, the above OBJ was added only for a competition of information.
The help I need is to construct a PostgreSQL query which can result on what I described before and resolving that issue I have with the recipients be an array.
I've been trying as follow but not really understood how to do it:
select
id,
recipients
from
documents d
where
recipients -> '0' = 'gmiG_duuBQOX6WblPXpUk'
As edit added the screenshot of properties of this table

Use the jsonb containment operator #> to find all lines where the recipient contains a certain recipient:
SELECT d.id, 'gmiG_duuBQOX6WblPXpUk'
FROM documents d
WHERE d.recipients #> ARRAY['gmiG_duuBQOX6WblPXpUk'];

flask-sqlalchemy slow paginate count

I have a Postgres 10 database in my Flask app. I'm trying to paginate the filtering results on table over milions of rows. The problem is, that paginate method do counting total number of query results totaly ineffective.
Heres the example with dummy filter:
paginate = Buildings.query.filter(height>10).paginate(1,10)
Under the hood if perform 2 queries:
SELECT * FROM buildings where height > 10
SELECT count(*) FROM (
SELECT * FROM buildings where height > 10
)
--------
count returns 200,000 rows
The problem is that count on raw select without subquery is quite fast ~30ms, but paginate method wraps that into subquery that takes ~30s.
The query plan on cold database:
Is there an option of using default paginate method from flask-sqlalchemy in performant way?
EDIT:
To get the better understanding of my problem here is the real filter operations used in my case, but with dummy field names:
paginate = Buildings.query.filter_by(owner_id=None).filter(Buildings.address.like('%A%')).paginate(1,10)
So the SQL the ORM produce is:
SELECT count(*) AS count_1
FROM (SELECT foo_column, [...]
FROM buildings
WHERE buildings.owner_id IS NULL AND buildings.address LIKE '%A%' ) AS anon_1
That query is already optimized by indices from:
CREATE INDEX ix_trgm_buildings_address ON public.buildings USING gin (address gin_trgm_ops);
CREATE INDEX ix_buildings_owner_id ON public.buildings USING btree (owner_id)
The problem is just this count function, that's very slow.

So it looks like a disk-reading problem. The solutions would be get faster disks, get more RAM is it all can be cached, or if you have enough RAM than to use pg_prewarm to get all the data into the cache ahead of need. Or try increasing effective_io_concurrency, so that the bitmap heap scan can have more than one IO request outstanding at a time.
Your actual query seems to be more complex than the one you show, based on the Filter: entry and based on the Row Removed by Index Recheck: entry in combination with the lack of Lossy blocks. There might be some other things to try, but we would need to see the real query and the index definition (which apparently is not just an ordinary btree index on "height").

LIKE query on elements of flat jsonb array

I have a Postgres table posts with a column of type jsonb which is basically a flat array of tags.
What i need to do is to somehow run a LIKE query on that tags column elements so that i can find a posts which has a tags beginning with some partial string.
Is such thing possible in Postgres? I'm constantly finding super complex examples and no one is ever describing such basic and simple scenario.
My current code works fine for checking if there are posts having specific tags:
select * from posts where tags #> '"TAG"'
and I'm looking for a way of running something among the lines of
select * from posts where tags #> '"%TAG%"'

SELECT *
FROM posts p
WHERE EXISTS (
SELECT FROM jsonb_array_elements_text(p.tags) tag
WHERE tag LIKE '%TAG%'
);
Related, with explanation:
Search a JSON array for an object containing a value matching a pattern
Or simpler with the #? operator since Postgres 12 implemented SQL/JSON:
SELECT *
-- optional to show the matching item:
-- , jsonb_path_query_first(tags, '$[*] ? (# like_regex "^ tag" flag "i")')
FROM posts
WHERE tags #? '$[*] ? (# like_regex "TAG")';
The operator #? is just a wrapper around the function jsonb_path_exists(). So this is equivalent:
...
WHERE jsonb_path_exists(tags, '$[*] ? (# like_regex "TAG")');
Neither has index support. (May be added for the #? operator later, but not there in pg 13, yet). So those queries are slow for big tables. A normalized design, like Laurenz already suggested would be superior - with a trigram index:
PostgreSQL LIKE query performance variations
For just prefix matching (LIKE 'TAG%', no leading wildcard), you could make it work with a full text index:
CREATE INDEX posts_tags_fts_gin_idx ON posts USING GIN (to_tsvector('simple', tags));
And a matching query:
SELECT *
FROM posts p
WHERE to_tsvector('simple', tags) ## 'TAG:*'::tsquery
Or use the english dictionary instead of simple (or whatever fits your case) if you want stemming for natural English language.
to_tsvector(json(b)) requires Postgres 10 or later.
Related:
Get partial match from GIN indexed TSVECTOR column
Pattern matching with LIKE, SIMILAR TO or regular expressions in PostgreSQL

Is it possible to query a value (or a value set) with cassandra even if we don't know the key (or key range) in advance?

I am going to express the idea in SQL:
SELECT key,value
FROM table1
WHERE value > 10
Or do we always need to know the key?

I suppose you can use secondary indexes which are available since version 0.7 of casssandra.
You might also checkout the following answer: Cassandra and Secondary-Indexes, how do they work internally?
it is recommended to use secondary indexes only for low-cardinality columns, which means for columns which do not have many different values (e.g. columns like 'status' or 'priority' which have usually only a handful different values like 'high', 'medium', 'low').
In case you are using Hector as your cassandra client you can find information here how to use them:
https://github.com/rantav/hector/wiki/User-Guide

Yes, of course, for example, you can use *
select * from CF where value = 10
If you use the Hector API (e.g. CqlQuery), you can get a list of rows back from this query.
Note, currently for secondary indexes, you must have at least one equality conditional, so your query with just value > 10 would not work. See this question

In Cassandra, how do I access historical data?

Cassandra uses a timestamp system to serve up the most recent records. How do I display a list of all values & timestamps for a particular column?
For example, I run this command for a Column family called 'Users':
set Users[jsmith][first]='John'
When I get the 'first' column, I see the following:
get Users[jsmith][first]
=> (column=first value=John, timestamp=1287604215498000
Then, I update the 'first' column to Charlie.
set Users[jsmith][first]='Charlie'
I will now see the following
get Users[jsmith][first]
=> (column=first value=Charlie, timestamp=1299980101189000
My question is how do I get all values (over time) for this column? I want to see something like get Users[jsmith][first] ==> John (timestamp), Charlie (timestamp).

You don't. Cassandra departs from the BigTable model here: only the most recent version is retained.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Postgres JSONB query not finding result even though it exists - database

Related

How to select the matching records when the where clause contains an array column?

flask-sqlalchemy slow paginate count

LIKE query on elements of flat jsonb array

Is it possible to query a value (or a value set) with cassandra even if we don't know the key (or key range) in advance?

In Cassandra, how do I access historical data?

Categories

Resources