I have map/reduced view as
emit([doc.State, doc.City, doc.ZipCode.toString()], [doc.TotalLoadWeight]);
I want to search it using keys ex Key=["Arizona", "Chandler"].
This works if my key is single. But incase of arrrays is searching possible?
if you want to use multiple keys in your search you should use the keys parameter which takes a JSON Array, so you'd then have: keys=[["Arizona", "Chandler"],["xxx", "yyy"] The view will then return all entries with keys which match either ["Arizona", "Chandler"] or ["xxx", "yyy"]
Related
I usually use jsonb field store array data.
for example, I want to store customer's barcode info, I will create a table like this:
create table customers(fcustomerid bigint, fcodes jsonb);
One customer has one row, all barcode info stored in its fcodes field, just like below:
[
{
"barcode":"000000001",
"codeid":1,
"product":"Coca Cola",
"createdate":"2021-01-19",
"lottorry":true,
"lottdate":"2021-01-20",
"bonus":50
},
{
"barcode":"000000002",
"codeid":2,
"product":"Coca Cola",
"createdate":"2021-01-19",
"lottorry":false,
"lottdate":"",
"bonus":0
}
...
{
"barcode":"000500000",
"codeid":500000,
"product":"Pepsi Cola",
"createdate":"2021-01-19",
"lottorry":false,
"lottdate":"",
"bonus":0
}
]
The jsonb array maybe store millions of barcode's objects with the same structure. Perhaps this is not a good idea, but you konw when I have thousands of customer, I can store all the data in one table, one customer has one row in this table, all its data store in one field, it looks very tersely and easy to manage.
For this kind of application scenarios, how to efficiently to insert or modify or query the data?
I can use jsonb_insert to insert one object, just like:
update customers
set fcodes=jsonb_insert(fcodes,'{-1}','{...}'::jsonb)
where fcustomerid=999;
When I want modify some object, I found it is a little difficulty, I should know the index of object first, if I use the incremental key codeid as the array index, things looks easilly. I can use jsonb_modify,Just like below:
update customers
set fcodes=jsonb_set(fcodes,concat('{',(mycodeid-1)::text,',lottery}'),'true'::jsonb)
where fcustomerid=999;
But if I want to query the objects in the jsonb array with createdate or bonus or lottorry or product, I should use jsonpath operator. just like:
select jsonb_path_query_array(fcodes,'$ ? (product=="Pepsi Cola")'
from customer
where fcustomerid=999;
or like:
select jsonb_path_query_array(fcodes,'$ ? (lottdate.datetime()>="2021-01-01".datetime() && lottdate.datetime()<="2021-01-31".datetime())'
from customer
where fcustomerid=999;
Thie jsonb index looks useful, But it looks useful between different row, and my operation mostly works in one row's one jsonb field.
I am very worrying about the efficiency, for millions of objects stored in one row's one jsonb field, is this a good idea? And how to improve the efficiency in this scenarios? Especially for the query.
You are right to worry. With a huge JSON like that, you will never get good performance.
Your data don't need JSON at all. Create a table that stores a single barcode and has a foreign key reference to customers. Then everything will be simple and efficient.
Using JSON in the database is almost always the wrong choice, judging from the questions in this forum.
I'm using postgresql database which allows having an array datatype, in addition django provides PostgreSQL specific model fields for that.
My question is how can I filter objects based on the last element of the array?
class Example(models.Model):
tags = ArrayField(models.CharField(...))
example = Example.objects.create(tags=['tag1', 'tag2', 'tag3']
example_tag3 = Example.objects.filter(tags__2='tag3')
I want to filter but don't know what is the size of the tags. Is there any dynamic filtering something like:
example_tag3 = Example.objects.filter(tags__last='tag3')
I don't think there is a way to do that without "killing the performance" other than using raw SQL (see this). But you should avoid doing things like this, from the doc:
Tip: Arrays are not sets; searching for specific array elements can be
a sign of database misdesign. Consider using a separate table with a
row for each item that would be an array element. This will be easier
to search, and is likely to scale better for a large number of
elements.
Adding to the above answer and comment, if changing the table structure isn't an option, you may filter your query based on the first element in an array by using field__0:
example_tag3 = Example.objects.filter(tags__0='tag1')
However, I don't see a way to access the last element directly in the documentation.
It's a good idea to save multiple addresses in a jsonb field in PostgreSQL. I'm new in nosql and I'd like to test PostgreSQL to do that. I don't want to have another table with addresses, I prefer to have it in the same table.
But I'm in doubt, I've seen PostreSQL have jsonb and jsonb[].
Which one is better to store multiple addresses?
If I use jsonb, I think I must to add a prefix for every field like this:
"1_adresse_line-1"
"1_adresse_line-2"
"1_postalcode"
"2_adresse_line-1"
"2_adresse_line-2"
"2_postalcode"
"3_adresse_line-1"
"3_adresse_line-2"
"3_postalcode"
etc.
Is it better to use jsonb[], how does it work?
Use a jsonb (not jsonb[]!) column with the structure like this:
select
'[{
"adresse_line-1": "a11",
"adresse_line-2": "a12",
"postalcode": "code1"
},
{
"adresse_line-1": "a21",
"adresse_line-2": "a22",
"postalcode": "code2"
}
]'::jsonb;
Though, a regular table related to the main one is a better option.
Why not jsonb[]? Take a look at JSON definition:
JSON is built on two structures:
A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array.
An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence.
In a jsonb column you can therefore store an array of objects. Attempts to use the array of jsonb are probably due to misunderstanding of this type of data. I have never seen a reasonable need for such a solution.
I have a listing site having a model with many properties on which I would like to use filters. I would like to use memcache and cursor for querying, e.g:
results=Model.all().filter("x =", a).filter("y =",b).with_cursor(cursor).fetch(20).
How should I handle cursor and pagination, when the user change the filter criteria, e.g.
from `x=a to x=c`?
Should I store cursor having key = query string? But then the query string changes with page numbers :( . I guess i will need to parse query string, remove page numbers and use that as a key for cursor. Is that how I should do it?
You can make an "hash" of your current filter, and pass it to view. That can be stores there as an hidden field, like <input type="hidden" name="prev_query" value="{{query_hash}}"/>
On second request you'll check that current filter's hash equals to passed as parameter.
'Hash' maybe md5 of your filter params, or just join concatenation of them.
Think of a cursor like a bookmark, holding place in a result set. Cursors are specific to the query they are for. You can't use the same cursor for two different queries - that would be akin to expecting a bookmark from one book to show you how far through you are in another.
If you want to store cursors elsewhere, you'll need to key them by the filter criteria so you can look up the appropriate one. Memcache is a poor choice, though, as elements may be evicted at any time. Why not just make the cursor part of the 'next page' URL?
My structure
cat:id:name -> name of category
cat:id:subcats -> set of subcategories
cat:list -> list of category ids
The following gives me a list of cat ids:
lrange cat:list 0, -1
Do I have to iterate each id from the above command to get the name field in my script? Because that seems inefficient. How can I get a list of category names from redis?
There are a couple different approaches. You may want to have the values in the list be delimited/encoded strings that contain both the id, the name, and any other value you need quick access to. I recommend JSON for interoperability and efficient string length, but there are other formats which are more performant.
Another option is to, like you said, iterate. You can make this more efficient by getting all your keys in a single request and then using MGET, pipelining, or MULTI/EXEC to fetch all the names in a single, efficient, operation.