How can I sort based on children? - database

Using Vapor and Fluent (PostgreSQL if that matters) I have entity B that has aID: Node (A is B's parent) to reference A and A has a one-to-many relationship with B. How can I make a query to fetch all A's sorted by the count of B's?
I want the result to look something like this:
All A's in DB
[
{
"id": 4,
"name": "Hi",
"bCount": 1000
},
{
"id": 3,
"name": "Another",
"bCount": 800
},
{
"id": 5,
"name": "Test",
"bCount": 30
}
]

Firstly,
Create a modal for A
Turn that JSON string to array of A
if you do this your sorting becomes as easy as -
array.sort { $0.bCount < $1.bCount }

This is going to be tricky to implement entirely in Fluent using Entity. Firstly, you will need to use raw SQL to get your bCount. Secondly, you will need to change your init(node:) to accept bCount, though it shouldn't be in your makeNode() because we don't want to create a stored database field for it.
Try this for your raw SQL (untested):
SELECT
A.*,
(
SELECT COUNT(*)
FROM B
WHERE B.aID = A.id
) AS bCount
FROM A
ORDER BY bCount
Then, run that query to get your A models.
var models: [A] = []
if let driver = drop.database?.driver as? PostgreSQLDriver {
if case .array(let array) = try driver.raw(sql) {
for result in array {
do {
var model = try A(node: result)
models.append(model)
}
}
}
}
As I said before, your init method on A will be receiving bCount so you will need to store it there.

Related

MongoDB update array of documents and replace by an array of replacement documents

Is it possible to bulk update (upsert) an array of documents with MongoDB by an array of replacement fields (documents)?
Basically to get rid of the for loop in this pseudo code example:
for user in users {
db.users.replaceOne(
{ "name" : user.name },
user,
{ "upsert": true }
}
The updateMany documentation only documents the following case where all documents are being updated in the same fashion:
db.collection.updateMany(
<query>,
{ $set: { status: "D" }, $inc: { quantity: 2 } },
...
)
I am trying to update (upsert) an array of documents where each document has it's own set of replacement fields:
updateOptions := options.UpdateOptions{}
updateOptions.SetUpsert(true)
updateOptions.SetBypassDocumentValidation(false)
_, error := collection.Col.UpdateMany(ctx, bson.M{"name": bson.M{"$in": names}}, bson.M{"$set": users}, &updateOptions)
Where users is an array of documents:
[
{ "name": "A", ...further fields},
{ "name": "B", ...further fields},
...
]
Apparently, $set cannot be used for this case since I receive the following error: Error while bulk writing *v1.UserCollection (FailedToParse) Modifiers operate on fields but we found type array instead.
Any help is highly appreciated!
You may use Collection.BulkWrite().
Since you want to update each document differently, you have to prepare a different mongo.WriteModel for each document update.
You may use mongo.ReplaceOneModel for individual document replaces. You may construct them like this:
wm := make([]mongo.WriteModel, len(users)
for i, user := range users {
wm[i] = mongo.NewReplaceOneModel().
SetUpsert(true).
SetFilter(bson.M{"name": user.name}).
SetReplacement(user)
}
And you may execute all the replaces with one call like this:
res, err := coll.BulkWrite(ctx, wm)
Yes, here we too have a loop, but that is to prepare the write tasks we want to carry out. All of them is sent to the database with a single call, and the database is "free" to carry them out in parallel if possible. This is likely to be significantly faster that calling Collection.ReplaceOne() for each document individually.

Getting values from json array using an array of object and keys in Python

I'm a Python newbie and I'm trying to write a script to extract json keys by passing the keys dinamically, reading them from a csv.
First of all this is my first post and I'm sorry if my questions are banals and if the code is incomplete but it's just a pseudo code to understand the problem (I hope not to complicate it...)
The following partial code retrieves the values from three key (group, user and id or username) but I'd like to load the objects and key from a csv to make them dinamicals.
Input json
{
"fullname": "The Full Name",
"group": {
"user": {
"id": 1,
"username": "John Doe"
},
"location": {
"x": "1234567",
"y": "9876543"
}
},
"color": {
"code": "ffffff",
"type" : "plastic"
}
}
Python code...
...
url = urlopen(jsonFile)
data = json.loads(url.read())
id = (data["group"]["user"]["id"])
username = (data["group"]["user"]["username"])
...
File.csv loaded into an array. Each line contains one or more keys.
fullname;
group,user,id;
group,user,username;
group,location,x;
group,location,y;
color,code;
The questions are: can I use a variable containing the object or key to be extract?
And how can I specify how many keys there are in the keys array to put them into the data([ ][ ]...) using only one line?
Something like this pseudo code:
...
url = urlopen(jsonFile)
data = json.loads(url.read())
...
keys = line.split(',')
...
# using keys[] to identify the objects and keys
value = (data[keys[0]][keys[1]][keys[2]])
...
But the line value = (data[keys[0]][keys[1]][keys[2]]) should have the exact number of the keys per line read from the csv.
Or I must to make some "if" lines like these?:
...
if len(keys) == 3:
value = (data[keys[0]][keys[1]][keys[2]])
if len(keys) == 2:
value = (data[keys[0]][keys[1]])
...
Many thanks!
I'm not sure I completely understand your question, but I would suggest you to try and play with pandas. It might be as easy as this:
import pandas as pd
df = pd.read_json(<yourJsonFile>, orient='columns')
name = df.fullname[0]
group_user = df.group.user
group_location = df.group.location
color_type = df.color.type
color_code = df.color.code
(Where group_user and group_location will be python dictionaries).

How to update PostgreSQL array of jsonb

I have a table like:
id: integer,
... other stuff...,
comments: array of jsonb
where the comments column has the following structure:
[{
"uid": "comment_1",
"message": "level 1 - comment 1",
"comments": [{
"uid": "subcomment_1",
"message": "level 2 - comment 1",
"comments": []
}, {
"uid": "subcomment_2",
"message": "level 1 - comment 2",
"comments": []
}]
},
{
"uid": "P7D1hbRq4",
"message": "level 1 - comment 2",
"comments": []
}
]
I need to update a particular field, for example:comments[1](with uid = comment_1) -> comments[2] (with uid = subcomment_2) -> message = 'comment edited'.
I'm brand new to postgresql and I can't figure it out how to do this, not even close. I manage to merge objects and change message for level 1 with:
UPDATE tasks
set comments[1] = comments[1]::jsonb || $$
{
"message": "something",
}$$::jsonb
where id = 20;
but that's as far as I could go.
Any hints towards the right direction?
LE:
I got this far:
UPDATE tasks
set comments[1] = jsonb_set(comments[1], '{comments,1, message}', '"test_new"')
where id = 20;
Sure, I can get this path from javascript but it's that a best practice? Not feeling comfortable using indexes from javascript arrays.
Should I try to write a sql function to get the array and use the 'uid' as key? Any other simpler way to search/select using the 'uid' ?
LLE
I can't get it to work using suggestion at:this question (which I read and tried)
Code bellow returns nothing:
-- get index for level 2
select pos as elem_index
from tasks,
jsonb_array_elements(comments[0]->'comments') with ordinality arr(elem, pos)
where tasks.id = 20 and
elem ->>'uid'='subcomment_1';
and I need it for several levels so it's not quite a duplicate.
First, you cannot update a part of a column (an element of an array) but only a column as a whole.
Next, you should understand what the path (the second argument of the jsonb_set() function) means.
Last, the third argument of the function is a valid json, so a simple text value must be enclosed in both single and double quotes.
update tasks
set comments = jsonb_set(comments, '{0, comments, 1, message}', '"comment edited"')
where id = 1;
Path:
0 - the first element of the outer array (elements are indexed from
0)
comments - an object with key comments
1 - the second element of
the comments array
message - an object message in the above
element.
See Db<>fiddle.

How to transform a JSON array nested inside an object inside another array in Postgres?

I'm using Postgres 9.6 and have a JSON field called credits with the following structure; A list of credits, each with a position and multiple people that can be in that position.
[
{
"position": "Set Designers",
people: [
"Joe Blow",
"Tom Thumb"
]
}
]
I need to transform the nested people array, which are currently just strings representing their names, into objects that have a name and image_url field, like this
[
{
"position": "Set Designers",
people: [
{ "name": "Joe Blow", "image_url": "" },
{ "name": "Tom Thumb", "image_url": "" }
]
}
]
So far I've only been able to find decent examples of doing this on either the parent JSON array or on an array field nested inside a single JSON object.
So far this is all I've been able to manage and even it is mangling the result.
UPDATE campaigns
SET credits = (
SELECT jsonb_build_array(el)
FROM jsonb_array_elements(credits::jsonb) AS el
)::jsonb
;
Create an auxiliary function to simplify the rather complex operation:
create or replace function transform_my_array(arr jsonb)
returns jsonb language sql as $$
select case when coalesce(arr, '[]') = '[]' then '[]'
else jsonb_agg(jsonb_build_object('name', value, 'image_url', '')) end
from jsonb_array_elements(arr)
$$;
With the function the update is not so horrible:
update campaigns
set credits = (
select jsonb_agg(jsonb_set(el, '{people}', transform_my_array(el->'people')))
from jsonb_array_elements(credits::jsonb) as el
)::jsonb
;
Working example in rextester.

Multiple sorting in ArangoDB

My webapp needs to display several sorted lists of document attributes in a graph. These are hours, cycles, and age.
I have an AQL query that beautifully traverses the graph and gets me all the data my app needs in 2 ms. I'm very impressed! But I need it sorted for each graph. The query currently returns an array of json objects that contain all three of the attributes and the id for which they apply. Awesome. The query also very easily sorts on one of the attributes.
My problem is: I need to have a sorted list of all three, and would prefer not to query the database three times since the data is all in the same documents my traversal returned.
I would like to return three sorted arrays of json objects: one containing hours and the id, one containing cycles and the id, and one containing age and the id. This way, my graphs can easily display all three graphs without client-side sorting.
HTTP requests themselves are time consuming although the database is very fast, which is why I'd like to pull all three at once, as the data itself is small.
My current query is a simple graph traversal:
for v, e, p in outbound startNode graph 'myGraph'
filters & definitions...
sort v.hours desc
return {"hours": v.hours, "cycles": v.cycles, "age": v.age, "id": v.id}
Is there an easy way I can tell Arango to return me this structure?
{
[
{
"id": 47,
"hours": 123
},
{
"id": 23,
"hours": 105
}...
],
[
{
"id": 47,
"cycles": 18
},
{
"id": 23,
"cycles": 5
}...
],
[
{
"id": 47,
"age": 4.2
},
{
"id": 23,
"age": 0.9
}
]
}
Although the traversal is fast, I would prefer if I didn't have to re-traverse the graph three times to do it, if possible.
My solution:
let data = (for v, e, p in outbound startNode graph 'myGraph'
filters & definitions...
return {"hours": v.hours, "cycles": v.cycles, "age": v.age, "id": v.id})
let byHours = (for thing in data
sort thing.hours desc
return {"hours": thing.hours, "id": thing.id})
let byCycles = (for thing in data
sort thing.cycles desc
return {"cycles": thing.cycles, "id": thing.id})
let byAge = (for thing in data
sort thing.age desc
return {"age": thing.age, "id": thing.id})
return {"hours": byHours, "cycles": byCycles, "age": byAge}
I'm not sure how this compares against your solution performance-wise, but the most obvious solution would be to traverse once and then create three sorted results like this:
LET nodes = (
FOR v, e, p IN OUTBOUND startNode GRAPH 'myGraph'
FILTER ...
RETURN v
)
RETURN {
hours: (
FOR n IN nodes
SORT n.hours DESC
RETURN KEEP(n, ['hours', 'id'])
),
cycles: (
FOR n IN nodes
SORT n.cycles DESC
RETURN KEEP(n, ['cycles', 'id'])
),
age: (
FOR n IN nodes
SORT n.age DESC
RETURN KEEP(n, ['age', 'id'])
)
}
This would traverse the graph only once but sort the result three times.

Resources