Cypher - How to call a procedure multiple times in loop? - loops

In Neo4j Browser, I tried to call a procedure multiple times in a loop, but Neo4j reported the same error: Query cannot conclude with CALL (must be RETURN or an update clause). Specifically,
With UNWIND (documentation):
UNWIND [10, 20] AS age_num
MATCH (n:User {name: 'a', age: age_num})
CALL apoc.nodes.delete(n)
...got Neo.ClientError.Statement.SyntaxError:
Query cannot conclude with CALL (must be RETURN or an update clause) (line 3, column 1 (offset: 68))
"CALL apoc.nodes.delete(n)"
^
With apoc.periodic.iterate() (documentation):
CALL apoc.periodic.iterate(
"UNWIND [10, 20] AS age_num MATCH (n:User {name: 'a', age: age_num}) RETURN n",
"CALL apoc.nodes.delete(n)",
{batchMode: 'SINGLE', parallel: false}
)
...got errorMessages:
{
"Query cannot conclude with CALL (must be RETURN or an update clause) (line 1, column 15 (offset: 14))\r\n\" WITH $n AS n CALL apoc.nodes.delete(n)\"\r\n ^": 1
}
The procedure apoc.nodes.delete() here is just an example. Please don't advise me on using DETACH DELETE instead.
Question: In Cypher, how is it supposed to call a procedure multiple times in a loop, each time might have a different parameter, e.g. a different property value?
Environment: Neo4j Desktop v4.0.4, Windows 8.1 x64.

You have to add a RETURN statement at the end of the query like the error states. Basically, if you only call a single procedure, then cypher won't bug you with this. But if you do any kind of MATCH before a procedure call, you have to end the query with RETURN. You could also just use DETACH DELETE cypher statement instead.
Version with DETACH DELETE:
UNWIND [10, 20] AS age_num
MATCH (n:User {name: 'a', age: age_num})
DETACH DELETE n
Version with APOC:
UNWIND [10, 20] AS age_num
MATCH (n:User {name: 'a', age: age_num})
CALL apoc.nodes.delete(n) YIELD value
RETURN distinct 'done'
Edit: I have fixed the output as per the OP comment

Related

Split a query with too many arguments in SQLAlchemy

In my program when I did not run a database update for a long time and then try to update my data,
my sqlalchemy script generates a postgresql upsert query with >4000 params each with >8 items.
When the query is executed with databases.Database.execute(query) I end up with this error:
asyncpg.exceptions._base.InterfaceError: the number of query arguments cannot exceed 32767
My idea is to automatically split the query based on the number of arguments as threshold and execute it in two parts and merge the results.
Do you have an idea how to resolve that problem?
I ended up writing a check for the number of query arguments by getting the length of the first argument's dictionary keys in my list of argument dictionaries as they all have the same number of keys=arguments per list item:
args_per_row = len(args_dict_list[0])
PSQL_QUERY_ALLOWED_MAX_ARGS = 32767
allowed_args_per_query = int(math.floor(PSQL_QUERY_ALLOWED_MAX_ARGS/args_per_row))
Then I divided the args_dict_list into parts that have the size of allowed args per query:
query_args_sets = [
args_dict_list[x:x + allowed_args_per_query] for x in range(
0,
len(args_dict_list),
allowed_args_per_query
)
]
and finally looped over the query_args_sets and generated and executed a separate query for each set:
for arg_set in query_args_sets:
query = query_builder.build_upsert(values=arg_set)
await database.execute(query)

Looping inside a Postgres UPDATE query

(Postgres 10.10)
I have the following fields in my_table:
loval INTEGER
hival INTEGER
valcount INTEGER
values INTEGER[]
I need to set values to an array containing valcount random integers each between loval and hival inclusive. So for:
loval: 3
hival: 22
valcount: 6
I'm looking to set values to something like:
{3, 6, 6, 13, 17, 22}
I know how to do this with an inefficient "loop through the cursor" solution, but I'm wondering if Postgres has a way to do a looping computation inline.
Note: I looked at generate_series, but I don't think it produces what I need.
generate_series() is indeed the solution:
update my_table
set "values" = array(select (random() * (hival - loval) + loval)::int
from generate_series(1, valcount));
Online example
Note that values is a reserved keyword, it's not a good idea to use that as a column name.

How can I use an array as input for FILTER function in Google Spreadsheet?

So this might be trivial, but it's kinda hard to ask. I'd like to FILTER a range based other FILTER results.
I'll try to explain from inside out (related to image below):
I use filter to find all names for given id (the results are joined in column B). This works fine and returns an array of values. This is the inner FILTER.
I want to use this array of names to find all values for them using another outer FILTER.
In other words: Find maximum value for all names for given id.
Here is what I've figured:
=MAX(FILTER(J:J, CONTAINS???(FILTER(G:G, F:F = A2), I:I)))
^--- imaginary function returning TRUE for every value in I
that is contained in the array
=MAX(FILTER(J:J, I:I = FILTER(G:G, F:F = A2)))
^--- equal does not work here when filter returns more than 1 value
=MAX(FILTER(J:J, REGEXMATCH(JOIN(",", FILTER(G:G, F:F = A2)), I:I)))
^--- this approach WORKS but is ineffective and slow on 10k+ cells
https://docs.google.com/spreadsheets/d/1k5lOUYMLebkmU7X2SLmzWGiDAVR3u3CSAF3dYZ_VnKE
I hope to find better CONTAINS function then the REGEXMATCH/JOIN combo, or to do the task using other approach.
try this in A2 cell (after you delete everything in A2:C range):
=SORTN(SORT({INDIRECT("F2:F"&COUNTA(F2:F)+1),
TRANSPOSE(SUBSTITUTE(TRIM(QUERY(QUERY(QUERY({F2:G},
"select max(Col2) group by Col2 pivot Col1"), "offset 1"),,999^99)), " ", ",")),
VLOOKUP(INDIRECT("G2:G"&COUNTA(F2:F)+1), I2:J, 2, 0)}, 1, 1, 3, 0), 999^99, 2, 1, 1)

Redis: Get all score available for a sorted set

I need to get all score available for a redis sorted set.
redis> ZADD myzset 10 "one"
(integer) 1
redis> ZADD myzset 20 "two"
(integer) 1
redis> ZADD myzset 30 "three"
(integer) 1
Now I want to retrieve all score for myzset, ie. 10,20,30.
EDIT: Since your problem with the size of the values wasn't obvious before, I did some additional research.
There is according to the current documentation no way to get just the scores from a sorted set.
What you'll need to do to get just the scores is to simultaneously add them to a separate set and get them from there when needed.
What you should probably do first though is to try to map your problem differently into data structures. I can't tell from your question why you'd need to get the scores, but there may be other ways to structure the problem that will map better to Redis.
--
I'm not sure there is any way to get all scores without getting the keys, but ZRANGE will at least get the information you're looking for;
redis> ZADD myzset 10 "one"
(integer) 1
redis> ZADD myzset 20 "two"
(integer) 1
redis> ZADD myzset 30 "three"
(integer) 1
redis> ZRANGE myzset 0 -1 WITHSCORES
["one","10","two","20","three","30"]
One way to address this problem is to use server-side Lua scripting.
Consider the following script:
local res = {}
local result = {}
local tmp = redis.call( 'zrange', KEYS[1], 0, -1, 'withscores' )
for i=1,#tmp,2 do
res[tmp[i+1]]=true
end
for k,_ in pairs(res) do
table.insert(result,k)
end
return result
You can execute it by using the EVAL command.
It uses the zrange command to extract the content of the zset (with scores), then it builds a set (represented with a table in Lua) to remove redundant scores, and finally build the reply table. So the values of the zset are never sent over the network.
This script has a flaw if the number of items in the zset is really high, because it copies the entire zset in a Lua object (so it takes memory). However, it is easy to alter it to iterate on the zset incrementally (20 items per 20 items). For instance:
local res = {}
local result = {}
local n = redis.call( 'zcard', KEYS[1] )
local i=0
while i<n do
local tmp = redis.call( 'zrange', KEYS[1], i, i+20, 'withscores' )
for j=1,#tmp,2 do
res[tmp[j+1]]=true
i = i + 1
end
end
for k,_ in pairs(res) do
table.insert(result,k)
end
return result
Please note I am a total newbie in Lua, so there are perhaps more elegant ways to achieve the same thing.
You need to pass the optional argument WITHSCORES. See documentation here:
ZREVRANGE key start stop [WITHSCORES] Return a range of members in a
sorted set, by index, with scores ordered from high to low
When it comes to ruby the following command will do
redis.zrange("zset", 0, -1, :with_scores => true)
# => [["a", 32.0], ["b", 64.0]]
source Ruby Docs

Solr, multivalued field: how can I return documents where ALL values in the field are contained within a set?

For example, if I have these 2 Documents:
id: 1
multifield: 2, 5
id: 2
multifield: 2, 5, 9
Then say I have a set that I'm querying with, which is {2, 5, 7}. What I would want is document 1 returned because 2 and 5 are both contained in the set. But document 2 should not be returned because 9 is not in the set.
Both the multivalued field and my set are of arbitrary length. Hopefully that makes sense.
Figured this out. This was the inspiration, specifically the answer suggesting to use Function Queries.
Using the same data in the question, I will add a calculated field to my documents which contains the number of values in my multivalued field.
id: 1
multifield: 2, 5
nummultifield: 2
id: 2
multifield: 2, 5, 9
nummultifield: 3
Then I'll use an frange with some function queries. For each item in my set, I'll use the termfreq function which will return 1 or 0. I will then sum up all of these values. Finally, if that sum equals the calculated field nummultifield, then I know that for that document, every value in the document is present in the set. Remember my set is 2,5,7 so my function query will look something like this:
fq={!frange l=0 u=0}sub( nummultifield, sum( termfreq(multifield,2), termfreq(multifield,5), termfreq(multifield,7)))
If we fill in the values for Document 1 and 2, it will look like this:
Document 1: sub( 2, sum( 1,1,0 ) ) = 0 ' in my range of {0,0} so Doc 1 is returned
Document 2: sub( 3, sum( 1,1,0 ) ) = 1 ' not in the range of {0,0} so not returned
I've tested it out and it works great. You need to make sure you don't duplicate any values in multifield or you'll get weird results. Incidentally, this trick of using frange could be used whenever you want to fake a boolean result from one or more function queries.
Faceting may be the what you are looking for.
http://wiki.apache.org/solr/SolrFacetingOverview
http://www.lucidimagination.com/devzone/technical-articles/faceted-search-solr
how to search for more than one facet in solr?
I adapted this from the Lucid Imagination link.
Choose all documents that have values 2 or 5 or 7:
http://localhost:8983/solr/select?q=*
&facet=on
&facet.field=multifield
&fq=multifield:2
&fq=multifield:5
&fq=multifield:7
Incomplete: I dont know any options to exclude all other values.

Resources