I am confused according to me it should not be deterministic as there will be states for (a,zo/azo) going to q1 and (a,z0/eplison) going to final state. Is this true or not
Related
The PACT documentation clearly states how to select for a single condition in the where clause, but it is not so clear on how to select for multiple clauses which seems much more general and important for real world use cases than a single clause example.
Pact-lang select row function link
For instance I was trying to select a set of dice throws across the room name and the current round.
(select 'throws (where (and ('room "somename") ('round 2)))
But this guy didn't resolve and the error was not so clear. How do I select across multiple conditions in the select function?
The first thing we tried was to simply select via a single clause which returns a list:
(select 'throws (where (and ('room "somename")))
A: [object{throw-schema},object{throw-schema}]
And then we applied the list operator "filter" to the result:
(filter (= 'with-read-function' 1) (select 'throws (where (and ('room "somename"))))
Please keep in mind we had a further function that read the round and spit back the round number and we filtered for equality of the round value.
This ended up working but it felt very janky.
The second thing we tried was to play around with the and syntax and we eventually found a nice way to express it although not quite so intuitive as we would have liked. It just took a little elbow grease.
The general syntax is:
(select 'throws (and? (where condition1...) (where condition2...))
In this case the and clause was lazy, hence the ? operator. We didn't think that we would have to declare where twice, but its much cleaner than the filter method we first tried.
The third thing we tried via direction from the Kadena team was a function we had yet to purview: Fold DB.
(let* ((qry (lambda (k obj) true)) ;;
(f (lambda(x) [(at 'firstName x), (at 'b x)])) ) (fold-db people (qry) (f)) )
This actually is the most correct answer but it was not obvious from the initial scan and would be near inscrutable for a new user to put together with no pact experience.
We suggest a simple sentence ->
"For multiple conditions use fold-db function."
In the documentation.
This fooled us because we are so used to using SQL syntax that we didn't imagine that there was a nice function like this lying around and we got stuck in our ways trying to figure out conditional logic.
I have noticed that Apache Flink does not optimise the order in which the tables are joined. At the moment, it keeps the user-specified join order (basically, it takes the the query literally). I suppose that Apache Calcite can optimise the order of joins but for some reason these rules are not in use in Apache Flink.
If, for example, we have two tables 'R' and 'S'
private val tableEnv: BatchTableEnvironment = TableEnvironment.getTableEnvironment(env)
private val fileNumber = 1
tableEnv.registerTableSource("R", getDataSourceR(fileNumber))
tableEnv.registerTableSource("S", getDataSourceS(fileNumber))
private val r = tableEnv.scan("R")
private val s = tableEnv.scan("S")
and we suppose that 'S' is empty and we want to join these tables in two ways:
val tableOne = r.as("x1, x2").join(r.as("x3, x4")).where("x2 === x3").select("x1, x4")
.join(s.as("x5, x6")).where("x4 === x5 ").select("x1, x6")
val tableTwo = s.as("x1, x2").join(r.as("x3, x4")).where("x2 === x3").select("x1, x4")
.join(r.as("x5, x6")).where("x4 === x5 ").select("x1, x6")
If we want to count the number of rows in tableOne and in tableTwo the result will be zero in both cases.
The problem is that evaluating tableOne will take much longer than evaluating tableTwo.
Is there any way by which we can automatically optimise the order of how the join are executed, or even enable a possible plan cost operation by adding some statistics? How can these statistic can be added?
In the documentation at this link it is written that maybe it is necessary to change the Table environment CalciteConfig but it is not clear to me how to do it.
Please help.
Join reordering is not enabled because Flink does not handle statistics well. Reordering joins without somewhat accurate cardinality estimates is basically gambling. Therefore, join reordering is disabled and tables are joined in the order as provided by the user. This gives a deterministic and controllable behavior.
However, you can pass optimization rules into the optimizer by passing a TableConfig with a CalciteConfig when creating the TableEnvironment, i.e., TableEnvironment.getTableEnvironment(env, yourTableConfig). In the CalciteConfig you can add optimization rules to different optimization phases. You probably want to add JoinCommuteRule and JoinAssociateRule to the logical optimization phase. You probably also have to dig into the code to check how to pass statistics into the optimizer.
I want to select with neo4j the users, the amount of action movies they watched and how many of this action movies from each user where directed by Roland Emmerich.
I tried in various forms this query:
match (u:User)-[:watched]->(m:Movie)-[belongs_to]->
(Category{category_name:"Action"}) with count(m) as actionMovies, u
match (m:Movie)<-[directed]-(Director{director_name:"Roland Emmerich"})
return u, count(m) as MoviesFromRE, actionMovies
But the query execution never finishes. So I assume I'm doing something like a cross join.
Actually I expect to have the first count independent of the second since its already calculated when compiler comes to the second match clause.
Here is the relevant db view
Thanks for any suggestions and helps
Ok after get rid of the typos and including Movies in with clause:
match (u:User)-[:watched]->(m:Movie)-[:belongs_to]->
(:Category{category_name:"Action"}) with count(m) as actionMovies, u, m
match (m)<-[:directed]-(:Director{director_name:"Roland Emmerich"})
return u, count(m) as MoviesFromRE, actionMovies order by u.user_name
Now I get for actionMovies always 1. I think its because I group now in first with clause by Movies. I think I need a way to take the Movies from the first clause to the second but not group my first result by them.
The reason you are always getting 1 for actionMovies is in this clause:
WITH COUNT(m) AS actionMovies, u, m
That clause is saying (in part): "count the number of m nodes for every unique pair of u and m nodes". That count must always be 1.
This query should work better:
MATCH (u:User)-[:watched]->(am:Movie)-[:belongs_to]->(:Category{category_name:"Action"})
WITH u, COLLECT(am) AS ams
UNWIND ams AS m
OPTIONAL MATCH (m)<-[dir:directed]-(:Director{director_name:"Roland Emmerich"})
RETURN u, COUNT(dir) AS MoviesFromRE, SIZE(ams) AS actionMovies
ORDER BY u.user_name;
Initially I was trying to find out why it's so slow to do a spatial query with multiple SDO_REALTE in a single SELECT statement like this one:
SELECT * FROM geom_table a
WHERE SDO_RELATE(a.geom_column, SDO_GEOMETRY(...), 'mask=inside')='TRUE' AND
SDO_RELATE(a.geom_column, SDO_GEOMETRY(...), 'mask=anyinteract')='TRUE';
Note the two SDO_GEOMETRY may not be necessary the same. So it's a bit different from SDO_GEOMETRY(a.geom_column, the_same_geometry, 'mask=inside+anyinteract')='TRUE'
Then I found this paragraph from oracle documentation for SDO_RELATE:
Although multiple masks can be combined using the logical Boolean
operator OR, for example, 'mask=touch+coveredby', better performance
may result if the spatial query specifies each mask individually and
uses the UNION ALL syntax to combine the results. This is due to
internal optimizations that Spatial can apply under certain conditions
when masks are specified singly rather than grouped within the same
SDO_RELATE operator call. (There are two exceptions, inside+coveredby
and contains+covers, where the combination performs better than the
UNION ALL alternative.) For example, consider the following query using the logical
Boolean operator OR to group multiple masks:
SELECT a.gid FROM polygons a, query_polys B WHERE B.gid = 1 AND
SDO_RELATE(A.Geometry, B.Geometry,
'mask=touch+coveredby') = 'TRUE';
The preceding query may result in better performance if it is
expressed as follows, using UNION ALL to combine results of multiple
SDO_RELATE operator calls, each with a single mask:
SELECT a.gid
FROM polygons a, query_polys B
WHERE B.gid = 1
AND SDO_RELATE(A.Geometry, B.Geometry,
'mask=touch') = 'TRUE' UNION ALL SELECT a.gid
FROM polygons a, query_polys B
WHERE B.gid = 1
AND SDO_RELATE(A.Geometry, B.Geometry,
'mask=coveredby') = 'TRUE';
It somehow gives the answer for my question, but still it only says: "due to internal optimizations that Spatial can apply under certain conditions". So I have two questions:
What does it mean with "internal optimization", is it something to do with spatial index? (I'm not sure if I'm too demanding on this question, maybe only developers in oracle know about it.)
The oracle documentation doesn't say anything about my original problem, i.e. SDO_RELATE(..., 'mask=inside') AND SDO_RELATE(..., 'maks=anyinteract') in a single SELECT. Why does it also have very bad performance? Does it work similarly to SDO_RELATE(..., 'mask=inside+anyinteract')?
I have a situation where I need to update votes for a candidate.
Citizens can vote for this candidate, with more than one vote per candidate. i.e. one person can vote 5 votes, while another person votes 2. In this case this candidate should get 7 votes.
Now, I use Django. And here how the pseudo code looks like
votes = candidate.votes
vote += citizen.vote
The problem here, as you can see is a race condition where the candidate’s votes can get overwritten by another citizen’s vote who did a select earlier and set now.
How can avoid this with an ORM like Django?
If this is purely an arithmetic expression then Django has a nice API called F expressions
Updating attributes based on existing fields
Sometimes you'll need to perform a simple arithmetic task on a field, such as incrementing or decrementing the current value. The obvious way to achieve this is to do something like:
>>> product = Product.objects.get(name='Venezuelan Beaver Cheese')
>>> product.number_sold += 1
>>> product.save()
If the old number_sold value retrieved from the database was 10, then the value of 11 will be written back to the database.
This can be optimized slightly by expressing the update relative to the original field value, rather than as an explicit assignment of a new value. Django provides F() expressions as a way of performing this kind of relative update. Using F() expressions, the previous example would be expressed as:
>>> from django.db.models import F
>>> product = Product.objects.get(name='Venezuelan Beaver Cheese')
>>> product.number_sold = F('number_sold') + 1
>>> product.save()
This approach doesn't use the initial value from the database. Instead, it makes the database do the update based on whatever value is current at the time that the save() is executed.
Once the object has been saved, you must reload the object in order to access the actual value that was applied to the updated field:
>>> product = Products.objects.get(pk=product.pk)
>>> print product.number_sold
42
Perhaps the select_for_update QuerySet method is helpful for you.
An excerpt from the docs:
All matched entries will be locked until the end of the transaction block, meaning that other transactions will be prevented from changing or acquiring locks on them.
Usually, if another transaction has already acquired a lock on one of the selected rows, the query will block until the lock is released. If this is not the behavior you want, call select_for_update(nowait=True). This will make the call non-blocking. If a conflicting lock is already acquired by another transaction, DatabaseError will be raised when the queryset is evaluated.
Mind that this is only available in the Django development release (i.e. > 1.3).