I want to paginate a union query in CakePHP 3.0.0. By using a custom finder, I have it working almost perfectly, but I can't find any way to get limit and offset to apply to the union, rather than either of the subqueries.
In other words, this code:
$articlesQuery = $articles->find('all');
$commentsQuery = $comments->find('all');
$unionQuery = $articlesQuery->unionAll($commentsQuery);
$unionQuery->limit(7)->offset(7); // nevermind the weirdness of applying this manually
produces this query:
(SELECT {article stuff} ORDER BY created DESC LIMIT 7 OFFSET 7)
UNION ALL
(SELECT {comment stuff})
instead of what I want, which is this:
(SELECT {article stuff})
UNION ALL
(SELECT {comment stuff})
ORDER BY created DESC LIMIT 7 OFFSET 7
I could manually construct the correct query string like this:
$unionQuery = $articlesQuery->unionAll($commentsQuery);
$sql = $unionQuery->sql();
$sql = "($sql) ORDER BY created DESC LIMIT 7 OFFSET 7";
but my custom finder method needs to return a \Cake\Database\Query object, not a string.
So,
Is there a way to apply methods like limit() to an entire union query?
If not, is there a way to convert a SQL query string into a Query object?
Note:
There's a closed issue that describes something similar to this (except using paginate($unionQuery)) without a suggestion of how to overcome the problem.
Apply limit and offset to each subquery?
scrowler kindly suggested this option, but I think it won't work. If limit is set to 5 and the full result set would be this:
Article 9 --|
Article 8 |
Article 7 -- Page One
Article 6 |
Article 5 --|
Article 4 --|
Comment 123 |
Article 3 -- Here be dragons
Comment 122 |
Comment 121 --|
...
Then the query for page 1 would work, because (the first five articles) + (the first five comments), sorted manually by date, and trimmed to just the first five of the combined result would result in articles 1-5.
But page 2 won't work, because the offset of 5 would be applied to both articles and comments, meaning the first 5 comments (which weren't included in page 1), will never show up in the results.
Being able to apply these clauses directly on the query returned by unionAll() is not possible AFAIK, it would require changes to the API that would make the compiler aware where to put the SQL, being it via options, a new type of query object, whatever.
Query::epilog() to the rescue
Luckily it's possible to append SQL to queries using Query::epilog(), being it raw SQL fragments
$unionQuery->epilog('ORDER BY created DESC LIMIT 7 OFFSET 7');
or query expressions
$unionQuery->epilog(
$connection->newQuery()->order(['created' => 'DESC'])->limit(7)->offset(7)
);
This should give you the desired query.
It should be noted that according to the docs Query::epilog() expects either a string, or a concrete \Cake\Database\ExpressionInterface implementation in the form a \Cake\Database\Expression\QueryExpression instance, not just any ExpressionInterface implementation, so theoretically the latter example is invalid, even though the query compiler works with any ExpressionInterface implementation.
Use a subquery
It's also possible to utilize the union query as a subquery, this would make things easier in the context of using the pagination component, as you wouldn't have to take care of anything other than building and injecting the subquery, since the paginator component would be able to simply apply the order/limit/offset on the main query.
/* #var $connection \Cake\Database\Connection */
$connection = $articles->connection();
$articlesQuery = $connection
->newQuery()
->select(['*'])
->from('articles');
$commentsQuery = $connection
->newQuery()
->select(['*'])
->from('comments');
$unionQuery = $articlesQuery->unionAll($commentsQuery);
$paginatableQuery = $articles
->find()
->from([$articles->alias() => $unionQuery]);
This could of course also be moved into a finder.
Related
I have a table in my model including some feature and I want to execute select query like this in Django:
SELECT * FROM TABLE WHERE column1-column2 > 10000
I tried filter(), but after a little search I found out that I should use .annotate() and I changed my query to:
Account.objects.annotate(realcharge=(F('charge')-F('amount')), realcharge__lt=10000)
But i get this error:
'int' object has no attribute 'resolve_expression'
How should I write my query?
my django version is 1.11.
I don't know if you still need an answer, but you could try the .filter() method after the call to .annotate(), like this:
# Note: I split this query in 2 lines for better readability
qs = Account.objects.annotate(realcharge=(F('charge')-F('amount')))
qs = qs.filter(realcharge__lt=10000)
I have a visit Model and I'm getting the data I want like that:
$app_visits = Visit::select([
'start',
'end',
'machine_name'
])->where('user_id', $chosen_id)->get();
But I want to add points for every visit. Every visit has an interaction (but there's no visit_id (because of other system I cannot add it).
Last developer left it like that:
$interactions = Interaction::where([
'machine_name' => $app_visit->machine_name,
])->whereBetween('date', [$app_visit->start, $app_visit->end])->get();
$points = 0;
foreach ($interactions as $interaction) {
$points += (int)$interaction->app_stage;
}
$app_visits[$key]['points'] = $points
But I really don't like it as it's slow and messy. I wanted to just add 'points' sum to the first query, to touch database only once.
#edit as someone asked for database structure:
visit:
|id | start | end | machine_name | user_id
inteaction:
|id | time | machine_name | points
You can use a few things in eloquent. Probably the most useful for this case, is the select(DB::raw(sql...)) as you will have to add a bit of raw sql to retrieve a count.
For example:
return $query
->join(...)
->where(...)
->select(DB::raw(
COUNT(DISTINCT res.id) AS count'
))
->groupBy(...);
Failing that, I'd just replace the eloquent with raw sql. We've had to do that a fair bit, as our data sets are massive, and eloquent model building has proven a little slow.
Update as you've added structure. Why not just add a relation to Interaction, based upon machine_name (or even a custom method using raw sql that calculates the points), and use: Visits::with('interaction.visitPoints')->...blah ?
Take a look at DB instead of Eloquent:
https://laravel.com/docs/5.6/queries
For more complex and efficient queries.
There is also a possibility to use raw SQL with this facade.
I want the no of records available in the database for the current query but without considering the LIMIT.
$this->Orders->find('all')
->where(['order_quantity']=>5)
->LIMIT(5);
Let's consider, I have 50 no of records for this above query. So just want the no of records available for the current query. I can't use 'count()' because of the limit it will always return total no of records available is less than or equal to 5. Is there any solution in cakePHP.
This page in the CakePHP 3 book, explains EXACTLY the answer to your question including how and why it works:
Returning the Total Count of Records
Using a single query object, it is possible to obtain the total number
of rows found for a set of conditions:
$total = $articles->find()->where(['is_active' => true])->count();
The count() method will ignore the limit, offset and page clauses,
thus the following will return the same result:
$total = $articles->find()->where(['is_active' => true])->limit(10)->count();
This is useful when you need to know the total result set size in
advance, without having to construct another Query object. Likewise,
all result formatting and map-reduce routines are ignored when using
the count() method.
Notice the bit about "... will ignore the limit, offset, and page clauses"
So try something like this:
$data = $articles->find()->where(['is_active' => true])->limit(10);
$count = $data->count();
I don't think you are familiar with the use of limit in find. So, I suggest you to study the docs.
The limit in your query means that the query will only display first 5 data even if the query actually has 50 data.
So, in order to get the actual data, you just need to remove the limit and make some changes in your code as follows:
$this->Orders->find('count')->where(['order_quantity' => 5]);
I've got a very complex query and trying to give a simple example of one of the sub-tables I'm having problems with, if you need more information or context please let me know.
I've posed a CSV file with some sample data here:
https://drive.google.com/open?id=0B4xdnV0LFZI1dzE5S29QSFhQSmM
We make cakes, and 99% of our cakes are made by us. The 1% is when we have a cake delivered to us from a subcontractor and we 'Receive' and 'Audit' it.
What I wanted to do was to write something like this:
SELECT
Cake.Cake
Instruction.Cake_Instruction_Key
Steps
FROM
Cake
Join Instruction
ON Cake.Cake_Key = Instruction.Cake_Key
JOIN Steps
ON Instruction.Step_Key = Steps.Step_Key
WHERE
MIN(Steps.Step_Key) = 1
This fails because you can't have an aggregate in the WHERE clause.
The desired results would be:
Cake C 13 Receive
Cake C 14 Audit
Cake D 15 Receive
Cake D 16 Audit
Thank you in advance for your help!
Take a look at the HAVING keyword:
https://msdn.microsoft.com/en-us/library/ms180199.aspx
It works more or less the same as the WHERE clause but for aggregate functions after the GROUP BY clause.
Beware however this can be slow. You should try filtering down the number of records as much as possible in the WHERE and even consider using a tempory table to aggregate the data into first.
What you're talking about is the GROUP BY/HAVING clause, so in your case you would need to add something like
GROUP BY Cake.Cake, Instruction.Cake_Instruction_Key, Steps
HAVING MIN(Steps.Step_Key) = 1
Select all records, ID which is not in the list
How to make like :
query = Story.all()
query.filter('ID **NOT IN** =', [100,200,..,..])
There's no way to do this efficiently in App Engine. You should simply select everything without that filter, and filter out any matching entities in your code.
This is now supported via GQL query
The 'IN' and '!=' operators in the Python runtime are actually
implemented in the SDK and translate to multiple queries 'under the
hood'.
For example, the query "SELECT * FROM People WHERE name IN ('Bob',
'Jane')" gets translated into two queries, equivalent to running
"SELECT * FROM People WHERE name = 'Bob'" and "SELECT * FROM People
WHERE name = 'Jane'" and merging the results. Combining multiple
disjunctions multiplies the number of queries needed, so the query
"SELECT * FROM People WHERE name IN ('Bob', 'Jane') AND age != 25"
generates a total of four queries, for each of the possible conditions
(age less than or greater than 25, and name is 'Bob' or 'Jane'), then
merges them together into a single result set.
source: appengine blog
This is an old question, so I'm not sure if the ID is a non-key property. But in order to answer this:
query = Story.all()
query.filter('ID **NOT IN** =', [100,200,..,..])
...With ndb models, you can definitely query for items that are in a list. For example, see the docs here for IN and !=. Here's how to filter as the OP requested:
query = Story.filter(Story.id.IN([100,200,..,..])
We can even query for items that in a list of repeated keys:
def all(user_id):
# See if my user_id is associated with any Group.
groups_belonged_to = Group.query().filter(user_id == Group.members)
print [group.to_dict() for group in belong_to]
Some caveats:
There's docs out there that mention that in order to perform these types of queries, Datastore performs multiple queries behind the scenes, which (1) might take a while to execute, (2) take longer if you searching in repeated properties, and (3) will up your costs with more operations.