CakePHP 3 left join and union in same query

CakePHP 3 left join and union in same query - cakephp

I have a products table and a metadata table that I want to allow searching of. Having managed to write the query so it works super fast, I'm having trouble migrating it to CakePHP 3.x
The tables are standard parent->child setup with foreign keys and fulltext indexes on the relevant data fields.
The query I want to emulate is:
select products.*, sum(hits.relevance) as relevance from (
SELECT products.id, MATCH(products.code, products.title) AGAINST('"mm mmm"' IN BOOLEAN MODE) as relevance
FROM products
WHERE MATCH(products.code, products.title) AGAINST('"mm mmm"' IN BOOLEAN MODE)
union all
SELECT pim1.product_id as id, MATCH(pim1.value) AGAINST('"mm mmm"' IN BOOLEAN MODE) as relevance
FROM pim1
WHERE MATCH(pim1.value) AGAINST('"mm mmm"' IN BOOLEAN MODE)
) as hits
left join products on products.id = hits.id
group by products.id
order by relevance desc
Essentially this is allowing MySQL to use the indexes much faster than a left join does, unions the results, then uses that as the primary table to left join the products data to to hand off to paginate() and the view.
I have the unions all sorted, but I can't seem to get the outer query to work.
$pim = $this->Products
->association("Pim1")
->find('all')
->select(['fk' => 'product_id'])
->select(['relevance' => 'MATCH(value) AGAINST(:search IN BOOLEAN MODE)'])
->where("MATCH(value) AGAINST(:search IN BOOLEAN MODE)")
->bind(":search", $this->request->session()->read('search.wild_terms'));
$prd = $this->Products
->find('all')
->select(['fk' => 'id'])
->select(['relevance' => 'MATCH(code, title) AGAINST(:search IN BOOLEAN MODE)'])
->where("MATCH(code, title) AGAINST(:search IN BOOLEAN MODE)")
->bind(":search", $this->request->session()->read('search.wild_terms'));
This bit works and runs the union as per the subquery above
$query = $prd->unionAll($pim);
This bit doesn't then allow me to attach the products data to the results of that union
$query->leftJoinWith("Products", function ($q) { return $q->where(['Products.id' => 'fk']); });
It throws an error
Products is not associated with Products
Any guidance on how to convert my successful SQL into Cake would be greatly appreciated.

leftJoinWith() is used for joining associations. Since your Products table is not associated with itself you cannot use it. Instead use leftJoin() You will need to pass all the information to that method to build the join conditions.

Related

How to improve poor performance of EF Core SQL query that sorts on a child collection

My issue is with the queries that EF Core generates for fetching ordered items from a child collection of a parent.
I have a parent class which has a collection of child objects. I'm using Entity Framework Core 5.0.5 (code first) against a SQL Server database. I've tried to boil down the scenario, so let's call it an Owner with a collection of Pets.
I often want a list of owners with their oldest pet, so I'll do something like
Context.Owners
.Select(owner =>
new {
Owner = owner,
OldPet = owner.Pets.OrderBy(pet => pet.Age).LastOrDefault()
})
.Where(owner.Id == 1);
This worked fine before (on ef6) and works functionally now. However, the issue I have is that now EF Core translates these sub collection queries into something apparently cleverer, something like
SELECT *
FROM [Owners] AS [c]
LEFT JOIN (
SELECT *
FROM (
SELECT [c0].[Id] ... , ROW_NUMBER() OVER(PARTITION BY [c0].[OwnerId] ORDER BY [c0].[Age] DESC) AS [row]
FROM [Pets] AS [c0]
) AS [t]
WHERE [t].[row] <= 1
) AS [t0] ON [c].[Id] = [t0].[OwnerId]
The problem I'm having is that it seems to perform terribly. Looking at the execution plan it's doing a clustered index seek on the pets table, then sorting them. The 'number of rows read' is massive and the 'sorting' takes tens or hundreds of milliseconds.
The way EF6 does the same functionality seemed way more performant in this sort of scenario.
Is there a way to change the behaviour so I can choose? Or a way to rewrite this type of query such that I don't have this problem? I've tried many variations of using GroupBy etc and still have the same result.

If you are doing FirstOrDefault in projection, EF Core has to create such join, which uses Window Function ROW_NUMBER. To get desired SQL it is better to rewrite your query to be more predictable for LINQ translator:
var query =
from owner in Context.Owners
from pet in owner.Pets
where owner.Id == 1
orderby pet.Age descending
select new
{
Owner = owner,
OldPet = pet
}
var result = query.FirstOrDefault();

SOQL Query for Left Join for custom objects

I have a requirement to fetching data from Sales force. I need to get the data from two custom objects. I
have written query in sql can anyone help me to convert it into SOQL
SELECT ID, Name, Crop_Year__c, Targeted_Enrollment_Segments__c, Description__c, Start_Date__c,
End_Date__c from Enrollment_Program__c EP
Left Join Account_Enrollment__c AE on EP.Crop_Year__c = AE.Crop_Year__c and EP.ID =
AE.Enrollment_Program__c
where AE.Account__c = 'xyz'

As you probably know, Salesforce SOQL doesn't have explicit JOIN clauses. It does that for you implicitly based on related object fields. That means you'll have to query Account_Enrollment__c and traverse the fields to get the related Enrollment_Program__c Lookup relationship.
Another problem is Salesforce only performs joins based on primary and foreign keys, so the EP.Crop_Year__c = AE.Crop_Year__c in your query won't work.
So, with that said, you can try this:
SELECT Enrollment_Program__c, Enrollment_Program__e.Name,
Enrollment_Program__r.Crop_Year__c, Enrollment_Program__r.Targeted_Enrollment_Segments__c,
Enrollment_Program__r.Description__c, Enrollment_Program__r.Start_Date__c,
Enrollment_Program__r.End_Date__c
FROM Account_Entrollment_Program__c WHERE Account__c = 'zyz'
If you know beforehand what the Crop_Year__c value is, you can just add this to your query:
AND Crop_Year__c=:year AND Enrollment_Program__c.Crop_Year__c=:year
Some details on the queries:
The __r suffix is how you get the lookup object addressed in the query. If you are interested only in the id, you can use __c.
The :year is how you pass the parameter year to the query. If you want to append it as text you can just use ... Crop_Year='+ year + '.

SqlServer Many to Many AND

I have 3 (hypothetical) tables.
Photos (a list of photos)
Attributes (things describing the photos)
PhotosToAttributes (a table to link the first 2)
I want to retrieve the Names of all the Photos that have a list of attributes.
For example, all photos that have both dark lighting and are portraits (AttributeID 1 and 2). Or, for example, all photos that have dark lighting, are portraits and were taken at a wedding (AttributeID 1 and 2 and 5). Or any arbitrary number of attributes.
The scale of the database will be maybe 10,000 rows in Photos, 100 Rows in Attributes and 100,000 rows in PhotosToAttributes.
This question: SQL: Many-To-Many table AND query is very close. (I think.) I also read the linked answers about performance. That leads to something like the following. But, how do I get Name instead of PhotoID? And presumably my code (C#) will build this query and adjust the attribute list and count as necessary?
SELECT PhotoID
FROM PhotosToAttributes
WHERE AttributeID IN (1, 2, 5)
GROUP by PhotoID
HAVING COUNT(1) = 3
I'm a bit database illiterate (it's been 20 years since I took a database class); I'm not even sure this is a good way to structure the tables. I wanted to be able to add new attributes and photos at will without changing the data access code.

It is probably a reasonable way to structure the database. An alternate would be to keep all the attributes as a delimited list in a varchar field, but that would lead to performance issues as you search the field.
Your code is close, to take it to the final step you should just join the other two tables like this:
Select p.Name, p.PhotoID
From Photos As p
Join PhotosToAttributes As pta On p.PhotoID = pta.PhotoID
Join Attributes As a On pta.AttributeID = a.AttributeID
Where a.Name In ('Dark Light', 'Portrait', 'Wedding')
Group By p.Name, p.PhotoID
Having Count(*) = 3;
By joining the Attributes table like that it means you can search for attributes by their name, instead of their ID.

For first create view from your joins:
create view vw_PhotosWithAttributes
as
select
p.PhotoId,
a.AttributeID,
p.Name PhotoName,
a.Name AttributeName
from Photos p
inner join PhotosToAttributes pa on p.PhotoId = pa.PhotoId
inner join Attributes a on a.AttributeID = pa.AttributeID
You can easy ask for attribute, name, id but don't forget to properly index field.

NHibernate Criteria SQL Inner Join on Sub Select Same Table

I can't for the life of me figure out how to translate the following SQL query using NHibernate's Criteria API:
SELECT r.* from ContentItemVersionRecords as r
INNER JOIN (
SELECT ContentItemId as CID, Max(Number) as [Version]
FROM ContentItemVersionRecords
GROUP BY ContentItemId
) AS l
ON r.ContentItemId = l.CID and r.Number = l.[Version]
WHERE Latest = 0 and Published = 0
The table looks like this:
The result of the SQL query above will return the highlighted records.
The idea is to select the latest version of content items, so I basically need to group by ContentItemId and get the record with the highest Number.
So the result will look like this:
I started out with a detached criteria, but I am clueless as to how to use it in the criteria:
// Sub select for the inner join:
var innerJoin = DetachedCriteria.For<ContentItemVersionRecord>()
.SetProjection(Projections.ProjectionList()
.Add(Projections.GroupProperty("ContentItemId"), "CID")
.Add(Projections.Max("Number"), "Version"));
// What next?
var criteria = session.CreateCriteria<ContentItemVersionRecord>();
Please note that I have to use the Criteria API - I can't use LINQ, HQL or SQL.
Is this at all possible with the Criteria API?
UPDATE: I just came across this post which looks very similar to my question. However, when I apply that as follows:
var criteria = session
.CreateCriteria<ContentItemVersionRecord>()
.SetProjection(
Projections.ProjectionList()
.Add(Projections.GroupProperty("ContentItemId"))
.Add(Projections.Max("Number")))
.SetResultTransformer(Transformers.AliasToBean<ContentItemVersionRecord>());
I get 2 results, which looks promising, but all of the integer properties are 0:
UPDATE 2: I found out that if I supply aliases, it will work (meaning I will get a list of ContentItemVersionRecords with populated objects):
var criteria = session
.CreateCriteria<ContentItemVersionRecord>()
.SetProjection(
Projections.ProjectionList()
.Add(Projections.Max("Id"), "Id")
.Add(Projections.GroupProperty("ContentItemId"), "ContentItemId")
.Add(Projections.Max("Number"), "Number"))
.SetResultTransformer(Transformers.AliasToBean<ContentItemVersionRecord>());
However, I can't use the projected values as the end result - I need to use these results as some sort of input into the outer query, e.g.
SELECT * FROM ContentItemVersionRecord WHERE Id IN ('list of record ids as a result from the projection / subquery / inner join')
But that won't work, since the projection returns 3 scalar values (Id, ContentItemId and Number). If it would just return "Id", then it might work. But I need the other two projections to group by ContentItemId and order by Max("Number").

OK, so in a nutshell, you need to unwind that nested query, and do a group by with a having clause, which is pretty much a where on aggregated values, as in the following HQL:
SELECT civ.ContentItem.Id, MAX(civ.Number) AS VersionNumber
FROM ContentItemVersionRecord civ
JOIN ContentItem ci
GROUP BY civ.ContentItem.Id " +
HAVING MAX(civ.Latest) = 0 AND MAX(civ.Published) = 0
This gives you, for each deleted content items (those have all their latest and published flags to zero on all their content item version records), the maximum version number, i.e. the latest version of each deleted content item.

Sql Server - How compare hash of two rows in merge

I have several working tables that I am merging together into one final table that will be used for display. If the display table does not contain the primary key compiled from the working tables (hereafter called src)then I insert the row into display. This works fine, the next part is confusing to me.
If the primary key is already in display I only want to update the display row if the src row has the same primary key but at least one column is different from the display row. I'd like to implement this using the HASHBYTES() method using the MD5 algorithm.
From msdn, the syntax should look like this: HASHBYTES('MD5', {#variable | 'string'})
I want to be able to do something like this in my merge statement:
WHEN MATCHED AND HASHBYTES('MD5', display) != HASHBYTES('MD5', src) THEN ...(stuff)
How do I complete the HASHBYTES function?
Here is my current merge statement
MERGE dbo.DisplayCases AS display
USING (SELECT CaseId, Title, projects.ProjectName, categories.CategoryTitle, Root, milestones.MilestoneName,
milestones.MilestoneDate, Priority, statuses.StatusTitle, EstimatedHours, ElapsedHours, personAssigned.Name as AssignedTo,
personResolved.Name as ResolvedBy, cases.IsResolved, IsOpen, Opened, Resolved, Uri, ResolveUri,
OutlineUri, SpecUri, ParentId, Backlog
FROM fogbugz.Cases cases
JOIN fogbugz.Projects projects ON cases.ProjectId = projects.ProjectId
JOIN fogbugz.Categories categories ON cases.CategoryId = categories.CategoryId
JOIN fogbugz.Milestones milestones ON cases.MilestoneId = milestones.MilestoneId
JOIN fogbugz.Statuses statuses ON cases.Status = statuses.StatusId
JOIN fogbugz.People personAssigned ON cases.AssignedTo = personAssigned.Id
LEFT JOIN fogbugz.People personResolved ON cases.ResolvedBy = personResolved.Id
) as src
ON display.CaseId = src.CaseId
WHEN NOT MATCHED THEN
INSERT(CaseId, CaseTitle, ProjectName, CategoryTitle, RootId, MilestoneName, MilestoneDate, Priority,
StatusTitle, EstHrs, ElapsedHrs, AssignedTo, ResolvedBy, IsOpen, IsResolved, Opened, Resolved, Uri,
ResolveUri, OutlineUri, Spec, ParentId, Backlog)
VALUES(src.CaseId, src.Title, src.ProjectName, src.CategoryTitle, src.Root, src.MilestoneName,
src.MilestoneDate, src.Priority, src.StatusTitle, src.EstimatedHours, src.ElapsedHours,
src.AssignedTo, src.ResolvedBy, src.IsResolved, src.IsOpen, src.Opened, src.Resolved,
src.Uri, src.ResolveUri, src.OutlineUri, src.SpecUri, src.ParentId, src.Backlog);

From Martin Smith's comment...
You could do WHEN MATCHED AND EXISTS(SELECT Source.* EXCEPT SELECT Target.*) THEN UPDATE ...

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

CakePHP 3 left join and union in same query - cakephp

leftJoinWith() is used for joining associations. Since your Products table is not associated with itself you cannot use it. Instead use leftJoin() You will need to pass all the information to that method to build the join conditions.

Related

How to improve poor performance of EF Core SQL query that sorts on a child collection

SOQL Query for Left Join for custom objects

SqlServer Many to Many AND

NHibernate Criteria SQL Inner Join on Sub Select Same Table

Sql Server - How compare hash of two rows in merge

Categories

Resources