EF LINQ Count by Grouped field - sql-server

I have the following data schema:
With the following LINQ query:
var profiles = (
from p in context.BusinessProfiles
join u in context.Users on p.UserId equals u.Id
join addr in context.BusinessAddress on p.ProfileId equals addr.ProfileId into addrj
from addr in addrj.DefaultIfEmpty()
join pa in context.BusinessProfileActivities on p.ProfileId equals pa.ProfileId into paj
from paIfNull in paj.DefaultIfEmpty()
where p.ProfileId >= 137 && p.ProfileId <= 139
group new { p, u, addr, paIfNull }
by new {
p.ProfileId,
p.CompanyName,
p.Email,
UserEmail = u.Email,
addr.City, addr.Region,
addr.Country,
ActivityProfileId = paIfNull.ProfileId }
into pg
select new {
pg.Key.ProfileId,
pg.Key.CompanyName,
Email = pg.Key.Email ?? pg.Key.UserEmail,
pg.Key.City,
pg.Key.Region,
pg.Key.Country,
MatchingActivities = pg.Key.ActivityProfileId > 0 ? pg.Count() : 0
} into result
orderby result.MatchingActivities descending
select result
);
Which results with:
This result is corrent (ProfileId 137 has 0 activities, 138 has 1 and 139 has 2), but it produces the following SQL:
SELECT [b].[ProfileId], [b].[CompanyName], COALESCE([b].[Email], [a].[Email]) AS [Email], [b0].[City], [b0].[Region], [b0].[Country],
CASE WHEN [b1].[ProfileId] > CAST(0 AS bigint) THEN COUNT(*)
ELSE 0
END AS [MatchingActivities]
FROM [BusinessProfiles] AS [b]
INNER JOIN [AspNetUsers] AS [a] ON [b].[UserId] = [a].[Id]
LEFT JOIN [BusinessAddress] AS [b0] ON [b].[ProfileId] = [b0].[ProfileId]
LEFT JOIN [BusinessProfileActivities] AS [b1] ON [b].[ProfileId] = [b1].[ProfileId]
WHERE ([b].[ProfileId] >= CAST(137 AS bigint)) AND ([b].[ProfileId] <= CAST(139 AS bigint))
GROUP BY [b].[ProfileId], [b].[CompanyName], [b].[Email], [a].[Email], [b0].[City], [b0].[Region], [b0].[Country], [b1].[ProfileId]
ORDER BY CASE
WHEN [b1].[ProfileId] > CAST(0 AS bigint) THEN COUNT(*)
ELSE 0
END DESC
In SQL, I can avoid both CASE WHEN if I use COUNT([b1].[ProfileId]) like this:
SELECT [b].[ProfileId], [b].[CompanyName], COALESCE([b].[Email], [a].[Email]) AS [Email], [b0].[City], [b0].[Region], [b0].[Country],
COUNT([b1].[ProfileId]) AS [MatchingActivities]
FROM [BusinessProfiles] AS [b]
INNER JOIN [AspNetUsers] AS [a] ON [b].[UserId] = [a].[Id]
LEFT JOIN [BusinessAddress] AS [b0] ON [b].[ProfileId] = [b0].[ProfileId]
LEFT JOIN [BusinessProfileActivities] AS [b1] ON [b].[ProfileId] = [b1].[ProfileId]
WHERE ([b].[ProfileId] >= CAST(137 AS bigint)) AND ([b].[ProfileId] <= CAST(139 AS bigint))
GROUP BY [b].[ProfileId], [b].[CompanyName], [b].[Email], [a].[Email], [b0].[City], [b0].[Region], [b0].[Country], [b1].[ProfileId]
ORDER BY [MatchingActivities] DESC
My question is, how can I count by grouped ActivityProfileId = paIfNull.ProfileId using LINQ and get EF to generate the above SQL?
I have tried so many variations resulting mostly in EF to SQL errors.
MatchingActivities = pg.Count(t => t.ActivityProfileId!= 0)
MatchingActivities = pg.Select(t => t.paIfNull.ProfileId).Distinct().Count(),
MatchingActivities = pg.Count(t => t.paIfNull != null),
All result in errors like System.InvalidOperationException: The LINQ expression ... could not be translated. or getting MatchingActivities as 1 instead of 0.
Related Q/A:
LINQ Count returning 1 instead of zero for an empty group
Group by in LINQ
How to write left join, group by and average in c# entity framework Linq

In short you can't! EF Core still doesn't support that.
See this:
https://github.com/dotnet/efcore/issues/17376
And also See:
https://stackoverflow.com/a/61878332/9212040

Related

Combining multiple SQL's into a single SQL

Hi have the following queries
(SELECT COUNT(DISTINCT KUNDNR) CHECKED_CUSTOMER from CLNT0001.TCM_CHECK_SUMMARY
where '20170322000000000' <= HISTVON and HISTVON < '20170323000000000' and INSTITUTSNR='0001')
and
SELECT clientNumber
,creationDate
,customerNumber
,checkedCustomer
,CLNT0001.TCM_CHECK_SUMMARY.COUNTRY_CODE countryCode
,CLNT0001.TCM_CHECK_SUMMARY.PST_KURZTEXT personStatus
,CLNT0001.TCM_CASE_COUNTRY_GROUP.COUNTRY_CODE homeCountryCode
,CLNT0001.TCM_CASE_COUNTRY_GROUP.PST_LFD_NR personStatusId
,CLNT0001.TCM_CASE_COUNTRY_GROUP.REGULATION regulation
,caseStatus
,COC_SCORE_COUNT cocCaseCount
FROM (
SELECT GEPRUEFT_JN checkedCustomer
,INSTITUTSNR clientNumber
,KUNDNR customerNumber
,CASE_STATUS caseStatus
,MAX(CREATION_DATE) creationDate
FROM CLNT0001.TAXACTCASE
WHERE GEPRUEFT_JN = 'J' AND CREATION_DATE>='20170322000000000' AND
CREATION_DATE<='20170323000000000'
GROUP BY KUNDNR
,INSTITUTSNR
,GEPRUEFT_JN
,CASE_STATUS
) T1
INNER JOIN CLNT0001.TCM_CHECK_SUMMARY ON T1.customerNumber = CLNT0001.TCM_CHECK_SUMMARY.KUNDNR
INNER JOIN CLNT0001.TCM_CASE_COUNTRY_GROUP ON T1.customerNumber = CLNT0001.TCM_CASE_COUNTRY_GROUP.KUNDNR
WHERE T1.creationDate <= CLNT0001.TCM_CHECK_SUMMARY.HISTBIS
AND T1.creationDate >= CLNT0001.TCM_CHECK_SUMMARY.HISTVON
I need the CHECKED_CUSTOMER column as a part of the second query's result set, i am not able to figure out a way to do this, is this possible ?
SELECT clientNumber,creationDate,customerNumber,checkedCustomer
,CLNT0001.TCM_CHECK_SUMMARY.COUNTRY_CODE countryCode
,CLNT0001.TCM_CHECK_SUMMARY.PST_KURZTEXT personStatus
,CLNT0001.TCM_CASE_COUNTRY_GROUP.COUNTRY_CODE homeCountryCode
,CLNT0001.TCM_CASE_COUNTRY_GROUP.PST_LFD_NR personStatusId
,CLNT0001.TCM_CASE_COUNTRY_GROUP.REGULATION regulation
,caseStatus,COC_SCORE_COUNT cocCaseCount ,CHECKED_CUSTOMER
FROM (
SELECT GEPRUEFT_JN checkedCustomer,INSTITUTSNR clientNumber ,KUNDNR customerNumber ,CASE_STATUS caseStatus,MAX(CREATION_DATE) creationDate,COUNT(DISTINCT b.KUNDNR) CHECKED_CUSTOMER
FROM CLNT0001.TAXACTCASE
LEFT JOIN CLNT0001.TCM_CHECK_SUMMARY b ON CLNT0001.TAXACTCASE.KUNDNR=b.KUNDNR
WHERE GEPRUEFT_JN = 'J' AND CREATION_DATE>='20170322000000000' AND
CREATION_DATE<='20170323000000000'
GROUP BY KUNDNR,INSTITUTSNR ,GEPRUEFT_JN,CASE_STATUS
) T1 INNER JOIN CLNT0001.TCM_CHECK_SUMMARY ON T1.customerNumber = CLNT0001.TCM_CHECK_SUMMARY.KUNDNR
INNER JOIN CLNT0001.TCM_CASE_COUNTRY_GROUP ON T1.customerNumber = CLNT0001.TCM_CASE_COUNTRY_GROUP.KUNDNR
WHERE T1.creationDate <= CLNT0001.TCM_CHECK_SUMMARY.HISTBIS
AND T1.creationDate >= CLNT0001.TCM_CHECK_SUMMARY.HISTVON

How to supply multiple values in between clause after where clause

SELECT
ROW_NUMBER() OVER (ORDER BY Vendor_PrimaryInfo.Vendor_ID ASC) AS RowNumber,
*
FROM
Unit_Table
INNER JOIN
Vendor_Base_Price ON Unit_Table.Unit_ID = Vendor_Base_Price.Unit_ID
INNER JOIN
Vendor_PrimaryInfo ON Vendor_Base_Price.Vendor_ID = Vendor_PrimaryInfo.Vendor_ID
INNER JOIN
Vendor_Registration ON Vendor_Base_Price.Vendor_ID = Vendor_Registration.Vendor_ID
AND Vendor_PrimaryInfo.Vendor_ID = Vendor_Registration.Vendor_ID
INNER JOIN
Category_Table ON Vendor_Registration.Category_ID = Category_Table.Category_ID
LEFT JOIN
Vendor_Value_Table ON Vendor_Registration.Vendor_ID = Vendor_Value_Table.Vendor_ID
LEFT JOIN
Feature_Table ON Vendor_Value_Table.Feature_ID = Feature_Table.Feature_ID
WHERE
Vendor_Registration.Category_ID = 5
AND Vendor_PrimaryInfo.City = 'City'
AND (value_text in ('sample value') or
(SELECT
CASE WHEN ISNUMERIC(value_text) = 1
THEN CAST(value_text AS INT)
ELSE -1
END) BETWEEN 0 AND 100)
As column has multiple values which may be text or may be int that's why I cast based on case. My question is: I just want to fetch the records either whose value is between 0 and 100 or value between 300 to 400 or value is like sample value.
I just want to place the condition after where clause and do not want to use column_name multiple time in between operator because these values are coming from url
Thanks in advance any help would be grateful.
You can try this way..
WHERE
Vendor_Registration.Category_ID = 5
AND Vendor_PrimaryInfo.City = 'City'
AND (value_text in ('sample value') or
(CASE WHEN (ISNUMERIC(value_text) = 1)
THEN CAST(value_text AS INT)
ELSE -1
END) BETWEEN 0 AND 100)

Linq results different than the sql result

I am having an MVC4 web application where I am using LINQ.
I have the below query which results 53 rows in SQL.
select * from table1 t join
[table2] tpf on t.TestID=tpf.TestID
join
table3 pf on tpf.Test2ID =pf.Test2ID
join table4 pfp on
pf.Test3ID = pfp.Test3ID
join table5 p on pfp.Test5ID = p.Test5ID where t.testtypeid=1
order by pfp.Test3ID,pf.Test2ID
If I convert the same query as below it returns more records.
trvm.MyTestVMs = (
from tt in db.table1s
join ttpf in db.table2s on tt.TestID equals ttpf.TestID
join pf in db.table3s on ttpf.Test2ID equals pf.Test2ID
join pfp in db.table4s on pf.Test3ID equals pfp.Test3ID
join p in table5s on pfp.Test5ID equals p.Test5ID
where tt.testtypeid == 1
orderby pfp.Test3ID
orderby pf.Test2ID
select new MyTestVM
{
FamilyID = pf.Test2ID,
ProductID = p.Test3ID,
Desc = p.Description
}
).ToList();
The result which is getting from SQL and the above LINQ varies. Actually, there are some duplicate result I am getting from the LINQ query. What is causing this difference?
It turns out that the LINQ query is not equivalent to the SQL query posted due to the usage of a products query variable (not shown in the post) which causes one of the many-to-many link table to be included twice, thus producing more records.
One way to fix the problem is to replace products with db.Products and apply the same filters as in the query variables you were trying to reuse.
But if you want to reuse query variables, then here is the correct way to do that:
// Eliminate the need of DbFunctions.TruncateTime(dt) inside the queries
dt = dt.Date;
// Queries
var productFamilys = (
from tt in db.TestTypes
join ttpf in db.TestTypeProductFamilys on tt.TestTypeID equals ttpf.TestTypeID
join pf in db.ProductFamilys on ttpf.ProductFamilyID equals pf.ProductFamilyID
where tt.TestTypeID == TestTypeID
where DbFunctions.TruncateTime(pf.StartDate) <= dt
where DbFunctions.TruncateTime(pf.EndDate) > dt
select pf
);
var productFamilyProducts = (
from pf in productFamilys
join pfp in db.ProductFamilyProducts on pf.ProductFamilyID equals pfp.ProductFamilyID
join p in db.Products on pfp.ProductID equals p.ProductID
where DbFunctions.TruncateTime(p.StartDate) <= dt
where DbFunctions.TruncateTime(p.EndDate) > dt
select new { Family = pf, Product = p }
);
var products = (
from pfp in productFamilyProducts
select pfp.Product
);
var productFamilyProductVMs = (
from pfp in productFamilyProducts
orderby pfp.Product.ProductID, pfp.Family.ProductFamilyID
select new ProductFamilyProductVM
{
ProductFamilyID = pfp.Family.ProductFamilyID,
ProductID = pfp.Product.ProductID,
ProdDesc = pfp.Product.Description
}
);
// Results
trvm.ProductFamilys = productFamilys.ToList();
trvm.Products = products.ToList();
trvm.ProductFamilyProductVMs = productFamilyProductVMs.ToList();
Now the SQL for the last query (the one in question) looks like this
SELECT
[Project1].[ProductFamilyID] AS [ProductFamilyID],
[Project1].[ProductID] AS [ProductID],
[Project1].[Description] AS [Description]
FROM ( SELECT
[Extent2].[ProductFamilyID] AS [ProductFamilyID],
[Extent4].[ProductID] AS [ProductID],
[Extent4].[Description] AS [Description]
FROM [dbo].[TestTypeProductFamilies] AS [Extent1]
INNER JOIN [dbo].[ProductFamilies] AS [Extent2] ON [Extent1].[ProductFamilyID] = [Extent2].[ProductFamilyID]
INNER JOIN [dbo].[ProductFamilyProducts] AS [Extent3] ON [Extent2].[ProductFamilyID] = [Extent3].[ProductFamilyID]
INNER JOIN [dbo].[Products] AS [Extent4] ON [Extent3].[ProductID] = [Extent4].[ProductID]
WHERE ([Extent1].[TestTypeID] = #p__linq__0) AND ((convert (datetime2, convert(varchar(255), [Extent2].[StartDate], 102) , 102)) <= #p__linq__1) AND ((convert (datetime2, convert(varchar(255), [Extent2].[EndDate], 102) , 102)) > #p__linq__2) AND ((convert (datetime2, convert(varchar(255), [Extent4].[StartDate], 102) , 102)) <= #p__linq__3) AND ((convert (datetime2, convert(varchar(255), [Extent4].[EndDate], 102) , 102)) > #p__linq__4)
) AS [Project1]
ORDER BY [Project1].[ProductID] ASC, [Project1].[ProductFamilyID] ASC
i.e. pretty similar to the sample SQL query and should produce the same results.
Result is different because in your second query , you have two "OrderBy" and because of that, the second OrderBy it's works over the collection which is result of first "OrderBy" and is reordering the items.
Change
orderby pfp.ProductID
orderby pf.ProductFamilyID
from your second query in
orderby pfp.ProductID, pf.ProductFamilyID
to get same results

SQL - add one more column as result of subquery? SQL Server 2008

I have a query which gives me perfectly good results:
select
A.ID_acc, A.ID_us, A.st, table3.KFL,
'100' as myattribute,
'101' as my attribute2
from
SOURCE1 as A
left join
(select
table2.ID_us, table2.ID_acc,
CASE WHEN table2.KFL_type = 'KFL' THEN P.index_num ELSE table2.KFL_type END as KFL
from
(select
table1.ID_us, table1.ID_acc,
CASE WHEN sum(table1.count_kfl) > 1 THEN '9999' WHEN sum(table1.count_kfl) = 1 THEN 'KFL' END as KFL_type
from
(SELECT
ID_us, ID_acc, count(*) as count_kfl
FROM
payments
WHERE
index_num IN (200, 201, 203)
AND (date >= XXXX or date2 >= 'XXXXX')
GROUP BY
1, 2) as table1
group by
1, 2) as table2
join
SOURCE2 as P on table2.ID_us = P.ID_us
and table2.ID_acc = P.ID_acc
where
(P.date>= XXXX or P.date2 >= 'XXXXX')
and index_num in (201,201,203)
group by
1, 2
order by
1, 2) as table3 on table3.ID_us = A.ID_us
and table3.ID_acc = A.ID_acc
where
A.not_deleted >= XXXXXX
This query is not my main question, so I only copied it just to short brief, but I wondering how I can now add one more additional column (result of count operation) as the end of my first query? Just to do not making 2 separately and then mixing results. Naturally I don't want to influence on my earlier fields results.
I have second query which looks like this:
select A.ID_us, count(*)/2 as number
from
SOURCE1 as A
left join
SOURCE3 as B
on A.ID_acc = B.ID_acc
where A.date >= XXXX
group by 1
The link between those 2 queries is attribute ID_acc in SOURCE A which appear in first and in second query.
But don't have idea how do it?
select A.ID_acc, A.ID_us, A.st, table3.KFL, '100' as myattribute, '101' as my attribute2, NEWSOURCE.MYNEW_attribute
from SOURCE1 as A
left join
(
...
)
as table3 on table3.ID_us = A.ID_us and table3.ID_acc = A.ID_acc
where A.not_deleted >= XXXXXX
left join
(
.
.
.
)
as NEWSOURCE
Something like this of course, don't work:///
Have you tried a correlated subquery:
select A.ID_acc, A.ID_us, A.st, table3.KFL, '100' as myattribute,
'101' as my attribute2,
( select count(*)/2 as number from SOURCE1 as IA left join SOURCE3 as IB on
IA.ID_acc = IB.ID_acc and IA.ID_Acc = A.ID_Acc where IA.date >= XXXX ) as NewColumn
from ...
Note the use of new aliases in the correlated subquery.

paging over SELECT UNION super slow and killing my server

I have an SP that returns paged data from a query that contains a UNION. This is killing my DB and taking 30 seconds to run sometimes, am I missing something obvious here? What can I do to improve it's performance?
Tables Involved: Products, Categories, CategoryProducts
Goal:
Any Products that are not in a Category or have been deleted from a category UNION all Products currently in a category and page over them for a web service.
I have Indexes on all columns that I am joining on and there are 427,996 Products, 6148 Categories and 409,691 CategoryProducts in the database.
Here is my query that is taking between 6, and 30 seconds to run:
SELECT * FROM (
SELECT ROW_NUMBER() OVER(ORDER BY Products.ItemID, Products.ManufacturerID) AS RowNum, *
FROM
(
SELECT Products.*,
CategoryID = NULL, CategoryName = NULL,
CategoryProductID = NULL,
ContainerMinimumQuantity =
CASE COALESCE(Products.ContainerMinQty, 0)
WHEN 0 THEN Products.OrderMinimumQuantity
ELSE Products.ContainerMinQty
END
Products.IsDeleted,
SortOrder = NULL
FROM CategoryProducts RIGHT OUTER JOIN Products
ON CategoryProducts.ManufacturerID = Products.ManufacturerID
AND CategoryProducts.ItemID = Products.ItemID
WHERE (Products.ManufacturerID = #ManufacturerID)
AND (Products.ModifiedOn > #tStamp )
AND ((CategoryProducts.IsDeleted = 1) OR (CategoryProducts.IsDeleted IS NULL))
UNION
SELECT Products.*,
CategoryProducts.CategoryID , CategoryProducts.CategoryName,
CategoryProducts.CategoryProductID ,
ContainerMinimumQuantity =
CASE COALESCE(Products.ContainerMinQty, 0)
WHEN 0 THEN Products.OrderMinimumQuantity
ELSE Products.ContainerMinQty
END
CategoryProducts.IsDeleted,
CategoryProducts.SortOrder
FROM Categories INNER JOIN
CategoryProducts ON Categories.CategoryID = CategoryProducts.CategoryID INNER JOIN
Products ON CategoryProducts.ManufacturerID = Products.ManufacturerID
AND CategoryProducts.ItemID = Products.ItemID
WHERE (Products.ManufacturerID = #ManufacturerID)
AND (Products.ModifiedOn > #tStamp OR CategoryProducts.ModifiedOn > #tStamp))
AS Products) AS C
WHERE RowNum >= #StartRow AND RowNum <= #EndRow
Any insight would be greatly appreciated.
If I read your situation correctly, the only reason for having two distinct queries is treatment of missing/deleted CategoryProducts. I tried to address this issue by left join with IsDeleted = 0 to bring all deleted CategoryProducts to nulls, so I don't have to test them again. ModifiedOn part got another test for null for missing/deleted Categoryproducts you wish to retrieve.
select *
from (
SELECT
Products.*,
-- Following three columns will be null for deleted/missing categories
CategoryProducts.CategoryID,
CategoryProducts.CategoryName,
CategoryProducts.CategoryProductID ,
ContainerMinimumQuantity = COALESCE(nullif(Products.ContainerMinQty, 0),
Products.OrderMinimumQuantity),
CategoryProducts.IsDeleted,
CategoryProducts.SortOrder,
ROW_NUMBER() OVER(ORDER BY Products.ItemID,
Products.ManufacturerID) AS RowNum
FROM Products
LEFT JOIN CategoryProducts
ON CategoryProducts.ManufacturerID = Products.ManufacturerID
AND CategoryProducts.ItemID = Products.ItemID
-- Filter IsDeleted in join so we get nulls for deleted categories
-- And treat them the same as missing ones
AND CategoryProducts.IsDeleted = 0
LEFT JOIN Categories
ON Categories.CategoryID = CategoryProducts.CategoryID
WHERE Products.ManufacturerID = #ManufacturerID
AND (Products.ModifiedOn > #tStamp
-- Deleted/missing categories
OR CategoryProducts.ModifiedOn is null
OR CategoryProducts.ModifiedOn > #tStamp)
) C
WHERE RowNum >= #StartRow AND RowNum <= #EndRow
On a third look I don't see that Category is used at all except as a filter to CategoryProducts. If this is the case second LEFT JOIN should be changed to INNER JOIN and this section should be enclosed in parenthessis.

Resources