How to make query with multi value parameter run faster? - sql-server

I have the following query that is running very slow:
SELECT
DISTINCT a.Role as Role
FROM
[Table_A] a
JOIN
[Table_B] b ON (a.Key = b.Key)
WHERE
b.Date BETWEEN #StartDate AND #EndDate
AND ISNULL(a.ID, -1) IN (#People)
The values of the variables #StartDate and #EndDate and #People come from parameters in a SSRS report. The date parameters are just dates. The #People parameter is a multi value parameter.
The problem is that #People contains over 3000 values. So the query has to go through it all using the IN clause. This really, really slows down my query when running it in SSRS.
I wanted to use an exists clause to replace the IN clause but I can't seem to get that to work in this scenario. I'd need to somehow select the values from the #People variable in the EXISTS clause and join it back to the first table, but I don't even know if that is possible.
Perhaps I am going down the wrong direction with this trying to use the EXISTS in this scenario. But I still need to fix the query so it runs faster.
Can anyone help with this?

ISNULL(a.ID, -1) is going to make the query non-SARGable. You would be better off using (a.ID IN (#People) OR a.ID IS NULL), however, an IN with that many arguments is unlikely to run well.
I'm running on memory here (I don't have SSRS at home) but if i recall SSRS does some "magic" with multivalue parameters and IN that doesn't scale well. Perhaps you would be better trying to use an EXISTS and a splitter (such as DelimitedSplit8k). This specific example relies on #People having less than 8000 characters.
SELECT DISTINCT a.Role
FROM [Table_A] a
JOIN [Table_B] b ON a.Key = b.Key
WHERE b.Date BETWEEN #StartDate AND #EndDate
AND (EXISTS (SELECT 1
FROM dbo.DelimitedSplit8K(#People,',') DS
WHERE DS.Item = a.ID)
OR a.ID IS NULL);
Considering, however, that ordinal position doesn't matter here, then other splitters are available. For example the XML Splitter.
For completeness, a quickly written XML Splitter Function:
CREATE FUNCTION dbo.XMLSplitter (#DelimitedString varchar(MAX))
RETURNS TABLE AS RETURN
SELECT n.d.value('.','varchar(MAX)') AS Item
FROM (VALUES(CONVERT(xml,'<d>'+ REPLACE(#DelimitedString,',','</d><d>') + '</d>'))) V(X)
CROSS APPLY V.X.nodes('d') n(d);
GO
Added a full example without a function:
SELECT DISTINCT a.Role
FROM [Table_A] a
JOIN [Table_B] b ON a.Key = b.Key
WHERE b.Date BETWEEN #StartDate AND #EndDate
AND (EXISTS (SELECT 1
FROM (VALUES(CONVERT(xml,'<d>'+ REPLACE(#DelimitedString,',','</d><d>') + '</d>'))) V(X)
CROSS APPLY V.X.nodes('d') n(d)
WHERE n.d.value('.','varchar(MAX)') = a.ID)
OR a.ID IS NULL);

Related

How do I properly add this query into my existing query within Query Designer?

I currently have the below query written within Query Designer. I asked a question yesterday and it worked on its own but I would like to incorporate it into my existing report.
SELECT Distinct
i.ProductNumber
,i.ProductType
,i.ProductPurchaseDate
,ih.SalesPersonComputerID
,ih.SalesPerson
,ic2.FlaggedComments
FROM [Products] i
LEFT OUTER JOIN
(SELECT Distinct
MIN(c2.Comments) AS FlaggedComments
,c2.SalesKey
FROM [SalesComment] AS c2
WHERE(c2.Comments like 'Flagged*%')
GROUP BY c2.SalesKey) ic2
ON ic2.SalesKey = i.SalesKey
LEFT JOIN [SalesHistory] AS ih
ON ih.SalesKey = i.SalesKey
WHERE
i.SaleDate between #StartDate and #StopDate
AND ih.Status = 'SOLD'
My question yesterday was that I wanted a way to select only the first comment made for each sale. I have a query for selecting the flagged comments but I want both the first row and the flagged comment. They would both be pulling from the same table. This was the query provided and it worked on its own but I cant figure out how to make it work with my existing query.
SELECT a.DateTimeCommented, a.ProductNumber, a.Comments, a.SalesKey
FROM (
SELECT
DateTimeCommented, ProductNumber, Comments, SalesKey,
ROW_NUMBER() OVER(PARTITION BY ProductNumber ORDER BY DateTimeCommented) as RowN
FROM [SalesComment]
) a
WHERE a.RowN = 1
Thank you so much for your assistance.
You can use a combination of row-numbering and aggregation to get both the Flagged% comments, and the first comment.
You may want to change the PARTITION BY clause to suit.
DISTINCT on the outer query is probably spurious, on the inner query it definitely is, as you have GROUP BY anyway. If you are getting multiple rows, don't just throw DISTINCT at it, instead think about your joins and whether you need aggregation.
The second LEFT JOIN logically becomes an INNER JOIN due to the WHERE predicate. Perhaps that predicate should have been in the ON instead?
SELECT
i.ProductNumber
,i.ProductType
,i.ProductPurchaseDate
,ih.SalesPersonComputerID
,ih.SalesPerson
,ic2.FlaggedComments
,ic2.FirstComments
FROM [Products] i
LEFT OUTER JOIN
(SELECT
MIN(CASE WHEN c2.RowN = 1 THEN c2.Comments) AS FirstComments
,c2.SalesKey
,MIN(CASE WHEN c2.Comments like 'Flagged*%' THEN c2.Comments) AS FlaggedComments
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ProductNumber ORDER BY DateTimeCommented) as RowN
FROM [SalesComment]
) AS c2
GROUP BY c2.SalesKey
) ic2 ON ic2.SalesKey = i.SalesKey
JOIN [SalesHistory] AS ih
ON ih.SalesKey = i.SalesKey
WHERE
i.SaleDate between #StartDate and #StopDate
AND ih.Status = 'SOLD'

SQL Server Aggregate Subquery Error for Condition Referencing Parent Query

I'm trying to add an aggregate sub query in select statement for the following code.
DECLARE #Date1 Date
DECLARE #Date2 Date
SET #Date1 = '2017-01-01'
SET #Date2 = '2017-03-01'
SELECT
p.PracticeName [Practice Name],
dbo.getFormattedName(l.Userid) [User Name],
MAX(EventDate) [Last Activity],
COUNT(*) [Activity Count],
(SELECT COUNT(*)
FROM UserEvent EVT (NOLOCK)
WHERE EVT.EventTypeID = 1
AND EVT.UserID = au.userID
AND EVT.EventDate >= #Date1
AND EVT.EventDate <= DATEADD(DAY, 1, #Date2)
GROUP BY
au.userID) [Login Count]
FROM
dbo.AudLog l (NOLOCK)
JOIN
Appuser au (NOLOCK) ON l.UserID = au.UserID
JOIN
Practice p (NOLOCK) ON au.PracticeID = p.PracticeID
WHERE
l.EnvironmentID = 1
AND EventDate >= #Date1
AND EventDate <= DATEADD(DAY,1,#Date2)
GROUP BY
p.PracticeName,
dbo.getFormattedName(l.Userid)
ORDER BY
p.PracticeName,
dbo.getFormattedName(l.Userid)
I'm getting the following error:
Column 'Appuser.UserID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I don't understand why that error applies to my sub query because I'm not selecting AppUser.UserID, I'm just using as a reference for a condition to align the sub query with the parent query. Also, it is indeed in the GROUP BY statement within the sub query.
I referenced this question but based on the explanation I would think my query would be working.
Any help appreciated.
The question you are referencing is different because there are no aggregate functions in the outer query that require a GROUP BY.
When you use a correlated subquery as your have (EVT.UserID = au.userID), the outer query column reference, in this case au.userID, is now a part of the columns in the outer select (because it is necessary to the subquery).
The simple fix is to add the au.userID column to your outer group by.
If you join on a field it is present in the intermediary data set that is part of the join. Even if it is not in your select list you need to consider it as part of the list.
If you really think it is a non-issue then you can add it to your group by clause and still get the results you desire.
The error is not referencing your sub query.
The following line dbo.ch_getFormattedProviderName(l.Userid) in the select clause of your main query is the problem.
You are using the function call dbo.ch_getFormattedProviderName(l.Userid) in your select statement but you are grouping by a different function dbo.getFormattedName(l.Userid) in your group by clause.
Change the function dbo.ch_getFormattedProviderName to dbo.getFormattedName or vice versa.
It is not answer to your question, but. There are some several things I would like to mention in your query.
Avoid using scalar functions. Try not use them at all. They are almost no good examples of usage of them. If you wish to encapsulate the logic then just use Table Valued Functions instead. Scalar functions are performance killers especially if you need to filter by them. They will not be indexed and they will execute on every single row.
Personally I didn't see the situation when such inline sub-queries would outperform normal JOIN inline query. But I saw a lot of opposite situations when the performance was worse. You can always re-write such query with the JOIN (SELECT ...)
NOLOCKs ... they are not very recommended. If you have locking problems then solving them by taking using index/re-design Technics. It is better to solve the root of the problem rather than doing workarounds.

The multi-part identifier "Registry.categoryID" could not be bound [duplicate]

I've seen similar errors on SO, but I don't find a solution for my problem.
I have a SQL query like:
SELECT DISTINCT
a.maxa ,
b.mahuyen ,
a.tenxa ,
b.tenhuyen ,
ISNULL(dkcd.tong, 0) AS tongdkcd
FROM phuongxa a ,
quanhuyen b
LEFT OUTER JOIN ( SELECT maxa ,
COUNT(*) AS tong
FROM khaosat
WHERE CONVERT(DATETIME, ngaylap, 103) BETWEEN 'Sep 1 2011'
AND
'Sep 5 2011'
GROUP BY maxa
) AS dkcd ON dkcd.maxa = a.maxa
WHERE a.maxa <> '99'
AND LEFT(a.maxa, 2) = b.mahuyen
ORDER BY maxa;
When I execute this query, the error result is:
The multi-part identifier "a.maxa" could not be bound. Why?
P/s: if i divide the query into 2 individual query, it run ok.
SELECT DISTINCT
a.maxa ,
b.mahuyen ,
a.tenxa ,
b.tenhuyen
FROM phuongxa a ,
quanhuyen b
WHERE a.maxa <> '99'
AND LEFT(a.maxa, 2) = b.mahuyen
ORDER BY maxa;
and
SELECT maxa ,
COUNT(*) AS tong
FROM khaosat
WHERE CONVERT(DATETIME, ngaylap, 103) BETWEEN 'Sep 1 2011'
AND 'Sep 5 2011'
GROUP BY maxa;
You are mixing implicit joins with explicit joins. That is allowed, but you need to be aware of how to do that properly.
The thing is, explicit joins (the ones that are implemented using the JOIN keyword) take precedence over implicit ones (the 'comma' joins, where the join condition is specified in the WHERE clause).
Here's an outline of your query:
SELECT
…
FROM a, b LEFT JOIN dkcd ON …
WHERE …
You are probably expecting it to behave like this:
SELECT
…
FROM (a, b) LEFT JOIN dkcd ON …
WHERE …
that is, the combination of tables a and b is joined with the table dkcd. In fact, what's happening is
SELECT
…
FROM a, (b LEFT JOIN dkcd ON …)
WHERE …
that is, as you may already have understood, dkcd is joined specifically against b and only b, then the result of the join is combined with a and filtered further with the WHERE clause. In this case, any reference to a in the ON clause is invalid, a is unknown at that point. That is why you are getting the error message.
If I were you, I would probably try to rewrite this query, and one possible solution might be:
SELECT DISTINCT
a.maxa,
b.mahuyen,
a.tenxa,
b.tenhuyen,
ISNULL(dkcd.tong, 0) AS tongdkcd
FROM phuongxa a
INNER JOIN quanhuyen b ON LEFT(a.maxa, 2) = b.mahuyen
LEFT OUTER JOIN (
SELECT
maxa,
COUNT(*) AS tong
FROM khaosat
WHERE CONVERT(datetime, ngaylap, 103) BETWEEN 'Sep 1 2011' AND 'Sep 5 2011'
GROUP BY maxa
) AS dkcd ON dkcd.maxa = a.maxa
WHERE a.maxa <> '99'
ORDER BY a.maxa
Here the tables a and b are joined first, then the result is joined to dkcd. Basically, this is the same query as yours, only using a different syntax for one of the joins, which makes a great difference: the reference a.maxa in the dkcd's join condition is now absolutely valid.
As #Aaron Bertrand has correctly noted, you should probably qualify maxa with a specific alias, probably a, in the ORDER BY clause.
Sometimes this error occurs when you use your schema (dbo) in your query in a wrong way.
for example if you write:
select dbo.prd.name
from dbo.product prd
you will get the error.
In this situations change it to:
select prd.name
from dbo.product prd
if you have given alies name change that to actual name
for example
SELECT
A.name,A.date
FROM [LoginInfo].[dbo].[TableA] as A
join
[LoginInfo].[dbo].[TableA] as B
on [LoginInfo].[dbo].[TableA].name=[LoginInfo].[dbo].[TableB].name;
change that to
SELECT
A.name,A.date
FROM [LoginInfo].[dbo].[TableA] as A
join
[LoginInfo].[dbo].[TableA] as B
on A.name=B.name;
I was struggling with the same error message in SQL SERVER, since I had multiple joins, changing the order of the joins solved it for me.
In my case the issue turned out to be the alias name I had given to the table. "oa" seems to be not acceptable for SQL Server.
What worked for me was to change my WHERE clause into a SELECT subquery
FROM:
DELETE FROM CommentTag WHERE [dbo].CommentTag.NoteId = [dbo].FetchedTagTransferData.IssueId
TO:
DELETE FROM CommentTag WHERE [dbo].CommentTag.NoteId = (SELECT NoteId FROM FetchedTagTransferData)
I was having the same error from JDBC. Checked everything and my query was fine. Turned out, in where clause I have an argument:
where s.some_column = ?
And the value of the argument I was passing in was null. This also gives the same error which is misleading because when you search the internet you end up that something is wrong with the query structure but it's not in my case. Just thought someone may face the same issue
I'm new to SQL, but came across this issue in a course I was taking and found that assigning the query to the project specifically helped to eliminate the multi-part error. For example the project I created was CTU SQL Project so I made sure I started my script with USE [CTU SQL Project] as my first line like below.
USE [CTU SQL Project]
SELECT Advisors.First_Name, Advisors.Last_Name...and so on.
If this error happens in an UPDATE, double-check the JOIN on the table with the column/field that is causing the error.
In my case this was due to the lack of the JOIN itself, which generated the same error due to an unknown field (as Andriy pointed out).
Instead you can try joining tables like,
select
....
from
dkcd
right join
a
, b
This should work
SELECT DISTINCT
phuongxa.maxa ,
quanhuyen.mahuyen ,
phuongxa.tenxa ,
quanhuyen.tenhuyen ,
ISNULL(dkcd.tong, 0) AS tongdkcd
FROM phuongxa ,
quanhuyen
LEFT OUTER JOIN ( SELECT khaosat.maxa ,
COUNT(*) AS tong
FROM khaosat
WHERE CONVERT(DATETIME, ngaylap, 103) BETWEEN 'Sep 1 2011'
AND
'Sep 5 2011'
GROUP BY khaosat.maxa
) AS dkcd ON dkcd.maxa = maxa
WHERE phuongxa.maxa <> '99'
AND LEFT(phuongxa.maxa, 2) = quanhuyen.mahuyen
ORDER BY maxa;
My error was to use a field that did not exist in table.
table1.field1 => is not exist
table2.field1 => is correct
Correct your Table Name.
my error occurred because of using WITH
WITH RCTE AS (
SELECT...
)
SELECT RCTE.Name, ...
FROM
RCTE INNER JOIN Customer
ON RCTE.CustomerID = Customer.ID
when used in join with other tables ...
Did you forget to join some tables? If not then you probably need to use some aliases.
I was also struggling with this error and ended up with the same strategy as the answer. I am including my answer just to confirm that this is a strategy that should work.
Here is an example where I do first one inner join between two tables I know got data and then two left outer joins on tables that might have corresponding rows that can be empty. You mix inner joins and outer joins to get results with data accross tables instead of doing the default comma separated syntax between tables and miss out rows in your desired join.
use somedatabase
go
select o.operationid, o.operatingdate, p.pasid, p.name as patientname, o.operationalunitid, f.name as operasjonsprogram, o.theaterid as stueid, t.name as stuenavn, o.status as operasjonsstatus from operation o
inner join patient p on o.operationid = p.operationid
left outer join freshorganizationalunit f on f.freshorganizationalunitid = o.operationalunitid
left outer join theater t on t.theaterid = o.theaterid
where (p.Name like '%Male[0-9]%' or p.Name like '%KFemale [0-9]%')
First: Do the inner joins between tables you expect to have data matching.
Second part: Continue with outer joins to try to retrieve data in other tables,
but this will not filter out your result set if table outer joining to has not got corresponding data or match on the condition you set up in the on predicate / condition.
This error can also be caused by simply missing a comma , between the column names in the SELECT statement.
eg:
SELECT MyCol1, MyCol2 MyCol3 FROM SomeTable;
For me the issue was that I was stupidly calling a DB function without empty brackets select [apo].[GenerateNationalIdFrance] instead of select [apo].[GenerateNationalIdFrance]() ... took me few minutes to realize that but worth mentioning for juniors out there :-)
For me I was using wrong alias spellings , it worked after correct spelings

Joining a calculated field to a field in another table

I have created a variable table called Table_A which has two columns, Age and Age_Range. The Datatype for Age is integer.
The next stage is a select statement where I’m pulling the Order_Number and a calculated field from Table_B. I want to join the calculated field from Table_B with Age from Table_A, so that I can see what the range is against the calculated field and its order number.
My first attempt was:
SELECT Order_Number, DATEDIFF(DAY,Order_Date,CAST(GETDATE()AS DATE)) AS Ageing, Age_Range
FROM Table_B LEFT JOIN Table_A ON Table_B.Ageing = Table_A.Age_Range
This didn’t work and I understand why. Usually in Access, I would just build the first query with the calculated field and then build the second query joining the calculated field with the desired field from the table. I’ve been looking at sub queries and derived tables, which I believe may solve my problem, but I’m not having any luck. I know this is a basic question, but I’ve just started out with SQL.
Thanks
You cannot join like that because SELECT is executed after JOIN statement.
You can read about it here: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/70efeffe-76b9-4b7e-b4a1-ba53f5d21916/order-of-execution-of-sql-queries
You can make a workaround using CROSS APPLY
SELECT Order_Number
, T.Ageing
, A.Age_Range
FROM Table_B AS B
CROSS APPLY (SELECT DATEDIFF(DAY, B.Order_Date, GETDATE())) AS T(Ageing)
LEFT JOIN Table_A AS A
ON T.Ageing = Table_A.Age_Range
If the beauty of the code is not neccesarry:
SELECT Order_Number, DATEDIFF(DAY,Order_Date,CAST(GETDATE()AS DATE)) AS Ageing, Age_Range
FROM Table_B LEFT JOIN Table_A ON DATEDIFF(DAY,Order_Date,CAST(GETDATE()AS DATE)) = Table_A.Age_Range
Otherwise use CROSS APPLY as already suggested (performance will be the same). By the way, you do not need to CAST getdate() to date, DATEDIFF will work without that, so you can easily write like that:
SELECT Order_Number, DATEDIFF(DAY,Order_Date,GETDATE()) AS Ageing, Age_Range
FROM Table_B LEFT JOIN Table_A ON DATEDIFF(DAY,Order_Date,GETDATE()) = Table_A.Age_Range

Want to Avoid Sorting Full-Text Search Results

I'm using the following SQL Server query, which searches a full-text index and appears to be working correctly. Some additional work is included so the query works with paging.
However, my understanding is that full-text searches return results sorted according to ranking, which would be nice.
But I get an error if I remove the OVER clause near the top. Can anyone tell me how this query could be modified to not resort the results?
DECLARE #StartRow int;
DECLARE #MaxRows int;
SET #StartRow = 0;
SET #MaxRows = 10;
WITH ArtTemp AS
(SELECT TOP (#StartRow + #MaxRows) ROW_NUMBER() OVER (ORDER BY ArtViews DESC) AS RowID,
Article.ArtID,Article.ArtTitle,Article.ArtSlug,Category.CatID,Category.CatTitle,
Article.ArtDescription,Article.ArtCreated,Article.ArtUpdated,Article.ArtUserID,
[User].UsrDisplayName AS UserName
FROM Article
INNER JOIN Subcategory ON Article.ArtSubcategoryID = Subcategory.SubID
INNER JOIN Category ON Subcategory.SubCatID = Category.CatID
INNER JOIN [User] ON Article.ArtUserID = [User].UsrID
WHERE CONTAINS(Article.*,'FORMSOF(INFLECTIONAL,"htmltag")'))
SELECT ArtID,ArtTitle,ArtSlug,CatID,CatTitle,ArtDescription,ArtCreated,ArtUpdated,
ArtUserID,UserName
FROM ArtTemp
WHERE RowID BETWEEN #StartRow + 1 AND (#StartRow + #MaxRows)
ORDER BY RowID
Thanks.
I'm really not an expert in FTS but hopefully this helps get you started.
First, ROW_NUMBER requires OVER (ORDER BY xxx) in SQL Server. Even if you tell it to order by a constant value, it still might end up rearranging the results. So, if you depend on row numbering to handle your pagination, you're stuck with some kind of sorting.
When I dig around on FTS for that "return results sorted according to ranking" bit, I find a couple articles that describe ordering by rank. In a nutshell, they say that RANK is a column explicitly returned by CONTAINSTABLE. So if you can't find a way to dig out the results ranking from CONTAINS, you might try joining against CONTAINSTABLE instead and use the RANK column explicitly as your order by value with ROW_NUMBER. Example (syntax may be a little off):
SELECT TOP (#StartRow + #MaxRows)
ROW_NUMBER() OVER (ORDER BY MyFTS.RANK DESC) AS RowID,
Article.ArtID,Article.ArtTitle,Article.ArtSlug,Category.CatID,Category.CatTitle,
Article.ArtDescription,Article.ArtCreated,Article.ArtUpdated,Article.ArtUserID,
[User].UsrDisplayName AS UserName
FROM Article
INNER JOIN Subcategory ON Article.ArtSubcategoryID = Subcategory.SubID
INNER JOIN Category ON Subcategory.SubCatID = Category.CatID
INNER JOIN [User] ON Article.ArtUserID = [User].UsrID
INNER JOIN CONTAINSTABLE(Article, *, 'FORMSOF(INFLECTIONAL,"htmltag")') AS MyFTS
The end result is that you're still sorting, but you're doing so on your rankings.
Also, the MSDN page says that CONTAINSTABLE has an ability to limit results on a TOP N basis, too. Maybe this would also be of use to you.

Resources