SQL Server: Selecting Value That was met in the Where Criteria - sql-server

My query selects records where given item types were ordered and I would like it to return a column that has the value for the criteria which a given record has met.
For example (since the above explanation is probably confusing):
DECLARE #Item1 VARCHAR(8) = 'Red Shoes',
#Item2 VARCHAR(8) = 'Brown Belt',
#Item3 VARCHAR(8) = 'Blue Shoes',
#Item4 VARCHAR(8) = 'Black Belt'
SELECT DISTINCT ord.Order_number,
ord.Item_number,
ord.Item_type,
ord.Item_desc,
link.Item_number AS linked_item_number
FROM Ordertbl ord
LEFT JOIN Item_tables link
ON link.item_number = ord.item_number
WHERE link.Item_number IN (#Item1,#Item2,#Item3,#Item4) AND
ord.Item_number NOT IN (#Item1,#Item2,#Item3,#Item4)
Desired Outcome: All items that were ordered whenever Item1,2,3, or 4 were ordered and, for each record, a field that depicts what item (1,2,3, or 4) was the source for that record being returned.
Using multiple Union queries with where criteria set to a single item provides the desired outcome if I set the linked_item_number field to the queried item, but that method is less than ideal because, at times, large numbers of items may be queried.

Edited: I've updated my answer a bit, and expanded on a few areas using my best guesses, but hopefully they help help illuminate the points I'm making.
Using NOT IN in a WHERE clause is really bad for performance. It would be better to convert your filters into tables, and then JOIN them to your order table.
But before we get to that, let's make a few DML assumptions here that will help keep things clear. Let's say you have two tables like the following:
CREATE TABLE Ordertbl
(
Order_number INT
,Item_number INT
--you might have more columns in your table
)
CREATE TABLE Item_tables
(
Item_number INT
,Item_type INT
,Item_desc VARCHAR(8)
--again, you might have more columns in your table
)
I'm also going to assume that the details about an item are in Item_tables and not in Ordertbl, because that makes the most sense for a database design to me.
My original answer had the following block of text next:
In this scenario, you'd need two additional tables, one for the list
of Items in where Item_number in (#Item1,#Item2,#Item3,#Item4) which
would have the corresponding Item_numbers. The other table would be
the list of Subjs in Item_number Not in
(#Subj1,#Subj2,#Subj3,#Subj4,#Subj5,#Subj6,#Subj7), again, including
the Item_number for those Subj records.
The question has been updated so that the WHERE clause is different than it was when I wrote the original version of my answer. The design pattern still applies here, even if the variables being used are different.
So let's create a temp table to hold all of our "triggering" items, and then populate it.
DECLARE #Item1 VARCHAR(8) = 'Red Shoes',
#Item2 VARCHAR(8) = 'Brown Belt',
#Item3 VARCHAR(8) = 'Blue Shoes',
#Item4 VARCHAR(8) = 'Black Belt'
CREATE TABLE #TriggeringItems
(
Item_desc VARCHAR(8)
)
INSERT INTO #TriggeringItems
(
Item_desc
)
SELECT #Item1
UNION
SELECT #Item2
UNION
SELECT #Item3
UNION
SELECT #Item4
If you had more filter variables to add, you could keep UNIONing them onto the INSERT.
So now we have our temp table and we can filter our query. Great!
...right?
Well, not quite. Our input parameters are descriptions, and our foreign key is an INT, which means we'll have to do a few extra joins to get our key values into the query. But the general idea is that you'd use an INNER JOIN to replace WHERE ... IN ..., and a LEFT JOIN to replace the WHERE ... NOT IN ... (adding a WHERE X IS NULL clause, where X is the key of the LEFT JOINed table)
If you didn't care about getting the triggering items back in your SELECT, then you could just go ahead with replacing the WHERE ... IN ... with an INNER JOIN, and that would be the end of it. But let's say you only wanted the list of items that WEREN'T the triggering items. For that, you would need to join Ordertbl to itself, to get the list of Order_numbers with triggering items within them. Then you could INNER JOIN one side to the temp table, and LEFT JOIN the other half. I know my explanation might be hard to follow, so let me show you what I mean in code:
SELECT DISTINCT onums.Order_number,
orditems.Item_number,
orditems.Item_type,
orditems.Item_desc,
tinums.Item_number AS linked_item_number
FROM #TriggeringItems ti
INNER JOIN Item_tables tinums ON ti.Item_desc = tinums.Item_desc
INNER JOIN Ordertbl onums ON tinums.Item_number = onums.Item_number
INNER JOIN Ordertbl ord ON onums.Order_number = ord.Order_number
INNER JOIN Item_tables orditems ON ord.Item_number = orditems.Item_number
LEFT JOIN #TriggeringItems excl ON orditems.Item_desc = excl.Item_desc
WHERE excl.Item_desc IS NULL
onums is our list of order numbers, but ord is where we're going to pull our items from. But we only want to return items that aren't triggers, so we LEFT JOIN our temp table at the end and add the WHERE excl.Item_desc IS NULL.

Related

Returning Grouped Data with Extra Row for each group in SQL Server

I have tables Composition, CompositionDetails and Tracks
Composition holds composition Name while compositionDetails is for mapping tracks with composition. One composition can have multiple tracks.
Table Structures are like this:
Composition table - CompositionId, CompositionName
Tracks table - TrackId, TrackName
CompositionDetails - CompositionId (FK), TrackId (FK)
Now with My query I am able to achieve this:
But I want this:
I mean one extra row above each group of composition.
I achieved it creating temp tables and looping to insert extra row. But with millions to data, it is very slow.
Any suggestions on how can can achieve this without creating temp table and going over loop to insert new rows?
Select Componame,TrackTitle from
(
Select Componame,TrackTitle,Componame as h,1 as Sort
from Composition
UNION
Select Componame,Componame,MIN(Componame),0 as Sort
from Composition
group by Componame
) a
Order by h,Sort,Componame,TrackTitle
Try this
SELECT c2.CompositionName, c2.CompositionName, c2.CompositionId
FROM Composition c2
UNION
SELECT c.CompositionName, t.TrackName, cd.CompositionId
FROM CompositionDetails cd join Composition c ON cd.CompositionId = c.CompositionId
JOIN Tracks t on t.Trackid = cd.Trackid
Assuming the CompositionDetails table doesn't have rows with NULLs in the TrackId column:
Using UNION, add one blank (NULL) TrackId per CompositionId into CompositionDetails.
Inner-join the resulting set to Composition.
Left-join the result of the inner join with Tracks. Obviously, this will yield NULL track names for the rows added in #1.
In the ORDER BY clause, sort by CompositionName first, then by TrackName. (In Transact-SQL, NULLs go before any values. Therefore, rows with blank track names will go before the others in their respective groups.)
In the SELECT clause, use either COALESCE or ISNULL to replace blank track names with composition names.
SELECT
c.CompositionName,
COALESCE(t.TrackName, c.CompositionName)
FROM
Composition AS c
INNER JOIN (
SELECT CompositionId, TrackId FROM CompositionDetails
UNION ALL
SELECT DISTINCT CompositionId, NULL FROM CompositionDetails
) AS d ON c.CompositionId = d.CompositionId
LEFT JOIN Tracks AS t ON d.TrackId = t.TrackId
ORDER BY
c.CompositionName,
t.TrackName
;

SQL WHERE NOT EXISTS (skip duplicates)

Hello I'm struggling to get the query below right. What I want is to return rows with unique names and surnames. What I get is all rows with duplicates
This is my sql
DECLARE #tmp AS TABLE (Name VARCHAR(100), Surname VARCHAR(100))
INSERT INTO #tmp
SELECT CustomerName,CustomerSurname FROM Customers
WHERE
NOT EXISTS
(SELECT Name,Surname
FROM #tmp
WHERE Name=CustomerName
AND ID Surname=CustomerSurname
GROUP BY Name,Surname )
Please can someone point me in the right direction here.
//Desperate (I tried without GROUP BY as well but get same result)
DISTINCT would do the trick.
SELECT DISTINCT CustomerName, CustomerSurname
FROM Customers
Demo
If you only want the records that really don't have duplicates (as opposed to getting duplicates represented as a single record) you could use GROUP BY and HAVING:
SELECT CustomerName, CustomerSurname
FROM Customers
GROUP BY CustomerName, CustomerSurname
HAVING COUNT(*) = 1
Demo
First, I thought that #David answer is what you want. But rereading your comments, perhaps you want all combinations of Names and Surnames:
SELECT n.CustomerName, s.CustomerSurname
FROM
( SELECT DISTINCT CustomerName
FROM Customers
) AS n
CROSS JOIN
( SELECT DISTINCT CustomerSurname
FROM Customers
) AS s ;
Are you doing that while your #Tmp table is still empty?
If so: your entire "select" is fully evaluated before the "insert" statement, it doesn't do "run the query and add one row, insert the row, run the query and get another row, insert the row, etc."
If you want to insert unique Customers only, use that same "Customer" table in your not exists clause
SELECT c.CustomerName,c.CustomerSurname FROM Customers c
WHERE
NOT EXISTS
(SELECT 1
FROM Customers c1
WHERE c.CustomerName = c1.CustomerName
AND c.CustomerSurname = c1.CustomerSurname
AND c.Id <> c1.Id)
If you want to insert a unique set of customers, use "distinct"
Typically, if you're doing a WHERE NOT EXISTS or WHERE EXISTS, or WHERE NOT IN subquery,
you should use what is called a "correlated subquery", as in ypercube's answer above, where table aliases are used for both inside and outside tables (where inside table is joined to outside table). ypercube gave a good example.
And often, NOT EXISTS is preferred over NOT IN (unless the WHERE NOT IN is selecting from a totally unrelated table that you can't join on.)
Sometimes if you're tempted to do a WHERE EXISTS (SELECT from a small table with no duplicate values in column), you could also do the same thing by joining the main query with that table on the column you want in the EXISTS. Not always the best or safest solution, might make query slower if there are many rows in that table and could cause many duplicate rows if there are dup values for that column in the joined table -- in which case you'd have to add DISTINCT to the main query, which causes it to SORT the data on all columns.
-- Not efficient at all.
And, similarly, the WHERE NOT IN or NOT EXISTS correlated subqueries can be accomplished (and give the exact same execution plan) if you LEFT OUTER JOIN the table you were going to subquery -- and add a WHERE . IS NULL.
You have to be careful using that, but you don't need a DISTINCT. Frankly, I prefer to use the WHERE NOT IN subqueries or NOT EXISTS correlated subqueries, because the syntax makes the intention clear and it's hard to go wrong.
And you do not need a DISTINCT in the SELECT inside such subqueries (correlated or not). It would be a waste of processing (and for WHERE EXISTS or WHERE IN subqueries, the SQL optimizer would ignore it anyway and just use the first value that matched for each row in the outer query). (Hope that makes sense.)

Need help improving query performance

I need help with improving the performance of the following SQL query. The database design of this application is based on OLD mainframe entity designs. All the query does is returns a list of clients based on some search criteria:
#Advisers: Only returns clients which was captured by this adviser.
#outlets: just ignore this one
#searchtext: (firstname, surname, suburb, policy number) any combination of that
What I'm doing is creating a temporary table, then query all the tables involved, creating my own dataset, and then insert that dataset into a easily understandable table (#clients)
This query takes 20 seconds to execute and currently only returns 7 rows!
Screenshot of all table count can be found here: Table Record Count
Any ideas where I can start to optimize this query?
ALTER PROCEDURE [dbo].[spOP_SearchDashboard]
#advisers varchar(1000),
#outlets varchar(1000),
#searchText varchar(1000)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Set the prefixes to search for (firstname, surname, suburb, policy number)
DECLARE #splitSearchText varchar(1000)
SET #splitSearchText = REPLACE(#searchText, ' ', ',')
DECLARE #AdvisersListing TABLE
(
adviser varchar(200)
)
DECLARE #SearchParts TABLE
(
prefix varchar(200)
)
DECLARE #OutletListing TABLE
(
outlet varchar(200)
)
INSERT INTO #AdvisersListing(adviser)
SELECT part as adviser FROM SplitString (#advisers, ',')
INSERT INTO #SearchParts(prefix)
SELECT part as prefix FROM SplitString (#splitSearchText, ',')
INSERT INTO #OutletListing(outlet)
SELECT part as outlet FROM SplitString (#outlets, ',')
DECLARE #Clients TABLE
(
source varchar(2),
adviserId bigint,
integratedId varchar(50),
rfClientId bigint,
ifClientId uniqueidentifier,
title varchar(30),
firstname varchar(100),
surname varchar(100),
address1 varchar(500),
address2 varchar(500),
suburb varchar(100),
state varchar(100),
postcode varchar(100),
policyNumber varchar(100),
lastAccess datetime,
deleted bit
)
INSERT INTO #Clients
SELECT
source, adviserId, integratedId, rfClientId, ifClientId, title,
firstname, surname, address1, address2, suburb, state, postcode,
policyNumber, max(lastAccess) as lastAccess, deleted
FROM
(SELECT DISTINCT
'RF' as Source,
advRel.SourceEntityId as adviserId,
cast(pe.entityId as varchar(50)) AS IntegratedID,
pe.entityId AS rfClientId,
cast(ifClient.Id as uniqueidentifier) as ifClientID,
ISNULL(p.title, '') AS title,
ISNULL(p.firstname, '') AS firstname,
ISNULL(p.surname, '') AS surname,
ISNULL(ct.address1, '') AS address1,
ISNULL(ct.address2, '') AS address2,
ISNULL(ct.suburb, '') AS suburb,
ISNULL(ct.state, '') AS state,
ISNULL(ct.postcode, '') AS postcode,
ISNULL(contract.policyNumber,'') AS policyNumber,
coalesce(pp.LastAccess, d_portfolio.dateCreated, pd.dateCreated) AS lastAccess,
ISNULL(client.deleted, 0) as deleted
FROM
tbOP_Entity pe
INNER JOIN tbOP_EntityRelationship advRel ON pe.EntityId = advRel.TargetEntityId
AND advRel.RelationshipId = 39
LEFT OUTER JOIN tbOP_Data pd ON pe.EntityId = pd.entityId
LEFT OUTER JOIN tbOP__Person p ON pd.DataId = p.DataId
LEFT OUTER JOIN tbOP_EntityRelationship ctr ON pe.EntityId = ctr.SourceEntityId
AND ctr.RelationshipId = 79
LEFT OUTER JOIN tbOP_Data ctd ON ctr.TargetEntityId = ctd.entityId
LEFT OUTER JOIN tbOP__Contact ct ON ctd.DataId = ct.DataId
LEFT OUTER JOIN tbOP_EntityRelationship ppr ON pe.EntityId = ppr.SourceEntityId
AND ppr.RelationshipID = 113
LEFT OUTER JOIN tbOP_Data ppd ON ppr.TargetEntityId = ppd.EntityId
LEFT OUTER JOIN tbOP__Portfolio pp ON ppd.DataId = pp.DataId
LEFT OUTER JOIN tbOP_EntityRelationship er_policy ON ppd.EntityId = er_policy.SourceEntityId
AND er_policy.RelationshipId = 3
LEFT OUTER JOIN tbOP_EntityRelationship er_contract ON er_policy.TargetEntityId = er_contract.SourceEntityId AND er_contract.RelationshipId = 119
LEFT OUTER JOIN tbOP_Data d_contract ON er_contract.TargetEntityId = d_contract.EntityId
LEFT OUTER JOIN tbOP__Contract contract ON d_contract.DataId = contract.DataId
LEFT JOIN tbOP_Data d_portfolio ON ppd.EntityId = d_portfolio.EntityId
LEFT JOIN tbOP__Portfolio pt ON d_portfolio.DataId = pt.DataId
LEFT JOIN tbIF_Clients ifClient on pe.entityId = ifClient.RFClientId
LEFT JOIN tbOP__Client client on client.DataId = pd.DataId
where
p.surname <> ''
AND (advRel.SourceEntityId IN (select adviser from #AdvisersListing)
OR
pp.outlet COLLATE SQL_Latin1_General_CP1_CI_AS in (select outlet from #OutletListing)
)
) as RFClients
group by
source, adviserId, integratedId, rfClientId, ifClientId, title,
firstname, surname, address1, address2, suburb, state, postcode,
policyNumber, deleted
SELECT * FROM #Clients --THIS ONLY RETURNS 10 RECORDS WITH MY CURRENT DATASET
END
Clarifying questions
What is the MAIN piece of data that you are querying on - advisers, search-text, outlets?
It feels like your criteria allows for users to search in many different ways. A sproc will always use exactly the SAME plan for every question you ask of it. You get better performance by using several sprocs - each tuned for a specific search scenario (i.e I bet you could write something blazingly fast for querying just by policy-number).
If you can separate your search-text into INDIVIDUAL parameters then you may be able to:
Search for adviser relationships matching your supplied list - store in temp table (or table variable).
IF ANY surnames have been specified then delete all records from temp which aren't for people with your supplied names.
Repeat for other criteria lists - all the time reducing your temp records.
THEN join to the outer-join stuff and return the results.
In your notes you say that outlets can be ignored. If this is true then taking them out would simplify your query. The "or" clause in your example means that SQL-Server needs to find ALL relationships for ALL portfolios before it can realistically get down to the business of filtering the results that you actually want.
Breaking the query up
Most of you query consists of outer-joins that are not involved in filtering. Try moving these joins into a separate select (i.e. AFTER you have applied all of your criteria). When SQL-Server sees lots of tables then it switches off some of its possible optimisations. So your first step (assuming that you always specify advisers) is just:
SELECT advRel.SourceEntityId as adviserId,
advRel.TargetEntityId AS rfClientId
INTO #temp1
FROM #AdvisersListing advisers
INNER JOIN tbOP_EntityRelationship advRel
ON advRel.SourceEntityId = advisers.adviser
AND advRel.RelationshipId = 39;
The link to tbOP_Entity (aliased as "pe") does not look like it is needed for its data. So you should be able to replace all references to "pe.EntityId" with "advRel.TargetEntityId".
The DISTINCT clause and the GROUP-BY are probably trying to achieve the same thing - and both of them are really expensive. Normally you find ONE of these used when a previous developer has not been able to get his results right. Get rid of them - check your results - if you get duplicates then try to filter the duplicates out. You may need ONE of them if you have temporal data - you definitely don't need both.
Indexes
Make sure that the #AdvisersListing.adviser column is same datetype as SourceEntityId and that SourceEntityId is indexed. If the column has a different datatype then SQL-Server won't want to use the index (so you would want to change the data-type on #AdvisersListing).
The tbOP_EntityRelationship tables sounds like it should have an index something like:
CREATE UNIQUE INDEX advRel_idx1 ON tbOP_EntityRelationship (SourceEntityId,
RelationshipId, TargetEntityId);
If this exists then SQL-Server should be able to get everything it needs by ONLY going to the index pages (rather than to the table pages). This is known as a "covering" index.
There should be a slightly different index on tbOP_Data (assuming it has a clustered primary key on DataId):
CREATE INDEX tbOP_Data_idx1 ON tbOP_Data (entityId) INCLUDE (dateCreated);
SQL-Server will store the keys from the table's clustered index (which I assume will be DataId) along with the value of "dateCreated" in the index leaf pages. So again we have a "covering" index.
Most of the other tables (tbOP__Client, etc) should have indexes on DataId.
Query plan
Unfortunately I couldn't see the explain-plan picture (our firewall ate it). However 1 useful tip is to hover your mouse over some of the join lines. It tells you how many records be accessed.
Watch out for full-table-scans. If SQL-Server needs to use them then its pretty-much given up on your indexes.
Database structure
Its been designed as a transaction database. The level of normalization (and all of the EntityRelationship-this and Data-that are really painful for reporting). You really need to consider having a separate reporting database that unravels some of this information into a more usable structure.
If you are running reports directly against your production database then I would expect a bunch of locking problems and resource contention.
Hope this has been useful - its the first time I've posted here. Has been ages since I last tuned a query in my current company (they have a bunch of stern-faced DBAs for sorting this sort of thing out).
Looking at your execution plan... 97% of the cost of your query is in processing the DISTINCT clause. I'm not sure it is even necessary since you are taking all that data and doing a group by on it anyway. You might want to take it out and see how that affects the plan.
That kind of query is just going to take time, with that many joins and that many temp tables, there's just nothing easy or efficient about it. One trick I have been using is using local variables. It might not be an all out solution, bit if it shaves a few seconds, it's worth it.
DECLARE #Xadvisers varchar(1000)
DECLARE #Xoutlets varchar(1000)
DECLARE #XsearchText varchar(1000)
SET #Xadvisers = #advisers
SET #Xoutlets = #outlets
SET #XsearchText = #searchText
Believe me, I have tested it thoroughly, and it helps with complicated scripts. Something about the way SQL Server handles local variables. Good luck!

Full-text Search on documents and related data mssql

Currently in the middle of building a knowledge base app and am a bit unsure on the best way to store and index the document information.
The user uploads the document and when doing so selects a number of options from dropdown lists (such as category,topic,area..., note these are not all mandatory) they also enter some keywords and a description of the document. At the moment the category (and others) selected is stored as foreign key in the documents table using the id from the categories table.
What we want to be able to do is do a FREETEXTTABLE or CONTAINSTABLE on not only the information within the varchar(max) column where the document is located but also on the category name, topic name and area name etc.
I looked at the option of creating an indexed view but this wasn't possible due to the LEFT JOIN against the category column. So I'm not sure how to go about being able to do this any ideas would be most appreciated.
I assume that you want to AND the two searches together. For example find all documents containing the text "foo" AND in category the "Automotive Repair".
Perhaps you don't need to full text the additional data and can just use = or like? If the additional data is reasonably small it may not warrant the complication of full text.
However, if you want to use full text on both, use a stored procedure that pulls the results together for you. The trick here is to stage the results rather than trying to get a result set back straight away.
This is rough starting point.
-- a staging table variable for the document results
declare #documentResults table (
Id int,
Rank int
)
insert into #documentResults
select d.Id, results.[rank]
from containstable (documents, (text), '"foo*"') results
inner join documents d on results.[key] = d.Id
-- now you have all of the primary keys that match the search criteria
-- whittle this list down to only include keys that are in the correct categories
-- a staging table variable for each the metadata results
declare #categories table (
Id int
)
insert into #categories
select results.[KEY]
from containstable (Categories, (Category), '"Automotive Repair*"') results
declare #topics table (
Id int
)
insert into #topics
select results.[KEY]
from containstable (Topics, (Topic), '"Automotive Repair*"') results
declare #areas table (
Id int
)
insert into #areas
select results.[KEY]
from containstable (Areas, (Area), '"Automotive Repair*"') results
select d.text, c.category, t.topic, a.area
from #results r
inner join documents d on d.Id = r.Id
inner join #categories c on c.Id = d.CategoryId
inner join #topics t on t.Id = d.TopicId
inner join #areas a on a.Id = d.AreaId
You could create a new column for your full text index which would contain the original document plus the categories appended as metadata. Then a search on that column could search both the document and the categories simultaneously. You'd need to invent a tagging system that would keep them unique within your document yet the tags would not be likely to be used as search phrases themselves. Perhaps something like:
This is my regular document text. <FTCategory: Automotive Repair> <FTCategory: Transmissions>

T-SQL filtering on dynamic name-value pairs

I'll describe what I am trying to achieve:
I am passing down to a SP an xml with name value pairs that I put into a table variable, let's say #nameValuePairs.
I need to retrieve a list of IDs for expressions (a table) with those exact match of name-value pairs (attributes, another table) associated.
This is my schema:
Expressions table --> (expressionId, attributeId)
Attributes table --> (attributeId, attributeName, attributeValue)
After trying complicated stuff with dynamic SQL and evil cursors (which works but it's painfully slow) this is what I've got now:
--do the magic plz!
-- retrieve number of name-value pairs
SET #noOfAttributes = select count(*) from #nameValuePairs
select distinct
e.expressionId, a.attributeName, a.attributeValue
into
#temp
from
expressions e
join
attributes a
on
e.attributeId = a.attributeId
join --> this join does the filtering
#nameValuePairs nvp
on
a.attributeName = nvp.name and a.attributeValue = nvp.value
group by
e.expressionId, a.attributeName, a.attributeValue
-- now select the IDs I need
-- since I did a select distinct above if the number of matches
-- for a given ID is the same as noOfAttributes then BINGO!
select distinct
expressionId
from
#temp
group by expressionId
having count(*) = #noOfAttributes
Can people please review and see if they can spot any problems? Is there a better way of doing this?
Any help appreciated!
I belive that this would satisfy the requirement you're trying to meet. I'm not sure how much prettier it is, but it should work and wouldn't require a temp table:
SET #noOfAttributes = select count(*) from #nameValuePairs
SELECT e.expressionid
FROM expression e
LEFT JOIN (
SELECT attributeid
FROM attributes a
JOIN #nameValuePairs nvp ON nvp.name = a.Name AND nvp.Value = a.value
) t ON t.attributeid = e.attributeid
GROUP BY e.expressionid
HAVING SUM(CASE WHEN t.attributeid IS NULL THEN (#noOfAttributes + 1) ELSE 1 END) = #noOfAttributes
EDIT: After doing some more evaluation, I found an issue where certain expressions would be included that shouldn't have been. I've modified my query to take that in to account.
One error I see is that you have no table with an alias of b, yet you are using: a.attributeId = b.attributeId.
Try fixing that and see if it works, unless I am missing something.
EDIT: I think you just fixed this in your edit, but is it supposed to be a.attributeId = e.attributeId?
This is not a bad approach, depending on the sizes and indexes of the tables, including #nameValuePairs. If it these row counts are high or it otherwise becomes slow, you may do better to put #namValuePairs into a temp table instead, add appropriate indexes, and use a single query instead of two separate ones.
I do notice that you are putting columns into #temp that you are not using, would be faster to exclude them (though it would mean duplicate rows in #temp). Also, you second query has both a "distinct" and a "group by" on the same columns. You don't need both so I would drop the "distinct" (probably won't affect performance, because the optimizer already figured this out).
Finally, #temp would probably be faster with a clustered non-unique index on expressionid (I am assuming that this is SQL 2005). You could add it after the SELECT..INTO, but it is usually as fast or faster to add it before you load. This would require you to CREATE #temp first, add the clustered and then use INSERT..SELECT to load it instead.
I'll add an example of merging the queries in a mintue... Ok, here's one way to merge them into a single query (this should be 2000-compatible also):
-- retrieve number of name-value pairs
SET #noOfAttributes = select count(*) from #nameValuePairs
-- now select the IDs I need
-- since I did a select distinct above if the number of matches
-- for a given ID is the same as noOfAttributes then BINGO!
select
expressionId
from
(
select distinct
e.expressionId, a.attributeName, a.attributeValue
from
expressions e
join
attributes a
on
e.attributeId = a.attributeId
join --> this join does the filtering
#nameValuePairs nvp
on
a.attributeName = nvp.name and a.attributeValue = nvp.value
) as Temp
group by expressionId
having count(*) = #noOfAttributes

Resources