I am currently working on a project, that includes in an automatical flairing part.
Basicly what this does:
I have a table called Fox, with various columns. And some other tables refering to Fox (i.e. CaughtChickens).
I want to have another table, that I can expand anytime, with 3 columns (other than ID ofc.) in my mind FlairName, FlairColor, and FlairStoredProcedure.
I want to have a stored procedure that returns all FlairName where the FlairStoredProcedure returns 1, for a certain FoxID.
This way I can write a stored procedure that checks if a certain Fox caught a chicken and returns 1 if it did, and add a flair Hunter on the User UI.
There are some cons with this:
Every time I want a new flair I have to write a new stored procedure it (yeah I kinda can't short this one out).
The stored procedures needs to have the same amount of in parameters (ie. #FoxID), and needs to return 1 or 0 (or select nothing when false, select the name if true (?))
I need to use dynamicSQL in the stored procedure that collect these flairs, and I kinda don't want to use any dynamicSQL at all.
Isn't there a lot easier way to do this that I am missing?
EDIT:
Example:
I have a table Fox:
FoxID FoxName FoxColor FoxSize Valid
1 Swiper red 12 1
I would have a table Flairs
FlairID FlairName FlairStoredProcedure Valid
1 Big pFlairs_IsFoxBig 1
2 Green pFlairs_IsFoxGreen 1
I would have 3 stored procedures:
pFox_Flairs
#FoxID int
DECLARE #CurrentFlairSP as varchar(100)
DECLARE #CurrentIDIndex as varchar(100) = 1
DECLARE #ResultFlairs as table(FlairName as varchar(50), FlairColor as integer)
WHILE #CurrentIDIndex <= (SELECT MAX(ID) FROM Flairs WHERE Valid <> 0)
BEGIN
IF EXISTS(SELECT * FROM Flairs WHERE ID = #CurrentIDIndex AND VALID <> 0)
BEGIN
SET #CurrentFlairSP = CONCAT((SELECT TOP 1 FlairStoredProcedure FROM Flairs WHERE ID = #CurrentIDIndex AND VALID <> 0), ' #FoxID=#FoxID')
INSERT INTO #ResultFlairs
EXEC (#CurrentFlairSP)
END
#CurrentIDIndex += 1
END
SELECT * FROM #ResultFlairs
pFlairs_IsFoxBig
#FoxID int
SELECT 'Big' WHERE EXISTS( SELECT TOP 1 * FROM Fox WHERE ID = #Fox AND FoxSize > 10)
pFlairs_IsFoxGreen
#FoxID int
SELECT 'Green' WHERE EXISTS( SELECT TOP 1 * FROM Fox WHERE ID = #Fox AND FoxColor = 'green')
You could create a single table valued function that checks all the conditions:
CREATE OR ALTER FUNCTION dbo.GetFlairs ( #FoxID int )
RETURNS TABLE
AS RETURN
SELECT v.FlairName
FROM Fox f
CROSS APPLY (
SELECT 'Big'
WHERE f.FoxSize > 10
UNION ALL
SELECT 'Green'
WHERE f.FoxColor = 'green'
) v(FlairName)
WHERE f.FoxID = #FoxID;
go
Then you can use it like this:
SELECT *
FROM dbo.GetFlairs(123);
If or when you add more attributes or conditions, simply add them into the function as another UNION ALL
Related
For example, there is a table
int type
int number
int value
How to make that when inserting a value into a table
indexing started from 1 for different types.
type 1 => number 1,2,3...
type 2 => number 1,2,3...
That is, it will look like this.
type
number
value
1
1
-
1
2
-
1
3
-
2
1
-
1
4
-
2
2
-
3
1
-
6
1
-
1
5
-
2
3
-
6
2
-
Special thanks to #Larnu.
As a result, in my case, the best solution would be to create a table for each type.
As I mentioned in the comments, neither IDENTITY nor SEQUENCE support the use of another column to denote what "identity set" they should use. You can have multiple SEQUENCEs which you could use for a single table, however, this doesn't scale. If you are specific limited to 2 or 3 types, for example, you might choose to create 3 SEQUENCE objects, and then use a stored procedure to handle your INSERT statements. Then, when a user/application wants to INSERT data, they call the procedure and that procedure has logic to use the SEQUENCE based on the value of the parameter for the type column.
As mentioned, however, this doesn't scale well. If you have an undeterminate number of values of type then you can't easily handle getting the right SEQUENCE and handling new values for type would be difficult too. In this case, you would be better off using a IDENTITY and then a VIEW. The VIEW will use ROW_NUMBER to create your identifier, while IDENTITY gives you your always incrementing value.
CREATE TABLE dbo.YourTable (id int IDENTITY(1,1),
[type] int NOT NULL,
number int NULL,
[value] int NOT NULL);
GO
CREATE VIEW dbo.YourTableView AS
SELECT ROW_NUMBER() OVER (PARTITION BY [type] ORDER BY id ASC) AS Identifier,
[type],
number,
[value]
FROM dbo.YourTable;
Then, instead, you query the VIEW, not the TABLE.
If you need consistency of the column (I name identifier) you'll need to also ensure row(s) can't be DELETEd from the table. Most likely by adding an IsDeleted column to the table defined as a bit (with 0 for no deleted, and 1 for deleted), and then you can filter to those rows in the VIEW:
CREATE VIEW dbo.YourTableView AS
WITH CTE AS(
SELECT id,
ROW_NUMBER() OVER (PARTITION BY [type] ORDER BY id ASC) AS Identifier,
[type],
number,
[value],
IsDeleted
FROM dbo.YourTable)
SELECT id,
Identifier,
[type],
number,
[value]
FROM CTE
WHERE IsDeleted = 0;
You could, if you wanted, even handle the DELETEs on the VIEW (the INSERT and UPDATEs would be handled implicitly, as it's an updatable VIEW):
CREATE TRIGGER trg_YourTableView_Delete ON dbo.YourTableView
INSTEAD OF DELETE AS
BEGIN
SET NOCOUNT ON;
UPDATE YT
SET IsDeleted = 1
FROM dbo.YourTable YT
JOIN deleted d ON d.id = YT.id;
END;
GO
db<>fiddle
For completion, if you wanted to use different SEQUENCE object, it would look like this. Notice that this does not scale easily. I have to CREATE a SEQUENCE for every value of Type. As such, for a small, and known, range of values this would be a solution, but if you are going to end up with more value for type or already have a large range, this ends up not being feasible pretty quickly:
CREATE TABLE dbo.YourTable (identifier int NOT NULL,
[type] int NOT NULL,
number int NULL,
[value] int NOT NULL);
CREATE SEQUENCE dbo.YourTable_Type1
START WITH 1 INCREMENT BY 1;
CREATE SEQUENCE dbo.YourTable_Type2
START WITH 1 INCREMENT BY 1;
CREATE SEQUENCE dbo.YourTable_Type3
START WITH 1 INCREMENT BY 1;
GO
CREATE PROC dbo.Insert_YourTable #Type int, #Number int = NULL, #Value int AS
BEGIN
DECLARE #Identifier int;
IF #Type = 1
SELECT #Identifier = NEXT VALUE FOR dbo.YourTable_Type1;
IF #Type = 2
SELECT #Identifier = NEXT VALUE FOR dbo.YourTable_Type2;
IF #Type = 3
SELECT #Identifier = NEXT VALUE FOR dbo.YourTable_Type3;
INSERT INTO dbo.YourTable (identifier,[type],number,[value])
VALUES(#Identifier, #Type, #Number, #Value);
END;
Is there a way, within SSMS or via a third-party app, to select a batch of lines to toggle their commented status?
Background: in my query, there are two "search modes" that I want to alternate between.
To do this, I need to comment lines 19, 107, 108, 112, and uncomment line 232. Then I need to do the opposite to go back to the other "mode".
Rather than scrolling through the query, is there some way to automate this process?
Example:
1 --select distinct x.name from (
2 select name, dateofbirth, favouritecolour
3 from classmates
4 where dateofbirth between '01-Mar-1990' and '17-Apr-1995'
5 --)x
6 union all
7 --select distinct x.name from (
8 select name, relationship, location
9 from family
10 where relationship = 'uncle'
11 --)x
For full detail, I could have the query like this. If I just wanted the names, I would uncomment lines 1,5,7 and 11.
My real life example is spread across hundreds of lines, and would involve commenting and uncommenting as part of the same "transition"
A simple way to do this is to use a variable:
--Searchmode 0 = Path A
--Searchmode 1 = Path B
DECLARE #SearchMode int = 0 --Change this to change path
IF #SearchMode = 0
BEGIN
SELECT blah
FROM tableA
END
IF #SearchMode = 1
BEGIN
SELECT blah
FROM tableB
END
You could also make it a procedure:
CREATE PROCEDURE dbo.exampleProc #SearchMode int
AS
BEGIN
IF #SearchMode = 0
BEGIN
SELECT blah
FROM tableA
END
IF #SearchMode = 1
BEGIN
SELECT blah
FROM tableB
END
END
Then just execute it and feed in the parameter value like this:
EXEC dbo.exampleProc 0
EXEC dbo.exampleProc 1
Edit:
You could also have the repeated parts of the query always run, then the extra filters only run if #SearchMode = 1. Something like:
DECLARE #SearchMode int = 0
select name, dateofbirth, favouritecolour
into #temp
from classmates
where dateofbirth between '01-Mar-1990' and '17-Apr-1995'
union all
select name, relationship, location
from family
where relationship = 'uncle'
IF #SearchMode = 0
BEGIN
SELECT *
FROM #temp
END
IF #SearchMode = 1
BEGIN
SELECT DISTINCT Name
FROM #temp
END
I searched the web but cannot find a solution for my problem (but perhaps I am using the wrong keywords ;) ).
I've got a Stored Procedure which does some automatic validation (every night) for a bunch of records. However, sometimes a user wants to do the same validation for a single record manually. I thought about calling the Stored Procedure with a parameter, when set the original SELECT statement (which loops through all the records) should get an AND operator with the specified record ID. I want to do it this way so that I don't have to copy the entire select statement and modify it just for the manual part.
The original statement is as follows:
DECLARE GenerateFacturen CURSOR LOCAL FOR
SELECT TOP 100 PERCENT becode, dtreknr, franchisebecode, franchisenemer, fakgroep, vonummer, vovolgnr, count(*) as nrVerOrd,
FaktuurEindeMaand, FaktuurEindeWeek
FROM (
SELECT becode, vonummer, vovolgnr, FaktuurEindeMaand, FaktuurEindeWeek, uitgestfaktuurdat, levdat, voomschrijving, vonetto,
faktureerperorder, dtreknr, franchisebecode, franchisenemer, fakgroep, levscandat
FROM vwOpenVerOrd WHERE becode=#BecondeIN AND levdat IS NOT NULL AND fakstatus = 0
AND isAllFaktuurStukPrijsChecked = 1 AND IsAllFaktuurVrChecked = 1
AND (uitgestfaktuurdat IS NULL OR uitgestfaktuurdat<=#FactuurDate)
) sub
WHERE faktureerperorder = 1
GROUP BY becode, dtreknr, franchisebecode, franchisenemer, fakgroep, vonummer, vovolgnr,
FaktuurEindeMaand, FaktuurEindeWeek
ORDER BY MIN(levscandat)
At the WHERE faktureerperorder = 1 I came up with something like this:
WHERE faktureerperorder = 1 AND CASE WHEN #myParameterManual = 1 THEN vonummer=#vonummer ELSE 1=1 END
But this doesn't work. The #myParameterManual indicates whether or not it should select only a specific record. The vonummer=#vonummer is the record's ID. I thought by setting 1=1 I would get all the records.
Any ideas how to achieve my goal (perhaps more efficient ideas or better ideas)?
I'm finding it difficult to read your query, but this is hopefully a simple example of what you're trying to achieve.
I've used a WHERE clause with an OR operator to give you 2 options on the filter. Using the same query you will get different outputs depending on the filter value:
CREATE TABLE #test ( id INT, val INT );
INSERT INTO #test
( id, val )
VALUES ( 1, 10 ),
( 2, 20 ),
( 3, 30 );
DECLARE #filter INT;
-- null filter returns all rows
SET #filter = NULL;
SELECT *
FROM #test
WHERE ( #filter IS NULL
AND id < 5
)
OR ( #filter IS NOT NULL
AND id = #filter
);
-- filter a specific record
SET #filter = 2;
SELECT *
FROM #test
WHERE ( #filter IS NULL
AND id < 5
)
OR ( #filter IS NOT NULL
AND id = #filter
);
DROP TABLE #test;
First query returns all:
id val
1 10
2 20
3 30
Second query returns a single row:
id val
2 20
I currently have a stored procedure in MSSQL where I execute a SELECT-statement multiple times based on the variables I give the stored procedure. The stored procedure counts how many results are going to be returned for every filter a user can enable.
The stored procedure isn't the issue, I transformed the select statement from te stored procedure to a regular select statement which looks like:
DECLARE #contentRootId int = 900589
DECLARE #RealtorIdList varchar(2000) = ';880;884;1000;881;885;'
DECLARE #publishSoldOrRentedSinceDate int = 8
DECLARE #isForSale BIT= 1
DECLARE #isForRent BIT= 0
DECLARE #isResidential BIT= 1
--...(another 55 variables)...
--Table to be returned
DECLARE #resultTable TABLE
(
variableName varchar(100),
[value] varchar(200)
)
-- Create table based of inputvariable. Example: turns ';18;118;' to a table containing two ints 18 AND 118
DECLARE #RealtorIdTable table(RealtorId int)
INSERT INTO #RealtorIdTable SELECT * FROM dbo.Split(#RealtorIdList,';') option (maxrecursion 150)
INSERT INTO #resultTable ([value], variableName)
SELECT [Value], VariableName FROM(
Select count(*) as TotalCount,
ISNULL(SUM(CASE WHEN reps.ForRecreation = 1 THEN 1 else 0 end), 0) as ForRecreation,
ISNULL(SUM(CASE WHEN reps.IsQualifiedForSeniors = 1 THEN 1 else 0 end), 0) as IsQualifiedForSeniors,
--...(A whole bunch more SUM(CASE)...
FROM TABLE1 reps
LEFT JOIN temp t on
t.ContentRootID = #contentRootId
AND t.RealEstatePropertyID = reps.ID
WHERE
(EXISTS(select 1 from #RealtorIdTable where RealtorId = reps.RealtorID))
AND (#SelectedGroupIds IS NULL OR EXISTS(select 1 from #SelectedGroupIdtable where GroupId = t.RealEstatePropertyGroupID))
AND (ISNULL(reps.IsForSale,0) = ISNULL(#isForSale,0))
AND (ISNULL(reps.IsForRent, 0) = ISNULL(#isForRent,0))
AND (ISNULL(reps.IsResidential, 0) = ISNULL(#isResidential,0))
AND (ISNULL(reps.IsCommercial, 0) = ISNULL(#isCommercial,0))
AND (ISNULL(reps.IsInvestment, 0) = ISNULL(#isInvestment,0))
AND (ISNULL(reps.IsAgricultural, 0) = ISNULL(#isAgricultural,0))
--...(Around 50 more of these WHERE-statements)...
) as tbl
UNPIVOT (
[Value]
FOR [VariableName] IN(
[TotalCount],
[ForRecreation],
[IsQualifiedForSeniors],
--...(All the other things i selected in above query)...
)
) as d
select * from #resultTable
The combination of a Realtor- and contentID gives me a set default set of X amount of records. When I choose a Combination which gives me ~4600 records, the execution time is around 250ms. When I execute the sattement with a combination that gives me ~600 record, the execution time is about 20ms.
I would like to know why this is happening. I tried removing all SUM(CASE in the select, I tried removing almost everything from the WHERE-clause, and I tried removing the JOIN. But I keep seeing the huge difference between the resultset of 4600 and 600.
Table variables can perform worse when the number of records is large. Consider using a temporary table instead. See When should I use a table variable vs temporary table in sql server?
Also, consider replacing the UNPIVOT by alternative SQL code. Writing your own TSQL code will give you more control and even increase performance. See for example PIVOT, UNPIVOT and performance
I have the following tables:
tbl_File:
FileID | Filename
-----------------
1 | test.jpg
and
tbl_Tag:
TagID | TagName
---------------
1 | Red
and
tbl_TagFile:
ID | TagID | FileID
-------------------
1 | 1 | 1
I need to pass a non-inclusive query against these tables. For example, imagine a list of checkboxes to select one or more tags, and then a search button. I need to pass the TagID's to the query as a PIPE delimited string, such as "1|2|5|"
The search results need to be non-inclusive, such as if it must meet all the criteria. If 3 tags are selected, the results are to be files that have all 3 tags associated with them.
I think I've made this too complicated, but tried iterating over the tags using charindex and stuff to work my way through the string, but it seems there must be an easier way.
I'd like to do this as a function... Such as
SELECT FileID, Filename
FROM tbl_Files
WHERE dbo.udf_FileExistswithTags(#Tags, FileID) = 1
Any efficient way to do this?
It doesn't sound from your example scenario that the actual "need" is to pass a pipe-delimited string. I would highly suggest abandoning that idea and using a Table Value Parameter in your stored procedure. This has numerous advantages in that you will not hit a datatype limit or a "number of parameters" limit that might occur with very large sets of criteria. Additionally it gets away from any need to run a (potentially very slow) UDF.
Split the string into tokens on the application side, and then insert each token as a row in the TVP. Example below:
Create the TVP type in your database:
CREATE TYPE [dbo].[FileNameType] AS TABLE
(
fileName varchar(1000)
)
On the application side, build your list of filename tokens into a recordset:
private static List<SqlDataRecord> BuildFileNameTokenRecords(IEnumerable<string> tokens)
{
var records = new List<SqlDataRecord>();
foreach (string token in tokens){
var record = new SqlDataRecord(
new SqlMetaData[]
{
new SqlMetaData("fileName", SqlDbType.Varchar),
}
);
records.Add(record);
}
return records;
}
Wherever you run your proc from (rough code here):
var records = BuildFileNameTokenRecords(listofstrings);
var sqlCmd = sqlDb.GetStoredProcCommand("FileExists");
sqlDb.AddInParameter(sqlCmd, "tvpFilenameTokens", SqlDbType.Structured, records);
ExecuteNonQuery(sqlCmd);
Filtering your select statement then simply becomes a matter of joining on the tokens in the table parameter. Something like this:
CREATE PROCEDURE dbo.FileExists
(
-- Put additional parameters here
#tvpFilenameTokens dbo.FileNameType READONLY,
)
AS
BEGIN
SELECT FileID, Filename
FROM tbl_Files INNER JOIN #tvpFilenameTokens
ON tbl_Files.FileID = #tvpFilenameTokens.fileName
END
Here is an option that should scale. All of the functionality is available back to SQL Server 2005. It uses a CTE to separate the portion of the query that finds only the FileIDs that have all of the TagIDs passed in, and then that list of FileIDs is joined to the [File] table to get the details. It also uses an INNER JOIN instead of an IN list to match the TagID's.
Please note that the example below uses a SQLCLR splitter that is freely available in the SQL# library (which I wrote, but this function is in the Free version). The specific splitter used is not the important part; it should just be one that is either SQLCLR, an inline tally-table (like the one used in #wewesthemenace's answer), or is the XML method. Just don't use a splitter based on a WHILE-loop or a recursive CTE.
---- TEST SETUP
DECLARE #File TABLE
(
FileID INT NOT NULL PRIMARY KEY,
[Filename] NVARCHAR(200) NOT NULL
);
DECLARE #TagFile TABLE
(
TagID INT NOT NULL,
FileID INT NOT NULL,
PRIMARY KEY (TagID, FileID)
);
INSERT INTO #File VALUES (1, 'File1.txt');
INSERT INTO #File VALUES (2, 'File2.txt');
INSERT INTO #File VALUES (3, 'File3.txt');
INSERT INTO #TagFile VALUES (1, 1);
INSERT INTO #TagFile VALUES (2, 1);
INSERT INTO #TagFile VALUES (5, 1);
INSERT INTO #TagFile VALUES (1, 2);
INSERT INTO #TagFile VALUES (2, 2);
INSERT INTO #TagFile VALUES (4, 2);
INSERT INTO #TagFile VALUES (1, 3);
INSERT INTO #TagFile VALUES (2, 3);
INSERT INTO #TagFile VALUES (5, 3);
INSERT INTO #TagFile VALUES (6, 3);
---- DONE WITH TEST SETUP
DECLARE #TagsToGet VARCHAR(100); -- this would be the proc input parameter
SET #TagsToGet = '1|2|5';
CREATE TABLE #Tags (TagID INT NOT NULL PRIMARY KEY);
DECLARE #NumTags INT;
INSERT INTO #Tags (TagID)
SELECT split.SplitVal
FROM SQL#.String_Split4k(#TagsToGet, '|', 1) split;
SET #NumTags = ##ROWCOUNT;
;WITH files AS
(
SELECT tf.FileID
FROM #TagFile tf
INNER JOIN #Tags tg
ON tg.TagID = tf.TagID
GROUP BY tf.FileID
HAVING COUNT(*) = #NumTags
)
SELECT fl.*
FROM #File fl
INNER JOIN files
ON files.FileID = fl.FileID
ORDER BY fl.[Filename] ASC;
DROP TABLE #Tags; -- don't need this if code above is placed in a proc
Results:
FileID Filename
1 File1.txt
3 File3.txt
Notes
As much as I love TVPs (and I do, when they are done correctly and used appropriately), I would say that they are a bit much for this type of small scale, single dimensional array scenario. There won't really be any performance gain over using a SQLCLR streaming TVF string splitter but it would require more app code and the additional User-Defined Table Type, which can't be updated without first dropping all procs that reference it. That doesn't happen all of the time, but needs to be considered in terms of long-term maintenance costs.
The JOIN between TagFile and the temporary table populated from the split operation should be much more efficient than using an IN list with a subquery for the split operation. An IN list is short-hand for all of the values in it to be their own OR conditions. Hence the JOIN is a fully set-based approach that lets the Query Optimizer do its thang.
The structure I used for the test #TagFile table only has the two relevant IDs in it: TagID and FileID. It does not have the ID field that I assume is an IDENTITY field on this table. Unless there is a very specific reason for needing that IDENTITY field, I would suggest removing it. It adds to inherent benefit as the combination of TagID and FileID is a natural key (i.e. it is both NOT NULL and Unique). And if the Clustered PK of this table were simply those two fields, the JOIN to the temp table of those split-out TagIDs would be quite fast, even with millions of rows in TagFile.
One reason that this approach works so much better than trying to handle this via a function per FileID (outside of the obvious set-based is better than cursor-based reason) is that the list of TagIDs is the same for all files to be checked. So splitting that out more than one time is a waste of effort.
By not splitting the TagID list inline in the query I am able to capture the number of elements in that list with no additional effort. Hence this saves from needing to do a secondary calculation.
Here is a function called DelimitedSplit8K by Jeff Moden. This is used to split strings of length up to 8000. For more info, read this: http://www.sqlservercentral.com/articles/Tally+Table/72993/
CREATE FUNCTION [dbo].[DelimitedSplit8K](
#pString VARCHAR(8000), --WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
#pDelimiter CHAR(1)
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH E1(N) AS (--10E+1 or 10 rows
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (
SELECT TOP (ISNULL(DATALENGTH(#pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(#pString, t.N, 1) = #pDelimiter
),
cteLen(N1, L1) AS(--==== Return start and length (for use in substring)
SELECT
s.N1,
ISNULL(NULLIF(CHARINDEX(#pDelimiter, #pString, s.N1), 0) - s.N1, 8000)
FROM cteStart s
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT
ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(#pString, l.N1, l.L1)
FROM cteLen l
Your query would now be:
DECLARE #pString VARCHAR(8000) = '1|3|5'
SELECT
f.*
FROM tbl_File f
INNER JOIN tbl_TagFile tf ON tf.FileID = f.FileID
WHERE
tf.TagID IN(SELECT CAST(item AS INT) FROM dbo.DelimitedSplit8K(#pString, '|'))
GROUP BY f.FileID, f.FileName
HAVING COUNT(tf.ID) = (LEN(#pString) - LEN(REPLACE(#pString,'|','')) + 1)
The statement below counts the number of TagID in the parameter by counting the occurrence of the delimiter | + 1.
(LEN(#pString) - LEN(REPLACE(#pString,'|','')) + 1)
Here is an option that does not require UDF's.
It can be argued that this is also complicated.
DECLARE #TagList VARCHAR(50)
-- pass in this
SET #TagList = '1|3|6'
SELECT
FinalSet.FileID,
FinalSet.Tag,
FinalSet.TotalMatches
FROM
(
SELECT
tbl_TagFile.FileID,
tbl_TagFile.Tag,
COUNT(*) OVER(PARTITION BY tbl_TagFile.FileID) TotalMatches
FROM
(
SELECT 1 FileID, '1' Tag UNION ALL
SELECT 1 , '2' UNION ALL
SELECT 1 , '3' UNION ALL
SELECT 1 , '6' UNION ALL
SELECT 2 , '1' UNION ALL
SELECT 2 , '3'
) tbl_TagFile
INNER JOIN
(
SELECT tbl_Tag.Tag
FROM
(
SELECT '1' Tag UNION ALL
SELECT '2' UNION ALL
SELECT '3' UNION ALL
SELECT '4' UNION ALL
SELECT '5' UNION ALL
SELECT '6'
) tbl_Tag
WHERE '|' + #TagList + '|' LIKE '%|' + Tag + '|%'
) LimitedTagTable
ON LimitedTagTable.Tag = tbl_TagFile.Tag
) FinalSet
WHERE
FinalSet.TotalMatches = (LEN(#TagList) - LEN(REPLACE(#TagList,'|','')) + 1)
There's some complications in this around data types and indexes and stuff but you can see the concept - you are only getting the records that match your passed in string.
subtable LimitedTagTable is your tag list filtered by your input pipe delimited string
subtable FinalSet joins your limited tag list to your list of files
column TotalMatches works out how many tag matches your file had
Finally this line limits the output to those files that had enough matches:
FinalSet.TotalMatches = (LEN(#TagList) - LEN(REPLACE(#TagList,'|','')) + 1)
Please experiment with different inputs and datasets and see if it suits as I have made a number of assumptions.
I'm answering my own question, in hopes that someone can let me know if/how flawed it is. So far it seems to be working but just early testing.
Function:
ALTER FUNCTION [dbo].[udf_FileExistsByTags]
(
#FileID int
,#Tags nvarchar(max)
)
RETURNS bit
AS
BEGIN
DECLARE #Exists bit = 0
DECLARE #Count int = 0
DECLARE #TagTable TABLE ( FileID int, TagID int )
DECLARE #Tag int
WHILE len(#Tags) > 0
BEGIN
SET #Tag = CAST(LEFT(#Tags, charindex('|', #Tags + '|') -1) as int)
SET #Count = #Count + 1
IF EXISTS (SELECT * FROM tbl_FileTag WHERE FileID = #FileID AND TagID = #Tag )
BEGIN
INSERT INTO #TagTable ( FileID, TagID ) VALUES ( #FileID, #Tag )
END
SET #Tags = STUFF(#Tags, 1, charindex('|', #Tags + '|'), '')
END
SET #Exists = CASE WHEN #Count = (SELECT COUNT(*) FROM #TagTable) THEN 1 ELSE 0 END
RETURN #Exists
END
Then in the query:
SELECT * FROM tbl_File a WHERE dbo.udf_FileExistsByTags(a.FileID, #Tags) = 1
So now I'm looking for errors.
What do you think? Probably not every efficient, however this search will be used only on a periodic basis.