I am currently working on a project, that includes in an automatical flairing part.
Basicly what this does:
I have a table called Fox, with various columns. And some other tables refering to Fox (i.e. CaughtChickens).
I want to have another table, that I can expand anytime, with 3 columns (other than ID ofc.) in my mind FlairName, FlairColor, and FlairStoredProcedure.
I want to have a stored procedure that returns all FlairName where the FlairStoredProcedure returns 1, for a certain FoxID.
This way I can write a stored procedure that checks if a certain Fox caught a chicken and returns 1 if it did, and add a flair Hunter on the User UI.
There are some cons with this:
Every time I want a new flair I have to write a new stored procedure it (yeah I kinda can't short this one out).
The stored procedures needs to have the same amount of in parameters (ie. #FoxID), and needs to return 1 or 0 (or select nothing when false, select the name if true (?))
I need to use dynamicSQL in the stored procedure that collect these flairs, and I kinda don't want to use any dynamicSQL at all.
Isn't there a lot easier way to do this that I am missing?
EDIT:
Example:
I have a table Fox:
FoxID FoxName FoxColor FoxSize Valid
1 Swiper red 12 1
I would have a table Flairs
FlairID FlairName FlairStoredProcedure Valid
1 Big pFlairs_IsFoxBig 1
2 Green pFlairs_IsFoxGreen 1
I would have 3 stored procedures:
pFox_Flairs
#FoxID int
DECLARE #CurrentFlairSP as varchar(100)
DECLARE #CurrentIDIndex as varchar(100) = 1
DECLARE #ResultFlairs as table(FlairName as varchar(50), FlairColor as integer)
WHILE #CurrentIDIndex <= (SELECT MAX(ID) FROM Flairs WHERE Valid <> 0)
BEGIN
IF EXISTS(SELECT * FROM Flairs WHERE ID = #CurrentIDIndex AND VALID <> 0)
BEGIN
SET #CurrentFlairSP = CONCAT((SELECT TOP 1 FlairStoredProcedure FROM Flairs WHERE ID = #CurrentIDIndex AND VALID <> 0), ' #FoxID=#FoxID')
INSERT INTO #ResultFlairs
EXEC (#CurrentFlairSP)
END
#CurrentIDIndex += 1
END
SELECT * FROM #ResultFlairs
pFlairs_IsFoxBig
#FoxID int
SELECT 'Big' WHERE EXISTS( SELECT TOP 1 * FROM Fox WHERE ID = #Fox AND FoxSize > 10)
pFlairs_IsFoxGreen
#FoxID int
SELECT 'Green' WHERE EXISTS( SELECT TOP 1 * FROM Fox WHERE ID = #Fox AND FoxColor = 'green')
You could create a single table valued function that checks all the conditions:
CREATE OR ALTER FUNCTION dbo.GetFlairs ( #FoxID int )
RETURNS TABLE
AS RETURN
SELECT v.FlairName
FROM Fox f
CROSS APPLY (
SELECT 'Big'
WHERE f.FoxSize > 10
UNION ALL
SELECT 'Green'
WHERE f.FoxColor = 'green'
) v(FlairName)
WHERE f.FoxID = #FoxID;
go
Then you can use it like this:
SELECT *
FROM dbo.GetFlairs(123);
If or when you add more attributes or conditions, simply add them into the function as another UNION ALL
I currently have a stored procedure in MSSQL where I execute a SELECT-statement multiple times based on the variables I give the stored procedure. The stored procedure counts how many results are going to be returned for every filter a user can enable.
The stored procedure isn't the issue, I transformed the select statement from te stored procedure to a regular select statement which looks like:
DECLARE #contentRootId int = 900589
DECLARE #RealtorIdList varchar(2000) = ';880;884;1000;881;885;'
DECLARE #publishSoldOrRentedSinceDate int = 8
DECLARE #isForSale BIT= 1
DECLARE #isForRent BIT= 0
DECLARE #isResidential BIT= 1
--...(another 55 variables)...
--Table to be returned
DECLARE #resultTable TABLE
(
variableName varchar(100),
[value] varchar(200)
)
-- Create table based of inputvariable. Example: turns ';18;118;' to a table containing two ints 18 AND 118
DECLARE #RealtorIdTable table(RealtorId int)
INSERT INTO #RealtorIdTable SELECT * FROM dbo.Split(#RealtorIdList,';') option (maxrecursion 150)
INSERT INTO #resultTable ([value], variableName)
SELECT [Value], VariableName FROM(
Select count(*) as TotalCount,
ISNULL(SUM(CASE WHEN reps.ForRecreation = 1 THEN 1 else 0 end), 0) as ForRecreation,
ISNULL(SUM(CASE WHEN reps.IsQualifiedForSeniors = 1 THEN 1 else 0 end), 0) as IsQualifiedForSeniors,
--...(A whole bunch more SUM(CASE)...
FROM TABLE1 reps
LEFT JOIN temp t on
t.ContentRootID = #contentRootId
AND t.RealEstatePropertyID = reps.ID
WHERE
(EXISTS(select 1 from #RealtorIdTable where RealtorId = reps.RealtorID))
AND (#SelectedGroupIds IS NULL OR EXISTS(select 1 from #SelectedGroupIdtable where GroupId = t.RealEstatePropertyGroupID))
AND (ISNULL(reps.IsForSale,0) = ISNULL(#isForSale,0))
AND (ISNULL(reps.IsForRent, 0) = ISNULL(#isForRent,0))
AND (ISNULL(reps.IsResidential, 0) = ISNULL(#isResidential,0))
AND (ISNULL(reps.IsCommercial, 0) = ISNULL(#isCommercial,0))
AND (ISNULL(reps.IsInvestment, 0) = ISNULL(#isInvestment,0))
AND (ISNULL(reps.IsAgricultural, 0) = ISNULL(#isAgricultural,0))
--...(Around 50 more of these WHERE-statements)...
) as tbl
UNPIVOT (
[Value]
FOR [VariableName] IN(
[TotalCount],
[ForRecreation],
[IsQualifiedForSeniors],
--...(All the other things i selected in above query)...
)
) as d
select * from #resultTable
The combination of a Realtor- and contentID gives me a set default set of X amount of records. When I choose a Combination which gives me ~4600 records, the execution time is around 250ms. When I execute the sattement with a combination that gives me ~600 record, the execution time is about 20ms.
I would like to know why this is happening. I tried removing all SUM(CASE in the select, I tried removing almost everything from the WHERE-clause, and I tried removing the JOIN. But I keep seeing the huge difference between the resultset of 4600 and 600.
Table variables can perform worse when the number of records is large. Consider using a temporary table instead. See When should I use a table variable vs temporary table in sql server?
Also, consider replacing the UNPIVOT by alternative SQL code. Writing your own TSQL code will give you more control and even increase performance. See for example PIVOT, UNPIVOT and performance
I have an existing stored procedure. I have been asked to attempt to find a way to fit a specific set of logic into the procedure in order to avoid having to create a new one. However, I am not the best with SQL, but I would still like to do everything I can to accomplish my task.
My current goal: use the existing table generated from the select top 400 statement and somehow fit the update I wrote (second chunk of code) to work with that.
My existing procedure:
USE [cph]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER proc [dbo].[PatientSynch]
#EnvironmentKey varchar(1)
AS
BEGIN
DECLARE #patId VARCHAR(25)
select top 400
c.pat_id as cpPatId,
(LEFT(c.fname, 1)
+LEFT(c.lname, 1)
+ #EnvironmentKey
+ RIGHT('00000000'
+ convert(varchar,c.pat_id),8 )) AS PRN,
c.pref_meth_cont_cn as PreferredChannel
--,p.cppatid,p.prn
from
dbo.cppat c
left outer join dbo.patient p on c.pat_id=p.cppatid
where
p.cppatid is null or p.prn is null
order by
c.pat_id desc
END
The statement I have created to suit my needs:
UPDATE dbo.cppat
SET chart_id = CONVERT(VARCHAR(10), pat_id) + '+'
WHERE pat_id IN
(
SELECT pat_id
FROM cppat
)
I've added Chart_id in the query since you need it there to update it. This creates a common table expression that you can use to update records.
;WITH Update_Complex_Query AS
(
SELECT TOP 400
c.pat_id AS cpPatId
, (LEFT(c.fname, 1) + LEFT(c.lname, 1) + #EnvironmentKey + + RIGHT('00000000' + convert(VARCHAR, c.pat_id), 8)) AS PRN
, c.pref_meth_cont_cn AS PreferredChannel
--,p.cppatid,p.prn
, c.Chart_Id
FROM dbo.cppat c
LEFT JOIN dbo.patient p
ON c.pat_id = p.cppatid
WHERE p.cppatid IS NULL
OR p.prn IS NULL
ORDER BY c.pat_id DESC
)
UPDATE Update_Complex_Query
SET Chart_id = CONVERT(VARCHAR(10), cpPatId) + '+'
I have the following tables:
tbl_File:
FileID | Filename
-----------------
1 | test.jpg
and
tbl_Tag:
TagID | TagName
---------------
1 | Red
and
tbl_TagFile:
ID | TagID | FileID
-------------------
1 | 1 | 1
I need to pass a non-inclusive query against these tables. For example, imagine a list of checkboxes to select one or more tags, and then a search button. I need to pass the TagID's to the query as a PIPE delimited string, such as "1|2|5|"
The search results need to be non-inclusive, such as if it must meet all the criteria. If 3 tags are selected, the results are to be files that have all 3 tags associated with them.
I think I've made this too complicated, but tried iterating over the tags using charindex and stuff to work my way through the string, but it seems there must be an easier way.
I'd like to do this as a function... Such as
SELECT FileID, Filename
FROM tbl_Files
WHERE dbo.udf_FileExistswithTags(#Tags, FileID) = 1
Any efficient way to do this?
It doesn't sound from your example scenario that the actual "need" is to pass a pipe-delimited string. I would highly suggest abandoning that idea and using a Table Value Parameter in your stored procedure. This has numerous advantages in that you will not hit a datatype limit or a "number of parameters" limit that might occur with very large sets of criteria. Additionally it gets away from any need to run a (potentially very slow) UDF.
Split the string into tokens on the application side, and then insert each token as a row in the TVP. Example below:
Create the TVP type in your database:
CREATE TYPE [dbo].[FileNameType] AS TABLE
(
fileName varchar(1000)
)
On the application side, build your list of filename tokens into a recordset:
private static List<SqlDataRecord> BuildFileNameTokenRecords(IEnumerable<string> tokens)
{
var records = new List<SqlDataRecord>();
foreach (string token in tokens){
var record = new SqlDataRecord(
new SqlMetaData[]
{
new SqlMetaData("fileName", SqlDbType.Varchar),
}
);
records.Add(record);
}
return records;
}
Wherever you run your proc from (rough code here):
var records = BuildFileNameTokenRecords(listofstrings);
var sqlCmd = sqlDb.GetStoredProcCommand("FileExists");
sqlDb.AddInParameter(sqlCmd, "tvpFilenameTokens", SqlDbType.Structured, records);
ExecuteNonQuery(sqlCmd);
Filtering your select statement then simply becomes a matter of joining on the tokens in the table parameter. Something like this:
CREATE PROCEDURE dbo.FileExists
(
-- Put additional parameters here
#tvpFilenameTokens dbo.FileNameType READONLY,
)
AS
BEGIN
SELECT FileID, Filename
FROM tbl_Files INNER JOIN #tvpFilenameTokens
ON tbl_Files.FileID = #tvpFilenameTokens.fileName
END
Here is an option that should scale. All of the functionality is available back to SQL Server 2005. It uses a CTE to separate the portion of the query that finds only the FileIDs that have all of the TagIDs passed in, and then that list of FileIDs is joined to the [File] table to get the details. It also uses an INNER JOIN instead of an IN list to match the TagID's.
Please note that the example below uses a SQLCLR splitter that is freely available in the SQL# library (which I wrote, but this function is in the Free version). The specific splitter used is not the important part; it should just be one that is either SQLCLR, an inline tally-table (like the one used in #wewesthemenace's answer), or is the XML method. Just don't use a splitter based on a WHILE-loop or a recursive CTE.
---- TEST SETUP
DECLARE #File TABLE
(
FileID INT NOT NULL PRIMARY KEY,
[Filename] NVARCHAR(200) NOT NULL
);
DECLARE #TagFile TABLE
(
TagID INT NOT NULL,
FileID INT NOT NULL,
PRIMARY KEY (TagID, FileID)
);
INSERT INTO #File VALUES (1, 'File1.txt');
INSERT INTO #File VALUES (2, 'File2.txt');
INSERT INTO #File VALUES (3, 'File3.txt');
INSERT INTO #TagFile VALUES (1, 1);
INSERT INTO #TagFile VALUES (2, 1);
INSERT INTO #TagFile VALUES (5, 1);
INSERT INTO #TagFile VALUES (1, 2);
INSERT INTO #TagFile VALUES (2, 2);
INSERT INTO #TagFile VALUES (4, 2);
INSERT INTO #TagFile VALUES (1, 3);
INSERT INTO #TagFile VALUES (2, 3);
INSERT INTO #TagFile VALUES (5, 3);
INSERT INTO #TagFile VALUES (6, 3);
---- DONE WITH TEST SETUP
DECLARE #TagsToGet VARCHAR(100); -- this would be the proc input parameter
SET #TagsToGet = '1|2|5';
CREATE TABLE #Tags (TagID INT NOT NULL PRIMARY KEY);
DECLARE #NumTags INT;
INSERT INTO #Tags (TagID)
SELECT split.SplitVal
FROM SQL#.String_Split4k(#TagsToGet, '|', 1) split;
SET #NumTags = ##ROWCOUNT;
;WITH files AS
(
SELECT tf.FileID
FROM #TagFile tf
INNER JOIN #Tags tg
ON tg.TagID = tf.TagID
GROUP BY tf.FileID
HAVING COUNT(*) = #NumTags
)
SELECT fl.*
FROM #File fl
INNER JOIN files
ON files.FileID = fl.FileID
ORDER BY fl.[Filename] ASC;
DROP TABLE #Tags; -- don't need this if code above is placed in a proc
Results:
FileID Filename
1 File1.txt
3 File3.txt
Notes
As much as I love TVPs (and I do, when they are done correctly and used appropriately), I would say that they are a bit much for this type of small scale, single dimensional array scenario. There won't really be any performance gain over using a SQLCLR streaming TVF string splitter but it would require more app code and the additional User-Defined Table Type, which can't be updated without first dropping all procs that reference it. That doesn't happen all of the time, but needs to be considered in terms of long-term maintenance costs.
The JOIN between TagFile and the temporary table populated from the split operation should be much more efficient than using an IN list with a subquery for the split operation. An IN list is short-hand for all of the values in it to be their own OR conditions. Hence the JOIN is a fully set-based approach that lets the Query Optimizer do its thang.
The structure I used for the test #TagFile table only has the two relevant IDs in it: TagID and FileID. It does not have the ID field that I assume is an IDENTITY field on this table. Unless there is a very specific reason for needing that IDENTITY field, I would suggest removing it. It adds to inherent benefit as the combination of TagID and FileID is a natural key (i.e. it is both NOT NULL and Unique). And if the Clustered PK of this table were simply those two fields, the JOIN to the temp table of those split-out TagIDs would be quite fast, even with millions of rows in TagFile.
One reason that this approach works so much better than trying to handle this via a function per FileID (outside of the obvious set-based is better than cursor-based reason) is that the list of TagIDs is the same for all files to be checked. So splitting that out more than one time is a waste of effort.
By not splitting the TagID list inline in the query I am able to capture the number of elements in that list with no additional effort. Hence this saves from needing to do a secondary calculation.
Here is a function called DelimitedSplit8K by Jeff Moden. This is used to split strings of length up to 8000. For more info, read this: http://www.sqlservercentral.com/articles/Tally+Table/72993/
CREATE FUNCTION [dbo].[DelimitedSplit8K](
#pString VARCHAR(8000), --WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
#pDelimiter CHAR(1)
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH E1(N) AS (--10E+1 or 10 rows
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (
SELECT TOP (ISNULL(DATALENGTH(#pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(#pString, t.N, 1) = #pDelimiter
),
cteLen(N1, L1) AS(--==== Return start and length (for use in substring)
SELECT
s.N1,
ISNULL(NULLIF(CHARINDEX(#pDelimiter, #pString, s.N1), 0) - s.N1, 8000)
FROM cteStart s
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT
ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(#pString, l.N1, l.L1)
FROM cteLen l
Your query would now be:
DECLARE #pString VARCHAR(8000) = '1|3|5'
SELECT
f.*
FROM tbl_File f
INNER JOIN tbl_TagFile tf ON tf.FileID = f.FileID
WHERE
tf.TagID IN(SELECT CAST(item AS INT) FROM dbo.DelimitedSplit8K(#pString, '|'))
GROUP BY f.FileID, f.FileName
HAVING COUNT(tf.ID) = (LEN(#pString) - LEN(REPLACE(#pString,'|','')) + 1)
The statement below counts the number of TagID in the parameter by counting the occurrence of the delimiter | + 1.
(LEN(#pString) - LEN(REPLACE(#pString,'|','')) + 1)
Here is an option that does not require UDF's.
It can be argued that this is also complicated.
DECLARE #TagList VARCHAR(50)
-- pass in this
SET #TagList = '1|3|6'
SELECT
FinalSet.FileID,
FinalSet.Tag,
FinalSet.TotalMatches
FROM
(
SELECT
tbl_TagFile.FileID,
tbl_TagFile.Tag,
COUNT(*) OVER(PARTITION BY tbl_TagFile.FileID) TotalMatches
FROM
(
SELECT 1 FileID, '1' Tag UNION ALL
SELECT 1 , '2' UNION ALL
SELECT 1 , '3' UNION ALL
SELECT 1 , '6' UNION ALL
SELECT 2 , '1' UNION ALL
SELECT 2 , '3'
) tbl_TagFile
INNER JOIN
(
SELECT tbl_Tag.Tag
FROM
(
SELECT '1' Tag UNION ALL
SELECT '2' UNION ALL
SELECT '3' UNION ALL
SELECT '4' UNION ALL
SELECT '5' UNION ALL
SELECT '6'
) tbl_Tag
WHERE '|' + #TagList + '|' LIKE '%|' + Tag + '|%'
) LimitedTagTable
ON LimitedTagTable.Tag = tbl_TagFile.Tag
) FinalSet
WHERE
FinalSet.TotalMatches = (LEN(#TagList) - LEN(REPLACE(#TagList,'|','')) + 1)
There's some complications in this around data types and indexes and stuff but you can see the concept - you are only getting the records that match your passed in string.
subtable LimitedTagTable is your tag list filtered by your input pipe delimited string
subtable FinalSet joins your limited tag list to your list of files
column TotalMatches works out how many tag matches your file had
Finally this line limits the output to those files that had enough matches:
FinalSet.TotalMatches = (LEN(#TagList) - LEN(REPLACE(#TagList,'|','')) + 1)
Please experiment with different inputs and datasets and see if it suits as I have made a number of assumptions.
I'm answering my own question, in hopes that someone can let me know if/how flawed it is. So far it seems to be working but just early testing.
Function:
ALTER FUNCTION [dbo].[udf_FileExistsByTags]
(
#FileID int
,#Tags nvarchar(max)
)
RETURNS bit
AS
BEGIN
DECLARE #Exists bit = 0
DECLARE #Count int = 0
DECLARE #TagTable TABLE ( FileID int, TagID int )
DECLARE #Tag int
WHILE len(#Tags) > 0
BEGIN
SET #Tag = CAST(LEFT(#Tags, charindex('|', #Tags + '|') -1) as int)
SET #Count = #Count + 1
IF EXISTS (SELECT * FROM tbl_FileTag WHERE FileID = #FileID AND TagID = #Tag )
BEGIN
INSERT INTO #TagTable ( FileID, TagID ) VALUES ( #FileID, #Tag )
END
SET #Tags = STUFF(#Tags, 1, charindex('|', #Tags + '|'), '')
END
SET #Exists = CASE WHEN #Count = (SELECT COUNT(*) FROM #TagTable) THEN 1 ELSE 0 END
RETURN #Exists
END
Then in the query:
SELECT * FROM tbl_File a WHERE dbo.udf_FileExistsByTags(a.FileID, #Tags) = 1
So now I'm looking for errors.
What do you think? Probably not every efficient, however this search will be used only on a periodic basis.
I store positions in a SQL Server 2012 database, where each position is defined by a position number and a company number.
The position numbers are unique for each company only.
For instance, my database could have the following
POSITION_NO COMPANY_NO
1 1
2 1
3 1
1 2
2 2
3 2
1 3
I need a function which takes a company number as a parameter, and returns the next sequential position number, which in the example table above would be 2 for COMPANY_NO = 3
What I use at the moment is:
CREATE PROCEDURE [DB].[GenerateKey]
#p_company_no float(53),
#return_value_argument float(53) OUTPUT
AS
BEGIN
DECLARE
#v_position_no numeric(5, 0)
SELECT #v_position_no = max(POSITION_NO) + 1
FROM DB.POSITION_TABLE with (nolock)
WHERE COMPANY_NO = #p_company_no
SET #return_value_argument = #v_position_no
RETURN
END
I am aware of the potential pitfalls of using with (nolock), but this was added in an unsuccessful attempt to prevent data-locks on my database. In fact, besides the fact that well-written code is obviously preferable, the main reason I am asking this question is to try and cut down the amount of places that could be causing the data-lock.
Is there any way my code could be improved?
Create an auxilliary table with sequences, with one row for every company (as you already did):
create table seq (company int, sequence int);
go
Seed the counters, one for every company (say there are two companies, 1 and 2):
insert seq values
(1, 1), (2, 1);
go
Then all you need is a way to both update and select the new value in a single statement to avoid race conditions. This is how to do it:
declare #next int;
declare #company int;
set #company = 2;
update seq
set #next = sequence = sequence + 1
where company = #company;
select #next
It would be nice to enclose this into a scalar function, but unfortunatelly no updates in functions are allowed. But you already have a stored procedure in place, so just modify the code in it.
And please tell me that the datatypes used are not really floats? Why not ints?
WHILE(1=1)
BEGIN
SELECT #v_position_no = max(POSITION_NO)
FROM DB.POSITION_TABLE with (nolock)
WHERE COMPANY_NO = #p_company_no
INSERT INTO DB.POSITION_TABLE
(COMPANY_NO, POSITION_NO)
SELECT TOP 1 #p_company_no, #v_position_no + 1
FROM DB.POSITION_TABLE with (nolock)
WHERE NOT EXISTS (SELECT 1
FROM DB.POSITION_TABLE with (nolock)
WHERE COMPANY_NO = #p_company_no
AND POSITION_NO = #v_position_no + 1)
IF(##ROWCOUNT > 0)
BREAK;
END
SET #return_value_argument = #v_position_no + 1
Note that this would only insert in the second statement if the POSITION_NO + 1 wasn't since added. If it was then it would try again.