TSQL Variable With List of Values for IN Clause - sql-server

I want to use a clause along the lines of "CASE WHEN ... THEN 1 ELSE 0 END" in a select statement. The tricky part is that I need it to work with "value IN #List".
If I hard code the list it works fine - and it performs well:
SELECT
CASE WHEN t.column_a IN ( 'value a', 'value b' ) THEN 1 ELSE 0 END AS priority
, t.column_b
, t.column_c
FROM
table AS t
ORDER BY
priority DESC
What I would like to do is:
-- #AvailableValues would be a list (array) of strings.
DECLARE
#AvailableValues ???
SELECT
#AvailableValues = ???
FROM
lookup_table
SELECT
CASE WHEN t.column_a IN #AvailableValues THEN 1 ELSE 0 END AS priority
, t.column_b
, t.column_c
FROM
table AS t
ORDER BY
priority DESC
Unfortunately, it seems that SQL Server doesn't do this - you can't use a variable with an IN clause. So this leaves me with some other options:
Make '#AvailableValues' a comma-delimited string and use a LIKE statement. This does not perform well.
Use an inline SELECT statement against 'lookup_table' in place of the variable. Again, doesn't perform well (I think) because it has to lookup the table on each row.
Write a function wrapping around the SELECT statement in place of the variable. I haven't tried this yet (will try it now) but it seems that it will have the same problem as a direct SELECT statement.
???
Are there any other options? Performance is very important for the query - it has to be really fast as it feeds a real-time search result page (i.e. no caching) for a web site.
Are there any other options here? Is there a way to improve the performance of one of the above options to get good performance?
Thanks in advance for any help given!
UPDATE: I should have mentioned that the 'lookup_table' in the example above is already a table variable. I've also updated the sample queries to better demonstrate how I'm using the clause.
UPDATE II: It occurred to me that the IN clause is operating off an NVARCHAR/NCHAR field (due to historical table design reasons). If I was to make changes that dealt with integer fields (i.e through PK/FK relationship constraints) could this have much impact on performance?

You can use a variable in an IN clause, but not in the way you're trying to do. For instance, you could do this:
declare #i int
declare #j int
select #i = 10, #j = 20
select * from YourTable where SomeColumn IN (#i, #j)
The key is that the variables cannot represent more than one value.
To answer your question, use the inline select. As long as you don't reference an outer value in the query (which could change the results on a per-row basis), the engine will not repeatedly select the same data from the table.

Based on your update and assuming the lookup table is small, I suggest trying something like the following:
DECLARE #MyLookup table
(SomeValue nvarchar(100) not null)
SELECT
case when ml.SomeValue is not null then 1 else 0 end AS Priority
,t.column_b
,t.column_c
from MyTable t
left outer join #MyLookup ml
on ml.SomeValue = t.column_a
order by case when ml.SomeValue is not null then 1 else 0 end desc
(You can't reference the column alias "Priority" in the ORDER BY clause. Alternatively, you could use the ordinal position like so:
order by 1 desc
but that's generally not recommended.)
As long as the lookup table is small , this really should run fairly quickly -- but your comment implies that it's a pretty big table, and that could slow down performance.
As for n[Var]char vs. int, yes, integers would be faster, if only because the CPU has fewer bytes to juggle around... which shoud only be a problem when processing a lot of rows, so it might be worth trying.

I solved this problem by using a CHARINDEX function. I wanted to pass the string in as a single parameter. I created a string with leading and trailing commas for each value I wanted to test for. Then I concatenated a leading and trailing commas to the string I wanted to see if was "in" the parameter. At the end I checked for CHARINDEX > 0
DECLARE #CTSPST_Profit_Centers VARCHAR (256)
SELECT #CTSPST_Profit_Centers = ',CS5000U37Y,CS5000U48B,CS5000V68A,CS5000V69A,CS500IV69A,CS5000V70S,CS5000V79B,CS500IV79B,'
SELECT
CASE
WHEN CHARINDEX(','+ISMAT.PROFIT_CENTER+',' ,#CTSPST_Profit_Centers) > 0 THEN 'CTSPST'
ELSE ISMAT.DESIGN_ID + ' 1 CPG'
END AS DESIGN_ID
You can also do it in the where clause
WHERE CHARINDEX(','+ISMAT.PROFIT_CENTER+',',#CTSPST_Profit_Centers) > 0
If you were trying to compare numbers you'd need to convert the number to a text string for the CHARINDEX function to work.

This might be along the lines of what you need.
Note that this assumes that you have permissions and the input data has been sanitized.
From Running Dynamic Stored Procedures
CREATE PROCEDURE MyProc (#WHEREClause varchar(255))
AS
-- Create a variable #SQLStatement
DECLARE #SQLStatement varchar(255)
-- Enter the dynamic SQL statement into the
-- variable #SQLStatement
SELECT #SQLStatement = "SELECT * FROM TableName WHERE " + #WHEREClause
-- Execute the SQL statement
EXEC(#SQLStatement)

Related

SQL SUBSTRING & PATINDEX of varying lengths

SQL Server 2017.
Given the following 3 records with field of type nvarchar(250) called fileString:
_318_CA_DCA_2020_12_11-01_00_01_VM6.log
_319_CA_DCA_2020_12_12-01_VM17.log
_333_KF_DCA01_00_01_VM232.log
I would want to return:
VM6
VM17
VM232
Attempted thus far with:
SELECT
SUBSTRING(fileString, PATINDEX('%VM[0-9]%', fileString), 3)
FROM dbo.Table
But of course that only returns VM and 1 number.
How would I define the parameter for number of characters when it varies?
EDIT: to pre-emptively answer a question that may come up, yes, the VM pattern will always be proceeded immediately by .log and nothing else. But even if I took that approach and worked backwards, I still don't understand how to define the number of characters to take when the number varies.
here is one way :
DECLARE #test TABLE( fileString varchar(500))
INSERT INTO #test VALUES
('_318_CA_DCA_2020_12_11-01_00_01_VM6.log')
,('_319_CA_DCA_2020_12_12-01_00_01_VM17.log')
,('_333_KF_DCA_2020_12_15-01_00_01_VM232.log')
-- 5 is the length of file extension + 1 which is always the same size '.log'
SELECT
REVERSE(SUBSTRING(REVERSE(fileString),5,CHARINDEX('_',REVERSE(fileString))-5))
FROM #test AS t
This will dynamically grab the length and location of the last _ and remove the .log.
It is not the most efficient, if you are able to write a CLR function usnig C# and import it into SQL, that will be much more efficient. Or you can use this as starting point and tweak it as needed.
You can remove the variable and replace it with your table like below
DECLARE #TESTVariable as varchar(500)
Set #TESTVariable = '_318_CA_DCA_2020_12_11-01_00_01_VM6adf.log'
SELECT REPLACE(SUBSTRING(#TESTVariable, PATINDEX('%VM[0-9]%', #TESTVariable), PATINDEX('%[_]%', REVERSE(#TESTVariable))), '.log', '')
select *,
part = REPLACE(SUBSTRING(filestring, PATINDEX('%VM[0-9]%', filestring), PATINDEX('%[_]%', REVERSE(filestring))), '.log', '')
from table
Your lengths are consistent at the beginning. So get away from patindex and use substring to crop out the beginning. Then just replace the '.log' with an empty string at the end.
select *,
part = replace(substring(filestring,33,255),'.log','')
from table;
Edit:
Okay, from your edit you show differing prefix portions. Then patindex is in fact correct. Here's my solution, which is not better or worse than the other answers but differs with respect to the fact that it avoids reverse and delegates the patindex computation to a cross apply section. You may find it a bit more readable.
select filestring,
part = replace(substring(filestring, ap.vmIx, 255),'.log','')
from table
cross apply (select
vmIx = patindex('%_vm%', filestring) + 1
) ap

Issues using multiple parameters in SSRS report (stored procedure)

I've read countless posts on this topic but I can't seem to get any of the recommendations to apply to my particular situation (which isn't different than others...)
I have an SSRS report. Dataset 1 is using a stored procedure and in the where clause I have
and (#param is null or alias.column in
(select Item from dbo.ufnSplit(#param,',')))
I borrowed the dbo.ufnSplit function from this post here: https://stackoverflow.com/a/512300/22194
FUNCTION [dbo].[ufnSplit]
(#RepParam nvarchar(max), #Delim char(1)= ',')
RETURNS #Values TABLE (Item nvarchar(max))AS
--based on John Sansoms StackOverflow answer:
--https://stackoverflow.com/a/512300/22194
BEGIN
DECLARE #chrind INT
DECLARE #Piece nvarchar(100)
SELECT #chrind = 1
WHILE #chrind > 0
BEGIN
SELECT #chrind = CHARINDEX(#Delim,#RepParam)
IF #chrind > 0
SELECT #Piece = LEFT(#RepParam,#chrind - 1)
ELSE
SELECT #Piece = #RepParam
INSERT #Values(Item) VALUES(#Piece)
SELECT #RepParam = RIGHT(#RepParam,LEN(#RepParam) - #chrind)
IF LEN(#RepParam) = 0 BREAK
END
RETURN
END
In dataset 2 I am getting the values that I want to pass to dataset 1
select distinct list from table
My parameter for #param is configured to look at dataset 2 for available values
My issue is that if I select a single value from my parameter dropdown for #param, the report works. If I select multiple values from the dropdown, I only return data for the first value selected.
My values in dataset 2 do not contain any ,'s
Did I miss anything for fail to provide enough information? I'm open to criticism, feedback, do's and don'ts for this, I've struggled with this issue for some time, and by no means a SQL expert :)
Cheers,
MD
Update So SQL Profiler is showing me this:
exec sp... #param=N'value1,value2 ,value3 '
Questions are:
1. Shouldn't every value be wrapped in single quotes?
2. What's with the N before the list?
3. Guessing the trailing spaces need to be trimmed out
When you select multiple values from a parameter dropdown list they are stored in an array. In order to convert that to a string that you can pass to SQL you can use the Join function. Go to your dataset properties and then to the Parameters tab. Replace the Parameter Value with this expression:
=Join(Parameters!param.Value, ",")
It should look like this:
Now your split function will get one comma separated string like it's supposed to. I would also suggest having the split function trim off spaces from the values after it has separated them.
So I figured it out and wanted to post my results here in hopes it helps someone else.
Bad data. One trailing space was blowing up my entire result set, and I didn't notice it until I ran through several scenarios (choosing many combinations of parameters)
My result set had trailing spaces - once I did an rtrim on it I didn't have to do any fancy join/split's in SSRS.

Stored Procedure - loop through results without cursor

Everywhere I look I see that in order to loop through results you have to use a cursor and in the same post someone saying cursors are bad don't use them (which has always been my philosophy) but now I am stuck. I need to loop through a result set!
Here's the situation. I need to come up with a list of ProductIDs that have 2 different statuses set to a specific value. I start the stored procedure, run the query that finds my products that meet the criteria.
So, now I have a list of ProductIDs that I need to run through my validation process:
16050
16052
41817
48255
Now I need for each of those products (there may be 1 there may be 1000, i don't know) to check a whole list of conditions:
Is a specific field = 'SIMPLE'? if so, perform a bunch of other queries and make sure everything is good
If it is not 'SIMPLE' then run a whole other set of queries and make sure that information is all good.
Is another field = 'YES'? if so, perform a bunch of other queries, if it is not, then do other queries.
Is a cursor what I need to use? Is there some other way to do what I need that I just am not seeing?
Thanks,
Leslie
I ended up using a WHILE loop that I can pass each ProductID into a series of checks!!
declare #counter int
declare #productKey varchar(20)
SET #counter = (select COUNT(*) from ##Magento)
while (1=1)
begin
SET #productKey = (select top 1 ProductKey from ##Magento)
print #productKey;
delete from ##Magento Where ProductKey = #productKey
SET #counter-=1;
IF (#counter=0) BREAK;
end
go
It's hard to say without knowing the specifics of your process, but one approach is to create a function that performs your logic and call that.
eg:
delete from yourtable
where productid in (select ProductID from FilteredProducts)
and dbo.ShouldBeDeletedFunction(ProductID) = 1
In general, cursors are bad, but there are always exceptions. Try to avoid them by thinking in terms of sets, rather than the attributes of an individual record.

SQL Server Query: Fast with Literal but Slow with Variable

I have a view that returns 2 ints from a table using a CTE. If I query the view like this it runs in less than a second
SELECT * FROM view1 WHERE ID = 1
However if I query the view like this it takes 4 seconds.
DECLARE #id INT = 1
SELECT * FROM View1 WHERE ID = #id
I've checked the 2 query plans and the first query is performing a Clustered index seek on the main table returning 1 record then applying the rest of the view query to that result set, where as the second query is performing an index scan which is returning about 3000 records records rather than just the one I'm interested in and then later filtering the result set.
Is there anything obvious that I'm missing to try to get the second query to use the Index Seek rather than an index scan. I'm using SQL 2008 but anything I do needs to also run on SQL 2005. At first I thought it was some sort of parameter sniffing problem but I get the same results even if I clear the cache.
Probably it is because in the parameter case, the optimizer cannot know that the value is not null, so it needs to create a plan that returns correct results even when it is. If you have SQL Server 2008 SP1 you can try adding OPTION(RECOMPILE) to the query.
You could add an OPTIMIZE FOR hint to your query, e.g.
DECLARE #id INT = 1
SELECT * FROM View1 WHERE ID = #id OPTION (OPTIMIZE FOR (#ID = 1))
In my case in DB table column type was defined as VarChar and in parameterized query parameter type was defined as NVarChar, this introduced CONVERT_IMPLICIT in the actual execution plan to match data type before comparing and that was culprit for sow performance, 2 sec vs 11 sec. Just correcting parameter type made parameterized query as fast as non parameterized version.
One possible way to do that is to CAST the parameters, as such:
SELECT ...
FROM ...
WHERE name = CAST(:name AS varchar)
Hope this may help someone with similar issue.
I ran into this problem myself with a view that ran < 10ms with a direct assignment (WHERE UtilAcctId=12345), but took over 100 times as long with a variable assignment (WHERE UtilAcctId = #UtilAcctId).
The execution-plan for the latter was no different than if I had run the view on the entire table.
My solution didn't require tons of indexes, optimizer-hints, or a long-statistics-update.
Instead I converted the view into a User-Table-Function where the parameter was the value needed on the WHERE clause. In fact this WHERE clause was nested 3 queries deep and it still worked and it was back to the < 10ms speed.
Eventually I changed the parameter to be a TYPE that is a table of UtilAcctIds (int). Then I can limit the WHERE clause to a list from the table.
WHERE UtilAcctId = [parameter-List].UtilAcctId.
This works even better. I think the user-table-functions are pre-compiled.
When SQL starts to optimize the query plan for the query with the variable it will match the available index against the column. In this case there was an index so SQL figured it would just scan the index looking for the value. When SQL made the plan for the query with the column and a literal value it could look at the statistics and the value to decide if it should scan the index or if a seek would be correct.
Using the optimize hint and a value tells SQL that “this is the value which will be used most of the time so optimize for this value” and a plan is stored as if this literal value was used. Using the optimize hint and the sub-hint of UNKNOWN tells SQL you do not know what the value will be, so SQL looks at the statistics for the column and decides what, seek or scan, will be best and makes the plan accordingly.
I know this is long since answered, but I came across this same issue and have a fairly simple solution that doesn't require hints, statistics-updates, additional indexes, forcing plans etc.
Based on the comment above that "the optimizer cannot know that the value is not null", I decided to move the values from a variable into a table:
Original Code:
declare #StartTime datetime2(0) = '10/23/2020 00:00:00'
declare #EndTime datetime2(0) = '10/23/2020 01:00:00'
SELECT * FROM ...
WHERE
C.CreateDtTm >= #StartTime
AND C.CreateDtTm < #EndTime
New Code:
declare #StartTime datetime2(0) = '10/23/2020 00:00:00'
declare #EndTime datetime2(0) = '10/23/2020 01:00:00'
CREATE TABLE #Times (StartTime datetime2(0) NOT NULL, EndTime datetime2(0) NOT NULL)
INSERT INTO #Times(StartTime, EndTime) VALUES(#StartTime, #EndTime)
SELECT * FROM ...
WHERE
C.CreateDtTm >= (SELECT MAX(StartTime) FROM #Times)
AND C.CreateDtTm < (SELECT MAX(EndTime) FROM #Times)
This performed instantly as opposed to several minutes for the original code (obviously your results may vary) .
I assume if I changed my data type in my main table to be NOT NULL, it would work as well, but I was not able to test this at this time due to system constraints.
Came across this same issue myself and it turned out to be a missing index involving a (left) join on the result of a subquery.
select *
from foo A
left outer join (
select x, count(*)
from bar
group by x
) B on A.x = B.x
Added an index named bar_x for bar.x
DECLARE #id INT = 1
SELECT * FROM View1 WHERE ID = #id
Do this
DECLARE #sql varchar(max)
SET #sql = 'SELECT * FROM View1 WHERE ID =' + CAST(#id as varchar)
EXEC (#sql)
Solves your problem

How to compare two column values which are comma separated values?

I have one table with specific columns, in that there is a column which contains comma separated values like test,exam,result,other.
I will pass a string like result,sample,unknown,extras as a parameter to the stored procedure. and then I want to get the related records by checking each and every phrase in this string.
For Example:
TableA
ID Name Words
1 samson test,exam,result,other
2 john sample,no query
3 smith tester,SE
Now I want to search for result,sample,unknown,extras
Then the result should be
ID Name Words
1 samson test,exam,result,other
2 john sample,no query
because in the first record result matched and in the second record sample matched.
That's not a great design, you know. Better to split Words off into a separate table (id, word).
That said, this should do the trick:
set nocount on
declare #words varchar(max) = 'result,sample,unknown,extras'
declare #split table (word varchar(64))
declare #word varchar(64), #start int, #end int, #stop int
-- string split in 8 lines
select #words += ',', #start = 1, #stop = len(#words)+1
while #start < #stop begin
select
#end = charindex(',',#words,#start)
, #word = rtrim(ltrim(substring(#words,#start,#end-#start)))
, #start = #end+1
insert #split values (#word)
end
select * from TableA a
where exists (
select * from #split w
where charindex(','+w.word+',',','+a.words+',') > 0
)
May I burn in DBA hell for providing you this!
Edit: replaced STUFF w/ SUBSTRING slicing, an order of magnitude faster on long lists.
Personally I think you'd want to look at your application/architecture and think carefully about whether you really want to do this in the database or the application. If it isn't appropriate or not an option then you'll need to create a custom function. The code in the article here should be easy enough to modify to do what you want:
Quick T-Sql to parse a delimited string (also look at the code in the comments)
Like the others have already said -- what you have there is a bad design. Consider using proper relations to represent these things.
That being said, here's a detailed article about how to do this using SQL Server:
http://www.sommarskog.se/arrays-in-sql-2005.html
One thing no one has covered so far, because it's often a very bad idea -- but then, you are already working with a bad idea, and sometimes two wrongs make a right -- is to extract all rows that match ANY of your strings (using LIKE or some such) and doing the intersection yourself, client-side. If your strings are fairly rare and highly correlated, this may work pretty well; it will be god-awful in most other cases.

Resources