SQL code inside SQL Scalar-value function - want to generalize - optimize - sql-server

I got this code, I would like to optimize.
I basically can add new columns to "Disp" table later on, and I don't want to come back modify this function.
I cannot use dynamic SQL. Right? Is there anything else that would work in my case?
This is the function:
ALTER FUNCTION [GetDate]
(#hdrnumber INT, #DateColName VARCHAR(50))
RETURNS DATETIME
AS
BEGIN
DECLARE #dt DATETIME
SELECT #dt = CASE
WHEN #DateColName = 'ord_bookdate' THEN [ord_bookdate]
WHEN #DateColName = 'ord_startdate' THEN [ord_startdate]
WHEN #DateColName = 'ord_completiondate' THEN [ord_completiondate]
WHEN #DateColName = 'pack_date_from' THEN [pack_date_from]
WHEN #DateColName = 'pack_date_to' THEN [pack_date_to]
END
FROM [Disp]
WHERE [hdrnumber] = #hdrnumber
RETURN #dt
END
(removed some of the code, because it's a long one, but hopefully what I left in here will make sense to you guys)
how do i use this function?
well it basically looks like this:
insert into tablename (...)
select somedate, [GetDate](somedate, somecolumn)
from sometable
where 1 = 1

Certainly agree with comments provided in previous two answers.
Anyways, you could write following function to get Column names of a given table and then
compare the column names to #DatecolumName to return the value from it..
Create
function [dbo].[ftTableSchema](#TableName varchar(100)) returns table as
return
--Declare #tableName varchar(30); select #TABLENAME='excelInBom'
SELECT ColumnName=Column_Name
,DataType= case data_type
When 'DECIMAL' then 'DECIMAL('+convert(varchar,Numeric_precision)+','+Convert(varchar,Numeric_scale)+')'
When 'NUMERIC' then 'DECIMAL('+convert(varchar,Numeric_precision)+','+Convert(varchar,Numeric_scale)+')'
when 'VARCHAR' then 'VARCHAR('+Convert(varchar,Character_maximum_length)+')'
WHEN 'CHAR' THEN 'CHAR('+Convert(varchar,Character_maximum_length)+')'
ELSE data_type
end
,ColumnOrder=Ordinal_Position,* FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME=#tableName

You either need to use dynamic sql which you can't do within a user-defined function (so you'd need to remove it out of the function), or like you are doing with a CASE statement (which isn't fully generic).
I think I'd personally go with the CASE approach to start with and consider dynamic sql if proves to give better performance (maybe with a larger number of different possible fields).
You would have some maintenance work to do to keep the CASE up to date, but I can't imagine more fields would be added that often?

I think you have a bigger design issue here. How in the world are you using this? If you need to specify the column as a string (??!) then dynamic SQL will be >>SO<< much faster in many key situations. Something like this:
declare #sql NVARCHAR(MAX);
set #sql = 'SELECT ' + #DateColName + ' FROM disp WHERE hdrnumber = #hdrnumber';
EXEC (#sql, #hdrnumber);
(just watch out that someone cant SQL inject into #DateColName).
But a bigger issue is the design smell that's all over this code. The fact that this function exists worries me about how you could possibly be using it. Take a bigger look, and maybe post another, more detailed question about your design in general and I think you could have even more helpful answers.

Related

SQL SUBSTRING & PATINDEX of varying lengths

SQL Server 2017.
Given the following 3 records with field of type nvarchar(250) called fileString:
_318_CA_DCA_2020_12_11-01_00_01_VM6.log
_319_CA_DCA_2020_12_12-01_VM17.log
_333_KF_DCA01_00_01_VM232.log
I would want to return:
VM6
VM17
VM232
Attempted thus far with:
SELECT
SUBSTRING(fileString, PATINDEX('%VM[0-9]%', fileString), 3)
FROM dbo.Table
But of course that only returns VM and 1 number.
How would I define the parameter for number of characters when it varies?
EDIT: to pre-emptively answer a question that may come up, yes, the VM pattern will always be proceeded immediately by .log and nothing else. But even if I took that approach and worked backwards, I still don't understand how to define the number of characters to take when the number varies.
here is one way :
DECLARE #test TABLE( fileString varchar(500))
INSERT INTO #test VALUES
('_318_CA_DCA_2020_12_11-01_00_01_VM6.log')
,('_319_CA_DCA_2020_12_12-01_00_01_VM17.log')
,('_333_KF_DCA_2020_12_15-01_00_01_VM232.log')
-- 5 is the length of file extension + 1 which is always the same size '.log'
SELECT
REVERSE(SUBSTRING(REVERSE(fileString),5,CHARINDEX('_',REVERSE(fileString))-5))
FROM #test AS t
This will dynamically grab the length and location of the last _ and remove the .log.
It is not the most efficient, if you are able to write a CLR function usnig C# and import it into SQL, that will be much more efficient. Or you can use this as starting point and tweak it as needed.
You can remove the variable and replace it with your table like below
DECLARE #TESTVariable as varchar(500)
Set #TESTVariable = '_318_CA_DCA_2020_12_11-01_00_01_VM6adf.log'
SELECT REPLACE(SUBSTRING(#TESTVariable, PATINDEX('%VM[0-9]%', #TESTVariable), PATINDEX('%[_]%', REVERSE(#TESTVariable))), '.log', '')
select *,
part = REPLACE(SUBSTRING(filestring, PATINDEX('%VM[0-9]%', filestring), PATINDEX('%[_]%', REVERSE(filestring))), '.log', '')
from table
Your lengths are consistent at the beginning. So get away from patindex and use substring to crop out the beginning. Then just replace the '.log' with an empty string at the end.
select *,
part = replace(substring(filestring,33,255),'.log','')
from table;
Edit:
Okay, from your edit you show differing prefix portions. Then patindex is in fact correct. Here's my solution, which is not better or worse than the other answers but differs with respect to the fact that it avoids reverse and delegates the patindex computation to a cross apply section. You may find it a bit more readable.
select filestring,
part = replace(substring(filestring, ap.vmIx, 255),'.log','')
from table
cross apply (select
vmIx = patindex('%_vm%', filestring) + 1
) ap

Stored Procedure - loop through results without cursor

Everywhere I look I see that in order to loop through results you have to use a cursor and in the same post someone saying cursors are bad don't use them (which has always been my philosophy) but now I am stuck. I need to loop through a result set!
Here's the situation. I need to come up with a list of ProductIDs that have 2 different statuses set to a specific value. I start the stored procedure, run the query that finds my products that meet the criteria.
So, now I have a list of ProductIDs that I need to run through my validation process:
16050
16052
41817
48255
Now I need for each of those products (there may be 1 there may be 1000, i don't know) to check a whole list of conditions:
Is a specific field = 'SIMPLE'? if so, perform a bunch of other queries and make sure everything is good
If it is not 'SIMPLE' then run a whole other set of queries and make sure that information is all good.
Is another field = 'YES'? if so, perform a bunch of other queries, if it is not, then do other queries.
Is a cursor what I need to use? Is there some other way to do what I need that I just am not seeing?
Thanks,
Leslie
I ended up using a WHILE loop that I can pass each ProductID into a series of checks!!
declare #counter int
declare #productKey varchar(20)
SET #counter = (select COUNT(*) from ##Magento)
while (1=1)
begin
SET #productKey = (select top 1 ProductKey from ##Magento)
print #productKey;
delete from ##Magento Where ProductKey = #productKey
SET #counter-=1;
IF (#counter=0) BREAK;
end
go
It's hard to say without knowing the specifics of your process, but one approach is to create a function that performs your logic and call that.
eg:
delete from yourtable
where productid in (select ProductID from FilteredProducts)
and dbo.ShouldBeDeletedFunction(ProductID) = 1
In general, cursors are bad, but there are always exceptions. Try to avoid them by thinking in terms of sets, rather than the attributes of an individual record.

TSQL Variable With List of Values for IN Clause

I want to use a clause along the lines of "CASE WHEN ... THEN 1 ELSE 0 END" in a select statement. The tricky part is that I need it to work with "value IN #List".
If I hard code the list it works fine - and it performs well:
SELECT
CASE WHEN t.column_a IN ( 'value a', 'value b' ) THEN 1 ELSE 0 END AS priority
, t.column_b
, t.column_c
FROM
table AS t
ORDER BY
priority DESC
What I would like to do is:
-- #AvailableValues would be a list (array) of strings.
DECLARE
#AvailableValues ???
SELECT
#AvailableValues = ???
FROM
lookup_table
SELECT
CASE WHEN t.column_a IN #AvailableValues THEN 1 ELSE 0 END AS priority
, t.column_b
, t.column_c
FROM
table AS t
ORDER BY
priority DESC
Unfortunately, it seems that SQL Server doesn't do this - you can't use a variable with an IN clause. So this leaves me with some other options:
Make '#AvailableValues' a comma-delimited string and use a LIKE statement. This does not perform well.
Use an inline SELECT statement against 'lookup_table' in place of the variable. Again, doesn't perform well (I think) because it has to lookup the table on each row.
Write a function wrapping around the SELECT statement in place of the variable. I haven't tried this yet (will try it now) but it seems that it will have the same problem as a direct SELECT statement.
???
Are there any other options? Performance is very important for the query - it has to be really fast as it feeds a real-time search result page (i.e. no caching) for a web site.
Are there any other options here? Is there a way to improve the performance of one of the above options to get good performance?
Thanks in advance for any help given!
UPDATE: I should have mentioned that the 'lookup_table' in the example above is already a table variable. I've also updated the sample queries to better demonstrate how I'm using the clause.
UPDATE II: It occurred to me that the IN clause is operating off an NVARCHAR/NCHAR field (due to historical table design reasons). If I was to make changes that dealt with integer fields (i.e through PK/FK relationship constraints) could this have much impact on performance?
You can use a variable in an IN clause, but not in the way you're trying to do. For instance, you could do this:
declare #i int
declare #j int
select #i = 10, #j = 20
select * from YourTable where SomeColumn IN (#i, #j)
The key is that the variables cannot represent more than one value.
To answer your question, use the inline select. As long as you don't reference an outer value in the query (which could change the results on a per-row basis), the engine will not repeatedly select the same data from the table.
Based on your update and assuming the lookup table is small, I suggest trying something like the following:
DECLARE #MyLookup table
(SomeValue nvarchar(100) not null)
SELECT
case when ml.SomeValue is not null then 1 else 0 end AS Priority
,t.column_b
,t.column_c
from MyTable t
left outer join #MyLookup ml
on ml.SomeValue = t.column_a
order by case when ml.SomeValue is not null then 1 else 0 end desc
(You can't reference the column alias "Priority" in the ORDER BY clause. Alternatively, you could use the ordinal position like so:
order by 1 desc
but that's generally not recommended.)
As long as the lookup table is small , this really should run fairly quickly -- but your comment implies that it's a pretty big table, and that could slow down performance.
As for n[Var]char vs. int, yes, integers would be faster, if only because the CPU has fewer bytes to juggle around... which shoud only be a problem when processing a lot of rows, so it might be worth trying.
I solved this problem by using a CHARINDEX function. I wanted to pass the string in as a single parameter. I created a string with leading and trailing commas for each value I wanted to test for. Then I concatenated a leading and trailing commas to the string I wanted to see if was "in" the parameter. At the end I checked for CHARINDEX > 0
DECLARE #CTSPST_Profit_Centers VARCHAR (256)
SELECT #CTSPST_Profit_Centers = ',CS5000U37Y,CS5000U48B,CS5000V68A,CS5000V69A,CS500IV69A,CS5000V70S,CS5000V79B,CS500IV79B,'
SELECT
CASE
WHEN CHARINDEX(','+ISMAT.PROFIT_CENTER+',' ,#CTSPST_Profit_Centers) > 0 THEN 'CTSPST'
ELSE ISMAT.DESIGN_ID + ' 1 CPG'
END AS DESIGN_ID
You can also do it in the where clause
WHERE CHARINDEX(','+ISMAT.PROFIT_CENTER+',',#CTSPST_Profit_Centers) > 0
If you were trying to compare numbers you'd need to convert the number to a text string for the CHARINDEX function to work.
This might be along the lines of what you need.
Note that this assumes that you have permissions and the input data has been sanitized.
From Running Dynamic Stored Procedures
CREATE PROCEDURE MyProc (#WHEREClause varchar(255))
AS
-- Create a variable #SQLStatement
DECLARE #SQLStatement varchar(255)
-- Enter the dynamic SQL statement into the
-- variable #SQLStatement
SELECT #SQLStatement = "SELECT * FROM TableName WHERE " + #WHEREClause
-- Execute the SQL statement
EXEC(#SQLStatement)

Performance of stored proc when updating columns selectively based on parameters?

I'm trying to figure out if this is relatively well-performing T-SQL (this is SQL Server 2008). I need to create a stored procedure that updates a table. The proc accepts as many parameters as there are columns in the table, and with the exception of the PK column, they all default to NULL. The body of the procedure looks like this:
CREATE PROCEDURE proc_repo_update
#object_id bigint
,#object_name varchar(50) = NULL
,#object_type char(2) = NULL
,#object_weight int = NULL
,#owner_id int = NULL
-- ...etc
AS
BEGIN
update
object_repo
set
object_name = ISNULL(#object_name, object_name)
,object_type = ISNULL(#object_type, object_type)
,object_weight = ISNULL(#object_weight, object_weight)
,owner_id = ISNULL(#owner_id, owner_id)
-- ...etc
where
object_id = #object_id
return ##ROWCOUNT
END
So basically:
Update a column only if its corresponding parameter was provided, and leave the rest alone.
This works well enough, but as the ISNULL call will return the value of the column if the received parameter was null, will SQL Server optimize this somehow? This might be a performance bottleneck on the application where the table might be updated heavily (insertion will be uncommon so the performance there is not a problem). So I'm trying to figure out what's the best way to do this. Is there a way to condition the column expressions with something like CASE WHEN or something? The table will be indexed up the wazoo as well for read performance. Is this the best approach? My alternative at this point is to create the UPDATE expression in code (e.g. inline SQL) and execute it against the server. This would solve my doubts about performance, but I'd rather leave this in a stored proc if possible.
Take a look at Hugo Kornelis' blog post at http://sqlblog.com/blogs/hugo_kornelis/archive/2007/09/30/what-if-null-if-null-is-null-null-null-is-null.aspx. Scoll down a bit to the discussion on COALESCE vs. ISNULL. If portability is a future consideration, look at COALESCE.
However, from a performance perspective, take a look at Adam's performance-centric blog post at http://sqlblog.com/blogs/adam_machanic/archive/2006/07/12/performance-isnull-vs-coalesce.aspx. ISNULL is the speedier.
Your choice...
BTW, I have a bunch of SP's that are just like your example and have no performance issues using ISNULL. (Being a bit lazy, I like to type 6 vs. 8 chars and being a littel prone to finger-dyslexia, ISNULL is much easier to type :-) )
ISNULL is the fastest way- the only way you'll improve is if you pass in NULL or the actual value, and do the ISNULL in the application.

T-SQL is it possible to use a variable in a select statement to specify the database

Is there a way to do something like this without converting the sql to a string and calling exec
DECLARE #source_database varvhar(200)
SELECT #source_database = 'wibble'
SELECT * FROM SELECT #source_database.dbo.mytable
No. I'm afraid not.
It is necessary to use dynamic sql in order to use a variable for either a database or column name.
Only for stored procs without using linked server or dynamic SQL
DECLARE #myProc varchar(200)
SELECT #myProc = 'wibble.dbo.foobar'
EXEC #myProc
There is another (not necessarily pretty) alternative:
IF (#source_database = 'wibble')
USE wibble;
ELSE IF (#source_database = 'wibble2')
USE wibble2;
ELSE
RAISERROR(....)
SELECT * FROM dbo.myTable
If you have any real number of databases, this may be tiresome. But it's an option nonetheless.

Resources