I remember reading a while back that randomly SQL Server can slow down and / or take a stupidly long time to execute a stored procedure when it is written like:
CREATE PROCEDURE spMyExampleProc
(
#myParameterINT
)
AS
BEGIN
SELECT something FROM myTable WHERE myColumn = #myParameter
END
The way to fix this error is to do this:
CREATE PROCEDURE spMyExampleProc
(
#myParameterINT
)
AS
BEGIN
DECLARE #newParameter INT
SET #newParameter = #myParameter
SELECT something FROM myTable WHERE myColumn = #newParameter
END
Now my question is firstly is it bad practice to follow the second example for all my stored procedures? This seems like a bug that could be easily prevented with little work, but would there be any drawbacks to doing this and if so why?
When I read about this the problem was that the same proc would take varying times to execute depending on the value in the parameter, if anyone can tell me what this problem is called / why it occurs I would be really grateful, I cant seem to find the link to the post anywhere and it seems like a problem that could occur for our company.
The problem is "parameter sniffing" (SO Search)
The pattern with #newParameter is called "parameter masking" (also SO Search)
You could always use the this masking pattern but it isn't always needed. For example, a simple select by unique key, with no child tables or other filters should behave as expected every time.
Since SQL Server 2008, you can also use the OPTIMISE FOR UNKNOWN (SO). Also see Alternative to using local variables in a where clause and Experience with when to use OPTIMIZE FOR UNKNOWN
Related
I'm facing a quite annoying barrier enforced by SQL Server and would like to check if there is an elegant solution for this.
I have a sequence of procedures' invocations (meaning, A calls B which calls C). The procedures are due to return different results sets, where (for instance) "A" generates its result using a set of records returned by "B".
Now, SQL Server does not allow to have nested INSERT INTO ... EXEC <stored procedure> so, to cope with this limitation, I converted the lowest procedure into a function that returns a table and hence INSERT INTO ... SELECT * FROM <function call>.
Now, there are situations in which the FUNCTION cannot return a result due to conditions of the data, and I would like the function to return a sort of code indicating the result of the execution (e.g. 0 would mean success, 1 would mean "missing input data").
Since SQL Server does not allow functions with OUTPUT parameters, I can't think of any elegant way of conveying these two outputs.
Can anyone suggest an elegant alternative?
there are situations in which the FUNCTION cannot return a result due
to conditions of the data, and I would like the function to return a
sort of code indicating the result of the execution
You really should use THROW to indicate the result of execution, which also precludes using a table-valued function.
So you need to use a stored procedure. To avoid the restriction on nested INSERT .. SELECT you can use temporary tables to pass data back to the calling procedure. EG
create or alter procedure foo
as
begin
if object_id('tempdb..#foo_results') is null
begin
print 'create table #foo_results(id int primary key, a int);';
THROW 51000, 'The results table #foo_results does not exist. Before calling this procedure create it. ', 1;
end
insert into #foo_results(id,a)
values (1,1);
end;
Can anyone suggest an ELEGANT alternative?
I'm not sure any of the alternatives is elegant.
SQL Server has Deferred Name Resolution feature, read here for details:
https://msdn.microsoft.com/en-us/library/ms190686(v=sql.105).aspx
In that page, all it's talking is stored procedure so it seems Deferred Name Resolution only works for stored procedures and not for functions and I did some testing.
create or alter function f2(#i int)
returns table
as
return (select fff from xxx)
go
Note the table xxx does not exist. When I execute the above CREATE statement, I got the following message:
Msg 208, Level 16, State 1, Procedure f2, Line 4 [Batch Start Line 22]
Invalid object name 'xxx'.
It seems that SQL Server instantly found the non-existent table xxx and it proved Deferred Name Resolution doesn't work for functions. However when I slightly change it as follows:
create or alter function f1(#i int)
returns int
as
begin
declare #x int;
select #x = fff from xxx;
return #x
end
go
I can successfully execute it:
Commands completed successfully.
When executing the following statement:
select dbo.f1(3)
I got this error:
Msg 208, Level 16, State 1, Line 34
Invalid object name 'xxx'.
So here it seems the resolution of the table xxx was deferred. The most important differences between these two cases is the return type. However I can't explain when Deferred Name Resolution will work for functions and when not. Can anyone help me to understand this? Thanks in advance.
It feels like you were looking for understanding of why your particular example didn't work. Quassnoi's answer was correct but didn't offer a reason so I went searching and found this MSDN Social answer by Erland Sommarskog. The interesting part:
However, it does not extend to views and inline-table functions. For
stored procedures and scalar functions, all SQL Server stores in the
database is the text of the module. But for views and inline-table
functions (which are parameterised view by another name) SQL Server
stores metadata about the columns etc. And that is not possible if the
table is missing.
Hope that helps with understanding why :-)
EDIT:
I did take some time to confirm Quassnoi's comment that sys.columns as well as several other tables did contain some metadata about the inline function so I am unsure if there is other metadata not written. However I thought I would add a few other notes I was able to find that may help explain in conjunction.
First a quote from Wayne Sheffield's blog:
In the MTVF, you see only an operation called “Table Valued Function”. Everything that it is doing is essentially a black box – something is happening, and data gets returned. For MTVFs, SQL can’t “see” what it is that the MTVF is doing since it is being run in a separate context. What this means is that SQL has to run the MTVF as it is written, without being able to make any optimizations in the query plan to optimize it.
Then from the SQL Server 2016 Exam 70-761 by Itzik Ben-Gan (Skill 3.1):
The reason that it's called an inline function is because SQL Server inlines, or expands, the inner query definition, and constructs an internal query directly against the underlying tables.
So it seems the inline function essentially returns a query and is able to optimize it with the outer query, not allowing the black-box approach and thus not allowing deferred name resolution.
What you have in your first example is an inline function (it does not have BEGIN/END).
Inline functions can only be table-valued.
If you used a multi-statement table-valued function for you first example, like this:
CREATE OR ALTER FUNCTION
fn_test(#a INT)
RETURNS #ret TABLE
(
a INT
)
AS
BEGIN
INSERT
INTO #ret
SELECT a
FROM xxx
RETURN
END
, it would compile alright and fail at runtime (if xxx would not exist), same as a stored procedure or a scalar UDF would.
So yes, DNR does work for all multi-statement functions (those with BEGIN/END), regardless of their return type.
I'm trying to create a wrapper in T-SQL for a procedure where I'm not sure what the data types are. I can run the wrapper without an INSERT INTO statement and I get the data just fine, but I need to have it in a table.
Whenever I use the INSERT INTO I get an error:
Column name or number of supplied values does not match table definition
I've parsed back through my code and can't see where any column names don't match up, so I'm thinking that it has to be a data type. I've looked through the procedure I'm wrapping to see if I can find what the data types are, but some aren't defined there; I've referenced the tables they pull some data from to find the definitions; I've run SQL_VARIANT_PROPERTY on all of the data to see what data type it is (although some of them come up null).
Is there some better way for me to track down exactly where the error is?
I think you can find out your stored procedure result schema, using sp_describe_first_result_set (available from SQL2012) and FMTONLY. Something like this:
EXEC sp_describe_first_result_set
#tsql = N'SET FMTONLY OFF; EXEC yourProcedure <params are embedded here>'
More details can be found here.
However, if I remember correctly, this works only if your procedure used deterministic schemas (no SELECT INTO #tempTable or similar things).
One trick to find out the schema of your result is to actually materialize the result into ad-hoc created table. However, this is not easy since SELECT INTO does not work with EXEC procedure. One work-around is this:
1) Define a linked-server to the instance itself. E.g. loopback
2) Execute your procedure like this (for SQL 2008R2):
SELECT * INTO tempTableToHoldDataAndStructure
FROM OPENQUERY(' + #LoopBackServerName + ', ''set fmtonly off exec ' + #ProcedureFullName + ' ' + #ParamsStr
where
#LoopBackServerName = 'loopback'
#ProcedureFullName = loopback.database.schema.procedure_name
#ParamsStr = embedded parameters
For SQL2012 I think the execution might fail if RESULT SETS are not provided (i.e. schema definition of the expected result, which is kind of a chicken-egg problem in this case):
' WITH RESULT SETS (( ' + #ResultSetStr + '))'');
Okay, I have a solution to my problem. It's tedious, but tedious I can do. Randomly guessing is what drives me crazy. The procedure I'm wrapping dumps 51 columns. I already know I can get it to work without putting anything into a table. So I decided to comment out part of the select statement in the procedure I'm wrapping so it's only selecting 1 column. (First I made a copy of that procedure so I don't screw up the original; then I referenced the copy from my wrapper). Saved both, ran it, and it worked. So far so good. I could have done it line by line, but I'm more of a binary kind of guy, so I went about halfway down--now I'm including about 25 columns in both the select statement and my table--and it's still working. Repeat procedure until it doesn't work any more, then backtrack until it does again. My error was in identifying one of the data types followed by "IDENTITY". I'm not sure what will happen when I leave that out, but at least my wrapper works.
I have a relatively simply stored procedure that runs an insert, and then attempts to return the last inserted ID. This is done so I can the ID via SCOPE_IDENTITY(). This was working great for me. But then, I got reports that on some machines, the stored proc would cause duplicate results.
After investigating it, I found that the cause was the use of the property ReturnsRecords. When true, it will run a query twice! For a select; who cares. For this case though, it is causing duplicates in my database.
Setting ReturnsRecords to false gets rid of the problem, but then it defeats the purpose of the stored proc (I absolutely must get the proper last inserted ID for the record)!
My question is simply this: How would I go about inserting this record and getting the ID of the new record, while getting around this problem?
Additional Info:
I am currently using DAO
I have tried the ADO.Command method, but it is
very error prone and doesn't seem to
work with output parameters for me.
I am using the stored proc solely for the purpose of retaining scope. I do not have my heart set on using a stored proc. I simply need a reliable way to get the id of the last inserted row.
This is an ACCDB
This is happening in access 2007
my DB backend is MSSQL Server 2008
Any help or insight is appreciated.
One of your parameters in the procedure can be set to output. Still don't return any rows, but set the value of that parameter to Scope_Identity()
create proc ReturnTheNewID
#NewValue int
, #ReturnNewID int output
as
set nocount on
insert ....
set #ReturnNewID = Scope_identity()
I'm trying to figure out if this is relatively well-performing T-SQL (this is SQL Server 2008). I need to create a stored procedure that updates a table. The proc accepts as many parameters as there are columns in the table, and with the exception of the PK column, they all default to NULL. The body of the procedure looks like this:
CREATE PROCEDURE proc_repo_update
#object_id bigint
,#object_name varchar(50) = NULL
,#object_type char(2) = NULL
,#object_weight int = NULL
,#owner_id int = NULL
-- ...etc
AS
BEGIN
update
object_repo
set
object_name = ISNULL(#object_name, object_name)
,object_type = ISNULL(#object_type, object_type)
,object_weight = ISNULL(#object_weight, object_weight)
,owner_id = ISNULL(#owner_id, owner_id)
-- ...etc
where
object_id = #object_id
return ##ROWCOUNT
END
So basically:
Update a column only if its corresponding parameter was provided, and leave the rest alone.
This works well enough, but as the ISNULL call will return the value of the column if the received parameter was null, will SQL Server optimize this somehow? This might be a performance bottleneck on the application where the table might be updated heavily (insertion will be uncommon so the performance there is not a problem). So I'm trying to figure out what's the best way to do this. Is there a way to condition the column expressions with something like CASE WHEN or something? The table will be indexed up the wazoo as well for read performance. Is this the best approach? My alternative at this point is to create the UPDATE expression in code (e.g. inline SQL) and execute it against the server. This would solve my doubts about performance, but I'd rather leave this in a stored proc if possible.
Take a look at Hugo Kornelis' blog post at http://sqlblog.com/blogs/hugo_kornelis/archive/2007/09/30/what-if-null-if-null-is-null-null-null-is-null.aspx. Scoll down a bit to the discussion on COALESCE vs. ISNULL. If portability is a future consideration, look at COALESCE.
However, from a performance perspective, take a look at Adam's performance-centric blog post at http://sqlblog.com/blogs/adam_machanic/archive/2006/07/12/performance-isnull-vs-coalesce.aspx. ISNULL is the speedier.
Your choice...
BTW, I have a bunch of SP's that are just like your example and have no performance issues using ISNULL. (Being a bit lazy, I like to type 6 vs. 8 chars and being a littel prone to finger-dyslexia, ISNULL is much easier to type :-) )
ISNULL is the fastest way- the only way you'll improve is if you pass in NULL or the actual value, and do the ISNULL in the application.