SQL Server: Table-valued Functions vs. Stored Procedures - sql-server

I have been doing a lot of reading up on execution plans and the problems of dynamic parameters in stored procedures. I know the suggested solutions for this.
My question, though, is everything I have read indicated that SQL Server caches the execution plan for stored procedures. No mention is made of Table-value functions. I assume it does so for Views (out of interest).
Does it recompile each time a Table-value function is called?
When is it best to use a Table-value function as opposed to a stored procedure?

An inline table valued function (TVF) is like a macro: it's expanded into the outer query. It has no plan as such: the calling SQL has a plan.
A multi-statement TVF has a plan (will find a reference).
TVFs are useful where you want to vary the SELECT list for a parameterised input. Inline TVFs are expanded and the outer select/where will be considered by the optimiser. For multi-statement TVFs optimisation is not really possible because it must run to completion, then filter.
Personally, I'd use a stored proc over a multi-statement TVF. They are more flexible (eg hints, can change state, SET NOCOUNT ON, SET XACTABORT etc).
I have no objection to inline TVFs but don't tend to use them for client facing code because of the inability to use SET and change state.

I haven't verified this, but I take for granted that the execution plan for functions are also cached. I can't see a reason why that would not be possible.
The execution plan for views are however not cached. The query in the view will be part of the query that uses the view, so the execution plan can be cached for the query that uses the view, but not for the view itself.
The use of functions versus stored procedured depends on what result you need from it. A table-valued function can return a single result, while a stored procedure can return one result, many results, or no result at all.

Related

Difference between scalar, table-valued, and aggregate functions in SQL server?

What is the difference between scalar-valued, table-valued, and aggregate functions in SQL server? And does calling them from a query need a different method, or do we call them in the same way?
Scalar Functions
Scalar functions (sometimes referred to as User-Defined Functions / UDFs) return a single value as a return value, not as a result set, and can be used in most places within a query or SET statement, except for the FROM clause (and maybe other places?). Also, scalar functions can be called via EXEC, just like Stored Procedures, though there are not many occasions to make use of this ability (for more details on this ability, please see my answer to the following question on DBA.StackExchange: Why scalar valued functions need execute permission rather than select?). These can be created in both T-SQL and SQLCLR.
T-SQL (UDF):
Prior to SQL Server 2019: these scalar functions are typically a performance issue because they generally run for every row returned (or scanned) and always prohibit parallel execution plans.
Starting in SQL Server 2019: certain T-SQL scalar UDFs can be inlined, that is, have their definitions placed directly into the query such that the query does not call the UDF (similar to how iTVFs work (see below)). There are restrictions that can prevent a UDF from being inlineable (if that wasn't a word before, it is now), and UDFs that can be inlined will not always be inlined due to several factors. This feature can be disabled at the database, query, and individual UDF levels. For more information on this really cool new feature, please see: Scalar UDF Inlining (be sure to review the "requirements" section).
SQLCLR (UDF): these scalar functions also typically run per each row returned or scanned, but there are two important benefits over T-SQL UDFs:
Starting in SQL Server 2012, return values can be constant-folded into the execution plan IF the UDF does not do any data access, and if it is marked IsDeterministic = true. In this case the function wouldn't run per each row.
SQLCLR scalar functions can work in parallel plans ( 😃 ) if they do not do any database access.
Table-Valued Functions
Table-Valued Functions (TVFs) return result sets, and can be used in a FROM clause, JOIN, or CROSS APPLY / OUTER APPLY of any query, but unlike simple Views, cannot be the target of any DML statements (INSERT / UPDATE / DELETE). These can also be created in both T-SQL and SQLCLR.
T-SQL MultiStatement (TVF): these TVFs, as their name implies, can have multiple statements, similar to a Stored Procedure. Whatever results they are going to return are stored in a Table Variable and returned at the very end; meaning, nothing is returned until the function is done processing. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) depends on the version of SQL Server:
Prior to SQL Server 2014: these always report 1 (yes, just 1) row.
SQL Server 2014 and 2016: these always report 100 rows.
Starting in SQL Server 2017: default is to report 100 rows, BUT under some conditions the row count will be fairly accurate (based on current statistics) thanks to the new Interleaved Execution feature.
T-SQL Inline (iTVF): these TVFs can only ever be a single statement, and that statement is a full query, just like a View. And in fact, Inline TVFs are essentially a View that accepts input parameters for use in the query. They also do not cache their own query plan as their definition is placed into the query in which they are used (unlike the other objects described here), hence they can be optimized much better than the other types of TVFs ( 😃 ). These TVFs perform quite well and are preferred if the logic can be handled in a single query.
SQLCLR (TVF): these TVFs are similar to T-SQL MultiStatement TVFs in that they build up the entire result set in memory (even if it is swap / page file) before releasing all of it at the very end. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) is always 1000 rows. Given that a fixed row count is far from ideal, please support my request to allow for specifying the row count: Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer
SQLCLR Streaming (sTVF): these TVFs allow for complex C# / VB.NET code just like regular SQLCLR TVFs, but are special in that they return each row to the calling query as they are generated ( 😃 ). This model allows the calling query to start processing the results as soon as the first one is sent so the query doesn't need to wait for the entire process of the function to complete before it sees any results. And it requires less memory since the results aren't being stored in memory until the process completes. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) is always 1000 rows. Given that a fixed row count is far from ideal, please support my request to allow for specifying the row count: Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer
Aggregate Functions
User-Defined Aggregates (UDA) are aggregates similar to SUM(), COUNT(), MIN(), MAX(), etc. and typically require a GROUP BY clause. These can only be created in SQLCLR, and that ability was introduced in SQL Server 2005. Also, starting in SQL Server 2008, UDAs were enhanced to allow for multiple input parameters ( 😃 ). One particular deficiency is that there is no knowledge of row ordering within the group, so creating a running total, which would be relatively easy if ordering could be guaranteed, is not possible within a SAFE Assembly.
Please also see:
CREATE FUNCTION (MSDN documentation)
CREATE AGGREGATE (MSDN documentation)
CLR Table-Valued Function Example with Full Streaming (STVF / TVF) (article I wrote)
A scalar function returns a single value. It might not even be related to tables in your database.
A tabled-valued function returns your specified columns for rows in your table meeting your selection criteria.
An aggregate-valued function returns a calculation across the rows of a table -- for example summing values.
Scalar function
Returns a single value. It is just like writing functions in other programming languages using T-SQL syntax.
Table Valued function
Is a little different compared to the above. Returns a table value. Inside the body of this function you write a query that will return the exact table.
For example:
CREATE FUNCTION <function name>(parameter datatype)
RETURN table
AS
RETURN
(
-- *write your query here* ---
)
Note that there is no BEGIN & END statements here.
Aggregate Functions
Includes built in functions that is used alongside GROUP clause. For example: SUM(),MAX(),MIN(),AVG(),COUNT() are aggregate functions.
Aggregate and Scalar functions both return a single value but Scalar functions operate based on a single input value argument while Aggregate functions operate on a single input set of values (a collection or column name). Examples of Scalar functions are string functions, ISNULL, ISNUMERIC, for Aggregate functions examples are AVG, MAX and others you can find in Aggregate Functions section of Microsoft website.
Table-Valued functions return a table regardless existence of any input argument. Execution of this functions is done by using them as a regular physical table e.g: SELECT * FROM fnGetMulEmployee()
This following link is very useful to understand the difference: https://www.dotnettricks.com/learn/sqlserver/different-types-of-sql-server-functions

Do SQL Server functions such as inline table-values functions persist?

I am aware that derived table and Common table expression (CTE) do not persist. They live in memory til the end of the outer query. Every call is a repeated execution.
Do functions such as inline table-valued functions persist, meaning they are only calculated once ? Can we index an inline table-valued function?
Inline function is basically the same thing as a view or a CTE, except that it has parameters. If you look at the query plan you'll see that the logic from the function will be included in the query using it -- so no, you can't index it and SQL Server doesn't cache it's results as such, but of course the pages will be in buffer pool for future use.
I wouldn't say that each call to CTE is a repeated execution either, since SQL server can freely decide how to run the query, as long as the results are correct.
For multi statement UDF each of the calls (at least in versions up to 2014) are separate executions, as far as I know, every time, and not cached in the sense I assume you mean.
Do functions such as inline table-valued functions persist
NO, if you see syntax of table valued function it returns result of a select statement in essence and so it doesn't store the fetched data anywhere (same as in view). So, NO there is no question of creating index on it since the data doesn't gets stored.
Unless you are storing that fetched data in another table like below and then you can create a index/other stuff on that test table;
SELECT * FROM yourInlineTableValuedFunction(parameter)
INTO TestTable;
Can we index an inline table-valued function?
No, but if you make the table into a temp table, sure you can index and speed up. The overhead of creation of a temp table will more than pay off in improved indexed access, caching, and, based on your use case, repeated use of the same temp table in a multi-user scenario.

Will the OPTIMIZE option work in a multi-statement table function?

I have SQL Server 2008 Express, so I don't have all the tools to see what is happening under the hood. Someone suggested to me that since a multi-statement table function is a "black box", that SQL Server may ignore the following:
OPTION (OPTIMIZE FOR (#JobID UNKNOWN, #Status UNKNOWN, #ResellerID UNKNOWN))
Does anyone have proof of this either way?
I know that if I were using a stored procedure, this wouldn't be an issue. However, using a multi-statement table function offers a lot of convenience for what I need to do.
With Express you have the same information at your disposal as with any other version, you just don't have the GUI tools to mangle display it. For instance execution plans are still available in the DMVs like sys.dm_exec_query_plan.
I'm not sure what the question you ask is, but is true that inline table functions are a much better choice than multi-statement table functions. The optimizer can see what the TVF does and can properly optimize it in the context of the entire query, perhaps eliminating unnecessary calls to the function or choosing an acces path (an index) that helps reduce the overall, aggregate time of the entire query. With a multi-statement TVF the plan is forced to effectively call and evaluate the function each time (ie. for each candidate row) and see what the result is. This is what probably your friend means when it says that multi-statement TVF are 'black-box'.

Stored procedures and functions

What are the differences between stored procedures and functions.
Whenever there are more input, output parameters i go for stored procedure. If it is only one i will go for functions.
Besides that, is there any performance issue if i use more stored procedures? I am worried as i have close to 50 stored procedures in my project.
How they differ conceptually.
Thanks in advance!
EDITED:-
When i executed a calculation in stored procedure and in functions, i have found that in stored procedures it is taking 0.15 sec, while in function it takes 0.45sec.
Surprisingly functions are taking more time than stored procedures. May be functions are worth for its reusability.
Inline functions executes quicker than strored procedures. I think, this is because multi-select functions can't use statastics, which slows them down, but inline table-value functions can use statistics.
Difference between stored procedure and functions in SQL Server ...
http://www.dotnetspider.com/resources/18920-Difference-between-Stored-Procedure-Functions.aspx
Difference between Stored procedures and User Defined functions[UDF]
http://www.go4expert.com/forums/showthread.php?t=329
Stored procedures vs. functions
http://searchsqlserver.techtarget.com/tip/Stored-procedures-vs-functions
What are the differences between stored procedure and functions in ...
http://www.allinterview.com/showanswers/28431.html
Difference between Stored procedure and functions
http://www.sqlservercentral.com/Forums/Topic416974-8-1.aspx
To decide between using one of the two, keep in mind the fundamental difference between them: stored procedures are designed to return its output to the application. A UDF returns table variables, while a SPROC can't return a table variable although it can create a table. Another significant difference between them is that UDFs can't change the server environment or your operating system environment, while a SPROC can. Operationally, when T-SQL encounters an error the function stops, while T-SQL will ignore an error in a SPROC and proceed to the next statement in your code (provided you've included error handling support). You'll also find that although a SPROC can be used in an XML FOR clause, a UDF cannot be.
If you have an operation such as a query with a FROM clause that requires a rowset be drawn from a table or set of tables, then a function will be your appropriate choice. However, when you want to use that same rowset in your application the better choice would be a stored procedure.
There's quite a bit of debate about the performance benefits of UDFs vs. SPROCs. You might be tempted to believe that stored procedures add more overhead to your server than a UDF. Depending upon how your write your code and the type of data you're processing, this might not be the case. It's always a good idea to text your data in important or time-consuming operations by trying both types of methods on them.

Profiling statements inside a User-Defined Function

I'm trying to use SQL Server Profiler (2005) to track down some application performance problems. One of the calls being made is to a table-valued user-defined function. This function wraps a select that joins several tables together.
In SQL Server Profiler, the call to the UDF is logged. However, the select that underlies the UDF isn't being logged at all. Because of this, I'm not getting useful data on which tables & indexes are being hit. I'd like to feed this info into the Database Tuning Advisor for some indexing advice.
Is there any way (short of unwrapping the queries themselves) to log the tables called by UDFs in Profiler?
You can't: a multi-statement TVF is a black box and you can only get CPU, Read, Writes etc.
by "black box" I mean it's a fully encapsulated and opaque series of statements inside another query, and there is no "flow" like you'd get line by line through a stored proc.
An in-line TVF is expanded like a view or macro into the main query and can be seen.
Edit: related: Table Valued Function where did my query plan go?

Resources