When to create a function rather than a procedure - sql-server

I have many stored procedures that are in production but no functions.
Here in MSDN is the definition of CREATE FUNCTION. It says the following:
Creates a user-defined function in SQL Server 2012. A user-defined
function is a Transact-SQL or common language runtime (CLR) routine
that accepts parameters, performs an action, such as a complex
calculation, and returns the result of that action as a value. The
return value can either be a scalar (single) value or a table. Use
this statement to create a reusable routine that can be used in these
ways:
My stored procedures already seem to achieve all of the above.
What is a definite situation when one chooses a function rather than a stored procedure?

When you want to use the result integrated as part of a larger query.
e.g. you can join or cross apply table valued functions and you can evaluate a Scalar UDF for each row in a result (with care as this can have performance implications).
For stored procedures you would need to execute them and capture the result into a temporary table etc to be able to join on it in a wider query and would require cursors or similar to simulate the Scalar UDF behaviour and execute it for each row.

Related

Memoizable functions - Snowflake

When querying INFORMATION_SCHEMA or SHOW FUNCTION we could find a column IS_MEMOIZABLE.
SELECT IS_MEMOIZABLE, *
FROM INFORMATION_SCHEMA.FUNCTIONS;
None of built-in function is memoizable:
SHOW BUILTIN FUNCTIONS;
SELECT "is_memoizable", *
FROM TABLE(RESULT_SCAN(LAST_QUERY_ID()))
WHERE "is_memoizable" <> 'N';
-- 0 rows
Memoization
In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.
The question is how to create user defined function that has IS_MEMOIZABLE property equals to 'Y'(true)?
Is there any specific keyword required and/or does it apply to specific types of functions(external/Python/Java/immutable/...)?
Memoizable UDFs
A scalar SQL UDF can be memoizable. A memoizable function caches the result of calling a scalar SQL UDF and then returns the cached result when the output is needed at a later time. The benefit of using a memoizable function is to improve performance for complex queries, such as multiple column lookups in mapping tables referenced within a row access policy or masking policy.
You can define a scalar SQL UDF to be memoizable in the CREATE FUNCTION statement by specifying the MEMOIZABLE keyword. Memoizable functions do not include arguments.
Example:
CREATE OR REPLACE FUNCTION test()
RETURNS INT
MEMOIZABLE
AS
$$SELECT 1$$;

Why ,when and where we will use stored procedures, views and functions?

I often use a stored procedure for data access purpose but don't know which one is best - a view or a stored procedure or a function?
Please tell me which one of the above is best for data access purpose and why it is best, list down the reason with the example please.
I searched Google to learn which one is best but got no expected answer
View
A view is a โ€œvirtualโ€ table consisting of a SELECT statement, by means of โ€œvirtualโ€
I mean no physical data has been stored by the view -- only the definition of the view is stored inside the database; unless you materialize the view by putting an index on it.
By definition you can not pass parameters to the view
NO DML operations (e.g. INSERT, UPDATE, and DELETE) are allowed inside the view; ONLY SELECT statements.
Most of the time, view encapsulates complex joins so it can be reusable in the queries or stored procedures. It can also provide level of isolation and security by hiding sensitive columns from the underlying tables.
Stored procedure
A stored procedure is a group of Transact-SQL statements compiled into a single execution plan or in other words saved collection of Transact-SQL statements.
A stored procedure:
accepts parameters
can NOT be used as building block in a larger query
can contain several statements, loops, IF ELSE, etc.
can perform modifications to one or several tables
can NOT be used as the target of an INSERT, UPDATE or DELETE statement
A view:
does NOT accept parameters
can be used as building block in a larger query
can contain only one single SELECT query
can NOT perform modifications to any table
but can (sometimes) be used as the target of an INSERT, UPDATE or DELETE statement.
Functions
Functions are subroutines made up of one or more Transact-SQL statements that can be used to encapsulate code for reuse
There are three types (scalar, table valued and inline mutlistatement) UDF and each of them server different purpose you can read more about functions or UDF in BOL
UDF has a big limitation; by definition it cannot change the state of the database. What I mean by this you cannot perform data manipulation operation inside UDF (INSERT, UPDATE , DELETE) etc.
SP are good for doing DDL statements that you can't do with functions. SP and user defined functions accept parameters and can returns values but they can't do the same statements.
User defined functions can only do DML statements.
View doesn't accept parameters, and only accept DML statements.
I hope below information will help you to understand the use of the SQL procedure, view, and function.
Stored Procedure - Stored Procedure can be used for any database operation like insert, update, delete and fetch which you mentioned that you are already using.
View - View only can be used to fetch the data but it has limitations as you can't pass the parameters to the view. e.g. filter the data based on the passed parameter
Function - Function usually used for a specific operation like you have many in-built SQL server functions also for the date, for math, for string manipulation etc.
I'll make it very short and straight.
When you are accessing data from different tables and don't want to pass parameter use View.
When you want to perform DML statement go for Function.
When you want to perform DDL statement go for Stored Procedure.
Rest is upon your knowledge and idea hit in your mind at particular point of time.
And for performance reasons many would argue
- avoid functions (especially scalar) if possible
It's easier to tweak stored procedures (query plans) and views
IMO, View (and Indexed View) are just Fancier SELECT
Stored Procedure are versatile as you can transform/manipulate within

Difference between scalar, table-valued, and aggregate functions in SQL server?

What is the difference between scalar-valued, table-valued, and aggregate functions in SQL server? And does calling them from a query need a different method, or do we call them in the same way?
Scalar Functions
Scalar functions (sometimes referred to as User-Defined Functions / UDFs) return a single value as a return value, not as a result set, and can be used in most places within a query or SET statement, except for the FROM clause (and maybe other places?). Also, scalar functions can be called via EXEC, just like Stored Procedures, though there are not many occasions to make use of this ability (for more details on this ability, please see my answer to the following question on DBA.StackExchange: Why scalar valued functions need execute permission rather than select?). These can be created in both T-SQL and SQLCLR.
T-SQL (UDF):
Prior to SQL Server 2019: these scalar functions are typically a performance issue because they generally run for every row returned (or scanned) and always prohibit parallel execution plans.
Starting in SQL Server 2019: certain T-SQL scalar UDFs can be inlined, that is, have their definitions placed directly into the query such that the query does not call the UDF (similar to how iTVFs work (see below)). There are restrictions that can prevent a UDF from being inlineable (if that wasn't a word before, it is now), and UDFs that can be inlined will not always be inlined due to several factors. This feature can be disabled at the database, query, and individual UDF levels. For more information on this really cool new feature, please see: Scalar UDF Inlining (be sure to review the "requirements" section).
SQLCLR (UDF): these scalar functions also typically run per each row returned or scanned, but there are two important benefits over T-SQL UDFs:
Starting in SQL Server 2012, return values can be constant-folded into the execution plan IF the UDF does not do any data access, and if it is marked IsDeterministic = true. In this case the function wouldn't run per each row.
SQLCLR scalar functions can work in parallel plans ( ๐Ÿ˜ƒ ) if they do not do any database access.
Table-Valued Functions
Table-Valued Functions (TVFs) return result sets, and can be used in a FROM clause, JOIN, or CROSS APPLY / OUTER APPLY of any query, but unlike simple Views, cannot be the target of any DML statements (INSERT / UPDATE / DELETE). These can also be created in both T-SQL and SQLCLR.
T-SQL MultiStatement (TVF): these TVFs, as their name implies, can have multiple statements, similar to a Stored Procedure. Whatever results they are going to return are stored in a Table Variable and returned at the very end; meaning, nothing is returned until the function is done processing. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) depends on the version of SQL Server:
Prior to SQL Server 2014: these always report 1 (yes, just 1) row.
SQL Server 2014 and 2016: these always report 100 rows.
Starting in SQL Server 2017: default is to report 100 rows, BUT under some conditions the row count will be fairly accurate (based on current statistics) thanks to the new Interleaved Execution feature.
T-SQL Inline (iTVF): these TVFs can only ever be a single statement, and that statement is a full query, just like a View. And in fact, Inline TVFs are essentially a View that accepts input parameters for use in the query. They also do not cache their own query plan as their definition is placed into the query in which they are used (unlike the other objects described here), hence they can be optimized much better than the other types of TVFs ( ๐Ÿ˜ƒ ). These TVFs perform quite well and are preferred if the logic can be handled in a single query.
SQLCLR (TVF): these TVFs are similar to T-SQL MultiStatement TVFs in that they build up the entire result set in memory (even if it is swap / page file) before releasing all of it at the very end. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) is always 1000 rows. Given that a fixed row count is far from ideal, please support my request to allow for specifying the row count: Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer
SQLCLR Streaming (sTVF): these TVFs allow for complex C# / VB.NET code just like regular SQLCLR TVFs, but are special in that they return each row to the calling query as they are generated ( ๐Ÿ˜ƒ ). This model allows the calling query to start processing the results as soon as the first one is sent so the query doesn't need to wait for the entire process of the function to complete before it sees any results. And it requires less memory since the results aren't being stored in memory until the process completes. The estimated number of rows that they will return, as reported to the Query Optimizer (which impacts the execution plan) is always 1000 rows. Given that a fixed row count is far from ideal, please support my request to allow for specifying the row count: Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer
Aggregate Functions
User-Defined Aggregates (UDA) are aggregates similar to SUM(), COUNT(), MIN(), MAX(), etc. and typically require a GROUP BY clause. These can only be created in SQLCLR, and that ability was introduced in SQL Server 2005. Also, starting in SQL Server 2008, UDAs were enhanced to allow for multiple input parameters ( ๐Ÿ˜ƒ ). One particular deficiency is that there is no knowledge of row ordering within the group, so creating a running total, which would be relatively easy if ordering could be guaranteed, is not possible within a SAFE Assembly.
Please also see:
CREATE FUNCTION (MSDN documentation)
CREATE AGGREGATE (MSDN documentation)
CLR Table-Valued Function Example with Full Streaming (STVF / TVF) (article I wrote)
A scalar function returns a single value. It might not even be related to tables in your database.
A tabled-valued function returns your specified columns for rows in your table meeting your selection criteria.
An aggregate-valued function returns a calculation across the rows of a table -- for example summing values.
Scalar function
Returns a single value. It is just like writing functions in other programming languages using T-SQL syntax.
Table Valued function
Is a little different compared to the above. Returns a table value. Inside the body of this function you write a query that will return the exact table.
For example:
CREATE FUNCTION <function name>(parameter datatype)
RETURN table
AS
RETURN
(
-- *write your query here* ---
)
Note that there is no BEGIN & END statements here.
Aggregate Functions
Includes built in functions that is used alongside GROUP clause. For example: SUM(),MAX(),MIN(),AVG(),COUNT() are aggregate functions.
Aggregate and Scalar functions both return a single value but Scalar functions operate based on a single input value argument while Aggregate functions operate on a single input set of values (a collection or column name). Examples of Scalar functions are string functions, ISNULL, ISNUMERIC, for Aggregate functions examples are AVG, MAX and others you can find in Aggregate Functions section of Microsoft website.
Table-Valued functions return a table regardless existence of any input argument. Execution of this functions is done by using them as a regular physical table e.g: SELECT * FROM fnGetMulEmployee()
This following link is very useful to understand the difference: https://www.dotnettricks.com/learn/sqlserver/different-types-of-sql-server-functions

SQL Server: Table-valued Functions vs. Stored Procedures

I have been doing a lot of reading up on execution plans and the problems of dynamic parameters in stored procedures. I know the suggested solutions for this.
My question, though, is everything I have read indicated that SQL Server caches the execution plan for stored procedures. No mention is made of Table-value functions. I assume it does so for Views (out of interest).
Does it recompile each time a Table-value function is called?
When is it best to use a Table-value function as opposed to a stored procedure?
An inline table valued function (TVF) is like a macro: it's expanded into the outer query. It has no plan as such: the calling SQL has a plan.
A multi-statement TVF has a plan (will find a reference).
TVFs are useful where you want to vary the SELECT list for a parameterised input. Inline TVFs are expanded and the outer select/where will be considered by the optimiser. For multi-statement TVFs optimisation is not really possible because it must run to completion, then filter.
Personally, I'd use a stored proc over a multi-statement TVF. They are more flexible (eg hints, can change state, SET NOCOUNT ON, SET XACTABORT etc).
I have no objection to inline TVFs but don't tend to use them for client facing code because of the inability to use SET and change state.
I haven't verified this, but I take for granted that the execution plan for functions are also cached. I can't see a reason why that would not be possible.
The execution plan for views are however not cached. The query in the view will be part of the query that uses the view, so the execution plan can be cached for the query that uses the view, but not for the view itself.
The use of functions versus stored procedured depends on what result you need from it. A table-valued function can return a single result, while a stored procedure can return one result, many results, or no result at all.

Stored procedures and functions

What are the differences between stored procedures and functions.
Whenever there are more input, output parameters i go for stored procedure. If it is only one i will go for functions.
Besides that, is there any performance issue if i use more stored procedures? I am worried as i have close to 50 stored procedures in my project.
How they differ conceptually.
Thanks in advance!
EDITED:-
When i executed a calculation in stored procedure and in functions, i have found that in stored procedures it is taking 0.15 sec, while in function it takes 0.45sec.
Surprisingly functions are taking more time than stored procedures. May be functions are worth for its reusability.
Inline functions executes quicker than strored procedures. I think, this is because multi-select functions can't use statastics, which slows them down, but inline table-value functions can use statistics.
Difference between stored procedure and functions in SQL Server ...
http://www.dotnetspider.com/resources/18920-Difference-between-Stored-Procedure-Functions.aspx
Difference between Stored procedures and User Defined functions[UDF]
http://www.go4expert.com/forums/showthread.php?t=329
Stored procedures vs. functions
http://searchsqlserver.techtarget.com/tip/Stored-procedures-vs-functions
What are the differences between stored procedure and functions in ...
http://www.allinterview.com/showanswers/28431.html
Difference between Stored procedure and functions
http://www.sqlservercentral.com/Forums/Topic416974-8-1.aspx
To decide between using one of the two, keep in mind the fundamental difference between them: stored procedures are designed to return its output to the application. A UDF returns table variables, while a SPROC can't return a table variable although it can create a table. Another significant difference between them is that UDFs can't change the server environment or your operating system environment, while a SPROC can. Operationally, when T-SQL encounters an error the function stops, while T-SQL will ignore an error in a SPROC and proceed to the next statement in your code (provided you've included error handling support). You'll also find that although a SPROC can be used in an XML FOR clause, a UDF cannot be.
If you have an operation such as a query with a FROM clause that requires a rowset be drawn from a table or set of tables, then a function will be your appropriate choice. However, when you want to use that same rowset in your application the better choice would be a stored procedure.
There's quite a bit of debate about the performance benefits of UDFs vs. SPROCs. You might be tempted to believe that stored procedures add more overhead to your server than a UDF. Depending upon how your write your code and the type of data you're processing, this might not be the case. It's always a good idea to text your data in important or time-consuming operations by trying both types of methods on them.

Resources