How to use SELECT/INSERT statement in snowflake UDF - snowflake-cloud-data-platform

I have a requirement where I need to create a UDF in snowflake but it has complex Select and Insert statements. But as per the documentation, SELECT statement cannot be used in UDF. Is there any possible workaround?

You can actually do that in UDF .
For eg:
create or replace function orders_for_product(prod_id varchar)
returns table (product_id varchar, quantity_sold numeric(11, 2))
as
$$
select product_id, quantity_sold
from orders
where product_id = prod_id
$$
;

User-defined functions accept parameters, perform an operation(some calculation) and return the result of that operation as a value. The operations that are not available through the built-in, system-defined functions are generally defined using UDFs.
However, you can incorporate SELECT in the UDF as provided below and some examples of how a UDF can be used are provided in the link shared.
CREATE FUNCTION profit()
RETURNS NUMERIC(11, 2)
AS
$$
SELECT SUM((retail_price - wholesale_price) * number_sold)
FROM purchases
$$
;
https://docs.snowflake.com/en/sql-reference/udf-overview.html#sql

Related

Would it negatively affect performance writing "WHERE" on returned resultset from Table-Val. Function rather than on query itself?

I have an SQL Query in which currently returning the result set using Table-Valued Function as below:
CREATE FUNCTION [dbo].[fnGetUserTable]
RETURNS #tblUser TABLE (
UserDeviceId BIGINT,
UserDeviceName NVARCHAR(50),
UserName NVARCHAR(50)
)
AS
BEGIN
INSERT INTO #tblUser
Select Device.DeviceId, Device.DeviceName, User.UserName
FROM Device INNER JOIN User ON Device.DeviceId = User.UserDeviceId
RETURN
END
However, am filtering the returned resultset from the function as follows:
SELECT UserDeviceId, UserDeviceName, UserName
FROM [dbo].[fnGetUserTable]
WHERE UserName ='myuser#users.com'
Assuming, there are 1000,000 records in table User, what would be best practive, Performance wise, to filter the returned resultset from function as done in the previous example,
or to add the where condition directly into the SQL Query within the [dbo].[fnGetGoldRateTable] function as follow:
CREATE FUNCTION [dbo].[fnGetUserTable]
RETURNS #tblUser TABLE (
UserDeviceId BIGINT,
UserDeviceName NVARCHAR(50),
UserName NVARCHAR(50)
)
AS
BEGIN
INSERT INTO #tblUser
Select Device.DeviceId, Device.DeviceName, User.UserName
FROM Device INNER JOIN User ON Device.DeviceId = User.UserDeviceId
WHERE UserName ='myuser#users.com'
RETURN
END
Or would it have same performance impact, since the SQL Server processes the From Clause before the Where Clause in a Select Query, which means the returned result set from the Function's Query would be Inserted by SQL Server into Temp table and then filtered according to the added where condition, in case of adding the Where condition in the function itself, which will lead to same performance criteria??
Please Advise,
Thanks in advance,
One way to know is by comparing the two approaches by running each approach and take a look into the execution plan for each approach, find which one costs less and costs effective.
here is a reference to enable the execution plan
https://learn.microsoft.com/en-us/sql/relational-databases/performance/display-an-actual-execution-plan?view=sql-server-ver15
what would be best practive, Performance wise, to filter the returned resultset from function as done in the previous example, or to add the where condition directly into the SQL Query
In this particular case adding the predicate inside the function is clearly better, as it would enable using an index on the column.
But that's because this is a multi-statement Table Valued Function, and those are "opaque" to the query optimizer (QO). The logic inside the function is executed and the results returned to the outer query. If this were an Inline Table-Valued Function, the case would be different. That would look like this:
CREATE FUNCTION [dbo].[fnGetUserTable] ()
RETURNS TABLE
as
RETURN
Select Device.DeviceId, Device.DeviceName, User.UserName
FROM Device INNER JOIN [User] ON Device.DeviceId = User.UserDeviceId
And an Inline Table-Valued Function is parsed and folded into the query plan of the outer query, and the QO can optimize the whole thing. And in particular it can "pushdown" query predicates from the outer query into the function.
So favor Inline Table-Valued Functions precisely because you don't have to add function parameters to get efficient query performance when filtering by one of the function's returned columns.

Use of table valued function in order by clause

Can I use my table valued function in order by clause of my select query????
Like this :
declare #ID int
set #ID=9011
Exec ('select top 10 * from cs_posts order by ' + (select * from dbo.gettopposter(#ID)) desc)
GetTopPoster(ID) is my table valued function.
Please help me on this.
You can use a table-valued function with a join. That also allows you to choose any combination of columns to sort by:
select top 10 *
from cs_posts p
join dbo.gettopposter(#ID) as gtp
on p.poster_id = gtp.poster_id
order by
gtp.col1
, gtp.col2
Yes. You can use a Table Valued Function just as a normal table.
Your query is not valid SQL though, despite the TVF.
For further reference:
http://msdn.microsoft.com/en-us/library/ms191165.aspx
You can't do it like that - how does it know what to order by? It doesn't know how the TVF relates to the original query. You can join the two however (as I assume cs_posts has an id column which relates to the TVF) and then order by the the TVF id column.

SQL Server 2008, can I refer to a temporary table in a select statement within a udf?

I'm trying to run a select query on a temporary table within a udf. I can't find documentation stating this isn't allowed, yet the below stored procedure won't compile when I change tblDailyPricingAndVol to #dailyPricingAndVolBySymbol (my temporary table of course. The temp table is created at a higher level (in a stored procedure before the stored procedure that uses this function) if that affects anything... thanks in advance.
Edit:
The udf is meant to just be a helper for the stored procedure that calls it.. I'm trying to query a temporary table with it due to the fact that it'll get called thousands of times each time it runs. The data that it retrieves and then aggregates is in a table with millions of rows. So I pare down the data into several hundred records, into the temporary table. This will speed the function up dramatically, even though it'll still take a fair bit of time to run.
ALTER FUNCTION dbo.PricingVolDataAvailableToDateProvided
(#Ticker nchar(10),
#StartDate DATE,
#NumberOfDaysBack int)
RETURNS nchar(5)
AS
BEGIN
DECLARE #Result nchar(5)
DECLARE #RecordCount int
SET #RecordCount = (SELECT COUNT(TradeDate) AS Expr1
FROM (SELECT TOP (100) PERCENT TradeDate
FROM tblDailyPricingAndVol WHERE (Symbol = #Ticker) AND (TradeDate IN
(SELECT TOP (#NumberOfDaysBack) CAST(TradingDate AS DATE) AS Expr1
FROM tblTradingDays
WHERE (TradingDate <= #StartDate)
ORDER BY TradingDate DESC))
ORDER BY TradeDate DESC) AS TempTable)
IF #RecordCount = #NumberOfDaysBack
SET #Result = 'True'
ELSE
SET #Result = 'False'
RETURN #Result
END
As been mentioned by other posters, you can't use a temporary table in an UDF. What you can do is pass a User-Defined Table to your function.
User-Defined Table Types
In SQL Server 2008, a user-defined table type is a user-defined type
that represents the definition of a table structure. You can use a
user-defined table type to declare table-valued parameters for stored
procedures or functions, or to declare table variables that you want
to use in a batch or in the body of a stored procedure or function.
A quick fix for changing your code could be
CREATE TYPE DailyPricingAndVolBySymbolType AS TABLE (<Columns>)
DECLARE #DailyPricingAndVolBySymbol DailyPricingAndVolBySymbolType
INSERT INTO #DailyPricingAndVolBySymbol SELECT * FROM #DailyPricingAndVolBySymbol
ALTER FUNCTION dbo.PricingVolDataAvailableToDateProvided (
#DailyPricingAndVolBySymbol DailyPricingAndVolBySymbolType READONLY
#Ticker nchar(10),
#StartDate DATE,
#NumberOfDaysBack int
) ...
Looks like you're out of luck. I created a quick function below and got an explicit compiler message that says you can't reference temp tables in a function. I'm not sure why you would need to reference temp tables within a UDF, that's not really the spirit of UDF. Could you show how you were planning to call this UDF? Maybe we could help on that refactor.
Temp tables cannot be accessed from within a function. I suggest using a staging table instead. To better organize these in your DB you could create a schema called Staging, a table called Staging.dailyPricingAndVolBySymbol, and call that from your UDF.

sql server stored procedure

Can I join a table with a Stored Procedure which returns a table ?
Thanks
You need to use INSERT.. EXEC to store the data from the SP into a table or table-variable. Then you can join to that.
Say the SP returns a table (a int, b varchar(10), c datetime)
declare #temp table (a int, b varchar(10), c datetime)
;
insert #temp
exec myproc 1, 10, 'abcdef'
;
select *
from #temp t join othertable o on ... etc
Without creating a temp table, if you also exclude table-variable, then the only option - provided the SP -does not take any- parameters, is to use OPENQUERY to run the SP to return a table. Pseudo:
select *
from OPENQUERY(local_server, 'spname_no_params') t
join othertable o on ... etc
You can't join directly onto a stored procedure. So you either need to use the approach per Richard's answer, or you could convert the sproc to a table valued function.
e.g.
CREATE FUNCTION dbo.fxnExample(#Something INTEGER)
RETURNS TABLE
AS
RETURN
(
SELECT A, B
FROM MyTable
WHERE Something = #Something
)
which you then use/JOIN on in a query like this:
SELECT t1.Foo, f.A, f.B
FROM Table1 t1
JOIN dbo.fxnExample(1) f ON t1.A = f.A
The thing to note is you can't do everything in a user defined function that you can in a sproc so depending on what your sproc does, this may not be possible. Also, for best performance you should make it an inline table valued function like my example above. The alternative is a multi-statement table valued function which could give you poor performance due to the way that the execution plan produced will be based on an assumption of a very low number of rows being returned by it (i.e. 1) - so if it returned a larger number of rows then performance could be poor.
Here's a good MSDN article on it: http://blogs.msdn.com/b/psssql/archive/2010/10/28/query-performance-and-multi-statement-table-valued-functions.aspx
No it's not possible. What you can do is put the output of that SP into a temporary table and use it to your join statement.

Is it possible to use a Stored Procedure as a subquery in SQL Server 2008?

I have two stored procedures, one of which returns a list of payments, while the other returns a summary of those payments, grouped by currency. Right now, I have a duplicated query: the main query of the stored procedure that returns the list of payments is a subquery of the stored procedure that returns the summary of payments by currency. I would like to eliminate this duplicity by making the stored procedure that returns the list of payments a subquery of the stored procedure that returns the summary of payments by currency. Is that possible in SQL Server 2008?
You are better off converting the first proc into a TABLE-VALUED function. If it involves multiple statements, you need to first define the return table structure and populate it.
Sample:
CREATE proc getRecords #t char(1)
as
set nocouut on;
-- other statements --
-- final select
select * from master..spt_values where type = #t
GO
-- becomes --
CREATE FUNCTION fn_getRecords(#t char(1))
returns #output table(
name sysname,
number int,
type char(1),
low int,
high int,
status int) as
begin
-- other statements --
-- final select
insert #output
select * from master..spt_values where type = #t
return
end;
However, if it is a straight select (or can be written as a single statement), then you can use the INLINE tvf form, which is highly optimized
CREATE FUNCTION fn2_getRecords(#t char(1))
returns table as return
-- **NO** other statements; single statement table --
select * from master..spt_values where type = #t
The second proc simply selects from the first proc
create proc getRecordsByStatus #t char(1)
as
select status, COUNT(*) CountRows from dbo.fn2_getRecords(#t)
group by status
And where you used to call
EXEC firstProc #param
to get a result, you now select from it
SELECT * FROM firstProc(#param)
You can capture the output from a stored procedure in a temp table and then use the table in your main query.
Capture the output of a stored procedure returning columns ID and Name to a table variable.
declare #T table (ID int, Name nvarchar(50))
insert into #T
exec StoredProcedure
Inserting the results of your stored proc into a table variable or temp table will do the trick.
If you're trying to reuse code in SQL Server from one query to the next, you have more flexibility with Table Functions. Views are all right if you don't need to pass parameters or use any kind of flow control logic. These may be used like tables in any other function, procedure, view or t-sql statement.
If you made the procedure that returns the list into a table-valued function, then I believe you could use it in a sub-query.
I would use a view, unless it needs to be parameterized, in which case I would use an inline table-valued function if possible, unless it needs to be a multi-statement operation, where you can still use a table-valued function, but they are usually less efficient.

Resources