I had succeeded to create the following function in PG 8.4.x
CREATE OR REPLACE FUNCTION foo()
RETURNS VOID
AS $function$
BEGIN
select concat('a','b');
END;$function$
LANGUAGE plpgsql;
The function is created without any errors
But when I try to use the function I got :
select foo();
ERROR: function concat(unknown, unknown) does not exist
LINE 1: select concat('a','b') ^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
QUERY: select concat('a','b')
CONTEXT: PL/pgSQL function "foo" line 2 at SQL statement
How comes that PG succeed to create a function that actually calls an unknown function? (CONCAT is supported only in PG 9.x)
The PLpgSQL checks only syntax of embedded SQL in validation time. The semantic - identifiers, functions, ... is checked immediately before first evaluation in run-time. You can search plpgsql_check extension. It does complete check of embedded SQL.
Because functions get compiled the first time you call them. Else it would not be possible to define a set of recursive functions where one calls the other :).
EDIT (thanks to Nick Barnes): Somewhat unrelated to the question there is a switch
SET check_function_bodies = true;
but this only enables basic syntax checks for PL/pgSQL functions. The binding will be done on first call nonetheless. Postgres will only attempt to resolve function / table names for LANGUAGE sql.
Related
We are working on migrating netezza to snowflake. Netezza stored procedures has a way, where it allows the call of procedure with any number of argument with the help of PROC_ARGUMENT_TYPES. Do we have similar function in snowflake as well?
Like
c:= PROC_ARGUMENT_TYPES.count;
returns the number of argument passed.
Please note: we are working on SQL stored procedures in Snowflake.
Snowflake does not allow procedures or UDFs with an arbitrary number of input parameters. However, it's possible to approximate this capability using any combination of procedure overloading, arrays, objects, and variants.
Here's one example that's using procedure overloading and variants. The first procedure has only the required parameters. The second procedure has the required parameters plus an additional parameter that accepts a variant.
If the calling SQL specifies two parameters, it will call the procedure (overload) with only two parameters in the signature. That procedure in turn just calls the main stored procedure specifying NULL for the third parameter and returns the results.
The main stored procedure with three inputs has a variant for the final input. It can accept an array or an object. An array requires positional awareness of the inputs. An object does not. An object allows passing name/value pairs.
create or replace procedure VARIABLE_SIGNATURE(REQUIRED_PARAM1 string, REQUIRED_PARAM2 string)
returns variant
language javascript
as
$$
var rs = snowflake.execute({sqlText:`call VARIABLE_SIGNATURE(?,?,null)`,binds:[REQUIRED_PARAM1, REQUIRED_PARAM1]});
rs.next();
return rs.getColumnValue(1);
$$;
create or replace procedure VARIABLE_SIGNATURE(REQUIRED_PARAM1 string, REQUIRED_PARAM2 string, OPTIONAL_PARAMS variant)
returns variant
language javascript
as
$$
var out = {};
out.REQUIRED_PARAM1 = REQUIRED_PARAM1;
out.REQUIRED_PARAM2 = REQUIRED_PARAM2;
out.OPTIONAL_PARAMS = OPTIONAL_PARAMS;
return out;
$$;
-- Call the SP overload different ways:
call VARIABLE_SIGNATURE('PARAM1', 'PARAM2');
call VARIABLE_SIGNATURE('PARAM1', 'PARAM2', array_construct('PARAM3', 'PARAM4', 'PARAM5'));
call VARIABLE_SIGNATURE('PARAM1', 'PARAM2', object_construct('PARAM3_NAME', 'PARAM3_VALUE', 'PARAM10_NAME', 'PARAM10_VALUE'));
While these SPs are JavaScript, overloading and the use of arrays, objects, and variants works the same way for SQL Script stored procedures.
Some things I have noticed about valid notations for this in Snowflake.
To avoid maintaining stored procedure duplicate, overloaded versions, an alternative kludge to overloading might be to require the passing of some sort of a testable falsy variant or a NULL when no additional values are wanted.
-- Call the SP by passing a testable, falsy value:
call VARIABLE_SIGNATURE('PARAM1', 'PARAM2'); -- This will fail fail without overloading with a matched, 2 string/varchar signature.
call VARIABLE_SIGNATURE('PARAM1', 'PARAM2', NULL); -- This will work.
call VARIABLE_SIGNATURE('PARAM1', 'PARAM2', ''::variant); -- This will work.
call VARIABLE_SIGNATURE('PARAM1', 'PARAM2', array_construct()); -- This will work.
call VARIABLE_SIGNATURE('PARAM1', 'PARAM2', object_construct()); -- This will work.
Of course, array_construct('PARAM3', 'PARAM4', 'PARAM5')) can also be written as parse_json('["PARAM3", "PARAM4", "PARAM5"]').
Similarly, object_construct('PARAM3_NAME', 'PARAM3_VALUE', 'PARAM10_NAME', 'PARAM10_VALUE') can be written also as parse_json('{"PARAM3_NAME": "PARAM3_VALUE", "PARAM10_NAME", "PARAM10_VALUE"}').
Neither of these alternatives gives us anything that useful unless you just like parse_json() more than the other two functions.
Also, I am not sure if this has always worked (maybe Greg Pavlik knows?), but the notation for these variant types can be abbreviated a little bit by constructing an object with {} or an array with [] and thus be made slightly cleaner and more readable.
To explore the notations that Snowflake will accept, here are examples of code that will work:
-- Call the SP using different notations:
call VARIABLE_SIGNATURE('PARAM1', 'PARAM2', (select array_construct('PARAM3', 'PARAM4', 'PARAM5'))); -- Make the notation awkward & hard to read.
call VARIABLE_SIGNATURE('PARAM1', 'PARAM2', (select ['PARAM3', 'PARAM4', 'PARAM5'])); -- Make the notation awkward & hard to read.
call VARIABLE_SIGNATURE('PARAM1', 'PARAM2', ['PARAM3', 'PARAM4', 'PARAM5']); -- This also works & is easy to read.
call VARIABLE_SIGNATURE('PARAM1', 'PARAM2', {'PARAM3_NAME': 'PARAM3_VALUE', 'PARAM10_NAME': 'PARAM10_VALUE'}); -- This also works & is easy to read.
I have copied some code from an example for accessing an sqlite database. It uses an agent to get the returned rows:
check_db (input_line: STRING)
local
df_db: SQLITE_DATABASE
df_db_query: SQLITE_QUERY_STATEMENT
test_val: STRING
do
test_val := "whatever"
create df_db.make_open_read_write ("Large.db")
create df_db_query.make ("SELECT * FROM test_table WHERE test_val =%
% :TEST_VAL%
% ;", df_db)
check df_db_query_is_compiled: df_db_query.is_compiled
end
df_db_query.execute_with_arguments (agent (returned_row: SQLITE_RESULT_ROW): BOOLEAN
do
if returned_row.is_null (1) then
insert_into_db
end
end,
<< create {SQLITE_STRING_ARG}.make (":TEST_VAL", test_val) >>)
end -- check_db
The problem I have is that I would like to pass input_line to the procedure insert_into_db.
The inline procedure used by execute_with_arguments isn't able to see any variables outside its scope, but I presume there must be a way to pass an extra parameter to it? Everything I have tried simply refuses to compile with syntax errors.
In this case, I simply want to add a database entry if it doesn't already exist, but I can easily see the case where I would want to send the returned row along with some extra data to another procedure, so it must be doable.
As you correctly point out, at the moment, local variables in Eiffel are not automatically passed into inline agents. The solution is to add explicit formal arguments to an inline agent and to pass the corresponding actual arguments to it.
The inline agent from the example can be adapted as follows (the outer context with the argument input_line is omitted for brevity):
agent (returned_row: SQLITE_RESULT_ROW; s: STRING): BOOLEAN
do
-- `s` is attached to `input_line` here.
if returned_row.is_null (1) then
insert_into_db
end
end (?, input_line)
In addition to the formal argument s that will get the value of input_line, you can see an explicit list of actual arguments (?, input_line). The question mark denotes an open argument, that will be passed to the agent by execute_with_arguments as before. input_line stands for a closed argument.
When the list has no closed arguments, as in the original code, it can be omitted. However, one could have written (?) after the keyword end of the inline agent in the original code to be absolutely explicit.
First attempt at a CLR Integration, so I can use Regex tools.
I create the assembly:
CREATE ASSEMBLY SQLRegexTools
FROM 'D:\CODE\SQLRegexTools\SQLRegexTools\bin\Release\SQLRegexTools.dll'
This succeeds, and the assembly appears in SELECT # FROM sys.assemblies.
But there are no records returned to SELECT * FROM sys.Assembly_modules.
And when I try to CREATE FUNCTION to call one of the methods,
CREATE FUNCTION RegExMatch(#pattern varchar(max)
, #SearchIn varchar(max)
, #options int)
RETURNS varchar
EXTERNAL NAME SQLRegexTools.Functions.RegExMatch
I get an error 'Msg 6505, Could not find Type "Functions" in assembly 'SQLRegexTools.'
The class name of the VB module is "Functions". Why is this called a type in the error, and why might I not be seeing anything in the modules?
This error occurs when SQL Server is unable to resolve the name as provided. Try the snippet below to see if this resolves the issue.
CREATE FUNCTION RegExMatch(#pattern varchar(max)
, #SearchIn varchar(max)
, #options int)
RETURNS varchar
AS EXTERNAL NAME SQLRegexTools.[Functions].RegExMatch
The resolution of individual functions/methods for accessing native code is relatively indistinguishable from a multipart object name (e.g. a type).
An additional note is that native code that utilizing nested namespaces must fully qualify nested namespaces within the same pair of quotes. For example, if RegExMatch were located at SQLRegexTools.A.B.C.RegExMatch, you would reference this as SQLRegexTools.[A.B.C].RegExMatch when using it with EXTERNAL NAME in SQL Server.
That hits the answer Joey. Thank you.
There are several gotchas in this process I learned after much sleuthing and testing. They are often buried in long posts about the entire process, I will summarize here:
As Joey explained, the fully qualified name is needed. To be more complete...
Example:
1 2 3 4
[SQLRegexToolsASM].[SQLRegexToolsNS.RegexFunctionsClass].RegExMatch
This is the assembly name in the CREATE ASSEMBLY step.
This is the Root Namespace from the properties of the assembly project. If you declared a namespace or two, they must be included in their proper order, after this item and before item 3 in this list. Observe all these namespaces are enclosed in their own square brackets.
assembly.[rootnamespace.namespace.namespace].classname.methodnameā
This is the Class Name you assigned to the class, maybe like this
Public Class RegexFunctionsClass
The name of the method defined in your VB or C# assembly.
We wrote a function get_timestamp() defined as
CREATE OR REPLACE FUNCTION get_timestamp()
RETURNS integer AS
$$
SELECT (FLOOR(EXTRACT(EPOCH FROM clock_timestamp()) * 10) - 13885344000)::int;
$$
LANGUAGE SQL;
This was used on INSERT and UPDATE to enter or edit a value in a created and modified field in the database record. However, we found when adding or updating records consecutively it was returning the same value.
On inspecting the function in pgAdmin III we noted that on running the SQL to build the function the key word IMMUTABLE had been injected after the LANGUAGE SQL statement. The documentation states that the default is VOLATILE (If none of these appear, VOLATILE is the default assumption) so I am not sure why IMMUTABLE was injected, however, changing this to STABLE has solved the issue of repeated values.
NOTE: As stated in the accepted answer, IMMUTABLE is never added to a function by pgAdmin or Postgres and must have been added during development.
I am guessing what was happening was that this function was being evaluated and the result was being cached for optimization, as it was marked IMMUTABLE indicating to the Postgres engine that the return value should not change given the same (empty) parameter list. However, when not used within a trigger, when used directly in the INSERT statement, the function would return a distinct value FIVE times before then returning the same value from then on. Is this due to some optimisation algorithm that says something like "If an IMMUTABLE function is used more that 5 times in a session, cache the result for future calls"?
Any clarification on how these keywords should be used in Postgres functions would be appreciated. Is STABLE the correct option for us given that we use this function in triggers, or is there something more to consider, for example the docs say:
(It is inappropriate for AFTER triggers that wish to query rows
modified by the current command.)
But I am not altogether clear on why.
The key word IMMUTABLE is never added automatically by pgAdmin or Postgres. Whoever created or replaced the function did that.
The correct volatility for the given function is VOLATILE (also the default), not STABLE - or it wouldn't make sense to use clock_timestamp() which is VOLATILE in contrast to now() or CURRENT_TIMESTAMP which are STABLE: those return the same timestamp within the same transaction. The manual:
clock_timestamp() returns the actual current time, and therefore its
value changes even within a single SQL command.
The manual warns that function volatility STABLE ...
is inappropriate for AFTER triggers that wish to query rows modified
by the current command.
.. because repeated evaluation of the trigger function can return different results for the same row. So, not STABLE.
You ask:
Do you have an idea as to why the function returned correctly five
times before sticking on the fifth value when set as IMMUTABLE?
The Postgres Wiki:
With 9.2, the planner will use specific plans regarding to the
parameters sent (the query will be planned at execution), except if
the query is executed several times and the planner decides that the
generic plan is not too much more expensive than the specific plans.
Bold emphasis mine. Doesn't seem to make sense for an IMMUTABLE function without input parameters. But the false label is overridden by the VOLATILE function in the body (voids function inlining): a different query plan can still make sense.
Related:
PostgreSQL Stored Procedure Performance
Aside
trunc() is slightly faster than floor() and does the same here, since positive numbers are guaranteed:
SELECT (trunc(EXTRACT(EPOCH FROM clock_timestamp()) * 10) - 13885344000)::int
I've been converting an oracle schema to an sql server one and got the following error
Invalid use of a side-effecting operator 'SET COMMAND' within a function.
In my case modifying the database involved this
set #originalDateFirst = ##DateFirst;
set datefirst 1;
set #DayOfWeek = datepart(weekday,#DATE); -- 1 to 5 = Weekday
set datefirst originalDateFirst;
Ideally this wouldn't have modified the database but the datepart function uses static state.
I'm not really from a database background so was slightly baffled by this but reading other answers it looked like all I needed to do was swap the word function for procedure and I'd be away. However I then got the following error
Incorrect syntax near 'RETURNS'.
Reading around a bit about stored procedures aren't allowed to return anything they like - only integers. However the integers normally have the same semantics as a console application's return code - 0 is success and anything else is an error.
Luckily the type I wanted to return was an integer so fixing the next error:
Incorrect syntax near 'RETURNS'.
Involved just removing
RETURNS INTEGER
from the function/procedure. However I'm unsure if there are any weird side effects caused by this error code interpretation that will be outside of my control. The function actually just returns either 0 or 1 basically as a true or false flag (where 1 is true and 0 is false as you might expect). Therefore one of my return values would count as an 'error'.
What if any are the consequences of piggybacking on the return code of a procedure rather than using an out parameter? Is it just a bad practice? If it's safe to do this I'd certainly prefer to so I don't need to change any calling code.
This isn't an answer to your question as posed, but may be a better solution to the overall problem.
Rather than having to rely on a particular DATEFIRST setting, or changing the DATEFIRST setting, why not use an expression that always returns reliable results no matter what the DATEFIRST setting is.
For example, this expression:
select (DATEPART(weekday,GETDATE()) + 7 - DATEPART(weekday,'20140406')) % 7
always returns 1 on Mondays, 2 on Tuesdays, ..., 5 on Fridays. No matter what settings are in effect.
So, your entire original block of 4 lines of code could just be:
set #DayOfWeek = (DATEPART(weekday,#Date) + 7 -
DATEPART(weekday,'20140406')) % 7; -- 1 to 5 = Weekday
And now you should be able to continue writing it as a function rather than a stored procedure.
If it's safe to do this I'd certainly prefer to so I don't need to change any calling code.
Which you would have to do if you did change your function into a stored procedure. There's no syntax where you can look at the call and ever be in doubt of whether a stored procedure or a function is being invoked - they always use different syntaxes. A procedure is executed by being the first piece of text in a batch or by being preceded by the EXEC keyword and no parentheses.
A function, on the other hand, always has to have parentheses applied when calling it, and must appear as an expression within a larger statement (such as SELECT). You cannot EXEC a function, nor call one by it being the first piece of text in a batch.
An out param could be of (almost) any valid datatype, RETURN is always an int, not necessarily 0 or 1.
Because you can't use a procedure as a query source (it's not a table), to consume a return value from a procedure, declare a variable and exec the procedure like this:
create procedure p as
-- some code
return 13
go
declare #r int
exec #r = p
select #r
I wouldn't call it piggybacking, it's a regular way to return a success/error code for example. But how you interprete the return value is entirely up to calling code.
Functions, otoh, can be used as a query source, if table-valued, or as a scalar value in select list or where clause etc. But you can't modify data inside functions, and there are other restrictions with them (as you've learned already). Furthermore, functions can have nasty impact on performance (except the inline table-valued functions, they're pretty much safe to use).