Snowflake - system/built-in stored procedures - snowflake-cloud-data-platform

Is there any stored procedure shipped by Snowflake?
SHOW PROCEDURES IN ACCOUNT;
SELECT * FROM TABLE(RESULT_SCAN(LAST_QUERY_ID())) WHERE "is_builtin" != 'N';
-- 0 rows
SHOW FUNCTIONS IN ACCOUNT;
SELECT * FROM TABLE(RESULT_SCAN(LAST_QUERY_ID())) WHERE "is_builtin" != 'N';
-- 571
We have quite extensive function list, but I am unable to locate a single stored procedure.
Some of the functions(e.g. SYSTEM$CANCEL_ALL_QUERIES) contain side-effects, which means they could be stored procedures(they even support execution via CALL):
SELECT SYSTEM$CANCEL_ALL_QUERIES(CURRENT_SESSION()::INT);
CALL SYSTEM$CANCEL_ALL_QUERIES(CURRENT_SESSION()::INT);
Is there any rationale behind this approach?

Snowflake has been an evolving product over the last 6 year. When it was first released there was no support for dynamic SQL aka Procedures, and there was very little support for Database introspection, like you have in the system table of PostgreSQL.
So many of the "do X" or "tell me about Y" where done via rather ugly function calls, that clearly showed you where escaping the sandpit (as compared to PostgreSQL where of db health/state is also "tables"). I remember our team coding up support to read results from the SHOW commands because RESULT_SCAN did not exist.
So the platform has changed a lot, and now using a mental model of "functions" only read, "procedures" alter mindset, it seems things to do fit the model.

Related

What is the return value of ##ProcId in an ad-hoc SQL query?

I've been looking at logging procedure executions on our reporting database to a table, and aim to come up with a generic snippet of code that can be copied into any proc we want to log.
The above lead me to play around with ##ProcID. The Microsoft documentation explains that this will provide the object ID of the proc, udf, or trigger within which it is contained. That makes sense, but I'm also seeing it return a value when run directly from a new query window. I've not been able to relate this integer to an object id in the database - I have no idea what this Id represents. I'm sysadmin on the server I'm trying this on so there shouldn't be any permission restrictions.
I haven't managed to find anything online about this - the only search result which looked relevant is on a login restricted SAP support forum.
Use Master
select ##procid -- returns an integer
select object_name(##procid) -- NULL
select * from sys.objects where object_id = ##ProcId -- 0 rows
While this isn't documented, the value corresponds to the objectid attribute of the cached query plan, as returned by sys.dm_exec_plan_attributes. The meaning of that is documented: "For plans of type "Adhoc" or "Prepared", it is an internal hash of the batch text."
To confirm, the following query returns the text of the query itself (and thus serves as a form of quine for SQL Server, albeit one that cheats as it inspects runtime values):
SELECT t.[text]
FROM sys.dm_exec_cached_plans p
CROSS APPLY sys.dm_exec_sql_text(p.plan_handle) t
CROSS APPLY sys.dm_exec_plan_attributes(p.plan_handle) a
WHERE a.attribute = 'objectid' AND a.value = ##PROCID
It depends what tool you are using to submit the command. Many tools will create a temporary stored procedure containing your commands (using ODBC prepared statement for example), and then run that procedure.
Speculating, it may be that the tool is detecting that the statement is unchanged and therefore re-using the previous prepared statement. In this case SQL server would not be involved, it would be the client library.
Alternatively, it may be that the server is detecting the sql is unchanged, and the preserved procid is a consequence of query-plan caching system. (SQL server attempts to detect repeated ad-hoc statements and optimise by re-using the plans for them.)
Either way you should consider this a curiosity, not something you should rely on for correct operation of your system as it may well change with updates to SQL Server or your client library.

Getting data of temp table while debugging

While debugging I am unable to watch temp table's value in sql server 2012.I am getting all of my variables value and even can print that but struggling with the temp tables .Is there any way to watch temp table's value?.
SQL Server provides the concept of temporary table which helps the developer in a great way. These tables can be created at runtime and can do the all kinds of operations that one normal table can do. But, based on the table types, the scope is limited. These tables are created inside tempdb database.
While debugging, you can pause the SP at some point, write the select statement in your SP before the DROP table statement, the # table is available for querying.
select * from #temp
I placed this code inside my stored procedure and I am able to see the temp table contents inside the "Locals" window.
INSERT INTO #temptable (columns) SELECT columns FROM sometable; -- populate your temp table
-- for debugging, comment in production
DECLARE #temptable XML = (SELECT * FROM #temptable FOR XML AUTO); -- now view #temptable in Locals window
This works on older SQL Server 2008 but newer versions would probably support a friendlier FOR JSON object. Credit: https://stackoverflow.com/a/6748570/1129926
I know this is old, I've been trying to make this work also where I can view temp table data as I debug my stored procedure. So far nothing works.
I've seen many links to methods on how to the do this, but ultimately they don't work the way a developer would want them to work. For example: suppose one has several processes in the Stored Procedure that updates and modifies data in the same temp table, there is no way to see update on the fly for each process in the SP.
This is a VERY common request, yet no one seems to have a solution other than don't use Stored Procedures for complex processing due how difficult they are to debug. If you're a .NET Core/EF 6 developer and have the correct PK,FK set for the database, one shouldn't really need to use Stored Procedures at all as it can all be handled by EF6 and debug code to view data results in your entities/models directly (usually in web API using models/entities).
Trying to retrieve the data from the tempdb is not possible even with the same connection (as has been suggested).
What is sometimes used is:
PRINT '#temptablename'
SELECT * FROM #temptablename
Dotted thruout the code, you can add a debug flag to the SP and selectively debug the output. NOT ideal at all, but works for many situations.
But this MUST already be in the Stored Procedure before execution (not during). And you must remember to remove the code prior to deployment to a production environment.
I'm surprised in 2022, we still have no solution to this other than don't use complex stored procedures or use .NET Core/EF 6 ... which in my humble opinion is the best approach for 2022 since SSMS and other tools like dbForge and RedGate can't accomplish this either.

Performance of Postgresql stored procedures/functions in a multi-tenant environment that has one db with many schemata (one for each tenant)

I am new to Postgresql and I am trying to figure out some details about stored procedures (which I think are actually called functions in pgsql) when used in a multiple schema environment.
The application I have in mind involves a multi-tenant DB design where one schema is used for each tenant and all schemata, which have the same table structure and names, are part of the same database. As far as I know from DBs in general, stored procedures/functions are pre-compiled and therefore faster so I woulid like to use them for performing operations on each schema's tables by sending the required parameters from the application server instead of sending a list of SQL commands. In addition, I would like to have a SINGLE set of functions that implement all the SELECT (including JOIN type), INSERT, UPDATE, etc operations on the tables of each schema. This will allow to easily perform changes in each function and avoid SQL code replication and redundancy. As I found out, it is possible to create a set of functions in a schema s0 and then create s1, s2, ... schemata (having all the same tables) that use these functions.
For exapmle, I can create a template schema named s0 (identical to all others) and create a SQL or pl/pgSQL function that belongs to this schema and contains operations on the schema's tables. In this function, the table names are written without the schema prefix, i.e.
first_table and not s0.first_table
An example function could be:
CREATE FUNCTION sel() RETURNS BIGINT
AS 'SELECT count(a) from first_table;'
LANGUAGE SQL;
As I have tested, this function works well. If I move to schema s1 by entering:
set search_path to s1;
and then call the function again, the function acts upon s1 schema's identically named table first_table.
The function could also include the parameter path in order to call it with a schema name and a command to change the search_ path similar to this:
CREATE OR REPLACE FUNCTION doboth(path TEXT, is_local BOOLEAN DEFAULT false) RETURNS BIGINT AS $$
SELECT set_config('search_path', regexp_replace(path, '[^\w ,]', '', 'g'), is_local);
SELECT count(a) from first_table;
$$ LANGUAGE sql;
as shown in the proposed solution in PostgreSQL: how do I set the search_path from inside a function?
However, when I tried this and I called the function for a schema, I noticed that the second SELECT of the function was executed before the first, which led to executing the second SELECT on the wrong schema! This was really unexpected. Does anybody know the explanation to this behavior?
In order to bypass this issue, I created a plpgsql function that does the same thing and it worked without any execution order issues:
CREATE OR REPLACE FUNCTION doboth(path TEXT, is_local BOOLEAN DEFAULT false) RETURNS BIGINT AS $$
DECLARE result BIGINT;
BEGIN
PERFORM set_config('search_path', regexp_replace(path, '[^\w ,]', '', 'g'), is_local);
SELECT count(a) from first_table INTO result;
RETURN result;
END
$$ LANGUAGE plpgsql;
So, now some questions about performance this time:
1) Apart from a) having the selection of schema to operate and the specified operation on the schema in one transaction which is necessary for my multi-tenant implementation, and b) teaming together SQL commands and avoiding some extra data exchange between the application server and the DB server which is beneficial, do the Postgresql functions have any performance benefits over executing the same code in separate SQL commands?
2) In the described multi-tenant scenario with many schemata and one DB,
does a function that is defined once and called for any identical schema to the one it is defined lose any of its performance benefits (if any)?
3) Is there any difference in performance between an SQL function and a PL/pgSQL function that contains the same operations?
Before I answer your questions, a remark to your SQL function.
It does not fail because the statements are executed in a wrong order, but because both queries are parsed before the first one is executed. The error message you get is somewhat like
ERROR: relation "first_table" does not exist
[...]
CONTEXT: SQL function "doboth" during startup
Note the “during startup”.
Aswers
You may experience a slight performance boost, particularly if the SQL statements are complicated, because the plans of SQL statements in a PL/pgSQL function are cached for the duration of a database session or until they are invalidated.
If the plan for the query is cached by the PL/pgSQL function, but the SQL statement calling the function has to be planned every time, you might actually be worse of from a performance angle because of the overhead of executing the function.
Whenever you call the function with a different schema name, the query plan will be invalidated and has to be created anew. So if you change the schema name for every invocation, you won't gain anything.
SQL function don't cache query plans, so they don't perform better than the plain SQL query.
Note, however, that the gains from caching simple SQL statements in functions are not tremendous.
Use functions that just act as containers for SQL statements only if it makes life simpler for you, otherwise use plain SQL.
Do not only focus on performance uring design, but on a good architecture and a simple design.
If the same statements keep repeating over and over, you might gain more performance using prepared statements than using functions.
Firstly, I do not really believe there can be any issues with line execution order in functions. If you have any issues, it's your code not working, not Postgres.
Secondly, multi-tenant behavior is well implemented with set search_path to s1, s0;. There is usually no need for switching anything inside procedures.
Thirdly, there are no performance benefits in using stored procedures except for minimizing data flows between DB and the application. If you consider a query like SELECT count(*) FROM mytable WHERE somecolumn = $1 there is absolutely nothing you can optimize before you know the value of $1.
And finally, no, there is no significant difference between functions in SQL and PL/pgSQL. The most time is still consumed by reading through tables, so focus on perfecting that.
Hope that clarifies the situation. Also, you may want to consider security benefits of storage procedures. Just a hint.

Parse all stored procedures in a database

Does anyone know of a way to verify the correctness of the queries in all stored procedures in a database?
I'm thinking of the scenario where if you modify something in a code file, simply doing a rebuild would show you compilation errors that point you to places where you need to fix things. In a database scenario, say if you modify a table and remove a column which is used in a stored procedure you won't know anything about this problem until the first time that procedure would run.
What you describe is what unit testing is for. Stored procedures and functions often require parameters to be set, and if the stored procedure or function encapsulates dynamic SQL--there's a chance that a [corner] case is missed.
Also, all you mention is checking for basic errors--nothing about validating the data returned. For example - I can change the precision on a numeric column...
This also gets into the basic testing that should occur for the immediate issue, and regression testing to ensure there aren't unforeseen issues.
You could create all of your objects with SCHEMABINDING, which would prevent you from changing any underlying tables without dropping and recreating the views and procedures built on top of them.
Depending on your development process, this could be pretty cumbersome. I offer it as a solution though, because if you want to ensure the correctness of all procedures in the db, this would do it.
I found this example on MSDN (SQL Server 2012). I guess it can be used in some scenarios:
USE AdventureWorks2012;
GO
SELECT p.name, r.*
FROM sys.procedures AS p
CROSS APPLY sys.dm_exec_describe_first_result_set_for_object(p.object_id, 0) AS r;
Source: sys.dm_exec_describe_first_result_set_for_object

Recompile stored procs?

Is there a way to re-compile or at least 'check compile' stored procedures en masse? Sometimes we'll make schema changes - add or drop a column, etc. And do our best to identify affected procs only to be bitten by one we missed, which pukes when it runs next. SQLServer 2k5 or 2k8.
I understand your question as 'when I make a schema change, I want to validate all procedures that they still execute correctly with the new schema'. Ie. if you drop a column that is referenced in a SELECT in a procedure, then you want it flagged as it requires changes. So specifically I do not understand your question as 'I want the procedure to recompile on next execution', since that job is taken care of for you by the engine, which will detect the metadata version change associated with any schema alteration and discard the existing cached execution plans.
My first observation is that what you describe in your question is usually the job of a TEST and you should have a QA step in your deployment process that validates the new 'build'. The best solution you could have is to implement a minimal set of unit tests that, at the very least, iterates through all your stored procedures and validates the execution of each for correctness, in a test deployment. That would pretty much eliminate all surprises, at least eliminate them where it hurts (in production, or at customer site).
Your next best option is to rely on your development tools to track these dependencies. The Visual Studio Database 2008 Database Edition provides such functionality out-of-the box and it will take care of validating any change you make in the schema.
And finally your last option is to do something similar to what KM suggested: automate an iteration through all your procedures depending on the modified object (and all procedures depending on the dependent ones and so on and so forth recursively). It won't suffice to mark the procedures for recompilation, what you really need is to run the ALTER PROCEDURE to trigger a parsing o its text and a validation of the schema (things are a bit different in T-SQL vs. your usual language compile/execute cycle, the 'compilation' per se occurs only when the procedure is actually executed). You can start by iterating through the sys.sql_dependencies to find all dependencies of your altered object, and also find the 'module definition' of the dependencies from sys.sql_modules:
with cte_dep as (
select object_id
from sys.sql_dependencies
where referenced_major_id = object_id('<your altered object name>')
union all
select d.object_id
from sys.sql_dependencies d
join cte_dep r on d.referenced_major_id = r.object_id
)
, cte_distinct as (
select distinct object_id
from cte_dep)
select object_name(c.object_id)
, c.object_id
, m.definition
from cte_distinct c
join sys.sql_modules m on c.object_id = m.object_id
You can then run through the dependent 'modules' and re-create them (ie. drop them and run the code in the 'definition'). Note that a 'module' is more generic than a stored procedure and covers also views, triggers, functions, rules, defaults and replication filters. Encrypted 'modules' will not have definition the definition available and to be absolutely correct you must also account for the various settings captured in sys.sql_modules (ansi nulls, schema binding, execute as clauses etc).
If you use ynamic SQL, that cannot be verified. It will not be captured by sys.sql_dependencies, nor it will be validated by 're-creating' the module.
Overall I think your best option, by a large margin, is to implement the unit tests validation.
If you have problems changing a table and breaking stored procedures try sp_depends:
sp_depends [ #objname = ] '<object>'
<object> ::=
{
[ database_name. [ schema_name ] . | schema_name.
object_name
}
and identity them before they break. Use it this way:
EXECUTE sp_depends YourChangedTableName
Also, you could use sp_recompile:
EXEC sp_recompile YourChangedTable
but that only marks associated stored procedures for recompile when they are run the next time.
You could use management studio or your source control to generate a concatenated create script of all procedures into a single file and then run that.
I know what you mean and in many scenario's in recognize your need. You might have a look at sp_refreshsqlmodule.
Good luck, Ron
You may be able to use DBCC FREEPROCCACHE
http://msdn.microsoft.com/en-us/library/ms174283.aspx
Just iterate through them, after getting the list from sysobjects, and run sp_recompile:
Here is a link showing a sample script:
http://database.ittoolbox.com/groups/technical-functional/sql-server-l/recompile-all-stored-procedures-2764478

Resources