Snowflake stored procedure queries not showing up in Query Profiler - snowflake-cloud-data-platform

I have a Snowflake stored procedure which is running for 8 hrs. Upon checking the query profiler to see which query is running long I just see a single entry for the call stored procedure statement.
However, Through the logs I know that an insert statement is running which is like:
insert into Snowflake_table
select *
from External_table(over S3 bucket)
I want to check and find out why reading from external table is taking lot of time but the insert query is not showing up in the Query profiler. I have tried querying the information_schema.Query_history but its not showing another query running apart from the call stored procedure statement
SELECT *
FROM Table(information_schema.Query_history_by_warehouse('ANALYTICS_WH_PROD'))
WHERE execution_status = 'RUNNING'
ORDER BY start_time desc;
Please suggest how to find the bottleneck here

Docs is stating that Queries on INFORMATION_SCHEMA views do not guarantee consistency with respect to concurrent DDL: https://docs.snowflake.com/en/sql-reference/info-schema.html
This means that it is possible that your insert-statement is running but is not shown as a result of your query. It could be included but it's not a must.
You could now change the filter to execution_status IN 'success', 'failed' and check again after the procedure finished.

Related

conditional stored procedure performance vs. code build statement

currenty i am working on a report system for our data archive.
the aim is to select data for every 1st of a month, every full hour and so on.
So I have a bunch of parameters to select the data down to a single hour.
To achieve that I used CASE statements to adjust the select like this:
SELECT
MIN(cd.Timestamp) as Mintime,
--Hours
CASE
WHEN
#SelHour IS NOT NULL
THEN
DATEPART(HOUR, cd.Timestamp)
END as Hour,
... -- more CASES up to DATEPART(YEAR, cd.Timestamp)
FROM dbo.CustomerData cd
... -- filter data and other stuff
This statements works good for me so far, but I am a bit worried about the performance of the stored procedure. Because I don't know how the server will behave with this "changing" statement. The result can vary between a 20 row result up to a 250.000 rows and more. Depending on the given parameters. As far as I know the sql server saves the query plan and reuses it for future execution.
When it saves the plan for the 20 row result the performance for the 250.000 result is propably pretty poor.
Now I am wondering whats the better aproach. Using this stored procedure or create the statement inside my c# backend and pass the "adjusted" statement to the sql server?
Thanks and greetings
For 20 rows result set it will work good anywhere. But for returning 250k records to c# code seems change in design for this code since loading 250k records in memory & looping will also consume significant memory and such concurrent requests from different session/user will multiply load exponentially.
Anyway to address problem with SQL Server reusing same query plan, you can recompile query plans selectively or every time. These are options available for Recompile execution plan:
OPTION(RECOMPILE)
SELECT
MIN(cd.Timestamp) as Mintime,
--Hours
CASE
WHEN
#SelHour IS NOT NULL
THEN
DATEPART(HOUR, cd.Timestamp)
END as Hour,
... -- more CASES up to DATEPART(YEAR, cd.Timestamp)
FROM dbo.CustomerData cd
... -- filter data and other stuff
OPTION(RECOMPILE)
WITH RECOMPILE Option this will recompile execution plan every time
CREATE PROCEDURE dbo.uspStoredPrcName
#ParamName varchar(30) = 'abc'
WITH RECOMPILE
AS
...
RECOMPILE Query Hint providing WITH RECOMPILE in execute
NOTE: this will require CREATE PROCEDURE permission in the database and ALTER permission on the schema in which the procedure is being created.
EXECUTE uspStoredPrcName WITH RECOMPILE;
GO
sp_recompile System Stored Procedure
NOTE: Requires ALTER permission on the specified procedure.
EXEC sp_recompile N'dbo.uspStoredPrcName ';
GO
For more details on Recompile refer Microsoft Docs:
https://learn.microsoft.com/en-us/sql/relational-databases/stored-procedures/recompile-a-stored-procedure?view=sql-server-ver15

Temporary tables and constant statement recompilation

I am stress testing a system that is using a temporary table in dynamic SQL. The table is created early on in the transaction and is filled by several dynamic SQL statements in several stored procedures that are executed as part of the batch using statements of the form:
INSERT #MyTable (...)
SELECT ...
where the SELECT statement is reasonably complicated in that it may contain UNION ALL and UNPIVOT statements and refer to several UDFs. All strings are executed using sp_executesql and Parameter Sniffing is enabled.
I have noticed that under load I am seeing a lot of RESOURCE_SEMAPHORE_QUERY_COMPILE waits where the query text being recompiled is present and identical in several waits at the same time and appears throughout the stress test which lasts about 5mins. The memory consumption on the server usually sits around 60% utilization and there is no limit on how much SQL Server can consume. The limiting factor appears to be CPU, which is constantly at >95% during the test.
I have profiled the server during the test to observe the SQL:StmtRecompile event which highlights the reason for the recompile is:
5 - Temp table changed
but the temp table is the same every time and there are no DDL statements performed against the table once it has been created, apart from when it is dropped at the end of the batch.
So far, I have tried:
Enabling the "optimize for ad hoc workloads" option
OPTION(KEEPFIXED PLAN)
Changing the dynamic statement to just the SELECT and then using INSERT ... EXEC so the temp table is not in the executed string
All of these have made no difference and the waits persist.
Why would SQL think that is needs to recompile these identical queries each time they are executed and how can I get it to keep and reuse the cached plans it is creating?
Note: I cannot change the temp table to an In-Memory table because sometimes the stored procedures using this may have to query another database on the same instance.
This is using SQL Server 2016 SP1 CU7.
It appears that removing the insertion into the temp table in the dynamic SQL strings improves performance significantly. For example, changing this:
EXEC sp_executesql
N'INSERT #tempTable (...) SELECT ... FROM ...'
where the SELECT statement is non-trivial, to this:
INSERT #tempTable (...)
EXEC sp_executesql
N'SELECT ... FROM ...'
greatly reduced the number of blocks created during compilation. Unfortunately, recompilation of the queries is not avoided it's just that the queries being recompiled are now much simpler and therefore less CPU intensive.
I have also found it more performant to create an In-Memory Table Type with the same columns as the temp table, perform the complex insertions into a table variable of that type and perform a single insert from the table variable into the temp table at the end.

Different execution count for queries in a stored procedure

I've executed a stored procedure with the following T-SQL code:
exec myStoredProcedure
go 10
After the execution of the procedure I reviewed the information in sys.dm_exec_query_stats and observed that for some queries in the stored procedure, the value in Execution Count is different.
Apparently, some execution plans have been ran only a number of times, 7 out of 10, for some queries in the stored procedure.
The data in the above screenshot is being returned with the following query:
select ...
qs.execution_count [Execution Count]
from sys.dm_exec_query_stats as qs
cross apply sys.dm_exec_sql_text (qs.sql_handle) as st
cross apply sys.dm_exec_text_query_plan (qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset) as qp
where st.objectid = object_id('myStoredProcedure')
And as you can see, there is no other execution plan stored for this procedure where the execution count is 3, thinking that the Optimizer decided to run a query with another plan.
Truth is some of the queries that have an execution count of 7 are inserts into different temporary tables, with SELECT INTO #temptable, but not all of them.
So, my question is why do some queries have a plan which has been executed a smaller number of times than others and how did those queries execute and with what plan?
I'd like to mention that there is no logic in the stored procedure that would generate different execution flows so that some queries do not get executed. (no IFs)
Also, no statistics update or DML queries have been ran in the meantime so that row-count or index changes have occurred.
Is my query which goes over the DMV not correct and does not pick up these "rogue" plans? Or has the data been cached in memory / tempdb for the temporary tables and read from there on subsequent executions?
Update:
Added a screenshot containing the column with plan_generation_num requested by #MartinBrown
The Execution_Count field is defined as:
"Number of times that the plan has been executed since it was last compiled."
That would suggest that on on the fourth run some of the plans were re-compiled. I would suspect this happened due to the original plans falling out of the cache.
See https://msdn.microsoft.com/en-us/library/ms189741.aspx

Alter table query plan for sql server 2008

I am trying to look for the query plan for alter table query in sql server management studio for sql server 2008.
the alter query is something like:
alter table myTable add my_timestamp datetime not null default(getdate())
When I try to see the 'estimated execution plan' for this query, it shows result as :
Estimated Operator cost 0%. Also when I try to look for the 'actual execution plan' for the query, no result is shown. How can I see the query plan for this query?
The plan is not available for DDL statements, alas. I assume you want to know whether the statement will scan or update all rows, or whether it is just a metadata operation. The way to find that out is:
Read the docs
Test it
The Display execution plan is valid only for Data Manipulation Language (DML) statements. The execution plan is not displayed for Data Definition Language (DDL) statements.
Your query is DDL. Hence the observed behaviour.
Raj

Get schema of proc's select output

I'd like to put the results of a stored proc into a temp table. It seems that the temp table must be defined beforehand and an INSERT INTO will not work.
Anyone know how to get the schema of the recordset being returned from a select statement?
sp_help only gets info on parameters.
You should be able to insert into a temp table without defining the schema using OPENQUERY:
SELECT * INTO #TempTable
FROM OPENQUERY(ServerName, ‘EXEC DataBaseName.dbo.StoredProcedureName paramvalues1, paramvalues1′)
Where ServerName is the name of your Sql Server instance. See this article for more info
Sometimes you just need to know the schema without creating a table. This command outputs the structure of the resultset without actually executing the stored procedure.
From rachmann on 16 April, 2015 from the Microsoft SQL Server forum article How to get schema of resultset returned by a stored procedure without using OPENQUERY?:
SELECT * FROM sys.dm_exec_describe_first_result_set ('owner.sprocName', NULL, 0) ;
Can you execute the logical content including INSERT INTO in a query window? That should generate a temp table that you can use as a model.
Worst case you build the schema by hand, once, which shouldn't be onerous if you are the one writing the SP.
For the benefit of future documentation, I like to hand-craft DDL in SPs anyway. It helps when debugging to have the schema explicitly at hand.
If you are able, change the stored procedure into a user-defined function.
http://www.scottstonehouse.ca/blog/2007/03/stored-procedures-are-not-parameterized.html

Resources