Snowflake procedure returns duplicates - snowflake-cloud-data-platform

I'm calling procedures via talend in pararel. Sometime 1 procedure is running twice at the same time. That's why I have duplicates. Is there a solution to block one of the calls to the procedure?

Related

How to execute sql server query statements in parallel?

I have a scenario in which there is one activity in my Azure Data Factory Pipeline. This activity copies data from history tables to archive tables. And a history table can have upto 600 million records. There is a SQL Server Stored Procedure(SP) in this activity which executes three child SPs using a while loop:
while i<3
exec proc
i = i + 1
The 3 SPs copy data from history table to archive table in SQL DW. This activity is common to 600 pipelines and different activities copy different number of tables.
But, while loop executes the child SPs one by one.
I tried searching for a way to parallelize the 3 SPs but found nothing in SQL Server.
I want to trigger all the child SPs at once. Is there anyway I can do this? Any solution in SQL Server, Data Factory,Python Script or Spark will suffice.
You cannot execute 3 child stored procedures parallelly inside a parent stored procedure. But you can execute the 3 child procedures directly without requiring any parent procedure.
Please follow the demonstration below where I executed 3 stored procedures in parallel using azure data factory (For each activity):
I have 3 stored procedures sp1, sp2 and sp3 that I want to execute in parallel. I created a parameter (Array) that holds the names of these stored procedures.
This parameter acts as items value (#pipeline().parameters.sp_names) in for each activity. Here, in for each activity, do not check sequential checkbox and specify batch value as 3.
Now inside the for each activity, create a stored procedure activity and create the necessary linked service. While selecting the stored procedure name check the edit box. Give dynamic content for stored procedure name as #item()
This procedure helps to run the stored procedures parallelly. Look at the outputs when the same is executed with For each activity having sequential execution and batch execution.
With Sequential execution:
With Batch execution:

Snowflake orchestration of tasks

I have a batch load process that loads data into a staging database. I have a number of tasks that execute stored procedures which move the data to a different database. The tasks are executed when the SYSTEM$STREAM_HAS_DATA condition is satisfied on a given table.
I have a separate stored procedure that I want to execute only after the tasks have completed moving the data.
However, I have no way to know which tables will receive data and therefore do not know which tasks will be executed.
How can I know when all the tasks that satisfied the SYSTEM$STREAM_HAS_DATA condition are finished and I can now kick off the other stored procedure? Is there a way to orchestrate this step by step process similar to how you would in a SQL job?
There is no automated way but you can do it with some coding.
You may create a stored procedure to check the STATE column of the task_history view to see if the tasks are completed or skipped:
https://docs.snowflake.com/en/sql-reference/functions/task_history.html
You can call this stored procedure periodically using a task (like every 5 minutes etc).
Based on your checks inside of the stored procedure (all tasks were succeeded, the target SP wasn't executed today yet etc), you can execute your target stored procedure which needs to be executed after all tasks have been completed.
You can also check the status of all the streams via SELECT SYSTEM$STREAM_HAS_DATA('<stream_name>') FROM STREAM which does not process the stream, or SELECT COUNT(*) FROM STREAM.
Look into using IDENTIFIER for dynamic queries.

Task with multiple stored procedures

Is there a way to create a task within snowflake to call multiple stored procedures?
For example I have three stored procedures to check for duplicated information over multiple tables, I'd like to call all three through the task without having to create a new SP to loop through them all.
A task can only trigger one SQL statement or one Stored Procedure.
So you have to decide:
One task for each procedure with dependencies between the tasks
One task with a wrapper procedure that calls all the three Stored Procedures (the solution you do not want to have)
I think chaining the tasks is a good solution. You have to use the AFTER-clause within your CREATE TASK-statement to achieve the correct dependencies: https://docs.snowflake.com/en/sql-reference/sql/create-task.html
A task can only call 1 SP so if you don't want to write one SP that calls the others then how about creating a chain of 3 tasks?

Execution order after a stored procedure in SQL Server

I have a stored procedure that does some inserts into a table. If I have to execute that same stored procedure repeatedly, each of these executions reflect the Inserts in the table after it ends or it could happen that each insert occurs after the end of the stored procedure execution and overlaps with the execution of the second instance of that stored procedure.
I hope I was clear, if not please correct me
Thanks
Whatever work is done in a Stored Procedure, unless explicitly rolled-back or automatically rolled-back due to an error, will be there when the Stored Procedure exits. Once a Stored Procedure exits, there is no more work that it could be doing.
This means that within a single session, any number of executions of a stored procedure are handled serially -- one after the other, no overlap.
However, across multiple sessions / connections, the work being done in a Stored Procedure certainly can overlap if that same code (Stored Procedure or even ad hoc SQL) is run at the same time across other sessions / connections.

difference between procedure and stored procedure sql server?

What is the difference between a procedure and a stored procedure on sql server?
There is no difference. There is no concept of "unstored" procedures in SQL Server.
CREATE PROCEDURE
Will create a stored procedure
select * from sys.procedures
will show you the stored procedures.
This is as opposed to sending adhoc sql statements or prepared sql statements.
A procedure is a specified series of actions, acts or operations which have to be executed in the same manner in order to always obtain the same result under the same circumstances
A stored procedure is a subroutine available to applications accessing a relational database system. Stored procedures (sometimes called a proc, sproc, StoPro, or SP) are actually stored in the database data dictionary.
n a procedure you have to start the transaction manually, allowing the rollback manually and stuff like that.
In a stored procedure, usually the DBA system takes care of the main transaction in case of errors. You can even use atomic transactions to keep your information consistent.
Then, A stored procedure is execute a bit faster than a single procedure because of the indexing in the dba.
If it's an actual procedure, in the database, it's a stored procedure -- regardless of whether people pronounce the "stored" part.
Stored procedures are in opposition to the client's issuing the SQL statements of the procedure one by one. That's what an un-"stored procedure" would be.

Resources