How to schedule a sql script in the snowflake database to run every day, and set the output file name to include the current date. E.g. if the code ran today then the file name should be 20200906*****.csv.gz, similary for tomorrow 20200907******.csv.gz.
You could use Snowflake TASKS in order to schedule execution of SQL statements.
Task can execute a single SQL statement, including a call to a stored procedure.
Tasks run according to a specified execution configuration, using any combination of a set interval and/or a flexible schedule using a subset of familiar cron utility syntax.
For your goal I would create a Stored Procedure (so that you could use variables for managing changing filename and for any more complex things).
SF Doc: https://docs.snowflake.com/en/sql-reference/sql/create-task.html
--create a new task that executes a single SQL statement based on CRON definition
CREATE TASK mytask_hour
WAREHOUSE = mywh
SCHEDULE = 'USING CRON 0 9-17 * * SUN America/Los_Angeles'
TIMESTAMP_INPUT_FORMAT = 'YYYY-MM-DD HH24'
AS
INSERT INTO mytable(ts) VALUES(CURRENT_TIMESTAMP);
--create a new task that executes a Stored Procedure every hour
create task my_copy_task
warehouse = mywh
schedule = '60 minute'
as
call my_unload_sp();
After creating a task, you must execute ALTER TASK … RESUME in order to enable it.
Use SHOW TASKS to check your task's definition/configuration and then query TASK_HISTORY in order to check executions.
Your Snowflake JS Stored Procedure could be something like this:
create or replace procedure SP_TASK_EXPORT()
RETURNS VARCHAR(256) NOT NULL
LANGUAGE JAVASCRIPT
EXECUTE AS CALLER
as $$
function getToday_yyyymmdd()
{
var v_out_Today;
rs = snowflake.execute ( { sqlText: `SELECT to_char(current_date,'yyyymmdd');` } );
if( rs.next())
{
v_out_Today = rs.getColumnValue(1); // get current date as yyyymmdd
}
return v_out_Today;
}
var result = new String('Successfully Executed');
var v_Today = getToday_yyyymmdd();
try {
var sql_command = `copy into #unload_gcs/LH_TBL_FIRST` + v_Today + `.csv.gz from ........`;
var stmt = snowflake.createStatement({sqlText: sql_command});
var res = stmt.execute();
}
catch (err) {
result = "Failed: Code: " + err.code + " | State: " + err.state;
result += "\n Message: " + err.message;
result += "\nStack Trace:\n" + err.stackTraceTxt;
}
return result;
$$;
Before creating your task and schedule it, test your Stored Procedure invoking it:
call SP_TASK_EXPORT();
Related
I have created a stored procedure that returns a create table sql statement; I want to be able to now call that procedure and assign the result to a variable like:
set create_table_statement = call sp_create_stage_table(target_db, table_name);
snowflake will not let me do this, so is there a way I can.
Context
We have just been handed over our new MDP which is built on AWS-S3, DBT & Snowflake, next week we go into production but we have 200+ tables and snowlpipes to code out. I wanted to semi automate this by generating the create table statements based off the tables metadata and then calling the results from that to create the tables. At the moment we're having to run the SQL, copy+paste the results in and then run that, which is fine in dev/pre-production mode when it's a handful of tables. but with just 2 of us it will be a lot of work to get all those tables and pipes created.
so I've found a work around, by creating a second procedure and calling the first one as a se=ql string to get the results as a string - then calling that string as a sql statement. like:
create or replace procedure sp_create_stage_table("db_name" string, "table_name" string)
returns string
language javascript
as
$$
var sql_string = "call sp_get_create_table_statement('" + db_name + "','" + table_name + "');";
var get_sql_query = snowflake.createStatement({sqlText: sql_string});
var get_result_set = get_sql_query.execute();
get_result_set.next();
var get_query_value = get_result_set.getColumnValue(1);
sql_string = get_query_value.toString();
try {
var main_sql_query = snowflake.createStatement({sqlText: sql_string});
main_sql_query.execute();
return "Stage Table " + table_name + " Successfully created in " + db_name + " database."
}
catch (err){
return "an error occured! \n error_code: " + err.code + "\n error_state: " + err.state + "\n error_message: " + err.message;
}
$$;
It is possible to assign scalar result of stored procedure to session variable. Instead:
SET var = CALL sp();
The pattern is:
SET var = (SELECT * FROM TABLE(RESULT_SCAN(LAST_QUERY_ID())));
Sample:
CREATE OR REPLACE PROCEDURE TEST()
RETURNS VARCHAR
LANGUAGE SQL
AS
BEGIN
RETURN 'Result from stored procedrue';
END;
CALL TEST();
SET variable = (SELECT * FROM TABLE(RESULT_SCAN(LAST_QUERY_ID())));
SELECT $variable;
-- Result from stored procedrue
I would like to return logging and status messages from a stored procedure to the TASK that calls it.
create or replace procedure status_return()
returns string not null
language javascript
as
$$
var result_status = 'The return status and debug information in string format';
return result_status; // Statement returned for info/debug purposes
$$;
I would like to pass the result from stored procedure call status_return() back to the task
-- Create a task that calls the stored procedure every hour
create or replace task call_SP
warehouse = SMALL
schedule = '1 minute'
as
call status_return();
When I execute TASK_HISTORY to view RETURN_VALUE is always empty.
select *
from table(information_schema.task_history(SCHEDULED_TIME_RANGE_START => dateadd(hours, -5, current_timestamp()) ,
TASK_NAME => 'call_sp'));
How can I view the result of a stored procedure in task_history for SUCCESS, FAILURE, or ERRORS?
I have tried creating a task in the following way, but I was unsuccessful and it return with errors.
create or replace task call_SP
warehouse = EDS_SMALL
schedule = '1 minute'
as
call system$set_return_value(call status_return());
Can I use Javascript in Tasks? To store the result of a stored procedure call into a variable and return it back to the TASK result
In order to be able to get a RETURN_VALUE in your TASK_HISTORY you have to set the return_value in your stored procedure using call system$set_return_value().
Examples can be found in snowflake documentation.
This is what it should looks like if you want the return_value field of the task_history to return your result status var when your task is launched :
create or replace procedure status_return()
returns string not null
language javascript
as
$$
var result_status = 'The return status and debug information in string format';
var rv_stmt = snowflake.createStatement({sqlText:`call system$set_return_value('` + result_status + `');`});
var rv_res = rv_stmt .execute(); // Set return_value
return result_status; // Statement returned for info/debug purposes
$$;
I am trying to write a Snowflake UDF that accepts a stage name and specific folder name as input parameters and returns the latest file id ( striping from full file name) as the output. Could anyone help me with a simple code to achieve this?
I'm not sure if you want a UDF or stored procedure. The syntax to create would be similar so I think this can help. Here is a stored procedure which will fetch latest staged file from a given stage and path. Just be aware of the limit 1 in query, multiple staged files may share the same last modified date while this procedure returns a scalar (single) value.
Stored Procedure Definition
create or replace procedure "MYDB"."MYSCHEMA"."LATEST_STAGED_FILE"(stage_name text, folder text)
returns string not null
language javascript
execute as caller
as
$$
var sql_text = "list #" + STAGE_NAME + "/" + FOLDER ;
var sql_command0 = snowflake.createStatement({ sqlText: sql_text});
var sql_command1 = snowflake.createStatement({ sqlText:`SELECT "name" FROM table(result_scan(last_query_id())) WHERE "last_modified" = (select MAX("last_modified") from table(result_scan(last_query_id()))) LIMIT 1;`});
try {
sql_command0.execute();
var resultSet = sql_command1.execute();
while(resultSet.next())
{
var resultFile = resultSet.getColumnValue('name').split("/")
return resultFile[resultFile.length - 1]
}
}
catch (err) {
return "Failed: " + err;
}
$$;
You can then call the stored procedure like
call "MYDB"."MYSCHEMA"."LATEST_STAGED_FILE"('MYDB.MYSCHEMA.MYSTAGE', 'mypath/myotherpath');
References
select from list #
list stage via SP
I need help with a practical scenario. I have a table called CONFIG_TBL in Snowflake. This table has SQL statements, 1 per row. My goal is to use Snowflake Stored Procedures or Snowflake UDF, or a combination of the two, and return a result set that will be obtained after execution of that statment.
The statements are simple select statements, like "select * from ABC'.
I could have done this, very easily in SQL server, since procedures can return table values. However, I don't know how to do this in Snowflake.
Any help will be greatly appreciated.
Thanks in advance.
Here's something to get you started at least. Procedures use javascript (SQL Stored Procedures are coming soon), but they can be used to run dynamic queries like you are looking for.
You can get the results in a couple of ways. By either returning a variant object or by using result_scan after calling the procedure.
This example just runs one query so your final solution will be different depending on just what you want the output to look like.
CREATE OR REPLACE PROCEDURE SCHEMA.PROCEDURE_NAME()
RETURNS VARIANT
LANGUAGE JAVASCRIPT
EXECUTE AS CALLER
AS $$
retrieve_queries_sql = "select top 1 query from CONFIG_TBL";
retrieve_queries_result_set = snowflake.execute({sqlText: retrieve_queries_sql });
query_to_run = retrieve_queries_result_set.next().getColumnValue(1);
rs = snowflake.execute({sqlText: query_to_run})
var return_value = "";
if (rs.next()) {
return_value += rs.getColumnValue(1);
return_value += ", " + rs.getColumnValue(2);
}
while (rs.next()) {
return_value += "\n";
return_value += rs.getColumnValue(1);
return_value += ", " + rs.getColumnValue(2);
}
}
return return_value;
$$
CALL SCHEMA.PROCEDURE_NAME()
SELECT *
FROM table(result_scan(last_query_id()))
Edit: Fixed to have example correctly return a result which can then be used by the result_scan. Example taken from here. There are various more examples for getting results out of a procedure, including using JSON output.
We can use language as SQL and return the value to table, Which can return table values.
CREATE OR REPLACE PROCEDURE DDL_TEST_SQL()
RETURNS TABLE(CARRIERFILEID integer, BLOBPATH varchar)
LANGUAGE SQL
EXECUTE AS OWNER
AS
$$
DECLARE
QUERY STRING;
res resultset;
BEGIN
QUERY := 'SELECT TOP 10 CARRIERFILEID,BLOBPATH from CONFIG.CARRIERFILE';
res := (EXECUTE IMMEDIATE :QUERY);
return table(res);
END;
$$;
CALL DDL_TEST_SQL()
I have created a task:
CREATE OR REPLACE TASK TASK_1
WAREHOUSE = WAREHOUSE
SCHEDULE = 'USING CRON 30 1 * * * America/Detroit'
AS
....
This runs at 1:30 am daily.
Is there a way to execute this query on demand?
i.e. something like:
TRIGGER TASK TASK_1;
mRainey's answer is right. You can't schedule a task outside of a schedule or task dependency. That's the correct answer to the OP's question.
For others though who stumble upon this answer, you can make scheduling a task at a different time easier on yourself:
CREATE OR REPLACE PROCEDURE "SCHEDULE_TASK_AT_TIME"(TASK_NAME VARCHAR, HOUR float, MINUTE float)
RETURNS VARIANT
LANGUAGE JAVASCRIPT
AS $$
var return_rows = [];
var task_name = TASK_NAME;
var h = HOUR;
var m = MINUTE;
var default_timezone = 'America/Los_Angeles';
var new_chron = 'USING CRON ' + m + ' ' + h + ' * * * ' + default_timezone;
var stmt = snowflake.createStatement({sqlText: `
DESCRIBE TASK IDENTIFIER(:1)
`, binds:[task_name]});
res = stmt.execute();
res.next();
var old_chron = res.getColumnValue(8);
var stmt = snowflake.createStatement({sqlText: `
ALTER TASK IDENTIFIER(:1) SUSPEND
`, binds:[task_name]});
res = stmt.execute();
var stmt = snowflake.createStatement({sqlText: `
ALTER TASK IDENTIFIER(:1) SET SCHEDULE = :2
`, binds:[task_name, new_chron]});
res = stmt.execute();
var stmt = snowflake.createStatement({sqlText: `
ALTER TASK IDENTIFIER(:1) RESUME
`, binds:[task_name]});
res = stmt.execute();
return_rows.push('Old Chron: ' + old_chron);
return_rows.push('New Chron: ' + new_chron);
return return_rows;
$$;
Then you can schedule your task like this, which would run at the next 22:38:
call SCHEDULE_TASK_AT_TIME('DEMO_TASK', 22, 38);
The output of this procedure gives you the old chron time and the new chron time so you can easily set it back when you're done.
Just be sure to be careful w/ this and notice its limitations - in my version you have to hard-code the timezone for example.
Also, I didn't look into whether you can set up a chron to execute just once, so whatever hour and minute you set it to, it'll run like this every day unless you take further action.
Currently, there’s no way to explicitly execute a task outside of either a schedule or task dependency.