Snowflake Warehouse Cannot Be Suspended? - snowflake-cloud-data-platform

I have a Snowflake Warehouse using 10 credits so far this month I no longer believe to be in use. To safely verify it's no longer in use I tried to:
ALTER WAREHOUSE MY_WAREHOUSE SUSPEND
But I received the message:
Invalid state. Warehouse 'MY_WAREHOUSE' cannot be suspended.
Why might the warehouse not be able to be suspended? What else do I need to check and run?
Additionally, is there a way to check if any queries/loaders have utilized this warehouse in the past X days?

While issuing the ALTER WAREHOUSE ... SUSPEND command, you will receive the error "Invalid state. Warehouse 'MY_WAREHOUSE' cannot be suspended" when the warehouse has already been suspended.
Before issuing the SUSPEND command, check the status of the warehouse using the below command
show warehouses like '%TEST%' in account;
The STATE column of the resultset will give the status of the warehouse.
To check if there were any queries that were using this warehouse, you can use the below query
select * from snowflake.account_usage.query_history where warehouse_name='Warehouse name' and START_TIME > 'Timestamp';
https://docs.snowflake.com/en/sql-reference/account-usage/query_history.html#query-history-view
https://docs.snowflake.com/en/sql-reference/functions/query_history.html#query-history-query-history-by
Or you could also choose to check the warehouse credits from the Account -> Usage tab in the Snowflake Web UI.

try dropping the auto suspend e.g. running something like this :
CREATE OR REPLACE WAREHOUSE SOME_FANCY_WAREHOUSE_NAME
WITH WAREHOUSE_SIZE = 'MEDIUM'
MAX_CLUSTER_COUNT = 1
MIN_CLUSTER_COUNT = 1
AUTO_SUSPEND = 60
AUTO_RESUME = TRUE
INITIALLY_SUSPENDED = TRUE
;

Related

MariaDB replication is not working when no database is selected

I'm using MariaDB 10.6.8 and have one of master DB and two of slave DBs. Those DBs are set up for replication.
When I excute INSERT or UPDATE query without database selection, replication doesn't seem to work. In other words, master DB's data is changed but slave DB's data is remains intact.
/* no database is selected */
MariaDB [(none)]> show master status \G
*************************** 1. row ***************************
File: maria-bin.000007
Position: 52259873
Binlog_Do_DB:
Binlog_Ignore_DB:
1 row in set (0.000 sec)
MariaDB [(none)]> UPDATE some_database.some_tables SET some_datetime_column = now() WHERE primary_key_column = 1;
Query OK, 1 row affected (0.002 sec)
Rows matched: 1 Changed: 1 Warnings: 0
MariaDB [(none)]> show master status \G
*************************** 1. row ***************************
File: maria-bin.000007
Position: 52260068
Binlog_Do_DB:
Binlog_Ignore_DB:
1 row in set (0.000 sec)
/* only change master database's record even though the replication position is changed */
However, after selecting the database, replication work fine.
/* but, after selecting the database */
MariaDB [(none)]> USE some_database;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
MariaDB [some_database]> UPDATE some_tables SET some_datetime_column = now() WHERE primary_key_column = 1;
Query OK, 1 row affected (0.002 sec)
Rows matched: 1 Changed: 1 Warnings: 0
/* then change master and slave database's record */
Can anyone tell me what could be the cause of this situation?
Regardless of the binary log format (MIXED, STATEMENT, ROW) all DML commands will be written to the binary log file as soon the transaction will be committed.
When using ROW format a TABLE_MAP event will be logged first, which contains a unique ID, the database and table name. The ROW_EVENT (Delete/Insert/Update) refers to one or more table id's to identify the tables used.
The STATEMENT format logs a query event, which contains the default database name, timestamp and the SQL statement. If there is no default database, the statement itself will contain the database name.
Binlog dump example for STATEMENT format (I removed the non relevant parts such as timestamp and user variables from output)
without default database
#230210 4:42:41 server id 1 end_log_pos 474 CRC32 0x1fa4fa55 Query thread_id=5 exec_time=0 error_code=0 xid=0
insert into test.t1 values (1),(2)
/*!*/;
# at 474
#230210 4:42:41 server id 1 end_log_pos 505 CRC32 0xfecc5d48 Xid = 28
COMMIT/*!*/;
# at 505
with default database:
#230210 4:44:35 server id 1 end_log_pos 639 CRC32 0xfc862172 Query thread_id=5 exec_time=0 error_code=0 xid=0
use `test`/*!*/;
insert into t1 values (1),(2)
/*!*/;
# at 639
#230210 4:44:35 server id 1 end_log_pos 670 CRC32 0xca70b57f Xid = 56
COMMIT/*!*/;
If a session doesn't use a default database on the source server, it may not be replicated if a binary log filter was specified on the replica, e.g. replicate_do_db, since the replica doesn't parse the statement but checks if the database name applies to the filter.
To avoid inconsistent data on your replicas I would recommend to use ROW format instead.

singleton pattern implementation in Snowflake?

We need to implement some singleton pattern to ensure a stored procedure cannot be run several times simultaneously.
As I cannot see this functionality in place, I thought about implementing this via a "Lock" table.
We are in a "batch" environment so waiting a few seconds is no problem.
SHARED.LOCK(LOCK_NAME STRING NOT NULL PRIMARY KEY
,SESSION_ID STRING NOT NULL
,ACQUIRED_AT TIMESTAMP_NTZ
)
LOCK_NAME is forced to upper case and used as a Primary Key
SESSION_ID is the current session
ACQUIRED_AT is just useful information
I then create a stored proc to "acquire" the lock $LOCK_NAME that tries to update the lock record with its own session id as long as it is not "locked" already
UPDATE SHARED.LOCK
SET LOAD_ID = $LOAD_ID
,SESSION_ID = CURRENT_SESSION()
,ACQUIRED_AT = CURRENT_TIMESTAMP()
WHERE LOCK_NAME = $LOCK_NAME
AND SESSION_ID IS NULL;
To avoid Snowflake optimistic locking side effects, I would ensure that this stored procedure is not called as part of an explicit transaction.
I then check whether I successfully "acquired" this lock
SELECT 1
FROM SHARED.LOCK
WHERE LOCK_NAME = $LOCK_NAME
AND LOAD_ID = $SESSION_ID;
If I get a record, then I have the lock.
Otherwise, I could wait X seconds and try again later, up to a certain number of attempts.
Once I am done, I can release the lock with a simple Update statement
UPDATE SHARED.LOCK
SET SESSION_ID = NULL
,ACQUIRED_AT = NULL
WHERE LOCK_NAME = $LOCK_NAME
AND SESSION_ID = $SESSION_ID;
And of course we'll have to do something about locks not released within a certain amount of time or locked by a session that is not live anymore, etc...
I think this should work... but maybe there is a simpler way to implement a singleton in Snowflake?
Any better ideas?
Depending on requirements, if the stored procedure is going to be run on schedule TASK could be used, which has OVERLAP protection built-in:
CREATE OR REPLACE TASK my_task
WAREHOUSE = compute_wh
SCHEDULE = '1 minute'
ALLOW_OVERLAPPING_EXECUTION = FALSE
AS
CALL procedure_call();
CREATE TASK - ALLOW_OVERLAPPING_EXECUTION :
ALLOW_OVERLAPPING_EXECUTION = TRUE | FALSE
Specifies whether to allow multiple instances of the task tree to run concurrently
FALSE ensures only one instance of a particular tree of tasks is allowed to run at a time.
Demo:
CREATE TABLE log(id INT NOT NULL IDENTITY(1,1), d TIMESTAMP);
CREATE OR REPLACE procedure insert_log()
returns string
language javascript
execute as owner
as
$$
snowflake.execute ({sqlText: "INSERT INTO log (d) SELECT CURRENT_TIMESTAMP()"});
snowflake.execute ({sqlText: "CALL SYSTEM$WAIT(2, 'MINUTES')"});
return "Succeeded.";
$$
;
ALTER TASK my_task RESUME;
SELECT * FROM log;

flink job submit through sql-client.sh sometime without any checkpoint (what is the way to alter it) or how to recover in case of failure

for example sql-client.sh embedded
insert into wap_fileused_daily(orgId, pdate, platform, platform_count) select u.orgId, u.pdate, coalesce(p.platform,'other'), sum(u.isMessage) as platform_count from users as u left join ua_map_platform as p on u.uaType = p.uatype where u.isMessage = 1 group by u.orgId, u.pdate, p.platform
it will show up as:enter image description here
there will never be any checkpoint.
Question: 1) how to trigger checkpoint ( alert job)
2) how to recover in case of failure
You can specify execution configuration parameters in the SQL Client YAML file. For example, the following should work:
configuration:
execution.checkpointing.interval: 42
There is a feature request on flink:https://cwiki.apache.org/confluence/display/FLINK/FLIP-147%3A+Support+Checkpoints+After+Tasks+Finished

How to use copy Storage Integration in a Snowflake task statement?

I'm testing SnowFlake. To do this I created an instance of SnowFlake on GCP.
One of the tests is to try the daily load of data from a STORAGE INTEGRATION.
To do that I had generated the STORAGE INTEGRATION and the stage.
I tested the copy
copy into DEMO_DB.PUBLIC.DATA_BY_REGION from #sg_gcs_covid pattern='.*data_by_region.*'
and all goes fine.
Now it's time to test the daily scheduling with the task statement.
I created this task:
CREATE TASK schedule_regioni
WAREHOUSE = COMPUTE_WH
SCHEDULE = 'USING CRON 42 18 9 9 * Europe/Rome'
COMMENT = 'Test Schedule'
AS
copy into DEMO_DB.PUBLIC.DATA_BY_REGION from #sg_gcs_covid pattern='.*data_by_region.*';
And I enabled it:
alter task schedule_regioni resume;
I got no errors, but the task don't loads data.
To resolve the issue i had to put the copy in a stored procedure and insert the call of the storede procedure instead of the copy:
DROP TASK schedule_regioni;
CREATE TASK schedule_regioni
WAREHOUSE = COMPUTE_WH
SCHEDULE = 'USING CRON 42 18 9 9 * Europe/Rome'
COMMENT = 'Test Schedule'
AS
call sp_upload_c19_regioni();
The question is: this is a desired behavior or an issue (as I suppose)?
Someone can give to me some information about this?
I've just tried ( but with storage integration and stage on AWS S3) and it works fine also using copy command inside sql part of the task, without calling a stored procedure.
In order to start investigating the issue, I would check following info (maybe for debugging I would create the task scheduling it every few minutes):
check task_history and verify executions
select *
from table(information_schema.task_history(
scheduled_time_range_start=>dateadd('hour',-1,current_timestamp()),
result_limit => 100,
task_name=>'YOUR_TASK_NAME'));
if previous step is successfull, check copy_history and verify the input file name , target table and number of records/errors are the expected ones
SELECT *
FROM TABLE (information_schema.copy_history(TABLE_NAME => 'YOUR_TABLE_NAME',
start_time=> dateadd(hours, -1, current_timestamp())))
ORDER BY 3 DESC;
Check if the results are the same you get when the task with sp call is executed.
Please also confirm that you are loading new files not yet loaded into your table with COPY command (otherwise you need to specify FORCE = TRUE parameter in the copy command or remove metadata information truncating your target table to reload the same files).

Successfully created a task in Snowflake, but it does not show up when running "show tasks"

I am new to Snowflake and am trying to create my first task.
CREATE TASK task_update_table
WAREHOUSE = "TEST"
SCHEDULE = 'USING CRON 0 5 * * * America/Los_Angeles'
AS
INSERT INTO "TEST"."WEB"."SOME_TABLE" (ID,VALUE1,VALUE2,VALUE3)
WITH CTE AS
(SELECT
ID
,VALUE1
,VALUE2
,VALUE3
FROM OTHER_TABLE
WHERE ID NOT IN (SELECT ID FROM "TEST"."WEB"."SOME_TABLE")
)
SELECT
ID,VALUE1,VALUE2,VALUE3
FROM CTE
I got a message that the task was created successfully
"Task task_update_table successfully created"
I then try to run show tasks in schema SHOW TASKS IN "TEST"."WEB" and get 0 rows as a result. What am I doing wrong? why is the task not showing?
I did all of this under sysadmin and was using the same warehouse, db and schema.
There are some limitations around show commands that might be blocking you,
particularly "SHOW commands only return objects for which the current user’s current role has been granted the necessary access privileges".
https://docs.snowflake.com/en/sql-reference/sql/show.html#general-usage-notes
I suspect the task was created by a different role (therefore owned by a different role), or perhaps it was created in different database or schema.
To find it, I'd recommend running the following using a role such as ACCOUNTADMIN.
show tasks in account;
SELECT *
FROM (
SELECT *
FROM TABLE(RESULT_SCAN(LAST_QUERY_ID())))
WHERE "name" = 'TASK_UPDATE_TABLE';
While testing and learning in Snowflake, it is critical you set your session "context" correctly, using commands like this:
USE ROLE my_role_here;
USE WAREHOUSE my_warehouse_here;
USE DATABASE my_database_here;
USE SCHEMA my_schema_here;
Doing those four commands, or setting defaults for them for your user will help you tremendously when learning.
I hope this helps...Rich

Resources