Snowflake Alert Long Running Queries - snowflake-cloud-data-platform

How to alert long running queries, to multiple users in snowflake ?
Right now the alert is sent only to the account admin role user.
Is there any way to notify the long query alert to "the user running the query OR notify to
multiple users belong to the particular warehouse/database" ?
Is there any way to leverage Snowflake Notification Integration for the above alerts?
Thanks In Advance
Sundar

It is possible to fulfill such requirement by using alerts and email notifications.
Setting Up Alerts Based on Data in Snowflake:
In some cases, you might want to be notified or take action when data in Snowflake meets certain conditions. For example, you might want to receive a notification when:
The warehouse credit usage increases by a specified percentage of your current quota.
The resource consumption for your pipelines, tasks, materialized views, etc. increases beyond a specified amount.
A data access request is received from an unauthorized user.
Your data fails to comply with a particular business rule that you have set up.
To do this, you can set up a Snowflake alert. A Snowflake alert is a schema-level object that specifies:
A condition that triggers the alert (e.g. the presence of queries that take longer than a second to complete).
The action to perform when the condition is met (e.g. send an email notification, capture some data in a table, etc.).
When and how often the condition should be evaluated (e.g. every 24 hours, every Sunday at midnight, etc.).
Sample:
CREATE OR REPLACE ALERT alert_long_queries
WAREHOUSE = my_warehouse_name
SCHEDULE = '5 MINUTE'
IF (EXISTS (
SELECT *
FROM TABLE(SNOWFLAKE.INFORMATION_SCHEMA.QUERY_HISTORY())
WHERE EXECUTION_STATUS ILIKE 'RUNNING'
AND start_time < current_timestamp() - INTERVAL '5 MINUTES'
))
THEN CALL SYSTEM$SEND_EMAIL(...);

The only notification available out-of-the-box in Snowflake is the Resource Monitor whereby AccountAdmin members only can subscribe for notifications.
https://docs.snowflake.com/en/user-guide/resource-monitors.html#resource-monitor-properties

Related

TASKS history in Snowflake

Is there a efficient way to see the logs of task run in snowflake
I am using this. Is there a possibility to wipe off the history from here?
select *
from table(information_schema.task_history(
scheduled_time_range_start=>dateadd('hour',-1,current_timestamp()),
result_limit => 1000,
task_name=>'TASKNAME'));
Is there a efficient way to see the logs of task run in Snowflake?
Depending of meaning of workd efficient, Snowflake offers UI to monitor tasks dependencies and run history.
Run History
Task run history includes details about each execution of a given task. You can view the scheduled time, the actual start time, duration of a task and other information.
Account Level Task History:
ask history displays task information at the account level, and is divided into three sections:
Selection (1) - Defines the set of task history to display, and includes types of tasks, date range and other information
Histogram (2) - Displays a bar graph of task runs over.
Task list (3) - a list of selected tasks.
Is there a possibility to wipe off the history from here?
Task History
This Account Usage view enables you to retrieve the history of task usage within the last 365 days (1 year). The view displays one row for each run of a task in the history.

What is the best way to query number of customers subscribed during a particular time period

We want to enable our users to get insights from the data. We have been using Tableau as a self-service BI platform. Our dashboard users have a specific request; they want to see data within a particular time period
Below is how my dataset looks like (dates are mm/dd/yy).
User request - To see how many customers were subscribed within a time period regardless of their current status. i.e. even if their current status is cancelled as long as they were active during the user-provided time period they should be counted
Example - User selects time range to be 01/01/2020 - 03/31/2020. Running a query on below data set should return count of 3. [CUST1 as they cancelled after 03/31/2020, CUST3 as they signed up before 01/01/2020 but are still active, CUST5 as they were active for some point during that period]
Problem - While I can write a SQL query with abundant where clauses to achieve this, we want a self-service automated way i.e. we want users to just provide us the time range and get the number. How do we achieve this in a BI tool like Tableau? I am also open to other tools, changing the data model design and other options. The goal is to just make this automated rather than having a person manually update and run a SQL query
Customer ID
Subscription Start Date
Subscription End Date
Subscription Status
CUST1
10/11/2019
04/12/2020
Cancelled
CUST2
01/12/2020
Active
CUST3
05/01/2019
Active
CUST4
06/07/2012
07/08/2012
Cancelled
CUST5
01/12/2020
03/14/2020
Cancelled
CUST6
04/12/2020
Active

How to check if no new opportunity has been created in past 1 year for an account in salesforce?

I've to create an automation process to check that no new opportunities has been created for an account in past 12 months and update the account field based on that.
Tried process builder, but it doesn't seem to work.
Tricky
A flow/workflow/process builder needs some triggering condition to fire. If an account was created 5 years ago, not updated since, haven't had any opportunities - it will not trigger any flows until somebody touches it.
And even if you somehow to manage to make a time-based workflow for example (to enqueue making a Task 1 year from now if there are no Opps by then) - it'll "queue" actions only from the moment it was created, it will not retroactively tag old unused accounts.
The time-based actions suck a bit. Say you made it work, it enqueued some future tasks/field updates/whatevers. Then you realise you need to exclude Accounts of certain record type from it. You need to deactivate the workflow/flow to do it - and deactivation wipes the enqueued actions out. So you'd need to save your changes and somehow "touch" all accounts again so they're checked again.
Does it have to be a field on Account? Can it be just a report (which you could make a reporting snapshot of if needed)? You could embed a report on account layout right? A query? Worst case some apex nightly job that runs and tags the accounts? It would dutifully run through them all and set/clear your helper field, easy to change (well, for a developer).
SELECT Id, Name
FROM Account
WHERE Id NOT IN (SELECT AccountId FROM Opportunity WHERE CreatedDate = LAST_N_DAYS:365)
Reporting way would be "cross filter": https://salesforce.vidyard.com/watch/aQ6RWvyPmFNP44brnAp8tf, https://help.salesforce.com/s/articleView?id=sf.reports_cross_filters.htm&type=5

How to find the count of total connections in snowflakes

We know that we have "show transactions" to see the transactions currently connected to database.
But I am interested
- To get the count of active users for each warehouse?
-History of connections count for each warehouse?
Is there a way to get above information using the sql commands (not the web ui)
If I understood correctly, you want to see the warehouse and active user mapping. There is no direct views as per my knowledge but you can leverage provided query where by keeping warehouse size !='0' you can tied warehouse and user together. You can check the below link
https://docs.snowflake.com/en/sql-reference/account-usage/query_history.html
Before that
Snowflake Sessions are not tagged with user name or account , those are system
generated ID.
User and warehouse relationship is zero or many (An active user can use multiple warehouse in parallel , also a warehouse can be used by multiple users at same point of time)
A user can have active session without a running warehouse
It is not mandatory to have an active user to keep your warehouse running
Finally, queries can also be executed without turning the warehouse up
SELECT TO_CHAR(DATE_TRUNC('minute', query_history.START_TIME ),'YYYY-MM-DD
HH24:MI') AS "query_history.start_time",
query_history.WAREHOUSE_NAME AS "query_history.warehouse_name",
query_history.USER_NAME AS "query_history.user_name"
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY AS query_history
WHERE (query_history.WAREHOUSE_SIZE != '0')
GROUP BY DATE_TRUNC('minute', query_history.START_TIME ),2,3
ORDER BY 1 DESC
Note : Above SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY view refresh has latency of 45 minutes

How to avoid double processing of data on a busy site?

On my website, people can add an item to their wishlist. When X number of people have added it to their list, then all those peoples' credit cards are charged.
The problem I'm facing is how to ensure that if two customers add it to their wishlist at the same time, then the payment processing code won't run twice. Any ideas?
An example of what can happen is:
We are waiting for 20 people to add the item to their wishlist, and we have 19.
Bob and Sally visit the site and click the 'add to wishlist' button
The server receives Bob's request, sees that 20 requests are now met, and charges the payments.
At the same time the server receives Sally's request, and still seeing 19 requests in db since Bob's order was simultaneously received, begins to process the payments. Hence, the payments are charged twice.
Any ideas on how to avoid this?
I am using a MySQL database and PHP for the programing.
This is the type of thing for which transactions are designed. The charging of the cards and the reseting of the wislist count must be in the same transaction so that they occur as an atomic unit. Furthermore, to avoid the problem you are describing, you must set the transaction isolation level to at least "Read Committed" "Repeatable Read".
Additional information:
Here's how to do it: 1. The app opens a transaction on the database. 2. The app does a select on the wishlish tables to retrieve the count. 3. If the count is >= n, the app does another select on the wishlist and related tables to retrive the pending wishlist orders, users, card info, etc. 4. Depending on the business rules regarding card transactions, the app then deletes the pending orders, or whatever to reset the wishlist count back to zero. 5. The app then closes the transaction.
Here's why it works: when the app does a select on the wishlist tables to retrieve the count inside a transaction, the db places a read lock on the tables associated with this query. If another transaction that opened during the pendency of the prior transaction tries to read those same tables, it must wait until the prior transaction has either a COMMIT or a ROLLBACK. If the prior transaction COMMITS, then the next transaction will see a count of 0 and all the other modifications. Otherwise, if the app executes a ROLLBACK for any reason, none of the data changes and the next transaction sees the data as it existed prior to the first transaction.
I am doing a similar site at the moment. Seems to be popular...
It is important that your processes are idempotent. In this context, this means that if you run our charging service multiple times, the orders which have already been charged are not charged twice.
I accomplish this by setting the OrderStatus to 'NotProcessed' when the order is placed.
Once the service runs and charges for an order the OrderStatus changes to 'PaymentPending'.
I charge for the order only if the OrderStatus is 'NotProcessed'.
PSEUDO CODE:
void ProcessPendingOrders()
{
var orders = getAllOrders();
foreach(Order order in orders)
{
if (order.OrderStatus == NotProcessed)
ChargeOrder(order)
}
}

Resources