I know how to set up a job to alert when it's running.
But I'm writing a job which is meant to run many times a day, and I don't want to be bombarded by emails, but rather I'd like a solution where I get an alert when the job hasn't been executed for X minutes.
This can be acheived by setting the job to alert on execution, and then setting up some process which checks for these alerts, and warns when no such alert is seen for X minutes.
I'm wondering if anyone's already implemented such a thing (or equivalent).
Supporting multiple jobs with different X values would be great.
The danger of this approach is this: suppose you set this up. One day you receive no emails. What does this mean?
It could mean
the supposed-to-be-running job is running successfully (and silently), and so the absence-of-running monitor job has nothing to say
or alternatively
the supposed-to-be-running job is NOT running successfully, but the absence-of-running monitor job has ALSO failed
or even
your server has caught fire, and can't send any emails even if it wants to
Don't seek to avoid receiving success messages - instead devise a strategy for coping with them. Because the only way to know that a job is running successfully is getting a message which says precisely this.
Related
Ok Not sure what is going on here. I have runaway queries that won't cancel. I have one query to select all rows from a table that only has 250 rows and is 1.5KB in size. It's been running for 30 minutes right now and it should only take a few ms.
I've tried canceling by hitting the abort button on the worksheet, going into history and selecting the query and hitting abort, aborting based on the query ID via SQL, and aborting based on the session ID via SQL.
Ironically whenever I try to abort via SQL it shows that the queries have been terminated and then they still show as running, I wait a few minutes and re run the query and it again shows as terminated but they still are running.
I also tried loggin out and logging back in again and am seeing all kinds of weird errors:
Internal Error: Unable to retrieve the current roles.
Error
Problem with your MFA Enrollment: There was an issue with your enrollment
process. Please try again.
Worksheet Not Loaded
I have no idea what is going on but it seems like everywhere I turn there is an issue. Any assistance would be greatly appreciated.
Try logging completely out, close the browser, reboot your machine, and start from there. Here my guess:
Sometimes the query history (which I assume is where you were seeing things still running), needs a browser refresh, but based on MFA errors, refreshing your browser appears to have you logged out of your SAML/MFA process.
Once you successfully login, you'll likely see that the query had completed already before you even tried to cancel it.
If that isn't the case, and you are still seeing issues, then we'd probably need more information, or a quick call to Snowflake Support will walk you through things. My guess is this is all a display issue on your browser/UI, rather than something going wonky with Snowflake.
I'm wondering is there a way to recognize the OfflineComamd is being executed or internal flag or something to represent this command has been passed or mark it has been executed successfully. I have issue in recognizing the command is passed or not with unstable internet. I keep retrieve the records from database and comparing each and every time to see this has been passed or not. But due to the flow of my application, I'm finding it very difficult to avoid duplicates.IS there any automatic process to make sure commands executed automatically or something else?
2nd question, I can use UITimer to check isOffline() to make sure internet is connected or not on the forms. Is there something equivalent on server page or where queries is written to see internet is disconnected or not. When the control moved to queries and internet is disconnected I see the dialog open from form page being frozen for unlimited time and will not end. I have to close and re-open the app to continue the synchronization process.At the same time I cannot set a timeout for dialog because I'm not sure how long it will take the complete the Synchronization process. Please advise.
Extending on the same topic but I have created a new issue just to give more clarity on my questions.
executeOfflineCommand skips a command while executing from storage on Android
There is no way to know if a connection will stay stable as it requires knowledge of the future. You can work like transaction services do where the server side processes an offline command as a transaction using the approach of 2-phase commit.
In this approach you have an algorithm similar to this:
Client sends command to server
Server returns a special unique ID for the command
Client asks server to perform the unique id
Server acknowledges that the command was performed
If the first 2 stages didn't complete you just do that again. The worst thing that could happen is some orphan commands on the server.
If the 3rd option didn't complete you just do it again. The server knows whether it processed the command and will just acknowledge it if it was already processed.
I have several HTTP callouts that are in a schedulable and set to run ever hour or so. After I deployed the app on the app exchange and had a salesforce user download it to test, it seems the jobs are not executing.
I can see the jobs are being scheduled to run accordingly however the database never seems to change. Is there any reason this could be happening or is there a good chance the flaw lies in my code?
I was thinking that it could be permissions however I am not sure (its the first app I am deploying).
Check if the organisation of your end user has added your endpoint to "remote site settings" in the setup. By endpoint I mean an address that's being called (or just the domain).
If the class is scheduled properly (which I believe would be a manual action, not just something that magically happens after installation... unless you've used a post-install script?) you could also examine Setup -> Apex Jobs and check if there are any errors. If I'm right, there will be an error about callout not allowed due to remote site settings. If not - there's still a chance you'll see something that will make you think. For example batch job has executed successfully but there were 0 iterations -> problem?
Last but not least - you can always try the debug logs :) Enable them in Setup (or open the developer console), fire the scheduled class's execute() manually and observe the results? How to fire it manually? Sth like this pasted to "execute anonymous":
MySchedulableClass sched = new MySchedubulableClass();
sched.execute(null);
Or - since you know what's inside the scheduled class - simply experiment.
Please note that if the updates you might be performing somehow violate for example validation rules your client has - yes, the database will be unchanged. But in such case you should still be able to see failures in Setup -> Apex Jobs.
I'm hoping to avoid a hacked together mishmash to achieve something. I know it can be done with a mishmash but let's see if I'm missing a SIMPLE, easy way. This is Nagios Core 3.
I have a service. That service is checked 24x7x365. Notifications are sent 24x7x365, on WARNING and also on CRITICAL.
That is good--that is what I want.
However...now I want one single exception to that notification setup. NOTE: I do not want an exception to the monitoring setup--I want the console to always show the correct status, 24x7. I just want to make one exception for the notification (via email) on this service.
Here is the exception:
IF service state is WARN AND time of day is between 0300 and 0600, do NOT notify.
That's it. If it's CRITICAL, email-notify 24x7 (as it already does). If it's not between 3 and 6 a.m., notify regardless of WARN vs. CRIT (as it already does). The only exception is WARNING and 3-6 a.m.
Background: This is because we have maintenance that occurs every night between 3 and 6, which we've customized to produce a WARNING (not CRITICAL). I want notifications any time outside of this (admin may have accidentally launched maint in middle of day), and I want CRITICAL any time. I don't want to simply skip CHECKS during that time because I do want the console to be correct (a big bunch of yellows 0300-0600).
So, anyway, seems like I can kludge together a bunch of constructs but does anybody have a simple way to define this one "boolean AND" condition to the notification (only) schedule?
This is what scheduled downtime is for. If you create a scheduled downtime window alerts will be suppressed during this timeframe.
If that's not an option, then you need to different contacts for this service. 1 that notifies 24x7 and only on CRITICAL, and the other notifies 24x7(sans 3-6), and only receives WARNING notifications. Have them both point to the same contact email address.
I have a task queue with several tasks. If I delete a particular task from Admin Console, it disappears from the task queue but GAE doesnt terminate it. The task is still being executed in the background.
Is this a common behavior ?
Yeah, I see the same behavior. Seems you can only delete pending tasks from the admin console. Once they've started they continue to run until they finish or hit an exception (could be as long as 10 minutes with the new update).
I've noticed they don't stop on version upgrades either, which is a little weird if you aren't expecting it... if the task takes a long time you end up with handlers running in two versions of the app simultaneously. It makes sense though.