StackDriver is alerting even when policy is met

StackDriver is alerting even when policy is met - google-app-engine

I have a Cron job setup on Google App Engine, that runs everyday at 11 am.
Given that the execution of the cron is crucial to my system, I decided to setup an alert for it if there's any failures, and this was setup as:
Create a metric with
Resource Type: GAE Application
Metric: Response Count
And filters:
Response Code >= 200
Response Code <= 299
Module ID: my-application
With an empty Group By and Aggregator set to NONE
Condition triggers if any time series violates condition is absent for 1 day
So my intended behavior was "If this app doesn't respond 200 at the cron for 1 day (enough for it to be hit), alert me"
But I'm getting an email with the alert almost daily.
Is this a bug? Or is there another way to set this up? I'm receiving an alert, and when I see the cron page at App Engine, it was ran on the correct time, and returned 200 (success).

Related

Google App Engine Cron not triggering endpoint at specific times

We have multiple App Engine Cron entries triggering our App Engine application, but recently we detected a decrease on the number of the processed events handled by one of the endpoints of our application. By looking at the App Engine Cron logs for this specific Cron entry on StackDriver, we found out that, during the days we invesgated (March 11-15), that are missing entries. Most of the missing triggers coincide through the days (12:15, 14:15, 16:15, 18:15, 20:15, 22:15, 00:15).
The screenshot below displays one specific day, and the red lines indicate the missing entries:
There are no requests with HTTP status code different than 200.
This is the configuration of the specific Cron entry (replaced some words with XXX due to business restrictions):
- description: 'Hourly job for XXX'
url: /schedule/bigquery/XXX
schedule: every 1 hours from 00:15 to 23:15
timezone: UTC
target: XXX
retry_parameters:
min_backoff_seconds: 2.5
max_doublings: 5
Could someone # GCP side take a look? The task name is 53751dd6a70fb9af38f49993b122b79f.

it seems like if the request takes longer than an hour, then the next one gets skipped (i.e. cron doesn't launch the next iteration if the current iteration is still running)
maybe do the actual work in a separate task and then the only thing the cron task does is launch this separate task

Scheduling cron job to trigger cloud function

Is there a way to schedule a cron job using the cron.yaml to trigger a HTTP cloud function. I tried to implement it but passing the entire URL is throwing an error.
cron:
- description: "Test Call"
url: https://us-central1-***.cloudfunctions.net/helloGET
schedule: every 1 mins
I see this error in the console when I try to deploy the cron job
Unable to assign value 'https://us-central1-***.cloudfunctions.net/helloGET' to attribute 'url':
Value 'https://us-central1-***.cloudfunctions.net/helloGET' for url does not match expression '^(?:^/.*$)$'
in "/Users/xyz/Desktop/cron.yaml", line 3, column 8
I know that error is being thrown because I have the full URL path but instead of the full path if I just pass the following
cron:
- description: "Test Call"
url: /helloGET
schedule: every 1 mins
then it is able to deploy the cron job but when the job is run it throws a 404 error because by just passing the path and not the full URL I believe it is looking for the URL in the app engine and since I dont have any code in the app engine and my service call is in the cloud function it is not able to find it.
Also is there a way to set the schedule to be run every 1 seconds instead of 1 mins.

The url in the cron.yaml needs to be a URl handled by your app, not an arbitrary one - which is why only the relative path works. From Syntax (emphasis mine):
url
Required. The url field specifies a URL in your application that
will be invoked by the Cron Service.
What you can do is have your application cron handler reach out to the arbitrary URL you need to trigger your Cloud Function. See Issuing HTTP(S) Requests
As for going below 1 minute intervals - that's not supported by cron itself. But there are ways to achieve something almost equivalent, see, for example High frequency data refresh with Google App Engine

Deadline exceeded while waiting for HTTP response, Python, Google App Engine

I am using a cron job to download a web page and save it, using Google App Engine.
After 5 seconds I get a Dealine exceeded error.
How can I avoid the error. Searching around this site, I can extend the time limit for urlfetch. But it isn't urlfetch that is causing issues. It is the fact that I can't run the task for over 5 seconds.
For example, I have tried this fix, but it only works if I am running the page myself, not via a cron job:
HTTPException: Deadline exceeded while waiting for HTTP response from URL: #deadline

Cron jobs can run for a maximum of 10 minutes, but that doesn't mean the URLFetch can't timeout before then. The default timeout for URLFetch is exactly 5 seconds but you can raise this.
"You can set a deadline for a request, the most amount of time the service will wait for a response. By default, the deadline for a fetch is 5 seconds. The maximum deadline is 60 seconds for HTTP requests and 10 minutes for task queue and cron job requests. When using the URLConnection interface, the service uses the connection timeout (setConnectTimeout()) plus the read timeout (setReadTimeout()) as the deadline."
see: https://developers.google.com/appengine/docs/java/urlfetch/?csw=1#Requests
"A cron job will invoke a URL, using an HTTP GET request, at a given time of day. An HTTP request invoked by cron can run for up to 10 minutes, but is subject to the same limits as other HTTP requests."
see: https://developers.google.com/appengine/docs/python/config/cron
Furthermore check out this older stackoverflow question with very similar issue.
If Google App Engine cron jobs have a 10 minute limit, then why do I get a DeadlineExceededError after the normal 30 seconds?

How do I run a cron job on Google App Engine immediately?

I have configured Google App Engine to record exception with ereporter.
The cron job is configured to run every 59 minutes. The cron.yaml is as follows
cron:
- description: Daily exception report
url: /_ereporter?sender=xxx.xxx#gmail.com # The sender must be an app admin.
schedule: every 59 minutes
How to do I run this immediately.
What I am trying to do here is simulate a 500 HTTP error and see the stack trace delivered immediately via the cron job.

Just go to the URL from your browser.

You can't using cron. Cron is a scheduling system, you could get it to run every minute.
Alternately you could wrap your entire handler in a try/except block and try to catch everything. (You can do this for some DeadlineExceededErrors for instance) then fire off a task which invokes ereporter handler, and then re-raise the Exception.
However in many cases Google infrastructure can be the cause of the Error 500 and you won't be able to catch the error. To be honest you are only likely to be able to cause an email sent for a subset of all possible Error 500's. The most reliable way probably be to have a process continuously monitor the logs, and email from there.
Mind you email isn't consider reliable or fast so a 1 min cron cycle is probably fast enough.

I came across this thread as I was trying to do this as well. A (hacky) solution I found was to add a curl command at the end of my cloudbuild.yaml file that triggers the file immediately per this thread. Hope this helps!
Make a curl request in Cloud Build CI/CD pipeline

How can Google App Engine be prevented from immediately rescheduling tasks after status code 500?

I have a Google App Engine servlet that is cron configured to run once a week. Since it will take more than 1 minute of execution time, it launches a task (i.e. another servlet task/clear) on the application's default push task queue.
Now what I'm observing is this: if the task causes an exception (e.g. NullPointerException inside its second servlet), this gets translated into HTTP status 500 (i.e. HttpURLConnection.HTTP_INTERNAL_ERROR) and Google App Engine apparently reacts by immediately relaunching the same task again. It announces this by printing:
Web hook at http://127.0.0.1:8888/task/clear returned status code 500. Rescheduling..
I can see how this can sometimes be a feature, but in my scenario it's inappropriate. Can I request that Google App Engine should not do such automatic rescheduling, or am I expected to use other status codes to indicate error conditions that would not cause rescheduling by its rules? Or is this something that happens only on the dev. server?
BTW, I am currently also running other tasks (with different frequencies) on the same task queue, so throttling reschedules on the level of task queue configuration would be inconvenient (so I hope there is another/better option too.)

As per https://developers.google.com/appengine/docs/java/taskqueue/overview-push#Java_Task_execution - the task must return a response code between 200 and 299.
You can either return the correct value, set the taskRetryLimit in RetryOptions or check the header X-AppEngine-TaskExecutionCount when task launches to check how many times it has been launched and act accordingly.

I think I've found a solution: in the Java API, there is a method RetryOptions#taskRetryLimit, which serves my case.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

StackDriver is alerting even when policy is met - google-app-engine

Related

Google App Engine Cron not triggering endpoint at specific times

Scheduling cron job to trigger cloud function

Deadline exceeded while waiting for HTTP response, Python, Google App Engine

How do I run a cron job on Google App Engine immediately?

How can Google App Engine be prevented from immediately rescheduling tasks after status code 500?

Categories

Resources