In App Engine Flexible can the nginx.health_check logging be disabled?

In App Engine Flexible can the nginx.health_check logging be disabled? - google-app-engine

App Engine Flexible creates an nginx.health_check log. It logs all health check requests, not just failed health checks. If your health check interval is under 10 seconds the log can grow to multiple gigs in just a few days. Is there any way to configure it to only record failed checks, or disable the log altogether?

The only way to disable this right now is to completely disable health checking (which I wouldn't recommend). We're looking at ways to fix this - apologies!

Related

Google App Engine: debugging Dashboard > Traffic > Sent

I have an GAE app PHP72, env: standard which is hanging intermittently (once or twice a day for about 5 mins).
When this occurs I see a large spike in GAE dashboard's Traffic Sent graph.
I've reviewed all uses of file_get_contents and curl_exec within the app's scripts, not including those in /vendor/, and don't believe these to be the cause.
Is there a simple way in which I can review more info on these outbound requests?

There is no way to get more details in that dashboard. You're going to need to check your logs at the corresponding times. Obscure things to check for:
Cron jobs coming in at the same times
Task Queues spinning up

Health API hammered on my GCP Flex application - how do I dial that back?

The GCP infra hits your app on '/_ah/health' to determine that it is still alive. My app is a 'Flex' type, and it is being hit a lot on that URL:
It is making the logs hard to navigate, apart from anything else :(
How do I dial back the frequency of GCP testing the health end-point?

You can configure how App Engine performs health checks against your VM instances hosting your apps as part of the health_check: configuration in your app.yaml.
Health checks
Periodic health check requests are used to confirm that a VM instance
has been successfully deployed, and to check that a running instance
maintains a healthy status. Each health check must be answered within
a specified time interval. An instance is unhealthy when it fails to
respond to a specified number of consecutive health check requests. An
unhealthy instance will not receive any client requests, but health
checks will still be sent. If an unhealthy instance continues to fail
to respond to a predetermined number of consecutive health checks, it
will be restarted.
Health check requests are enabled by default, with default threshold
values. You can customize VM health checking by adding an optional
health check section to your configuration file:
health_check:
enable_health_check: True
check_interval_sec: 5
timeout_sec: 4
unhealthy_threshold: 2
healthy_threshold: 2
You can use the following options with health checks:
enable_health_check - Enable/disable health checks. Health checks are enabled by default. To disable health checking, set to False.
Default: True
check_interval_sec - Time interval between checks. Default: 1 second
timeout_sec - Health check timeout interval. Default: 1 second
unhealthy_threshold - An instance is unhealthy after failing this number of consecutive checks. Default: 1 check
healthy_threshold - An unhealthy instance becomes healthy again after successfully responding to this number of consecutive checks.
Default: 1 check
restart_threshold - When the number of failed consecutive health checks exceeds this threshold, the instance is restarted. Default:
300 checks
By default the health checks are turned on and it is recommended not to turn off health checking just to ensure App Engine does not send requests to a VM which is not responding. Instead you can control the check_interval_sec to adjust how often you want your VM health checked.
You can always filter out information from the logs when viewing the logs. Check out the information on Advanced Log filtering UI and syntax.

As others have mentioned there's an app.yaml config page
As of date of posting, it doesn't say what the default is.
Here's what's super weird for the check_interval_sec var - My deployments are blocked by the GCP deployer, for certain settings values.
A setting of 200 for 'check_interval_sec', caused the deployment to fail with the following message:
"Invalid value for field 'resource.checkIntervalSec': '6000'.
Must be less than or equal to 300"
A setting of 20, caused the deployment to fail with the following message:
"Invalid value for field 'resource.checkIntervalSec': '600'.
Must be less than or equal to 300"
A setting of 10 worked, and it is indeed ten seconds in the console (though a bunch of threads his the JVM at the same moment for some reason).
TL;DR: seconds are not seconds in 'check_interval_sec' - at least for 'flex' apps.

How can I disable logging of Google App Engine Health Checks: /_ah/vm_health

I created a DotNet Core App
Deployed it on the Google App Engine (Custom / Flex)
I opened the logging tab
I noticed the following entry: _ah/vm_health
It is not there once or twice, it is there very large number of times
Questions:
How can I exclude this one from the logs, I know the system is checking if everything is healthy, and this is good, I just don’t want it logged.
How can I exclude anything from the logs? For example, there is an entry, and it is sending things to the logs, and wanted ignored by the log system.

You can't disable logging of the health checks: they're still requests hitting your app and they're logged like any other request.
In the StackDriver Logs Viewer you might be able to use the Advanced Logs Filters to filter out and prevent displaying of the undesired logs. I can't give an actual example, though, as I didn't yet use this facility. Just to be clear - this just prevents displaying the logs when the filter is applied, the logs are not ignored by the logging system.

Why appengine taksqueues are forced into a lowering processing rate?

While executing tasks I see the following in the admin console in the 'Task queues' section. This is a paid app and am prepared to use more resources and pay for them. Any clue what might be causing it:
"To conserve system resources during peak usage, App Engine is enforcing a processing rate lower than the maximum rate for this queue"

Your app is not scaling properly.
Hover your mouse over the question mark at the top-right of the console screen. You will most likely see a tooltip message like:
"App Engine is enforcing a processing rate lower than the maximum rate for this queue either because your application is returning HTTP 503 codes or because currently there is no instance available to execute a request."
So, check your logs for 503's. Check your resource settings to make sure your app can properly handle the traffic.

The API call datastore_v3.Put() required more quota than is available

How to reset quota if Datastore Write Operations limit is reached ?
Any operation (both from admin console and from my code) on datastore reports the following error:
The API call datastore_v3.Put() required more quota than is available.
I have tried to disable application and wait for quota reset, but it did not work.
When the app is enabled, it produces a lot of tasks that in turn try to operate on datastore, what obviously consumes the quota.
Now, I have paused the task queues and will give another try waiting 24 hours.
Is it the right solution ?

The quota is reset every 24h, so wait that time or enable billing. The quota won't reset by disabling and reenabling the application.

You should assign a daily budget to your app even with billing enabled.
Maybe you forgot to do this.
goto cloud console, select project,
goto Compute > App Engine > Settings in the left side nav bar.
and set a daily budget.