I have an app on App Engine (flexible environment) and configured a few cron jobs. These jobs should take several minutes but I see them failing after ~30 seconds (502 error). The documentation is not very clear regarding the max time of cron jobs (Scheduling Jobs with cron.yaml), although it seems that "An HTTP request invoked by cron can run for up to 24 hours".
Any ideas of how to overcome this?
Thanks in advance
This is an answer to my own question.
The problem I had was that I only had one Gunicorn worker. The App Engine health checks were happening every 30 seconds and there was no worker able to reply to the health checks, so the server was restarted.
I should have added more workers in the app.yaml file. For example, I've added the following line.
entrypoint: gunicorn -b :$PORT main:app --workers 12
Hope this helps.
Related
I try to deploy a new version of app engine service with google cloud build with following steps:
deploy maintenance dispatch.yml to route all requests to maintenance page
Upgrade database
deploy new version
deploy dispatch.yml to route requests back to default service
The first three steps are working, but step 4 results in the following error:
an operation is already in progress
Screenshot of error message
The running GAE process is the one which is stopping the previous version.
So how can I find the running process and wait until it is stopped before
I deploy the dispatch.yml?
Per the documentation, to migrate traffic from one version to another, you should use the set-traffic command. So I think your step 4 should be replaced by the set-traffic command
I could solve the problem by myself with the following statement:
gcloud app operations wait $(gcloud app operations list --format="value(id)" --pending --limit=1) || true
This would wait for a running operation. In my case I had to add this line twice because there where to running operations to wait for.
This may be the wrong place for this question, so please re-direct me if necessary.
I have deployed a couple simple functions using Google Cloud Functions that do the following:
Read files from AWS and write to Cloud SQL
Aggregate Cloud SQL data and write csv file to Cloud Storage bucket
Simple OLS prediction model on aggregated data
I have these as separate functions because (1) often takes longer than the Cloud Function maximum timeout. Because of this, I am considering moving this whole thing to App Engine as a service. My question about App Engine Standard are:
What do the request timeouts mean? If I were to run this service, do I still have a short time-limit after which it will no longer run?
Is App Engine the best thing to use for this task?
Thanks for all your help
According to Google Documentation, GAE Standard has a maximum timeout of 1 minute for http requests and 10 minutes for cron/tasks for the older environments. Newer env have it as 10 minutes for both http requests & tasks. If your functions are taking longer than these, then GAE standard won't work for you. For such a case, you should take a look at GAE Flex - see this Google documentation which compares Flex to Standard).
Secondly, it seems to me that what you have are endpoints that are only hit occasionally or at specific scheduled times. If that is the case, I would also recommend taking a look at Cloud Run. We have a blog article about it and we have this
....Another thing to note about Cloud Run is that it only runs when it receives an HTTP request. It plays dead and comes alive to execute your code when an HTTP request comes in. When it is done executing the request, it goes 'dead' again till the next request comes in. This means you're not paying for time spent idling i.e. when it is not doing anything....
You can keep your Cloud Functions and the strong cohesion implemented by each of your 3 Functions, then you can use Cloud Workflows a serverless solution to orchestrate the 3 CF calls. The drawback : you pay for 3 CF invocations and 3 Workflows steps. But does it matter ? Since 2millions CF invocations are free and 5000 Workflows steps are free.
As proposed by #NoCommandLine Cloud Run is indeed an alternative, with its timeout of 3600s(1h). The drawback: you need to wrap your code in a http request and provide a webserver like express or gunicorn.
A hack is to build a docker container for your code with no need for a webserver and run it using Cloud Build which have a timeout of 24 hours.
My reactJS app is simple app which contains embedded SurveyJS widget. Deployed on Google Cloud Run, and it takes 20 seconds to load first time, subsequent access is faster.
How to troubleshoot, not sure it is Google Cloud Run config issue, or my docker file issue.
Appreciate your inputs.
Thanks,
Please note that these 20 seconds can due to the Cloud Run Cold-start.
The first time that a Cloud Run instance starts running requires downloading the container image and starting the container. This time is called “cold start”. The opposite is “warm start” which means that the container is already running waiting for or already processing requests.
Please have a look into the following Cloud Run Official Documentation in order to minimize the cold-start. Also please have a look into the following external tutorials , 3 which explains regarding the cold-start and possible ways to minimze it.
I try to deploy a simple nodejs app to GAE flexible environment.
Followed the official guide, using this command:
gcloud app deploy --verbosity=debug
I tried a lot of times.
The logs give me these forever:
DEBUG: Operation [apps/just-aloe-212502/operations/b1e812f6-299c-438e-b335-e35aa343242a] not complete. Waiting to retry.
Updating service [flex-env-get-started] (this may take several minutes)...⠛DEBUG: Operation [apps/just-aloe-212502/operations/b1e812f6-299c-438e-b335-e35aa343242a] not complete. Waiting to retry.
Updating service [flex-env-get-started] (this may take several minutes)...⠛DEBUG: Operation [apps/just-aloe-212502/operations/b1e812f6-299c-438e-b335-e35aa343242a] not complete. Waiting to retry.
Updating service [flex-env-get-started] (this may take several minutes)...⠹DEBUG: Operation [apps/just-aloe-212502/operations/b1e812f6-299c-438e-b335-e35aa343242a] not complete. Waiting to retry.
Updating service [flex-env-get-started] (this may take several minutes)...⠼DEBUG: Operation [apps/just-aloe-212502/operations/b1e812f6-299c-438e-b335-e35aa343242a] not complete. Waiting to retry.
What happened?
I can run my simple nodejs hello-world app successfully in local. And, the GAE standard environment works fine.
I should note that App Engine Flexible environment is based on Google Compute Engine, so it takes time to configure the infrastructure when you deploy your app.
The first deployment of a new version of an App Engine Flexible application takes some time due to setting up of internal infrastructure however subsequent deployments should be relatively fast since it only modifies some GCP resources and then waits on the health checks.
Deployment requires docker image building (which you can skip if you already have a pre-built image uploaded to gcr.io). Using a pre-build (to gcr.io) docker image will skip docker build step and would optimize the deployment time.
So I'm using an AE managed VM to host a website with the nodejs docker image - works great - site works, etc. However, I can't seem to get a AE cron job registered. I added a cron.yaml file right next to my app.yaml file, and I'm not excluding it in my docker file.
Is there some extra step I need to take for the cron job to be registered? Or are the cron jobs not supported on managed VMs?
Cron.yaml:
cron:
- description: daily summary job
url: /cron/socialmedia/twitter
schedule: every 2 minutes
At least on regular GAE (i.e. not managed VM) simply uploading the application with appcfg.py update doesn't always also update the cron jobs.
Updating cron jobs specifically, using appcfg.py update_cron should work in such cases.
You can deploy your cron.yaml jobs using gcloud preview app deploy cron.yaml