Protecting cron scheduling endpoint on AppEngine (Flexible Environment) - google-app-engine

I am trying to get my dataflow job scheduled via cron.yaml in an AppEngine flexible environment. This works flawlessly when I leave my endpoint unprotected. However, when trying to secure the endpoint, I see 403 status responses, even when triggering it from within the TaskQueues interface.
My app.yaml looks like this:
runtime: java
env: flex
handlers:
- url: /.*
script: this field is required, but ignored
- url: /dataflow/schedule
script: this field is required, but ignored
login: admin
runtime_config:
jdk: openjdk8
resources:
cpu: .5
memory_gb: 1.3
disk_size_gb: 10
manual_scaling:
instances: 1

Secure handlers (like login: admin) do not work on App Engine Flexible, that is why the 403.
For securing that handler, you can check the request header "X-AppEngine-Cron" in your app, which is a trusted header only set by traffic coming from App Engine.

Related

Deploying FrontEnd and BackEnd as two separate applications with Google Cloud App Engine

I have two application that I want to deploy with Google Cloud App Engine.
One of them is react front end, and I want to serve this through www.videoo.io
Second one is back-end, which will be served via api.videoo.io
Frontend yaml file react.yaml :
runtime: nodejs16
env: standard
handlers:
- url: /static
static_dir: static
secure: always
- url: www.videoo.io/*
service: frontend
script: auto
secure: always%
API yaml file, api.yaml :
runtime: python37
entrypoint: gunicorn -b :$PORT videoo.wsgi
service: "videoo-api"
env: standard
handlers:
- url: api.videoo.io/*
service: backend
script: auto
secure: always%
Is this the correct way to achieve this ?
What is the best strategy to serve these two separate applications that will interactively communicate (Frontend will make calls to API to get object information that is stored Django app) ?
Here is also my domain name information in Google App Engine settings :
You are on the right path. You are using the microservices architecture which is basically deploying individual apps as parts (services) under a single project.
Your frontend service seems to be your default so you don't need a service name for it. Every GAE App needs a default service.
Rename react.yaml to app.yaml (since it will be your default service) and update the contents to
runtime: nodejs16
env: standard
handlers:
- url: /static
static_dir: static
secure: always
- url: /.*
script: auto
secure: always
Also rename your api.yaml to backend.yaml since that is what you called your service (not sure if this is required but I do that to easily track of what is controlling my service). Update the contents of the file to
service: backend
runtime: python37
entrypoint: gunicorn -b :$PORT videoo.wsgi
env: standard
handlers:
- url: /.*
script: auto
secure: always
You'll need a dispatch.yaml file to route traffic to the different services. Something like
dispatch:
# Send all api traffic to the backend service.
- url: "api.videoo.io/*"
service: backend
# Send all other traffic to the default (frontend).
- url: "*/*"
service: default
Final step is that during your deploy, you will deploy the 2 services in addition to your dispatch.yaml file. The dispatch.yaml file has to be in your project root folder
gcloud app deploy app.yaml dispatch.yaml <path_to_backend.yaml>

How do I force google app engine to use https ssl protocol?

I have a flask web app deployed to google app engine. However, it is not forcing my link to https. However, if I refresh it, it will go to the ssl https version. But the user can still remove the s and jump back into the http version. Is there any way to completely remove the http protocol on my site and have it redirect to the ssl version. Here is the app.yaml file I'm using currently. I also tried adding in redirect_http_response_code: 301 with no luck to remove the http protocol
runtime: python
env: flex
entrypoint: gunicorn -b :$PORT main:app
runtime_config:
python_version: 3.7
# This sample incurs costs to run on the App Engine flexible environment.
# The settings below are to reduce costs during testing and are not appropriate
# for production use. For more information, see:
# https://cloud.google.com/appengine/docs/flexible/python/configuring-your-app-with-app-yaml
manual_scaling:
instances: 1
resources:
cpu: 1
memory_gb: 0.5
disk_size_gb: 10
handlers:
- url: /.*
secure: always
script: auto
I prefer not to install additional software packages for relatively simple things, so I do it myself. For GAE flex there are a few things to handle. I've added comments below to help explain.
#app.before_request
def redirect_http():
# http -> https
if (
# Flask tells us when request is http. This might not be needed for you but
# I need it because I use this code for GAE standard as well.
not request.is_secure and
# Load balancers forward https requests as http but set headers to let you
# know that original request was https
not request.headers.get('X-Forwarded-Proto') == 'https' and
# GAE cron urls must be http
not request.path.startswith("/cron")
):
return redirect("https" + request.url[4:], code=301)
# naked domain -> www
if request.url.startswith("https://example.com"):
return redirect('https://www.' + request.url[8:], code=301)
The Flask packages recommended by #tzovourn do other important things as well so you may want to consider those (I personally do all those things myself since it isn't hard to do them).
I noticed that you are using App Engine Flexible. As per documentation, setting secure: always in app.yaml doesn't work for App Engine Flexible.
Documentation recommends to secure your HTTP requests by using the Flask Talisman library.
Another way to configure your Flask application to redirect all incoming requests to HTTPS is to use the Flask-SSLify extension

Clear appEngine Flex static files cache

I set a cache-control on my server of 1 year.
How to say to the AppEngine "clear !" to take a new version from the server ?
The configuration is Flex custom environment
runtime: custom
env: flex
env_variables:
writecontrolEnv: 'prod'
handlers:
- url: /.*
script: this field is required, but ignored
service: gateway-prod
automatic_scaling:
min_num_instances: 1
max_num_instances: 2
resources:
cpu: 1
memory_gb: 2
disk_size_gb: 10
skip_files:
- node_modules/
network:
instance_tag: gateway
Assuming that your app is the one serving the static files then the cache parameters sent by the server are controlled by your application code. Which means that once you deploy a new version with updates parameters the server will send the updated values.
But the problem is that caching is actually performed by the client (or some middle-man network device), so the end user will not reach to the server until the (very long in your case) cache expiration time is reached, so it won't see the update until then.
You can try to clear your browser cache, hoping that the browser was the one doing the cache-ing.
To prevent such occurrences in the future you may want to choose a shorter cache expiration time or use some cache busting technique like this one.

Google App Engine handler won't work unless a catch all handler is set

I'm struggling to setup URL handlers in the handlers section of my app.yaml file for my standard engine php application.
Following code makes both mydomain.com/abc.php and mydomain.com display the abc.php file in my dist folder:
handlers:
- url: /(abc\.php)
script: dist/abc.php
- url: /(.*)
script: dist/abc.php
However, I don't want a catch all handler, so I removed it so only 1 handler exists:
handlers:
- url: /(abc\.php)
script: dist/abc.php
Now when I go to mydomain.com/abc.php, I'm getting a 500 error:
How can I make a targeted URL work on GAE without using a handler for all URL's?
My app.yaml file contained resources:
manual_scaling:
instances: 1
resources:
cpu: 1
memory_gb: 1
disk_size_gb: 10
Since this isn't supported by the App Engine Standard environment, simply commenting out this code resolved the issue.

Google App Engine service (module) not starting, and flooding 404's to /_ah/start

I'm refactoring an existing codebase. I switched from using the appcfg.py to using the gcloud command, which seemed to go fine. Our entire codebase was running on one default frontend instance, which I'm now trying to break into services. To start, I created one "worker" backend service, and I'm using a cron job to test.
I can see the worker in the console, but no instance is started. The logs for that service are rapidly flooded with 404's to /_ah/start. I've tried manual and basic scaling. The documentation states that it's okay not to have a startup script, and that a 404 at that endpoint is considered success. However, the instance is not starting.
Logs
worker.yaml
service: worker
runtime: python27
api_version: 1
instance_class: B2
manual_scaling:
instances: 1
threadsafe: false
handlers:
- url: /work/.*
script: worker.app
secure: always
login: admin
worker.py
import webapp2
import handlers
config = {
#...
}
app = webapp2.WSGIApplication([
webapp2.Route(
'/work/test<:/?>',
handlers.Test,
methods=['GET'],
),
], debug=True, config=config)
dispatch.yaml
dispatch:
- url: "*/work/*"
module: worker

Resources